Introduction to Hilbert Spaces with Applications

Introduction to Hilbert Spaces with Applications Lokenath Debnath Piotr Mikusinski Department of Mathematics University ...

Author: Lokenath Debnath | Piotr Mikusinski

69 downloads 1433 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD

Introduction to Hilbert Spaces with Applications Lokenath Debnath Piotr Mikusinski Department of Mathematics University of Central Florida Orlando, Florida

Academic Press San Diego New York Boston London Sydney Tokyo Toronto

Th1s book is pnnted on ac1d-free paper @l Copynght © 1990 by Academic Press All nghts reserved No part of this pubiJcatJon may be reproduced or transmllted 1n any form or by any means. electronic or mechanical, mcludmg photocopy, recordmg, or any mformat1on storage and retneval system. Without permission in wri!Jng from the publisher

ACADEMIC PRESS

A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 92101-4495

United K1ngdom Ed1t10n published by ACADEMIC PRESS LIMITED 24-28 Oval Road. London NW1 7DX

Library of Congress CatalogJng-m-PublicatJon Data Debnath. Lokenath lntroduct1on to Hilbert spaces with appllcations/Lokenath Debnath and P1otr M1kusuisk1 p em Includes bibliographical references ISBN 0-12-208435-7 1 Hilbert space I M1kusuiskJ. P1otr II T1tle OA322.4 D43 1990 515'733-dc20 89-18245 CIP

Printed

1n

the United States of Amenca

97 98 99 BB 7 6 5

This book is dedicated to the memory of our fathers: JOGESH CHANDRA 0EBNATH

and

JAN MIKUSINSKI

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

PART

xi

1

THEORY

CHAPTER

1

Normed Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

Introduction ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vector Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear Independence, Basis, Dimension . . . . . . . . . . . . . . . . . . . . Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Completion of Normed Spaces............................ Contraction Mappings and the Fixed Point Theorem. . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 3 9 10 18 22 27 29 32

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9

vii

Contents

viii CHAPTER

2

The Lebesgue Integral........................................

37

2~ 1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16

37 38 43 47 49 51 54 58 62 64 67 71 74 75 79 82

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Step Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lebesgue Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modulus of an Integrable Function . . . . . . . . . . . . . . . . . . . . . . . . Series of Integrable Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Norm in L\R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Convergence Almost Everywhere . . . . . . . . . . . . . . . . . . . . . . . . . . Fundamental Theorems ....................... :. . . . . . . . . . Locally Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Lebesgue Integral and the Riemann Integral . . . . . . . . . . . . The Lebesgue Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Complex Valued Lebesgue Integrable Functions............. The Space L 2 (R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Spaces L 1(RN) and L 2(RN) . . . . . . . . . . . . . . . . . . . . . . . . . . . Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER

3

Hilbert Spaces and Orthonormal Systems . . . . . . . . . . . . . . . . . . . . . . .

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13

87

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Examples of Inner Product Spaces. . . . . . . . . . . . . . . . . . . . . . . . . 88 Norm in an Inner Product Space . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Hilbert Spaces-Definition and Examples . . . . . . . . . . . . . . . . . . 93 Strong and Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Orthogonal and Orthonormal Systems. . . . . . . . . . . . . . . . . . . . . . 98 Properties of Orthonormal Systems . . . . . . . . . . . . . . . . . . . . . . . . 104 Trigonometric Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Orthogonal Complements and Projection Theorem........... 117 Linear Functionals and the Riesz Representation Theorem. . . . 122 Separable Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

CHAPTER

4

Linear Operators on Hilbert Spaces

137

4.1 4.2

137 138

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Contents

4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13

ix

Bilinear Functionals and Quadratic Forms . . . . . . . . . . . . . . . . . . Adjoint and Self-adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . Invertible, Normal, Isometric, and Unitary Operators . . . . . . . . Positive Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Projection Operators..................................... Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spectral Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Fourier Transform................................... Unbounded Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

PART

142 149 155 161 166 171 176 187 192 203 212

2

APPLICATIONS

CHAPTER

5

Applications to Integral and Differ,ential Equations................

223

5.1 5.2 5.3 5.4 5.5 5.6 5.7

223 224 231 234 236 242

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Existence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fredholm Integral Equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Method of Successive Approximations . . . . . . . . . . . . . . . . . . . . . Volterra Integral Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Method of Solution for a Separable Kernel . . . . . . . . . . . . . . . . . Volterra Integral Equations of the First Kind and Abel's Integral Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Ordinary Differential Equations and Differential Operators . . . 5.9 Sturm-Liouville Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.10 Inverse Differential Operators and Green's Functions . . . . . . . . 5.11 Applications of Fourier Transforms to Ordinary Differential Equations and Integral Equations . . . . . . . . . . . . . . . . . . . . . . . . . 5.12 Exercises...............................................

CHAPTER

246 248 257 263 268 276

6

Generalized Functions and Partial Differential Equations...........

283

6.1 6.2

283 283

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

x

6.3 6.4 6.5 6.6

Contents

Fundamental Solutions and Green's Functions for Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weak Solutions of Elliptic Boundary Value Problems . . . . . . . . Examples of Applications of Fourier Transforms to Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER

295 310 316 326

7

Mathematical Foundations of Quantum Mechanics . . . . . . . . . . . . . . .

333

7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11

333 333 345 359 361 377 384 389 390 396 403

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Concepts and Equations of Classical Mechanics . . . . . . . Basic Concepts and Postulates of Quantum Mechanics . . . . . . . The Heisenberg Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . The Schrodinger Equation of Motion . . . . . . . . . . . . . . . . . . . . . . The Schrodinger Picture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Heisenberg Picture and the Heisenberg Equation of Motion The Interaction Picture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Linear Harmonic Oscillator. . . . . . . . . . . . . . . . . . . . . . . . . . . Angular Momentum Operators............................ Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER

8

Optimization Problems and Other Miscellaneous Applications

411

8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Gateaux and Frechet Derivatives . . . . . . . . . . . . . . . . . . . . . . Optimization Problems and the Euler-Lagrange Equations. . . . Minimization of a Quadratic Functional . . . . . . . . . . . . . . . . . . . . Variational Inequalities ........ , , . . . . . . . . . . . . . . . . . . . . . . . . Optimal Control Problems for Dynamical Systems . . . . . . . . . . . Approximation Theory................................... Linear and Nonlinear Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bifurcation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises...............................................

411 412 424 441 443 447 453 460 465 471

Hints and Answers to Selected Exercises . . . . . . . . . . . . . . . . . . . . . . .

479

Bibliography.................................................

493

List of Symbols..............................................

497

Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

501,

Preface

Functional analysis is one of the central areas of modern mathematics, and the theory of Hilbert spaces is the core around which functional analysis has developed. Hilbert spaces have a rich geometric structure because they are endowed with an inner product which allows the introduction of the concept of orthogonality of vectors. We believe functional analysis is best approached through a sound knowledge of Hilbert space theory. Our belief led us to prepare an earlier manuscript which was used as class notes for courses on Hilbert space theory at the University of Central Florida and Georgia Institute of Technology. This book is essentially based on those notes. One of the main impulses for the development of functional analysis was the study of differential and integral equations arising in applied mathematics, mathematical physics and engineering; it was in this setting that Hilbert space methods arose and achieved their early successes. With ever greater demand for mathematical tools to provide both theory and applications for science and engineering, the utility and interest of functional analysis and Hilbert space theory seems more clearly established than ever. Keeping these things in mind, our main goal in this book has been to provide both a systematic exposition of the basic ideas and results of Hilbert

xi

xii

Preface

space theory and functional analysis, and an introduction to various methods of solution of differential and integral equations. In addition, Hilbert space formalism is used to develop the foundations of quantum mechanics and Hilbert space methods are applied to optimization, variational and control problems, and to problems in approximation theory, nonlinear stability and bifurcation. One of the most important examples of a Hilbert space is the space of the Lebesgue square integrable functions. Thus, in a study of Hilbert spaces, the Lebesgue integral cannot be avoided. In several books on Hilbert spaces, the reader is asked to use the Lebesgue integral pretending that it is the Riemann integral. We prefer to include a chapter on the Lebesgue integral to give an opportunity to a motivated reader to understand this beautiful and powerful extension of the Riemann integral. The presentation of the Lebesgue integral is based on a method discovered independently by H. M. MacNeille and Jan Mikusiriski. The method eliminates the necessity of introducing the measure before the integral. This feature makes the approach more direct and less abstract. Since the main tool is the absolute convergence of numerical series, the theory is accessible for senior undergraduate students. This book is appropriate for a one-semester course in functional analysis and Hilbert space theory with applications. There are two basic prerequisites for this course: linear algebra and ordinary differential equations. It is hoped that the book will prepare students for further study of advanced functional analysis and its applications. Besides, it is intended to serve as a ready reference to the reader interested in research in various areas of mathematics, physics and engineering sciences to which the Hilbert space methods can be applied with advantage. A wide selection of examples and exercises is included, in the hope that they will serve a's a testing ground for the theory and method. Finally, a special effort is made to present a large and varied number of applications in order to stimulate interest in the subject. The book is divided into two parts: Part I. Theory (Chapters 1-4); Part II. Applications (Chapters 5-8). The reader should be aware that Part II is not always as rigorous as Part I. The first chapter discusses briefly the basic algebraic concepts of linear algebra, and then develops the theory of normed spaces to some extent. This chapter is by no means a replacement for a course on normed spaces. Our intent was to provide the reader who has no previous experience in the theory of normed spaces with enough background for understanding of the theory of Hilbert spaces. In this chapter, we discuss normed spaces,

Preface

xiii

Banach spaces and bounded linear mappings. A section on the contraction mapping and the fixed point theorem is also included. In Chapter 2, we discuss the definition of the Lebesgue integral and prove the fundamental convergence theorems. The results are first stated and proved for real valued functions of a single variable, and then extended to complex valued functions of several real variables. A discussion of locally integrable functions, measure and measurable functions is also included. In the last section, we prove some basic properties of convolution. Inner product spaces, Hilbert spaces and orthonormal systems are discussed in Chapter 3. This is followed by strong and weak convergence, orthogonal complements and projection theorems, linear functionals and the Riesz Representation Theorem. Chapter 4 is devoted to the theory of linear operators on Hilbert spaces with special emphasis on different kinds of operators and their basic properties. Bilinear functionals and quadratic forms leading to the Lax-Milgram Theorem are discussed. In addition, eigenvalues and eigenvectors of linear operators are studied in some detail. These concepts play a central role in the theory of operators and their applications. The spectral theorem for self-adjoint compact operators and other related results are presented. This is followed by a brief discussion on the Fourier transforms. The l~st section is a short introduction to unbounded operators in a Hilbert space. Applications of the theory of Hilbert spaces to integral and differential equations are presented in Chapter 5, and emphasis is placed on basic existence theorems and the solvability ofvarious kinds of integral equations. Ordinary differential equations, differential operators, inverse differential operators and Green's functions are discussed in some detail. Also included is the theory of Sturm-Liouville systems. The last section contains several examples of applications of Fourier transforms to ordinary differential equations and to integral equations. Chapter 6 provides a short introduction to distributions and their properties. The major part of this chapter is concerned with applications of Hilbert space methods to partial differential equations. Special emphasis is given to weak solutions of elliptic boundary problems, and the use of Fourier transforms for solving partial differential equations, and, in particular, for calculating Green's functions. In Chapter 7, the mathematical foundations of quantum mechanics are built upon the theory of Hermitian operators in a Hilbert space. This chapter includes basic concepts and equations of classical mechanics, fundamental ideas and postulates of quantum mechanics, the Heisenberg uncertainty principle, the Schrodinger and the Heisenberg pictures, the quantum

xiv

Preface

theory of the linear harmonic oscillator and of the angular momentum operators. The final chapter is devoted to the Hilbert space methods for finding solutions of optimization problems, variational problems and variational inequalities, minimization problems of a quadratic functional and optimal control problems for dynamical systems. Also included are brief treatments of approximation theory, linear and nonlinear stability problems, and bifurcation theory. This book contains almost six hundred examples and exercises which are either directly associated with applications or phrased in terms of the mathematical, physical and engineering contexts in which theory arises. The exercises truly complement the text. Answers and hints to some of them are provided at the end of the book. For students and readers wishing to learn more about the subject, important references are listed in the Bibliography. In preparing this book, we have been encouraged by and have benefited from the helpful comments and criticisms of a number of graduate students and faculty members of several universities in the United States and abroad. Professor James V. Herod and Thomas D. Morley have adopted the manuscript at Georgia Institute of Technology for a graduate course on Hilbert spaces. We express our grateful thanks to them for their valuable advice and suggestions during the preparation of the book. We also wish to thank Drs. R. Ger and A. Szymanski who have carefully read parts of the manuscript and given some suggestions for improvement. It is our pleasure to acknowledge the encouragement and help of Professor P. K. Ghosh who has provided several references and books on the subject from his personal library. We also express our grateful thanks to our friends and colleagues including Drs. Ram N. Mohapatra, Michael· D. Taylor and Carroll A. Webber for their interest and help during the preparation of the book. Thanks also go to Mrs. Grazyna Mikusinski for drawing all diagrams. In spite of all the best efforts of everyone involved, it is doubtless that there are still typographical errors in the book. We do hope that those which remain are both few and obvious, and will not create undue confusion. Finally, the authors wish to express their thanks to Mrs. Alice Peters, Editor, and the staff of Academic Press for their help and cooperation. University of Central Florida Orlando

Lokenath Debnath Piotr Mikusinski

PART 1

THE RY

CHAPTER

1

Normed Vector Spaces

1.1. Introduction The basic algebraic concepts in the theory of Hilbert spaces are those of a vector space and an inner product. The inner product induces a norm and thus every Hilbert space is a normed space. Since the norm plays a very important role in the theory, it is not possible to study Hilbert spaces without familiarity with basic concepts and properties of normed spaces. This chapter is by no means a presentation of the theory of normed spaces. We limit our discussion to concepts which are necessary for understanding of the following chapters.

1.2. Vector Spaces We will consider both real vector spaces and complex vector spaces. The field of real numbers will be denoted by R and the field of complex numbers by C. Elements of R or C are called scalars. Sometimes it is convenient to give a definition or state a theorem without specifying the field of scalars. In such a case we will use F to denote either R or C. For instance, if F is

3

4

Theory

used in a theorem, this means that the theorem is true for both scalar fields Rand C. Definition 1.2.1 (Vector Space). set E with two operations:

By a vector space we mean a nonempty

a mapping (x, y)...;. x + y from Ex E into E called addition, a mapping (A, x)...;. Ax from Fx E into E called multiplication by scalars, such that the following conditions are satisfied: (a) x+y=y+x; (b) (x+y)+z=x+(y+z); (c) For every x, y E E there exists z E E such that x + z = y. (d) a(f3x) = (af3)x; (e) (a+ f3)x =ax+ f3x; (f) a(x+y)=ax+ay; (g) lx=x.

Elements of E will be called vectors. If F = R, then E will be called a real vector space, and if F = C, a complex vector space. From (c) it follows that for every x E E there exists zx E E such that x + zx = x. We will show that there exists exactly one element z E E such that x + z = x for all x E E. That element will be denoted by 0 and called the zero vector. Assume x + zx = x and y E E. By (c) and (a), there exists wEE such that y=x+w. Then, by (a) and (b), y+zx =(x+ w)+zx =(x+ Zx)+ w =x+ w = y.

This shows that, if x + z = x for some x E E, then y + z = y for any other vector y E E. We still need to show that such an element is unique. Indeed, if z 1 and z2 are two such numbers, then z 1 + z 2 = z 1 and z 1 + z2 = z 2 • Thus z 1 =z2 • Uniqueness of the zero vector implies that the vector z in (c) is unique for any pair of vectors x, y E E: Let x + z 1 = y and x + z2 = y. There exists wEE such that x + w = 0. Then z 1 = z 1 +(x+ w) =(x+z 1 )+ w = y+w =(x+z2 )+ w=z2 +(x+ w) = z2 • The unique solution z of x + z = y will be denoted by y- x. According to the definition of the zero vector, we have x- x = 0. The vector 0- x will be denoted by -x.

5


We will use 0 to denote both the scalar 0 and the zero vector; this will not cause any confusion. Because of (b), the use of parentheses in expressions with more than one plus sign can be avoided. The following properties follow easily from the definition of vector spaces: If A ~ 0 and Ax = 0, then x = 0. If x ~ 0 and Ax = 0, then A = 0. Ox=O and (-1)x=-x. Example 1.2.1. The scalar fields Rand e are the simplest non-trivial vector spaces. R is a real vector space, e can be treated as a real or complex vector space. Here are some other simple examples of vector spaces: RN ={(x~> ... , xN): x~> ... , xN ER}. eN ={(zJ, ... ' zN): ZJ, ... ' ZN

E

C}.

{(z 1 , z 2 , z 1 + z 2 ): z 1 , z 2 E C}.

Example 1.2.2 (Function Spaces). Let X be an arbitrary non-empty set and let E be a vector space. Denote by F the space of all functions from X into E. Then F becomes a vector space if the addition and multiplication by scalars are defined in the following natural way: (f + g)(x) = f(x) + g(x), (Af)(x) = Af(x).

The zero vector in F is the function which assigns the zero vector of E to every element of X. Among functions spaces there are the most important and interesting examples of vector spaces. Note that spaces RN and eN can be defined as function spaces: RN is the space of all real valued functions defined on {1, ... , N}, and similarly eN is the space of all complex valued functions defined on {1, ... , N}. A subset E 1 of a vector space E is called a vector subspace (or simply a subspace) if for every a, f3 E F and x, y E E 1 the vector ax+ f3y is in E 1 • Note that a subspace of a vector space is a vector space itself. According to the definition, a vector space is a subspace of itself. If we want to exclude this case we say a proper subspace, i.e., E 1 is a proper subspace of E if E 1 is a subspace of E and £ 1 ~E.

Theory

6

Example 1.2.3. Let 0 be an open subset of R N. The following are subs paces of the space of all functions from 0 into C.

=the space of all continuous complex valued functions defined on 0. ~k ( 0) =the space of all complex valued functions with continuous partial derivatives of order k defined on 0. ~ ( 0) =the space of infinitely differentiable functions defined on 0. flP(O) =the space of all polynomials of N variables as functions on 0. ~(0)

00

If Ei and E 2 are subspaces of a vector space E and Ei c E 2 , then Ei is a subspace of E 2 • For instance, the space of all polynomials of N variables is a subspace of ~ (RN) which in turn is a subspace of ~k(RN) or ~(RN). 00

Example 1.2.4 (Sequence Spaces). If the set X in Example 1.2.2 is the set N of all positive integers, then the corresponding function space is actually a space of sequences. The addition and multiplication by scalars are then defined as

(xi, x2, .. .)+(yi, Jl,. ·.)=(xi+ Yi, x2+ Y2, .. .), A(xi, x2, .. . ) =(Ax~> Ax 2 ,

••• ).

The space of all sequences of complex numbers is a vector space. The space of all bounded complex sequences is a proper subspace of that space. The space of all convergent sequences of complex numbers is a proper subspace of the space of all bounded sequences. In most examples verifying that a set is a vector space is easy or trivial. In the following example the task is much more difficult. Example 1.2.5 (lP -Spaces). Denote by [P, for p;::: 1, the space of all infinite sequences {z"} of complex numbers such that 00

n=l

We are going to show that [P is a vector space. Since [P is a subset of a vector space, namely the space of all sequences of complex numbers, it suffices to show that, if {x" }, {Yn} E [PandA E C, then {Ax"} E [P and {xn + Yn} E F. To check the first property note that 00

00

rl=l

n=l

7


Condition L~~ 1 lx" + Y~~l" 1, q > 1 and I/ p + 1/ q = 1. For any two sequences of complex numbers {xn} and {y11 } we have

Proof.

First observe that 1 1 x II "~-x+P q

for 0 ~ x

~

1. Let a and b be positive numbers such that 0 ~a"/ bq ~ 1 and hence we have

a"~

bq. Then

1 a" 1 ab-q 1" ~--+-. p bq q

Since -q/p = 1- q, we obtain 1 a" p bq

1

ab 1 -q~--+

q

Multiplying both sides by bq we get a"

bq

p

q

ab~-+

(1.2.1)

We have proved (1.2.1) assuming a" :s:: bq. A similar argument shows that (1.2.1) holds also if bq :s:: a". Therefore the inequality can be used for any a, b~O. Using (1.2.1) with and

where n is a positive integer and 1 :s:: j :s:: n, we get

8

Theory

By adding these inequalities for j

=

1, ... , n, we obtain

which, by letting n.:.;. oo, gives Holder's inequality. Theorem 1.2.2 (Minkowski's Inequality). Let p;::: 1. For any two sequences of complex numbers {xn} and {y"} we have

Proof. For p = 1 it is enough to use the triangle inequality for the absolute value. If p > 1, then there exists q such that 1/ p + 1/ q = 1 and, by Holder's inequality, we have 00

L

n=J

00

lxn + Ynlp

=

L

lxn + Ynllxn + Ynlp-J

n=l 00

::; L

00

lxnllxn+Ynlp-J+

n=J

L

IYnllxn+Ynlp-J

n=J

+ c~l IYnlp riP (~I lxn + Ynlq(p-J)r q, 1

and hence, because q(p -1) = p,

~I lxn + Ynlp:::::: { c~l lxnlp riP+ (~I IYnlp riP}(~! lxn + Ynlp rlq' which gives Minkowski's inequality.

Example 1.2.6 (Cartesian Product of Vector Spaces). vector spaces over the same scalar field F. Define

E

= { (XI ' ... ' Xn):

Let E 1 ,

••• ,

XI E E I ' x2 E E2' ... ' Xn

and A(x~> ... , x") =(Ax~> ... , Ax");

(xi> ... , Xn)+(yl> .. ·, Yn) =(xi +y~, · · ·, Xn

+ Yn)·

E

E" be

En}

9


Then E is a vector space. The space E is called the Cartesian product or just the product of spaces E~> ... , En- The notation E = E 1 x · · · x En is often used. For example C 2 = c XC, C 3 = c X c XC, etc.

1.3. Linear Independence, Basis, Dimension Definition 1.3.1 (Linear Combination). Let E be a vector space and let x 1 , ... , xk E E. A vector x E E is called a linear combination of vectors x 1 , ••• , xk if there exist scalars a 1 , ••• , ak such that

For example, any element of RN is a linear combination of vectors

e 1 = (1, 0, 0, ... , 0), e2 = (0, 1, 0, ... , 0), ... , eN= (0, 0, ... , 0, 1). Similarly, any polynomial of degree k is a linear combination of monomials

Definition 1.3.2 (Linear Independence). A finite collection of vectors {x 1 , ••• , xd is called linearly independent if a 1x 1 + · · · + akxk = 0 only if a 1 = a 2 = · · · = ak = 0. An arbitrary collection of vectors dis called linearly independent if every finite sub collection is linearly independent. A collection of vectors which is not linearly independent is called linearly dependent. We can also say d is linearly independent if no vector x of d is a linear combination of a finite number of vectors from d different from x. Vectors e 1 , ••• , eN mentioned above are linearly independent. Also the monomials 1, x,

x

2

, ••• ,

x\ ...

are linearly independent. Note that linear independence may depend on the scalar field. For instance, numbers 1 and i (i stands for the imaginary number i =Fl) represent linearly independent vectors in the space C over the field of real numbers. On the other hand, 1 and i are not independent in C over the field of complex numbers. Let d be a subset of a vector space E. By span d we will denote the set of all finite linear combinations of vectors from d, i.e.,

10

Theory

It is easy to check that span d is a vector subspace of E. This subspace will be called the space spanned by d. It is the smallest vector subspace of E containing d.

Definition 1.3.3 (Basis). A set of vectors ~ c E is called a basis of E (or a base of E) if ~ is linearly independent and span ~=E. If there exists a finite basis in E, then E is called a finite dimensional vector space. Otherwise we say that E is infinite dimensional. It can be proved that, for a given vector space E, the number of vectors in any basis of E is the same. If, for example, E has a basis that consists of exactly n vectors, then any other basis has exactly n vectors. In such a case n is called the dimension of E and we write dim E = n. The following are examples of sets of vectors which are bases in R 3 : d

= {(1, 0, 0), (0, 1, 0), (0, 0,

1)},

~ =

{(0, 1, 1), (1, 0, 1), (1, 1, 0)},

'{;; =

{(1, 2, 3), (1, 3, 5), (3, 2, 3)}.

We have dim R 3 = 3, and in general dim RN = N Spaces '{;;(fl), '{;;k(RN), (R N) are infinite dimensional. Note that the dimension of the real vector space eN is 2N, while the dimension of the complex vector space eN is N. This shows that a vector space is not just a set of vectors. The scalar field is its essential part. 00

'{;;

1.4. Normed Spaces In general, it does not make sense to ask what is the length of a vector in a vector space. The concept of norm in a vector space is an abs,tract generalization of the length of a vector in R 3 • It is defined axiomatically, i.e., any real valued function satisfying certain conditions is called a norm. Definition 1.4.1 (Norm). A real function II · II on a vector space E (a function which assigns a real number llx II to a vector x E E) is called a norm if (a) llxll =0 if and only if x=O; (b) I Axil= IAIIIxll for every x E E and A E F; (c) llx+yll::o:llxii+IIYII foreveryx,yEE.

11


Condition (c) is usually called the triangle inequality. Since 0= 11011 = llx-xll s llxll+ 11-xll =2llxll, we have llx112:: 0 for every x E E. Example 1.4.1.

1. The function defined by

llxll =Jxi+ · · ·+x]v

for x = (x~> ... , xN) ERN,

is a norm in R N. This norm is often called the Euclidean norm. The following are also norms in R N: llxll = lx1l + · · · + lxNI; llxll = max{lxJI,. · ·, lxNI}. 2. Theformula liz II= J1z1l 2 + · · · + lzNI 2 , for Z = (zh ... , ZN) E CN, defines a norm in CN. 3. Let n be a closed bounded subset of RN. The function IIIII = maxxEnlf(x)l defines a norm in ~(D). 4. Let z = {zn} E F. The function defined by liz II= (L~=J lznjP) 11 P is a norm in [P. Note that Minkowski's inequality is in fact the triangle inequality for this norm. Definition 1.4.2 (Normed Space). normed space.

A vector space with a norm is called a

It is possible to define different norms on the same vector space. Therefore to define a normed space we need to specify both the vector space and the norm. We can say that a normed space is a pair (E, 11·11), where E is a vector space and II · II is a norm defined on E. On the other hand, some vector spaces are traditionally equipped with certain standard norms. For instance, when we say "the normed space R N, we mean the norm

llxll =Jxi+ · · ·+x]v. Similarly, the norms defined in Examples 1.4.1.2-4 are standard. Observe that a vector subspace of a normed space is a normed space with the same norm restricted to the subspace. The absolute value is a norm in R. It can be used to define convergence of real numbers. A norm in any normed space defines a convergence in the same way.

12

Theory

Definition 1.4.3 (Convergence in a Normed Space). Let (E, 11·11) be a normed space. We say that a sequence {x,} of elements of E converges to some x E E, if for every c: > 0 there exists a number M such that for every n ~ M we have I Xn- x I <E. In such a case we write limn~oo Xn = x or simply Xn...;. x. The above definition becomes simpler if the convergence of real numbers is used: Xn...;. x in E means I Xn- x II...;. 0 in R. The convergence in a normed space has the basic properties of the convergence in R:

A convergent sequence has a unique limit. If Xn...;. X and An...;. A (An, A are scalars), then AnXn...;. Ax. If Xn

-i>

X and Yn

-i>

y, then Xn

+ Yn -i> X+ y.

The above properties can be proved the same way as they are proved in the case of convergence in R. A norm in a vector space E induces a convergence in E. In other words, if we have a normed space E, then we automatically have a convergence defined in E. In practice, we often face a different problem: we have a vector space E with a given convergence and we want to know if we can find a norm in E which would define the convergence. It is not always possible. The following two examples illustrate the problem. Iri the first one the convergence can be described by a norm. In the second one we prove that the given convergence cannot be defined by a norm. Consider the space ~(D) of all continuous functions defined on a closed bounded set D c R N. Let f,/1 , /2 , ••• , E ~(D). We say that the sequence Un} converges uniformly to f if for every c: > 0 there exists a constant M such that for all xED and for all indices n~M we have lf(x)-fn(x)lO if and only if f,(t)-i>l(t) for every tE[O, 1]. Consider the sequence of functions

g~>

g2 ,

•••

(1.4.1)

defined by

ifO::o: t::o:Tn, if 2-n < t ::0: 2 1 -n, otherwise; see Fig. 1.1. Since gn ~ 0, we have llgn II~ 0 for all n EN. Define In= gn/ llgn II, n EN. Then IIJ,, II= 1 for all n EN, and thus the sequence Un} is not convergent to 0 (the zero function) with respect to the norm II· 11. On the other hand, it is easy to see that In ( t) -i> 0 for every t E [ 0, 1]. This contradiction shows that a norm satisfying (1.4.1) cannot exist.

Definition 1.4.4 (Equivalence of Norms). Two norms defined on the same vector space are called equivalent if they define the same convergence. More precisely, norms II· 1 1 and II· 1 2 in E are equivalent if, for every sequence {xn} in E,

Example 1.4.4.

The norms

are equivalent in R 2 • It can be proved that any two norms in a finite dimensional vector space are equivalent.

14

Theory

The following theorem gives another useful criterion for equivalence of norms. The condition in the theorem is often used as a definition of equivalence of norms. Theorem 1.4.1. Let 11·11 1 and ll·ll2 be norms in a vector space E. Then 11·11 1 and 11·11 2 are equivalent if and only if there exist positive numbers a and f3 such that (a)

for all x

E

E.

Proof. Clearly, condition (a) implies the equivalence of norms 11·11 1 and 11·11 2. Now assume that the norms are equivalent, i.e., llxn 11 1 ...;. 0 if and only if llxnll 2-i>O. Suppose there is no a>O such that allxll 1 sllxll 2 for every x E E. Then for each n EN there exists Xn E E such that

Define Yn = (1/.fn)(xn/ llxn ll2). Then llYn ll2 = 1/ .Jn...;. 0. On the other hand, IIYnll 1 ;:::niiYnii 2;:::.J11. This contradiction shows that a number a with the required property exists. The existence of the number f3 can be proved in a similar way. Definition 1.4.5 (Open Balls, Closed Balls, Spheres). Let x be an element of a normed space E and let r be a positive number. We shall use the following notation: B(x, r) = {y E E: IIY -xll < r} (open ball); B(x, r)={yEE: S(x,r)={yEE:

lly-xlls r} lly-zll=r}

(closed ball); (sphere).

In each case, x is called the center and r the radius. Example 1.4.5.

Fig. 1.2 shows examples of balls in R 2 with respect to the

'-'-

----=----=

1

ll(x,y)II 3 S:J

ll(x,y)ll1 S.1 FIGURE 1.2.

15


2

-2

FIGURE 1.3.

norms ll(x, y)IIJ =Jx

2

+ Y2,

ll(x, Y)ll2 = lxl + IYI,

ll(x, y)ll3 = max{lxl, IYI}.

Let E=~([-7T,7T]) and let 11/ll=max,E[-.,-,.,-Jif(t)l. Fig. 1.3 shows B({sin t}, 1). The figure should be understood as follows: B({sin t}, 1) is the set of all continuous functions on [ -1r, 7T] whose graphs are in the shaded area. It is not the set of all points of the shaded area.

Definition 1.4.6 (Open and Closed Sets). A subset S of a normed space E is called open if for every xES there exists c: > 0 such that B(x, c:) c S. A subset S is called closed if its complement is open, i.e., if E\S is open. Example 1.4.6. Open balls are open sets. Closed balls and spheres are closed sets. Let D be a bounded closed set in RN. Consider the space ~(D) with the norm 11!11 =maxxEnlf(x)l. Let /E ~(D) and f(x)>O for all xED. The following sets are open in ~(D): {gEX: g(x)f(x) for all xED}, {gEX: lg(x)lf(x) for all xED}.

16

Theory

The following sets are closed in '{S(fl): {gEX: g(x):Sf(x) for all xEfl}, {gEX:

g(x)~f(x)

for all xED},

{gEX: lg(x)I:Sf(x) for all xEfl}, {gEX: lg(x)l~f(x) for all xED}, {g EX: g(x0 ) =A}

(Xo is a fixed point in

n

and A E C).

Theorem 1.4.2. (a) (b) (c) (d) (e)

The The The The The

union of any number of open sets is open. intersection of a finite number of open sets is open. union of a finite number of closed sets is closed. intersection of any number of closed sets is closed. empty set and the whole space are both open and closed.

The easy proofs are left as an exercise. Theorem 1.4.3. A subset Sofa normed space E is closed if and only convergent sequence of elements of S has its limit in S, i.e.,

x 1, x 2 ,

••• E

S and Xn ...;. x implies x

E

if every

S.

Proof. Suppose S is a closed subset of E, X 1 , x 2 , ••• E S, Xn...;. x, and x ~ S. Since Sis closed, E\S is open. Thus, there exists c: > 0 such that B(x, c:) c E\S, or equivalently, y ~ S whenever llx- y II< c:. On the other hand, since lx- xn II...;. 0, we have llx- Xn II< c: for all sufficiently large n's. This contradiction shows that x ~ S is impossible. Suppose now that whenever x 1 , x 2 , • •• E S and xn...;. x, then xES. If S is not closed, then E\S is not open, and thus there exists x E E\S such that every ball B(x, c:) contains elements of S. Consequently, we can find a sequence x 1 , x 2 , ••• E S such that Xn E B( x, 1/ n ). But then xn ...;.,x and, according to our assumption, xES. This contradicts the assumption x E E\S. Therefore S must be a closed set. Note how the above theorem is useful in proving that the last five sets in Example 1.4.6 are closed.

Definition 1.4.7 (Closure). LetS be a subset of a normed space E. By the closure of S, denoted by cl S, we mean the intersection of all closed sets containing S.

17


In view of Theorem 1.4.2(d), closure of a set is always a closed set. It is the smallest closed set which contains S. The following theorem gives a sequential description of the closure. Theorem 1.4.4. Let S be a subset of a normed space E. The closure of S is the set of limits of all convergent sequences of elements of S, i.e., cl S

=

{x E E: there exist

Xn E

S such that

Xn...;.

x}.

Proof is left as an exercise.

Example 1.4.7. The Weierstrass theorem says that every continuous function on an interval [a, b] can be approximated uniformly by polynomials. This can be also expressed as follows: the closure of the set of all polynomials on [a, b] is the whole space ~([a, b]). Definition 1.4.8 (Dense Subsets). dense in E if cl S = E.

A subset S of a normed space E is called

Example 1.4.8. The set of all polynomials on [a, b] is dense in ~([a, b ]). The set of all sequences of complex numbers which have only a finite number of non-zero terms is dense in lP (for all p;::: 1). Theorem 1.4.5. Let S be a subset of a normed space E. The following conditions are equivalent: (a) Sis dense in E, (b) For every x E E there exist x 1 , x 2 , ••• E S such that Xn ...;. x, (c) Every non-empty open subset of E contains an element of S. Proof is left as an exercise.

Definition 1.4.9 (Compact Sets). A subset S of a normed space E is called compact if every sequence {xn} in S contains a convergent subsequence whose limit belongs to S. Example 1.4.9.

In RN or CN, every bounded closed set is compact.

Theorem 1.4.6.

Compact sets are closed and bounded.

Proof. LetS be a compact subset of a normed space E. Suppose x 1 , x 2 , ••• E S and Xn...;. x. Then {Xn} contains a subsequence {xPJ which converges to some y E S. On the other hand, we have xp, ...;. x. Thus x = y and x E S. This shows that S is closed.

18

Theory

Suppose now, Sis not bounded. Then there exists a sequence x 1 , x 2 , ••• E S such that llx, 11::-: n for all n EN. Clearly, {x,J does not contain a convergent subsequence, and hence S is not compact. Although in finite dimensional normed spaces every closed and bounded set is compact, in general it is not the case.

Example 1.4.10. Consider the space ~([a, b ]). The unit ball B(O, 1) is a closed and bounded set, but it is not compact. To see this consider the sequence of functions defined by xn(t) = tn. Then x, E B(O, 1) for all n EN. Since the convergence in ~([a, b]) is the uniform convergence, clearly the sequence {x,} does not have a convergent subsequence. A normed space (E, 11·11) is, in a natural way, a metric space (hence also a topological space). A metric in E can be defined by

d(x, y) =

llx- Yll.

Clearly, the convergence defined by the norm and the convergence defined by the above metric are the same. Not all linear metric spaces can be equipped with a norm which generates the same convergence. In other words, all normed spaces are metrizable, but not all metric spaces are normable.

1.5. Banach Spaces Every Cauchy sequence of numbers converges. Every absolutely convergent series of numbers converges. These are very important properties of real and complex numbers. Many crucial arguments concerning numbers rely on them. One expects that similar properties of a normed space would be of great importance. This is true. However not all normed spaces have the above properties. Those which do are called Banach spaces.

Definition 1.5.1 (Cauchy Sequence). A sequence of vectors {xn} in a norrued space is called a Cauchy sequence if for every c: > 0 there exists a number M such that llxm- Xn II< E for all m, n > M. Theorem 1.5.1.

The following conditions are equivalent:

(a) {xn} is a Cauchy sequence; (b) I xp, - Xq" I . .;. 0 as n ...;. oo, for every pair of increasing sequences ofpositive integers {p,} and {qn};

19


(c) llxP,+t- x"" II-i> 0 as n...;. oo, for every increasing sequence of positive

integers {p,J. Proof. Clearly (a)~(b)~(c). It remains thus to prove that (c)~(a). Suppose {xn} is not a Cauchy sequence. Then there exists c: > 0 such that lxm- Xn II;::: c: for infinitely many m and nand we can easily find an increasing sequence of integers {pn} such that llxPzn+t- Xp 2 , II;::: cannot be true.

E.

Then llxP,+t- xp, II-i> 0

Observe that every convergent sequence is a Cauchy sequence. In fact, if llxn- xll...;. 0, then

for every pair of increasing sequences. The converse, in general, is not true. For instance, let 9P([O, 1]) be the space of polynomials on [0, 1] with IIPII =max[o.l]IP(x)l. Define

x2

xn

P (x)=1+x+-+· · ·+-

2!

n

n!

for n = 1, 2, ....

Then {Pn} is a Cauchy sequence, but it does not converge in 9P([O, 1]) because its limit is not a polynomial.

Lemma 1.5.1. If {xn} is a Cauchy sequence in a normed space, then the sequence of norms {II Xn II} converges.

Proof.

Since lllxii-IIYIII :S:: llx- Yll.• we have lllxm 11-llxn Ill :s:: llxm -xn II-i> 0 as

m, n...;. oo. This shows that the sequence of norms is a Cauchy sequence of

real numbers, hence convergent. Note that the above lemma implies that every Cauchy sequence is bounded, i.e., if {xn} is a Cauchy sequence, then there is a number M such that llxn II :s:: M for all n.

Definition 1.5.2 (Banach Space). A normed space E is called complete if every Cauchy sequence in E converges to an element of E. A complete normed space is called a Banach space. It can be proved that all the vector spaces in Examples 1.4.1 are complete. Example 1.5.1.

We will show that the space 12 is complete. Let art== {an,!' an,2' .. .},

20

Theory

n = 1, 2, ... , be a Cauchy sequence in a number M such that

f. Then, given any E > 0, there exists

00

L

lam,k-an,ki

2

(1.5.1)

<E,

k=J

if m, n > A1. Hence

k= 1, 2, ... ' which means that for every k the sequence {an,d is a Cauchy sequence in C and thus convergent. Denote

a= {an}.

and

ak =lim an,k n~oo

We are going to prove that a is an element off and that the sequence {an} converges to a. Indeed, (1.5.1) implies ko

ko

L

(lam,kl-lan,kl)

2

::0:

L

lam,k.,-an,ki

2

<E,

for any fixed k 0 • Letting m...;. oo in the above inequality yields ko

L

(lakl-lan,kl)

2

::0: E.

(1.5.2)

::0: E.

(1.5.3)

k=J

Now, let k 0 ...;. oo in (1.5.2), to obtain 00

L

(lakl-lan,IJ)

2

k~J

Since 00

L lan,kl

2

< 00 ,

k~J

by Minkowski's inequality, we have

~[ la~cl 2 = L (iakl-lan,kl + lan,kl) 00

2

k=!

This proves that the sequence a = {an} is an element of f. Moreover, since E is arbitrarily small, (1.5.3) implies 00

lim

L (iakl-lan,kl)

11--?CO

k=l

2

=

0,

21


which means that the sequence {an} is convergent to a. This proves the completeness of f. Definition 1.5.3 (Convergent and Absolutely Convergent Series). A series L~~~ Xn converges in a normed space E if the sequence of partial sums converges in E, i.e., there exists x E E such that llx 1 + x 2 + · · · + Xn- x II...;. 0 as n...;. oo. In that case we write L~=J Xn = x. If L~~~ llxn II< oo, then the series is called absolutely convergent.

In general, an absolutely convergent series need not converge. However, we have the following important theorem. Theorem 1.5.2. A normed space is complete if and only if every absolutely convergent series converges.

Proof. Let E be a Banach space. Suppose L~~~ llxn II< oo, xn E E, n = 1, 2, .... Define sn = x 1 + · · · + Xn for n = 1, 2, .... We will show that {sn} is a Cauchy sequence. Let c: > 0 and let k be a positive integer such that 00

n~k+J

Then, for every m > n > k, we have 00

r=n+l

This proves that the sequence {sn} is a Cauchy sequence in E. Since E is Xn converges. complete, {sn} converges in E, which means that the series Assume now, that Eisa normed space in which every absolutely convergent series converges. We need to prove that E is complete. Let {xn} be a Cauchy sequence in E. Then, for every kEN, there exists Pk EN such that

L

for all m, n:::::: Pk· Without loss of generality, we can assume that the sequence {pn} is strictly increasing. Since the series L~~~ (xP~.+t -xPk) is absolutely convergent, it is convergent and thus the sequence XPk = XPt

converges to an element x

+ (XP2- Xp) + ... + (XPk- XPk-) E

E. Consequently

llxn -xll:::::: llxn -xPJ+ llxp, -xii...;.O, because

{xn}

is a Cauchy sequence. This completes the proof.

22

Theory

Example 1.5.2. As an example of an application of the above theorem we will prove that the space '{?([a, b]) of continuous (real or complex valued) functions on an interval [a, b] is complete. (Recall that the norm in ~([a, b]) is defined by III I =max[ a, bJ lf(x )I,) In view of Theorem 1.5.2 it suffices to prove that the sum of an absolutely convergent series of continuous functions is a continuous function. Let L~=l llfn II< oo. Then L~= 1 l.fn (x)l < oo for every x E [a, b ], and therefore there is a function f such that 00

f(x) =

L fn(x) n=1

at every x E [a, b]. We need to show that f is continuous on [a, b ]. Let xE[a, b] and £>0. There exists an index k such that L~=lc+l llfnll <E/3. Since the function L~= 1 fn is continuous at x there exists 8 > 0 such that

It;l

fn(Y)- nt! fn(X)

Thus, for every y

E

(x- 8, x

I 0 implies II L( xn)- L(x0 ) II-i> 0. If Lis continuous at every x E E 1 , then we simply say that L is continuous. A number of examples of continuous mappings is discussed in Chapter 4 (Section 4.2). Here we will only make the following simple but useful observation.

Example 1.6.1. The norm in a normed space Eisa continuous mapping from E into R. Indeed, if II Xn -X II-i> 0, then Ill Xn II -II X Ill:::::: II X~ -X II-i> 0. Continuity can be described in many different ways. The conditions in the following theorem characterize continuity in terms of open and closed sets. The proof of the theorem is left as an exercise. Theorem 1.6.1.

Let f: E 1 -i> E 2 • The following conditions are equivalent:

(a) f is continuous; (b) The inverse image f- 1 ( U) of any open subset of E 2 is open in E 1 ; (c) The inverse image f- 1(F) of any closed subset of E 2 is closed in £ 1 • From now on we are going to limit our discussion to linear mappings. Theorem 1.6.2. A linear mapping L: E 1 -i> E 2 is continuous is continuous at a point.

if and only if it

Proof. Assume L is continuous at x 0 E E 1 • Let x be arbitrary element of E 1 and let {xn} be a sequence convergent to x. Then the sequence {xn- x + x 0 } converges to x 0 and thus we have II L(xn)- L(x )II= IIL(xn- x+ Xo)- L(x0 ) II-i> 0 which completes the proof.

Definition 1.6.3 (Bounded Linear Mappings). A linear mapping L: E 1 -i> E 2 is called bounded it there exists a number K such that IIL(x)ll:::::: Kllxll for all XE E 1 •

25


Theorem 1.6.3.

A linear mapping is continuous if and only if it is bounded.

Proof. If L is bounded and Xn -i> 0, then II L( Xn) II :::::: K II Xn II-i> 0. Thus L is continuous at the origin and hence, by Theorem 1.6.2, L is continuous. If L is not bounded, then for every n EN there exists xn E E 1 such that

II L(xn)ll > n llxn II· Define Yn = Xn/n llxn II for n = 1, 2, .... Then Yn -i> 0 and at the same time IIL(yn)ll > 1. Therefore Lis not continuous. Notice that the above theorem says that for linear mappings continuity and uniform continuity are equivalent. The space of all linear mappings from a vector space E 1 into a vector space E 2 becomes a vector space if the addition and multiplication by scalars are defined as follows:

If E 1 and E 2 are normed spaces, then the set of all bounded linear mappings from E 1 into E 2 , denoted by iJJJ(E~> E 2 ), is a vector subspace of the space defined above. Let

(1.6.1)

IlLII = sup IIL(x)ll llxll~l

Theorem 1.6.4.

If E 1 and E 2 are normed spaces, then iJJJ ( E 1 , E 2 ) is a normed space with norm ( 1.6.1 ).

Proof. We will only show that norm ( 1.6.1) satisfies the triangle inequality. For all L 1 , L 2E iJJJ(E~> E 2 ) and every XE E 1 such that llxll = 1 we have

IIL1(x)+ L2(x)ll:::::: IILJ(x)ll + IIL2(x)ll· This implies IIL1(x)+ L2(x)ll:::::: sup IILJ(x)ll + sup IIL2(x)ll =IlL~ II+ IIL2II, llxll=l

llxll=l

and hence sup IIL1(x)+ L2(x)ll = IIL1 + L2ll:::::: IlL~ II+ IIL211-

IIxll= I

Norm (1.6.1) is the standard norm in iJJJ(E~> E 2 ). When we say "the normed space iJJJ(E~> E 2 ) " we always mean the norm is defined by (1.6.1). Convergence with respect to this norm is called the uniform convergence. We will also use another type of convergence in iJJJ(£ 1, E 2 ), called the strong convergence, which is defined as follows:

26

Theory

A sequence of 03(E 1 , E 2 ) if for every

E ~/3 ( E 1 , converges strongly to L E we have II (x)- L(x)ll_,. 0 as n...;. oo. Since

uniform convergence implies strong convergence. In general, the converse is not true. Theorem 1.6.5. If E 1 is a normed space and E 2 is a Banach space, then iJJJ(E1 , E 2 ) is a Banach space.

Proof. We only need to show that iJJJ(E1 , E 2 ) is complete. Let {Ln} be a Cauchy sequence in iJJJ(E 1 , E 2 ), and let x be an element of E 1 • Then II Lm(x)- Ln(X )II s II Lm- Ln llllxll-i> 0

as m, n...;. 0,

which shows that {Ln(x)} is a Cauchy sequence in E 2 • Since E 2 is complete, there is an element L(x) E E 2 such that Ln(x)...;. L(x). This defines a mapping L from E 1 into E 2 • We will show that LE iJJJ(E1 , E 2 ) and IILn -LII...;.O. Clearly, Lis a linear mapping. Since Cauchy sequences are bounded, we have II L(x)ll =II n->CO lim Ln(x)ll =lim II Ln(x)ll s (sup II Ln ll)llx II· n--?CO Therefore Lis bounded and thus LE iJJJ(Eh E 2 ). It remains to prove that II L,- Lll...;. 0. Let E > 0 and let k be a real number such that for every m, n 2:: k we have IILm- Ln II< E. If llxll = 1 and m, n 2:: k, then IILm(x)- Ln(x)ll s IILm- Lnllllxll <E. By letting n ...;.oo (m remains fixed), we obtain IILm(x)- L(x)ll s E for every m > k and x with llxll = 1. This means that IILm- Lll s E form> k, which completes the proof. Theorem 1.6.6. Let f be a continuous linear mapping from a subspace of a normed space E 1 into a Banach space E 2 • Then f has a unique extension to a continuous mapping defined on the closure of the domain ffi(f). In particular, if ffi(f) is dense in E 1 , then f has a unique extension to a continuous linear mapping defined on the whole space E 1 •

Proof. If x E clffi(f), then there exists a sequence {xn} in ffi(f) convergent to x. Since {xn} is a Cauchy sequence, llf(x,")- f(xq, )II= llf(x,"- Xq, )II s III llllx,"- Xq, 11...;. 0

27


for any two increasing sequences of positive integers {p and {q Thus {f(x is a Cauchy sequence in E 2 . Since is complete, there is z E E 2 such that f( X -'Jo z. We want to define the value of the extension j at x as ](x) = z, i.e. 11 }

11 } .

11 ) }

11 )

](x) =lim f(xn),

X11 Effi(f) and X11 -'? X.

n->CO

This definition will be correct only if we can show that the limit z is the same for all sequences in ffi(f) convergent to x. Indeed, if y, E ffi(f) and Yn-'? X, then

because Yn - X -'? 0, and hence also f(Yn - X )-'? 0. From the continuity off on ffi(f) it follows that ](x) = f(x) whenever x E ffi(f). Clearly, j is a linear mapping. It remains to show that j is continuous. Let xe:clffi(f), llxll=l. There exist X 1 ,X2 , ... ,Effi(f) such that X -'?X. Then llxn 11-'Jo llxll = 1 and 11

11

11

](x) =II lim f(xn )II= lim n--?CO

Thus

n--?CO

llf(xn)ll :o; II/ II·

j is bounded, hence continuous, and IIlii =II! II·

Theorem 1.6.7. Iff: E 1 -'Jo E 2 is a continuous linear mapping, then the null space .N(f) is a closed subspace of E. Moreover, if the domain ffi(f) is closed, then the graph CfJ( L) is a closed subspace of E 1 X E 2 •

Proof is left as an exercise. Spaces [JJJ(E, F) of bounded linear mappings from a .normed space E into the scalar field Fare of special interest. Elements of [JJJ(E, F) will be called functionals. The space [JJJ(E, F) is sometimes denoted byE' and called the dual space of E.

Theorems proved in this section apply to dual spaces of normed spaces. Note, that since the scalar field is a complete space, the dual space of a normed space is always a Banach space.

1.7. Completion of Normed Spaces Some spaces arising naturally from applications are not complete. We would like to find a way to enlarge such a space to a complete space. It turns out that it is always possible. Although it can be done in infinitely many ways, under certain conditions such an extension is in some sense unique (every

28

Theory

two spaces satisfying conditions (a)-( d) below are isomorphic). For this reason we can talk of the completion of a normed space. In this section we will describe the completion by Cauchy sequences. Let (E, //·//) be a normed space. We will construct a normed space (E, 1/·1/ 1) such that: (a) E can be identified with a vector subspace of E, (b) 1/x/1 = 1/x/1 1 for every xE E; (c) E is dense in E, i.e., every element of E is the limit of a convergent sequence of elements of E; (d) E is complete. Two Cauchy sequences {xn} and {yn} of elements of E will be called equivalent if limn~oo 1/ Xn- Yn /1 = 0. The set of all Cauchy sequences equivalent to a given Cauchy sequence {xn} is denoted by [xn] and called the equivalence class of {xn}. The set of all equivalence classes of Cauchy sequences of elements of E, denoted by E, becomes a vector space when the addition and multiplication by scalars are defined as follows:

[xn]+ [Yn] The norm in

E is

=

[xn + Yn],

defined by

1/[xn]//J =lim 1/xn/1· n~oo

By Lemma 1.5.1, the limit limn~ool/xn 1/ exists for every Cauchy sequence {xn}. It is easy to check that, if {xn} and {yn} are equivalent, then limn~oo/1 Xn 1/ = limn~ocoi/Yn /1. Since every x E E can be identified with the constant sequence {x, x, x, ... }, E can be considered as a subspace of E. Moreover, 1/x/1 = 1/x/1 1 for yvery XEE.

To show that E is dense in E note that every element [xn] of E is the limit of the sequence {xn}, or more precisely, the sequence of elements of E corresponding to {xn}· Now we will prove that E is complete. Let {Xn} be a Cauchy sequence in E. Since E is dense in E, for every n E N there exists x, E E such that

From the inequalities

II Xn -

Xm II = II Xn - Xm Ill::; II Xn - xn Ill+ II xn - xm 1/J + II xm - Xm Ill S

1 1 1/Xn -Xm/11 +-+n m

29


we see that {x,} is a Cauchy sequence in E. Define X= [xn]. It remains to show that lim IIXn-XIIJ =0.

n~oo

Indeed, we have

because limn~ooll xn- X

11 1

= 0. The proof is complete.

1.8. Contraction Mappings and the Fixed Point Theorem The name fixed point theorem is usually given to a result which says that, under certain conditions, given a mapping f there is a point z such that f(z) = z. Such a point z is called a fixed point off Theorems of this sort have numerous important applications. Some of them, in the theory of differential and integral equations, will be discussed in Chapter 5.

Example 1.8.1. 2

by Tz = z • points ofT

Consider the normed space C and the mapping defined Then T has 0 and 1 as fixed points. These are the only fixed

Let E = ~([0, 1]) be the space of complex valued continuous functions defined on the closed interval [0, 1]. Let T be defined by Example 1.8.2.

(Tx)(t)

=

x(O)+

L

x( T) dT.

Clearly, any function of the form x( t) = ae', 0 s t s 1 and a E C, is a fixed point ofT

Theorem 1.8.1 proved in this section is a version of a theorem called the Contraction Theorem or the Banach Fixed Point Theorem. The theorem is usually formulated for metric spaces instead of normed spaces because the algebraic structure of the space is not essential for the result.

30

Theory

Definition 1.8.1. A mapping f from a subset A of a normed space E 1 into E 1 is called a contraction mapping (or simply a contraction) if there exists

a positive number a< 1 such that 1/f(x)- f(y)/1 i; a 1/x- y/1

for all x, y EA.

(1.8.1)

Note that contraction mappings are continuous. Example 1.8.3. Consider the nonlinear algebraic equation x 3 - x -1 = 0. This equation has three roots. There are several ways of putting the equation in the form Tx = x. This can be obtained by choosing T to be one of the following 1 Tx=-. 2 X -1

Tx= x 3 -1,

The original equation has a root in [1, 2]. The mapping T defined by Tx = (1 + x ) 113 is a contraction on [ 1, 2]. Indeed, we have

ITx- Tyl = 1(1 +x) 113 -(1 + y) 113 l:::::: en -1)lx- Yl,

n-1

and < 1. Note that the other two mappings are not contractions. A number of important examples of contraction mappings will be discussed in Chapter 5. The following is an instructive example of a mapping which is not a contraction, although it is "close" to being one. Example 1.8.4. Consider the function f(x) =x+ e-x which can be treated as a mapping from R+ into R+, where R+ denotes the set of all non-negative real numbers. For any x, y E R+ we have, by the Mean Value Theorem, lf(x) -f(Y)I

= lf'(~)llx- Yl

for some

~

between x and y.

Since If'( ~)I< 1 for all ~ E R+, we have lf(x) -f(y)l < lx- Yl·

However, f is not a contraction, because there is no a < 1 such that lf(x)- f(y)l < alx- Yl for all x, y E R+. Theorem 1.8.1. Let f be a contraction mapping from a closed subset F of a Banach space E into F Then there exists a unique z E F such that f( z) = z.

31


Proof.

Let 0 oc.

Thus {xn} is a Cauchy sequence. Since F is a closed subset of a complete space, there exists z E F such that xn -i> z as n -i> oc. We are going to show that z is a unique point such that f( z) = z. Indeed, since

llf(z)-zll::; llf(z)-xnll + llxn -zll = IIJ(z)-J(xn-1)11 + llxn -zll ::; a

liz -xn-111 + llxn- zll-i> 0

as n -i> oc,

we have llf(z)- zll = 0, and thus f(z) = z. Suppose now f(w) wEF Then

liz- wll = llf(z) -f(w)ll sa liz- wll. Since 0 O.

Since the slope of the graph of y = Tx is strictly less than 1 for all x we have

ITx- Tyl b 1), ••• , [an, bn) and numbers A1 , ... , An E R such that (2.2.1) where A is the characteristic function of [ ak> bk), i.e.,fk( x) = 1 if x E [ ak> bk), and fk(x) = 0 otherwise. Clearly, representation (2.2.1) of a step function is not unique. On the other hand, if we assume that intervals [ak, bk) are disjoint and the minimal number of intervals is used, then the representation is unique. Such a representation can be obtained in the following way: Let f be a step function and let

be all discontinuity points off In other words, a 0 , a 1 , ••• , an are the points where the graph off has a jump. Denote by gk ( k = 1, ... ,n) the characteristic function of the interval [ ak-J, ak). Then

where ak =f(a,J, k= 1, ... , n. This representation satisfies the required conditions. It will be called the basic representation off This definition does not make much sense iff= 0. It is natural to call f = 0 the basic representation of the zero function.

FIGURE

2.1. A "typical" step function.

39

The Lebesgue Integral

The collection of all step functions on R is a vector space. The absolute value of a step function is again a step function. Iff= ad1 + · · · + anJ;, is the basic representation of a step function,;; then I! I= /aJ/!1 + · · · + /anlfn· For any real valued functions f and g we have min(f, g)= ~(f + g -If- g/)

and

max(f, g)= ~(f + g+ If- g/).

Thus, iff and g are step functions, then min(f, g) and max(f, g) are also step functions. By the support of a non-zero function f, denoted by supp f, we mean the set of all points x E R for which f(x) ~ 0. The support of a non-zero step function is always a finite union of semi-open intervals. Iff= 0, supp f = 0.

Definition 2.2.1 (Integral of a Step Function). function

The integral

Jf

of a step

f(x) = Ad1(x)+ · · · + Anfn(x), where A is the characteristic function of [a," bk), k = 1, ... , n, is defined by

f

Clearly, the value

J= A1(b1- a 1 )+ · · · + An(bn- an).

Jf

is equal to the Riemann integral

f L:f(x) f=

off, i.e.,

dx.

From the properties of the Riemann integral it follows that the defined integral does not depend on a particular representation. This fact is of importance for construction of the Lebesgue integral. The independence can be proved without using the theory of the Riemann integral. The reader is asked to provide an elementary proof as an exercise. Theorem 2.2.1.

For any step functions f and g we have:

(a) J(f+g)=SJ+Jg; (b) Af = A f, A E R; (c) f~ g implies SJ~ g; (d) ISJI~SIJI.

J

J

J

Proof. Properties (a) and (b) follow directly from the definition of the integral. To prove (c) it suffices to show that j2:: 0 implies Jf2:: 0. Iff= 0, then Sf= 0 by (b). Iff 2:: 0 and f does not vanish identically on R, then all the coefficients in the basic representation off are positive, and thus j> 0. Since!~ I! I and -J~ IJI, we have u~ I! I and (-f)~ IJI, by (c), which implies ISJI ~ SIJI, by Cb ).

s

s

s

J

Theory

40

A rather obvious property of the integral of step functions is formulated in the following lemma. It will be used in the proof of Theorem 2.2.2. The easy proof is left as an exercise. Lemma 2.2.1. Let f be a step function whose support is contained in the union of disjoint semi-open intervals [ a 1 , b1 ), • •• , [an, bn ). If I! I< M, for some constant M > 0, then

Lemma 2.2.2. Let [ a 1 , b 1 ), [ a 2 , b2 ), [a, b) such that

•••

be disjoint subintervals of an interval

00

U

[am bn) =[a, b).

(2.2.2)

n=l

Then (2.2.3) n=l

Proof. Let S c [a, b) consists of all points c such that the lemma holds for the interval [a, c) and the sequence of subintervals [an, bn)n[a, c). Therefore, if c E S, then n

where bc,n =min{bn, c} and the summation is over all those n for which an < bc,n. It suffices to prove that bE S. To this end we first prove that LUB S E S. Indeed, if s = LUB S and {sn} is a non-decreasing sequence of elements of S convergent to s, then (2.2.4) m

m

Since sn- a...;. s-a, (2.2.4) implies

L. (b,,m- am)= s-a, and consequently s E S. Next we show that s =b. Suppose s 0, c: E Q, there are intervals [an, bn) such that 00

[0, 1)S

U

[an, bn)

n=l

and at the same time 00

n=l

Indeed, since the set [0, 1) is countable, we can write [0, 1) = { q1 , q2 , For n = 1, 2, ... , define an=qn-c:f2n+l

and

• •• } •

bn=qn+c:j2n+l.

Then the sequence of intervals [an, bn) have the desired properties. The next theorem will be used in the proof of Lemma 2.2.3. It describes an important property of the integral of step functions. The proof is not easy. It is probably the most complicated proof in our presentation of the Lebesgue integral. Theorem 2.2.2. Let Un} be a non-increasing sequence of non-negative step functions such that Iimn~oofn (x) = 0 for every X E R. Then limn~oo fn = 0.

J

Since the sequence (by 0), it converges. Let

Proof.

{J fn} is non-increasing and bounded from below lim

n~oo

ffn =E.

(2.2.5)

Suppose c: > 0. Let [a, b) be an interval containing the support of f 1 (and thus the support of every fn, n = 1, 2, ... ). Put a = c: /2( b- a). For n = 1, 2, ... define An={xE[a, b):fn(X)h, .. . E C(R). IfL~=I Jlfn)

In other words, the limit with respect to the convergence in norm can be interchanged with integration:

f

lim fn = n--;.oo lim

n4-oo

f

fn-

The convergence almost everywhere does not share this property. On the other hand, in practice it is often much easier to show that a sequence of functions converges almost everywhere than to show convergence in norm. Theorems 2.8.3 and 2.8.4 give conditions which are easy to check and at the same time imply the convergence in norm. A sequence of functions is called monotone if it is non-increasing or non-decreasing. The following theorem on monotone sequences of functions is due to Beppo Levi. Note a certain similarity between this theorem and Theorem 2.2.2. Theorem 2.8.3 (Montone Convergence Theorem). If {fn} is a monotone sequence of integrable functions and IS fn I:s:: M for some constant M and all n EN, then there exists an integrable function f such that fn...;. f i.n. andfn...;. f a. e .. Moreover, we have IS !I :s:: M. Proof. Without loss of generality, we can assume that the sequence 1s non-decreasing and the functions are non-negative. In such a case

60

Theory

for every n EN. By letting n----'> oo, we obtain

1 By Corollary 2.5.1, there exists f E L (R) such that f = ! 1 + (!2 - f 1 ) + · · · , and hence fn----'> f i.n. and fn----'> f a.e., by Theorem 2.6.3 and Corollary 2.7.1. Finally

IIf I IIfl -I (!2=

:s;

I I 1!11-

I I

!I)-

1!2-JII-

(!3- !2)-

..

·I

1!3-hl-·. ·:SM,

proving the theorem. Theorem 2.8.4 (The Lebesgue Dominated Convergence Theorem). If a sequence of integrable functions {fn} converges almost everywhere to a function f and lin I :s; h for every n EN, where h is an integrable function, then f is integrable and fn----'> f i.n .. Proof.

For m, n = 1, 2, ... , define

Then, for every fixed mEN, the sequence {gm,J, gm, 2 , ••• } is non-decreasing and, since I gm,n I= gm,n :s; h < oo, there is an integrable function gm such that gm,n----'> gm a.e. as n----'> oo. Note that the sequence {gn} is non-increasing and 0 :s; gn for all n EN. Thus it converges to a function g at every point and, by the Monotone Convergence Theorem, f is integrable and g,,----'> g i.n .. Now we will consider two cases. Case 1. First suppose J= 0. Then fn----'> 0 a.e., and therefore gn----'> 0 a.e .. Since the sequence converges in norm, we obtain gn----'> 0 i.n .. Hence

J

J

J

which proves the theorem in the first case. Case 2. When f is an arbitrary function, then for every increasing sequence of positive integers {Pn} we have

61


and Ihn Is= 2h for every n EN. By the first part of the proof, we derive hn -i> 0 1 i.n .. This shows that the sequence {fn} is a Cauchy sequence in L (R) and therefore it converges in norm to some bEL \R), by Theorem 2.8.1. On the other hand, by Theorem 2.8.2, there exists an increasing sequence of positive integers qn such thatfq, -i> b a.e .. But.[q, -i> f a.e., and thus b =J a.e .. In view of Theorem 2.7.3, it implies that fn -i> f i.n .. This completes the proof. The following useful theorem follows rather easily from the Monotone Convergence Theorem. It is traditionally called the Fatou's Lemma. Theorem 2.8.5 (Fatou's Lemma). Let {fn} be non-negative integrable functions such that fn s= M for some M and every n EN. If the sequence {fn} converges almost everywhere to a function f, then f is integrable and f :s: M.

J

J

Proof. Let n,k =min{fn,fn+I> ... ,fn+k}, for n, kEN. For a fixed n EN the sequence {n.I, n, 2 , •• • } is a decreasing sequence of integrable functions such that I n,kl s= n,I < oo. Thus, by the monotone convergence theorem, it converges almost everywhere to an integrable function n, i.e.,

J

J

n = inf{fn,fn+bfn+2, .. .} a.e .. Since

and

the sequence {n} converges almost everywhere to an integrable function f and we have Jfs= M, again by the Monotone Convergence Theorem. Assumptions in Theorems 2.8.3 and 2.8.4 can be relaxed. In Theorem 2.8.3 it is sufficient to assume that the sequence {fn} is monotone almost everywhere, i.e., the set of points x E R where {fn(x)} is not monotone is a null set. Similarly, in Theorem 2.8.4 we can replace the assumption lin I:s: h by inequality almost everywhere, i.e., the set of points x E R where lin ( x) I > h ( x) is a null set. These are not significant generalizations but they may be useful. The reader should understand well why these generalizations follow easily from the proven theorems.

Theory

62

2.9. Locally Integrable Functions

J Coo

The integral f corresponds to the integration over the entire real line, so the symbols f or JR could be used instead. In applications we often need to integrate functions over bounded intervals. This concept can be easily defined using the integral ff Definition 2.9.1 (Integral Over an Interval). By the integral of a function [a, b ], denoted by f, we mean the value of the integral fX[a,bJ, where X[a,bJ denotes the characteristic function of [a, b] and fX[a,bJ is the product of functions.

r:

f over an interval

J

J:

In other words, f is the integral of the function equal to f on [a, b] and zero otherwise. Theorem 2.9.1. exists. Proof.

IjfE L 1 (R), then for every interval [a, b] the integral

s:J

Let f= ! 1 +! 2 + · · · . Define, for n = 1, 2, ... , if XE(a, b], otherwise.

Then fX[a,bJ = g 1 + g 2 + · · · , proving the theorem. The converse of the above theorem is not true. For instance, for the constant function J= 1, the integral f exists for every -oo 0 there exist intervals [ a 1 , b 1 ), [ a 2 , b2 ), ••• such that

and

n=l

Then S is a null set. Proof.

Let, for every n EN,

be a sequence of intervals such that (2.11.1)

69


and 00

L

k=J

1 (bn,k- an,k) 0, by Theorem 2.8.2, there exists a subsequence {/pJ of {fn} which is convergent almost everywhere. Define f(x) =

{

lim r (x) ~~oo Jp,

if the limit exists, otherwise.

Then /p, -i> f a.e .. Since ffp, -i> 0, f is a null function. Let g denote the characteristic function of the setS. By (2.11.1), gs,fn for all nEN. Thus g s,j and, by Theorem 2.6.1, g is a null function. This means that S is a null set. The proof is complete. The property of the measure proven in the next theorem is called a-additivity of the measure. It is one of the most fundamental properties of the measure. Theorem 2.11.2. Let 5 1 , 5 2 , ••• be a sequence of disjoint measurable sets. Then the unionS= Sn is measurable and we have

u:=l

(2.11.3) n=l

Proof. Suppose first that there exists a bounded interval [a, b] such that Sn s [a, b] for every n EN. Then Xs = Xs, + Xs2 + · · · ·

Consequently, S is measurable and we have (2.11.4)

70

Theory

Now let S, S~, 5 2 , ••• be arbitrary sets satisfying the assumptions of the theorem. Note that, by the first part of the proof, Sis measurable. To prove (2.11.3) consider two cases. Case 1. Xs is integrable. Then Xs, is integrable for every n EN, because Xs.,:::::: Xs, and we have Xs = Xs,

+ Xs + · · · 2

Then (2.11.3) can be proved as in (2.11.4). Case 2. J.L(S) = oo. We need to show that L~~~ J.L(Sn) = oo. Suppose not: L~~ 1 f.L(Sn) f + g a. e., Afn...;. A/ a. e.,

-i> fg a. e., lin 1-i> If la.e .. Since Un + gn}, {Afn}, {fngn}, and {lin I} are sequences of step functions, the fngn

theorem is proved. Observe that the above theorem implies that iff and g are measurable functions then max(/, g) and min(/, g) are measurable functions. Theorem 2.11.4.

Iff is a measurable function and integrable function g, then f is locally integrable.

I! Is; g for some locally

Proof. Let [a, b) be an arbitrary bounded interval in R. We need to show that fX[a, b) is an integrable function. Suppose first that f;::: 0. Let {fn} be a sequence of step functions such that fn...;. f a. e .. Define functions gn as gn = X[a,

b)

min(fn, g).

Then { gn} is a sequence of integrable functions convergent to fX[a,b) almost everywhere. Therefore, by the Dominated Convergence Theorem, fX[a,b) is an integrable function. For an arbitrary measurable function/ we first define f+ =max(/, 0) and/-= max( -f, 0). Then we have/+,/-;::: 0 andf = f+- f-. Therefore the theorem follows from the first part of the proof and Theorem 2.9.2. Other properties of the Lebesgue measure, measurable sets, and measurable functions can be found in 2.16. Exercises.

2.12. Complex Valued Lebesgue Integrable Functions In this section we extend the definition of Lebesgue integrable functions to include also functions with complex values. First we define a complex

72

Theory

valued step function: f is a complex valued step function if there exist complex numbers A1 , ••• , An and intervals [a 1 , b1 ), ••• , [an, bn) such that (2.12.1) where X[ a"' b") is the characteristic function of [ ak, bk), k integral ft of the step function (2.12.1) is defined by

f

= 1, 2, .... The

f =AI ( b!- a!)+ . .. +An ( bn -an).

The defined integral has the same properties as the integral of real step functions. Note that, iff is a step function, then its real part Re f and imaginary part Im fare step functions and we have ft= Ref+ Imf.

f

if

Definition 2.12.1 (Lebesgue Integral for Complex Valued Functions). A complex valued function f is Lebesgue integrable if there exists a sequence of step functions {fn} such that the following two conditions are satisfied: (a) L~=J f Ifni g 2 , ••• such that

Now define

We will show that co

H=

n

L L FkGn-k+!·

(2.15.2)

n=! k=!

In fact, since

and the series

are absolutely convergent, we have

Moreover, if for some (x, y) E R 2 the series co

n

L L Fk(x, y)Gn-k+J(x, Y)

n=! k=!

converges absolutely, then both series co

00

L f,(x-y)

and

n=l

L gn(Y) n-1

converge absolutely and thus we have co

H(x, y) =

n

LL

Fk(x, y)Gn-k+!(x, y),

n=! k=!

at that point (x, y). Therefore, (2.15.2) holds, which means that HE L 1 (R 2 ). By the Fubini Theorem, the integral

f f(x-y)g(y) dy

81


is defined almost everywhere in R (with respect to x) and defines an integrable function, i.e., f *gEL 1 (R). To show that (2.15.1) holds, note that since His integrable so is IHI and thus

f

lf*gl

=I If

f(x-y)g(y)dyl dx

:Sf f lf(x-y)iig(y) dyl dx =

=

=

ff

(by the Fubini Theorem)

lf(x- Y)llg(y)l dx dy

f f f

lf(x-y)l dx

III

I

lg(y)l dy

lgl.

The proof is now complete. Theorem 2.15.2.

Iff, gEL \R), then f* g

=

g *f

Proof follows easily by the change of variables (Theorem 2.10.4). Theorem 2.15.3. Iff is an integrable function and g is a bounded locally integrable function, then the convolution f * g is a continuous function. Proof. First observe that since IJ(x- y)g(y )I :S Mlf(x- y )I for some constant M and every x E R the integral fj(x- y)g(y) dy is defined at every x E R, by Theorem 2.9.3. Now we will show that the convolution f* g is a continuous function. For any x, t E R, we have ICf*g)(x+t)-Cf*g)(x)l

=If f(x+t-y)g(y) dy- f f(x-y)g(y) dyl =If (f(x+ t-y)- f(x- y)) g(y) dyl :Sf lf(x+t-y)-f(x-y)llg(y)l dy :SM

f

lf(t-y)-J(-y)l dy.

82

Theory

Since, by Theorem 2.4.2, lim 1 complete.

rycx:

Jif( t - y) -f( -y)j dy = 0,

the proof is

2.16. Exercises

(1) Denote by d the family consisting of all finite unions of semi-open intervals [a, b) and the empty set. Prove the following properties of d: (a) If A 1 , ••• ,A11 Ed, then A 1 u · · ·uAnEd. (b) If A 1 , ••• , A 11 E d, then A 1 n · · · n An E d. (c) If A, BEd, then A\BEd.

(2) Show that step functions form a vector space. (3) Prove that the integral of a step function is independent of a particular representation (2.2.1 ).

(4) Show that for any step functions f and (a) (b) (c) (d) (e)

g we have

supp(f+g)~suppfusuppg,

supp fg = supp f n supp g, supp Iii= supp f, supp Aj= suppf, A E R, A 7"' 0, If If!:=:; jgj, then suppf~supp g.

(5) Prove Lemma 2.2.1. (6) Prove Theorem 2.4.2 for step functions.

(7) Expand the following functions into a series of step functions (i.e., find step functions f~> fz, ... such that f = / 1 + !2+ · · ·):

g

(a)

f(x) =

(b)

f(x)={~

if X= 0, if X 7"' 0. if X E [a, b], ifx£[a, b],

a f i.n. and fn -i> fa. e .. Moreover, we have If !I :SM.

(29) Prove: If a sequence of integrable functions Un} converges almost everywhere to a function f and lfn(x)l :S h(x) for almost all x E R, all n EN, and some integrable function h, then f is integrable and f, -i> f i.n ..

85


(30) Show that the function sin x f(x)

x

=

{

1

if

X

7"' 0,

if

X=

0,

is not Lebesgue integrable, although the improper Riemann integral Coo f(x) dx converges.

(31) Let .;U denote the collection of all measurable subsets of R. Prove the following: (a) (b) (c) (d) (e) (f) (g)

0, R E .!U. If A], Az, ... E .!U, then An E .!U. If A~>A 2 , ••• E.!U, then n';:"= 1 AnE.!U. If A, BE .!U, then A \BE .!U. Intervals are measurable sets. Open subsets of R are measurable. Closed subsets of R are measurable.

u:=l

(32) Let .;U be the collection of all measurable subsets of R and let f-L be the Lebesgue measure on R. Prove the following: (a) If A 1 ,A 2 , ••• E.!U, then J-L(U~= 1 An):SL~= 1 J-L(An). (b) If A, BE .;U and A~ B, then J-L(B\A) = J-L(B)- J-L(A). (c) If A~>A 2 , ••• E.!U and A 1 ~A 2 ~A 3 ~···, then J.L(U~= 1 An)= limn -.oo f-L (An)·

(33) Let f be a real valued function on R. Show that the following conditions are equivalent: (a) (b) (c) (d) (e)

f is measurable; {xER:f(x):Sa} {x E R: f(x) a}

is is is is

measurable measurable measurable measurable

for for for for

all all all all

aER. a E R. aER. aER.

(34) Prove: Let A~> A 2 , ••• be measurable sets such that limn-.oo J-L(An) = 0. Then for every fEL 1 (R) we have lim n__,.co

f

An

f=O.

86

Theory

(35) Let g(x) =

{~/Vx

for 0< lxl < 1, otherwise.

2 Show that gEL \R) but g £ L \R).

(36) Let f(x) =min{1, 1/lxl}. Show that f £ C(R) but fE e(R). (37) Show that e([a,

b])~ L\[a, b]) for any bounded interval [a, b].

(38) Letf,g, hEL\R). Show that (f+g)*h=f*h+f*h. (39) Let f be the characteristic function of the interval [ -1, 1]. Calculate the convolutions f* f and f* f* f (40) Let f E L \R) and let g be a bounded continuously differentiable function on R. Show that f * g is differentiable. If, in addition, g' is bounded show that (/*g)'=f*g'. (41) Let f be a locally integrable function on Rand let g be a continuously differentiable function with bounded support in R. Show that f* g is differentiable and (!*g)'= f * g'.

CHAPTER

3

Hilbert Spaces and Orthonormal Systems

3.1. Introduction The theory of Hilbert spaces was initiated by David Hilbert (1862-1943) in his 1912 work on quadratic forms in infinitely many variables which he applied to the theory of integral equations. After many years John von Neumann ( 1903-1957) first formulated an axiomatic theory of Hilbert spaces and developed the modern theory of operators on Hilbert spaces. His remarkable contribution to this area has provided the mathematical foundation of quantum mechanics. Von Neumann's work has also provided an almost definite physical interpretation of quantum mechanics in terms of abstract relations in an infinite dimensional Hilbert space. This chapter is concerned with inner product spaces (called also preHilbert spaces) and Hilbert spaces. The basic ideas and properties will be discussed with special attention given to orthonormal systems. The theory is illustrated by numerous examples.

87

88

Theory

3.2. Inner Product Spaces Definition 3.2.1 (Inner Product Space). A mapping

Let E be a complex vector space.

(-, · ): Ex E-'? C is called an inner product in E if for any x, y, z conditions are satisfied:

E

E and a, f3

E

C the following

(a) (x, y) = (y, x) (the bar denotes the complex conjugate); (b) (ax+f3y,z)=a(x,z)+f3(y,z); (c) (x,x);:::O,and(x,x)=Oimpliesx=O. A vector space with an inner product is called an inner product space or a pre-Hilbert space or a unitary space. According to the definition, the inner product of two vectors is a complex number. The reader should be aware that other symbols are sometimes used to denote inner product: (x, y/ or (x/ y/. Instead of z the symbol z* is also used. In this book we will use (x, y) and z. By (a), (x, x) = (x, x) which means that (x, x) is a real number for every x E E. It follows from (b) that (x, ay + f3z) = (ay + f3z, x) = a(y, x)+ f3(z, x) = a(x, y)+ /3(x, z). In particular (ax,y)=a(x,y)

and

(x,ay)=a(x,y).

Hence, if a = 0, (O,y)=(x,O)=O.

3.3. Examples of Inner Product Spaces Example 3.3.1. The simplest but important example of an inner product space is the space of complex numbers C. The inner product is defined by (x, y) = xy. Example 3.3.2. The space eN of ordered N-tuples complex numbers, with the inner product defined by

X=

(xl' ... ' XN) of

N

(x, y) =

L

XkYk.

is an inner product space.

x=(x~>···,xN),

y=(YJ, ... ,yN),

89


Example 3.3.3. The space f of all infinite sequences of complex numbers 2 x = (x~> x 2 , x 3 , ••• ) such that 'L:~ 1 lxkl < oo (see Section 1.2), with the inner product defined by 00

L

(x, y) =

XkYk>

k~!

is an infinite dimensional inner product space. As we will see later, this space is in a sense the most important example of an inner product space, (see Theorem 3.12.3).

Example 3.3.4. Consider the space of infinite sequences (x~> x 2 , x 3 , ••• ) of complex numbers such that only a finite number of terms are non-zero. This is an inner product space with the inner product defined as in Example 3.3.3. Example 3.3.5. The space ~([a, b]) .of all continuous complex valued functions on the interval [a, b ], with the inner product (f, g)=

r

f(x)g(x) dx,

is an inner product space.

Example 3.3.6. The space L \[a, b]) of all Lebesgue square integrable functions on the interval [a, b ], see Section 2.13, with the inner product defined as in Example 3.3.5, is an inner product space. This space is of great importance in applications. Example 3.3.7.

Let E 1 and E 2 be inner product spaces. Define

E

=

E 1 X E2 = {(x, y):

X E

E1 , y

E

E 2 }.

The space E is an inner product space with the inner product defined by ((x 1, YJ), (x2, Y2)) = (x~> x2)+ (YJ, Y2)· Note that E 1 and E 2 can be identified with E 1 x {0} and {0} x E 2, respectively. Similarly we can define the inner product on E 1 x · · · x En. This method can be used to construct new examples of inner product spaces. The reader should be aware of the fact that the parentheses ( ) are used here with different meanings. In the definition of the set E by (x, y) we denote the pair of vectors x and y. In ( (x 1 , YJ), (x2 , Y2)) the outside parentheses denote the inner product while (x 1 , YJ) and (x2 , y 2 ) are again pairs of vectors. Finally, (x 1 , x 2 ) and (YJ, y 2 ) are inner products in E 1 and E 2 , respectively. It is always clear from the context what is the meaning of a particular pair of parentheses, but one needs to be careful.

Theory

90

3.4. Norm in an Inner Product Space An inner product space is a vector space with an inner product. It turns out that every inner product space is also a normed space with the norm defined by llxll =yf(x, x). First notice that the norm is well defined because (x, x) is always a nonnegative (real) number. Condition (c) of Definition 3.2.1 implies that llx II= 0 if and only if x = 0. Moreover

It thus remains to prove the triangle inequality. This is not as simple as the

first two conditions. We first prove the so-called Schwarz's Inequality, which will be used in the proof of the triangle inequality. Theorem 3.4.1 (Schwarz's Inequality). inner product space we have

For any two elements x andy of an

(3.4.1)

l(x, Y)l :o; llxiiiiYII·

The equality l(x, Y)l = llxiiiiYII holds if and only if x and y are linearly dependent.

Proof. If y = 0, then (3.4.1) is satisfied because both sides are equal to zero. Assume then y-¥- 0. By (c) in Section 3.2, we have 2 Os (x+ ay, x+ ay) = (x, x)+ a(x, y)+ a(y, x)+ lal (y, y)

(3.4.2)

Now put a= -(x, y)j(y, y) in (3.4.2) and then multiply by (y, y) to obtain 2 0 :o; (x, x)(y, y) -l(x, y)l •

This gives Schwarz's inequality. If x and y are linearly dependent, then y =ax for some a

E

C. Hence

l(x, Y)l = l(x, ax)l = lal(x, x) = lalllxllllxll = llxllllaxll = llxiiiiYII· Now, let x andy be vectors such that l(x, y)l = llxiiiiYII, or equivalently (x, y)(y, x) = (x, x)(y, y).

(3.4.3)

91


We will show that (y, y )x- ( x, y )y = 0, which proves that x andy are linearly dependent. Indeed, by (3.4.3) we have ((y, y)x-(x, y)y,(y, y)x- (x, y)y) =

(y, y) 2 (x, x)- (y, y)(y, x)(x, y)- (x, y)(y, y)(y, x)

+ (x, y)(y, x)(y, y) =

0,

completing the proof. Corollary 3.4.1 (Triangle Inequality). inner product space we have

For any two elements x and y of an

llx+ Yll :o; llxll + IIYII· Proof.

(3.4.4)

When a= 1, equality (3.4.2) can be written as llx+ Yll 2 = (x+ y, x+ y) = (x, x)+2 Re(x, y)+(y, y) :o;(x,x)+2l(x,y)l+(y,y) 2

:o; llxll +2llxii11YII + IIYII =

Cllxll + IIYII)

2

(by Schwarz's inequality)

2

(3.4.5)

•

This proves the triangle inequality. (Re z denotes the real part of z

E

C.)

The above discussion justifies the following definition. Definition 3.4.1 (Norm in an Inner Product Space). By the norm in an inner product space E we mean the functional defined by llxll =J(x, x). We have proved that every inner product space is a normed space. It is only natural to ask whether every normed space is an inner product space. More precisely: is it possible to define in a normed space (E, 11·11) an inner product ( ·, ·)such that II x II =vi (x, x) for every x E E? In general the answer is negative. In the following theorem we prove a property of the norm in an inner product space which is a necessary and sufficient condition for a normed space to be .an inner product space; see 3.13 Exercises, (11). Theorem 3.4.2 (Parallelogram Law). inner product space we have

For any two elements x and y of an

(3.4.6)

Theory

92

Proof.

We have llx+ Yll

2

=

(x+ y, x+ y) = (x, x)+ (x, y)+ (y, x) + (y, y)

and hence (3.4.7)

Now replace y by -y to obtain llx- Yll

2

=

llxll

2 -

(x, Y)- (y, x)+ IIYII

2

•

(3.4.8)

By adding (3.4.7) and (3.4.8) we obtain the parallelogram law. One of the most important consequences of having the inner product is the possibility of defining orthogonality of vectors. This makes the theory of Hilbert spaces so much different from the general theory of Banach spaces. Definition 3.4.2 (Orthogonal Vectors). Two vectors x and y in an inner product space are called orthogonal (denoted by x j_ y) if (x, y) = 0. Theorem 3.4.3 (Pythagorean Formula). we have

For any pair of orthogonal vectors (3.4.9)

Proof. If x j_ y, then (x, y) = 0, and thus the equality follows immediately from (3.4.5). In the definition of the inner product space we assume that Eisa complex vector space. It is possible to define a real inner product space. Then condition (b) in the definition becomes (x, y) = (y, x ). All the above theorems hold in the real case. If, in Examples 3.3.1-3.3.6, the word complex is replaced by real and C by R, we obtain a number of examples of real inner product spaces. A finite dimensional real inner product space is called a Euclidean space. If x = (x 1 , ••• , xN) andy= (YJ, ... , YN) are vectors in RN, then the inner product (x, y) = L~~~ XkYk can be defined equivalently as (x, y) = llxiiiiYII cos e, where e is the angle between vectors x and inequality becomes _ l(x, Y)l Ieos Ol-llxiiiiYII

n.

In E ~0 (R) for every n EN. For n > m, we have

as m...;. oo. This shows that {In} is a Cauchy sequence. On the other hand, it follows directly from the definition of In that . sin 7TX hm ln(X) = 1+ X -

n-;.oo

which is not in

~0 (R).

1

,

1

96

Theory

Example 3.5.8. Denote by L 2 '"([a, b]) the space of all complex valued square integrable functions on [a, b] with a weight function p which is positive almost everywhere, i.e., f E L l,p ([a, b]) if

r

2

lf(x)l p(x) dx
a 2 , ••• ), where an=(x,xn), n=1,2, ... By Theorem 3.8.3, T is a one-to-one mapping from H onto 12 • It is clearly a linear mapping. Moreover, for an= (x, xn) and f3n = (y, Xn), x, y E H, n EN, we have

00

=

L n=l

00

anf3n =

L n=l

(x, Xn)(y, Xn)

127


Thus Tis an isomorphism from H onto f. The proof of (b) is left as an exercise. Remarks. 1. It is easy to check that isomorphism of Hilbert spaces is an equivalence relation. 2. Since any infinite dimensional separable Hilbert space is isomorphic to 12 , it follows that any two such spaces are isomorphic. The same is true for real Hilbert spaces; any real infinite dimensional separable Hilbert space is isomorphic to the real space 12 • In some sense, there is only one real and one complex infinite dimensional separable Hilbert space.

3.13. Exercises

(1) Show that (x, ay+ f3z) = a(x, y)+ /3(x, z)

for all a, f3

E

C,

in any inner product space.

(2) Prove that the space 'e0 (R) of all complex valued continuous functions that vanish outside some finite interval is an inner product space with the inner product (f, g)=

f:

f(x)g(x) dx.

(3) Verify that the spaces in Examples 3.3.1-3.3.7 are inner product spaces. E = 't6 1 ([a, b]) (the space of all continuously differentiable complex valued functions on [a, b]). For f, gEE define

(4) (a) Let

(f, g)=

r

f'(x)g'(x) dx.

Is ( ·, · ) an inner product in E? (b) Let F={fE 't6 1 ([a, b]):f(a)=O}. Is(·,·) defined in (a) an inner product in F?

(5) Is the space 'eb(R) of all continuously differentiable complex valued continuous functions that vanish outside some finite interval an inner product space if (f, g)=

L:

f'(x)g'(x) dx ?

(6) Show that the norm in an inner product space is llxll = IIYII = 1 and x :;6 y, then llx+ Yll < 2.

strictly convex, i.e., if

128

Theory

(7) Show that in any inner product space II x- y II + IIY- z II = II x- z II if and only if y = ax+(l-a)z for some a E [0, 1]. (8) Let E 1 , ••• , En be inner product spaces. Show that ([xh ... , Xn], [yl, · · ·, YnD =(xi, YJ) + · · · + (xn, Yn) defines an inner product in E = E 1 x · · · x En. If E 1 , ••• , En are Hilbert spaces, show that E is a Hilbert space and its norm is defined by ll[x], ... , xnJII =JIIx~ll + · · ·+ llxnll

2

2 •

(9) Show that the polarization identity

holds in any pre-Hilbert space.

(10) Show that for any x in a Hilbert space llxll =sup 1 Y 1 ~ 1

I(x,y)l.

( 11) Prove that any complex Banach space with norm II ·II satisfying the parallelogram law is a Hilbert space with the inner product defined by 1 2 2 (x, y) =- [llx+ Yll -llx-yll + illx+ iyll 2 - illx- iyll 2 ], 4

and then llxll 2 =(x,x).

(12) Is 'e([a, b]) with the norm space? ( 13) Show that L 2 ([ a,

11!11 =max[a,bJif(x)l

an inner product

b]) is the only inner product space among the spaces

LP([a, b]).

( 14) Show that for any elements in an inner product space,

The equality is called Apollonius' identity.

( 15) Prove that any finite dimensional inner product space is a Hilbert space.

129


r

(16) Let F={fE 'e 1 ([a, b]):f(a)=O} and (f, g)=

f'(x)g'(x) dx.

Is E a Hilbert space?

(17) Is the space <eb(R) with the inner product (f, g)=

J:

f'(x)g'(x) dx

a Hilbert space?

( 18) Let E be an incomplete inner product space. Let H be the completion of E (see Section 1.7). Is it possible to extend the inner product from E onto H such that H would become a Hilbert space? ( 19) Suppose Xn...;. x and Yn...;. y in a Hilbert space, and an...;. a in C. Prove that

(a)

Xn + Yn -i> x+ y, (b) anXn -i> aX, (c) (xn, Yn)...;. (x, y), (d) llxnll-i> llxll·

(20) Suppose Xn-i>wX and Yn...;.wy in a Hilbert space, and an...;.a in C. Prove or give a counterexample:

(a)

Xn + y n -i> wX + y; (b) anXn -i> wax; (c) (xn, Yn) -i> (x, y); (d) llxnll-i> llxll; (e) Ifxn=Yn for all nEN,thenx=y.

(21) Show that in a finite dimensional Hilbert space weak convergence implies strong convergence. (22) It is always possible to find a norm on an inner product space E which would define the weak convergence in E? (23) If L~~~ uk product space.

=

u, show that

L~~~ (uk, x) = (u, x) for any x in an inner

130

Theory

(24) Let {x~> ... , x"} be a finite orthonormal set in a Hilbert space H. Prove that for any x E H the vector n

x-

L

(x, xk)xk

k~]

is orthogonal to xk for every k = 1, ... , n.

(25) In the pre-Hilbert space 'e([ -7r, 7T]) show that the following sequences of functions are orthogonal: (a) xk(t)=sinkt,k=1,2,3, ... ; (b) Yn(t)=cosnt,n=0,1,2, ... .

(26) Show that the application of the Gram-ScJ.tmidt process to the sequence of functions fo(t)

= 1,/1(t) = t,f2(t) =

2 t , ... , f, ( t) = t", ...

(as elements of L 2 ([ -1, 1])) yields the Legendre polynomials.

(27) Show that the application of the Gram-Schmidt pro'cess to the sequence of functions r ( - _,2/2 , f 1( ( ) -- ( e _,2/2 , f 2 ( ( ) -- ( 2 e _,2/2 , · .. , f,n ( ( ) -- ( n e _,2!2 , ... JO t)- e (as elements of L 2 (R)) yields the orthonormal system discussed in Example 3.7.4.

(28) Apply the Gram-Schmidt process to the sequence of functions fo(t)

= 1,/1 (t) = t,f2(t) = t 2, ... Jn ( t) =

t", ...

defined on R with the inner product (f, g)=

L:

2

f(t)g(t) e-' dt.

Compare the result with Example 3.7.4.

(29) Apply the Gram-Schmidt process to the sequence of functions 2 fo(t) = 1,/J(t) = t,f2(t) = t , ... ,Jn(t) = t", ...

defined on [0, oo) with the inner product (f, g)=

Ioo f(t)g(t) e-' dt.

The obtained polynomials are called the Laguerre polynomials.

131


(30) Let T" be the Chebyshev polynomials of degree

n, i.e.,

Tn(x)=2 1-n cos(narccosx).

T0 (x)=1, Show that the functions

n =0, 1, 2, ... ,

form an orthonormal system in L 2 [( -1, 1)] with respect to the inner product (f, g)=

f

1

l

~ f(x)g(x) dx.

-1

v 1- x-

(31) Prove that for any polynomial Pn(X) = x" + an-]Xn-l + ... + ao

we have max IPn(x)l;::: max ITn(x)l,

[-1,1]

[-1,1]

where T" denotes the Chebyshev polynomial of degree n.

(32) Show that the complex functions e 2 ,

•• • }.


135

(57) If S is a closed subspace of a Hilbert space H, then H = SCB SJ_. Is this true in every inner product space? (58) Show that the functional in Example 3.11.2 is unbounded. (59) The Riesz Representation Theorem says, that for every bounded linear functional IE H' on a Hilbert space H, there exists a representer x1 E H such that I( x) = ( x, x1 ) for all x E H Let T: H'...;. H be the mapping which assigns x1 to f Prove the following properties of T: (a) T is onto, (b) T(f+g)=T(f)+T(g), (c) T(af) =aT(!), (d) I T(f)ll = IIIII, where f, gE H' and a

E

C.

(60) Prove part (b) of Theorem 3.12.3. (61 ) Let I be a bounded linear functional on a closed subspace F of a Hilbert space H Show that there exists a bounded linear functional g on H such that IIIII =II gil and l(x) = g(x) whenever x E F

(62) Show that the space f is separable. (63) Let '!I be· an uncountable orthonormal system in an inner product space E. Show that for every x E E we have ( x, e):; 0 and all x, y functional is defined by II¢ II=

E

E. The norm of a bounded bilinear

sup

l¢(x, Y)l.

llxiHIYII-1

Iff= gin Example 4.3.3, then ¢ 1 is symmetric and positive. Inner product is strictly positive. If operators A and B in Example 4.3.2 are bounded, then ¢ 1 , ¢ 2 , and ¢ 3 are bounded. Similarly, iff and g in Example 4.3.3 are bounded, then the defined bilinear functional is also bounded. Note that for a bounded bilinear functional
eJ) == aJk. Therefore the operator A is self-adjoint if aiJ = aJi. A matrix satisfying this condition is often called Hermitian.

151


Example 4.4.2. Let H be a separable infinite dimensional Hilbert space and let { e~> e2 , e 3 , ••• } be a complete orthonormal sequence in H. Let A be a bounded operator in H represented by an infinite matrix ( aiJ ), see Theorem 4.2.2. Like in the finite dimensional case, the adjoint operator A* is represented by the infinite matrix ( a1;). A is self-adjoint if a;1 = a1; for all i, j EN. Example 4.4.3.

r

LetT be a Fredholm operator on L 2 ([a, b]) defined by (Tx)(s)=

K(s, t)x(t) dt,

rr rr rr r r r r

where K is a function defined on [a, b] x [a, b] such that

IK (s, t)l 2 ds dt < 00.

Note that the condition is satisfied if K is continuous. We have ( Tx, y) =

K (s, t)x(t).Y"(S) ds dt

=

K(s, t)x(t)y(s) dsdt

x(t)

=

=(x,

This shows that

K (s, t)y(s) ds dt

K(s,t)y(s)ds).

(K*x)(s)=

K(t, s)x(t) dt.

Thus a Fredholm operator is self-adjoint if its kernel satisfies the equality K ( s, t) = K ( t, s).

Example 4.4.4.

Let A be the operator on L \[a, b]) defined by (Ax)( t) = tx( t).

Since (Ax, y) = A is self-adjoint.

r

tx(t)y(t) dt =

Lb x(t)ty(t) dt

=

(x, Ay),

152

Theory

Consider the operator A defined on L\R) defined by

Example 4.4.5.

(Ax)(t) = e-ltlx(t).

This is also a bounded self-adjoint operator. Boundedness of A can be shown as in Example 4.2.5. Moreover, we have (Ax, y) =

L:

e- 1'1x(t)y(t) dt =

L:

x(t)[e-l'ly(t)] dt = (x, Ay).

Thus A is self-adjoint. Example 4.4.6. Let
0 (a

E

R) such that aA:::; .P.

Easy proofs are left as exercises.

Example 4.6.3. The product of two positive operators is not necessarily positive. Indeed, consider operators on R 2 defined by matrices

A=[~

~]

and

B = [:

163


It is easy to check that both A and B are positive operators, but the product AB is not.

Theorem 4.6.3. operator.

Proof.

The product of two commuting positive operators is a positive

Let A and B be commuting positive operators. Define for n = 1, 2, ....

Note that operators A" are self-adjoint and commuting. We will show, by induction, that (4.6.1) for all n EN. Clearly, (4.6.1) is satisfied for n = 1. Suppose now (4.6.1) holds for some kEN. Then

(Al(J5- Ak)x, x) = (Ak(J5- Ak)x, Akx) = ((J'i- Ak)Akx, Akx);::: 0 and

which means

Consequently

and

J5- Ak+! = (J'i- Ak) +Al ;:::0. This shows that (4.6.1) holds for k+1, and thus for all n EN, by induction. We have n

A!= Ai + A2 = Ai +A~+ A3

=••·= L k~!

and hence n

L Al =A!-An+!:S A

1•

k~!

Therefore n

L k~!

(Akx, Akx) :S (A 1 x, x).

Al + An+h

164

Theory

This shows that the series L~~~ IIAnxll 2 converges and IIAnxll-i>O. Then

(f

Al )x = A 1 x- An+ 1x -i> A 1 x

as n -i> oo

1

or, equivalently,

n=l

Since B commutes with An for all n EN, we have 00

(ABx, x) = IIAII(BAJx, x) = IIAII

L

(BA~x, x)

n=l 00

= IIAII

L

(BAnx,Anx)~O.

n=l

This proves the theorem. The following theorem is an interesting analog of a property of real numbers.

a

Theorem 4.6.4. Let A 1 :S A 2 :S · · · :SAn ::; · · · be self-adjoint operators in Hilbert space H such that AnAm = AmAn for all m, n EN. If B is a self-adjoint operator on H such that AnB = BAn and An :S B for all n EN, then the sequence {An} converges strongly to a self-adjoint operator A and A 1 :SA :S B.

Proof.

Define en= B-An. Operators en commute with each other and

By Theorem 4.6.3, for n > m the operators

are positive. Hence

( e~x, x) ~ ( emenx, x) ~ ( e~x, x) for every x E H. Since, for an arbitrary fixed x E H, { ( elx, x)} is a nonincreasing sequence of non-negative numbers, it converges and thus lim (emenx,x)=lim (e~x,x).

m,n --?00

Hence

n --?00

165


as m, n-'? oo. Therefore { C,x} is a Cauchy sequence for every x E H. Consequently, {Cn}, and thus also {An}, are convergent. It is easy to check that the operator A defined by Ax= lim" ~oo A"x is self-adjoint and A 1 :SA :S B. Definition 4.6.2 (Square Root). A square root of a positive operator A is a self-adjoint operator B satisfying B 2 = A.

Theorem 4.6.5. Every positive operator A has a unique positive square root B. Moreover, B commutes with every operator commuting with A. Proof. and

Let A;::: 0 and let a> 0 (a

E

R) be such that a 2 A::; !fi. Define T0 = 0 (4.6.2)

for n = 0, 1, 2, .... Note that operators T" are self-adjoint (as polynomials of A with real coefficients) and positive. Moreover, they commute with every operator commuting with A. In particular, Tn Tm = Tm Tn for all m and n. For every n, we have (4.6.3) and Tn+!- T" =W!P- Tn-1)+(!fi- Tn))(T"- Tn-1).

(4.6.4)

In view of ( 4.6.3), we have T" :S!} for all n. Moreover

Indeed,

and if T"- Tn- 1;:::O, then Tn+ 1- T" ;:::O, by (4.6.4). By Theorem 4.6.4, the sequence { T"} converges to a positive self-adjoint operator T Letting n-'? oo in ( 4.6.2) yields T= T+~(a 2 A- T 2 ), i.e.,

Denote B = T I a. The operator B is obviously positive. Since, for each n EN, Tn commutes with every operator commuting with A, so do T and B.

166

Theory

It remains to prove the uniqueness. Let C be a positive operator such that C 2 =A. Since C commutes with A, C commutes with B. Let x E H and let y 0 = (B- C)x. Then

(Byo, Yo)+ ( Cyo, Yo)= ((B + C)yo, Yo) =

((B

= ( (B

+ C)(B- C)x, Yo) 2

-

C 2 )x, y 0 ) = 0.

Since B and C are positive, we have (By 0 , y 0 ) = ( Cy0 , y 0 ) = 0. If D is a positive square root of B, then 2

2

II Dyoll = (D Yo, Yo)= (Byo, Yo)= 0.

Hence Dy0 = 0 and also By0 = D( Dy0 ) IIBx- Cxll

2

= 0. Similarly Cy0 = 0. Finally,

= ((B- C) 2 x, x) = ((B- C)y0 , x) = 0

for arbitrary x E H This proves B = C, completing the proof of the theorem.

Definition 4.6.3 (Positive Definite Operator). A self-adjoint operator is called strictly positive or positive definite if (Ax, x) > 0 for all x E H, x :;

U2)

= (A 1 u~> A2u 2) =(Au~> Au2 ) =

(u 1 , u2 ).

Since A1A2 ~ 1, we get ( u~> u 2 ) = 0, which proves that the eigenvectors u 1 and u2 are orthogonal. The proof is complete. Theorem 4.9.7.

For every eigenvalue A of a bounded operator A we have

IAI:S IIAII· Proof. Let u be a non-zero eigenvector corresponding to A. Since Au= Au, we have

IIAull = IIAull, and thus

IAIIIull = IIAull :S IIAIIIIull· This implies

IAI :S I All.

Remark. If the eigenvalues are considered as points in the complex plane, the above result implies that all the eigenvalues of a bounded operator A lie inside the circle of radius I All. Corollary 4.9.1. the inequality

All eigenvalues of a bounded self-adjoint operator A satisfy

IAI :S

sup llxll"'l

{I(Ax, x)l}.

(4.9.9)

182

Theory

Proof follows immediately from Theorem 4.4.5. It is natural to ask whether the absolute value of any eigenvalue actually attains the value IIAII· In general the answer is negative, but it is true for compact self-adjoint operators.

Theorem 4.9.8. If A is a non-zero compact self-adjoint operator on a Hilbert space H, then it has an eigenvalue A equal to either IIAII or -IIAII. Proof. Let {un} be a sequence of elements of H such that I Un II= 1 for all n EN and as n...;. oo.

(4.9.10)

Then IIA Un -IIAun I

Un 1 2 = (A 2 Un -IIAun I 2 Un, A 2 Un -II Au" I 2 Un) 2 2 2 4 2 = IIA unii -2IIAunii\A Un, Un)+IIAunll llunll 2 2 4 2 = IIA Un 1 -2IIAun 11\Aun, Aun) + IIAun 1 1 Un 1 4 2 2 = IIA Un 1 -IIAun 1 2 2 4 :S IIAII 11Aun 1 -IIAu" 1 = IIAun IICIIAII 2 -II Au" 1 2 ). I Au" I converges to I All, we obtain 2 2 as n...;. oo. (4.9.11) IIA Un -IIAun I Un II-i> 0 2

Since

2

The operator A 2 being the product of two compact operators is also compact. Hence there exists a subsequence {upJ of {u"} such that {A 2 uPJ converges. 2 Since I All~ 0, the limit can be written in the form IIAII v, v ~ 0. For every n EN we have

IIIIAII 2 v -IIAII 2 uPn I :S IIIIAII 2 v- A 2 uPn II+ IIA 2 uPn -IIAuPn ll 2 uPn I + IIIIAuPn ll 2 uPn -IIAII 2 uPn II· Thus, by (4.9.10) and (4.9.11), we have as n...;. oo or as n...;. oo.

183


This means that the sequence { urJ converges to v and therefore

The above equation can be written as

(A-IIAII)(A + IIAII)v = 0. If w =(A+ IIAII)v ~ 0, then (A -IIAII)w = 0, and thus I All is an eigenvalue of A. On the other hand, if w = 0, then -II All is an eigenvalue of A. This completes the proof. Corollary 4.9.2. If A is a non-zero compact self-adjoint operator on a Hilbert space H, then there is a vector w such that I wll = 1 and I(Aw,

w)l = sup {I( Ax, x)l}. llxll-51

Proof. Let w, I w I = 1, be an eigenvector corresponding to an eigenvalue A such that lA I= I All· Then 2 I(Aw, w)l = I(Aw, w)l = 1AIIIwll = IAI = IIAII = sup {I(Ax, x)l}, II xll"'1

by Theorem 4.4.5. Remarks. Theorem 4.9.8 guarantees the existence of at least one non-zero eigenvalue, but no more in general. The corollary gives a useful method for finding that eigenvalue by maximizing certain quadratic expression. The following result is another example of a theorem describing spectral properties of an operator. We will not prove this result. The interested reader can find a proof in [E. Kreyszig, Introductory Functional Analysis with Applications, Wiley, 1978, Theorem 9.2-3]. Theorem 4.9.9.

Let A be a self-adjoint operator in a Hilbert space H Define m = inf (Ax, x) llx11~1

and

M

=

sup (Ax, x). llxll~1

The spectrum of A lies in the closed interval [m, M]. Moreover, m and M belong to the spectrum. Theorem 4.9.10. The set ofdistinct non-zero eigenvalues {An} ofa self-adjoint compact operator A is either finite or limn_.oo An = 0.

184

Theory

Proof. Suppose A has infinitely many distinct eigenvalues A,., n EN. Let un, for n EN, be an eigenvector corresponding to An such that II un II= 1. By Theorem 4.9.6, {un} is an orthonormal system. Moreover, by Theorem 4.8.7, we have 2

0 =lim IIAun 11 =lim (Aun, Aun) =lim (AnUn, Anun) n--;..oo

n--;..oo

n--;..oo

2

=lim A~llunll = lim A~. n--;..oo

n-;..oo

This proves the theorem. Example 4.9.4. We will find the eigenvalues and eigenfunctions of the operator A on L \[ 0, 21T]) defined by (Au)(x) = ["'" k(x- t)u(t) dt,

where k is a periodic function with period 21r, square integrable on [0, 21T ]. , As a trial solution we take

and note that (Aun)(x)=

f

2,-

k(x-t) ein'dt=einx

fx

0

k(s) einsds.

x-2~

Thus nEZ,

where

An=["'" k(s)

eins ds.

The set of functions {un}, n E Z, is a complete orthogonal system in L\[0,21T]). Note that A is self-adjoint if k(x)=k(-x) for all x, but the collection of eigenfunctions is complete even if A is not self-adjoint. Theorem 4.9.11. Let {Pn} be an orthogonal sequence of projection operators on a Hilbert space Hand let {An} be a sequence of numbers such that An...;. 0 as n...;. oo. Then (a) L:~J AnPn converges; (b) For each nEN, An is an eigenvalue of the operatorA=[:= 1 AnPn, and the only other possible eigenvalue of A is 0.

185


(c) If all Ans are real, then A is self-adjoint. (d) If all projections P" are finite-dimensional, then A is compact. Proof.

(a) For every

mE

N, we have

Since the vectors A Pnx are orthogonal, 11

II

f.

AnPnx

r

n=k

=

f. IIAnPnxll f. IAni IIPnxll 2

2

=

n=k

2 •

n=k

Now, since A11 -i>O as n...;.oo, for every c>O we have

t A~~P~~xr$£ t IIP~~xii =E 1 t P~~xr$£ 1 t P~~rllxll , 2

II

n-k

2

2

n-k

2

n-k

2

n-k

(4.9.12) for all sufficiently large k and m. The sum L~~k P being a finite sum of projection operators, is a projection operator and its operator norm is 1. Thus (4.9.12) yields 11 ,

whenever k and m are sufficiently large. Thus the sequence of partial sums L~~k A11 P11 is a Cauchy sequence, and by Theorem 1.6.5, the series converges to a bounded operator on H (b) Denote the range of Pn by !Jt(Pn) and let n0 E N. If u E !Jt(Pno), then P nou = u and Pnu = 0 for all n ~ n0 , because the P" are orthogonaL Thus Au= A110 U, which shows that A110 is an eigenvalue of A To prove that there are no other non-zero eigenvalues, suppose u is an eigenvector corresponding to an eigenvalue A. Set V 11 = P11 u, n = 1, 2, ... , and let w = Qu, where Q is the projection on the orthogonal complement of !Jt(A). Then 00

u=

L. v"+w, n=l

with w _i iJt ( P") for all n EN. Clearly,

(4.9.13)

186

Theory

since Pnw = 0 and A is continuous. Consequently, the eigenvalue equation has the form

or 00

L

(4.9.14)

(A-An)vn+Aw=O.

n=l

Since all vectors in (4.9.14) are orthogonal, the sum vanishes only if every term vanishes. Hence Aw = 0, and for every n EN either A =An or vn = 0. Finally, if u in (4.9.13) is a non-zero eigenvector, then either w ~ 0 or vk ~ 0 for some kEN. Therefore, A= 0 or A= Ak for some kEN, by ( 4.9.14). This proves the assertion. (c) Suppose all Ans are real. Since projections are self-adjoint operators, for any x, y E H we have 00

(Ax, y) =

L

00

(AnPnX, y) =

L

An(PnX, y)

n=l

n=l

00

=

L

00

An(x,Pny)=

n=l

L

(x,AnPny)=(x,Ay).

n=l

(d) A is the limit of a uniformly convergent sequence of compact operators, hence it is compact. Definition 4.9.4 (Approximate Eigenvalue). Let T be an operator on a Hilbert space H A scalar A is called an approximate eigenvalue of T if there exists a sequence of vectors {xn} such that llxn II= 1 for all n EN and II Txn - Axn II-i> 0 as n -i> oo. Obviously, every eigenvalue is an approximate eigenvalue. Example 4.9.5. Let {en} be a complete orthonormal sequence in a Hilbert space H Let An be a strictly decreasing sequence of scalars convergent to some A. Define an operator on H by 00

Tx =

L

An(x, en)en.

n=l

It is easy to see that every An is an eigenvalue of T, but A is not. On the other hand,

187


as n...;. oo. Thus A is an approximate eigenvalue of T. Note that the same is true if we just assume that An...;. A and An ~A for all n EN. For further properties of approximate eigenvalues see 4.13. Exercises at the end of this chapter.

4.10. Spectral Decomposition Let H be a finite-dimensional Hilbert space, say H = CN. It is known from linear algebra that eigenvectors of a self-adjoint operator on H form an orthogonal basis of H. The following theorems generalize this result to infinite dimensional spaces. Theorem 4.10.1 (Hilbert-Schmidt Theorem). For every self-adjoint compact operator A on an infinite dimensional Hilbert space H there exists an orthonormal system of eigenvectors {un} corresponding to non-zero eigenvalues {An} such that every element x E H has a unique representation in the form 00

X=

L

anun+v,

(4.10.1)

n=l

where an

E

C and v satisfies the equation Av = 0.

Proof. By Theorem 4.9.8 and Corollary 4.9.2 there exists an eigenvalue A1 of A such that IA 11= sup I(Ax, llxll"'l

x)l.

Let u 1 be a normalized eigenvector corresponding to A1 • We set

Q1 = {x E H: x _i u 1}, i.e., Q1 is the orthogonal complement of the set {u 1}. Thus Q1 is a closed linear subspace of H If x E Q1 , then (Ax, u 1 )

= (x, Au 1 ) = A 1 (x, u 1 ) = 0,

which means that x E Q 1 implies AxE Q 1 . Therefore A maps the Hilbert space Q1 into itself. We can again apply Theorem 4.9.8 and Corollary 4.9.2 with Q1 in place of H This gives an eigenvalue A2 such that IA 2 1= sup {I(Ax, llxllsl

x)l:

u E Q1}.

Let u2 be a normalized eigenvector of A2 • Clearly u 1 _i u 2 • Next we set

Q2 =

{X E

Q1; X

j_ U2},

188

Theory

and repeat the above argument. Having eigenvalues A1 , ••• , An and the corresponding normalized eigenvectors u 1 , ••• , un, we define

and choose an eigenvalue An+J such that IAn+ll = sup {I(Ax, x)l:

U

(4.10.2)

E Qn}.

llxil"'i

For un+J we choose a normalized vector corresponding to An+J· This procedure can terminate after a finite number of steps. Namely, it can happen that there is a positive integer k such that (Ax, x) = 0 for all x E Qk. Then every element x of H has a unique representation

where Av = 0. Then

and the theorem is proved in this case. Now suppose that the described procedure yields an infinite sequence of eigenvalues {An} and eigenvectors {un}. Then {un}, as an orthonormal sequence, converges weakly to 0. Consequently, by Theorem 4.8.8, the sequence {Aun} converges strongly to 0. Hence

Denote by S the space spanned by vectors {un}. By the Projection Theorem (Theorem 3.10.4), every x E H has a unique decomposition x = u + v or 00

X=

L

anUn +v,

n=l

where v E Sj_. It remains to prove that Av = 0 for all v E Sj_. Let vES\ v:;60. Define w=v/llvll· Then (Av, v) = llvii\Aw, w).

Since

wE

Sj_ E Q,. for every n EN, by (4.10.2) we have 2

I(Av, v )I= II vii I(Aw, w )I:::::: II v 11

2

2

sup {I(Ax, x)l: u E Qn} =II vii 1An+ 1 1-i> 0.

llxll"'i

This implies (Av, v) = 0 for every v E Sj_. Therefore, by Theorem 4.4.5, the norm of A restricted to Sj_ is 0, and thus Av = 0 for all v E Sj_. This completes the proof.

189


Theorem 4.10.2 (Spectral Theorem for Self-Adjoint Compact Operators). Let A be a se{j~adjoint compact operator on an infinite dimensional Hilbert space H. Then there exists in H a complete orthonormal system (an orthonormal basis) {vn} consisting of eigenvectors of A. Moreover, for every x E H 00

Ax=

L

An( X, Vn)Vn,

(4.10.3)

n=l

where An is the eigenvalue corresponding to Vn· Proof. Most of this theorem is already contained in Theorem 4.10.1. To obtain a complete orthonormal system {vn} we need to add an arbitrary orthonormal basis of sj_ to the system {Un} defined in the proof of Theorem 4.1 0.1. The eigenvalues corresponding to those vectors from Sj_ all equal zero. Equality ( 4.1 0.3) follows from the continuity of A.

Theorem 4.10.3. For any two commuting self-adjoint compact operators A and B on a Hilbert space H, there exists a complete orthonormal system of common eigenvectors. Proof. Let A be an eigenvalue of A and let S be the corresponding eigenspace. For any x E S we have ABx = BAx = B(Ax) = ABx.

This means that Bx is an eigenvector of A corresponding to A, provided Bx ~ 0. In any case, Bx E S and hence B maps S into itself. Since B is a self-adjoint compact operator, by Theorem 4.1 0.2, S has an orthonormal basis consisting of eigenvalues of B, but these vectors are also eigenvectors of A, because they belong to S. If we repeat the same with every eigenspace of A, then the union of all these eigenvectors will be an orthonormal basis of H This proves the theorem. Theorem 4.10.4. Let A be a self-adjoint compact operator on a Hilbert space H with a complete orthonormal system of eigenvectors {vn} corresponding to eigenvalues {An}. Let Pn be the projection operator onto the space spanned by Vn. Then, for all X E H, (4.10.4) n=l

and (4.10.5) n=l

190

Proof.

Theory

From the Spectral Theorem (Theorem 4.10.2), we have (4.1 0.6) n=l

For every kEN, the projection operator Pk onto the one dimensional subspace Sk spanned by Vk is given by

Indeed, for every x E H we have X

= (X, Vk) Vk + L

(X,

Vn ) Vn ,

nT'k

where (x, vk)vk E sk and LnT'k (x, Vn)Vn j_ Sk. Thus (x, of x onto Sk. Now (4.10.6) can be written as

vk)vk

is the projection

n=l

and, by Theorem 4.10.2,

n=l

Hence, for all x

E

n=l

H,

which proves (4.10.5).

L

Note that the convergence of A"P" is guaranteed by Theorem 4.9.11 and is quite different from the convergence of A"P"x.

L

Remarks. 1. Theorem 4.1 0.4 can be considered as another version of the Spectral Theorem. This version is important in the sense that it can be extended to non-compact operators. It is also useful because it leads to an elegant expression for powers and more general functions of an operator. 2. It follows from Theorem 4.10.4 that a self-adjoint compact operator is an infinite sum of very simple operators. One dimensional projection operators are not only the simplest self-adjoint compact operators, but they are also the fundamental ones because any self-adjoint compact operator is a (possibly infinite) linear combination of them.

191


Let A, A,., and P,. be as in Theorem 4.10.4. Then

because APnx = AnPnx for all x E H. Similarly, for any kEN, we get (4.10.7) n=l

More generally, for any polynomial p( t) = ant"+ . .. +a It, we have 00

p(A) =

L p(An)Pn. n=l

The constant term in p must be zero, because otherwise the sequence {p(An)} would not converge to zero. In order to deal with polynomials with a non-zero constant term a 0 , we have to add a 0!fi to the series. Note that in such a case p(A) is not a compact operator. The above result can be generalized in the following way. Definition 4.10.1 (Function ofan Operator). Letfbe a real valued function on R such that f( A)...;. 0 as A ...;. 0. For a self-adjoint compact operator A= L:~J AnPn we define 00

J(A) =

L J(An)Pn.

(4.10.8)

n=l

Theorem 4.9.11 ensures the convergence of the series in (4.1 0.8), and that f(A) is self-adjoint and compact.

Example 4.10.1. Let A= L.:~J A"P" be a self-adjoint compact operator such that all A" 's are non-negative. For any a> 0 we can define A"' by 00

A"'x =

L

A~Pnx.

n=l

Note that in the case a =!the above definition agrees with the Definition 4.6.2. Indeed, by (4.10.7), we have 00

(JAY=

L

00

(JA;;/Pn =

n=l

because all An's are non-negative.

L n=l

AnPn =A,

192

Theory

Let A= L~~~J AnPn be a self-adjoint compact operator. of A by sine define We can

Example 4.10.2.

00

L

sin A=

(sin An)Pn.

n=l

Condition f( A)-'? 0 as A-'? 0 in Definition 4.1 0.1 can be replaced by AnPn boundedness off in a neighborhood of the origin. Indeed, if A= and Pnx =(x, vn)vn, then for any XE H we have

L.:=J

00

(f(A))x =

L f(An)(x, Vn)Vn, n=l

where convergence of the series is justified by Theorem 3.8.3, because lf(An)(x, vnW::o: Ml(x, vnW, 2 for some constant M, and hence {f(An)(x, An)} E 1 • Clearly, in this case we cannot expect f(A) to be a compact operator.

If eigenvectors {un} of a self-adjoint operator Ton a Hilbert space H form a complete orthonormal system in H and all eigenvalues are positive (or non-negative), then Tis strictly positive (or positive).

Theorem 4.10.5.

Proof. Suppose {un} is a complete orthonormal system of eigenvectors of T corresponding to eigenvalues {An}. For any non-zero vector u =

r.:=J

anun

E

H we have

=

L

00

00

00

an(u, Anun) =

n=l

L n=l

Anan(U, Un)

=L

Ananan

n=l

n=l

if all eigenvalues are non-negative. If all A,s are positive, then the last inequality becomes strict. This completes the proof.

4.11. The Fourier Transform In this section we introduce the Fourier transform in L \R) and discuss its basic properties. The definition of the transform in L \R) is not trivial. The

193


integral _1_

../2ii

foo

e-ikxf(x) dx

-00

cannot be used as a definition of the Fourier transform in L\R) because not all functions in L\R) are integrable. It is however possible to extend the Fourier transform from L 1(R) n L\R) onto L 2 (R). In the first part of this section we discuss properties of the Fourier transform in L 1(R). Then we show that the extension onto L 2 (R) is possible and study properties of that extension. Let f be an integrable function on R. Consider the integral kE R.

( 4.11.1)

Since the function g(x) = e-ikx is continuous and bounded, the product e-ikxf(x) is a locally integrable function for any kE R (see Theorem 2.9.2). Moreover, since Ie-ikxl s 1 for all k, x E R, we have

and thus, by Theorem 2.9.3, the integral ( 4.11.1) exists for all k E R. Definition 4.11.1 (Fourier Transform in L 1(R)). j defined by 1 27T

~

A f(k) =

y

f

00

Let f

e-•"kx f(x) dx.

E

L 1(R). The function

(4.11.2)

-00

is called the Fourier transform of f. In some books the Fourier transform is defined without the factor 1j../2ii in the integral. Another variation is the definition without the "-" sign in the exponent, i.e.,

These details do not change the theory of Fourier transforms at all. Instead of"]" the notation ".'JP{f(x)}" can also be used. The latter is especially convenient if instead of a letter "f" or "g" we want to use an 2 expression describing a function, for example .'JP{ e-x }. We will use freely both symbols.

194

Theory

Example 4.11.1. (a) Let a >0. Then gi{e-a 0. The proof follows easily from Definition 4.11.1.

Theorem 4.11.6. If f is a continuous piecewise differentiable function, f,f' E L 1(R), and limlxl~oof(x) = 0, then @P{f'} = ik@P{f}.

Proof.

Simple integration by parts gives

_1_ foo f'(x) e-ikx dx =-1-[f(x) ~ -oo ~ =

e-ikx]~oo+~ foo

f(x) e-ikx dx

..[2;i -oo

ik](k).

Corollary 4.11.1. Iff is a continuous piecewise n-times differentiable function, f, f', ... , f(n) E L 1(R), and limlxl~oo f(k)(x) = 0 fork= 0, ... , n -1, then @P{f(")} = (i)"k"@P{f}. Because of our definition of the Fourier transform it is better to redefine the convolution of two functions, f, g E L 1(R) as follows:

1 (f * g)(x) = ~

foo

v27T -oo

f(x- u)g(u) du.

The main reason is the simplicity of the formula in the next theorem.

Theorem 4.11.7 (Convolution Theorem). @P{f} @P{g}.

Let f, g E L 1(R). Then @P{f * g} =

197


Proof. Let f, g E L 1(R) and h = and we have " h(k)=

1 27T

f7C: y

1 27T

=-

=-127T =

foo

.

h(x)e-'kxdx=

foo ·g(u) foo -oo

foo

1 27T

foo

.

e-•kx

-00

1 27T

f7C: y

foo

f(x-u)g(u)dudx

-00

.

e-•kxf(x-u) dxdu

-oo

g(u)

-00

~ foo

v27T

*g. Then hE L 1 (R), by Theorem 2.15.1,

f7C: y

-00

f

foo

e-ik(x+u)f(x) dx du

-00

g(u) e-iku du

-oo

~ foo

v27T

e-ikxf(x) dx = g(k)}(k).

-oo

We will now discuss the extension of the Fourier transform onto L\R).

In the following theorem, and in the remaining part of this section, denotes the norm in L 2 (R), i.e.,

llfll2 =

11·11 2

~ L: lfCxW dx

Theorem 4.11.8. Let f be a continuous function on R vanishing outside a A 2 bounded interval. Then f E L (R) and

Proof. Suppose first that f vanishes outside the interval [ -7r, 7T ]. Using Parseval's formula for the orthonormal sequence of functions on [ -7r, 7T] cf>n(X ) =

1

v2iT e

-inx

,

n=0,±1,±2,... ,

we get

Since the above inequality holds also for g(x) = e-i$xf(x) instead of f(x), we obtain 00

IIIII~=

L licn+gW, n=-oo

198

Theory

since IIIII~= I gil~. Integration of both sides with respect tog from 0 to 1 yields

Iff does not vanish outside [ -7r, 7T ], then we take a positive number A for which the function g(x) =/(Ax) vanishes outside [ -7r, 7T ]. Then

and thus

The proof is complete. The space of all continuous functions on R with compact support is dense in L\R). Theorem 4.11.8 shows that the Fourier transform is a continuous mapping from that space into L\R). Since the mapping is linear, it has a unique extension to a linear mapping from L\R) into itself. This extension will be called the Fourier transform on L\R). Definition 4.11.2 (Fourier Transform in e(R)). Let/E L\R) and let {cf>n} be a sequence of continuous functions with compact support convergent to fin L\R), i.e., II!- cf>n 1 2 -i> 0. The Fourier transform off is defined by (4.11.4) where the limit is with respect to the norm in L\R). Theorem 4.11.8 guarantees that the limit exists and is independent of a particular sequence approximating f. It is important to remember that the convergence in L\R) does not imply pointwise convergence and therefore the Fourier transform of a square integrable function is not defined at a point, unlike the Fourier transform of an integrable function. We can say that the Fourier transform of a square integrable function is defined almost everywhere. For this reason we cannot say that, iff E L 1(R) n L \R), then the Fourier transform defined by (4.11.2) and the one defined by (4.11.4) are equal. To be precise, we should say that the function defined by ( 4.11.2) belongs to the equivalence class of square integrable functions defined by

199


( 4.11.4). In spite of this difference, we will use the same symbol to denote both transforms. It will not cause any misunderstanding. The following theorem is an immediate consequence of Definition 4.11.2 and Theorem 4.11.8. IffE L 2(R), then

Theorem 4.11.9 (Parseval's Relation).

Remark. In physical problems, the quantity II! 11 2 is a measure of energy, and llill 2 represents the power spectrum off

Theorem 4.11.10.

Let fE L 2 (R). Then

" f(k) =lim

1

r;::;n-.coy27T

fn

e-·"k xf(x) dx,

-n

( 4.11.5)

where the convergence is with respect to the norm in L 2 (R). Proof.

For n = 1, 2, 3, ... , define if lxl< n, if lxl ::2: n.

Then II/- fn ll2 -i> 0, and thus IIi- in 11 2-i> 0 as n -i> oo.

Theorem 4.11.11.

If f. gEL 2 (R), then

L:!(x)g(x) dx= Proof.

f_cocoi(x)g(x) dx.

For n = 1, 2, 3, ... , define if lxl < n, if lxl ::2: n,

and if lxl< n, iflx1:2:n. Since

( 4.11.6)

200

Theory

we have

The function e-iX$gn(x)fm(g) 2

is integrable over R and thus the Fubini Theorem can be applied. Consequently

and

L: Jm(x)gn(X) dx = L""""/m(Hgn(g) df Since

llg- g" llr'-' 0 and II§- in 1 2

-i>

0, by letting n -i> oo we obtain

L:Jm(x)g(x) dx= L:fm(x)g(x) dx, by the continuity of inner product. For the same reason, by letting m -i> oo, we get

L: }(x)g(x) dx =

J:

f(x)g(x) dx,

completing the proof. The following technical lemma will be useful in the proof of the important inversion theorem for the Fourier transform in L 2 (R). Lemma 4.11.1.

Proof.

Let f

E

2

L (R) and let g

= fX

A

Then f =g.

From Theorems 4.11.9 and 4.11.11, and the equality g =

Cf. g)= cJ, g)= cJ,}) =IIi II~= IIIII ~-

X

f we obtain (4.11.7)

Hence also (f,

g)= llfll~.

(4.11.8)

Finally, by Parseval's equality,

llill~= llgll~= 11111~= 11!11~-

(4.11.9)

201


Using (4.11. 7 -4.11.9) we get

This

II!- ill~= U- i.f- g)= 11111~- C!. §)- C!. §)+ 11§11~= o. shows that f = l

Theorem 4.11.12 (Inversion of Fourier Transforms in L 2 (R)). Then

f(x)

1 f" =lim ~ n-;.co v27T -n

Letf E L 2 (R).

e'kx" f(k) dk,

where the convergence is with respect to the norm in L 2 (R).

Proof.

Let f

E

L 2 (R). If g =], then, by Lemma 4.11.1, f(x)

1 f" e-•kx g(k) dk =~ g(x) =lim ~ n-;.coy27T -n 1 f" e'·~ex= n-;.coy27T lim ~ g(k) dk -n

Corollary 4.11.2.

IffEL\R)nL 2 (R), then the equality 1 /(x) = ~ v27T

f""

e'"kx" f(k) dk

(4.11.10)

-co

holds almost everywhere in R. The transform defined by (4.11.10) is called the inverse Fourier transform. One of the main reasons for introducing the factor 1/ v27i in the definition of the Fourier transform is the symmetry of the transform and its inverse: 1 .'JP{f(x)} = ~ y

27T

f""

.

e-'kxf(x) dx,

-a)

Theorem 4.11.13 (General Parseval's Relation).

If f. g E L\R), then

L:f(x)g(x) dx= f_""""j(k)g(k) dk.

202

Theory

Proof.

The polarization identity C!. g)= ~elf+ gl -If- gl + ilf + igl 2

2

2

-

ilf- igl

2

)

implies that every isometry preserves inner product. Since the Fourier 2 transform is an isometry on L (R), we have (/,g)= (i, g). The following theorem summarizes the results of this section. It is known as the Plancherel Theorem.

For every fE L\R) there exists

Theorem 4.11.14 (Plancherel Theorem). }EL 2 (R) such that:

(a) Iff E L 1 (R) n L 2 (R), then }(k) = (1/..f27i) f::oo e-ik:x f(x) dx. (b) ll}(k)- (1/..f27i) e-ikxf(x) dxll 2 -i> 0 and II/( x)- (1/ ..f27i) eikx}( k) dkll 2 -i> 0 as n -i> oo. (c) II! II~= 11111~. (d) The mapping f

Cn Cn

-i>

Jis a Hilbert space isomorphism of L \R) onto L (R). 2

Proof. The only part of this theorem which remains to be proved is the fact that the Fourier transform is "onto". Let fE L\R) and define h=

J

,..

and

g =h.

Then, by Lemma 4.11.1,

!=h=i and hence

f=g. This shows that every square integrable function is the Fourier transform of a square integrable function. Theorem 4.11.15.

Proof.

The Fourier transform is an unitary operator on L 2 (R).

First note that

.'JP{g}(k)=

1

171:::

v2~

·~ex-

foo

e-' g(x)dx=

-oo

1

171:::

v2~

foo

·~ex

e' g(x)dx=.'JP- 1{g}(k).

-oo

Now, using Theorem 4.11.11, we obtain 1 (.'JP{f}, g)= f2= YL~

1 = ~ v2~

foo -oo

foo

_ 1 .'JP{f}(x)g(x) dx = ~ v2~

foo

/(x).'JP{g}(x) dx

-oo

/(x).'JP- 1{g}(x) dx = (/, g;:- 1 {g}).

-oo

This shows that g;:-J = .'JP*, and thus .'1P is unitary.

203


The Fourier transform can be defined for functions in L 1 (RN) by ](k) =

\N/2 JRN r e-ik·xt(x) dX,

(217"

where k = ( k 1 , ••• , kN ), x = ( X~o ... , xN) and k · x = k 1 x 1 + · · · + kNxN. The theory of the Fourier transform in L\RN) is similar to the one dimensional case. Moreover, the extension to L 2 (RN) is possible and it has similar properties, including the Inversion Theorem and the Plancherel Theorem.

4.12. Unbounded Operators Boundedness of an operator was an essential assumption in almost every theorem proved in this chapter. Methods used were developed with boundedness or continuity in mind. However, in the most important applications of the theory of Hilbert spaces we often have to deal with operators which are not bounded. In this section we will briefly discuss some basic problems, concepts and methods in the theory of unbounded operators. An operator A defined in a Hilbert space H, i.e., iYt(A) c H, is called unbounded if it is not bounded. Therefore, to show that an operator A is unbounded it suffices to find a sequence of elements x" E H such that llxnllsM (for some M and all nEN) and IIAxnll-i>OO. Since for linear operators boundedness is equivalent to continuity, unboundedness is equivalent to discontinuity (at every point). Consequently, we can show that an operator A is unbounded by finding a sequence {x"} convergent to 0 such that the sequence {Axn} does not converge to 0. One of the most important unbounded operators is the differential operator, see Example 4.2.3. Other important unbounded operators arise from the quantum mechanics and will be discussed in Chapter 7. In physical applications it is natural to assume that all eigenvalues are real. For this reason self-adjoint operators are of special interest. It will be convenient to adopt the following convention: when we say "A is an operator on a Hilbert space H" we mean that the domain of A is the whole space H, and when we say "A is an operator in a Hilbert space H" we mean that the domain of A is a subset of H. If the domain of a bounded operator A is a proper subspace of a Hilbert space H, then A can be extended to a bounded operator defined on the entire space H. More precisely, there exists a bounded operator B defined on H, ffi(B) = H, such that Ax= Bx for every x E ffi(A). Moreover, we can always find B such that liB II= IIAII, see 4.13. Exercises, (2). We may thus

204

Theory

always assume that the domain of a bounded operator is the whole of H. In the case of unbounded operators the above is impossible. For instance the domain of the differential operator cannot be extended onto H. On the other hand, it may be still possible to extend the domain of an unbounded operator in such a way that, although the domain of the extension is not the whole space, it has better properties. Extension of unbounded operators is one of the important problems of the theory.

Definition 4.12.1 (Extension of Operators). a vector space E. If

Let A and B be operators on

ffi(A)c ffi(B)

and

Ax=Bx

for every x

E

ffi(A),

then B is called an extension of A, and we write A c B. When performing typical operations on unbounded operators, we have to remember about the domains. For instance, the operator A+ B is defined for all x E ffi(A) n ffi(B), i.e., ffi(A +B)= ffi(A) n ffi(B). It may happen that ffi(A) n ffi(B) = {0} and then the sum A+ B does not make sense. Similarly,

ffi(AB) = {x E ffi(B): Bx E ffi(A)}. The usual properties need not hold. Although, we have the equality (A+ B) C = AC + BC, in general, the inclusion AB + AC c A(B +C) cannot be replaced by equality.

Definition 4.12.2 (Densely Defined Operator). An operator A defined in a normed space E is called densely defined if its domain is a dense subset of E, i.e., cl ffi(A) =E. The differential operator D = dj dx is densely defined in L 2 (R), because the space of differentiable square integrable functions is dense in L 2 (R).

Definition 4.12.3 (Adjoint of a Densely Defined Operator). Let A be a densely defined operator in a Hilbert space H. Denote by ffi(A*) the set of ally E H for which (Ax, y) is a continuous functional on ffi(A). The adjoint A* of A is the operator defined by (Ax, y) = (x, A*y)

for all xEffi(A) and yEffi(A*).

In the above definition A has to be densely defined in order to ensure the uniqueness of the adjoint A*.

205


Theorem 4.12.1. space H

Let A and B be densely defined operators in a Hilbert

(a) If A c B, then B* c A*. (b) Ifffi(B*) is dense in H, then B c B**. Proof.

First note that A c B implies (Ax, y) = (x, B*y)

for all xEffi(A) and all yEffi(B*).

(4.12.1)

for all xEffi(A) and all yEffi(A*).

(4.12.2)

On the other hand, we have (Ax, y) = (x, A*y)

Comparing (4.12.1) and (4.12.2), we conclude that ffi(B*) c ffi(A*) and A*(y) = B*(y) for ally E ffi(B*). This proves (a). To prove (b) observe that the condition (Bx, y) = (x, B*y)

for all xE ffi(B) and ally E ffi(B*),

can be rewritten as (B*y, x) = (y, Bx)

for all yEffi(B*) and all xEffi(B).

(4.12.3)

Therefore, since ffi(B*) is dense in H, B** exists and we have (B*y, x) = (y, B**x)

for all y E ffi(B*) and all x E ffi(B**).

(4.12.4)

From (4.12.3) and (4.12.4) it follows that ffi(B) c ffi(B**) and B(x) = B**(x) for any x E ffi(B). The proof is complete. Theorem 4.12.2. If A is a one-to-one densely defined operator in a Hilbert space H such that its inverse A -J is densely defined, then A* is also one-to-one and (4.12.5) Proof. Let yEffi(A*). Then for every xEffi(A- 1) we have A- 1 xEffi(A) and hence (A- 1 x,A*y)=(AA- 1x,y)=(x,y). This means that A*yE ffi((A- 1)*) and

(4.12.6) 1

Next, take an arbitrary yE ffi(A- )*. Then, for each xE ffi(A), we have AxE ffi(A - 1 ). Hence (Ax, (A- 1 )*y) =(A -I Ax, y) = (x, y).

This shows that (A- 1 )*y E ffi(A*) and A*(A - 1 )*y =(A -I A)*y = y.

Equality (4.12.5) follows from (4.12.6) and (4.12.7).

(4.12.7)

206

Theorem 4.12.3. B*A*c (AB)*.

Theory

If A, B, and AB are densely defined operators in H, then

Proof. Suppose xE ffi(AB) and yEffi(B*A*). Since xEffi(B) and A*yE f0 ( B*), it follows that (Bx, A*y) = (x, B* A*y). On the other hand, since BxEffi(A) and yEffi(A*), we have

(A(Bx), y) = (Bx, A*y). Hence

(A(Bx), y) = (x, B*(A*y)). Since this holds for all x (AB)y.

E

ffi(AB), we have y E ffi((AB)*) and (B* A*)y =

Self-adjoint operators have been already discussed in Section 4.4. In that section however, we limited our discussion to bounded operators. Without the boundedness condition the matter is more delicate.

Definition 4.12.4 (Self-Adjoint Operator). Let A be a densely defined operator in a Hilbert space H. A is called self-adjoint if A= A*. Remember that A= A* means that ffi(A*) = ffi(A) and A(x) = A*(x) for all x E ffi(A). If A is a bounded densely defined operator in H, then A has a unique extension to a bounded operator on H, and then its domain as well as the domain of its adjoint is the whole space H. In the case of unbounded operators the situation is much more complicated. It is possible that a densely defined operator A has an adjoint A* such that A(x) = A*(x) whenever xEffi(A)nffi(A*), but ffi(A*):;O (n-1)! (n-1)!

as n -i> oo.

This shows f(x)=O for all xE[O, 1]. Theorem 5.5.3 (Non-Homogeneous homogeneous Volterra equation f(x)=¢(x)+A

Volterra

Equation).

K(x, t)f(t) dt,

a-s x -s b,

r

The

non-

(5.5.9)

has a unique solution, for any A, given by f(x)

=

¢(x)+

1 tx

~

A "Kn(X, t) -1, is called the Jacobi differential equation;

p(x)

= (1-x)"+ 1 (1 + x) 13 +\ q(x) == 0, w(x) = (1- x)"(l + x) 13 •

The functions

called the Jacobi polynomials, are the eigenfunctions of the Jacobi equation. The eigenvalues are An= n(n +a+ f3 + 1). The Jacobi operator is defined by

d [ (1-x)"+ 1 (1+x) 13 +1 du]. Lu=-dx dx If

f3

= 0, the Jacobi equation becomes the Legendre equation. For it reduces to the Chebyshev equation. If a = f3 = y then we get the Gagenbauer polynomials C~. a

a =

-t

= f3 =!

Example 5.8.9 (Laguerre differential equation

Operator

and

Laguerre

x d [ -x du] e dx xe dx +Au =0,

Polynomials).

The

O<x 0 and bi + b~ > 0. Definition 5.9.2 (Singular Sturm-Liouville System). Suppose p, q, and w are functions defined in [a, b] satisfying all the conditions in Definition 5.9.1 except that p is only assumed to be positive in (a, b), and vanishes at one or both end-points of the interval [a, b]. By the singular Sturm- Liouville system we mean the system consisting of the differential equation (5.9.1) in the open or semi-open interral with boundary conditions

(a) u is bounded on (a, b), (b) If p does not vanish at an end-point, p satisfies boundary conditions of the form (5.9.2) at that end-point. Example 5.9.1.

Consider the regular Sturm-Liouville system u"+Au =0, u(O)=u(1r)=O.

0 :S X

:S 7T,

Applications

258

Suppose A < 0 and let v =

JfAI. Then the general solution of the equation

is

This solution satisfies the given boundary conditions if and only if

A+B=O,

Since v > 0, ev7T -:;f e -v7r and therefore the only solution is A= B = 0. This means that the system has no non-zero solutions if A < 0. In other words, there are no negative eigenvalues. A similar argument shows that A = 0 is not an eigenvalue. However, when A > 0, then the solutions of the equation are

u (X) = A cos fi

X

+B

sin fi

X.

The boundary conditions give

A= 0

and

B sin fi

7T

= 0.

Since A -:;f 0 and B = 0 yields trivial solution, we must have B sin fi Hence the eigenvalues are An= n Un

2

,

7T

=

-:;f

0 and

0.

n = 1, 2, ... , and the eigenfunctions are

(x) =sin nx.

Note that An...,. co as n...,. co, unlike the case of self-adjoint compact operators when the eigenvalues converge to 0; see Theorem 4.9.10. Section 5.10, particularly Theorem 5.10.4, will explain this. Another type of problem that often occurs in practice 1s the periodic Sturm-Liouville system: du] +(q(x)+Aw(x))u=O, d [ p(x) dx dx p(a)

= p(b),

u(a)

= u(b),

u'(a)

= u'(b).

as x s b,

259

Applications to Integral and Differential Equations

Example 5.9.2. Find the eigenvalues and eigenfunctions of the following periodic Sturm- Liouville system u''+Au=O,

-7TSX:'57T,

u(-1r)=u(1r),

u'(-1r)=u'(1r).

Note that here p(x)=1 and hence p(-1r)=p(1r). When A>O the general solution of the equation is u(x) =A cos fi x+B sinfi x.

Using the boundary conditions we get 2B sin fi

7T

=

0,

2A Vii sin fi

7T

=

0.

Thus, for non-trivial solutions, we must have sin fi

7T

= 0.

The equation is satisfied if

n = 1, 2, 3, .... For every eigenvalue An = n 2 we have two linearly independent solutions cos nx and sin nx. It can be readily shown that the system has no negative eigenvalues. However, A = 0 is an eigenvalue and the corresponding eigenfunction is the constant function u ( x) = 1. Thus the eigenvalues are 0, 1, 4, ... , n 2 ,

•••

and the corresponding eigenfunctions are 1, cos x, sin x, cos 2x, sin 2x, ... , cos nx, sin nx, ....

Throughout the remainder of this section, L will denote the differential operator in the Sturm- Liouville differential equation, i.e., Lu

d = dx

[ p(x) dx du]

+ q(x)u.

For the regular Sturm-Liouville system, we denote by 2i!(L) the domain of L, i.e., 2i!(L) is the space of all complex valued functions u defined on [a, b] for which u" belongs to L 2 ([a, b]) and which satisfy boundary conditions (5.9.2). We have then L:2i!(L)-i>L2 ([a, b]).

260

Applications

For the singular Sturm-Liouville system we need only to replace (5.9.2) by (a) u is bounded on (a, b), (b) b 1 u(b) + b2 u'(b) = 0, where b 1 and b2 are real constants such that hi+ b~> 0. Theorem 5.9.1 (Lagrange's Identity).

For any u,

vE 2i!(L),

(5.9.3)

Proof

We have

d [ pdv] d [ pdu] uLv-vLu=u- +quv-v- -quv dx dx dx dx

=_!}__ [p(u dv _ v du)]. dx dx dx Theorem 5.9.2 (Abel's Formula).

If u and v are two solutions of

Lu+Awu =0

(5.9.4)

in [a, b], then p(x) W(x; u, v) =constant, where W is the Wronskian: u(x) W(x;u,v)=det [ v(x) Proof.

u'(x)]. v'(x)

Since v and v are solutions of (5.9.4) we have

d [ p(x) dx du] +(q(x)+Aw(x))u =0, dx d [ p(x) dx dv] +(q(x)+Aw(x))v=O. dx Multiplying the first equation by v and the second by u, and then subtracting, we obtain

u _!}__ [p dv] _ v !!_ dx dx dx

[p

du] = O. dx

261


By integrating this equation from a to x we find p(x)[u(x)v'(x)- u'(x)v(x)] = p(a)[u(a)v'(a)- u'(a)v(a)] =constant.

This is Abel's formula. Theorem 5.9.3. Eigenfunctions of a regular Sturm-Liouville system are unique except for a constant factor.

Proof. Suppose u and v are eigenfunctions corresponding to the same eigenvalue A. According to Abel's formula, we have p(x) W(x; u, v) =constant.

Since p>O, if W(x; u, v) vanishes at a point in [a, b], then it vanishes everywhere in [a, b ]. From the boundary conditions we have a 1 u(a) + a 2 u'(a) = 0, a 1 v(a)+ a 2 u'(a) = 0.

Since a 1 and a 2 are not both zero, we get u(a) W(a;u,v)=det [ ) v(a

Therefore W(x; u, v) of u and v. Theorem 5.9.4.

= 0 for all

x

E

u'(a)] = O. v'(a)

[a, b ], which proves linear dependence

For any u, v E 2ZJ(L) we have (Lu, v) = (u, Lv),

where ( , ) denotes the inner product of L 2 ([a, b]). In other words, Lis a self-adjoint operator.

Proof. Since all constants involved in the boundary conditions of a SturmLiouville system are real, if vE 2i!(L), then vE 2i!(L). Also, since p, q, and w are real valued, Lv = Lv. Consequently, (Lu,v)-(u,Lv)=

r

(vLu-uLv)dx=[p(uv'-vu')]~,

(5.9.5)

by Lagrange's identity (5.9.3). We will show that the last term in the above equality vanishes, for both the regular and singular system. If p(a) = 0, the

262

Applications

result follows immediately. If p( a)> 0, then u and v satisfy boundary conditions of the form (5.9.2) at x =a. That is, u(a) [ v(a)

u'(a)]

v'(a)

[a

1

]

=O.

az

Since a 1 and a 2 are not both zero, we have

u(a)v'(a)- V(a)u'(a) = 0. A similar argument can be applied to the other end-point x = b, so that we conclude

[p( uv'- vu')]~ = 0. Theorem 5.9.5.

Eigenvalues of a Sturm-Liouville system are real.

Proof. Let A be an eigenvalue of a Sturm-Liouville system and let u be the corresponding eigenfunction. This means that u -:1' 0 and Lu = -Awu. Then 0 = (Lu, u)- (u, Lu) = ( -Awu, u)- (u, -Awu) =(A -A)

J:

w(x)lu(x)IZ dx.

Since w ( x) > 0 in [a, b] and u -:1' 0, the integral is a positive number. Therefore X= A, completing the proof.

Remark.

This theorem states that all eigenvalues of a regular SturmLiouville system are real, but it does not guarantee that an eigenvalue exists. It is proved in Section 5.10 that a regular Sturm-Liouville system has an infinite sequence of eigenvalues. Theorem 5.9.6. Eigenfunctions corresponding to distinct eigenvalues of a Sturm- Liouville system are orthogonal with respect to the inner product with the weight function w(x).

Proof. Suppose u 1 and u 2 are eigenfunctions corresponding to eigenvalues A1 and A2 , A1 -:1' A2 • Thus

Hence ( 5.9.6)


263

By Theorem 5.9.1, we have (5.9.7) Combining (5.9.6) and (5.9.7) and integrating from a to b we get (Al-A2)

r

w(x)u 1 (x)u 2 (x) dx

Since A1 -:1' A2 , we conclude

This completes the proof.

5.10. Inverse Differential Operators and Green's Functions A typical boundary value problem for ordinary differential equation can be written in the operator form as Lu=f

(5.10.1)

We seek a solution u which satisfies this equation and the given boundary conditions. If 2i!(L) is defined as the space of functions satisfying those boundary conditions, then the problem reduces to finding a solution of (5.10.1) in 2ZJ(L). One way to approach the problem is by looking for the inverse operator L - 1 • I fit is possible to find L -\then the solution of ( 5.10.1) can be obtained as u = L -J (f). It turns out that in many important cases it is possible, and the inverse operator is an integral operator of the form (C 1 u)(x) =

Lb G(x, t)f(t) dt.

The function G is called the Green's function of the operator L. Existence of the Green's function and its determination is not a simple problem. We will examine the question more closely in the case of Sturm-Liouville systems.

264

Applications

Theorem 5.10.1. Suppose A = 0 is not an eigenvalue of the following regular Sturm- Liouville system:

d [ p(x) dx du] Lu = dx

+ q(x)u = f(x),

as x s b,

(5.10.2)

with the homogeneous boundary conditions (5.10.3) (5.10.4)

where p, q, and w are continuous real valued functions on [a, b ], p is positive in [a, b ], p'(x) exists and is continuous in [a, b ], a 1 , a 2 , b 1 , b2 are given real numbers such that ai+ a~> 0 and bi + b~ > 0. Ihen,for any f E C(?([ a, b ]), the system has a unique solution u(x) =

J:

G(x, t)f(t) dt,

where G(x, t) is the Green's function given by for a :S t<x, for x s t s b, where u 1 and u 2 are non-zero solutions of the homogeneous system d [ p(x) dx du] dx

+ q(x)u =

0

with boundary conditions (5.10.3) and (5.10.4), respectively, and

is the Wronskian. Proof. According to the theory of ordinary differential equations, the general solution of (5.10.2) is of the form (5.10.5) where c 1 and c2 are constants, u 1 and u2 are two linearly independent solutions of the homogeneous equation Lu = 0, and uP is any particular solution of (5.10.2).

265


The particular solution uP can be found by the method of variation of parameters. Thus we are looking for a solution in the form (5.10.6) where v 1 and v2 are functions to be determined. Since there are infinitely many pairs of functions v 1 and v2 for which uP satisfies (5.10.2), we add a second condition: (5.10.7) We now have

and

Substituting into (5.10.2) we get v1(pu~ + p'u; + qu1) + v2(pu~ + p'u;+ qu2) + p(v;u; + v;u;)

=f

Since u 1 and u2 are solutions of the homogeneous equation, the first two terms vanish so that the above result becomes v'u'+v'u'-f. I I 2 2 - p"

Solving (5.10.7) and (5.10.8) for

v;

and

v; we

(5.10.8) obtain (5.10.9)

We will show that the Wronskian does not vanish at any point of [a, b ]. Indeed, suppose that

vanishes at some

gE [a,

b ]. Then the system au 1(g) + f3u2(g) = 0 au;(g) + {3u;(g) = 0

has a non-trivial solution such that a and f3 are not both zero. Then the function g = au 1+ f3u 2 is a solution of the initial value problem d [ p(x) dx du] dx

+ q(x)u = 0,

g(g) = g'(g) = 0.

266

Applications

But we know that the above problem has only the trivial solution, and thus = 0. This means that u 1 and u 2 are linearly dependent, i.e., u 1 = yu 2 for some constant y. Consequently, u 1 satisfies both boundary conditions (5.10.3) and (5.10.4), but this implies that A= 0 is an eigenvalue of the system, contrary to the assumption. By Abel's formula (Theorem 5.9.2), p(x) W(x; u 1 , u 2 ) is a constant. Since W does vanish in [a, b] and p is assumed to be positive, the constant is not zero. Denote g

Now, by integrating equalities (5.10.9) we get v 1 (x) = -

f

cf(x)u 2 (x) dx

and

v 2 (x) =

f

cf(x)u 1 (x) dx,

and finally up(x) = -cu 1 (x)

=

tx

tx

f(t)u 2 (t) dt+ cu 2 (x)

cu 2 (x)u 1 (t)f(t) dt+

tx

f(t)u 1 (t) dt

Lb cu (x)u (t)f(t) dt. 1

2

(5.10.10)

Consequently, if we define the Green's function as for as t< x, for x s t s b, we can write up(x) =

(5.10.11)

r

G(x, t)f(t) dt

provided the integral exists. This follows immediately from the continuity of G. The proof of continuity of G is left as an exercise.

r

Denote by T the integral operator defined in Theorem 5.10.1, i.e., (Tf)(x) =

G(x, t)f(t) dt.

(5.10.12)

We are going to examine properties of T. Theorem 5.10.2.

The operator Tdefined by (5.10.12) is a self-adjoint compact operator from L 2 ([a, b]) into C(?([a, b]).

267


Proof. Compactness of integral operators has been discussed in Example 4.8.4. In Example 4.4.3 we prove that an integral operator of the form (5.10.12) is self-adjoint if G(x, t) = G(t, x). It is easy to see that in this case the condition is satisfied. Finally, continuity of Tf follows from the continuity of G.

Operator T, as a compact self-adjoint operator, admits spectral representation; (see Theorem 4.10.2). The following two theorems describe the connection between eigenvalues and eigenfunctions of the regular Sturm-Liouville operator L and the corresponding integral operator T. Theorem 5.10.3. If A = 0 is not an eigenvalue of the Sturm- Liouville system defined in Theorem 5.10.1, then A= 0 is not an eigenvalue of the integral operator T defined by (5.10.12).

Suppose Tf = 0. Then

Proof.

0 = ( Tf)'(x) =

=

c(

u((x)

-u;(x)

=

c ( u((x)

d~ [ cu (x) 1

tx tx tx

tx

f(t)u 2( t) dt- cu 2(x)

tx

f( t)u 1 ( t) dt

J

f(t)u2(t) dt+u 1 (x)u2(x)f(x)

f(t)u 1 (t) dt-u 1 (x)u2(x)f(x)) f(t)u 2(t) dt- u;(x)

tx

f(t)u 1 (t) dt).

Therefore, we have the following system of equations: u 1 (X) u((x)

tx tx

f(t)u2(t) dt-u2(x) f(t)u 2(t) dt- u;(x)

tx tx

f(t)u 1 (t) dt=O, f(t)u 1 (t) dt = 0.

Since the determinant

does not vanish at any point of [a, b] (see the proof of Theorem 5.10.1) we conclude

268

Applications

for all x E [a, b ]. This implies f = 0, and thus the equation Tf = 0 has only the trivial solution. The proof is complete. Theorem 5.10.4. Under the assumptions of Theorem 5.10.1, A is an eigenvalue of L if and only if 1/ A is an eigenvalue of T. Moreover, iff is an eigenfunction of L corresponding to the eigenvalue A, then fis an eigenfunction ofT corresponding to the eigenvalue 1/ A. Proof. Suppose Lf = Af for some non-zero f in the domain of L. By Theorem 5.10.1, we have f= T(Af)

or equivalently, since A -:;f 0, 1 Tf=-f A

This shows that 1/ A is an eigenvalue of T and f is the corresponding eigenfunction. Conversely, iff is an eigenfunction of T corresponding to A, f -:;f 0 and A -:;f 0, then Tf= Af,

and hence f= L( Tf) = L(Af) =A Lf

Therefore 1/ A is an eigenvalue of Land the corresponding eigenfunction is f

5.11. Applications of Fourier Transforms to Ordinary Differential Equations and Integral Equations In this section we discuss some examples of applications of the Fourier transform to ordinary differential equations and integral equations. Consider the nth order linear ordinary differential equation with constant coefficients L(y) = f(x),

(5.11.1)

where L is the nth order differential operator given by L= anDn + an_ 1 Dn- 1 + · · · + a 1 D+ a 0 ,

(5.11.2)

269


Application of the Fourier transform to both sides of ( 5.11.1) gives [an (ikr +an-i (ik) n-i + ... + ai(ik) + ao]y(k) = ](k)

or p(ik)y(k) = ](k),

where p(z) = anzn + an-!Zn-i + ... + aiz+ ao. Thus "(k) = ](k) =f"(k) A(k) y p(ik) g '

(5.11.3)

where "(k =-1) p(ik)"

g

Now the Convolution Theorem 4.11.7 gives the solution y(x)= ;-;;1

J"" f(g)g(x-g) dg

v27T -ro

(5.11.4)

provided g(x) = g;-- 1{g(k)} is known explicitly. In order to give a physical interpretation of the result, we consider the differential equation associated with a sudden impulse functionf(x) = 8(x): L( G)= 8(x)

(5.11.5)

(for a rigorous discussion of the Dirac delta distribution 8 see Section 6.2). Application of the Fourier transform to ( 5.11.5) yields the solution

1 1 G(x) = g;-- 1 { --g(k)} = --g(x). ~ ~

(5.11.6)

Now, the solution (5.11.4) can be written as y(x)= f_""rof(g)G(x-g) dg

(5.11.7)

Clearly, G(x) behaves like a Green's function, that is, it is the response to a unit impulse. In any physical system, f(x) is usually called the input function, while y(x) is called the output obtained by the superposition principle. The Fourier transform of [ ~ G(x)] is called the admittance g(k) = [p(ik)r 1 • In order to determine the response to a given input, we first find the Fourier transform of the input, multiply the result by the admittance, and then apply the inverse Fourier transform to the product. We illustrate these ideas by a simple electrical circuit problem.

270

Applications

Example 5.11.1. equation

The electric current I(t) in the circuit is governed by the

di

(5.11.8)

L dt+RI=E(t),

where L is the inductance, R is the resistance, and E ( t) is the applied electromagnetic force. With E(t) = E 0 e-ltl, application of the Fourier transform (with respect tot) to the equation (5.11.8) gives

" )!

(ikL+ R)I(k) =

Eo-2 7T1+k

or

"

I ( k) =

Eo V{2; -(i-kL_+_R-')'-( 1-+-k-:

2:-)"

The inverse Fourier transform yields

f""

Eo eikt dk It=() 7T -ro(ikL+R)(1+k 2 ). This integral can readily be evaluated by the theory of residues. For t > 0,

Eo . [ residue . . iRJ I ( t) =--;;2m at k == 1. +residue at s = L

(5.11.9) Similarly, for t < 0, we obtain E0 E 0 e' • I(t) = - -·27Ti[residue at s = - i] = - - . 1r L+R

At t = 0, the current is continuous, hence

.

Eo R+L

1(0) =hm I(t) = - - . ,_.o

(5.11.10)


271

Example 5.11.2 (Synthesis and Resolution of a Pulse; Physical Interpretation of Convolution). A time-dependent electric, optical or electromagnetic pulse can be regarded as a superposition of plane waves of all real frequencies so that the total pulse can be represented by the inverse Fourier transform 1 f (t)=27T

fro Fw)e ( iwt dw, -ro

(5.11.11)

where the factor 1/27T is introduced because the angular frequency w is related to the linear frequency v by w = 27Tv, and negative frequencies are introduced for mathematical convenience so that we can avoid dealing with the cosine and sine functions separately. Clearly, F( w) can be represented by the Fourier transform off( t) as F(w)=

f_""ro

f(t) e-iwt dt.

(5.11.12)

This represents the resolution of the pulse f( t) into its angular frequency components, and (5.11.11) gives a synthesis of the pulse from its individual components. Consider a simple electrical device such as amplifier with an input function f( t), and an output function g( t). For an input of a single frequency w, f(t) = eiwt. The amplifier will change the amplitude and may also change the phase so that the output can be expressed in terms of the input, the amplitude and phase modifying function ( w) as g(t)=(w)f(t),

(5.11.13)

where ( w) is usually called the transfer function, and it is, in general, a complex function of the real variable w. This function is generally independent of the absence or presence of any other frequency components. Thus the total output may be obtained by integrating over the entire input as modified by the amplifier 1 g(t) = 27T

J""-ro (w )F(w) eiwt dw.

(5.11.14)

Therefore, the total output g( t) can readily be calculated from any given input/( t) and known transfer function ( w ). On the other hand, the transfer function is obviously characteristic of the amplifier, and can, in general, be obtained as the Fourier transform of some function ¢( t): (w)

=.:

f_""ro cf>(t) e-iwt dt.

(5.11.15)

Applications

272

The Convolution Theorem 4.11.7 allows us to rewrite (5.11.14) as g( t)

= g;-- 1{ (w )F(w )} = LX'ro

f( T)c/>( t- T) dT

(5.11.16)

Physically, the result represents an output (effect) as the integral superposition of an input (cause) function f(t) modified by cf>(t-T). Indeed, (5.11.16) is the most general mathematical representation of an output in terms of an input modified by the amplifier where t is the time variable. Assuming the principle of causality, that is, every effect has a cause, we must require T < t. The principle of causality is imposed by requiring cf>(t-T)=O

forT>t.

(5.11.17)

Consequently, (5.11.16) reduces to the form g(t)=

fro j(T)cf>(t-T) dT.

(5.11.18)

In order to determine the significance of cf>(t), we use a sudden impulse function/(T)=8(T) so that (5.11.18) becomes g(t)=

fro 8(T)cf>(t-T) dT=cf>(t)H(t).

(5.11.19)

This recognizes ¢ ( t) as the output corresponding to a unit impulse at t = 0, and the Fourier transform (w) of cf>(t) is given by (t)

e-iwt

dt

(5.11.20)

with cf>(t)=O for t(x)+ A

f7T f(x, t; A)cf>(t) dt,

where A is not an eigenvalue. Obtain the general solution, if it exists, for cf>(x) =sin x.

(9)

Show that the solution of the differential equation d

2

f

dx 2 +xf=1,

f(O)=f'(O)=O,

satisfies the non-homogeneous Volterra equation 1 + f(x)=2

27T

Jx t(t-x)f(t)dt. 0

(1 0) Transform the problems 2 d f f(O) = 0, f'(1) = 0, (a) dxz +!= x, d

2

f

(b) dxz + f = x,

f(O) = 1, f'(1) = 0,

into Fredholm integral equations.

(11) Discuss the solutions of the integral equation f(x)=cf>(x)+A

(12)

f

(x+t)f(t) dt.

When do the following integral equations have solutions?

(a) f(x)=cf>(x)+A

f

(b) f(x)=cf>(x)+A

f7T sin(x+t)f(t) dt.

(c) f(x) = cf>(x)+A

f

(d) f(x) = cf>(x) +A

(1-3xt)f(t) dt.

xtf(t) dt.

r1

~ Pn(x)Pn( t)f( t) dt,

1

where Pn is the nth degree Legendre polynomial.


(e) f(x) =x+-1 2

f

279

1

(x+ t)f(t) dt.

-1

(13) Find the eigenvalues and eigenfunctions of the following integral equations: (a) f(x) =A

r7T cos(x- t)f(t) dt.

(b) f(x) =A

f

(t-x)f(t) dt. 1

27T

(c) f(x)=cf>(x)+A

J

cos(x+t)f(t)dt.

0

( 14) Solve the integral equations (a) f(x) = cf>(x)+ A

f

tf(t) dt.

12

(b) f(x)=x+A f

f(t)dt.

J xtf(t) dt. 2 1

(c) f(x)

Sx 1 =-+-

6

(d) f(x)=x+

f

0

(l+xt)f(t) dt.

(e) f(x)=ex+A f2ex+'f(t)dt.

( 15) Use the separable kernel method to show that f(x) =A

f

cos x sin t f(t) dt

has no solution except the trivial solution f = 0.

(16) Obtain the Neumann n series solutions of the following equations:

J (t+x)f(t) dt. 2 1

(a) f(x)=x+-1

-1

(b) f(x) = x+ Lx (t-x)f(t) dt.

(c) f(x)=x- Lx (t-x)f(t)dt. (d) f(x) = 1-2 Lx tf(t) dt.

280

Applications

( 17) If Lu = u" + w 2 u, show that Lis formally self-adjoint and the concomitant is J( u, v) = vu'- uv'. Moreover, if u is a solution of Lu = 0 and v is a solution of L*v = 0, then the concomitant of u and v is a constant.

(18) Let L be a self-adjoint differential operator given by (5.8.15). If u 1 and u 2 are two solutions of Lu = 0, and J( u 1 , u2 ) = 0 for some x for which a 2 (x) -:1' 0, then u 1 and u 2 are linearly independent. ( 19) Consider the differential operator

u'(O) = 0,

u(1)=0.

Show that L is formally self-adjoint.

(20) Prove continuity of the Green's function defined in Theorem 5.10.1. (21) Find eigenvalues and eigenfunctions of the following Sturm-Liouville system: u"+Au=O,

0 :S X

:S 7T,

u(O)=u'(1r)=O.

(22) Transform the Euler equation x 2 u"+xu'+ Au =0,

1 s x s e,

with the boundary conditions

u(1)=u(e)=O into the Sturm-Liouville system 1 - d [ xdu] - +-Au=O dx dx x '

u(l) = u(e) = 0. Find the eigenvalues and eigenfunctions.

(23) Prove that A= 0 is not an eigenvalue of the system defined in Example 5.9.1.

(24) Show that the Sturm-Liouville operator L = DpD + q, D = d / dx is positive if p(x) > 0 and q(x) 2::0 for all x E [a, b ].

281


(25) Show that the Sturm-Liouville operator Lin

L 2 ([a, b]) given by

1

L=-(DpD+q) r(x)

is not symmetric.

(26) Use the Fourier transform to solve the forced linear harmonic oscillator t > o, w -:p n,

x(O+) = o = x(O+ ).

Examine the case when w = 0.

(27) Solve the problem discussed in Example 5.11.1 with E(t) = E 0 e-"'sinwtH(t) and I(O+)=I0 •

(28) If there is a capacitor in the circuit discussed in Example 5.11.1, then the current I( t) satisfies the following integrodifferential equation: L-+ q + di RI +dt c1 [ 0

J' I(t) dt J

= E(t),

0

where q0 is the initial charge on the capacitor so that

q=q0 +

L

I(t) dt

is the charge and dq / dt = I. Solve this problem using the Fourier transform and the following conditions

I=q=E=O

fort 0. This completes the proof. In electrodynamics, the fundamental solution (6.3.53) has a well known interpretation. It is essentially the potential at the point x produced by a unit point charge at the point~· This is what can be expected from a physical point of view because 8 (x- ~) is the charge-density corresponding to a unit point charge at ~· The solution of ( 6.3.46) is u(x,y,z)= (

JR3

1 G(x,~)f(~)d~=( 47T JR3

fig'x-~'l{;) 11

dgd17d{

(6.3.54)

The integrand in (6.3.54) consists of the given charge distribution f(x) at x = ~ and the Green's function G(x, ~). Physically, G(x, ~)f( ~) represents the resulting potentials due to elementary point charges, and the total potential due to a given charge distribution f(x) is then obtained by the integral superposition of the resulting potentials. This is the so-called principle of superposition. Example 6.3.15. holtz equation

The fundamental solution of the two dimensional Helm-

-co< x, y 0, we seek solutions in the form u that
dT =

L

14> dT

( 6.4.3)

for every¢ E 0)(n). This does not require any information on the second derivatives of u. On the other hand, iff~ C(?(n) the problem (6.4.1) does not have a classical solution. It is then necessary to generalize the solution in an appropriate manner. Iff E L 2 (n), Equation (6.4.3) makes sense if V'u E L 2 (n). If u E H6(n), where H6(n) is the subspace of H\n) consisting of functions vanishing on an, and if the derivatives auj axk are considered in the generalized sense, then it follows from the definition of the Sobolev 2 space that aujaxkE L (n). Then if uEH6(n) and u satisfies (6.4.3) then it is a weak solution of ( 6.4.1 ).

Generalized Functions and Partial Differential Equations

311

Since H6(D) is the closure of ;](D), ;](D) is a dense subspace of H/Ml). Therefore, solving Equation (6.4.3) is equivalent to finding u E H0(D) such that for all ¢

(\7 u, \7 4>) = (f, 4>)

E

;](D),

(6.4.4)

where (·,·)is the inner product in L (D): (¢, r/J) = fn cf>r/J. Equation (6.4.4) is known as the variational or weak formulation of the problem (6.4.1). 2

Theorem 6.4.1. Let D be a bounded open subset of RN and let fE L 2 (D). Then there exists a unique weak solution u E H b( D) satisfying ( 6.4.4). Furthermore, u E Hb(D) is a solution of (6.4.4) if and only if min 1 ( v),

1 ( u) =

( 6.4.5)

VEH6(!1)

where 1(v)=.!_f \7v·\7vdr-f fvdr. 2 !1 !1

(6.4.6)

Proof. In order to apply the Lax- Milgram Theorem 4.3.7, we set H = Hb(D) and, for u, v E H6(D), a(u,v)=

L

\7u·\7vdr.

(6.4. 7)

We first show that a(·,·) is coercive, that is, there exists a positive constant K such that

for all u E H. This readily follows from Friedrichs' first inequality

L1Vul where

a

2

dr2::

a

2

uEH,

u dr,

(6.4.8)

is a positive constant. Thus

lflVul

2:-

2

where K

L

=

2

!1

min{l/2, a/2} and u E H.

af 2

dr+-

!1

2

2

u dr2:KIIull~>

( 6.4.9)

312

Applications

To prove the boundedness of a ( ·, · ) we note that

Lj"V'ul LCIY'ul 2

a(u, u) =

2

dT:S

+ u2)

dT=

llull7.

(6.4.10)

Thus a ( ·, ·)is bounded, symmetric and coercive. So, by the Lax-Milgram Theorem, there exists a unique weak solution of the equation (6.4.4). We next consider the Neumann boundary value problem -V' 2 u+bu=f

inn,

au -=0 an

(6.4.11a)

an,

on

(6.4.11b)

where n c R N is a bounded open set and n is the exterior unit normal to an, and b is a non-negative constant. According to Green's first identity ( 6.3.24), if u is a classical solution, then u E Hb(n) and it satisfies the equation

Jnr \i'u·\i'vdT+fnv\7

2

udT=f an

VaudS. an

f

fv dT

Or equivalently, by ( 6.4.11 ), f

\7 U • \7 V dT +

n

r buv dT

Jn

=

(6.4.12)

an

for every v E Hb(n). Iff E L 2 (n) then we define a weak solution of (6.4.11) as u E Hb(n) satisfying (6.4.12). Consider the bilinear form associated with the operator A=- V' 2 + b:

L L

a(u, v)

=

(6.4.13)

[(V'u · V'v)+ buv] dT.

Clearly, a is a bilinear form on H 1 (n) and a ( u, v) =

(V' u · V' v + buv) dT

s::max(1,b 1 ) = M(u, v)s::

L

(V'u·V'v+uv)dT

Mllullllvll,

where 0 < b :S b 1 , M = max(1, b 1 ), and a is continuous. On the other hand, a(u, u) =

L

(V'u · V'u+bu

2

)

dT~min(1, b )llull 0

2

,

313


where 0 < b0 :S b. Therefore, a is a continuous and coercive bilinear form. Then, by the Lax- Milgram Theorem, there exists a unique solution u E H 1 (D) such that a ( u, v) = (f, v)

(6.4.14)

for all v E H\ n ). This u is called the weak solution of the equation Au= f, that is, u is the unique solution of the Neumann boundary value problem (6.4.11). Furthermore, the solution minimizes the functional J(v)=.!f (V'v·V'v+bv 2 )dr-r fvdr.

2

Example 6.4.1.

n

(6.4.15)

Jn

Consider a boundary value problem (6.4.16a) ( 6.4.16b) 2

where a0 is a positive constant. Set Tu =- V' u + a 0 u. Define an inner product in Hb(D) (u, v)=

L

(6.4.17)

(uxvx+uyvy+uv) dxdy,

a bilinear form in Hb(D) a(u, v)=(v, Tu)= and a functional on Hb(D) I ( v) =

L L

v(-V' 2 u+a 0 u) dxdy,

(6.4.18)

fv dx d y.

(6.4.19)

A quadratic form for this problem can be defined in Hb(D) by I(u)

=~a(u, u)- I(u) =

L[~{(u~

+ u~) + a0 u 2 } - fu] dx dy.

The bilinear form a is symmetric, bounded and positive definite. The boundedness follows from the Schwarz inequality la(u,

v)ls::

LCluxl +luyl LClvxl +lvyl +ao~Llul ~ Llvl 2

2

)

2

:S

K

I ullll vii,

where K = max(1, a 0 ).

2

dxdy

dxdy

2

dxdy

2 )

dxdy

314

Applications

The positive definiteness follows from (6.4.18) by setting u = v: a(u,u)=

L

2

2

(IY'ul +aou )dxdy2a 0

LCIY'ul +lul )dxdy=allull 2

2

2

,

where a= min(1, a 0 ). Note that I(v) is bounded. Hence it follows from the Lax-Milgram Theorem that the problem a ( u, v) = I ( v) has a unique solution in Hb( D). We can generalize the preceding result to cover the case of second order elliptic equations defined on an open bounded set n c RN with smooth boundary an. We now consider the boundary value problem Tu=f

in De RN,

(6.4.20a)

an,

(6.4.20b)

on

u=O

where Tu=-

L-

N a [ aiJau] +a u, 0 i,J=i axi ax1

aiJ E C(?\ cl D), 1 s i,j s N, a0 E C(?\ cl D), x = (x~> ... , xN) ERN. The differential operator T is said to be in divergence form. It is called uniformly elliptic if the ellipticity condition N

L

2

aiJ(x)glJ2 Klgl = K(gi+ · · · + gfv)

( 6.4.21)

i,j=l

is satisfied for all gERN, x En, and K is positive and independent of x and g. If fE L\D), a weak solution of (6.4.20) is given by

r f:. aij axiau~ dr+ r aoUV dr = r fv dr axj Jn Jn

Jn i,j=i

(6.4.22)

for all v E Hb(D). It can readily be verified that every classical solution is a weak solution. Conversely, every sufficiently smooth weak solution is a classical solution. We next define a bilinear form in Hb(D) by a(u, v) =

iL N

n i,J=i

i

au av aiJ-- dr+ a 0 uv dr axi ax1 n

(6.4.23)

and the norm (6.4.24)

315


If a 0 (x)?: 0 for all xED, then, in view of the ellipticity condition ( 6.4.21 ),

i iL

L n ,,j=

a(u, u) =

N

~

f

au au a i j - - dT+ ax1 axj 1

n

2

a 0 u dT

au au aij--dT n u= 1 ax1 axj N

( 6.4.25) It can be checked that the form a(u, v) is bounded in Hb(D), that is,

la(u,

v)l :S Mllullllvll

(6.4.26)

for some constant M and all u, v E Hb(D). If a is symmetric, that is, au= aji for all i, j EN, then by the Lax- Milgram Theorem, there exists a unique solution u E Hb(D) such that a ( u, v)

= (f, v)

( 6.4.27)

for all v E Hb(D). Consequently, u satisfies the equation (6.4.22). In other words, the unique solution u minimizes the functional

i

iL

i

1 av av 1 J(v)=a 0 v 2 dTfvdT N au--dT+2 n i,j=i ax1 axj 2 n n

(6.4.28)

on Hb(D). To define a weak solution through (6.4.22), it suffices to assume au, a 0 are bounded on D. Hence u is the weak solution of the equation Tu = f, that is, u is the unique weak solution of the elliptic boundary value problem (6.4.20). More generally, we consider the following second order elliptic boundary value problem: Tu

=f

in D c R N'

on aD,

u=O

(6.4.29)

where Tu=-

a [ au] au La ; ; - + L a -+a u, ax axj axi N

N

1

U=i

1

0

i=i

where the aijs satisfy the ellipticity condition ( 6.4.21) and a; E C(?( cl D), 1:Si:SN. A weak solution is au E Hb(D) satisfying a ( u, v) = (f, v)

(6.4.30)

316

Applications

for every v c H6(D), where a(u, v) =

fn L

f L

N

i,j=!

au av a i j - - dr+ axi axj n

f

au a 1- v dr+ a0 uv dr. axi n

N

I=!

(6.4.31)

This bilinear form is not always symmetric. If it is symmetric, bounded, and coercive, then there exists a unique solution by the Lax-Milgram Theorem.

6.5. Examples of Applications of Fourier Transforms to Partial Differential Equations Example 6.5.1 (One Dimensional Diffusion Equation with No Sources or Sinks). Consider the initial value problem for the one dimensional diffusion equation with no sources or sinks: -co< x< co, t >0,

(6.5.1)

where K is a constant, with the initial data u(x,O)=f(x).

( 6.5.2)

This kind of problem can often be solved by the use of the Fourier transform 1 u(k, t) = ;;:;v27T

f

ro

e-

ikx

u(x, t ) d x.

-ro

When the Fourier transform is applied to (6.5.1) and (6.5.2) we obtain

u = - Kk ii, 2

1

The solution of the transformed system is u(k, t) = ](k)e-Kk 21 .

(6.5.3)

The inverse Fourier transform gives the solution u(x, t) =-1-

.J2iT

f""

](k) eikx-Kk2t dk

-ro

which is, by the Convolution Theorem 4.11.7, u(x,t)=

1 ~ v27T

f"" -ro

f(g)g(x-g)dg,

(6.5.4)

317


where

f"" exp[-Kt(k-~) _£__] dk .J27i 2

1

=-

2Kt

-ro

4Kt

Thus the solution (6.5.4) becomes

u(x, t) =

1 ~

v47TKt

f""

f(g) exp [ - (x-g) -ro 4Kt

2 ]

dg.

(6.5.5)

The integrand involved in the integral solution consists of the initial data f(x) and the Green's function G(x, t): 1 - exp [ - (x- gf] . G(x t) = -

(6.5.6)

1 [ (x-gf] dg=8(x-g), ~exp v47TKt 4Kt

(6.5.7)

'

,j41rKt

4Kt

Since lim

, ... o+

if we let t ~ 0+ the solution becomes

u(x,O)=f(x). Consider now the initial value problem

-co< x, y 0,

(6.5.8) (6.5.9)

u(x, y, 0) = f(x, y). The function

(6.5.10) satisfies the equation (6.5.8). From this we can construct the formal solution 1 u(x,y,t)=-

47Tt

i

R2

[

f(g,7J)exp- (x-gf+(y-7Jf] dgd7J. 4t

(6.5.11)

318

Applications

Similarly, a formal solution of the initial value problem for the three dimensional diffusion equation

-co< x, y, z 0, u(x, y, z, 0) = f(x, y, z).

(6.5.12) (6.5.13)

lS

(6.5.14) where

Example 6.5.2 (One Dimensional Wave Equation). Obtain the d' Alembert solution of the Cauchy problem for a one dimensional wave equation

-co< x 0, u(x, 0) = f(x),

( 6.5.15)

u,(x, 0) = g(x).

( 6.5.16)

We apply the joint Fourier and Laplace transforms defined by 1 u(k, s) = r::t= v27T

f""

-ro

.

e-lkx dx

f"" e-stu(x, t) dt.

(6.5.17)

0

The transformed Cauchy problem has the solution in the form fi(k s) = s](k) + g(k)

'

(6.5.18)

s2+ c2k2

The joint inverse transformation gives the solution

u ( x, where

::e-

u( x, t)

=

1

t) =-1-

l2

Y L.7T

f""

-ro e

tkx:;e- 1 {s](k)+ g(k)} dk S

2+ C 2k2

'

(6.5.19)

is the inverse Laplace transform operator. Finally, we obtain

.A:rr f_""ro e kx [ f( k) cos ckt + g~:) sin ckt J dk 1

= _1_

f"" ! eikx[etckt + e-ickt]f(k) dk

../2Ti -ro 2 +-1- ~ ../2Ti 2zc 1

f""

-ro

eikx[eickt _ e-ickt]g(k) dk

=-[f(x+ct)+f(x-ct)]+

2

1

1

1

=-[f(x+ct)+f(x-ct)]+-

2

1

r::t=Y 27T 2c

fx+ru

2c x-ct

fro g(k) dk f x+ct e'k' . dt -ro x-ct g(t) d{

(6.5.20)

319


This is the classical d'Alembert solution. It can be shown, by direct substitution, that it is the unique solution of the wave equation provided f is twice continuously differentiable and g is once differentiable. This essentially proves the existence of the d'Alembert solution. It can also be shown, by direct substitution, that the solution (6.5.20) is uniquely determined by the initial data. It is important to point out that the solution u depends only on the initial values at points between x- ct and x + ct and not at all on initial values outside this interval on the line t = 0. This interval is called the domain of dependence of the variables (x, t). Moreover, the solution depends continuously on the initial data, that is, the problem is well posed.

Example 6.5.3 (Laplace's Equation in a Half-Plane). Dirichlet problem consisting of the Laplace equation Uxx

+ Uyy = 0.

We consider the

-co< x O.

(6.5.27)

Similarly, we can solve the Dirichlet problem for the three dimensional Laplace equation in the half-space: Uxx + Uyy + Uzz = 0,

-co< x, y (x, y) =U 27T

f""

sin ka 1 ikx lkl - - - e - Y dk.

(6.5.37)

lkl

k

-ro

Thus the velocity component in the y direction is given by v=-4> =u

27T

Y

u

= - Re

27T

sin-kae ikx-lkl Y fro -ro

f"" -ro

u

= - Re 47T

dk

k

1 ka cos kx e -lkiy dk -sin k

f""

-lkly

{sin k(x+ a) -sin k(x- a)} _e_ dk, k -ro

(6.5.38)

where Re stands for the real part. Using the result

f

ro sinake-ky dk=.:!!.-tan- 1: 1

2

k

o

a'

the above solution for v becomes

J

-1 y u [ tan -1 ---tan -y- . v=-

27T

x-a

x+a

(6.5.39)

Similarly, for the x-component of the velocity we obtain u=-cf> = -iU 27T x

f"" -ro

r2 U sin-kae ikx-lkly dk =-ln27T

k

r1,

(6.5.40)

2 2 where ri=(x-af+y 2 and r~=(x+a) +y • Introducing a complex potential w = ¢ + iifJ, we obtain

dw dz

a¢

. a¢

ax

ay

.

-=--z-=-u+zv

(6.5.41)

which can be written, by (6.5.38)-(6.5.40), in the form

r1 dw= U - [ ln-+i(8 1 -8 2 ) r 27T dz 2

J= U- l nz-- -a, 27T

z +a

(6.5.42)

where tan 8 1 = yj (x- a) and tan 82 = Y/ (x +a). Integrating (6.5.42) with respect to z gives the complex potential

u

w = - [2a + (z- a) ln(z- a)- ( z +a) ln(z+ a)]. 27T

(6.5.43)

322

Applications

Example 6.5.5 (The Navier-Stokes Equation). The Navier-Stokes equation in a viscous fluid of constant density p and constant viscosity l' with no external forces is Du 1 - = --V'p+ Dt p

lJ

'17 2 u

(6.5.44)

'

where u = ( u, v, w) is the local Eulerian fluid velocity at a point x = ( x, y, z) and at time t, p(x, t) is the pressure and the total derivative following the motion

D

a

-=-+u. '17 Dt at

(6.5.45)

which consists of an unsteady term and a convective term. We next introduce the vorticity vector w = (g, 17, t) in rectangular Cartesian coordinates ( 6.5.46a) (6.5.46b) ( 6.5.46c) Using the vector identity uxcurlu=~V'(u · u)-u · V'u

(6.5.47)

with q 2 = u · u, Equation ( 6.5.44) assumes the form

(P

au 1 -+u·V'u=-'17 -+-q at p 2

z) +vV'u. 2

(6.5.48)

Taking the curl of both sides of this equation, the pressure term disappears and hence we get

aw =

-

at

2

curl( u x w) + v '17 w (6.5.49)

in which the continuity equations, '17 · u = 0 and '17 · w = 0, are used. The equation (6.5.49) can be also written in the form

Dw - - = w · V'u + v '17 2 w. Dt

( 6.5.50)

This equation (or its equivalent form ( 6.5.49)) is called the vorticity transport equation, and represents the rate of change in vorticity w which

323


is represented by three terms on the right-hand side of (6.5.49). The first term, u · V'w, is the familiar rate of change due to convection of fluid in which the vorticity is non-uniform past a given point. The second term w · V'u describes the stretching of vortex lines, and the last term, v V' 2 w, represents the rate of change of w due to molecular diffusion of vorticity in exactly the way that v V' 2 u represents the contribution to the acceleration from the diffusion of velocity (or momentum). In the case of two dimensional flow, w is everywhere normal to the plane of flow and w · V'u = 0. Equation ( 6.5.50) then reduces to the scalar equation

Dt

-=v\7 Dt

2

(6.5.51)

Y !,,

so that only convection and viscous conduction occurs. In terms of the stream function rjf, where u = r/Jy, v = -rjfx (and t = - V' 2 r/J) satisfy the continuity condition identically, Equation ( 6.5.49) assumes the form

a arfJ a arfJ a ) 2 + - - - - - V' (at ay ax ax ay

"'=

4

lJ

V' r/J·

( 6.5.52)

In the steady state a; at= 0, and if the velocity of the fluid is very small and the viscosity is very large, all terms on the left hand side of (6.5.52) can be neglected in the first approximation. Consequently (6.5.52) reduces to the biharmonic equation (6.5.53) We solve this biharmonic equation for the viscous fluid bounded by the plane y = 0 with the fluid introduced through a strip lxl * r/J E 7iJ.


327

(5) Construct a test function¢ such that ¢(x) = 1 for lxl s:: 1, and ¢(x) = 0 for 1x12::2. (6) Which of the following expressions define a distribution? (a) (f, c/>) = L~~J c/> (n\0); (b) (f, ¢) = L~=J cf>(xn), x~, ... , Xm (c) (f, ¢) = '£:~ 1 cf>(n)(O); (d) (f, ¢) =

'£:=

1

cf>(xn), x~, x 2 ,

(e) (f, ¢) = L~=J cf>(n)(xn),

•.• E

X 1 , ••• ,

E

Rare fixed;

Rare fixed;

Xm

E

Rare fixed;

(f) (f, ¢) = (¢(0))2;

(g) (f,¢)=sup¢; (h) (f, ¢) = JCX)CX) lcf>(t)l dt; (i) (f, ¢) = f~ cf>(t) dt; (j) (f, ¢) =

(7) Let (a) (b) (c) (d)

L:=J cf>(xn), where limn~ro Xn = 0.

cf>n~cP and rf!n~r/J- Prove the following:

acf>n+brf!n~acf>+brjf for any scalars a, b,

fcPn ~ !¢ for any smooth function f defined on R N' cf>n o A~¢ o A for any affine transformation A of R N onto R N' D"cf>n ~ D"¢ for any multi-index a.

(8) Let f be a locally integrable function on RN. Prove that the functional F on 7iJ defined by

is a distribution.

(9) Find the nth distributional derivative of f(x) = lxl( 10) Let fn (x) =sin

nx. Show that fn ~ 0 in the distributional sense.

( 11) Let Un} be the sequence of functions on R defined by 0, fn(x)= n, { 0,

if x 1j2n.

Show that the sequence converges to the Dirac delta distribution.

328

Applications

( 12) Show that the sequence of Gaussian functions on R defined by n = 1, 2, ... ,

converges to the Dirac delta distribution.

( 13) Show that the sequence of functions on R defined by r

Jn(X

)

sm nx =--,

n = 1, 2, ... ,

7TX

converges to the Dirac delta distribution.

Cro

( 14) Let ¢ 0 E 0)(R) be a fixed test function such that ¢ 0 (x) dx = 1. Show that every test function ¢ E 0)(R) can be represented in the form 4> = Kcf>o+ cP1, where K is a constant and ¢ 1 is a test function such that Moreover, the representation is unique.

Cro ¢

1

(x) dx = 0.

( 15) The fundamental solution of the one dimensional diffusion equation satisfies the equation G,- KGxx = 8(x- na(t- r),

-co<xO.

Show that

n J 2

G(x, t;

g,

H ( t - T) [ (Xr)= v'4K7T(t-r) exp --4K'----(t....:-c..:...r_) ·

Hence obtain the solution of the non-homogeneous equation

-co< x 0.

u,- Kuxx = f(x, t),

( 16) Find the fundamental solution for the one dimensional diffusion equation -co<xO.

( 17) Apply the joint Fourier and Laplace transforms to obtain the Green's function for the wave equation 0

11 -

2

c Gxx

= 8(x)8(t),

G(x, 0) = G,(x, 0) = 0.

-co<xO,

329


(18) (a) Show that the fundamental solution G(x, g, t) for the Cauchy problem G,(x, 0) = 8(x- g),

G(x, 0) = 0, is

1 G(x, g, t) = - [H(x- g+ ct)- H(x- g- ct)]. 2c

(b) Use this fundamental solution to solve a more general wave problem

u11

=

c2 uxx,

-co< x 0,

u(x,O)=O,

u,(x,O)=g(x).

(19) Prove the existence of the weak solution of the Dirichlet boundary value problem u =0

on

an,

where c is a positive function of x and y. Show that the weak solution is given by

L

2

v(-\l u+cu) dr= Lfvdr,

where u, v E Hb(n).

(20) Show that the Dirichlet problem for the biharmonic operator t/u = f inn, f E L\n),

au an

u=-=0

where

nc

on

an,

RN, has a weak solution u E H~(n) given by

L

t.u t.v dr =

L

fv dr

for every v E

H~(n).

(21) Show that the boundary value problem -t.u+u=f

in RN,

u-?-0

/EL2 (RN),

as lxl-?-co

1

has a unique solution u E H (RN) such that

r

JRN

\1 u . \1 v dr +

r

JRN

uv dr =

r

JRN

fv dr

330

Applications

(22) Let n c RN be a bounded open set. Consider the Robin boundary value problem

/E L\0),

inn,

-tw+u=f

au an

on

-+au=O

an,

a>O.

Show that there exists a unique solution u E H6(0) such that for every v E Hb(O),

a ( u, v) = (f, v)

where a ( u, v) =

Jnr V' u . V' v dr + Jnr uv dr + a

f

uv dr,

u,

VE

Hb(O).

an

(23) Use the Fourier transform method to show that the solution of the telegrapher's problem U11 + au 1 + bu =

C

2

-co< X, t 0, u(x,O)=f(x), -co<xtJ.i is the work done in this displacement, then

Qj 8% = =

L (X; 8x; + Y; 8y; + Z; 8z;) """' ax; Y;-+Z;ay; az;) t>qj· L ( X;-+ aqj aqj aqj

Substituting this result into (7.2.17) we obtain

~ (:~)- : : = Qj.

(7.2.18)

These are called Lagrange's equations of motion for a holonomic dynamical system. A great advantage of these equations for the solution of dynamical problems is that forces which do no work, such as reactions at smooth hinges, do not appear. However, these are to be included in the usual equations of motion. An important modification of (7.2.18) can be made for conservative dynamical systems. In such cases,

av

Qj=- a-' qj

(7.2.19)

where V is the potential energy of the system and is a function of the generalized coordinates q~> q 2 , ••• , q3 n and possibly of the time t. Since V does not involve the velocities qh a V1aqj = 0. Thus Lagrange's equation (7.2.18) can be written in terms of Las

~ (;~)- ;~ = 0,

j=1,2, ... ,3n.

(7.2.20)

If the kinetic energy of the system Tis a homogeneous quadratic function of velocities qh then

2T=

(7.2.21)

339

Mathematical Foundations of Quantum Mechanics

Assuming that V does not involve the time t explicitly, it follows from (7.2.20) that

.!!_ (T+ V) = .!!_ (2T- L) = L [.!!_(tv a~)- ija~tv aL] 1 dt

dt

=

dt

aqj

aqj

L. [ cij ~ (:~)- cij ;~] = o.

atJ.i

(7.2.22)

This shows that T+ V =constant for a conservative system. In order to discuss the so called Hamilton equations of motion, we introduce the concepts of generalized momentum pj and generalized force Fj by aL

(7.2.23a)

p=-

aqj

1

and aL aqj.

(7.2.23b)

F=1

Consequently, the Lagrange equations (7.2.20) become aL d aqj = dt Pj =Ph

(7.2.24)

where p; and qj are usually called conjugate variables. In Cartesian coordinates this result reduces to the familiar equation for aL aT

p =-=-=mx.

x, ax ax

r

(7.2.25)

Any equations that will hold in any coordinate system are of the most interest and useful for applications. To develop such equations of motion, we now introduce a new function, called the Hamiltonian (or Hamilton's function) H, which is defined in terms of the Lagrangian Land two conjugate variables pj and tJ.i by H(p, q) =

LPAj- L,

(7.2.26)

j

where L= L(q;, 4;, t) is, in general, a function of q;, cj,, t and qj enters through the kinetic energy as a quadratic term. Equation (7.2.23a) will give P; as a linear function of cir This system of linear equations involving p, and cj, can be solved to determine ci; in terms of p,, and then the cj,s can in

340

Applications

principle be eliminated from (7.2.26). This essentially means that H can always be expressed as a function of Ph qj and t so that H = H(ph qh t). Thus """' aH """' aH aH dt. dH = L - dp + L - dq +apj 1 aqj 1 at

(7.2.27)

On the other hand, differentiation of H in (7.2.26) with respect to t gives

or dH = """' L p1 dq1 +

aL dq - """' aL aL L ri dp - LL - . dq - - dt. a% aqj at 1

"'l}

1

(7.2.29)

'

In view of (7.2.23a), this equation becomes dH

aL aL ="""' q· dp- """'- dn, - - dt. L L aqj at 1

1

(7.2.30)

"'l}

Evidently, the two expressions of dH given in (7.2.27) and (7.2.30) must be equal so that the coefficients of the corresponding differentials can be equated to obtain

(J;=-

aH apj,

(7.2.3la)

aL aH --aqj a tV '

(7.2.3lb)

aL aH --at at

(7.2.3lc)

Using the Lagrange equations (7.2.20) and (7.2.23a), the first two of the above equations become, for each j, dqj_ aH dt- apj' dpj

aH

dt=- aq/

(7.2.32a)

j = 1, 2, ... , 3 n.

(7.2.32b)

These are known as Hamilton's canonical equations of motion. They constitute a set of6n coupled first order equations that reflects the symmetry except for a negative sign. Thus the Hamilton equations are completely equivalent to Lagrange's equation (7.2.13) which represent a set of 3n


341

coupled second order differential equations. These equations possess a unique solution if the initial data are prescribed at some time t = t0 • In other words, Hamilton's equations completely determine the position and momentum at all times provided the initial data are given. This shows that the fundamental laws of classical mechanics are completely deterministic. The Lagrange-Hamilton theory can be employed to deduce (i) the law of conservation of energy, and (ii) that H is equal to the total energy. To derive (i), we assume that L, and therefore H, in (7.2.26) do not involve the time t explicitly. Consequently,

"""' ( pq+pn--n,--_ .. . . aL . aL q.. ) =L 1 1 1 "'l] aqj "'l] aqj 1 =

L (pA+ PAJ-PAJ-PA)=O.

This shows that H is constant. To prove the second property, we assume that the coordinate transformations (7.2.14) do not depend explicitly on time t. We note that Tin L = T-V is given by (7.2.16) where the coefficients ajk are symmetric functions of the generalized coordinates (];· On the other hand, Vis, in general, independent of qj and hence aL aT Pj = - . = - . = aqJ aqJ

3

L aJkqk. k~i "

Thus the Hamiltonian H becomes

=2T-L= T+ V Thus H is equal to the total energy. Since H was proved to be a constant, the sum of the kinetic and potential energies is constant. This is the celebrated law of conservation of energy. We consider a conservative holonomic system which is described by the generalized coordinates q 1 , q2 , ••• , qn. For any complete set of specified initial conditions each of the coordinates q; must be a single-valued function of time t. Thus we assume that these functions are known and have the form q 1 =q 1 (t),q 1 =q 2 (t), ... ,qn=q,.(t). Then these equations may be regarded as the parametric equations of a path in n-dimensional Euclidean

342

Applications

space and the motion of the system can be related to that of a point which moves along this path. Given the initial state, the motion in subsequent times is uniquely determined by Newton's laws of motion. Therefore, there exists a unique path in the n-dimensional Euclidean space for a given set of initial data. It is of interest to compare this path with another one in the n-dimensional space, the two paths having the same end-points and such that they are traversed in the same time r. Assuming that at any instant of time the difference between the positions of two points which trace out the two paths is infinitesimally small, we denote. the variation between the two paths at any instant by 8. Both paths have the same end-points so that 8q 1 (0) = 8q 1 ( r) = · · · = t>qn(O) = t>qn( r) = 0. However the variations 8q 1 (t), 8q 2 (t), ... , t>qn(t) do not vanish at any timet between 0 and r. Then it follows that

Thus

=

[L

-aL t)q.J •

aqj

J

,~T

-

[L

" 'aL . t)q.J

aqj

Jt=o

=

0'

since both the paths have the same end-points. Evidently

8[

(7.2.33)

Ldt=O.

This result is well known as Hamilton's variational principle provided the Lagrangian L= L(CJJ, qj). This principle was obtained from Newton's law of motion. Conversely, it can be shown that Newton's laws can be derived from Hamilton's principle if L = L( qj, q1 ). Clearly t>L=

[a~ &j1 +aL t>q,J. aq aq 1

1

343


By Hamilton's principle

aL ) dt 0=8 TLdt= IT 8Ldt=L IT (aL -. 8q1 +-8{jj o o a% a% Io

J [

aL aL 8{jj ]T ' d (aL) ='L T[ - -. 8{jj+-8qj dt+ 'L-. dt aqj aCJJ 8q1 o Io where the last result is obtained by integration by parts. Since 8q1 = 0 at t = 0 and t = r, the last term vanishes and the above expression gives

LIT [aL _.!!._(a~)] 8% dt = o o aqj dt aq; for all 8% and all r. Thus the integrand must vanish, which yields Lagrange's equation

:t (:~)-:~

=0.

(7.2.34)

Hence Newton's laws can be derived from these equations. Hamilton's principle shows that motion according to Newton's laws is distinguished from all other kinds of motion by having the property that the integral f L dt for any given time interval has a stationary value. Hence it is regarded as a fundamental principle of classical mechanics from which everything else can be derived.

Poisson's Brackets in Mechanics The equations of motion for any canonical function F(p;, q;, t) can be expressed, using Hamilton's equations (7.2.32ab ), as

f f (aFaq; aH _ aF aH) + aF ap; ap; aq; at

dF = (aF q;+ aF ]\) + aF dt i~l aq; ap; at =

;~ 1

aF ={F,H}+at,

(7.2.35)

where {F, H} is called the Poisson bracket of two functions F and H If the canonical function F does not explicitly depend on time t, then aFjat=O so that (7.2.35) becomes

dF dt={F, H}.

(7.2.36)

344

Applications

In addition, if {F, H} = 0, then F is a constant of the motion. In fact, (7.2.35) really includes the Hamilton equations which can be verified by setting F= p;, F= q; or F= H. It readily follows the definition of the Poisson bracket that (7.2.37a)

{q;,pJ= aij, {q;, qj}={p;,pj}=O,

(7.2.37bc)

where t>ij is the Kronecker delta notation. These are the fundamental Poisson brackets for the canonically conjugate variables p; and q;. Any relation involving Poisson's brackets must be invariant under a canonical transformation. This is often used as an alternative definition of a canonical transformation. It can also be verified that the components of the angular momentum L=rxp satisfy i,j, k

=

x, y, z in cyclic order

(7.2.38)

and (7.2.39) It also follows from the definition of the Poisson bracket that the derivative of a canonical function with respect to generalized coordinates qj is equal to the Poisson bracket of that function with the canonically conjugate momentum pj, that is,

(7.2.40) In particular, we obtain aF ax= {F, Px},

(7.2.41a)

aF ay ={F,py},

(7.2.41 b)

aF az ={F,pz},

(7.2.41c)

or equivalently, F(x + dx, y, z)

=

F(x, y, z) +{F, Px} dx,

(7.2.42a)

F(x, y + dy, z)

=

F(x, y, z) + {F, p,.} dy,

(7.2.42b)

F(x, y, z+ dz)

=

F(x, y, z) +{F, pj dz.

(7.2.42c)

345


Thus the canonical momenta Px, pY, Pz are called the generators of infinitesimal translations along the x, y, and z directions, respectively. In general, a mechanical description of a physical system requires the concepts of (i) variables or observables, (ii) states (iii) equations of motion. Physically measurable quantities are called observables. In classical mechanics, examples of variables or observables are position, momentum, angular momentum and energy which are the characteristics of a physical system. They can be measured experimentally and are represented by dynamical variables which are well defined functions of two canonically conjugate variables (generalized coordinates and generalized momenta). So the observables in classical mechanics are completely deterministic. There are states which describe values of the observables at given times. The state of a physical system at a time t = t0 > 0 is uniquely determined by the appropriate physical law and the initial state at t = 0. For example, the state of a system of n interacting particles is determined by assigning 3 n position coordinates and 3n velocity coordinates. Finally, there are equations of motion which determine how the values of the observables change in time. As mentioned above, Newton's equations, Lagrange's equations or Hamilton's equations are well known examples of equations of motion.

7.3. Basic Concepts and Postulates of Quantum Mechanics Classical physics breaks down at the levels of atoms and molecules. Historically, the first indication of a breakdown of classical ideas occurred in the rather complex phenomenon of the so called "black body radiation" which essentially deals with electromagnetic radiation in a container in equilibrium with its surroundings. In other words, the black body radiation is concerned with the thermodynamics of the exchange of energy between radiation and matter. According to principles of classical physics, this exchange of energy is assumed to be continuous in the sense that light of frequency v can give up any amount of energy on absorption, the exact amount in any particular case depending on the energy intensity of the light beam. Specifically, in 1900 Max Planck first postulated that the vibrating particles of matter are regarded to act as harmonic oscillators, and do not emit or absorb light continuously but instead only in discrete quantities. Mathematically, the radiation of frequency v can only exchange energy with matter in units of hv, where h is the Planck constant of numerical value h =21Th= 6.625 x 10- 27 erg sec= 4.14 x 10- 21

MeV sec

(7 .3.1)

346

Applications

and li is called the universal constant. Clearly, h has dimension (energy x time) of action which is a dynamical quantity in classical mechanics. Equivalently, Planck's quantum postulate can be stated by saying that radiation of frequency v behaves like a stream of photons of energy E = hv = liw,

(7.3.2)

which may be emitted or absorbed by matter where w = 21rv is the angular frequency. Clearly, Planck's constant h measures the degree of discreteness which was required to explain the energy distribution of the black body radiation. Thus the concept of discreteness is fundamental in quantum mechanics, but it is totally unacceptable in classical physics. Finally, it is important to point out that the Planck equation (7.3.2) is fairly general so that it can be applied to any quantum system as a fundamental relation between its energy E and the frequency v of an oscillation associated with the system. Also the failure of classical concepts when applied to the motion of electrons appeared most clearly in connection with the hydrogen atom. According to the Rutherford model, an atom can be considered as a negatively charged electron orbiting around a relatively massive, positively charged nucleus. With the neglect of radiation, this system is exactly similar to the motion of a planet round the sun, with gravitational attraction between the masses being replaced by the Coulomb attraction between the charges. The potential energy of the Coulomb attraction between the fixed nucleus charge +Ze and the electron of charge-e is V(r) = -Ze 2 jr. The hydrogen atom consists of two particles-the nucleus, a proton of mass mP and charge + e ( Z = 1), and an electron of mass me and charge -e. The nucleus is small and heavy (mp/me~2000) and the radius of the proton ~10- 3 times the atomic radius. According to the classical atomic theory of Rutherford, the attractive potential would cause the electron to orbit around the nucleus, and the orbiting electron constitutes a rapidly accelerating charge, which according to Maxwell's theory acts as a source of radiant energy. Thus the accelerated charged electron would continuously radiate energy, and in a matter of 10- 10 sec the electron should coalesce with the nucleus, causing the atom to collapse. On the other hand, the frequency of the emitted radiation is related to that of the electron in its orbit. As the electron radiates energy, this frequency, according to classical theory, must change rapidly but continuously, thus giving rise to radiation with a continuous range of frequencies. Thus the Rutherford atomic model has two important qualitative weaknesses: (i) The atom should be very unstable.

347


(ii) It should radiate energy over a continuous range of frequencies. Both of these results are totally contradicted by experiments. The original problem of quantum mechanics was to investigate the stability of atoms and molecules, as well as to explain the discrete frequency spectra of the emitted radiation by excited atoms. The remarkable success in predicting observed atomic and molecular spectra is one of the major triumphs of quantum mechanics. In this chapter, we present the basic principles of quantum mechanics as postulates which will then be used to discuss various consequences. No attempt will be made to derive or justify these postulates. Both the number and content of the basic postulates are to some extent a matter of individual choice. The postulates together with their consequences form a basic but limited theory of quantum mechanics. It has already been mentioned in previous sections that classical mechanics identifies the state of a physical system with the values of certain observables (for example, the position x and the momentum p) of the system. On the other hand, quantum mechanics makes a very clear distinction between states and observables. So we begin with the first postulate concerning the state of a quantum system. Postulate I (The State Vector). Every possible state of a given system in quantum mechanics corresponds to a separable Hilbert space over the complex number field. A state of the system is represented by a non-zero vector in the space, and every non-zero scalar multiple of a state vector represents the same state. Conversely, every non-zero vector in theHilbert space and its non-zero scalar multiples represent the same physical state of the system. The particular state vector to which the state of the system corresponds at time t is denoted by 'l'(x, t) and is called the time dependent state vector of the system. The state of a physical system is completely described by this state vector 'l'(x, t) in the sense that almost all information about the system at time t can be obtained from the vector 'l'(x, t). Usually, a state vector is denoted by rjJ(x). In the Dirac notation, any general state vector rjf(x) is written as r/J(x) =(xI r/J)

(7.3.3)

(r/JI x).

(7.3.4)

and its complex conjugate as rjf(x)

=

348

Applications

This postulate makes several assertions. First, all physical properties of a given system are unchanged if it is multiplied by a non-zero scalar. We can remove this arbitrariness by imposing the normalizing condition (7.3.5) where the integral is taken over all admissible values of x. Or equivalently

f

(1/Jix)(xii/J)dx=

f

2

I is the phase of the wave. This can also be expressed in the form (7.5.2) where rjJ(r) =A(r) exp[ -i¢(r)].

(7.5.3)

According to the Planck equation (E = hv), if a quantum system has energy E, its state vector l'l'(r, t)) at time t should contain a factor exp[ -27Tivt] = exp[ -(iE /h) t] so that l'l'(r, t)) = e-iEtffll'lf(r, 0)).

(7.5.4)

Clearly, energy is an observable, and hence, for a system to have a definite energy E, it must be in an eigenstate of this observable. If this is the case, Equation (7.5.4) expresses the fact that the state vector at time tis different from that at t = 0 only by a scalar factor, and so it describes the same physical state. For this reason an eigenstate of energy is called a stationary state of the system. The following postulate of quantum mechanics is concerned with the existence of the energy operator and the time development of a quantum system: Postulate VI (a) (Hamiltonian Operator). For every physical system there exists a linear Hermitian operator fr, the so called Hamiltonian Operator which represents the observable operator corresponding to the total energy of the system. (b) (Schrodinger's Equation). If a physical system is not disturbed by any experiment, the Hamiltonian operator H determines the time development

363


of the state vector of the system 'l'(r, t) through the partial differential equation

a'¥ in-= H'l'(r, t). A

(7 .5.5)

at

This is called the time-dependent Schrodinger equation, and represents the fundamental equation of motion in quantum mechanics. Its ultimate justification is that it leads to predictions which are in remarkable agreement with experimental findings. As in equations of motion in classical mechanics, Equation (7.5.5) is completely deterministic in the sense that, given the s'tate 'l'(r, t) at some time t = t0 , Equation (7 .5.5) will uniquely determine the state 'l'(r, t) at other time t. The Hamiltonian operator if corresponds to the force in classical mechanics because the total energy if includes the potential energy which gives the force in classical mechanics. So if is equivalent to the force field. For a single particle of mass m moving in space, the classical state is described by the position and momentum vectors (r, p). If the particle is under the action of a force F(r) which is derived from a potential V(r) so that F= -V' V, the Hamiltonian function H is

p2

H(r, p) = T+ V=-+ V(r). 2m

(7 .5.6)

The Hamiltonian operator if corresponding to this classical function H in quantum mechanics is derived by replacing r and p with the operators r and p=-in V', respectively. Consequently, the Schrodinger equation (7.5.5) assumes the form 2

J

a'¥ [ --V' n 2 +V(r) '-V=H'-V, in-= A

at

2m

A

(7.5.7)

where A

H

=-

n2

-'\7 2 + V(r) 2m

(7.5.8)

and 'l'(r, t) belongs to the Hilbert space L 2 (R 3 ). This postulate has three main consequences. First, it asserts that (7.5.4) can be derived as a solution of the Schrodinger equation (7.5.5). We assume that if has a purely discrete spectrum of eigenvalues so that it has a complete set of eigenstates 11/J,(r)) with the corresponding eigenvalues En- Then 1'-l'(r, t)) can be expanded in terms of the complete eigenstates as (7 .5.9)

364

Applications

Substituting this into (7.5.5) gives

which, equating the coefficients of Irf!nJ, yields (7.5.10)

Hence the solution of this equation is (7.5.11)

where an (0) is the initial value of an ( t) at t = 0. Thus the time dependent solution is (7.5.12)

The second consequence of (7 .5.5) is the de Broglie wave for free particles of definite momentum p = nk which is required to explain electron diffraction. The one dimensional de Broglie wave is 'kx

(x IPal= a e'

.

=a e'P"

jil

,

(7.5.13)

where a is a constant. If it is normalized by using the orthonormality condition for continuous eigenvalues (a Ia')= 8( a- a'), where the simplest representation of the Dirac delta function is 1 8(a) = 27T

J""-ro

eiax

dx,

then c t) represents the time-dependent term that arises from the presence of an external field. In the absence of the latter term, the time-evolution operator is obtained from (7.6.11) in the form U0 ( t, t0 ) A

-

_

-

exp - in ( t- t0 ) H CO)] . [

(7.8.2)

A

Both the state vector 'l'I(t) and the operator AI(t) depend on time t and are defined by

U~ 1 (t, t0 )'-l's(t)=exp[~(t-t0 )fico>] 'l's(t),

'-VI(t)= "'

"'-1

"'

(7.8.3)

"'

AI(t)= U 0 (t, t0 )AsUo(t, t0 )

= exp [ ~ (t- t0 )fico> JAs exp [- ~ ( t- t0 )fico>

l

(7.8.4)

where '¥ s( t) is the state vector and As is the operator in the Schrodinger picture so that a'¥ S in---;;(= H ( t)'-V s( t) A

A

=

A

)

[Hco) + Hc 1 ( t) ]'¥ s( t),

dAs= aAs dt at

(7.8.5) (7.8.6)

It follows from (7.8.3) and (7.8.5) that

[i (

in a'¥ I= in j_ {exp t- to) fico>] '¥ s( t)} at at n =

-fico>'¥ I+ exp [

= - J{Col'l!l

~ (t- t

0)

fico>

J(in a~ s)

+ exp [ ~ ( t- t 0 )Hcol J[1fco> + HC 1 l( t)]'-V s( t)

=

-H( 0 )'l! 1 + 1fto)'l! 1 + Ut) 1(t, t0 )H( 1 )(t) Uo(t,

=

H\

1

)(t)'Itl(t),

f 0 )'¥1(t)

(7.8.7)

390

Applications

where ficI 1>(t) = {J-0 1 (t, t0 )fi(!>U.0 (t, t) 0 ·

(7.8.8)

On the other hand, it also follows from (7.8.4) and (7.8.6) that

dAI =a AI+_!_ [A fico>] dt

at

in

I '

I

'

(7.8.9)

where fiCo)= {J-0 1 (t , t0 )fiCo) I

U (t 0

,

t) =fiCO) · 0

(7.8.10)

These results show that the state vector '¥I( t) in the interaction picture satisfies the Schrodinger equation (7.8.7) with the Hamiltonian fi\ 1>, while the operator AI( t) obeys the Heisenberg equation with the time-independent Hamiltonian fico>.

7.9. The linear Harmonic Oscillator According to classical mechanics, a harmonic oscillator is a particle of mass m moving under the action of a force F = -mu/x. The equation of motion is then (7.9.1) The solution of this equation with the initial conditions, x(O) =a, .X(O) = 0 is

x= a cos

wt.

(7.9.2)

This represents an oscillatory motion of angular frequency w and amplitude 2 2 a. The potential is related to the force by F =- aVjaxso that V(x) =!mw x • The energy of the oscillatory motion is the potential energy when the particle is at the extreme position. Therefore the energy is (7.9.3) Since the amplitude a can have any non-negative value, the energy E can have any value greater than or equal to zero. In other words, the energy forms a continuous spectrum. We next consider the quantum theory of such a system. The total energy of the system is represented by the Hamiltonian operator (7.9.4)

391

Mathematical Foundations of Ouantum Mechanics

It is convenient to introduce two dimensionless operators a and a* by (7.9.5)

(7.9.6) Since

x and p are Hermitian operators, it follows that

"'I

for any two wave functions and rf!z. Thus the operators a and a* are not Hermitian and hence they do not represent physical observables. However, aa* and a*a are Hermitian operators, because they can be represented as real functions of fr:

p2 mu/ iw 1 aa* ::=:-+-- x-- [x p] =H +-liw 2m 2 2 ' 2 ' A A

A

A

A

A

p2 mw 2 iw 1 a*a =-+--x+- [x p]=H --liw 2m 2 2 ' 2 A

and hence

fr

A

A

A

A

A

can be written in terms of a and a* as "'"'* I HA =aA*"' a -21 ¥...nw = aa +2 liw

(7.9.7ab)

so that

[a, a*]= liw.

(7.9.8)

The eigenstate of energy En is !En) and (7.9.9) Using (7.9.7ab), we rewrite (7.9.9) either as

a*a!En) =(En-! nw )!En)

(7.9.10)

aa*!En) =(En +!nw) !EnJ·

(7.9.11)

or

Multiplying (7.9.10) by

a, we obtain

aa* alE,)= (E,- !nw )a! E,).

(7.9.12)

Then either (7.9.13)

392

Applications

or, say, (7.9.14) This result is used to rewrite (7.9.12) as

aa*IEn-il =(En- nw +HwiEn-il·

(7.9.15)

This is identical with (7.9.11) for En-i, provided (7.9.16) Thus, given any eigenvector lEn), it is possible to generate a new eigenvector 1En_ 1), by (7.9.14), unless lEn) is the lowest state IE0 ). In this case (7.9.13) is satisfied. It follows from (7.9.10) for n =0 that (7.9.17) This determines the lowest (or ground) state energy. Clearly, it follows from is the operator which annihilates energy in the system in (7.9.14) that quantum units of nw, and is called the annihilation operator. Similarly, multiplication of (7 .9.11) by a* gives

a

a

(7.9.18) Then, either (7.9.19) or, say, (7.9.20) This result is used to rewrite (7.9.18) as

a*aiEn+il =(En+ nw -~nw)IEn+il·

(7.9.21)

This is (7.8.10) for En+i, provided (7.9.22) It follows that, given any eigenstate lEn), it is also possible to generate a new eigenvector 1En+ 1 J, by (7.9.20), with the eigenvalues given by (7.9.22), unless En is the highest energy level, in which case (7.9.19) is satisfied. But the potential is an increasing function of x and hence there is no highest level and the creation of higher energy levels is always possible. Thus the operator a* generates energy in the system in quantum units of nw, and is called the creation operator. It then follows from (7.9.17) and (7.9.22) that the general energy level is n=0,1,2, ...

(7.9.23)


393

This obviously represents a discrete set of energies. Thus, in quantum mechanics, a stationary state of the harmonic oscillator can assume only one of the values from the set En. The energy is thus quantized, and forms a discrete spectrum. According to classical mechanics, the energy forms a continuous spectrum, that is, all non-negative numbers are allowed for the energy of a simple harmonic oscillator. This shows a remarkable contrast between the results of the classical and quantum theory. The non-negative integer n which characterizes the energy eigenvalues (and hence eigenfunctions) is called the quantum number. The value of n = 0 corresponds to the minimum value of the quantum number with energy (7.9.24) This is called the lowest (or ground) state energy which never vanishes, as the lowest possible classical energy would. The ground state energy E 0 is proportional to li, representing a quantum phenomenon. The discrete energy spectrum is in perfect agreement with the quantization rules of the quantum theory. To determine the energy eigenfunctions rf!n belonging to En it is convenient to write the annihilation and creation operators as A= alv'hW and A*= a*lv'hW and replace p by -in(alax) so that (7.9.25)

A = 1- [ - ( li I mw) 1I 2 - a + (mw I li) 1I 2 xA] = 1- ( - a + 17A) , A* v'2 ax v'2 a11

(7.9.26)

where i,=(mwlli) 112 X. Consequently, A A

A

H

1

liw

2'

AA*=-+

(7.9.27a)

(7.9.27b) Since rfro is the eigenfunction corresponding to the lowest energy, E 0 ,

ArjJ 0 =0 or (7.9.28)

394

Applications

Its normalized solution can be written as (7.9.29) All other eigenfunctions rf!n can be calculated from rf!o by successive applications of the creation operator A*, and thus rf!n is proportional to (A*rrf!o· We also note that (7.9.30) so that if rf!n is normalized, so is rf!n+i = (n that rf!n

o-

+

112

A*rfrn· Thus, it turns out

= (1, 2, 3, ... , n) -1/2CA*)nrf!o =(2nn!)-i/2(-

d~

+7]r

exp(-

~2).

(7.9.31)

This result can be simplified by using the operator identities (7.9.32a) d ( - d7]

+ 7J

)n

= (-l)n

e'l

2

/2

2 dn d7]n e-'1 12,

(7.9.32b)

so that the final form of rf!n is (7.9.33)

n = 0, 1, 2 ... ,

(7.9.34)

where the result in the square brackets in (7.9.33) defines Hn ( 7J ), the Hermite polynomials of degree n.

Example 7.9.1 (The Schrodinger Equation Treatment of Planck's Simple Harmonic Oscillator). The quantum mechanical motion of the Planck oscillator is described by the one dimensional Schrodinger equation 2

2M( 1 n- E 2 Mw

dd rjfo+7

x-

1

X

2)

,f,=O.

'I'

(7.9.35)

395


In terms of the constants

2MB

f3=r;z,

(7.9.36a)

Mw a=->0 li

(7.9.36b)

and an independent variable x' = xFa, Equation (7 .9.35) becomes, dropping the prime, (7 .9.37) The eigenfunctions of this equation are the Hermite orthogonal functions (7 .9.38) with the corresponding eigenvalues

~= (2n + 1), a

(7 .9.39)

where Hn (x) is the Hermite polynomial of degree n. Substituting the values of a and {3, it turns out that

1)

2n + E==E = ( - - wli n 2 '

n=0,1,2, ....

(7.9.40)

The so-called half-integral multiples of the energy quanta which are the characteristics of the oscillator, that is, the odd multiples of !wn. This result is remarkably the same as in the Heisenberg theory. In view of the following properties of the Hermite polynomials H 0 (x)

=

1,

it follows that the first eigenfunction rjf 0 (x) represents a Gaussian distribution curve and the second eigenfunction rfr 1 (x) vanishes at the origin and corresponds to a Maxwellian distribution curve for positive x which is continued towards negative values of x so that it is an odd function of x. The third eigenfunction rf!c(x) is negative at the origin and has two symmetric zeros ± 1/ J2 and so on. Thus the geometrical shape of these eigenfunctions can easily be determined. 1t is also important to note that the roots of successive polynomials separate one another.

396

Applications

7.10. Angular Momentum Operators The orbital angular momentum operators ix, iY and (have already been introduced in Section 7.3. It has been shown that they obey the commutation relations (7.3.23). Using the spherical polar coordinates (r, 8, ¢),which are related to rectangular Cartesian coordinates (x, y, z) by

r sin 8 COS cf>,

(7.10.1a)

y = r sin e sin ¢,

(7.10.1b)

X =

z = r cos

e,

(7.10.1c)

combined with the chain rule for differentiation

a ax

ar a ae a a¢ a ax ar ax ae ax a¢

-=--+--+-and similar results for a; ay and a; az, the angular momentum operators can be expressed in angular variables

a

a)

Lx =in sin ¢-+cot 8 cos¢- , A

(

ae

Ly =in A

(

(7.10.2a)

a¢

a

a)

-cos ¢-+cot 8 sin¢- ,

ae

(7.10.2b)

a¢

(7.10.2c) 2

J

a ) - +12 - -a- . L 2 =L2 +L 2 +L 2 =-li 2 [ - 1- -as(i n O X y Z Sineae ae sin 8a¢ 2 A

A

A

A

(7.10.3)

From (7.10.1c) and (7.10.3) it is easy to check that (7.10.4) It also follows from (7.10.1c) and (7.5.70) that

( Y;" ( e, 4>) =(lim) Y;"( e, ¢).

(7.10.5)

For any given value of l, the possible eigenvalues of the z-component of the angular momentum, ( are m =0, ±1, ±2, ... ±I,

giving (2/ + 1) admissible values.

(7.10.6)

397


On the other hand, it can easily be checked with the aid of (7.10.3), (7 .5.60) and (7.5.70) with A = 1( l+ 1) that

L2 Y7'( e, 4>) = [ n2 1(1 + 1)] Y7'( e, 4> ), where lmls:: 1, 1=0, 1, 2, .... This shows that the eigenvalues of

(7.10.7)

L2 are 1= 0, 1, 2, 3 ....

(7.10.8)

Evidently, the spherical harmonics Y'('( e, ¢) are the simultaneous eigenA2 functions of L and L The eigenvalues of the total angular momentum L 2 are n 1(1+1), 1=0,1,2, ... , and those of Lz are mn, m=0,±1, ... ,±1. 2 2 Thus a measurement of L can yield as its result only the values 0, 2 n , 2 6n , 12h 2 , •••• The total angular momentum states with 1 values 0, 1, 2, 3, 4, ... are known for historical reasons as S, P, D, F, G ... states respectively. Similarly, the measured values of are only 0, ±n, ±2n, .... Hence both D and iz are quantized and can upon measurement only reveal one of the specified discrete values. It is convenient to define two operators i+ and i_ by ~2

~

2 •

A

L

L+ = Lx+iLy,

(7.10.9a)

i_ =ix-iiy

(7.10.9b)

Theorem 7.10.1. (a) i+ and i_ are non-Hermitian operators, (b) i+i- and i_i+ are Hermitian.

Proof

Since ix and Ly are Hermitian,

for any two wave functions rfr 1 and rfr 2 • Thus i+ and i_ are not Hermitian operators, and hence they do not represent observables. On the other hand, "'

"'

"'

. "'

"'

• "'

"'2

"'2

.

"'

"'

L+L- = (Lx + iLy)(Lx -zLy) = Lx + Ly -z[Lx, Ly] =

"'2

"'2

"'

"'2

"'

"'

LX + Ly + nLZ = L - Lz ( Lz - n).

(7.10.10)

Similarly, (7.10.11) Thus both i+i and i i+ are expressed as real functions of L and Hence they are Hermitian operators. This completes the proof. 2

L.

398

Applications

Since the orbital angular momentum can only take on integer values, this result indicates the necessity for some generalization of this formalism. It is necessary to introduce matrix operators of size n x n defined by

R. Let x=(x 15 ••• ,xN)EB 1 and h=(h 1 , ••• ,hN)EB1 • Iff has continuous partial derivatives of order one, then the Gateaux differential off is df(x, h)=

N af(x) L -- hk. k~i axk

(8.2.2)

For a fixed x 0 E B 15 the Gateaux derivative at x 0 , (8.2.3) is a bounded linear operator from RN into RN. (Note that, in this example, B; = B 1 .) We can also write

which is the gradient off at x 0 , denoted by \lf(x0 ). Example 8.2.2.

Let B 1 = RN and B 2 = RM. Let

414

Applications

be Gateaux differentiable at some x ERN. The Gateaux derivative A can be identified with a M x N matrix (aij)· If h is the jth coordinate vector, h = ej = (0, ... , 1, ... , 0), then

~~~

1/f(x+th/-f(x)

A(h)l/=o

implies

for every i = 1, ... , M and j = 1, ... , N. This shows that j;'s have partial derivatives at x and

a}; (x) - - = a!]> ..

aXj

for every i = 1, ... , M and j = 1, ... , N. The Gateaux derivative off at x has the matrix representation

(8.2.4)

f'(x)=

This is called the Jacobian matrix off at x. Note that if M = 1 then the matrix reduces to a row vector, which is the case discussed in Example 8.2.1. Example 8.2.3. Let B = C(?([ a, b ]) be the normed space of real valued continuous functions on [a, b] with the norm defined by

llxll =

sup

lx(t)l.

IE[ a, b]

Let K(s, t) be a continuous real valued function defined on [a, b] x [a, b], and let g( t, x) be a continuous real valued function on [a, b] x R with continuous partial derivative agjax on [a, b] xR. Define a mapping f: B--'?B by f(x)(s) =

Then df(x, h)=

[d~

r

r

(8.2.5)

K(s, t)g(t, x(t)) dt.

K(s, t)g(t, x(t)+ah(t))

dtl~o·

415

Optimization Problems and Other Applications

Interchange of the order of differentiation and integration is permissible under the given assumption on g, and hence it follows that df(x, h)=

tb t)[a: K(s,

g(t, x(t)) Jh(t) dt.

(8.2.6)

Thus, the Gateaux derivative of the integral operator (8.2.5) is the linear integral operator (8.2.6) and its kernel is K(s, t)gx(t, x). Remark. The Gateaux differential is a generalization of the idea of the directional derivative familiar in finite dimensional spaces.

Theorem 8.2.2 (Mean Value Theorem). Suppose the functional f has a Gateaux derivative df( x, h) at every point x E B. Then, for any two points x, x + h E B, there exists a constant g E ( 0, 1) such that (8.2.7)

f(x+ h)- f(x) = df(x+ gh, h). Proof.

Put ( t) = f(x

+ th ).

Then

, . [(t+s)-(t)J . [f(x+th+sh)-f(x+th)J ( t) = hm = hm s--;..0

S

S

S-""0

= df(x+ th, h).

Application of the mean value theorem for functions of one variable to yields

(l) -(O)

= '(g)

for some

gE (0, 1).

Consequently, f(x+ h)- f(x) = df(x+ gh, h).

This proves the theorem. The derivative of a function

f

of a real variable is defined by

. f(x+h)-f(x) ' f (x)= 1tm , h~o h

(8.2.8)

provided the limit exists. This definition cannot be used in the case of mappings defined on a Banach space because h is then a vector, and division by a vector is meaningless. On the other hand, the division by a vector can be easily avoided by rewriting (8.2.8) as f(x +h) =f(x) + f'(x)h

+ hw(h ),

(8.2.9)

416

Applications

where w is a function (which depends on h) such that w( h)---?> 0 as h---?> 0. Equivalently, we can now say that f'(x) is the derivative off at x if

f(x +h)- f(x) = f'(x)h +(h ),

(8.2.10)

where ( h) = hw( h), and thus ( h)/ h---?> 0 as h---?> 0. The definition based on (8.2.10) can be generalized to include mappings from a Banach space into a Banach space. This leads to the concept of the Frechet differentiability and Frechet derivative.

Definition 8.2.3 (Frechet Derivative). Let x be a fixed point in a Banach space B 1 • A continuous linear operator A: B 1 ---?> B 2 is called the Frechet derivative of the operator T: B 1 ---?> B 2 at x if T(x+h)- T(x) =Ah+(x, h)

(8.2.11)

lim ll(x, h) II= 0 o llh I

(8.2.12)

provided

llhll ....

or, equivalently, lim llhli->o

I T(x+ h)- T(x)- Ah I llhll

O.

(8.2.13)

The Frechet derivative at x will be denoted by T'(x) or dT(x). In the case of a real valued function f: R---?> R, the ordinary derivative at x is a number representing the slope of the graph of the function at x. The Frechet derivative off is not a number, but a linear operator from R into R. The existence of the ordinary derivative /'(x) implies the existence of the Frechet derivative at x, and the comparison of (8.2.9) and (8.2.11) shows that A is the operator which multiplies every hER by the number f'(x). In elementary calculus, the tangent to a curve is the straight line giving the best approximation of the curve in the neighborhood of the point of tangency. Similarly, the Frechet derivative of an operator f can be interpreted as the best local linear approximation. We consider the change in f when its argument changes from x to x + h, and then approximate this change by a linear operator A so that

f(x+h)=f(x)+Ah+e,

(8.2.14)

where e is the error in the linear approximation. Thus, e has the same order of magnitude as h, except for the case when A is equal to the Frechet

417


derivative off In such a case e = o( h), so that e is much smaller than h as h ~ 0. In this sense, the Frechet derivative gives the best linear approximation off near x. Finally, if A is a linear operator, then the derivative of A is A itself. And the best linear approximation of A is A itself. Theorem 8.2.3. If a mapping has the Frechet derivative at a point, then it has the Gateaux derivative at that point and both derivatives are equal.

Proof. then

Let f: B 1 ~ B 2 , and let x

E

B 1 • Iff has the Frechet derivative at x,

lim IIT(x+h)-T(x)-Ahll

llhll

llhiH

0

for some continuous linear operator A: B 1 ~ B 2 • In particular, for any fixed non-zero h E B 1 , we have lim ,~o

II T(x+ th)- T(x) t

Ah

I

=lim ~~o

I T(x+ th)- T(x)- A(th) llllh II= 0. 11th 11

Thus, A is the Gateaux derivative off at x. Corollary 8.2.1

If the Frechet derivative exists, it is unique.

Proof. Suppose A 1 and A 2 are Frechet derivatives off at some x E B 1 • Then A 1 and A 2 are the Gateaux derivatives off at x. Thus, A 1 = A 2 , by Theorem 8.2.1. Example 8.2.4. if x ;; C(?([a, b]), we mean an operator defined by (Tu)(x) =

r

K(x, t)f(t, u(t)) dt,

where K: [a, b] x [a, b]---?> Rand f: [a, b] x R-?> Rare given functions. Iff is sufficiently smooth, then T(u+h)(x)=

r

K(x, t)[f(t, u)+hfu(t, u)+th 2fuu(t, u)+· · ·] dt

= ( Tu) ( x) + Ah + o ( h ) , where the Frechet derivative A= T'(u) is T'( u)(h) =

tb

K (x, t)fu ( t, u( t))h( t) dt.

Thus, the Frechet derivative of T at u is the linear integral operator with the kernel K(x, t)fu(t, u(t)).

419


Theorem 8.2.4. If an operator defined on an open subset of a Banach space has the Frechet derivative at a point, then it is continuous at that point.

Proof. Let n be an open set in a Banach space B 1 , and let T be an operator from n into a Banach space Bz. Let X En and let E > 0 be such that X+ h En whenever I h I < E. Then

IIT(x+h)- T(x)ll = IIAh+(x, h)II~O as I h I

~

0. This proves that T is continuous at x.

Much of the theory, results and methods of ordinary calculus can be easily generalized to Frechet derivatives. For example, the usual rules for differentiation of the sum and product (in the case of functionals) of two or more functions apply to Frechet derivatives. The mean value theorem, the implicit function theorem and Taylor series have satisfactory extensions. The interested reader is referred to Liusternik and Sobolev (1974). In the next theorem, we prove the chain rule for Frechet derivatives. Theorem 8.2.5 (Chain Rule). Let B 1 , B2 , B 3 be real Banach spaces. If g: B 1 ~ B 2 is Frechet differentiable at some x E B 1 and f: B 2 ~ B 3 is Frechet differentiable at y = g( x) E B 2 , then = f o g is Frechet differentiable at x and

'(x) = f'(g(x))g'(x). Proof.

For x, h E B 1 , we have

(x+ h) -(x) = f(g(x+ h))- f(g(x)) = f(g(x+h)- g(x) + g(x))- f(y) =

f(d

+ y)- f(y),

where d=g(x+h)-g(x). Thus, IICx +h)- (x)- f'(y)d II= o( I d II).

In view of II d- g'(x)h II= o( I h II), we obtain llcx +h)- (x)- f'(y)g'(x)h

II= o( llh II)+ o( I d II).

Since g is continuous at x, by Theorem 8.2.4, we have I d I -1), Legendre polynomials Pn (x) (a=f3=0, w(x)=l), and Chebyshev polynomials Tn(x) (a={3==-t w(x) == (1-x 2 )- 112 ). Other orthogonal polynomials are also of interest and can be obtained from the Chebyshev polynomials Tn (x) which satisfy the recurrence relation

with T0 (x) = 1 and T 1 (x) = x. It follows from Tn(x) ==cos ne (n = 0, 1, 2, ... ), where X= cos e, Os e s 1T, that T~(x) = n sin ne; sin e. We then define the new polynomials Un(x) of degree at most n by Un(x) ==

sin(n+ 1)e . e , sm

n = 0, 1, 2, ... ,

(8.7.22)

where x =cos e. These are called the Chebyshev polynomials of the second kind. It is easy to check that polynomials (8.7.22) are orthogonal with respect 2 to w(x)=(1-x ) 112 and hence are constant multiples of Jacobi's poly2 112 nomials p~l/ • l(x). Using L'Hopital's rule, it follows that Un(1) = (n+ 1),

and then p(l/2.!/2\1) n

= 1·3·5·····(2n+1) u 2n(n+ 1)!

n

(1)

•

There are many identities connecting Tn(x) and Un(x). Some of them are given as exercises.

8.8. Linear and Nonlinear Stability We consider linear and nonlinear problems of stability and instability for differential systems. In dynamical systems, the state at any time 1 can be

461


represented by an element of a Banach (or Hilbert) space E. Suppose that the dynamics of a physical system are governed by the evolution equation du dt = F(A, u, t),

(8.8.1)

where A E A is a parameter, A is a set of parameters (for instance A= R), u is a function of a real variable t with values in E, and F is a mapping from Ax E x R into E. Definition 8.8.1 (Autonomous Dynamical System). The dynamical system governed by (8.8.1) is called autonomous if the function F does not depend explicitly upon t. For autonomous systems, (8.8.1) can be written m the form du/ dt = F(A, u). Definition 8.8.2 (Equilibrium Solution). If F(A 0 , u 0 ) and u = u 0 , then u 0 is called an equilibrium solution.

=0

for some A= A0

Definition 8.8.3 (Stable, Unstable and Asymptotically Stable Solutions). u 0 be an equilibrium solution of Equation (8.8.1). (a) u 0 is said to be stable if for every

E

Let

> 0 there exists a 8 > 0 such that

I u( t)- uoll < E for all solutions u( t) of (8.8.1) such that I u(O)- uoll < 8. (b) u0 is called unstable if it is not stable. (c) u 0 is called asymptotically stable if it is stable and I u( t)- uoll-?> 0 as t---?> co. Example 8.8.1. Consider the scalar equation .X= 0. Every solution of this equation has the form x = c, where c is a constant. Thus, every solution is stable but not asymptotically stable. Example 8.8.2. Consider the system dujdt=Au, u(O)=u 0 , where u(t) is real for each t and A E R. This equation has the equilibrium solution u 0 ( t) = 0. The general solution is u(t) = u 0 eAt. If A s:: 0, then the zero solution is stable. If A > 0, the solution is unstable because u( t) --o> co as /---?>co, no matter how small u 0 is.

462

Applications 2

Consider the equation .X= x with x(O) = x 0 • The solution of this equation is obtained by separating the variables, and has the form Example 8.8.3.

Xo

x(t)=--. 1-x 0 t

The solution is not defined for t = 1/ x 0 • Thus x( t)"" 0 is a solution which is unstable. Example 8.8.4.

Consider a linear autonomous system

u=Lu+v,

(8.8.2)

where u( t) E E for each t, L: E----'? E is a linear operator which does not depend on t, and v is a given element of E. Clearly, u 0 E E is an equilibrium solution of (8.8.2) if Lu 0 = -v. We suppose the solution of (8.8.2) is of the form u( t) = u 0 + eAtw, where A is a constant and wE E. Clearly, u( t) satisfies (8.8.2) provided

This means that A is an eigenvalue of L with eigenvector w. Ifthe eigenvalue A has a positive real part and w is a normalized eigenvector, then for any s > 0, the function u( t) = u 0 + sw eA' is a solution of (8.8.2) such that I u(O)uoll = E and I u( t)- uoll--'? co as t----'? co. This shows that the equilibrium solution u 0 is unstable provided there is an eigenvalue with positive real part. This example leads to the "Principle of Linearized Stability" which can be described as follows: Consider a system of ordinary differential equations

u=

(8.8.3)

F(A, u),

where u = ( u 1 , u 2 , ••• , un), F = (F 1 , F 2 , ••• , Fn) and A is a parameter. Let u 0 be the equilibrium solution with A= A0 , so that F( u0 , A0 ) = 0. Suppose the solution of (8.8.3) can be written as u ( t) = v( t) + u 0 , where v( t) is the perturbation from equilibrium. It follows from u = F( u, A0 ) that

v = u= F(v+ Uo, Ao) = F(uo, Ao) + [ -aF;] (v) + O(llvll ) auj 2

or (8.8.4)

O(llvll IIG(v)llscllvll

where A=[aF;/aujluo,Aob and G(v)=

2

,

2 )

represents a term such that

463


where c is a constant. Neglecting the second term in (8.8.4), we obtain the linear equation (8.8.5)

v=Av.

The solution of this equation is v(t)

= e'Au 0.

(8.8.6)

Clearly, all solutions of this equation decay if the spectrum of A lies in the left half plane. Some solutions of (8.8.6) may grow exponentially provided A has eigenvalues in the right half plane. In general, the second order term is negligible when the perturbations are small. This heuristic argument can be justified by Lyapunov's Theorem: Theorem 8.8.1 (Lyapunov's Theorem). If all eigenvalues of A have negative real parts, then u 0 is a stable equilibrium solution of (8.8.3). If some eigenvalues of A have positive real parts, then u 0 is an unstable solution. A rigorous proof of this theorem is beyond the scope of this book. However, the reader is referred to Coddington and Levinson (1955). The following example shows that the weak inequality Re( A) s 0 for all eigenvalues does not ensure stability. Example 8.8.5.

.

Consider the equation u =Au, where u( t)

(0 1)

E

R 2 and A is

. 0 0 If u 0 is an equilibrium solution of this equation, then Au 0 = 0. Clearly, u 0 = (a, 0) represents an equilibrium solution for any number a. The only eigenvalue of A is zero. If we write u = (x, y), then the given equation ·becomes .X= y and y = 0. Hence, the general solution is y = m, x = mt + c, where m and c are constants. For sufficiently small m and c, the solution u( t) = ( mt + c, m) can be made sufficiently close to u 0 = (a, 0) at t = 0. But I u( t)- u0 l ~coast~ co. This shows that the equilibrium solution is unstable. the matnx operator

Theorem 8.8.2 (Stability Criterion). If A is a linear operator on a space E and A+ A* is negative semi-definite, that is, ( v, (A+ A*) v) s 0 for all v E E, then all equilibrium solutions of the equation u=Au+f

( 8.8. 7)

are stable, where u is an element of a Hilbert space E for each t, A: E ~ E is independent of I and fis a given element of E.

464

Applications

Proof. Suppose u 0 is an equilibrium solution of (8.8.7), that is, Au 0 = 0, and u( t) is any other solution. If v = u- u 0 , then v = Av. Thus,

d2

dt211vll 2=

d2 dt2 (v, v) = (v, v)+ (v, v)

= (v, Av) + (Av, v) = (v, (A +A*)v) If A+A* is negative semi-definite, then d2

dt211vll

2

s0.

This means that II vii is a non-increasing function. Consequently, if II u(O)uoll < s, then II u( t)- uoll < E for all t > 0. This shows that all equilibrium solutions are stable. We next consider the stability of a general nonlinear autonomous equation (8.8.8)

u=Nu.

The question of stability of an equilibrium solution u 0 of (8.8.8) is concerned with the effects of small initial displacements of u from u0 , and it only involves values of u in the neighborhood of u 0 • If N is Frechet differentiable, then the operator N can be approximated by the linear operator N'(u) in the neighborhood of u 0 , and linear stability theory can be used. Hence, Nu = Nu 0 + N'(u 0 )(u- u0 )

+ o(u- u 0 ),

(8.8.9)

where Nu 0 = 0. Neglecting the term o( u- u 0 ), Equation (8.8.9) is approximately equal to u = N'( u 0 )( u- uo)·

(8.8.1 0)

This equation may be called the linearized approximation of the nonlinear Equation (8.8.8). Its stability can be determined by stability criteria discussed earlier. When u is near u 0 , (8.8.10) is the linearized approximation to (8.8.8), so it is naturally assumed that the stability of the linearized equations determines that for the nonlinear equations. This principle is generally accepted as valid in the applied literature, aPd stability is determined formally by solving the associated linear eigenvalue problem. However, this general principle is not necessarily true as shown by a counterexample.

465


Example 8.8.6. Consider the nonlinear equation u= u 3 , where u( t) E R for each t. The equilibrium solution is u 0 = 0. It can explicitly be solved by using the initial condition u (0) = u0 , and the solution is 2 2

u =

Uo

1-2u~t'

which is not defined for t = 1/2u~. Thus, u0 is an unstable equilibrium. However, the linearized equation u = 0 admits a stable solution. Thus, the stability of the linearized equation does not imply stability of the nonlinear equation. The difficulty associated with this example is that the linearized equation has eigenvalue A= 0 (critical case when ReA= 0). In other words, the linearized system is only marginally stable. This means that an arbitrarily small perturbation can push the eigenvalue into the right half plane, and make the system unstable. In other words, the eigenvalue zero corresponds to a constant solution of the linearized equation, and an arbitrarily small perturbation can change this constant solution and thus lead to instability. However, if all the eigenvalues of a linearized problem are negative, then its solutions tend to u0 exponentially. The small perturbations involved in going from the linearized to the nonlinear problem cannot change exponential decay of u- u 0 into growth, so in this case the nonlinear problem will be stable.

8.9. Bifurcation Theory Bifurcation is a phenomenon involved in nonlinear problems and is closely associated with the loss of stability. We have seen in Section 8.8 that the stability of a dynamical system depends on whether the eigenvalues of the linearized operator are positive or negative. These eigenvalues correspond to bifurcation points. We shall discuss bifurcation theory in terms of operator equations in a real Banach (or Hilbert) space. By a nonlinear eigenvalue problem we usually mean the problem of determining appropriate solutions of a nonlinear equation of the form F(A, u) = 0,

(8.9.1)

where F: R x E ~ B is a nonlinear operator, depending on the parameter A, which operates on the unknown function or vector u, and E and B are real Banach (or Hilbert) spaces.

466

Applications

Bifurcation theory deals with the existence and behavior of solutions u (A) of Equation (8.9.1) as a function of the parameter A. Of particular interest is the process of bifurcation (or branching) where a given solution of (8.9.1) splits into two or more solutions as A passes through a critical value A0 , called a bifurcation point. Definition 8.9.1 (Bifurcation Points). The solution of (8.9.1) is said to bifurcate from the solution u 0 ( A0 ) at the value A = A0 if the equation has at least two distinct solutions u 1 (A) and u 2 (A) such that they tend to u 0 = u 0 ( A0 ) as A~ A0 • The points (A 0 , u0 ) satisfying Equation (8.9.1) are referred to as bifurcation (or branch) points if, in every neighborhood of ( A0 , u 0 ), there exists a solution (A, u) different from ( A0 , u 0 ). The first problem of bifurcation theory is to determine the solution u 0 and the parameter A0 at which bifurcation occurs. The second problem is to find the number of solutions which bifurcate from u0 (A 0 ). The third problem is to study the behavior of these solutions for A near A0 • To illustrate bifurcation, we consider the linear eigenvalue problem Lu =Au,

(8.9.2)

where L is a linear operator acting on a function or a vector u in some Banach space and A E R. For every value of A, (8.9.2) has a trivial solution u = 0 with the norm II u II = 0. Suppose there is a sequence of eigenvalues A1 < A2 < A3 < · · · , and the corresponding normalized eigenfunctions u 1 , u2 , u 3 , ••• such that

k= 1, 2, 3, ....

(8.9.3)

Then, for a, any real number, non-trivial solutions are u =auk, k = 1, 2, 3, ... , with the norm llull =a. The norms of both trivial and non-trivial solutions are shown graphically by Figure 8.1.

Ilull

0 FIGURE

8.1.

Bifurcation Diagram.

467


Many examples of bifurcation phenomena occur in both differential and integral equations. One such example is as follows: Consider a thin elastic rod with pinned ends lying in the

Example 8.9.1.

x-z plane. The shape of the rod is described by two functions u(x) and w(x) which are the dimensionless displacement functions in the x and z

directions. The x-displacements of its end points are prescribed. The displacement functions u(x) and w(x) satisfy the following differential equations and boundary conditions: 2

d w dx 2 +Aw(x)=O,

du) ( dx

+ ~ ( dw) 2 = 2 dx

w(O)

-

(8.9.4)

Osxs1,

p.,A

0 :S x

'

= w(1) =

0,

:S

1'

(8.9.5)

u(O) = -u(1) =a> 0,

(8.9.6)

where the parameter A is proportional to the axial stress in the rod, the constant a in (8.9.6) is proportional to the prescribed end displacement and is referred to as the end-shortening and p., is a positive physical constant. Consider the linearized problem where the nonlinear term w~ is absent. The solution of the linearized Equation (8.9.5) is (8.9. 7)

u(x) = a(1- 2x),

where a= Ap.,/2. The solution of (8.9.4) and (8.9.6) is w(x) = 0 unless A is an eigenvalue An given by (8.9.8)

n=1,2,3, .... In this case, w is a multiple of the eigenfunctions

Wn

given by

n = 1, 2, 3, ... ,

(8.9.9)

where the An are constants. From a= ~Ap., and A =An = n 2 1r 2 , we conclude that if a= an = ~P.,An, then the rod buckles into a shape given by (8.9.7) and (8.9.9) with an undetermined amplitude An. The numbers an are called the critical end-shortenings. For a -:;fan, n = 1, 2, ... , the rod remains straight because the solution of (8.9.4) and (8.9.6) is w(x)

=0.

(8.9.10)

468

Applications

We now consider the nonlinear problem (8.9.4)-(8.9.6). The solution of the problem i.s still given by (8.9.9) when A= An, and by (8.9.10) when A -:;fAn. To find u(x) when A= An, we put (8.9.9) into (8.9.5) and integrate using (8.9.6) at x = 0 to obtain 1 2 u(x) = Un (x) =a- p.,An 1 + A~) 11- x +smrAn sin 2n7TX.

(

4

(8.9.11)

In view of the boundary condition u(1) =-a, we obtain (8.9.12)

a=an(1+ :;).

This is a relation between the end-shortening and the amplitude. The bifurcation diagrams for the thin rod are given in Fig. 8.2. The diagram shows that, for a< a 1 , the only solution is the trivial solution w ="' 0. At a= a 1 , the non-trivial solution w1 = A 1 sin 7TX bifurcates from the trivial solution, and continues to exist for all a> a 1 • The point a= a 1 is called the first bifurcation point and the non-trivial solution is called the first bifurcation solution. For each n, non-trivial solutions of (8.9.12) for An are possible if and only if a 2: an. The solutions bifurcate from the trivial (unbuckled) state An= 0 at a= an. Thus, the solution of the linearized problem determines the bifurcation points of the nonlinear problem. For any a in ansa san+~> there are 2n + 1 solutions. For a< a 1 , no buckling is possible. We also note from (8.9.12) that dajdAn = anAn/2p.,. Hence, for a fixed amplitude A, the parabola in Figure 8.2 bifurcating from an has a steeper slope than that bifurcating from am if m < n. Clearly, these parabolas do not intersect. For any fixed value of a, the bifurcation solutions can be classified by the values of the potential energy associated with them. We also observe

0

FIGURE

8 2.

a,

a

Bifurcation diagram for the thin rod.

469


that the potential energy is equal to the internal energy since the displacements are specified at the ends of the rod. Consequently, the potential energy is proportional to the functional V defined by V(w)

=~

f [w~+; ux+~ w~rJ (

dx.

(8.9.13)

In the unbuckled state, Equations (8.9.7) and (8.9.10) hold with a= Ap.,/2, and the corresponding potential energy is

1 (2 ) . 2

(8.9.14)

Vro=lp, /La

The potential energy Vn of the buckled state given by (8.9.9) is obtained by substituting (8.9.8), (8.9.9) and (8.9.11) into (8.9.13) in the form (8.9.15) Hence, (8.9.16) 2 ( Vn- Vm) =-(anam)[(a- an) +(a- am)] 2:0,

a 2: an 2: am.

(8.9.17)

PIt follows from (8.9.16) and (8.9.17) that, for fixed a> a 1 , the straight state has the largest energy and the branch originating from a 1 has the

smallest energy. For fixed a in the interval ansa s an+J, the energies of the branches are ordered as Vro > Vn > Vn_ 1 > · · · > V 1 • For the state of smallest energy, the displacement function of this state is

w = A 1 w1 = ±2.f/i ( ;

-1)

sin

7TX

for all a> an.

(8.9.18)

Suppose the solutions of (8.9.1) represent equilibrium solutions for a dynamical system which evolves according to the time-dependent equations u, = F(A, u),

(8.9.19)

where u : R ~ E and E is a Banach (or Hilbert) space. An equilibrium solution u 0 is stable if small perturbations from it remain close to u 0 as t ~ oo; u 0 is asymptotically stable if small perturbations tend to zero as t ~ oo (see Section 8.8). When the parameter A changes, one solution may persist but become unstable as A passes a critical value A0 , and it is at such a transition point that new solutions may bifurcate from the known solution.

470

Applications

One of the simple nonlinear partial differential equations which exhibits the transition phenomena shown in Figure 8.3 is in D,

u=O

on

aD,

(8.9.20) (8.9.21)

where D is a smooth bounded domain in RN. The equilibrium states of (8.9.20) are given by solutions of the time-independent equation ( u, == 0). One solution is obviously u = 0, which is valid for all A; this solution becomes unstable at A= A1 , the first eigenvalue of the Laplacian on D: V' 2 u 1 + A1 u 1 = 0, u 1 = 0 on aD. For A> A1 , there are at least three solutions of the nonlinear equilibrium equation. The nature of the solution set in the neighborhood of ( A~o 0) is given in Figure 8.3; the new bifurcating solutions are stable. The Laplacian has a set of eigenvalues A1 < A2 < A3 < · · · which tend to infinity, and all of these eigenvalues are potential bifurcation points. In the theory of Calculus in Banach spaces, the following version of the Implicit Function Theorem is concerned with the existence, uniqueness and smoothness properties of the solution of the Equation (8.9.1). Theorem 8.9.1 (Implicit Function Theorem). Suppose A, E, B are real Banach spaces and F is a Frechet differentiable mapping from a domain D c AxE to B. Assume F(A 0 , u0 ) = 0 and the Frechet derivative F' ( A0 , u 0 ) is an isomorphism from E to B. Then, locally, for II A - A0 I sufficiently small, there is a differentiable mapping u (A) from A to E, with (A, u (A)) E D, such that F(A, u(A)) =0. Moreover, (A, u(A)) is the only solution of F=O in a sufficiently small neighborhood D' c D. IfF is en then u is en. If A, E, and Bare complex Banach spaces and F is Frechet differentiable, then F is analytic and u is analytic in A.

u

0

FIGURE 8.3. Bifurcation diagram where unstable solutions are represented by dashed lines.

471


The proof of the theorem is beyond the scope of this book. However, the theorem can be proved by using a contraction mapping argument and is adequate for most physical applications. The reader is referred to Sattinger (1973) and Dieudonne (1969) for a detailed discussion of proofs. Bifurcation phenomena typically accompany the transition to instability when a characteristic parameter crosses a critical value, and hence they play an important role in applications to mechanics. Indeed the area of mechanics is a rich source of bifurcation and instability phenomena, and the subject has always stimulated the rapid development of functional analysis.

8.1 0. Exercises

(1) Let H 1 and H 2 be real Hilbert spaces. Show that if T is a bounded linear operator from H 1 into H 2 , and f is a real functional on H 1 defined by where u is a fixed vector in H 2 , then point given by

f

has a Frechet derivative at every

f'(x) = -2T*u+2T*Tx, where T* is the adjoint of T

(2) Suppose T: B 1 ~ B 2 is Frechet differentiable on a open set 0 c B 1 • Show that if X E 0 and hE B 1 are such that X+ thE 0 for every t E [0, 1], then

I T(x +h)- T(x) I s I h I

I T'(x+ a h) II·

sup O(x) +A

(13) (a)

A 1 = A2 =

f

1 - Aa 22

- Aa 12

l

-~ (x+ t) + Atx+~)

r(x, t, A)¢( t) dt.

1/ 1r,f(x) =

C1

cos x+c 2 sin x,

(b) A1 = iJ3j2, / 1 (X) = 1- jJ} X, A2 = -iVJ/2, / 2 (X) = 1 + iJ3 X.

(14)

(d) f(x)=-2.

(16) (a) f(x)=!(3x+1),

(c) f(x) =sinh x, (d) f(x)=e-x

(b) f(x)=sinx,

(19) (Lu, v) =

J l

r r

2 •

d)

d 2

( v(x) ex dx 2 + ex dx u(x) dx

0

= =

v(x)(exu'(x))'dx u Lv dx = ( u, Lv).

(21)

-1)

2n Un(x) = Bn sin ( - 2

(22)

2 An= n 7T

(24)

(Tu, u)

2

,

Un(x) = Bn sin(n1r ln x), n

= ((DpD+ q)u,

X,

= 1, 2, ....

u) = (DpDu, u) + q( u, u) =

(25) (u, v) =

(27)

n = 1, 2, ....

Lb u(x)v(x)r(x) dx.

-pll Dull 2 + qll ull 2 •

487

Hints and Answers to Selected Exercises

(31) (b) Sif{Tu} = 2/(1 + k 2 )Sif{u}. Since SiP is unitary, we have II Sif{Tu}ll =II Tull s2llull·

Hints and Answers to 6.6. Exercises

(1) Show first that iff is a continuous function which is not identically zero then there exists a ¢ E cgro(RN) with compact support such that JRN f(x)cf>(x) dx .:P 0. (5) Use the function f(t) =

{~-J/t2

tsO, t>O.

(6) (a), (b), (e), (i) Yes. (c), (f), (g), (h), (j) No. (d) No, if {xn} has a convergent subsequence.

(1 0) Use the Riemann-Lebesgue Lemma. (12) Note that, for every E > 0,

(13) Use the Riemann-Lebesgue Lemma to show that . fro . cf>(x)-¢(0) !~~ sm nx x dx = 0. -Q'

(17) G(x, 1) = (1/2c)H(I){H(x+ c1) -H(x- c1)} = (1/2c)H(ct-lxl). (18) (b) u(x,I)=J:a:G(x,g,l)g(g)dg, where G(x,g,t) is obtained in 18(a). Since G(x, g, t) = lj2c if x- Cl < g < x + Cl and 0 elsewhere, we have u(x, 1) = (lj2c) g(g) dg.

I;";;

488


(26)

(7TY) fro f(t) dt -ro cosh ( -;;7T (x- t) ) -cos-;; 1ry

1

u(x,y)=-sin2a a

1 . (7TY) +-sm -

2a

a

fro g(t) dt . -ro cosh ( -;;7T (x- t) ) +cos-;; 7TY

(27)

7T(x-t)) . (7TY) fro f(t) cosh ( 2 a dt . u(x, y) == sm 2a -ro cosh (7T ) 7TY -;; ( x- t) -cos-;; (29) tf>(x, y)

=!fro fo(ak) 2 -ro k

eikx-lkly

dk.

Hints and Answers to 7.11. Exercises (1) (a) (dldt)(aLiaii)-alaxi=O implies mxi-kxi=O. Multiply this equation by xi and integrate to obtain ~mx; + ~kxi =constant. (b) Use (7.10.1abc) and Lin 1(a) and show that it becomes the expression for Lin 1(b).

(2) aL . Pr= ar = mr,

aL 2 • 1 2 2 ·2 Pe =--;-= mr 8, where L= T- V=2m(f +r 8 )- V(r); a8

1 2 2 "2 1 p; p~ H=T+V=-m(f +r 8 )+V(r)=--+--2 +V(r), 2 2m 2mr

. aH Pr r=-=apr m'

. aH Pe 8 = - = -2 ape mr •

Then use A= -aH Iar and

Pe = -aH I a8.


(4) (i) aT

.

P =-=mx ax ' aH p . -=-=X ap m '

aH=kx=-p implies (x,p)=-(k/m)(x,p). ax

(ii)

aT

aT

P =-=mr r ar '

Pe

2 •

= aiJ = mr e,

1 ( Pr+2 p~) +m/1- (-1- -1) '

aH

2

H=-

2m

- r

2a

Pr m

. '

-=-=r

r

apr

aH

Pe

.

ap8

mr

'

-=-= 8 2

aH

.

-=0=-Pe·

ae

Thus,

r- riJ

2

= -

11-l r 2

2

and Cd 1dt)( r iJ) =

o.

(8) (iii)

(iv)

[fix, x2 J =.X[ fix, .XJ +[fix, xJx = -.X[.X, fixJ- [.X, fixJ.X = -2inx.

(11) Use (7.10.9ab). (12) Use (7.10.48). (15)

( 19)

490


(21) (r/J, r/J)

=(~I (r/Jn, rfJ)r/Jn, ~I (r/Jk, r/J)r/Jk) =n~i

E

(r/Jn, r/J)*(r/Jk, r/J)(r/Jn, r/Jk).

(22) (i) Use the fact that A' is the difference of two Hermitian operators, and then show that (rjJ, A'4>) = (A'rjJ, 4>) for any rjJ and¢. (ii) Use the fact that A(11) =(B) A, since A is a linear operator and (B) is a scalar. (iii) (A'rf!, A'rf!) = crfr, (A'frf!) = crfr, [A -

X2,

x 3 ) lies on the sphere.

A subject to the condition J~ J1 + (y') dx = L. In other words, maximize the functional l 1 (y) = J~ [y(x)+ A(J1 + (y'(x) 2 - L)] dx. 2 Answer: ( x- a ) + (y- f3 f = A 2 where a, f3 and A are constants. 2

( 12) Maximize

(23) This polynomial is the real part of the binomial expansion of (COS 8 + i sin er, where X= COS 8. (29) Use repeated integration by parts to show

f

Dn[(x 2 -1rJxmdx=O,

m=0,1, ... ,n-1,

1

and then find the leading coefficient of vn [ (x 2 - 1

r].

(30) Use Rodrigues' formula and the binomial expansion of (x 2 -1r.

491


(31)

(b)

vn+i[(x2-l)n] = Dn{D[(x2-l)n-l(x2_ 1 )]} = 2n

Dn[x(x2-1r-JJ = 2n{x Dn[(x2-l)n-1]+ n Dn-i[(x2

l)n-1]}

and then use Rodrigues' formula.

(33) (a) Use the recurrence relation 31(a) and 31(b). (b) Use 31(b) and 33(a).

(34) (a) Multiply equality in 33(a) by x and subtract from the equality in 31 (b). (b) Square and add the equalities in 31 (b) and 33(b ).

(37) Transform into polar coordinates.

(41 ) (a) w = A sin nx, (b) w = A sin nx,

2 A[ -n 2 - (A- A )] = 0.

A( A- A 2 n

2

)

= 0.

(43) The term within the first bracket of the equation can be replaced a 2 2 2 2 constant p.,. Answer: p., = n 1r , w =An sin n1rx, A~= 4[(A/ n 1r ) -1]. (45) Note that the quantity in the square bracket is a constant and2 can2 be2 2 2 replaced by a constant a. Then a= an= n 1r , u =A sin n1rx, A -IAI = n 1r . Draw the bifurcation diagram. (46) Square both sides of the equation and integrate from 0 to 1.

Bibliography

Balakrishnan, A V., Applied Functional Analysis, Springer-Verlag, New York, 1976. Balakrishnan, A V., Introduction to Optimization Theory in a Hilbert Space, Springer-Verlag, New York, 1971. Banach, S., Iheorie des operations lineaires, Chelsea, ~ew York, 1955. Berkovitz, L., Optimal Control Theory, Springer-Verlag, New York, 1975. Cheney, E. W., Introduction to Approximation Theory, McGraw-Hill, New York, 1966. Coddington, E. A and Levinson, N., Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955. Curtain, R. F. and Pritchard, A J., Functional Analysis in Modern Applied Mathematics, Academic Press, New York, 1977. De Boor, C., Approximation Theory, Proceedings of Symposia in Applied Mathematics, Vol. 36, American Mathematical Society, Providence, 1986. Dieudonne, J., Foundations of Modern Analysis, Academic Press, New York, 1969. Dirac, P. A M., The Principles of Quantum Mechanics (Fourth Edition), Oxford University Press, Oxford, 1958. Dunford, N. and Schwartz, J. T., Linear Operators, Part I, General Theory, Interscience, New York, 1958. Friedman, A, Foundations of Modern Analysis, Dover Publications, New York, 1982.

493

494

Bibliography

Garabedian, P. R., Partial Differential Equations, John Wiley and Sons, New York, 1964. Glimm, J. and Jaffe, A., Quantum Physics (Second Edition), SpringerVerlag, New York, 1987. Gould, S. H., Variational Methods for Eigenvalue Problems, Toronto University Press, Toronto, 1957. Halmos, P. R., Measure Theory, Springer-Verlag, New York, 1974. Hilbert, D., Grundziige einer allgemeinen Iheorie der linearen Integralgleichungen, Leipzig, 1912. Hutson, V. and Pym, J. S., Applications of Functional Analysis and Operator Theory, Academic Press, New York, 1980. Iooss, G. and Joseph, D. D., Elementary Stability and Bifurcation Theory, Springer-Verlag, New York, 1981. Jauch, J. M., Foundations of Quantum Mechanics, Addison-Wesley Publishing Company, Reading, Mass., 1968. Jones, D. S., Generalized Functions, McGraw-Hill, New York, 1966. Kantorovich, L. V. and Akilov, G. P., Functional Analysis in Normed Spaces, Pergamon Press, London, 1964. Keller, J. B. and Antman, S., Bifurcation Theory and Nonlinear Eigenvalue Problems, W. A. Benjamin, New York, 1969. Kolmogorov, A. N. and Fomin, S. V., Elements of the Theory of Functions and Functional Analysis, Vol. 1, Graylock Press, Rochester, New York, 1957; Vol. 2, Graylock Press, Albany, New York, 1961. Kolmogorov, A. N. and Fomin, S. V., Introductory Real Analysis, PrenticeHall, New York, 1970. Kreyn, S. G., Functional Analysis, Foreign Technology Division WP-AFB, Ohio, 1967. Kreyszig, E., Introductory Functional Analysis with Applications, John Wiley and Sons, New York, 1978. Landau, L. D. and Lifshitz, E. M., Quantum Mechanics, Non-relativistic Theory, Pergamon Press, London, 1959. Lax, P. D. and Milgram, A. N., Parabolic Equations, Contribution to the Theory of Partial Differential Equations, Ann. of Math. Studies, No. 33 (1954), Princeton, 167-190. Lions, J. L. and Stampacchia, G., Variational Inequalities, Comp. Pure Appl. Math. 20(1967), 493-519. Luenberger, D. G., Optimization by Vector Space Methods, John Wiley and Sons, New York, 1969. Liusternik, L. A. and Sobolev, V. J., Elements of Functional Analysis (Third English Edition), Hindustan Publishing Co., New Delhi, 1974.

Bibliography

495

Mackey, G. W., The Mathematical Foundations of Quantum Mechanics, W. A. Benjamin, New York, 1963. MacNeille, H. M., A Unified Theory of Integration, Proc. Nat. Acad. Sci. USA, Vol. 27(1941), 71-76. Merzbacher, E., Quantum Mechanics (Second Edition), John Wiley and Sons, New York, 1961. Mikusinski, J., Bochner Integra~ Birkhauser-Verlag, Basel, 1978. Myint- U, T. and Debnath, L., Partial Differential Equations for Scientists and Engineers (Third Edition) North-Holland, New York, 1987. Neumann, J. V., Mathematical Foundations of Quantum Mechanics, Princeton University Press, Princeton, 1955. Reed, M. and Simon, B., Methods of Modern Mathematical Physics, Volume 1, Functional Analysis, Academic Press, New York, 1972. Riesz, F. and Sz-Nagy, B., Functional Analysis (Second Edition), Frederick Ungar, New York, 1955. Rivlin, T. J., An Introduction to the Approximation of Functions, Dover Publications, New York, 1969. Roach, G. F., Green's Functions (Second Edition) Cambridge University Press, Cambridge, 1982. Sattinger, D. H., Topics in Stability and Bifurcation Theory, Lecture in Mathematics, Vol. 309(1973), Springer-Verlag, New York. Schechter, M., Modern Methods in Partial Differential Equations, McGrawHill, New York, 1977. Schwartz, L., 17uforie des distributions, Vols. I and II, Herman and Cie, Paris, 1950, 1951. Shilov, G. E., Generalized Functions and Partial Differential Equations, Gordon and Breach, New York, 1968. Sobolev, S. L., Partial Differential Equations of Mathematical Physics, Pergamon Press, London, 1964. Stakgold, I., Boundary Value Problems of Mathematical Physics, Macmillan, New York, 1968. Taylor, A. E., Introduction to Functional Analysis, John Wiley and Sons, New York, 1958. Tricomi, F. G., Integral Equations, Interscience, New York, 1957. Yosida, K., Functional Analysis (Fourth Edition), Springer-Verlag, New York, 1974. Young, L. C., Calculus of Variations and Optimal Control Theory, W. B. Sanders Company, Philadelphia, 1969. Zemanian, A. H., Distribution Theory and Transform Analysis, McGraw-Hill, New York, 1965.

List of Symbols

Page numbers indicate page where symbol is further defined. N

Q R R+

c F

RN

eN C(?(O) cgk(O) cgro(O) fYl(O)

C(?([a, b]) cgk([a, b]) cgro([a, b]) ,

dimension of E, 10 norm, 10 empty set open ball, 14 closed ball, 14 sphere, 14 closure of S, 16 boundary of S domain of L, 23 range of L, 23 null space of L, 23 graph of L, 23 space of bounded linear mappings form E 1 into E 2 , 25 dual space of E, 27 support of !, 39 expansion of an integrable function !, 44 space of Lebesgue integrable functions on R, 45 space of equivalence classes of Lebesgue integrable functions on R, 52 convergence in norm, 53 almost everywhere, 54 convergence almost everywhere, 55 characteristic function of S Lebesgue measure of S, 68 real part of z imaginary part of z space of square integrable functions on R, 74 space of Lebesgue integrable functions on RN, 75 space of square integrable functions on RN, 76 space of Lebesgue integrable functions on 0 space of square integrable functions on n space of Lebesgue integrable functions on [a, b] space of square integrable functions on [a, b], 75 convolution off and g, 79 inner product, 88 complex conjugate of z orthogonal, 92 orthogonal complement, 117 direct sum projection onto S, 166

List of Symbols

p_l

499

complementary projection operator, 168 .J! identity operator, 138 A* adjoint of A, 145 A-1 inverse of an operator A, 155 Legendre polynomials, 100, 254 Pn(x) P;;'(x) associated Legendre functions, 254 Hn(X) Hermite polynomials, 102, 256 Tn(x) Chebyshev polynomials, 254 p~·f3(x) Jacobi polynomials, 255 C~(x) Gegenbauer polynomials, 255 Ln(x) Laguerre polynomials, 255 L~(x) associated Laguerre functions, 256 Bessel's functions, 256 In (x) L 2 'P ([a, b]) space of square integrable functions with the weight function p on [a, b] D" differential operator, 96 Hm(O), H 2 (0), Hm(O), W;'(O) Sobolev spaces, 96 w weak convergence, 96 Xn ---7 X W(x; u, v) Wronskian of u and v, 260 G(x, t) Green's function, 253 2 V =Ll Laplacian, 298 f(x) Euler's gamma function 2ZJ, 2ZJ(RN) space of test functions, 285 2ZJ', 2ZJ'(RN) space of distributions, 287 llA uncertainty of A, 359 Lx, Ly, Lz orbital angular momentum operators, 353, 396 general angular momentum operators, 399 Mx, My, Mz .-< A .-< Pauli's spin matrices, 398 O"x, cry, az Jy, Jz total angular momentum operators, 403 generalized momentum, 399 pj generalized coordinates, 336 qj H Hamiltonian, 339 {A, B} Poisson bracket of two functions A and B, 343 [A, BJ commutator of two operators A and B, 350 (xl r/J) = rjf(x) state vector, 347 (rf!lx) = rjf(x) complex conjugate of rjf(x), 347 '-V(x, t) time-dependent state vector, 363 A observable operator, 349 h Planck's constant, 345

ix,

500 li

(A) dT(x, h), T'(x)h

'Vf dTx, T'x

T"

List of :

universal constant, 345 expectation value of an operator A, 355 Gateaux derivative, 412 gradient of a functional, 413 Frechet derivative, 416 second Frechet derivative, 420 first variation, 438 space of polynomials of degree at most n, 455

Index

A

Abel's formula, 260 Abel's integral equation, 247 Abel's problem, 432 Absolutely convergent series, 21 Abstract minimization problem, 443 Action, 367 Adjoint boundary conditions, 251 Adjoint of a differential operator, 250 Adjoint operator, 149, 204 Admissible set, 425 Admittance, 269 Angular momentum, 344, 353 Angular momentum operator, 396, 400, 403 Annihilation operator, 392 Anomalous Zeeman effect, 400 Anti-derivative of a distribution, 293 Anti-Hermitian operator, !55 Anti-linear functional, 124 Appolonius' identity, 128 Approximate eigenvalue, 186 Approximation theory, 453 Associated Legendre functions, 254, 376 Associated Legendre operator, 254

Asymptotically stable solution, 461 Atomic units, 372 Autonomous dynamical system, 461

B Banach fixed point theorem, 30 Banach space, 19 Basic representation, 38 Basis, 10 Beppo Levi, 59 Bessel functions, 256 Bessel operator, 256 Bessel's equality and inequality, 104 Best approximation, 453 Bifurcation, 465 diagram, 466, 468, 470 points, 466 Biharmonic equation, 300, 323 Bilinear concomitant, 251 Bilinear functional, 143 Born, 367 Bosons, 402 Bounded bilinear functional, 143

501

502 Bounded linear mapping, 24 Bounded operator, 138 Bounded quadratic form, 144

c Cartesian product, 8 Cartesian product of vector ·spaces, 8 Cauchy sequence, 18 Cayley transform, 213 Chain rule, 419 Change of variable theorem, 66 Characteristic function, 38 Chebyshev operator, 254 Chebyshev polynomials, 131, 254, 460 Classical quantities, 353 Classical solution, 295 Closed balls, 14 Closed graph theorem, 208 Closed operator, 208 Closed set, 15 Closed unbounded operator, 208 Closest point property, 118 Closure, 16 Coercive functionals, 147 Commutation relations, 354 Commutator, 350 Commuting operators, 141 Compact-normal resolvent, 212 Compact operator, 171 Compact sets, 17 Compatibility theorem, 407 Compatible observables, 351 Complementary observables, 351 Complementary projection operator, 168 Completely continuous operator, 171, 176 Complete normed space, 19 Complete sequence, 107 Completion of a normed space, 28 Complex vector space, 4 Conjugate-linear functional, 124 Conjugate variables, 339 Conservation law of energy, 384 Conservative, 334 Constant of motion, 335, 387 Continuity equation, 366 Continuous mapping, 24 Continuous spectrum, 177 Contraction, 30 Contraction mapping, 30

Contraction mapping theorem, 30, 225 Control function, 447 Convergence absolute, 21 almost everywhere, 55 in norm, 53 in normed space, 12 pointwise, 12 strong, 25, 97 of test function, 286 uniform, 12, 25 weak, 97 weak distributional, 289 Convergence almost everywhere, 55 Convergence in norm, 53 Convergence in normed space, 12 Convergence of test function, 286 Convex functions, 423 Convex sets, 118 Convolution, 79 Convolution theorem, 196 Correspondence principle, 353 Cost functional, 447 Creation operator, 392 D

d'Alembert solution, 318 de Broglie wave, 370 Degenerate eigenvalue, 178 Degree of degeneracy, 178 Densely defined operator, 204 Dense subsets, 17 Derivative Frechet, 416 Gateaux, 412 second Frechet, 420 Derivatives of distributions, 288 Differential operator, 139 Diffusion equation, 299, 316 Dimension, 10, 348 Dirac delta distribution, 288 Dirichlet integral, 303 Dirichlet problem, 310 Dispersion relation, 368 Distributional solution, 296 Distributions, 287, 294 Domain, 23 Domain of a differential operator, 250 Dual space, 27

503

Index

E

Ehrenfest's theorem, 375, 383 Eigenfunction, 176, 253 Eigenvalue, 176, 253 Eigenvalue space, 178 Eigenvector, 176 Elliptic functional, 147 Elliptic operator, 310 Equality almost everywhere, 54 Equation Abel integral, 247 Bessel, 256 biharmonic, 300, 323 continuity, 366 diffusion, 299 Euler-Lagrange, 428, 430, 435 Fredholm integral, 224, 232, 233 Hamilton, 340 Hamilton-Jacobi, 367 Heisenberg, 386 Helmholtz, 299, 300, 310 Hermite, 257 homogeneous integral, 224 integral, 223 Klein-Gordon, 300 matrix Riccati differential, 451 Lagrange, 336, 338, 431 Laguerre, 255 Laplace, 298, 319, 320 Legendre, 254 Navier-Stokes, 322 Newton 334 non-homogeneous integral, 224 non-homogeneous wave, 299 nonlinear Fredholm integral, 233 ordinary differential, 248, 268 partial differential, 295 Poisson, 299, 305 Schrodinger, 300, 362, 375 state, 452 Sturm-Liouville, 257 telegrapher, 299 Volterra integral, 223, 224, 236, 237, 239, 246 vorticity transport, 322 wave, 299, 300 Equation of continuity, 366 Equations of motion, 345 Equilibrium solution, 461 Equivalence of norms, 13 Euclidean norm, 11

Euclidean space, 92 Euler-Lagrange equations, 428, 431, 435, 436, 439 Existence and uniqueness of solution, 228, 229, 232, 233 Expectation value, 355 Extension of operators, 204 Extremum, 425

F Fatou's lemma, 61 Fejer's kernel, 114 Fermat's principle, 431 Fermions, 402 Finite dimensional operator, 173 Finite dimensional vector space, 10 Fixed point, 29, 224 Fixed point theorem, 30 Formally self-adjoint differential operator, 251 Fourier coefficients, 117 Fourier series, 117 Fourier transJbrm, 193, 198 Frechet derivative, 416 Fredholm alternative, 229 Fredholm alternative for self-adjoint compact operators, 229 Fredholm equations, 224, 232, 233, 274 Fredholm operator, 151 Friedrich's first inequality, 311 Fubini's theorem, 78 Function associated Legendre, 254 Bessel, 256 characteristic, 38 control, 447 convex, 423 Dirac delta, 269, 288 eigen-, 176, 253 Green's, 263, 298 Hamilton's, 339, 431 Hamilton's principal, 367 Heaviside, 288 input, 269 Lagrangian, 335 Lebesgue integrable, 43, 75 locally integrable complex valued, 74 measurable, 70 momentum density, 372 momentum wave, 371 null, 52

504

lnde~

output, 269 Rademacher, 110 series of integrable, 50 smooth, 285 square integrable, 74, 76 state, 349, 447 step, 38, 75 tent, 83 test, 285 transfer, 271 Walsh, lll wave, 349 weight, 253 Functional, 27 anti-linear, 124 bilinear, 143 bounded bilinear, 143 coercive, 147 conjugate linear, 124 cost, 447 elliptic, 147 linear, 27 positive bilinear, 143 quadratic, 441 sesquilinear, 143 strictly positive, 143 symmetric bilinear, 143 Function of an operator, 191 Function spaces, 5 Fundamental matrix, 448 Fundamental solution, 297 G

Gateaux derivative, 412 Gegenbauer polynomials, 132 Generalized coordinates, 336 Generalized force, 339 Generalized Fourier coefficients, 106 Generalized Fourier series, 106 Generalized momentum, 339 Generator, 345, 380 Geodesic, 437 Gradient, 418 Gradient of a functional, 413 Gram-Schmidt orthonormalization process, 103 Graph, 23, 208 Graph of an operator, 23, 208 Green's first identity, 301 Green's function, 263, 298 Green's second identity, 301

Ground state energy, 392, 393 Group velocity, 369 H

Hamilton's canonical equation, 340, 386 Hamilton's function, 339, 431 Hamilton's principal function, 367 Hamilton's variational principle, 342 Hamilton-Jacobi's equation, 367 Hamiltonian, 339, 353, 431 Hamiltonian operator, 362 Hammerstein operator, 418 Heaviside function, 288 Heisenberg commutation relation, 351, 354 Heisenberg operator, 385 Heisenberg picture, 378, 384 Heisenberg's equation of motion, 385 Heisenberg's uncertainty principle, 359 Helmholtz equation, 299, 300, 306, 308 Hermite operator, 256 Hermite polynomials 102, 256 Hermitian operator, 150 Hilbert, 87 Hilbert space, 93 Hilbert space isomorphism, 126 Hilbert-Schmidt theorem, 187 Hilbert transform, 275 Holder's inequality, 7 Holonomic system, 336 Homogeneous Dirichlet problem, 298 Homogeneous integral equation, 224 Homogeneous Neumann problem, 298 Homogeneous Volterra equation, 239 I

Idempotent operator, 167 Identity operator, 138 Image, 23 Implicit function theorem, 470 Index of performance, 447 Infinite dimensional vector space, 10 Inner product, 88 Inner product space, 88 Input function, 269 Integral equations, 223 Integral of a step function, 39 Integral operator, 140 Integral over an interval, 62 Interaction picture, 378, 389

505

Index

Intrinsic angular momentum, 400 Inverse differential operators, 263 Inverse Fourier transform, 201 Inverse image, 23 Inverse operator, 155 Invertible operator, 155 Isometric operator, 159 Iterated kernels, 240

J Jacobian matrix, 414 Jacobi's identity, 404 Jacobi's operator, 255 Jacobi's polynomials, 132, 255

Linear harmonic oscillator, 390, 394 Linear independence, 9 Linear mapping, 23 Linear momentum, 334, 353 Linear operator, 138 Linear transformation, 138 Lions, 445 Lions-Stampacchia theorem, 445 Lipschitz's condition, 228, 277 Locally integrable complex valued functions, 74 Locally integrable functions, 62, 76 Lowest state energy, 392 Lyapounov's theorem, 463 M

K Kernel, 223 Kinetic energy, 334, 353 Klein-Gordon equation, 300 L

L' -norm, 51 U -norm, 75 Lagrange identity, 260 Lagrange's equations of motion, 336, 431 Lagrangian, 431 Lagrangian function, 335 Laguerre operator, 255 Laguerre polynomials, 130, 255 Laplace equation, 298, 319, 320 Laplace operator, 298, 319, 320 Law of conservation of energy, 341 Lax, 148 Lax-Mi!gram theorem, 148 Least-Square approximation, 455 Lebesgue, 37 Lebesgue dominated convergence theorem, 60 Lebesgue integrable functions, 43, 75 Lebesgue integral, 43 Lebesgue integral for complex valued functions, 72 Lebesgue measure, 68 Legendre equation, 254, 376 Legendre operator, 254 Legendre polynomials, 100, 102, 254, 376 Linear combinations, 9 Linear dependence, 9 Linear functional, 27

MacNeille, 38 Magnetic quantum number, 337 Matrix Riccati equation, 451 Maxwellian distribution, 395 Mean value theorem, 415 Measurable functions, 70 Measurable sets, 68 Measure, 68 Measurement, 349 Method of successive approximation, 225, 234 Mikusinski, 38 Milgram, 148 Minkowski's inequality, 8 Momentum density function, 372 Momentum wave function, 371 Monotone convergence theorem, 59 Multiple eigenvalue, 178 Multiplication operator, 140 Multiplicity, 178 Multiplier, 140 N Navier-Stokes equation, 322 Neumann, 87 Neumann problem, 298 Neumann series, 227 Newton's equation, 334 Newton's second law of motion, 334, 388 Non-degenerate eigenvalue, 178 Non-homogeneous equation, 224 Non-homogeneous Volterra equation, 240 Non-homogenous wave equation, 299

506

In de

Nonlinear Fredholm equation, 233 Non-separable Hilbert space, 125 Norm, 10, 51, 90, 91 Euclidean, 11 L', 51 D, 75 fP, 11 strictly convex, 127 sup, 138 uniform, 12 Normal operator, !58 Normed space, 11 Norm in inner product space, 91 Norm of a bounded bilinear functional, 143 Norm of a bounded quadratic form, 144 Norm of uniform convergence, 12 Null function, 52 Null operator, 138 Null set, 54 Null space, 23

0 Observable operators, 350 Observables, 345 Observation, 349 One dimensional Schriidinger equation, 394 One-sided shift operator, !59 Open balls, 14 Open sets, 15 Operator, 23, 138 adjoint, 149, 204 adjoint of a densely defined, 204 adjoint of a differential, 250 angular momentum, 396, 400, 403 annihilation, 392 anti-Hermitian, !55 associated Legendre, 254 Bessel, 256 bounded, 138 Chebyshev, 254 closed, 208 closed unbounded, 208 commuting, 141 compact, 171 complementary projection, 168 completely continuous, 171, 176 creation, 392 densely defined, 204 differential, 139 elliptic, 310

finite dimensional, 173 formally self-adjoint differential, 251 Fredholm, !51 Hamiltonian, 362 Hammerstein, 418 Heisenberg, 385 Hermite, 256 Hermitian, !50 idempotent, 167 identity, 138 integral, 140 inverse, !55 inverse differential, 263 invertible, 155 isometric, !59 Jacobi, 255 Laguerre, 255 Laplace, 298, 319, 320 Legendre, 254 linear, 138 linear harmonic oscillator, 390, 394 linear momentum, 334, 353 multiplication, 140 normal, 158 null, 138 observable, 350 one-sided shift, 159 orbital angular momentum, 396 orthogonality of a projection, 169 positive, 161 positive definite, 166 projection, 166 quantum, 353 Schriidinger, 385 self-adjoint, 150, 206 square root of a positive, 165 strictly positive, 166 symmetric, 206 time-evolution, 379 total Hamiltonian, 389 two-sided shift, 214 unbounded, 203 unitary, 160 Optimal control problems, 447 Optimal error, 454 Optimal solution, 454 Optimal trajectory, 448 Optimization problems, 424 Orbital angular momentum operators, 396 Orbital quantum number, 377 Ordinary differential equations, 248, 268

Index Orthogonal complement, 117 Orthogonal decomposition, 121 Orthogonality of projection operators, 169 Orthogonal systems, 99 Orthogonal projection, 120 Orthogonal vectors, 92 Orthonormal basis, 124 Orthonormal polynomials, 458 Orthonormal sequence, 99 Orthonormal systems, 99 Outcome of quantum measurement, 358 Output function, 269

Parallelogram law, 91 Parseval relation of Fourier transforms, 199, 201 Pars eva! 's formula, 108 Partial differential equation, 295 Particle momentum, 369 Pauli's spin matrices, 398 Periodic boundary condition, 249 Periodic Sturm-Liouville system, 258 Picard's existence theorem, 228 Plancherel theorem, 202 Planck, 345 Planck's constant, 345 Planck's simple harmonic oscillator, 394 Point spectrum, 177 Pointwise convergence, 12 Poisson equation, 299, 305 Poisson's bracket, 343, 386 Polarization identity, 128, 144 Pontrjagin maximum principle, 448 Position, 353 Positive bilinear functional, 143 Positive definite operator, 166 Positive operator, 161 Potential energy, 334, 353 Pre-Hilbert space, 88 Principle of linearized stability, 462 Principle of quantization, 354 Principle of superposition, 306 Principal quantum numbers, 377 Probability current density, 366 Probability density, 365 Probability flux, 366 Product of two operators, 141 Projection onto S, 121 Projection operator, 166 Proper subspace, 5

Pythagorean Jbrmula, 92, 104

Q Quadratic form, 143 Quadratic functional, 441 Quantization, 354 Quantum number, 393 Quantum operators, 353 R Rademacher function, 110 Range, 23 Real vector space, 4 Regular distribution, 287 Regular points, 177 Regular Sturm-Liouville systems, 257 Relative extrema, 425 Representer of a functional, 124 Resolution of a pulse, 271 Resolvent, 177, 235 Riemann integrable functions, 64 Riemann integral, 64 Riemann-Lebesgue lemma, 195 Riesz, 58, 123 Reisz representation theorem, 123 Robin problem, 298 Rodrigues formula, 377 Root-mean-square deviation, 355, 359

s Scalars, 3 Schriidinger picture, 378 Schriidinger's equation, 300, 362 Schwartz, 283 Schwarz's inequality, 90 Second Frechet Derivative, 420 Self-adjoint and formally self-adjoint differential operator, 251 Self-adjoint operator, !50, 206 Separability, 125 Separable Hilbert spaces, 124 Separable kernel, 242 Separable spaces, 124 Separated boundary conditions, 249 Sequence spaces, 6 Series of integrable function, 50 Sesquilinear functional, 143 Set of measure zero, 68

508

Index

Simple eigenvalue, 178 Singular distribution, 287 Singular Sturm-Liouville system, 257 Smooth function, 285 Snell's law, 432 Sobolev, 97 Sobolev space, 96 Solution asymptotically stable, 461 bifurcation, 466 classical, 295 distributional, 296 equilibrium, 461 stable, 461 unstable, 461 weak, 296 Space Banach space, 19 C, 3 '6'([a, b]), 89 CN, 5 '6'0(R), 95

lhm, 96 Hm(fJ) = Wzm(rJ), 97 f 2 , 6, 89 fP,

6

L'(R). 45

£2(R), 74 £2(RH), 75 £2 ([a, b]), 89

R, 3 Euclidean, 92 finite dimensional vector space, 10 Hilbert, 93 infinite dimensional, 10 infinite sequence, 6 inner product, 88 non-separable Hilbert, 125 pre-Hilbert, 88 separable Hilbert, 124 Sobolev, 96 test function, 285, 294 vector, 4 Space spanned by s1, 10 Spectral theorem for self-adjoint compact operators, 189 Spectral theorem for unbounded operators, 211 Spectrum, 177 Sphere, 14 Spherical harmonics, 377 Spherically symmetric potential, 375

Spin, 400 Square integrable functions, 74, 76 Square root of an operator, 165 Stability criterion, 463 Stable solution, 461 Stampacchia, 445 Standard deviation, 359 State equation, 452 State function, 349, 447 States, 345 State transition matrix, 449 State vector, 347 Stationary point, 425 Stationary state, 362 Step function, 38, 75 Strictly convex norm, 127 Strictly positive functional, 143 Strictly positive operator, 166 Strong convergence, 25, 97 Sturm-Liouville systems, 257 Subspace, 5 Successive approximation, 234 Summability kernel, 113 Support, 39 Symmetric bilinear functional, 143 Symmetric operator, 206 Synthesis of a pulse, 271

T

Tautochronous motion, 247 Telegrapher equation, 299 Tent function, 83 Test function space, 285 Theorem Banach fixed point, 30 closed graph, 208 compatibility, 407 contraction mapping, 30, 225 convolution, 196 Ehrenfest, 375, 383 Fubini's, 78 Hilbert-Schmidt, 187 implicit function, 470 Lax-Milgram, 148 Lebesgue dominated convergence, 60 Lions-Stampacchia, 445 Lyapounov, 463 mean value, 415 monotone convergence, 59 orthogonal projection, 120

509

Index

Picard existence, 228 Plancherel, 202 Riesz representation, 123 spectral, 189, 21! virial, 408 Weierstrass approximation, 17 Time-dependent Schrodinger equation, 363 Time-dependent state vector, 347 Time-evolution equation, 381 Time-evolution operator, 379 Time-invariant, 447 Total angular momentum, 403 Total energy, 335, 353 Total Hamilton operator, 389 Transfer function, 271 Transition matrix, 449 Triangle inequality, 11, 91 Trigonometric Fourier series, 112 Two-sided shift operator, 214

u Unbounded operators, 203 Uncertainity, 359 Uncertainty principle, 360 Uniform convergence, 12, 25 Unitary operator, 160 Unitary space, 88 Universal constant, 346 Unstable solution, 461

v Variables, 345 Variational inequalities, 443 Variational problems, 311, 411 Vector space, 4 Vector subspace, 5 Virial theorem, 408 Volterra equation, 236, 246 Volterra equation of the first kind, 223, 246 Volterra equation of the second kind, 224, 237 Vorticity transport equation, 322

w Walsh functions, 111 Wave equation, 299, 300, 318 Wave function, 349 Wave-particle duality, 370 Weak convergence, 97 Weak distributional convergence, 289 Weak solution, 296, 311 Weierstrass approximation theorem, 17 Weighted average, 357 Weight function, 253 Wronskian, 260

z Zeeman effect, 400

Introduction to Hilbert Spaces with Applications

Introduction to Hilbert spaces with applications

Hilbert Spaces with Applications

Hilbert Spaces With Applications