This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
eJ) == aJk. Therefore the operator A is self-adjoint if aiJ = aJi. A matrix satisfying this condition is often called Hermitian.
151
Linear Operators on Hilbert Spaces
Example 4.4.2. Let H be a separable infinite dimensional Hilbert space and let { e~> e2 , e 3 , ••• } be a complete orthonormal sequence in H. Let A be a bounded operator in H represented by an infinite matrix ( aiJ ), see Theorem 4.2.2. Like in the finite dimensional case, the adjoint operator A* is represented by the infinite matrix ( a1;). A is self-adjoint if a;1 = a1; for all i, j EN. Example 4.4.3.
r
LetT be a Fredholm operator on L 2 ([a, b]) defined by (Tx)(s)=
K(s, t)x(t) dt,
rr rr rr r r r r
where K is a function defined on [a, b] x [a, b] such that
IK (s, t)l 2 ds dt < 00.
Note that the condition is satisfied if K is continuous. We have ( Tx, y) =
K (s, t)x(t).Y"(S) ds dt
=
K(s, t)x(t)y(s) dsdt
x(t)
=
=(x,
This shows that
K (s, t)y(s) ds dt
K(s,t)y(s)ds).
(K*x)(s)=
K(t, s)x(t) dt.
Thus a Fredholm operator is self-adjoint if its kernel satisfies the equality K ( s, t) = K ( t, s).
Example 4.4.4.
Let A be the operator on L \[a, b]) defined by (Ax)( t) = tx( t).
Since (Ax, y) = A is self-adjoint.
r
tx(t)y(t) dt =
Lb x(t)ty(t) dt
=
(x, Ay),
152
Theory
Consider the operator A defined on L\R) defined by
Example 4.4.5.
(Ax)(t) = e-ltlx(t).
This is also a bounded self-adjoint operator. Boundedness of A can be shown as in Example 4.2.5. Moreover, we have (Ax, y) =
L:
e- 1'1x(t)y(t) dt =
L:
x(t)[e-l'ly(t)] dt = (x, Ay).
Thus A is self-adjoint. Example 4.4.6. Let
0 (a
E
R) such that aA:::; .P.
Easy proofs are left as exercises.
Example 4.6.3. The product of two positive operators is not necessarily positive. Indeed, consider operators on R 2 defined by matrices
A=[~
~]
and
B = [:
163
Linear Operators on Hilbert Spaces
It is easy to check that both A and B are positive operators, but the product AB is not.
Theorem 4.6.3. operator.
Proof.
The product of two commuting positive operators is a positive
Let A and B be commuting positive operators. Define for n = 1, 2, ....
Note that operators A" are self-adjoint and commuting. We will show, by induction, that (4.6.1) for all n EN. Clearly, (4.6.1) is satisfied for n = 1. Suppose now (4.6.1) holds for some kEN. Then
(Al(J5- Ak)x, x) = (Ak(J5- Ak)x, Akx) = ((J'i- Ak)Akx, Akx);::: 0 and
which means
Consequently
and
J5- Ak+! = (J'i- Ak) +Al ;:::0. This shows that (4.6.1) holds for k+1, and thus for all n EN, by induction. We have n
A!= Ai + A2 = Ai +A~+ A3
=••·= L k~!
and hence n
L Al =A!-An+!:S A
1•
k~!
Therefore n
L k~!
(Akx, Akx) :S (A 1 x, x).
Al + An+h
164
Theory
This shows that the series L~~~ IIAnxll 2 converges and IIAnxll-i>O. Then
(f
Al )x = A 1 x- An+ 1x -i> A 1 x
as n -i> oo
1
or, equivalently,
n=l
Since B commutes with An for all n EN, we have 00
(ABx, x) = IIAII(BAJx, x) = IIAII
L
(BA~x, x)
n=l 00
= IIAII
L
(BAnx,Anx)~O.
n=l
This proves the theorem. The following theorem is an interesting analog of a property of real numbers.
a
Theorem 4.6.4. Let A 1 :S A 2 :S · · · :SAn ::; · · · be self-adjoint operators in Hilbert space H such that AnAm = AmAn for all m, n EN. If B is a self-adjoint operator on H such that AnB = BAn and An :S B for all n EN, then the sequence {An} converges strongly to a self-adjoint operator A and A 1 :SA :S B.
Proof.
Define en= B-An. Operators en commute with each other and
By Theorem 4.6.3, for n > m the operators
are positive. Hence
( e~x, x) ~ ( emenx, x) ~ ( e~x, x) for every x E H. Since, for an arbitrary fixed x E H, { ( elx, x)} is a nonincreasing sequence of non-negative numbers, it converges and thus lim (emenx,x)=lim (e~x,x).
m,n --?00
Hence
n --?00
165
Linear Operators on Hilbert Spaces
as m, n-'? oo. Therefore { C,x} is a Cauchy sequence for every x E H. Consequently, {Cn}, and thus also {An}, are convergent. It is easy to check that the operator A defined by Ax= lim" ~oo A"x is self-adjoint and A 1 :SA :S B. Definition 4.6.2 (Square Root). A square root of a positive operator A is a self-adjoint operator B satisfying B 2 = A.
Theorem 4.6.5. Every positive operator A has a unique positive square root B. Moreover, B commutes with every operator commuting with A. Proof. and
Let A;::: 0 and let a> 0 (a
E
R) be such that a 2 A::; !fi. Define T0 = 0 (4.6.2)
for n = 0, 1, 2, .... Note that operators T" are self-adjoint (as polynomials of A with real coefficients) and positive. Moreover, they commute with every operator commuting with A. In particular, Tn Tm = Tm Tn for all m and n. For every n, we have (4.6.3) and Tn+!- T" =W!P- Tn-1)+(!fi- Tn))(T"- Tn-1).
(4.6.4)
In view of ( 4.6.3), we have T" :S!} for all n. Moreover
Indeed,
and if T"- Tn- 1;:::O, then Tn+ 1- T" ;:::O, by (4.6.4). By Theorem 4.6.4, the sequence { T"} converges to a positive self-adjoint operator T Letting n-'? oo in ( 4.6.2) yields T= T+~(a 2 A- T 2 ), i.e.,
Denote B = T I a. The operator B is obviously positive. Since, for each n EN, Tn commutes with every operator commuting with A, so do T and B.
166
Theory
It remains to prove the uniqueness. Let C be a positive operator such that C 2 =A. Since C commutes with A, C commutes with B. Let x E H and let y 0 = (B- C)x. Then
(Byo, Yo)+ ( Cyo, Yo)= ((B + C)yo, Yo) =
((B
= ( (B
+ C)(B- C)x, Yo) 2
-
C 2 )x, y 0 ) = 0.
Since B and C are positive, we have (By 0 , y 0 ) = ( Cy0 , y 0 ) = 0. If D is a positive square root of B, then 2
2
II Dyoll = (D Yo, Yo)= (Byo, Yo)= 0.
Hence Dy0 = 0 and also By0 = D( Dy0 ) IIBx- Cxll
2
= 0. Similarly Cy0 = 0. Finally,
= ((B- C) 2 x, x) = ((B- C)y0 , x) = 0
for arbitrary x E H This proves B = C, completing the proof of the theorem.
Definition 4.6.3 (Positive Definite Operator). A self-adjoint operator is called strictly positive or positive definite if (Ax, x) > 0 for all x E H, x :;
U2)
= (A 1 u~> A2u 2) =(Au~> Au2 ) =
(u 1 , u2 ).
Since A1A2 ~ 1, we get ( u~> u 2 ) = 0, which proves that the eigenvectors u 1 and u2 are orthogonal. The proof is complete. Theorem 4.9.7.
For every eigenvalue A of a bounded operator A we have
IAI:S IIAII· Proof. Let u be a non-zero eigenvector corresponding to A. Since Au= Au, we have
IIAull = IIAull, and thus
IAIIIull = IIAull :S IIAIIIIull· This implies
IAI :S I All.
Remark. If the eigenvalues are considered as points in the complex plane, the above result implies that all the eigenvalues of a bounded operator A lie inside the circle of radius I All. Corollary 4.9.1. the inequality
All eigenvalues of a bounded self-adjoint operator A satisfy
IAI :S
sup llxll"'l
{I(Ax, x)l}.
(4.9.9)
182
Theory
Proof follows immediately from Theorem 4.4.5. It is natural to ask whether the absolute value of any eigenvalue actually attains the value IIAII· In general the answer is negative, but it is true for compact self-adjoint operators.
Theorem 4.9.8. If A is a non-zero compact self-adjoint operator on a Hilbert space H, then it has an eigenvalue A equal to either IIAII or -IIAII. Proof. Let {un} be a sequence of elements of H such that I Un II= 1 for all n EN and as n...;. oo.
(4.9.10)
Then IIA Un -IIAun I
Un 1 2 = (A 2 Un -IIAun I 2 Un, A 2 Un -II Au" I 2 Un) 2 2 2 4 2 = IIA unii -2IIAunii\A Un, Un)+IIAunll llunll 2 2 4 2 = IIA Un 1 -2IIAun 11\Aun, Aun) + IIAun 1 1 Un 1 4 2 2 = IIA Un 1 -IIAun 1 2 2 4 :S IIAII 11Aun 1 -IIAu" 1 = IIAun IICIIAII 2 -II Au" 1 2 ). I Au" I converges to I All, we obtain 2 2 as n...;. oo. (4.9.11) IIA Un -IIAun I Un II-i> 0 2
Since
2
The operator A 2 being the product of two compact operators is also compact. Hence there exists a subsequence {upJ of {u"} such that {A 2 uPJ converges. 2 Since I All~ 0, the limit can be written in the form IIAII v, v ~ 0. For every n EN we have
IIIIAII 2 v -IIAII 2 uPn I :S IIIIAII 2 v- A 2 uPn II+ IIA 2 uPn -IIAuPn ll 2 uPn I + IIIIAuPn ll 2 uPn -IIAII 2 uPn II· Thus, by (4.9.10) and (4.9.11), we have as n...;. oo or as n...;. oo.
183
Linear Operators on Hilbert Spaces
This means that the sequence { urJ converges to v and therefore
The above equation can be written as
(A-IIAII)(A + IIAII)v = 0. If w =(A+ IIAII)v ~ 0, then (A -IIAII)w = 0, and thus I All is an eigenvalue of A. On the other hand, if w = 0, then -II All is an eigenvalue of A. This completes the proof. Corollary 4.9.2. If A is a non-zero compact self-adjoint operator on a Hilbert space H, then there is a vector w such that I wll = 1 and I(Aw,
w)l = sup {I( Ax, x)l}. llxll-51
Proof. Let w, I w I = 1, be an eigenvector corresponding to an eigenvalue A such that lA I= I All· Then 2 I(Aw, w)l = I(Aw, w)l = 1AIIIwll = IAI = IIAII = sup {I(Ax, x)l}, II xll"'1
by Theorem 4.4.5. Remarks. Theorem 4.9.8 guarantees the existence of at least one non-zero eigenvalue, but no more in general. The corollary gives a useful method for finding that eigenvalue by maximizing certain quadratic expression. The following result is another example of a theorem describing spectral properties of an operator. We will not prove this result. The interested reader can find a proof in [E. Kreyszig, Introductory Functional Analysis with Applications, Wiley, 1978, Theorem 9.2-3]. Theorem 4.9.9.
Let A be a self-adjoint operator in a Hilbert space H Define m = inf (Ax, x) llx11~1
and
M
=
sup (Ax, x). llxll~1
The spectrum of A lies in the closed interval [m, M]. Moreover, m and M belong to the spectrum. Theorem 4.9.10. The set ofdistinct non-zero eigenvalues {An} ofa self-adjoint compact operator A is either finite or limn_.oo An = 0.
184
Theory
Proof. Suppose A has infinitely many distinct eigenvalues A,., n EN. Let un, for n EN, be an eigenvector corresponding to An such that II un II= 1. By Theorem 4.9.6, {un} is an orthonormal system. Moreover, by Theorem 4.8.7, we have 2
0 =lim IIAun 11 =lim (Aun, Aun) =lim (AnUn, Anun) n--;..oo
n--;..oo
n--;..oo
2
=lim A~llunll = lim A~. n--;..oo
n-;..oo
This proves the theorem. Example 4.9.4. We will find the eigenvalues and eigenfunctions of the operator A on L \[ 0, 21T]) defined by (Au)(x) = ["'" k(x- t)u(t) dt,
where k is a periodic function with period 21r, square integrable on [0, 21T ]. , As a trial solution we take
and note that (Aun)(x)=
f
2,-
k(x-t) ein'dt=einx
fx
0
k(s) einsds.
x-2~
Thus nEZ,
where
An=["'" k(s)
eins ds.
The set of functions {un}, n E Z, is a complete orthogonal system in L\[0,21T]). Note that A is self-adjoint if k(x)=k(-x) for all x, but the collection of eigenfunctions is complete even if A is not self-adjoint. Theorem 4.9.11. Let {Pn} be an orthogonal sequence of projection operators on a Hilbert space Hand let {An} be a sequence of numbers such that An...;. 0 as n...;. oo. Then (a) L:~J AnPn converges; (b) For each nEN, An is an eigenvalue of the operatorA=[:= 1 AnPn, and the only other possible eigenvalue of A is 0.
185
Linear Operators on Hilbert Spaces
(c) If all Ans are real, then A is self-adjoint. (d) If all projections P" are finite-dimensional, then A is compact. Proof.
(a) For every
mE
N, we have
Since the vectors A Pnx are orthogonal, 11
II
f.
AnPnx
r
n=k
=
f. IIAnPnxll f. IAni IIPnxll 2
2
=
n=k
2 •
n=k
Now, since A11 -i>O as n...;.oo, for every c>O we have
t A~~P~~xr$£ t IIP~~xii =E 1 t P~~xr$£ 1 t P~~rllxll , 2
II
n-k
2
2
n-k
2
n-k
2
n-k
(4.9.12) for all sufficiently large k and m. The sum L~~k P being a finite sum of projection operators, is a projection operator and its operator norm is 1. Thus (4.9.12) yields 11 ,
whenever k and m are sufficiently large. Thus the sequence of partial sums L~~k A11 P11 is a Cauchy sequence, and by Theorem 1.6.5, the series converges to a bounded operator on H (b) Denote the range of Pn by !Jt(Pn) and let n0 E N. If u E !Jt(Pno), then P nou = u and Pnu = 0 for all n ~ n0 , because the P" are orthogonaL Thus Au= A110 U, which shows that A110 is an eigenvalue of A To prove that there are no other non-zero eigenvalues, suppose u is an eigenvector corresponding to an eigenvalue A. Set V 11 = P11 u, n = 1, 2, ... , and let w = Qu, where Q is the projection on the orthogonal complement of !Jt(A). Then 00
u=
L. v"+w, n=l
with w _i iJt ( P") for all n EN. Clearly,
(4.9.13)
186
Theory
since Pnw = 0 and A is continuous. Consequently, the eigenvalue equation has the form
or 00
L
(4.9.14)
(A-An)vn+Aw=O.
n=l
Since all vectors in (4.9.14) are orthogonal, the sum vanishes only if every term vanishes. Hence Aw = 0, and for every n EN either A =An or vn = 0. Finally, if u in (4.9.13) is a non-zero eigenvector, then either w ~ 0 or vk ~ 0 for some kEN. Therefore, A= 0 or A= Ak for some kEN, by ( 4.9.14). This proves the assertion. (c) Suppose all Ans are real. Since projections are self-adjoint operators, for any x, y E H we have 00
(Ax, y) =
L
00
(AnPnX, y) =
L
An(PnX, y)
n=l
n=l
00
=
L
00
An(x,Pny)=
n=l
L
(x,AnPny)=(x,Ay).
n=l
(d) A is the limit of a uniformly convergent sequence of compact operators, hence it is compact. Definition 4.9.4 (Approximate Eigenvalue). Let T be an operator on a Hilbert space H A scalar A is called an approximate eigenvalue of T if there exists a sequence of vectors {xn} such that llxn II= 1 for all n EN and II Txn - Axn II-i> 0 as n -i> oo. Obviously, every eigenvalue is an approximate eigenvalue. Example 4.9.5. Let {en} be a complete orthonormal sequence in a Hilbert space H Let An be a strictly decreasing sequence of scalars convergent to some A. Define an operator on H by 00
Tx =
L
An(x, en)en.
n=l
It is easy to see that every An is an eigenvalue of T, but A is not. On the other hand,
187
Linear Operators on Hilbert Spaces
as n...;. oo. Thus A is an approximate eigenvalue of T. Note that the same is true if we just assume that An...;. A and An ~A for all n EN. For further properties of approximate eigenvalues see 4.13. Exercises at the end of this chapter.
4.10. Spectral Decomposition Let H be a finite-dimensional Hilbert space, say H = CN. It is known from linear algebra that eigenvectors of a self-adjoint operator on H form an orthogonal basis of H. The following theorems generalize this result to infinite dimensional spaces. Theorem 4.10.1 (Hilbert-Schmidt Theorem). For every self-adjoint compact operator A on an infinite dimensional Hilbert space H there exists an orthonormal system of eigenvectors {un} corresponding to non-zero eigenvalues {An} such that every element x E H has a unique representation in the form 00
X=
L
anun+v,
(4.10.1)
n=l
where an
E
C and v satisfies the equation Av = 0.
Proof. By Theorem 4.9.8 and Corollary 4.9.2 there exists an eigenvalue A1 of A such that IA 11= sup I(Ax, llxll"'l
x)l.
Let u 1 be a normalized eigenvector corresponding to A1 • We set
Q1 = {x E H: x _i u 1}, i.e., Q1 is the orthogonal complement of the set {u 1}. Thus Q1 is a closed linear subspace of H If x E Q1 , then (Ax, u 1 )
= (x, Au 1 ) = A 1 (x, u 1 ) = 0,
which means that x E Q 1 implies AxE Q 1 . Therefore A maps the Hilbert space Q1 into itself. We can again apply Theorem 4.9.8 and Corollary 4.9.2 with Q1 in place of H This gives an eigenvalue A2 such that IA 2 1= sup {I(Ax, llxllsl
x)l:
u E Q1}.
Let u2 be a normalized eigenvector of A2 • Clearly u 1 _i u 2 • Next we set
Q2 =
{X E
Q1; X
j_ U2},
188
Theory
and repeat the above argument. Having eigenvalues A1 , ••• , An and the corresponding normalized eigenvectors u 1 , ••• , un, we define
and choose an eigenvalue An+J such that IAn+ll = sup {I(Ax, x)l:
U
(4.10.2)
E Qn}.
llxil"'i
For un+J we choose a normalized vector corresponding to An+J· This procedure can terminate after a finite number of steps. Namely, it can happen that there is a positive integer k such that (Ax, x) = 0 for all x E Qk. Then every element x of H has a unique representation
where Av = 0. Then
and the theorem is proved in this case. Now suppose that the described procedure yields an infinite sequence of eigenvalues {An} and eigenvectors {un}. Then {un}, as an orthonormal sequence, converges weakly to 0. Consequently, by Theorem 4.8.8, the sequence {Aun} converges strongly to 0. Hence
Denote by S the space spanned by vectors {un}. By the Projection Theorem (Theorem 3.10.4), every x E H has a unique decomposition x = u + v or 00
X=
L
anUn +v,
n=l
where v E Sj_. It remains to prove that Av = 0 for all v E Sj_. Let vES\ v:;60. Define w=v/llvll· Then (Av, v) = llvii\Aw, w).
Since
wE
Sj_ E Q,. for every n EN, by (4.10.2) we have 2
I(Av, v )I= II vii I(Aw, w )I:::::: II v 11
2
2
sup {I(Ax, x)l: u E Qn} =II vii 1An+ 1 1-i> 0.
llxll"'i
This implies (Av, v) = 0 for every v E Sj_. Therefore, by Theorem 4.4.5, the norm of A restricted to Sj_ is 0, and thus Av = 0 for all v E Sj_. This completes the proof.
189
Linear Operators on Hilbert Spaces
Theorem 4.10.2 (Spectral Theorem for Self-Adjoint Compact Operators). Let A be a se{j~adjoint compact operator on an infinite dimensional Hilbert space H. Then there exists in H a complete orthonormal system (an orthonormal basis) {vn} consisting of eigenvectors of A. Moreover, for every x E H 00
Ax=
L
An( X, Vn)Vn,
(4.10.3)
n=l
where An is the eigenvalue corresponding to Vn· Proof. Most of this theorem is already contained in Theorem 4.10.1. To obtain a complete orthonormal system {vn} we need to add an arbitrary orthonormal basis of sj_ to the system {Un} defined in the proof of Theorem 4.1 0.1. The eigenvalues corresponding to those vectors from Sj_ all equal zero. Equality ( 4.1 0.3) follows from the continuity of A.
Theorem 4.10.3. For any two commuting self-adjoint compact operators A and B on a Hilbert space H, there exists a complete orthonormal system of common eigenvectors. Proof. Let A be an eigenvalue of A and let S be the corresponding eigenspace. For any x E S we have ABx = BAx = B(Ax) = ABx.
This means that Bx is an eigenvector of A corresponding to A, provided Bx ~ 0. In any case, Bx E S and hence B maps S into itself. Since B is a self-adjoint compact operator, by Theorem 4.1 0.2, S has an orthonormal basis consisting of eigenvalues of B, but these vectors are also eigenvectors of A, because they belong to S. If we repeat the same with every eigenspace of A, then the union of all these eigenvectors will be an orthonormal basis of H This proves the theorem. Theorem 4.10.4. Let A be a self-adjoint compact operator on a Hilbert space H with a complete orthonormal system of eigenvectors {vn} corresponding to eigenvalues {An}. Let Pn be the projection operator onto the space spanned by Vn. Then, for all X E H, (4.10.4) n=l
and (4.10.5) n=l
190
Proof.
Theory
From the Spectral Theorem (Theorem 4.10.2), we have (4.1 0.6) n=l
For every kEN, the projection operator Pk onto the one dimensional subspace Sk spanned by Vk is given by
Indeed, for every x E H we have X
= (X, Vk) Vk + L
(X,
Vn ) Vn ,
nT'k
where (x, vk)vk E sk and LnT'k (x, Vn)Vn j_ Sk. Thus (x, of x onto Sk. Now (4.10.6) can be written as
vk)vk
is the projection
n=l
and, by Theorem 4.10.2,
n=l
Hence, for all x
E
n=l
H,
which proves (4.10.5).
L
Note that the convergence of A"P" is guaranteed by Theorem 4.9.11 and is quite different from the convergence of A"P"x.
L
Remarks. 1. Theorem 4.1 0.4 can be considered as another version of the Spectral Theorem. This version is important in the sense that it can be extended to non-compact operators. It is also useful because it leads to an elegant expression for powers and more general functions of an operator. 2. It follows from Theorem 4.10.4 that a self-adjoint compact operator is an infinite sum of very simple operators. One dimensional projection operators are not only the simplest self-adjoint compact operators, but they are also the fundamental ones because any self-adjoint compact operator is a (possibly infinite) linear combination of them.
191
Linear Operators on Hilbert Spaces
Let A, A,., and P,. be as in Theorem 4.10.4. Then
because APnx = AnPnx for all x E H. Similarly, for any kEN, we get (4.10.7) n=l
More generally, for any polynomial p( t) = ant"+ . .. +a It, we have 00
p(A) =
L p(An)Pn. n=l
The constant term in p must be zero, because otherwise the sequence {p(An)} would not converge to zero. In order to deal with polynomials with a non-zero constant term a 0 , we have to add a 0!fi to the series. Note that in such a case p(A) is not a compact operator. The above result can be generalized in the following way. Definition 4.10.1 (Function ofan Operator). Letfbe a real valued function on R such that f( A)...;. 0 as A ...;. 0. For a self-adjoint compact operator A= L:~J AnPn we define 00
J(A) =
L J(An)Pn.
(4.10.8)
n=l
Theorem 4.9.11 ensures the convergence of the series in (4.1 0.8), and that f(A) is self-adjoint and compact.
Example 4.10.1. Let A= L.:~J A"P" be a self-adjoint compact operator such that all A" 's are non-negative. For any a> 0 we can define A"' by 00
A"'x =
L
A~Pnx.
n=l
Note that in the case a =!the above definition agrees with the Definition 4.6.2. Indeed, by (4.10.7), we have 00
(JAY=
L
00
(JA;;/Pn =
n=l
because all An's are non-negative.
L n=l
AnPn =A,
192
Theory
Let A= L~~~J AnPn be a self-adjoint compact operator. of A by sine define We can
Example 4.10.2.
00
L
sin A=
(sin An)Pn.
n=l
Condition f( A)-'? 0 as A-'? 0 in Definition 4.1 0.1 can be replaced by AnPn boundedness off in a neighborhood of the origin. Indeed, if A= and Pnx =(x, vn)vn, then for any XE H we have
L.:=J
00
(f(A))x =
L f(An)(x, Vn)Vn, n=l
where convergence of the series is justified by Theorem 3.8.3, because lf(An)(x, vnW::o: Ml(x, vnW, 2 for some constant M, and hence {f(An)(x, An)} E 1 • Clearly, in this case we cannot expect f(A) to be a compact operator.
If eigenvectors {un} of a self-adjoint operator Ton a Hilbert space H form a complete orthonormal system in H and all eigenvalues are positive (or non-negative), then Tis strictly positive (or positive).
Theorem 4.10.5.
Proof. Suppose {un} is a complete orthonormal system of eigenvectors of T corresponding to eigenvalues {An}. For any non-zero vector u =
r.:=J
anun
E
H we have
=
L
00
00
00
an(u, Anun) =
n=l
L n=l
Anan(U, Un)
=L
Ananan
n=l
n=l
if all eigenvalues are non-negative. If all A,s are positive, then the last inequality becomes strict. This completes the proof.
4.11. The Fourier Transform In this section we introduce the Fourier transform in L \R) and discuss its basic properties. The definition of the transform in L \R) is not trivial. The
193
Linear Operators on Hilbert Spaces
integral _1_
../2ii
foo
e-ikxf(x) dx
-00
cannot be used as a definition of the Fourier transform in L\R) because not all functions in L\R) are integrable. It is however possible to extend the Fourier transform from L 1(R) n L\R) onto L 2 (R). In the first part of this section we discuss properties of the Fourier transform in L 1(R). Then we show that the extension onto L 2 (R) is possible and study properties of that extension. Let f be an integrable function on R. Consider the integral kE R.
( 4.11.1)
Since the function g(x) = e-ikx is continuous and bounded, the product e-ikxf(x) is a locally integrable function for any kE R (see Theorem 2.9.2). Moreover, since Ie-ikxl s 1 for all k, x E R, we have
and thus, by Theorem 2.9.3, the integral ( 4.11.1) exists for all k E R. Definition 4.11.1 (Fourier Transform in L 1(R)). j defined by 1 27T
~
A f(k) =
y
f
00
Let f
e-•"kx f(x) dx.
E
L 1(R). The function
(4.11.2)
-00
is called the Fourier transform of f. In some books the Fourier transform is defined without the factor 1j../2ii in the integral. Another variation is the definition without the "-" sign in the exponent, i.e.,
These details do not change the theory of Fourier transforms at all. Instead of"]" the notation ".'JP{f(x)}" can also be used. The latter is especially convenient if instead of a letter "f" or "g" we want to use an 2 expression describing a function, for example .'JP{ e-x }. We will use freely both symbols.
194
Theory
Example 4.11.1. (a) Let a >0. Then gi{e-a 0. The proof follows easily from Definition 4.11.1.
Theorem 4.11.6. If f is a continuous piecewise differentiable function, f,f' E L 1(R), and limlxl~oof(x) = 0, then @P{f'} = ik@P{f}.
Proof.
Simple integration by parts gives
_1_ foo f'(x) e-ikx dx =-1-[f(x) ~ -oo ~ =
e-ikx]~oo+~ foo
f(x) e-ikx dx
..[2;i -oo
ik](k).
Corollary 4.11.1. Iff is a continuous piecewise n-times differentiable function, f, f', ... , f(n) E L 1(R), and limlxl~oo f(k)(x) = 0 fork= 0, ... , n -1, then @P{f(")} = (i)"k"@P{f}. Because of our definition of the Fourier transform it is better to redefine the convolution of two functions, f, g E L 1(R) as follows:
1 (f * g)(x) = ~
foo
v27T -oo
f(x- u)g(u) du.
The main reason is the simplicity of the formula in the next theorem.
Theorem 4.11.7 (Convolution Theorem). @P{f} @P{g}.
Let f, g E L 1(R). Then @P{f * g} =
197
Linear Operators on Hilbert Spaces
Proof. Let f, g E L 1(R) and h = and we have " h(k)=
1 27T
f7C: y
1 27T
=-
=-127T =
foo
.
h(x)e-'kxdx=
foo ·g(u) foo -oo
foo
1 27T
foo
.
e-•kx
-00
1 27T
f7C: y
foo
f(x-u)g(u)dudx
-00
.
e-•kxf(x-u) dxdu
-oo
g(u)
-00
~ foo
v27T
*g. Then hE L 1 (R), by Theorem 2.15.1,
f7C: y
-00
f
foo
e-ik(x+u)f(x) dx du
-00
g(u) e-iku du
-oo
~ foo
v27T
e-ikxf(x) dx = g(k)}(k).
-oo
We will now discuss the extension of the Fourier transform onto L\R).
In the following theorem, and in the remaining part of this section, denotes the norm in L 2 (R), i.e.,
llfll2 =
11·11 2
~ L: lfCxW dx
Theorem 4.11.8. Let f be a continuous function on R vanishing outside a A 2 bounded interval. Then f E L (R) and
Proof. Suppose first that f vanishes outside the interval [ -7r, 7T ]. Using Parseval's formula for the orthonormal sequence of functions on [ -7r, 7T] cf>n(X ) =
1
v2iT e
-inx
,
n=0,±1,±2,... ,
we get
Since the above inequality holds also for g(x) = e-i$xf(x) instead of f(x), we obtain 00
IIIII~=
L licn+gW, n=-oo
198
Theory
since IIIII~= I gil~. Integration of both sides with respect tog from 0 to 1 yields
Iff does not vanish outside [ -7r, 7T ], then we take a positive number A for which the function g(x) =/(Ax) vanishes outside [ -7r, 7T ]. Then
and thus
The proof is complete. The space of all continuous functions on R with compact support is dense in L\R). Theorem 4.11.8 shows that the Fourier transform is a continuous mapping from that space into L\R). Since the mapping is linear, it has a unique extension to a linear mapping from L\R) into itself. This extension will be called the Fourier transform on L\R). Definition 4.11.2 (Fourier Transform in e(R)). Let/E L\R) and let {cf>n} be a sequence of continuous functions with compact support convergent to fin L\R), i.e., II!- cf>n 1 2 -i> 0. The Fourier transform off is defined by (4.11.4) where the limit is with respect to the norm in L\R). Theorem 4.11.8 guarantees that the limit exists and is independent of a particular sequence approximating f. It is important to remember that the convergence in L\R) does not imply pointwise convergence and therefore the Fourier transform of a square integrable function is not defined at a point, unlike the Fourier transform of an integrable function. We can say that the Fourier transform of a square integrable function is defined almost everywhere. For this reason we cannot say that, iff E L 1(R) n L \R), then the Fourier transform defined by (4.11.2) and the one defined by (4.11.4) are equal. To be precise, we should say that the function defined by ( 4.11.2) belongs to the equivalence class of square integrable functions defined by
199
Linear Operators on Hilbert Spaces
( 4.11.4). In spite of this difference, we will use the same symbol to denote both transforms. It will not cause any misunderstanding. The following theorem is an immediate consequence of Definition 4.11.2 and Theorem 4.11.8. IffE L 2(R), then
Theorem 4.11.9 (Parseval's Relation).
Remark. In physical problems, the quantity II! 11 2 is a measure of energy, and llill 2 represents the power spectrum off
Theorem 4.11.10.
Let fE L 2 (R). Then
" f(k) =lim
1
r;::;n-.coy27T
fn
e-·"k xf(x) dx,
-n
( 4.11.5)
where the convergence is with respect to the norm in L 2 (R). Proof.
For n = 1, 2, 3, ... , define if lxl< n, if lxl ::2: n.
Then II/- fn ll2 -i> 0, and thus IIi- in 11 2-i> 0 as n -i> oo.
Theorem 4.11.11.
If f. gEL 2 (R), then
L:!(x)g(x) dx= Proof.
f_cocoi(x)g(x) dx.
For n = 1, 2, 3, ... , define if lxl < n, if lxl ::2: n,
and if lxl< n, iflx1:2:n. Since
( 4.11.6)
200
Theory
we have
The function e-iX$gn(x)fm(g) 2
is integrable over R and thus the Fubini Theorem can be applied. Consequently
and
L: Jm(x)gn(X) dx = L""""/m(Hgn(g) df Since
llg- g" llr'-' 0 and II§- in 1 2
-i>
0, by letting n -i> oo we obtain
L:Jm(x)g(x) dx= L:fm(x)g(x) dx, by the continuity of inner product. For the same reason, by letting m -i> oo, we get
L: }(x)g(x) dx =
J:
f(x)g(x) dx,
completing the proof. The following technical lemma will be useful in the proof of the important inversion theorem for the Fourier transform in L 2 (R). Lemma 4.11.1.
Proof.
Let f
E
2
L (R) and let g
= fX
A
Then f =g.
From Theorems 4.11.9 and 4.11.11, and the equality g =
Cf. g)= cJ, g)= cJ,}) =IIi II~= IIIII ~-
X
f we obtain (4.11.7)
Hence also (f,
g)= llfll~.
(4.11.8)
Finally, by Parseval's equality,
llill~= llgll~= 11111~= 11!11~-
(4.11.9)
201
Linear Operators on Hilbert Spaces
Using (4.11. 7 -4.11.9) we get
This
II!- ill~= U- i.f- g)= 11111~- C!. §)- C!. §)+ 11§11~= o. shows that f = l
Theorem 4.11.12 (Inversion of Fourier Transforms in L 2 (R)). Then
f(x)
1 f" =lim ~ n-;.co v27T -n
Letf E L 2 (R).
e'kx" f(k) dk,
where the convergence is with respect to the norm in L 2 (R).
Proof.
Let f
E
L 2 (R). If g =], then, by Lemma 4.11.1, f(x)
1 f" e-•kx g(k) dk =~ g(x) =lim ~ n-;.coy27T -n 1 f" e'·~ex= n-;.coy27T lim ~ g(k) dk -n
Corollary 4.11.2.
IffEL\R)nL 2 (R), then the equality 1 /(x) = ~ v27T
f""
e'"kx" f(k) dk
(4.11.10)
-co
holds almost everywhere in R. The transform defined by (4.11.10) is called the inverse Fourier transform. One of the main reasons for introducing the factor 1/ v27i in the definition of the Fourier transform is the symmetry of the transform and its inverse: 1 .'JP{f(x)} = ~ y
27T
f""
.
e-'kxf(x) dx,
-a)
Theorem 4.11.13 (General Parseval's Relation).
If f. g E L\R), then
L:f(x)g(x) dx= f_""""j(k)g(k) dk.
202
Theory
Proof.
The polarization identity C!. g)= ~elf+ gl -If- gl + ilf + igl 2
2
2
-
ilf- igl
2
)
implies that every isometry preserves inner product. Since the Fourier 2 transform is an isometry on L (R), we have (/,g)= (i, g). The following theorem summarizes the results of this section. It is known as the Plancherel Theorem.
For every fE L\R) there exists
Theorem 4.11.14 (Plancherel Theorem). }EL 2 (R) such that:
(a) Iff E L 1 (R) n L 2 (R), then }(k) = (1/..f27i) f::oo e-ik:x f(x) dx. (b) ll}(k)- (1/..f27i) e-ikxf(x) dxll 2 -i> 0 and II/( x)- (1/ ..f27i) eikx}( k) dkll 2 -i> 0 as n -i> oo. (c) II! II~= 11111~. (d) The mapping f
Cn Cn
-i>
Jis a Hilbert space isomorphism of L \R) onto L (R). 2
Proof. The only part of this theorem which remains to be proved is the fact that the Fourier transform is "onto". Let fE L\R) and define h=
J
,..
and
g =h.
Then, by Lemma 4.11.1,
!=h=i and hence
f=g. This shows that every square integrable function is the Fourier transform of a square integrable function. Theorem 4.11.15.
Proof.
The Fourier transform is an unitary operator on L 2 (R).
First note that
.'JP{g}(k)=
1
171:::
v2~
·~ex-
foo
e-' g(x)dx=
-oo
1
171:::
v2~
foo
·~ex
e' g(x)dx=.'JP- 1{g}(k).
-oo
Now, using Theorem 4.11.11, we obtain 1 (.'JP{f}, g)= f2= YL~
1 = ~ v2~
foo -oo
foo
_ 1 .'JP{f}(x)g(x) dx = ~ v2~
foo
/(x).'JP{g}(x) dx
-oo
/(x).'JP- 1{g}(x) dx = (/, g;:- 1 {g}).
-oo
This shows that g;:-J = .'JP*, and thus .'1P is unitary.
203
Linear Operators on Hilbert Spaces
The Fourier transform can be defined for functions in L 1 (RN) by ](k) =
\N/2 JRN r e-ik·xt(x) dX,
(217"
where k = ( k 1 , ••• , kN ), x = ( X~o ... , xN) and k · x = k 1 x 1 + · · · + kNxN. The theory of the Fourier transform in L\RN) is similar to the one dimensional case. Moreover, the extension to L 2 (RN) is possible and it has similar properties, including the Inversion Theorem and the Plancherel Theorem.
4.12. Unbounded Operators Boundedness of an operator was an essential assumption in almost every theorem proved in this chapter. Methods used were developed with boundedness or continuity in mind. However, in the most important applications of the theory of Hilbert spaces we often have to deal with operators which are not bounded. In this section we will briefly discuss some basic problems, concepts and methods in the theory of unbounded operators. An operator A defined in a Hilbert space H, i.e., iYt(A) c H, is called unbounded if it is not bounded. Therefore, to show that an operator A is unbounded it suffices to find a sequence of elements x" E H such that llxnllsM (for some M and all nEN) and IIAxnll-i>OO. Since for linear operators boundedness is equivalent to continuity, unboundedness is equivalent to discontinuity (at every point). Consequently, we can show that an operator A is unbounded by finding a sequence {x"} convergent to 0 such that the sequence {Axn} does not converge to 0. One of the most important unbounded operators is the differential operator, see Example 4.2.3. Other important unbounded operators arise from the quantum mechanics and will be discussed in Chapter 7. In physical applications it is natural to assume that all eigenvalues are real. For this reason self-adjoint operators are of special interest. It will be convenient to adopt the following convention: when we say "A is an operator on a Hilbert space H" we mean that the domain of A is the whole space H, and when we say "A is an operator in a Hilbert space H" we mean that the domain of A is a subset of H. If the domain of a bounded operator A is a proper subspace of a Hilbert space H, then A can be extended to a bounded operator defined on the entire space H. More precisely, there exists a bounded operator B defined on H, ffi(B) = H, such that Ax= Bx for every x E ffi(A). Moreover, we can always find B such that liB II= IIAII, see 4.13. Exercises, (2). We may thus
204
Theory
always assume that the domain of a bounded operator is the whole of H. In the case of unbounded operators the above is impossible. For instance the domain of the differential operator cannot be extended onto H. On the other hand, it may be still possible to extend the domain of an unbounded operator in such a way that, although the domain of the extension is not the whole space, it has better properties. Extension of unbounded operators is one of the important problems of the theory.
Definition 4.12.1 (Extension of Operators). a vector space E. If
Let A and B be operators on
ffi(A)c ffi(B)
and
Ax=Bx
for every x
E
ffi(A),
then B is called an extension of A, and we write A c B. When performing typical operations on unbounded operators, we have to remember about the domains. For instance, the operator A+ B is defined for all x E ffi(A) n ffi(B), i.e., ffi(A +B)= ffi(A) n ffi(B). It may happen that ffi(A) n ffi(B) = {0} and then the sum A+ B does not make sense. Similarly,
ffi(AB) = {x E ffi(B): Bx E ffi(A)}. The usual properties need not hold. Although, we have the equality (A+ B) C = AC + BC, in general, the inclusion AB + AC c A(B +C) cannot be replaced by equality.
Definition 4.12.2 (Densely Defined Operator). An operator A defined in a normed space E is called densely defined if its domain is a dense subset of E, i.e., cl ffi(A) =E. The differential operator D = dj dx is densely defined in L 2 (R), because the space of differentiable square integrable functions is dense in L 2 (R).
Definition 4.12.3 (Adjoint of a Densely Defined Operator). Let A be a densely defined operator in a Hilbert space H. Denote by ffi(A*) the set of ally E H for which (Ax, y) is a continuous functional on ffi(A). The adjoint A* of A is the operator defined by (Ax, y) = (x, A*y)
for all xEffi(A) and yEffi(A*).
In the above definition A has to be densely defined in order to ensure the uniqueness of the adjoint A*.
205
Linear Operators on Hilbert Spaces
Theorem 4.12.1. space H
Let A and B be densely defined operators in a Hilbert
(a) If A c B, then B* c A*. (b) Ifffi(B*) is dense in H, then B c B**. Proof.
First note that A c B implies (Ax, y) = (x, B*y)
for all xEffi(A) and all yEffi(B*).
(4.12.1)
for all xEffi(A) and all yEffi(A*).
(4.12.2)
On the other hand, we have (Ax, y) = (x, A*y)
Comparing (4.12.1) and (4.12.2), we conclude that ffi(B*) c ffi(A*) and A*(y) = B*(y) for ally E ffi(B*). This proves (a). To prove (b) observe that the condition (Bx, y) = (x, B*y)
for all xE ffi(B) and ally E ffi(B*),
can be rewritten as (B*y, x) = (y, Bx)
for all yEffi(B*) and all xEffi(B).
(4.12.3)
Therefore, since ffi(B*) is dense in H, B** exists and we have (B*y, x) = (y, B**x)
for all y E ffi(B*) and all x E ffi(B**).
(4.12.4)
From (4.12.3) and (4.12.4) it follows that ffi(B) c ffi(B**) and B(x) = B**(x) for any x E ffi(B). The proof is complete. Theorem 4.12.2. If A is a one-to-one densely defined operator in a Hilbert space H such that its inverse A -J is densely defined, then A* is also one-to-one and (4.12.5) Proof. Let yEffi(A*). Then for every xEffi(A- 1) we have A- 1 xEffi(A) and hence (A- 1 x,A*y)=(AA- 1x,y)=(x,y). This means that A*yE ffi((A- 1)*) and
(4.12.6) 1
Next, take an arbitrary yE ffi(A- )*. Then, for each xE ffi(A), we have AxE ffi(A - 1 ). Hence (Ax, (A- 1 )*y) =(A -I Ax, y) = (x, y).
This shows that (A- 1 )*y E ffi(A*) and A*(A - 1 )*y =(A -I A)*y = y.
Equality (4.12.5) follows from (4.12.6) and (4.12.7).
(4.12.7)
206
Theorem 4.12.3. B*A*c (AB)*.
Theory
If A, B, and AB are densely defined operators in H, then
Proof. Suppose xE ffi(AB) and yEffi(B*A*). Since xEffi(B) and A*yE f0 ( B*), it follows that (Bx, A*y) = (x, B* A*y). On the other hand, since BxEffi(A) and yEffi(A*), we have
(A(Bx), y) = (Bx, A*y). Hence
(A(Bx), y) = (x, B*(A*y)). Since this holds for all x (AB)y.
E
ffi(AB), we have y E ffi((AB)*) and (B* A*)y =
Self-adjoint operators have been already discussed in Section 4.4. In that section however, we limited our discussion to bounded operators. Without the boundedness condition the matter is more delicate.
Definition 4.12.4 (Self-Adjoint Operator). Let A be a densely defined operator in a Hilbert space H. A is called self-adjoint if A= A*. Remember that A= A* means that ffi(A*) = ffi(A) and A(x) = A*(x) for all x E ffi(A). If A is a bounded densely defined operator in H, then A has a unique extension to a bounded operator on H, and then its domain as well as the domain of its adjoint is the whole space H. In the case of unbounded operators the situation is much more complicated. It is possible that a densely defined operator A has an adjoint A* such that A(x) = A*(x) whenever xEffi(A)nffi(A*), but ffi(A*):;O (n-1)! (n-1)!
as n -i> oo.
This shows f(x)=O for all xE[O, 1]. Theorem 5.5.3 (Non-Homogeneous homogeneous Volterra equation f(x)=¢(x)+A
Volterra
Equation).
K(x, t)f(t) dt,
a-s x -s b,
r
The
non-
(5.5.9)
has a unique solution, for any A, given by f(x)
=
¢(x)+
1 tx
~
A "Kn(X, t) -1, is called the Jacobi differential equation;
p(x)
= (1-x)"+ 1 (1 + x) 13 +\ q(x) == 0, w(x) = (1- x)"(l + x) 13 •
The functions
called the Jacobi polynomials, are the eigenfunctions of the Jacobi equation. The eigenvalues are An= n(n +a+ f3 + 1). The Jacobi operator is defined by
d [ (1-x)"+ 1 (1+x) 13 +1 du]. Lu=-dx dx If
f3
= 0, the Jacobi equation becomes the Legendre equation. For it reduces to the Chebyshev equation. If a = f3 = y then we get the Gagenbauer polynomials C~. a
a =
-t
= f3 =!
Example 5.8.9 (Laguerre differential equation
Operator
and
Laguerre
x d [ -x du] e dx xe dx +Au =0,
Polynomials).
The
O<x 0 and bi + b~ > 0. Definition 5.9.2 (Singular Sturm-Liouville System). Suppose p, q, and w are functions defined in [a, b] satisfying all the conditions in Definition 5.9.1 except that p is only assumed to be positive in (a, b), and vanishes at one or both end-points of the interval [a, b]. By the singular Sturm- Liouville system we mean the system consisting of the differential equation (5.9.1) in the open or semi-open interral with boundary conditions
(a) u is bounded on (a, b), (b) If p does not vanish at an end-point, p satisfies boundary conditions of the form (5.9.2) at that end-point. Example 5.9.1.
Consider the regular Sturm-Liouville system u"+Au =0, u(O)=u(1r)=O.
0 :S X
:S 7T,
Applications
258
Suppose A < 0 and let v =
JfAI. Then the general solution of the equation
is
This solution satisfies the given boundary conditions if and only if
A+B=O,
Since v > 0, ev7T -:;f e -v7r and therefore the only solution is A= B = 0. This means that the system has no non-zero solutions if A < 0. In other words, there are no negative eigenvalues. A similar argument shows that A = 0 is not an eigenvalue. However, when A > 0, then the solutions of the equation are
u (X) = A cos fi
X
+B
sin fi
X.
The boundary conditions give
A= 0
and
B sin fi
7T
= 0.
Since A -:;f 0 and B = 0 yields trivial solution, we must have B sin fi Hence the eigenvalues are An= n Un
2
,
7T
=
-:;f
0 and
0.
n = 1, 2, ... , and the eigenfunctions are
(x) =sin nx.
Note that An...,. co as n...,. co, unlike the case of self-adjoint compact operators when the eigenvalues converge to 0; see Theorem 4.9.10. Section 5.10, particularly Theorem 5.10.4, will explain this. Another type of problem that often occurs in practice 1s the periodic Sturm-Liouville system: du] +(q(x)+Aw(x))u=O, d [ p(x) dx dx p(a)
= p(b),
u(a)
= u(b),
u'(a)
= u'(b).
as x s b,
259
Applications to Integral and Differential Equations
Example 5.9.2. Find the eigenvalues and eigenfunctions of the following periodic Sturm- Liouville system u''+Au=O,
-7TSX:'57T,
u(-1r)=u(1r),
u'(-1r)=u'(1r).
Note that here p(x)=1 and hence p(-1r)=p(1r). When A>O the general solution of the equation is u(x) =A cos fi x+B sinfi x.
Using the boundary conditions we get 2B sin fi
7T
=
0,
2A Vii sin fi
7T
=
0.
Thus, for non-trivial solutions, we must have sin fi
7T
= 0.
The equation is satisfied if
n = 1, 2, 3, .... For every eigenvalue An = n 2 we have two linearly independent solutions cos nx and sin nx. It can be readily shown that the system has no negative eigenvalues. However, A = 0 is an eigenvalue and the corresponding eigenfunction is the constant function u ( x) = 1. Thus the eigenvalues are 0, 1, 4, ... , n 2 ,
•••
and the corresponding eigenfunctions are 1, cos x, sin x, cos 2x, sin 2x, ... , cos nx, sin nx, ....
Throughout the remainder of this section, L will denote the differential operator in the Sturm- Liouville differential equation, i.e., Lu
d = dx
[ p(x) dx du]
+ q(x)u.
For the regular Sturm-Liouville system, we denote by 2i!(L) the domain of L, i.e., 2i!(L) is the space of all complex valued functions u defined on [a, b] for which u" belongs to L 2 ([a, b]) and which satisfy boundary conditions (5.9.2). We have then L:2i!(L)-i>L2 ([a, b]).
260
Applications
For the singular Sturm-Liouville system we need only to replace (5.9.2) by (a) u is bounded on (a, b), (b) b 1 u(b) + b2 u'(b) = 0, where b 1 and b2 are real constants such that hi+ b~> 0. Theorem 5.9.1 (Lagrange's Identity).
For any u,
vE 2i!(L),
(5.9.3)
Proof
We have
d [ pdv] d [ pdu] uLv-vLu=u- +quv-v- -quv dx dx dx dx
=_!}__ [p(u dv _ v du)]. dx dx dx Theorem 5.9.2 (Abel's Formula).
If u and v are two solutions of
Lu+Awu =0
(5.9.4)
in [a, b], then p(x) W(x; u, v) =constant, where W is the Wronskian: u(x) W(x;u,v)=det [ v(x) Proof.
u'(x)]. v'(x)
Since v and v are solutions of (5.9.4) we have
d [ p(x) dx du] +(q(x)+Aw(x))u =0, dx d [ p(x) dx dv] +(q(x)+Aw(x))v=O. dx Multiplying the first equation by v and the second by u, and then subtracting, we obtain
u _!}__ [p dv] _ v !!_ dx dx dx
[p
du] = O. dx
261
Applications to Integral and Differential Equations
By integrating this equation from a to x we find p(x)[u(x)v'(x)- u'(x)v(x)] = p(a)[u(a)v'(a)- u'(a)v(a)] =constant.
This is Abel's formula. Theorem 5.9.3. Eigenfunctions of a regular Sturm-Liouville system are unique except for a constant factor.
Proof. Suppose u and v are eigenfunctions corresponding to the same eigenvalue A. According to Abel's formula, we have p(x) W(x; u, v) =constant.
Since p>O, if W(x; u, v) vanishes at a point in [a, b], then it vanishes everywhere in [a, b ]. From the boundary conditions we have a 1 u(a) + a 2 u'(a) = 0, a 1 v(a)+ a 2 u'(a) = 0.
Since a 1 and a 2 are not both zero, we get u(a) W(a;u,v)=det [ ) v(a
Therefore W(x; u, v) of u and v. Theorem 5.9.4.
= 0 for all
x
E
u'(a)] = O. v'(a)
[a, b ], which proves linear dependence
For any u, v E 2ZJ(L) we have (Lu, v) = (u, Lv),
where ( , ) denotes the inner product of L 2 ([a, b]). In other words, Lis a self-adjoint operator.
Proof. Since all constants involved in the boundary conditions of a SturmLiouville system are real, if vE 2i!(L), then vE 2i!(L). Also, since p, q, and w are real valued, Lv = Lv. Consequently, (Lu,v)-(u,Lv)=
r
(vLu-uLv)dx=[p(uv'-vu')]~,
(5.9.5)
by Lagrange's identity (5.9.3). We will show that the last term in the above equality vanishes, for both the regular and singular system. If p(a) = 0, the
262
Applications
result follows immediately. If p( a)> 0, then u and v satisfy boundary conditions of the form (5.9.2) at x =a. That is, u(a) [ v(a)
u'(a)]
v'(a)
[a
1
]
=O.
az
Since a 1 and a 2 are not both zero, we have
u(a)v'(a)- V(a)u'(a) = 0. A similar argument can be applied to the other end-point x = b, so that we conclude
[p( uv'- vu')]~ = 0. Theorem 5.9.5.
Eigenvalues of a Sturm-Liouville system are real.
Proof. Let A be an eigenvalue of a Sturm-Liouville system and let u be the corresponding eigenfunction. This means that u -:1' 0 and Lu = -Awu. Then 0 = (Lu, u)- (u, Lu) = ( -Awu, u)- (u, -Awu) =(A -A)
J:
w(x)lu(x)IZ dx.
Since w ( x) > 0 in [a, b] and u -:1' 0, the integral is a positive number. Therefore X= A, completing the proof.
Remark.
This theorem states that all eigenvalues of a regular SturmLiouville system are real, but it does not guarantee that an eigenvalue exists. It is proved in Section 5.10 that a regular Sturm-Liouville system has an infinite sequence of eigenvalues. Theorem 5.9.6. Eigenfunctions corresponding to distinct eigenvalues of a Sturm- Liouville system are orthogonal with respect to the inner product with the weight function w(x).
Proof. Suppose u 1 and u 2 are eigenfunctions corresponding to eigenvalues A1 and A2 , A1 -:1' A2 • Thus
Hence ( 5.9.6)
Applications to Integral and Differential Equations
263
By Theorem 5.9.1, we have (5.9.7) Combining (5.9.6) and (5.9.7) and integrating from a to b we get (Al-A2)
r
w(x)u 1 (x)u 2 (x) dx
Since A1 -:1' A2 , we conclude
This completes the proof.
5.10. Inverse Differential Operators and Green's Functions A typical boundary value problem for ordinary differential equation can be written in the operator form as Lu=f
(5.10.1)
We seek a solution u which satisfies this equation and the given boundary conditions. If 2i!(L) is defined as the space of functions satisfying those boundary conditions, then the problem reduces to finding a solution of (5.10.1) in 2ZJ(L). One way to approach the problem is by looking for the inverse operator L - 1 • I fit is possible to find L -\then the solution of ( 5.10.1) can be obtained as u = L -J (f). It turns out that in many important cases it is possible, and the inverse operator is an integral operator of the form (C 1 u)(x) =
Lb G(x, t)f(t) dt.
The function G is called the Green's function of the operator L. Existence of the Green's function and its determination is not a simple problem. We will examine the question more closely in the case of Sturm-Liouville systems.
264
Applications
Theorem 5.10.1. Suppose A = 0 is not an eigenvalue of the following regular Sturm- Liouville system:
d [ p(x) dx du] Lu = dx
+ q(x)u = f(x),
as x s b,
(5.10.2)
with the homogeneous boundary conditions (5.10.3) (5.10.4)
where p, q, and w are continuous real valued functions on [a, b ], p is positive in [a, b ], p'(x) exists and is continuous in [a, b ], a 1 , a 2 , b 1 , b2 are given real numbers such that ai+ a~> 0 and bi + b~ > 0. Ihen,for any f E C(?([ a, b ]), the system has a unique solution u(x) =
J:
G(x, t)f(t) dt,
where G(x, t) is the Green's function given by for a :S t<x, for x s t s b, where u 1 and u 2 are non-zero solutions of the homogeneous system d [ p(x) dx du] dx
+ q(x)u =
0
with boundary conditions (5.10.3) and (5.10.4), respectively, and
is the Wronskian. Proof. According to the theory of ordinary differential equations, the general solution of (5.10.2) is of the form (5.10.5) where c 1 and c2 are constants, u 1 and u2 are two linearly independent solutions of the homogeneous equation Lu = 0, and uP is any particular solution of (5.10.2).
265
Applications to Integral and Differential Equations
The particular solution uP can be found by the method of variation of parameters. Thus we are looking for a solution in the form (5.10.6) where v 1 and v2 are functions to be determined. Since there are infinitely many pairs of functions v 1 and v2 for which uP satisfies (5.10.2), we add a second condition: (5.10.7) We now have
and
Substituting into (5.10.2) we get v1(pu~ + p'u; + qu1) + v2(pu~ + p'u;+ qu2) + p(v;u; + v;u;)
=f
Since u 1 and u2 are solutions of the homogeneous equation, the first two terms vanish so that the above result becomes v'u'+v'u'-f. I I 2 2 - p"
Solving (5.10.7) and (5.10.8) for
v;
and
v; we
(5.10.8) obtain (5.10.9)
We will show that the Wronskian does not vanish at any point of [a, b ]. Indeed, suppose that
vanishes at some
gE [a,
b ]. Then the system au 1(g) + f3u2(g) = 0 au;(g) + {3u;(g) = 0
has a non-trivial solution such that a and f3 are not both zero. Then the function g = au 1+ f3u 2 is a solution of the initial value problem d [ p(x) dx du] dx
+ q(x)u = 0,
g(g) = g'(g) = 0.
266
Applications
But we know that the above problem has only the trivial solution, and thus = 0. This means that u 1 and u 2 are linearly dependent, i.e., u 1 = yu 2 for some constant y. Consequently, u 1 satisfies both boundary conditions (5.10.3) and (5.10.4), but this implies that A= 0 is an eigenvalue of the system, contrary to the assumption. By Abel's formula (Theorem 5.9.2), p(x) W(x; u 1 , u 2 ) is a constant. Since W does vanish in [a, b] and p is assumed to be positive, the constant is not zero. Denote g
Now, by integrating equalities (5.10.9) we get v 1 (x) = -
f
cf(x)u 2 (x) dx
and
v 2 (x) =
f
cf(x)u 1 (x) dx,
and finally up(x) = -cu 1 (x)
=
tx
tx
f(t)u 2 (t) dt+ cu 2 (x)
cu 2 (x)u 1 (t)f(t) dt+
tx
f(t)u 1 (t) dt
Lb cu (x)u (t)f(t) dt. 1
2
(5.10.10)
Consequently, if we define the Green's function as for as t< x, for x s t s b, we can write up(x) =
(5.10.11)
r
G(x, t)f(t) dt
provided the integral exists. This follows immediately from the continuity of G. The proof of continuity of G is left as an exercise.
r
Denote by T the integral operator defined in Theorem 5.10.1, i.e., (Tf)(x) =
G(x, t)f(t) dt.
(5.10.12)
We are going to examine properties of T. Theorem 5.10.2.
The operator Tdefined by (5.10.12) is a self-adjoint compact operator from L 2 ([a, b]) into C(?([a, b]).
267
Applications to Integral and Differential Equations
Proof. Compactness of integral operators has been discussed in Example 4.8.4. In Example 4.4.3 we prove that an integral operator of the form (5.10.12) is self-adjoint if G(x, t) = G(t, x). It is easy to see that in this case the condition is satisfied. Finally, continuity of Tf follows from the continuity of G.
Operator T, as a compact self-adjoint operator, admits spectral representation; (see Theorem 4.10.2). The following two theorems describe the connection between eigenvalues and eigenfunctions of the regular Sturm-Liouville operator L and the corresponding integral operator T. Theorem 5.10.3. If A = 0 is not an eigenvalue of the Sturm- Liouville system defined in Theorem 5.10.1, then A= 0 is not an eigenvalue of the integral operator T defined by (5.10.12).
Suppose Tf = 0. Then
Proof.
0 = ( Tf)'(x) =
=
c(
u((x)
-u;(x)
=
c ( u((x)
d~ [ cu (x) 1
tx tx tx
tx
f(t)u 2( t) dt- cu 2(x)
tx
f( t)u 1 ( t) dt
J
f(t)u2(t) dt+u 1 (x)u2(x)f(x)
f(t)u 1 (t) dt-u 1 (x)u2(x)f(x)) f(t)u 2(t) dt- u;(x)
tx
f(t)u 1 (t) dt).
Therefore, we have the following system of equations: u 1 (X) u((x)
tx tx
f(t)u2(t) dt-u2(x) f(t)u 2(t) dt- u;(x)
tx tx
f(t)u 1 (t) dt=O, f(t)u 1 (t) dt = 0.
Since the determinant
does not vanish at any point of [a, b] (see the proof of Theorem 5.10.1) we conclude
268
Applications
for all x E [a, b ]. This implies f = 0, and thus the equation Tf = 0 has only the trivial solution. The proof is complete. Theorem 5.10.4. Under the assumptions of Theorem 5.10.1, A is an eigenvalue of L if and only if 1/ A is an eigenvalue of T. Moreover, iff is an eigenfunction of L corresponding to the eigenvalue A, then fis an eigenfunction ofT corresponding to the eigenvalue 1/ A. Proof. Suppose Lf = Af for some non-zero f in the domain of L. By Theorem 5.10.1, we have f= T(Af)
or equivalently, since A -:;f 0, 1 Tf=-f A
This shows that 1/ A is an eigenvalue of T and f is the corresponding eigenfunction. Conversely, iff is an eigenfunction of T corresponding to A, f -:;f 0 and A -:;f 0, then Tf= Af,
and hence f= L( Tf) = L(Af) =A Lf
Therefore 1/ A is an eigenvalue of Land the corresponding eigenfunction is f
5.11. Applications of Fourier Transforms to Ordinary Differential Equations and Integral Equations In this section we discuss some examples of applications of the Fourier transform to ordinary differential equations and integral equations. Consider the nth order linear ordinary differential equation with constant coefficients L(y) = f(x),
(5.11.1)
where L is the nth order differential operator given by L= anDn + an_ 1 Dn- 1 + · · · + a 1 D+ a 0 ,
(5.11.2)
269
Applications to Integral and Differential Equations
Application of the Fourier transform to both sides of ( 5.11.1) gives [an (ikr +an-i (ik) n-i + ... + ai(ik) + ao]y(k) = ](k)
or p(ik)y(k) = ](k),
where p(z) = anzn + an-!Zn-i + ... + aiz+ ao. Thus "(k) = ](k) =f"(k) A(k) y p(ik) g '
(5.11.3)
where "(k =-1) p(ik)"
g
Now the Convolution Theorem 4.11.7 gives the solution y(x)= ;-;;1
J"" f(g)g(x-g) dg
v27T -ro
(5.11.4)
provided g(x) = g;-- 1{g(k)} is known explicitly. In order to give a physical interpretation of the result, we consider the differential equation associated with a sudden impulse functionf(x) = 8(x): L( G)= 8(x)
(5.11.5)
(for a rigorous discussion of the Dirac delta distribution 8 see Section 6.2). Application of the Fourier transform to ( 5.11.5) yields the solution
1 1 G(x) = g;-- 1 { --g(k)} = --g(x). ~ ~
(5.11.6)
Now, the solution (5.11.4) can be written as y(x)= f_""rof(g)G(x-g) dg
(5.11.7)
Clearly, G(x) behaves like a Green's function, that is, it is the response to a unit impulse. In any physical system, f(x) is usually called the input function, while y(x) is called the output obtained by the superposition principle. The Fourier transform of [ ~ G(x)] is called the admittance g(k) = [p(ik)r 1 • In order to determine the response to a given input, we first find the Fourier transform of the input, multiply the result by the admittance, and then apply the inverse Fourier transform to the product. We illustrate these ideas by a simple electrical circuit problem.
270
Applications
Example 5.11.1. equation
The electric current I(t) in the circuit is governed by the
di
(5.11.8)
L dt+RI=E(t),
where L is the inductance, R is the resistance, and E ( t) is the applied electromagnetic force. With E(t) = E 0 e-ltl, application of the Fourier transform (with respect tot) to the equation (5.11.8) gives
" )!
(ikL+ R)I(k) =
Eo-2 7T1+k
or
"
I ( k) =
Eo V{2; -(i-kL_+_R-')'-( 1-+-k-:
2:-)"
The inverse Fourier transform yields
f""
Eo eikt dk It=() 7T -ro(ikL+R)(1+k 2 ). This integral can readily be evaluated by the theory of residues. For t > 0,
Eo . [ residue . . iRJ I ( t) =--;;2m at k == 1. +residue at s = L
(5.11.9) Similarly, for t < 0, we obtain E0 E 0 e' • I(t) = - -·27Ti[residue at s = - i] = - - . 1r L+R
At t = 0, the current is continuous, hence
.
Eo R+L
1(0) =hm I(t) = - - . ,_.o
(5.11.10)
Applications to Integral and Differential Equations
271
Example 5.11.2 (Synthesis and Resolution of a Pulse; Physical Interpretation of Convolution). A time-dependent electric, optical or electromagnetic pulse can be regarded as a superposition of plane waves of all real frequencies so that the total pulse can be represented by the inverse Fourier transform 1 f (t)=27T
fro Fw)e ( iwt dw, -ro
(5.11.11)
where the factor 1/27T is introduced because the angular frequency w is related to the linear frequency v by w = 27Tv, and negative frequencies are introduced for mathematical convenience so that we can avoid dealing with the cosine and sine functions separately. Clearly, F( w) can be represented by the Fourier transform off( t) as F(w)=
f_""ro
f(t) e-iwt dt.
(5.11.12)
This represents the resolution of the pulse f( t) into its angular frequency components, and (5.11.11) gives a synthesis of the pulse from its individual components. Consider a simple electrical device such as amplifier with an input function f( t), and an output function g( t). For an input of a single frequency w, f(t) = eiwt. The amplifier will change the amplitude and may also change the phase so that the output can be expressed in terms of the input, the amplitude and phase modifying function ( w) as g(t)=(w)f(t),
(5.11.13)
where ( w) is usually called the transfer function, and it is, in general, a complex function of the real variable w. This function is generally independent of the absence or presence of any other frequency components. Thus the total output may be obtained by integrating over the entire input as modified by the amplifier 1 g(t) = 27T
J""-ro (w )F(w) eiwt dw.
(5.11.14)
Therefore, the total output g( t) can readily be calculated from any given input/( t) and known transfer function ( w ). On the other hand, the transfer function is obviously characteristic of the amplifier, and can, in general, be obtained as the Fourier transform of some function ¢( t): (w)
=.:
f_""ro cf>(t) e-iwt dt.
(5.11.15)
Applications
272
The Convolution Theorem 4.11.7 allows us to rewrite (5.11.14) as g( t)
= g;-- 1{ (w )F(w )} = LX'ro
f( T)c/>( t- T) dT
(5.11.16)
Physically, the result represents an output (effect) as the integral superposition of an input (cause) function f(t) modified by cf>(t-T). Indeed, (5.11.16) is the most general mathematical representation of an output in terms of an input modified by the amplifier where t is the time variable. Assuming the principle of causality, that is, every effect has a cause, we must require T < t. The principle of causality is imposed by requiring cf>(t-T)=O
forT>t.
(5.11.17)
Consequently, (5.11.16) reduces to the form g(t)=
fro j(T)cf>(t-T) dT.
(5.11.18)
In order to determine the significance of cf>(t), we use a sudden impulse function/(T)=8(T) so that (5.11.18) becomes g(t)=
fro 8(T)cf>(t-T) dT=cf>(t)H(t).
(5.11.19)
This recognizes ¢ ( t) as the output corresponding to a unit impulse at t = 0, and the Fourier transform (w) of cf>(t) is given by (t)
e-iwt
dt
(5.11.20)
with cf>(t)=O for t(x)+ A
f7T f(x, t; A)cf>(t) dt,
where A is not an eigenvalue. Obtain the general solution, if it exists, for cf>(x) =sin x.
(9)
Show that the solution of the differential equation d
2
f
dx 2 +xf=1,
f(O)=f'(O)=O,
satisfies the non-homogeneous Volterra equation 1 + f(x)=2
27T
Jx t(t-x)f(t)dt. 0
(1 0) Transform the problems 2 d f f(O) = 0, f'(1) = 0, (a) dxz +!= x, d
2
f
(b) dxz + f = x,
f(O) = 1, f'(1) = 0,
into Fredholm integral equations.
(11) Discuss the solutions of the integral equation f(x)=cf>(x)+A
(12)
f
(x+t)f(t) dt.
When do the following integral equations have solutions?
(a) f(x)=cf>(x)+A
f
(b) f(x)=cf>(x)+A
f7T sin(x+t)f(t) dt.
(c) f(x) = cf>(x)+A
f
(d) f(x) = cf>(x) +A
(1-3xt)f(t) dt.
xtf(t) dt.
r1
~ Pn(x)Pn( t)f( t) dt,
1
where Pn is the nth degree Legendre polynomial.
Applications to Integral and Differential Equations
(e) f(x) =x+-1 2
f
279
1
(x+ t)f(t) dt.
-1
(13) Find the eigenvalues and eigenfunctions of the following integral equations: (a) f(x) =A
r7T cos(x- t)f(t) dt.
(b) f(x) =A
f
(t-x)f(t) dt. 1
27T
(c) f(x)=cf>(x)+A
J
cos(x+t)f(t)dt.
0
( 14) Solve the integral equations (a) f(x) = cf>(x)+ A
f
tf(t) dt.
12
(b) f(x)=x+A f
f(t)dt.
J xtf(t) dt. 2 1
(c) f(x)
Sx 1 =-+-
6
(d) f(x)=x+
f
0
(l+xt)f(t) dt.
(e) f(x)=ex+A f2ex+'f(t)dt.
( 15) Use the separable kernel method to show that f(x) =A
f
cos x sin t f(t) dt
has no solution except the trivial solution f = 0.
(16) Obtain the Neumann n series solutions of the following equations:
J (t+x)f(t) dt. 2 1
(a) f(x)=x+-1
-1
(b) f(x) = x+ Lx (t-x)f(t) dt.
(c) f(x)=x- Lx (t-x)f(t)dt. (d) f(x) = 1-2 Lx tf(t) dt.
280
Applications
( 17) If Lu = u" + w 2 u, show that Lis formally self-adjoint and the concomitant is J( u, v) = vu'- uv'. Moreover, if u is a solution of Lu = 0 and v is a solution of L*v = 0, then the concomitant of u and v is a constant.
(18) Let L be a self-adjoint differential operator given by (5.8.15). If u 1 and u 2 are two solutions of Lu = 0, and J( u 1 , u2 ) = 0 for some x for which a 2 (x) -:1' 0, then u 1 and u 2 are linearly independent. ( 19) Consider the differential operator
u'(O) = 0,
u(1)=0.
Show that L is formally self-adjoint.
(20) Prove continuity of the Green's function defined in Theorem 5.10.1. (21) Find eigenvalues and eigenfunctions of the following Sturm-Liouville system: u"+Au=O,
0 :S X
:S 7T,
u(O)=u'(1r)=O.
(22) Transform the Euler equation x 2 u"+xu'+ Au =0,
1 s x s e,
with the boundary conditions
u(1)=u(e)=O into the Sturm-Liouville system 1 - d [ xdu] - +-Au=O dx dx x '
u(l) = u(e) = 0. Find the eigenvalues and eigenfunctions.
(23) Prove that A= 0 is not an eigenvalue of the system defined in Example 5.9.1.
(24) Show that the Sturm-Liouville operator L = DpD + q, D = d / dx is positive if p(x) > 0 and q(x) 2::0 for all x E [a, b ].
281
Applications to Integral and Differential Equations
(25) Show that the Sturm-Liouville operator Lin
L 2 ([a, b]) given by
1
L=-(DpD+q) r(x)
is not symmetric.
(26) Use the Fourier transform to solve the forced linear harmonic oscillator t > o, w -:p n,
x(O+) = o = x(O+ ).
Examine the case when w = 0.
(27) Solve the problem discussed in Example 5.11.1 with E(t) = E 0 e-"'sinwtH(t) and I(O+)=I0 •
(28) If there is a capacitor in the circuit discussed in Example 5.11.1, then the current I( t) satisfies the following integrodifferential equation: L-+ q + di RI +dt c1 [ 0
J' I(t) dt J
= E(t),
0
where q0 is the initial charge on the capacitor so that
q=q0 +
L
I(t) dt
is the charge and dq / dt = I. Solve this problem using the Fourier transform and the following conditions
I=q=E=O
fort 0. This completes the proof. In electrodynamics, the fundamental solution (6.3.53) has a well known interpretation. It is essentially the potential at the point x produced by a unit point charge at the point~· This is what can be expected from a physical point of view because 8 (x- ~) is the charge-density corresponding to a unit point charge at ~· The solution of ( 6.3.46) is u(x,y,z)= (
JR3
1 G(x,~)f(~)d~=( 47T JR3
fig'x-~'l{;) 11
dgd17d{
(6.3.54)
The integrand in (6.3.54) consists of the given charge distribution f(x) at x = ~ and the Green's function G(x, ~). Physically, G(x, ~)f( ~) represents the resulting potentials due to elementary point charges, and the total potential due to a given charge distribution f(x) is then obtained by the integral superposition of the resulting potentials. This is the so-called principle of superposition. Example 6.3.15. holtz equation
The fundamental solution of the two dimensional Helm-
-co< x, y 0, we seek solutions in the form u that
dT =
L
14> dT
( 6.4.3)
for every¢ E 0)(n). This does not require any information on the second derivatives of u. On the other hand, iff~ C(?(n) the problem (6.4.1) does not have a classical solution. It is then necessary to generalize the solution in an appropriate manner. Iff E L 2 (n), Equation (6.4.3) makes sense if V'u E L 2 (n). If u E H6(n), where H6(n) is the subspace of H\n) consisting of functions vanishing on an, and if the derivatives auj axk are considered in the generalized sense, then it follows from the definition of the Sobolev 2 space that aujaxkE L (n). Then if uEH6(n) and u satisfies (6.4.3) then it is a weak solution of ( 6.4.1 ).
Generalized Functions and Partial Differential Equations
311
Since H6(D) is the closure of ;](D), ;](D) is a dense subspace of H/Ml). Therefore, solving Equation (6.4.3) is equivalent to finding u E H0(D) such that for all ¢
(\7 u, \7 4>) = (f, 4>)
E
;](D),
(6.4.4)
where (·,·)is the inner product in L (D): (¢, r/J) = fn cf>r/J. Equation (6.4.4) is known as the variational or weak formulation of the problem (6.4.1). 2
Theorem 6.4.1. Let D be a bounded open subset of RN and let fE L 2 (D). Then there exists a unique weak solution u E H b( D) satisfying ( 6.4.4). Furthermore, u E Hb(D) is a solution of (6.4.4) if and only if min 1 ( v),
1 ( u) =
( 6.4.5)
VEH6(!1)
where 1(v)=.!_f \7v·\7vdr-f fvdr. 2 !1 !1
(6.4.6)
Proof. In order to apply the Lax- Milgram Theorem 4.3.7, we set H = Hb(D) and, for u, v E H6(D), a(u,v)=
L
\7u·\7vdr.
(6.4. 7)
We first show that a(·,·) is coercive, that is, there exists a positive constant K such that
for all u E H. This readily follows from Friedrichs' first inequality
L1Vul where
a
2
dr2::
a
2
uEH,
u dr,
(6.4.8)
is a positive constant. Thus
lflVul
2:-
2
where K
L
=
2
!1
min{l/2, a/2} and u E H.
af 2
dr+-
!1
2
2
u dr2:KIIull~>
( 6.4.9)
312
Applications
To prove the boundedness of a ( ·, · ) we note that
Lj"V'ul LCIY'ul 2
a(u, u) =
2
dT:S
+ u2)
dT=
llull7.
(6.4.10)
Thus a ( ·, ·)is bounded, symmetric and coercive. So, by the Lax-Milgram Theorem, there exists a unique weak solution of the equation (6.4.4). We next consider the Neumann boundary value problem -V' 2 u+bu=f
inn,
au -=0 an
(6.4.11a)
an,
on
(6.4.11b)
where n c R N is a bounded open set and n is the exterior unit normal to an, and b is a non-negative constant. According to Green's first identity ( 6.3.24), if u is a classical solution, then u E Hb(n) and it satisfies the equation
Jnr \i'u·\i'vdT+fnv\7
2
udT=f an
VaudS. an
f
fv dT
Or equivalently, by ( 6.4.11 ), f
\7 U • \7 V dT +
n
r buv dT
Jn
=
(6.4.12)
an
for every v E Hb(n). Iff E L 2 (n) then we define a weak solution of (6.4.11) as u E Hb(n) satisfying (6.4.12). Consider the bilinear form associated with the operator A=- V' 2 + b:
L L
a(u, v)
=
(6.4.13)
[(V'u · V'v)+ buv] dT.
Clearly, a is a bilinear form on H 1 (n) and a ( u, v) =
(V' u · V' v + buv) dT
s::max(1,b 1 ) = M(u, v)s::
L
(V'u·V'v+uv)dT
Mllullllvll,
where 0 < b :S b 1 , M = max(1, b 1 ), and a is continuous. On the other hand, a(u, u) =
L
(V'u · V'u+bu
2
)
dT~min(1, b )llull 0
2
,
313
Generalized Functions and Partial Differential Equations
where 0 < b0 :S b. Therefore, a is a continuous and coercive bilinear form. Then, by the Lax- Milgram Theorem, there exists a unique solution u E H 1 (D) such that a ( u, v) = (f, v)
(6.4.14)
for all v E H\ n ). This u is called the weak solution of the equation Au= f, that is, u is the unique solution of the Neumann boundary value problem (6.4.11). Furthermore, the solution minimizes the functional J(v)=.!f (V'v·V'v+bv 2 )dr-r fvdr.
2
Example 6.4.1.
n
(6.4.15)
Jn
Consider a boundary value problem (6.4.16a) ( 6.4.16b) 2
where a0 is a positive constant. Set Tu =- V' u + a 0 u. Define an inner product in Hb(D) (u, v)=
L
(6.4.17)
(uxvx+uyvy+uv) dxdy,
a bilinear form in Hb(D) a(u, v)=(v, Tu)= and a functional on Hb(D) I ( v) =
L L
v(-V' 2 u+a 0 u) dxdy,
(6.4.18)
fv dx d y.
(6.4.19)
A quadratic form for this problem can be defined in Hb(D) by I(u)
=~a(u, u)- I(u) =
L[~{(u~
+ u~) + a0 u 2 } - fu] dx dy.
The bilinear form a is symmetric, bounded and positive definite. The boundedness follows from the Schwarz inequality la(u,
v)ls::
LCluxl +luyl LClvxl +lvyl +ao~Llul ~ Llvl 2
2
)
2
:S
K
I ullll vii,
where K = max(1, a 0 ).
2
dxdy
dxdy
2
dxdy
2 )
dxdy
314
Applications
The positive definiteness follows from (6.4.18) by setting u = v: a(u,u)=
L
2
2
(IY'ul +aou )dxdy2a 0
LCIY'ul +lul )dxdy=allull 2
2
2
,
where a= min(1, a 0 ). Note that I(v) is bounded. Hence it follows from the Lax-Milgram Theorem that the problem a ( u, v) = I ( v) has a unique solution in Hb( D). We can generalize the preceding result to cover the case of second order elliptic equations defined on an open bounded set n c RN with smooth boundary an. We now consider the boundary value problem Tu=f
in De RN,
(6.4.20a)
an,
(6.4.20b)
on
u=O
where Tu=-
L-
N a [ aiJau] +a u, 0 i,J=i axi ax1
aiJ E C(?\ cl D), 1 s i,j s N, a0 E C(?\ cl D), x = (x~> ... , xN) ERN. The differential operator T is said to be in divergence form. It is called uniformly elliptic if the ellipticity condition N
L
2
aiJ(x)glJ2 Klgl = K(gi+ · · · + gfv)
( 6.4.21)
i,j=l
is satisfied for all gERN, x En, and K is positive and independent of x and g. If fE L\D), a weak solution of (6.4.20) is given by
r f:. aij axiau~ dr+ r aoUV dr = r fv dr axj Jn Jn
Jn i,j=i
(6.4.22)
for all v E Hb(D). It can readily be verified that every classical solution is a weak solution. Conversely, every sufficiently smooth weak solution is a classical solution. We next define a bilinear form in Hb(D) by a(u, v) =
iL N
n i,J=i
i
au av aiJ-- dr+ a 0 uv dr axi ax1 n
(6.4.23)
and the norm (6.4.24)
315
Generalized Functions and Partial Differential Equations
If a 0 (x)?: 0 for all xED, then, in view of the ellipticity condition ( 6.4.21 ),
i iL
L n ,,j=
a(u, u) =
N
~
f
au au a i j - - dT+ ax1 axj 1
n
2
a 0 u dT
au au aij--dT n u= 1 ax1 axj N
( 6.4.25) It can be checked that the form a(u, v) is bounded in Hb(D), that is,
la(u,
v)l :S Mllullllvll
(6.4.26)
for some constant M and all u, v E Hb(D). If a is symmetric, that is, au= aji for all i, j EN, then by the Lax- Milgram Theorem, there exists a unique solution u E Hb(D) such that a ( u, v)
= (f, v)
( 6.4.27)
for all v E Hb(D). Consequently, u satisfies the equation (6.4.22). In other words, the unique solution u minimizes the functional
i
iL
i
1 av av 1 J(v)=a 0 v 2 dTfvdT N au--dT+2 n i,j=i ax1 axj 2 n n
(6.4.28)
on Hb(D). To define a weak solution through (6.4.22), it suffices to assume au, a 0 are bounded on D. Hence u is the weak solution of the equation Tu = f, that is, u is the unique weak solution of the elliptic boundary value problem (6.4.20). More generally, we consider the following second order elliptic boundary value problem: Tu
=f
in D c R N'
on aD,
u=O
(6.4.29)
where Tu=-
a [ au] au La ; ; - + L a -+a u, ax axj axi N
N
1
U=i
1
0
i=i
where the aijs satisfy the ellipticity condition ( 6.4.21) and a; E C(?( cl D), 1:Si:SN. A weak solution is au E Hb(D) satisfying a ( u, v) = (f, v)
(6.4.30)
316
Applications
for every v c H6(D), where a(u, v) =
fn L
f L
N
i,j=!
au av a i j - - dr+ axi axj n
f
au a 1- v dr+ a0 uv dr. axi n
N
I=!
(6.4.31)
This bilinear form is not always symmetric. If it is symmetric, bounded, and coercive, then there exists a unique solution by the Lax-Milgram Theorem.
6.5. Examples of Applications of Fourier Transforms to Partial Differential Equations Example 6.5.1 (One Dimensional Diffusion Equation with No Sources or Sinks). Consider the initial value problem for the one dimensional diffusion equation with no sources or sinks: -co< x< co, t >0,
(6.5.1)
where K is a constant, with the initial data u(x,O)=f(x).
( 6.5.2)
This kind of problem can often be solved by the use of the Fourier transform 1 u(k, t) = ;;:;v27T
f
ro
e-
ikx
u(x, t ) d x.
-ro
When the Fourier transform is applied to (6.5.1) and (6.5.2) we obtain
u = - Kk ii, 2
1
The solution of the transformed system is u(k, t) = ](k)e-Kk 21 .
(6.5.3)
The inverse Fourier transform gives the solution u(x, t) =-1-
.J2iT
f""
](k) eikx-Kk2t dk
-ro
which is, by the Convolution Theorem 4.11.7, u(x,t)=
1 ~ v27T
f"" -ro
f(g)g(x-g)dg,
(6.5.4)
317
Generalized Functions and Partial Differential Equations
where
f"" exp[-Kt(k-~) _£__] dk .J27i 2
1
=-
2Kt
-ro
4Kt
Thus the solution (6.5.4) becomes
u(x, t) =
1 ~
v47TKt
f""
f(g) exp [ - (x-g) -ro 4Kt
2 ]
dg.
(6.5.5)
The integrand involved in the integral solution consists of the initial data f(x) and the Green's function G(x, t): 1 - exp [ - (x- gf] . G(x t) = -
(6.5.6)
1 [ (x-gf] dg=8(x-g), ~exp v47TKt 4Kt
(6.5.7)
'
,j41rKt
4Kt
Since lim
, ... o+
if we let t ~ 0+ the solution becomes
u(x,O)=f(x). Consider now the initial value problem
-co< x, y 0,
(6.5.8) (6.5.9)
u(x, y, 0) = f(x, y). The function
(6.5.10) satisfies the equation (6.5.8). From this we can construct the formal solution 1 u(x,y,t)=-
47Tt
i
R2
[
f(g,7J)exp- (x-gf+(y-7Jf] dgd7J. 4t
(6.5.11)
318
Applications
Similarly, a formal solution of the initial value problem for the three dimensional diffusion equation
-co< x, y, z 0, u(x, y, z, 0) = f(x, y, z).
(6.5.12) (6.5.13)
lS
(6.5.14) where
Example 6.5.2 (One Dimensional Wave Equation). Obtain the d' Alembert solution of the Cauchy problem for a one dimensional wave equation
-co< x 0, u(x, 0) = f(x),
( 6.5.15)
u,(x, 0) = g(x).
( 6.5.16)
We apply the joint Fourier and Laplace transforms defined by 1 u(k, s) = r::t= v27T
f""
-ro
.
e-lkx dx
f"" e-stu(x, t) dt.
(6.5.17)
0
The transformed Cauchy problem has the solution in the form fi(k s) = s](k) + g(k)
'
(6.5.18)
s2+ c2k2
The joint inverse transformation gives the solution
u ( x, where
::e-
u( x, t)
=
1
t) =-1-
l2
Y L.7T
f""
-ro e
tkx:;e- 1 {s](k)+ g(k)} dk S
2+ C 2k2
'
(6.5.19)
is the inverse Laplace transform operator. Finally, we obtain
.A:rr f_""ro e kx [ f( k) cos ckt + g~:) sin ckt J dk 1
= _1_
f"" ! eikx[etckt + e-ickt]f(k) dk
../2Ti -ro 2 +-1- ~ ../2Ti 2zc 1
f""
-ro
eikx[eickt _ e-ickt]g(k) dk
=-[f(x+ct)+f(x-ct)]+
2
1
1
1
=-[f(x+ct)+f(x-ct)]+-
2
1
r::t=Y 27T 2c
fx+ru
2c x-ct
fro g(k) dk f x+ct e'k' . dt -ro x-ct g(t) d{
(6.5.20)
319
Generalized Functions and Partial Differential Equations
This is the classical d'Alembert solution. It can be shown, by direct substitution, that it is the unique solution of the wave equation provided f is twice continuously differentiable and g is once differentiable. This essentially proves the existence of the d'Alembert solution. It can also be shown, by direct substitution, that the solution (6.5.20) is uniquely determined by the initial data. It is important to point out that the solution u depends only on the initial values at points between x- ct and x + ct and not at all on initial values outside this interval on the line t = 0. This interval is called the domain of dependence of the variables (x, t). Moreover, the solution depends continuously on the initial data, that is, the problem is well posed.
Example 6.5.3 (Laplace's Equation in a Half-Plane). Dirichlet problem consisting of the Laplace equation Uxx
+ Uyy = 0.
We consider the
-co< x O.
(6.5.27)
Similarly, we can solve the Dirichlet problem for the three dimensional Laplace equation in the half-space: Uxx + Uyy + Uzz = 0,
-co< x, y (x, y) =U 27T
f""
sin ka 1 ikx lkl - - - e - Y dk.
(6.5.37)
lkl
k
-ro
Thus the velocity component in the y direction is given by v=-4> =u
27T
Y
u
= - Re
27T
sin-kae ikx-lkl Y fro -ro
f"" -ro
u
= - Re 47T
dk
k
1 ka cos kx e -lkiy dk -sin k
f""
-lkly
{sin k(x+ a) -sin k(x- a)} _e_ dk, k -ro
(6.5.38)
where Re stands for the real part. Using the result
f
ro sinake-ky dk=.:!!.-tan- 1: 1
2
k
o
a'
the above solution for v becomes
J
-1 y u [ tan -1 ---tan -y- . v=-
27T
x-a
x+a
(6.5.39)
Similarly, for the x-component of the velocity we obtain u=-cf> = -iU 27T x
f"" -ro
r2 U sin-kae ikx-lkly dk =-ln27T
k
r1,
(6.5.40)
2 2 where ri=(x-af+y 2 and r~=(x+a) +y • Introducing a complex potential w = ¢ + iifJ, we obtain
dw dz
a¢
. a¢
ax
ay
.
-=--z-=-u+zv
(6.5.41)
which can be written, by (6.5.38)-(6.5.40), in the form
r1 dw= U - [ ln-+i(8 1 -8 2 ) r 27T dz 2
J= U- l nz-- -a, 27T
z +a
(6.5.42)
where tan 8 1 = yj (x- a) and tan 82 = Y/ (x +a). Integrating (6.5.42) with respect to z gives the complex potential
u
w = - [2a + (z- a) ln(z- a)- ( z +a) ln(z+ a)]. 27T
(6.5.43)
322
Applications
Example 6.5.5 (The Navier-Stokes Equation). The Navier-Stokes equation in a viscous fluid of constant density p and constant viscosity l' with no external forces is Du 1 - = --V'p+ Dt p
lJ
'17 2 u
(6.5.44)
'
where u = ( u, v, w) is the local Eulerian fluid velocity at a point x = ( x, y, z) and at time t, p(x, t) is the pressure and the total derivative following the motion
D
a
-=-+u. '17 Dt at
(6.5.45)
which consists of an unsteady term and a convective term. We next introduce the vorticity vector w = (g, 17, t) in rectangular Cartesian coordinates ( 6.5.46a) (6.5.46b) ( 6.5.46c) Using the vector identity uxcurlu=~V'(u · u)-u · V'u
(6.5.47)
with q 2 = u · u, Equation ( 6.5.44) assumes the form
(P
au 1 -+u·V'u=-'17 -+-q at p 2
z) +vV'u. 2
(6.5.48)
Taking the curl of both sides of this equation, the pressure term disappears and hence we get
aw =
-
at
2
curl( u x w) + v '17 w (6.5.49)
in which the continuity equations, '17 · u = 0 and '17 · w = 0, are used. The equation (6.5.49) can be also written in the form
Dw - - = w · V'u + v '17 2 w. Dt
( 6.5.50)
This equation (or its equivalent form ( 6.5.49)) is called the vorticity transport equation, and represents the rate of change in vorticity w which
323
Generalized Functions and Partial Differential Equations
is represented by three terms on the right-hand side of (6.5.49). The first term, u · V'w, is the familiar rate of change due to convection of fluid in which the vorticity is non-uniform past a given point. The second term w · V'u describes the stretching of vortex lines, and the last term, v V' 2 w, represents the rate of change of w due to molecular diffusion of vorticity in exactly the way that v V' 2 u represents the contribution to the acceleration from the diffusion of velocity (or momentum). In the case of two dimensional flow, w is everywhere normal to the plane of flow and w · V'u = 0. Equation ( 6.5.50) then reduces to the scalar equation
Dt
-=v\7 Dt
2
(6.5.51)
Y !,,
so that only convection and viscous conduction occurs. In terms of the stream function rjf, where u = r/Jy, v = -rjfx (and t = - V' 2 r/J) satisfy the continuity condition identically, Equation ( 6.5.49) assumes the form
a arfJ a arfJ a ) 2 + - - - - - V' (at ay ax ax ay
"'=
4
lJ
V' r/J·
( 6.5.52)
In the steady state a; at= 0, and if the velocity of the fluid is very small and the viscosity is very large, all terms on the left hand side of (6.5.52) can be neglected in the first approximation. Consequently (6.5.52) reduces to the biharmonic equation (6.5.53) We solve this biharmonic equation for the viscous fluid bounded by the plane y = 0 with the fluid introduced through a strip lxl * r/J E 7iJ.
Generalized Functions and Partial Differential Equations
327
(5) Construct a test function¢ such that ¢(x) = 1 for lxl s:: 1, and ¢(x) = 0 for 1x12::2. (6) Which of the following expressions define a distribution? (a) (f, c/>) = L~~J c/> (n\0); (b) (f, ¢) = L~=J cf>(xn), x~, ... , Xm (c) (f, ¢) = '£:~ 1 cf>(n)(O); (d) (f, ¢) =
'£:=
1
cf>(xn), x~, x 2 ,
(e) (f, ¢) = L~=J cf>(n)(xn),
•.• E
X 1 , ••• ,
E
Rare fixed;
Rare fixed;
Xm
E
Rare fixed;
(f) (f, ¢) = (¢(0))2;
(g) (f,¢)=sup¢; (h) (f, ¢) = JCX)CX) lcf>(t)l dt; (i) (f, ¢) = f~ cf>(t) dt; (j) (f, ¢) =
(7) Let (a) (b) (c) (d)
L:=J cf>(xn), where limn~ro Xn = 0.
cf>n~cP and rf!n~r/J- Prove the following:
acf>n+brf!n~acf>+brjf for any scalars a, b,
fcPn ~ !¢ for any smooth function f defined on R N' cf>n o A~¢ o A for any affine transformation A of R N onto R N' D"cf>n ~ D"¢ for any multi-index a.
(8) Let f be a locally integrable function on RN. Prove that the functional F on 7iJ defined by
is a distribution.
(9) Find the nth distributional derivative of f(x) = lxl( 10) Let fn (x) =sin
nx. Show that fn ~ 0 in the distributional sense.
( 11) Let Un} be the sequence of functions on R defined by 0, fn(x)= n, { 0,
if x 1j2n.
Show that the sequence converges to the Dirac delta distribution.
328
Applications
( 12) Show that the sequence of Gaussian functions on R defined by n = 1, 2, ... ,
converges to the Dirac delta distribution.
( 13) Show that the sequence of functions on R defined by r
Jn(X
)
sm nx =--,
n = 1, 2, ... ,
7TX
converges to the Dirac delta distribution.
Cro
( 14) Let ¢ 0 E 0)(R) be a fixed test function such that ¢ 0 (x) dx = 1. Show that every test function ¢ E 0)(R) can be represented in the form 4> = Kcf>o+ cP1, where K is a constant and ¢ 1 is a test function such that Moreover, the representation is unique.
Cro ¢
1
(x) dx = 0.
( 15) The fundamental solution of the one dimensional diffusion equation satisfies the equation G,- KGxx = 8(x- na(t- r),
-co<xO.
Show that
n J 2
G(x, t;
g,
H ( t - T) [ (Xr)= v'4K7T(t-r) exp --4K'----(t....:-c..:...r_) ·
Hence obtain the solution of the non-homogeneous equation
-co< x 0.
u,- Kuxx = f(x, t),
( 16) Find the fundamental solution for the one dimensional diffusion equation -co<xO.
( 17) Apply the joint Fourier and Laplace transforms to obtain the Green's function for the wave equation 0
11 -
2
c Gxx
= 8(x)8(t),
G(x, 0) = G,(x, 0) = 0.
-co<xO,
329
Generalized Functions and Partial Differential Equations
(18) (a) Show that the fundamental solution G(x, g, t) for the Cauchy problem G,(x, 0) = 8(x- g),
G(x, 0) = 0, is
1 G(x, g, t) = - [H(x- g+ ct)- H(x- g- ct)]. 2c
(b) Use this fundamental solution to solve a more general wave problem
u11
=
c2 uxx,
-co< x 0,
u(x,O)=O,
u,(x,O)=g(x).
(19) Prove the existence of the weak solution of the Dirichlet boundary value problem u =0
on
an,
where c is a positive function of x and y. Show that the weak solution is given by
L
2
v(-\l u+cu) dr= Lfvdr,
where u, v E Hb(n).
(20) Show that the Dirichlet problem for the biharmonic operator t/u = f inn, f E L\n),
au an
u=-=0
where
nc
on
an,
RN, has a weak solution u E H~(n) given by
L
t.u t.v dr =
L
fv dr
for every v E
H~(n).
(21) Show that the boundary value problem -t.u+u=f
in RN,
u-?-0
/EL2 (RN),
as lxl-?-co
1
has a unique solution u E H (RN) such that
r
JRN
\1 u . \1 v dr +
r
JRN
uv dr =
r
JRN
fv dr
330
Applications
(22) Let n c RN be a bounded open set. Consider the Robin boundary value problem
/E L\0),
inn,
-tw+u=f
au an
on
-+au=O
an,
a>O.
Show that there exists a unique solution u E H6(0) such that for every v E Hb(O),
a ( u, v) = (f, v)
where a ( u, v) =
Jnr V' u . V' v dr + Jnr uv dr + a
f
uv dr,
u,
VE
Hb(O).
an
(23) Use the Fourier transform method to show that the solution of the telegrapher's problem U11 + au 1 + bu =
C
2
-co< X, t 0, u(x,O)=f(x), -co<xtJ.i is the work done in this displacement, then
Qj 8% = =
L (X; 8x; + Y; 8y; + Z; 8z;) """' ax; Y;-+Z;ay; az;) t>qj· L ( X;-+ aqj aqj aqj
Substituting this result into (7.2.17) we obtain
~ (:~)- : : = Qj.
(7.2.18)
These are called Lagrange's equations of motion for a holonomic dynamical system. A great advantage of these equations for the solution of dynamical problems is that forces which do no work, such as reactions at smooth hinges, do not appear. However, these are to be included in the usual equations of motion. An important modification of (7.2.18) can be made for conservative dynamical systems. In such cases,
av
Qj=- a-' qj
(7.2.19)
where V is the potential energy of the system and is a function of the generalized coordinates q~> q 2 , ••• , q3 n and possibly of the time t. Since V does not involve the velocities qh a V1aqj = 0. Thus Lagrange's equation (7.2.18) can be written in terms of Las
~ (;~)- ;~ = 0,
j=1,2, ... ,3n.
(7.2.20)
If the kinetic energy of the system Tis a homogeneous quadratic function of velocities qh then
2T=
(7.2.21)
339
Mathematical Foundations of Quantum Mechanics
Assuming that V does not involve the time t explicitly, it follows from (7.2.20) that
.!!_ (T+ V) = .!!_ (2T- L) = L [.!!_(tv a~)- ija~tv aL] 1 dt
dt
=
dt
aqj
aqj
L. [ cij ~ (:~)- cij ;~] = o.
atJ.i
(7.2.22)
This shows that T+ V =constant for a conservative system. In order to discuss the so called Hamilton equations of motion, we introduce the concepts of generalized momentum pj and generalized force Fj by aL
(7.2.23a)
p=-
aqj
1
and aL aqj.
(7.2.23b)
F=1
Consequently, the Lagrange equations (7.2.20) become aL d aqj = dt Pj =Ph
(7.2.24)
where p; and qj are usually called conjugate variables. In Cartesian coordinates this result reduces to the familiar equation for aL aT
p =-=-=mx.
x, ax ax
r
(7.2.25)
Any equations that will hold in any coordinate system are of the most interest and useful for applications. To develop such equations of motion, we now introduce a new function, called the Hamiltonian (or Hamilton's function) H, which is defined in terms of the Lagrangian Land two conjugate variables pj and tJ.i by H(p, q) =
LPAj- L,
(7.2.26)
j
where L= L(q;, 4;, t) is, in general, a function of q;, cj,, t and qj enters through the kinetic energy as a quadratic term. Equation (7.2.23a) will give P; as a linear function of cir This system of linear equations involving p, and cj, can be solved to determine ci; in terms of p,, and then the cj,s can in
340
Applications
principle be eliminated from (7.2.26). This essentially means that H can always be expressed as a function of Ph qj and t so that H = H(ph qh t). Thus """' aH """' aH aH dt. dH = L - dp + L - dq +apj 1 aqj 1 at
(7.2.27)
On the other hand, differentiation of H in (7.2.26) with respect to t gives
or dH = """' L p1 dq1 +
aL dq - """' aL aL L ri dp - LL - . dq - - dt. a% aqj at 1
"'l}
1
(7.2.29)
'
In view of (7.2.23a), this equation becomes dH
aL aL ="""' q· dp- """'- dn, - - dt. L L aqj at 1
1
(7.2.30)
"'l}
Evidently, the two expressions of dH given in (7.2.27) and (7.2.30) must be equal so that the coefficients of the corresponding differentials can be equated to obtain
(J;=-
aH apj,
(7.2.3la)
aL aH --aqj a tV '
(7.2.3lb)
aL aH --at at
(7.2.3lc)
Using the Lagrange equations (7.2.20) and (7.2.23a), the first two of the above equations become, for each j, dqj_ aH dt- apj' dpj
aH
dt=- aq/
(7.2.32a)
j = 1, 2, ... , 3 n.
(7.2.32b)
These are known as Hamilton's canonical equations of motion. They constitute a set of6n coupled first order equations that reflects the symmetry except for a negative sign. Thus the Hamilton equations are completely equivalent to Lagrange's equation (7.2.13) which represent a set of 3n
Mathematical Foundations of Quantum Mechanics
341
coupled second order differential equations. These equations possess a unique solution if the initial data are prescribed at some time t = t0 • In other words, Hamilton's equations completely determine the position and momentum at all times provided the initial data are given. This shows that the fundamental laws of classical mechanics are completely deterministic. The Lagrange-Hamilton theory can be employed to deduce (i) the law of conservation of energy, and (ii) that H is equal to the total energy. To derive (i), we assume that L, and therefore H, in (7.2.26) do not involve the time t explicitly. Consequently,
"""' ( pq+pn--n,--_ .. . . aL . aL q.. ) =L 1 1 1 "'l] aqj "'l] aqj 1 =
L (pA+ PAJ-PAJ-PA)=O.
This shows that H is constant. To prove the second property, we assume that the coordinate transformations (7.2.14) do not depend explicitly on time t. We note that Tin L = T-V is given by (7.2.16) where the coefficients ajk are symmetric functions of the generalized coordinates (];· On the other hand, Vis, in general, independent of qj and hence aL aT Pj = - . = - . = aqJ aqJ
3
L aJkqk. k~i "
Thus the Hamiltonian H becomes
=2T-L= T+ V Thus H is equal to the total energy. Since H was proved to be a constant, the sum of the kinetic and potential energies is constant. This is the celebrated law of conservation of energy. We consider a conservative holonomic system which is described by the generalized coordinates q 1 , q2 , ••• , qn. For any complete set of specified initial conditions each of the coordinates q; must be a single-valued function of time t. Thus we assume that these functions are known and have the form q 1 =q 1 (t),q 1 =q 2 (t), ... ,qn=q,.(t). Then these equations may be regarded as the parametric equations of a path in n-dimensional Euclidean
342
Applications
space and the motion of the system can be related to that of a point which moves along this path. Given the initial state, the motion in subsequent times is uniquely determined by Newton's laws of motion. Therefore, there exists a unique path in the n-dimensional Euclidean space for a given set of initial data. It is of interest to compare this path with another one in the n-dimensional space, the two paths having the same end-points and such that they are traversed in the same time r. Assuming that at any instant of time the difference between the positions of two points which trace out the two paths is infinitesimally small, we denote. the variation between the two paths at any instant by 8. Both paths have the same end-points so that 8q 1 (0) = 8q 1 ( r) = · · · = t>qn(O) = t>qn( r) = 0. However the variations 8q 1 (t), 8q 2 (t), ... , t>qn(t) do not vanish at any timet between 0 and r. Then it follows that
Thus
=
[L
-aL t)q.J •
aqj
J
,~T
-
[L
" 'aL . t)q.J
aqj
Jt=o
=
0'
since both the paths have the same end-points. Evidently
8[
(7.2.33)
Ldt=O.
This result is well known as Hamilton's variational principle provided the Lagrangian L= L(CJJ, qj). This principle was obtained from Newton's law of motion. Conversely, it can be shown that Newton's laws can be derived from Hamilton's principle if L = L( qj, q1 ). Clearly t>L=
[a~ &j1 +aL t>q,J. aq aq 1
1
343
Mathematical Foundations of Quantum Mechanics
By Hamilton's principle
aL ) dt 0=8 TLdt= IT 8Ldt=L IT (aL -. 8q1 +-8{jj o o a% a% Io
J [
aL aL 8{jj ]T ' d (aL) ='L T[ - -. 8{jj+-8qj dt+ 'L-. dt aqj aCJJ 8q1 o Io where the last result is obtained by integration by parts. Since 8q1 = 0 at t = 0 and t = r, the last term vanishes and the above expression gives
LIT [aL _.!!._(a~)] 8% dt = o o aqj dt aq; for all 8% and all r. Thus the integrand must vanish, which yields Lagrange's equation
:t (:~)-:~
=0.
(7.2.34)
Hence Newton's laws can be derived from these equations. Hamilton's principle shows that motion according to Newton's laws is distinguished from all other kinds of motion by having the property that the integral f L dt for any given time interval has a stationary value. Hence it is regarded as a fundamental principle of classical mechanics from which everything else can be derived.
Poisson's Brackets in Mechanics The equations of motion for any canonical function F(p;, q;, t) can be expressed, using Hamilton's equations (7.2.32ab ), as
f f (aFaq; aH _ aF aH) + aF ap; ap; aq; at
dF = (aF q;+ aF ]\) + aF dt i~l aq; ap; at =
;~ 1
aF ={F,H}+at,
(7.2.35)
where {F, H} is called the Poisson bracket of two functions F and H If the canonical function F does not explicitly depend on time t, then aFjat=O so that (7.2.35) becomes
dF dt={F, H}.
(7.2.36)
344
Applications
In addition, if {F, H} = 0, then F is a constant of the motion. In fact, (7.2.35) really includes the Hamilton equations which can be verified by setting F= p;, F= q; or F= H. It readily follows the definition of the Poisson bracket that (7.2.37a)
{q;,pJ= aij, {q;, qj}={p;,pj}=O,
(7.2.37bc)
where t>ij is the Kronecker delta notation. These are the fundamental Poisson brackets for the canonically conjugate variables p; and q;. Any relation involving Poisson's brackets must be invariant under a canonical transformation. This is often used as an alternative definition of a canonical transformation. It can also be verified that the components of the angular momentum L=rxp satisfy i,j, k
=
x, y, z in cyclic order
(7.2.38)
and (7.2.39) It also follows from the definition of the Poisson bracket that the derivative of a canonical function with respect to generalized coordinates qj is equal to the Poisson bracket of that function with the canonically conjugate momentum pj, that is,
(7.2.40) In particular, we obtain aF ax= {F, Px},
(7.2.41a)
aF ay ={F,py},
(7.2.41 b)
aF az ={F,pz},
(7.2.41c)
or equivalently, F(x + dx, y, z)
=
F(x, y, z) +{F, Px} dx,
(7.2.42a)
F(x, y + dy, z)
=
F(x, y, z) + {F, p,.} dy,
(7.2.42b)
F(x, y, z+ dz)
=
F(x, y, z) +{F, pj dz.
(7.2.42c)
345
Mathematical Foundations of Quantum Mechanics
Thus the canonical momenta Px, pY, Pz are called the generators of infinitesimal translations along the x, y, and z directions, respectively. In general, a mechanical description of a physical system requires the concepts of (i) variables or observables, (ii) states (iii) equations of motion. Physically measurable quantities are called observables. In classical mechanics, examples of variables or observables are position, momentum, angular momentum and energy which are the characteristics of a physical system. They can be measured experimentally and are represented by dynamical variables which are well defined functions of two canonically conjugate variables (generalized coordinates and generalized momenta). So the observables in classical mechanics are completely deterministic. There are states which describe values of the observables at given times. The state of a physical system at a time t = t0 > 0 is uniquely determined by the appropriate physical law and the initial state at t = 0. For example, the state of a system of n interacting particles is determined by assigning 3 n position coordinates and 3n velocity coordinates. Finally, there are equations of motion which determine how the values of the observables change in time. As mentioned above, Newton's equations, Lagrange's equations or Hamilton's equations are well known examples of equations of motion.
7.3. Basic Concepts and Postulates of Quantum Mechanics Classical physics breaks down at the levels of atoms and molecules. Historically, the first indication of a breakdown of classical ideas occurred in the rather complex phenomenon of the so called "black body radiation" which essentially deals with electromagnetic radiation in a container in equilibrium with its surroundings. In other words, the black body radiation is concerned with the thermodynamics of the exchange of energy between radiation and matter. According to principles of classical physics, this exchange of energy is assumed to be continuous in the sense that light of frequency v can give up any amount of energy on absorption, the exact amount in any particular case depending on the energy intensity of the light beam. Specifically, in 1900 Max Planck first postulated that the vibrating particles of matter are regarded to act as harmonic oscillators, and do not emit or absorb light continuously but instead only in discrete quantities. Mathematically, the radiation of frequency v can only exchange energy with matter in units of hv, where h is the Planck constant of numerical value h =21Th= 6.625 x 10- 27 erg sec= 4.14 x 10- 21
MeV sec
(7 .3.1)
346
Applications
and li is called the universal constant. Clearly, h has dimension (energy x time) of action which is a dynamical quantity in classical mechanics. Equivalently, Planck's quantum postulate can be stated by saying that radiation of frequency v behaves like a stream of photons of energy E = hv = liw,
(7.3.2)
which may be emitted or absorbed by matter where w = 21rv is the angular frequency. Clearly, Planck's constant h measures the degree of discreteness which was required to explain the energy distribution of the black body radiation. Thus the concept of discreteness is fundamental in quantum mechanics, but it is totally unacceptable in classical physics. Finally, it is important to point out that the Planck equation (7.3.2) is fairly general so that it can be applied to any quantum system as a fundamental relation between its energy E and the frequency v of an oscillation associated with the system. Also the failure of classical concepts when applied to the motion of electrons appeared most clearly in connection with the hydrogen atom. According to the Rutherford model, an atom can be considered as a negatively charged electron orbiting around a relatively massive, positively charged nucleus. With the neglect of radiation, this system is exactly similar to the motion of a planet round the sun, with gravitational attraction between the masses being replaced by the Coulomb attraction between the charges. The potential energy of the Coulomb attraction between the fixed nucleus charge +Ze and the electron of charge-e is V(r) = -Ze 2 jr. The hydrogen atom consists of two particles-the nucleus, a proton of mass mP and charge + e ( Z = 1), and an electron of mass me and charge -e. The nucleus is small and heavy (mp/me~2000) and the radius of the proton ~10- 3 times the atomic radius. According to the classical atomic theory of Rutherford, the attractive potential would cause the electron to orbit around the nucleus, and the orbiting electron constitutes a rapidly accelerating charge, which according to Maxwell's theory acts as a source of radiant energy. Thus the accelerated charged electron would continuously radiate energy, and in a matter of 10- 10 sec the electron should coalesce with the nucleus, causing the atom to collapse. On the other hand, the frequency of the emitted radiation is related to that of the electron in its orbit. As the electron radiates energy, this frequency, according to classical theory, must change rapidly but continuously, thus giving rise to radiation with a continuous range of frequencies. Thus the Rutherford atomic model has two important qualitative weaknesses: (i) The atom should be very unstable.
347
Mathematical Foundations of Quantum Mechanics
(ii) It should radiate energy over a continuous range of frequencies. Both of these results are totally contradicted by experiments. The original problem of quantum mechanics was to investigate the stability of atoms and molecules, as well as to explain the discrete frequency spectra of the emitted radiation by excited atoms. The remarkable success in predicting observed atomic and molecular spectra is one of the major triumphs of quantum mechanics. In this chapter, we present the basic principles of quantum mechanics as postulates which will then be used to discuss various consequences. No attempt will be made to derive or justify these postulates. Both the number and content of the basic postulates are to some extent a matter of individual choice. The postulates together with their consequences form a basic but limited theory of quantum mechanics. It has already been mentioned in previous sections that classical mechanics identifies the state of a physical system with the values of certain observables (for example, the position x and the momentum p) of the system. On the other hand, quantum mechanics makes a very clear distinction between states and observables. So we begin with the first postulate concerning the state of a quantum system. Postulate I (The State Vector). Every possible state of a given system in quantum mechanics corresponds to a separable Hilbert space over the complex number field. A state of the system is represented by a non-zero vector in the space, and every non-zero scalar multiple of a state vector represents the same state. Conversely, every non-zero vector in theHilbert space and its non-zero scalar multiples represent the same physical state of the system. The particular state vector to which the state of the system corresponds at time t is denoted by 'l'(x, t) and is called the time dependent state vector of the system. The state of a physical system is completely described by this state vector 'l'(x, t) in the sense that almost all information about the system at time t can be obtained from the vector 'l'(x, t). Usually, a state vector is denoted by rjJ(x). In the Dirac notation, any general state vector rjf(x) is written as r/J(x) =(xI r/J)
(7.3.3)
(r/JI x).
(7.3.4)
and its complex conjugate as rjf(x)
=
348
Applications
This postulate makes several assertions. First, all physical properties of a given system are unchanged if it is multiplied by a non-zero scalar. We can remove this arbitrariness by imposing the normalizing condition (7.3.5) where the integral is taken over all admissible values of x. Or equivalently
f
(1/Jix)(xii/J)dx=
f
2
I is the phase of the wave. This can also be expressed in the form (7.5.2) where rjJ(r) =A(r) exp[ -i¢(r)].
(7.5.3)
According to the Planck equation (E = hv), if a quantum system has energy E, its state vector l'l'(r, t)) at time t should contain a factor exp[ -27Tivt] = exp[ -(iE /h) t] so that l'l'(r, t)) = e-iEtffll'lf(r, 0)).
(7.5.4)
Clearly, energy is an observable, and hence, for a system to have a definite energy E, it must be in an eigenstate of this observable. If this is the case, Equation (7.5.4) expresses the fact that the state vector at time tis different from that at t = 0 only by a scalar factor, and so it describes the same physical state. For this reason an eigenstate of energy is called a stationary state of the system. The following postulate of quantum mechanics is concerned with the existence of the energy operator and the time development of a quantum system: Postulate VI (a) (Hamiltonian Operator). For every physical system there exists a linear Hermitian operator fr, the so called Hamiltonian Operator which represents the observable operator corresponding to the total energy of the system. (b) (Schrodinger's Equation). If a physical system is not disturbed by any experiment, the Hamiltonian operator H determines the time development
363
Mathematical Foundations of Quantum Mechanics
of the state vector of the system 'l'(r, t) through the partial differential equation
a'¥ in-= H'l'(r, t). A
(7 .5.5)
at
This is called the time-dependent Schrodinger equation, and represents the fundamental equation of motion in quantum mechanics. Its ultimate justification is that it leads to predictions which are in remarkable agreement with experimental findings. As in equations of motion in classical mechanics, Equation (7.5.5) is completely deterministic in the sense that, given the s'tate 'l'(r, t) at some time t = t0 , Equation (7 .5.5) will uniquely determine the state 'l'(r, t) at other time t. The Hamiltonian operator if corresponds to the force in classical mechanics because the total energy if includes the potential energy which gives the force in classical mechanics. So if is equivalent to the force field. For a single particle of mass m moving in space, the classical state is described by the position and momentum vectors (r, p). If the particle is under the action of a force F(r) which is derived from a potential V(r) so that F= -V' V, the Hamiltonian function H is
p2
H(r, p) = T+ V=-+ V(r). 2m
(7 .5.6)
The Hamiltonian operator if corresponding to this classical function H in quantum mechanics is derived by replacing r and p with the operators r and p=-in V', respectively. Consequently, the Schrodinger equation (7.5.5) assumes the form 2
J
a'¥ [ --V' n 2 +V(r) '-V=H'-V, in-= A
at
2m
A
(7.5.7)
where A
H
=-
n2
-'\7 2 + V(r) 2m
(7.5.8)
and 'l'(r, t) belongs to the Hilbert space L 2 (R 3 ). This postulate has three main consequences. First, it asserts that (7.5.4) can be derived as a solution of the Schrodinger equation (7.5.5). We assume that if has a purely discrete spectrum of eigenvalues so that it has a complete set of eigenstates 11/J,(r)) with the corresponding eigenvalues En- Then 1'-l'(r, t)) can be expanded in terms of the complete eigenstates as (7 .5.9)
364
Applications
Substituting this into (7.5.5) gives
which, equating the coefficients of Irf!nJ, yields (7.5.10)
Hence the solution of this equation is (7.5.11)
where an (0) is the initial value of an ( t) at t = 0. Thus the time dependent solution is (7.5.12)
The second consequence of (7 .5.5) is the de Broglie wave for free particles of definite momentum p = nk which is required to explain electron diffraction. The one dimensional de Broglie wave is 'kx
(x IPal= a e'
.
=a e'P"
jil
,
(7.5.13)
where a is a constant. If it is normalized by using the orthonormality condition for continuous eigenvalues (a Ia')= 8( a- a'), where the simplest representation of the Dirac delta function is 1 8(a) = 27T
J""-ro
eiax
dx,
then c t) represents the time-dependent term that arises from the presence of an external field. In the absence of the latter term, the time-evolution operator is obtained from (7.6.11) in the form U0 ( t, t0 ) A
-
_
-
exp - in ( t- t0 ) H CO)] . [
(7.8.2)
A
Both the state vector 'l'I(t) and the operator AI(t) depend on time t and are defined by
U~ 1 (t, t0 )'-l's(t)=exp[~(t-t0 )fico>] 'l's(t),
'-VI(t)= "'
"'-1
"'
(7.8.3)
"'
AI(t)= U 0 (t, t0 )AsUo(t, t0 )
= exp [ ~ (t- t0 )fico> JAs exp [- ~ ( t- t0 )fico>
l
(7.8.4)
where '¥ s( t) is the state vector and As is the operator in the Schrodinger picture so that a'¥ S in---;;(= H ( t)'-V s( t) A
A
=
A
)
[Hco) + Hc 1 ( t) ]'¥ s( t),
dAs= aAs dt at
(7.8.5) (7.8.6)
It follows from (7.8.3) and (7.8.5) that
[i (
in a'¥ I= in j_ {exp t- to) fico>] '¥ s( t)} at at n =
-fico>'¥ I+ exp [
= - J{Col'l!l
~ (t- t
0)
fico>
J(in a~ s)
+ exp [ ~ ( t- t 0 )Hcol J[1fco> + HC 1 l( t)]'-V s( t)
=
-H( 0 )'l! 1 + 1fto)'l! 1 + Ut) 1(t, t0 )H( 1 )(t) Uo(t,
=
H\
1
)(t)'Itl(t),
f 0 )'¥1(t)
(7.8.7)
390
Applications
where ficI 1>(t) = {J-0 1 (t, t0 )fi(!>U.0 (t, t) 0 ·
(7.8.8)
On the other hand, it also follows from (7.8.4) and (7.8.6) that
dAI =a AI+_!_ [A fico>] dt
at
in
I '
I
'
(7.8.9)
where fiCo)= {J-0 1 (t , t0 )fiCo) I
U (t 0
,
t) =fiCO) · 0
(7.8.10)
These results show that the state vector '¥I( t) in the interaction picture satisfies the Schrodinger equation (7.8.7) with the Hamiltonian fi\ 1>, while the operator AI( t) obeys the Heisenberg equation with the time-independent Hamiltonian fico>.
7.9. The linear Harmonic Oscillator According to classical mechanics, a harmonic oscillator is a particle of mass m moving under the action of a force F = -mu/x. The equation of motion is then (7.9.1) The solution of this equation with the initial conditions, x(O) =a, .X(O) = 0 is
x= a cos
wt.
(7.9.2)
This represents an oscillatory motion of angular frequency w and amplitude 2 2 a. The potential is related to the force by F =- aVjaxso that V(x) =!mw x • The energy of the oscillatory motion is the potential energy when the particle is at the extreme position. Therefore the energy is (7.9.3) Since the amplitude a can have any non-negative value, the energy E can have any value greater than or equal to zero. In other words, the energy forms a continuous spectrum. We next consider the quantum theory of such a system. The total energy of the system is represented by the Hamiltonian operator (7.9.4)
391
Mathematical Foundations of Ouantum Mechanics
It is convenient to introduce two dimensionless operators a and a* by (7.9.5)
(7.9.6) Since
x and p are Hermitian operators, it follows that
"'I
for any two wave functions and rf!z. Thus the operators a and a* are not Hermitian and hence they do not represent physical observables. However, aa* and a*a are Hermitian operators, because they can be represented as real functions of fr:
p2 mu/ iw 1 aa* ::=:-+-- x-- [x p] =H +-liw 2m 2 2 ' 2 ' A A
A
A
A
A
p2 mw 2 iw 1 a*a =-+--x+- [x p]=H --liw 2m 2 2 ' 2 A
and hence
fr
A
A
A
A
A
can be written in terms of a and a* as "'"'* I HA =aA*"' a -21 ¥...nw = aa +2 liw
(7.9.7ab)
so that
[a, a*]= liw.
(7.9.8)
The eigenstate of energy En is !En) and (7.9.9) Using (7.9.7ab), we rewrite (7.9.9) either as
a*a!En) =(En-! nw )!En)
(7.9.10)
aa*!En) =(En +!nw) !EnJ·
(7.9.11)
or
Multiplying (7.9.10) by
a, we obtain
aa* alE,)= (E,- !nw )a! E,).
(7.9.12)
Then either (7.9.13)
392
Applications
or, say, (7.9.14) This result is used to rewrite (7.9.12) as
aa*IEn-il =(En- nw +HwiEn-il·
(7.9.15)
This is identical with (7.9.11) for En-i, provided (7.9.16) Thus, given any eigenvector lEn), it is possible to generate a new eigenvector 1En_ 1), by (7.9.14), unless lEn) is the lowest state IE0 ). In this case (7.9.13) is satisfied. It follows from (7.9.10) for n =0 that (7.9.17) This determines the lowest (or ground) state energy. Clearly, it follows from is the operator which annihilates energy in the system in (7.9.14) that quantum units of nw, and is called the annihilation operator. Similarly, multiplication of (7 .9.11) by a* gives
a
a
(7.9.18) Then, either (7.9.19) or, say, (7.9.20) This result is used to rewrite (7.9.18) as
a*aiEn+il =(En+ nw -~nw)IEn+il·
(7.9.21)
This is (7.8.10) for En+i, provided (7.9.22) It follows that, given any eigenstate lEn), it is also possible to generate a new eigenvector 1En+ 1 J, by (7.9.20), with the eigenvalues given by (7.9.22), unless En is the highest energy level, in which case (7.9.19) is satisfied. But the potential is an increasing function of x and hence there is no highest level and the creation of higher energy levels is always possible. Thus the operator a* generates energy in the system in quantum units of nw, and is called the creation operator. It then follows from (7.9.17) and (7.9.22) that the general energy level is n=0,1,2, ...
(7.9.23)
Mathematical Foundations of Quantum Mechanics
393
This obviously represents a discrete set of energies. Thus, in quantum mechanics, a stationary state of the harmonic oscillator can assume only one of the values from the set En. The energy is thus quantized, and forms a discrete spectrum. According to classical mechanics, the energy forms a continuous spectrum, that is, all non-negative numbers are allowed for the energy of a simple harmonic oscillator. This shows a remarkable contrast between the results of the classical and quantum theory. The non-negative integer n which characterizes the energy eigenvalues (and hence eigenfunctions) is called the quantum number. The value of n = 0 corresponds to the minimum value of the quantum number with energy (7.9.24) This is called the lowest (or ground) state energy which never vanishes, as the lowest possible classical energy would. The ground state energy E 0 is proportional to li, representing a quantum phenomenon. The discrete energy spectrum is in perfect agreement with the quantization rules of the quantum theory. To determine the energy eigenfunctions rf!n belonging to En it is convenient to write the annihilation and creation operators as A= alv'hW and A*= a*lv'hW and replace p by -in(alax) so that (7.9.25)
A = 1- [ - ( li I mw) 1I 2 - a + (mw I li) 1I 2 xA] = 1- ( - a + 17A) , A* v'2 ax v'2 a11
(7.9.26)
where i,=(mwlli) 112 X. Consequently, A A
A
H
1
liw
2'
AA*=-+
(7.9.27a)
(7.9.27b) Since rfro is the eigenfunction corresponding to the lowest energy, E 0 ,
ArjJ 0 =0 or (7.9.28)
394
Applications
Its normalized solution can be written as (7.9.29) All other eigenfunctions rf!n can be calculated from rf!o by successive applications of the creation operator A*, and thus rf!n is proportional to (A*rrf!o· We also note that (7.9.30) so that if rf!n is normalized, so is rf!n+i = (n that rf!n
o-
+
112
A*rfrn· Thus, it turns out
= (1, 2, 3, ... , n) -1/2CA*)nrf!o =(2nn!)-i/2(-
d~
+7]r
exp(-
~2).
(7.9.31)
This result can be simplified by using the operator identities (7.9.32a) d ( - d7]
+ 7J
)n
= (-l)n
e'l
2
/2
2 dn d7]n e-'1 12,
(7.9.32b)
so that the final form of rf!n is (7.9.33)
n = 0, 1, 2 ... ,
(7.9.34)
where the result in the square brackets in (7.9.33) defines Hn ( 7J ), the Hermite polynomials of degree n.
Example 7.9.1 (The Schrodinger Equation Treatment of Planck's Simple Harmonic Oscillator). The quantum mechanical motion of the Planck oscillator is described by the one dimensional Schrodinger equation 2
2M( 1 n- E 2 Mw
dd rjfo+7
x-
1
X
2)
,f,=O.
'I'
(7.9.35)
395
Mathematical Foundations of Quantum Mechanics
In terms of the constants
2MB
f3=r;z,
(7.9.36a)
Mw a=->0 li
(7.9.36b)
and an independent variable x' = xFa, Equation (7 .9.35) becomes, dropping the prime, (7 .9.37) The eigenfunctions of this equation are the Hermite orthogonal functions (7 .9.38) with the corresponding eigenvalues
~= (2n + 1), a
(7 .9.39)
where Hn (x) is the Hermite polynomial of degree n. Substituting the values of a and {3, it turns out that
1)
2n + E==E = ( - - wli n 2 '
n=0,1,2, ....
(7.9.40)
The so-called half-integral multiples of the energy quanta which are the characteristics of the oscillator, that is, the odd multiples of !wn. This result is remarkably the same as in the Heisenberg theory. In view of the following properties of the Hermite polynomials H 0 (x)
=
1,
it follows that the first eigenfunction rjf 0 (x) represents a Gaussian distribution curve and the second eigenfunction rfr 1 (x) vanishes at the origin and corresponds to a Maxwellian distribution curve for positive x which is continued towards negative values of x so that it is an odd function of x. The third eigenfunction rf!c(x) is negative at the origin and has two symmetric zeros ± 1/ J2 and so on. Thus the geometrical shape of these eigenfunctions can easily be determined. 1t is also important to note that the roots of successive polynomials separate one another.
396
Applications
7.10. Angular Momentum Operators The orbital angular momentum operators ix, iY and (have already been introduced in Section 7.3. It has been shown that they obey the commutation relations (7.3.23). Using the spherical polar coordinates (r, 8, ¢),which are related to rectangular Cartesian coordinates (x, y, z) by
r sin 8 COS cf>,
(7.10.1a)
y = r sin e sin ¢,
(7.10.1b)
X =
z = r cos
e,
(7.10.1c)
combined with the chain rule for differentiation
a ax
ar a ae a a¢ a ax ar ax ae ax a¢
-=--+--+-and similar results for a; ay and a; az, the angular momentum operators can be expressed in angular variables
a
a)
Lx =in sin ¢-+cot 8 cos¢- , A
(
ae
Ly =in A
(
(7.10.2a)
a¢
a
a)
-cos ¢-+cot 8 sin¢- ,
ae
(7.10.2b)
a¢
(7.10.2c) 2
J
a ) - +12 - -a- . L 2 =L2 +L 2 +L 2 =-li 2 [ - 1- -as(i n O X y Z Sineae ae sin 8a¢ 2 A
A
A
A
(7.10.3)
From (7.10.1c) and (7.10.3) it is easy to check that (7.10.4) It also follows from (7.10.1c) and (7.5.70) that
( Y;" ( e, 4>) =(lim) Y;"( e, ¢).
(7.10.5)
For any given value of l, the possible eigenvalues of the z-component of the angular momentum, ( are m =0, ±1, ±2, ... ±I,
giving (2/ + 1) admissible values.
(7.10.6)
397
Mathematical Foundations of Quantum Mechanics
On the other hand, it can easily be checked with the aid of (7.10.3), (7 .5.60) and (7.5.70) with A = 1( l+ 1) that
L2 Y7'( e, 4>) = [ n2 1(1 + 1)] Y7'( e, 4> ), where lmls:: 1, 1=0, 1, 2, .... This shows that the eigenvalues of
(7.10.7)
L2 are 1= 0, 1, 2, 3 ....
(7.10.8)
Evidently, the spherical harmonics Y'('( e, ¢) are the simultaneous eigenA2 functions of L and L The eigenvalues of the total angular momentum L 2 are n 1(1+1), 1=0,1,2, ... , and those of Lz are mn, m=0,±1, ... ,±1. 2 2 Thus a measurement of L can yield as its result only the values 0, 2 n , 2 6n , 12h 2 , •••• The total angular momentum states with 1 values 0, 1, 2, 3, 4, ... are known for historical reasons as S, P, D, F, G ... states respectively. Similarly, the measured values of are only 0, ±n, ±2n, .... Hence both D and iz are quantized and can upon measurement only reveal one of the specified discrete values. It is convenient to define two operators i+ and i_ by ~2
~
2 •
A
L
L+ = Lx+iLy,
(7.10.9a)
i_ =ix-iiy
(7.10.9b)
Theorem 7.10.1. (a) i+ and i_ are non-Hermitian operators, (b) i+i- and i_i+ are Hermitian.
Proof
Since ix and Ly are Hermitian,
for any two wave functions rfr 1 and rfr 2 • Thus i+ and i_ are not Hermitian operators, and hence they do not represent observables. On the other hand, "'
"'
"'
. "'
"'
• "'
"'2
"'2
.
"'
"'
L+L- = (Lx + iLy)(Lx -zLy) = Lx + Ly -z[Lx, Ly] =
"'2
"'2
"'
"'2
"'
"'
LX + Ly + nLZ = L - Lz ( Lz - n).
(7.10.10)
Similarly, (7.10.11) Thus both i+i and i i+ are expressed as real functions of L and Hence they are Hermitian operators. This completes the proof. 2
L.
398
Applications
Since the orbital angular momentum can only take on integer values, this result indicates the necessity for some generalization of this formalism. It is necessary to introduce matrix operators of size n x n defined by
R. Let x=(x 15 ••• ,xN)EB 1 and h=(h 1 , ••• ,hN)EB1 • Iff has continuous partial derivatives of order one, then the Gateaux differential off is df(x, h)=
N af(x) L -- hk. k~i axk
(8.2.2)
For a fixed x 0 E B 15 the Gateaux derivative at x 0 , (8.2.3) is a bounded linear operator from RN into RN. (Note that, in this example, B; = B 1 .) We can also write
which is the gradient off at x 0 , denoted by \lf(x0 ). Example 8.2.2.
Let B 1 = RN and B 2 = RM. Let
414
Applications
be Gateaux differentiable at some x ERN. The Gateaux derivative A can be identified with a M x N matrix (aij)· If h is the jth coordinate vector, h = ej = (0, ... , 1, ... , 0), then
~~~
1/f(x+th/-f(x)
A(h)l/=o
implies
for every i = 1, ... , M and j = 1, ... , N. This shows that j;'s have partial derivatives at x and
a}; (x) - - = a!]> ..
aXj
for every i = 1, ... , M and j = 1, ... , N. The Gateaux derivative off at x has the matrix representation
(8.2.4)
f'(x)=
This is called the Jacobian matrix off at x. Note that if M = 1 then the matrix reduces to a row vector, which is the case discussed in Example 8.2.1. Example 8.2.3. Let B = C(?([ a, b ]) be the normed space of real valued continuous functions on [a, b] with the norm defined by
llxll =
sup
lx(t)l.
IE[ a, b]
Let K(s, t) be a continuous real valued function defined on [a, b] x [a, b], and let g( t, x) be a continuous real valued function on [a, b] x R with continuous partial derivative agjax on [a, b] xR. Define a mapping f: B--'?B by f(x)(s) =
Then df(x, h)=
[d~
r
r
(8.2.5)
K(s, t)g(t, x(t)) dt.
K(s, t)g(t, x(t)+ah(t))
dtl~o·
415
Optimization Problems and Other Applications
Interchange of the order of differentiation and integration is permissible under the given assumption on g, and hence it follows that df(x, h)=
tb t)[a: K(s,
g(t, x(t)) Jh(t) dt.
(8.2.6)
Thus, the Gateaux derivative of the integral operator (8.2.5) is the linear integral operator (8.2.6) and its kernel is K(s, t)gx(t, x). Remark. The Gateaux differential is a generalization of the idea of the directional derivative familiar in finite dimensional spaces.
Theorem 8.2.2 (Mean Value Theorem). Suppose the functional f has a Gateaux derivative df( x, h) at every point x E B. Then, for any two points x, x + h E B, there exists a constant g E ( 0, 1) such that (8.2.7)
f(x+ h)- f(x) = df(x+ gh, h). Proof.
Put ( t) = f(x
+ th ).
Then
, . [(t+s)-(t)J . [f(x+th+sh)-f(x+th)J ( t) = hm = hm s--;..0
S
S
S-""0
= df(x+ th, h).
Application of the mean value theorem for functions of one variable to yields
(l) -(O)
= '(g)
for some
gE (0, 1).
Consequently, f(x+ h)- f(x) = df(x+ gh, h).
This proves the theorem. The derivative of a function
f
of a real variable is defined by
. f(x+h)-f(x) ' f (x)= 1tm , h~o h
(8.2.8)
provided the limit exists. This definition cannot be used in the case of mappings defined on a Banach space because h is then a vector, and division by a vector is meaningless. On the other hand, the division by a vector can be easily avoided by rewriting (8.2.8) as f(x +h) =f(x) + f'(x)h
+ hw(h ),
(8.2.9)
416
Applications
where w is a function (which depends on h) such that w( h)---?> 0 as h---?> 0. Equivalently, we can now say that f'(x) is the derivative off at x if
f(x +h)- f(x) = f'(x)h +(h ),
(8.2.10)
where ( h) = hw( h), and thus ( h)/ h---?> 0 as h---?> 0. The definition based on (8.2.10) can be generalized to include mappings from a Banach space into a Banach space. This leads to the concept of the Frechet differentiability and Frechet derivative.
Definition 8.2.3 (Frechet Derivative). Let x be a fixed point in a Banach space B 1 • A continuous linear operator A: B 1 ---?> B 2 is called the Frechet derivative of the operator T: B 1 ---?> B 2 at x if T(x+h)- T(x) =Ah+(x, h)
(8.2.11)
lim ll(x, h) II= 0 o llh I
(8.2.12)
provided
llhll ....
or, equivalently, lim llhli->o
I T(x+ h)- T(x)- Ah I llhll
O.
(8.2.13)
The Frechet derivative at x will be denoted by T'(x) or dT(x). In the case of a real valued function f: R---?> R, the ordinary derivative at x is a number representing the slope of the graph of the function at x. The Frechet derivative off is not a number, but a linear operator from R into R. The existence of the ordinary derivative /'(x) implies the existence of the Frechet derivative at x, and the comparison of (8.2.9) and (8.2.11) shows that A is the operator which multiplies every hER by the number f'(x). In elementary calculus, the tangent to a curve is the straight line giving the best approximation of the curve in the neighborhood of the point of tangency. Similarly, the Frechet derivative of an operator f can be interpreted as the best local linear approximation. We consider the change in f when its argument changes from x to x + h, and then approximate this change by a linear operator A so that
f(x+h)=f(x)+Ah+e,
(8.2.14)
where e is the error in the linear approximation. Thus, e has the same order of magnitude as h, except for the case when A is equal to the Frechet
417
Optimization Problems and Other Applications
derivative off In such a case e = o( h), so that e is much smaller than h as h ~ 0. In this sense, the Frechet derivative gives the best linear approximation off near x. Finally, if A is a linear operator, then the derivative of A is A itself. And the best linear approximation of A is A itself. Theorem 8.2.3. If a mapping has the Frechet derivative at a point, then it has the Gateaux derivative at that point and both derivatives are equal.
Proof. then
Let f: B 1 ~ B 2 , and let x
E
B 1 • Iff has the Frechet derivative at x,
lim IIT(x+h)-T(x)-Ahll
llhll
llhiH
0
for some continuous linear operator A: B 1 ~ B 2 • In particular, for any fixed non-zero h E B 1 , we have lim ,~o
II T(x+ th)- T(x) t
Ah
I
=lim ~~o
I T(x+ th)- T(x)- A(th) llllh II= 0. 11th 11
Thus, A is the Gateaux derivative off at x. Corollary 8.2.1
If the Frechet derivative exists, it is unique.
Proof. Suppose A 1 and A 2 are Frechet derivatives off at some x E B 1 • Then A 1 and A 2 are the Gateaux derivatives off at x. Thus, A 1 = A 2 , by Theorem 8.2.1. Example 8.2.4. if x ;; C(?([a, b]), we mean an operator defined by (Tu)(x) =
r
K(x, t)f(t, u(t)) dt,
where K: [a, b] x [a, b]---?> Rand f: [a, b] x R-?> Rare given functions. Iff is sufficiently smooth, then T(u+h)(x)=
r
K(x, t)[f(t, u)+hfu(t, u)+th 2fuu(t, u)+· · ·] dt
= ( Tu) ( x) + Ah + o ( h ) , where the Frechet derivative A= T'(u) is T'( u)(h) =
tb
K (x, t)fu ( t, u( t))h( t) dt.
Thus, the Frechet derivative of T at u is the linear integral operator with the kernel K(x, t)fu(t, u(t)).
419
Optimization Problems and Other Applications
Theorem 8.2.4. If an operator defined on an open subset of a Banach space has the Frechet derivative at a point, then it is continuous at that point.
Proof. Let n be an open set in a Banach space B 1 , and let T be an operator from n into a Banach space Bz. Let X En and let E > 0 be such that X+ h En whenever I h I < E. Then
IIT(x+h)- T(x)ll = IIAh+(x, h)II~O as I h I
~
0. This proves that T is continuous at x.
Much of the theory, results and methods of ordinary calculus can be easily generalized to Frechet derivatives. For example, the usual rules for differentiation of the sum and product (in the case of functionals) of two or more functions apply to Frechet derivatives. The mean value theorem, the implicit function theorem and Taylor series have satisfactory extensions. The interested reader is referred to Liusternik and Sobolev (1974). In the next theorem, we prove the chain rule for Frechet derivatives. Theorem 8.2.5 (Chain Rule). Let B 1 , B2 , B 3 be real Banach spaces. If g: B 1 ~ B 2 is Frechet differentiable at some x E B 1 and f: B 2 ~ B 3 is Frechet differentiable at y = g( x) E B 2 , then = f o g is Frechet differentiable at x and
'(x) = f'(g(x))g'(x). Proof.
For x, h E B 1 , we have
(x+ h) -(x) = f(g(x+ h))- f(g(x)) = f(g(x+h)- g(x) + g(x))- f(y) =
f(d
+ y)- f(y),
where d=g(x+h)-g(x). Thus, IICx +h)- (x)- f'(y)d II= o( I d II).
In view of II d- g'(x)h II= o( I h II), we obtain llcx +h)- (x)- f'(y)g'(x)h
II= o( llh II)+ o( I d II).
Since g is continuous at x, by Theorem 8.2.4, we have I d I -1), Legendre polynomials Pn (x) (a=f3=0, w(x)=l), and Chebyshev polynomials Tn(x) (a={3==-t w(x) == (1-x 2 )- 112 ). Other orthogonal polynomials are also of interest and can be obtained from the Chebyshev polynomials Tn (x) which satisfy the recurrence relation
with T0 (x) = 1 and T 1 (x) = x. It follows from Tn(x) ==cos ne (n = 0, 1, 2, ... ), where X= cos e, Os e s 1T, that T~(x) = n sin ne; sin e. We then define the new polynomials Un(x) of degree at most n by Un(x) ==
sin(n+ 1)e . e , sm
n = 0, 1, 2, ... ,
(8.7.22)
where x =cos e. These are called the Chebyshev polynomials of the second kind. It is easy to check that polynomials (8.7.22) are orthogonal with respect 2 to w(x)=(1-x ) 112 and hence are constant multiples of Jacobi's poly2 112 nomials p~l/ • l(x). Using L'Hopital's rule, it follows that Un(1) = (n+ 1),
and then p(l/2.!/2\1) n
= 1·3·5·····(2n+1) u 2n(n+ 1)!
n
(1)
•
There are many identities connecting Tn(x) and Un(x). Some of them are given as exercises.
8.8. Linear and Nonlinear Stability We consider linear and nonlinear problems of stability and instability for differential systems. In dynamical systems, the state at any time 1 can be
461
Optimization Problems and Other Applications
represented by an element of a Banach (or Hilbert) space E. Suppose that the dynamics of a physical system are governed by the evolution equation du dt = F(A, u, t),
(8.8.1)
where A E A is a parameter, A is a set of parameters (for instance A= R), u is a function of a real variable t with values in E, and F is a mapping from Ax E x R into E. Definition 8.8.1 (Autonomous Dynamical System). The dynamical system governed by (8.8.1) is called autonomous if the function F does not depend explicitly upon t. For autonomous systems, (8.8.1) can be written m the form du/ dt = F(A, u). Definition 8.8.2 (Equilibrium Solution). If F(A 0 , u 0 ) and u = u 0 , then u 0 is called an equilibrium solution.
=0
for some A= A0
Definition 8.8.3 (Stable, Unstable and Asymptotically Stable Solutions). u 0 be an equilibrium solution of Equation (8.8.1). (a) u 0 is said to be stable if for every
E
Let
> 0 there exists a 8 > 0 such that
I u( t)- uoll < E for all solutions u( t) of (8.8.1) such that I u(O)- uoll < 8. (b) u0 is called unstable if it is not stable. (c) u 0 is called asymptotically stable if it is stable and I u( t)- uoll-?> 0 as t---?> co. Example 8.8.1. Consider the scalar equation .X= 0. Every solution of this equation has the form x = c, where c is a constant. Thus, every solution is stable but not asymptotically stable. Example 8.8.2. Consider the system dujdt=Au, u(O)=u 0 , where u(t) is real for each t and A E R. This equation has the equilibrium solution u 0 ( t) = 0. The general solution is u(t) = u 0 eAt. If A s:: 0, then the zero solution is stable. If A > 0, the solution is unstable because u( t) --o> co as /---?>co, no matter how small u 0 is.
462
Applications 2
Consider the equation .X= x with x(O) = x 0 • The solution of this equation is obtained by separating the variables, and has the form Example 8.8.3.
Xo
x(t)=--. 1-x 0 t
The solution is not defined for t = 1/ x 0 • Thus x( t)"" 0 is a solution which is unstable. Example 8.8.4.
Consider a linear autonomous system
u=Lu+v,
(8.8.2)
where u( t) E E for each t, L: E----'? E is a linear operator which does not depend on t, and v is a given element of E. Clearly, u 0 E E is an equilibrium solution of (8.8.2) if Lu 0 = -v. We suppose the solution of (8.8.2) is of the form u( t) = u 0 + eAtw, where A is a constant and wE E. Clearly, u( t) satisfies (8.8.2) provided
This means that A is an eigenvalue of L with eigenvector w. Ifthe eigenvalue A has a positive real part and w is a normalized eigenvector, then for any s > 0, the function u( t) = u 0 + sw eA' is a solution of (8.8.2) such that I u(O)uoll = E and I u( t)- uoll--'? co as t----'? co. This shows that the equilibrium solution u 0 is unstable provided there is an eigenvalue with positive real part. This example leads to the "Principle of Linearized Stability" which can be described as follows: Consider a system of ordinary differential equations
u=
(8.8.3)
F(A, u),
where u = ( u 1 , u 2 , ••• , un), F = (F 1 , F 2 , ••• , Fn) and A is a parameter. Let u 0 be the equilibrium solution with A= A0 , so that F( u0 , A0 ) = 0. Suppose the solution of (8.8.3) can be written as u ( t) = v( t) + u 0 , where v( t) is the perturbation from equilibrium. It follows from u = F( u, A0 ) that
v = u= F(v+ Uo, Ao) = F(uo, Ao) + [ -aF;] (v) + O(llvll ) auj 2
or (8.8.4)
O(llvll IIG(v)llscllvll
where A=[aF;/aujluo,Aob and G(v)=
2
,
2 )
represents a term such that
463
Optimization Problems and Other Applications
where c is a constant. Neglecting the second term in (8.8.4), we obtain the linear equation (8.8.5)
v=Av.
The solution of this equation is v(t)
= e'Au 0.
(8.8.6)
Clearly, all solutions of this equation decay if the spectrum of A lies in the left half plane. Some solutions of (8.8.6) may grow exponentially provided A has eigenvalues in the right half plane. In general, the second order term is negligible when the perturbations are small. This heuristic argument can be justified by Lyapunov's Theorem: Theorem 8.8.1 (Lyapunov's Theorem). If all eigenvalues of A have negative real parts, then u 0 is a stable equilibrium solution of (8.8.3). If some eigenvalues of A have positive real parts, then u 0 is an unstable solution. A rigorous proof of this theorem is beyond the scope of this book. However, the reader is referred to Coddington and Levinson (1955). The following example shows that the weak inequality Re( A) s 0 for all eigenvalues does not ensure stability. Example 8.8.5.
.
Consider the equation u =Au, where u( t)
(0 1)
E
R 2 and A is
. 0 0 If u 0 is an equilibrium solution of this equation, then Au 0 = 0. Clearly, u 0 = (a, 0) represents an equilibrium solution for any number a. The only eigenvalue of A is zero. If we write u = (x, y), then the given equation ·becomes .X= y and y = 0. Hence, the general solution is y = m, x = mt + c, where m and c are constants. For sufficiently small m and c, the solution u( t) = ( mt + c, m) can be made sufficiently close to u 0 = (a, 0) at t = 0. But I u( t)- u0 l ~coast~ co. This shows that the equilibrium solution is unstable. the matnx operator
Theorem 8.8.2 (Stability Criterion). If A is a linear operator on a space E and A+ A* is negative semi-definite, that is, ( v, (A+ A*) v) s 0 for all v E E, then all equilibrium solutions of the equation u=Au+f
( 8.8. 7)
are stable, where u is an element of a Hilbert space E for each t, A: E ~ E is independent of I and fis a given element of E.
464
Applications
Proof. Suppose u 0 is an equilibrium solution of (8.8.7), that is, Au 0 = 0, and u( t) is any other solution. If v = u- u 0 , then v = Av. Thus,
d2
dt211vll 2=
d2 dt2 (v, v) = (v, v)+ (v, v)
= (v, Av) + (Av, v) = (v, (A +A*)v) If A+A* is negative semi-definite, then d2
dt211vll
2
s0.
This means that II vii is a non-increasing function. Consequently, if II u(O)uoll < s, then II u( t)- uoll < E for all t > 0. This shows that all equilibrium solutions are stable. We next consider the stability of a general nonlinear autonomous equation (8.8.8)
u=Nu.
The question of stability of an equilibrium solution u 0 of (8.8.8) is concerned with the effects of small initial displacements of u from u0 , and it only involves values of u in the neighborhood of u 0 • If N is Frechet differentiable, then the operator N can be approximated by the linear operator N'(u) in the neighborhood of u 0 , and linear stability theory can be used. Hence, Nu = Nu 0 + N'(u 0 )(u- u0 )
+ o(u- u 0 ),
(8.8.9)
where Nu 0 = 0. Neglecting the term o( u- u 0 ), Equation (8.8.9) is approximately equal to u = N'( u 0 )( u- uo)·
(8.8.1 0)
This equation may be called the linearized approximation of the nonlinear Equation (8.8.8). Its stability can be determined by stability criteria discussed earlier. When u is near u 0 , (8.8.10) is the linearized approximation to (8.8.8), so it is naturally assumed that the stability of the linearized equations determines that for the nonlinear equations. This principle is generally accepted as valid in the applied literature, aPd stability is determined formally by solving the associated linear eigenvalue problem. However, this general principle is not necessarily true as shown by a counterexample.
465
Optimization Problems and Other Applications
Example 8.8.6. Consider the nonlinear equation u= u 3 , where u( t) E R for each t. The equilibrium solution is u 0 = 0. It can explicitly be solved by using the initial condition u (0) = u0 , and the solution is 2 2
u =
Uo
1-2u~t'
which is not defined for t = 1/2u~. Thus, u0 is an unstable equilibrium. However, the linearized equation u = 0 admits a stable solution. Thus, the stability of the linearized equation does not imply stability of the nonlinear equation. The difficulty associated with this example is that the linearized equation has eigenvalue A= 0 (critical case when ReA= 0). In other words, the linearized system is only marginally stable. This means that an arbitrarily small perturbation can push the eigenvalue into the right half plane, and make the system unstable. In other words, the eigenvalue zero corresponds to a constant solution of the linearized equation, and an arbitrarily small perturbation can change this constant solution and thus lead to instability. However, if all the eigenvalues of a linearized problem are negative, then its solutions tend to u0 exponentially. The small perturbations involved in going from the linearized to the nonlinear problem cannot change exponential decay of u- u 0 into growth, so in this case the nonlinear problem will be stable.
8.9. Bifurcation Theory Bifurcation is a phenomenon involved in nonlinear problems and is closely associated with the loss of stability. We have seen in Section 8.8 that the stability of a dynamical system depends on whether the eigenvalues of the linearized operator are positive or negative. These eigenvalues correspond to bifurcation points. We shall discuss bifurcation theory in terms of operator equations in a real Banach (or Hilbert) space. By a nonlinear eigenvalue problem we usually mean the problem of determining appropriate solutions of a nonlinear equation of the form F(A, u) = 0,
(8.9.1)
where F: R x E ~ B is a nonlinear operator, depending on the parameter A, which operates on the unknown function or vector u, and E and B are real Banach (or Hilbert) spaces.
466
Applications
Bifurcation theory deals with the existence and behavior of solutions u (A) of Equation (8.9.1) as a function of the parameter A. Of particular interest is the process of bifurcation (or branching) where a given solution of (8.9.1) splits into two or more solutions as A passes through a critical value A0 , called a bifurcation point. Definition 8.9.1 (Bifurcation Points). The solution of (8.9.1) is said to bifurcate from the solution u 0 ( A0 ) at the value A = A0 if the equation has at least two distinct solutions u 1 (A) and u 2 (A) such that they tend to u 0 = u 0 ( A0 ) as A~ A0 • The points (A 0 , u0 ) satisfying Equation (8.9.1) are referred to as bifurcation (or branch) points if, in every neighborhood of ( A0 , u 0 ), there exists a solution (A, u) different from ( A0 , u 0 ). The first problem of bifurcation theory is to determine the solution u 0 and the parameter A0 at which bifurcation occurs. The second problem is to find the number of solutions which bifurcate from u0 (A 0 ). The third problem is to study the behavior of these solutions for A near A0 • To illustrate bifurcation, we consider the linear eigenvalue problem Lu =Au,
(8.9.2)
where L is a linear operator acting on a function or a vector u in some Banach space and A E R. For every value of A, (8.9.2) has a trivial solution u = 0 with the norm II u II = 0. Suppose there is a sequence of eigenvalues A1 < A2 < A3 < · · · , and the corresponding normalized eigenfunctions u 1 , u2 , u 3 , ••• such that
k= 1, 2, 3, ....
(8.9.3)
Then, for a, any real number, non-trivial solutions are u =auk, k = 1, 2, 3, ... , with the norm llull =a. The norms of both trivial and non-trivial solutions are shown graphically by Figure 8.1.
Ilull
0 FIGURE
8.1.
Bifurcation Diagram.
467
Optimization Problems and Other Applications
Many examples of bifurcation phenomena occur in both differential and integral equations. One such example is as follows: Consider a thin elastic rod with pinned ends lying in the
Example 8.9.1.
x-z plane. The shape of the rod is described by two functions u(x) and w(x) which are the dimensionless displacement functions in the x and z
directions. The x-displacements of its end points are prescribed. The displacement functions u(x) and w(x) satisfy the following differential equations and boundary conditions: 2
d w dx 2 +Aw(x)=O,
du) ( dx
+ ~ ( dw) 2 = 2 dx
w(O)
-
(8.9.4)
Osxs1,
p.,A
0 :S x
'
= w(1) =
0,
:S
1'
(8.9.5)
u(O) = -u(1) =a> 0,
(8.9.6)
where the parameter A is proportional to the axial stress in the rod, the constant a in (8.9.6) is proportional to the prescribed end displacement and is referred to as the end-shortening and p., is a positive physical constant. Consider the linearized problem where the nonlinear term w~ is absent. The solution of the linearized Equation (8.9.5) is (8.9. 7)
u(x) = a(1- 2x),
where a= Ap.,/2. The solution of (8.9.4) and (8.9.6) is w(x) = 0 unless A is an eigenvalue An given by (8.9.8)
n=1,2,3, .... In this case, w is a multiple of the eigenfunctions
Wn
given by
n = 1, 2, 3, ... ,
(8.9.9)
where the An are constants. From a= ~Ap., and A =An = n 2 1r 2 , we conclude that if a= an = ~P.,An, then the rod buckles into a shape given by (8.9.7) and (8.9.9) with an undetermined amplitude An. The numbers an are called the critical end-shortenings. For a -:;fan, n = 1, 2, ... , the rod remains straight because the solution of (8.9.4) and (8.9.6) is w(x)
=0.
(8.9.10)
468
Applications
We now consider the nonlinear problem (8.9.4)-(8.9.6). The solution of the problem i.s still given by (8.9.9) when A= An, and by (8.9.10) when A -:;fAn. To find u(x) when A= An, we put (8.9.9) into (8.9.5) and integrate using (8.9.6) at x = 0 to obtain 1 2 u(x) = Un (x) =a- p.,An 1 + A~) 11- x +smrAn sin 2n7TX.
(
4
(8.9.11)
In view of the boundary condition u(1) =-a, we obtain (8.9.12)
a=an(1+ :;).
This is a relation between the end-shortening and the amplitude. The bifurcation diagrams for the thin rod are given in Fig. 8.2. The diagram shows that, for a< a 1 , the only solution is the trivial solution w ="' 0. At a= a 1 , the non-trivial solution w1 = A 1 sin 7TX bifurcates from the trivial solution, and continues to exist for all a> a 1 • The point a= a 1 is called the first bifurcation point and the non-trivial solution is called the first bifurcation solution. For each n, non-trivial solutions of (8.9.12) for An are possible if and only if a 2: an. The solutions bifurcate from the trivial (unbuckled) state An= 0 at a= an. Thus, the solution of the linearized problem determines the bifurcation points of the nonlinear problem. For any a in ansa san+~> there are 2n + 1 solutions. For a< a 1 , no buckling is possible. We also note from (8.9.12) that dajdAn = anAn/2p.,. Hence, for a fixed amplitude A, the parabola in Figure 8.2 bifurcating from an has a steeper slope than that bifurcating from am if m < n. Clearly, these parabolas do not intersect. For any fixed value of a, the bifurcation solutions can be classified by the values of the potential energy associated with them. We also observe
0
FIGURE
8 2.
a,
a
Bifurcation diagram for the thin rod.
469
Optimization Problems and Other Applications
that the potential energy is equal to the internal energy since the displacements are specified at the ends of the rod. Consequently, the potential energy is proportional to the functional V defined by V(w)
=~
f [w~+; ux+~ w~rJ (
dx.
(8.9.13)
In the unbuckled state, Equations (8.9.7) and (8.9.10) hold with a= Ap.,/2, and the corresponding potential energy is
1 (2 ) . 2
(8.9.14)
Vro=lp, /La
The potential energy Vn of the buckled state given by (8.9.9) is obtained by substituting (8.9.8), (8.9.9) and (8.9.11) into (8.9.13) in the form (8.9.15) Hence, (8.9.16) 2 ( Vn- Vm) =-(an- am)[(a- an) +(a- am)] 2:0,
a 2: an 2: am.
(8.9.17)
PIt follows from (8.9.16) and (8.9.17) that, for fixed a> a 1 , the straight state has the largest energy and the branch originating from a 1 has the
smallest energy. For fixed a in the interval ansa s an+J, the energies of the branches are ordered as Vro > Vn > Vn_ 1 > · · · > V 1 • For the state of smallest energy, the displacement function of this state is
w = A 1 w1 = ±2.f/i ( ;
-1)
sin
7TX
for all a> an.
(8.9.18)
Suppose the solutions of (8.9.1) represent equilibrium solutions for a dynamical system which evolves according to the time-dependent equations u, = F(A, u),
(8.9.19)
where u : R ~ E and E is a Banach (or Hilbert) space. An equilibrium solution u 0 is stable if small perturbations from it remain close to u 0 as t ~ oo; u 0 is asymptotically stable if small perturbations tend to zero as t ~ oo (see Section 8.8). When the parameter A changes, one solution may persist but become unstable as A passes a critical value A0 , and it is at such a transition point that new solutions may bifurcate from the known solution.
470
Applications
One of the simple nonlinear partial differential equations which exhibits the transition phenomena shown in Figure 8.3 is in D,
u=O
on
aD,
(8.9.20) (8.9.21)
where D is a smooth bounded domain in RN. The equilibrium states of (8.9.20) are given by solutions of the time-independent equation ( u, == 0). One solution is obviously u = 0, which is valid for all A; this solution becomes unstable at A= A1 , the first eigenvalue of the Laplacian on D: V' 2 u 1 + A1 u 1 = 0, u 1 = 0 on aD. For A> A1 , there are at least three solutions of the nonlinear equilibrium equation. The nature of the solution set in the neighborhood of ( A~o 0) is given in Figure 8.3; the new bifurcating solutions are stable. The Laplacian has a set of eigenvalues A1 < A2 < A3 < · · · which tend to infinity, and all of these eigenvalues are potential bifurcation points. In the theory of Calculus in Banach spaces, the following version of the Implicit Function Theorem is concerned with the existence, uniqueness and smoothness properties of the solution of the Equation (8.9.1). Theorem 8.9.1 (Implicit Function Theorem). Suppose A, E, B are real Banach spaces and F is a Frechet differentiable mapping from a domain D c AxE to B. Assume F(A 0 , u0 ) = 0 and the Frechet derivative F' ( A0 , u 0 ) is an isomorphism from E to B. Then, locally, for II A - A0 I sufficiently small, there is a differentiable mapping u (A) from A to E, with (A, u (A)) E D, such that F(A, u(A)) =0. Moreover, (A, u(A)) is the only solution of F=O in a sufficiently small neighborhood D' c D. IfF is en then u is en. If A, E, and Bare complex Banach spaces and F is Frechet differentiable, then F is analytic and u is analytic in A.
u
0
FIGURE 8.3. Bifurcation diagram where unstable solutions are represented by dashed lines.
471
Optimization Problems and Other Applications
The proof of the theorem is beyond the scope of this book. However, the theorem can be proved by using a contraction mapping argument and is adequate for most physical applications. The reader is referred to Sattinger (1973) and Dieudonne (1969) for a detailed discussion of proofs. Bifurcation phenomena typically accompany the transition to instability when a characteristic parameter crosses a critical value, and hence they play an important role in applications to mechanics. Indeed the area of mechanics is a rich source of bifurcation and instability phenomena, and the subject has always stimulated the rapid development of functional analysis.
8.1 0. Exercises
(1) Let H 1 and H 2 be real Hilbert spaces. Show that if T is a bounded linear operator from H 1 into H 2 , and f is a real functional on H 1 defined by where u is a fixed vector in H 2 , then point given by
f
has a Frechet derivative at every
f'(x) = -2T*u+2T*Tx, where T* is the adjoint of T
(2) Suppose T: B 1 ~ B 2 is Frechet differentiable on a open set 0 c B 1 • Show that if X E 0 and hE B 1 are such that X+ thE 0 for every t E [0, 1], then
I T(x +h)- T(x) I s I h I
I T'(x+ a h) II·
sup O(x) +A
(13) (a)
A 1 = A2 =
f
1 - Aa 22
- Aa 12
l
-~ (x+ t) + Atx+~)
r(x, t, A)¢( t) dt.
1/ 1r,f(x) =
C1
cos x+c 2 sin x,
(b) A1 = iJ3j2, / 1 (X) = 1- jJ} X, A2 = -iVJ/2, / 2 (X) = 1 + iJ3 X.
(14)
(d) f(x)=-2.
(16) (a) f(x)=!(3x+1),
(c) f(x) =sinh x, (d) f(x)=e-x
(b) f(x)=sinx,
(19) (Lu, v) =
J l
r r
2 •
d)
d 2
( v(x) ex dx 2 + ex dx u(x) dx
0
= =
v(x)(exu'(x))'dx u Lv dx = ( u, Lv).
(21)
-1)
2n Un(x) = Bn sin ( - 2
(22)
2 An= n 7T
(24)
(Tu, u)
2
,
Un(x) = Bn sin(n1r ln x), n
= ((DpD+ q)u,
X,
= 1, 2, ....
u) = (DpDu, u) + q( u, u) =
(25) (u, v) =
(27)
n = 1, 2, ....
Lb u(x)v(x)r(x) dx.
-pll Dull 2 + qll ull 2 •
487
Hints and Answers to Selected Exercises
(31) (b) Sif{Tu} = 2/(1 + k 2 )Sif{u}. Since SiP is unitary, we have II Sif{Tu}ll =II Tull s2llull·
Hints and Answers to 6.6. Exercises
(1) Show first that iff is a continuous function which is not identically zero then there exists a ¢ E cgro(RN) with compact support such that JRN f(x)cf>(x) dx .:P 0. (5) Use the function f(t) =
{~-J/t2
tsO, t>O.
(6) (a), (b), (e), (i) Yes. (c), (f), (g), (h), (j) No. (d) No, if {xn} has a convergent subsequence.
(1 0) Use the Riemann-Lebesgue Lemma. (12) Note that, for every E > 0,
(13) Use the Riemann-Lebesgue Lemma to show that . fro . cf>(x)-¢(0) !~~ sm nx x dx = 0. -Q'
(17) G(x, 1) = (1/2c)H(I){H(x+ c1) -H(x- c1)} = (1/2c)H(ct-lxl). (18) (b) u(x,I)=J:a:G(x,g,l)g(g)dg, where G(x,g,t) is obtained in 18(a). Since G(x, g, t) = lj2c if x- Cl < g < x + Cl and 0 elsewhere, we have u(x, 1) = (lj2c) g(g) dg.
I;";;
488
Hints and Answers to Selected Exercises
(26)
(7TY) fro f(t) dt -ro cosh ( -;;7T (x- t) ) -cos-;; 1ry
1
u(x,y)=-sin2a a
1 . (7TY) +-sm -
2a
a
fro g(t) dt . -ro cosh ( -;;7T (x- t) ) +cos-;; 7TY
(27)
7T(x-t)) . (7TY) fro f(t) cosh ( 2 a dt . u(x, y) == sm 2a -ro cosh (7T ) 7TY -;; ( x- t) -cos-;; (29) tf>(x, y)
=!fro fo(ak) 2 -ro k
eikx-lkly
dk.
Hints and Answers to 7.11. Exercises (1) (a) (dldt)(aLiaii)-alaxi=O implies mxi-kxi=O. Multiply this equation by xi and integrate to obtain ~mx; + ~kxi =constant. (b) Use (7.10.1abc) and Lin 1(a) and show that it becomes the expression for Lin 1(b).
(2) aL . Pr= ar = mr,
aL 2 • 1 2 2 ·2 Pe =--;-= mr 8, where L= T- V=2m(f +r 8 )- V(r); a8
1 2 2 "2 1 p; p~ H=T+V=-m(f +r 8 )+V(r)=--+--2 +V(r), 2 2m 2mr
. aH Pr r=-=apr m'
. aH Pe 8 = - = -2 ape mr •
Then use A= -aH Iar and
Pe = -aH I a8.
Hints and Answers to Selected Exercises
(4) (i) aT
.
P =-=mx ax ' aH p . -=-=X ap m '
aH=kx=-p implies (x,p)=-(k/m)(x,p). ax
(ii)
aT
aT
P =-=mr r ar '
Pe
2 •
= aiJ = mr e,
1 ( Pr+2 p~) +m/1- (-1- -1) '
aH
2
H=-
2m
- r
2a
Pr m
. '
-=-=r
r
apr
aH
Pe
.
ap8
mr
'
-=-= 8 2
aH
.
-=0=-Pe·
ae
Thus,
r- riJ
2
= -
11-l r 2
2
and Cd 1dt)( r iJ) =
o.
(8) (iii)
(iv)
[fix, x2 J =.X[ fix, .XJ +[fix, xJx = -.X[.X, fixJ- [.X, fixJ.X = -2inx.
(11) Use (7.10.9ab). (12) Use (7.10.48). (15)
( 19)
490
Hints and Answers to Selected Exercises
(21) (r/J, r/J)
=(~I (r/Jn, rfJ)r/Jn, ~I (r/Jk, r/J)r/Jk) =n~i
E
(r/Jn, r/J)*(r/Jk, r/J)(r/Jn, r/Jk).
(22) (i) Use the fact that A' is the difference of two Hermitian operators, and then show that (rjJ, A'4>) = (A'rjJ, 4>) for any rjJ and¢. (ii) Use the fact that A(11) =(B) A, since A is a linear operator and (B) is a scalar. (iii) (A'rf!, A'rf!) = crfr, (A'frf!) = crfr, [A -
X2,
x 3 ) lies on the sphere.
A subject to the condition J~ J1 + (y') dx = L. In other words, maximize the functional l 1 (y) = J~ [y(x)+ A(J1 + (y'(x) 2 - L)] dx. 2 Answer: ( x- a ) + (y- f3 f = A 2 where a, f3 and A are constants. 2
( 12) Maximize
(23) This polynomial is the real part of the binomial expansion of (COS 8 + i sin er, where X= COS 8. (29) Use repeated integration by parts to show
f
Dn[(x 2 -1rJxmdx=O,
m=0,1, ... ,n-1,
1
and then find the leading coefficient of vn [ (x 2 - 1
r].
(30) Use Rodrigues' formula and the binomial expansion of (x 2 -1r.
491
Hints and Answers to Selected Exercises
(31)
(b)
vn+i[(x2-l)n] = Dn{D[(x2-l)n-l(x2_ 1 )]} = 2n
Dn[x(x2-1r-JJ = 2n{x Dn[(x2-l)n-1]+ n Dn-i[(x2
l)n-1]}
and then use Rodrigues' formula.
(33) (a) Use the recurrence relation 31(a) and 31(b). (b) Use 31(b) and 33(a).
(34) (a) Multiply equality in 33(a) by x and subtract from the equality in 31 (b). (b) Square and add the equalities in 31 (b) and 33(b ).
(37) Transform into polar coordinates.
(41 ) (a) w = A sin nx, (b) w = A sin nx,
2 A[ -n 2 - (A- A )] = 0.
A( A- A 2 n
2
)
= 0.
(43) The term within the first bracket of the equation can be replaced a 2 2 2 2 constant p.,. Answer: p., = n 1r , w =An sin n1rx, A~= 4[(A/ n 1r ) -1]. (45) Note that the quantity in the square bracket is a constant and2 can2 be2 2 2 replaced by a constant a. Then a= an= n 1r , u =A sin n1rx, A -IAI = n 1r . Draw the bifurcation diagram. (46) Square both sides of the equation and integrate from 0 to 1.
Bibliography
Balakrishnan, A V., Applied Functional Analysis, Springer-Verlag, New York, 1976. Balakrishnan, A V., Introduction to Optimization Theory in a Hilbert Space, Springer-Verlag, New York, 1971. Banach, S., Iheorie des operations lineaires, Chelsea, ~ew York, 1955. Berkovitz, L., Optimal Control Theory, Springer-Verlag, New York, 1975. Cheney, E. W., Introduction to Approximation Theory, McGraw-Hill, New York, 1966. Coddington, E. A and Levinson, N., Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955. Curtain, R. F. and Pritchard, A J., Functional Analysis in Modern Applied Mathematics, Academic Press, New York, 1977. De Boor, C., Approximation Theory, Proceedings of Symposia in Applied Mathematics, Vol. 36, American Mathematical Society, Providence, 1986. Dieudonne, J., Foundations of Modern Analysis, Academic Press, New York, 1969. Dirac, P. A M., The Principles of Quantum Mechanics (Fourth Edition), Oxford University Press, Oxford, 1958. Dunford, N. and Schwartz, J. T., Linear Operators, Part I, General Theory, Interscience, New York, 1958. Friedman, A, Foundations of Modern Analysis, Dover Publications, New York, 1982.
493
494
Bibliography
Garabedian, P. R., Partial Differential Equations, John Wiley and Sons, New York, 1964. Glimm, J. and Jaffe, A., Quantum Physics (Second Edition), SpringerVerlag, New York, 1987. Gould, S. H., Variational Methods for Eigenvalue Problems, Toronto University Press, Toronto, 1957. Halmos, P. R., Measure Theory, Springer-Verlag, New York, 1974. Hilbert, D., Grundziige einer allgemeinen Iheorie der linearen Integralgleichungen, Leipzig, 1912. Hutson, V. and Pym, J. S., Applications of Functional Analysis and Operator Theory, Academic Press, New York, 1980. Iooss, G. and Joseph, D. D., Elementary Stability and Bifurcation Theory, Springer-Verlag, New York, 1981. Jauch, J. M., Foundations of Quantum Mechanics, Addison-Wesley Publishing Company, Reading, Mass., 1968. Jones, D. S., Generalized Functions, McGraw-Hill, New York, 1966. Kantorovich, L. V. and Akilov, G. P., Functional Analysis in Normed Spaces, Pergamon Press, London, 1964. Keller, J. B. and Antman, S., Bifurcation Theory and Nonlinear Eigenvalue Problems, W. A. Benjamin, New York, 1969. Kolmogorov, A. N. and Fomin, S. V., Elements of the Theory of Functions and Functional Analysis, Vol. 1, Graylock Press, Rochester, New York, 1957; Vol. 2, Graylock Press, Albany, New York, 1961. Kolmogorov, A. N. and Fomin, S. V., Introductory Real Analysis, PrenticeHall, New York, 1970. Kreyn, S. G., Functional Analysis, Foreign Technology Division WP-AFB, Ohio, 1967. Kreyszig, E., Introductory Functional Analysis with Applications, John Wiley and Sons, New York, 1978. Landau, L. D. and Lifshitz, E. M., Quantum Mechanics, Non-relativistic Theory, Pergamon Press, London, 1959. Lax, P. D. and Milgram, A. N., Parabolic Equations, Contribution to the Theory of Partial Differential Equations, Ann. of Math. Studies, No. 33 (1954), Princeton, 167-190. Lions, J. L. and Stampacchia, G., Variational Inequalities, Comp. Pure Appl. Math. 20(1967), 493-519. Luenberger, D. G., Optimization by Vector Space Methods, John Wiley and Sons, New York, 1969. Liusternik, L. A. and Sobolev, V. J., Elements of Functional Analysis (Third English Edition), Hindustan Publishing Co., New Delhi, 1974.
Bibliography
495
Mackey, G. W., The Mathematical Foundations of Quantum Mechanics, W. A. Benjamin, New York, 1963. MacNeille, H. M., A Unified Theory of Integration, Proc. Nat. Acad. Sci. USA, Vol. 27(1941), 71-76. Merzbacher, E., Quantum Mechanics (Second Edition), John Wiley and Sons, New York, 1961. Mikusinski, J., Bochner Integra~ Birkhauser-Verlag, Basel, 1978. Myint- U, T. and Debnath, L., Partial Differential Equations for Scientists and Engineers (Third Edition) North-Holland, New York, 1987. Neumann, J. V., Mathematical Foundations of Quantum Mechanics, Princeton University Press, Princeton, 1955. Reed, M. and Simon, B., Methods of Modern Mathematical Physics, Volume 1, Functional Analysis, Academic Press, New York, 1972. Riesz, F. and Sz-Nagy, B., Functional Analysis (Second Edition), Frederick Ungar, New York, 1955. Rivlin, T. J., An Introduction to the Approximation of Functions, Dover Publications, New York, 1969. Roach, G. F., Green's Functions (Second Edition) Cambridge University Press, Cambridge, 1982. Sattinger, D. H., Topics in Stability and Bifurcation Theory, Lecture in Mathematics, Vol. 309(1973), Springer-Verlag, New York. Schechter, M., Modern Methods in Partial Differential Equations, McGrawHill, New York, 1977. Schwartz, L., 17uforie des distributions, Vols. I and II, Herman and Cie, Paris, 1950, 1951. Shilov, G. E., Generalized Functions and Partial Differential Equations, Gordon and Breach, New York, 1968. Sobolev, S. L., Partial Differential Equations of Mathematical Physics, Pergamon Press, London, 1964. Stakgold, I., Boundary Value Problems of Mathematical Physics, Macmillan, New York, 1968. Taylor, A. E., Introduction to Functional Analysis, John Wiley and Sons, New York, 1958. Tricomi, F. G., Integral Equations, Interscience, New York, 1957. Yosida, K., Functional Analysis (Fourth Edition), Springer-Verlag, New York, 1974. Young, L. C., Calculus of Variations and Optimal Control Theory, W. B. Sanders Company, Philadelphia, 1969. Zemanian, A. H., Distribution Theory and Transform Analysis, McGraw-Hill, New York, 1965.
List of Symbols
Page numbers indicate page where symbol is further defined. N
Q R R+
c F
RN
eN C(?(O) cgk(O) cgro(O) fYl(O)
C(?([a, b]) cgk([a, b]) cgro([a, b]) ,
dimension of E, 10 norm, 10 empty set open ball, 14 closed ball, 14 sphere, 14 closure of S, 16 boundary of S domain of L, 23 range of L, 23 null space of L, 23 graph of L, 23 space of bounded linear mappings form E 1 into E 2 , 25 dual space of E, 27 support of !, 39 expansion of an integrable function !, 44 space of Lebesgue integrable functions on R, 45 space of equivalence classes of Lebesgue integrable functions on R, 52 convergence in norm, 53 almost everywhere, 54 convergence almost everywhere, 55 characteristic function of S Lebesgue measure of S, 68 real part of z imaginary part of z space of square integrable functions on R, 74 space of Lebesgue integrable functions on RN, 75 space of square integrable functions on RN, 76 space of Lebesgue integrable functions on 0 space of square integrable functions on n space of Lebesgue integrable functions on [a, b] space of square integrable functions on [a, b], 75 convolution off and g, 79 inner product, 88 complex conjugate of z orthogonal, 92 orthogonal complement, 117 direct sum projection onto S, 166
List of Symbols
p_l
499
complementary projection operator, 168 .J! identity operator, 138 A* adjoint of A, 145 A-1 inverse of an operator A, 155 Legendre polynomials, 100, 254 Pn(x) P;;'(x) associated Legendre functions, 254 Hn(X) Hermite polynomials, 102, 256 Tn(x) Chebyshev polynomials, 254 p~·f3(x) Jacobi polynomials, 255 C~(x) Gegenbauer polynomials, 255 Ln(x) Laguerre polynomials, 255 L~(x) associated Laguerre functions, 256 Bessel's functions, 256 In (x) L 2 'P ([a, b]) space of square integrable functions with the weight function p on [a, b] D" differential operator, 96 Hm(O), H 2 (0), Hm(O), W;'(O) Sobolev spaces, 96 w weak convergence, 96 Xn ---7 X W(x; u, v) Wronskian of u and v, 260 G(x, t) Green's function, 253 2 V =Ll Laplacian, 298 f(x) Euler's gamma function 2ZJ, 2ZJ(RN) space of test functions, 285 2ZJ', 2ZJ'(RN) space of distributions, 287 llA uncertainty of A, 359 Lx, Ly, Lz orbital angular momentum operators, 353, 396 general angular momentum operators, 399 Mx, My, Mz .-< A .-< Pauli's spin matrices, 398 O"x, cry, az Jy, Jz total angular momentum operators, 403 generalized momentum, 399 pj generalized coordinates, 336 qj H Hamiltonian, 339 {A, B} Poisson bracket of two functions A and B, 343 [A, BJ commutator of two operators A and B, 350 (xl r/J) = rjf(x) state vector, 347 (rf!lx) = rjf(x) complex conjugate of rjf(x), 347 '-V(x, t) time-dependent state vector, 363 A observable operator, 349 h Planck's constant, 345
ix,
500 li
(A) dT(x, h), T'(x)h
'Vf dTx, T'x
T"
List of :
universal constant, 345 expectation value of an operator A, 355 Gateaux derivative, 412 gradient of a functional, 413 Frechet derivative, 416 second Frechet derivative, 420 first variation, 438 space of polynomials of degree at most n, 455
Index
A
Abel's formula, 260 Abel's integral equation, 247 Abel's problem, 432 Absolutely convergent series, 21 Abstract minimization problem, 443 Action, 367 Adjoint boundary conditions, 251 Adjoint of a differential operator, 250 Adjoint operator, 149, 204 Admissible set, 425 Admittance, 269 Angular momentum, 344, 353 Angular momentum operator, 396, 400, 403 Annihilation operator, 392 Anomalous Zeeman effect, 400 Anti-derivative of a distribution, 293 Anti-Hermitian operator, !55 Anti-linear functional, 124 Appolonius' identity, 128 Approximate eigenvalue, 186 Approximation theory, 453 Associated Legendre functions, 254, 376 Associated Legendre operator, 254
Asymptotically stable solution, 461 Atomic units, 372 Autonomous dynamical system, 461
B Banach fixed point theorem, 30 Banach space, 19 Basic representation, 38 Basis, 10 Beppo Levi, 59 Bessel functions, 256 Bessel operator, 256 Bessel's equality and inequality, 104 Best approximation, 453 Bifurcation, 465 diagram, 466, 468, 470 points, 466 Biharmonic equation, 300, 323 Bilinear concomitant, 251 Bilinear functional, 143 Born, 367 Bosons, 402 Bounded bilinear functional, 143
501
502 Bounded linear mapping, 24 Bounded operator, 138 Bounded quadratic form, 144
c Cartesian product, 8 Cartesian product of vector ·spaces, 8 Cauchy sequence, 18 Cayley transform, 213 Chain rule, 419 Change of variable theorem, 66 Characteristic function, 38 Chebyshev operator, 254 Chebyshev polynomials, 131, 254, 460 Classical quantities, 353 Classical solution, 295 Closed balls, 14 Closed graph theorem, 208 Closed operator, 208 Closed set, 15 Closed unbounded operator, 208 Closest point property, 118 Closure, 16 Coercive functionals, 147 Commutation relations, 354 Commutator, 350 Commuting operators, 141 Compact-normal resolvent, 212 Compact operator, 171 Compact sets, 17 Compatibility theorem, 407 Compatible observables, 351 Complementary observables, 351 Complementary projection operator, 168 Completely continuous operator, 171, 176 Complete normed space, 19 Complete sequence, 107 Completion of a normed space, 28 Complex vector space, 4 Conjugate-linear functional, 124 Conjugate variables, 339 Conservation law of energy, 384 Conservative, 334 Constant of motion, 335, 387 Continuity equation, 366 Continuous mapping, 24 Continuous spectrum, 177 Contraction, 30 Contraction mapping, 30
Contraction mapping theorem, 30, 225 Control function, 447 Convergence absolute, 21 almost everywhere, 55 in norm, 53 in normed space, 12 pointwise, 12 strong, 25, 97 of test function, 286 uniform, 12, 25 weak, 97 weak distributional, 289 Convergence almost everywhere, 55 Convergence in norm, 53 Convergence in normed space, 12 Convergence of test function, 286 Convex functions, 423 Convex sets, 118 Convolution, 79 Convolution theorem, 196 Correspondence principle, 353 Cost functional, 447 Creation operator, 392 D
d'Alembert solution, 318 de Broglie wave, 370 Degenerate eigenvalue, 178 Degree of degeneracy, 178 Densely defined operator, 204 Dense subsets, 17 Derivative Frechet, 416 Gateaux, 412 second Frechet, 420 Derivatives of distributions, 288 Differential operator, 139 Diffusion equation, 299, 316 Dimension, 10, 348 Dirac delta distribution, 288 Dirichlet integral, 303 Dirichlet problem, 310 Dispersion relation, 368 Distributional solution, 296 Distributions, 287, 294 Domain, 23 Domain of a differential operator, 250 Dual space, 27
503
Index
E
Ehrenfest's theorem, 375, 383 Eigenfunction, 176, 253 Eigenvalue, 176, 253 Eigenvalue space, 178 Eigenvector, 176 Elliptic functional, 147 Elliptic operator, 310 Equality almost everywhere, 54 Equation Abel integral, 247 Bessel, 256 biharmonic, 300, 323 continuity, 366 diffusion, 299 Euler-Lagrange, 428, 430, 435 Fredholm integral, 224, 232, 233 Hamilton, 340 Hamilton-Jacobi, 367 Heisenberg, 386 Helmholtz, 299, 300, 310 Hermite, 257 homogeneous integral, 224 integral, 223 Klein-Gordon, 300 matrix Riccati differential, 451 Lagrange, 336, 338, 431 Laguerre, 255 Laplace, 298, 319, 320 Legendre, 254 Navier-Stokes, 322 Newton 334 non-homogeneous integral, 224 non-homogeneous wave, 299 nonlinear Fredholm integral, 233 ordinary differential, 248, 268 partial differential, 295 Poisson, 299, 305 Schrodinger, 300, 362, 375 state, 452 Sturm-Liouville, 257 telegrapher, 299 Volterra integral, 223, 224, 236, 237, 239, 246 vorticity transport, 322 wave, 299, 300 Equation of continuity, 366 Equations of motion, 345 Equilibrium solution, 461 Equivalence of norms, 13 Euclidean norm, 11
Euclidean space, 92 Euler-Lagrange equations, 428, 431, 435, 436, 439 Existence and uniqueness of solution, 228, 229, 232, 233 Expectation value, 355 Extension of operators, 204 Extremum, 425
F Fatou's lemma, 61 Fejer's kernel, 114 Fermat's principle, 431 Fermions, 402 Finite dimensional operator, 173 Finite dimensional vector space, 10 Fixed point, 29, 224 Fixed point theorem, 30 Formally self-adjoint differential operator, 251 Fourier coefficients, 117 Fourier series, 117 Fourier transJbrm, 193, 198 Frechet derivative, 416 Fredholm alternative, 229 Fredholm alternative for self-adjoint compact operators, 229 Fredholm equations, 224, 232, 233, 274 Fredholm operator, 151 Friedrich's first inequality, 311 Fubini's theorem, 78 Function associated Legendre, 254 Bessel, 256 characteristic, 38 control, 447 convex, 423 Dirac delta, 269, 288 eigen-, 176, 253 Green's, 263, 298 Hamilton's, 339, 431 Hamilton's principal, 367 Heaviside, 288 input, 269 Lagrangian, 335 Lebesgue integrable, 43, 75 locally integrable complex valued, 74 measurable, 70 momentum density, 372 momentum wave, 371 null, 52
504
lnde~
output, 269 Rademacher, 110 series of integrable, 50 smooth, 285 square integrable, 74, 76 state, 349, 447 step, 38, 75 tent, 83 test, 285 transfer, 271 Walsh, lll wave, 349 weight, 253 Functional, 27 anti-linear, 124 bilinear, 143 bounded bilinear, 143 coercive, 147 conjugate linear, 124 cost, 447 elliptic, 147 linear, 27 positive bilinear, 143 quadratic, 441 sesquilinear, 143 strictly positive, 143 symmetric bilinear, 143 Function of an operator, 191 Function spaces, 5 Fundamental matrix, 448 Fundamental solution, 297 G
Gateaux derivative, 412 Gegenbauer polynomials, 132 Generalized coordinates, 336 Generalized force, 339 Generalized Fourier coefficients, 106 Generalized Fourier series, 106 Generalized momentum, 339 Generator, 345, 380 Geodesic, 437 Gradient, 418 Gradient of a functional, 413 Gram-Schmidt orthonormalization process, 103 Graph, 23, 208 Graph of an operator, 23, 208 Green's first identity, 301 Green's function, 263, 298 Green's second identity, 301
Ground state energy, 392, 393 Group velocity, 369 H
Hamilton's canonical equation, 340, 386 Hamilton's function, 339, 431 Hamilton's principal function, 367 Hamilton's variational principle, 342 Hamilton-Jacobi's equation, 367 Hamiltonian, 339, 353, 431 Hamiltonian operator, 362 Hammerstein operator, 418 Heaviside function, 288 Heisenberg commutation relation, 351, 354 Heisenberg operator, 385 Heisenberg picture, 378, 384 Heisenberg's equation of motion, 385 Heisenberg's uncertainty principle, 359 Helmholtz equation, 299, 300, 306, 308 Hermite operator, 256 Hermite polynomials 102, 256 Hermitian operator, 150 Hilbert, 87 Hilbert space, 93 Hilbert space isomorphism, 126 Hilbert-Schmidt theorem, 187 Hilbert transform, 275 Holder's inequality, 7 Holonomic system, 336 Homogeneous Dirichlet problem, 298 Homogeneous integral equation, 224 Homogeneous Neumann problem, 298 Homogeneous Volterra equation, 239 I
Idempotent operator, 167 Identity operator, 138 Image, 23 Implicit function theorem, 470 Index of performance, 447 Infinite dimensional vector space, 10 Inner product, 88 Inner product space, 88 Input function, 269 Integral equations, 223 Integral of a step function, 39 Integral operator, 140 Integral over an interval, 62 Interaction picture, 378, 389
505
Index
Intrinsic angular momentum, 400 Inverse differential operators, 263 Inverse Fourier transform, 201 Inverse image, 23 Inverse operator, 155 Invertible operator, 155 Isometric operator, 159 Iterated kernels, 240
J Jacobian matrix, 414 Jacobi's identity, 404 Jacobi's operator, 255 Jacobi's polynomials, 132, 255
Linear harmonic oscillator, 390, 394 Linear independence, 9 Linear mapping, 23 Linear momentum, 334, 353 Linear operator, 138 Linear transformation, 138 Lions, 445 Lions-Stampacchia theorem, 445 Lipschitz's condition, 228, 277 Locally integrable complex valued functions, 74 Locally integrable functions, 62, 76 Lowest state energy, 392 Lyapounov's theorem, 463 M
K Kernel, 223 Kinetic energy, 334, 353 Klein-Gordon equation, 300 L
L' -norm, 51 U -norm, 75 Lagrange identity, 260 Lagrange's equations of motion, 336, 431 Lagrangian, 431 Lagrangian function, 335 Laguerre operator, 255 Laguerre polynomials, 130, 255 Laplace equation, 298, 319, 320 Laplace operator, 298, 319, 320 Law of conservation of energy, 341 Lax, 148 Lax-Mi!gram theorem, 148 Least-Square approximation, 455 Lebesgue, 37 Lebesgue dominated convergence theorem, 60 Lebesgue integrable functions, 43, 75 Lebesgue integral, 43 Lebesgue integral for complex valued functions, 72 Lebesgue measure, 68 Legendre equation, 254, 376 Legendre operator, 254 Legendre polynomials, 100, 102, 254, 376 Linear combinations, 9 Linear dependence, 9 Linear functional, 27
MacNeille, 38 Magnetic quantum number, 337 Matrix Riccati equation, 451 Maxwellian distribution, 395 Mean value theorem, 415 Measurable functions, 70 Measurable sets, 68 Measure, 68 Measurement, 349 Method of successive approximation, 225, 234 Mikusinski, 38 Milgram, 148 Minkowski's inequality, 8 Momentum density function, 372 Momentum wave function, 371 Monotone convergence theorem, 59 Multiple eigenvalue, 178 Multiplication operator, 140 Multiplicity, 178 Multiplier, 140 N Navier-Stokes equation, 322 Neumann, 87 Neumann problem, 298 Neumann series, 227 Newton's equation, 334 Newton's second law of motion, 334, 388 Non-degenerate eigenvalue, 178 Non-homogeneous equation, 224 Non-homogeneous Volterra equation, 240 Non-homogenous wave equation, 299
506
In de
Nonlinear Fredholm equation, 233 Non-separable Hilbert space, 125 Norm, 10, 51, 90, 91 Euclidean, 11 L', 51 D, 75 fP, 11 strictly convex, 127 sup, 138 uniform, 12 Normal operator, !58 Normed space, 11 Norm in inner product space, 91 Norm of a bounded bilinear functional, 143 Norm of a bounded quadratic form, 144 Norm of uniform convergence, 12 Null function, 52 Null operator, 138 Null set, 54 Null space, 23
0 Observable operators, 350 Observables, 345 Observation, 349 One dimensional Schriidinger equation, 394 One-sided shift operator, !59 Open balls, 14 Open sets, 15 Operator, 23, 138 adjoint, 149, 204 adjoint of a densely defined, 204 adjoint of a differential, 250 angular momentum, 396, 400, 403 annihilation, 392 anti-Hermitian, !55 associated Legendre, 254 Bessel, 256 bounded, 138 Chebyshev, 254 closed, 208 closed unbounded, 208 commuting, 141 compact, 171 complementary projection, 168 completely continuous, 171, 176 creation, 392 densely defined, 204 differential, 139 elliptic, 310
finite dimensional, 173 formally self-adjoint differential, 251 Fredholm, !51 Hamiltonian, 362 Hammerstein, 418 Heisenberg, 385 Hermite, 256 Hermitian, !50 idempotent, 167 identity, 138 integral, 140 inverse, !55 inverse differential, 263 invertible, 155 isometric, !59 Jacobi, 255 Laguerre, 255 Laplace, 298, 319, 320 Legendre, 254 linear, 138 linear harmonic oscillator, 390, 394 linear momentum, 334, 353 multiplication, 140 normal, 158 null, 138 observable, 350 one-sided shift, 159 orbital angular momentum, 396 orthogonality of a projection, 169 positive, 161 positive definite, 166 projection, 166 quantum, 353 Schriidinger, 385 self-adjoint, 150, 206 square root of a positive, 165 strictly positive, 166 symmetric, 206 time-evolution, 379 total Hamiltonian, 389 two-sided shift, 214 unbounded, 203 unitary, 160 Optimal control problems, 447 Optimal error, 454 Optimal solution, 454 Optimal trajectory, 448 Optimization problems, 424 Orbital angular momentum operators, 396 Orbital quantum number, 377 Ordinary differential equations, 248, 268
Index Orthogonal complement, 117 Orthogonal decomposition, 121 Orthogonality of projection operators, 169 Orthogonal systems, 99 Orthogonal projection, 120 Orthogonal vectors, 92 Orthonormal basis, 124 Orthonormal polynomials, 458 Orthonormal sequence, 99 Orthonormal systems, 99 Outcome of quantum measurement, 358 Output function, 269
Parallelogram law, 91 Parseval relation of Fourier transforms, 199, 201 Pars eva! 's formula, 108 Partial differential equation, 295 Particle momentum, 369 Pauli's spin matrices, 398 Periodic boundary condition, 249 Periodic Sturm-Liouville system, 258 Picard's existence theorem, 228 Plancherel theorem, 202 Planck, 345 Planck's constant, 345 Planck's simple harmonic oscillator, 394 Point spectrum, 177 Pointwise convergence, 12 Poisson equation, 299, 305 Poisson's bracket, 343, 386 Polarization identity, 128, 144 Pontrjagin maximum principle, 448 Position, 353 Positive bilinear functional, 143 Positive definite operator, 166 Positive operator, 161 Potential energy, 334, 353 Pre-Hilbert space, 88 Principle of linearized stability, 462 Principle of quantization, 354 Principle of superposition, 306 Principal quantum numbers, 377 Probability current density, 366 Probability density, 365 Probability flux, 366 Product of two operators, 141 Projection onto S, 121 Projection operator, 166 Proper subspace, 5
Pythagorean Jbrmula, 92, 104
Q Quadratic form, 143 Quadratic functional, 441 Quantization, 354 Quantum number, 393 Quantum operators, 353 R Rademacher function, 110 Range, 23 Real vector space, 4 Regular distribution, 287 Regular points, 177 Regular Sturm-Liouville systems, 257 Relative extrema, 425 Representer of a functional, 124 Resolution of a pulse, 271 Resolvent, 177, 235 Riemann integrable functions, 64 Riemann integral, 64 Riemann-Lebesgue lemma, 195 Riesz, 58, 123 Reisz representation theorem, 123 Robin problem, 298 Rodrigues formula, 377 Root-mean-square deviation, 355, 359
s Scalars, 3 Schriidinger picture, 378 Schriidinger's equation, 300, 362 Schwartz, 283 Schwarz's inequality, 90 Second Frechet Derivative, 420 Self-adjoint and formally self-adjoint differential operator, 251 Self-adjoint operator, !50, 206 Separability, 125 Separable Hilbert spaces, 124 Separable kernel, 242 Separable spaces, 124 Separated boundary conditions, 249 Sequence spaces, 6 Series of integrable function, 50 Sesquilinear functional, 143 Set of measure zero, 68
508
Index
Simple eigenvalue, 178 Singular distribution, 287 Singular Sturm-Liouville system, 257 Smooth function, 285 Snell's law, 432 Sobolev, 97 Sobolev space, 96 Solution asymptotically stable, 461 bifurcation, 466 classical, 295 distributional, 296 equilibrium, 461 stable, 461 unstable, 461 weak, 296 Space Banach space, 19 C, 3 '6'([a, b]), 89 CN, 5 '6'0(R), 95
lhm, 96 Hm(fJ) = Wzm(rJ), 97 f 2 , 6, 89 fP,
6
L'(R). 45
£2(R), 74 £2(RH), 75 £2 ([a, b]), 89
R, 3 Euclidean, 92 finite dimensional vector space, 10 Hilbert, 93 infinite dimensional, 10 infinite sequence, 6 inner product, 88 non-separable Hilbert, 125 pre-Hilbert, 88 separable Hilbert, 124 Sobolev, 96 test function, 285, 294 vector, 4 Space spanned by s1, 10 Spectral theorem for self-adjoint compact operators, 189 Spectral theorem for unbounded operators, 211 Spectrum, 177 Sphere, 14 Spherical harmonics, 377 Spherically symmetric potential, 375
Spin, 400 Square integrable functions, 74, 76 Square root of an operator, 165 Stability criterion, 463 Stable solution, 461 Stampacchia, 445 Standard deviation, 359 State equation, 452 State function, 349, 447 States, 345 State transition matrix, 449 State vector, 347 Stationary point, 425 Stationary state, 362 Step function, 38, 75 Strictly convex norm, 127 Strictly positive functional, 143 Strictly positive operator, 166 Strong convergence, 25, 97 Sturm-Liouville systems, 257 Subspace, 5 Successive approximation, 234 Summability kernel, 113 Support, 39 Symmetric bilinear functional, 143 Symmetric operator, 206 Synthesis of a pulse, 271
T
Tautochronous motion, 247 Telegrapher equation, 299 Tent function, 83 Test function space, 285 Theorem Banach fixed point, 30 closed graph, 208 compatibility, 407 contraction mapping, 30, 225 convolution, 196 Ehrenfest, 375, 383 Fubini's, 78 Hilbert-Schmidt, 187 implicit function, 470 Lax-Milgram, 148 Lebesgue dominated convergence, 60 Lions-Stampacchia, 445 Lyapounov, 463 mean value, 415 monotone convergence, 59 orthogonal projection, 120
509
Index
Picard existence, 228 Plancherel, 202 Riesz representation, 123 spectral, 189, 21! virial, 408 Weierstrass approximation, 17 Time-dependent Schrodinger equation, 363 Time-dependent state vector, 347 Time-evolution equation, 381 Time-evolution operator, 379 Time-invariant, 447 Total angular momentum, 403 Total energy, 335, 353 Total Hamilton operator, 389 Transfer function, 271 Transition matrix, 449 Triangle inequality, 11, 91 Trigonometric Fourier series, 112 Two-sided shift operator, 214
u Unbounded operators, 203 Uncertainity, 359 Uncertainty principle, 360 Uniform convergence, 12, 25 Unitary operator, 160 Unitary space, 88 Universal constant, 346 Unstable solution, 461
v Variables, 345 Variational inequalities, 443 Variational problems, 311, 411 Vector space, 4 Vector subspace, 5 Virial theorem, 408 Volterra equation, 236, 246 Volterra equation of the first kind, 223, 246 Volterra equation of the second kind, 224, 237 Vorticity transport equation, 322
w Walsh functions, 111 Wave equation, 299, 300, 318 Wave function, 349 Wave-particle duality, 370 Weak convergence, 97 Weak distributional convergence, 289 Weak solution, 296, 311 Weierstrass approximation theorem, 17 Weighted average, 357 Weight function, 253 Wronskian, 260
z Zeeman effect, 400