Partial Differential Equations Analytical and Numerical Methods
, " , .':\" ! /j ....•~: ..... / ...: \
/ \
", I
~-,
02
0 .4
0'
0.'
:
Differential Partial Differential Equations
This page intentionally This page intentionally left left blank blank
Partial Differential Differential Partial Equations Equations Analytical Analytical and and Numerical Numerical Methods Methods
Mark S. S. Gockenbach Gockenbach Mark Michigan Technological Technological University University Michigan Houghton, Houghton, Michigan Michigan
siam Society for Industrial and and Applied Applied Mathematics Mathematics Society for Industrial Philadelphia Philadelphia
Copyright © © 2002 2002 by by the Society for for Industrial Industrial and Applied Mathematics. Mathematics. Copyright the Society and Applied 1098765432 1 10987654321 All rights rights reserved. reserved. Printed Printed in in the the United United States States of of America. No part part of of this book may All America. No this book may be reproduced, reproduced, stored, transmitted in in any any manner manner without permission of without the the written written permission of be stored, or or transmitted the the Society for Industrial the publisher. publisher. For For information, information, write write to to the Society for Industrial and and Applied Applied 19104-2688. Mathematics, Mathematics, 3600 3600 University University City City Science Science Center, Center, Philadelphia, Philadelphia, PA PA 19104-2688. Library of of Congress Congress Cataloging-in-Publication Cataloging-in-Publication Data Data Library Gockenbach, Mark Mark S. Gockenbach, S. Partial differential equations S. Partial differential equations :: analytical analytical and and numerical numerical methods methods // Mark Mark S. Gockenbach. Gockenbach. p.cm. p. cm. Includes Includes bibliographical bibliographical references references and and index. index. ISBN 0-89871-518-0 ISBN 0-89871-518-0 1. 1. Differential Differential equations, equations, Partial. Partial. I.I. Title. Title. QA377 .G63 .G63 2002 2002 QA377 515'.353-dc21 515'.353-dc21 2002029411 MATLAB trademark of of The MATLAB is is a a registered registered trademark The MathWorks, MathWorks, Inc. Inc. Maple trademark of Waterloo Maple, Maple is is a a registered registered trademark of Waterloo Maple, Inc. Inc. Mathematica a registered Mathematica is is a registered trademark trademark of of Wolfram Wolfram Research, Research, Inc. Inc.
0-89871-518-0 0-89871-518-0
5.laJ1l.. siam •
is trademark. is a a registered registered trademark.
Dedicated to mother, Joy Joy Gockenbach, the Dedicated to my my mother, Gockenbach, and and to to the memory of of my my father, father, LeRoy LeRoy Gockenbach. memory Gockenbach.
This page intentionally This page intentionally left left blank blank
Contents Contents Foreword Foreword
xiii xiii
Preface Preface
xvii xvii
11
Classification Classification of of differential differential equations equations
22
Models Models in in one one dimension dimension
2.1 2.1
2.2 2.2 2.3 2.3 2.4 2.4 33
Heat flow in Heat flow in aa bar; bar; Fourier's Fourier's law. law . . . . . . . . . . . . . . . .. Boundary 2.1.1 Boundary and and initial initial conditions conditions for for the heat equation equation fk 2.1.2 Steady-state 2.1.2 Steady-stateheat heatflow 2.1.3 Diffusion. . . . . . . . . . . . . . . . . . 2.1.3 Diffusion The bar . . . . . . . . . . . . . . . . . . . . . The hanging hanging bar 2.2.1 Boundary conditions for for the bar 2.2.1 Boundary conditions the hanging hanging bar The The wave wave equation equation for for aa vibrating vibrating string string Suggestions Suggestions for for further further reading. reading . . . . . . . . . . . .
Essential linear linear algebra algebra Essential 3.1 Linear systems 3.1 Linear systems as as linear linear operator operator equations. equations 3.2 Existence and uniqueness of to Ax = = bb 3.2 Existence and uniqueness of solutions solutions to 3.2.1 3.2.1 Existence . . . . . . . . . 3.2.2 Uniqueness . . . . . . . . 3.2.2 Uniqueness 3.2.3 The 3.2.3 The Fredholm Fredholm alternative alternative 3.3 Basis 3.3 Basis and and dimension. dimension . . . . . . . 3.4 Orthogonal 3.4 Orthogonal bases and and projections . L22 inner product . 3.4.1 The L 3.4.2 The theorem 3.4.2 The projection projection theorem 3.5 3.5 Eigenvalues and and eigenvectors eigenvectors of of aa symmetric symmetric matrix matrix 3.5.1 The 3.5.1 The transpose transpose of of aa matrix matrix and and the the dot dot product product ... 3.5.2 Special properties of 3.5.2 Special properties of symmetric symmetric matrices matrices 3.5.3 The spectral spectral method method for for solving solving Ax 3.5.3 The Ax == bb 3.6 Preview of of methods methods for for solving solving ODEs ODEs and and PDEs 3.6 Preview PDEs . 3.7 Suggestions for for further further reading reading . . . . . . . . . . . . . 3.7 Suggestions
VVII II
11 9 9 99 13 14 14 16 16 21 21 24 24 27 27 30 30
31 31 31 31 38 38 38 38 42 42 45 45 50 50 55 55 58 58 61 61 68 71 71 72 72 74 74 77 77 78
VIII viii
4 4
Contents Contents ordinary differential equations 79 Essential ordinary 79 Converting system ... 79 Converting aa higher-order higher-order equation equation to to aa first-order first-order system 79 Solutions to some simple ODEs ODEs . . . . . . . . . . . . . . . Solutions to some simple 82 82 4.2.1 The general 4.2.1 The general solution solution of of aa second-order second-order homogeneous homogeneous ODE with with constant ODE constant coefficients coefficients . . . . . . . . . . . 82 82 4.2.2 A special inhomogeneous inhomogeneous second-order ODE . 85 4.2.2 A special second-order linear linear ODE 85 4.2.3 First-order linear ODEs . . . 4.2.3 First-order linear ODEs 87 87 4.3 Linear with constant constant coefficients coefficients . . . . . . . . . . . . 4.3 Linear systems systems with 91 4.3.1 Homogeneous systems systems . . . . . . . . . . . . . . . . 4.3.1 Homogeneous 92 4.3.2 Inhomogeneous systems and and variation variation of of parameters parameters 96 4.3.2 Inhomogeneous systems 101 4.4 Numerical methods for for initial value problems . . . . . . . . . . . 101 4.4.1 Euler's method method . . . . . . . . . . . . . . . . . . . . . 102 4.4.1 Euler's 102 4.4.2 Improving on Euler's Euler's method: method: Runge-Kutta Runge-Kutta methods methods 104 104 4.4.2 Improving on 4.4.3 Numerical systems of of ODEs 4.4.3 Numerical methods methods for for systems ODEs . . . . . . 108 108 4.4.4 Automatic step control Runge-Kutta-Fehlberg and Runge-Kutta-Fehlberg 4.4.4 Automatic step control and methods 110 methods. . . . . . . . . . . . . . . 110 115 4.5 Stiff systems of of ODEs. 4.5 Stiff systems ODEs . . . . . . . . . . . . . 115 4.5.1 A simple example example of stiff system 117 4.5.1 A simple of aa stiff system 117 4.5.2 The backward backward Euler Euler method method . . . 118 4.5.2 The 118 4.6 Green's functions 123 4.6 Green's functions . . . . . . . . . . . . . . . . 123 4.6.1 The Green's function for for aa first-order first-order linear linear ODE ODE.. . 123 123 4.6.1 The Green's function 4.6.2 The Dirac Dirac delta delta function function . . . . . . . . . . . 125 4.6.2 The 125 4.6.3 The Green's function for aa second-order second-order IVP 4.6.3 The Green's function for IVP . . . . 126 126 4.6.4 Green's for PDEs 127 4.6.4 Green's functions functions for PDEs 127 4.7 Suggestions for 128 4.7 Suggestions for further further reading reading . . 128
4.1 4.1 4.2 4.2
5 5
131 Boundary value problems in statics 5.1 The analogy between between BVPs BVPs and algebraic systems 131 5.1 The analogy and linear linear algebraic systems . . . . 131 5.1.1 A note note about about direct direct integration. 141 5.1.1 A integration . . . . . . . 141 5.2 Introduction to 144 to the the spectral spectral method; method; eigenfunctions eigenfunctions . . . . 144 5.2 Introduction 5.2.1 Eigenpairs 5.2.1 Eigenpairs of of — - -j^ under under Dirichlet Dirichlet conditions conditions.. . . . 144 144 5.2.2 Representing functions functions in terms of of eigenfunctions eigenfunctions.. . 146 146 5.2.2 Representing in terms boundary conditions; conditions; other 5.2.3 Eigenfunctions under under other other boundary 5.2.3 Eigenfunctions other Fourier series 150 Fourier series . . . . . . . 150 5.3 Solving 155 5.3 Solving the the BVP BVP using using Fourier Fourier series series 155 5.3.1 A special special case case . . . . . . 155 5.3.1 A 155 5.3.2 The general general case case . . . . . 156 5.3.2 The 156 5.3.3 Other boundary conditions 161 5.3.3 Other boundary conditions 161 5.3.4 Inhomogeneous 164 5.3.4 Inhomogeneous boundary boundary conditions conditions 164 5.3.5 Summary 166 5.3.5 Summary . . . . . . . . . . . . . . . 166 5.4 Finite element element methods methods for for BVPs BVPs . . . . . . . . 172 5.4 Finite 172 5.4.1 5.4.1 The The principle principle of of virtual virtual work work and and the the weak weak form form of of 173 aa BVP BVP . . . . . . . . . . . . . . . . . . . . . . . . . . 173 5.4.2 5.4.2 The The equivalence equivalence of of the the strong strong and and weak weak forms forms of of the the BVP 177 BVP. . . 177 5.5 The Galerkin Galerkin method method . . . . . . . . . . . . . . . . . . . . . . . . 180 180 5.5 The
::2
Contents
5.6 5.7 5.8
IX ix
Piecewise polynomials and the finite element method . . . 5.6.1 Examples using piecewise piecewise linear finite elements . . . 5.6.2 Inhomogeneous Dirichlet conditions .. . . . . 5.6.2 BVPs. . . . . . . . . . . . . . . . . . Green's functions for BVPs 5.7.1 The Green's function and the inverse of a differential differential operator. operator . . . . . . Suggestions for further reading. reading
188 193 197 202
207 210
6
Heat flow and diffusion 211 211 6.1 211 6.1 Fourier series methods for the heat equation 211 The homogeneous 6.1.1 homogeneous heat equation 214 6.1.2 Nondimensionalization . . . . . . 217 The inhomogeneous 220 6.1.3 inhomogeneous heat equation equation 220 6.1.4 Inhomogeneous boundary conditions 222 6.1.5 Steady-state heat flow flow and diffusion diffusion 224 6.1.6 Separation variables. . . . . . . . Separation of variables 225 6.2 Pure Neumann conditions and the Fourier cosine series 229 6.2.1 One end insulated; mixed boundary conditions . . . 229 229 231 6.2.2 6.2.2 Both ends insulated; Neumann boundary conditions 231 6.2.3 Pure Neumann conditions in a steady-state BVP . . 237 237 6.2.3 6.3 Periodic boundary conditions and the full Fourier series . . . . . 245 6.3.1 Eigenpairs of -— -j^ -/l;x under periodic boundary conditions247 conditions247 249 6.3.2 Solving the BVP using the full Fourier series . . . . 249 252 6.3.3 Solving the IBVP using the full Fourier series . . . . 252 6.4 Finite element methods for the heat equation 256 6.4.1 The method of lines for the heat equation. equation . . 260 6.5 Finite elements and Neumann conditions . . . . . . . . . . 266 6.5.1 The weak form of a BVP with Neumann conditions 266 6.5.2 6.5.2 Equivalence of the strong and weak forms of a BVP with Neumann conditions . . . . . . . . . . . . . . . 267 6.5.3 Piecewise linear finite elements with Neumann conditions. ditions . . . . . . . . . . . . . . . . . . . . . . . . . 269 6.5.4 Inhomogeneous Neumann conditions . . . . . . . . . 273 6.5.4 6.5.5 The finite element method method for an IBVP with Neumann conditions . . . . . . . . . . . . . . . . . . . . 274 6.6 Green's functions for the heat equation . . . . . . . . . . . . . . 279 6.6.1 The Green's function for for the one-dimensional heat equation under Dirichlet conditions . . . . . . . . . 280 6.6.2 Green's functions 281 6.6.2 functions under other boundary conditions. conditions . 281 6.7 Suggestions for further reading. reading . . . . . . . . . . . . . . . . . . 283
7
VVaves Waves 7.1 The homogeneous wave equation without boundaries . . . . . . 7.1 7.2 Fourier series methods for the wave equation . . . . . . . . . . . 7.2.1 Fourier series solutions of the the homogeneous wave equation. equation . . . . . . . . . . . . . . . . . . . . . . . .
285 285 285 291 291
293
x
Contents
7.2.2
7.3
7.4 7.5 88
Fourier series solutions of the inhomogeneous wave equation . . . . . . . . . . . . . . equation. 7.2.3 Other boundary conditions . . . . . . . . . . Finite element methods for the wave wave equation . . . . . . 7.3.1 The wave equation with Dirichlet conditions . . . . The wave equation under other boundary conditions 7.3.2 Point sources and resonance . . . . . . . . . . . . . . 7.4.1 The wave equation with a point source . 7.4.2 Another experiment leading to resonance reading . . . . . . . . . . . . Suggestions for further reading.
296 296 301 305 306 306 312 318 318 321 324
Problems in in multiple multiple spatial spatial dimensions dimensions 327 Problems 327 8.1 dimensions . . . . .. 327 8.1 Physical models in two or three spatial dimensions. 8.1.1 8.1.1 The divergence theorem . . . . . . . . . . . . . .. 328 for a three-dimensional domain . 330 8.1.2 The heat equation for 330 8.1.3 Boundary conditions for for the three-dimensional heat 8.1.3 equation . . . . . . . . . . . . . . . . 332 equation. 8.1.4 The heat equation in a bar . . . . . . . . . . . . . . 333 8.1.5 The heat equation in two dimensions . . . . . . . . . 334 8.1.6 equation for a three-dimensional domain. domain . 334 8.1.6 The wave equation equation in two dimensions dimensions . . . . . . . . 334 8.1.7 The wave equation 8.1.8 335 8.1.8 Equilibrium problems and and Laplace's equation equation . .. .. . . 335 8.1.9 Green's identities and the symmetry of the Laplacian 8.1.9 Laplacian 336 8.2 Fourier series on on aa rectangular rectangular domain domain . . . . 339 8.2 Fourier series 339 8.2.1 Dirichlet boundary conditions. conditions . . 339 8.2.2 Solving aa boundary boundary value 345 8.2.2 Solving value problem problem 345 8.2.3 Time-dependent problems 346 Time-dependent problems. . . . . 8.2.4 Other boundary conditions for 348 for the rectangle . . . . 348 349 8.2.5 Neumann boundary conditions . . . . . . . . 8.2.6 Dirichlet and Neumann problems for Laplace's equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 8.2.7 Fourier series methods for a rectangular box in three dimensions . . . . . . . . . . . . . . 354 8.3 Fourier series on a disk 359 Fourier series on a disk . . . . . . . . . . . . . . . . . . 8.3 359 8.3.1 The Laplacian Laplacian in polar coordinates . . . . . 360 360 8.3.2 Separation Separation of variables in polar coordinates 362 362 363 8.3.3 Bessel's equation . . . . . . . . . . . . . . . 8.3.4 Properties Properties of the Bessel functions . . . . . 366 8.3.5 The eigenfunctions eigenfunctions of the negative Laplacian on the disk . . . . . . . . . . . 368 disk 8.3.6 Solving PDEs PDEs on on aa disk disk . . . . . . . . . . . . . . 372 8.3.6 Solving 372 377 Finite elements in two dimensions . . . . . . . . . . . . . . . 8.4 377 8.4.1 The weak form of a BVP in multiple dimensions . . 377 377 378 8.4.2 Galerkin's method . . . . . . . . . . . . . . . . . 378 8.4.3 Piecewise linear finite elements in two dimensions dimensions . 379 379 8.4.4 Finite elements and Neumann conditions . . . . . 388 388
Contents
8.5 8.5 99
10 10
XI xi
8.4.5 Inhomogeneous boundary conditions 8.4.5 Inhomogeneous boundary conditions Suggestions reading . . . . . . . . . . Suggestions for for further further reading
389 389 392 392
More More about about Fourier Fourier series series 9.1 The 9.1 The complex complex Fourier Fourier series series . . . . . . . . . . . . . . . . 9.1.1 Complex 9.1.1 Complex inner inner products products . . . . . . . . . . . 9.1.2 Orthogonality 9.1.2 Orthogonality of of the the complex complex exponentials exponentials 9.1.3 Representing 9.1.3 Representing functions functions with with complex complex Fourier Fourier series. series . 9.1.4 The complex Fourier series of a real-valued function 9.1.4 The complex Fourier series of a real-valued function 9.2 Fourier series series and and the the FFT FFT . . . . . . . . . . . . . . . . . . . . . 9.2 Fourier 9.2.1 Using the 9.2.1 Using the trapezoidal trapezoidal rule rule to to estimate estimate Fourier Fourier coefcoefficients .. . '. . . . . . . . . . . . . . . . . . 9.2.2 The 9.2.2 The discrete discrete Fourier Fourier transform transform . . . . . . . . . . . . 9.2.3 A 9.2.3 A note about about using using packaged packaged FFT FFT routines routines . . . . . 9.2.4 Fast transforms transforms and and other other boundary boundary conditions; conditions; the the 9.2.4 Fast discrete discrete sine sine transform. transform . . . . . . . . . . . . . . . using the 9.2.5 Computing 9.2.5 Computing the the DST DST using the FFT FFT . . . . . . . . 9.3 Relationship of of sine sine and and cosine cosine series series to to the the full full Fourier Fourier series series . 9.3 Relationship 9.4 Pointwise 9.4 Pointwise convergence convergence of of Fourier Fourier series series . . . . . . . . . . . . . 9.4.1 Modes 9.4.1 Modes of of convergence convergence for for sequences sequences of of functions functions .. . 9.4.2 Pointwise convergence convergence of of the the complex complex Fourier Fourier series series 9.4.2 Pointwise 9.5 Uniform convergence 9.5 Uniform convergence of of Fourier Fourier series series . . . . . . 9.5.1 Rate 9.5.1 Rate of of decay decay of of Fourier Fourier coefficients coefficients 9.5.2 Uniform 9.5.2 Uniform convergence convergence . . . . . . . . . 9.5.3 A note note about about Gibbs's Gibbs's phenomenon 9.5.3 A phenomenon . 9.6 Mean-square 9.6 Mean-square convergence convergence of of Fourier Fourier series series .. . 2 9.6.1 The L2( -£, £) . . . . . . . . . 9.6.1 The space space L (-l,l) 9.6.2 Mean-square 9.6.2 Mean-square convergen·ce convergence of of Fourier Fourier series series . 9.6.3 Cauchy 9.6.3 Cauchy sequences sequences and and completeness completeness 9.7 A note about 9.7 A note about general general eigenvalue eigenvalue problems problems 9.8 Suggestions 9.8 Suggestions for for further further reading. reading . . . . . .
393 393
More More about about finite finite element element methods methods 10.1 10.1 Implementation Implementation of of finite finite element element methods methods 10.1.1 Describing 10.1.1 Describing aa triangulation triangulation ... 10.1.2 Computing 10.1.2 Computing the the stiffness stiffness matrix matrix 10.1.3 Computing the the load load vector vector 10.1.3 Computing 10.1.4 Quadrature 10.1.4 Quadrature . . . . . . . . . . . 10.2 10.2 Solving Solving sparse sparse linear linear systems systems . . . . . . . . 10.2.1 Gaussian elimination elimination for for dense dense systems systems 10.2.1 Gaussian 10.2.2 Direct 10.2.2 Direct solution solution of of banded banded systems systems . . . 10.2.3 Direct solution of general general sparse 10.2.3 Direct solution of sparse systems systems 10.2.4 Iterative 10.2.4 Iterative solution solution of of sparse sparse linear linear systems systems 10.2.5 The 10.2.5 The conjugate conjugate gradient gradient algorithm algorithm 10.2.6 Convergence of of the the CG CG algorithm algorithm 10.2.6 Convergence 10.2.7 Preconditioned 10.2.7 Preconditioned CG CG . . . . . . . . .
461 461
394 394 395 395 396 396 397 397 398 398 401 401 402 402 404 404 409 409
410 410 411 411 415 415 419 419 419 419 422 422 436 436 436 436 439 439 443 443 444 444 445 445 448 448 450 450 455 455 459 459 461 461 462 462 465 465 467 467 467 467 473 473 473 473 475 475 477 477 478 478 482 482 485 485 486 486
xii
Contents
10.3
10.4 10.5
An outline of the convergence theory for finite element methods 488 H1( 10.3.1 TheThe Sobolev space HJ (0) . . . .space . . . . . 489 10.3.1 Sobolev 489 10.3.2 approximation in the energy norm . 491 Best approximation 491 10.3.3 491 Approximation by piecewise polynomials 491 10.3.4 L22 estimates 492 Elliptic regularity and L 492 Finite element methods for eigenvalue problems 494 Suggestions for further reading . . . . . . . . . . 499
A A
Proof Proof of of Theorem Theorem 3.47 3.47
501 501
B B
Shifting 505 Shifting the the data data in in two two dimensions dimensions 505 B.0.1 B.O.I Inhomogeneous Dirichlet conditions on a rectangle . 505 B.0.2 Inhomogeneous Neumann conditions on a rectangle 508
C C
Solutions to to odd-numbered odd-numbered exercises exercises Solutions
515 515
Bibliography
603 603
Index
607 607
Foreword Foreword Newton and differential equaNewton and other other seventeenth-century seventeenth-century scientists scientists introduced introduced differential equations physics. The The intellectual tions to to describe describe the the fundamental fundamental laws laws of of physics. intellectual and and technological technological of implications of this innovation are enormous: the differential differential equation equation description of physics is the foundation upon which all modern quantitative science and engineering ing rests. rests. Differential Differential equations equations describe describe relations relations between physical physical quantitiesquantities— forces, masses, positions, positions, etc.-and with respect to space forces, masses, etc.—and their their rates rates of of change change with respect to space and and time, i.e. their derivatives. the implications these relations time, i.e. their derivatives. To To calculate calculate the implications of of these relations requires requires the "solution" or integration integration of the differential differential equation: since derivatives approximate the the ratios ratios of of small of differential mate small changes, changes, solution solution of differential equations equations amounts amounts to to the the summing together of many small changes-arbitrarily many, many, in principle-to comcomsumming together of many small changes—arbitrarily in principle—to pute the effect effect of of aa large large change. change. Until Until roughly the middle of of the twentieth twentieth century, this meant meant the derivation derivation of of formulas formulas expressing expressing solutions solutions of of differential differential equations equations as the elementary calculus-polynomials, as algebraic algebraic combinations combinations of of the elementary functions functions of of calculus—polynomials, trigonometric functions, the exponential, extrigonometric functions, the the logarithm logarithm and and the exponential, and and certain certain more more exotic formulas are only for problems, otic functions. functions. Such Such formulas are feasible feasible only for aa limited limited selection selection of of problems, and even then often which can only be evaluated often involve infinite sums ("series") which evaluated approximately, was (until approximately, and and whose whose evaluation evaluation was (until not not so so long long ago) ago) quite quite laborious. laborious. The has changed all that. that. Fast with inexpeninexpenThe digital digital revolution revolution has changed all Fast computation, computation, with sive performing many per second, makes sive machines machines performing many millions millions of of arithmetic arithmetic operations operations per second, makes numerical the solution practical: these numerical methods methods for for the solution of of differential differential equations equations practical: these methmethby literally literally adding ods ods give give useful useful approximate approximate solutions solutions to to differential differential equations equations by adding up up the the effects effects of many many microscopic microscopic changes changes to approximate approximate the effect effect of aa large change. change. Numerical methods have been used to Numerical methods have been used to solve solve certain certain relatively relatively "small" "small" differential differential equations the eighteenth those that occur in equations since since the eighteenth century-notably century—notably those that occur in the the computacomputation of planetary orbits and other astronomical calculations. Systematic use of the computer has vastly expanded the range of differential differential equation models that can be coaxed coaxed into into making predictions predictions of of scientific scientific and and engineering engineering significance. significance. Analytical techniques the last three centuries techniques developed developed over over the last three centuries to to solve solve special special problem problem classes classes have new significance numerical methods. methods. Optimal have also also attained attained new significance when when viewed viewed as as numerical Optimal design prediction of migration, synthesis of design of of ships ships and and aircraft, aircraft, prediction of underground underground fluid fluid migration, synthesis of new numerous other tasks all rely-sometimes tacitly—upon tacitly-upon the new drugs, drugs, and and numerous other critical critical tasks all rely—sometimes the modeling phenomena by by differential modeling of of complex complex phenomena differential equations equations and and the the computational computational approximation of of their their solutions. solutions. Just Just as significantly, mathematical mathematical advances unapproximation as significantly, advances underlying derlying computational computational techniques techniques have have fundamentally fundamentally changed changed the ways ways in in which scientists and engineers think about differential differential equations and their implications.
xiii XIII
xiv XIV
Foreword
Until recently this sea change in theory and practice has enjoyed enjoyed little reflecteaching of of differential differential equations equations in undergraduate classes classes at tion in in the the teaching tion in undergraduate at universities. universities. While mention mention of of computer computer techniques techniques began began showing up in in textbooks published or or While showing up textbooks published revised in the 1970s, the view view of of the the subject propounded by by most textbooks would would revised in the 1970s, the subject propounded most textbooks have seemed conventional in the 1920s. The book you hold in your hands, along with aa few others published published in in recent recent years, years, notably notably Gil Introduction to to with few others Gil Strang's Strang's Introduction Applied Mathematics, Mathematics, represents represents aa new new approach differential equations Applied approach to to differential equations at at the the undergraduate level. level. It presents presents computation an integral integral part part of of undergraduate computation as as an of the the study study of differential equations. differential equations. It is not so so much that computational computational exercises must be part syllabus-this text of the syllabus—this text can can be used entirely entirely without any student student involvement in computation at taught that way would miss a great deal computation at all, though aa class taught that way deal of the the possible analysis and implementation possible impact. impact. Rather, Rather, the the concepts concepts underlying underlying the the analysis and implementation of of numerical methods assume an importance equal equal to to that of solutions solutions in terms of elementary functions. In series and and elementary In fact, many of these concepts are are equally effeceffective of the the series series expansion methods as This book book expansion methods as well. well. This tive in in explaining explaining the the workings workings of devotes considerable effort effort to "classical" methods, moddevotes considerable to these these "classical" methods, side side by by side side with with modern numerical numerical approaches the finite element method). method). The ern approaches (particularly (particularly the finite element The "classical" "classical" both aa means means to understand the the essential nature of of the series expansions expansions provide provide both series to understand essential nature the phenomena modeled by the equations, physical phenomena equations, and effective effective numerical methods for those special problems to which they apply. Perhaps surprisingly, some of the the most most important important concepts concepts in in the Perhaps surprisingly, some of the modern modern viewpoint on on differential differential equations equations are are algebraic: the ideas vector, vector vector space, viewpoint algebraic: the ideas of of vector, space, linear algebra algebra are are central, in the the development development of of and other components and other components of of linear central, even even in more conventional parts of the subject such as as series solutions. The The present present book book more conventional parts of the subject such series solutions. both theory theory and and computation, computation, just just as as uses linear unifying principle principle in in both uses linear algebra algebra as as aa unifying working scientists, engineers, and and mathematicians mathematicians do. do. working scientists, engineers, This book, book, along along with with aa number number of of others like it published in in recent recent years, years, difdifThis others like it published fers textbooks on differential differential equations in fers from earlier earlier undergraduate textbooks in yet yet another another respect. Especially in in the the middle of the last century, century, mathematical mathematical instrucrespect. Especially middle years years of the last instruction tion in in American universities tended to to relegate the physical context context for differential differential equations equations and other topics to the background. The "big three" differential differential equations of of science science and and engineering—the engineering-the Laplace, Laplace, wave, wave, and and heat heat equations, equations, to to which which tions the devoted—have appeared in many many texts the bulk bulk of of this this book book is is devoted-have appeared in texts with with at at most most aa curcursory nod nod to their physical physical origins meaning in in applications. applications. In In part, part, this this trend sory to their origins and and meaning trend reflected the the development the theory theory of equations as self-contained reflected development of of the of differential differential equations as aa self-contained arena of of mathematical mathematical research. research. This has been been extremely arena This development development has extremely fruitful, fruitful, and and is the the source of many many of new ideas ideas which which underlie underlie the the effectiveness effectiveness of of indeed indeed is source of of the the new modern numerical numerical methods. methods. However, However, it it has has also also led to generations modern led to generations of of textbooks textbooks which present present differential differential equations equations as as aa self-contained subject, at at most most distantly which self-contained subject, distantly related to the other intellectual disciplines disciplines in which differential equations play playaa related to the other intellectual in which differential equations crucial role. role. The The present present text, text, in in contrast, physically and crucial contrast, includes includes physically and mathematically mathematically substantial derivations of of each differential equation, equation, often often in in several several contexts, contexts, along substantial derivations each differential along with examples and homework homework problems problems which which illustrate illustrate how how differential differential equations with examples and equations really really arise arise in in science science and and engineering. engineering. With exception of the chapter chapter on on ordinary ordinary differential differential equations equations With the the exception of aa part part of of the problemswhich begins the book, this text text concerns itself itself exclusively with linear problems—
Foreword
xv
that is, problems for which scalar multiples multiples of of solutions solutions are solutions, that is, problems for which sums sums and and scalar are also also solutions, if not of of the same problem problem then be argued if not the same then of of aa closely closely related related problem. problem. It might might be argued that is odd in aa book which otherwise breaks with with recent that such such aa conventional conventional choice choice is odd in book which otherwise breaks recent tradition tradition in several several important important respects. Many if not most most of of the important important problems which which concern concern contemporary contemporary scientists scientists and and engineers engineers are nonlinear. nonlinear. For For example, example, alflow and reaction are nonlinear, and most most all all problems problems involving involving fluid fluid flow and chemical chemical reaction are nonlinear, and these these phenomena and and their simulation and and analysis of central central concern concern to modtheir simulation analysis are are of to many many modphenomena ern engineering engineering disciplines. series solutions solutions are to linear linear problems, problems, ern disciplines. While While series are restricted restricted to numerical methods are principle just just as nonlinear as as to numerical methods are in in principle as applicable applicable to to nonlinear to linear linear probproblems in making making them them work work well). However, lems (though (though many many technical technical challenges challenges arise arise in well). However, it true that that not only are equations-Laplace, it is is also also true not only are the the classic classic linear linear differential differential equations—Laplace, wave, heat-models for generic classes phenomena (potentials, wave wave, and and heat—models for generic classes of of physical physical phenomena (potentials, wave which many many nonlinear processes belong, belong, but but also somotion, motion, and and diffusion) diffusion) to to which nonlinear processes also their their sothe building blocks of methods for nonlinear problems. problems. lutions lutions are are the building blocks of methods for solving solving complex, complex, nonlinear problems seems for Therefore Therefore the the choice choice to to concentrate concentrate on on these these linear linear problems seems very very natural natural for aa first first course course in in differential differential equations. equations. The computational the theory theory and The computational viewpoint viewpoint in in the and application application of of differential differential important to equations very important equations is is very to aa large large segment segment of of SIAM SIAM membership, membership, and and indeed indeed the forefront gratifySIAM members have SIAM members have been been in in the forefront of of its its development. development. It is is therefore therefore gratifying that SIAM playa leading role bringing this this important part of of modern modern ing that SIAM should should play a leading role in in bringing important part the classroom by making the present volume available. applied mathematics into applied mathematics into the classroom by making the present volume available. William W. Symes Symes William W. Rice University University Rice
This page intentionally This page intentionally left left blank blank
Preface Preface This differential equations This introductory text on on partial differential equations (PDEs) (PDEs) has has several several features that that are are not not found found in other texts texts at at this this level, level, including: in other including: features
equal emphasis emphasis on on classical classical and and modern modern techniques. techniques. •• equal the explicit explicit use use of of the the language language and and results results of of linear linear algebra. algebra. •• the examples and and exercises exercises analyzing analyzing realistic realistic experiments experiments (with (with correct correct physical physical •• examples parameters parameters and and units). units). recognition that that mathematical mathematical software software forms forms aa part part of of the the arsenal arsenal of of both both •• aa recognition students students and professional mathematicians. mathematicians. In this preface, will discuss and offer getting the In this preface, II will discuss these these features features and offer suggestions suggestions for for getting the most out of the text.
Classical and and modern modern techniques Classical techniques Undergraduate courses on PDEs tend to focus on Fourier series methods and separation of variables. These techniques are still useful after after two centuries because they apply. Howthey offer insight into those problems they offer aa great great deal deal of of insight into those problems to to which which they apply. However, the subject of PDEs much of research in in both both pure pure and and applied ever, the subject of PDEs has has driven driven much of the the research applied to some more mathematics mathematics in in the the last last century, century, and and students students ought ought to to be be exposed exposed to some more modern techniques well. modern techniques as as well. The limitation limitation of of the Fourier series series technique technique is is its restricted applicability: applicability: The the Fourier its restricted it can be used only for equations equations with constant constant coefficients coefficients and only on certain certain simple geometries. To complement the classical topic of Fourier series, I present the finite element element method, method, aa modern, and flexible to solving solving PDEs. PDEs. finite modern, powerful, powerful, and flexible approach approach to Although many include some some discussion discussion of of finite finite elements elements (or Although many introductory introductory texts texts include (or finite finite differences, a competing computational computational methodology), the modern approach tends place in the exposition. tends to to receive receive less less attention attention and and aa subordinate subordinate place in the exposition. In In this this text, II have have put put equal weight on Fourier series finite elements. text, equal weight on Fourier series and and finite elements.
Linear algebra Linear algebra Both Both linear linear and and nonlinear differential differential equations equations occur occur as as models of of physical phenomena of great importance importance in in science science and and engineering. engineering. However, However, most most introductory introductory ena of great XVII xvii
xviii
Preface
texts focus equations, and mine is no exception. are several texts focus on on linear linear equations, and mine is no exception. There There are several reasons reasons is difficult, and it makes sense sense to to begin begin why this this should be so. so. The The study why should be study of of PDEs PDEs is difficult, and it makes difficult nonlinear with the simpler linear equations equations before moving on to the more difficult nonlinear equations. Moreover, linear linear equations equations are better understood. understood. Finally, Finally, much much equations. Moreover, are much much better of of what is known about about nonlinear differential differential equations equations depends on the analysis analysis so this this material material is prerequisite for moving on on to of linear differential equations, so of linear differential equations, is prerequisite for moving to nonlinear equations. nonlinear equations. Because we focus focus on on linear linear equations, equations, linear linear algebra algebra is is extremely useful. useful. InIndeed, no discussion of Fourier series or finite element methods can be complete unless it it puts the the results results in in the the proper linear algebraic algebraic framework. framework. For For example, example, both methods produce the best approximate solution certainfinite-dimensional finite-dimensional both best approximate solution from from certain subspaces, projection theorem is therefore central to both techniques. Symsubspaces, and the projection metry is another another key feature exploited by both methods. While many texts de-emphasize the linear algebraic nature of the concepts it explicit. This decision, believe, and techniques, II have to make make it and solution solution techniques, have chosen chosen to explicit. This decision, II believe, leads more cohesive better preparation preparation for for future However, leads to to aa more cohesive course course and and aa better future study. study. However, it presents certain certain challenges. challenges. Linear Linear algebra algebra does not seem seem to to receive the attention it presents does not receive the attention it deserves in in many many engineering and science programs, and so many many students students will will it deserves engineering and science programs, and so based on this text without the "prerequisites." take a course based "prerequisites." Therefore, I present a fairly complete overview overview of the necessary material in Chapter 3, 3, Essential Essential Linear Linear Algebra. Algebra. Both faculty previewing this this text text and taking aa course it will will Both faculty previewing and students students taking course from from it soon realize that that there there is is too too much much material material in 3 to to cover thoroughly in in the soon realize in Chapter Chapter 3 cover thoroughly the couple of weeks weeks it can reasonably occupy in a semester course. From experience I that conscientious dislike moving through material material that that know that know conscientious students students dislike moving so so quickly quickly through they cannot cannot master master it. However, one one of to using using this this text text is is to to avoid avoid getting they it. However, of the the keys keys to getting bogged down in Chapter 3. Students Students should should try try to get from it the "big "big picture" picture" essential ideas: and two essential •• How Howto to compute compute aa best best approximation approximation to to aa vector vector from from aa subspace, subspace, with with and and without an basis (Section (Section 3.4) without an orthogonal orthogonal basis 3.4).. •• How How to to solve solve aa matrix-vector matrix-vector equation equation when when the the matrix matrix is is symmetric symmetric and and its its eigenvalues are known known (Section (Section 3.5). eigenvalues and and eigenvectors eigenvectors are 3.5). Having Having at at least begun to grasp these ideas, students students should should move on on to Chapter Chapter 4 even if some details are not clear. The concepts from linear algebra algebra will become much clearer clearer as they are used throughout the remainder of the text. 11 II have have taught taught this this course course several several times times using using this this approach, approach, and, and, although although students find it frustrating at the beginning, beginning, the to be be good. students often often find it frustrating at the the results results seem seem to good.
Realistic problems Realistic problems The subject of PDEs is easier to grasp if one keeps in mind certain certain standard physical experiments modeled by by the the equations equations under consideration. II have have used used these ical experiments modeled under consideration. these 1
Also, Chapter Chapter 44 is is much much easier easier going going than than Chapter Chapter 3, 3, aa welcome welcome contrast! contrast!
1 Also,
Preface
XIX xix
models to introduce introduce the equations and and to to aid aid in in understanding understanding their their solutions. solutions. The The models also is worth worth studying! models also show, show, of of course, course, that that the the subject subject of of PDEs PDEs is studying! To make the the applications applications as as meaningful meaningful as as possible, possible, II have have included included many many To make examples and exercises posed in terms of meaningful experiments with realistic physical parameters.
Software Software There be used There exists exists powerful powerful mathematical mathematical software software that that can can be used to to illuminate illuminate the the material presented in material presented in this this book. book. Computer Computer software software is useful for for at least least three three reasons: reasons:
It removes removes the the need need to to do do tedious tedious computations computations that that are are necessary necessary to to compute compute •• It solutions. Just use aa table Just as as aa calculator calculator eliminates eliminates the need to to use table and and interinterpolation to to compute logarithm, aa computer algebra system polation compute aa logarithm, computer algebra system can can eliminate eliminate the need to perform perform integration by parts the need to integration by parts several several times times in in order order to to evaluate evaluate an an integral. more mechanical mechanical obstacles more time time to integral. With With the the more obstacles removed, removed, there there is is more to focus on concepts. Problems that that simply simply cannot cannot be be solved solved (in (in aa reasonable reasonable time) time) by by hand hand can can of•• Problems often be done with the assistance of a computer. This allows for more interesting ten be done with the assistance of a computer. This allows for more interesting assignments. assignments. •• Graphical Graphical capabilities capabilities allow allow students students to to visualize visualize the the results results of of their their compucomputations, improving improving understanding tations, understanding and and interpretation. interpretation. II expect or expect students students to to use use aa software software package package such such as as MATLAB, MATLAB, Mathematica, Mathematica, or Maple to the text Maple to reproduce reproduce the the examples examples from from the text and and to to solve solve the the exercises. exercises. II prefer prefer not not to to introduce introduce aa particular particular software software package package in in the the text text itself, itself, for at reasons. The the features usage of the software can at least least two two reasons. The explanation explanation of of the features and and usage of the software can detract from from the mathematics. Also, Also, if if the the book book is is based based on software detract the mathematics. on aa particular particular software package, then then it can be difficult to different package. package. For For these it can be difficult to use use with with aa different these reason, reason, my my package, text packages except However, text does does not not mention mention any any software software packages except in in aa few few footnotes. footnotes. However, since the use of software is, in my opinion, opinion, essential for a modern course, course, I have written tutorials for MATLAB, Maple that the various various written tutorials for MATLAB, Mathematica, Mathematica, and and Maple that explain explain the capabilities of these programs that that are are relevant appear capabilities of these programs relevant to to this this book. book. These These tutorials tutorials appear CD. on on the the accompanying accompanying CD.
Outline Outline The material in in this this text text is series The core core material is found found in in Chapters Chapters 5-7, 5-7, which which present present Fourier Fourier series and techniques for most important of and finite finite element element techniques for the the three three most important differential differential equations equations of mathematical Laplace's equation, equation, the the heat equation, and and the wave equaequathe wave mathematical physics: physics: Laplace's heat equation, tion. Since the concepts themselves themselves are are hard enough, these chapters are are restricted restricted tion. Since the concepts hard enough, these chapters to problems in dimension. to problems in aa single single spatial spatial dimension. Several introductory introductory chapters chapters set set the the stage stage for for this this core. core. Chapter Chapter 11 briefly briefly Several defines the basic that will will be the text. text. Chapter defines the basic terminology terminology and and notation notation that be used used in in the Chapter
xx xx
Preface
22 then then derives derives the the standard standard differential differential equations equations in in one one spatial spatial dimension, dimension, in in the the process parameters that process explaining explaining the the meaning meaning of of various various physical parameters that appear appear in in the the the associated boundary conditions equations equations and and introducing introducing the associated boundary conditions and and initial initial conditions. conditions. Chapter Chapter 3, 3, which has has already already been been discussed discussed above, above, presents presents the the concepts concepts and and techniques techniques from from linear linear algebra algebra that will will be used used in in subsequent subsequent chapters. chapters. II want want to to reiterate reiterate that that perhaps the the most most important important key key to to using this text text effectively effectively is to move through to move through Chapter Chapter 33 expeditiously. expeditiously. The The rudimentary rudimentary understanding understanding that that students obtain obtain in in going going through through Chapter Chapter 33 will will grow grow as as the the concepts concepts are are used used in students in the the rest rest of of the book. book. Chapter presents the background material material on Chapter 44 presents the background on ordinary ordinary differential differential equations equations that much easier that is is needed needed in in later later chapters. chapters. This This chapter chapter is is much easier than than the the previous previous the material material is the last one, much of one, because because much of the is review review for for many many students. students. Only Only the last two two Although sections, methods and sections, on on numerical numerical methods and stiff stiff systems, systems, are are likely likely to to be be new. new. Although the Essential Ordinary Differential Equations, the chapter chapter is is entitled entitled Essential Ordinary Differential Equations, Section Section 4.3 4.3 is is not not formally the book. give formally prerequisite prerequisite for for the the rest rest of of the book. II included included this this material material to to give students foundation for understanding stiff students aa foundation for understanding stiff systems systems of of ODEs ODEs (particularly, (particularly, the the stiff stiff system system arising arising from from the the heat heat equation). equation). Similarly, Similarly, Runge-Kutta Runge-Kutta schemes schemes and and automatic automatic step step control control are are not not strictly strictly needed. However, However, understanding aa little little about about variable variable step step size size methods methods is is useful if if one one tries tries to apply apply an an "off-the-shelf" "off-the-shelf" routine routine to to aa stiff stiff system. system. Chapter techniques developed first part Chapter 88 extends extends the the models models and and techniques developed in in the the first part of of the the book brief discussions dimensions). book to to two two spatial spatial dimensions dimensions (with (with some some brief discussions of of three three dimensions). The more in-depth The last last two two chapters chapters provide provide aa more in-depth treatment treatment of of Fourier Fourier series series (Chapter 9) and finite elements (Chapter 10). In addition to the standard theory of (Chapter 9) and finite elements (Chapter 10). In addition to the standard theory of Fourier series, Chapter 9 shows how to use the fast Fourier transform to efficiently Fourier series, Chapter 9 shows how to use the fast Fourier transform to efficiently compute compute Fourier Fourier series series solutions solutions of of the PDEs, explains explains the relationships relationships among among the the various Fourier series, the Fourier various types types of of Fourier series, and and discusses discusses the the extent extent to to which which the Fourier series series method can be extended to complicated complicated geometries and equations with nonconstant stant coefficients. coefficients. Sections Sections 9.4-9.6 9.4-9.6 present present aa careful careful mathematical mathematical treatment treatment of of the the convergence Fourier series, have aa different flavor from convergence of of Fourier series, and and have different flavor from the the remainder remainder of of the the book. In particular, they book. In particular, they are are less less suited suited for for an an audience audience of of science science and and engineering engineering have been been included the curious students, students, and and have included as as aa reference reference for for the curious student. student. Chapter Chapter 10 10 gives gives some some advice advice on on implementing implementing finite finite element element computations, computations, discusses discusses the the solution solution of of the the resulting resulting sparse sparse linear linear systems, systems, and and briefly briefly outlines outlines the convergence methods. It use finite finite the convergence theory theory for for finite finite element element methods. It also also shows shows how how to to use elements elements to solve solve general general eigenvalue eigenvalue problems. problems. The The tutorials tutorials on on the accompanyaccompanying ing CD CD include include programs implementing implementing two-dimensional two-dimensional finite finite element element methods, as as (MATLAB, described described in in Section Section 10.1, 10.1, in in each each of of the the supported supported software software packages (MATLAB, Maple). The Mathematica, and and Maple). The sections sections on on sparse sparse systems systems and and the convergence convergence thetheboth little pointing the the students toward more ory ory are are both little more more than than outlines, outlines, pointing students toward more advanced advanced concepts. Both of these topics, topics, of justify aa dedicated concepts. Both of these of course, course, could could easily easily justify dedicated semestersemesterlong long course, course, and and II had no no intention intention of of going going into into detail. detail. II hope hope that that the the material material on on implementation implementation of of finite finite elements elements (in (in Section Section 10.1) 10.1) will will encourage encourage some some students students to to experiment experiment with with two-dimensional two-dimensional calculations, calculations, which which are are already already too too tedious tedious to to carry carry out out by hand. hand. This This sort sort of of information information seems seems to to be lacking lacking from from most most books books accessible accessible to to students students at at this this level.
Preface Preface
XXI xxi
Possible course course outlines outlines Possible In In aa one-semester one-semester course course (42-45 (42-45 class class hours), hours), II typically typically cover cover Chapters Chapters 1-7 1-7 and and part part of Chapter Chapter 8. 8. II touch touch only only lightly lightly on on the material concerning concerning Green's Green's functions functions and and of the material the Dirac delta delta function function (Sections (Sections 4.6, 4.6, 6.6, 6.6, and and 7.4.1), 7.4.1), and and sometimes sometimes omit omit Section Section the Dirac 6.3, 6.3, but but cover cover the the remainder remainder of of Chapters Chapters 1-7 1-7 carefully. carefully. If part of the material material in If an an instructor instructor wishes wishes to to cover cover aa significant significant part of the in Chapters Chapters 8-10, an an obvious obvious place place to save time time is is in in Chapter Chapter 4. suggest covering to save 4. II would would suggest covering 8-10, the needed material material on the needed on ODEs ODEs on on aa "just-in-time" "just-in-time" basis basis in in the the course course of of Chapters Chapters 5-7. This This will will definitely definitely save save time, time, since since my my presentation presentation in in Chapter Chapter 44 is is more more 5-7. detailed than is really necessary. Chapter Chapter 22 can can be be given given as as aa reading assignment, detailed than is really necessary. reading assignment, particularly for the the intended particularly for intended audience audience of of science science and and engineering engineering students, students, who who will will typically be comfortable with the physical physical parameters parameters appearing appearing in in the differential the differential typically be comfortable with the equations. equations.
Acknowledgments Acknowledgments This book book began when II was visiting Rice University in in 1998-1999 1998-1999 and and taught taught aa This began when was visiting Rice University course using lecture notes of Professor William W. W. Symes. Symes. To To satisfy satisfy my my perpercourse using the the lecture notes of Professor William sonal predilections, rewrote the the notes notes significantly, significantly, and for the the convenience convenience of and for of sonal predilections, II rewrote myself and my my students, students, II typeset them in in the the form form of of aa book, book, which which was was the myself and typeset them the first first version of of this this text. text. Although the final result bears, bears, in in some some ways, ways, little little resemblance blance to to Symes's Symes's original original notes, II am am indebted to to him for for the idea idea of of recasting recasting the the undergraduate PDE undergraduate PDE course course in in more more modern modern terms. terms. His His example example was was the the inspiration inspiration for this project, project, and benefited from his advice for this and II benefited from his advice throughout throughout the the writing writing process. process. II am am also also indebted indebted to to the the students students who who have have suffered suffered through through courses courses taught taught from early early version version of of this this text. Many of of them found errors, errors, typographical typographical and and from text. Many them found otherwise, otherwise, that that might might otherwise otherwise have have found found their their way way into into print. print. II would would like like to to thank thank Professors Professors Gino Gino Biondini, Biondini, Yuji Yuji Kodoma, Kodoma, Robert Robert Krasny, Krasny, Yuan Lou, Uhlig, all whom read read part text Yuan Lou, Fadil Fadil Santosa, Santosa, and and Paul Paul Uhlig, all of of whom part or or all all of of the the text and and offered offered helpful helpful suggestions. suggestions. The various physical physical parameters used in deThe various parameters used in the the examples examples and and exercises exercises were were derived rived (sometimes (sometimes by by interpolation) interpolation) from from tables tables in in the the CRC CRC Handbook Handbook of of Chemistry Chemistry and Physics and Physics [35]. [35). The graphs graphs in in this generated with with MATLAB. MATLAB. For For MATLAB MATLAB product product The this book book were were generated information, please contact: information, please contact: The MathWorks, Inc. The MathWorks, Inc. 33 Apple Drive Apple Hill Hill Drive Natick, MA 01760-2098 01760-2098 USA Natick, MA USA Tel: Tel: 508-647-7000 508-647-7000 Fax: 508-647-7101 508-647-7101 Fax: E-mail:
[email protected] E-mail:
[email protected] Web: www.mathworks.com Web: www.mathworks.com
As mentioned above, the CD also also supports supports the the use of Mathematica and Maple. As mentioned above, the CD use of Mathematica and Maple. For For Mathematica product product information, information, contact: Mathematica contact:
xxii
Preface
Wolfram Wolfram Research, Research, Inc. Inc. 100 Drive 100 Trade Trade Center Center Drive Champaign, USA Champaign, IL IL 61820-7237 61820-7237 USA Tel: Tel: 800-965-3726 Fax: Fax: 217-398-0747 E-mail: E-mail:
[email protected] [email protected] Maple product information, information, contact: For Maple contact: Waterloo Waterloo Maple Maple Inc. Inc. 57 57 Erb Erb Street Street West West Waterloo, Waterloo, Ontario Ontario Canada Canada N2L6C2 Tel: 800-267-6583 Tel: 800-267-6583 E-mail: E-mail:
[email protected] [email protected] Chapter Chapter 1
ification of classification of differential -al
equations
Loosely speaking, a differential differential equation is an equation specifying a relation between the derivatives of a function or between one or more derivatives and the function function itself. We We will call the function appearing in such an equation the unknown Junction. function. We use this terminology because the typical task involving involving a differential differential equation, equation, that is, and the focus of this book, is to solve the differential differential is, to find a function whose derivatives are related as specified by the differential equation. In whose differential carrying out this task, everything else about the relation, other than the unknown unknown function, is regarded as known. known. Any function satisfying satisfying the differential differential equation is called a solution. In other words, a solution of a differential solution. words, differential equation is a function function that, when substituted for unknown function, causes the equation to be satisfied. for the unknown Differential Differential equations fall into several natural and widely widely used categories: 1. versus partial: 1. Ordinary Ordinary versus partial:
(a) If the unknown unknown function has a single independent variable, say t, then the equation is an ordinary ordinary differential differential equation equation (ODE). (ODE). In this case, only "ordinary" derivatives are involved. involved. Examples of ODEs are du = 3u dt
(1.1)
and rPu a dt 2
du
+ b dt + cu =
In the second example, a, b, and fJ(t) ( t ) as a known known function uu = u(t). = u(t).
J (t).
(1.2)
and c are regarded as known known constants, constants, of t. In both equations, the unknown unknown is
(b) If the unknown unknown function function has two or more independent variables, the equation is called a partial differential differential equation equation (PDE). Examples include
{Pu 8x2
11
82 u
+ 8y2
= 0
(1.3)
Chapter 1. Classification of differential equations Chapter 1. Classification
2
and (1.4)
In (1-3), the is u = u(x,y), the unknown unknown function function is = u(x, V), while while in in (1.4), (1.4), it it is is u == In (1.3), u(x, t). In In (1.4), (1.4), cc is u(x,t). is aa known known constant. constant. 2. Order: differential equation the order of the the highest highest deriva2. Order: The The order of of aa differential equation is is the order of derivative appearing the equation. Most differential science tive appearing in in the equation. Most differential equations equations arising arising in in science and or second Example (1.1) above is and engineering engineering are are first first or second order. order. Example (1.1) above is first first order, order, are second order. while Examples Examples (1.2), while (1.2), (1.3), (1.3), and and (1.4) (1.4) are second order.
3. Linear versus versus nonlinear: 3. Linear nonlinear:
(a) As the the examples examples suggest, suggest, a differential differential equation has, has, on on each each side of the the equals sign, the unknown equals sign, algebraic algebraic expressions expressions involving involving the unknown function function and and its derivatives, derivatives, and and possibly other functions functions and and constants constants regarded regarded as known. A A differential terms involving the unknown. differential equation equation is is linear if if those those terms involving the ununknown known function function contain products of known contain only only products of aa single factor of of the the unknown function or with other (known) constants constants or funcfunction or one one of of its its derivatives derivatives with other (known) or functions of the the tions of the independent independent variables. variables.22 In In linear linear differential differential equations, equations, the unknown function and its appear raised raised to to aa power power unknown function and its derivatives derivatives do do not appear other 1, or argument of of aa nonlinear nonlinear function function (like (like sin, sin, exp, exp, than 1, or as as the the argument other than log, etc.). log, etc.). For second-order ODE the form form For example, example, the the general general linear linear second-order ODE has has the
require that that a a2(t) in order that the the equation truly be be of of Here we Here we require ) 00 in order that equation truly 2 ( t f:. second order. (1.2) is special case case of this general general equation. equation. is aa special of this second order. Example Example (1.2) In linear differential differential equation, unknown or its derivatives be In aa linear equation, the the unknown or its derivatives can can be multiplied functions of the independent variable, multiplied by constants constants or by functions the independent variable, but but not by not by functions functions of of the the unknown. unknown. As another the general general linear second-order PDE in two linear second-order PDE in two indeindeAs another example, example, the pendent variables variables is is pendent
a 2u a 2u a 2u au (x, y) ax 2 +a12(x, y) axay +a22 (x, y) ay2 au au + al (x, y) ax +a2(x, y) ay
+ ao(x, y)u = f(x, V)·
Examples (1.3) (although the the independent independent Examples (1.3) and and (1.4) (1.4) are are of of this this form form (although variables are are called called x and (1.4), not and V). y). Not an, a12, a12, and tt in in (1.4), not x and Not all all of of au, variables and be zero in order order for this equation to be be second and a22 a22 can can be zero in for this equation to second order. order. A linear homogeneous if the zero function is A linear differential differential equation equation is is homogeneous if the zero function is aa solution. For For example, = 00 satisfies satisfies (1.3), (1.3), == 00 satisfies satisfies (1.1), (1.1), u(x,y) u(x,y) == solution. example, u(t) = -::-----2
2In will give precise definition In Section Section 3.1, 3.1, we we will give aa more more precise definition of of linearity. linearity.
Chapter 1. Classification differential equations equations Classification of differential
3
and u(x, t) = == 00 satisfies so these these are are examples of homogeneous homogeneous and u(x,t) satisfies (1.4), (1.4), so examples of linear differential equations. A homogeneous linear equation has another linear differential equations. A homogeneous linear equation has another important property: whenever whenever u and of the the equation, so important property: and v are are solutions solutions of equation, so is is au cm + + (3v ftv for for all real numbers a and (3. ft. For example, suppose suppose u(t) u(t) and and vet) v(t) are are solutions solutions of of (1.1) (1.1) (that (that is, is, du/dt du/dt — = 3u 3w and and dv/dt dv/dt = 3i>), 3v), and and w == au + (3v. ftv. Then Then dw
ill =
d
dt [au + (3v] du dv =a dt +(3dt = a3u + (33v = 3(au + (3v) =3w.
Thus w is is also also aa solution solution of Thus of (1.1). (1.1). A linear is not not homogeneous is called A linear differential differential equation equation which which is homogeneous is called inhoinhoFor example mogeneous. mogeneous. For example
~~
=
3u + sin (21ft)
(1.5)
is inhomogeneous, since not satisfy equation. is linear linear inhomogeneous, since u(i) u(t) = == 00 does does not satisfy this this equation. Example (1.2) might be be homogeneous homogeneous or or not, not, depending depending on on whether Example (1.2) might whether /(*) = not. J(t) == 00 or or not. It is is always always possible to group group all terms involving involving the the unknown unknown function It possible to all terms function all terms terms not not in aa linear linear differential on the the left-hand left-hand side in differential equation equation on side and and all involving the the unknown unknown function the right-hand right-hand side. For example, involving function on on the side. For example, (1.5) (1.5) is equivalent to
~~ -
3u = sin (21ft).
(1.6)
"Equivalent (1.5) and "Equivalent to" in this context means means that (1.5) and (1.6) (1.6) have exactly exactly the same solutions: if the if u(t) u(t) solves one one of these two equations, equations, it it solves the other other as as well. well. When When the the equation written this this way, way, with with all all of of the equation is is written the terms terms involving the unknown unknown on the left left and and those those not not involving involving the the involving the on the the unknown on the right, right, homogeneity homogeneity is to recognize: the equation is unknown on the is easy easy to recognize: the equation is right-hand side homogeneous if if and and only only if if the the right-hand side is identically zero. (b) Differential Differential equations which are not linear are termed nonlinear. nonlinear. For example, example, (1.7)
is nonlinear. This is is clear, clear, as as the u appears appears raised to is nonlinear. This the unknown unknown function function u raised to the second power. Another Another way way to to see that (1.7) is as as follows: the second power. see that (1.7) is is nonlinear nonlinear is follows: if the equation equation were were linear, linear homogeneous, since the linear, it it would would be be linear homogeneous, since the zero zero if the
4 4
Chapter 1. Classification equations Classification of differential equations
function is is aa solution. solution. However, However, suppose suppose that that u(t) u(t) and and v(t) v(t} are are nonzero nonzero function solutions, solutions, so so that that rPu dt 2
+u
2
rP
2
rPv = 0, dt 2
+V
2
= 0,
and w = u + v. Then rPw dt 2
+ w = dt 2
[u
+ v] + (u + v)
+
d2 v dt 2
rPu = dt 2
= ( rPu dt 2
2
2
+ u + 2uv + v
2v 2) (d + u + dt + 2
V
2
2) + 2uv
= 2uv.
Thus w w does does not not satisfy satisfy the equation, so so the the equation equation must must be nonlinear. Thus the equation, be nonlinear. Both Both linear linear and and nonlinear nonlinear differential differential equations equations occur occur as as models models of of physical However, cal phenomena phenomena of of great great importance importance in in science science and and engineering. engineering. However, linear of linear differential differential equations equations are are much much better better understood, understood, and and most most of what what is is known known about about nonlinear differential differential equations equations depends depends on on the the analysis of linear linear differential equations. equations. This This book will focus focus almost almost exclusively clusively on on the the solution solution of of linear linear differential differential equations. equations. 4. Constant versus versus nonconstant coefficient: Linear differential equations equations 4. Constant nonconstant coefficient: Linear differential like like
have constant coefficients coefficients if if the the coefficients coefficients ra, c, and and kk are are constants constants rather rather m, c, have constant than (nonconstant) functions functions of of the the independent independent variable. The right-hand right-hand side side than (nonconstant) variable. The ff(t) ( t ) may depend on the independent variable; it is only the quantities quantities which multiply the the unknown unknown function function and and its its derivatives derivatives which which must must be be constant, constant, multiply in order for for the equation to have constant constant coefficients. Some techniques that are effective effective for for constant-coefficient constant-coefficient problems problems are are very very difficult difficult to to apply apply to to are problems problems with with nonconstant nonconstant coefficients. coefficients. As will explicitly As aa convention, convention, we we will explicitly write write the the independent independent variable variable when when aa coefcoefficient ficient is is aa function function rather rather than than aa constant. constant. The The only only function function in in aa differential differential equation will omit the unknown unknown equation for for which which we we will omit the the independent independent variable variable is is the function. Thus Thus function.
d[
dU]
- - k(x)-
dx
dx
= f(x)
is is an an ODE ODE with with aa nonconstant nonconstant coefficient, coefficient, namely namely k(x). k(x).
We will will only only apply apply the "constant coefficients" coefficients" to to linear linear differential We the phrase phrase "constant differential equations. equations.
5
Chapter 1. Classification Classification of differential differential equations
5. versus system 5. Scalar Scalar equation equation versus system of of equations: equations: A A single single differential differential equaequation unknown function tion in in one one unknown function will will be be referred referred to to as as aa scalar scalar equation equation (all (all of of the the examples we have seen to this point have been scalar equations). A system of examples we have seen to this point have been scalar equations). A system of differential equations consists of several equations for one or more unknown differential equations consists of several equations for one or more unknown functions. constant-coefficient functions. Here Here is is aa system system of of three three first-order first-order linear, linear, constant-coefficient ODEs X3(t): ODEs for for three three unknown unknown functions functions Xl x i ((t), t ) , XX2(t), 2 ( t ) , and and xs(t):
dXI
dt = 2XI -
X2
+ X3,
dX2 dt
= Xl + X2,
dX3 dt
= -Xl + X2 -
X3·
It It is is important important to to realize realize that, since since aa differential differential equation equation is is an an equation equation involving the equation values of indeinvolving functions, functions, aa solution solution will will satisfy satisfy the equation for for all all values of the the independent variable(s), variable(s), or or at at least least all all values values in in some some restricted restricted domain. domain. The The equation equation pendent is is to to be be interpreted interpreted as as an an identity, such such as as the the familiar familiar trigonometric trigonometric identity cos 2 (t)
+ sin2 (t)
= 1.
(1.8)
To to say To say say that that (1.8) (1.8) is is an an identity identity is is to say that that it it is is satisfied satisfied for for all all values values of of the the 3t independent variable t. Similarly, u(t) = is (1.1) because because this this independent variable Similarly, u(t) — e3t is aa solution solution of of (1.1) function uu satisfies satisfies function du
dt (t) = 3u(t)
for for all values values of of t.t. The will note between the uses of many terms The careful careful reader reader will note the the close close analogy analogy between the uses of many terms introduced "nonlinintroduced in in this this section section ("unknown ("unknown function," function," "solution," "solution," "linear" "linear" vs. vs. "nonlinear") the uses ear" ) and and the uses of of similar similar terms terms in in discussing discussing algebraic algebraic equations. equations. The The analogy analogy between between linear linear differential differential equations equations and and linear linear algebraic algebraic systems systems is is an an important important theme theme in in this this book.
Exercises 1. the following the categories 1. ClaSSify Classify each each of of the following differential differential equations equations according according to to the categories described described in in this this chapter chapter (ODE (ODE or or PDE, PDE, linear linear or or nonlinear, nonlinear, etc.): etc.):
(a) dx dt (b)
0
au - -a2 u = (1 + t) sm. (x) at ax2
-
(c)
+ tx =
6
Chapter 1. Classification Classification of differential equations 2. Repeat Repeat Exercise 1 for for the following equations: equations:
(a) av av -+3- =x+3t at ax (b) dx
1
dt
t
- + -x =
0
(c) au _ ~ [2U au] = 0 at ax ax 3. Repeat Repeat Exercise the following 3. Exercise 11 for for the following equations: equations:
(a)
d2(} dt 2
.
+ sm ((})
= 0
(b)
au au - 2 - =l+x at ax
-
(c)
4. Repeat Repeat Exercise 1 for for the following equations: equations: (a)
d2 x dx --2-+tx=O 2 dt dt (b)
av - -a ( ~(x)av) p(x)c(x)at ax ax
= 0
(c) d2x dt 2
dx
+ 2 dt + 3x =
sin (nt)
5. Determine whether each of the functions below is a solution of the the functions the corresponddifferential equation ing differential equation in Exercise 1:
(a) x(t) = e!t (b) u(x,t) = tsin(x)
(c) w(x, t)
=
t(l - x)
6. Determine Determine whether whether each each of of the solution of of the the correspondcorrespond6. the functions functions below below is is aa solution ing differential equation ing differential equation in in Exercise Exercise 2: 2:
Chapter 1. Classification of differential equations
7
(a) v(x, t) = tx (b) x(t) =
t
(c) u(x, t) = x 2 t 7. Find that u(t) t sin (t) is is aa solution solution of the ODE ODE 7. Find aa function function ff(t) ( t ) so so that u(t) = tsm of the du 1 - - -u = f(t) t > O. dt t '
Is there only such function f? Why Why or why not? Is there only one one such function /? or why not?
8. Is Is there there aa constant constant f/ such such that that u(t) u(t] = = ee1t is of the is aa solution solution of the ODE ODE 8. d2 u dt 2
-
du 2 dt
+ 3u = f?
9. Suppose u is aa nonzero equation. 9. Suppose u is nonzero solution solution of of aa linear, linear, homogeneous homogeneous differential differential equation. What is is another nonzero solution? solution? What another nonzero solution of of 10. Suppose u is 10. Suppose is aa solution ~u
a(t) dt 2
du
+ b(t) dt + c(t)u = f(t)
and v is (nonzero) solution of and is aa (nonzero) solution of
Explain how how to to produce produce infinitely infinitely many many different of (1.9). (1.9). Explain different solutions solutions of
(1.9)
This page intentionally This page intentionally left left blank blank
Chapter 2
Modelass
in one one dimension in dimension
In this chapter, present several In this chapter, we we present several different different and and interesting interesting physical physical processes processes that that can be modeled by ODEs or PDEs. For now we restrict ourselves to phenomena that can be modeled by ODEs or PDEs. For now we restrict ourselves to phenomena that can dimension: can be be described described (at (at least least approximately) approximately) as as occurring occurring in in aa single single spatial spatial dimension: heat flow or mechanical mechanical vibration vibration in in aa long, long, thin thin bar, bar, vibration of aa string, string, diffusion diffusion flow or vibration of heat of chemicals chemicals in in aa pipe, and so so forth. forth. In In Chapters Chapters 5-7, 5-7, we will learn learn methods methods for we will for of pipe, and solving solving the the resulting resulting differential differential equations. equations. In Chapter Chapter 8, 8, we we consider consider similar similar experiments experiments occurring occurring in in multiple multiple spatial spatial In dimensions. dimensions.
2.1
Heat flow in law in a bar; Fourier's law
We by considering heat energy temperWe begin begin by considering the the distribution distribution of of heat energy (or, (or, equivalently, equivalently, of of temperthe cross-sections bar are ature) ature) in in aa long, long, thin thin bar. bar. We We assume assume that that the cross-sections of of the the bar are uniform, uniform, and temperature varies the longitudinal and that that the the temperature varies only only in in the longitudinal direction. direction. In In particular, particular, we assume assume that that the bar is is perfectly perfectly insulated, insulated, except except possibly at the the ends, ends, so so that that we the bar possibly at no we no heat heat escapes escapes through through the the side. side. By By making making several several simplifying simplifying assumptions, assumptions, we derive whose solution of derive aa linear linear PDE PDE whose solution is is the the temperature temperature in in the the bar bar as as aa function function of spatial position position and and time. time. spatial We when we We show, show, when we treat treat multiple multiple space space dimensions dimensions in in Chapter Chapter 8, 8, that that if if the the initial temperature is constant constant in in each each cross-section, cross-section, and and if if any any heat heat source source depends depends initial temperature is only on on the the longitudinal longitudinal coordinate, coordinate, then then all all subsequent subsequent temperature temperature distributions distributions only depend only only on on the the longitudinal longitudinal coordinate. coordinate. Therefore, Therefore, in in this this regard, regard, there there is depend is no modeling modeling error error in in adopting adopting aa one-dimensional one-dimensional model. model. (There (There is is modeling modeling error no error make.) associated associated with with some some of of the the other other assumptions assumptions we we make.) We will will now now derive derive the the model, model, which which is is usually usually called called the the heat equation. We We heat equation. We begin by defining aa coordinate coordinate system system (see (see Figure Figure 2.1). 2.1). The The variable variable xx denotes denotes begin by defining position in in the the bar bar in in the longitudinal direction, direction, and and we we assume assume that that one one end end of of the the the longitudinal position bar is The length bar will will be be f, the other £. We We bar is at at x = O. 0. The length of of the the bar £, so so the other end end is is at at x = t. denote by by A the the area area of of aa cross-section cross-section of of the bar. denote the bar. The temperature temperature of of the the bar bar is is determined determined by by the the amount amount of of heat energy in The heat energy in 99
bar; Fourier's Fourier's law 2.1. Heat flow in a bar;
11 11
The fact fact that that we we do do not not know know EQ Eo causes causes no difficulty, because because we we are are going to model The no difficulty, going to model the in the the energy energy (and hence temperature). temperature). the change change in (and hence Specifically, we examine the rate rate of change, with with respect to time, of the the heat Specifically, we examine the of change, respect to time, of heat energy in in the of the between xx and and xx + Ax. The total heat in [x, xx + Ax] energy the part part of the bar bar between ~x. The total heat in [x, ~xl is by (2.2), (2.2), so so its its rate of change is is given given by rate of change is
d dt
r+tl.x ix Apeu(s,t)ds
(the constant EQ Eo differentiates to zero). work with this expression, we need need the (the constant differentiates to zero). To To work with this expression, we the following which allows us to move the the derivative derivative past past the the integral sign: following result, result, which allows us to move integral sign:
Theorem 2.1. Let (a,b) x [c, —>• R and aF/ax dF/dx be be continuous, continuous, and define Theorem 2.1. Let F F :: (a, b) x [e, d\ d] -+ Rand and define (a, 6) ->• R R by by ¢(f>: : (a, b) -+ ¢(x)
=
ld
F(x, y) dy.
Then is continuously and Then ¢0 is continuously differentiable, differentiate, and d¢ dx (x) = That That is, is, d
dx
ld c
ld c
aF ax (x, y) dy.
F(x,y)dy=
ld c
aF ax (x,y)dy.
By Theorem Theorem 2.1, By 2.1, we we have have d dt
r+tl.x r+tl.x au ix Apcu(s, t) ds = ix Ape at (s, t) ds.
(2.3)
If we assume that there there are no internal or sinks heat, the the heat If we assume that are no internal sources sources or sinks of of heat, heat contained [x, x + Ax] ~xl can can change because heat heat flows flows through through the the crosscrosscontained in in [x, change only only because sections x and and x x + ~x. rate at which heat heat flows flows through through the the cross-section sections at at x Ax. The The rate at which cross-section at is called and is denoted q(x,t). q(x, t). It has units units of per unit at xx is called the the heat heat flux, flux, and is denoted It has of energy energy per unit area per unit time, and if heat heat energy energy is is flowing flowing in in the the positive area per unit time, and is is aa signed signed quantity: quantity: if positive x-direction, then qq is x-direction, then is positive. positive. The net heat heat entering [x, x + ~xl at time t, through through the the two two cross-sections, is The net entering [x, + Ax] at time cross-sections, is
A(q(x, t) - q(x
+ ~x, t))
= -
r+tl.x aq ix A ax (s, t) ds;
(2.4)
the last the fundamental theorem of (See Figure the last equation equation follows follows from from the fundamental theorem of calculus. calculus. (See Figure 2.2.) Equating yields 2.2.) Equating (2.3) (2.3) and and (2.4) (2.4) yields
r+tl.x au r+tl.x aq ix Ape at (s,t)ds = - ix Aax(s,t)ds
12
Chapter 2. Models in one dimension
or or
l
x
X
dX
+
{Ape au at (8, t)
q
a (8, t) } + A ax
d8 = O.
Since this all x and Since this holds holds for for all and Ax, ~x, it it follows follows that that the the integrand integrand must must be be zero: zero: au Ape at (x, t)
aq
+ A ax (x, t) = 0,
0
< x < c.
(2.5)
(The key point point here here is is that that the the integral integral above zero over over every small interval. interval. (The key above equals equals zero every small the integrand is itself zero. zero. If the integrand were positive, positive, This is only possible if the say, at some some point point in in [0, C), and and were were continuous, then the the integral integral over over aa small say, at [0,^], continuous, then small interval positive.) interval around that point would be positive.)
Figure 2.2. 2.2. The heat flux into and out of a part of of the bar. bar. Figure flow from regions of higher temperature to regions of of Naturally, heat will flow lower temperature; note that the sign of au/ax du/dx indicates whether temperature is increasing decreasing as as x increases, and hence hence indicates indicates the the direction direction of of heat increasing or or decreasing increases, and heat flow. We We now now make make aa simplifying assumption, which which is is called called Fourier's Fourier's law of of heat flow. simplifying assumption, conduction: the conduction: the heat heat flux flux q q is is proportional proportional to to the the temperature temperature gradient, gradient, du/dx. au/ax. We We the magnitude of the the constant constant of proportionality the the thermal conductivity K, r., call the and obtain the equation and obtain the equation au (2.6) q(x, t) = -r. ax (x, t) (the the units of the the heat flux and (the units of r.K can can be determined from the and those of the the temperature gradient; gradient; see Exercise 2.1.1). 2.1.1). The negative sign sign is is necessary necessary so so that that temperature see Exercise The negative heat flows flows from from hot hot regions regions to to cold. cold. Substituting Fourier's law into the the differential heat Substituting Fourier's law into differential we can eliminate eliminate q q and find a PDE PDE for u: u: equation (2.5), we equation
au a2u pc at - r. ax 2
= 0, 0 < x < C,
(2.7)
for all t.
(We common factor of A) A.) Here we have assumed that r. (We have canceled aa common K is constant, constant, the bar bar were homogeneous. homogeneous. It which would be true if the It is possible that r. K depends on x (in (in which which case case p and and ce probably probably do do as as well); well); then then we we obtain obtain
p(x)c(x)
~~ -
:x (r.(x)
~~)
= 0, 0 < x < C, for all
t.
We call call (2.7) (2.7) the the heat equation. equation. We If the bar bar contains internal sources (or sinks) heat (such (such as as chemical chemical reacreacIf the contains internal sources (or sinks) of of heat we collect all such sources into a single source function tions that produce heat), we
2.1.
Heat Heat flow flow in in a bar; bar; Fourier's Fourier's law law
13
/(x,t) (in units heat energy time per volume). Then the total heat f(x, t) (in units of of heat energy per per unit unit time per unit unit volume). Then the total heat added to + ~xl Ax] during time interval interval [t, t + At] to [x, x + during the the time ~tl is is added
l
t+.t:.t
it
x+.t:.x
Af(s, r) ds dr.
x
The time rate of change of this contribution to the total heat energy (at time t) is
l
x+.t:.x
x
(2.8)
Af(s, t) ds.
The rate change of of total total heat in [x, [#, x + ~xl Ax] is is now given by by the the sum sum of of (2.4) and The rate of of change heat in now given (2.4) and (2.8), so so we we obtain, obtain, by above reasoning, reasoning, the inhomogeneous33 heat equation by the the above the inhomogeneous heat equation (2.8), au pc at
2.1.1 2.1.1
-
a2 u "'ax 2 = f(x,t), 0< x < £, for all t.
(2.9)
Boundary and and initial initial conditions conditions for for the the heat heat equation Boundary equation
The equation by itself is not aa complete complete model model of heat flow. We must The heat heat equation by itself is not of heat flow. We must know know flows through the ends of the bar, and we we must know the temperature heat flows how heat distribution distribution in the bar at some initial initial time. flow in a bar correspond to perfect Two possible boundary conditions for for heat flow perfectly insulated, insulation and perfect thermal contact. contact. If the ends of the bar are perfectly so we have so that the heat flux across across the ends ends is is zero, zero, then then we have -K,
au ax (0, t)
= 0,
-K,
au ax (£, t)
=
°
for all t
(no heat flows flows into the left left end or out of the right end). On the other hand, ifif the ends of zero44 (through (through perfect perfect thermal the ends of the the bar bar are are kept kept fixed fixed at at temperature temperature zero thermal contact with an ice ice bath, bath, for for instance), instance), we obtain contact with an we obtain
u(0,t) =u(l,t] for all alH. u(O, t) = u(£, t) = 00 for t. Either of boundary conditions Either of these boundary conditions can be inhomogeneous (that (that is, is, have aa nonzero right-hand side), and we could, could, of course, have mixed conditions (one end insulated, the other other held at fixed temperature). the held at fixed temperature). A A boundary boundary condition condition that that specifies specifies the the value value of of the solution solution is is called called aa Dirichlet condition, while a condition derivative is called condition that specifies the value of the derivative called a Neumann condition. Neumann condition. A A problem with a Dirichlet condition condition at one one end end and and a Neuboundary conditions. conditions. As noted mann condition at at the the other is said said to to have mixed mixed boundary noted 3
The term term homogeneous homogeneous is is used used in in two two completely completely different different ways ways in in this this section section and and throughout 3The throughout the A material in the the the book. book. A material can can be be (physically) (physically) homogeneous, homogeneous, which which implies implies that that the the coefficients coefficients in differential equations equations will will be constants. On other hand, linear differential differential equation can be differential be constants. On the the other hand, aa linear equation can be is zero. zero. These uses of of (mathematically) homogeneous, which (mathematically) homogeneous, which means means that that the the right-hand right-hand side side is These two two uses the homogeneous are are unrelated confusing, but the usage is standard standard and and so the word word homogeneous unrelated and and potentially potentially confusing, but the usage is so the reader understand from from context context the the sense sense in which the the word word is used. the reader must must understand in which is used. 4 by an 4Since Since changing changing uu by an additive additive constant constant does does not not affect affect the the differential differential equation, equation, we we can can as as well use Celsius as as the the temperature rather than than Kelvin. Kelvin. (See well use Celsius temperature scale scale rather (See Exercise Exercise 6.) 6.)
14
Chapter Chapter 2. Models in one dimension
above, above, we we will will also also use use the the terms terms homogeneous homogeneous and and inhomogeneous inhomogeneous to to refer refer to to the the boundary conditions. The condition u(£) u(i] == 10 10 is called an inhomogeneous inhomogeneous Dirichlet condition, for for example. example. condition, To completely completely determine determine the temperature as function of of time time and space, we we To the temperature as aa function and space, must the temperature temperature distribution distribution at at some some initial initial time time to. to. This This is is an an initial must know know the initial rnnAH-.i.nn.' condition: u(x, to) = 1jJ(x), 0 < x < £. Putting condition, Putting together together the the PDE, PDE, the the boundary boundary conditions, conditions, and and the the initial initial condition, we obtain an initial-boundary value value problem (IBVP) for for the the heat heat equation. equation. For we obtain an initial-boundary problem (IBVP) For example, with ends of of the the bar held at at fixed we have have the the IBVP IBVP example, with both both ends bar held fixed temperatures, temperatures, we
au a2 u pc at - '" ax 2 = f(x, t), 0 < x < £, t > to, u(x, to) = 1jJ(x), 0 < x < £, u(O, t) = 0, t > to, u(£, t) = 0, t > to.
2.1.2
(2.10)
Steady-state heat flow Steady-state flow
An special case case modeled modeled by by the equation is is steady-state An important important special the heat heat equation steady-state heat heat flow—a flow--a situation in in which is constant constant with to time time (although (although not situation which the the temperature temperature is with respect respect to not necessarily with respect to this case, be necessarily with respect to space). space). In In this case, the the temperature temperature function function uu can can be thought of partial derivative thought of as as aa function function of of xx alone, alone, uu = — u(x), u(x), the the partial derivative with with respect respect to to tt is is zero, zero, and and the the differential differential equation equation becomes
J2u
-'" dx 2 = f(x), 0 < x < £. In the steady-state case, case, any any source source term must be of time In the steady-state term /f must be independent independent of time (otherwise, (otherwise, the equation equation could could not not possibly possibly be satisfied by by aa function function uu that is independent independent of the be satisfied that is of time). Boundary conditions have have the the same same meaning meaning as as in in the the time-dependent time-dependent case, case, time). Boundary conditions although the boundary although in in the the case case of of inhomogeneous inhomogeneous boundary boundary conditions, conditions, the boundary data data must must be constant. constant. On On the the other other hand, hand, it it obviously obviously does does not not make sense sense to to impose impose an temperature distribution. an initial initial condition condition on on aa steady-state steady-state temperature distribution. Collecting these these observations, observations, we see that that aa steady-state steady-state heat we see heat flow flow problem problem Collecting takes the form of of aa boundary boundary value value problem (BVP). For For example, example, if if the temperature takes the form problem (BVP). the temperature is fixed fixed at the two endpoints of the bar, we we have the following following Dirichlet problem: problem:
J2u
-'" dx 2 = f(x), 0 < x < £, u(O) = 0, u(£) = O.
(2.11)
We remark We remark that, that, when when only only one one spatial spatial dimension dimension is is taken taken into into account, account, aa steadysteadystate (or (or equilibrium) equilibrium] problem in an an ODE ODE rather rather than than aa PDE. PDE. Moreover, state problem results results in Moreover, these problems, problems, at in their be solved two intethese at least least in their simplest simplest form, form, can can be solved directly directly by by two integrations. Nevertheless, Nevertheless, we will (in Chapter 5) devote a significant significant amount of effort effort
2.1. bar; Fourier's law 2.1. Heat flow in a bar;
15
toward developing developing methods for solving solving these ODEs, since since the techniques can can be genthe techniques be gentoward methods for these ODEs, eralized multiple spatial eralized to to multiple spatial dimensions, dimensions, and and also also form form the the foundation foundation for for techniques techniques for for solving solving time-dependent time-dependent problems. problems. Example Example 2.2. 2.2. The The thermal thermal conductivity conductivity of of an an aluminum aluminum alloy alloy is is 1.5 1.5 W/(cmK). W/(cmK). We steady-state temperature We will will calculate calculate the the steady-state temperature of of an an aluminum aluminum bar bar of of length length 11 m m (insulated along along the the sides) with its its left left end end fixed fixed at at 20 20 degrees degrees Celsius Celsius and and its its right (insulated sides) with right end for the steady-state temperature, end fixed fixed at at 30 30 degrees. degrees. If If we we write write uu = u(x) u(x] for the steady-state temperature, then then satisfies the the BVP BVP uu satisfies d2 u -1.5 dx 2 = 0, 0
< x < 100,
u(O) = 20, u(100)
= 30.
(We so as (We changed changed the the units units of of the the length length of of the the bar bar to to centimeters, centimeters, so as to to be be consistent consistent with the the units units of of the the thermal thermal conductivity.) conductivity.) The The differential differential equation equation implies implies that that with 22 2 d?u/dx is zero, so, integrating once, du/dx is constant, say d u/dx is zero, so, integrating once, du/dx is constant, say du dx(x)=C1 ,0<x 00 such such that, that, at at time t, the chemical moves across the cross-section at x at a rate of of
au
-D ax (x, t).
The units ofthis (for example, g/cm22s). Since of this mass flux are mass per area per time (for 44 the units of au/ax du/dx are are mass/length coefficient D must must have the units of mass/length ,, the the diffusion diffusion coefficient have unit unit of of area per time time (for area per (for example, example, cm cm22/s). /s). With this this assumption, the equation diffusion is as With assumption, the equation modeling modeling diffusion is derived derived exactly exactly as was the heat equation, with a similar result. The total amount of the chemical contained in part of of the the pipe between x and and x + is given given by by (2.12), contained in the the part pipe between + 6.x Ax is (2.12), so so the the rate at which this total mass is is changing is
1
a [r+!::..x r+!::..x au at ix Au(s, t) ds = ix A at (s, t) ds. This same quantity can be computed fluxes at the two at xx This same quantity can be computed from from the the fluxes at the two cross-sections cross-sections at and of and x + 6.x. Ax. At At the left, mass mass is is entering at aa rate rate of
au -AD ax (x, t), of while at the right it enters at a rate of
au AD ax (x
+ 6.x, t).
We therefore therefore have
r+!::..x
ix
au au A 8t (s, t) ds = -AD 8x (x, t)
{x+!::..x
=
ix
au
+ AD 8x (x + 6.x, t)
82u AD 8x 2 (s, t) ds.
The The result result is is the the diffusion diffusion equation: equation: 2
8u = D 8 u2 0 < X < e, t > to. 8t 8x ' If accounted for for by a If the chemical chemical is added to the interior interior of the pipe, this can be accounted function /(x, t), where mass is added to the part of the and x + Ax f(x, t), where mass is added to the part of the pipe pipe between between x and 6.x function at aa rate rate of at of
r+!::..x
ix
Af(s, t) ds
(mass per unit time—/ time-f has units of mass per volume per time). We We then obtain obtain the inhomogeneous the inhomogeneous diffusion diffusion equation: equation:
8u 82 u 8t - D 8x 2 = f(x, t), 0 < x < e, t> to· 6 6This This constant constant varies varies with with temperature temperature and and pressure; pressure; see see the the CRC CRC Handbook Handbook of of Chemistry Chemistry and page 6-179. 6-179. Physics °hysics [35], [35], page
2.1.
Heat flow in a bar; Fourier's law
19
Just flow, we we can Just as as in in the the case case of of heat heat flow, can consider consider steady-state steady-state diffusion. diffusion. The The result result is is the the ODE ODE d2u -D dx 2 = f(x), 0 < x < f, with boundary conditions. with appropriate appropriate boundary conditions. Boundary Boundary conditions conditions for for the the diffusion diffusion equaequation the exercises. exercises. tion are are explored explored in in the
Exercises 1. 1. Determine Determine the the units units of of the the thermal thermal conductivity conductivity /'i,K from from (2.6). (2.6). 2. Handbook of 2. In In the the CRC CRC Handbook of Chemistry Chemistry and and Physics Physics [35], [35], there there is is aa table table labeled labeled "Heat "Heat Capacity Capacity of of Selected Selected Solids," Solids," which which "gives "gives the the molar molar heat heat capacity capacity at at constant temperature in in constant pressure pressure of of representative representative metals metals ... ... as as aa function function of of temperature the iron the range range 200 200 to to 600 600 K" K" ([35], ([35], page page 12-190). 12-190). For For example, example, the the entry entry for for iron is is as as follows: follows: Temp. Temp. (K) cc (J(mole· (J/mole • K) K)
200 21.59
250 23.74
300 25.15
350 26.28
400 27.39
500 29.70
600 32.05
table indicates, the specific As this As this table indicates, the specific heat heat of of aa material material depends depends on on its its temperatemperature. How dependence ture. How would would the the heat heat equation equation change change if if we we did did not not ignore ignore the the dependence of the the specific specific heat heat on on temperature? temperature? of 3. energy. 3. Verify Verify that that the the integral integral in in (2.1) (2.1) has has units units of of energy. 4. as 4. Suppose Suppose uu represents represents the the temperature temperature distribution distribution in in aa homogeneous homogeneous bar, bar, as discussed discussed in in this this section, section, and and assume assume that that both both ends ends of of the the bar bar are are perfectly perfectly insulated. insulated. (a) this situation? (a) What What is is the the IBVP IBVP modeling modeling this situation? (b) heat energy (b) Show Show (mathematically) (mathematically) that that the the total total heat energy in in the the bar bar is is constant constant with of with respect respect to to time. time. (Of (Of course, course, this this is is obvious obvious from from aa physical physical point point of view. view. The The fact fact that that the the mathematical mathematical model model implies implies that that the the total total heat heat energy completely energy is is constant constant is is one one confirmation confirmation that that the the model model is is not not completely divorced reality.) divorced from from reality.) 5. means of heat energy one 5. Suppose Suppose we we have have aa means of "pumping" "pumping" heat energy into into aa bar bar through through one of per second of the the ends. ends. If If we we add add rr Joules Joules per second through through the the end end at at xx == f, I, what what would the corresponding corresponding boundary boundary condition condition be? would the be? was 6. 6. In In our our derivation derivation of of the the heat heat equation, equation, we we assumed assumed that that temperature temperature was measured measured on on the the Kelvin Kelvin scale. scale. Explain Explain what what changes changes must must be be made made (in (in the the PDE, use degrees PDE, initial initial condition, condition, or or boundary boundary conditions) conditions) to to use degrees Celsius Celsius instead. instead.
7. The The thermal thermal conductivity conductivity of of iron iron is is 0.802 0.802 W W/(cm Consider an an iron iron bar bar of 7. (( cm K). Consider of length 11 m with the and length m and and radius radius 11 cm, cm, with the lateral lateral side side completely completely insulated, insulated, and assume that the the temperature temperature of the bar bar is fixed at 20 degrees assume that of one one end end of of the is held held fixed at 20 degrees Celsius, the other degrees. Celsius, while while the the temperature temperature of of the other end end is is held held fixed fixed at at 30 30 degrees.
20
Chapter 2. Models in one dimension dimension
heat energy is added to or removed from the interior of the Assume that no heat bar. bar. (a) What What is is the (steady-state) (steady-state) temperature temperature distribution distribution in the bar? flowing through the bar? (b) At what rate is heat energy flowing 8. (a) Show that the function function
u(x, t)
= e-~IPt/(pc) sin (Ox)
is is aa solution solution to to the homogeneous heat equation equation
au - aax2u = 0,
pc at
/'i,
2
0<x
< l, for all t.
(b) What What values of 0 will cause uu to to also also satisfy satisfy homogeneous homogeneous Dirichlet Dirichlet con(b) values of will cause conditions at x = f? = 0 and x = = tl 9. we consider 9. In In this exercise, exercise, we consider aa boundary boundary condition condition for for aa bar that that may be be more realistic than a simple Dirichlet Dirichlet condition. Assume that, as usual, the side of a bar is is completely insulated insulated and that the ends are placed in a bath maintained maintained at constant flows out of or into the ends constant temperature. Assume that the heat heat flows in accordance accordance with Newton's law of cooling: the heat flux is proportional proportional to to the difference difference in temperature temperature between the end of the bar and the surrounding medium. resulting boundary medium. What What are are the the resulting boundary conditions? conditions? 10. Derive the heat equation from Newton's law of cooling (cf. (cf. the previous exercise) of cise) as as follows: follows: Divide Divide the bar bar into into aa large large number number nn of of equal equal pieces, each each of length Llx. Approximate the temperature in the ith piece as a function Ui(t) Ax. temperature «th Ui(t) (thus (thus assuming assuming that that the the temperature temperature in in each each piece piece is is constant constant at at each each point point in time). Write down a coupled system of ODEs for Ul (t), U2 (t), ... Unn(t) ui(t), u^t], • • • ,,u by applying applying Newton's law of cooling cooling to each piece and its nearest neighbors. p, c, Assume that the bar is homogeneous, so that the material properties p, c, and /'i, constant. Take the the limit as Llx -+ 0, and and show that the the result is the K are constant. Aa; —> the heat equation. equation. heat 11. Suppose a chemical is diffusing diffusing in a pipe, and both ends of the pipe are sealed. sealed. What are are the the appropriate appropriate boundary boundary conditions conditions for for the diffusion equation? equation? What the diffusion What initial initial conditions conditions are are required? required? Write Write down down aa complete complete IBVP IBVP for for the the What diffusion diffusion equation equation under these conditions. conditions.
12. Suppose Suppose that chemical contained contained in in aa pipe pipe of of length length lI has has an an initial initial concenconcen12. that aa chemical tration 0) = At time time zero, the ends pipe are tration distribution distribution of of u(x, w(or,0) = 'IjJ(x). ip(x}. At zero, the ends of of the the pipe are sealed, and no mass is added to or removed from the interior of the pipe. (a) Write down the IBVP describing the diffusion diffusion of the chemical. (b) Show mathematically mathematically that the total mass of the chemical in the pipe is constant. (Derive (Derive this fact from the equations rather rather than from common sense.)
2.2. The hanging bar
21 21
(c) Describe ultimate steady-state steady-state concentration. concentration. (c) Describe the the ultimate (d) the steady-state steady-state concentration in terms terms of (d) Give Give aa formula formula for for the concentration in of 'IjJ. if). 13. 13. Suppose Suppose aa pipe pipe of of length length tf and and radius radius rr joins joins two two large large reservoirs, reservoirs, each each concontaining aa (well-mixed) the same the concentration taining (well-mixed) solution solution of of the same chemical. chemical. Let Let the concentration in one reservoir MO and in the the other ui, and assume that w(0,£) = MO in one reservoir be be Uo and in other be be Uf, and assume that u(O, t) = Uo and u(f, and u(l,t)t) = = Uf. ut. (a) the steady-state steady-state rate rate at at which the chemical chemical diffuses diffuses through (a) Find Find the which the through the the pipe (in of mass mass per per time). time). pipe (in units units of (b) does this this rate rate vary vary with the length length fi and (b) How How does with the and the the radius radius r? r?
14. Consider previous exercise, exercise, in chemical is monoxide 14. Consider the the previous in which which the the chemical is carbon carbon monoxide (CO) and the the solution is CO 0.01 and (CO) and solution is CO in in air. air. Suppose Suppose that that Uo MO = = 0.01 and Uf ui = = 0.015, 0.015, and diffusion coefficient coefficient of of CO CO in in air air is 0.208 cm /s. If If the and that that the the diffusion is 0.208 cm22Is. the bar bar is is 11 m m long and and its its radius is 22cm, cm, find find the the steady-state rate at at which which CO CO diffuses long radius is steady-state rate diffuses through the the pipe pipe (in units of per time). time). through (in units of mass mass per 15. Verify that that Theorem 2.1 holds d) = and F defined 15. Verify Theorem 2.1 holds for for (a, b) 6) xx (c, (c, d) = (0,1) (0,1) xx [0,1] [0,1] and defined F(x, y) = (xy). by F(x,y) = cos cos(xy}.
2.2 2.2
The hanging bar The hanging bar
Suppose that bar, with cross-sectional area i, hangs vertiSuppose that aa bar, with uniform uniform cross-sectional area A and and length length f, hangs vertically, and it stretches stretches due force (perhaps acting upon assume cally, and it due to to aa force (perhaps gravity) gravity) acting upon it. it. We We assume that occurs only only in in the vertical direction; direction; this this assumption assumption is is reasonreasonthat the the deformation deformation occurs the vertical able only if the the bar is long long and materials tend to contract bar is and thin. thin. Normal Normal materials tend to contract horizontally horizontally able only if when they are stretched both this contraction and coupling bewhen they are stretched vertically, vertically, but but both this contraction and the the coupling between horizontal small compared compared to to the the elongation elongation when tween horizontal and and vertical vertical motion motion are are small when the bar bar is Segel [36], [36], Chapter Chapter 12). 12). the is thin thin (see (see Lin Lin and and Segel With the the assumption assumption of of purely purely vertical vertical deformation, deformation, we we can can describe moveWith describe the the movement of the bar bar in of aa displacement function u(x, t). Specifically, suppose ment of the in terms terms of displacement function u(x,t). Specifically, suppose that the of the at xx = = 0, that the top top of the bar bar is is fixed fixed at 0, and and let let down down be be the the positive positive x-direction. x-direction. Let the cross-section cross-section of the bar bar originally move to to xx + u(x, at time time tt (see (see Let the of the originally at at xx move u(x,t)t) at Figure 2.4). 2.4). We PDE describing describing the dynamics of of the Figure We will will derive derive aa PDE the dynamics the bar bar by by applying applying Newton's second second law law of motion. Newton's of motion. We assume that the the bar bar is elastic, which which means that the the internal in the We assume that is elastic, means that internal forces forces in the bar depend on the local local relative relative of change in The deformed bar depend on the of change in length. length. The deformed length length of of the the part part of of the the bar bar originally originally between between xx and and xx + Ax 6.x (at (at time time t} t) is is (x
+ 6.x + U (x + 6.x, t))
- (x
+ u(x, t))
= 6.x
+ u(x + 6.x, t)
Since the original length is is Ax, 6.x, the the change change in length of of this this part part is is Since the original length in length u(x
+ 6.x, t) - u(x, t),
and the in length length is and the relative relative change change in is u(x+6.x,t) -u(x,t) ~ ou( ) 6.x - ax x, t .
- u(x, t).
22 22
Chapter 2.
dimension Models in one dimension
Figure 2.4. hanging bar and its Figure 2.4. The hanging its coordinate system. system.
This explains explains the as the the This the definition definition of of the the strain, strain, the the local local relative relative change change in in length, length, as dimensionless dimensionless quantity quantity au ax (x, t). As we we noted noted above, above, to to say that the the bar bar is is elastic is to to say that the the internal As say that elastic is say that internal restoring force of of the depends only only on strain. We the the deformed deformed bar bar depends on the the strain. We now now make make the restoring force further are small, internal further assumption assumption that that the the deformations deformations are small, and and therefore therefore that that the the internal forces are, are, to strain. This is equivalent to forces to good good approximation, approximation, proportional proportional to to the the strain. This is equivalent to assuming that is linearly linearly elastic, Hookean. assuming that the the bar bar is elastic, or or Hookean. Under assumption that is Hookean, expression for Under the the assumption that the the bar bar is Hookean, we we can can write write an an expression for the force acting on P, and xx + Ax. the total total internal internal force acting on P, the the part part of of the the bar bar between between xx and ~x. We We denote of proportionality proportionality denote by by k(x) k(x) the the stiffness stiffness of of the the bar bar at at x, x, that that is, is, the the constant constant of in with units units of offorce per unit unit area. the engineering engineering literature, literature, kk is is in Hooke's Hooke's law, law, with force per area. (In (In the called Young's modulus elasticity. Values materials called the the Young's modulus or or the the modulus modulus of of elasticity. Values for for various various materials can be found in as [35].) [35].) Then Then the can be found in reference reference books, books, such such as the total total internal internal force force is is
au + ~x, t) -
Ak(x + ~x) ax (x
au
Ak(x) ax (x, t).
(2.13)
The in this expression is force exerted The first first term term in this expression is the the force exerted by by the the part part of of the the bar bar below below xx + Ax above ~x on on P, P, and and the the second second term term is is the the force force exerted exerted by by the the part part of of the the bar bar above the strains positive, then bar has been x on on P. P. The The signs signs are are correct; correct; if if the strains are are positive, then the the bar has been stretched, and the is pulling the internal internal restoring restoring force force is pulling down down (the (the positive positive direction) direction) stretched, and at x + Ax and up at x+ ~x and up at at x. x.
2.2. The hanging hanging bar
23
Now, (2.13) is equal (by the fundamental theorem of calculus) to r+l:J.x a ( au ) ix A ax k(s) ax (s, t) ds.
We now assume that all external forces are lumped into a body force given by a force body force density f/ (which has units offorce of force per unit volume). Then the total external force on P (at time t) t] is {x+l:J.x ix f(s, t)Ads, and the sum of the forces acting on part P is {x+l:J.x a ( au ) ix A ax k(s) ax (s, t) ds
r+l:J.x
+ ix
Af(s, t) ds.
Newton's second law states that the total force acting on P must equal the mass mass of of P times times its its acceleration. acceleration. This law law takes takes the the form form x x r+l:J.x a ( au ) +l:J.X +l:J.X a2u ix A ax k(s) ax (s, t) ds + x Af(s, t) ds = x Ap(s) at 2 (s, t) ds,
l
l
where p(x) p(x) is the density of the bar at x (in units of mass per volume). We can rewrite this as x 2 +l:J.X [ aatu (s, t) - ax a ( k(s) a ut)) - f(s, t) ] ds = 0 x p(s) ax (s, 2
l
(note how the factor of A cancels). This integral must be zero for £) for every xx E e [0, [0,£) and every ~x Ax > O. 0. It follows (by the reasoning introduced on page 12) 12) that the integrand integrand must be identically identically zero; this gives the equation 2
a u p(x) at 2
-
a ( au) ax k(x) ax - f(x, t)
= 0,
or (2.14) The PDE equation. PDE (2.14) is called the wave wave equation. If t, and If the bar is in equilibrium, equilibrium, then the displacement does not depend on t, we u(x). In u/at22 is we can can write write uu = u(x]. In this case, case, the the acceleration acceleration ad22 u/dt is zero, zero, and and the the forcing forcing function f/ must must also also be of time. We then obtain the the following following ODE function be independent independent of time. We then obtain ODE for the equilibrium displacement of the bar: for the equilibrium displacement of the bar:
dU) = f(x).
d ( k(x)-dx dx
(2.15)
This is the same equation that governs flow! Just as in the case governs steady-state steady-state heat flow! of steady-state flow, the resulting BVPs can be solved with two integrations steady-state heat flow, integrations (see Examples Examples 2.2 and 2.3).
24 24
Chapter 2. Models in one dimension
If the bar bar is is homogeneous, homogeneous, so that pp and constants, these If the so that and k are are constants, these last last two two differential equations can be written as differential equations can be written as
and and
J2u
-k dx 2 = I(x),
respectively. respectively.
2.2.1 2.2.1
Boundary conditions conditions for the hanging bar Boundary bar
Equation (2.15) unique displacement; Equation (2.15) by by itself itself does does not not determine determine aa unique displacement; we we need need boundboundary conditions, well as as initial conditions if the problem problem is ary conditions, as as well initial conditions if the is time-dependent. time-dependent. The The statement of the the problem problem explicitly gives us us one boundary condition: condition: u(O) statement of explicitly gives one boundary u(0) == 00 (the (the top end of of the bar cannot cannot move). move). Moreover, Moreover, we second boundary top end the bar we can can deduce deduce aa second boundary condition from force force balance at the end of of the bar. If If the of the bar the other other end the bar. the bottom bottom of the bar condition from balance at is unsupported, then no contact contact force force applied applied at at xx = t. On the other hand, is unsupported, then there there is is no = e. On the other hand, the analysis that part of bar above above xx — t (which (which is the analysis that led led to to (2.13) (2.13) shows shows that that the the part of the the bar = e is all of the exerts an an internal internal force force of all of the bar) bar) exerts of
on the t. Since Since there is nothing balance this this force, have on the surface surface at at xx = = e. there is nothing to to balance force, we we must must have -Ak(f) ~~ (e) = 0,
or or simply simply
~~(f)=O. Since the the wave equation involves time derivative derivative of of u, Since wave equation involves the the second second time u, we we need need two two initial conditions to uniquely uniquely determine determine the the motion the bar: bar: the the initial initial displaceinitial conditions to motion of of the displacethe initial velocity. We We thus thus arrive the following IBVP for the wave wave ment and and the ment initial velocity. arrive at at the following IBVP for the equation: equation: 2
p(x) aatu2
-
a ( k(x) au) ax ax = I(x, t), 0
< x < e, t> to,
u(x, to) = 'I/J(x), 0 < x au at (x, to) = "Y(x), 0 < x
u(O, t) = 0, t> to, au ax (e, t) = 0, t > to·
< e,
< e,
(2.16)
2.2. The The hanging hanging bar bar 2.2.
25 25
The corresponding corresponding steady-state steady-state BVP BVP (expressing (expressing mechanical mechanical equilibrium) equilibrium) is The is
d (k(x) dx dU)
- dx
= f(x),
0< x < C, (2.17)
u(O) = 0,
~~(C) = o. There are several other other sets sets of of boundary boundary conditions interest There are several conditions that that might might be be of of interest in connection connection with with the or (2.15). For example, if both in the differential differential equations equations (2.14) or For example, if both ends of bar are are fixed (not allowed allowed to to move), move), we we have have the the boundary boundary conditions conditions ends of the the bar fixed (not
= 0,
U(O)
u(C)
=0
°
(recall that that u u is displacement, so so the condition u(C) u(i] = = 0 indicates indicates that cross(recall is the the displacement, the condition that the the crosssection at at the the end end of of the the bar corresponding to to xx == C t does not move move from from its original bar corresponding does not its original section position). If both the bar boundary conditions position). both ends ends of of the bar are are free, free, the the corresponding corresponding boundary conditions are are du du dx (0) = 0, dx (C) = o. Any of of the the above can be be inhomogeneous. inhomogeneous. For Any above boundary boundary conditions conditions can For example, example, fix one one end end of of the the bar bar at and stretch stretch the the other other to to xx = + A^. I::!.L we could could fix we at xx == 00 and =C t + This the boundary boundary conditions tJ..£. As As This experiment experiment corresponds corresponds to to the conditions u(O) w(0) = = 0, 0, u(C) u(t] = A£ another one end the bar bar (say another example, example, if if one end of of the (say xx = = 0) is is fixed fixed and and aa force force F is is applied applied to the the other the applied determines the to other end end (x == C), £), then then the applied force force determines the value value of of du/dx(C). du/dx(i}. Indeed, as as indicated indicated above, above, the restoring force force of the bar on the t cross-section cross-section Indeed, the restoring of the bar on the xx = C is is
-Ak(C) ~~ (C),
and this this must force F: F: and must balance balance the the applied applied force du -Ak(C) dx (C)
+F
= O.
This to the the boundary boundary condition This leads leads to condition
and the quantity / A has has units pressure (force per unit For mathematical and the quantity F F/A units of of pressure (force per unit area). area). For mathematical purposes, it to write an inhomogeneous boundary condition condition of this type purposes, it is is simplest simplest to write an inhomogeneous boundary of this type as as
:~ (C) = c,
but for practical problem, problem, it essential to but for solving solving aa practical it is is essential to recognize recognize that that F
C
= Ak(C)"
26
Chapter 2. Models Models in one dimension dimension
Exercises 1. the following experiment: A A bar is hanging with the the top fixed 1. Consider Consider the following experiment: bar is hanging with top end end fixed at bar is is stretched by aa pressure pressure (force per unit at xx == 0, 0, and and the the bar stretched by (force per unit area) area) pp applied uniformly free (bottom) (bottom) end. What are are the the boundary applied uniformly to to the the free end. What boundary conditions conditions describing this situation? describing this situation? 2. Suppose that aa homogeneous homogeneous bar bar (that constant stiffness stiffness k) of 2. Suppose that (that is, is, aa bar bar with with constant A;) of length top end fixed at the bar bar is stretched to to aa length length £i has has its its top end fixed at xx = 0, 0, and and the is stretched length £1+A£. + Il£ by pressure pp applied bottom end. to be the positive positive by aa pressure applied to to the the bottom end. Take Take down down to be the x-direction. x-direction. (a) Explain Explain why why pp and and Il£ have the the same (a) A^ have same sign. sign. (b) Explain why why p and and \t Il£ cannot cannot both both be be chosen chosen arbitrarily arbitrarily (even (even subject subject (b) Explain to the the requirement requirement that that they they have have the the same both physical physical to same sign). sign). Give Give both and mathematical reasons. and mathematical reasons. (c) is specified. Find Il£ terms of of p, p, k, (c) Suppose Suppose pp is specified. Find A^ (in (in terms k, and and C). i). (d) is specified. specified. Find terms of Il£, fc, k, and (d) Suppose Suppose Il£ Al is Find pp (in (in terms of A£, and C). t}. 3. A certain certain type type of of stainless stiffness of of 195 (A Pascal (Pa) is 3. A stainless steel steel has has aa stiffness 195 GPa. GPa. (A (Pa) is the standard unit of pressure, or unit area. area. The Pascal is is aa derived the standard unit of pressure, or force force per per unit The Pascal derived unit: one Pascal Pascal equals equals one square meter. meter. The the unit: one one Newton Newton per per square The Newton Newton is is the standard standard unit of force: force: one one Newton equals equals one kilogram kilogram meter per secondsecondsquared. squared. Finally, Finally, GPa GPa is is short short for for gigaPascal, or 10 1099 Pascals.) Pascals.) (a) Explain Explain in words (including units) what what aa stiffness (a) in words (including units) stiffness of of 195 195 GPa GPa means. means.
(b) Suppose Suppose aa pressure pressure of of 11 GPa GPa is is applied applied to the the end end of aa homogeneous, bar of of this this stainless steel, and and that the other is circular cylindrical cylindrical bar circular stainless steel, that the other end end is fixed. If the original original length the bar bar is m and and its its radius radius is is 11 cm, cm, what fixed. If the length of of the is 11 m what will its length length be, be, in the equilibrium after the pressure has has been been will its in the equilibrium state, state, after the pressure applied? applied? (c) the result result of of 3b by formulating the boundary boundary value value (c) Verify Verify the 3b by formulating and and solving solving the problem representing this experiment. problem representing this experiment. 4. Consider circular cylindrical cylindrical bar, length 11 m m and cm, made made from 4. Consider aa circular bar, of of length and radius radius 11 cm, from an aluminum alloy with stiffness 70 GPa. If the top end of the bar (x = 0) an aluminum alloy with stiffness 70 GPa. If the top end of the bar 0) is fixed, what what total total force must be to the the other other end end (x = = 1) is fixed, force must be applied applied to 1) to to stretch stretch the of 1.01 1.01 m? the bar bar to to aa length length of m? 5. Write the the wave wave equation equation for the bar bar of that the the density of 5. Write for the of Exercise Exercise 3, 3, given given that density of the stainless steel steel is is 7.9 7.9 g/cm g/cm33.. (Warning: Use Use consistent units!) What What must must the units the forcing be? Verify Verify that that the two terms terms on the left the units of of the forcing function function f/ be? the two on the left side the same units as side of of the the differential differential equation equation have have the same units as f./. 6. that aIm bar of the stainless stainless steel in Exercise Exercise 3, with 6. Suppose Suppose that a i m bar of the steel described described in 3, with density 7.9 g/ cm33,, is the bottom bottom but but free the top. top. Let Let the density 7.9g/cm is supported supported at at the free at at the the cross-sectional area of the the bar bar be be 0.1 m22 .• A A weight weight of placed on on cross-sectional area of O.lm of 1000 1000 kg kg is is placed
2.3. The wave wave equation for a vibrating string
27
gravitational constant top of the bar, exerting pressure on it via gravity (the gravitational is is 9.8m/s 9.8m/s22 ). ). The purpose of this problem is to compute and compare the effects effects on the bar of the mass maBS on the top and the weight of the bar itself.
(a) Write down three BVPs: (which means that i. First, take into account the weight of the bar (which gravity induces aa body body force), force), but but ignore ignore the the mass maBS on on the top (so gravity induces the top (so the the top end end of of the the bar bar is boundary condition condition is is aa homogeneous homogeneous top is free-the free—the boundary Neumann condition). condition). Neumann ii. Next, ignore the the effect effect of ii. Next, take take into into account account the the mass mass on on the the top, top, but but ignore of the weight of the bar (so there is no body force). LaBt, take take both both effects into account. iii. iii. Last, effects into account. 1.
(b) Explain why why the the third BVP can can be be solved by solving solving the the first first two two and (b) Explain third BVP solved by and adding the the results. results. adding
(c) Compare the (c) Solve Solve the the first first two two BVPs BVPs by by direct direct integration. integration. Compare the two two disdisplacements. Which is more significant, significant, the weight of the bar bar or the mass maBS on top? (d) How would would the the situation change if the the cross-sectional cross-sectional area area of of the the bar bar were were (d) How situation change changed to 0.2 0.2m changed to m 22 ?? 7. (a) Show 7. (a) Show that that the function function
u(x, t) = cos (c(}t) sin ((}x)
is solution to equation is aa solution to the the homogeneous homogeneous wave wave equation
(b) What values of 9() will cause u to also satisfy satisfy homogeneous Dirichlet conditions =0 and xx = = tl ditions at at xx = 0 and £?
2.3 2.3
The for a a vibrating string The wave wave equation equation for vibrating string
We now now present present an an argument argument that that the wave equation equation (2.14) the small small We the wave (2.14) also also describes describes the aB a guitar string). In the course transverse vibrations of an elastic string (such as unjustified assumptions which which of the following derivation, we make several aa priori unjustified are significant enough that the end result ought to be viewed viewed with some skepticism. careful analysis However, aa careful analysis leads leads to to the same model (see (see the the article article by Antman [1]). [1]). For simplicity, we For simplicity, we will will assume assume that that the the string string in in question question is is homogeneous, homogeneous, so We suppose so that any any material properties are constant constant throughout the string. string. We that string is is stretched stretched to its two endpoints are are not that the the string to length length £I and and that that its two endpoints not allowed allowed move. We We further suppose that the string occupying to move. string vibrates in in the xy-plane, xy-plane, occupying the interval interval [0, £] on on the the x-axis x-axis when when at rest, and that the the point point at at (x,O) in the the [0,1] at rest, and that (x,0) in the
28
Chapter 2. Models in one dimension
reference configuration moves to (x, u(x, t)) at time t. We We are thus postulating that configuration moves the motion motion of of the the string string is entirely entirely in the the transverse direction direction (this (this is is one one of of the the severe we mentioned previous paragraph). Granted severe assumptions assumptions that that we mentioned in in the the previous Granted this this assumption, assumption, we we now derive derive the differential differential equation equation satisfied satisfied by the the displacement displacement u. Since, by by assumption, assumption, aa string string does does not not resist internal restoring Since, resist bending, bending, the the internal restoring We force of the string under tension is tangent tangent to the string itself at every point. We magnitude ofthis will denote denote the magnitude of this restoring restoring force force by T(x, T(x, t). In In Figure 2.5, 2.5, we display a part of the deformed string, corresponding corresponding to the part of the string between x and x + 6x Ax in reference configuration, configuration, together together with the internal internal forces forces at at the the and in the the reference with the ends of this part, and their magnitudes. In the absence of any external forces, the sum of these internal forces must balance the mass times acceleration of this part of the string. To write down these equations, we we must decompose the internal force into its horizontal and vertical components.
Figure Figure 2.5. 2.5. A part of of the the deformed deformed string.
We write write n = n(x, t) for for the the force force at at the the left left endpoint and and 0 9 for for the the angle this this force vector makes with the horizontal. We We then have
ni + n~ =
T(x, t)2,
with with nl = -T(x, t) cos (0), n2 = -T(x, t) sin (0). Assuming that at every point, and and noting noting that that Assuming that lou/oxl \du/dx « O.
(2.18)
We this as as the wave equation. equation. It It is is usual this in We recognize recognize this the homogeneous homogeneous wave usual to to write write this in the form the form a2 u 2 a2 u at 2 - c ax 2 = 0, < x < £, t > 0,
°
2
where c = T/p. The significance significance of of the the parameter parameter cc will in Chapter Chapter where c2 = T / p. The will become become clear clear in 7. In the case that force is is applied the string string (in (in the vertical In the case that an an external external body body force applied to to the the vertical direction), the direction), the equation equation becomes becomes
aat2 u
2 -
2
a2 u
c ax 2 = I(x, t), 0
< x < £, t> O.
(2.19)
30
Chapter 2. 2. Models one dimension dimension Chapter Models in in one
Exercise asks the Exercise 11 asks the reader reader to to determine determine the the units units of of f./. The string, as above, The natural natural boundary boundary conditions conditions for for the the vibrating vibrating string, as suggested suggested above, are conditions: are homogeneous homogeneous Dirichlet Dirichlet conditions:
u(O, t)
= u(£, t) = 0, t > 0.
One can also ends of string are are allowed freely One can also imagine imagine that that one one or or both both ends of the the string allowed to to move move freely in direction (perhaps (perhaps an along aa frictionless frictionless in the the vertical vertical direction an end end of of the the string string slides slides along pole). In this this case, case, the the appropriate appropriate boundary boundary condition condition is is aa homogeneous homogeneous Neumann Neumann pole). In condition (see (see Exercise Exercise 5). condition 5).
Exercises 1. What ( x , t t) ) have (2.19)? 1. What units units must must ff(x, have in in (2.19)? 2. What What are are the the units units of of the the tension tension T in in the the derivation derivation of of the the wave wave equation equation for for 2. the string? the string?
3. are the What are the units units of of the the parameter parameter cc in in (2.18)? (2.18)? 3. What 4. Suppose the only external external force force applied gravity. the only applied to to the the string string is is the the force force due due to to gravity. 4. Suppose What form form does does (2.19) (2.19) take take in in this this case? case? (Let (Let g be be the the acceleration due to What acceleration due to gravity, and take constant.) gravity, and take g g to to be be constant.) 5. Explain why why aa homogeneous homogeneous Neumann Neumann condition condition models models an the string 5. Explain an end end of of the string that is allowed freely in in the that is allowed to to move move freely the vertical vertical direction. direction. 6. that an elastic string string is fixed at both ends, ends, as as in in this this section, section, and 6. Suppose Suppose that an elastic is fixed at both and it it sags under under the the influence influence of of an external force force ff(x) with respect sags an external ( x ) (f (f is is constant constant with respect to time). time). What What differential differential equation equation and side conditions conditions does the equilibrium equilibrium to and side does the displacement of the string satisfy? given in in units force that f/ is is given units of of force displacement of the string satisfy? Assume Assume that per length. per length.
2.4 2.4
Suggestions for reading Suggestions for further further reading
If the the If the reader reader wishes wishes to to learn learn more more about about the the use use of of mathematical mathematical models models in in the an excellent excellent place place to to start start is is the the text text by by Lin Lin and and Segel which comcomsciences, an sciences, Segel [36], which bines modeling modeling with with the the analysis analysis of of the the models models by by aa number number of of different bines different analytical analytical techniques. Lin Lin and and Segel cover the the models models that that form form the the basis basis for for this this book, book, as as techniques. Segel cover well as many others. others. A more advanced advanced text, focuses almost almost entirely the derivation derivation of of A more text, which which focuses entirely on on the differential Gurtin differential equations equations from from the the basic basic principles principles of of continuum continuum mechanics, mechanics, is is Gurtin [21]. [21].
Chapter Chapter 33
Eseential ial
linear algebra algebra linear
The techniques presented presented in in this this book be described by analogy analogy to The solution solution techniques book can can be described by to techniques for solving techniques for solving Ax=b, is an n E RllXll) bare b E where A is n x nn matrix (A (A 6 R n x n ) and and x and and b are n-vectors (x, (x,b € Rll). R n ). Recall Recall that such such a matrix-vector equation represents represents the following system system of of n linear the nn unknowns Xl, #2, X2, ... equations in equations in the unknowns x\, • • • ,, Xn: xn: anXl a2lXl
+ al2X2 + ... + alnX n + a22X2 + ... + a2nXn
bl , b2 ,
Before we discuss methods for solving differential differential equations, we review review the the fundamental about systems linear (algebraic) mental facts about systems of of linear (algebraic) equations. equations.
3.1 3.1
Linear operator equations equations Linear systems systems as as linear linear operator
To fully fully appreciate appreciate the taken in book, it it is understand To the point point of of view view taken in this this book, is necessary necessary to to understand the finitethe equation Ax = = b not just as as aa system system of of linear equations, equations, but as a finiteother words, words, we we must must view view the the matrix matrix A dimensional linear operator operator equation. dimensional linear equation. In In other A as defining an operator mapping, or simply function) from Rll operator (or mapping, Rn to Rll Rn via matrix multiplication: multiplication: A A maps x E 6 Rll Rn to to Y y = = Ax E e Rll. R n . (More generally, generally, if A is is not mxn m m square, say Rffi, Rffi say A E e RffiXll, R , then A defines a mapping from Rll Rn to R , since Ax E GR n for xe E Rll.) for each each x R .) The following following language language is useful in The is useful in discussing discussing operator operator equations. equations. Definition 3.1. 3.1. Let X be sets. A function function (^operator, X Definition X and Y be (operator, mapping,) mapping) /f from X eX X a unique yy £E Y, denoted (x). to Y is a rule for associating associating with each xX E denoted yy = = ff (x) The set set X X is called the domain range of of f is the set domain of of ff,, and the range R(A) = {f(x) E Y : x E X}. 31 31
32
Chapter 3. Essential linear algebra
We X --+ X into function from from X We write write ff :: X -> YY (lif ("f maps maps X into Y") Y") to to indicate indicate that that ff is is aa function X to Y. toY.
The reader should recognize the difference difference between the range of a function X --+ the set (which is f). The set /f :: X —)• Y Y and and the set YY (which is sometimes sometimes called called the the co-domain co-domain of of /). The set Y Y merely merely identifies identifies the the type type of of the the output output values values f(x); /(#); for for example, example, ifif YY == R, R, then then every f/(#) (x) is is a real number. On the other hand, the range of f/ is is the set of elements by f. As aa simple of of Y that that are are actually actually attained attained by /. As simple example, example, consider consider f/ : R --+ —> R defined x 22.• The but the the range range of the defined by by ff(x) (x) = =x The co-domain co-domain of of /f is is R, but of /f consists consists of of the set of nonnegative numbers: R(f) = [0,00). In many cases, it is quite difficult difficult to determine the range of a function. function. The codomain, on the other hand, must be specified as part of the definition of the function. A A set set is is just aa collection collection of of objects objects (elements); (elements); most most useful sets sets have have operations operations defined important sets vector defined on on their elements. elements. The The most most important sets used in in this this book are sue vector spaces. spaces. Definition A vector space V which two Definition 3.2. 3.2. A V is is aa set set on on which two operations operations are are defined, defined, addition (if (if u, V, then V) and and scalar scalar multiplication multiplication (if V and and aa addition u, vv 6 E V, then u u +v v e E V) (if u u € EV is scalar, then space are is aa scalar, then au cm E € V). V). The The elements elements of of the the vector vector space are called called vectors. (In (In this book, the scalars are usually real numbers, and we assume this unless otherwise this book, the scalars are usually real numbers, and we assume this unless otherwise stated. we use complex numbers stated. Occasionally Occasionally we use the the set set of of complex numbers as as the the scalars. scalars. Vectors Vectors will will denoted by always always be denoted by lower lower case case boldface boldface letters.} letters.) The satisfy the following algebraic properties: The two two operations operations must must satisfy the following algebraic properties: 1. u
+v
= v
2. (u + v)
+u
+w
for all u, v E V.
= u
+ (v + w)
for all u, v, wE V.
3. There is a zero vector 0 in V with the property that u
+0 =
4.
+ (-u)
For each u E V, there is a vector -u E V such that u
u for all u E =
v.
o.
5. a(u + v) = au + av for all u, v E V and for all scalars a. 6. (a + P)u = au + pu for all u E V and for all scalars a, p. 7. a(pu) = (ap)u for all u E V and for all scalars a, p.
8. 1u = u for all u E v.
For For every every vector vector space space considered considered in in this this book, the the verification of of these vector vector space space properties properties is is straightforward straightforward and and will will be be taken taken for for granted. granted. Example space is Euclidean Example 3.3. 3.3. The The most most common common example example of of aa vector vector space is (real) (real) Euclidean n-space: n-space:
3.1. Linear Linear systems systems as operator equations equations 3.1. as linear linear operator
33 33
Vectors in R R nn are written in Vectors in are usually usually written in column column form, form,
as is convenient times to think of G Rn as as an as it it is convenient at at times to think of u u ERn an n n x x 11 matrix. matrix. Addition Addition and and multiplication are defined scalar multiplication defined componentwise: u
+ v = (Ul, U2, ... , un) + (VI, V2, ... ,Vn ) = (Ul + VI, U2 + V2, • .. ,Un + V n ), au =
a(Ul,U2, ... ,Un)
=
(aUl,aU2, ... ,aUn).
Example Apart from from Euclidean Euclidean n-space, n-space, the the most most common common vector spaces are are Example 3.4. 3.4. Apart vector spaces function spaces -vector spaces spaces in which the the vectors vectors are are functions. functions. Functions Functions (with function spaces —vector in which (with common domains) domains) can can be be added added together and multiplied multiplied by by scalars, scalars, and and the the algebraic common together and algebraic properties of vector space space are are easily easily verified. verified. Therefore, defining aa function properties of aa vector Therefore, when when defining function space, one only check that any any desired properties of functions are are preserved preserved space, one must must only check that desired properties of the the functions by addition addition and and scalar scalar multiplication. Here are are some some important by multiplication. Here important examples: examples:
1. C[a, b] be the of all defined 1. b] is is defined defined to to be the set set of all continuous, continuous, real-valued real-valued functions functions defined on the interval [a, b]. The The sum continuous functions continuous, on the interval [a, b]. sum of of two two continuous functions is is also also continuous, as is is any any scalar of aa continuous continuous function. Therefore, C[a, b] as scalar multiple multiple of function. Therefore, b] is is aa vector space. vector space. 2. C1l [a, to be be the continuously differentiable differentiate 2. C [a, b] b] is is defined defined to the set set of of all all real-valued, real-valued, continuously functions defined defined on interval [a,b]. [a, b]. (A function is continuously continuously differenfunctions on the the interval (A function differentiable if if its derivative sum of tiate derivative exists and and is continuous.) continuous.) The The sum of two continuously continuously differentiate functions and the differentiable functions is is also also continuously continuously differentiate, differentiable, and the same same is is for a scalar multiple of of a continuously differentiable function. function. Therefore, true for continuously differentiate Therefore, C1l [a, [a, b] b] is is aa vector vector space. space. C 3. For any positive integer integer k, k, Ck[a, b] is the the space space of of real-valued real-valued functions functions defined 3. For any positive Ck[a, b] defined [a, b] b] that have k continuous continuous derivatives. on [a, derivatives. Many vector vector spaces are encountered encountered in in practice practice are are subspaces subspaces of Many spaces that that are of other othe vector spaces. spaces. vector
Definition V be and suppose W is the Definition 3.5. 3.5. Let Let V be aa vector vector space, space, and suppose W is aa subset subset ofV of V with with the following properties: following properties: 1. The The zero vector vector belongs to W. W. 2. Every Every linear vectors in also in in W. That is, if if x, yyEW 2. linear combination combination of of vectors in W is also That is, GW and a, G R, then then and a, /? j3 E ax + j3y E W.
34 34
Chapter 3. 3. Essential linear algebra algebra Chapter Essential linear
Then call W V. Then we we call W aa subspace of ofV.
A subspace of of aa vector vector space space is space in in its its own own right, as the reader A subspace is aa vector vector space right, as the reader can verify by checking that all the of aa vector space are for aa can verify by checking that all the properties properties of vector space are satisfied satisfied for subspace. Example 3.6. 3.6. We We define define Example
= {u E C 2[a, b]
C1[a, b]
: u(a)
= u(b) = o}.
The set Cb[a, subset of of C [a, b], and subset contains contains the the zero zero function The set C^ja, bj b] is is aa subset C22[a,b], and this this subset function (hopefully Also, if u,v E and (hopefully this this is obvious obvious to to the the reader). reader). Also, if u,v e Cb[a,b], 6%[a, b], a,f3 a,j3 GE R, and w = au au + f3v, then w = f3v, then 2
2
• w wE [a, bj [a, bj is aa vector and eC C2[a, b] (since (since C C 2[a,b] vector space); space); and
— au(a) a-Q + 0-Q and similarly w(b) = 0. w(a) = au (a) + /3v(a) f3v(a) = 0.·0 f3 . 0 = Q, 0, and similarly w(b) o. • w(a) Therefore, b], which which shows shows that subspace of [a, bj. Therefore, wE w € Cb[a, Cp[a,b], that Cb[a, C^a,b]bj is aa subspace of C C22[a,b].
Example 3.7. Example 3.7. We We define define C~,[a,bj={uEC2[a,bj: ~~(a)=~~(b)=O}. The set CF,-[a, subset of of C [a, bj, and and it it can shown to be aa subspace. The set C^[a,b]bj is also also aa subset C22[a,b], can be be shown to be subspace. function belongs belongs to [a, bj. If If u, [a, b], a, f3 E and Clearly zero function Clearly the the zero to C;' Cj^[a,b]. u,vv E € C;' C^[a,b], a,/? £ R, R, and w au + f3v, fiv, then then w == au dw dx (a)
du
= a dx (a) +
Similarly, Similarly,
dw f3 dx (a)
=a
·0+ f3 . 0 = o.
~: (b) = 0,
and so w w E shows that subspace of [a, bj. and so e C;'[a,bj. C^[a,b]. This This shows that C;'[a,bj C^[a,b] is aa subspace of C C22[a,b].
throughout this book. The letters The previous two examples will be used throughout "D" and "N" Dirichlet and and Neumann, Neumann, respectively respectively (see (see Section for "D" and "N" stand stand for for Dirichlet Section 2.1, 2.1, for example). The following provides an important nonexample nonexample of a subspace. subspace. Example Example 3.8. 3.8. We We define define W
= {u
E C 2 [a,bj : u(a)
= 'Y,
u(b)
= 8},
where'Y are nonzero nonzero real although W subset of of C [a, bj, where 7 and and 8 are real numbers. numbers. Then, Then, although W is aa subset C22[a,b], it is not a subspace. For example, the zero function does not belong to W, since it not a subspace. For example, the zero function does not belong to W, since it does does not satisfy the the boundary boundary conditions. conditions. Also, Also, if E Wand R, then, it not satisfy if u, u, vv e W and a, a, J3f3 E € R, then, with w = au au + f3v, flv, we with w we have have w(a)
= au(a) + f3v(a) = a'Y + f3'Y = (a + (3)"(.
3.1. Linear systems as as linear operator equations equations
35 35
Thus w(a) does does not not equal ,,(, except special case that a a + (3 j3 = Thus w(a) equal 7, except in in the the special case that = 1. I. Similarly, Similarly, w(b) does not satisfy the boundary condition at the right endpoint. w(b) does not satisfy the boundary condition at the right endpoint. The concept vector space space allows allows us us to linearity, which which describes The concept of of aa vector to define define linearity, describes many simple simple processes processes and modeling and and analysis. many and is is indispensable indispensable in in modeling analysis.
Definition X and and Yare vector spaces, spaces, and and ff :: X X -+ is an an operator Definition 3.9. 3.9. Suppose Suppose X Y are vector —>• Y Y is operator (or mapping) with range Y. if and (or function,77 or mapping) with domain X and range Y. Then f is is linear linear if only if only if f(ax + j3z) = af(x) + j3f(z) for all a, j3 E R, x, z E X. (3.1) This can be as the the following following two conditions, which which together together are are This condition condition can be expressed expressed as two conditions, equivalent to to (3.1): equivalent (3.1):
1. f(ax) = af(x) for all x E X and all a E R; 2. f(x
+ z) = f(x) + f(z)
for all x,z E X.
A linear thus aa particularly particularly simple kind of simplicity A linear operator operator is is thus simple kind of operator; operator; its its simplicity can be appreciated by comparing the (e.g. f(x f ( x ++y)y}==f(x) f ( x ) ++ can be appreciated by comparing the property property of of linearity linearity (e.g. f(y)) with with common common nonlinear nonlinear operators: operators: ^/x Jx + y ^ -:j:. -/X + v:y, sin -:j:. sin sin (x) f(y)} ^-\-^Jy-> sin (x (x + y) ^ (x) + sin(?/), sin (y), etc. etc. mxn Example The operator by aa matrix GR R ffixn via matrix-vectoi Example 3.10. 3.10. The operator defined defined by matrix A A E via matrix-vector multiplication, multiplication, ff(x) (x) == Ax, Ax,
is linear; linear; the reader should should verify this if necessary (see Exercise 7). Moreover, it is the reader verify this if necessary (see Exercise 7). Moreover, it can be shown shown that that every operator mapping mapping Rn into Rffi can be be represented represented can be every linear linear operator Rn into Rm can by aa matrix in this this way way (see (see Exercise Exercise 8). This explains why why the the study by matrix A €E Rffixn R mxn in This explains study of (finite-dimensional) (finite-dimensional) linear linear algebra algebra is of matrices. this book, book, of is largely largely the the study study of matrices. In In this upper case boldface boldface letters. matrices will be denoted by upper letters. Example show that that the sine function function is not linear, linear, we that Example 3.11. 3.11. To To show the sine is not we observe observe that sin (2i) = sin (7f) = 0,
while while
2sin(~)
=2·1=2.
Example 3.12. Differentiation operator Example 3.12. Differentiation defines defines an an operator
d~
: C 1 [a, b] -+ C[a, b],
7 7The The word word "operator" "operator" is is preferred preferred over over "function" "function" in in this this context, context, because because the the vector vector spaces spaces themselves are are often often spaces spaces of of functions. functions. themselves
36
Chapter 3. 3. Essential Essential linear linear algebra algebra Chapter
and this this operator operator is is well well known known to to be be linear. linear. For For example, and example,
d~ [2 sin (x) since since
3e X ] = 2 cos (x) - 3e x ,
d~ [sin (x)] = cos (x), d~ [eX] = eX.
In the kth kth derivative derivative operator defines aa linear linear operator operator mapping mapping Ck Ck[a, In general, general, the operator defines [a, bb] into C[a,b]. This is why linearity is so important in the study of differential equainto C[a, b]. This is why linearity is so important in the study 01 differential equations. tions. Since a matrix A E e Rnxn R n x n defines a linear operator operator from Rn Rn to Rn, Rn, the linear Ax = be regarded this point of system system Ax = bb can can be regarded as as aa linear linear operator operator equation. equation. From From this point of view, posed by Is there vector x xE view, the the questions questions posed by the the system system are: are: Is there aa vector G Rn Rn whose whose image image under A A is is the the given given vector If so, so, is is there there only only one one such such vector vector x? the next next vector b? b? If x? In In the under section, we explore these these questions. questions. section, we will will explore The point of discussing The point of view view of of aa linear linear operator operator equation equation is is also also useful useful in in discussing differential equations. equations. For For example, example, consider consider the the steady-state steady-state heat heat flow flow problem problem differential -/1,
(Pu ox 2 = I(x), 0 < x
< C,
u(O) = 0, u(e) = 0
(3.2)
from Section Section 2.1. 2.1. We We define define aa differential differential operator operator L C|,[0,^] -> C[O,C] C[Q,l] by by from LD D :: C1[0,C]-+ LDu =
~u
-/1,
dx 2 '
Then the the BVP BVP (3.2) (3.2) is is equivalent equivalent to to the the operator operator equation equation Then LDU=I
(the reader should notice Dirichlet boundary boundary conditions conditions are are enforced enforced by (the reader should notice how how the the Dirichlet by the definition definition of of the the domain domain of of LD). This and and similar similar examples examples will will be be discussed the LD)' This discussed throughout throughout this this chapter chapter and and in in detail detail in in Section Section 5.1. 5.1.
Exercises Exercises 1. R -+ 1. In In elementary elementary algebra algebra and and calculus calculus courses, courses, it it is is often often said said that that 1 / :R ->• R R is linear linear if if and only if if it it has has the the form form f(x) — ax ax + + b, &, where where aa and and bb are are is and only I(x) = constants. Does Does this this agree agree with with Definition Definition 3.9? 3.9? If If not, not, what what is is the the form form of of aa constants. linear function / : R —^ R? linear function 1 R -+ R?
2. Show explicitly explicitly that that 1 / :R R -+ -> R R defined defined by by f(x) = v'x ^/x is linear. 2. Show I(x) = is not not linear. 3. the following not it 3. For For each each of of the following sets sets of of functions, functions, determine determine whether whether or or not it is is aa vector space. space. (Define (Define addition addition and and scalar scalar multiplication multiplication in in the obvious way.) vector the obvious way.) If not, state what property to hold. If it it is is not, state what property of of aa vector vector space space fails fails to hold.
3.1.
Linear equations Linear systems as linear operator equations
(a) {J E C[O, 1] : 1(0)
37
= O}
(b) {J E C[O, 1] : 1(0) = I}
(c)
{1 E C[O, 1]
1
: 10 1(x) dx
= O}
(d) P Pnn,, the set of all polynomials of degree n or less. (e) The set of all polynomials of degree exactly n. n. 4. Prove that the [a, b] the differential differential operator operator L : C1l[a, b] -+ —> C[a, b] b] defined by du Lu=udx
is not a linear operator. operator. 5. [a, b] -+ 5. Prove Prove that that the the differential differential operator operator L : Cl1[a,b] —>• C[a, C[a,b]b] defined defined by by du 3 Lu = - +u dx
is is not not aa linear linear operator. operator. 6. Prove that the [a, b] the differential differential operator M M : C22[a, b] -+ —)• C[a, • R2 R2 is that there there is AE €R 2 such that /f is by /(x) f(x) == Ax. Hint: Each Each x x E such that is given given by Ax. Hint: e R2 R satisfies satisfies xx == xiei +X2e2, + #262, where Xlel where
we have have Since Since f/ is is linear, linear, we
The desired matrix be expressed expressed in terms of the vectors (et), The desired matrix A A can can be in terms of the vectors f/(ei), f(e2). /(e 2 ). (b) Now Now show R n -t Rm m is then there there exists exists aa matrix (b) show that that if if f/ : R —>• R is linear, linear, then matrix mxn such that /(x) f(x) == Ax for AE <E RmXll R for all all x E e Rll. Rn. II
3.2
Existence Ax == b Existence and and uniqueness of solutions to Ax
We will now now discuss discuss the the linear linear system system Ax Ax = = b, where where A A eE RllXll R n x n and and bb eE Rll, R n , as as a We consider three fundamental linear operator linear operator equation. equation. We consider three fundamental questions: questions: 1. Does aa solution to the the equation 1. Does solution to equation exist? exist?
2. is it unique? 2. If If aa solution solution exists, exists, is it unique? how can compute it? 3. If 3. If aa unique unique solution solution exists, exists, how can we we compute it? It turn~ out that the first two are intimately purpose of It turns out that the first two questions questions are intimately linked; linked; the the purpose of this this to shed these two two questions and the the connection connection between between section section is is to shed some some light light on on these questions and We will will also how this this point point of can be be carried to the them. We them. also briefly briefly discuss discuss how of view view can carried over over to the case will be be deferred in case of of aa linear linear differential differential equation. equation. The The third third question question will deferred to to later later in this chapter. this chapter.
3.2.1 3.2.1
Existence Existence
The existence of to Ax b is to the condition that that b b lie in The existence of aa solution solution to Ax == b is equivalent equivalent to the condition lie in R(A), the range range of of A. A. This begs the R(A)? 72-(A), the This begs the question: question: What What sort sort of of aa set set is is 7£(A)? If y, y, w 72.(A), say say y Az, then then If wE6 R(A), y = = Ax, Ax, w w = = Az, ay + {3w
= aAx + {3Az = A(ax + (3z).
This shows that ay (3w € E R(A). Moreover, the zero vector vector lies in 7£(A), R(A), since This shows that ay + + /3w 72-(A). Moreover, the zero lies in since AO = = o. 0. It follows that R(A) 7£(A) is a subspace of Rll Rn (possibly the entire space RllRn— every vector space Every linear operator has has this every vector space is is aa subspace subspace of of itself). itself). Every linear operator this property; property; if ff :: X -» -t Y then R(f) of the if Y is is aa linear linear operator, operator, then 72-(f) is is aa subspace subspace of the vector vector space space Y. Y. (The same same need not be true for for aa nonlinear operator.) (The need not be true nonlinear operator.)
3.2. Existence and uniqueness of solutions solutions to to Ax Ax = = b 3.2. Existence and uniqueness of b
39 39
The subspaces of of Rll is particularly particularly simple: proper subspaces The geometry geometry of of subspaces Rn is simple: the the proper subspaces of R (Le. those those that are not not the the entire of RII (i.e. that are entire space) space) are are lower-dimensional lower-dimensional spaces: spaces: lines lines in R2, lines and and planes planes in in R33,, and and so forth (we visualize these these objects objects in in in R2, lines so forth (we cannot cannot visualize dimensions we can understand understand them dimensions greater greater than three, but we them by analogy). analogy). Since every must contain contain the in R for example, example, is is every subspace subspace must the zero zero vector, vector, not not every every line line in R 22 ,, for a subspace, subspace, but but only only those those passing passing through through the the origin origin are. are. n With understanding of the geometry of R Rll, we obtain obtain the following conWith this understanding , we clusions: clusions: n
• If R(A) = Rll, for each b E If 7£(A) R n , then Ax = b has a solution solution for e Rll. R n . (This is is a tautology.) tautology. ) • If for almost If R(A) 7£(A) i^ Rll, R n , then Ax = = b fails to have a solution for almost every b E e Rll. Rn. n This is because a lower-dimensional subspace comprises very little of R R II (think of a line contained contained in the the plane or in three-dimensional space). three-dimensional 2x2 Example 3.13. 3.13. We consider the the equation = b, where where A 6 ER is given by by Example We consider equation Ax = R2X2 is given
A=[~ ~]. 2 ER R2, we have For any x 6 , we
-
Xl [ 2Xl
=
(Xl
+ 2X2 + 4X2
+ 2X2)
[
~
] .
This calculation shows shows that every vector in the range of multiple of This calculation that every vector b in the range of A is is aa multiple of the the vector vector
R(A) is in the the plane plane R R22 (see Figure 3.1). Since Since this this Therefore, subspace 'R-(A) Therefore, the the subspace is aa line line in (see Figure 2 line is aa very small part part of of R the system system Ax Ax = fails to to have have a a solution solution for line is very small R 2 ,, the = b fails for 2 almost every every b bE R2. almost eR . As mentioned mentioned at at the beginning of this chapter, chapter, there there is is aa close close analogy analogy between between As the beginning of this linear (algebraic) systems and and linear linear differential equations. The reader should linear (algebraic) systems differential equations. The reader should think think carefully about carefully about the similarities between the following example and the previous one. Example 3.14. define the £] Example 3.14. We We define the linear linear differential differential operator operator LN LN :'•e1[o, C^[Q,f\£] -+ -» e[O, (7[0,£j by by
40
Chapter 3. Essential Essential linear algebra
Figure 3.1. 3.1. The A in Figure The range of the matrix A in Example 3.13. 3.13.
where where
CMO,l] =
{u E C2 [0,l] :
~~(O) = ~~(l) = O}
(as defined defined in the previous section). If If f e E 7£(Ljv), R(L N ), then there exists u €E Ch[O,l] C^[0,l] such that LNu = It follows follows that such that LNU = f.f. It that
l r rl cPu io f(x) dx = - io dx (x) dx 2
= -
dU ]1 [-(x) dx
0
= _ du (l)
dx
+ du (0) dx
= 0.
This shows that E C[O, l] cannot cannot belong belong to the range LN unless unless it it satisfies satisfies the This shows that f e C[0,^] to the range of of LN the special condition special condition
11
f(x) dx
= 0.
(3.3)
In fact, fact, 7£(Ljv) R(LN) i-iss the set of of all all such such ff,, as as the reader is asked to to show show in in Exercise Exercise In the set the reader is asked 12. 12. Because the the space space C[O, l] is is infinite-dimensional, infinite-dimensional, we we cannot visualize this situBecause C[0,^] cannot visualize this situation (as (as we could in the previous the reader reader should appreciate ation we could in the previous example). example). However, However, the should appreciate that e C[0, example, that most most functions functions ff E e[O, £] l] do do not not satisfy satisfy the the condition condition (3.3). (3.3). (For (For example,
3.2. Existence and and uniqueness of solutions to Ax Ax = b
41 41
the reader is invited to write down quadratic polynomial at random and call it down a quadratic f(x). Chances quadratic will not satisfy satisfy (3.3).) range /(#). Chances are that this quadratic (3.3).) Therefore, Therefore, the range of LN LN is only only a small part of of C[O, f], and for most most choices f], there is of C[0,i], and for choices of of f E G C[O, C[0, £], LNU = = f. no solution to to L^u f. Example 3.15. now define LD Example 3.15. We now define L C^[Q,l] ->• C[O,f] C[Q,f\ by D : Cb[O,f]-+ d2 u LDU = - dx 2 '
where
Cb[O,f] = {u E C2[0,f] : u(O) = u(f) =
O}.
The reader should recall from the previous section that the BBVP VP
°
~u
= f(x), < x < f, u(O) = 0,
- dx 2
u(f) =
°
can be operator equation Lj^u L DU = R( A) is be written as the linear operator — f .. We will show that 'R-(A) all of LDu = = f for for every every f E fl. The idea of C[O, C[Q,(]f] by showing that we can solve Lpu £ C[O, C[0,£]. of is to integrate twice and use the boundary boundary conditions to determine the constants of integration. intearation. We have ~U
dx 2 (x) = -f(x),
so
du dx (x) = We write
l
0
°< x < f,
X
f(s) ds
du dx(X) = -F(x)
where
F(x) =
+ Cb a < x < f.
°
+c1 , < x < f,
fox f(s) ds.
We then integrate to obtain u(x) =
-loX F(z) dz + C x + C = -loX Io 2
1
z
f(s) dsdz + C1 x + C2,
°< X < f.
The reader should notice the use of the dummy variables variables of integration ssand and z. z. The first boundary boundary condition, u(O) = O. w(0) = = 0, implies that C Ci2 = 0. We then have u(f) =
-10£ l
z
f(s) dsdz
+ C1 f;
u(f) = since u(i] — 0, 0, we obtain C1 =
e1111Z f(s)dsdz 0
0
42 42
Chapter 3. 3. Essential linear algebra algebra Chapter Essential linear
and so so and
r r f(s)dsdz+ x r 10r f(s)dsdz, 1010 e10 l
u(x)=-
O<x Ax -
Az = 0 => A(x - z) = 0,
the following from linearity of of A. If x the last last step step following from the the linearity A. If x^ =I z, z, then then w w= = x x— - zz is is aa nonzero nonzero satisfying Aw Aw = = o. vector satisfying 0. = b and w is a nonzero On the other hand, suppose suppose x is a solution solution to Ax = vector 0. Then Then vector satisfying satisfying Aw Aw = = o.
A(x+w) =Ax+Aw =b+O =b, and in this case there cannot cannot be a unique solution to Ax == b. the above observations, we define the the null space of A to be Because of the
N(A) = {x E Rll : Ax = O}. Since AO AO = = 0 always holds for a linear operator we always have 0 E N(A). operator A, we e -A/"(A). Moreover, if if x,z x,z e E A/"(A) N(A) and and a, a,{3 E R, R, then then Moreover, ft € Ax = 0, Az = 0 and so and so A (ax + {3z) = aAx + {3Az = a . 0 + {3 ·0=
o.
Therefore, ax ax + {3z N(A). This shows shows that A/"(A) N(A) is a subspace of Rll. /3z E € A/"(A). Rn. If 0 is the only vector in A/"(A), N(A), we say that A/"(A) N(A) is trivial. trivial. the Our above lead conclusion: If If Ax Our observations observations above lead to to the the following following conclusion: Ax = = b b has has aa N(A) is trivial. Furthermore, nothing in the solution, it is unique if and only if wV(A) the A's being a matrix operator; the same arguments can above discussion depends on A's differential operator. be made for any linear operator, such as a differential
3.2. Existence and Ax = bb and uniqueness uniqueness of solutions to Ax
43
If is nontrivial nontrivial and b has then the has in in If N(A) JV(A) is and Ax Ax == b has aa solution, solution, then the equation equation has fact many solutions. this, suppose E RD band fact infinitely infinitely many solutions. To To see see this, suppose xx e Rn satisfies satisfies Ax Ax = = b and w e N(A), A/"(A), w w =F 7^ O. 0. Then, Then, for for each each 0: a E G R, R, we we have have wE
A(x + o:w) = Ax + o:Aw = Ax + 0:0 = Ax = b. Since the real number 0:, this shows shows Since xx + + o:w aw is is different different for for each each different different choice choice of of the real number a, this that has infinitely Moreover, it that the the equation equation has infinitely many many solutions. solutions. Moreover, it easily easily follows follows that that the the set set of of all all solutions solutions to Ax Ax = = b is, is, in in this this case, case,
x+N(A) = {x+w : w EN(A)}. Once Once again, again, the the same same properties properties hold hold for for any any linear linear operator operator equation. equation. Example Let A by Example 3.16. 3.16. Let A Ee R R4x4 be be defined defined by
Consider Ax = b, where b Consider the the equation equation Ax = b, where b 88 elimination system Ax elimination algorithm, algorithm, the the system Ax == system system Xl 13X3 4X3 + X2 +
E R4 e R4 is is arbitrary. arbitrary. Using Using the the standard standard the bb can can be be shown shown to to be be equivalent equivalent to to the 4X4 2X4
0 0
bl - 3b2, b2, b3 - b2 - 2b l b4 - b2 - bl .
,
We see that system is conditions We see that the the system is inconsistent inconsistent unless unless the the conditions
b3 - b2 - 2b 1 = 0, b4 - b2 - b1 = 0
(3.5)
hold. If If these b, then hold. these conditions conditions are are satisfied satisfied by by b. then
= bl
-
3b 2
X2 = b2
-
4X3 - 2X4,
Xl
+ 13x3 + 4X4,
where X3 and X4 can Setting #3 X3 = X4 = of where x% and x± can take take on on any any value. value. Setting = ssand and #4 = t, every every vector vector of the the form form
bl
x=
[
~2o3b
2 ]
+s
[
1~1 ] +t [ - 0~ 1
001
8 8We reduction We assume assume that that the the reader reader is is familiar familiar with with Gaussian Gaussian elimination, elimination, the the standard standard row row reduction algorithm for for solving solving Ax = b. For aa review, see any any introductory introductory text text on on linear algebra, such such as algorithm Ax = h. For review, see linear algebra, as the text by by Lay the text Lay [34]. [34].
44
Chapter 3. Essential algebra Essential linear algebra
is solution of is aa solution of the the system. system. We We have have that that
is one of Ax is one solution solution of Ax = = b, b, and and
Example 3.17. compute the space of LN defined defined in Example Example 3.17. We We compute the null null space of the the operator operator LN in Example If LNU LNU = satisfies the 3.14. 3.14- If = 0, 0, then then uu satisfies the BVP BVP d2 u
- dx 2
= 0,
du (0)
= 0,
dx
~~(f) =
°< x
R is defined denned by f(x) =
and and
2
2
Xl X~ [ X2 - Xl
+
b=[~].
(a) (a) Show Show that that f(x) f (x) == bb has has exactly exactly two solutions. solutions. (Hint: (Hint: A A graph graph is is useful.) useful.)
(b) Show Show that that the only solution solution to to f(x) f(x) == 00 is is xx = = o. 0. (Yet, (Yet, as as Part lOa (b) the only Part lOa shows, unique.) shows, the the solution solution to f(x) f (x) = = b is is not not unique.) This example example illustrates illustrates that that the properties of of linear linear systems systems do do not not necessarnecessarThis the properties ily carryover carry over to to nonlinear ily nonlinear systems. systems. 11. Let Let D D :: C CIl[a, b] -+ -> C[a, C[a,b]b] be be the differentiation operator: operator: 11. [a, b] the differentiation Df
= df. dx
(a) Show Show that that the the range range of of D D is is all all of of C[a, b]. (a) (b) Part lla is is equivalent equivalent to to saying saying that, that, for for every every f/ E e C[a, b], b], the (differen(b) Part 11a the (differential) equation equation Du = = f has has aa solution. solution. Is Is this this solution solution unique? Why or tial) unique? Why or why not? why not? 12. Let Let LN LN be defined as as in in Example Example 3.14, 3.14, and and suppose suppose f/ E e C[O, C[Q,l]l] satisfies 12. be defined satisfies
1£ f(x) dx Show that that f/ E e R(Ln). n(Ln}. Show
=
o.
Chapter 3. Essential linear algebra
50
3.3
Basis and dimension dimension Basis
The matrix-vector product product Ax is equivalent to aa linear the columns columns The matrix-vector Ax is equivalent to linear combination combination of of the of A. If A has VI, V V2,2 ,... of the the matrix matrix A. If A has columns columns YI, . . . ,, Vvnn ,, then then
The reader should write out is not not clear (see Exercise The reader should write out aa specific specific example example if if this this is clear (see Exercise x 1). XI,X2, ••• ... ,Vn vectors; an ex1). The The quantities quantities £i,#2, • • • ,Xn i n are are scalars scalars and and VI,V2, YI, V2,...., vn are are vectors; an expression such as as XiVi is called called aa linear linear combination combination of of the the pression such Xl Vl + #2V2 X2 V2 + cldots cldots + x Xn Vnn is nv vectors Vl, Vnn because using the the linear vectors YI , V2, v 2 , ... . . . ,, v because the the vectors vectors are are combined combined using linear operations operations of addition addition and and scalar scalar multiplication. multiplication. of E RDXD way When A £ R n x n is is nonsingular, each b E£ RD Rn can be written in a unique way as aa linear linear combination combination of of the the columns columns of of A A (that (that is, is, the equation Ax Ax = = bb has has aa the equation as unique is related. related. unique solution). solution). The The following following definition definition is Definition Let V space, and V2, , Vnn are vectors Definition 3.23. 3.23. Let V be be aa vector vector space, and suppose suppose Vl, vi,V 2 ,... . . .,v are vectors in the property property that be written linin V V with with the that each each Vv E6 V V can can be written in in aa unique unique way way as as aa linear combination combination of of {Vl, (YI, V2,···, v 2 , . . . , vvn }. Then {Vb {vi, V . . . ,, v is called called aa basis basis of of V. V. }. Then V2,2 ,... v nn )} is ear Moreover, we say say that that nn is is the the dimension dimension of of V. V. Moreover, we A be shown A vector vector space space can can have have many many different different bases, bases, but but it it can can be shown that that each each contains the same number of well-defined. contains the same number of vectors, vectors, so so the the concept concept of of dimension dimension is is well-defined. We now now present present several We several examples examples of of bases. bases.
Example standard basis for RD ... ,e Example 3.24. 3.24. The The standard basis for Rn is is {el,e2, (ei,e2,... ,en}},; where where every every entry entry n of ej is zero except the jth, which is one. Then we obviously have, for any x E RD, of GJ is zero except the jth, which is one. Then we obviously have, for any x 6 R ,
and see that For example, for xx E R 33;, and it it is is not not hard hard to to see that this this representation representation is is unique. unique. For example, for £R
Example An alternate for R3 V2, V3},; where Example 3.25. 3.25. An alternate basis basis for R3 is is {Vb {vi,V2,V3J where
It may be obvious why one want to use this of It may not not be obvious to to the the reader reader why one would would want to use this basis basis instead instead of the much much simpler basis {el, {61,62,63}. it is is easy easy to to check check that the simpler basis e2, e3} . However, However, it that Vl . V2 = 0, Vl . V3 = 0, V2· V3 = 0,
3.3. Basis Basis and and dimension 3.3. dimension
51 51
and the basis basis {VI, {vi,V2,vs} easy to use as {61,62,63}. and this this property property makes makes the V2, V3} almost almost as as easy to use as {el, e2, e3}. We explore this topic in the the next section. We explore this topic in next section. Example The set the vector of degree Example 3.26. 3.26. The set P P n is is the vector space space of of all all polynomials polynomials of degree n n or or n less (see (see Exercise 3.1.3). The The standard basis is {l,x,x22,... To see standard basis is {1,x,x ,xn}. see that that this this less Exercise 3.1.3). , ... ,x }. To is indeed first note that every every polynomial € Pn can is indeed aa basis, basis, we we first note that polynomial ppEP can be be written written as as aa 2 n 2 linear combination of 1, x, x , . . . , x : linear combination of 1, X, x , ... , xn: p(x) = co' 1 + CIX + C2X 2 + ... + cnx n (this is is just definition of that this (this just the the definition of polynomial polynomial of of degree degree n). n). Showing Showing that this representarepresentation is unique is a little subtle. If we also had tion is unique is a little subtle. If we also had p(x) = do' 1 + dlx
+ d2x2 + ... + dnxn,
then, subtraction then, subtraction would would yield yield (eo - do)
+ (Cl
+ ... + (c n -
- ddx
dn)x n
=0
for every x. can have at most for every x. However, However, a a nonzero nonzero polynomial polynomial of of degree degree n n can have at most n n roots, roots, so it must be the case that that (CQ 4- (GI + ... ••• + + (c (cnn -— dn)x dn)xnn is is the the zero so it must be the case (eo — - do) do) + (CI — - di)x ddx + zero polynomial. That is, CQ=—do, do,Clci == dd1,i ... , . . ., ,Ccnn== ddnn must must hold. hold. polynomial. That is, Co Example 3.27. 3.27. An Example An alternate alternate basis basis for for P^ P 2 is is { I , x-~2' X2_X+~} 6
(the advantage advantage of this basis (the of this basis will will be be discussed discussed in in Example Example 3.39 3.39 in in the the next next section). section). To To 2 show this is we must that, given any p(x] — Co CQ+cix+C2X show that that this is indeed indeed aa basis, basis, we must show show that, given any p(x) = +CIX+C2X2,, there is aa unique unique choice choice of of the the scalars scalars ao,ai,fl2 ao, aI, a2 such such that that there is ao . 1 + al (x -
~) + a2 (X2 - X + ~)
= Co
+ CIX + C2X2.
This equation equation is is equivalent equivalent to to the the three This three linear linear equations equations 1
ao - '2a1
1
+ 6'a2
al -
= Co,
a2 =
CI,
a2 = C2·
The reader regardless of the The reader can can easily easily verify verify that that this this system system has has aa unique unique solution, solution, regardless of the values values 0/co,ci,C2. of eo,Cl,C2· Example 3.28. Yet another basis forP? is {L {1/1,1/2,1^3}, Example 3.28. Yet another basis for P 2 is where 1, L 2, L 3 }, where LI(X) = 2 (x
-~) (x -1),
L2(X) = -4x(x - 1), L3 (x) = 2x (x -
~) .
52
Chapter Chapter 3. 3.
Essential linear Essential linear algebra algebra
// write x\ Q, # = 1/2, 1/2, and and x% I , then the property property If we we write Xl = 0, X22 = X3 = 1, then the 1, i
L i (Xj ) = { 0,
i
= j, ¥- j
(3.9)
holds. the properties of aa basis can be be verified (see Exercise 5). holds. From From this this property, property, the properties of basis can verified (see Exercise 5). There are are two two essential of aa basis basis {v {vi, v2, of aa vector space There essential properties properties of 1,V ... , V n} vector space 2 ,...,v n } of every vector in V can can be represented as as aa linear linear combination of the V. First, First, every vector in be represented combination of the basis basis is unique. two definitions definitions provide provide vectors. vectors. Second, Second, this this representation representation is unique. The The following following two concise ways to two properties. concise ways to express express these these two properties. Definition 3.29. 3.29. Let be aa vector and suppose {YI, V2, v 2 , ... . . . ,, v Definition Let V be vector space, space, and suppose {Vl, V n} is aa colcoln } is lection in V. The The span span of {YI, V2, v 2 , ... . . . ,, v is the the set lection of of vectors vectors in of {Vl, v nn} } is set of of all all linear linear combicombinations of of these these vectors: nations vectors: spon{vi,v 2 ,...,v n } = {o!iVi+Q!2V2 +
that that
h anvn : c*i, a 2 , • • •, an e R}
Thus, one one of of the of aa basis {YI, v . . . ,, v of aa vector space V is Thus, the properties properties of basis {v!, V2, V n} vector space is 2 , ... n } of V = span{vi,v 2 ,...,v n }.
The reader should also also be aware that, for any vectors vVl, i,V 2 ,... . . . ,, Vn v n in The reader should be aware that, for any vectors V2, in aa vector vector space V, span span{vi, V 2 ,... . . . ,, v } is a subspace of (possibly the entire space as in {Vl' V2, V n} is a subspace of V (possibly the entire space V, as in space n the of aa basis). the case case of basis). Definition 3.30. A set of of vectors vectors {Vl' {YI, v . . . ,, v is called linearly independent Definition 3.30. A set V2, V n} called linearly independent ifif 2 , ... n } is the only . . . ,,Ccnn satisfying satisfying the only scalars scalars c\, Cl, c C2, 2 ,•••
are
Cl
= C2 = ... = Cn = o.
It can shown that the uniqueness part of the definition definition of of aa basis is equivaIt can be be shown that the uniqueness part of the basis is equivalent to independence of basis vectors. vectors. Therefore, for aa vector lent to the the linear linear independence of the the basis Therefore, aa basis basis for vector space is is aa linearly independent spanning spanning set. set. space linearly independent A third the number A third quality quality of of aa basis basis is is the number of of vectors vectors in in it-the it—the dimension dimension of of the the vector any two these properties properties imply the third. vector space. space. It It can can be be shown shown that that any two of of these imply the third. That has dimension two of of the following statements That is, is, if if V has dimension n, n, then then any any two the following statements about about {vi, V2, v 2 , ... . . . ,, vvd } imply the third: {Vl, imply the third: fc • k =n; •
{Vl, V2, ... ,
vd is linearly independent;
3.3. Basis and dimension
53 53
Thus if {vi, v . . . ,, Vfc} satisfy two of the above properties, it is Thus if {Vl' V2, vd isis known known to to satisfy two of the above properties, then then it is aa 2 , .•. basis V. basis for for V. Before the topic remind the fact Before leaving leaving the topic of of basis, basis, we we wish wish to to remind the reader reader of of the the fact indicated of this section, which which is is so so fundamental indicated in in the the opening opening paragraphs paragraphs of this section, fundamental that that we it formally formally as we express express it as aa theorem. theorem. Theorem Let A A be n x x n n matrix. A is Theorem 3.31. 3.31. Let be an an n matrix. Then Then A is nonsingular nonsingular if if and and only only ifif the columns of A form form aa basis basis for for R R nn .. the columns of
Thus, when columns form form aa basis for Rn, R n , and Thus, when A A is is nonsingular, nonsingular, its its columns basis for and solving solving Ax equivalent to finding the weights that that express express b as aa linear combination Ax = =b b is is equivalent to finding the weights b as linear combination of this basis. This answers the Suppose we of this basis. This fact fact answers the following following important important question. question. 99 Suppose we n n have . . . ,, v for Rn R and and aa vector b G R . Then, of course, course, b b is have aa basis basis YI, Vl, v V2, Vn vector bERn. Then, of is aa 2 , ... n for linear combination combination of find the weights in linear linear of the the basis basis vectors. vectors. How How do do we we find the weights in this this linear combination? How How expensive is it to do so (that is, how much work is required)? the scalars Xl, #2, X2, ..• the equation To find find the scalars #1, • • • ,X , xn in in the equation
10 we define definelO
and solve solve and Ax=b
via Gaussian elimination. The expense computing xx can countvia Gaussian elimination. The expense of of computing can be be measured measured by by counting the operations—the number of additions, subtractions, ing the number number of of arithmetic arithmetic operations-the number of additions, subtractions, multiplications, and divisions—required. The total number of of operations operations required required multiplications, and divisions-required. The total number to solve solve Ax Ax = and it convenient to leading to = bb isis aa polynomial polynomial in in n, and it is is convenient to report report just just the the leading term in in the the polynomial, polynomial, which which can can be be shown to be be term shown to 2
-n
3
3
(the lower-order terms terms are not very very significant large). We We usually usually express (the lower-order are not significant when when nn is is large). express this saying saying that this that the the operation operation count count is is
("on the the order order of of (2/3)n (2/3)n33"). "). In section, we discuss aa certain type of of basis for which In the the next next section, we discuss certain special special type basis for which it it is is much easier to in terms of the the basis. basis. much easier to express express aa vector vector in terms of 9
91f If the the importance importance of of this this question question is is not not apparent apparent to to the the reader reader at at this this point, point, it it will will be be after after he or or she reads the the next two sections. he she reads next two sections. 10 This notation is the columns are are the vectors VI, vi, V 2 ,..• . . . ,, v notation means means that that A A is the matrix matrix whose whose columns the vectors V2, Vn. lOThis n.
54 54
Chapter Essential linear Chapter 3. 3. Essential linear algebra algebra
Exercises Exercises 1.
(a) Let A
~ -~ -~ [
-n, ~ -n x
[
Compute boteAx ax and and Compute both
and are equal. equal, and verify verify that that they they are nxn (b) Let 6 R x n and e R n , and and suppose suppose the columns of of A (b) Let A A ERn and x x ERn, the columns A are are
so that j)-entry of of A Compute both and (XlVI (#iVi + so that the the (i, (i,j)-entry A is is (vj)j. (vjk Compute both (Ax)j (AX)i and xX2V2 1- x equal. XnVn)i, and verify verify that that they they are are equal. 2V2 + ... + nvn)i, and 2. Is 2. Is
basis for for R3? R3? (Hint: (Hint: As As explained explained in in the the last last paragraphs paragraphs of of this this section, section, the the aa basis three Rn if and only = b three given given vectors vectors form form aa basis basis for for Rn if and only if if Ax Ax = b has has aa unique unique solution for for every 6 R n , where columns are solution every b b ERn, where A A is is the the 33 x x3 3 matrix matrix whose whose columns are the the three vectors.) three given given vectors.) 3. Is 3. Is
basis for for R3? R3? (See (See the the hint hint for for the the previous previous exercise.) exercise.) aa basis 4. Show that 4. Show that {X2
+ 1, X + 1, X2 -
X
+ I}
is space of less. (Hint: Verify is aa basis basis for for P%, P2, the the space of polynomials polynomials of of degree degree 22 or or less. (Hint: Verify directly directly that that the the definition definition holds.) holds.) 5. Show that {L1? L22 ,,L L33}, defined in in Example is aa basis for P 5. Show that {L }, defined Example 3.28, 3.28, is basis for P22-. (Hint: (Hint: Use Use I ,L (3.9) to show that (3.9) to show that
holds for every p E P2 .)
projections 3.4. Orthogonal Orthogonal bases and projections
55
6. Let V be be the the space of all continuous, complex-valued complex-valued functions functions defined on the 6. Let space of all continuous, defined on the real line: line: real V = {f : R -+ C : f is continuous}. lx spanned by e ei:v and e-i:v, where ii = = A. Define W to be the subspace of V spanned e lx, where ^/^l. Show that {cos sin (x) another basis basis for for W. (Hint: Use Euler's Euler's formula: Show that {cos (x), (x), sin (x}}} is is another (Hint: Use formula: eiIJ = cos (0) + i sin (0).)
3.4 3.4
Orthogonal and projections projections Orthogonal bases bases and
At the the end end of of the the last last section, section, we we discussed discussed the the question question of expressing aa vector vector in in At of expressing terms of of aa given basis. This This question is important important for for the the following following reason, reason, which which terms given basis. question is describe in general general terms at problems that are we can only describe at the moment: Many problems are admit a special posed in vector spaces admit special basis, in in terms of which the problem is is easy easy to solve. solve. That That is, is, for for many many problems, problems, there there exists exists aa special basis with with the the property to special basis property that if if all all vectors vectors are in terms terms of that basis, basis, then then aa very very simple simple calculation that are expressed expressed in of that calculation will produce produce the the final final solution. For this this reason, reason, it it is is important important to to be be able able to to take take aa will solution. For vector (perhaps in terms terms of of aa standard basis) and and express in terms terms of of aa vector (perhaps expressed expressed in standard basis) express it it in different basis. In the latter part of this section, we we will study different study one one type of problem problem advantageous to use aa special another such for which it is advantageous special basis, and we will discuss another the next next section. problem in in the problem section. It is easy to to express vector in terms of of aa basis basis if if that that basis basis is orthogonal. is quite quite easy express aa vector in terms is orthogonal We wish wish to to describe describe the the concept concept of of an an orthogonal orthogonal basis basis and and show some important We show some important examples. Before Before we we can can do do so, we must must introduce introduce the the idea of an product, examples. so, we idea of an inner inner product, which is is aa generalization generalization of of the the Euclidean Euclidean dot dot product. which product. 3 R22 and R R3. The dot product plays a special role in the geometry of R . The 2 3 reason for this is is the the fact that two two vectors vectors x,y x, y in R2 or R3 are are perpendicular perpendicular ifif reason for this fact that in R or R and only ifif and only x·y =0. Indeed, one can show that Indeed, one can show that
x . y = Ilxllllyll cos (0), where 09 is is the the angle angle between between the the two two vectors vectors (see (see Figure Figure 3.2). 3.2). where From elementary elementary Euclidean Euclidean geometry, geometry, we we know know that, that, if if x x and and y y are are perpenperpenFrom dicular, then dicular, then
Ilx + yW = IIxl1 2+ IIyl12
(the Pythagorean Pythagorean theorem). theorem). Using Using the the dot dot product, product, we we can can give give aa purely purely algebraic (the algebraic proof of the the Pythagorean Pythagorean theorem. By definition, definition, proof Ilull=~,
so
Ilx + Yl12 = (x + y) . (x + y) =x,x+x·y+y·x+y·y =x·x+2x·y+y·y = IIxl12 + 2x . y + IIYI12.
56 56
Chapter Chapter 3. Essential Essential linear algebra y
e
x
Figure 3.2. The The angle two vectors. Figure 3.2. angle between between two vectors. This calculation shows that This calculation shows that
holds O. holds if if and and only only if if x .• y = = 0. Seen this theorem is is an an algebraic algebraic property can be Seen this way, way, the the Pythagorean Pythagorean theorem property that that can be deduced in in RD, Rn, n 3, even in those spaces we cannot visualize vectors or deduced n > 3, even though though in those spaces we cannot visualize vectors or what it to use the word word orthogonal it means means for for vectors vectors to to be be perpendicular. perpendicular. We We prefer prefer to use the orthogonal what instead of of perpendicular: Vectors xx and in RD Rn are are orthogonal orthogonal if = 0. instead perpendicular: Vectors and y y in if x x·• y y = O. In course of of solving solving differential differential equations, deal with with function function spaces spaces In the the course equations, we we deal in to Euclidean and our methods are heavily dependent dependent on in addition addition to Euclidean spaces, spaces, and our methods are heavily on the the existence product-the analogue dot product product in in more more general general existence of of an an inner inner product—the analogue of of the the dot vector spaces. spaces. Here Here is vector is the the definition: definition:
Definition 3.32. V be be aa real (real) inner inner product product on V is is aa Definition 3.32. Let Let V real vector vector space. space. A A (real) on V function, usually .) or or (., taking two two vectors from V producing function, usually denoted denoted (', (•,•) ( - , -.)) vv,> taking vectors from V and and producing aa real function must following three properties: real number. number. This This function must satisfy satisfy the the following three properties: 1. (u, (v, u) vectors uu and and Vi v; 1. (u, v) v) = = (v, u) for for all all vectors 2. (cm + + fiv, /?v, w) = a(u, a(u, w) + /?(v, (w, au cm + + fiv) /?v) = a(w, u) + fi(w, /?(w, v) v) for 2. (au w) = w) + fi(v, w) w) and and (w, = a(w, u) + for all vectors u, u, v, v, and and w, and all all real real numbers and fii all vectors w, and numbers aa and /3;
3. all vectors vectors u, u, and and (u, (u, u) 0 if if and if uu is 3. (u, (u, u) u) > 2': 00 for for all u) = = 0 and only only if is the the zero zero vector. vector. It should should be easy easy to check check that these properties hold for for the ordinary ordinary dot dot product on Euclidean product on Euclidean n-space. n-space. Given an an inner inner product space (a (a vector an inner product), we define Given product space vector space space with with an inner product), we define orthogonality just as in in Euclidean space: two are orthogonal orthogonal if if and and only only ifif just as Euclidean space: two vectors vectors are orthogonality their zero. It their dot product is zero. It can can then then be shown shown that that the the Pythagorean Pythagorean theorem holds holds (see Exercise Exercise 3). 3). (see
57
Orthogonal bases and projections 3.4. Orthogonal projections
An orthogonal orthogonal basis an inner space V V is is aa basis {vi, v An basis for for an inner product product space basis {VI, V2, ... , v v nn}} 2 ,..., with the property that that with the property i =I- j
'* (Vi,Vj) =
0
(that is, basis is to every every other other vector vector in in the the basis). basis). (that is, every every vector vector in in the the basis is orthogonal orthogonal to We now now demonstrate the first first special special property property of of an an orthogonal orthogonal basis. basis. Suppose Suppose We demonstrate the {VI, V2, is an an orthogonal basis for an inner inner product product space x is is any any {vi, v . . . ,, vvn }} is orthogonal basis for an space V and and x 2 , ... exist scalars scalars aI, a2, ... ann such that vector in in V. Then vector Then there there exist ai, 0:2, • • • ,, a such that
(3.10) To deduce deduce the we take take the inner product (3.10) with To the value value of of Q:J, ai, we the inner product of of both both sides sides of of (3.10) with Vi: (Vi,X) = (Vi,aIVI + a2v2 + ... + anv n ) = al (Vi, vd + a2 (Vi, V2) + ... + an(Vi, Vn ) = ai(Vi, Vi). The step follows that every product (Vi, Vj) vanishes vanishes except The last last step follows from from the the fact fact that every inner inner product (v^, YJ) except (vj,Vj). then obtain obtain (Vi, Vi). We We then
(Vi, x) ai = ( )' i = 1,2, ... , n, Vi, Vi and so and so
X=
(VI,X) VI (VI, VI)
+
(V2,X) V2 (V2, V2)
+ ... +
(vn,x) Vn. (V n , v n )
(3.11)
This formula shows shows that that it it is is easy easy to to express express aa vector terms of of an an orthogonal orthogonal vector in in terms This formula basis. Assuming Assuming that that we we compute compute (VI, vt), (v2, (V2, V2), basis. (vi, vi), V 2 ) ... , . . .,,(V (vnn,,Vvnn)) once once and and for forall, all, itit n inner products to requires just n to find the weights in in the linear linear combination. In the the - 11 arithmetic arithmetic operations case n-vectors, aa dot dot product product requires requires 2n — case of of Euclidean Euclidean n-vectors, operations (n multiplications and and nn -— 1I additions), additions), so the total total cost cost is multiplications so the is just just
If n is large, this this is less costly the O(2n O(2n33/3) /3) operations operations required for aa If is large, is much much less costly than than the required for nonorthogonal basis. basis. We We also remark that that if the basis basis is is orthonormal-each nonorthogonal also remark if the orthonormal—each basis basis vector is is normalized normalized to have have length one-then one—then (3.11) simplifies simplifies to to (3.12)
Example 3.33. The basis {vi,v 2 ,v 3 } forH3, where
58
bra Chapter3.3. Essential Essentiallinear linear algebra Chapter algebra
is verified directly. If is orthonormal, as as can can be be verified If
then
3.4.1 3.4.1
L22 inner product The L
We have seen that functions functions can can be be regarded regarded as We have seen that as vectors, vectors, at at least least in in aa formal formal sense: sense: functions can be added together together and multiplied by scalars. (See Example 3.4 in Section 3.1.) functions are not so different 3.1.) We will now show more directly that functions different we show that aa suitable product can can from vectors. In In the the process, from Euclidean Euclidean vectors. process, we show that suitable inner inner product be defined defined for for functions. functions. be Suppose we have have aa function b]-a continuous function defined Suppose we function 9g E G era, C[a, b]—a continuous function defined on on the the interval [a, b]. we can produce a vector that approximates interval [a, &]. By sampling 9g on a grid, we approximates the function g. Let Xi Xi = i~x, A# ~x = = (b - a)/N, a)/N, and define a vector vector G E = a + iAx, (b — G RN RN by
Gi=g(Xi), i=O,l, ... ,N-1. Then G can can be be regarded regarded as as an an approximation approximation to to g 9 (see (see Figure 3.3). Given Given another another Then Figure 3.3). function f(x) and the corresponding corresponding vector vector FERN, we have function f(x) and the F € R N , we have N-l
F·G=
L
FiGi
i=O N-l
=L
f(xi)g(x;).
i=O
Refining the discretization leads to that obviously obviously Refining the discretization (increasing (increasing N) leads to aa sampled sampled function function that original function more accurately. Therefore, we we ask: What happens represents the original to as N -t to F F .• G G as ->• oo? oo? The The dot dot product product N-l
F .G =
L
f (Xi) 9 (Xi)
i=O
does not converge to any value as as N -t but aa simple modification induces does not converge to any value —>• 00, oo, but simple modification induces convergence. We replace the ordinary dot product by the following scaled dot we introduce new notation: notation: product, for which we product, for which introduce aa new N-l
(F, G) =
L
i=O
FiGi~X,
59
3.4. Orthogonal bases and projections projections
o
0.2
0.6
0.4
0.8
x
Approximating a a function function g(x] g(x) by by a a vector vector G ERN. Figure 3.3. Approximating € RN.
we have Then, when F and G are sampled functions as above, we N-l
(F, G) =
L
1 b
f(xi)g(xi)6.x -+
f(x)g(x) dx as N -+
00.
a
i=O
Based on on this this observation, observation, we we argue argue that that aa natural natural inner inner product product (•, (.,.)•) on on Based C[a, b] is b] is
era,
(f,g) =
lb
I(x)g(x) dx.
(3.13)
Just dot product on Euclidean Euclidean n-space (\\x\\ = • x], so the Just as the the dot product defines defines aa norm norm on n-space (11xll = ^/x y'X-X), so the functions: inner product (3.13) defines a norm for functions:
11111 = v(f, f) = foC II(x)1 2 dx.
(3.14)
the definition of norm. Norms Norms measure the the size or magniFor completeness, we give the tude of of vectors, vectors, and and the the definition definition is is intended intended to to describe describe abstractly the properties tude abstractly the properties reasonable notion of size ought to have. that any reasonable Let V be a a vector vector space. space. A A norm on is aa real-valued real-valued function Definition 3.34. 3.34. Let V be on V V is function with V, usually usually denoted denoted by by II\\ .• II\\ or or II\\ .• Ilv, \\v, and and satisfying satisfying the the following following with domain domain V, properties: properties:
60
Chapter 3. Essential linear algebra
1. Ilvll ~ 0 for all v E V and Ilvll = 0 if and only if v =
o.
2. Ilavll = lalllvil for all scalars a and all v E V. 3. Ilu + vii ~ Ilull
+ IIvll
for all u, v E V.
The last property property is called called the The last the triangle inequality.
For For Euclidean vectors in the plane, the triangle inequality expresses expresses the fact fact that one side side of a triangle cannot be longer than the sum of the other two sides. 11 The inner product defined by (3.13) is the so-called L L22 inner product.l1 Two b] are g) = 0. O. This condition does functions in in era, C[a, b] are said said to be orthogonal if if (f, (/,#) does not have a direct geometric meaning, as the analogous condition does for Euclidean 3 R22 or R R3, we argued above, orthogonality is still important vectors in R , but, as we important algebraically. we measure norm in the L L22 sense, we we say that functions f/ and g 9 are When we close (for (for example, good approximation approximation to that g 9 is is aa good to /) f) ifif close example, that
2 g(X))2 is small for for every x €E [a, [a, b]b] is small. This does not mean that (f(x) (/(#) - g(x}) is 2 ((f(x) _g(X))2 can be large in places, as long as this difference ((f(x) —g(x)) difference is large only over very but rather it implies that (f(x) - g(x))2 the average over small intervals), but (f(x) — g(x)}2 is small on the the interval [a, 6]. b]. For this reason, we often often -use use the the term "mean-square" "mean-square" in the interval [a, in referring referring L22 norm (for we might say to the the L (for example, example, we say "g "# is close close to f/ in in the mean-square mean-square sense"}. sense").
Example If /f ;: [0,1] R is defined defined by by f(x) x(1 — - x), then Example 3.35. 3.35. // [0,1] --t —> R f(x) = = x(l then
Ilfll =
t x2(1- x)2 dx = V35 fT = _1_ == 0.1826. v'3O
10
With 1] -» --t R defined by With g9 ;: [0, [0,1] defined by
g(x) = 83 sin (1I"x) , 11"
we have we have Ilf-gll=
fo
1
(X(1-X)-:3 sin (1I"X)r dx=
(11"6 - 960) == 0.006940. 3011"6
These two functions functions differ differ by by less than 4% in in the mean-square sense sense (cf. These two less than the mean-square (cf. Figure Figure
3·4)· 3.4).
11 llThe refers to to the the French French mathematician mathematician Lebesgue, the "2" to the the exponent exponent in in the The "L" "L" refers Lebesgue, and and the "2" to the formula for L22 norm norm of of aa function. symbol L L22 is is read read "L-two." formula for the the L function. The The symbol "L-two."
61 61
3.4. Orthogonal bases and projections 0.3r-------..---~---_r_;::=:::::::::;::3r:=;;===iI
.......
-- .......
0.1
0.2
0.4
0.6
0.8
x Figure 3.4. functions of Example 3.35. 3.35. Figure 3.4. The The functions of Example
3.4.2 3.4.2
The projection theorem
The is about approximating aa vector vector vv in vector space by The projection projection theorem theorem is about approximating in aa vector space V by aa vector vector from from aa subspace subspace W. W. More More specifically, specifically, the the projection projection theorem theorem answers answers the the question: Is Is there £W W closest closest to (the best best approximation approximation to to vv from from w E to vv (the question: there aa vector vector w W), and and if if so, so, how can it it be computed? Since Since this this theorem theorem is so important, important, and its W), how can be computed? is so and its proof we will prove the the theorem. proof is is so so informative, informative, we will formally formally state state and and prove theorem. Theorem Let V be aa vector space with with inner inner product product (-,-), (', .), let Theorem 3.36. 3.36. Let V be vector space let W W be be aa finite-dimensional subspace subspace of and let v E V. finite-dimensional o f VV,, and letv^V. 1. EW such that 1. There There is is aa unique unique uu G W such that
Ilv - ull = min Ilv - wll· wEW
That is, there there is unique best to v v from from W. That is, is aa unique best approximation approximation to W. We We also also call call uu the projection o/v onto W, W, and the projection of v onto and write write u = projwv.
2. A A vector vector u best approximation from W only ifif 2. u E EW W is is the the best approximation to to vv from W if if and and only
(v - u, z) = 0 for all z E W.
(3.15)
62
Chapter Chapter 3. 3. Essential Essential linear linear algebra algebra 3. If If {WI, ... , w for W, 3. {wi, W2, w 2 ,..., wnn}} is is aa basis basis for W, then then n
projw v =
(3.16)
LXiWi, i=1
where where Gx = b,
(Wj, Wi),
G ij =
(Wi, v).
bi =
(3.17)
The equations equations represented represented by by Gx Gx = = bb are are called called the the normal normal equations, equations, and The and the matrix G is is called called the the Gram Gram matrix. matrix. the matrix G 4- If (w1; W2,···, w2,..., w wnn}} is is an an orthogonal orthogonal basis basis for W, then then the the best best approximation 4· If {Wl, for W, approximation to vv from from W W is to is
.
proJw v =
~ (Wi'V) ~ ( . .)Wi. i=1
(3.18)
W" W,
IfIf the the basis basis is is orthonormal, orthonormal, this this simplifies to simplifies to n
projw v
= L(Wi, V)Wi.
(3.19)
i=l
Proof. We prove the uE any Proof. We will will prove the second second conclusion conclusion first. first. Suppose Suppose that that u e W, and and zz is is any other vector in W. Then, Then, since since W is is closed closed under under addition addition and and scalar scalar multiplication, multiplication, other vector in we that uu + + tz tz E €W W for for all all real real numbers numbers t. On On the the other other hand, hand, every every other other we have have that vector W w in in W can can be written as as uu + + tz tz for for some some zz E £ W and and some some tt E €R R (just (just take take vector be written w -— uu and and tt == 1). 1). Therefore, Therefore, uu E €W W is is closest closest to to vv if if and and only only if zz =— W if Ilv - ull
s: IIv -
(u + tz)11 for all z
E
W, t E R.
(3.20)
2 2
Since IIxl1 ||x|| == (x,x), (x,x), this this last last inequality inequality is is equivalent equivalent to to Since (v - u, v - u)
s: (v -
(u + tz), v - (u + tz)) = ((v - u) - tz, (v - u) - tz) = (v - u, v - u) - 2t(v - u,z) + e(z,z)
or to to or t 2 (z, z) - 2t(v - u, z)
2': 0 for all z E W, t E R.
If we regard If we regard zz as as fixed, fixed, then then
e(z,z)+2t(v-u,z) O. is is aa simple simple quadratic quadratic in in t, and and the the inequality inequality holds holds if if and and only only if if (v (v -- u, u, z) z) = = 0. It holds for for all It follows follows that that the the inequality inequality holds all zz and and all all tt if if and and only only if if (3.15) (3.15) holds. holds. In addition, addition, provided / 0, 0, (3.20) (3.20) holds holds as as an an equation equation only only when when tt = = 00 (since (since In provided zz i(z, z) > 00 for (z, z) for zz i^ 0). 0). That That is, is, if if W wE € W and and W w i^ u, u, then then
IIv-ull < IIv-wlI·
Orthogonal bases and projections projections 3.4. Orthogonal
63
Thus, if the the best best approximation problem has has aa solution, Thus, if approximation problem solution, it it is is unique. unique. We now prove prove the remaining conclusions conclusions to to the the theorem. theorem. Since finiteWe now the remaining Since W is is aa finitedimensional subspace, has aa basis basis {WI, W2, ... , w }. A vector u E W solves the dimensional subspace, it it has {wi, w , . . . , w }. A vector u e solves the 2 n best approximation approximation problem problem if if and and only only if holds; moreover, moreover, it it is is straightforbest if (3.15) (3.15) holds; straightforthat (3.15) (3.15) is is equivalent equivalent to ward to ward to show show that to (v - u, Wi) = 0 for i = 1,2, ... , n
(3.21)
(see Exercise Exercise 5). Any Any vector vector uu E can be be written written as (see G W can as n
u= LXjWj.
(3.22)
j=1
Thus, is aa solution if and and only if (3.22) holds and Thus, uu 6E W W is solution if only if (3.22) holds and
(
V-
Wi) = 0, i = 1,2, ... ,n,
(3.23)
= (Wi,V), i = 1,2, ... ,n.
(3.24)
"tXjWj, )=1
which simplifies to to which simplifies n L(Wj,Wi)Xj j=1
nxn If we define G Gy = (Wj, (w j? Wj) 6 Rn by (3.24) If we define G 6 ER R nxn by by G Wi) and and b bERn by bit = (w (Wi,i? v), then then (3.24) ij = is is equivalent equivalent to to Gx=b.
It can be be shown shown that that G is nonsingular nonsingular (see (see Exercise Exercise 6), the unique unique best best approxIt can G is 6), so so the approximation W is is given given by (3.22), where solves Gx Gx = b. to v v from from W by (3.22), where x x solves = b. imation to If the the basis basis is orthogonal, then then (Wj, Wi) = unless jj = In this this case, G is is If is orthogonal, (wj, w;) = 00 unless = i. i. In case, G the diagonal matrix with diagonal entries entries
and Gx Gx — = b b is is equivalent the n simple and equivalent to to the simple equations equations
that is, is, to that to
(v, Wi)
Xi
This completes completes the proof.
= (Wi,Wi )'
i = 1,2, ... ,n.
0
We now present present two two examples examples of of best best approximation approximation problems problems that that commonly commonly We now computational practice. occur in scientific scientific and computational practice.
64
Chapter Chapter 3. 3. Essential Essential linear algebra
Example 3.37. 3.37. We assume that that data data points Example We assume points
(xl,yd (xi,yi) (X2, Y2) (z 2 ,!fe) (z 3 ,2/Y3) (X3, 3) (0:4,2/4) (X4, Y4) (X5,Y5) (a*,2/5)
= (0.10,1.7805), = = (0.30,2.2285), = (0.40,2.3941), = (0.40,2.3941), = (0.75,3.2226), = (0.75,3.2226), = (0.90,3.5697) -(0.90,3.5697)
have collected in in the laboratory, and and there there is is aa theoretical theoretical reason reason to have been been collected the laboratory, to believe believe that yi = = axi ought to of a, that Yi aXi + bb ought to hold hold for for some some choices choices of a, b b6 E R. R. Of Of course, course, due due to to measurement error, this relationship is unlikely to hold exactly for any any choice choice of of aa measurement error, this relationship is unlikely to hold exactly for and b, b, but but we we would to find and bb that come as close as as possible possible to and would like like to find aa and that come as close to satisfying satisfying it. If we we define it. If define
Y=[H~:~ ,x~ 3.2226 3.5697
0.10 0.30 0.40 0.75 0.90
I
,e =
1 1 1 1 1
then one to pose this problem is to + be is as then one way way to pose this problem is to choose choose aa and and bb so so that that ax ax + be is as close close as possible possible to to yy in the Euclidean Euclidean norm. norm. That That is, we find the best best approximation approximation to to as in the is, we find the y W= = span{x, e). y from from W span{x, e}. The matrix is is The Gram Gram matrix G
= [ x· x x .e
e· x ] e· e
= [1.6325 2.45
2.45] 5 '
and the right-hand of the the normal normal equations equations is is and the right-hand side side of b = [ x· y ] == [ 7.4370 ] e· y 13.196· Solving the the normal yields Solving normal equations equations yields
== [ 2.2411 ] [ a] b 1.5409' The resulting linear model is is displayed, together with with the original data, data, in in Figure Figure The resulting linear model displayed, together the original 3.5.
Example 3.38. One One advantage advantage of working with with polynomials polynomials instead Example 3.38. of working instead of of transcentranscendental functions like eXx is is that that polynomials polynomials are are very very easy evaluate~only the basic dental functions like e easy to to evaluate—only the basic arithmetic required. For For this it is often desirable to approxarithmetic operations operations are are required. this reason, reason, it is often desirable to approximate imate aa more more complicated complicated function function by by a a polynomial. polynomial. Considering Considering f(x] f(x) = = eX ex as as aa function function in in (7[0,1], C(O, 1], we we find find the the best best quadratic quadratic approximation, approximation, in in the the mean-square mean-square sense, basis for the space of polynomials {l,x,x sense, to to ff.. A A basis for the space V^ P2 of polynomials of of degree degree 22 or or less less is is {I, x, x 22 }. }.
3.4. Orthogonal bases and projections
65
4~------~------~------~~------~------~
x
Figure 3.5. 3.5. The The data 3.37 and the approximate approximate linear linear reFigure data from from Example Example 3.37 and the relationship. lationship. It is for the problem of best It is easy easy to to verify verify that that the the normal normal equations equations for the problem of finding finding the the best approximation from P2 matrix-vector form) approximation to to f from p2 are are (in (in matrix-vector form)
Ga = b, Ga=b, where where
G
=[
1~2
1/2 1/3 ] [ e- 1 ] 1 . 1/3 1/4 ,b = 1/5 e 2 1/3 1/4
Since Since G-1b ==
1.013] 0.8511 , [ 0.8392
the best best quadratic quadratic approximation approximation is is the eX
== JJ2{x)
= 1.013 + 0.8511x + 0.8392x2.
The function and quadratic approximation approximation are graphed in The exponential exponential function and the the quadratic are graphed in Figure Figure 3.6. 3.6.
Example 3.39. An orthonormal for p2 P 2 (on the interval Q2, Q3}, 3.39. An orthonormal basis for interval [0,1]) [0,1]J is {ql, {^1,^25^3}; where where
66
Chapter 3. Essential linear algebra
2.8 2.6
x
y=e - _. y=P2(X)
2.4 2.2 2 >-
1.8 1.6 1.4 1.2 1
0
0.2
0.4
0.6
0.8
x
Figure 3.6. The function and the Figure 3.6. The function f(x) f(x) — = eeXx and the best best quadratic quadratic approximation approximation yy — = Pi(x) p2(X) (see (see Example Example 3.38). 3.38). The best best approximation approximation p^ to eeXx (which (which was the previous The P2 to was calculated calculated in in the previous exercise) exercise) can can be computed by the formula be computed by the formula P2(x) = (qij)qi(x) + (q2,f}qi(x) + (tfs,/)^). A A direct direct calculation calculation shows shows that that the the same same result result is is obtained. obtained. (See (See Exercise Exercise 10.) 10.) The approximation is The concept concept of of best best approximation is central central in in this this book, book, since since the the two two main main solution techniques techniques we we discuss, discuss, Fourier series and finite elements, elements, both both produce produce aa solution Fourier series and finite best approximation to of aa BVP. BVP. best approximation to the the true true solution solution of
Exercises 1. (a) Show that {vi,vV2, from Example Example 3.33 orthonormal 1. (a) Show that the the basis basis {VI, V3} from 3.33 is is an an orthonormal 2 ,V3} basis R33.. basis for for R (b) Express Express the (b) the vector vector
as combination of VI,, V2, V2, \s. V3. as aa linear linear combination of YI 2. Is the {1/1,1/2,1/3} (on the interval [0,1]) [0,1]) given 3.28 2. Is the basis basis {L for P P2 the interval given in in Example Example 3.28 2 (on I ,L 2 ,L 3 } for an orthogonal orthogonal basis? basis? an
67
Orthogonal bases and projections 3.4. Orthogonal projections E V satisfy 3. Let V be an inner product space. Prove that x, yy e
and only if (x, (x, y) = = 0. O. ifif and only if
4. Use the results of of this section section to show show that any any orthonormal orthonormal set containing containing n n vectors in R Rnn is a basis for for R Rn. . (Hint: Since the dimension of R Rnn is is n, it suffices orthogonal set spans Rn suffices to show either that the orthogonal Rn or that it is linearly independent. Linear independence is probably easier.) 5. Let W be be aa subspace subspace of inner product product space and let W2, w n }} 5. Let of an an inner space V and let {Wi, {wi, w ..., w 2 , ... E V, be a basis for for W. Show that, for for y e V, all zz E (y, z) z) = 00 for for all GW
holds if and and only ifif
(y,Wi) =0, i=1,2, ... ,n holds. holds. 6. W2, w n} } be a linearly independent 6. Let {Wi, {w1; w ..., w independent set set in an an inner inner product space space 2 , ... nxn V, and define G E R nxn by y, and eR by G = (Wj,Wi), = 1,2, Gijij = (wj-jWj), i,j i,j = 1 , 2 ,... . . . ,,n. n.
Prove that G is invertible. Hint: By the Fredholm alternative, alternative, it suffices suffices to = O.0. Assume that x EE Rn show that the only solution to Gx = 0 is is x = Rn satisfies Gx = = O. 0. This implies that x·Gx = O. Show that that x X·• Gx Gx = = (u, u), where where u E is given given by by Show € V is
that u u cannot cannot be be the the zero zero vector vector unless unless x x= o. and show and show that = 0. 7. Consider the following data: x X 1.0000 1.2500 1.4000 1.5000 1.9000 2.1000 2.2500 2.6000 2.9000
yy
2.0087 2.4907 2.8363 2.9706 3.9092 3.9092 4.1932 4.1932 4.5057 4.5057 5.2533 5.8030
68
Chapter 3. Essential linear algebra
Find the the relationship = mx these data. data. Plot Plot the Find relationship yy = mx + Cc that that best best fits fits these the data data and and the best fit line. 8. the following following data: 8. Consider Consider the data: X x 1.0000 1.0000 1.2500 1.2500 1.4000 1.4000 1.5000 1.9000 2.1000 2.1000 2.2500 2.6000 2.9000
y
Y
1.2475 1.2475 1.6366 1.6366 1.9823 1.9823 2.2243 3.4766 4.2301 4.8478 6.5129 8.2331
Find the the relationship relationship yy = = C2X c^x22 + +C\X data. If If possible, Find C1X + CQ Co that that best best fits fits these these data. possible, plot the data and the best fit fit parabola. parabola. 9. 9. Using the the orthonormal orthonormal basis basis {Ql,Q2,q3} {^1,^2,^3} for P2 P% given given in in Example 3.39, find find the quadratic quadratic polynomial p(x] p(x) that the that best approximates approximates g(x) = — sin sin (nx), (TTX), in the the mean-square sense, over the interval [0,1]. Produce a graph of 9g and the quadratic approximation. quadratic approximation. 10. Verify that in Example Example 3.39, is an 10. (a) (a) Verify that {Ql, (.x, x:f:.O {:} .>.x - Ax = 0, x :f:. 0 {:} (.>.1 - A)x = 0, x:f:. o. This last condition is is possible if 'AI is aa singular This last condition possible if if and and only only if >'1 -— A A is singular matrix; matrix; that that is, is, if if and only only ifif and det (.>.1 - A) = 0, 12 where is the determinant12 of the square matrix matrix B. det(B) is the determinant of the square B. In In principle, principle, then, then, we we where det(B) can find find the the eigenvalues eigenvalues of by solving solving the the equation det (AI can of A A by equation det (.>.1 -— A) A) = 0. O. It is hard to PA (A) = det (.>.1 (AI — is aa polynomial of degree degree nn It is not not hard to show show that that PA('>') = det - A) A) is polynomial of (the characteristic polynomial of A), and so A has n eigenvalues (counted according (the characteristic polynomial of A), and so A has n eigenvalues (counted according to as roots of PA (A)). Any all of of the eigenvalues can can be to multiplicity multiplicity as roots of PA('>')). Any or or all the eigenvalues be complex, complex, even if real entries, entries, since coefficients can have complex even if A A has has real since aa polynomial polynomial with with real real coefficients can have complex roots. roots.
Example 3.41. Let
Then Then
PA('>') = det (.>.1 - A) -1 '>'-1 -1 .>. - 2 1
= (.>. - 1)('>' - 2) -1 =.>.2 _ 3'>' + 1. Therefore, eigenvalues are are Therefore, the the eigenvalues
3± v'5 2 12 12We that the the reader reader is is familiar familiar with with the the elementary properties of of determinants determinants (such (such as as We assume assume that elementary properties the computation of of determinants determinants of matrices). The The most most important matrix the computation of small small matrices). important of of these these is is that that aa matrix is singular singular if if and its determinant introductory text on linear is and only only if if its determinant is is zero. zero. Any Any introductory text on linear algebra, algebra, such such as as [34], can can be details. [34], be consulted consulted for for details.
70
Chapter Chapter 3. 3. Essential linear algebra algebra
Example 3.42. 3.42. Let Example Let
Then Then PA (A) = det (AI - A)
=1
A-I
-1 A-I
1
+1 A2 - 2A + 2.
= (A - 1)2 =
Therefore, the the eigenvalues eigenvalues are Therefore, are 1 ± i,
where i = ^f—^.. wherei=A· Example Example 3.43. 3.43. Let Let
Then Then PA(A)
= det (AI -
A) -1 A-I
o
-1 -1 A-I
Therefore, the only only eigenvalue is 1, of multiplicity multiplicity 3. In In this Therefore, I, which is an eigenvalue of example, We must it is is instructive instructive to to compute compute the the eigenvectors. eigenvectors. We must solve solve (XI (AI — - A.)x A)x = = 0 example, it for We have for A A= = 1, 1, that that is, is, (I (I — - A)x A)x = = 0. O. We have I- A
=
01 01 00 [ 001
1-
III = [00
[10 1 1 001
-10 -1 -1 000
1,
and and aa straightforward straightforward calculation calculation shows shows that that the the solution solution space space (the (the eigenspace) eigenspace) is is
{(a,O,O) : a
E
R} = span{(I,O,O)}.
of the fact fact that that the eigenvalue has multiplicity multiplicity 3, 3, there is is only only one Thus, in spite of linearly independent independent eigenvector corresponding to the eigenvalue. linearly The fact fact that that the the matrix matrix in in the the previous previous example example has has only only one one eigenvector eigenvector for for The an eigenvalue eigenvalue of of multiplicity multiplicity 33 is is significant. significant. It It means means that that there there is is not not aa basis basis of of an 3 R consisting A. R3 consisting of of eigenvectors eigenvectors of of A.
3.5. 3.5.
Eigenvalues and and eigenvectors eigenvectors of matrix Eigenvalues of a a symmetric symmetric matrix
3.5.1 3.5.1
71 71
The of a a matrix matrix and dot product The transpose transpose of and the the dot product
The eigenvalues and eigenvectors is greatly simplified The theory theory of of eigenvalues and eigenvectors is greatly simplified when when aa matrix matrix A Ae E nxn Rnxn is R is symmetric, symmetric, that is, is, when AT AT == A. In order to appreciate appreciate this, we we need to of the of aa matrix inner the transpose transpose of matrix to to the the Euclidean Euclidean inner to understand understand the the relationship relationship of product (or dot dot product). product (or product). mxn operator defined x n,, and perform the following We consider aa linear linear operator defined by A E eR Rm following calculation: calculation:
n
=
m
LLAijViUj j=l i=l
n
j=l
(The above manipulations are elementary properties (The above manipulations are just just applications applications of of the the elementary properties of of arithmetic-basically that numbers numbers can be added in any any order, order, and and multiplication arithmetic—basically that can be added in multiplication mxn distributes over addition.) Thus the transpose AT A E Rmxn satisfies the AT of A € R following fundamental property: following fundamental property: (3.25) nxn For aa symmetric matrix A A E R nxn this simplifies simplifies to to For symmetric matrix eR ,, this
(Au) . v = u· (Av) for all u, vERn.
When for complex complex scalars and vectors entries, we we must When we we allow allow for scalars and vectors with with complex complex entries, must y E then we we define modify the product. If modify the dot dot product. If x, x, y 6 en, C n , then define n
x·y
= LXiYi. i=l
The second vector in product must conjugated; this this is is necessary so The second vector in aa dot dot product must thus thus be be conjugated; necessary so that defined. We same notation notation that x-x X·X will will be be real, real, allowing allowing the the norm norm to to be be defined. We will will use use the the same product whether the vectors are in or C en.n . The The complex complex dot product for for the the dot dot product whether the vectors are in R Rnn or dot product has following properties, form the definition of of an an inner on aa has the the following properties, which which form the definition inner product product on complex vector space: complex vector space: 1. ~ 0 for all x E 1. xX·• x > 6 en, C n , and x .• x = 0 if and only if x = = o. 0. n 2. x X·• y Y= = y Y .• x x for en. 2. for all all x, x, yy E 6C .
72
Essential linear algebra Chapter 3. Essential
(ax + f3y) /3y) .• zz = ax· ax • zz + + f3y 0y .• zz for for all all x, x,y, 6 en, C n , a, f3ft E £ e. C. Together Together with 3. (ax y, zz E with the the second second property, property, this this implies implies that that z . (ax
+ f3y)
=
az . x + f3z . y
for all x,y,z £ E en, all x,y,z C n , a,f3 a,/3 E £ e. C. Complex products are Complex vector vector spaces spaces and and inner inner products are discussed discussed in in more more detail detail in in Section Section 9.1. 9.1.
3.5.2 3.5.2
Special symmetric matrices matrices Special properties properties of of symmetric
We We can can now now derive derive the special special properties of of the the eigenvalues eigenvalues and and eigenvectors eigenvectors of of aa symmetric symmetric matrix. matrix. Theorem If A E Theorem 3.44. 3.44. //A e Rnxn R n x n is symmetric, symmetric, then then every every eigenvalue eigenvalue of of A is real. real. Moreover, each eigenvector. Moreover, each eigenvalue eigenvalue corresponds corresponds to to aa real real eigenvector.
Proof. Ax = AX, xx =P where for the moment moment we we do the Proof. Suppose Suppose Ax = Ax, / 0, 0, where for the do not not exclude exclude the possibility that be complex. possibility that A A and and xx might might be complex. Then Then (Ax) . x = (AX) . x = A(X . x)
and and x· (Ax)
= x . (AX) = X(x . x).
But But (Ax) (Ax). •xx== xx. •(Ax) (Ax) when whenAA isissymmetric, symmetric,soso A(X . x)
= X(x . x).
Since x .•xx i:-/ 0,0,this this yields yields
A = A, which implies that A is real. real. Let xx = =u iv, where v £ R n . Then Then Let U + iv, where u, u, vERn. Ax
= AX A(u + iv) = A(U + iv) Au + iAv = AU + iAV { Au = AU, Av = AV.
Since i:- 0, both), so v (or Since xx 7^ 0, we we must must have have either either U u =P / 00 or or vv f:^ 00 (or (or both), so one one of of u, u, v (or both) both) real eigenvector A corresponding A. 0 must be must be aa real eigenvector of of A corresponding to to A. we will From this point Prom this point on, on, we will only only discuss discuss eigenvalues eigenvalues and and eigenvectors eigenvectors for for real real to the theorem, then, then, we we will will not to use use symmetric matrices. matrices. According According to symmetric the last last theorem, not need need to complex numbers or vectors. Theorem be eigenvectors Theorem 3.45. 3.45. Let Let A A E £ Rnxn R n x n be be symmetric, and and let let Xl, xi, X2 x2 be eigenvectors of of AA corresponding AI, A2. Xl and and X2 orthogonal. corresponding to to distinct distinct eigenvalues eigenvalues \i, \2- Then Then xi X2 are are orthogonal.
73 73
Eigenvalues and eigenvectors eigenvectors of of a symmetric symmetric matrix matrix 3.5. Eigenvalues
Proof. We have Proof.
But and and Therefore Therefore
Al(XI . X2) = A2(Xl . X2), Al ^ i- AA2,2 , this implies implies that (xi,x (Xl,X2) = 0. 0. and since AI 2) =
D
Example 3.46. 3.46. Consider Example Consider
A =
1/3 1/3 1/3] 1/3 1/3 1/3 . [ 1/3 1/3 1/3
straightforward calculation shows that A straightforward
so that so the the eigenvalues eigenvalues of of A A are are 0,0,1. 0,0, 1. Another Another straightforward straightforward calculation calculation shows shows that there are are two two linearly linearly independent independent eigenvectors eigenvectors corresponding corresponding to to X ,\ — = 0, 0, namely, there namely,
and (independent) eigenvector A= = 1, 1, and aa single single (independent) eigenvector for for ,\
°
We can can verify verify by by observation observation that that the the eigenvectors eigenvectors for X= = 0 are are orthogonal orthogonal to to the the We for A eigenvector for A = eigenvector for = 1. Here is another special special property property of symmetric matrices. matrices. Example 3.43 3.43 shows that this result is not true for for nonsymmetric nonsymmetric matrices. matrices. nxn Theorem 3.47. 3.47. Let Let A A E 6 R Rnxn be symmetric, symmetric, and and suppose suppose A A has has an an eigenvalue Theorem be eigenvalue IJL of of (algebraic) (algebraic) multiplicity multiplicity kk (meaning (meaning that that J1fj, is is aa root root of of multiplicity multiplicity kk of of the the J1characteristic polynomial of A). A). Then Then A A has has kk linearly linearly independent independent eigenvectors eigenvectors characteristic polynomial of . corresponding corresponding to J1-. [L.
74
Chapter Chapter 3. Essential linear algebra
The proof of this theorem theorem is rather involved involved and and does not generalize to differThe proof of this is rather does not generalize to differential operators operators (as above). We therefore relegate it to ential (as do do the the proofs proofs given given above). We therefore relegate it to Appendix Appendix A. A. If /L is is an an eigenvalue of multiplicity multiplicity k, as in the previous previous theorem, theorem, then then we we can If n eigenvalue of k, as in the can choose the the kk linearly linearly independent independent eigenvectors corresponding to to // /L to be orthonorchoose eigenvectors corresponding to be orthonor13 mal. We obtain the corollary. mal. 13 We thus thus obtain the following following corollary.
Corollary 3.48. 3.48. (The spectral theorem for symmetric symmetric matrices) Corollary (The spectral theorem for matrices) Let Let A A GE nxn Rnxn be U2,2 ,... ... ,un} Rnn R be symmetric. symmetric. Then Then there there is is an an orthonormal orthonormal basis basis {Ul' {ui,u ,u n ) of of R consisting of of A. consisting of eigenvectors eigenvectors of A.
3.5.3 3.5.3
The spectral spectral method method for solving Ax The for solving Ax = = bb
E Rnxn When A e R nxn is symmetric symmetric and and the eigenvalues eigenvalues and eigenvectors eigenvectors of A are known, 14 is aa simple method 14 for Ax = there is there simple method for solving solving Ax = b. b. Let A A be be symmetric symmetric with with eigenvalues AI,, \A2, ... ,An Let eigenvalues AI An and and orthonormal orthonormal eigeneigen2,..., vectors ui, Ul, U U2, For any b ERn, we can write vectors 2 ,... . . . ,,Un' un. For any b e R n , we can write
We can can also write We also write
(Of course, ai o,i = are thinking cannot = Uj Ui .• x, x, but, but, as as we we are thinking of of xx as as the the unknown, unknown, we we cannot (Of course, compute o>i from from this compute ai this formula.) formula.) We We then then have have Ax
= b::::} A (alul + ... + anun ) =
+ ... + (un' b)un ::::} alAul + ... + anAun = (Ul . b)UI + ... + (un' b)u n ::::} alAlUl + ... + anAnUn = (Ul . b)Ul + ... + (un' b)un . (Ul . b)Ul
Since b can have only expansion in terms of of the basis {Ul, ... ,u , un}, we must Since b can have only one one expansion in terms the basis {ui,... must n }, we have have aiAi
= (Ui' b), i = 1,2, .. . ,n,
that is, that is,
Thus we solution we obtain obtain the the solution Thus (3.26) 13 13It is always always possible possible to to replace replace any linearly independent independent set with an an orthonormal orthonormal set It is any linearly set with set spanning spanning the same same subspace. The technique technique for doing this this is procedure; it it is is the subspace. The for doing is called called the the Gram-Schmidt Gram-Schmidt procedure; explained in in elementary texts such such as explained elementary linear linear algebra algebra texts as [34]. 14 14This method is is not not normally normally taught taught in in elementary linear algebra algebra courses or books, books, because because it This method elementary linear courses or it is more difficult to to find find the the eigenvalues and eigenvectors eigenvectors of of A A than than to to just just solve solve Ax Ax = = b b by by other is more difficult eigenvalues and other means. means.
75 75
3.5. Eigenvalues Eigenvalues and eigenvectors of of a a symmetric 3.5. and eigenvectors symmetric matrix matrix
We see see that that all all of of the the eigenvalues eigenvalues of of A A must must be be nonzero nonzero in in order order to to apply We apply this this method, which which is is only only sensible: if 00 is is an an eigenvalue eigenvalue of of A, A, then then A A is is singular method, sensible: if singular and and Ax = bb either either has has no no solution or has has infinitely many solutions. Ax solution or infinitely many solutions. If already have the eigenvalues method for If we already eigenvalues and and eigenvectors eigenvectors of A, then this method solving Ax = = b is simpler and less less expensive expensive than than the the usual usual method method of of Gaussian solving Ax b is simpler and Gaussian elimination. elimination. 3.49. Let Let Example 3.49.
A
=
11 -4 [ -1
-4 14 -4
-1 -4 11
1 =[ 1 ,b
12 1
A direct direct calculation calculation shows shows that that the the eigenvalues eigenvalues of of A A are are A A1 = 6, A2 = 12, A3 = 18,
corresponding (orthonormal) (orthonormal) eigenvectors are and the corresponding
(According to Theorem the eigenvectors eigenvectors are are automatically automatically orthogonal orthogonal in in this this (According to Theorem 3.45, 3.45, the we had to ensure that each eigenvector was normalized.) The solution case; however, case; however, we had to ensure that each eigenvector was normalized.) The solution to Ax = = bb isis to Ax
Exercises 1. 1. Let
164 A = [ -48
-48] [ 116 ] 136 ,b = 88 .
Compute, by by hand, hand, the the eigenvalues eigenvalues and eigenvectors of of A, A, and use them them to Compute, and eigenvectors and use to solve for xx (use (use the "spectral method"). solve Ax Ax — = bb for the "spectral method"). 2. Repeat Exercise for 2. Repeat Exercise 11 for
A=
3 -1
-1 11 ] . 3 ] ,b = [ -9
76
Chapter 3. Essential linear algebra
3. Repeat Repeat Exercise Exercise 11 for for 3.
4. Repeat Repeat Exercise Exercise 11 for for 4.
[ 7 -2 1]
A =
-2 1
10 -2
-2 7
,b =
[ 6] 12 18
.
nxn 5. Let Let A A E R nxn be symmetric, symmetric, and and suppose suppose the the eigenvalues eigenvalues and 5. €R be and (orthonormal) (orthonormal) eigenvectors of of A A are known. How How many many arithmetic eigenvectors are already already known. arithmetic operations operations are are required to to solve Ax = = b b using using the the spectral spectral method? required solve Ax method? nxn 6. A A symmetric matrix A A E Rnxn 6. eR is called positive definite definite ifif
=I 0 ~
x
(Ax) . x > O.
the spectral Use the spectral theorem to show show that A is positive definite if and only if all of the eigenvalues A are are positive. of the eigenvalues of of A positive.
7. Let defined by 7. Let L L be be the the n n x nn matrix matrix defined by the the condition condition that that 2
p, Lij
-b,
= {
0,
i
= j,
li-jl=1,
otherwise,
where l/(n + 1). 1). For example, with where h = = 1j(n For example, with n = = 5, 5, 72 -36 L =
0
[
o o
-36 72 -36 0 0
°
-36 72 -36 0
0 0 -36 72 -36
o
o . 01 -36 72
(a) For For each each jj = = 1,2, ... ,n, define the the discrete discrete sine wave s^ s(j) of frequency jj (a) 1,2,..., n, define sine wave of frequency by s~) = sin (jk7rh). J Show that that s( s(j) an eigenvector eigenvector of of L, L, and and find find the the corresponding corresponding eigenShow ) is is an eigenJ value Xj. Aj. (Hint: (Hint: Compute Ls(j) and apply apply the the addition addition formula for the value Compute Ls( ) and formula for the sine function.) sine function.)
(b) What is is the the relationship relationship between between the the frequency and the the magnitude magnitude of of (b) What frequency j and Aj? A,-?
3.6. Preview of methods for solving solving ODEs and PDEs
77
(c) (c) The The discrete sine sine waves are orthogonal orthogonal (since (since they are the eigenvectors eigenvectors of of aa symmetric symmetric matrix matrix corresponding to distinct eigenvalues) eigenvalues) and and thus thus form an orthogonal basis for for Rn. be shown that every every form an orthogonal basis R n . Moreover, Moreover, it it can can be shown that s(j) s(^ has the same norm: 1 I s 11-- V2h' (j)
J. -- 1, 2 , .. . ,n.
Therefore, {I V2hs(1) V2hs^l\, V2hS(2), ^/2hs^2\ . .... ,,V2hs(n)} \/2hs^ > is an orthonormal basis for n n R . We We will will call call aa vector vector xx E e Rn R smooth smooth or or rough rough depending depending on on whether Rn. whether its in the the discrete sine wave wave basis are heavily toward its components components in discrete sine basis are heavily weighted weighted toward that the the solution x of of the low low or high frequencies, the or high frequencies, respectively. respectively. Show Show that solution x Lx = b is Lx =b is smoother smoother than than b. b.
3.6 3.6
Preview Preview of of methods methods for for solving solving ODEs ODEs and and PDEs PDEs
The close formal relation concepts of of linear differential equation equation and The close formal relation between between the the concepts linear differential and that ideas linear operators operators might role linear algebraic system system suggests suggests that ideas about about linear might playa play a role linear algebraic in solving former, as the case the latter. In order this parallel in solving the the former, as is is the case for for the latter. In order to to make make this parallel the context explicit, we will will apply apply the the machinery machinery of of linear algebra algebra in the context of of differential differential equations: view view solutions solutions as as vectors vectors in in aa vector space, space, identify identify the the linear linear operators operators which define linear differential equations, equations, and and understand facts such such as as the the which define linear differential understand how how facts Fredholm alternative appear Fredholm alternative appear in in the the context context of of differential differential equations. equations. In chapters to follow we we will will accomplish this. In the process, process, we we will will In the the chapters to follow accomplish all all of of this. In the develop three classes of methods for for solving ODEs and and PDEs, each develop three general general classes of methods solving linear linear ODEs PDEs, each one closely closely analogous analogous to method for for solving solving linear linear algebraic algebraic systems: systems: to aa method one 22 2 1. The The method method of of Fourier series: Differential operators, like like dd?/dx can have have / dx ,, can 1. Fourier series: Differential operators, eigenvalues and eigenfunctions; eigenfunctions; for example eigenvalues and for example
;:2
[sin(wx)] = _w 2 sin(wx).
The Fourier series using the the eigenfunctions eigenfunctions The method method of of Fourier series is is aa spectral spectral method, method, using of the the differential differential operator. operator. 2. method of Green's functions: A Green's Green's function function for differential 2. The The method of Green's functions: A for aa differential equation equation is is the solution solution to to aa special special form form of of the the equation equation (just (just as as A-I A"1 is is the the solution J) that down the the solution solution to to AB = /) that allows allows one one to to immediately immediately write write down solution -1 to the equation equation (just could write write down x= A -lb). to the (just as as we we could down x =A b). 3. method of method that 3. The The method of finite finite elements: elements: This This is is aa direct direct numerical numerical method that can used when fails (or intractable). It can can be be compared can be be used when (1) (1) or or (2) (2) fails (or is is intractable). compared with Gaussian elimination, numerical method method for for solving solving with Gaussian elimination, the the standard standard direct direct numerical finite element not produce Ax = h. b. Like Like Gaussian Gaussian elimination, elimination, finite element methods methods do do not produce formula for for the the solution; solution; however, however, again again like like Gaussian Gaussian elimination, elimination, they they are are aa formula broadly applicable. applicable.
t78
Chapter 3. Essential linear algebra
In this this book, book, we we concentrate concentrate on Fourier series and finite finite element We In on Fourier series and element methods. methods. We will will also also explain explain the the idea idea of of aa Green's Green's function function and and derive derive aa few few specific specific Green's Green's functions that we will find useful.
3.7 3.7
Suggestions Suggestions for for further further reading reading
A good good introductory introductory text, text, which which assumes assumes no no prior prior knowledge knowledge of of linear linear algebra algebra and and A is is written written at at aa very very accessible accessible level, level, is is Lay Lay [34]. [34]. An An alternative alternative is is Strang Strang [44]; [44]; this this book also also assumes no background but it is written book assumes no background in in linear linear algebra, algebra, but it is written at at aa somewhat somewhat more demanding more demanding level. level. It It is is noteworthy noteworthy for for its its many many insights insights to to the the applications applications of of linear algebra for its tone. A A more advanced text text is is Meyer Meyer [40]. linear algebra and and for its conversational conversational tone. more advanced [40]. Anyone seriously mathematics must must become become familiar Anyone seriously interested interested in in applied applied mathematics familiar with with the computational algebra. The the computational aspects aspects of of linear linear algebra. The text text by by Strang Strang mentioned mentioned above above includes the numerical the subject. subject. There are also also more more includes material material on on the numerical aspects aspects of of the There are specialized references. A good introductory introductory text is Hager [23], [23], while more advanced advanced treatments include Bau [49]. encyclopedic treatments include Demmel Demmel [13] [13] and and Trefethen Trefethen and and Bau [49]. An An encyclopedic reference is Golub and Van Loan [19].
Chapter 4 Chapter 4
tial ordinary ordinary Jsseiitial differential eal
equations
In an ordinary differential differential equation equation (ODE), there is a single independent variable. Commonly ODEs model change over time, so the independent variable is tt (time). Our interest in ODEs derives from the following fact: both the Fourier series method and the finite element method reduce time-dependent time-dependent PDEs into systems of ODEs. In the Fourier In the the case case of of the Fourier series series method, method, the the system system is is completely completely decoupled, decoupled, so so the the "system" is really just a sequence of scalar ODEs. In we learn how to In Section 4.2, we solve the scalar ODEs that arise in the Fourier series method. The finite element method, on the other hand, results in coupled systems of of ODEs. In Section 4.3, we we discuss the solution of linear, coupled systems of firstorder ODEs. Although we we present an explicit solution technique in that section, the emphasis is is really on the properties of the solutions, as the systems that arise in practice are destined to be solved by numerical rather than analytical analytical means. In Sections 4.4 and 4.5, we we introduce some simple numerical methods that are adequate adequate for our purposes. We close this chapter chapter by interpreting our simple solutions in terms of Green's functions. Although we do not emphasize the method of Green's function in this book, the basic basic idea 4.6. book, we we do do explain explain the idea in in Section Section 4.6.
4.1 4.1
Converting a a higher-order higher-order equation equation to to a Converting a first-order system first-order system
We begin our discussion of ODEs with a simple observation: It is always possible to convert a single ODE of order two or more to a system of first-order first-order ODEs. We illustrate the following illustrate this this on on the following second-order second-order equation: equation:
cPu
a dt 2
du
+ b dt + cu = f(t).
We define Xl (t)
= u(t),
79
(4.1)
80
Chapter 4. Essential Essential ordinary differential equations
Then we have Then we have
dXl
dt
=X2,
dX2 ~u C b du 1 dt = dt2 = --;;,u - -;;, dt + -;;,f(t) =
c --;;,Xl -
b 1 -;;,X2 + -;;,f(t).
We can can write this as as aa first-order first-order system system using using matrix-vector matrix-vector notation: notation: We write this dx =Ax+f(t), A= [ 0c -d t - -a
(4.2)
In the vector-valued In (4.2), (4.2), xx is is the vector-valued function function
The The same same technique technique will will convert convert any any scalar scalar equation equation to to aa first-order first-order system. system.
If the the unknown unknown is is u and and the equation is is order order m, m, we we define define If the equation Xl
du dm-lu = U, X2 = -dt ' ... , Xm = -t md 1 . -
The The first first m -— 1 equations equations will will be be
dx l dX2 dt = X2, dt = X3, ... ,
dXm-l - - - =X m , dt
and the original the new If and the the last last equation equation will will be be the original ODE ODE (expressed (expressed in in the new variables). variables). If the original scalar scalar equation equation is is linear, linear, then then the the resulting system will will also also be be linear, the original resulting system linear, and be written written in and it it can can be in matrix-vector matrix-vector form form if if desired. desired. An mth-order typically has many solutions; An mth-order ODE ODE typically has infinitely infinitely many solutions; in in fact, fact, an an mthmthorder has an order linear linear ODE ODE has an m-dimensional m-dimensional subspace subspace of of solutions. solutions. To To narrow narrow down down aa unique unique solution, solution, m m auxiliary auxiliary conditions conditions are are required. required. When When the the independent independent variable is is time, time, the auxiliary conditions conditions are are initial initial conditions: conditions: variable the auxiliary
du ~-lU u(to) = ao, dt (to) = al,···, dt m- l (to) = am-l· These the new new These initial initial conditions conditions translate translate immediately immediately into into initial initial conditions conditions for for the unknowns Xl, x\, x%,..., xm: X2, ... , Xm: unknowns
We can can apply apply aa similar similar technique technique to to convert convert an an mth-order mth-order system system of of ODEs We ODEs to aa first-order first-order system. system. For For example, example, suppose suppose x(t) x(i) is is aa vector-valued vector-valued function, function, say to say x(t) x(£) ERn. 6 R n . Consider Consider aa second-order second-order system: system:
~x dt
2 = f
(
dX) t, x, dt .
(4.3)
4.1. a higher-order to a a first-order first-order system system l.l. Converting Converting a higher-order equation equation to
81 81
We define yet)
= x(t), dx
z(t) = dt (t). We then have have We then dy dt
dz dt
= z, = f(t,y,z).
The original system (4.3) consists of n second-order ODEs in n unknowns (the components of x( t)). We We have have rewritten rewritten this original system system as 2n first-order ODEs components of x(i)). this original as In first-order ODEs in 2n unknowns unknowns (the n components of y(t) yet) and in In (the n components of and the the nn components components of of z(t)). The fact that any any ODE ODE can can be be written written as system has has the the followThe fact that as aa first-order first-order system following benefit: Any Any theory theory or or algorithm algorithm developed ODEs is is ing benefit: developed for for first-order first-order systems systems of of ODEs automatically applicable applicable to to any This leads to aa considerable in automatically any ODE. ODE. This leads to considerable simplification simplification in the study the study of of this this subject. subject.
Exercises 1. Write the the ODE ODE 1. Write
rPu
2 dx 2
+u =0
as system of first-order ODEs. as aa system of first-order ODEs. 2. Write the ODE
rPu
dx 2
+ csin (u)
= 0
as aa system system of first-order first-order ODEs. 3. 3. Write Write the the ODE ODE
d4 u
dt 4
d2 u
-
2 dt 2 + u = sin (t)
as system of first-order ODEs. as aa system of first-order ODEs. 4. Write the ODE
d2 u du 2 - 2 -2u- +t u=O dt dt as system of first-order ODEs. as aa system of first-order ODEs.
5. Write the following system of second-order ODEs as a system of first-order first-order ODEs: ODEs:
82
Chapter 4. Essential ordinary equations ordinary differential equations
Here ml m2 are are constants, constants, and and II and fz real-valued functions functions of of four Here mi and and m^ j\ and /2 are are real-valued four variables.
4.2
Solutions to some simple ODEs Solutions
In this section, we show how to to solve some simple simple firstand second-order second-order ODEs ODEs In this section, we show how solve some first- and that will will arise arise later in the the text. text. that later in
4.2.1 4.2.1
The general general solution solution of of a second-order homogeneous The a second-order homogeneous ODE ODE with constant coefficients with constant coefficients
In the ensuing ensuing chapters, often encounter encounter the the second-order In the chapters, we we will will often second-order linear linear homogehomogeconstant coefficients, neous ODE ODE with constant coefficients, d2u a dt 2
du
+ b dt + cu =
(4.4)
0,
we will present present aa simple so we simple method method for for computing computing its its general general solution. solution. The The method method is based based on on the faced with with aa differential differential equation, equation, one one can is the following following idea: idea: When When faced can sometimes guess guess the general form form of the solution solution and, by substituting substituting this general form into the equation, determine the specific specific form. form. (4.4) is of the In this case, In this case, we assume that that the the solution solution of (4.4) the form form
Substituting into (4.4) (4.4) yields yields Substituting into
since the exponential exponential is never zero, this equation equation holds holds if if and only ifif since the is never zero, this and only
ar2
+ br + c = O.
This quadratic is is called called the the characteristic characteristic polynomial polynomial of of the the ODE ODE (4.4), and its This quadratic (4.4), and its roots roots are ODE. are called called the the characteristic characteristic roots roots of of the the ODE. formula: The characteristic characteristic roots are given by the the quadratic quadratic formula: rl =
-b - Vb 2 2a
-
4ac
, r2 =
-b + v'b 2
-
4ac
2a
We three cases. We distinguish distinguish three cases. - 4ac 1. The characteristic roots are are real real and and unequal unequal (i.e. (Le. 6 b22 — 4ac > In 1. The characteristic roots > 0). 0). In rit r2t this since the the equation equation is any linear combination of and eer2t this case, case, since is linear, linear, any linear combination of e er1t and (4.4) can is also also aa solution solution of of (4.4). In In fact, as as we now show, show, every every solution solution of of (4.4) can be written written as as be
(4.5)
4.2. Solutions to some some simple ODEs 4.2. Solutions to simple ODEs
83
for choice of ci, C2In fact, initial for some some choice of Cl, C2. In fact, we we will will show show that, that, for for any any &i, kl' £2, k2, the the initial value problem (IVP) (IVP) d2 u a dt 2
du
+ b dt + cu =
0,
u(O) = kl'
(4.6)
du(O) = k dt
2
has a unique solution solution of the form (4.5). (4.5). (4.5), we have With uu given by (4.5),
= CI + C2,
u(O)
du dt (0) =
rlCI
+ r2 C2,
and and we we wish wish to to choose choose c\, CI, c^ C2 to to satisfy satisfy CI rlCI
+ C2 = kl'
+ r2C2
= k2'
that is, is, that [
ll]C_k r2
rl
The matrix in this equation equation is is obviously nonsingular (since rl / :j:. r-2), r2), The coefficient coefficient matrix in this obviously nonsingular (since r\ and there is is aa unique unique solution CI, C2 for for each each fci, kl' fo k 2•. and so so there solution c\,C2
Since every in the (4.5), we the Since every solution solution of of (4.4) (4.4) can can be be written written in the form form (4.5), we call call (4.5) (4.5) the general solution of of (4.4). (4.4). qeneral solution - 4ac 2. The characteristics roots roots are are complex complex (Le. 4ac < 0). In 2. The characteristics (i.e. b22 — < 0). In this this case, the characteristics characteristics roots roots are are also also unequal; unequal; in in fact, they form complex case, the fact, they form aa complex conjugate pair. The analysis of the previous case applies, and we could write the general the form form (4.5) (4.5) in this case as well. well. However, However, with with 7*1, rl, r..t) + C2eP,t sin (>..t). 3. Suppose has the the single root r = = -bj(2a). that, for for 3. Suppose (4.4) (4.4) has single characteristic characteristic root -b/(2a). Show Show that, any kl' k2' there is is aa unique unique choice of ci,C2 Cl, C2 such such that that the the solution of (4.6) (4.6) is is any fei, fo, there choice of solution of
Clert
+ C2 tert .
4. each of following IVPs, general solution solution of ODE and and use 4. For For each of the the following IVPs, find find the the general of the the ODE use it the IVP: IVP: it to to solve solve the (a)
d2 u dt 2
du
+ 2 dt + u = 0, u(O)
= 1,
du(O) = 0 dt
4.2. Solutions to some simple ODEs ODEs 4.2. Solutions to some simple
89
(b)
dlu 3 dt 2
du
+ 2 dt + u =
0,
= -1,
u(o)
du(O) = 3 dt
(c) dlu 2 dt 2
du
+ 2 dt + u = u(O)
0,
= 0,
du (0) = 1 dt
(d)
d2 u dt 2
du
+ 3 dt + u =
0,
u(O) = 1, du (0) = 2 dt
5. following differential differential equations equations are accompanied by by boundary boundary conditionsconditions— 5. The The following are accompanied auxiliary conditions that of aa spatial spatial domain auxiliary conditions that refer refer to to the the boundary boundary of domain rather rather than to to an an initial initial time. time. By using the the general general solution of the ODE, determine than By using solution of the ODE, determine to the value problem problem (BVP) exists, and whether aa nonzero whether nonzero solution solution to the boundary boundary value (BVP) exists, and if so, whether whether the the solution solution is is unique. unique. if so, (a)
d2 u dx 2
-
2u
= 0,
u(O) = 0, u(l) = 0 (b)
d2 u dx 2
+ 2u =
0,
u(O) = 0, u(l) = 0
(c) dlu dx 2
+ 7r
2
U
= 0,
u(O) = 0, u(l)
=0
90
Chapter ordinary differential Chapter 4. 4. Essential Essential ordinary differential equations equations
6. Determine Determine the values of A E e R such that the BVP d2 u dx 2
+ AU =
0,
u(o) = 0, u(l) =
°
has aa nonzero has nonzero solution. solution.
7. Prove directly (that (that is, is, by by substituting substituting uu into into the the differential differential equation 7. Prove directly equation and and initial condition) that initial condition) that u(t) = ea(t-to)uo
+ i t ea(t-s) /(s) ds to
solves solves
du
dt - au = /(t), u(to )
= Uo·
8. the following following IVPs: 8. Solve Solve the IVPs:
(a)
du dt
= 2u -
0.1,
u(O) = 1.0 (b)
du
dt = -2u + O.lt,
u(O) = 0
(c)
du
= t, u(O) = 0
dt +u
9. 9. Find the solution solution to the the IVP ~u
dt 2
+ 4u =
1,
u(o) = 0,
du
dt (0)
= O.
91 91
4.3. Linear systems with constant coefficients
10. 10. Find the the solution solution to the IVP IVP d2 u
dt 2
+ 9u = t, u(O) = 0,
du(O) dt
= O.
11. Find the the solution to the the IVP IVP 11. Find solution to
d2 u
-t
dt 2 +u = e ,
= 1, du(O) = O. u(O)
dt
12. Use Use the of Section Section 4.2.1 4.2.1 to to derive derive the the solution (4.8), and verify 12. the techniques techniques of solution to to (4.8), and verify that you you obtain obtain (4.10). way to to do do this is to write the the general that (4.10). (Hint: (Hint: One One way this is to write general solution of of the as u(t) u(t) = c\ cos (Ot) (9t] + + C2 02 sin sin (Ot) and and then solve for for Cl c\ the ODE ODE as Cl cos then solve solution and C2C2. If you do this, you will then have to apply trigonometric identities identities to put the solution in the form given in (4.10). It is simpler to recognize at the beginning that the general solution could just as well be written as u(t) = Cl ci cos (0(t -- to)) t0})++C2c2sin sin(O(t (0(t- -to)).) t0)).) u(t) cos (O(t
4.3
coefficients Linear systems with constant coefficients
We now consider first-order linear system of with constant coefficients. We now consider aa first-order linear system of ODEs ODEs with constant coefficients. Such be written Such aa system system can can be written
dXl
dt = anXl + a12 X2 + ... + alnXn + /1 (t), dX2
dt = a21 Xl + a22 x 2 + ... + a2n Xn + h(t),
(4.15)
dX n (it = anlXl + an2X2 + ... + annXn + fn(t). As algebraic system, system, there is aa great advantage in in using using matrix-vector As with with aa linear linear algebraic there is great advantage matrix-vector notation. notation. System System (4.15) (4.15) can be written as
dx - =Ax+f(t), dt
(4.16)
where
a21
a12 a22
al n a2n .
anl
an2
a~n
an
A=
1, x(t) = [ X2(t) x,(t) 1 [ h(t) j,(t) 1 . , f(t) = . . xn·(t)
fn·(t)
92
Essential ordinary differential equations Chapter 4. Essential
Although it it is is possible possible to to develop develop aa solution solution technique technique that that is is applicable applicable for for any any Although matrix A whkl1 there matrix AE € Rnxn, R n x n , it it is is sufficient sufficient for for our our purposes purposes to to discuss discuss the the case case in in which there is aa basis basis for for R Rnn consisting consisting of of eigenvectors eigenvectors of of A, A, notably notably the the case case in in which which A is A is is symmetric. we can spectral method, much the spectral symmetric. In In this this case, case, we can develop develop aa spectral much like like the spectral 15 method for solving solving Ax Ax = = bb in in the case that that A A is is symmetric. symmetric.15 method for the case As we develop reader should As we develop an an explicit explicit solution solution for for (4.16), (4.16), the the reader should concentrate concentrate on on the qualitative qualitative properties properties (of (of the the solution) solution) that that are are revealed. revealed. These These properties turn properties turn the out to to be be more more important important (at (at least least in in this this book) book) than than the the formula formula for the solution. solution. out for the
4.3.1 4.3.1
Homogeneous systems Homogeneous systems
We begin the homogeneous version of of (4.16), (4.16), We begin with with the homogeneous version
dx dt = Ax.
(4.17)
We will look We will look for for solutions solutions to to (4.16) (4.16) of of the the special special form form
x(t)
= a(t)u,
where uf:.O u ^ 0 is is aa constant constant vector. vector. We We have have where
dx da x(t) = a(t)u, dt = Ax => dt u = aAu. This last last equation equation states states that two vectors vectors are are equal, equal, which which means means that that Au Au must must This that two be multiple of of uu (otherwise, (otherwise, the the two two vectors in different different directions, directions, be aa multiple vectors would would point point in which is is not possible if if they they are are equal). equal). Therefore, Therefore, uu must must be be an an eigenvector eigenvector of of A: A: which not possible Au=)"u.
We obtain We then then obtain
da = ),,0: => a(t) = Ce),(t-to) dt ' where C C is is aa constant. constant. Therefore, Therefore, if if )", A,uu isis any any eigenvalue-eigenvector eigenvalue-eigenvector pair, pair, then then where x(t)
= e),(t-to)u
is aa solution solution of of (4.17). (4.17). is Example Let Example 4.5. 4.5. Let A
= ~ [-: -: 6
1
! ].
4-5
eigenvalues of of A A are \i = = 0, 0, A2 = -1, —1, and),,3 and AS = = -2, —2, and the corresponding corresponding Then the eigenvalues are),,1 ),,2 = eigenvectors are
15 15See See Section Section 3.5.3. 3.5.3.
4.3. Linear Linear systems systems with constant coefficients coefficients 4.3. with constant
93
We therefore therefore know three independent We know three independent solutions solutions of of dx dt
= Ax,
namely, namely, X1(t)=U1, X2(t) = e-(t-t O )U2, X3(t) =
e- 2 (t-t O)U3.
If, as in the previous previous example, R nn consisting of eigenvectors If, example, there is a basis for R {ui, U2, u 2 , ... . . . ,, u of A, A, with corresponding eigenvalues we can {U1' un} with corresponding eigenvalues AI, AI, A A2,"" An, then we can n} of 2,..., A n , then write of (4.17). Indeed, Indeed, it it suffices suffices to show that write the the general general solution solution of to show that we we can can solve solve
dx -A dt x,
x(to)
(4.18)
= Xo·
n Since {U {ui, un} is basis for Rn, , there exists aa vector Rn such such that Since 1, 112,..., U2, ... , Un} is aa basis for R there exists vector cc G ERn that
The solution to then The solution to (4.18) is is then
(4.19) This is verified by computing dx/ dt:
~~ (t)
+ c2A2eA2(t-to)U2 + ... + CnAneAn(t-to)un cle),dt-to) AUI + c2e),2(t-t O) AU2 + ... + cne),n(t-t o) AUn A (C1e),1(t-to)U1 + c2e A2 (t-t o )U2 + ... + CneAn(t-tO)un)
= cIAle A1 (t-t o)UI =
=
= Ax(t).
We also have have We also by construction. by construction. Example 4.6. 4.6. Let Let A A be be the matrix Example matrix in in Example 4.5, 4-5, and let
94
Chapter 4. Essential Essential ordinary ordinary differential equations differential equations
Since the basis of {111,112,113} is orthonormal, orthonormal, as as is is easily checked, Since the basis of eigenvectors eigenvectors {Ul, U2, U3} o/A of A is easily checked, we have we have
We We can then immediately write down the the solution to to
dx
-=Ax, dt
x(O) = xo. is It is
as in in the the previous previous example, example, A A is is symmetric, the eigenvectors eigenvectors If, as symmetric, then then the
{ui,u 2 ,...,u n } can be chosen chosen to to be be orthonormal, orthonormal, in in which which case case it it is particularly simple to compute can be is particularly simple to compute the coefficient coefficient Cl, C2, ...- ,,c Cn expressing in terms terms of of the the eigenvectors. We obtain the ci,C2,.. expressing Xo x0 in eigenvectors. We obtain the following the solution (4.18): the following simple simple formula formula for for the solution of of (4.18): n
x(t) = 2)xo . ui)e"i(t-to)Ui. i=l
We have have derived derived the the solution to (4.18) in the the case that A n linearly inWe solution to (4.18) in case that A has has n linearly independent We remark that formula formula (4.19) (4.19) might might involve involve complex complex dependent eigenvectors. eigenvectors. We remark that numbers if A A is is not not symmetric. we have have aa complete complete and and satisfactory numbers symmetric. However, However, we satisfactory the case that A A is is symmetric-there n linearly eigensolution in in the solution case that symmetric—there are are n linearly independent independent eigenthe eigenvalues to be be real, real, and the basis of eigenvectors can vectors, the vectors, eigenvalues are are guaranteed guaranteed to and the basis of eigenvectors can be chosen chosen to be orthonormal. be to be orthonormal. the significance The interpretation interpretation is very We now now discuss discuss the We significance of of solution solution (4.19). (4.19). The is very The solution to (4.18) (4.18) is the sum components, one simple: simple: The solution to is the sum of of nn components, one in in each each eigendireceigendirection, and is tion, and component component ii is • exponentially exponentioallyincreasing increasing(if (if Ai yi > 0); 0);
exponentioallydecreasing decreasing(if(ifAiyi 00 oo as as tt —> oo. Ilx(t)ll-t -t 00. - k)-dimensional Even if kk == 1, 1, an (n — fc)-dimensional subspace of R Rnn is a very small part of Rn Rn 3 (comparable to a plane or a line in R3). R ). Therefore, most initial vectors do not lie in this subspace.
Example 4.7. 4.7. Let Let Example 1 -1 1
Jl
Then A are Al = 1, A2 = = -1, A3 = -2, Then the the eigenvalues eigenvalues of of A are AI 1, A2 —1, and and \s —2, and and the the corresponding corresponding eigenvectors are eigenvectors are
The solution of The solution of (4.18) (4-18) is is
where where
96
Essential ordinary differential equations Chapter 4. Essential
The only not grow without bound only initial values that lead to a solution x that does not are are Xo E S = span{U2' U3}. But S is plane, which very small part of Euclidean 3-space. almost But is aa plane, which is is aa very small part of Euclidean 3-space. Thus Thus almost every every initial initial value leads leads to to a solution solution that grows exponentially. exponentially.
Another can draw draw is is somewhat somewhat more more subtle subtle than those Another conclusion conclusion that that we we can than those given but it 6. given above, above, but it will will be be important important in in Chapter Chapter 6. 4. has eigenvalues very different then solutions (4.18) have 4. If If A A has eigenvalues of of very different magnitudes, magnitudes, then solutions of of (4.18) have components whose whose magnitudes change at very different different rates. Such solutions can be difficult difficult to compute efficiently efficiently using numerical methods. methods. We We will will discuss discuss this this point point in in more more detail detail in in Section Section 4.5. 4.5.
4.3.2 4.3.2
Inhomogeneous and variation variation of parameters In homogeneous systems systems and of parameters
We can now explain how to solve the inhomogeneous inhomogeneous system system
dxdt = Ax + f(t),
(4.20)
again only considering the case in which which there is a basis of Rn Rn consisting of eigenvectors of A. The method is a spectral method, and the reader may wish to review Section 3.5.3. 3.5.3. Section If ... , u un} R nn can be written If {Ul' {ui, U2, u 2 ,..., for R R nn,, then every vector in R n} is a basis for uniquely as a linear combination of these vectors. In particular, for each t, £, we we can Ul, U2, ... , u Un: write write f(t) as aa linear linear combination combination of ui, 112,..., n:
Of the vector t, so ... ,en(t). Of course, course, since since the vector f(t) f (t) depends depends on on £, so do do the the weights weights Cl c\ (t), (£), C2(t), C2(t),..., cn(t). These weights can be computed explicitly from f, f, which of course is considered to be be known. We can also write the solution of (4.20) in terms of the the basis vectors: (4.21)
,ann(t). However, However, when Since x(t) x(£) is unknown, unknown, so are are the the weights al(t),a2(t), ai(i),«2(*), ... • • • ,a, unknown weights. these basis vectors are eigenvectors of A, it is easy to solve for the unknown Indeed, substituting (4.21) (4.21) in in place Indeed, substituting place of of x x yields yields
dxdt
d[n
- - A x = - "'a·udt~" i=l
n
= L i=l
dt
~ i=l
d
~Ui
1-A (n"'a-u) •• n
-
LaiAui i=l
4.3.
Linear Linear systems with constant coefficients
The ODE The ODE
dx
- - Ax dt then implies then implies that that n
'"' L...J
{d~ } dt - A·a· ~
i=1
~
97
= f(t)
U·~
n
= '"' L...J c·(t)u· ~ ~, i=1
which can only only hold hold if which can if
~i
_
Aiai = Ci(t), i = 1,2, ... ,no
(4.22)
Thus, by by representing representing the the solution solution in in terms terms of of the the eigenvectors, eigenvectors, we we reduced reduced the the Thus, system of of ODEs ODEs to to nn scalar scalar ODEs ODEs that can be solved independently.16 independently.16 The The comsystem that can be solved computation of of the the functions functions C1 ci(£), C 2(t), ( t ) ,... . . . ,, C cnn(t) is simplest simplest when when A is symmetric, symmetric, so putation (t), C2 (t) is A is so that the eigenvectors eigenvectors can that case, that the can be be chosen chosen to to be be orthonormal. orthonormal. In In that case,
Ci(t) = f(t) . Ui, i = 1,2, ... ,n. We will will concentrate concentrate on on this case henceforth. We this case henceforth. The x(to) unknowns The initial initial condition condition x(£ = Xo x0 provides provides initial initial values values for for the the unknowns 0) = ai(t],a-2,(t},... Indeed, we we can can write a1 (t), a2(t), ... ,a , an(t). write n(t}. Indeed,
so x(to) so x(£ = Xo XQ implies implies that that 0) =
or or
= bi , i = 1,2, ... ,no
ai(to)
When the basis is orthonormal, the coefficients coefficients b&1i ,, &b22 ,j .•• - - - >, &bnn can be computed in the usual the usual way: way: bi = Xo . Ui, i = 1,2, ... ,n. Example Example 4.8. 4.8. Consider Consider the the IVP IVP
dx
= Ax + f(t), x(O) = 0, dt
16 16This just what what happened happened in the spectral b-the system This is is just in the spectral method method for for solving solving Ax Ax == b—the system of of nn simultaneous equations equations was was reduced to nn independent independent equations. equations. simultaneous reduced to
98 98
Chapter 4. Essential ordinary ordinary differential equations Chapter equations
where
A= [11 1] l'
f(t)
= [ C?S (t) sm (t)
] .
2 The matrix A is symmetric, and an orthonormal basis for R R2 consists of vectors consists of the vectors
U1
l'
= [ _;,
U2 = [ ; ,
l·
The corresponding corresponding eigenvalues are are \i A1 = = 0, A2 = = 2. 2. 0, X% f(t) in terms of of the eigenvectors: We now express express f(t) f(t) = Cl(t)Ul + C2(t)U2, 1 . Cl(t) = f(t) . U1 = J2(cos (t) - sm (t)),
C2(t)
= f(t) . U2 =
1
J2(cos (t)
.
+ sm (t)).
The solution is where 1
dal
dt = J2 (cos (t) da2
. - sm (t)), a1(0)
1
= 0,
.
dt = 2a2 + J2 (cos (t) + sm (t)) , a2(0) = O. We obtain obtain 1 aI(t) = J2 =
and
a2(t)
1 = J2
it 0
{CDS(S) -sines)} ds
~ (sin (t) + cos (t) -
it 0
1)
e2(t-s) {cos (s) + sin (s)} ds
1 2t = 5y2 In (3e -
.
3cos(t) - sm(t)).
Finally, x(t)
= a1 (t)U1 + a2(t)u2 =
/0 (2cos(t) +4sin(t) +3e 2t -5) [ 110 (-8 cos (t) _ 6 sin (t) + 3e2t + 5)
1 .
The technique for for solving presented in this section, section, which which we we have have The technique solving (4.20) (4.20) presented in this described as aa spectral spectral method, method, is is usually usually referred as the described as referred to to as the method method of of variation of of parameters.
4.3.
99
constant coefficients Linear systems with constant
Exercises 1. Consider the the IVP IVP (4.18), (4.18), where where A A is is the the matrix matrix in in Example Example 4.5. 4.5. Find Find the 1. Consider the solution solution for for
2. Consider Consider the the IVP IVP (4.18), (4.18), where where A A is is the the matrix matrix in in Example Example 4.5. 2. 4.5.
(a) matter what what the the value value of of x xo, the solution solution x(£) x(t) converges converges (a) Explain Explain why, why, no no matter 0 , the to constant vector to aa constant vector as as tt —> -+ oo. 00. (b) Find Find all all values values of of x Xo0 such such that that the the solution x is is equal equal to to aa constant constant vector (b) solution x vector for for all all values values of of t.t.
3. to 3. Find Find the the general general solution solution to
dx
dt = Ax,
where where
A=[~ ~]. 4. Find Find the the general general solution solution to to 4.
dx
dt = Ax,
where where
5. Let Let A A be be the the matrix matrix in in Exercise Exercise 3. 3. Find Find all all values values of Xo such such that that the the solution 5. of XQ solution exponentially to zero. to (4.18) decays exponentially 6. Let Let A A be be the the matrix matrix in in Exercise Exercise 4. 4. Find Find all all values values of of XQ Xo such such that that the the solution solution 6. to (4.18) (4.18) decays exponentially to to zero. zero. to decays exponentially 7. Let Let A A be be the the matrix matrix of of Example Example 4.5. 4.5. Solve Solve the the IVP 7. IVP
dx
= Ax + ret), x(O) = 0, dt
where where
ret) = [
~
sin (t)
l.
100 100
Chapter 4. Essential Essential ordinary differential equations
8. Let Let 8.
-4 2 -4
-4]
2 -4 -4 -4 2
.
Solve the the IVP IVP Solve
dx
= Ax + f(t),
dt
x(o) = xo,
where where
(t) ] 1 , sin (t)
COS
f(t) = [
Xo
=
[ 1] 1 . 1
9. following system been proposed proposed as the population population dynam9. The The following system has has been as aa model model of of the dynamics two species animals that that compete ics of of two species of of animals compete for for the the same same resource: resource:
dx dt
= ax -
~~
= -ex + dy, y(O) = yo.
by, x(O)
= Xo,
Here a, b, positive constants, x(t) is the population population of of the the first first species Here o, 6, e, c, dd are are positive constants, x(t] is the species at time £, and and y(t] is the corresponding population of the the second second species species at time t, yet) is the corresponding population of (x measured in convenient units, units, say thousands or millions (x and and yy are are measured in some some convenient say thousands or millions of The equations increases of animals). animals). The equations are are easy easy to to understand: understand: either either species species increases (exponentially) if the other is not present, the two compete is not present, but, but, since since the two species species compete (exponentially) if the other for resources, the the presence one species species contributes negatively to to for the the same same resources, presence of of one contributes negatively the growth rate of the growth rate of the the other. other.
(a) Solve the IVP with with 6b = = ec = o= = 1, 1, and (a) Solve the IVP = 2, 2, a = dd == l1,, x(0) x(O) = = 2,2, and and j/(0) yeO) = and explain (in (in words) to the of the the two two species explain words) what what happens happens to the populations populations of species in the long term. term. in the long (b) With the the values values of of a, a, 6, c, dd given given in 9a, is is there there an an initial initial condition (b) With b, e, in part part 9a, condition which will outcome? which will lead lead to to aa different different (qualitative) (qualitative) outcome? 10. The The purpose of this exercise is to derive derive the solution (4.11) (4.11) of of the IVP 10. purpose of this exercise is to the solution the IVP
J2u dt 2
2
+ B u = J(t), u(to) du dt (to)
The solution solution The u(t)
=~
= 0, = o.
(4.23)
rt sin (B(t - 8»J(8) d8
ito
can using the the techniques techniques of the computations can be be found found using of this this section, section, although although the computations are more any of of the presented. than any the examples examples we we have have presented. are more difficult difficult than
4.4.
methods for initial value problems Numerical methods
101 101
(a) Rewrite Rewrite (4.23) in the the form (a) (4.23) in form
dx dt
= Ax + F(t),
x(to)
= O.
matrix A is not symmetric. Notice that the matrix Al, A2 A2 and and eigenvectors eigenvectors Ui,u Ul, U22 of (b) Find Find the the eigenvalues eigenvalues AI, (b) of A. A. (c) Write Write the the vector-valued vector-valued function function F(t) in the form (c) in the form
F(t)
= Cl(t)Ul +C2(t)U2.
Since the eigenvectors eigenvectors ui Ul and and u U2 not orthogonal, orthogonal, this this will will require Since the are not require 2 are solving 2) system system of of equations to find find Cl(t) and C2(t). solving aa (2 (2 xx 2) equations to c\ (t} and c2 (t). (d) Write Write the the solution solution in in the the form (d) form
x(t) = al(t)ul
+ a2(t)u2,
and solve solve the scalar IVPs get ai (t}, a2(t). a2 (t}. and the scalar IVPs to to get al(t), (e) The desired Show that is (4.11). (e) The desired solution solution is is u(t] u(t) = xi(t). Xl(t). Show that the the result result is (4.11).
4.4
Numerical methods methods for initial value problems problems
So far simple classes it is So far in in this this chapter, chapter, we we have have discussed discussed simple classes of of ODEs, ODEs, for for which which it is However, most differential possible to produce an explicit formula for the solution. However, differential equations in this sense. Moreover, sometimes the that equations cannot cannot be be solved solved in this sense. Moreover, sometimes the only only formula formula that difficult to evaluate, involving integrals that cannot cannot be computed in can be found is difficult functions, eigenvalues and eigenvectors that cannot cannot be found terms of elementary functions, exactly, so forth. exactly, and and so forth. In cases it may to that the the only only way way to to investigate investigate the the solution solution is is to In cases like like these, these, it may be be that approximate it using approximate it using aa numerical method, method, which which is is simply simply an an algorithm algorithm producing producing an approximate solution. We an approximate solution. We emphasize emphasize that that the the use use of of aa numerical numerical method method always always implies that that the the computed computed solution will be be in in error. error. It is essential essential to to know know something implies solution will It is something about the the magnitude magnitude of of this this error; one cannot cannot use use the the solution solution with with any any about error; otherwise, otherwise, one confidence. confidence. Most numerical methods for IVPs in ODEs are designed for first-order first-order scalar almost without change, to first-order systems, and equations. These can be applied, almost hence to to higher-order higher-order ODEs (after they they have have been been converted converted to to first-order hence ODEs (after first-order systems). systems). the general first-order scalar IVP, which is of Therefore, we begin by discussing the first-order scalar of the the form form du (4.24) dt = !(t, u), u(to) = Uo· We shall discuss time-stepping methods, which seek seek to find approximations approximations to to U(tl), U(t2),"., u(tnn},), where where to < ti tl < t^ t2 < ... define the the grid. grid. The The quantities n(ti),^(£2), • • • ,u(t • • • < ttnn define quantities ti — to,t2 tn-i are it - to, t2 -— ti,..., tl,···, ttnn -— tn-l are called called the the time steps. The basic basic idea idea of of time-stepping time-stepping methods methods is is based based on on the the fundamental fundamental theorem The theorem if of calculus, which implies that if
du dt (t)
= !(t,u(t)),
102
Chapter equations Chapter 4. Essential Essential ordinary ordinary differential equations
then then (4.25) we cannot Now, we cannot (in general) hope to evaluate evaluate the integral integral in (4.25) (4.25) analytically; analytically; however, any numerical method for approximating the value of a definite integral (such methods are often quadrature rules) can be adapted to form the often referred to as quadrature basis of a numerical method of IVPs. By the way, way, equation (4.25) explains why the process often referred process of of solving solving an an IVP IVP numerically numerically is is often referred to to as as integrating integrating the the ODE. ODE.
4.4.1
Euler's Euler's method
The simplest left-endpoint rule: rule: simplest method of integrating integrating an ODE is based based on the left-endpoint
lb
f(x) dx == f(a)(b - a).
(4.25), we Applying this to (4.25), we obtain obtain (4.26) Of course, course, this is not a computable formula, because, except possibly on the first step (i = 0), we we have an approximation, we do not know U(ti) u(ti) exactly. Instead, we approximation, Ui == = U(ti)' u(ti). Formula Formula (4.26) (4.26) suggests suggests how how to to obtain obtain an an approximation approximation Ui+1 m+\ to to U(ti+1): u(ti+i): Ui (4.27) The reader should notice that (except for ii == 0) 0) there are two sources of error in the using formula the estimate estimate UHl. MJ+I. First First of of all, all, there there is is the the error error inherent inherent in in using formula (4.26) (4.26) to to advance the integration by one time step. Second, Second, there is the accumulated error due we do ti) in due to to the the fact fact that that we do not know know u( u(ti] in (4.26), (4.26), but but rather rather only only an an approximation approximation to to it. it. The method (4.27) is referred to as Euler's Euler's method. method. Example Example 4.9. 4.9. Consider Consider the the IVP IVP du
dt The The exact exact solution solution is is
u
I
+ t2'
u(O) = 1.
(4.28)
1
u(t) = etan - t, and formula to and we we can can use use this this formula to determine determine the the errors errors in in the the computed computed approximation approximation to u(t). u(t). Applying Applying Euler's method with with the the regular regular grid t{ = iLlt, iAt, Ai 10/n, we to Euler's method grid ti Llt = lOin, we have have
103 103
4.4. Numerical Numerical methods methods for value problems 4.4. for initial initial value problems
For example, with UQ = 11 and = 100, 100, we For example, with Uo and nn = we obtain obtain Ul ~
1.1000,
U2 ~
1.2089,
U3 ~
1.3252, ....
In Figure Figure 4-1, 4.1, we graph the exact solution solution and and the approximations computed using In we graph the exact the approximations computed using Euler's method method for for nn = = 10,20,40. As we we should should expect, produced Euler's 10,20,40. As expect, the the approximation approximation produced as At IJ..t (the step) decreases. In fact, fact, Table 4.1, by Euler's Euler's method method gets gets better by better as (the time time step) decreases. In Table 4-1, we collect suggests that global where where we collect the the errors errors in in the the approximations approximations to to u(10), w(10), suggests that the the global of steps) steps) is is O(At). O(lJ..t). The error (which comprises the total error error after after aa number error (which comprises the total number of The symbol symbol of At") IJ..t") denotes denotes aa quantity quantity that proportional to to or smaller than O(lJ..t) 0(At) ("big-oh ("big-oh of that is is proportional or smaller than IJ..t as At IJ..t ~ At as ->• O. 0. 5~------~-------r------~--------~------,
4.5
, ..
~.
4
~
o
".!,.
~.
o
'! . . . . .
I!. ~ .
I! . . . .
3.5 :::J
3 2.5
2 -
exact
o It
2
4
6
8
n=10 n=20 n=40 10
Figure 4.1. (4-28). Figure 4.1. Euler's Euler's method method applied applied to to (4.28).
Proving that the order method is is really really O(lJ..t) is beyond beyond the the scope scope Proving that the order of of Euler's Euler's method O(At) is of this this book, book, but but we we can easily sketch essential ideas the proof. proof. It be of can easily sketch the the essential ideas of of the It can can be proved that the the left-endpoint left-endpoint rule, rule, applied to an an interval interval of of length length proved that applied to of integration integration of IJ..t, has an error error that that is is O(At 0(lJ..t 22 ). ). In In integrating an IVP, we apply apply the the left-endpoint At, has an integrating an IVP, we left-endpoint rule repeatedly, in in fact, 0(1/ IJ..t) times. times. Adding Adding up up 1/ IJ..t errors order At IJ..t 22 rule repeatedly, fact, nn = = 0(l/At) I/At errors of of order gives total error that this this heuristic heuristic reasoning reasoning is gives aa total error of of O(lJ..t). 0(At). (Proving (Proving that is correct correct takes takes some rather rather involved but elementary be found most numerical numerical some involved but elementary analysis analysis that that can can be found in in most analysis textbooks.) analysis textbooks.)
104 104
Chapter 4. Essential ordinary ordinary differential differential equations equations
n n
Error Error in in u(10) w(10)
Error Error in in u(10) w(io)
10 10 20 40 80 80 160 320 640
-3.3384.10- 11
-3.3384. -3.3384 • 101Q-11 -3.7014.10-3.7014 -10-11 -4.0218. 10- 11 -4.0218 -HT -4.2232 .- 10HT11 -4.3378. -4.3378 • 10lO"11 -4.3992. -4.3992 • 101Q-11 -4.4310 .• 101Q-11
-3.3384 • 1Q-1.8507.10-1.8507 -KT 11 -1.0054.10-1.0054 -HT 11 -5.2790· -5.2790 • 101Q-22 -2.7111.10-2.7111 -1Q-22 -1.3748.10-1.3748 -HT 22 -6.9235 .• 1010~33
At At
Table 4.1. Global error error in in Euler's method for (4-28). Table 4.1. Global Euler's method for (4.28).
4.4.2 4.4.2
on Euler's method: Runge-Kutta methods Improving on
If method can If the the results results suggested suggested in in the the previous previous section section are are correct, correct, Euler's Euler's method can be used to to approximate approximate the the solution to an an IVP to any any desired desired accuracy accuracy by be used solution to IVP to by simply simply making tlt Although this this is might making At small small enough. enough. Although is true true (with (with certain certain restrictions), restrictions), we we might hope for that would as many many time hope for aa more more efficient efficient method-one method—one that would not not require require as time steps steps to achieve aa given given accuracy. accuracy. to achieve To method, we To improve improve Euler's Euler's method, we choose choose aa numerical numerical integration integration technique technique that that is more accurate than the is more accurate than the left-endpoint left-endpoint rule. rule. The The simplest simplest is is the the midpoint midpoint rule: rule:
lar
b
(a+b)
f(x) dx == (b - a)f -2- .
If tlt At = = bb -— aa (and (and the the integrand integrand f/ is is sufficiently sufficiently smooth), smooth), then the error error in in the the If then the midpoint ). Following the reasoning the previous we expect midpoint rule rule is is O(tlt O(A£33). Following the reasoning in in the previous section, section, we expect that the the corresponding that corresponding method method for for integrating integrating an an IVP IVP would would have have O(tlt O(A£22 )) global global error. error. It use the rule for It is is not not immediately immediately clear clear how how to to use the midpoint midpoint rule for quadrature quadrature to to integrate is integrate an an IVP. IVP. The The obvious obvious equation equation is
However, after after reaching time ti ti in in the integration, we we have an approximation approximation for However, reaching time the integration, have an for u(ti), but but none u(ti + A£/2). use the the midpoint midpoint rule requires that that we we first U(ti), none for for U(ti tlt/2). To To use rule requires first generate an an approximation approximation for for U(ti u(ti + tlt/2); Ai/2); the the simplest simplest way way to to do do this is with an generate this is with an Euler Euler step: step: U (ti
+ ~t) == u (ti) + ~t f
(ti,U (ti)).
Putting Putting this this approximation approximation together together with with the the midpoint midpoint rule, rule, and and using using the the approxapproximation Ui Ui == = U(ti), u(ti), we we obtain obtain the improved Euler method: imation the improved Euler method:
105 105
4.4. Numerical methods for initial value problems
nn
Error in in u(10) it (10) Error
Error in in2 u(lO) «(io) Error
10 10 20 20 40 40 80 80 160 160 320 320 640 640
-2.1629.10-2.1629 • 10~22 -1.2535 .-Mr 10- 22 -4.7603 .• 101Q-333 -1.4746.10-1.4746 -nr 43 -4.1061 -4.1061 .• 10-4 10~ -1.0835 .• 10-4 10-4 -2.7828.10-2.7828 • 10~55
-2.1629.10-2.1629-1Q-22 -5.0140.10-5.0140 -1Q-22 -7.6165.10-7.6165 -1Q-22 -9.4376. -9.4376 • 1010~22 -10.512.10-10.512 -1Q-22 -11.095.10-11.095 -i(r 222 -11.398.10-11.398 -10- 2
lJ.t At 2
Table 4.2. 4.2. Global Global error error in in the the improved improved Euler Euler method method for (4-28). Table for (4.28).
results of with Figure Figure 4.2 4.2 shows shows the the results of the the improved improved Euler Euler method, method, applied applied to to (4.28) (4.28) with n = by — 10, 10, n = 20, 20, and and n = 40. 40. The The improvement improvement in in accuracy accuracy can can be be easily easily seen seen by comparing Figure 4.1. We can Table 4.2-when comparing to to Figure 4.1. We can see see the the O(~t2) O(At2) convergence convergence in in Table 4.2—when At is is divided divided by by two two (Le. (i.e.nnisisdoubled), doubled),the theerror errorisisdivided dividedby byapproximately approximately four. four. ~t
4.5~------~-------r------~--------r-----~
-
exact
o •
n=10 n=20 n=40
8
10
Figure Improved Euler Euler method Figure 4.2. 4.2. Improved method applied applied to to (4.28). (4-28).
The improved improved Euler Euler method method is is aa simple simple example example of of aa Runge-Kutta The Runge-Kutta (RK) (RK)
106
Chapter Chapter 4. 4. Essential Essential ordinary ordinary differential differential equations equations
method. An (explicit) (explicit) RK RK method method takes the following following form: form: method. An takes the m
ui+1 = Ui
+ D.t L (}:jkj , j=l
kl
= 1 (ti' Ui) ,
k2 = 1 (ti k3 = 1 (ti
km =
+ 'Y2D.t, Ui + f321D.tkl) ' + 'Y3D.t, Ui + f331D.tk l + f332D.tk2) ,
(4.29)
1 (ti + 'YmD.t,ui + ~ f3mlD.tkl) ,
f3kl, and "0 with certain restrictions on the values of the parameters parameters (}:j, ctj, PM, 7^ (e.g. (e.g. (}:l a\ + ... +h (}:m am = —1). !)• Although (4.29) looks complicated, it is not hard to understand the the idea. A general form for a quadrature rule is
W2, ... ,W X2, ... ,X [a, b] b) are where WI, wi,W2,.--, wmm are the quadrature quadrature weights and Xl, xi, x%,..., xmm E G [a, axe the quadrature nodes. In the weights weights are quadrature nodes. In the the formula formula for for Ui+l Ui+i in in (4.29), (4.29), the are
and values • • • ,,kkm are estimates estimates of of fI(t, ( t , uu(t)) ( t ) ) at in the the interval and values fci, kl' £2, k2, ... at ra m nodes nodes in interval m are [ti,ti We will the general general formula formula (4.29), (4.29), but but the should apap[ti' ti+1)' will not not use use the the reader reader should +i]. We preciate the following point: there are are many many RK RK methods, obtained by by choosing preciate the following point: there methods, obtained choosing various values for the parameters in (4.29). This fact is used in designing algorithms rithms that that attempt to to automatically control the the error. We We discuss discuss this this further further in Section 4.4.4 below. Section 4.4.4 below. The most popular RK method is analogous to Simpson's rule for quadrature:
i
b
I(x) dx
== b ~
a(f(a) + 41 (a; b) + f(b)) .
Simpson's Simpson's rule rule has has an an error error of of O(D.t O(A£5)) (D.t (At = = bb -— a), and and the the following following related related method method for for integrating integrating an an IVP has global global error error of of O(D.t4): O(A£ 4 ): Ui+l
= Ui
D.t
+6
(k l
+ 2k2 + 2k3 + k4) ,
kl
=f
k2
D.t D.t) = 1 ( ti + 2' Ui + 2 kl ,
(ti' Ui) ,
k3 =
D.t D.t) 1 ( ti + 2' Ui + 2 k2 ,
k4 =
1 (ti+l,Ui + D.tk 3).
(4.30)
4.4.
107 107
Numerical Numerical methods methods for initial value problems
n n
Error Error in in u(10) w(10)
Error in in4 u(io) u(10) Error
10 20 40 80 160 320 640
1.9941 .• 1010~22 1.0614. 10-33 1.0614 -Hr 6.5959 .• 1010~55 4.0108 .• 1010~66 2.4500. 2.4500 • 101Q-77 1.5100. 10-88 1.5100 -l(r 10 9.3654. 9.3654 • 10- 10
1.9941 .• 1010~22 1.6982 .• 1010~22 1.6885. 1.6885 • 1010~22 1.6428 .• 1010~22 1.6057. 10-22 1.6057 -101.5833 .-1010- 22 1.5712 .-ID" 10- 22
.6.t At4
Table 4.3. Global error in the fourth-order Runge-Kutta method (RK4) Table 4.3. (RK4) for (4.28). for (4.28).
We call this the RK4 method. Figure 4.3 and Table 4.3 demonstrate demonstrate the accuracy of D..t by by two two decreases decreases the the error by (approximately) (approximately) aa factor factor of of 16. 16. of RK4; RK4; dividing dividing A£ error by This is typical typical for for O(D..t4) convergence. This is O(At 4 ) convergence. 4.5r-------~-------r------~--------r_----__,
-
exact
o ..
8
n=10 n=20 n=40
10
Figure Fourth-order Runge-Kutta Runge-Kutta method (RK4) Figure 4.3. Fourth-order (RK4) applied applied to (4.28). (4-28).
We must must note note here here that that the the improvement these higher-order higher-order We improvement in in efficiency efficiency of of these methods is not as dramatic as it might appear appear at first glance. For instance, comcomwe notice notice that that just just 10 of RK4 RK4 gives error paring Tables 4.1 and 4.3, we paring Tables 4.1 and 4.3, 10 steps steps of gives aa smaller smaller error than method. However, the improvement in efficiency efficiency is is not not aa than 160 160 steps steps of of Euler's Euler's method. However, the improvement in factor steps of of Euler's Euler's method use 160 while factor of of 16, 16, since since 160 160 steps method use 160 evaluations evaluations of of ff(t, ( t , uu), ) , while
108 108
Chapter 4. Essential Essential ordinary ordinary differential equations
10 RK4 use evaluations of problems of realistic complexity, 10 steps steps of of RK4 use 40 40 evaluations of ff(t, ( t , uu). ) . In In problems of realistic complexity, the is the the most part of the calculation is the evaluation evaluation of of ff(t,u) ( t , u ] is most expensive expensive part of the calculation (often (often /f is defined by aa computer computer simulation rather than so defined by simulation rather than aa simple simple algebraic algebraic formula), formula), and and so aa more more reasonable reasonable comparison comparison of of these these results results would would conclude conclude that that RK4 RK4 is is 44 times times as efficient Euler's method method for level of of error. error. HigherHigherand level as efficient as as Euler's for this this particular particular example example and order methods tend tend to that they order methods to use use enough enough fewer fewer steps steps than than lower-order lower-order methods methods that they are efficient even they require more work step. are more more efficient even though though they require more work per per step.
4.4.3 4.4.3
Numerical methods for systems systems of ODEs
The numerical methods methods presented presented above can be of The numerical above can be applied applied directly directly to to aa system system of ordinary differential differential equations. equations. Indeed, the system system is is written written in in vector ordinary Indeed, when when the vector form, form, the notation notation is identical. We We now now present an example. example. the is virtually virtually identical. present an
Example 4.10. species of share aa habitat, and suppose Example 4.10. Consider Consider two two species of animals animals that that share habitat, and suppose one prey to to the the other. Let Xl population of predator species species one species species is is prey other. Let x\ (t) (t) be be the the population of the the predator at time and let let X2(t) be the the population of the the same time. The The LotkaLotkaat time t, t, and X2(t) be population of the prey prey at at the same time. Volterra predator-prey model for these populations is Volterra predator-prey model for these two two populations is dXl dt = e2 e l XI X2 - qXI, dX2 dt
= rX2 -
elXIX2,
where ei, el, 62, e2, q, are positive positive constants. constants. The parameters have have the following interprewhere #, rr are The parameters the following interpretations: tations:
BI • el
describes the attack rate rate of the predators (the rate at which which the the prey describes the attack of the predators (the rate at prey are are killed bv the the predators vredators is is elxlx2); e^ x-\ x? ); killed by
the growth rate of the predator the number number of • 62 e2 describes describes the growth rate of the predator population population based based on on the of prey killed efficiency of predators at at converting predators to prey killed (the (the efficiency of predators converting predators to prey); prey); is the the rate rate at at which which the the predators • qq is predators die; die; is the the intrinsic intrinsic growth growth rate rate of of the the prey population. prey population. • rr is The the rate rate of population growth growth (or of each each species. The equations equations describe describe the of population (or decline) decline) of species. predator population population is the rate of The The rate rate of of change change of of the the predator is the the difference difference between between the rate of growth due due to feeding on prey and rate of of death. death. The rate of change of of growth to feeding on the the prey and the the rate The rate of change the prey population population is is the natural growth growth rate would the prey the difference difference between between the the natural rate (which (which would govern in absence of of predators) predators) and and the death rate due to govern in the the absence the death rate due to predation. predation. 2 Define -» R22 by Define ff :: R x R 2 -+ by
f(t,x) = [ e2 e l Xl X2 rX2 -
-
qXl ]
elXIX2
(f is is actually actually independent independent of of t; t; however, include tt as as an independent variable (f however, we we include an independent variable so general form form discussed discussed above). given initial so that that this this example example fits fits into into the the general above). Then, Then, given initial
4.4.
109 109
methods for initial value problems Numerical methods
values #1,0 Xl,O and X2,O of the predator predator and and prey prey populations, populations, respectively, respectively, the IVP of of values and #2,0 of the the IVP interest is is interest dx dt
= f(t,x), x(O) = xo,
(4.31)
where where
= [ Xl,O
xo
X2,O
] .
Suppose Suppose
e = 0.01, 0.01, e2 e2 == 0.2, 0.2, rr == 0.2, 0.2, qq == 0.4, 0.4, Xl,O XI,Q == 40, 40, X2,O x2,o == 500. 500. ell =
usmg the we estimated estimated the of (4.31) (4-31) on on the the Using the RK4 RK4 method method described described above, above, we the solution solution of time interval [0,50]. [0,50]. The The implementation is exactly described in in (4.30), (4-30), except except time interval implementation is exactly as as described that the various various quantities (lcj,Uj,f(tj,Ui)J The time time step Ui, f(ti' Ui)) are are vectors. vectors. The step used used was was that now now the quantities (ki, Llt = 0.05 (for steps). In In Figure Figure 4-4> 4.4, wwe plot the the two two populations At = 0.05 (for aa total total of of 1000 1000 steps). & plot populations versus time; graph suggests suggests that populations vary versus time; this this graph that both both populations vary periodically. periodically. 600 I
"',
I
500
I
I
400 :
I
c
I
I
0
I
~300 c..
I I
0
c..
I
I
I
I I
200 100
I
,,
.. .. 10
I
,,
I
, ,,
I
.. 20
,,
I I
/
40
50
Figure 4.4. The variation of populations with with time Figure 4.4. The variation of the the two two populations time (Example (Example 4.10). 4-10). Another way way to to visualize visualize the results is graph x^ X2 versus versus Xl; such aa graph Another the results is to to graph x\; such graph is meaningful ODE (one (one in appear in which which tt does does not not appear is meaningful because, because, for for an an autonomous autonomous ODE explicitly), the curve curve (xi(t)jX^(t}} is determined be the the initial XQ (see (Xl (t), X2 (t)) is determined entirely entirely be initial value value xo (see explicitly), the for aa precise precise formulation formulation of this property). property). In In Figure Figure 4-5, 4.5, we graph x% X2 Exercise 55 for Exercise of this we graph versus x\. This This curve curve is clockwise direction (as can be determined determined versus Xl. is traversed traversed in in the the clockwise direction (as can be
110 110
Chapter Chapter 4. Essential ordinary ordinary differential differential equations equations
by Figure 4-4)4.4}. F° Forr example, when the prey population population is by comparing comparing with with Figure example, when the prey is large large and predator population population is small (the left part part of predator and the the predator is small (the upper upper left of the the curve), curve), the the predator As it does, the prey population population begins population will will begin grow. As population begin to to grow. it does, the prey begins to to decrease decrease (since more prey prey are Eventually, the prey population population gets gets small enough that that (since more are eaten). eaten). Eventually, the prey small enough cannot support support aa large large number number of predators (far part of the curve), curve), and itit cannot of predators (far right right part of the and the the predator population population decreases rapidly. 'When the predator predator population population gets gets small, predator decreases rapidly. When the small, the the grow, and the again. prey population begins to grow, the whole cycle cycle begins again. 600r---~----'---~~---r----~--~----~--~
10
20
30
40
50
60
70
80
predator population
Figure 4.5. The variation of populations (Example Figure 4.5. The variation of the the two two populations (Example 4.10). 4-10).
4.4.4 4.4.4
and Runge-Kutta-Fehlberg Automatic step control and methods
As As the above examples examples show, show, it it is possible to design numerical numerical methods which which give aa predictable improvement predictable improvement in in the error error as as the the time step step is is decreased. decreased. However, However, these these to choose in order desired methods do not not allow allow the the user methods do user to choose the the step step size size in order to to attain attain aa desired level D..t from level of of accuracy. accuracy. For For example, example, we we know that that decreasing decreasing At from 0.1 0.1 to 0.05 0.05 will will decrease decrease the the error in in an O(D..t4) 0(At 4 ) method method by approximately approximately aa factor of 16, 16, but this this does does not tell tell us that either either step step size size will will lead lead to a global global error error less than than 1010~33.. It It would would be desirable to have have aa method .that, that, given given aa desired desired level level of of accuracy, accuracy, could could choose the step step size size in an an algorithm so so as as to attain attain that accuracy. accuracy. While While no method guaranteed to do method guaranteed do this is is known, known, there there is is aa heuristic technique that that is is usually successful successful in in practice. The The basic idea idea is is quite quite simple. simple. When an algorithm wishes to to tn+!, uses two two different D..tfek and and to integrate integrate from from tn tn to tn+i, it it uses different methods, methods, one one of of order order Ai fc+1 k another D..t +!.. It then regards the more accurate another of order A£ accurate estimate estimate (resulting from from
4.4. Numerical methods for initial value problems
111 111
fe+1 the exact value, it with from the O(At O(~tk+l)) method) method) as as the the exact value, and and compares compares it with the the estimate estimate from the lower-order lower-order method. method. If the difference difference is is sufficiently sufficiently small, then it the If the small, then it is is assumed assumed that the the step is sufficiently the step accepted. (Moreover, (Moreover, if if the that step size size is sufficiently small, small, and and the step is is accepted. the difference is too too small, small, then then it it is assumed that that the the step than it it needs needs difference is is assumed step size size is is smaller smaller than to be, be, and and it it may may be be increased increased on on the the next next step.) step.) On On the the other other hand, hand, if if the the difference to difference is too too large, then it it is is assumed that the the step step is inaccurate. The The step step is rejected, and is large, then assumed that is inaccurate. is rejected, and ~t, the the step size, is is reduced. reduced. This This is is called automatic step step control, control, and and it it leads leads to At, step size, called automatic to an approximate approximate solution solution computed on an since At can vary an computed on an irregular irregular grid, grid, since ~t can vary from from one one step to This technique error below step to the the next. next. This technique is is not not guaranteed guaranteed to to produce produce aa global global error below the desired level, since the only controls error (the the desired level, since the method method only controls the the local local error (the error error resulting resulting from aa single single step step of and from of the the method). method). However, However, the the relationship relationship between between the the local local and global error is is understood in principle, and this global error understood in principle, and this understanding understanding leads leads to to methods methods that achieve the desired accuracy. accuracy. that usually usually achieve the desired A popular popular class of methods methods for control consists consists of of the the RungeRungeA class of for automatic automatic step step control Kutta-Fehlberg (RKF) (RKF) methods, methods, which which use use two two RK methods together together to to control Kutta-Fehlberg RK methods control the local local error error as in the the previous previous paragraph. paragraph. The The general form of of RK RK the as described described in general form methods, given shows that that there there are are many many possible possible RK RK formulas; formulas; indeed, indeed, methods, given in in (4.29), (4.29), shows there are many different RK methods methods that that can be derived. derived. for given order order !ltk, for aa given At fc , there are many different RK can be is used used in the RKF methodology to to choose pairs of of formulas that evaluate This fact fact is This in the RKF methodology choose pairs formulas that evaluate /, the function defining as few few times as possible. f, the function defining the the ODE, ODE, as times as possible. For For example, example, it it can can be be proved that that every method of of order order !lt requires at least six evaluations of of f./. proved every RK RK method At55 requires at least six evaluations It is (i.e. the ti,ti 72/1,... + 16h ^h in It is possible possible to to choose choose six six points points (Le. the values values ti, ti + 12h, ... ,, ttij + in (4.29)) (4.29)) so can be in an points so that that five five of of the the points points can be used used in an O(At O( !lt 44 )) formula, formula, and and the the six six points together define an O(At 55 )) formula. (One such of formulas formulas is the together define an O(!lt formula. (One such pair pair of is the the basis basis of of the popular efficient implementation implementation of automatic popular RKF45 RKF45 method.) method.) This This allows allows aa very very efficient of automatic step control. step control.
Exercises 1. The purpose purpose of this exercise exercise is is to to estimate estimate the the value value of u(0.5), where where u(t) u(t) is is 1. The of this of u(0.5),
the solution of IVP the solution of the the IVP du
t
dt = u + e , u(O) = O.
(4.32)
The solution is and so = ee 11//22 /2 /2 = The solution is w(t) u(t) = = te*, te t , and so w(0.5) u(0.5) = == 0.82436. 0.82436. (a) steps of method. u(0.5) by by taking taking 44 steps of Euler's Euler's method. (a) Estimate Estimate w(0.5) (b) Estimate steps of u(0.5) by by taking taking 22 steps of the the improved improved Euler's Euler's method. method. (b) Estimate w(0.5) (c) Estimate Estimate w(0.5) u(0.5) by by taking taking 11 step the classical classical fourth-order RK method. (c) step of of the fourth-order RK method.
Which each evaluate evaluate ff(t, ( t , u) = = is more more accurate? accurate? How How many many times times did did each Which estimate estimate is u + eetet ?? u+ 2. 2. Reproduce Reproduce the the results results of of Example Example 4.10. 4.10.
112
3.
Chapter 4. Essential Essential ordinary differential equations 17
17
The system of of ODEs ODEs The /-LI(X - /-L2) T2(X,y)3
(4.33)
/-LlY T2(X,y)3'
where
+ /-LI)2 + y2, = V(X - /-L2)2 + y2,
TI (X,y) = V(X T2(X,y)
models the the orbit orbit of of aa satellite satellite about two heavenly heavenly bodies, which we we will will assume assume earth and to be the earth and the moon. In these these equations, equations, (x(t),y(t)) (x(t),y(t)) are the the coordinates coordinates of of the the satellite satellite at at time time t.t. The origin origin of of the the coordinate coordinate system system is the center of mass of the the earth-moon earth-moon The is the center of mass of system, and and the the x-axis ar-axis is is the the line line through the centers centers of and the the system, through the of the the earth earth and moon. The center center of of the is at the point point (1 (1 — the center center of moon. The the moon moon is at the - /^i,0) /-LI,O) and and the of the is at at ((—//i, 0), where 1/82.45 is is the the ratio ratio of of mass of the the earth earth is /-LI, 0), where IJL\ /-LI = = 1/82.45 mass of the moon moon to the mass of of the The unit unit of length is is the distance from from the center to the mass the earth. earth. The of length the distance the center earth to the moon. We write ^2 /-L2 = /-L1.• of the the earth to the the center of the = 11 -— A*i If known If the satellite satellite satisfies the following initial conditions, then its orbit is known to be be periodic with with period T = 6.19216933: 6.19216933:
dx x(O) = 1.2, dt (0) = 0, y(O) = 0,
~~ (0) =
-1.04935751.
(a) Convert Convert the the system system (4.33) (4.33) of of second-order second-order ODEs ODEs to first-order system. system. (a) to aa first-order (b) (b) Use Use aa program that implements implements an an adaptive (automatic (automatic step step control) orbit for one period. Plot the method18 (such as RKF45) RKF45) to follow the orbit orbit in plane, and make sure the tolerance tolerance used by the orbit in the the plane, and make sure that that the used by the adaptive adaptive algorithm algorithm is small small enough that that the orbit orbit really appears in in the plot plot to to be periodic. periodic. Explain Explain the the variation variation in in time time steps steps with with reference to the the motion of the satellite. (c) fixed step (c) Assume Assume that, in in order to obtain obtain comparable accuracy accuracy with with aa fixed step size, it it is use the the minimum step chosen chosen by the adaptive adaptive size, is necessary necessary to to use minimum step by the algorithm. How many steps steps would would be by an an algorithm algorithm. How many be required required by algorithm with with 19 fixed step size? How algorithm?19 fixed How many steps were used by the adaptive algorithm? 17 17 Adapted pages 153-154. Adapted from from [16], [16], pages 153-154. 18 18Both and Mathematica provide provide such respectively. Both MATLAB MATLAB and such routines, routines, ode45 ode45 and and NDSolve, NDSolve, respectively. 19 to determine number of routine. It It is possible 19It It is is easy easy to determine the the number of steps steps used used by by MATLAB's MATLAB's ode45 ode45 routine. is possible but tricky the number number of of steps used by by Mathematica's Mathematica's NDSolve NDSolve routine. but tricky to to determine determine the steps used routine.
methods for initial value problems 4.4. Numerical methods
113
4. Let A=![ll 48] 5 48 39 . The problem is to produce a graph, on the interval [0,20], of the two components of of the the solution nents solution to to du dt = Au,
(4.34)
= 110,
u(O)
where
Llt = = 0.1 (a) Use Use the RK4 method with a step size of At 0.1 to compute the solution. Graph both components on the interval [0,20].
(b) Use the techniques of the previous section to compute the exact exact solution. Graph Graph both components components on on the interval [0,20]. [0,20]. difference. (c) Explain the difference.
b] and and f: b]-t 5. (a) Suppose tto0 E G [a, &] f : Rll Rn -t ->• Rll, R n , u : [a, 6] ->• Rll Rn satisfy du dt = f(u),
u(t o) = Uo· Show that v : [a + (it - to), b + (h - to)] -t Rll defined by
v(t) = u(t - (tl - to)) satisfies satisfies
dv
dt = f(v),
v(td = 110· Moreover, show that Moreover, show that the the curves curves
{u(t)
tE[a,b]}
and and
{v(t)
t
E
[a + (tl - to), b + (tl - to)]}
are the same. (b) Show, by an explicit explicit example (a scalar scalar IVP the (b) Show, by producing producing an example (a IVP will will do), do), that that the differential equation equation that depends above property does not hold for a differential explicitly explicitly on on tt (that (that is, is, for for aa nonautonomous ODE). ODE).
114 114
Chapter Chapter 4. Essential ordinary differential equations equations
6. 6. Consider Consider the the IVP IVP dx dt
x(O)
= cos (t)x, = 1.
Use both Euler's method and the improved improved Euler method to to estimate x(lO), Use both Euler's method and the Euler method estimate #(10), using step sizes 1,1/2,1/4,1/8, and and 1/16. 1/16. Verify in Euler's Euler's using step sizes 1,1/2,1/4,1/8, Verify that that the the error error in 2 method is O(A£), while the error in in the improved Euler Euler method is O(Ai ). method is O(~t), while the error the improved method is O(~t2). the IVP IVP 7. Consider Consider the
-dx = 1+ lx-II ' dt x(O) = O. (a) Use the the RK4 RK4 method and 1/128 (a) Use method with with step step sizes sizes 1/4,1/8,1/16,1/32,1/64, 1/4,1/8,1/16,1/32,1/64, and 1/128 to x(1/2) and verify that that the the error is O(~t4). to estimate estimate x(l/2) and verify error is O(Af 4 ).
(b) Use the RK4 RK4 method method with step sizes sizes 1/4,1/8,1/16,1/32,1/64, 1/4,1/8,1/16,1/32,1/64, and 1/128 (b) Use the with step and 1/128 to estimate estimate x(2). x(2). Is the error case as to Is the error O(~t4) O(A£ 4 ) in in this this case as well? well? If If not, not, speculate speculate as to to why why it it is is not. as not. 8. Let Let 8.
A~
-31
1 o -11 2 0 1 -1 0 -1 -1 1 0 0 2 -1 0-2
l
[ -2
-3
.
The matrix A has an eigenvalue eigenvalue A = -0.187; —0.187; let corresponding eigeneigenThe matrix A has an A == let x x be be aa corresponding problem is is to to produce produce aa graph the interval of the five vector. vector. The The problem graph on on the interval [0,20] [0,20] of the five components components of of the the solution solution to to du dt
= Au,
u(O) = x.
(a) Solve Solve the the problem method (in arithmetic) (a) problem using using the the RK4 RK4 method (in floating floating point point arithmetic) and graph the results. and graph the results. (b) What is is the the exact solution? (b) What exact solution?
(c) Why is there discrepancy between the computed computed solution solution and exact there aa discrepancy between the and the the exact (c) Why is solution? solution? (d) this discrepancy be eliminated eliminated by by decreasing the step step size (d) Can Can this discrepancy be decreasing the size in in the the RK4 method? method? RK4
115 115
4.5. Stiff systems of of ODEs
4.5
Stiff systems of ODEs
The numerical methods described described in in the the last last section, section, while for many IVPs, The numerical methods while adequate adequate for many IVPs, can be unnecessarily unnecessarily inefficient for some We begin begin with comparison of can be inefficient for some problems. problems. We with aa comparison of and the the performance performance of state-of-the-art automatic two IVPs, IVPs, and two of aa state-of-the-art automatic step-control step-control algoalgorithm on on them. rithm them. Example following two Example 4.11. 4.11. We We consider consider the the following two IVPs: IVPs:
dx dt
= Alx,
(4.35)
x(O) = Xo,
dx
dt = A 2 x,
(4.36)
x(O) = Xo. For both IVPs, the is the the same: same: For both IVPs, the initial initial value value is
We matrices AI Al and A22 by spectral decompositions. decompositions. The We will will describe describe the the matrices and A by their their spectral The two two are symmetric, symmetric, with the same same eigenvectors eigenvectors but different eigenvalues. matrices matrices are with the but different eigenvalues. The The eigenvectors eigenvectors are are
while the eigenvalues of —I, A = -2, —1, XA33 = —3 and and the eigenvalues of while the eigenvalues of AI Al are are AI Al = = -1, A22 = = -3 the eigenvalues of A are AI -I, A -10, A3 A3 = -100. -100. A22 are Al = -1, A22 = -10, Since Since the the initial initial value value x Xo0 is is an an eigenvector eigenvector of of both both matrices, matrices, corresponding corresponding to to the eigenvalue eigenvalue -1 —I in in both cases, the the two two IVPs exactly the the same solution: the both cases, IVPs have have exactly same solution:
We solved both systems using MATLAB command which implements We solved both systems using the the MATLAB command ode45, ode45, which implements aa 20 state-of-the-art automatic automatic step-control step-control algorithm algorithm based fourth-fifth-order scheme. scheme. 2o state-of-the-art based on on aa fourth-fifth-order The The results results are are graphed graphed in in Figure Figure 4-6, 4.6, where, where, as as expected, expected, we we see see that that the the two two comcomputed solutions solutions are are the the same same (or at least least very very similar). puted (or at similar). performed by ode45 reHowever, examining the the results results of of the the computations performed by ode45 veals surprise: the algorithm used used only steps for for IVP IVP (4-35) (4.35) but but 1124 steps for veals aa surprise: the algorithm only 52 52 steps 1124 steps for (4-36)1 (4·36)! 20 20The ode45 was was implemented by Shampine and Reichelt. Reichelt. See See [42] for details. The routine routine ode45 implemented by Shampine and [42] for details.
116 116
Chapter 4. 4. Essential differential equations Chapter Essential ordinary ordinary differential equations
2
4
6
8
10
2
4
6
8
10
Figure The computed computed solutions (4.35) (top) (top) and and IVP (4-36) Figure 4.6. 4.6. The solutions to to IVP IVP (4.35) IVP (4.36) (bottom). (Each (Each graph of the the solution; however, the the (bottom). graph shows shows all all three three components components of solution; however, exact solution, solution, which which is is the same for for the the two two IVPs, IVPs, satisfies satisfies xi(t] Xl(t) = = X3(t).) exact the same = £X2(t) (£) = xs(t)•) 2 A detailed explanation explanation of of these results is is beyond beyond the the scope scope of of this this book, but A detailed these results book, but we the reason comments are are illustrated illustrated we briefly briefly describe describe the reason for for this this behavior. behavior. These These comments by an an example below. Explicit Explicit time-stepping time-stepping methods methods for for ODEs, ODEs, such by example below. such as as those those that form the basis basis of have an stability region for the time step. that form the of ode45, ode45, have an associated associated stability region for the time step. fc This means means that the expected O(~tk)) behavior behavior of of the the global global error error is is not not observed observed This that the expected O(At unless is small small enough enough to in the stability region. values of ~t is to lie lie in the stability region. For For larger larger values of At, ~t, unless A£ the method method is is unstable produces computed that "blow up" as as the the unstable and and produces computed solutions solutions that "blow up" the integration proceeds. stability region and gets gets is problem problem dependent dependent and integration proceeds. Moreover, Moreover, this this stability region is 21 smaller the eigenvalues of the the matrix matrix A A get get large and negative. negative. 21 smaller as as the eigenvalues of large and If A has large eigenvalues, then system being has some If A has large negative negative eigenvalues, then the the system being modeled modeled has some transient the presence of the behavior transient behavior behavior that that quickly quickly dies dies out. out. It It is is the presence of the transient transient behavior 21 If 21 If
the of ODEs consideration is the system system of ODEs under under consideration is nonlinear, nonlinear, say say
= f(x,t), then it the eigenvalues the Jacobian Jacobian matrix / ax that that determine behavior. then it is is the eigenvalues of of the matrix 8f df/dx determine the the behavior. x
ODEs 4.5. Stiff systems systems of ODEs
117
that requires requires aa small small time time step—initially, step-initially, the the solution over aa very very small that solution is is changing changing over small time scale, the time time step model this this behavior behavior accurately. time scale, and and the step must must be be small small to to model accurately. If, the same time, there there are small negative negative eigenvalues, then the the system If, at at the same time, are small eigenvalues, then system also also models components out more Once the out, that die die out more slowly. slowly. Once the transients transients have have died died out, models components that one ought to able to increase the step to more slowly slowly varying varying to be be able to increase the time time step to model model the the more one ought components of the the solution. However, this this is is the the weakness weakness of explicit methods methods like like components of solution. However, of explicit the transient transient behavior behavior inherent Euler's method, RK4, Euler's method, RK4, and and others: others: the inherent in in the the system system haunts the numerical even when when the the transient transient components solution haunts the numerical method method even components of of the the solution have A time time step that ought to be be small enough to to accurately have died died out. out. A step that ought to small enough accurately follow follow the the slowly varying components produces instability in the numerical method and ruins slowly varying components produces instability in the numerical method and ruins the the computed computed solution. solution. A system with negative magnitudes (that A system with negative eigenvalues eigenvalues of of widely widely different different magnitudes (that is, is, aa that models models components different rates) rates) is is sometimes sometimes system system that components that that decay decay at at greatly greatly different 22 stiJJ.22 Stiff problems require require special numerical methods; we give give examples called called stiff. Stiff problems special numerical methods; we examples we discuss discuss aa simple for which which the below in however, we below in Section Section 4.5.2. 4.5.2. First, First, however, simple example example for the ideas presented above can be be easily easily understood. ideas presented above can understood.
4.5.1 4.5.1
A simple example of a stiff system
The IVP
d~l
= -107Ul, Ul(O) = 1,
(4.37)
dU2 dt = -U2, U2(0) = 1
has solution has solution (t) u
= [
107t
ul(0)eu2(0)e- t
]
'
and the the system qualifies as as stiff according to the description description given and system qualifies stiff according to the given above. above. We We will will apply method, since is simple enough that apply Euler's Euler's method, since it it is simple enough that we we can can completely completely understand understand its behavior. However, similar similar results be obtained another its behavior. However, results would would be obtained with with RK4 RK4 or or another explicit explicit method. method. the behavior on (4.37), (4.37), we we write exTo understand understand the To behavior of of Euler's Euler's method method on write it it out out explicitly. We have plicitly. We have with with
that is, that is, U~n+1) = u~n) _ 107 ~tu~n) = (1 - 107 ~t) u~n), u~n+1) = u~n) _ ~tu~n) = (1 - ~t) u~n). 22
22Stiffness of Stiffness is a surprisingly subtle concept. The definition we follow here does not capture all of this subtlety; moreover, moreover, there is is not aa single, single, well-accepted well-accepted definition definition of of a stiff stiff system. system. See See [33], Section 6.2, for for aa discussion discussion of of the the various various definitions definitions of of stiffness. Section 6.2, stiffness.
118 118
Chapter 4. 4. Essential Essential ordinary ordinary differential differential equations equations Chapter
It It follows follows that that u~n)
= (1 - 107 b.t) U~n-l) = (1 - 107 b.t)2 U~n-2) = (1-10 7 b.t)3 U~n-3)
= ... = (1 -
107b.tt uiO)
= (1 - 107 b.tt
and, similarly, similarly, and, Clearly, the the computation computation depends depends critically critically on on Clearly, 11 - 107 b.tl, 11 - b.tl·
If one one of of these these quantities quantities is is larger larger than than 1, 1, the the corresponding corresponding component component will will grow grow If exponentially as as the the iteration iteration progresses. progresses. At At the the very very least, least, for for stability stability (that (that is, is, to to exponentially avoid spurious spurious exponential exponential growth), growth), we we need need avoid 11 -10 7 b.tl :::; 1, 11- b.tl :::; 1.
The first first inequality inequality determines determines the the restriction restriction on on b.t, At, and and aa little little algebra algebra shows shows that that The we must must have have we o < b.t :::; 2 . 10- 7 . Thus aa very very small small time time step step is is required required for for stability, stability, and and this this restriction restriction on on b.t Af is is Thus imposed by by the the transient transient behavior behavior in in the the system. system. imposed As we we show show below, below, it it is is possible possible to to avoid avoid the the need need for for overly overly small small time time steps steps As in aa stiff stiff system; system; however, however, we we must must use use implicit implicit methods. methods. Euler's Euler's method method and and the the in RK RK methods methods discussed discussed in in Section Section 4.4 4.4 are are explicit, explicit, meaning meaning that that the the value value u(n+l) w(n+1) is is defined by by aa formula formula involving involving only only known known quantities. quantities. In In particular, particular, only only urn) u^ apapdefined pears pears in in the the formula; formula; in in some some explicit explicit methods methods (multistep (multistep methods), methods), u(n-I), u^n~l\ u(n-2), u^n~2\ n+1 ... ... may may appear appear in in the the formula, formula, but but u(n+l) w( ) itself itself does does not. not. On the other hand, an implicit method defines On the other hand, an implicit method defines u(n+l) u^n+l^ by by aa formula formula that that ininn+1 volves volves u(n+l) w( ) itself itself (as (as well well as as urn) u^ and and possibly possibly earlier earlier computed computed values values of of u). u). This This means that, that, at at each each step step of of the the iteration, iteration, an an algebraic algebraic equation equation (possibly (possibly nonlinear) nonlinear) means must must be be solved solved to to find find the the value value of of u(n+l). u( n+1 ). In In spite spite of of this this additional additional computational computational expense, implicit implicit methods methods are are useful useful because because of of their their improved improved stability stability properties. properties. expense,
4.5.2 4.5.2
The backward backward Euler Euler method method The
The simplest simplest implicit implicit method method is is the the backward backward Euler Euler method. method. Recall Recall that that Euler's Euler's The method for for the the IVP IVP method du dt = f(t, u), u(to) = Uo
4.5. Stiff of ODEs 4.5. Stiff systems systems of ODEs
119 119
was derived from the equivalent formulation
by applying the left-endpoint left-endpoint rule for quadrature:
lb
hex) dx == h(a)(b - a).
The result is
= Un + Atf(tn, Un)
U n +1
(see Section 4.4.1). To obtain obtain the backward Euler method, we we apply the right-hand (see right-hand rule, rule,
lb
hex) dx == h(b)(b - a),
instead. The result is
(4.38) This This method is indeed indeed implicit. implicit. It is necessary to solve the equation equation (4.38) for In the the case case of of aa nonlinear nonlinear ODE ODE (or (or aa system system of of nonlinear nonlinear ODEs), ODEs), this this may may be be In difficult (requiring aa numerical numerical root-finding root-finding algorithm algorithm such such as as Newton's Newton's method). method). difficult (requiring However, not difficult difficult to backward Euler However, it it is is not to implement implement the the backward Euler method method for for aa linear linear system. We We illustrate illustrate this for the linear, constant-coefficient system:
unn+ i. U +1.
dx dt = Ax + f(t),
(4.39)
x(O) = Xo. The backward Euler method takes the form
+ At(Axn+1 + f(t n+ 1 )) =} X n +1 - AtAxn+1 = Xn + Atf(tn+d =} (1 - AtA)xn+l = Xn + Atf(tn + 1 ) =} Xn+1 = (1 - AtA)-l (x n + Atf(tn+d) . X n +1
= Xn
Example the backward method to example (4.37) (4-37) Example 4.12. 4.12. Applying Applying the backward Euler Euler method to the the simple simple example methods. We gives some insight into into the the difference difference between between the implicit implicit and explicit explicit methods. We following iteration for the first obtain the the following first component: (n+1) _ (n) _ U1 - U1
+
107 At)
=}
(1
=}
ui
=}
uin ) = (1
n
+1)
107 ~ AtU (n+1) 1
u~n+1) = uin)
= (1 + 107 At) -1 uin ) + 107 At) -n .
120
Chapter 4. Essential ordinary differential equations
Similarly, for for the second component, component, we we obtain obtain Similarly, the second
any positive value For any value of of ilt, A£; we we have
and so so no no instability instability can step size size ilt will still still have be small enough and can arise. arise. The The step A£ will have to to be small enough accurate, but we do small to for the solution to for the solution to be be accurate, but we do not not need need to to take take ilt Ai excessively excessively small to avoid spurious growth. avoid spurious exponential exponential growth. Example 4.13. Euler's method method and the backward backward Euler Euler method method Example 4.13. We We will will apply apply both both Euler's and the to the IVP IVP
dx
dt = A 2 x,
(4.40)
x(O) = xo, where A2 is is the matrix from from Example Example 4-11 4.11 and where A.% the matrix o,nd
We integrate over over the the interval with aa time step of We integrate interval [0,2) [0,2] with time step of ilt At =— 0.05 0.05 (40 (40 steps) steps) in in both both 4.7. Euler's Euler's method algorithms. The The results are are shown in in Figure 4-7method "blows "blows up" up" with this step, while while the Euler method produces aa reasonable reasonable approximation this time time step, the backward backward Euler method produces approximation to true solution. to the the true solution. Remark higher-order methods for use use with with stiff ODEs, nonoRemark There There are are higher-order methods designed designed for stiff ODEs, tably method. It is also also possible possible to to develop develop automatic automatic step-control tably the the trapezoidal trapezoidal method. It is step-control 23 methods for methods for stiff stiff systems. systems.23
Exercises 1. 1. Let Let
A=[
and the IVP and consider consider the IVP
dx
dt = Ax,
x(O) = xo. (a) Find the the exact x(t). (a) Find exact solution solution x(£). 23
23MATLAB includes such routines. MATLAB includes
121 121
4.5. Stiff systems of ODEs
0 x
-1 -2
-3
0
0.5
1.5
2
0.5
1.5
2
2
x
1l 0, -1
0
Figure 4.7. The computed solutions to IVP IVP (4.40) method (4-40) using Euler's method (top) (top) and the backward backward Euler method (bottom). (Each graph shows all three components of the solution.) solution.) (b) Estimate x(l) using using 10 of Euler's method and the norm (b) Estimate x(l) 10 steps steps of Euler's method and compute compute the norm of the of the error. error. (c) using 10 and compute (c) Estimate Estimate x(l) x(l) using 10 steps steps of of the the backward backward Euler Euler method method and compute the norm norm of of the the error. error. the 2. for 2. Repeat Repeat Exercise Exercise 11 for
3. 3. Let Let
A =
~
-113 -80 [ -107
-80 -140 -80
-107] -80 . -113
Find largest value of At Ilt such that Euler's method is numerical Find the the largest value of such that Euler's method is stable. stable. (Use (Use numerical experimentation. ) experimentation.)
122 122
Chapter Essential ordinary ordinary differential differential equations Chapter 4. 4. Essential equations
4. Let Let ff:: R R x xR R22 -+ R22 be be defined by 4. ->• R defined by
The nonlinear nonlinear IVP The IVP
dx dt
= f(t,x),
x(o)
=[~ ]
has solution has solution
te-t ] x(t) = [ e- t . (a) by example example that that Euler's Euler's method method is unstable unless unless At D..t is chosen (a) Show Show by is unstable is chosen small Determine (by (by trial trial and and error) error) how how small D..t must must be be in in small enough. enough. Determine small At that Euler's Euler's method method is is stable. order order that stable.
(b) (b) Apply Apply the the backward backward Euler Euler method. method. Is Is it it stable stable for for all all positive positive values values of of D..t? At? 5. Let 5. Let
A = [-50.5 49.5
49.5 ] -50.5 .
(a) Find the the exact solution to (a) Find exact solution to
~: = Ax,
x(O)
= [ ~ ].
(b) Determine Determine experimentally experimentally (that (that is, is, by by trial trial and and error) error) how how small small At D..t must (b) must be in in order that Euler's Euler's method method behaves behaves in in aa stable manner on this IVP. be order that stable manner on this IVP. (c) (c) Euler's Euler's method method takes takes the the form form Xi+1
= Xi + D..tAxi, i = 0,1,2, ....
Let the the eigenpairs of A A be be .A1, U1 and and A2, .A2, 112. U2. Show Show that that Euler's Euler's method Let eigenpairs of AI, ui method is equivalent equivalent to is to
y~i+l) = (1 y~i+l) where where
+ D..t.Ady~i), = (1 + D..t.A2)y~i), (i) _
(i) _ YI
-
Xl . UI,
Y2
-
Xi . U2·
y~i), y% y~i),, and derive an upper bound on At D..t that Find explicit formulas for y^', guarantees guarantees stability stability of of Euler's Euler's method. method. Compare Compare with with the the experimental experimental rpsiilt. result.
4.6. Green's functions
123
(d) Repeat with the backward Euler Euler method (d) Repeat with the backward method in in place place of of Euler's Euler's method. method. nxn 6. Let €R be symmetric with with negative ..., A 6. Let A A E R nxn be symmetric negative eigenvalues eigenvalues AI, Al, A A2,2 ,... An. n.
(a) (a) Consider Consider Euler's Euler's method method and and the the backward backward Euler Euler method method applied applied to to the the homogeneous homogeneous linear linear system system
dx
dt = Ax.
Show that Euler's method produces produces aa sequence sequence xo,xi,X2,... XO, Xl, X2, ... where Show that Euler's method where
the backward backward Euler method produces produces aa sequence Xo, Xl, X2, ... while while the Euler method sequence x 0 ,xi,x 2 ,... where where Xi = (I - AtA)-ixo.
(b) What are the the eigenvalues eigenvalues of of 1I + + AtA? (b) What are AtA? (c) What What condition condition must satisfy in in order order that that Euler's method be stable? (c) must At satisfy Euler's method be stable? previous exercise.) (Hint: (Hint: See See the the previous exercise.) What are AtA)-l? (d) (d) What are the eigenvalues eigenvalues of of (I -— At A)"1? (e) that the the backward backward Euler all positive positive values values of of (e) Show Show that Euler method method is is stable stable for for all At. At.
4.6
Green's functions functions
The Green's Green's function an IVP, IVP, BVP, BVP, or or IBVP IBVP has has aa fairly fairly simple simple meaning, meaning, alThe function for for an although details can can get get complicated, complicated, particularly for PDEs. PDEs. A A homogeneous the details particularly for homogeneous though the always has has the zero solution; nonzero solutions arise only only when when the linear ODE always the zero solution; nonzero solutions arise the iniinilinear not zero, the differential has aa nonzero nonzero forcing tial tial conditions conditions are are not zero, or or when when the differential equation equation has forcing function (that is, is initial values values and function (that is, is inhomogeneous). inhomogeneous). The The initial and the the forcing forcing function function can can be called the data of problem. The function represents represents the contribution be called the of the the problem. The Green's Green's function to of the the datum datum at at each each point in time. time. to the the solution solution of point in In the this section, we explain for two IVPs that In the remainder remainder of of this section, we explain this this statement statement for two IVPs that we first 4.2. We explain how how the we first considered considered in in Section Section 4.2. We also also explain the Green's Green's function function can can be interpreted interpreted as to the the differential differential equation, and introduce introduce the be as aa special special solution solution to equation, and the concept of the Dirac Finally, we comment on the form form of the concept of the Dirac delta delta function. function. Finally, we comment on how how the of the Green's function function will change when PDEs. Green's will change when we we consider consider PDEs.
4.6.1 4.6.1
The Green's function for a a first-order first-order linear linear ODE ODE The Green's function for
The The solution solution to to
du dt - au = f(t), t> 0,
u(to) = uo,
(4.41)
124
Chapter ordinary differential Chapter 4. 4. Essential Essential ordinary differential equations equations
as derived in Section 4.2, is
u(t)
= ea(t-to)UO +
We We define define
(j(t;s)
={
e
rt ea(t-s) f(s) lto
a(t-s)
, 0,
t> t
ds.
s,
< s.
(4.42)
Then we can write formula for for the u as Then we can write the the formula the solution solution u as
u(t) = (j(t; to)uo + {'XI (j(t; s)f(s) ds.
(4.43)
lto
The function (causal) Green's function for (4.41), and and formula formula (4.43) The function G (j is is called called the the (causal) for (4.41), (4.43) already hints at the significance of (j. The effect already hints at the significance of G. The effect of of the the initial initial datum datum Uo UQ on on the the solution at time time tt is while the the effect the datum datum ff(s) on the the solution solution u at is (j(t; G(t] to)uo, while effect of of the ( s ) on solution 24 at time time tt is is (j(t; G(t] s)f(s)ds. s)/(s)ds.24 u at As aa concrete concrete example, example, the the IVP the growth growth of population (of As IVP (4.41) (4.41) models models the of aa population (of aa country, where a is is the the natural natural growth growth rate rate and is the rate of country, say), say), where and ff(t) ( t ) is the rate of immigration immigration 25 For the sake of definiteness, suppose the population is measured in at time t. 25 at time For the sake of definiteness, suppose the population is measured in millions of and tt is is measured in years. Then Uo UQ is at time millions of people people and measured in years. Then is the the population population at time to (in rate, in millions of of people people per per year, (in millions) millions) and and ff(t) ( t ) is is the the rate, in millions year, of of immigration immigration at at time that u(t) to) is is the following IVP: time t. We We can can see see that u(t) = (j(t; G(t;to) the solution solution to to the the following IVP:
du - - au dt u(to)
= 0' = 1.
t > to,
(4.44)
That is, u(t) = to) is the population resulting from initial populaThat is, = (j(t; G(t;to) is the population function function resulting from an an initial population of 11 (million) and no immigration (here consider only only times times tt after to). We We can tion of (million) and no immigration (here consider after to). can also recognize u(t) population function to aa certain also recognize u(t) = = (j(t; G(t; s) as as the the population function corresponding corresponding to certain pattern of immigration, namely, namely, when when 11 million million people people immigrate instant pattern of immigration, immigrate at at the the instant si Of Of course, course, this this is is an an idealization, idealization, but but there there are are many many situations situations when when we tt = s! we wish to to model model aa phenomenon that takes takes place place at time or at aa sinwish phenomenon that at an an instant instant in in time or at single point point in in space point force mechanics). We We now now explore gle space (such (such as as aa point force in in mechanics). explore this this idea idea further. further. We whose population population is by the the differential We consider consider aa country country whose is modeled modeled by differential equation equation
du dt - au = f(t). We assume that initially initially there country (u(to) = 0), million We assume that there are are no no people people in in the the country (u(to) = 0), and and 11 million over the interval [s,s [s, s + 6.tj (at aa constant rate, during people immigrate people immigrate over the time time interval At] (at constant rate, during that that 24 24The represented by by the has different units than represented by by Uo. Indeed, The data data represented the function function f/ has different units than that that represented UQ. Indeed, an examination of differential equation shows that that the of f/ are are the of Uo UQ divided divided an examination of the the differential equation shows the units units of the units units of is aa rate, rate, and it must be multiplied by the to being being by time. time. Therefore, by Therefore, f/ is and it must be multiplied by the time time "interval" "interval" ds ds prior prior to
multiplied by by G(t; multiplied G(t; s). 25 25This particularly good assumes that that the This model model of of population population growth growth is is not not aa particularly good one, one, since since it it assumes the growth rate remains constant time. In In reality, reality, the the growth growth rate changes due growth rate remains constant over over time. rate changes due to to changing changing birth birth rates, life life spans, rates, spans, etc. etc.
4.6. Green's Green's functions
125
interval, D..t people per year). year). We interval, of of 1/ I/At people per We a.ssume assume that that ss > to. to. The The resulting resulting population population satisfies the IVP
du dt - au = dl'>.t(t - s), t> to, u(to) = 0, where where
dto,.t(t)
= {
It,
0< t < D..t,
0, t
D..t.
The is The solution solution is
(t; T) the interval As we This This la.st last expression expression is is the the average average of of G G(t; r] over over the interval [s, [s, s + D..t]. At]. As we take take D..t At smaller smaller and and smaller, smaller, we we obtain obtain u(t) -+ G(t; s) as D..t -+ 0. But the limit? limit? But what what forcing forcing function function do do we we obtain obtain in in the
4.6.2
The Dirac delta function
The has the properties, the the first first two just the The function function dto,.t d&t has the following following properties, two of of which which are are just the definition: definition:
• dto,.t(t) = 0, t ~ [0, D..t]. • dto,.t(t) = l/D..t, t • J: dto,.t(t) dt J: dto,.t(t) dt
=
E
[O,D..t].
It
Joto,.t dt = 1 for all D..t > 0, provided [0, D..tj c [a, b], and if D..t < a or > b.
=°
°
• If 9 is a continuous function, and [0, D..tj
c [a, bj, then
{b 1 (to,.t ia dto,.t(t)g(t) dt = D..t io g(t) dt, which is the the average average value of 9g on on the the interval interval [0, [0, D..tj. At]. which is value of Since D..t -+ value of Since the the limit, limit, as as At ->• 0, 0, of of the the average average value of 9g over over [0, [0, D..t] At] is is g(O)