li A
rn
CIiOD
Second
David Poole Trent University
THOIVISON
__
~JI
_
BROOKS/COLE
Aust ralia. Canada. Mexico . Singapore . Spa in United KHl gdom • United Slates
THOIVISON ~...
.
BROOKS/COLE
Linear Algebra: A Modern Introduct ion Second Ed ition
Dm'la Poole Executive Publisher C urt Hinnchs Executive Edi tor: jennifer Laugier Ed itor: /ohn·Paul Ramin ASSIStant Edi tor: Stacy Gre!.'n Ed itonal Assistant: Leala Holloway T!.'chnology l)roject Managt'r: E,lrl Perry Marketing M,mager. Tom Ziolkowski Marketing Assistant: Enn M IIchell Advertising Project ~lanager: Bryan Vann ProjeCl Ma nager. Edllorial ProduClion: Kelsey McGee Art Director: Vernon Boes Prinl Buyer: Jud y Inouye
Permissions Ed,lor: loohtt lLe Production Service: Matrlx Producllons Text Designer: John Rokusek Photo Research: Sarah Ever t50n / lmage Quest Copy Editor: Conllle Day Il lust ration: SCientific Illustra tors Cover Design: Kim l~okusek Cover Images: Getty Images Cover Pnnllng, Prim ing and Binding: Transcontlllen Pnnllllgl L.ouisrv ille Composilor: Inte racti\'e Compositton Corporation
e
Asia (including India ) Thomson Learning 5 Shenton Way #01 01 UIC Buildtng Singapore 068808
2006 Thomson Brooks/Cole, a parI of The Thomson Corporation Thomson, Ihe St~r logo, and Brooks/Cor.. arc tr~demarks llsed herein under license. ALL RI GHTS RESERVED. No part of this work covered by the
cop)'right hereon may be reproduced or used III any form or b)'
III
C lIlada
2 3 4 5 6 7 09 08 07 06 05
For more information about ollr products. contact us at. Thomson Learning Academic Resou rce Cen ter 18004230563 For permISSiOn to usc material from thIs text or product , submit a request o nhnr at hllp:!!w....'" Ihumso nrighls.com. Any additiol1(ll (lUemOns ahout permiSSIOns can be su bmlltoo by email 10 thom son righ [email protected] son.com.
Library of Congrrss Co nlrol Numbrr: 2004111976 ISBN 0534998453 In ternational Student Editio n: ISBN 0534405967 ( Not for sale in the United States) Thomson Hi ghe r Educa tio n 10 Davis D rive Belmont, CA 940023{)98 USA
Aust ralia/New ualand Thomson karning Australia 102 Dodds Strret SoUlhbank, VictOri a 3006 Australia Ca nada Thomson Nrl~on I 120 Birchmount RO.ld Toronto. Ontario M 1K 5G4 Canada UK/Europe/ Middle East/Afri ca Thomson Learning HIgh Holhorn Ilouse 50/ 51 Bedfo rd Row londonWC IR4LR United Kingdom Latin America Thomson Learning Srneca,53 Coloma Polanco 11560 MexICO D.F. Mexico Spain (incl ud ing Porlugal) Thomso n Paraninfo CaUl' Magallanes. 25 28015 Madnd. Spain
'10. ",
".
...... ".
· ,
,..
"2"_
.
'
For Mary', my standard basis
r
,
c
nlS
Pre/lice 1'/1 To lilt Instructor To the SlIIdent
Chapter I
Vectors
. ..
XVII
XXIII
I
1.0 1.1
1
The Geometry and Algebra of VeCiors
1.2
Length and Angle: The Dot Product
Introduction: The Racetrack Game
3 15 29
Exp/orllfioll: \ '(!ClOrs (/lui Geometr),
1.l
Lines ~lnd Planes Exploratiol/;
1.4
Chapter 2
Till.'
31 Cross
Pr~/lfct
Code Vecto rs and Modular Arithmetic \ 'igllelte: Tire Codabar Sy~11!1II 55 Chapler Review 56
Systems of Linear Equatio ns 2.0 2. 1
45
58
Introduction: Triviality 58 Inlroductlon to Systems of Linear Equations
Exp/oratio,,; L,cs My Compl/ter Told Me 2.2
2.5
"
59
66
Direct Methods fo r Solving Lmear System s 68 Exp/ortltiofl; Hlrlial PiI'tII;flS 86 Explom/ioll; COlllltir'g 0pt'nlliolls: An (III,o,IIIClioll
to tile Amdysis of A/goritiJllls 2.3 2.4
47
87
Span ning Sets and Linea r Independence 90 Applicatio ns iO I Allocation of Resources 101 Balanci ng C hemical EqulItions 103 Network Analysis 104 Electrical Networks 106 I~ init e Linear Games [09 Viguette: TIle Global Positioning S),stem 119 iteratIve Methods for Solving Linear Syste ms C hapter Review \3 2
122
Contents
Chapter 3
Matrices 3.0 3.1 3.2 3.3 3.4
3.5 3.6 3.7
134
Int roduction: Matrices in Action 134 Matrix Operations 136 Matrix Algebra 152 The Inverseof
ErrorCorrectmg Codes Chapter Rev iew
Chapter 4
4. 1
4.2
240
250
Eigenvalues and Eigenvectors 4.0
189 209
252
Introduction: A Dynami cal System on Graphs Introduction to Eigenvalues and Eigenvectors Dctermillants 262
252
253
Explomtioll: Geometric Applicatiolls of Determ;lII/IIrs
283
4.3
Eigenvalues and Eigenvectors of "x" Mat rices
289
4.4
Similarity and DiagQnaliz3tion
4.5 4.6
Iterative Methods for Computing Eigenvalues 308 Applications and the PerronFrobenllls Theorem 322 Markov Chains 322 PopulatIOn Growth 327 The PcrronFrobcnius Theorem 329 332 Linear Recurrence Relations Systems o f Linear Differential Eq uations 337 D iscrete Linear Dynamical Systems 345
298
Vigllcllc: Rlmkiu8 Sports Teams llUd Scare/rillg rlre Inr eme! Chapter Review
Chapter 5
Orthogonality 5.0 5. 1 5.2 5.3
353
361
363
Introd uction: Shadows on a Wall 363 Orthogonality in R" 365 Orthogonal Complements and Orthogonal Projections The GramSchmid t Process and the QR Factorization
Exploralion: The Mo(lified QU Fllctorizllfioll
375 385
393
Expforflli()lI: Approxilllntillg Eigenvailles willi lire QR Algoril/rlU
395
5.4 5.5
Vector Spaces 6.0 6.1 6.2
431
Int roduct ion: FibonaccI in (Vector) Space Vector Spaces and Subspaces 433 Linear Independence, Basis, and Dimension
Exploratioll: Magic Squares 6.3 6.4 6.5 6.6
397
Orthogonal Diagonalization of Sym metric Matrices Applicat ions 405 Dual Codes 405 Quad ratic Forms 4 11 Grap hing Quadratic Equations 418 Chapter Review 429
431
447
464
Cha nge of Basis 467 Linear Transformations 476 The Kernel and Ra nge of a Linear Transfor matio n The Mat rix of a Linear Tra ns formatio n SO l
485
£'(plomtioll: Tilillgs. innices, ami tile Crystallogmphic Restrictiml
6.7
Applications
522
Homogeneous Linear Di fferentia l Equations Linear Codes 529 Chapter Review 536 .
Distance and Approxima tion 7.0
522
538
7./
Introduction: Taxicab Geometry Inner Product Spaces 540
7.2
Exploratioll: Vectors and Matrices lI'itl, Complex [//tries 552 Exploratiol/: Geometric II/equalities (llId Optimizatioll Problellls Nor ms and Distance Functions 561
73 7.4
Least Squares Approximation 577 The Singular Value Decomposition 599
Vignette: Digital/mage Compressioll 7.5
538
616
Applications 61 9 Approximation of Functions 6 19 Error·Correcting Codes 626 Chapter Review 631
APPEN DIX A APPENDIX B APPEN DlX C
APPEND IX D
5/9
Mathemat ical Notation :lnd Methods of Proof Mathematical Ind uc tion 643 Complex Numbe rs 650 Polynomials 661
Answe rs to Selected OddNumbered Exercises I ndex 706
67 1
634
556
. '...
....
. . ..    . 1..
..,
,
",
~..: ~.~
...
'

Preface Tile IIUf flrmg one know$ wilen wmlll81l book is Wlllll 10 put first.
 Blaise Pascal Pensees, 1670
See The College Matlrtmlll rN. jOllrna/ 24 ( 1993). 4146.
I am pleased \",th the warm response of both teachers lind students to the first edition o f Linear Algebra: A Modem hltroductioll. In th is second edition, I have tried to preserve the approach and features that users found appe'lling, while inco rpornung many of their suggested Improvements. I want studen ts to see linear algebra as an exciting subject and to appreciate its tremendous usefulness. At the same time.l want to help them master the basic concepls and tech niques of linear algebra that they will need in other courses. both III mathematics and in o ther disciplines. I also want stude nt's to appreciate the interplay of theoretical. applied, and numerical mathematICS th nt perv
nnd thai student s need to see vectors fir st (in a concrete scning) in order to gam some geometric Insight. Moreover, introducing I'('C IOTS early allows students to sec how systems of linear equations arise naturally from geo metric problems. Matrices then ,I rise equally naturnlly as coefficient matrices of linea r systems and as agents of dlOlnge (linear transformations). ThIS sets thl' stage fo r eigenvectors and orthogonal projecl ions, both of which are best understood geometn cally. The arrows that appear on the cover reflect my conviction that geometric understanding should precede computatIonal techniques. I have tried to lim it the number oflhco rems in the text. For the most part , results labeled as theorems either will be used later in the text or summaflZC preceding work. Interest ing results that are not centml to the book have been Included as exercises or explorations. For example, the cross product of vectors IS discussed only in explorations (Ill Chapters J and 4). Unlike Illost linear algebra textbooks, this book has no chapter on determinants. The essential results aTe an m Section 4.2, with other interesting matenal con tamed in an explonltion. The book IS, however, comprehensive for nn introductory text. Wherever possible. I have included elementary and llccessible proofs of theorems in order to ayoid having to say, "The proof of this result is beyond the scope of this text.n The result IS, I hope, a work that is sclfcontallled. J have not been st ingy with the applications: There arc many more in the book than ca n be covered in a slIlgle course. However, it is Important that students see the impressive range of problems to which linear algebra can be applied. I have included some modern material on cod ing theory that is not normally found in an 1I1troductory linear algebra text. There are also sevc ral ImpressIve real world appli cations of linea r lligebra, presented as selfcontained "v ignettes." J hope that instructors will enjoy tCilchlllg from this book. More important , I hope that st udents using the book will come away with an appreciation of the beauty, power, and tremcndous utilit y of linear algebra and that they will have fun along the way.
New in the Second The overa ll structure and style of Lmeilr Algebra: A Modem l"tro(/u(lio/l remain the same in the seco nd edition However, there are many places where changes or additions ha\'c been made. Some of these changes urc the result of my using the hook in my o,.,.n courses and realizing that certain cxplanations 01' examples could be made clearer, or that an extra exercise here and therc would help students understand the materiuJ. I ha\'e paid attention to comments made by my own students, to the remarks of reviewers of the first edition . and to suggestions scnt to me by many users. As always, my first priority is the studen ts ,.,.ho will use the book. To this end, there is much new malerial designed to stimulale 51uden lS' enjoyment of linear algebra and to help them revie,'" concepts and technl(lues after each chapter. The second edition of Lmear Algebra: A Modem //llroI'lIl C/;OIl also comes wJth a much wider rangc of ancillary materials that benefit both students and instructors. Here is a summary of th e IH.'W material:
,,,ill
• Each chapter now concludes with a review section. Keydefi nitions and concepts are listed here. with page rcfercnces, and each chapter has a sct of review questions, including ten true/ false questions designed to test conceptual understandIng.
Prcfacc
I.
• I have added nve rea lworld applica tio ns of linear algebra, four of wh ich are new. Th ese "vignettes" Oint of view (" Fibonacci in (Vector) Space" J. I think the new introduction does a much beller job of setting up the material o n general vecto r spaces that is the subject of this chapter. I havc retamed "Magic Squares" as an exploration followlllg Sectio n 6.2. • A new ex ploration, UVectors and Matrices wi th Complex Entri es," has been added after Section 7.1. Here you will find an introduction to Hermitian, unitary, and normal matrices that can be used by instructors whose course includes complex linear algebra, or simply as an enrichment ac tivity in any course. • In an effort to make the biographical and historica l snapshots more diverse, I have added biographies of Olga TausskyTodd, for her fundamental contributions to matrix theory, and Jessie MacWilliams, a pioneer in coding theory. • There are over 300 new or reVised exercises. • I have mllde numerous small changes in wordlllg to improve clarity. • The num be n ng scheme for theorems, examples, figures, and tables has been Improved to make them easier to find when they are cited elsewhere in the book. • The supplementary material on technology that apl'eared in the first edition as Appendix E has been removed, updated, and placed on the CD that nO\\l accompanies the book. • A fuJI range of ancillllTY materials accompanies the second edition of Linear Algebra. A Mo(/em II/nodl/eriol!. These supplements arc dcscnbed on p
Fealures Clear Wrillng Style Th e text is written is a simple, direct, cOllversational style. As much as possible, I have used "mathematical English" rather than relying excessively on mathcm:ltical notation However, aU proofs that arc given arc full y rigorous. and Appendix A contains an introduction to mathemati cal notation for those who wish to streamline their own writing. Concrete examples almost always precede theorems, which arc then foll owed by furthe r exam ples and applications. This flow from speCIfic to general and back
agai n is consistent throughout the book.
Key Concepts Introduced Early Many students encounter difficulty in linear algebra when th e course moves from the computational (solVing systems of linea r equations. manipulating ''cClors and matrices) to the theoretical (span ning sets, linear mdependence, subspaces, basis and d imension.) This book int rod uces all of the key concepts of linear algebra earl y, in a concrete setting, before revisiting them in fu ll gener ality. Vector concepts such as dot product, length, orthogonality, and projection arc fi rst discussed in Chapter I in the concrete setting of IR z and ~3 before the more general notions of inner prodUct, norm, and orthogonal projection appear in Chaplers 5 and 7. Similarly, spannIng sets and linear lIldependence arc given a co ncrete treatment in Chapter 2 pnor to thei r generalil..al1on to vector spaces in Chapter 6. The fundam ental concepts of subspace, basIs, and dimension appear first in Chapter 3 when the row, column, and nu ll spaces of a matrix arc introd uced; it is not until Cha pter 6 that these Ideas arc given a general trC
Emphasis on Vectors and Geometry In keepi ng \\'ilh the philosophy that linear algeb ra IS primarily about vectors, thIS book stresses geometric intuition. Accofd in gly, the first chapter is about vecto rs, and it develops many concepis that will appear repeatedly Ihroughout the text. Concepts such as o rthogonality, projection, and linear combi nation are all found in Chapter I, as is a comprehensive treatment of lines and planes in Rl that provides essent ial insight into the solution of systems of linear equations. This emphasis on vectors, geometry, and visuali7.ation is found throughout the text. Linear transformat ions are introduced as mat rix transfo rmat ions in C hapler 3 , \Vlth many geometric examples, before general linear transformations are covered In Chapter 6. In Chapter 4, eigenvalues arc introduced with "eigenpictures" as a Visual aid. T he proof of Perron's Theorem is given fi rst heuristically and then formall y, in both cases using a geometric argument. The geometry of linear d yna mical systems rein forces and summarizes the material on eigenvalues and eigenvectors. In C hapter 5, orthogonal projections, orthogonal complemen ts o f subspaces, and the G ramSchmidt Process are all presented in the concrete setting oflQ} before being generalized to R" and, in Chapter 7,
Preface
II
to inner product spaces. The nature of the si ngular value decomposition is also explained informally In Chapter 7 via a gcom('!ric argument. Of the more than 300 figures in the text, over 200 are devoted to fostering a geometric understanding of Illle,lT algebra .
Explorations S(t' pagN I, /34. 431. 538
5« pagcs 29, 28.1, 4M, 519 5;]. :>tI
.'>« f'lIK/'j 6(i, 86, 87, .193, .W.'i
The introduction to each chapter is a guided exploration (Section 0) in which students arc invited to discover, individually or in groups, some aspect of the upcoming chapter. For example. "The Racetrack Game" introduces vectors." Mat rices in Action" introduces matrix multiplication and linear transformations, "Fibonacci in (Vector) Space" touches on vecto r space concepts, and "Taxicab Geometry" sets up generalized norms and distance functions. Additional explorat ions found throughout the book incl ude applications of vectors and determina nts to geometry, an investigation of 3X3 magic squares, a study of symm('! ry via the tilmgs of M. C. Escher, an int roduction to complex linear algebra, and optimization problems using geometric inequalities. There are also explorations that mtroduce import3nt numerical considerat ions and the analysis of algorithms. Having students do some of these explorations is one way of encouraging them 10 become ,\Ctlve learners and to give them "ownership" over a small pa Tt of the cou rse.
lppllcallons
Ser f"li,....j :U , 532
See paga 55. J 19, 2l4. .153, 616
The book contains an abundant selection of applications chosen from a broad range of disciplines, includ ing mathematics, compu ter science. physics, chemistry. engi neeri ng, biology, psychology, geography, and sociology. Notewo rthy among these is a strong treatmen t of coding theory, from errordetecting codes (such as International Standard Book Numbers) to sophisticated errorcorrecting codes (such liS the ReedMuller COOl' that was used to transmit satellite photos from space). Additionally, there are five "v ignettes" that brieny showcase some very modern applications of linea r algebra: the Codabar System. the Global Positioning System (G PS), robotics, Internet search engines, and digital image compression
Examples and Exercises There are over 400 examples 111 this book, most worked III greater detail than is customary in an mtroductory linear algebra textbook. This level of detail is in keeping With the philosophy that studen ts should wan t (and be able) to read a textbook. Accordingly. it is not intended that all of lh,'se exa mples be covered in class: mnny can be assigned for ind ividual or group study, possibly as part of u project. Most examples have at least one cou nterpart exercise so that students ca n tryout the skills covered in the example before exploring generalizations. There are over 2000 exercises, more than in most textbooks al a similar le\'el. Answers to most of the computational odd numbered exerciscs can be fo und in the back of the book. Instructors will find :111 abundance of exercises from which to select homework assignments. (Suggestions are gIVen in the IIIS/mewr s Guide.) The exercises In each sect ion are graduated, progressing from the rou tine to the challenging. ExercISeS range from those intended for hand computation to those requiring the use of a calculator or computer algebra system, and from theoretical and numerieJI exercises to conceptual exercises. Many of the examples and exercises use actual da ta
compiled from real' \\'orld situations. For example, there ,lrot problem s on modeling the grmvth of caril>ou and seal populations, radiocarbon dating of the Stonehenge monument, and predicling major league baseball players' salaries. \I'/orking such problems reinfo rces the fact that linear algebra is a valuable tool for modeling real· hfe problems. Additional exercises appear in the form of a revi ew after each chapter. In each set, there are iO true/ false questions designed to test conceptual understanding, followed by 19 computational and theoretical exercises thai sum marize the main concepts and lechniques of that ch
Biographical Skelches and EtymologiCal NOles II is important that studen ts le
Margin Icons The margins of the book contain several icons whose purpose is to alert the reader in various \vays. Calculus is nol a prerequisite for this book, but linea r algebra has many icon denotes an example mteresting and important applications to calculus. The or exercise that requires calcu lus. (This material can be omitted if not everyone in the class has had ,It least one semester of calculus. Alternatively, Ihey can be assigned as projects.) The .... icon denotes an example o r exercise involvmg complex numbers. ( For students unfam iliar \vit h complex Olllnbers, Appendtx C contains alt the back · ground material that is needed.) The CAe icon indicates that a compuler
l&.
Technologr This boo k can be used stlccessful1y whether or not students have access to techno logy. However, calculators with matnx capabilit ies and computer algebra systems are now
Preface
1111
widely available and, properly used, can enrich the learning experience as well as help with tedious calculations. [n this text, I take the point of view that students need to master all of the basic techniques of line;lr algebra by solving by hand examples that are not too computationally difficult. Technology may then be used (in whole or in part) to solve subsequent examples and applications and !O flppl)' techniques th:1t rely on earlier ones. For example, when systems of linear equat ions are first introduced, detailed solutions arc provided; later, solu tions are Simply given, and the reader is expected to verify them. This is a good place to use some form of technology. likeWise. when applicat ions use dalll that make hand C
Iinile and Humericailinear Algebra
5
The text covers two aspects o f linear algebra th;!t are scarcely ever mentioned together: finite linear algebra and numerical linear algebra. By introducing modular arithmetic early, [ have been able to make finite linear algebra (more properly. "linear algebra over fin ite fields," although I do not USt' that phrase) a recurring theme throughout the book. This approach provides access to the material on coding theory In Sections 1.4,37.5.5,6.7, and 7.5. There is also an application to finit e linear games in Sect ion 2.4 that st udents really enjoy. In addltlon to being exposed to the applications of finite linear algebra, mathematics majors Will benefit from seeing the material on finite fields, because they are likely to encounter it in such courses as d Iscrete mathematICS, abstract algebra, an d number theory. Al! students should be aware thai in practice, it is Impossible to arrive:1I exact solutions of largescale problems in linear algebra. Exposure 10 some of the techniques of numerical linear algebra will provide an indication of how 10 obtain highly accural e approximate solutIons. Some of Ihe numerical topics included in the boclk are roundoff error and partial pivoting, Iteral ive methods for solving lInear systems and comp uting eigenvalues, the LU and QR factorization s, matrix norms and condition numbers, least squares approxImation, and the singul:lr value decomposition. The inclusion of numerical linear algebra also brlllgs up some 1Illeresting and important issues that are completely absent from the theory of linea r algebra, such ,IS pivoting str,ltegles, the conditIon of a linear system, and the convergence of iterative methods. This book not only raises these questions but also shows how one might approach them. Gerschgorin disks, matrix norms, and the singular values of a matrix, discussed in Chapters 4 and 7, are useful in this regard.
AppendIces Appendix A contains an overview o f mathematical notation and methods of proof, and Appendix B discusses mathematical induction . All students will benefit from these sections, but those with a nulthematically oncnted major may wish to pay particular attention to them. Some of the exa mples In these appendices are uncommon (for instance, Example 13.6 in Appendix B) and underscore the rower of the methods. Appendix C is an introduction to complex numbers. For students fam il iar With these resuits, this appendix can serve as a usefu l reference; fo r others, th is section con tains everything they need to know (or those parts of the text that use complex numbers. Appendix D is abou t polyno mials. I have found that many stu(]cnts req uire a refresher about these facts. Most students will be unfamiliar with Descartes' Rule o f Signs; it is used in Chapter 4 to explain the behavior of the elgelwalues of Leslie matrices. Exercises to accompany the four appendices can be found on the book's website.
Answers to Selected OddNumbered ExercIses Short answers to most of the odd num bered com putatIOnal exerCises are given at the end of the book. The Siru/eni Solution s Man ual and Study GUIde con tains detailed solutions to all of the oddnu m bered exercises, including the theoretical ones. The Complctc Solutions MmJlm/ contains detailed solu tions to all of the exercises.
AncillarIes The foll owi ng supplemen ts are all available free of charge to ins tructors who ado pt Linear Algebra: A Modern Introduction (Second Edition ). The Student SO/1I/IOIIS Mal/llal (I/UI Stlldy Guide can be purchased by students, either separately or shrink\... rapped with the textbook. The CDROM is packaged with the textbook, and the website has passwordprotected sections for students and instructors.
StJl(lelH SO /Jltioll s Manual and Study Guide by Robe rt Rogers (Bay State College), ISBN: 0534998585 Includes detailed solutions to all oddnumbered exercises and selected evennumbered exercises; section and chapter summar ies o f symbols, definitions, and theorems; and study tips and hints. Complex exercises are explored through a qucst lonandanswer format deSigned to deepen understanding. Challenglllg and entertaining problems that further explore selected exercises arc also included.
Complete SolutiollS Mmwal by Robert Rogers (Bay State College), ISBN: 0534998593 Full solutions to all exercises, including those ill th e explo rations and chapter reviews.
Illstructor's Gllide by Douglas Shaw and Michael Proph et (University of Northern lo\ ...a), ISBN: 0534998615 Includes cameraready group work, teaching tips. interesting exam questions, examples and extra material for lectures, and o ther items designed to reduce the instructor's preparation time and make linear algebra class an exciting and interactive experience. For each section of the text, the Instructor's Grllfle includes suggested time and emphasis, points to stress, questions for discussion, lectu re materials and examples, technology tips, student projects, group work with solutions, sample assignments, and suggested test questions.
Preface
It
Test Balik by Richard C. Pappas ( Widener Unive rsity), ISIl N: 0534998607 Contains fo ur tests for each chapter (one each in ffl'eform, muhiplechoice, true/ false, and "mixed" formats) and four tinal exa ms, each in a diffe rent formal . Answer keys a rc provided.
it.m TM Testing, [SBN: 053499862 3 EffiCIent and ve rs.1ti[e, JI.rn Testing IS an Inte rnet ready, texts ped fie testing su ite that enables instructors to customize exams and track student progress in lIil accessible, browserbased for mat . Itrn offe rs problem s keyed to Lineflr Algebra: A Modem Imro dIlCUOIl, allows free response mathematics, and [ets sllldents work with real math nO\:ltion in re,ll lime. The complete integration of the testing a nd course managemen t components simplifies routme tasks. Results flow autom:lllcally to the instructor's grade book, and the inst ruClor cail easily communicate with individuals, sections, or an entire course.
CDROM, [SBN : 0534422896 Contams data sets for more than 800 pro blems III Maple, MATLAB, and Mathern:l!ic:l, as welt as dnta sets for selected examples. Also contains CAS e nhancements to the vtgnettes and explorations that :lppear III the text and includes manuals for using Maple, MATLAB, a nd Mathematica. Website for Lillcar Algebra: A Moderll IlItrQtillC/ioll math.brookscole.com/ poolelinearalgebra Contains additional online support materi:lls for students and instructors. O nli ne teachmg lips, practice e xams, selected homework solutions, transpare ncy masters, :lnd exercises to ac,omvany the book's appendices are available. Online versions of some of the ma te rial on the book's C D ROM will also be fo und here.
Acknowledgments The reviewers of the first edition cOll tributed valuable a nd oft en msightful comme nts :lbout the book. I am grateful for the time each o f them took to do this. Their judgemen! and suggestions have contributed greatly to the second edition, whose reviewers contribu ted additional helpful suggestions. Reviewers for the First Edition Israel Kohracht, University of Con necticul Arthur Robinson, George Washington University Mo Tav:lkoli, Chnffey College Reviewers for the Second Edition Justin Travis G ray, Simon I:r;)ser Umversity J. Douglas Fai~s, Youngstown Sta te Un iversity Yltval Fl icker, O hio State Uni ve rsil y \ViIliam Hagl'r, Unl\'ersity of ):Iorida Sasho Kal3jdzievski, Unive rsity of Mallltoba Dr. E.n Bing Lin , University o fToledo Dr. Asamoa h Nkwanta, Morgan Slate University G leb Novitc hkov, Pe nn State Unive rsity Ron Solomon, Ohio State UOIversity Bruno Wei fe r!, Arizona State Uni verSity
I am indebted 10 a great many people who h:lVe, over the years, influ~nced my vIews about linear algebra and the teaching of mathematics in generaL First, I would like to thank collectively the p'lfticipams in the education and special linear algebra sessions at meetmgs of the Mathematical Associatio n o f America and the Ca nadian M(lthemalical SOCIety. I h(lvc also learned much from parllcipatlon In the Canadian Mathematics Education Study Group. I espeCially wam to thank Ed Barbeau, Bill Higgmson , Rlch.ml HoshlllO, John Grant McLoughlin, Eric Muller, Morris Orzech, Bill Ralph , Pat Rogers, Peler Taylor, and Walter Whiteley, whose advice and Inspira tion contributed greatly to the philosophy and style of this book. Special thanks go to Jim Stewarl for his ongomg support and advice Joe Rotman and his lovely book A First COllrs!! ill Abstmcf Algebm IIlspired thc etymological notes in this book, and [ rehed heavily on Steven Schwarlzman's Tile Words of Mathem(ltics when compiling t hese notes. I thank An Benjamin for introducing me to the Codabar system. My colleagues Marcus ]>ivato and Reel11 Yassawi provided useful information about dynamical systems. As ah"ays, I am grateful to my students for asking good questions and providing me with the feedback necessary to becoming a better teacher. Special thanks go to my teachlllg assistants Alex Chute, NICk F
.
'
To
,.
':
SUO
"Wollld you tell me, plr{lse, wllich way I ollght to go from here?" "That depends a good dim / Oil where you want to get to," said the Cat.  Lewis Carroll Alice'sA dvenrures in Wonderllllul, 1865
This text was written wit h flexibilit y in mind . It IS intended for use in a one or twosemester co urse with 36 lectures per semester. The range of topics and applications makes it suitable fo r a vanety o f audiences and ty pes of courses. However, there is mo re material in the book than can be covered in class, even III a two semester cou rse. After the fo llowing overv iew of the text are some brief suggestio ns for ways to use the book. The Instmctor's Gllide has more detailed suggestions, including teach ing n otes, recommended exercises, classroo m activities and p rojects, and add itional topics.
An Overview Ollhe Texi Chapter 1, Vectors
Sec pagr 29
Sn page 45
See pages 52, 53, 55
T he racetrack game in Section 1.0 serves to introduce vectors III an informal way. (It's also quite a lot of fun to play!) Vectors are then formally Introd uced fro m both an algebraIC and :1 geometn c poin t of view, T he operations of additio n a nd scalar multi plICation and their properties are fi rst developed in the concrete settings of R! and R J before being generalized 10 Rn. Sectio n 1.2 defines the do t p roduct of vectors and the reiated nOlions of length, angle, and orthogonality, T he very importa nt concept of (o rt hogonal) p rojection is developed here. il will reappear in C hapters 5 and 7. T he explorat io n "VIXlors and Geomet ry" shows how vector methods can be used to prove certam results in Euclidean geometry. Sectio n 1.3 is a basic but tho rough int roductio n to lines and planes in R2 and R' . T his section is crucial fo r understanding the geometric significance of the sol ution of linear systems in Chapter 2. Note that the cross product o f vec.to rs in R3 is left as an explo ratio n. T he chap ter concl udes with an mtrod uctioll to codes, lead ing to a discussio n o f modular arithmetic and nnite linear algebra. Most students will enjoy the application to the Universal Product Code (U PC) and International Sta ndard Boo k Num ber (ISBN ). The vignette on the Codabar system used in cred it and bank cards is an excellent classroom p resen tation that can even be lLsed to introduce Section 1.4 , nil
111111
To the InSlructor
Cbapter 2, 51111111 01 linear (quallons Sa prlge 58
Sa pagtS 75, 203, J8J. 490
Set: {Hlgt:s 66, 86, 87
T he int rod uction to this chapter serves to illustrate that there is more than one way to think of the solutio n to a system o f linear equations. Sections 2.1 and 2.2 develop the ma in computation al tool for solving linear systems: row reduct ion of matnces (Gaussian and GaussJordan elimi nat ion.) Nearly all subsequent comp utational methods in the book depend on this. The Ra nk Theorem appears here for the first tIme; it shows up again, in more generality, 111 Cha ptt'!fs 3, 5, and 6. Section 2.3 is very important; it in troduces the fundamental notions of spanning sets and linear independence of vectors. Do not rush thro ugh this material. Section 2.4 contallls fi ve applicalions from whICh instr uctors can choose depending o n the time available and the interests o f the class. The vigneu c on the Globa l Positioning System provides another application that students will enjoy. The Iterative methods in Sect ion 2.5 will be optional fo r many courses but arc essential for a course with an appl ied/n u mencal focus. T he three explorations in this chapter are related in that they all deal with aspects of the usc of computers to solve linear system s. All students should at least be made aware of these issues.
Chapter 3, Matrices
Seepage IJ4
Sa pages 170, 204, 29
Sa page 214
See page5 228. 2JJ
'
This chapter con tains some of the most important ideas in the book. It IS a long chapter, but the early material can be covered fairl y qu ickl y, with extra time allowed for the crucial matenalill Sect ion 3.5. Section 3.0 is an explo ratio n that introduces the notio n of a linear transformat ion: the Idea that matrices are not just static objects but rather a type o f function, tr.tI1sfonning vectors into other vectors. All of the basic facts about matrices, matrix operations, and their properties are fou nd in the fi rst two sections. T hc material on partitioned matrices and the multiple representations of the matrix product is \vorth .messing, beca use Ii is used repeatedly in subsequent sections. The Fu ndame ntal Theorem of Invertible Matrices in Section 3.3 IS very important and will appear sevcral more times as new characterizations of inwrt ibil ity are presented. Section 3.4 d iscusses the very important LV factorization of a matrix. If this top ic is not covered in class, it is worth assigning as a project or discussing III a workshop. The point of Section 3.5 is to present many of the key concepts of linear algebra (subspace, basis, d imension, and rank) in the concretc sett ing of matrices before studen ts see them in fu ll gcnerality. All hough the examples in this section are all familiar, it is Import"an t that students get used to the new termillology and, in particula r, understand what the notion o f a basis means. The geometric treatment of linear tra nsform ations in Section 3.6 is intend ed to smooth the transition to general linear transformations in Chapter 6. The example of a projection is particularly important because it will reappear in Chapter 5. T he vignette on robotic arms is a concrete demonstration of composition o f lincar (and affine) transformations. There are four applications from which 10 choose in Section 3.7. At leasl one on Markov chams or the Leslie model of population growth should be covered so that it can be used agai n in Chapter 4 where their behavior will be explamed.
CUPt., 4, Ilunal ••• aid Ilgen.ectors Seepage 252
The in troduction Section 4.0 presents an interesting dynamical system involving graphs. This exploration in troduces the notion of an eigenvcctor and foreshadows the power method in Section 4.5. In kccplllg with Ihe gcomelric emphasis of the book, Seclion 4.1 con tains the novd feature of "clgenpicturcs" as a way of visualizing
To the Instructor
S«JHl~ 283
SIT pngc 353
&r pagN J9.l, )95
S« pogrs 40.... 411.
III
th e eigenvecto rs of 2X2 matrices. Dete rm ina n ts appear m Section 4.2, motivated by their use in finding the charnctenstic polynomials of small matrices. This "c rash course" 10 de termmants co nta ills all the essential material s tude nts need, including an optional but elem entar y p roof of the Laplace Ex pansion Theorem . The exploration "Geometric App lica tions of Determinants" makes a nice project tha t contains several in tcresting and useful results. (Alternatively, inst ructors who Wish to give m ore d ctailed coverage to de terminants may ch ~ to cover some of this exploration in class.) The basic theory of eigenvalues and eigenvcctors is found in Section 4.3. and Sect ion 4.4 deals wi th the important topic of diagonahzation. Example 4.29 o n powers o f matrices is worth covering in class. The power method and its varia nts, d iscussed in Section 4.5 , arc optio nal, but all students should be aware of the method , and an ap plied course s hould cover it in deta il. Gerschgorin's Disk Theo rem ca n be covered independently of the rest of Section 4.5. Ma rkov chains and the Leslie model of populat ion growth reappea r in Section 4.6. Although the proof of Perro n's Theorem is optional, the theo rem itself ( hke the stronger PerronFrobeni us Theorem ) sho uld at least be m en tioned because it explains wily we should expect a unique positive eigenvalue with a corresponding positive eigenvector in these applications. The applications on recurre nce relations and d ifferen tial equatio ns connect linear algebra to discrete mathematics and calcul us, respectively. The matrix exponential can be covered if your class has a good calculus background. The final to pIC of discrete lin ear dynamical systems revIsits and summarizes ma ny of the ideas in Cha pter 4, looking at them in 11 new, geom etric light. Students will enjoy readi ng how eigenvC<:to rs can be used to help rank sports teams and websites. This vignetle can easIly be extended to a project or enrichmen t activity.
The mtroductory exploration, "Shadows on a Wall ," is mathematICS at its best: it takes a known co ncept ( projectio n o f a vector o nto another vector) and generalizes it 111 a useful way (projection of a vector on to a s ubspace a plane), while uncovering some previously un observed pro perties. Sect ion 5.1 contains the baSIC results about o rthogonal and o rtho normal sets of vectors tha t will be used repeatedly from here on. In particular, orthogonalmatTlces should be s tressed. In Section 5.2, two concepts from Chapter I are generalized: the ort hogonal complemen t of a subspace and the o rt hogo nal projection o f a vector onto a subspace. The Ort hogonal Decomposition Theorem is im portant here and helps to set up the G ramSchm idt Process. Also no te the quick proof of the Rank Theorem. The GramSchmidt Process is detailed in Section 5.3, along wit h the extremely importan t QR facto riution. The Iwo exploratIons that follow outline how the QR facto rizatio n is computed in practice and how it can be used to app roximate eigenvalues. Section 5.4 0 11 orthogonal diagonaliz.... tion of ( real) sym metric matrices IS needed for the applicallons that foll ow. It also contams the Spectral Theorem , one of the h ighlights of the theory of linear algebra. The applications in Sect ion 5.5 include dual codes. quad rat ic forms, and gra phmg qU,ldratic equatIons. , always mdude at least the last of these in my course because it extends what s tudents alre3dy know about conic seCllons.
Chapter 6, Vector Space. The Fibonacci sequence reappears in Section 6.0, although it is not im portant that students have seen it before (Section 4.6). The p urpose of th is explo ration is to show that familiar vector space concepts (Section 3.5) can be used frui tfully in a new
II
To the Instructo r
Supagd J9
setting. Because all of th e main ideas of vector spaces have already been introduced in Chapters 1 3, students should find Sections 6.1 and 6.2 fai rly famili ar. The emphasis here should be o n using the vector space axioms to prove properties rath('r than relying on computational techn iques. When discussing change of basis in Section 6.3, It is helpful to show students how to use the notation to f('member how the construction works. Ultimately, the GaussJordan method is the most ('fficlent here. Sections 6.4 and 6.5 on Imear transformations arc important. The examples are rellited to previous results o n matrICes (and matrix transformations) . [n particular, it is 1ll1portant to stress that the kernel and range of a linear transformatIOn g(,l1('ralize the null space and column space of a matrix. Section 6.6 puts forth the notio n that (almost) all linear transformations are essentiall}' matrix transformations. This builds on the information in S('ction 3.6, so students should not fi nd it terribly surprising. However, the examples should be worked carefully. The con nection between change of basis and similarity of matrices is noteworthy. The exploration "Tilings, Lattices, and the CrystallographIC Restriction" is an impreSSive application of change of basis. The connection with th(' artwork of M. C. Escher makes it all the more lllteresting. Th(' appli cations in Section 6.7 build o n prevIOus ones and can be included (IS time and interest permit.
ChaPler 1, Dlslance lid Appro.I ••llon Serpage 538
See page .';'52
See pagt: 5';6
Seepagc616
Section 7.0 opens WI th the enterraining "Taxicab Geometry" exploration. Its purpose is to sct up the material on generalized norms and distance functions (metrics) that follows. Inner product spaces arc discussed in Sectio n 7. 1~ the emphasis here should be on the examples and using the axioms. The explorallon "Vectors and Mat rices with Complex Entries" shows how th(' concepts of dot product, symmetric matrix, orthogonal matrix, and orthogonal diagonalizatlon can be extended from real to complex vector spaces. The following explorat ion , "Geom('tric Inequalities and Optim izatIOn Problems," is one that studen ts typically enjoy. (They will have fun seeing hmv many "calculus" problems can be solved Without using calculus at all! ) SectIOn 7.2 covers generalized vector and matrix norms and sho ws how the condi tion number of a matrix is related to the notion of illconditioned linear systems explored in Chapter 2. Least squares approximation (SectIOn 7.3) is an important application of lin('ar algebra in many other disciplines. The Best Approximation Theorem and the Least Squares Theorem arc importan t, but their proofs are intuitively clear. Spend time her(' on the examplesa few should suffice. Section 7.4 presents the singular value decomposition, o ne of the most impressive applications of linear algebra. If your course gets this far, you will be amply rewarded. Not only does the SVO ti(' together many notlons discussed prevIOusly; it also affords some new (and quil(, powerfu l) applications. If MATUB is available, the vignette on digItal image compression is worth presenting; it is a visually impressive d\spltly of the power of linear algebra and a fitting culmination to the course. Th(' furth('r applications in Section 7.5 can be chosen accordi ng to the time available and the inter('SIS of the class.
How 10 Use Ihe Book Students find the book easy to r('ad,so I usually have them read a section before I cov('r th(' material in class. That way, J ca n spend class time highlighting the most important concepts, dealing with topics students find difficult, working ('xamples, and discussing
10 the Instructor
III
applications. I do not attempt 10 cover all of the material fro m the assigned read ing in class. Th is approach enables me to kee p the p ace of the course fairly brisk, slowing down for those sections that students typically find challenging. In a twosemester course, it is pOSSible to coyer the entire book, including a reasonable sele
• laslc Gourse A course designed for mathematics majors and students from other disciplines is outlined below. This course d oes not mention generaJ vecto r spaces at all (all concepts are treated in a concrete setting) and is very light on p roofs. Still, it is a thorough int roductio n to h near algebra. Section
Number of Lectures
1.1
1
1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3 3.5
1 1.5 1 1.5 051
12 12 1 2 1 2 2
Section
3.• 4.1 4.2 4.3 4.4 5.1 5.2 5.3 5.4 7.3
Number of Lectures
12 1 2 1 1 2 11.5 11.5 0.5
1 2 Total: 2330 lectures
Because the students in a cou rse such as this one rep resent a wide variety of disciplines. I wo uld suggest usi ng much of the remai ning lecture time fo r applications. In my course, 1 do Section l A, which students really seem to like, and at least one application from each of Chapters 25. O ther applications can be assigned as projects, along With as many o f the exploratio ns as d esired. There is also sufficient lect ure time available to cover some of the theory in detail.
For a course with a computational emphasis, the basic cou rse outlined above can be supplemented with the sections o f the text dea ling WIth numerical linear aJgebra. rn
such a course, I would cover part or all of Sections 2.5,3.4 , 4.5, 5.3.7.2, and 7.4, endIng with the singular value decomposi tion. The explorations in Chapters 2 and S arc particula rl y well suited to such a course, as are almost an)' of the applications.
A COirse lor Slldeals W.o Hawe Already Slldled So •• Unar Algebra Some courses will be aimed at students who have already encountered the basic prin ciples of li near algebra in other co urses. For example, a college algebra course will often include an introduction to systems of linear equations, matrices, and determinants; a rnultivariable calcu lus course will almost cen ainly contain material o n vectors, lines, and planes. For students who have seen such topics al ready, much early material can be omitted and replaced with a quick revie\v. Depending on the back ground of the d ass, it may be possible to skim over the material in the basic course up to Section 3.3 in abou t six lectures. If the class has a signi ficJnl number of mat hematics majors (and especially if Ih is is the only linear algebra course they will ta ke), I would be sure to cover Sectio ns 1.4,6.1 6.5,7. 1, and 7.4 and as many appllcJtions JS t ime permits. If the course has science majors (but not mathematics majors), I would cover Sections 1.4,6.1, and 7.1 ;md a broader selection of applications, being sure to include the material on differen tia l equations and approximatio n of functions. If computer science students or engineers are prominently represented , I would try to do as much of {he materia! on codes and numericalllllear a!gebra as , could. There are many o ther types of courses that call successfully use this text. I hope that you find it useful for your course and that yo u enjoy using it.
... . 
;"'
.'
"0
To the Siudent "Wllere shall I ocgm, please your Majestyr he lUked. "Beg m at th e beginning," the Kmg SIIill, gravely, "Iurd go on till you (ome
to the end: thell stop."  Lewis Carroll Alue's Adventures II! Wonderland, 1865
linear algebra is an exci ting subject. It is full of in terestin g results, applicat ions to other disoplines, and connections to o ther areas o f mathematics. The Stlldetlt Solution; Manllal and Stu dy Guide contains d etailed advice on how best to use this book; fo llowing are some general suggestio ns. Linear algebra h
[ 1~2] is the same as the set of all scalar m ultiples of [~ l
As yo u encounter new concepts, try to relate them to examples that you know. Write out proofs and solutions to exercises in a logical, con nected way, usmg complete sentences. Read back what you have written to see wh ether it makes sense. Better yet, if yo u can, have a friend in the class read what you have written. If il doesn't make sense to another person, chances are tha t it doesn't make sense, period. You will find that a calculator with matrix capabilit ies or a compu ter algebra system is useful. These tools can help you 10 check you r own hand calculations and are indispensable for some problems involving tedious computatio ns. Technology also enables you to explore aspects of linear algebra o n your own. You can play "what if?"
11111
1111
1b the Stude nt
games: What if I change one of the entries in this vector? What if this matrix is of a d ifferent size? Can I force the solution to be what r \"ould like it to be by changing something? To signal places in the text or exercises where the use o f technology is reco mmended, I have placed the icon in the margin. The CD that accompanies th is book contains selected exercises fro m the book worked o ut using Maple, Mathematiea, and MATLAB, as well as much additional advice abo ut the use of technology in lin ear algebra. You are abo ut to embark on a journey through lmear algebra. T hink of th is book as yo ur travel guide. Are you ready? Let's go!
=.
.
~
.... 

...

.~
.
.
eel r
Here they cOllie pourlllg out of the bille Little arrows for me and for you.  Albert Ham mon d and Mike Hazelwood
Lillie Arrows Dutche.~~
Music/BMI, 1968
1.0 Introduction: The Racetrack Game M:my measurable quant ities, such as length , area, volu me, mass, and temperature, ca n be completely described by specifying their magnitude. Other quantities, such as velocity, force, and acceleration, require bo th a m:lgn itude and a direction for their description. These quantities arc vectors. For example, wlIld velocity is a vector consisting of wind speed and di rection , such as 10 kph southwest. Geometrically, vectors are often represen ted as arrows or directed line segments. Although the id~a of a vecto r \~as introduced In Ihe 19th c~ntury, ItS usefulness in applications, particularly those in the physical SCiences, was' not realil..ed unt il the 20th century. More recently, vectors have found appl ica tions in computer science, st;ltistics, econo m ics, and the life and social sciences. We will consider some of these many applications throughout this book. This chapter IIltroduces vectors and begi ns 10 consider some o f their geomet ric and algebraic p roperties. We will also consider one nongcometric applica tio n where vectors arc useful. We begin, though, with a simple game that introduces some of the key ideas. [You may even wish to ptay It with a fm nd duri ng those (very rard) dull momen ts in linear algebra cI;ISS. [ The game is played on graph paper. A track, with a starting line and a fi nish tine, is d rawn on the paper. The track ca n be of any length and shape, so long as it is wide eno ugh to accommodate all of the p];\yers. For this example, we will have two playe rs (let's call them Ann and Ik rt ) who use differem colored pens to represent their cars or bi cycles or whatever they are going to race around the track. (Let's think of Ann and Bert as cyclists.) Ann and Bert each begin by draw ing a dot o n the starting line at a grid po int on the graph paper. They ta ke turns m oving to a new grid point, subject to the following rules:
t. Each new grid point and the line segment connectmg it to the previous grid POIllI must lie entirely wi thin the track. 2. No two players may occupy the same grid point o n the same turn. (This is the"no collisions" rule.) 3. Each new move is related to the p revious move as fo llows: If a player moves a uni ts horizontally and b units vertically o n one move, then on the next move he 1
Chapter I
!
Vectors
or she must move between a  I and II + I units horizontally and between b  1 and b + I uni ts vertically. In other words, if the second move is c units horizon tally and d units vertically. then )n  cl S I ,tnd )b  (~ S I . (This is the "acceleran onld ecelenlllon" rule.) Note tha t this rule forces the first move to be I unit ve rticall y andlor I unit horizo ntally.
1 ~
• TIle Iri sh mathemat iCian Wilham
A player who collides with another player or leaves thc track is eliminated . Thc winner is the firs t player to cross the fi nish line. If mo re than one player crosses the fini sh line o n thc same turn, the one who goes farth est past the fimsh hne IS the willner. In the sample game s hown in Figure 1.1, Ann was the winner. Ben accelcrated toO quickly and had dirficull y ncgoliating the turn at the top o f the track. To understand rule 3, consider An n's third :lIld fourth moves. On her thi rd move, s he we n t I unit hOri zon tally and 3 units vertically. O n her fourth move, her opt ions were to move 0 to 2 uni ts horizo ntally and 2 to 4 uni ts ve rt icall y. ( Not ice tha t some o f these combillations wo uld ha vc placed her ou tside th e track. ) She chose to move 2 units in each direction.
Rowan H:unilt on (1805 1865)
ust'd "ector concepts in his stud y of complt'X numbers and their generalizatio n, the quaternions.
Shm L~±
A B
L + _' Flilish
,,,.,, 1.1 A sample game of racetrack
Pr.ItI,.l Playa few games of racetrack. ".111,. t Is it possible for Bert to win thIS race by choosing 11 different scque nce of moves?
PrlIII•• , Usc the notation (a, bJto denote a move that is a units horizontally lind b units ve rt ical ly. (Eilher a o r b or both may be negatIVe.) If move 13, 4 1 has just been made, d raw on graph paper all the grid points tha t could pOSSibly be reachcd on the next move. PrlIII_ ~ \\'hat is the lIer c[fect of IWO successive mOH~s? In other words, if ~'ou move la, bj and then Ie, dl, how far ho rizontally and ve rtiCally Will you have mm·ed a ltogether?
Section l. t The Geometry and t\lgebra ofVeclors
3
Pr••I,. 5 Write out Ann's seq uence of moves using the ((I, bl notation. Suppose she begins at the origi n (0,0) on the coordi nate axes. E»plain how you can find the coordinates of the grid point corresponding 10 each of her moves wit/,our looking m the graph paper. If the axes were drawn dirrerently, so that Ann's starting poi nt was not the origin but the point (2, 3), what would the coordinates of her final point be? Although simple, this game introduces several ideas that will be useful in our study of \'ectors. The next three sections consider veClors from geomet ric and algebraic viewpoints, beginning, as in the racet rack game, in the plane.
The Geometrv and Algebra of Veclors Vectors In the Plane
The Cartesian plane is named after the rrcrKh philosopher and mathematician Rent! Descartes (J 596(650), whose introduction of coordinates allowed grometric problems 10 be handled using algebraic techniques. The word vtctorcomes from the Lalin rool meaning ~ to carry.. .. A veclor is fo rmed when a poinl is displ ~cedor "c~\fried off" a gi"en distance in a given direction. Viewed another way, vectors "cn rry" two pieces of information: their length :md their direction. When writing vecton. by han it is difficult to indicate boldface Some people prefer to write Ii fo the vector denoted in print by V, but in most cases it is fine to use ;111 ordin~ry lowercas;,' v. [ t will usuaUy be dear from Ihe context when Ihe letter denOies a vector. The word componem is derh·e{i from the L.1lin words (0, meaning "together with," and POllett', meaning "to put." Thus, a vedor is MPUI together" out of its component.>.
We begin by consideri ng the Cartesian plane with the fa miliar x and yaxes. A vector is a dlrecred lINe segment that corresponds to a displacement from one point A to another point B; see Figure 1.2. The vector from A to B is denoted by AB; the point A is called its initial point, or tail, and the Jroint 8 is called its termillal point, or Ilead. Often, a vector is simply denoted by a single boldface, lowercase letter such as v. The set of all points in the plane corresJronds to the set of all vectors whose tails are at the origin O. To each po int A, there corresponds the vector a = OA; to each vector a With tail at 0, there corresponds its head A. (Vectors of thIS form arc sometimes called position vectors.) 11 IS natural to represent such vectors using coordinates. ror example, III Figure 1.3, A = 0,2) and we wrlle the vector a = OA = (3,21 using square brackets. Si milarly, the other vectors in Figure 1.3 are

.

b = [  1,3J "nd < = [ 2,  I J The Individual coordinates (3 and 2 in the case of a) are called the components of the vecto r. A vector is sometimes said to be an ordered pair of real numbers. The o rder is Important since, for example, [3,21 "., [2, 3J. In general, two vecto rs are equal if and only if their correspond ing components are equal. Thus, lx, yJ = I J, 5 J implies thaI X "" I andy = 5. It is frequently convenient to usc column vectors instead of (or in addilion (0 ) row vectors. Another representatio n of 13, 21 is
[ ~ ]. (The important point is that the )'
B
y
; '0
b
, Fillr, 1.3
c
•
[R '
Chapter I Vectors
is pro nounced "r two."
co mponents are ordered. ) In later chapters, you will see that colum n vectors are somewhat better from a compu tational point of view; for now, try to get used [0 both representations. It may occur to you that we cannot really draw the vec[Or [0, OJ = 00 from the origi n to Itself. Nevertheless, it is a perfectl y good vector and has a special name: the zero vector. The zero vector is denoted by O. The set of all vectors with n. . o componen ts is denoted by 1Il1 (where R denotes the set of real numbers from which the components of vectors in 1R2 arc chosen). T hus, [1,3.5), [ V2, 1T ]. and [~ , 4 ] are all in Rl. Thinklllg back to the racetrack game, let's try to connect all of these ideas to vectors whose tails are rlO l at the Origin. The etymological origin of the word vector in th e verb "to carry" provides a cl ue. The vector [3, 2J may be interprctcd as follows: Start ing at the origin 0, travel 3 units to the nght, thcn 2 units up, fin ishing at P. The same displacement may be ltpplied with oth cr mitial points. Figure 1.4 shows two equ ivalent displacements, represented by the vectors AB and CD .


)'
B D
"++ ++ + .,
ng.r.l.4
 
When vectors arc referredTcill
their coordinates, they are being considerrd mralytica/ly.
We define two vectors as equal if they have th e same length and the same direction. Thus, AU = CD in Figure 1.4 . (Eve n though they htlve di fferent initial and terminal poi nts, they represent the same displacement.) Geometrically, two vectors are equal If one can be obtained by slidmg (or translatmg) the other parallel to itself until the two vectors coincide. In terms of components, in Figu re 1.4 we have A = (3, 1) a nd B = (6, 3). Notice that the vector 13, 2\ that records the displacement is just the d ifference of the respectIve components:

AB ~ [ 3 ,2 1~[ 6  3 , 3 1J
Sim ilarl y,


Ci5 ~ [ I  (  , ), ' (  I )I~[3, l )
a nd thus AB = CD, as expected . A vecto r such as OP with its tail at the origm is said to be in standard position. The fo regoi ng dIscussion shows that cvery vector can be drawn as a vector in sttlndard position. Conversely, a vector in standard position can be redmwn (by translation) so that lIs tail is at any pomt in the plane.
Example 1.1


If A = ( 1,2) and B = (3, 4 ), fi nd AU and redraw it (a) rn standard position and (b) with its tml at the point C = (2,  I ).
"..tI,.


We compute AB =[ 3  ( 1),4  2] [4,2 ]. If AB is then tr:lllslated to CD , where C = (2,  I ), then we must have D = (2 + 4,  \ + 2) = (6, I ). (Sec Figure 1.5.)
Section 1. 1 The Geometry and Algebra of Veclors
5
y
8(3 .4)
A(  i .2)
14. 21
n,lr.1.5
Mew Vectors " •• Old As in the r~l ce \ rack game, we often want \0 "fo llow" one vector by another. Th is leads to the no tion o f veCforadditiotl, the fi rst basic vector operation.
If we fo llow u by v, we can visualize the total displacement as a third vector, denOied by u + v. In Figure 1.6, u = f I, 21 and v = [2, 2 J, so the net effect of fo llo\"ing u by vis
I' + 2,2 + 21 which gLves u the vector
+ v. In general, If u
=
13, 41
= 111,, 1,21a nd v = I v.. \/21 , then their $um u + v IS
II is helpful 10 vLsualize u + v geometrically. The following rule is the geo metric version of the foregoing discussion.
y
,, ,: 2
,, ,, ,
,
___ __ I!
,', 4
2
, ,
u
, , _______ _ _ £1 3
'Illn 1.' Vector addi tI on
•
Chapter I
Vtxtors
Tbe HeadIoTail Rule
Given vecto rs u and v in 1R1, translate v so that its tail coincides with the head of u . The sum u + v of u and v is the vector from the tail o f u to the head o f v. (See Figure 1.7.)
, ,
n••r.1.1 The headtotail ru le
, Figure 1.1
By tra nslating u and v parallel 10 themselves, we obtain a parallelogram, as shown in Figure 1.8. This parallelogram is called the parallelogram determirlCd by u
The paTJllelogram
11m/ v. It leads to an equivalent version of the headIo tail rule for vectors in standa rd
determined by u and v
position.
The Parallelogram RUle
G iven vectors u and v in R2 (in standard position), t heir sum u + v is the vector in standard position along the diagonal of the parallelogram determined by u and Y. See Figure 1.9.)
"
,
" ng.r.U The parallelogram rule
ExamPle 1.2
If u = [3,  IJ and v = [1, 4], compute and draw u
+ v.
We compute u + v = [3 + I,  \ + 4 1 = [4, 3]. This vecto r IS drawn using the headtotail rule in Figure l.iO(a ) and using the parallelogram rule in Figure l.iO(b).
SDI.1I01
&ellon 1.1
The GMll1etry and Algebra of V«tors
y
y
+_x
(,)
(b)
Fig,,. 1.11
The second basic vector operation is scalar multiplimtio1l. G iven a vector v and a real number c, the scalar multiple cv IS the vector obtained by m uh iplYlllg each component of v by c. For example, 3[  2, 4 J = ( 6, 12 J. In general,
Geometrically, cv is a "scaled" version o f v.
If v = \2, 4 ], compute and dra w 2v , 1v, and lv.
S,I,.It.
We calculate as follows: 2v = (2(2),2(4)J = ( 4,8J
Iv= (\( 2), \(4) J =
( I, 2J 2v = ( 2(  2),  2(4) J = (4, 8J These vectors are show n in Figure !.II. y
2v
v
,
!v
+_ x
 2.
n"r. 1.11
•
Chapter I Vector5
, " ,
2,
...::',,,
fIt,r.l.l1 Vector subtraction
fll'" 1.12
The term S(a/nrcomes from the Latin ....,ord Knill, meaning Mlad_ der." The equally spaced rung~ on a ladder suggest II scale, and in vec tor arithmetic, multiplication by a constant changes only thc '>tale (0 length) of a vettor. Thus, constant became known as scalars.
"
u +(  v )
Observe that cv has the same direction as v Lf c > 0 and the opposite di rection if c < O. We also see that cv islc~ times as long as v. For this reason , III the context of veetors, constants (that is, real numbers) are referred to as sca/a N. As Figure 1.1 2 shows, when translation of vectors is taken into account, two vectors are scalar multiples of each other 1f and only if they are parallel. A special case of a scalar multiple is ( \ lv, which is written as  v and is called the negative ofv. We can usc it to define vector subtraction: The difference of u and v is the vector u  v definfd by II 
V
= U
+ ( v)
Figure 1.1 3 shows that II  v corresponds to the "o ther" d iagonal of the parallelogram determined by u a nd v.
EII.,'I 1.4 y
b .
~ b
B
::..:.... .,
fl,.,. 1.1.
= [
3, 11. then u  v = (I  ( 3), 2  11 = [4,
II.

A
•
If u = [I, 2 J and v
The definitio n of subtractIon in Example 1.4 also agrees with the way we calculate a vector such as AB. If the poin ts A and B correspond to the vectors a and b in standard posLtlo n, then All = b  a. as shown in Figure 1.14 .1 Observe thai the headto · ta il rule applied to this diagram gives the equation a + (b  a ) "" b. If we had accidentally drawn b  !l wLth its head at A Lllstead of at 8, the diagram would have read b + (b  a) = a, which is d early wrong! More will be said about algebraic exprl"SSions involvmg vectors later I II this section. I
Veelors
'I ~.
Everything we have just do nt txtends tasily to three dimensions. The set of all ordered triples of real numbers is denoted by RJ. Points and vectors are located using three mutually perpendicular coordinate axes that mC<.'t at the o rigi n O. A poi nt such as A .. ( 1, 2, 3) can be located as follows: First travel 1 unit along the xaxis, then move 2 units parallel to the y· axis, and fi na!!L!nove 3 uni ts parallel to the z·axis. The corresponding vector a = I I, 2, 31 is then OA , as shown in Figure 1.15. Another way to visualize vector a in Ill' is to construct a box whose six sides arc determ ined by the three coordinate planes (the X)' , XZ' , and yzplanes) ,lIld by three planes through the point ( 1, 2, 3) parallel to the coordinate planes. The \'ector [1, 2,31 then corresponds to the diagonal fTo m the o rigin to the opposite corner of the box (see Figure 1.16).
Section 1.1
The Geometry and Algeb ra of Vectors
•
l
A(1. 2. 3)
a •
I ,
,, ,:3 ,'
2"J~ Y
fill" 1.15
x n ••r.1 .16
T he ucomponenlwise" d efin itions of vector additio n and scalar multiplication ,lfe extended to IRl in an obvious way.
Vectors il R" In general, we define R~ as the set of all ordered IIwples of real numbers written as row o r column vectors. Thus, a vector v in lQ" is o f the form
)
The individual entries of v arc its compo nents; V, is called the ith component. We extend th e definition s of vector add ition and scalar multiplication to R~ In the o bvious way: If U "" [II I' 1/2' . .. , II" ] and v = [vI' I'., ... , v"l, the Ilh component of u + v is II, + v, and th e ith componen l of cv is just (v,_ Since in IR" we can no longer draw pictures of vectors, it is important to be able to calculate with vectors. We must be ca reful not to ass ume tha t vector arithmetic will be si milar to the arithmetic o f real numbers. Often it is, and the algebraic calculations we do vmh vectors arc Similar to those we wo uld d o with scalars. But, 10 later se<:tions, we will encoun te r situatio ns where vector algebra is quite IInlikeour previous experience with real num bers. SO II IS importan t to verify any algebraic properties before attem pting to usc them. One s uch property is commlitalill ilyo f addition: u + v = v + u for vectors u and v. T his is certainly true in 1Il1. Geometrically, the headto tail rule shows that both u + v and y + u arc the main diago nals of the parallelogram determ ined by u and v. (The parallelogra m rule also refl ects th is symmetry; see Figure 1.1 7.) No te that f igure 1.17 is simply an illustration of the property u + v = v + u.Jt is not a proof, si nce it docs not cover every poSSible case. Fo r example, we must also include the cases where u = Y, U =  V , and u = O. (What would d iagrams for these cases look like?) ror this reaso n, an algebraic p roof is needed. However, it is just as easy to give a p roof that is valid in IR" as to give o ne that is valid 10 Ill!. The following theorem summarizes the algebraic properties of vector addition and scalar multipllcat Ion in IR". The proofs follow fro m the corresponding properties of rea l numbers.
11
Chapter I
Vec tors
TheOf•• l.l
Algebraic Properties of Vector s in a" Let u, v, and w be vecto rs in R~ and let c and d be scalars. The
+ v '" v + u b. (u + v) + w = u + (v +
a. u
Commutativi ty
•
w)
c. u + O = u
d. u + (  u) = 0 e. c( u + v ) = eu + ev f. (e + d )u = eu + ti u g. e( du ) = (ed)u h. iu = u
Distributiv ity Distributivit
••••nl The word rlworem is derived from
the Greek word rhrorema, which in turn CO Ul es from a word mcanin "to look at." Thu5, a theorem i based o n the insights we have when we look at examples and extract from them properties tha we t ry to p rove hold in general. Similarly, when we understand someth ing in mathematicsthe proof of a theorem. fi p we o ften say " I see."
• Properries (c) and (d ) together with the commutativity property (a) imply thatO + u = u and  u + u = Oas well. • If we read the dlstributivity properties (e) and ( f) fro m right to left . they say thai we ca n filctor a common scalar or a common vector fro m a su m.
mil We prove p roperties (a) and (b) and leave the proofs of the remaining p roperties as exerCiSes. Let u = [ 11 1,112> " " Il nJ , V = [VI' V2" ' " v"l , and w '" [ wI'
W2•··· > w..l.
(, j
+ [vl . v2, ••• • v" 1 = [Ill + VI' /11 + V1" ' " 11.. + vnl = [ VI + 111' V2+ 112" " , Vn + IInl = [ vI' V1" ' " vnl + [ U p 1/2" ' " U,,]
u + v = [ tl l ,
1/2" ' "
unl
= v+ u The second and fourth equalities are by the defi nitio n o f vecto r addition, and the third equality is by the comm utativity of addition of real n um bers. (b) Figure 1.18 ilIuSlTates associativity in R: 2• Algebraica lly, we have
eu+ v) + w =
([lip [ Il l
(u
+ v) + w
= u
+ ( v + w)
[ ( II I [ Ii i
, 11"1 + [ VI' V2, ••• , v"]) + [Wl' W" . . . , W"] 112 + V" ... , lin + v~ ] + [ WI' W " • • . , wn]
112""
+
Vj ,
+ +
VI)
(VI
+ +
WI' (U 2
WI )' 112
+
V2)
+ ( Vl
+ +
W 2• .•• ,
W l ), · · · ,
( II ~
+
II ~ +
V~)
+
Wn ]
n + \\In) ]
(V
w
, fig.,. 1.1'
= u
+ (V + W)
The fourth equality is by the associativity o f add itio n of real nu mbers. Note the careful use of parentheses.
Section 1.1
The GWllletry and Algebra of "wors
11
By property (b) o f Theorem 1.1, we may u Ilambiguously write u + v + W without parentheses, since we may group the summands In whichever way we please. By (a), we may also rearrange the summandsfor example,as w + u + v if we choose. Likewise, sums of fou r or more vectors can be calculated without rega rd to order or grouping. In general, if VI' v2" •• , vt are vectors in R ~, we will write such sums without parentheses:
The next eX3mple illustrates the use of Theorem 1.1 in performing algebraic calculations with vectors.
Let a, b, and x denote vectors in Rn. (a) Simplify 3a + (Sb  2a) + 2(b  a). (b ) If Sx  a "" 2{a + 2x), solve for x in terms of a.
hI'tltl We will give both solutions in detail , with reference to all of the properties in Theorem 1.1 th3t we use. It is good practICe to justify all steps the first few times you do this type of calculation. O nce rou arc comfortable with the vector properties, though, it is ;\cccptable to leave out some of the intermediate steps to save time and space. (a) We begin by inserting parentheses.
3a + (Sb  2. ) + 2(b  a) = (3a + (Sb  2. )) + 2(b  a) = (3a + (2a + Sb)) + (2b  2a) = «3a + (  2a» + Sb) + (2b  2a ) = «3 + ( 2»a + Sb) + (2b  2a) = (I a + Sb) + (2 b  2a ) = « a + Sb) + 2b)  2a = (a + (Sb + 2b»  2a =(. +(S+2)b)  2a
(a ), (t )
(bl (fl (bl, (hi (bl (f)
=
(7b + a )  2a
(.1
=
7b + (a  2a)
(bl
=
7b + (1  2)a
(fl, (h i
=7b +(  I). =
7b  a
You can sec why we wi ll agree to omit some or these steps! In practice, it is acceptable to si mplify this sequence of steps as 3a
+ (Sb  2a ) + 2(b  a ) "" 3a + Sb  23 + 2b  23 = (3.  2a  2a)+(Sb +2b) ""  a
or even 10 do most of the calculation mentally.
+ 7b
12
Chapter I
Vel:IOrs
(b) In de tail, we have
Sx  a "'" 2(a + 2x)
(5x
~
5x  a =2a+ 2(2x)
(eJ
5x  a = 2a + (2 · 2)x 5x  a =2a + 4x
(g)
a)  4x = (2a + 4x)  4x
+ 5x)  4x "" 2a + (4x  4x)  a + (5x  4x ) = 2a + 0 a + (5  4)x = 2a
( a
(a). (b )
(b), (d )
(O. (c)
 a +( I )x =2a
a + ( a + x) = a + 2a (a + ( a)) + x  (I + 2).
(h)
(b ),
O + x =3a
(d )
x = 3a
«)
m
Again . in most cases we will omit most of these steps.
111..1 CO •• llltlOII.I. Coo~II.IeS A vector that IS 11 sum of scalar multiples of other vectors IS said to ~ a linear combi· lIatloll of those vectors. The fo rmal definition follows.
•
lelilill..
A vector v is a Ii" ear combinatiotf of vecto rs vI' v" . .. , Vi if there are scala rs 'I' '.! •... , c1 such that v  ' IVI + ~V2 + .. . + 'tvJ: The scalars Cl' cll ••• , c~ are called the coeffide", s of the linear comb ination.
,
...
2 The vecto r  2 is a linear combination of  I
3
I
2
0
+2  3
 ]
]

I
0  I
5
2
,  3 • and  4 , since
5 4 = 0
•
I
0
2 2  ]
4
••••rk Determining whether a given veCior is a linear combimltion of other vectors is a problem we will address in Chapler 2.
In R2, it is possible to depict linear combinat io ns of two (nonpa rallel ) vectors quite conveniently.
Let u =
[~) and v = [ ~). We can use u and v to locate a new sel of axes (in the same
way that e , =
(~] and e1 = (~] locate the standa rd coordi nate axes). We can use
Section 1.1
The Geometry and Algebra of Vectors
1~
y
2,
w
v
"
.
jC+++_ X
" Flg.r. 1.19 these new axes to determine a coordinate grid that wtlllet us easIly locate lInear combinations of u and v. As Figure 1.19 shows, w can be located by starting at the origin and traveling  u follow'ed by 2v. That is, w ==  u
+
2v
We say that the coordinates of w with respect to u and v are  I and 2. ( Note that this IS j ust another way of thinking of the coefficients of the linear combinatio n.) It fol lows that
(Observe that  1 and 3 are the coordmates of w with respect to c 1 and c 2. )
SWItching from the standard coordi na te axes to alternative ones is a useful idea. It has applications in chemIstry and geology, Slllce molecular and crystalline st ructures often do not fall onto a rectangular grid. It is an idea that we will encounter re peatedly in this book.
J. Draw the following vectors in standard position in Iff:
la)
a ~ [~]
I')' ~[: ]
Ib)
b~ [ :]
Id)d ~ [_ ~ l
4. If the vectors in Exercise 3 are translated so that their heads are at the point (4, 5, 6), find the po ints that correspond to thei r tails.


5. For each of the follOWing pairs o f points, draw the vector AB. Then compute and redraw AB as a vector In standard POSI t Ion.
2. Draw the vectors in Exercise I with their tails at the point ( 2,  3).
la) A ~ II,  I),B ~ 14,2 ) Ib) A ~ 10,  2), B ~ 12,  I)
3. Draw the following vectors in standard position in lR':
(e) A == (2, f) ,B =
l a ) a~ [ O,2,O [
Ib)b ~[ 3,2,1[
(c)e ==[ 1,  2, 1)
(d)d ==[  ],  ],  2[
(d) A ==
(t,3)
d,D, B == (i, ~ )
14
Chapter I
Vectors
6. A hiker walks 4 km no rth and the n 5 km northeast. Draw displacement vectors representing the hike r's tTiP and draw a vector that rep resents the hike r's net displacem ent frOIll the staning point.
Exercises 7 10 refer ro the vectors in Exercise J. Compllte the indimtcd vectors (/lui also show how the resl/lts a lii be obtained geometrimlly. 7. a
+b


(a) AB

(b) BC
AD (d) CF (e)

+c
(e) AC
lO.a  d
(f) Be
8. b
.
9. d  c
Exercises II am/ 12 refer to the vectors ;,, Exercise 3. Compute tile indicated I'ectors. 12.2c 3b  d
11.2a+ 3c
13. Find the components of the vectors ti , v, u + v, and u  v, where u and v are as shown ill r igurc 1.20. )'

Express each of the followm g VCCIOTS in terms of a :::: OA and b = DB:
>
+ DE +
~
FA
In exercises 15 and 16, simplify t/ie givell vector expression.
llidimte which properties j" T/leorem 1.1 yOlil/se. 15. 2(a  3b)
+ 3(2b + a)
16. 3(a  c) + l {a + 2b ) + 3(c  b)
111 Exercises J 7 and 18, solve for the vector x ill terms of the veClnrs a and b.
1
17. x  a = 2( x  2a)
•
18. x
60'
+ 2a
 b = 3(x
+ a)
 2(2a  b )
Itl Exercises 19 alld 20, draw tile coordJ/late t/.Xes re/(lfive to u and" and /ocfllew.
 I
19. u = [ _ : ]. v = [ : ], w = 2u
20,:. =
 I
FI'lre 1.11 14. In rlgure 1.21, A, B, C, D, E, and Fare the verti ces of a . regular hexagon centered at the ongill. )'
u 2v
1/1 ExerCises 21 tHld 22. draw tile stalldard CQOrtli,wte axes 011 {h e same diagram as tIre axes relative to u alld v. Use tlrese to find w as a lillctlT combinat ion of u and v .
21. . ~ [_:]., =[ :].w ~ [!]
22.u ~ [  ~ ]. , = [ : ]. w~ [:]
1/
C
[ ~ J. v = [ ~J. w =
+ 3v
23. Draw diagrams 10 11IuSIrate properties (d) and (e) o f
A
",
a E
",Ire 1.11
x
T heorem 1.1 .
24. Give algebraic proofs of propert ies (d ) th ro ugh (g) of Theorem 1.1. F
Section 1.2
Lengt h and Angle: The 001 Product
15
length and Angle: The Dot Product It is qUIte easy to reformulate the familiar geometric concepts of length. distance. and angle in terms of vectors. Doing so wdl allow us to use Ihese Important and powerful ideas in settings more general than R Z and IR' . In subsequent chapters, these simple geo met ric tools will be used to solve a wide variety of problems tlrl Slll g in applicatio ns even when there is no geometry apparent at all!
n. Dot Predict T he vecto r versions of lengt h, d istance, and angle ca n all be descnbed using the noti on of the dot prod uct 0 [ 1\"'0 veC lOrs.
Delililiol
If
", ". then the dot product U · v of u and v is defined by
In wo rds. u· v is the su m of the p roducts o f the correspond ing components of u and v. 1t is importan t to note a couple of thi ngs about this "product" that we have Just defined : First, u and ... must have the same number of com po nents. Second , the dot p roduct u' v is a /lu mber, no t anoth er vector. (T his is why u ' v is sometimes called the scaltlr producl o f u and ....) The dot prod uct of vectors in IRwis a special and importan t case of the mo re general notion of imler producl, which we Iyill explo re in Chapter 7.
EII.ple 1.1
3
I
Compute u ·v when u =
2 and v =
3 S.IIII ••
U · V
= ] · ( 3) + 2 · 5
5
2
+ ( 3) ' 2 =
]
Notice that if we had calculated v · u in Exam p le 1.8, we wo uld have com puted v ' u = ( 3) ' 1 + 5 · 2 + 2 ' ( 3)
=
]
That u ' v = v ' u in general is clear, since the individ ual products o f th e componen ts commute. This comm utativity propert y is o ne of the properties of the dot product that we will usc repeatedly. T he main properties of the d ot p roduct arc summarized in Theo rem 1.2.
.2
Let u , v, and w be vectors in IR" and let c be a scalar. Then a. b. c. d.
u' v = v ' u Commutati\·ity Dhlributlvit)" u · (v + w) = u'v + u'w (cu ) ·v = c( u·v)..; u · u ~ O and u·u =Oifnndonlyifu=O
',..1
We p rove (a) and (c) and leave proof of the remaining properties fo r the exercises. (a ) Applying the definition of dot product to u . v and v' U, we obtam u .v =
II I VI
=
VI lli
+ +
II. V. Y. u .
+ ... + + ... +
Il~Vn Vnll n
where the middle equality fo llo\'IS fro m the fact that multiplicalion of real numbef5 is commutative. (c ) Using the defi n itio ns of scalar multipl icatio n and dot product, we ha\'e
(cu ) • v
= [CU ,. CU2' ..• , CII "1 [ VI> V2' ... ,
= ~
+ (U 2 V1 + ... + c(ll t VI + 112V2 + ... +
(IIIV,
vn]
CII"V"
II"Vn )
« u ·v)
•••arb • Property (b ) can be read from righ t to left, in which case It S,1YS that we can factor out a common vector u from a su m of dot products. Th iS prope rt y also has a " righthanded" analogue that follows from properties (b) and (a) together:
(v + w) · u ::
V ' U
+ w . u.
• Propert y (e) can be extended to give u' (cv) = c(u ' v) (Exercise 44). This extended version of (c) essentially says thaI in t'lking a scalar mult iple of;1 dot product o f vecto rs, the scalar can first be comb med with whic hever vector is more convenient. For example,
(j[ I,  3,ZJ) · [6,4,O)
~
[I ,3,Z) ' (j[6,  4,0J)
~
[ I,3,2) · [3,  Z,O) ~ 3
With this approach we avoid introducing fractions mto the vectors, as the o riginal groupi ng would have. • The second part of (d ) uses the logical connective If (wd only if Appendix A d iscusses this phrase in mo re detail, but for the m oment let us just note that the wording signals a dOl/ble implication namely, ifu = O,t hen u'u :: 0 and
if u'u = O,thenu = 0
Theorem 1.2 shows that aspects of the algebra of \'e<:tors rtsCmble the algebra of numbers. The next exam ple shows that we can sometimes fi nd vector analogues of fa miliar iden tities.
Section 1.2
Ellmple 1.9
Prove that (u
+ v)· (u + v)
=
Length and .... ngle: The Dol Product
Jl
+ 2(uv) + v 'v fo r all vectors u and v in IR~.
U·U
(u + v)· (u + v) = (u + v) ' u + (u + v) '\1 = u·u + v·u + u·v + v·y
S.IIIII.
= u ·u + u v + u· v + vv 
U ·U
+ 2(uv) + V'V
(Id en tify the parts o f Theorem 1.2 that were used at each step.)
)'
To sec how the dot product plays a role in the calculalion of lengths, recall how lengths 3rc computed in Ihe plane. The Theore m of Pythagoras IS all we need.
b
IIvl1 = Val + b 1 /
,,: ,, , ,,
I' '      4_ x a fl,lr.l .U
[~] IS the d istance frorn the origin \0 the point Pythagoras' Theorem. is given by Va + II, as in Figure 1.22.
In Rl, the length o f the vector \I = (a, b), which, by
Observe that til
+ II
= v' v. This leads to the followi ng definition. ,•
Dell.llloD
The leng.h (or twnn) ora vector v arive .scalar ~ vf defined by
",.
in R" is tnef'iotu'i
•
v.
V vi + 0 + ... + :!I ________________ I v~
=
, hi
Vv· v =
~=F
In words. the lengTh of a vector is the square root of the sum of the squares of its components. Note that the square root of v' v is always defi ned, since v' v > 0 by Theorem l.2(d ). No te also that the definition can be rewntlen to give U 2 :::: V • v, whICh will be useful in provmg furth er proper ties of the do t p roduct and lengths of vectors.
E. . . plll.l0 Theore m 1.3 lists some of the main properties of vector length.
ThallI. 1.3
Let v be a vector in R" and let c be a scalar. The n a.
Ilvl
= 0 ifand only if v = 0
b. • "'1  kiN ____________________ .__ .L
P'II'
•
Properly (a ) fo llows immediately from Theorem 1.2{d ). To show (b ), we have
levi' 
(ev)·(ev )  c(v.v ) =
<'lIvl'
using Theorem 1.2(c). Taking square roots of both sides, usi ng the fuct that
W
=
lei for any real number c, gives the result.
11
Chapt~r
I Vectors
A vector of length I is called a unit vector. In lRl, the set of all uni t vectors can be identified wi th the Imit circle, the circle of radius I centered al Ihe origin (see Figure 1.23). Given any nonzero vector v, we can always find a unit vector in the 5.1mcdlrcetion as v by dividing v by its own length (or, equivalenlly, TlIII/lip/ying by 1 /~vI). We can show this algebraically by using property (b) of Theorem 1.3 above: If
u=
(l/I v~ ) v,
then M
11/lvl ll  (1/1 )lvI = I and u is in the same direction as v, since lflvl is a positive scalar. Finding a unit vec=
l(lflvl)vl
=
lor in the same direction is often referred to as lIormalizitlg a vector (see Figure 1.24). y
I
,
/
v
 I
I
,
~
I ,
,
( ..IIPI.1.
fI,.r.'.U
fl,.,. 1.r.
Unit vectors in R!
Normaliling a
In 1R2, let e l
=
l~]
and c1 =
[~]. Then eJ and ez are unit vectors, since the sum of the
squares of their compone nts is I in each case. Similarly, 111 R', we call const ruct unit vectors I 0 0 e, = 0 , e2 = I , ,nd e 3 = 0 0 0 I
, y
" " 4  >1  e,
_ x
fl,.r".15 Standard un it vectors in Rl and R'
x
y
19
Section 1.2 Length and Angle: The Dot Product
Observe in Figure 1.25 that these veClO rs serve to locate the positive coordinate axes in RZ and RI.
In general, in R", we defin e uni t vectors e 1• e Zl • •. , e". whe re e, has li n its Ith com ponen l ilnd zeros elsewhere. These vectors a n se repeatedly in linear algebra a nd are called the stntulnrd unit vectors.
Exlma'e 1.12
2 Norma1i1.e the vecto r v =
 I 3
"IIUII
'"
~ vl = \hz +(_ I )l+3l = \IT4, so a uni t vector in the same direction
as v IS given by
2
u = (1/I>I)v = (1/ Vl4) _ I 3
2/ Vl4 1/ Vl4 3/ Vl4
Since property (b ) of Theorem 1.3 describes how length behaves With respect to scala r multiplication. natural curiosity suggests that we ask whether length and vector addition arc compatible. It would be nice If we had an identity such as lu + vi = I l u~ + Ivll. bu t fo r nlmost any choice of vectors u a nd vthis turns out to be false. [See Exercise 42(n).) However, all is not lost, fo r It turns out tha t if we replace the = sign by :5 ,l he resulti ng inequality IS true. The proof of this famous and important resuh
••
Ihe Triangle Inequalityrelies on anolher imporlant inequalitythe CauchySchwarz Inequalitywhich \....e will prove and disc uss In m ore detail in Chaple r 7.
Theorem 1.4
• The CauchySchwarz InequaUty
a,
•
For all vectors u and v in Rn,
u
+ ,.
" flllrt 1.21 The Triangle In(>(Juality
.
See Exercises 55 nnd 56 for nlgebraic and geometric npproaches to the pmof o f this inequality. In Rl or IR J , where we G ill usc geom etry, it is dea r from a d iagra m such as Figure 1.26 that l u + vi < ~ u ll + I vl fo r all vectors u and v. We now show that thIs is true more generally.
,
~
ThlOrlm 1.5
The Triangle Inequality For all vectors u and v in R~.
20
Chapter I
Vectors
P"II Since both sides of the inequality arc nonnegative, showing thai the square of the lefthand side is less Ihan or equal to the square of the righthand side IS equiva lent to proving the theo rem. (Why?) We compute
Il u + V~2
(u + v) ·(u + v)
""
= u·u + 2(u·v) + v'v
By EX31llpie 1.9
" iull' + 21u· vi + ~vr s lul1 2 + 21uU + IvII 2 = (lui + IvI )'
By CauchySchwoarl
as requ ired.
DlSlance The d istance betw'cen two vectors is the direct analogue of the distance between IwO points on the real n umber line or IwO points in the Cartesian plane. On the number line (Figure 1.27 ), t he distance between the numbers a and b is given by ~. (Taking the absolute need to know which o f a o r b is larger.) This d istance is also eq ual to , and its twodimensional generalization is points (II I' a2 ) and (bl' btlna mely, the familia r fo rmulaIor the dis· nce (I
la
d=Y
b
'+
11,)' t
/,
o
I i
I
o
2 flglr'1 .21 d
= 10 
Jn terms of vectors, if a
=.0
[
~
::]
I 4 I I . 3
= 1 2  31= 5 and b "" [ : : ], then ti is just the length o f a  b.
as shown III Figu re 1.2B. This is the basis for the next definition. y
a b
,
,, , ,. ,, "' , __ _ ________ JI,
l a~ h _
FIliI,. 1.Z1 (/  v""(a,'b,.,")'~+c(a,b,.,")l
DelialtloD
=
'~ x
la  hI
The distanced(u, v) between vectors u and v in Rn is defined b
(u. v)
=
lu .
i
' UW'"
S«t ion
1.2 Length and Angle: Thl' Dot f'roduct
Exlmple 1.13
o
Find the distance betwl'en u =
and v :::
I
 I
S,I.II..
21
2
 2
v2 We com pute u  v =

I ,so
1
d(u . , ) ~ ~ u 
'i
s
\I( v2) ' + ( 
I )'
+
I' ~
V4
~ 2
The dol product can also be used to calc ulat e the ang le between a pair of vcctors. In 1!l2 orR), Ihe angle bet>.vee n Ihe non7.ero vect ors u and v will refer 10 the angle 0 determ ined by these vectors that S:ltisfi es 0 s () :S 180" (sec Plgure 1.29 ).
,
, u
"
• C\
•
u
u
fl,. f.l .2I The lIngie bctw('('n u lind v In Figure 1.30, con sider the tria ngle with side s u, v,and u  v, where () is the angle between u and v. Applyin g the law of cosi nes \0 Ih is triangle yields
U '
•
Il u  vl1
U
FI"f' U'
lu92 + I 2  20ull cos 0 Expanding the lefthand side and usin g Ivf2= v · v seve ral time s, we obta in lui '  2(u" ) + lvi' ~ lu! ' + lvi'  '1Iu!II'1005 0 :::
which, after simplification, leaves li S with u' v = U uUvf cosO. From this we obtain the foll owing fo rmula for Ihe cosine of tile angle () between non zero ve<:t ors u and v. We s\:lt c it as a definition. For nonzero vectors u and v in R", u '
•
Eli _p ie l.1
Compu te the angle between Ihe vectors u =
12, I, 21 and v
IE
I I, I, II.
U
Chapter I Vectors
Sollllal
We calculate U· \I = 2· 1 + I . I + (2 )' I = I. lui =0 V2 2 + 11 + ( 2)1 = V9 = 3, and I"" = V 11 + Il + Il= v'3. Therefo re. cos O= I/ 3v3 . so (J = c05  1(1/3 \13 ) .. 1.377 radians, or 78.9". of
tI
Com pute the angle between the diagonals on two adjacent faces of tI cube .
Sol lll. . The dimensions of the cube do not matter, so we will work with a cube with sides of length I. Ori ent the cube rdalive to the coord inate axes in R'. as shown in Figure 1.31, and take the two side diagonals to be the vect ors] 1,0, I J and ]0, I, 1J. Then ang le 0 between these vectors sat isfies cosO =
1·0 +0 ·1
+
v'2 V2
1· 1
I =2
from which it follows that the required ang le is 'TT/ 3 rad itlos, or 60°.
, [0. 1.1 ]
fI , O.lj
_ x
y
fI,.,. 1.31 (Actuall y. we don 't need to do any calculations at all to get th is answer. If we draw a third side diagonal joining the \lert ices al ( 1,0, I) and (0, I, I ), we get an equilateral tnangle, Sillce all of the side diagona ls arc of equal length. The angle we w:tnt is one of the angles of this triangle and therefore measures 60°. Sometimes a little insight can save a lot of calc ulation; in th is case, il gives a nice chec k on our wor k!)
• As Ihis discussion shows. we usua lly will have 10 settle for an approx imation to the ang le between t....o vectors. However, when the angle is one of the soc alled spee cial angles (Oe, 30 • 45°, 60°, 90°, or an mteger mul tiple of these), we sho uld be able to recognize its cosm e Crab le 1.1 ) and thus give the cor respond ing angle exac tly. In all
~able 1,1
00
0
cos 0
CoslDls 01 Special Ing liS
v'4 2
= I
300
0
45
0
VI
2
2
=
600 I
Vi
VI
2
W
I = 
2
Yo 2
=0
n
Section 1.2 I.englh and Angle: The Dol Product
ot her cases, we will use a calculator o r computer to approxi ma te the demed angle by means of the JIlverse cosine function. • The de ri ~ at ion of the fo rmula for the cosine of the angle between two vectors is valid only JIl Rl o r R}, since it depends o n a geometric fac t: the law of cosines. In Rn. (or 1/ > 3, the formula can be ta ken as a (Iefil/ition instead. This makes sense, smce the CauchySchwarz Inequali ty implies that
u· v
U ·
v
S I, so l u~avl r
 I to I, just as the cosine function does.
OnUgOlI1 •• ctors The word orthogonal is deri\'ed fro m the GredI: words orlh..s, meaning "upright," and g,m;a, mea 11 i I1g "angie." Hence. lWtho nal literally means "right·angled.~ The Latin equivalent is rectangular.
The concept of perpend icularity is fu nda mcntal to geometry. Anyone st udying geometry q uickly real izes the importa nce and usefulness of Tigh t angles. We now generalize the idea of perpendicularit y to veClOfS ill R", where it is called orthogonality. In lit: o r R3, two nonzero vectors u and v a re perpendicular If the angle 8 between them is a right anglethat IS, if Q  1T/ 2 radians. o r 90°. ThUS'
I :~'I:1 = cos 90
0
= 0,
and it follows that u ' v = O. This motivates the following definitio n.
BIIiDlUaD
Th'ovectors u and v in R" are or,hogoll4lto each other if u· v
O.
Since O· v s 0 for every vecto r v in Rn, the zero vecto r IS o rthogonal to every vector.
E....II 1.16
In lR'. u
=
]1, 1, 2] andv = 13,1,21 a reorthogonal,smce u . v = 3
+ I
 4 =
o.
...i
Using the no tIon of o rthogonalit y, we gel an easy proof of Pythagoras' TheQrem, valid in Rn.
Pythagoras' Theorem Por all vectors u and v in orthogonal.
Rft,
u."
+
~
: if and o nly if u ant.!
"... From Example 1.9, wc have l u + vf = ~ u 11 2 + 2(u · v) + Ivfl for all vectors u and v in R". It foll o\vs immediately that lu + vll = lull + ~ vl: if and only if u . v = O. See Figure ' .32.
v
u+ ,
,
" flilre 1.U Pythagor:u' Theorem
The concept of o rthogo nal ity is one ofthe 1I10si important and useful in linear algebra, and it often rtrises in surprisi ng ways. Chapter 5 contains a detaIled treatment of the topIC, but we will encounter it many lim es befo re then. One problem in which it clea rly plays .. role is fi nding the distance from a point to a line, where "dro pping a perpendicular" is a fa miliar step.
Prolletlo., We now consider the problem o f finding the d istance fro m a po in t to a line in the context of vectors. As you will set, this tedmique leads to an important concept: the p rojection of a veCior onto ano ther vecto r. As Figure 1.33 shows, the problem of finding th e distance from a point B to a li ne C(in R2 or R 3 ) red uces to the p roblem of fi nding the length of the perpendicula r li n e segmen t PB o r, equi valently, the lengt h of the vecto r PH. If we choose a point A on C, then, in the rightangled Irmngle 6.A PB, the other two vectors arc the leg AP and the hypotenuse AB. AP is called the projection of AB onto the line t. We will now look at Ihis situation in terms o f vecto rs.



B
B
.'
p
A
n•• r. UJ The distance from a pomt to a [me
Consider two nonzero vectors u and v. Let p be the vector obtained by dropping a perpend icular from the head of v o nto u and let 0 be the angle between u and v, as shown in Figu re 1.34. Then clearl y p = ~ p U u , where u "" ( 1/I ul )u IS the unit vector in the di rcction of u. Morcover, elemen tary trigonometry gives I p~ = Ivi cosO, and we know that cos (J =
I ~~'I:~ ' Thus, after substitution. we obtain
U ,v ) U ( U' U
This is thc for m ula we want, and It is th e basis o f thc follo\" ing definitio n for vccto rs in IR".
Definition
If u and v are vectors in v onto u ; [he \'cctor proj.. (v) defined by
rojll(v) =
R" and U·
u • 0, then th~ ptPjlL'N,.
v)
 ( U'U
u . .
An alternativc way to derive thiS formula is d escribed in Exercise 57.
•
Section 1.2
,, ,, ,, '<1
proj g(V)
Length and Angle: The Dot Product
2:5
Re • • ,U
•
• U
fI,lr.1.35
• The term projectio" comes from the id ea of projecting an image onto a wall (with a slide projector. for example). Imagine a beam of light with rays parallcll'O each other and perpendicular to u shi ning down on v. The projection of v onto u is just the shadow cast, or p rojccted, by v 011 to u. • it may be helpful to thi nk of proUv) as a function with variable v. Then the variable v occurs only once on the righ thand sid e of the defini tion. Also, it is hel pful to remember FIgure 1.34, which reminds us that proju(v) is a scalar multiple of the ' ector u ( not v). • Although in our derivation of the definition of projg(v) we required v as well as u to be nonzero (why?), it is clear frolll the geometry that the projection of the
zero vector onto u is O .The definition is in agreement with this, si nce ( u. 0) u~ Ou = O.
U· U
• If the angle between u and v is obtuse. as in Figure U S. then proj. (v) will be in the opposite d irection fro m Ui that is. proj. (v) will be a negative scalar multiple of u. • If u is a unit vector then proj.(v) z:: {u . v) u. (Why?)
(KImple 1.11
Find the projection of v onto u in each case.
Sllullon (a) We compute U' v =
[ ~ ] . [  ~] =
I and u ' u =
[n ·[~]
= 5, so
= [2/S] ( U'V)U~ ~[2] 5 1 115 U·U
(b) Since cJ is a unit vector,
o proj.,(v) = (cJ ' v)c) = 3c) = (c) We sec that
lui =
v'~ + ~ + t =
pro; ..(v}(u·V )U =
(~
+
I
+
1. Thus.
~)
1/2 1/ 2
I/ Vi 3( 1
+ Vi) 4
0 3
I I
Vi
3( 1
+ Vi) 2
1/ 2 1/ 2
I/ Vi
C h~pt cr
26
I Vectors
In Exercises 1 6, fil/d u • v.
3. u =
\
,
2 ,v =
3

.  4. u =
\
3
5. " ~ [ \,

\12, v'J, 01, v
3.2  0.6 , v =
1.5
4.\
 0.2
 1.4
~ [4, 
30. Let A = (  3,2), B = ( 1, 0 ), and C = (4,6). Prove that a A BC is a rightangled tria ngle. 31. Let A = ( 1, 1,  I), R = (  3, 2,  2), and C= (2,2,  4). Prove that dARC is a right:mglcd triangle. .::::. 32. Find the angle between a diagonal of a cube and an adjacent edge. 33, A cube has fou r diagonals. Show that no two of them
\12, 0,  51
are perpendicular.
6. u = [ 1.1 2,  3.25, 2.07,  1.83],
111 Exercises 3439, find tile projeClioll of v 011to u. Draw (I sketch ill Exercises J4 ami 35.
v = [  2.29, l. 72, 4.33,  1.54]
In Exercises 7 12. jimi ~ u B for the given exercise. (HId give a un it vector 11/ tile (lireclIOU of u. 1M
7. Exercise I 10. ExerCise
8. E.xercise 2
9. Exercise 3
II . Exercise 5
::. 12. Exercise 6
36. u =
2
2/3  2/ 3
 2 37. u =
2
 1/ 3 III Exercises 13 16. find Jhe dis/(lllce d( u, v ) betweell u ali(I v ill the givell exercise. 13. Exercise I
_~ 38. u =
14. ExerCise 2
15. Exercise 3 ~ 16. Exercise 4
17. If u, v, and w are vectors in Rn, /1 ~ 2, and c is a scalar, explam why the follow'lIlg expresSIOns make no sense:
(.) I"'vl
(b) u · v + w
(c) u· (v·w)
(d) c· (u+w)
=
39. u =
\
\ 8. "
v
= [~]. v = [ :]
19.u =
3.0\
1.34
 0.33
4.25  1.66
 2
20. u = [4,3 ,  I ], v = [ I,  I, I]
(, )
~ 21. II = [0.9,2. 1, 1.2], v = [ 4. 5,2.6.  0.8 ]
22, u
= [1,  2,3,4 ], v = [ 3, I, 
1, I ]
23. u = \1,2, 3,4]. v = 15, 6.7 , 8] /11 Exercises 2429, find lilt allgle between U lind v . .
In
lir e
g,vell exerCIse.
24. Exercise 18
25. Exercise 19
~ 27, Exercise 21 ~ 28. Exercise 22 :
26. Exercise 20 29. Exercise 23
2
2.52 Figure 1.36 suggests two ways III whICh veclOrs may be used to compute the a rea of a triangle. The area A of
 \
\
 \
1.2
1.5
\
 I ,v =
,v =
[0.5] ,v = [2.\]
is acute, obtuse. or a right angle.
2
\
 \
III E.xercises /823, determme wlretlrer the allgle between 1\ IIlId
\
2 3
(b)
Fl•• ,. 1.16
Secllo n 1.2
the triangle in part (a) is given by 11 Iv  proj.(v)l, nnd pa rt (b) suggests the trigonometric form of the nrea of a triangle: A = 1I ullvisinO (We can use the identit ysinO "" V I cos1 8tonnd sinO. ) In Exercises 40 tmd 41 , compllte the arell of the triangle with
tilegil'cnl'crtices !Ising bOlh ltIellwds.
= (1, 1),8 = (2,2),C= (4,0) 41. A = (3,  I, 4), 8 = (4,  2, 6), C = (5,0,2) 40. A
, V ""
2
k' k 3
44. Describe nil vectors v = [;] that are orthogonal
lo u =[~l 45. Describe all vectors v "" [;] that are orthogonal
,o u =[~l 46. Under what conditions arc the following true fo r vectors u and v in Rl or RJ ?
(.) lI u+ ~I = Ilull + II~I 47. Prove Theorem 1. 2(b). 48. Prove Theorem 1.2(d).
111 Exercises 4951, pro\'t~ the statell property of distance betWCCl1vectors. 49. d(u, v) "" dey, u) for all vectors u and v 50. d(u, w) ::5 d(u, v) + dey, w) for all vectors u, v, and w 51. d(u, v) "" 0 ifa nd only if u == v 52. Prove that U' cv = c(u ' v) for all vt""Ctors U and v in R" and all scalars c.
,
53. Prove that Jlu  vU2: UuJl llv11fo r all vectors u and v in \R:". I Him: Replace u by u  v in the Triangle Inequality.1 54. Suppose we know that u . v = U· w. Does It follow that v "" w? If it does, give a proof that is valid in Rn; otherwise: give a cOllllterexampie (that is. a specific set of vectors u, v, and w for which u ' v = u' w but v ¢ w).
:n
IIul12 llvWfor all vec
.h"
56. (. ) Pm" Ilu + ~I' + Ilu  ~I' = 211ull' + 21MI' for all vectors u and v in R~. (b) Draw a diagram showing u, v, u + v, and u  v in R~ and use (a) to deduce a result about parallelograms. U· V
=
1
4 Ilu + v~2
I ~ u

4
 vf
for all
vectors u and v in R~.
which tllc two vectors lire orthogonal.  I
55. Prove that (u + v )· (u  v) = tors u and v in R.
57. Prove that
[II Exercises 42 Witt 4] , fi nd all wllues of the sealll r k for
1
Le ngt h and Angle: The Dot Product
58. (a) Prove that lIu + vii"" Ilu  vii If and only if u and v are orthogonal. (b) Draw a diagr:lm showing u, v, u + v, and u  v in RI and use (a) to deduce a resu1t abou t parallelograms. 59. (a) Prove thai u + v and u  v arc orthogonal in RM if ull = 11vI1./ Hml:See Exercise 47·1 and only if D (b) Draw a diagram showing u, v, u + v, and u  v in RI and use (a ) to deduce a result about parallelograms. 60. If lIuli
= 2,1Iv11 = v'3, ,nd u. v = I, find Ilu + ~I. 61. Show that there arc no vectors u and v such that Ilull "" [1vI1 ". 2,and U· v :;: 3.
I,
62 ~ (a ) Prove Ihat if u is orthogonal to both rand w, then
u IS orthogonal to v + w. (b) Prove thai if u is orlhogMal to both v and w, then u is orthogonal to sv + t w for all scalars sand t. 63. Prove that u is orthogonal to v  proi~( v) for all vectors u and v in Rn, where u O.
"*
64. (a) Prove tha t proiu{proju(v» = proju(v). (b) Prove tha I proj.( v  proju(v» = O. (e) Explain (a) and (b) gcomClrically.
65. The CauchySchwarz. Inequality lu· vj < Ilull \lvll is equivalent to the inequality we get by squaring both sides: (u' v)2 S lI uW IIvW. (a) In R Z, with u =
["'J 112
and v =
[v'J.
this becomes
V2
Prove this algebraically. I Hint: Subtract the leh hand side from the right·hand side and show that the difference must necessarily be nonnegative.' (b) Prove the analogue of (a) III Rl. 66. Another approach to the proof of the CauchySchwarL Inequality is suggested by Figure 1.37, which shows
ZI
Chapter I
Vectors
that in R2 or IRJ, l proju(v)1 :s Ivi. Show that this is equivalent to the CauchySchwarz Inequality. v  eu
cu
u
fl •• "1.31
proju(v)
u
Fila" 1.31 67. Use the fa ct tha t proJ.(v) = eu fo r some scalar c, together with Figure 1.38, to fi nd c and thereby denve
68. Using mathematical induction, prove the following generalization of the Tna ngle Inequality:
jlv. +
V2
for all
+ ... + vnl :s Iv.11 + IIv21 + ... +
fI 2;
Ilvn~
I
the formula for projw(v).
•
,
. ,. • 4o
0_
Vectors and Geometry Many results in plane Eucl idean geometry can be proved using vecto r tech niques. For example. in Example 1.17, we used vectors to prove Pythagoras' Theorem. In this exploration we will lise vecto rs to develop proofs for sollle other theorems from Euclid ean geometry. As an introd uct ion to the nOlation and the basic approach. consider the follow ing easy example.
Example 1.18
SllllIOI We fi rst convert evc~ in g to vector nOI:l tionJ.!..o de n~s the Orig i~l and fi! a poin t, l~ be t he vector Op, In this situation, a "" GA, b = DB, m = OM, and AB = DB  OA = b  a (Figure 1.3~ Now, since M is the midpoint of AB, we have
A
•
Give a vecto r description of the midpoi m M of a line scgmcnt AB.
b a
,
1
m  a = AM  l AB = l (b  a ) m = a +!
flgar, 1.39

The midpomt of AB
1. Give a vector description o f the po ill t P that is one third of the way from A to B on the line segment AB. Generalize. 2. Prove that th e line segment join ing the nud points of twO sides of a tria ngle is parallel to the third side and half as long. (In vector no tation, prove that PQ = in Figure 1.40.)
tAB
c
Q
A~' . fl,." 1.40
3. Prove th:. t the q uad rilateral PQRS (Figure 1.41 ), whose vertices are the midpoints o f the sides o f an arbitrary quadrilateral ABCD, is a parallelogram. 4. A median of a triangle is a line segmen t from a vertex to the midpoint of the opposite side (J:igure 1.42). Prove th tH the three medians of any tri:.ngle are cOllcurrem (Le., they have a common point of intersection ) at a point G that is two thirds of the d istance fro m each vertex to the m idpoint of the opposite side. ( Hint: In Figure 1.43, show that the point that is twothirds o f the d ista nce from A to P is given by~(a + b + c). Then show that 1(a + b + c ) is twothirds of the distance from B to Q and twothirds of the distance from Cto R. ) The point G in Figure 1.43 is called the centroid of the triangle.
"
A
A
R
8
M
"c f)
:.. p
C
R
n,.rel."1
flg." 1.42:
nlur.l.U
A median
"111' centroid
5. An altitude o f a triangle is a line segmen t fro m a vertex that is perpendicular to the o pposi te side (Figu re 1.44). Prove that the three alti tudes of a tn angle ,1fe concurrent. ( Him: Lei H be the poi nt of intersect ion oflhe altnudes from A and 8 in ~ Figure 1.45. Prove that CH is orthogonal to AB.) Th e point H In Figure 1.45 is called t he ortl'oce,.'er of the tria ngle.

6. A perperldicular bis«lor of a line segment is a line through the midpoint of the segment, perpendicular to the segment (Figure 1.46) , Prove Iha\ the perpendicular bisectors of the three sides of a triangle are concurrent.  (l/IIJI: Let Kbe the point of intersection of the ~pendicu1a r bisectors of ACand Bein Figure 1047. Prove that RK is o rthogonal to AB. ) The poilU K in Figure 1047 is called the circumCfmterof the tria ngle.
_.
c
A
/J
B
A
flllr.1.U
n'I" 1.U
FI'lf,1.41
An altitude
The orthocentcr
A perpendicular biSCC10r
7. Let A and Bbe the endpoints of a diameter of a circle. If C is any point on the circle. pro\'e thai L ACB is a right angle. (Hint: In Figure lAS. let 0 be the center of the circle. E.l:press everything in terms o f a and c and show that AC is orthogonal to BC)


8. Prove that the line segmen ts Joi ning the midpoin ts of opposite sides of a quadrilateral bisect each other (Figure 1.49).
c A
Q
o
c
B R
A'~',. /J
o
R
n•• ,. Ul The circumccnter
n•• ,.1.41
n,.r.1.49
.... _ _
... ..... ~, •
cc _ ~ C
31
Lines and Planes
Section 1.3 c
•
1:3
lines and Planes We arc all familiar with Ihe equation of a line in the Cartesian plane. We now want to consider lines in Rl from a vector po int of view. The insights we obtain from this approach will allow us to generalize to lines in R ' and the n to planes in IRl. Much o f thc linea r algebra we will consider in later cha pters has its origins in the simple geometry of lines and planes; the ability to Visualize these and to think geometrically about a
problem will serve you well.
lines In
~.
and
~3
'*
In the xyplanc, the general form of the equation of a line is ax + by = c. If b 0, then the equation ca n be rewritten as y = (alb)x + c/b, which has the fo rm y = mx + Ie. [This is the slopeintercept form; m is the slope of the line, and the point with coordinates (0, k) is its yintercept.l To get vectors into the picture, let's consider an example.
The line e with equatlOil 2x + y = 0 is shown in Figure 1.50. It is a line with slope  2 passing through the ori gin. The left hand side of the equation is in the form of a dot product; in fact, if we let n =
[ ~] and x =
[;
J,then the equation becomes n . x
=
o.
The vector n is perpendicular to the linethat is, it is orthogol/(II to any vector x that is parallel to the line (Figure 1.51 ) and it is called a normal vector to the line. The equation n . x = 0 is the normal form of the equation of f. Another way to think about this line is to imagine a particle moving along the line. Suppose the particle is initially al the origin at time t = 0 and it moves along the line in such a "Vay that its xcoordinate changes I unit per second. Then at r = I the particle IS at (I,  2), at r = 1.5 it IS at ( 15,  3), and, if we allow negatIve values of t (that is, we consider where the particle was in the past ), at t =  2 it is (or was) at y
y
\
, The l.atin word trormlf refers to
ca rpenter's square, used for d raw~. ing right angles. Th us, a 110rmal vector is one Ihal is perpendicular to so meth in g else, usuall y a plane.
I
fhilUI' 1.50 The lille 2x + y  0
flllr.l .51 A normal vector n
..
tI
II
Chapler I
Vtelors
( 2,4). This movemen t is illustrated in Figure 1.52. In general. if x = t, then y   21. and we may write this relationship in vector form as
Whal is the sign ifica nce of the voctor d = [
e,
_~]? It is a particular vector parallel
to caned a dincriol! "t:elor for the line. As shown in Figure 1.53, we may write the equation of 35 x = Id. This IS the vecfor fo rm of the equation of the line. If the line does n o t pass through the origin. then we must modify things slightly.
e
y y
\
,  2
l
+++_
_x d , 
I
, ,.. 1.5
" Jlgutl1.52
fI,.r,
,
x
[il
1.53 1\ dlre ctlOI} veclor d
Consider the line ewilh equation 2x + y= 5 (Figure 1.54). Th is is just the line from Example 1.19 shifted upwa rd 5 u nits. It also has slo pe  2, but its yintercept IS the poi nt CO, 5). II is clear Ihat the veClots d and n from Example 1.19 are, respectively, a di rection vector and a normal veClor fo r this line too. Thus. n is orthogonal 10 every vector that is parallel to f. The poi nt J> = ( 1,3) is on f. If X = (x, y) represe nts a general point on f. then the vector PX = x  p is parallel 10 f and n · (x  p ) = 0 (see Figu re 1.55). Simplified , we have n · x = n · p. As a check, we compute

n .x =[ ~ ] . [;] = 2X+Y
and
n.p =[ ~] . [ ~] =5
Thus. the normal form n • x = n · p is just a di fferent represcntation of the general form of Ihe equation of thc hne. (Note that in Exam ple 1. 19, p \Vas the ....ero ve<:to r, so n · p = 0 gave the righ thand side of the equation.)

5«:t ion L3
y
aa
Lines and Plant'S
y
\
/. n d
++
,
++ \++<_x e
"
x
t
flgur.1.54
Flglr, 1.55
The line 2x + y " 5
n· (x  p)" O
These res ults lead to the fo llowing definition.
DeliDitloD
The normal/orm o/t',e etluatiOtI of a line n' (x  p )=O
where p is a specific point on
or
ein Rl is
n·x = n·p
eand n *' O is a norma) vector for C.
The geneml form of the equation of( i!i ax + by= c, where n mal vecto r fo r
e.
•
Continuing with Example 1.20, lei us now fi nd the vector form of the equation of Note that, for each choice of x, x  p must b e parallel toand Ihus a multiple ofthe direction vecto r d . That is, x  p = Id or x = p + td for some scalar t. In te rms of componen ts, We have
e.
(I) x= I
+
I
(2)
y = 3  21
e,
Th~
word paramt'tl'Tand
ih~ COJ'T~
sponding adjecliv~ JMramctric rome rronl the Greek words pllTa, mea ning "alongside," and metrQ", mcaning "mcasure.~ Mathematically speaking, a parameter is a variable in lemlS of which other va riablcs HC cxpresseda new "rtlt'
Equat ion ( I) is the vector fo rm of the equat ion of a nd the componentwise equations (2) are called JKlra metric equations o f the line. The va riable t is called a parameter. How dotS all of th iS generalize to R J ? ObserVe tha t the vecto r a nd paramet ric forms of the equations of a line ca rryover perfectly. The notion o f the slo pe of a line in R 2 which is diffi cul t to generalize to three d imensions is replaced by the mo rc convenient no tion o f a d irec tion vecto r, Icading to t he followi ng defin ition.
Definition
The vector fo nn of the equation of a /jlle C in R2 or R:.is
e
"*
x zr p +td
where p is a specific poi nl o n and d 0 is a direction v~ctor for {. The C
34
Chapter I
Vectors
We will often abbreviate this terminology slightl y, refe rring simply to the genera l, normal, vector, and parametric equations of a line or plane.
Example 1.21
Find vector and parametric equations of the line in IR J through the point P = (1,2,  I), 5 parallel to the vector d =
 I
3 SOlliliol
The vector equation x = p
+
Id is
x
1
y =
2 + t  I
5
 I
The parametric form is
3
+ 5,
x=
I
y=
2
I
z = 1 + 31
HamarU
e e
• The vector and parametric forms o f the equation of a given line are not uniquein fac t, there arc infinitely many, since \ve may use any poi nt on to determine p and any direction vector for C. However, all d irection vectors are clearly multiples of each Olher. 10 In Example 1.21,(6, 1,2) is anothe r point on the line (ta ke t = I),and another dircction vectoT. Therefore.
x
6
10
Y
1
+s 2
2
6
 2
is
6
gives a different (but equivalent) vector equa tion for the line. T he relationsh ip between the two parameters sand t can be fo und by comparing the paramet ric equations: For a givcn point (x. y. z) on C, we have
x=
1 + 51=6+ lOs
y=
2
1= 1 
2s
z = 1 +3( = 2+ 6s iml'lying that
 IOs+51 =
25 
5
t = I
 6s + 3t = Each of these equations reduces to I = I + 2s.
3
Section 1.3
Lines and Planes
35
• Int uitively, we know that a line is a onedimensional object_ The idea of "dimension" will be clarified in Chapters 3 and 6, but for the moment observe tha t th is idea appears to agree wi th th e fact that th e vector fo rm o f the equa tion of a line requires one parameter.
Example 1.22
One often hears the expression "two points de termine a line." Fmd a vector equa tion of the line C in R) de te rmined by the poi nts P = ( 1,5,0) a nd Q = (2, I, I). Solltlot fi ne).
We may c hoose any point on C for p, so we will use P (Q would also be

A convenient direction vector is d = PQ = Th us, we o b tain
3  4 (or any scalar multi ple o f this). 1
x = p + td  I ~
+
t
,
1
Plaaes In R'
n
The next question we should ask o urselves is, How docs the general fo rm of the equation o f a line generalize to R~ We might reasonably guess that If ax + by = C IS the general form o f the equation of a line in R l , then ax + by + a = d m ight represent 3 line In IR). In normal (arm, this equation would be n' x = n . p , where n is a normal vector to the line and p corresponds to a point on the line. To see if this is i1 reasonable hypothesis. let's thi nk abou t the special case o f the
flllr, 1.5& orthogonal to infinitely many vectors
n
5 0
3
IS
equatio n ax + by + cz = O. In normal form , it becomes n . x
n ' (x  p )= Q
,
However, the sct of all vectors x that satisfy this equation is the sct of all vectors orthogo nal to n . A5 shown in Figure 1.56, vectors in infinitely ma ny direc tions have this pro pe rty, determm tng ,1 fa mil y of parallel plmlcs. So our g uess was incorrect: [t appea rs that (IX + by + cz = d IS the equation of a planenot a linein iR;3. Let's make thiS fillding more precise. Every plane '1f in IV can be determined by specifying a point p on ~ and a no nzero vector n normal t o~ ( Figure 1.57). Thus. if x rep resents an arbi tra ry point on eJ', we have n . (x  p) = 0 or n . x = n' p. If
n
fl , .r, 1.51
= 0, where n =
a b.
"'..., ~
n (IX
=
a b
,
x and x =
+ by+ a=
Definition
y, then, in te rms of components, the equation becom cs
,
t/(whered= n ·
pl.
The lIormal fo rm of tile equation of a pla"e ~ in n ' (x  p) = 0
where p is a speci fic point on ~ and n
or
R3 is
n · x = n· p
'* 0 is a normal vector for '3'. a
T he general f orm oftheequ(,tiotl ofr;/' is fIX + b)' + rz = d, where n = a no rmal vector for t1J .
b "
,
.'
38
Ch~Jltcr t
Vectors
Not e that any scal ar mul t iple of a norm al vect or for a plan e is ano ther nor mal vect or.
Exlmple 1.23
Find the norm al and genera! fo rms of the equ atio n o f the plan e that con tain s thc I
poin t P = (6,0 , I) and has norm al vect or n
=
2 3
SO llnl .
Wit h p ""
6 0 and x
x =:
I the nor mal equ atio n n . x = n
0
y , we have n op = 1·6 ,
+ 2·0 + 3 ·1 = 9, so
p bec ome s the gen eral equ atio n x
+
2y
+
3z
::0
9.
4
Geo m etricall y, it is clea r tha t parallel plan es ha ve the sam e nor mal vecto r(s) . Thus, thei r gen eral equ atio ns have left han d sides that are mul tiple s of each othe r. So. for exam ple, 2x + 4)' + 6z =: 10 is the gene ral equ alio n of a plan e Ih,1I is parHIle! to the plan e in Exa mpl e 1.23. since we may rew rite the equ atio n as x + 2y + 3z = 5from whi ch we see that the two plan es ha ve the sam e no rma l vect or n . ( Not e tha t the plan es do not coin cide , sinc e the righ tha nd Sides of thei r equ a tio ns arc dis tinc t.)
We may also express the equation of a plane in vector or pa rametric fo rm. To do
so, we obse rve tha t a plan e can also be dete rmi ned by spec ifyin g one of its poin ts P ( by the vector p) and two di rect ion vect ors u and v parallel to the plane (but no t pllralle l to each othe r) . As Figu re 1.58 sho ws. given any poin t X in the plan e (loc ated
"
P "' SII + tv
p
x
o flgl rl 1.5. x  p = su + rv by xl, we can always find app rop riat e mul tipl es su and tv of the dire ctio n vect ors such that x  p = su + tv or x = p + s u + tv. If we writ e this equal10n com pon entw ise, we obta m para met ric equ atio ns for the plan e.
DeUnUion
==
The vect or f orm of tire equ atiotf of a plan e Jl in R J is x = p + su + t v whe re p is a poi nt on qp and u and v arc di rect ion vect ors for .I (u and v are rRfn zero and parallel to r;iJ> , b ut not para llel to each othe r ). The equ atio ns corr espo ndi ng to the com pon ents of the vect or form of the equ atio n are called par ame tric equ atio ns of 'Jl.
Sect Lon Ll
Example 1.24
,
3T
Lines and Planes
Find vector and parametric equat ions for the plane in Example 1.23.


Solilion We need to fi nd two direction vectors. We have one point P = (6.0, I) 10 the plane; if we can find two other points Q and R in qp, then the vectors PQ and PR can serve as direction vectors (unless by bad luck they happen to be parallel!). By trial and erro r, we observe that Q = (9,0,0) and I~ ::::: (3,3, 0) both sa tisfy the genera l equation x + 2y + 3z = 9 and so lie in the plane. Then we com pute n
3

n,
o
u =PQ= q  p =
Flgur'1.59 Two nor mals determme a line
,

andv :::: PR =r  p =
3 3
,
which, since they are not scalar multiples of each o ther, will serve as dltection vectors. Therefore, we have the vector «Iuation of@',
n,
x y
Il~
,
=
 3
6
3
o
o +I
1
 1
1
3
and the corresponding param~tric tquations,
,
x = 6 + 3s  3t Y= z= l 
nglr, 1.6. The Intersection of two planes is a line
[What would have happened had
w~
3t s
t
chosen R = (0.0, 3)?J
Rlmlrks • A pla n ~ is a t wodim~nslonal object. and Its equation, in vector o r parametric form, requi res two pa rameters. • As Figure 1.56 sh ows, gi\'en a point Pand a nonzero vector n in R', there are in fin itely many lines thro ugh P with n as a normal vector. However, P and two nonparallel normal v~c tors and do serve to locate a line t uniquely, since must then be the line th rough Pthat is perpendicular to the plane with equation x = p + SIl l + Ifl l (Figure 1.59). Thus. a line in Rl can also be specified by a pair of equation s
"I
"1
e
alx+ b,y+ c1z = d l azx + bV' + C2z ,.. d1 one corresponding to each normal vector. But since these equations correspond to a pair of no nparallel planes (why nonpa rallel?), th is is just the descnpllon of a line as the intersection of two no nparallel planes (Figure 1.60), Algebraica lly, the line conSISts of all po ints (x, Y, z) that simultaneously satiSfy both equations. We will explore this concept furth er in Chapter 2 when we discuss the solution of systems of linear equations, Tables 1.2 and 1.3 summarize the information presented so far abou t the equalions of lines and planes. Observe once again that a single (general) equation descnbes a line in R" but a plane in IR' , [In higher di mensions, an object (line, plane, etc.) d~tcrm ined by a single equation of this type is usually called a hyperplane. I Th~ relationship among the
38
Chapter 1 Vectors
Tlble 1.2 !qlallons 01 lines In Normal Form
~,
General Form
Vector Form
ax + by = c
x ::: p +ld
n·x = n · p
Parametric Form
! X = Pr + tdr
\J' = PI +
Table 1.3 lines and Planes In
~.
Normal Form
Planes
Vector Form
General Form
+ b,y + clz = ta2x + /'zY + c2z =
Lines
fa lx
Parametric Form x  p , + ld ,
x = p + ld
d, d1
Y = Pl+ td2 z ::: p) + Idl X = p+ SU + 1V
ax + by + cz = d
n ·x = n · p
ttl:
X "" P I +
SUI + t VI
Y = Pl+ SUz + II': Z "" P l
+
SilJ
+
tVJ
d imension of the object, the n umber of equa tio ns req u ired, and the d imension of the space IS given by the " balancing fo rmula": ( di mellsioll o f the object )
+ ( flllmberof general equ.ltiollS)
d imension of the space
The higher the dimension of the object. the fewer eq uations It needs. For example, a p lane in Rl is Iwodimensional, requires o ne general equatIon , and lives in a threed imensional space: 2 + I = 3. A line in R J is onedimenSIonal and so needs 3  I = 2 equations. Note that the dimension of the object also agrees wi th the number of para m eters in its vector or parametric fo rm. NotIo ns o f "dimensio n" will be clarified in Chaplers 3 and 6, but fo r the time being, these in tuitive observations will serve us well. We cail now find the distance frOIll a point to a line or a plane by co mbilllng the results of Section 1.2 with the results fro m th iS section .
Example 1.25
Find the distance from the point B = ( 1, 0, 2) to the 1me
e through
the point
 I
A = (3, I , 1) ",ithdlrecllonvCClor d ""
I
o 50lullon As "'e have al ready determ ined, we ne«l. to calculate the lengt h of PB, where Pis the poin l on Cal the foot o( the perpendicular (rom B. If we label v = A B, then AI' = proJd(v) and PO = v  proid(v) (sec Figure 1.61 ). We do the necessary calculat ions in several steps. ~
~
~
,
1 ~
Step I:
v = AB = b  a =
0 2

3
 2
1
 I
1
1
Section 1.3
Lines and Illanes
B
,
A
Figure 1.61 d(B, €) = ~ v  projd(v ) ~
Step 2:
The projection of v onlO d is
~ ((1)0 ( 2) + 10( 1) ( 1)2 +1+0
~
+ 001)
I
0
, ,!
_1 ~
0 Step 3:
The vecto r we want is
t
2 v  projiv) =
1
 I
Step 4:
The distance d ( B,
~
0
I
,, ,, I
e) from B 10 eis
Iv 
1
projiv)1 =
! I
Usi ng Theorem 1.3(b) to simplify the calculation, we have
3
Iv 
Projd{v)l =
!
I
o
 I
,,
 I
3 2
=! V 9 + 9 + 4 = ~ v'U
Note • In terms o f our earl ier notation, deB, €) = d ey, projd(v)) .
39
••
Chapler
J
Ve<:tors
In the case where the line e is in Rl and its equation has the general form ax + by = c, the distance deB, t) from 8 "" (Xo, Yo) is given by Ihe fo rmula
(3)
You are invited to prove this formula In Exercise 390
Example 1.26
Find the dlSlance from the point H = ( 1,0, 2) isx +y  z= l .
10
the p lane 9J whose general equation

SIIIU" In Ihis case, we need to calculate the length of Pll, where P is the pOint on IJP at the foot of the perpendicular froln B. As Figure 1.62 shows, if A is any poin t on 1
qp and
we sit uate the norlnal vector n ==
• of 9JI so that its tail is at A, then we
I  I

need to find the length of the projection of An onto n. Again .....e do the necessary calculations in steps. n
 
B
no'"~
1.62 diM')  Iproj.(AB) 1 Step I: By trial and erro r, we find any point whose coordi nates s.1lisfy the equation x + y  z = I. A "" (1,0,0) will do. Step 2:
Set

v = AB "" b  a ". Step 3:
1
1
0 2
0 0

0 0 2
The projec tion of v onto n is proj.(v} =
(n . v) n·n
n
_ (1 0 + 1 0  1 0
0
0
I+J+( I ) ~
,
=  '•
1  I
,,  ', ,1
 '•
1
=
2)
1 1
 I
Section 1.3
Step 4:
Lines and Planes
41
The d istance d (B. IJP) from B to IJP is
I Ipwj"(v) i = HI
I  I
,, In general. the distance deB, IJP) from the point 13 = (>;» Yo . ~ ) to the plane whose general equatio n is ax + by + cz = d is given by the fo rmula
d(8,
tU"o
+
b}o +
cZo 
d
(4)
You will be asked to derive th is fo rmula in Exercise 40.
Exercises 1.3 III Exercises 9 muilO, write the eqllatioll of the pia/ie /)(lssing
1n Exercises J alld 2, write the equatlOl! of the line passing through P with normal vector n IfI (a) normal fOrtll and (b) gel/eml form. I. P = (O,O), n=
[:l
2.P = ( 1,2), n =
through P with direct ion vectors u alld v ill (a) vector form alld (b) parametric form.
[:l
9. P = (0,0,0), u =
III Exercises 36, write the et/llation of the /inc passing through P with direction vecford in (a) vector fo rm and (b) parametric form. 1 3.P = (J , 0), d = [  3 ]
4. P = ( 4,4 ), d =
5.P = (O,0,0), d =
 I
6.P = (3,0,  2),d =
° 2
5
4
/11 Exercises 7 and 8, write the equation of the plane passing throllgh P with /lormal vectorn in (a) normal form and (b) gel/eml foml. 7.P =(0,1,0), n =
3 2 I
8. P = (  3,5, 1), n =
I , v=
2
2
I
10. P = (6,  4,  3), u =
[I ]
I
3
2
I  I
5
° ,v =
 I
I
I
I
I
/n Exercises II alld 12, give till' I'('(tor eqllation of the line passing throllgh P and Q. I J. P = (I,  2),Q=(3,0)
12. P = (0, I, I),Q = ( 2, 1,3) III Exercises / J and 14, give the vector eqlwtioll of the plalle pllSsing througll p, Q, and R.
13. P = (I, I, I),Q = (4,0, 2),R = (0, 1, 1) 14. P = ( 1, 0,0), Q = (0, 1,0), R = (0, 0, I)
15. Find parametric equations and an equation in vector for m for the lines in [R:2 with the followlllg equations: (a) y= 3x  1 (b ) 3x + 2y = 5
42
Chapter I
V~ tors
16. Consider the vector equation x = p + t( q  p), where p and q co rrespond to distinct points Pand Q in 1R2 or IRJ.
(a) Show that this equation describes the li ne segm ent PQ as t va ries from 0 to 1. _ (b) For which value o f t is x the midpoin t of PQ, and what is x in this case? (el Fi nd the midpoint of PCl when P = (2,  3) and Q (O, I). (d) Fi nd the m idpoint of PCl when P = ( 1, 0, 1l and Q  (4, I, 2). (e) Find the two points tha t d ivide PQ in part (c) into three equal parts. (£) Fi nd the two poin ts that divide PQ in part (d ) in to three equal pa rts.
17. Suggest a " vector proof" of the fact that, in 1R2, two lines wit h slo pes III ] and o nly if lII ] m2 =  I.
1112
are per pendicular if and
18. The li ne C passes thro ugh the point P "" ( I ,  I , I) and
25. A cube has vertices at th e eight points ( x, y, z), where each of x, y, and z is eit her 0 or 1. (See Figure 1.31.)
(a) Fi nd the general equations o f the planes tha t d etermi ne the six faces (sides) o f the cube. (b ) Find the general equation of the plane tha t contains the diagonal from the origin to ( I , I, I ) and is perpendicular to the xy plane. (cl Find the general eq uation of the plane that contains the side diago nals referred to in Example 1.1 5. 26. Find the equation of th e set of all po ints that are equidistant from the points P = (1,0,  2) and Q~
(5,2,4).
In Exercises 27 and 28, find the distance from the pomt Q to thelinef. 27. Q = (2, 2),f with equa tion [ ; ] =
[~] +
2
has direction voc{Qr d ;::
3. For each of rhe
 I follow ing planes (jJ>, determ ine whether f and QI> are parallel, perpendicular, or neither: (a) 2x+ 3y  Z "" I (c) x  y  z = 3
(b) 4x  y + 5z = 0 (d ) 4x + 6y  2z = 0
19. The plane @'] has the equation 4x  y + 5z = 2. For each of the planes q. in Exercise 18, de termine whether qp] and 'lJ' are parallel, perpendicular, or neither.
x 28. Q = (0, \ ,0), € with eq uatio n y
{:J 2
1
\ +
,
I
1
0
3
In Exercises 29 and 30, find the distallce from tllf point Q to the phme ~ . 29, Q "" (2, 2, 2), ~ with equation x
+ y
z= 0
30. Q = (0, 0,0), f!J' with equation x  2y + 22 = 1
20. Find the vector fo rm of the equation of the 11Ile in 1R2 that passes thro ugh P = (2,  I ) and is perpendicular to the line with general equation 2x  3y = 1.
Figure 1.63 suggests a way to use vectors to locate the point R Otl f that is closest to Q.
2\. Find the vecto r fo rm of the eq uatio n of the line in [R:2 that passes th rough P = (2,  \ ) and is parallel to the line with general equat ion 2x  3y = 1.
32. Find the point Ron f t hat is closest to Q in Exercise 28.
3 1. Find the poin t Ron
e that is doses t to Q in Exercise 27.
Q
22. Find the vector fo rm of the equation of the line in IR J that passes through P "" ( \,0, 3) and is perpendicular to the plane with general equation x  3y + 2z = 5.
e
23. Fmd the vector fo rm of the equation o f the line in R J that passes through P = (  1,0, 3) and is parallel to the lme with parametric equations
,
x = I  t Y "" 2 + 3t z=  2 I 24. Find th e nor m al for m of the equation of the plane that passes thro ugh P = (0,  2,5) and is parallel to the plane with general equatio n 6x  y + 22 == 3.
p
o flgur. 1.63
~
r = p
+ PR
Section 1.3
Figure 1.64 suggests a way to use vectors to locate the poim R on VI' tlrat is closest to Q.
Lines and Planes
43
the angle between W> I and qp 2 to be either 8 or 180"  0, whichever is an acu te angle. (Figure 1.65)
Q
n,
,
c
,
o
flgare 1.64 r = p + PQ + OR
 
180  8
figure 1.65
33. Find the point Ron g> that is closest to Q in Exercise 29. 34. Find the po int Ron 'll' that is closest to Q in Exercise 30.
Exercises 35 (II/(/ 36, filld the distall ce between tile {X/rallel lilies.
III Exercises 4344, find tlse acute mlgle between the pIa/Ie! with the given equat ;0115.
43.x+ y+ z = 0 and 2x + y  2z = 0 44. 3x  y+2 z=5 and x+4y  z = 2
111
35.
= [I] + s['] [x] y I 3
III Exercises 4546, show tlll/tihe pllllie and line with the given 1.'(llIatiol15 illlersecf, (lnd then find the aela/.' angle of intersectioll between them. 45. The plane given by x
36.
x Y
I
I
x
O+siandy
,
 \
I
z
o
I
\ +t \ 1
Z=
3
+
I
46. The plane given by 4x  Y 
In Exercises 37 011(/38, find the distance between the parallel planes. C 137. 2x + y  1%= 0 and 2x + y  2z =5
38.x+y + z =
J
,nd x + y +z= 3
39. Prove equation 3 o n page 40. 40. Prove equation 4 on page 4 J. 41. Prove that, in R ', the distance bet..."een parallel lines wit h equations n' x = c, and n· x = c1 is given by
given by x =
Exercises 4748 explore Olle approach 10 the problem of fillding the projection of a ,'ector onlO (/ pial/e. As Figllre 1.66 shows, if@> is a plalle throllgll the origin ill RJ with normal
n
en
till
p= \
I nil If two nonparallel plalles f!J> I alld 0>2 lrave lIormaI vectors " l al1d 11, mui 8 is tile angle /Jetween " l anti " 2, then we define
6 and the line
t
42. Prove that the dis tance between parallel planes with equations n· x = til and n' x = ti, is given by 
z
y = I +2t. Z = 2 + 31
ICI  ~ I ~ nil
I(il
0 and the line
givcn byx = 2 +
I y = I  2t.
I
+ Y + 2z =
figure 1.66 Projection onto a pl(lllc
en
44
Chapler I
Veclors
vector n, ami v is a vector in Rl, then p = pro~{ v) is a vector ;11 r:I sllch that v  en = p for some scalar c.
onto the planes With the fo llowi ng equations: (a) x+ y+ z = O
(b) 3x  y+ z = O
47. Usi ng the fa ct that n is orlhogonal to every vector in ~ (and hence to p), solve for c ::and the reby fi nd an expressio n fo r p in terms of v and n.
(e) x  2z = 0
(d ) 2x  3y
48. Use the method of Exercise 43 to find the p rojection of v =
1 0
2
+z= 0
I The Cross Product It would be convenient if we could easily convert the vector form x "" p + s u + tvor the equation of a plane to the normal for m n' x = n ' p. What we need is a process that, given two nonparallel vecto rs u and v, produces a third vecto r n that is orthogo nal to both u and v. One approach is to use a const ruction known as the cross product of vectors. Dilly Yldid In RJ , it is defined as follows:
Definition
The cross prOtlUCI of u =
U2
and v =
VI
is the vector u X v
defin ed by
U X II
=
IIl V) 
II J Vl
" l V, 
" I V)
U I Vl 
U2 Y ,
. A s hortcut that can help yo u rem ember how to cakut.lte the cross product of two lIectors is illustra ted below. Under each com p lete Yector, write the first two compo
nents of that vector. Ignon ng the two components on the top line, consider each block o f four: Subtract the products of the components connected by dashed lines from the products o f the components connected by solid lines. (It helps to notice that the fi rst component of u X v has no Is as subscripts, the second has no 2s, and the third has no 35.)
IIl Vl 
II J V,
UJ V, 
II I VJ
Il , I'l 
1'2 VI
45
The following problems brietly explore the cross product. I. Compute u x v.
,
3
0 (a) u =
,
, v :::::
(0) u
=
,
flgur.1.61
=
.. 2
2 ,v =
3
(b ) u
2
, u X ,
,
(d ) u
=
, , , ,
•v =
2
,
, ,
0
3
, ,=
2
3
2. Show that c 1 X c 2 = c)' c1 X c j = c., and c j X e l = ez. 3. Using the definitiOn o f a cross p roduct, prove that u X v (as shown in Figure 1.67 ) is orthogonal to u and v. 4. Use the cross product to help find the no rmal form of the equation of the plane. o 3 (a) The plane passing through P = (l , 0,  2), pa rallel to u = I and v =  ]
,
2
(b) The plane passing through p = CO,  1, l),Q = (2,0,2),aod R = (1,2,  1) 5. Prove the following properties of the cross produ ct: ( a) v X u =  (u x v ) (b) u X 0 = 0 (c) u X u = 0 (d ) u X kv = k(u X v) (e) u X ku == 0 (0 u x (v + w) = u X v + u X w 6. Prove th e fo llowing properties of the cross product: (a ) u· (v X w) ::::: (u X v) ·w ( b ) u x (v X w ) "" (u ·w )v  (u v)w (e) Illl x
V!l =
I U ~ 2 ~ v ll~ 
(u vy
7. Redo Problem s 2 and 3, this time making use of Problems 5 and 6. 8. I.et u and v be vecto rs in RJ and let 0 be the angle between u and v. (a) Prove that lu x v ~ = l un vll sin O. t Hlnt: Usc Problem 6(c).J ( b) Prove that the arc.. A of the tri .. ngle de termined by u and v (as shown in Figure 1.68) is given by
,
A u
fll.,.1 .6I
= t llu x
vii
(c) Use the resul t in part (b) to compute the area o f the tria ngle with vertices A = (1 , 2, 1) , B = (2, I,O),and C = (5,  I, 3) .
Section 1.4
Code Vectors lmd Modular Anthmetlc
41
" ."~
Code Vectors and Modular Arithmetic
The modern theory of codes onglllated WIth the work o f the American mathematician and com puter scientist Claude Shannon ( 19162001 ). whose 1937 thesis showed how algebra could playa role in the design and analysis o f electncal clfcuits. Shan non would later be Instrumental in th e formatIon of the field of IIIformation tlreoryand gtve the theorctkal basis for what are now called errorcorreclmg codes.
Throughout hislory, people have transmitted informa tio n usi ng codes. Sometimes the intent is to disgUise the message being sen t, such as when each letter in a word is replaced by a different leiter acco rding \ 0 a substitu tio n rule. Although fascinating, these secret codes, or ciphers, ilre not o f concern here; they are the focus of the field of cryptography. Rather, we wi ll concentrate o n codes that are used when data m ust be transmitted electronically. A familiar example of such a code is Morse code, \~ i th its system of dots and dashes. The adven t of d igital computers In the 20th centu ry led to the need to tra nsmit massive amounts of data q uickly and accurately. Computers are designed to en code data as sequences of Os ilnd Is. Many recent tech nological advilncements depend on codes, and we encounter Ihem every d ay withoul being aware of them: satellite communications, compact disc players, the u niversal product codes (U PC) associated with the bar codes fo u nd o n merchandise, and the international standard book numbers (ISBN) found o n every book published today are but a few examples. In this sectIOn, we will use vectors to design codes for detecting errors that may occur in the transmission of data. In laler cha plers, we will construct codes that can not only detect but also correct erro rs. The vectors thaI arise in the study of codes are not the familia r vectors of R" but vectors with only a fi nite number of choices for the components. These veclo rs depend on a different type of arlthmeticmoduiar arithmetIcwhich will be introduced in Ihis section and used throughout the book.
Binary COdes Since computers represen t d;lIa in terms o f Os and Is (which can be interpreted as off/on, closed/open, false/ t rue, o r no/yes), we begin by consideri ng biliary codes, which co nsist of vectors each of whose componenls is eilher a 0 Qf a \, In thls setting, the usual rliles of arit hmetIC must be modified, since the result of each calculation involving SOl lars must be a 0 or a I. The modifi ed rules for addition and multiplication are given below.
+
0
I
o
I
001
o
0
0
I
I
0
I
I
0
The only curiosity here is the rule that I + I = O. This is not as strange as it appears; If we replace 0 wit h the word ""even" and I with the word "odd," these tables simply sum marize the fami liar panty rules for the additio n and multiplicatIOn of even and odd integers. For example, I + I = 0 expresses the fact tha t the sum of IWO odd integers is an even lllteger. With these rules, a Uf set of scala rs 10, I } is denoted by Z2 and is called the set of integers modulo 2.
":t
In Z2' 1 + 1 + 0 + 1 = 1 and 1 + 1 + I + I = O. (Thesecakulalions ill ustrate the panty ,"I" Th, sum o f lh,oo odds ,"d , n eve" " odd; lh, sum of fout odds is
'S''"'
We are using the term kmgth differently from the way we used it in R". This should not be confusing, since there is no gromt'tric notion of length for binary vectors.
Wi th l, as OU f set o f scalars, we now extend the above rules to vectors. The SCI of all ,,tuples of Os and Is (with all ari th metic performed m odu lo 2) is de noted by Zl' The vectors In Z~ are called binary vectors o/Iength n.
Cha pler I
V« lo rs
Example 1.28
The vectors in l~ arc [0 , 0 1, [0, I], [I, 0], and II, contain , in general?)
Exampll 1.29
Lei U = f 1, 1,0, I, OJ and v  10, I. I, 1,01 be two brna ry veclorsoflen glh 5. Find U ' v.
II.
(How Illany vectors does
Z'l
Solution The calculation of u' v takes place in Zl' so we have u ·v = \·0+ \ . \ + 0·\ + \·1 + 0·0 = 0 + 1 +0+ 1+0
= 0
t
I In practice, we have a message (consisting of words, numbers, or symbols) that we wish to transmit. We begin by encod ing each "word" of Ihe message as a binary vecto r. III
Definition
A binary code is a set o f binary vecto rs (of the same length ) Gliled
code vectors. The process o f com'erring a message into code vectors is ca tled encoding, and the reverse process is called decodi"g. •
"=zz
A5 we will 5«, it is highly desirable that a code have other p ro~rti es as well, such as the ability to spot when an error has occu rred in the transmission o f a code vecto r and, if possible, to suggest how to correct the erro r.
ErrorOllecllng COdes Suppose that we have alread y encoded a message as a set of binary code vectors. We now want to send the binar y cod e vecto rs across a cluHlllei (such as a radio tra nsm itter, a telepho ne line, a fiber o ptic cable, or a CD laser). Unfortunatel y, the channel may be "noisy" (because o f electrical interference, competing Signals, or dirt and scratches). As a result, erro rs may be introduced: Some of the Os may ~ changed to Is, and vice versa. How can we guard agaInst this problem ?
hample 1.30
We wish to encode and transmit a message conSisting of one of the words up, do""., I"/t, or rigill. We decide to use the fo ur vectors in Z~ as our binar y code, as shown in Table 104. If the receiver has this table too and the encoded message is transmitted without e rro r, decod ing is trivial. However, let's suppose that a si ngle error occurred. (By an error, we mean that one component o r the code vec to r changed .) For example, suppose we sent the message "down" encoded as [0, I J but an error occurred in the transm ission o f the fi rst component and the 0 changed to a t. The receiver wo uld then sec
Tlble 1.4 Message Code '
up
[0.0)
down
left 11 . 0)
right ) 1,
tJ
Section 1.4
Code Vectors and Modular Anthmetlc
49
[1. II instead and decode the message as "right." (We will only concern ourselves wi th thc case of single errors such as this o ne. In practice. it is usually assumed that the probabil ity of multiple errors is negligibly small.) Even If the receiver knew (somehow) that a single error had occurred, he or she would not know whether Ihe cor rect code vector was (0, Jj or [ I, OJ. But suppose we sent the message usi ng a code that was a subset of Z~in other wo rds, a binary code of length 3, as shown in Table 1.5.
Tnlll.5 Message
Cod,
up
down
left
right
[O.O. OJ
10, I, J]
11,0, iJ
[ 1,1, 01
This code can detect any single error. For example, if "down" was sent as [0, I, J J and an error occurred in one component, the receiver would read either [I , I , 1 J o r !0, 0, I J or [0, I , 0], none of which is a code vector. So the receiver would know that an error had occurred (but not where) and could ask that the encoded message be retransmitted. (Why wouldn't the receiver know where the error was?)
The term pa rifY comes from th~ l ntin wo rd par, meaning "equnl or ~CVCI\:' Two inteser~ ar~ s.aid t~. have the sa me parity if they are both even or bQth odd,
The code ill Table 1.5 is an example of an errordetecting code. Until the 1940s, this was the best that could be achieved. The advent of digital computers led to the development of codes thllt could correct as well as detect erro rs. We will consider these in Chapters 3, 6, and 7. The message to be transmitted may itself consist of binary vectors. In th is case, a simple but useful errordetecting code is a parity d l/?ck code, which is created by ap
pending an extra componelltcatied a check digitto each vector so that the par ity (the total numberof Is) is even.
Exampla 1.31
If the messllge to be sent is the binary vector I I, 0, 0, 1,0, 11. which has an odd number of Is, then the check digit will be I (In o rder to make the total number of Is in the code vector even) and th e code vector will be [ 1, 0,0, 1, 0, I, I J. Note that a single error will be detected, since It will CllUse the panty of the code vecto r to change from even to odd. For exam ple, if an erro r occurred III the third compo nent. the code vector would be received as [ I, 0, I, 1, 0, I, I J, whose parity is odd because it has fi ve Is.
~ jI
Let's look at this concept a bit more formally. Suppose the message is the binary vector b = [bl' bl •• •• , hnJ in I.';. Then the parity check code vector is v = [ /' " b2 , . .. , l bn , d] in , where the check digit d is chosen so that
Zr
bl + h2 + ... + b" + d = 0
In
Zl
or, equivalently, so that I .v = 0
where I = [1, I, ... , I J, a vector whose every component is I. The vector 1 is cal led a check yector. If vector Vi IS received and I· Vi = I, then we can be certai n that
50
Chapter I
Vectors
an error has occurred. (Although we are not considering the possibility of more tha n one erro r, observe that th is schem e will not detect an even number of erro rs.) Parity check codes arc a special case of the more general check digit codes, which we will consider after first extend ing the forego mg ideas to more general seuings.
Modular Arithmetic I I is possible to generalize what we have just done fo r b inary vecto rs to vecto rs whose
components are taken from a finite set 10, 1,2, ... , kJ fo r k 2:: 2. To do so, we must fi rst extend the id ea of b inary arit hmetic.
EKample 1.32
The integers modulo 3 consist of the set Zl = {O, I, 2 ) I io n as given below:
+1 0
I
2
0
0
I
2
0
I
I
2 0
I
2
2 0
I
2
With
addition and multiplica 
0
I
2
0 0 0
0
0
I
2
2
O bserve that the result of each addition and m ultiplication belongs to the set 10, I , 2J; we say that Zl is closed wi th respect to the operatio ns of addi tion and multiplicatio n. It is perhaps easiest to think of this set in term s of a threeho ur dock with 0 , I, and 2 on its face, as shown in Figure 1.69. The calculation 1 + 2 = 0 translates as fo llows: 2 hours afte r I o'dock, it is o o'dock. l us t as 24:00 and 12:00 are the same on a 12hou r d ock, so 3 and 0 are eq uivalent on this 3 ho ur clock. Likewise, all mult iples of 3positive and negativeare equivalent to 0 here; 1 is equi valent to any num ber tha t is I more than a multiple o f 3 (such as  2, 4, and 7); and 2 is eq uivalent to any numbe r that is 2 mo re than a m ultiple of 3 (such as  1,5, and 8). We can Vis ualize the n um ber line as wra pping a round a circle, as shown in Figure 1.70.
o . . . . 3.0.3 . .. .
2
.... 1,2.5 . . . .
. . . .  2, 1.4.. ..
filer. 1.&9 Arit hmetic modulo 3
Example 1.33
Fllur. 1.10
To wh
Selullel
T his is the same as asking whe re 3548 lics on our 3hour clock. T he key 15 to calculate how far this number is (rO m the nea rest (smaller) multiple of 3; that is,
Sectl011 I 4
Code Ve
51
we need to know the remmnderwhcn 3548 is divided by 3.Uy longdivislO11, we fi nd that 3548 = 3 · 1182 + 2, so the remainder is 2. Therefore, 3548 is eqUIvalent to 2 in Z,.
• tI
In courses in abstract algebra and number theory, which explore this conce!,l in greater det ail, the above equivalence is o ften written as 3548 = 2 (mod 3) or 3548 _ 2 (mod 3) , where = is read "is congruent to." We will not usc this nottllion or terminology here.
(Xample 1.34
In ill' calculate 2
+ 2 + I + 2.
SIIIUa. 1 We use the S<1 mc ideas as in Example 1.33. The ordinary sum IS 2 + 2 + 1 + 2 = 7, wh ich is I mo re than 6, so division by 3 leaves a remainder of I. Thus, 2 + 2 + 1 +2= l in1'.).
Salltlll t
A better way to perform this calculation is to do it step by step entirely in ZJ'
+ I+2
2 + 2 + 1 + 2 = (2 + 2)
= 1 + 1+ 2 ~( 1 + J)+2
= 2+2 ~
I
Here we have used parentheses 10 group the terms we have chosen to combine. We could spcl'd things up by simultaneously combining Ihe first twO and the last two terms:
(2 + 2) + (i + 2)
~
I+ 0
~
l
Repeated multiplica tion can be handled similarly. The idea is to use the addition and multiplICation tables to reduce the result of each calculation 10 0, 1, or 2. Extending these ideas to vectors is straightforward.
Example 1.35 In 
1
o
In ~. lei u = (2, 2, 0, I, 2 J and v = ll , 2, 2, 2, I J. T hen u·v =2 · 1 +2· 2+0 · 2 + 1·2 + 2 · 1
=2 +1+0 + 2 + 2 ~
l
3
Vecto rs in Z~ are referred 10 as temary vectors o/Iel/gth 5. II(
filii" 1.11 Arithmetic modulo
In
ti
In general, we have the set Z", = 10. I, 2, ... , m  I J of integers modulo m (correspo nd ing to an mhour dock, as shown in Figure 1.7 1) and mary vectors of lengtl. n d enoted by Z::,. Codes using mary vectors are called mary codes. The next example is a direct extellsion o f Example 1.31 to ternary codes.
52
Chapter 1 Vectors
Example 1.36
Let b = [bl' b!, .. . , bnl be a vector in Z!.~. Then a check digit code vector may be d efined by v = [b .. bz"'" b~, dl (in Z!.3' 1), with the checkdiglt dchosen so lhal I . v "" 0
(where the check vector I = [1,1, ... , 1[ is the vecto r of Is in Z!.j"' I); that is, the check d igit satisfi es
For example, consider the vector u = [2,2,0, 1, 21 from Example 1.35. Th e sum o f its components is 2 + 2 + 0 + 1 + 2 = I, so the check digit must be 2 (since 1 + 2 = 0). Therefo re, the associated code vector is v = [2, 2, 0, I, 2, 21.
\Vhile simple check digit codes will detect single errors, it is oft en importtln t to catch other common types of errors as well, such as the accidenttll interchtlnge, or transpositIOn, of two adjacent components. (Fo r example, transposing the second and third components of v in Example 1.36 would result in the Incorrect vector v' = [2,0,2, 1,2,21.) For such purposes, o ther types of check digit codes have been
designed, Many of these simply replace the check "ector 1 by some olher carefully chosen vector c,
Example 1.31
a figure 1.12 A Universal Product Code
The Universal Product Code, o r UPC (Figure 1.72), is a code associated with the bar codes fo und o n many types of merchandise. The black and while bars that are scanned by a laser at a store's checkout counter correspond to a lOa ry vecto r u = [111' 112" " , U I I' d] of length 12. The fi rst II componen ts for m a vector in Z:~ that gives manufacturer and prod uct information; the last component d IS a check digit chosen so that c' u = 0 In l w where the check vector c is the vector [3, I, 3, I, 3, 1,3, 1, 3, 1, 3, I I. 111at is, aft er rearranging,
6
where d is the check digit. In other words, the check digit is chosen so that the left hand side of this expression is a mulliple of 10. Fo r the UPC shown in Figure 1.72, wecan determ ine that the check digit is 6, perfo rming all calculations in 71. 10 :
+ 9+3  2 + 7 + 3 · 0 + 2 + 3·0+9 + 3·4 + d + 0 + 0 + 4) + (7 + 9 + 7 + 2 + 9) + d
c' u = 3 · 0+ 7 + 3 · 4 = 3(0 + 4 + 2
= 3(0) + 4 + d = 4+(/ The check digit d must be 6 to make the result of the calculation 0 in 1' 10' (Another way to think of the check digit in this example is that it is chosen so that c' u will be a muillpleof 10.) The Universal Product Code will detect all single errors and most transposition errors in adjacen t components, To see this last point, suppose that the UPC in
Section 1.4
Code V«lors and Modular Anthmetlc
51
Example 1.37 were incorrectly writ te n as u ' = [0,7, 4,2,9,7,0,2,0,9, 4,61, wi th the fourth a nd fi fth components transposed. When we applied the check vector, we wo uld have c' u '  4 0 ( verify this!) alerting us to the fact that there had been an error. (Sec Exercises 48 and 49.)
'*
The I nte rna tio nal Standard Book Number ( ISBN ) code is a nother widely used check digit code. It IS designed to detect more types of errors tha n the Universal Produc t Code and, conseque ntly. IS slightly more complica ted. Ye t the basic pn nci ple is the same. The code vector is a vector in Zl~. Th e first nine componenlS give cou nl ry. publishe r, and book info rmation; the tenth componenl is the c heck digit. The ISBN for the book Calculus: Concepts and COllfexts by James Stewart is 0534344S0X. It ,s recorded as the vector
b = (O,S, 3, 4,3,4,4,S,0,X] where the check Mdigit" IS the leite r X. For the ISBN code. the check vecto r is the vector c  110,9, 8.7,6.5, 4,3,2, I J. and we require that c  b = 0 in 7.. 11 ' lei's dete r m ine Ihe check digit fo r the vecto r b in Ihis exam ple. We must compule c,b  10 · 0 + 9·5 + 8 · 3 + 7 · 4
+ 6·3
+ 5 ·4 + 4'4
+ 3·5
+ 2 ·0 + tl
whe re d is the check digit. We ~in by performing all of the multiplications in II ,. ( Fo r example, 9 · 5"" I , since 45 is I mo re t ha n the closest smaller multiple of 11namely, 44. O n an IIhour dock, 45 o'clock is I o'clock.) The si mplifi ed su m is 0+1 +2+6+7+ 9 +5+4+0+ d and adding in 7..11 leaves us with I + d. The check digit tl must now be chosen so Ihat the final result is 0; therefore, in ZI" tI :: 10. ( Equivalently, dmust be chosen so that c ' b will be a multiple of II .) But since it is prefe rable tha i each compone nt o f an ISBN be a single digit, the Roma n nume ral X is used for 10 whenever it occurs as a c heck digit, as it d oes he re.
The ISBN code will detect all single e rro rs and adjacent tra nspositio n errors (see Exercises 5254 ).
//1 Exercises 1 4, u t/lld v are bill~ry vectors. Find u u . v III cueh casc. I. u
= [~]" = [:]
2.u =
3. u = (1, 0, I, I]" = ( I, I, I, J)
I
I
l,v =
i
°
+ v (/lid
6. Wrile oullhe addition and multiplicatio n tables for ls'
I" Exercises 7 19, perform tile i"dicated c~fcu lat iotlS. 7.2 +2+2 in Zl 9. 2(2+ 1 +2) in l}
I
11. 2 · 3 ·2in ~
8. 2  2  2in Z j I O.3 + 1+ 2+3in ~
12. 3{3 + 3 + 2 )i n ~
4. u  ( I, 1, 0, 1,0]" = (0, I, I, 1, 0]
13. 2+ I +2+2+ lin lJ,~, and Zs
5. Wri te out the addition a nd multiplication tables for Z t.
14. (3 + 4)(3 + 2 + 4 + 2) m Zs
54
Chapter 1 Vectors
+ 4 + 3) i n ~
16. 21oo in Z II
17. [2, 1,2] +[2, O , I ] inZ~
In Exercises 45 and 46, fi nd the check digit d III the given Universal Product Code.
18. [2, 1, 2J · [2, 2,1] inZ;
45. [0, 5,9,4,6,4, 7,O,O,2 ,7, d]
15.8(6
19. [2, 0,3, 2J · ([3, I, I, 2J + [3,3, 2, I]) in
z: ,nd in Zl
46. [0,1,4,0, 1,4, 1,8, 4,1 ,2, d] 47. Considerthe
III Exercises 2031, solve tile given eqllariolJ or iluiicale that there is no sollltioll.
+ 5 = 1 in Z6
20,x+ 3 = 2inZs
21. x
22.2x= 1tn Zl
23. 2x = 1 in Z.
24. 2x = I tn Zs
25_ 3x
26. 3x= 4m ~
27. 6x = 5 in Z,
28, 8x= 9in Z il
29. 2x + 3 = 2in Zs
+5
31. 6x + 3 = 1 in Zs
30. 4x
= 2
in ~
= 4
urc [0,4,6,9,5,6,1,8, 2, 0, 1,5 ].
(a) Show that this UPC can not be correct.
(b ) Assuming that a sin gle error was made and that the incorrect digi t is the 6 in the third entry, fi nd the correct U PC. 48. Prove that the Universal ProducCCode will detect all single errors.
in Zs
°
32_ (a) For which va lues of a does x + t l = have a solution in Zs? (b) For which values of a and b does x + a = b have a solution III Z6? (c) For which values of a, b, and m does x + a = b have a solution in Z",,?
49. (a) Prove that if a transposition error is made in the second and third en tries o f the UPC [0, 7, 4, 9, 2, 7, 0, 2, 0, 9, 4, 6 ], the error will be detected. (b ) Show that there is a tra nsposition involving two adjacent en tries o f the UPC in part (a) that would no t be detected. (c) In general, when will the Un iversal Product Code not detect a transposition error involving two
adjacent entries?
33. (a) For which values o f a d oes ax = 1 have a solulion 1Il
Zs?
(b) For which values o f a does (IX = 1 have a solution .III " ""6' (c) Fo r \~hich values of a and In does ax = I have a solution in Z",?
In Exercises 50 mId 51, find the check dig it d ill the givell International Standard Book Number. 50. [0,3,8,7,9,7,9,9,3,dl
51. [0,3 ,9, 4, 7,5,6 ,8,2,d] 52. Consider the ISBN [0,4,4,9, 5,0,8,3, 5,6].
In ExerCISes 34 (IIu/ 35, finritlJ e parity check code vector for
the binary I'ecforu. 34.u = [1,0, 1, 1]
35. u =
II, 1,0, 1, I ]
In Exercises 3639, a parity check code vector v IS glvell. Determine wiJetiJer a single error could have occurred ill the fr(m sllllssion of v. 36. v = \ 1,0, 1, 0]
37.v = ]I,I,I,0,1,1]
38. v = [0,1,0, I, I, II
39. v = [ I, 1,0,1,0,1 , 1, 1]
Exercises 4043 refer to check digit codes ill which tlJe check vector c is tIle vector I = [ I , I , . . . , I ] of the approl"ime length. /n each case, find till! dleck digit d that would be appendetl to the wctor u. 40. u = (1,2,2, 21 inZ;
4 1. u = 13,4,2,3] in Z!
42.u = [1,5,6,4,5]i n .z~
43. u = 13,0,7, 5,6,8J in Z;
44. Prove that for any posi tive integers m and fI, the check d igit code 1I1 .z~ with check vector c = 1 = [ I , 1, . .. , I J will detect all single errors. (That is, prove that if vecrors u and v in Z::' differ in exactly o ne entry, then cu ,·v. )
*"
(a) Show that this ISBN cannot be correct. (b) Assuming that a single error was made and that the incorrect digit is the 5 in the fi ft h entry, find the correct ISBN. 53. (a) Prove that if a transposition error is made in the fourt h and fifth entries of the ISBN [0,6, 7, 9, 7, 6, 2, 9, 0, 61, the error will be detected. (b) Prove that if a transposition error is made in an y two adjacent entries of the ISBN in part (a), the error will be de tected. (c) Prove, in general, that the ISBN code will always detect a transposition error involvlllg two adjacent ent ries. 54. Consider the ISBN [0,8,3,7,0,9,9, 0,2,6]. (a) Show that this ISBN can not be correct. (b) Assum ing that the error was a transposition error involving two adjacent entries, fin d the correct ISBN. (c) Give an example of an ISBN fo r whICh a tr ansposition erro r involving two adjacent entries will be detected but will not be correctable.
Svste
The Cod
Every credi t (
f (iN.M

Ofl01
Suppose tha t the fi rst IS digi ts of you r card
a TC
541 23456 7890432
and that the check d igi t is d. This corresponds to the veClor
x = [5,4 , 1, 2, 3,4, 5,6,7,8,9,0,4,3,2,dl in Z:~ T he Codabar system uses the check vecto r C ::: [2, 1,2, 1,2, 1,2, 1,2, J, 2, 1,2, 1, 2, 11, but instf."ad of requIring that c . x = 0 in ZIO' an extra calculation is added to increase the errordetecting capabIlity of the code. Let h count the number o f digits in odd positiolls thal are greater Iluln 4. ln this example, these diglls arc 5, 5, 7, and 9,
5011 = 4. It is now required that c . x
+h=
0 in 7. 10. Thus, in the example, we ha ve. rear
ranging and working modulo 10,
c . x + I, = (2 . 5 + 4 + 2 • 1 + 2 + 2 . 3 + 4 + 2 . 5 + 6 + 2 . 7 + 8 + 2 . 9 + 0 +2 · 4 +3+2 ·2 +d)+ 4 =2(5 +1+ 3+5+7+ 9 + 4 + 2)+(4 +2+ 4 +6+8+ 0 +3+ (1)+ 4 ... 2 (6) + 7 + d + 4
=3+ d Thus, the check d igit d for this card must be 7, so the resul t of the calculation is 0 111 liD"
The Codabar system is one of the most efficien t errord etecting m ethods. It \"111 d elcet all singledigit errors and most other common errors such as "dj;lcellt transposItion errors.
"
,, ,
..
Chapler 1 Vectors
,I
:'f~' Ie, Definitions and errordetecting code, 49 headla tail rule, 6 integers mod ulo m (Zm)' 5 1 International Standard Book Number, 53 length (norm) of a ve<:tor, 17 linear combinatio n of vectors, 12 normal vector, 31, 35 o rthogonal vectors, 23 parallel vectors, 8 parallelogram rule, 6
algebraic properties of vectors, 10 angle between vectors, 21 CauchySchwan Inequality, 19 check digit code, SO code vector, 48 cross product, 45 direclion veCIO T, 32 distance between vecto rs, 20 dot product, 15 equation of a line, 33 equation of a plane, 3536
projection of a vector onto a vector, 24 Pythagoras' Theorem, 23 scalar multiplication, 7 standard unit vectors, 19
Triangle Incqu
19
uni t vector, 18 Universal Product Code, \lector, 3 vector addi tio n, 5 zero vector, 4
52
Review Quesllons I. Mark each o f the following statements true or false:
(a) For vectors u, v, and w in R~, if u + w = v + w) then u = v. (b) For vectors u, v, and w in !R", if u . w "" v . w, then u "" v. (c) For vectors u, v, and w in !RJ , if U IS orthogonal to v, and v is orthogonal to w, thcn u is orthogonal to w. (d) In R.l , if a line is ptlmllel to a plane~ , then tl direct ion vector d for is parallel to a normal vector
e
n for ~ .
e
e
(e) In 1Il3 , if tl line is perpend icu lar to a plane W>, then a di rec tion vector d for is a parallel to a normal vector n for ~. (f) [n 1R3 , if two planes are not parallel, then they m ust intersect in tl li ne. (g) In IR' , if IwO Illles are no t pamUcI, then they must in tersect in a point. (11 ) If v is a binary vecto r such that v . v "" 0, then v = O. (i) If tlt most o ne error has occurred, then the upe [0,4, 1,7,7, 1, 5,2,7,0,8,21 is correct. (j) If at most one error has occurred, then the ISBN [0,5, 3,2,3, 4, 1,7,4,8J is co rrect. 2. Jf u = [
e
~]. v = [~].and the vector4 u + v
IS drawn
with its tail at the POlll t (10,  10), fi nd the coordi nates of the POlllt at the head of 4u + v. 3.lf u "" fo r x.
[~]. v = [~].and 2x + u =3( x 
_.
4. Let A, 8, C, and D be the vert ices o f a sq uare centered al the origin 0, labeled in clockwise orde r. If a ::: OA and b = OB, find Be in te rms of a and b. ~
~
5. Find the angle between the vectors ( I, 1,2 J and [2. I. I[. I
6. Find the projection of v =
1
1 onlo u :::
1
 2
2
7. Find a un it vector in Ihe xyplane that is orthogonal 1
to
2 .
3 8. Find the general equation of the plane through the point ( I , I, I) that IS perpendicular to the line with parametric equations
x
=
2
Y= 3 z=  I
t
+ 2t + t
9. Find the general equalio n o f the plane th rough the point (3,2,5) that IS parallel to the plane whose genem l equatIOn is 2x + 3Y  z = O. 10. Find the general equation of the plane through the points A{I, 1,0), B( l , 0, I ), and C(O, 1, 2). 11. Find the area of the triangle with vertices A( I, 1,0),
8( 1,0, I), and C(O, 1,2).
v),solvc
12. Find the midpoint of th e line segment between A = (5, 1,  2) and B = (3, 7,0).
Chapter Review
,, I I
I
13. Why are there no vectors u and v in R" such that l u~ = 2, ~ vll = 3,andu 'v =7?
17, If possible, solve 3(x + 2) = 5 in 71. 7 '
14. Find the distance from the point (3. 2, 5) to the plane whose general equation is 2x + 3y  z = O.
19. Find the check digit din the O,3,1,7,d).
15. FlIld the distance from the point (3.2 ,5) to the line with parametric equmio ns x = I, Y = I + t, Z = 2 +
20. Find the check digit d in the ISBN [0,7,6,7,0,3,9,
16. Compute 3  (2
+ 4)'(4 + 3)2 in Z s.
18. If possible, solve 3(x
t.
4,',dl,
+ 2)
= 5 in 7L,
upe [7,3,3. 9,6, I, 7,
5J
. :'\'t r #
"c ..
'
' :...
 _ •••
Ie E
The worl.1 w.:Ij f ul! 0/ "llulI/ron, . ... There: /lUISI b;,>
1'lC Accidental Toriris/ Alfred A. Knopf, 1985, p. 235
f linear io
2.0 Introduction: Trlvialitv The word lrivilil is derived from the ullin root /ri (" three") and the Latin word v/(/ ("road"). Thus, speaking liter,llly, a triviality is a place where three roads meet. Th iS common meeting point gives rise to the other, more fa miliar meaning of trivia/commonplace, ordinary, or insignifi cant. In medieval universities, the Irivi/llll consisted of the three "common" subjects (grammar, rhetoric. and logic) that were taught before the quadrivilllll (arithmetic. geometry, music, and astronomy). The "three roads" th at made up the trivi um were the beginning of the liberal arts. In this section, we begin to examine systems of linear equations. The same system of equations can be viewed in three different, yet eq ually important, waysthese will be our three roads, all leading to the same solution. You will need to get used to this th reefold way of viewing systems of l!Ilear equallo ns, so that it becomes commonplace (trivial!) for you. The system of equations we are goi ng to consider is
2x + y= 8 x 3y= 3
Probl •• l Draw the two lines represen ted by these equations. What is their point of intersection? Proill •• 2 Consider the vectors u =
[~J a nd v = [ _ ~J. Draw the coordillate
grid determined by u and v. ( Hlllt: Lightly draw the standard coordinate grid first and use it as an aid in drawing the new one.)
Problem 3 On the tr vgrid, fi nd the coordinates ofw
= [ _:
J.
Proble.4 Another way to state Problem 3 IS to ask for the coefflcicnts x and yfor which xu + yv = w. Write out the twO equation s to which this vector equation is equivalent (one for each component). What do YOli observe? Preble. 5 Return now to the lines you drew for Problem I. We will refer 10 the line whose equation is 2x + Y = 8 as Ime I and the Iine whose equation is x  3y ""  3 as line 2. Plot the pOlilt (0, O) on your graph from Problem I and label it Po Draw a 58
Sect ion 2.1
Tlble 2.1 y
Point
Po P,
0
0
P, P, P,
Introductio n to Systems of Linear Equations
59
horizontal line segment from Po to linc I and label this new pomt PI" Next draw a vert/ca /lin e segment from PI to line 2 and label this point P2" Now draw a hOrizontal line segment from Pl IO linc 1, obtai ning point Py Continue in this fashion , drawi n g vertICal segments to line 2 followed by ho rizontal segments to lin e I. What appears to be happening? Proille. a Using a calculator with twod eci mal place accuracy, find the (approximate) coordi nates o f the points P,. p~ , Pj' . . . , P6' (Yo u will fmd it helpful to first solve the fi rst equation fo r xin terms or r and th e second equatio n for y m terms of x.) Record yo ur results in Table 2.1 , writing the x and ycoo rd inates of each point separately.
P, The results of these pro lJlems sho w tiltH Ihe task of "so lvlllg" a system of lint'! :lr equations may be VIewed in several ways. Repeat the process descri bed in the prob· lems with the following systems of equations:
P,
{a}4x  2y " 0 x + 2y "" 5
{bp x + 2y == 9 x + 3y =: to
(c ) x + Y = 5 x y = 3
(d ) x + 2y = 4 2x 
Y 3
Are all of your observatioJls from Problems 16 suU valid for the5C examples? Note any similarities or differences. In this chapter, we will explo re these ideas in more detail.
Introduction to Svstems of Linear Equations Recall that the general equation of a line in HZis of the fo rm f1X + by = c and that the general equatio n of a plane in Rl is of the form
ax + by+ cz= d Equatio ns of this form arc called IinttJr equations.
Definition
A
linear equation in lhe " variables X l' X1 •. .. , Xw is an equation
that can be wriUen in the form alx l
where the coefficiellts tI l '
+ a:x: + ... + a.,x" =
Ill ' ...•
b
a" and the COllstan' term b are constants. .
Example 2.1
The foll owing equations are linear:
3x  4y = 1
V2X+7Y 
r }sY t = 9
( sin ~) z =
1
3.2x I

O.Ol xJ "" 4.6
Observe that the third equation is linear because it can be rcwrilten in thc form X I + 5x z + X l  2X4 = 3. It is also impo rtan t to note that, although in these examples (and in most applica tions) the coefficients and constant lerms are real numbers, in some examples and applications they W i ll be complex numbers Qr members of Ip for some prime number p.
Chapter 2 Systems of Linear Equal lons
The following equatio ns are not linear: x
+z=2 y
.xy+2z= 1
Thus, linear equations do not contain products. reciprocals, o r o ther func tions of the variables; the variables occu r only to the first power and arc multiplied only by <:onstants. Pay particular allention to the fourth exam ple in each list: \Vhy is It that the fourth equation in the fi rst iisl is linear but the fourth equation in the second list is not r
+
+ ... +
.+
= b is a vector I$p~ .... , $.1whose components satisfy the equatio n when we substitute X I "" 51' x., = $" .. .• x. = $ ... A solution of a linear equatio n fl lXI

Example 2.2
(llX2
(I,.xn

(a) [5,4] IS a solutIo n of 3x  4y = ~ I because, \\'h cn we substitute X = 5 and y = 4, the eq uation is s.1tisfi ed: 3( 5)  4(4) =  I. ( I, I ] is another solution. In general, the sol utions simply correspond to the points on the line de termined by the given equalio n. Thus, setting x = I and solvlflg for y, we see that the complete set of solutions Gill be writlcn in Ihc pa ramcrric form ft,! + ~ tJ. (We could also sel r equa] to some parametersay, sand solve fo r x instead; the two para metric solutions wou ld look d ifferent but would be equivalent. Try thi s.) (b) The linear equation X I  X 2 + 2x}  3 has [3,0,0], [0, 1,2 1, lind 16, I ,  I I as specific solutions. The complete set of solutions corresponds to the set of poi nts in the plane determllled by the given equatLon. If we set X l = s and Xl = t, then a parametric solution is given by (3 + $  2t. s, I]. (Which valuesof sand I prod uce the three specIfi c solutIons above?)
A system oftitlear equatiol1s is a finite set of linear equations, each with the same vllriables.fA solutjon of a system of lin r equation s is a vector that is sinlllitllnmu* a solution of each equation in the system. 'T1le solutiouset.o f a.sy telll o f linear equationa is thesec of dlJ solutions of the system. We will refer to the p rocess o f fi nding the solution set of a system of linear equations as "solving the system."
Exalliple 2.3
The system
y=3 + 3y z: 5
2x X
has (2 , I ) as a solution, since it is a solution of both equations. On the other hand, II . II is not a solution of the system, since it satisfies only the first equatio n.
Example 2.4
Solve the following systems o f linear equations:
(,)x  y: I x+y = 3
(b ) x  y : 2 2x  2y = 4
(c) x  y :
I
x  y=3
Section 2.1
Introduction to Systems of Line:lr Equatlons
61
Solalloll (a) Adding the two equations together gives 2x = 4, so x = 2, frOIll which we find that y = 1. A quick check confi rms that [2, 11is indeed a solution of both equations. That this IS the Dilly solution can be seen by observing that th is solution corresponds to the (unique) point of intersection (2, 1) of the lines with equat ions x  y = 1 and x + y = 3, as shown in Figure 2.I (a). Thus, 12, 1) is a mriqlle so/wioll. (b) The second equation in thiS system is just t",,·ice the fi rst, so the solutions arc the solutions orthe firs t equation alonenamely, the POll1ts on the line x  y = 2. These can be represented paramet rically as 12 + t, tl. Thus, this system has infinitely many solutions [Figure 2.1 (b) I. (c) Two numbers x and ycannot simul taneously have a difference of 1 and 3. Hence, this system has 110 solutlollS. (A mo re algebraic approach might be to subtract the second equation from the fi rst, yielding the equally absurd conclusion 0 =  2.) As Figure 2.1 (c) shows, th e lines for th e equations are parallel in th is case. y
y
)'
4
4
2
2 f+x
4
 2
4
2
4
4
2
4 ( a)
(b)
(0)
Ilgure 2.1 A tern o(linear equations is caUed cOl/sistent if it has at leasrone solution. A systm1 with no SOIUti01l.S is called ;'lconsisteflf. Even though they are small, the three systems in Example 2.4 illustrate the only three possibilities for the number of solutions of a system o f linear equations wilh real coefficients, We will prove later thaI these same three possibilities hold fo r lIllY system of linear equations over the real numbers.
A system of linear equations with real coefficients has either (a) a unique solution (a consistent system) or (b) infinitely many solu tions fa l.:oDsistentsystem) o r (c) no solutio ns (an inconsistent system). ..,
Solving a Svslem 01 linear £Quallons Two linear systems are <:alIed 4!quivalmt if they have the same solution sellf. For example,
x y :::: l x+ y = 3
and
x  y=! y= l
are equivalent, sillce they hOlh have the un ique solut ion )2, I], (Check th is.)
It
Chaptt r 2 Systems of Linear Equations
Ou r approach to solvmg a system of linear equations is to transform the given system into an equivalent one that is easier to solve. The triangular pattern of the second eXllmple above (in whIch the second equation has one less va riable than the first) is what we will aim for.
EKlmple 2.5
So lve th e system
x y ,= 2 y+ 3z== 5 5z == 10
Solullon Starting from the last equation and \"orking backward, \oJe find successively that z == 2,y = 5  3(2 ) =  l,andx = 2
+ ( I ) +
2 := 3. So the umquesolution is
[),  I ,2 [.
The procedure used to solve Example 2.5 is called back substitution. We now turn to the general strategy for transforming a given system LIlto an equivalent one that can be solved easily by back substitutIOn. ThIs process will be described in greater detail in the next sectIOn; for now, we will simply observe il in act ion III a single exa mple.
EKlmple 2.6
Solve the system
x y z= 2 3x  3y + 2z = 16 2x  y + z = 9 SOliliol To transform th IS system into one that exhibI ts the triangular structure of Example 2.5, we first need to climitwte the variable x from eq uations 2
The word IIItJlrix is derived from the louin word /l/tJla, meaning umother." When the 5uffiJt Ix is added, the meaning becomes uwomb." Just as a womb surrounds a fet us, the brackets of a matriJt surround its entries, and just as the womb gives rise to a baby, a matrix gives rise to certain types of functions called linear tfam/ormtJIiolls. A matrix with m rows and II columns is called an II! X n matrix (pronounced h I/! by /I~). The plural of matrix is matricC$, not "matrixes."
I
 I
)
 )
2
 I
2 2 16 I 9
 I
where the first three columns contain the coeffici ents of the variables in o rder, the fin
x  y z= 2 3x  3y + 2z = 16 2x  y+z=9
 I
2
I
 I
3
 )
2 16
2
 I
I
9
Section 2.t
Subtract 3 times the fi rst equation from the second equation:
In troduction to Systems of Linear Eq uatio ns
63
Subtract 3 times the firs t row from the second row:
z= 2 5z = 10 2< y + z = 9 xy 
I
 1
o
0 I
2
 1
2
5 10 1
9
, Subtract 2 times the fi rst equation fro m the third equation:
x  y
Z ='
2
third row: I
1
 \
2
5%::: 10
o
0
5 10
y+32"" 5
o
I
3
Interchange equations 2 and 3:
)
Subtract 2 limes the first row from the
5
Interchange rows 2 and 3:
x  y z = 2
I
 \
y+3% = 5 52 = 10
o
1
35
o
0
5 10
 )
2
This is the same system that we solved using back substi tution in Example 2.5, where we found that the solution was [3. 1, 21. This IS therefore also the solution to the system given 10 this example. Why? The calculations above show that any soillri01l of the given system is also II solution o/the final Otic. But si nce the steps we just performed are reversible, we could recover the origi nal system, start ing with the fin al system. (How?) So any roIl/tion of the final sy5tem is also a SOIlltioll of the givell one. T hus, the systems are equivalent (as are all of the ones obtamed in the intermed iate steps above). Moreover, we migh t just as well work wi th matrices instead o f equations, since it is a simple matter to reinsert the" variables before proceeding with the back substitution. (Wo rking with matrices is the subject of the next sectio n.)
Remark
Calculato rs with matrix capabilities and computer algeb ra sys tems can facilitate solving systems of linear equatio ns, particularly when the systems are large or have coeffi cien ts that arc no t "nice," as is often the case in reallife applications. As always, though, yo u should do as many examples as you can wi th pencil and paper until you are comfortable with the tech niques. Even if a calculator o r CAS is called for, thi nk about how you would do the calculatio ns manually before d oing any th ing. After yo u have an answer, be sure to thmk abo u t whether it is reasonable. Do not be misled into thUlking that techno logy will always give yo u the answer faster or more easily than calculating by hand. Sometimes it may not give you the answer at all! Roundoff errors associated with the fl oa ting poin t arithmetic used by calculators and computers can cause serious problems and lead to wildly wrong answe rs to some problems. See Explo ration: Lies My Computer Told Me for a glimpse of the problem. (You've been warned!)
Chapter 2 Systems of Linear Equatlons
64
In ExerCISes 16, tletermme wllich equatiolls are lillear equa tiotlS in the vnr;ables x, y, ami z. If allY equatIOn is not linear, explain wilY nOI. 1. x  1Ty + ~z = 0 2. X2 + I + Zl = \ 3.
X I
Tilt systems II! Exercises 25 alld 26 exllibll a "lower trial/gll lar" pmlem tlwt makes them easy to solve by fonvllfd subs/I /Illioll . ( We will encounter forward subs/illllioll again ill Chapter 3.J Solve these systems. =  \ 25. x 2 26. x ,
+ 7y +z = sin(;) 5.3cosx  4y +z=
4.2x  xy  5z = 0 oj. 6. (cos3)x  4y + z =
2x
V3
V3
III Exercises 7 10, fiml a linear equal/OIl that has tile same soilltion set as the given equation (possibly with some restrictiollS on the 1'llriaIJ/es). Xl 
8.
7.2x+ y = 7  3y I I 4 9.  +  = x y xy
x
i
=
y
10. log,o X
1
29.
III Exercises 1114,jind the solutlOlI S
\2. 2x,
Exercises 15 18, draw graphs correspOIulillg to II,e gillell lillellr systems. Determine geomerrialll" whether ellch syslem lUiS (/ ullique soilltiol/, iI/finitely mw,y solutiO/IS, or 110 solu tioll. 'f/II! n wive cadI syslem algebraically 10 confirm )'OM t//lSlller. IS, x + )' = O 16. x  2), = 7 3x + y= 7 2x + )' = 3 17. 3x  6y=3 18. 0.lOx  0.05y = 0.20  x + 2y=1  O.06x + 0.03)' =  0.12 III
,/' Exercises /9 24, soll'C Ihe glvell syslem by back 5111,;;11 til t/Oll.
21.x 
20. 211  31' = 5 211 = 6
y + z= 0 2y  z = \ 3z =  \
23. x, + x 1
22.
, ,
0
 5x + 2x = 0 4x  0
x } x 4 = 1 Xl +X} +X4 = O Xj  x4 = 0 
x~ =
x, + 2X2 + 3x) =
,
1
5
 3x  4y + z =  iO
7
x+5y = 1  x+ y = 5 2x + 4)' = 4
30.
a  2b + d= 2  a+ b c 3d = J
lias tile given matrix as its augmemed matrix.
+ 3x 2 = 5
31.
y= 3
 3
III Exercises 3 I alld 3Z, fi"d II system of linear etjuatlOtls ,IIat
13. x + 2y + 3z= 4
19. x  2y= 1
=
rind tile augll/ellled matriCes of the Imear systellls ill Exercises 2730. 27. x  y = O 28. 2x, + 3X.!  xJ = 1 x, + x3 = 0 2x + y = 3  x, + 2X.!  2X3 = 0
iog ,o Y = 2

+ )'
24. x  3)' + z = 5 y  2%= \
32.
0
I
I I
I
 I
0 I
2
 I
I I
I
 I
I
I
0
I
0 2 0
3
I 2
I
 I 4
2
3 0
For Exercises 3338, solve the lincnr systems ill Ihe givcn exerCIses. 33. ExerCise 27
34. Exercise 28
35. Exercise 29
36. Exercise 30
37. Exercise 3 \
38. Exercise 32
39. (a) Find a system of two linear equatIOns III the variables x and y \"hose solution set is given by the paramet ric equations x = t and), = 3  2t. (b) Find another pa rametric solution to the system III pari (a) in wh ich the parameter is 5 and y = s. 40. (a) Fllld a system of two linear eq uatIOns 111 the variables x,, x 2,and x J whose solution set is given by the parametric equations x I = t, X 2 = \ + I, and x 3 =2  t. (b) Find another parametric solu tion to the system in part (a) in which the parameter is 5 and xJ = s.
Section 2. L Introduction to Systems of Linear Equations

In Exercises 4144, the systems of equations are lIonliliear. Find substitutiorlS (cha nges of variables) that convert each system 11110 a linear system cmd lise this linear system to help solve the given system. 2
3 41.  + = 0 x y 3 4 += 1 x y
42.
x? + 2y2 =
6
x 2  i =3 43.tanx2siny
2 tanx siny + cosz = 2 siny  cos z =  \ 44. 2~ + 2(3 b) = I 3(2")  4(3') = I
65
Using your calculator or CAS, solve this system, rounding the result of every calculation to five significant digits. 3. Solve the system two mort" times, rounding first to four significant digits and then 10 three significant digitS. What happens? 4. Clearly, a ve ry small roundoff error ( less than or equal to 0.(0125 ) call result in very large errors in the solution. Explam why geomet rically. (Think about the grap hs o f the various li near systems you solved in Problems 13.) Systems such as the one you just worked w ith are called iI/conditioned. They are ext re mely sensllive to roundoff errors, and there is nOI m uch we can do about it. We will encoun ter illcondi tioned systems agai n in Chapters 3 and 7. Here is a nother example to experi ment with:
45S2x + 7.083y "" 1.931 1.73 Ix + 2.693y = 2.00 1 Play around with various numbers o f significant d igi ts to see what happens. startll1g with eight significant digits (if yo u can).
•
"
68
Chapkr 2
Systems of Li near Equations
•
Direcl Melhods for Solving Linear SVSlems In this section we will look at a general , systematic procedure for solving a system of lmear equations. ThIs procedure IS based o n the Idea of red ucing the augmented matrix of the given system to a form that can then be solved by back substitution. The method is direct in the sense that it leads direct ly to the solution (if one ex ists) in a finite number of steps. In Section 2.5, we wi ll co nsider some ;lI(ilrect methods that work in a completely dIfferent way_
Matrices and Echelon lorm There are two important matriccs associatcd with a linear system. The coefficient matrix contams the coeffiCIents of the variables, an
2x+y  z = 3 x +5z= 1  x + 3y  2z = O the coefficient 1
 1
0  1 3
5
2 1
a ndlhe ugrncnted matri
2
•
" 2
1
0  1 3 1
 1
3
5 I
 2 0
Note that if [I variable is mi ssi ng (as y is in the second equation), its coefficien t 0 is entered In the app ropria te positio n in the matrix_ If we denote the coefficien t matrix of a linear system by A and the column vector of constant terms by b, then the form of the augmented matrix IS [A I b J. In solving a linear system, It will not always be p ossible to reduce the coefficient mat rix to tri3ngular fOTm, ,IS we d id in Example 2.6. However, we can always achieve a stai rcase pattern in the nonzero entries of the final matrix. The word cchdoll comes from th e Lati n word 5C11ifl, meaning " ladder" or "stairs." The Fren.;h wo rd for '"ladder," tel/file, is also deri\ed from this Latin base. A matrix. in «he1on form exhibilS
a staircase pattern.
"
Deflnillon
A ma trix is in
form if it satili6es
lh~ ..folJpwing
p roperties:
t ..AIlY ro~ consi!iting entirely of .e!'05"8.T t the bottlm", 2. In each nonzero rO\ !he fi rst no nzero entry (called the leailing entry) is in a columu to the left of any leading entries below it
Section 2.2
&9
DITfil Met hods for Soh'ing Lmear Systems
Note that these p roperties guarantee that the leading en tries fo rm a staircase pattern. In particular, in any col umn con taining a leadlllg entry, all entries below the leading en try ar... zero, as the follow ing exam ples illustrate.
Example 2.7
The following 2 0 0
4
matri c~
1
0
1
1
0 0 0
5
0 0
1
2 D 0 1
are in row echelon form:
1
1 2 1 0 1 3 0 0 <J
0 2 0 0 0 0 0 0
0
1
 I
I
1
2
0 0 0 0
3 2
0 0 5 4
t
If a matrix in ro w echelon fo rm is actually the augmented matrix of a linea r system, the system is qui te easy to solve by back substi tution alone.
Example 2.8
Assumi ng thai each of the mat rices in Example 2.7 IS an augmcnled ma trix, w ri te out the corresponding systems of linear equations and solve them.
Solullon
We fi rSI remind o u rselves that the last col umn in an augmented matrix is the vecto r o f co nslantterms. The first matrix then corresponds to the system
2x,+4x2 = 1  X:z = 2 ( Notice that we have dropped the last equation 0 = 0, or Ox, + Ox! = 0, whICh is clearly satisfied for any values of XI and xJ.) Back substitutioll gives Xl =  2 and then 2x, I  4(  2) 9, so XI "" ~ . The solution is 2 J. The second matrix has the corresponding system
=
=
n, 
The last equation represents Ox, + OX:! "" 4, which d earl y has no solutions. Therefore, the system has no solutions. Similarly, the system corresponding to the fourth matrix has no solutions. For the system corresponding to the th ird matrix, we have
so XI = I  2(3)  X1 = 5  Xz. There are infinitely many solutions, since we m3Y assign Xz any value r to get the parametric solution { 5  t, t. 3 J.
ElemeRlarr Row Operallons We now desc ribe the p roced ure by which any mat rix can be reduced to (j matrix in row echelo n form. The allowable o peratio ns, called elementary row operations.
18
Chaptrr 2 Systrms of Linear EquatIOns
correspond to the operations that can be performed on a system or linear equations to transform it into an equivalent system .
Definition
The following elem entary row operations can be performed on a
matrix: I. Interchange two rows.
2. Multiply a row by a nonzr ro constant. 3. Add a multiple of a row to another row.
Observe that dividing a row by a nonzero constan t is implied in the above definition, since, fo r example, dividing a row by 2 is the same as multiplying it by Similarly, subtract ing a multiple of one row from another row is the same as adding a negative multiple of one row to another row. We WII! use the following shorthand notatio n for the three elementary row operatio ns:
t.
I. R, ++ Rj means intercha nge rows j and j. 2. kR; means multiply row j by k. 3. R, + kRJ means add k tJlnes row J to row I (and replace row i with the result).
The process of applying elemen tary row operat ions to brmg a matrix into row echelon form, called row reduction, is used to reduce a matrix to echelon form.
El3mple 2. 9
Reduce the follow ing matrix to echelon fo rm:
1
2
 4
 4
5
2
4
o
0
2
2 3  1 1
2
1
5
3
6
5
Solution
We work column by column, from left to right and from top to bottom. The strategy is to create a leading entry in a column and then use it to create zeros below It. The entry chosen to become a lead ing en try is (aLlea a pivot, and this phase of the process is cal led pivoting. Although no t strictly necessary, it is often convenient to use the second elementary row operation to make each leading entry a I. We begin by introducing zeros into the firs t column below the leading 1 in the first row:
1 2 2 4 2 3  1 1
 4 0
2 3
 4
5
2 1 5 6 5 0
R,  2R, R,  2R, 1<, + H,
,
1
2
 4
 4
0
0
8
0
 1 3
8 10  1
0
9
5  8 5
2
10
The first colu mn is now as we wan t it, so the next thing to do is 10 create a leading entry in the second row, aiming for the staircase pattern of echelon form.
Se<:tion 2.2
Dire<:t Me thods for Solving Linear Systems
11
In th is case, we do this by interchanging rows. (We coutd also add row 3 or row 4 to row 2.) 11. . _
II,
•
, ,
I
2
0 0 0
I
10
9
5 5
0 3
8
8
8
 I
2
10
The pivot this time was I. We now create a zero at the bottom of column 2, using the leading entry  1 in row 2:
R.+311,
I 0
2  \
o
0 0
• o
 4
10 8 29
 4 9 8 29
5 5
8 5
Column 2 is now done. Noting that we already have a leading entry in column 3, we just pivol on the 8 to introduce a zero below it. This is easiest if we first divide row 3 by8:
III.,
•
, ,
I
2
0 0 0
I
10
0 0
1
29
9
5  5
I  I 29
 5
Now we use the leading I in row 3 to create a zero below it:
,
I ' 29 1<,
•
0 0 0
 4
4
10
9
5 5
Q
4
~
I
0
0
0
24
2
With this fin al step, we have reduced our matrix to echelon for m.
Kellar', • The row echelon form of a matrix is' n ot unigue. (Fi nd a different rowecheton form for the matrix in Example 2.9.) • The lead ing entry in each row is used to create the zeros below it. • The pivots are not necessarily the entries that are originally in the positions even tually occupied by the leading entnes. In Example 2.9, the Pivots were I,  I, 8, and 24. The original matrix had 1,4, 2, and 5 in those positions on the "staircase." • Once we have pivoted and introd uced zeros below the leading entry in a column, that column docs not change. In other words, the row echelon form emerges from left to right , top to bottom. Elementary row operations are reversiblethat is, they can be "undone." Thus, if some elementary row operation cOllverts A into B, there is also an elementary row operation tha t converts B into A. (See ExerCises IS and 16. )
12
Chapter 2 Systems of Lmear Equations
Definition
Matrices A and B are row equivalent if there is a sequence of elementa ry row operations that converts A into B.
T he matrices in Example 2.9, ]
2
 4
 4
2 2
4
0
3
0 2
I
5 2 5
 I
I
3
6
5
,nd
]
2
 4
4
0 0
 ]
10
9
5  5
0
I
I
 I
0
0
0
0
24
are row equivalenLIn general, though, how can we tell whether two matrices are row equivalent?
Theorem 2.1
Matrices A and B are row equivalent if and only if they can be reduced to the same row echelon fo rm.
Prool
If A and Bare row equ ivalent, then furt her row operatIons WIll reduce B (and therefore A) to the (same) row echelon form. Conversely, if A and B have the same row echelon form R, then, via elementary ro w operations, we can convert A into Rand B into R. Reversing the latter sequence of operations, we can convert R into B, and therefore the sequence A ;. R ;. B ach ieves the desired effect. ________
Relllark
In practice, Theorem I is easiest to use if R is the reduced row echelon fo rm of A and B, as defined on page 76. See Exercises 17 and 18.
Gaussian Elimination When row reduction is applied to the augmented matrix of a system of linear equations, we create an equivalent system that can be solved by back substitution. The entire process is kno wn as Gaussian elimination.
Gaussian ElImination
I . Write the augmented matrix of the system of linear equations. 2. Use elementary row operations to reduce the augme nted matrix to row echelon form . 3. Using back substitution, solve the equivaJent system thaI corresponds to the row reduced matrix.
Remark
When performed by hand, step 2 of Gaussian elimination allows qUI te a bit of choIce. Here are some useful guidelines: (a ) Locate the leftmost colum n that is not all zeros. ( b) Create a leading entry at the top of this column. (It will usually be easiest if you make this a leadi ng I. See Exercise 22.)
Section 2 2
Direct Methods for Solving Linear Systems
13
•
(c) Use the lead ing entr y to create zeros below it. (d) Cover up the row contain ing the leadmg entry, and go back to step (a) to repeat the procedu re o n the remain ing submatTlx. Stop when the ent ire matrix is in rowechcion form.
Example 2.10
Solve the system
2X2 +
3 Xl
8
5 x,  2xJ =  5
2Xl+3~ +
x, 501111101
=
Xl
=
T he augmented malrix is
2 3
0 2 1
 I
8 5  2  5 3 1
We p roceed 10 red uce this matrix to row echclon form , foll owing the guidelines given fo r slep 2 of the process. The first nonzero column is column I. 'Ne begm by creating a leading entry al the to p of this column; interchanging rows I and 3 is Ihe best way to achieve th is.
0
2
2
3
1
 I
8 1 5 2 5 3
,
R,"' R,
2  5
1
 I
2
3
1
5
0
2
3
8
We now create a second zero in the fi rst column, using the leading I:
" ,
lit,
1
 I
0
5 2
0
2  5 5 15 3 8
Carl Friedrich Gauss ( 1777 1855) IS generally conSidered to be o ne of Ihe thfet: greates t mathematicians of al! time , along with Archimedes and Newton He is often called the "prince of mathematiCIans," a mckname that he richly deserves. A child prodigy, Gauss reportedly could do arithmetIC bdore he could talk_ At the age of 3, he corrected an er rur In his father's calcul;lliot1s for the company pa yroll, and as a young student, he found th e formula 11(11 + 1l/2 for the sum of the first II natural numbers. When he wa.~ 19, he proved that a 17sided polygon could be co nslru(ted USing only a straightedge and a comp;lss, ~md at the age of 2 1, he proved, in his doctoral dissertnllon, that every polrnomial of degree n wi th real or co mplex coefficie nts has exactl y II zeros, coullting multIple zeros th r Fundamental Theorem of Aigebr;l. Gauss' t 80 [ publicat ion Disqllisiliolles Anthmcticllc is gener:llly considered to be th e fou ndation of modern num ber theory, but he made co nt ributions to nearly eve ry branch of mathematiCs as well as to statistics, physics, astronomy, and su rveymg. Gauss dId not pubhsh all of his findings, probably because he was tOO crukal of Ills own work. He also did not like \0 teach and was often critical of other mathemnt lcinlls, perhaps be
14
Ch~JHtt
2 Systems of Lmear Equallons
We now cover up the fi rst row and repeat the procedure. The sc
i
\
 \
0
5 2
0
 2  5 5 \5 3 8
til.
•
2 5
\
\
0
\
\
3
0
2
3
8
We now need ano ther zero at the bottom of column 2:
•
. 1 H,
•
2  5
\
 \
0 0
\
\
3
0
\
2
The augmented matrix is now in row echelon form , and we move to step 3. The corresponding system is
2x} :::::  5
~ 
XI 
+
Xl
Xl:::::
3
Xj :::::
2
and back substitution gives Xj ::::: 2, then ~ ::::: 3  X j ::::: 3  2 ::::: I, and finally X I :::::  5 + X2 + 2x) :::::  5 + 1 + 4 ::::: O. Wt: write the solution in vector form as
o \
2 (We are going to write the vecto r solut ions of linear systems as column vectors from now on. The reason for this will become clear in Cha pter 3.)
Example 2.11
Solve the system
w
xy+2z:::::
I
2w  2x  y+3z:::::
3
:::::3
 w+ x y
SOlulloD
The augmented matrix is 1
 I
 I
2
I
2
2
 I 3
3
 I
1
 I
0 3
which can be row reduced as fo llows: \
\
 \
2
2
 2
 \
 \
\
 \
3 3 0 3
\
• .
••
!R,
'•.
•" •
\
\
 \
2
\
0 0
0
\
 \
\
0
2
2 2
\
 \
 \
2 \
0 0
0
\
0
0
 \
\
0 0
Section 2.2
Direct Methods for Solving Line;!.r Syst mlS
15
The associated system is now
w x y +2 z= 1 y  z= I which has mfinitely man y solution s. There is more than one way to aSSI gn JXItlUllelets, but we will proceed to use back substItution, writing the variables corr espo nding to the leading entries (the lead ing vari ables) in terms of the other variable s (Ihe free variables). In this case, Ihe lead mg v:lriables are )\I and y, and the free vari,lbles are x :md z. Thus, y = 1 + z. and from this .....e obtain
w =1 +x +y 2 z = I
+
x
+ ( I + z)  2z
= 2+ x z If we assign parameters x es and z = I,t he solution can be written in vect or form as w
2
x y
+s
,
\
,
t
2 0
+ r
\
r
0
+>
\
 \
\
0
0
+r
0
\ \
4
Example 2.1 I high lights a very importa nt property : In a consistent syst em, the free variables arc just the variables that arc not leading variables. Si nce the num ber of lead ing vari ables is the num ber of non uro rows in the row ech don form of the coefficie nt matrix, we can predict the num ber of free variables (p:lram eters) before we lind Ihe explicil solulion using back substitution. In Cha pter 3, we w,1I prove that , although the row echelon form of a matrix is not unique, the number of nOllzero rows is the So1me in all row echelon forms of a given matrix. Thus, it mak es sense to give a name to th is num ber.
"m k of a matt· is the Q;umber oC .IlOI 1lcrO =
•
\Ve Will denote the rank of a mat nx A by rank (A) . In Exa mpl e 2.10, the rank of
the coefficient matrix is 3, and in Example 2.11, the rank of the coefficie nl matrix is 2. The observations \ve have just mad e Justify the following theorem, which we will prove in more generality in Chapters 3 and 6.
l
Theor •• 2.2
The Rank The ore m lei A be the coefficien t mntrix of ;l system of linear equations with ""Variables. If
Ihe syslCm is consistenl. Ihen
num~r of fltt
variabl
","iot A )  p ..
•
16
Chapler 2 Systems of l mear Equations
Thus, in Example 2. 10, we have 3  3 = 0 free variables (in other words, a unique solution), and in Example 2.11 , we have 4  2 = 2 free variables, as we fo und.
Example 2.12
Solvc thc systcm ~+2X3=
X1 
Xl
+
2X:1 
X3
3
=  3
2xl  2x3 =
SolutIon
I
When we row reduce the augmented matrix, we have 1 1
0
 I 2 2
2 3  I 3  2 1
,
~ II,
0
 I 3 2
2 3  3  6 2 1
1
 I
0
1
0
2
2 3  I  2  2 1
1
 I
2
3
1
I
 2
1
11,  11
0
,
, o
11.  1 11
o
o
0
5
i
lead ing to the impossible equal ion 0 = 5. (Wc could also have pc rfo ~m ed R)  R2 as the second elementary row operatio n, which would have given us the sam e contradiction but a d ifferen t row echelon fo rm .) Thu&. th e systcm.has..llP solution5' . ]J)).QlJ),nSJCAt.
GaussIordan Ellmlnallon Wilhelm lo rdan ( 18421899) was a German professor of geodesy whose cont ribution to solving linear systems was a systematic method o f back substitution
closely rdated to the method des<:ribed here.
A modification of Gaussia n elimination greatly sim plifies the back substitut ion phase
and IS particularl y h elpful when calculat ions are being done by hand o n a system with in finitely many solutions. This vanant, known as GaussJordan elimi"ation, rei les on red ucing the augmented matrix cvcn furth cr.
,
Definition
A matrix is in reduced row echelon f orm if it satisfies the follow
ing properties: S in row echelon form. 2. 1n e leading entry in each nonzero row is a 1 ( called a leading I) , 3. Each column con taining a lead ing 1 has ..:eros everywhere els~
• The following matrix is in reduced row echelon fo rm: 1
2 0 0
0
0
1
0
0
0 0
3
1
0 0
4
 I
0
0 1
3
2 0
0
0
0
0
0
1
0
0
0
0
0
0
Section 2 2
Direct Methods for Solving Linear Systems
n
For 2 X 2 matrices, the poSSIble reduced row echelon fo rms are
For a short proof of this filet, see the article by Thomas Yuster, "The Red uced Row &helon Form 01 a Matrix Is Unique: A Simple Proof.~ in the M3rch 19M ~ue of .\1,]llIr mafies Magluim·(\·ol. 57, no. 2, pp. 9394).
GaussJordan EIImlaalioa
Example 2.13
oI] . 'nd
[00 0]0
whe re · can be any number. It is clear Ihat after a malrix has been reduced to echelon fo rm, further elementary row operatio ns will bring ilto reduced row echelon form. WhaJ is nol dear (although intuition may s uggest it) is that , unlike the row echelon form, the reduced row echelon form of a matrix is unique. In GaussJo rdan ehmination, we proceed as III Gaussian elimination but reduce the augmenled malrix to reduced row echelon form.
I. \Vrite the augmented matrix of the system of linear equations. 2. Usc elementary row operations to reduce the augmented matrix to reduced row echelon form. 3. If the resulting s~ ,h:m IS wnslstcnt, solve for the leading va riables III terms of any remaining free variables.
Solve the system in Example 2.11 by Gausslordan elimination .
Solullon The reduction proceeds as it d id in Example 2.1 I until we reach the echelon form: 1
 1
 1
o
0
1
2 1  I 1
000
o
0
We now must create a 0 above the leading I in the second row, thud column. We do this by adding row 2 to row I to obtain I
 I
0
0
1
o o The system has now been reduced
1 2
 I
0 0
o
1
0
10
IV 
X
+ z=2 y  z= 1
It is now much easier to solve for the leadlllg va riables:
w =2 +x  z
,nd
y = I
+
Z
l'
Chapter 2 Systems of Linear Equations
If we assign parameters x = s and z = 'as before, the solution ca n be written in vector
form as 2
w
+5
,
x
y
1
,
,
+
I
I
.t
",."l
From a comp utational point of view, it i.s more emdent (in the sense that it re{luircs fewer calculat ions) to fi rst reduce the matrix to row echelon form and then, working from right to left, make each leading entry a \ and create zeros above these leading I s. However, for manual calculation, you will find it easier to just work from left to right and create the leading Isand the zeros in their columns as you go. Lct's return to the geometry that brought us to this point. Just as systems of linear equations in two variables correspond to lines in R!, so linear equations in th ree variables correspond to planes in RJ. In fact. many questions about li nes and planes can be answered by solving an appropriate linear system.
Example 2.14
Find Ihe Jine of intersection of the planes x + 2y  z = 3 and 2x + 3y + 2 = I.
Solutlol r:irst, observe that there wilfbe a line of in terseClion, since the normal vectors of the two plancs[ I, 2. \ J and [2,3, IIare not pa rallel. The points that lie in the intersection of the two planes correspond to the poinls in the solution set of the system
+ 2x + x
2y  2 = 3
+2=
3y
1
GaussJordan elimination applied to the augmen ted mat rix yields
[;
2 3
 I
3]
R.
1 1
• [~
2
2R
r, .,.1/f.
 I
[~
o
+ 52 =
7
y  3z =
5
•
l
I 3] 3  5
1
Replacing variables. we have x
We set the free variable z equal to a parameter I and thus obtain the parametric equations of the line of intersection of the tw"o planes:
x=7  5t
figure 2.2 The intersection of two planes
Y =
z=
5
+
3t t
Section 2.2
19
Direct Meth ods for Solving Linear Systems
In ve<:lo r fo rm, the equat ion is
x
~7
y
5
,
~5
+ ,
o
3 1
See Figure 2.2.
EKamDle 2.15
1 Let p ==
x= p +
~
,q
~
o
1
2 ,u =
l,andv =
3  I . Determine whether the lin es
 I  I 1 1 I U and x == q + rv inte rsect and, if so, find thei r jXlint o f in tersect io n.
SllllIol
We need to be carefu l here. Altho ugh t has been used as the parameter III the equatio ns of both lines, the li nes are ind ependent and therefo re so arc their parame ters. Let's usc a d ifferent param eter say, s for the fi rst li ne, so its equation
x becomes x "" p
+
su. If the lines intersect, then we wan! 10 fi nd an x =
satisfi es both equatio ns simultaneously. Th:lt is, we want x = p su  tv = q  p . Substitu ti ng the given p, q , u, and v, we obtal1l the equations
+
SU
= q
y lhat
,
+
Iv or
5  31 = }
s+
1
=
2
s + 1=
2
t
whose solutIon is easily found to be s = ~ , I =
i .The point of intersection is thereforc
,,
1
x
1
1
y
0
+ )', 1
 I
1
,
~
,, ,
)'
)'
figure 2.3
Sec Figure 2.3. (Check thaI substituting t = ~ in the other equa tion gIves the same
Two intersecting lines
point. )
In R'\ It is possible fo r two lllles 10 intersect in a pOll1t, to be parallel, o r to do neither. Nonparallel lines that d o not intersect a rc called skew lilles.
Be••r'
HomOgeneous
S~stems
We have seen that every syslem of linear equations has either no solutio n, a u nique solutio n, or infinitely ma ny solutions. However, there is one type o f system that always has at least one solutio n.
,
80
Chapter 2
Systems of Linea r Eq uatIons
DeUaltloD
A.system of linear equations is called homogeneous if the constl:1n term in eamequation is zero.
III other words, a homogeneous system has an augmen ted matrix of th e for m IA I OJ. T he following system is homogeneous:
2x  x
+ +
3y 
Sy
+
Z ""'
0
2z = 0
Since a homogeneous system cannot have no solution (fo rgive the double negative!), It will have either a unique solution (namely, the zero. or trivial , solution 0 •nfi nit Iy Illany solutions. The next theorem says that the latte r case m llst occur if the number o f var iables is greater than the number of equations.
, Theore. 2.3
If [A I 01 is a homogeneous system of II! li near equat ions wi th m < II, then the system has in fi ni tely man y solutions.
/I
variables, where
Prool SlIlce the system has at least the zero solution, it is consistent. Also, rank{A) S
//I
(why?) . By the Rank Theorem, we have numberoffree variables =
II 
rank(A );=:
II 
m > 0
So there is at least one free variable and, hence, infinitely many solutions.
Theorem 2.3 says noth ing about the case where m ~ II. Exercise 44 as ks you to give examples to show that, in this case, there can be either a unique solution or infi ni tely many solut ions.
Motl
Linear SlSlems over 7L, Thus far, all of the linear systems we have encountered have involved real numbers, and the solut ions have accordingly been vectors in some [R n. We have seen how other number systems ariseno tably,'Z ..Wd~n p isa prime number,Z.pbehaves in manyrespects like R; in particular, we can a ,5ul)tract, multiply, and d ivide (by no nzero num bers). Thus. we can also solve systems of linear equations when the variab les and coefficien ts belong to some z.p'In such instances, we refer to solving a system over Z,. For example. the linear eq uation X I + Al + X, = 1, when viewed as an equation over 1 2, has exactly fou r solutio ns: Rand Zpa re examples of fields. The set o f rational num~rs 0 and the set o f complex nu mbers Care
other examples. Fields are covered in detail in courses in abstract algebra.
x,
I
X,
0
x,
0
x, ,
x,
X,
0 I ,
x,
0
x,
(where the last so lution arises because I
X,
+I+
0 0 • ,nd I I "" I in Z. l)'
x,
I
x,
,
x,
I
Section 2.2
Direct Methods for Solving Linear Systems
.,
x, When we view the equation XI
+ :s + x, =
l over Z " the sol utio ns
Xl
are
x, 100202JJ2 0 , 1 ,0,2 , 2 , 0 , 1 , 2,1 00
1
02221
1
(Check these. ) But we need not use trialanderro r methods; row reductio n o f augmented matrices wo rks just as well over Zpas over R.
Solve the following system o f linear equations over l ,: XI XI
+2x,+ x.~= O +x,= 2 X:z + 2x, :::z 1
Solutlaa
The fi rst thing to note in examples like this one is that subtraction and d ivision are not needed; we can accomplish the same effects usi ng addition and multiplica tion. (Th is, however, requires tha t we be working over l ,. where p is a prim e; see ExerCise 60 at the cnd of this sectIOn and Exercise 33 in Section IA. ) We row red uce the augmented matrix of the system, using calculations modulo 3. 1
2
1 0
1
0
1 2
0
,
II .. .. ZII,
2 1 II, + II,
II, + 211,
Thus, the solution is X I = 1, X2 = 2. x,
Example 2.11
::
1
2
1 0
0
1
0 2
0
1
2 1
1
0
1 2
0
1 0 2
0
0
2 2
1
0 1
0
0 1
0 2
0
0
1 1
I.
Solve the following system of linear equations oyer 1 2:
x, +x1+x, + x, +x, X:z + x, Xj + +
x~
= 1
=1 =0 ~
:: 0
x~
:: 1
82
Chapter 2 Systems of Linear Eq uations
Solation
The row red uction proceeds as fo llows: I
I
I
I
I
I
I
0
0 0 I
I
I
0 0
I
0 I 0 0 I 0
0
I
H. + II,
,
/I. ~ H,
I
11. .... ,1. II, + 11, II, + II,
•
11,+ /1 .
,
/1,. + 11,
I
I
I
I
0 0 0 0
0 I 0 I
I
I 0
I I
0 0 I 0 0 0
I
0
0
I
0 0 0
I
I
0
I
0 0 I 0
I
I 0
0
0 0
0
0 0
I
0
I
0
I
0 0
0 0 0
0 0
I
I
0
0
0
I
I
I
I
I 0
0 0 0 0 0
T herefore, we have +~ = I
x,
+
x~
= 0
X3+ X4=O
Seui ng the free variable x. = t yields
x, x, x,
I + 1
I
1
'"
1
0 0 0
1
I +1
I I I
SlIlce t can take on the two values 0 and I, there are exactly two solutions: I
o
o
I
o o
and
I I
.1 .. 1"
For li near systems over lLp' there can never be infinitely many solutions. (\Vby not?) Rat her, when there is more than one solution, the nu mber of solutio ns IS fi nite and is a function of lhe number of free variables and p. (See Exercise 59.)

Direct Methods for Solvmg L.lnear S),Stt'I11S
Sei.:tio n 2.2
13
Exercises 2.2 In Exercises 18. determi"e whether the giwn matrix is ill row echelon form. If it I S, state w~,ether it IS also it, reduced row echelon for m. 0 0
3
I
0
I
3
0
0
0
3
5. 0 0 0 I
0
I
I.
3.
0 0
[~ I
7.
I
2.
~l  4 0 0 0 0
5
7 0
I
0
I
 I
4
111 Exercises J 7 (llid I B, sl/ow that Ihe give" IIIMnC!!5 (l re row
0
0
0
equlI'a/e/lt {Hid find (I scqllf!llce of eicm cll wry ro w opera/iollS Ilwl will COll verl A lfl to 8.
0 0
4.
6.
I
0 0 0
0 0
0
0
0
0
0
I
0
I
0
I
0
0
0
17. A ==
2 3
2
I
3
5
I
0
0
0
0
I
 I
0 0
I
I
0
I
0 0 0 0 0 0
3 0
8.
row «he/Oil form. 0
I
9. 0
I
I
I
I
I
3
5  2
II.
5
2 13.
3 2
4
10.
12.
4  2
 I
 I
 1
 3
 I
14.
 2  3
 4
 2
I
6
,  6 2
:]
,
 4
0
 I
10
9
5  5
I
I
 1
0
0
24
0
0 0
0
1
2
 4
 4
5
2 2
4
0
0
3
2
I
 I
I
3
6
2 5 5
0 ,B
 I
I
I
~
I
 I
3 2
5 2
I
0
Perfor m R! + R, and R, + R2• Now rows t and 2 are identical. Now perform Rl  R, to obtain a row o f zeros in the second row.
+
RI • R,  R2• Rz + R, •  R,
21. St udents freque ntly perfo rm th e following type of cal culation 10 introd uce a zero intO :1 matrix:
7
10  3
15. Reverse the elementary row operatIOns used in Example 2.9 to show that we can convert
2
I
R!
1
I
I
3
20. What is the net e ffect of performi ng the fo llowing sequence o f elem en tar y row o perations o n a matrix (with at least two rows)?
[~ ~] [:
 I
2 0
19. What is wro ng w ith the fo llowing "proof" that every m atrix with at least two rows is row cqUlvalent to a matrix with a :.:ero row?
ill Exercises 91 4, use eielllel//(/fY row operatiolls to redllre Ihe gIVen matrix to (a) row echeloll f orm (md (b) retillced
0
:J. B =[~ ~ ]
[~
18.A=
I
,
16. In general, what IS the elementa ry raw o peratio n that " undoes" each of the three elementary row o perations R, ++ Rf kR,~ and R, + kR,?
However, 3 Rz  2R, IS flOl an elem entary row operatio n. Why not? Show how 10 achieve the same resul t using elem entar y row operations. 22. Consider the ma trix A =
into
[~ ~ ]. Show that any o f
Ihe th ree types o f elem entary row operations can be used to create a le:lding I at the top of the first co lumn. \Vh ich d o yo u p refer and why? 23. What is the rank o f each of the matrices in Exercises 18? 24. \Vhat are the possible reduced row echelon fo rms of 3 X 3 m atrices?
••
Chapter 2 Systems of Linear Equallons
In Exercises 2534, solve the given system of eqllations IIsing either Gausswn or GaussJordan elimination, 26. x y+ z = O 25. X,+ 2x23x,= 9  x + 3y + z = 5 2x,  X2+ x,= O X2 + X, = 4
4x, 
27.
x,  3x z  2x, = 0 x,+ 2xz + xJ= O 2x,
+
4x1 + 6xJ
+ s= + s=
29. 2r 4r
+ 21V + 3x
28.
+
Y
3x  y
31V  X 31V  4x
= 0
7z = 2
+ 4z = + z=
0 I z= 2
+ y
J 7
 XI
+ 3x1  2x] + 4X4 =
31.
+
3x1
X, ! X1
+ X2 
ji, x,
+'jX2
}XI
2 =
+ Xs
22 = 3, ~
 vl
y + V2z=
I
+X+
2y
2
+z=
+
ky = I
kx+ y == 1 43. x
3z = 2
+ Y + kz
x+ky+
x+ y+ z= k 2x  y+4z= f(
==
1
z=
I
kx+ y+ z= 2
and
46. 4x +yz= O and
B
2x  y+ 4z=S
2xy+ 3z= 4
47. (3) Give an example of th ree planes that have a com mon line of intersection (Figure 2.4 ).
I
1
°
y+z "'=' x+ y  \ w+x +z= 2 c+ d ., 34.n + b + a + 2b + 3c+ 4(1 = n + 3b + 6c+ lOti = a+ 'Ib+ 10c + 20d E> w x 
+
45. 3x+2y+z = 1
= 1
 4x:s=
2x,
vly IV
=
42. x  2y
41. x
In Exeroses 45 a"d 46, fi "d the li"e of intersection of the give" planes.
6X4
 3",
32. V2x+ y+
33.
8~
4x, XJ 

0
xJ2 ~ =J
2x,6x2+
40. kx + 2y = 3 2x  4y =  6
44. Give examples of homogenCQus systems o f m linear equations in 11 variables with It! "" n and with //I > n that have (a) infinitely many solutions and (b ) a unique solution.
2r+5s= 1 30.
III Exercises 4043,for what value(s) of k, if any, will the systems hllve (a) I/O soilltion, (b) a unique solution, l/lld (c) illfilli rely II J(lIIY solll t lOllS ?
4 iO 20
35
flglrl
Exercises 3538, (lelermi"e by impectioll (i.e" withollt performing any ca/clliatiolls) wlletller a /mear system with the given allgmented matrix has a IImque 5O/lItIOIl, infillitely many solutions, or flO solution, lllsllfy your answers.
2.'
In
0 35.
37.
0 I
3 36. I
2 2
2
4
4 0
I
2 3
4
8 0 12 0
38. 6 7
4
3
7
7
I 2
0 I 0
3 I I I
I
2
5 9
6 10
3 7
"
5 7
0 3
(b) Give an example of th ree planes that intersect in pairs but have no common point of intersec~on (Figure 2.5).
I
I I  I
6 2
0
5 6 2 I 7 7
39. Show that if ad  be '1= 0, then the system
(u:+by= r
cx+ dy==s has a unique solution.
fl11U12 .5
Section 2.2
Direct Methods fo r Solving linear Systems
15
show that there are mfmitcly many vectors
(el GLve an example of three planes, exactly two of which arc parallel (Figure 2.6).
x, x '"
X,
x, that simultaneously sat isfy u ' x = 0 and v' x = 0 and that aU arc multiples of u Xv=
fl,.,.2 .• (d) Give an example of three planes that in tersect in single pomt (Figure 2.7).
lL
52. Let P =
I':Y, 
"J v~
" , VI 
III V,
lil Y: 
IIl Vt
2
1
0
1 •q =
1 •u =  1
0
 3 , andY = 1
0 6  1
Show that the Ji nes x = p + su ::and x = q + IV,\Te skew lines. Find vector equations of a pair of p::arallel planes, one contaimng each line. hi Exercises 5358, soll'e tile systems oflmeaTequations over
tile indim/ed Zr 53. x + 2y = l over Z,
+ Y= 2 +Y = lover 'Z.2
x 54. x
y+ z=O
+ z=
x
fill" 2.1
y+ z = O +z= l
x 48. P =
49. P 
3 1 .q = 0
 1
1
2 ,u =
2 ,v =
 1
0
0 ,v =
2 3
1
1
1
1 ,u =  1
2 50. Let P = 2 , u = I , and v = 1 . Describe 3  \ 0 all points Q = (a, b, c) such lhal the line through Q with direclion vecto r v intersects the line with equation x = p + SUo 1
1
51. Recall that the cross prod uct of vecto rs u and v is a v«tor u X v that is orthogonal to both u and v. (See Exploration: The Cross Product in Chapter I.) If and
v=
56. 3x x
1
 1
0
1
= I over 'Z.,
55. x+ Y
2
"v: ",
I
+ 2y = lover l.J + 4y = I
57. 3x + 2y= I overl., x
58.
+ 4y
= 1
+ 4~
XI
x t + 2x!+ 4x, 2xt+ ~
+
"" I over Z", = 3
~= 1
+ 3x,
"" 2 59. Prove the following corollary to the Rank Theorem: Let A be an m X "m::atrix with entries in Z,. Any consistent syslcm of linear equ::ations wi th coefficienl m:llrix A has exactly p" ,.".V.l sol utions over 7L r ''(I
60. When p is not prime, eXira care is needed in solvi ng a linear system (or, indeed, any equation) over l,. Using Gaussian e limination, solve the foll owing system over l,. What complicatio ns ::arise? 2x + 3y= 4 4x + 3y = 2
.~
...
,
~
.:;$: .
CAS
Partial Pivoting In Exploration: Lies My Computer Told Me following Section 2.1, we saw that ill conditio ned linear systems ca n cause tro uble when roundoff error occurs. In this exploration, you will d iscover another way in which linear systems are sensitive to rou ndoff error and see that very small changes in the coefficients can lead to huge maccuracles in the solution . Fortunately, there is somet hing that can be done to
mmtmize or even e1immate this problem (unlike the problem with tnconditioned systems). I.
(a) Solve the single linear equation 0.OOO21x = 1 for x.
(b) Su ppose yo ur calculator can carry on ly fo ur significant digits. The equation will be ro unded to 0.OOO2x = I. Solve this equation. The difference between the answers in parts (a) and (b) can be thought of as the effect of an error of 0.0000 1 o n the solution o f the given equation. 2.
Now extend this Id ea to a system o f linear eq uations. (a) Wit h Gaussian elimination, solve the li n ear system 0.400x
+ 99.6y =
75.3x  45.3y =
100 30.0
using three siglllficant digits. Begin by pivoting on 0.400 and take each calculation to th ree significant digits. You should obt:lin the "solution" x =  1.00, Y = 1.01. Check that the actual solutjon is x = 1.00, Y = 1.00. Th iS is a huge error200% 111 the xvalue! Can you discover wholt caused It? (b) Solve the system in part (a) again, this time interchanging the two equa tions (o r, equivalently, the two rows of its augmented matri x) and pivoting on 75.3. Again, take each calculation to three sign ificant digits. What is the solution this time? The moral o f the story is that, when uSing Gaussian or GaussJo rdan dim ination to obtai n a numerical solution to a system of linear equations (i.e., a decimal approx imation), you should choose the pivots with care. Specifically, at each pivotll1g step, choose fro m alTlong ,,11 possible p ivots in a colum n the entry with the la rgest absolute value. Use row interchanges to bring thIS element into the correct posit ion and usc it to create zeros where needed in the column. This strategy is known as partial pivoti"g,
.6
3. Solve the foll owing systems by Gaussian elimlllation, first without and then with part ial pivoting. Take each calculation to th ree sign ificant digits. (The exact solutions arc given.)
+ O.995y == 1O.2x + I.OOy =
(il) O.OOlx

1.00
(b) lOx 
 3x
 50.0
7y
+ 2.09y + 6z
5x 
. [xJ [,.ooJ 1.00
Exact so]utlon: y =
= 7 = 3.91
y+ Sz = 6
x Exact Solullon: y
 I.DO
z
l.OO
0.00
I Counting Operations: An Introduction to the Analysis of Algorithms
Abu la'fur Muhammad ibn Musa al Khwarizmi (c. 780850) was an
Ambie mathematician whose book 1'liMb aljabT w'l/ll/Ilujllualllll (c. 825) described the uSC of HinduArabiC nume.rals and Ihe rules of basic arithmehc. T he second word oflhe book's title gives rise 10 the English word a/gellm, ,md the word algorithm is dem/cd from
(11· Khwarizml's 113111e.
! ! •
G.l ussian and GaussJorda n elimination arc examples of algorithms: systematic procedures deSigned to Implement a particular taskin this case, the row reduction of the augmented matrix of a system of linear equations. Algorithms arc particularly well sui ted to com pu ter implementation, but not all algo rithms are created equal. Apart from the speed, memory, and other atlributes of the com puter system o n which they are running, some algorithms are fast er than others. One measure of the socalled complexilyof an algori thm (a measure of its effi ciency, or ability to perform its task in a reason
Consider the augmented matrix
[A l b) =
2
4
6
3  I
9 1
6 12
 I
8 1
OJ
• Count the number of operations requir~d to bring (A I b] to the row echelon form I
o
2 1
3 4  \ 0
o
0
[1
(By"operation" we mean a mu lt iplica tion or a division.) Now count the number of operations needed to complete the back substitution phase of Gaussian cltminatio n. Record the \olal number of operations.
2. Count the number of operations nceded to perform GaussJordan elim inationthat is, to reduce (A I b] to its reduced row echelo n form I
0
0  1
o
1
0
1
o
0
1
1
(where the zeros are introduced inlo each column immed iately after the leading I is created in that column ). What do your answers suggest about the relative efficiency of the two algori thms? We will now attempt to analya the algorithms in a general, systematic way. Suppose the augmen ted matrix [A I b ] arises from a linear system with n equations and /I vanables; th us. [A I b] is /I X ( /I + I):
(A I b J =
a" a"
a"
• ••
au
•• •
a" b, a" b, • •
(/,,1
• ••
a"
II""
b,
We will assume that row in terchanges are never need ed Ihat we can always creale a leading I from a pivot by dividing by the pivot. 3.
(a) Show that
II
operat ions are needed 10 create the firslleading I: · ..
I
,
•• •
n,,1
n~
•
· ..
·..
·..
• ••
• • a"" bIt
(Why don't we need 10 counl an operation for the creation of the leading 1?) Now show that n operatio ns are needed 10 obtain the first zero in colum n I:
• o • I
• • . .. • • . ..
II
,
(Why don't we need to coun t an operation for the crcatlon of the zero itself?) When the first column has been "swept out," we have the matrix
• •• o • ... • • 1
•• •
• •
o • ... • • Show that the 10lal number of operations nceded up to this poi nt is "
+
(n  I ) n.
(b) Show that the total number of operations needtd to reach the row eche~ Ion form

1
• .. . • •
o
1
o
0
••
...
I ·
•
"
[ n + (n  I),,] + [(n  I) + (n  2)(n  I) ] + [(n  2) + (n  3)("  2) ]
+ ... +
[2
+
1·2 ]
+
I
whICh si mplifies to
(e) Show that the number of operallons nceded to complete the back substitution phase is
(d ) Usi ng summation formulas for the sums in parts (b) and (e) (set Exercises 45 and 46 in Section 2.4 and Appendix B), show that the total Ilumber of operations, T ( n), perfofmed by Gaussian elim ination is
T(n) :::
1,,) + Ir  in
Since every polynomial function is dOllllllutcd by its leading term for large values of the variable, w~ see thai T ( tI) = ~ for large values o f II.
rr
rr
4. Show that G aussJordan el im Ination has '{( til = j tOlal operatlonS if we creale zeros above and below th e leadmg Is as we go. (This shows that. for lurge systems of linear equations, Gaussian elimination is fa ste r than this version of GaussJordan elimination .)
II
90
Ch3pler 2 Systems of Lincnr Eq unliolls
Spanning Sels and linear Independence The second of the three roads in our "trivium" is concerned with linea r combmfllions of vectors. V./e have seen that we can view solving a system of linear equations as asking whether a certam vector IS a linear combination of certain other vectors. We explore this idea in more de tail in this section . It leads to some very Important concepts, which \
Spanning Sets 01 Vectors We can now easily ans wer the question raised in Section 1.1: When is a given vc<:tor a linea r comb inat ion of other given vectors?
lXample 2.18
\ (ll) Is the vector
 \
\
2 a linear combination of Ihe vecto rs 0 and 3 3
\ 2 (b) Is 3 a linear coml>i nalion of the vectors 0 and 4 3
\
,
3
 \
\ ?
 3
Sollilion (a) \Ve wanllo fin d scalars x and y s uch that
I
x 0
+y
3
 1
1
1
2
3
3
Expandlllg, we obtain the system
x 
y= I Y= 2
3x 3y=3 whose augmen ted matrix is 1
 I
o
\
3
I
2  3 3
(Observe that the column s of the augmented matrix are just the given vectors; notICe the o rder of the vecto rsin particular, \'Ihic h vC<: tor is the co nstant vector.) The reduced echelon form of this matrix is
\
o o
0 3 \2 0 0
( Verify thiS.) So the solution is x = 3,), = 2, and the corresponding linear combinatio n is  \
\
3 0 3
+2
\
 3
\
=
2 3
!J1
$e(t lon 2.3 Spann ing Sets and Lme;u Independence
whose augmenled (b) Utilizing our observation in par t (3), we oblain a lmea r syslem matrix is 1
I 2
0
1 3
3
3 4
which reduces to
5 3 0 0 0  2 1
0 1
2 a linenr com revealmg Ihat Ihe system has no sol ution. Thus, in this cnse, 3 is not I
bination of 0 and 3
4
 I
I 3
of linea r The nOli on of a span ning set is int imately connected wIth the solu tion mented masystems. Look back at Example 2. 18. Th ere we saw thn' a system with aug colum ns of trix fA I bl has a solution precisely when b IS a linear combination of the A. This is a gene ral fact , summarized in the next theQrem .
TheOI•• 2.4
nt if and A system of linear eqU
M,
LeI's revisit Example 2.4, iOlcrpreting it inlighl of Throfcm 2.4. (a) The system
x  y::: I x
+
y = 3
has the uniq ue solution x = 2,y ::: I . Thus,
See Figure 2.8(a).
(b) The system
x
y :;:: 2
2x  2y
=
4
that has infinitely many solu tion s of thc form x = 2 + t, y::: t. This imp lies
for all valu es of t. Geo met rically, the vectors
[ ~ ], [=~], and [~] are all parallel and
so all lie along thc same line through the origin Isec Figure 2.8(b) J.
Chapter 2 Systems of Linear Equations
92
)'
!'
!'
5 4
4
3
3
3
2
2
 2  1
1  1

0
x
x
2  1
3
2
2
2
2
2
3
3
3
(.)
Jlllni
2
3
3
(0)
(b)
Z.' (e) The system
x y= 1 xy=3 has no solutions, so there arc no values of xand y that satisfy
In this case, [ : ] and [
=:]
an: parallel. but
[ ~] does not lie along the same line
thro ugh the o rigin [sec Figure 2.8(e)]. We will often be interested in the collection of allli nclIT combinations of a given sc i o f vectors.
Dellnltlon
If S = Iv!' v 2' •••• v l } is a set of vectors in R • then the set of all linear combi nations of vI' v1, • ••• vi is called the spa" of vI' v 2•... , v l and is denDled by span(vp v2•... , vl ) o r span(S). If span{S) R". lhcn S is called a spa,,ning set for R~.
lumpla 2.19
Show th at Rl = span(
Solulioil
We need to show Ihat an arbitrary vecto r
combination of [ _
1~]
=
[ _ ~]. [~]).
[~] can be ..... ritlen as a linear
~ ] and [ ~ l thai is, we must show that the equat ion x[ _~] +
[ ~] can always be solved for x and y (in terms of
valucs of a and b.
Ii
and b), regardless of the
93
Section 2 3 Spanning Sets and Linear Independence
The augmented matrix is [ _ 1
a]
II 
II.,
>
3 b
~ ~
:l
and row reduction produces
!~
I 3 bj ll [ 2
I
[ 
'
0
II
3 b 7 (/ + 2&
j
at which point it is clear that the system has a (unique) solut ion. (Why?) If we contin ue, we obtain
" ['o >
3
b I (a + 2b)f7
J, , w [I ~
0
from which we see that x == (3 11  b)/7 and y = (a (la nd b, we have
0
(b  3a )/7 ] I (11 + 2b)/7
+ 2b)/7. Thus, fo r any choice o f
(Ched this.)
He.,rk It is also tr ue that IR? "" fi nd xand y such that
o[~] "" [:l
span( [ _~]. [ ~l [ ~]} If,givcn [ :]. wecan
x[ _ ~ ] + y[ ~ ] =
[:l
then we also have
x[ _~] + y[ ~] +
in fact, any sct of vecto rs that cOIl taillS
a spanning set for 1R2(see Exercise 20). The next example is an important (easy) case of a spanningsct. We will encounter versions of this example many times.
EKample 2.20
x Let e p e1 • and e 3 be the standard unit vectors in
RJ. Then for any vector y
o
o
y = x 0
+)' I
+ z 0
o
o
x
1
,
,
= xe l
+ ye2 +
>
we have
ZC.l
1
Thus. Rl = span(e l , e2, ( 3 ). You should have n o difficulty seeing lhat, in general, R~ "" span(e l' c! • ... , e~).
t
When the span of a set o f vectors In a d escription of the vectors' span .
EKBmple 2.21
1
Find the span o f 0 and
3
R~
is n ot all of lR~, i\ is reasonable to ask for
 I
1 . (See Exam ple 2.18.)
3
94
Chapter 2 Systems of Linear EquatIOns
,
Solutloll Thinking geometrically, we can see that the set of all linear combi nations of  I
]
o
and
I IS Just
]
the plane through the origi n with
0
3
3
x
plane
]
as direction
 3
x y
,
 ]
]
= , 0
+ 1
,
 3
3 1
1
 I
y is in the span of 0 and
which is just another way of sayi ng that
Two nonparallel ve<:tors span a
and
3
vectors (Figure 2.9). The vector equ:lIion of this pl ane is
figure 2.9
 I
] z 33 Suppose we want to obtain the general eq uation of this plane. There are several ways to proceed. One IS to usc the fa ct that the equation ax + I,y + cz = 0 must be satisfied by the points ( ], 0, 3) and (I, I,  3) determined by th e di rection vectors. Substitution then leads to a system of eq uations III (I, II ,\I1d c. (See Exercise 17.) Another method is to use the system of equations arising from the vector equation:
s
r= x r = )'
3s  31 =
Z
If we row reduce the augmented matrix, we obtain 1
0
3
 I
X
• o
1 Y
,
 3
]
11.)11
o
 I
x
]
Y
0 z  3x 1
x
Now we know that Ihis system is consistent, since y is in the span of 0  I , 3 1 by assumption. So we 11/115/ have z  3x = 0 (or 3x  z = 0, in more standard
 3 for m ), giving us the general equation we seek.
Remerk
A normal vector to the plane in this example is also given by the cross
product
1
1
 3
OX]
°
)
1
 )
lineal Independence I
1
In Exam ple 2. 18, we found that 3 0
+
2
1
1
2 . Let's abbreviate thIS equa
 3 3 3 l ion as 3u + 2v = w. The vector w "depends" on u and v in the sense that it is a linear com blllation of them. We say that a set o f vectors is linearly depetldetlt If one
Section 2.3
Spanning Sets and Lmear Independence
95
of them can ~ written as a linear combination of the others. Note that we also have u =  ~ v + l w and v = u + ! w. To get around the question of which vecto r to express In terms of the rest, the formal definitio n is stated as follows:
!
Definition
A sct of vectors vI' v~, ... , v l is linearly dependent jf there are scalars 'I' ':2, ... , ,~, (1/ least Ollt: of whi,h is IIo t zero, such that
'I VI
+ '2V2 + ... +
'''V~
= 0
A set of vectors that is not linearly dependent is called lillcarly
i"dependellt. •
RemarllS • In the definition of linear dependence, the requirement that at least one of the scalars 'I' ':2, .. . , 't must be no nuro allows for the possibility that some may be zero. In the example above, u, v, and w are li nea rly dependent, since 3u + 2v  w = 0 and, in fact, all of the scalars are nonzero. On the o ther hand,
so
[~], [~], and [~] are linearly dependen t, since at least o ne (in fact, two) of the
three scalars I,  2, and 0 is nonzero. (Nott' that the actual dependence arises si mply fro m the fact that the fi rst two vectors are multiples.) (See Exercise 44.) • Smce OVI + OV1 + ... + O v~ ::: 0 fo r any vectors VI' v 2, •• • , v,, linear dependence essentiall y says that the zero vector can be express(.'(\ as a IIontnvia/l inear combinat io n of v,. VI"'" V.. Thus. lincar mdependence means that the zero vector can be expressed as a linear combination of VI' v 2" •• , vt ollly in the trivial way: ' IVI + Czv2 + ... + 'tV, = 0 o nly if ' I ::::c 0, Cz = 0, ... "t = O. •
.5

The relationship between the intuitive notion of dependence and the formal definition is given in the next theorem. Happily, the two notions are eq uivalent ! Vectors VI' VI' ... , Vm in R~ are linearly dependent if and o nly if at least one of the vectors coin be expressed as a linear combination of tht' others.
PrlGl If 011t' of the vectorssay. VIis a linea r combina tion of the others, then such that VI = ':2V2 + ... + ,",v"" Rearranging, we obtam there are scalars ' 2" ' " VI  Czv~ _ ..  cmv'" = 0, which implies that VI' VI" '" V", are hnearlydependent. since at least o ne of the scalars (namely, the coefficient I of v.) is nonzero. Conversely. suppose that v I' V I " '" V m are linearly dependent. Then there are ':2, .. . , ("" not all zero, such that + ' 2V1 + ., . + '..,V,. = O. Suppose scalars CI o. Then
'm
"*
'1'
and we may multiply both sides by other vectors:
'IV.
1/ '1to ob tain VI as a linear combina tio n o( the

96
Chapter 2
Systems of Lmear Equations
Mall
It may appear as if we are cheating a bit in th is proof. After all, we cannot be sure that VI is a linea r combination of the other vecto rs, nor that ci is nonzero. However, the argumen t JS analogous for some other vector v, or for a different scalar Cj . AJternati veiy, we can just relabel things so that they work ou t as in the above proof. In a situation like this, a mathematician might begin by sayi ng, " withou t loss o( generality, we may assume that VI is a linear combination of the other vectors" and then proceed as above.
Example 2.22
Any set of vectors containing the zero vector is linearly dependent. For i( O, v2, · .. , Ym a re in R", then we can find a nont rivial combination o f the form CIO + C:!Yz + . + C,,, Y m = Obysetting c1 = I andC:! = cJ = . . = c,n = O.
Example 2.23
Determine whether the fo llowing sets of vectors are linearl y independent:
[:],nd [:] 1
{el
 I .
o
o
1
0
I ,
I , and 0
o
1
(b)  I
I , and
(dl
0
 I
1
1
1
1
2.
I ,a nd
4
o
1
1
 I
2
Solullol
In answe ri ng any question of this type, it is a good idea to see if you can d etermlOe by inspection whether one vector is a linear combination of the others. A li ttle tho ught may save a lot of com putatIOn !
Jtt) The only way two vecto rs can be li nearly dependen t is If one IS a multiple o f the other. (Why?) These two vecto rs aTe dearl y not multiples, so the y are linearly independent. ( b) There is no obvious dependence relation here, so we try to find scalars such that 1
'1
o o o
0
1 +£21
o
'p ' 2' cJ
1
The correspond ing linear system is
c, + c, + c,
CJ = 0 ~ O
~ + C,
= 0
and the augmented matrix is 1
0
1 0
1
1
0
1
0 0 1 0
Once again, we make the fundamental observation that the colum ns of the coefficient matrix are jus t the vectors in question!
Section 2.3
Spanning Sets and Li near Independen ce
91
The reduced row echelon form is
(check this), so
I
0
a
o o
I
0 0
0
I 0
0
'I = 0, ':! "" 0, CJ "" O. Thus, the given vectors are linearly independent.
(c) A Im1e reAection reveals that
1
0
\ +
+
1
o
 I
 ]
0
0
0
I
0
so the three vectors are li nearly dependent. (Sct up a linear system as in parI (b) to check this algebra ically.) (d ) O nce again, we observe no ohvious dependence, so we proceed directly to red uce a homogeneous linear sys tem~h ose augmen ted matrix has as ils columns the given vectors: II,
\
\
\ 0
2
\
4 0
0
 \
2 0
II, ~ II ,
•
\
\
\ 0
0
 \
2 0
0
 \
2 0
+ H,
• •, R.
\
0
3 0
0 0
\
 2 0
0
0 0
If we let the scalars be c" ' 2' and '1, we have 'I
+
3(3 ::::
0
'1  2c, :: 0 from which we sec that the system has l1lfinitely many solutions. In particular, there
must be :1 nonzero solution, so the given vectors are li nearly dependent. If we continue, we can describe th ese solutions exactly: ' I ""  3(, and Cz == 2cJ • Thus, for any nonzero value of C3• \Ve have the llnear dependence relation 1
 3c, 2
1
+ 20
o
1
0
o o
1
\
(O nce again , check that this is correct. )
•
We sum marize this procedure (or testlllS for linear independence as a theorem.
Theorem 2.6
leI
V I' V1•. •. , v'" be (colum n) IV I v2 • . • v",J with these vectors
vectors in [Rn and let A be the IIXm matrix as its columns. Then V I' v!' ...• v.. are linearly dependent if and only if the homogeneous linear system with augmented matrix iA I OJ has a nontrivial solu tio n. vI' v2' • •• , v", are lin early dependent if and on ly if there are scalars cp c2" . • , cm ' not all zero, such that ' I V I + S!v1 + .. . + c..,vm == O. By Theorem 2.4, this is equiv
Prool
" alenl to saying that the nonzero vector augmented m:ltrix is [ VI v2 . . . Y", I 0].
is a solution of the system whose
SI
Chapter 2 Systems of Linear Equations
Example 2. 24
The standard umt ve<:tors e p e , and eJ are linearly independent in R', Slllce the sys2 tem wI th augmenTed matrix [el e e I O[ is already in the reduced row 1 J echeloll form I
0
o [ o
0
0 0 0 0 I 0
and so dea rly has only the trivial solution. In general , we see that et. e , 2 ••• , eft will be linearly independent in R".
Perform ing c1ementary row oJXf3tlons on a matrix constructs linear combinations of the row s. We can use Ihis fact to come up wi th anOlher way 10 tesl vectors for linear independence.
Examp le 2.25
Conside r the thre e vectors of Example 2.23(d) as ro w vectors: [1,2 ,0)'
[ 1, 1,  11,
and
/1, 4, 21
We constru ct a matrix with these vectors 3S its row s ilnd proceed to redu ce it to eche lon fo rm. Each tim e a row changes, we denote the new rO\~ by adding a prim e symbol:
I 2 0 < I I  I I
4
2
,
" • • 0 , 0,
I
2
0
0 0
I
 I
2
2
/r,  II", >211"·
•
I
2
0
0 0
 I
 I
0
0
From this we sec that
or,
1!I
term s of the orig inal vectors,  3[ 1.2. 0J + 2[ 1, I , I] + [ 1,4, 2J ~ [O, O,O J
[No tice that this 3pproach corresponds to t3klllg pl,2 .23 Id)· 1
CJ
= 1 in the sol utio n to Exam
Thus, the rows of a matrix will be linearly dependent if elem entary ro\~ operations can be used to create a zero row. We SUmm3r1Ze th is fi ndin g as follo ws: ,
."
Let V "
V2 ' ••• ,
v'" be (row ) vectors in
thes e vect ors as its rows. The n vI' v ' 2 rank (A) < m.
R~ and
• , _,
let A be the mX /I matrix
' . II " with ~"
v. v'" art' linearly dependt:nt if and only jf
Proof Ass ume that v,, VI " .• , V... are linearly dept"ndenl. Then, by The orem 2.2, at least one of the vectors can be wriHen as a linear combination of the othe rs. We
Section 2.3
Span ning SC1S and Lmear Independence
99
relabel the vectors, if necessary, so that we can write v", "" civ i + C2V2 + ... + c",_lvm_I' Then the elementary row operatIOns Rm  c1RI • Rm  c2R2, • • . , Rm  cm_I Rm_ 1applied to A will create a zero row In row til. Thus, rank.{A) < n1. Conver$ely, assume that rank(A) < m. Then there is some sequence of row operations that will create a zero row. A successive subst it ution argument analogolls to that used in Example 2.25 can be used to show that 0 is a nontrivial linear combination of v I' v 2' . •• , v .... Thus, vI' V I" •• , V m are linearly dependent. In some si tuations, we GIn ded uce that a set ofvec lors is linearly dependent wi thout domg any work. One such sItuation IS \"hen the zero vector is III the sct (as in Example 2.22). Ano ther is when there arc " too many" ve
Theorem 2.8
•
Any set of til vectors in
Prool Let
R~
is linearly dependent if III >
II.
v 2' •••• v m be (colum n) veCiors 10 R" and let A be the " X III mal rix [VI v2 . •• v",1 wllh these vectors as its co lum ns. By Theorem 2.6, v2••••• v", are linearly dependen t if and only if the homogeneous linear system with augmented matrix fA I OJ has a nontrivial solution. But , according to Theorem 2.6, this will always be the case if A has more columns than rows; it is the case here. since number of col umns m is greater than number of rows fl.
Example 2.26
Vi'
The vectors
v,.
[ ~ ]. [ ~ ], and [~] are linearly d ependent , since there cannot be more
than two linearly indcpcn(lent vectors in [R2. (Note that if we wan t to fi nd the actual dependence relation among thest three vectors, we must solve the homogeneous system whose coefficient matrix has the given vectors as columns. Do this!>
III Exercises 16. determine if the vector v is a linear combl/la/iou of the remainillg vectors.
I.y~ [~]. u, [ _ : ]. u, ~
[_:]
2.Y~ [a u,~ [ _;]. u, ~
[:]
1
3. v =
, til
=
4.v ""

eM
6. v =
1
3 2, u ,  I
1 I ,u 2 =
o
2 , ul =
l , u2 =1 ,
3
o
3.2
1
3
1
0 I 1
0 1
1
o
1
2
5. v ""
1
2.0 , u l = 2.6
1.2 uj
""
0.2 1.0
1.0 0.4
4.8
3.4
, ul
:=
1.4 •
 6.4
lal
Chapter 2
Systems of Linear Equations
(b) In part (a), suppose in addition that each vJ is also a linear combination of u l ' . .. , u l  Prove that span(u" ... , u l ) = span{v l . ··· , vm). (c) Usc the result of part (b) to prove that
III Exercises 7 {lIId 8, determine if Ihe vecror b is ill the span of 1111: coi/ll/lllS of tile I/wlrix A
7.A =
[~ ~J. b = [~] 3
10
I
I
I
456,b =
11
0,
I ,
I
7
12
001
I
8.A =
2 8
9 2
9. ShowthadR =
SP:I11([:l [ _ ~]}
10. Show that W = span( [
Rl
I
0
0 • I • I 0 I I I
12. Show that
Use II,e metitod of Example 2.23 and Theorem 2.6 to deterlilli if' if tile sets of vectors ill Exercises 223/ (Ire linearly 1/1 depelldelll. If, for allY of titese, the answer ({III be detenllll1ed by ill5peetion (i.e., WllllOlIt en /Cillatioll ), state why. For any sets t/tnt are linearly depel1defll, find (/ dependence relationship (Huol1g tlte vectors. I I I 2 I 23. I • 2 •  I 22.  I , 4 I 3 2 3 4
_~], [~] ). I
II. Show that jRl = span
[Him: We know that IRl = span(e l' e2> e,).)
 I
2
2 •  I • I  I 3 0
= span
24.
Exercises /3 16, describe tile Spl/II of ri,e givel1 I'ec fors (a) geometrically lind (b) alge/Jraica//y. III
13.[_:].[;] 15.
I
3
2,
2
o
14. [~ ].[!] 1
16.
 I
0, ..... 1
J
2 •
I •
I
2
26.  I
0
1 , 1 0 I
18. Prove that u , v, and ware all in span( u, v, w ). 19. Prove that u, v, and w are all in span(u, u v + w ).
4
7 I
28.
2
,
3
2
2 2 •
3
I
 I
I
0
 I
I
0
I
0 •
I
0 0
0 0 •
2
I
=
31.
,
I
•
0
4
3
3
2 •
2
I
l
 I
I
 I
 I
3
I
 I
 I
•
, 0
.2
3
I
I
3 I
• I
 I
I
I
0
I
 I
4
•
2
I
I
I
2
6 0 27. 4 • 7 • 0 5 8 0
I • 0 2 3
5
0
3
5
3
0
30.
25.
2
 I
29.
5
3 •  I •
+ v, u +
20. (a) Prove that if u l • . .• , u mare vectors in R~, S = { u p u ~"" , u kl, a nd T = {u p ...• Ul. U hP .•• , uml. then span(S)!: span( T). (Him: Rephrase this questio n in terms of linear combinations.) (b) Deduce that if R" span (S), then R~ span( T) also. 21. (a) Suppose that vector w is a linear combi natio n of vectors U I' ••. , u •. an(\ that each u, is a linear combination of vectors v I" . . , v,,,. Prove that W IS a linear combination of V I" •• , v'" and therefo re span(u p . .. , Uk) \: span( v p ••• > von)'
I
 2
17. The general equation of the plane t hat contains the points (1, 0, 3), (1, I,  3), and the origin is of the form ax + by + cz = O. Solve fo r (/, b, and c.
=
2
I  I
•
3 I
•
I
3
111 Exercrses 32 4 J, determine If ri,e sets of vectors 1/1 the given exerCISe are Iim~ilrly itldependelll by cOllverting rile
Section 2.4
vectors to row vectors (Il1d usmg tile method of Example 2.25 (/lid Theorem 2.7. For any sets that are lil/eMly dependem, find a dependence relationship among the vectors. 32. Exercise 22
33. Exercise 23
34. Exercise 24
35. Exercise 25
36. Exercise 26
37. Exercise 27
38, Exercise 28
39. Exercise 29
40. ExerCISe 30
41. Exercise 3 1
Applications
1D1
(b) If vectors u , v, and ware hnearly independent, will u  v, v  w, and u  w also be linearly independent? Justify your answer. 44. Prove that two vectors are linearly dependent if and o nly ,f one is a scalar multiple of the o ther. ( Him: Sepa rately consider the case where one of the vectors is 0.) 45. Give a "row veclor proof" of Theorem 2.8.
42. (a) If the columns of an fi X /I mat rix A are linearly independent as vecto rs in IR", what is the rank of A? Explain. (b) If the rows of an nXn matrix A are linearly independent as vectors in R ", what is the rank of A? Explain. 43. (a) If vecto rs u, v, and w arc linearly independen t, will u + v, v + w, and u + w also be linearly independent? Justify your answer.
46. Prove that every subset of a linearly independent set is linearly independen t. 47. Suppose that 5 = Iv 1" •• , v ~, vI is a sel of ve(.to rs in some R" and that v is a linear combination of v..... , vk. lf S = !vp ... , vk}, prove that span (S) = span (S'). [Hint: Exercise 2 1(b) is helpful here. ] 48. Let {v" .. . , v~l be a linearly independent set of vectors in R", and let v be a vector in R ~. Suppose Ih"l v = e,v l + C2V1 + ... + ck v k with CI * O. Prove that lv, v!" .. , v,l is li nearly independent.
Applications There are too many applications of systems o f linear equations to do them justice in a single section. This section Will introduce a few applications.. to illust rate the diverse settings in which they arise.
Allocation 01 Resources A great many applications of systems of ti near eq uations involve allocating limited resources subject to a set of constraints.
Example 2.21
A biologist has placed three st rains of bacteria (denoted I, II, and III ) in a test tube, where they will feed on threedifTerent food sources (A, 13, and C). Each day 2300 units of A, BOO units of B, and 1500 units of C are placed in the test tube, and each bac· terium consumes a certain nu mber of units of each food per day, as shown in Table 2.2 . How many b'lCteria of each strain ca n coexist in the test tube and consume ti ll of the foo d?
Table 2.Z
Food A Food B FoodC
Bacteria Strain I
Bacteria Strain II
Bacteria Strain III
2
2 2 3
4
I I
0 I
112
Chapler 2 Systems of Linear Equations
Sol.tl..
Let XI' x 2 , and x) be the numbers of ba cteria of strains [, II , and 1[ [, respectively. Smce each o f the XI baCieria of strain I consumes 2 units of A per da y, strain 1 consumes a total of 2xI units per da y. Similarly. strai ns II and III consume a to tal of 2x2and 4xJ units of food A daily. Since we W
+ 2x~ +
4Xl = 2300
Likewise, we obtain equations corresponding 10 the consumption of 13 and C: XI XI
+ 2X2 + 3x! +
== 800 xJ == 1500
Th us, we have a system of three linear equations in th ree variables. Row reduction of the corresponding augmen ted malTi" gives 2242300 1 20800 I
3
I 1500

1 0 0 100 ..1010350 001350
Therefore, Xl = 100, x 2 = 350, and X J = 350. The biologist should place 100 bacteria o f strain I and 350 of each of strains II and III in Ihe test lube If she walliS all the food to be consumed .
Example 2.28
Repeat Example 2.27, using the data on daily consumption of food (u nits per day) shown in Table 2.3. Assume this time that 1500 units of A, 3000 units of B, and 4500 units o f e are placed in the test tube each day.
Table 2.3
Food A Food B
Foodc
Bacteria Strain I
Bacter ia St ra in II
Bacteria Strain III
1 1 1
1 2
1
3
3
5
hIIU.. Let xl'~, and xJ again be the numbers of bactcna of each type. The augmented matrix for the resulting linear system and the correspondlllg reduced echelon form are 1 I 1 1500 1233000 I 3 5 4500
1 0
 I
0
, o
1
2 1500
o
0
o
0
We sec that in Ihis case we have llIore than o ne solution, given by
o
x, Xl
+ 2xj
"" 1500
Letting Xj = /, we obtain x ! = t. ~ "" 1500  21, and x.I = I. In any applied problem, we must be careful to interprct solutions properly. Certalilly the number of bacteria
Section 2.4
Applications
113
cannot be negative. Therefore, t iii?:: 0 and 1500  2t 2:. O. The latter inequality implies that t :S 750, so we have 0 S t :S 750. Presumably the number of baCieria must be a whole number, so there are exactly 75 1 values of t that satisfy the inequality. Thus, our 751 solutions are of the form XI
t
Xi
1500  2t t
~
0 =
1500 0
I
+t
 2 I
one for each integer value of t such that 0 :S t <: 750. (So, although mathematically this system has in finitely many solutions, physically there are only finitely many.)
BalanCing Chemical (quilions
...t
When a chemical reaction occurs, certain mo lecules (the rcactwltS) combine to form new molecules (the products). A balanced chemical equation is an algebraic equation that gives the relative numbers of reactan ts and products in the reaction and has the same number of atoms of each type on the lefl  and right hand sides. The equation is usually written with the reactants on the left , the products on the right, and an arrow in between to show the direction of the reaction For example, fo r the reaction in which hydrogen gas ( H2) and oxygen (0 2) combine to form water (H,O), a balanced chem ical equation is
indicat ing that two molecules of hydrogen combme wi th one molecule of oxygen 10 form IWO molecules of waler. Observe that the equalion is balanced, since there are four hydrogen atoms and twO oxygen atoms o n each SIde. Note that there will never be a un ique balanced equation (or a reaction, since any positive integer multiple of a balanced equation will also be balanced. For example, 6H 2 + 30, ) 6HzO is also balanced. Therefore, we usually look for the simplest balanced equation for a given reaction. While trial and error will often work in simple exa mples, the process of balancing chemical equations really involves solving a homogeneous system of linear equations, so we can use the techniques we have developed \0 remove the guesswork. ,
The combustion of am monia (NHj ) in oxygen produces nitrogen (N2 l and water. Find a balanced chemical equation for this reaction. SaluUaa If we denote the numbers of molecules of ammonia,oxygen, nitrogen, and water by 11', x, y, and z, respectIvely, then we arc seeking an equation of the form
Compari ng the numbers of nitrogen, hydrogen , and oxygen atoms in the reactants and products, we obtatn three linear equations: Nitrogen: II' = 2y Hydrogen: 3w = 2z Oxygen: 2x = z Rewriting these equations in standard fo rm gives us a homogeneous system of three linear equations in fo ur variables. INotice that Theorem 2.3 guarantees that such a
104
Chapter 2 Systems of l inear Equations
system will have (infinitely many) nontrivial solutions.] We reduce the correspondin g augmented matri x by GaussJordan el iminatIon.
 2y
w 3w 2x
Thus,
IV
= 0 1 0 2z = 0 >1 3 0 
= ~ z, X =
z= 0
0
2
2 0 0
0 0 2 0  1 0
100 
•• 1
0
I
o
0
0
, t 0  ,, 0 , 0
1z. and y = tz. The smallest positive value of zthat will produce
in feger values for all four variables is the least common denominator of the fractions j , ~ , and Sname1y, 6which gives IV = 4, x = 3, Y = 2, and z = 6. Therefore, the bal anced chemical equation is
Network AnalysiS
)'0
Many practical situations give fl se to networks: transporta tion networks, communi cations networks, and econom tc networks, to name a few. O f particular interest are the possible floll's through networks. For exa m ple, vehicles flow through a network of roads, mformation flows through a data network, and goods and services flow through an economic netwo rk. For us, a lIetwork will consist o f a finite number of nodes (also called junctions or vertices) connected by a series of directed edges known as branclu!s or arcs. Each b ranch will be labeled with a flow that represents the amount of some commodity that can fl ow
Figura 2.10 Flow at ~l node f, + J; "'" 50
(Kample 2.30
Figure 2.10 shows a portIon of a network, with two branches entering a node and two leaving. The conservation of flow rule implies that the total Illcoming flow, h + h unit s, must match the total outgoing flow, 20 + 30 un its. Thus, we have the linear equation h + h = 50 corresponding to this node. We ca n ana lyze the flow through an entire network by const ructing such equ
Describe the possible flows through the network of water pipes shown in Figure 2.11 , where flow is measured in liters per minute.
5alulloD
At each node, we write o ut the equation that represents the conservation of flow there. We then rewrite each equatIOn with the variables on the left and the constant on the right, to get a hnear system in standard form. Node A:
15 = h + J.
Node 8:
h= h +
10
Node C: h+f,+5=30 Node D: f.t + 2O =f,
•
+ J. = 15 f. = ]0 f.  I, = 25 I,+f, f,  f.t = 20
Section 2.4
10
B
•
1,1
/,1 20
105
Applications
f; ~
•
5
•
c
0
1
30
flgurI2.11 Using GaussJordan elimination, we red uce the augmented matrix:
I
0
0
lIS
I
 I
0
o
I
I
o
0
I
o o
10
I
0
0
lIS
, o
I
0
5  1 20
001
25  I 20
o
0
0
I
o o
(Check this.) We see that there is one free variable.}:;. so we have infinitely many solutions. Setting}:; = t and expressing the leading variables in terms of .r.;. we ob ta in
/J = 15  r
j,= 5  r h = 20 + r r ~= These equations describe all possible flows and allow us to anal yze the ne twork. For example. we sec that if we control the fl ow on branch AD so tha t t = 5 Umin. the n the ather flowsareJ; = lO,h = 0. and A = 25 . We can do even better: We can find the m inimum and maximum possible flows on each branch. Each of the fl ows must be nonnegative. Examining the fi rst :and second equations in turn , we see that I ::::: 15 (otherwise f, wo uld be neg:ative) and 1:5 5 (otherwise h would be negative). The second of these inequalities is mo re restrictive than the fi rst,so we must use it. The third equat ion contributes no furt he r restrictions on our parameter /, so we have deduced that 0 :5 1 :5 5. Combining this result with the four equati ons, we see that
10:5/J S I5
Os h :S 5 20 <
hS
25
0
.t
cornpJo" d"",pt;on of th, po,,;b], flow, thmugh th;, ""Work .
Chap ter 2 Systems of linear Equatio ns
Electrical Networks Electrical networks arc a specialized type of netwo r k providing infor matio n about power sources, such as batteries, and d evices powered by these sources, such as light b ulbs o r motors. A power source "forces" a current of electrons to fl ow through the network, where it encou nters various resistors, each of which requires that a certain amount of fo rce be applIed in order for the current to flow through il. The fundamen tal law of electricity is O hm's law, which states exactly how m uch force E is needed to drive a cu rrent 1 through a resistor with resistance R.
Ohm's law
fo rce = resistance x cur re nt
E = R1
i
Force is measured in volts, resistance in ohms. and curren t in amperes (o r amps. for short). Thus. in terms of these un its, Ohm's law becom es "volts = ohms X amps," and it tells us what the "voltage drop" is when a current passes through a resisto r that is, how much voltage is used up. Current fl ows o ul of the positIve terminal o f a battery and flows back into the n egative terminal, traveling around one or more closed circu its in the process. In a diagram o f an electrical network, batteries are represented by (where the
11
positive terminal is the longer vertical bar) and resistors are represented by A./Vv. T he fo llowing two laws, whose discovery we owe to Kirchhoff, govern electrical net works. The firs t is a "conservation o f fl ow" law at each node; the second IS a '"balancing of voltage" law aro u nd each circuit.
Kirchholl's laws
C urrent Law (nodes) The sum of the currents flowing into any node is eq ual to the sum o f the cu rren ts flow ing out of that node.
Voltage Law (circuits) The sum of the voltage dro ps around any circuit is equal to the total voltage around the circuit (provided by the batteries).
Figure 2.12 illustra tes Kirchhoff's laws. In part (a), the current law gives II = 12 + IJ (or I,  12  IJ = 0, as we will write it); part (b) gives 41 = to, where we have used O hm's law to compute the voltage drop 41 at the resistor. Using Kirchhoff's laws, we ca n set up a system of linear equations that will allow us to determ ine the currents in an electrical network.
Example 2.31
Determine the currents I ]> 12, and IJ in the electrical network shown in Figure 2. 13.
Solution
This network has two batteri es and four resistors. Current II fl ows th ro ugh the top branch RCA, current 12 flows across the middle branch AB, and current I, fl ows through the bOllom branch BDA. At node A, the currentlalV gives II + 13 = 12, or
11  12 + 1, = 0 (Observe that we get the same equatIon al node 8.)
Section 2.4
•
.
1

I,
1
101
I,
•
B vo lts
~'=,
2 ohms
12
10 volts
•
c
I,
I\pplications
A
t,
•
2 ohms
•
B
I ohm 4 ohms
•
4 ohms
(a) 11 = 11 + / J
I,
, I,
0 16 volts
(b) 4/ = 10
Figure 2.13
Flgur, %.12
Next we apply the voltage law for each ci rcuit. For the circuit CABC, the voltage drops at the resistors are 211, / 2' and 211, Thus, we have thc equatio n
Similarly, for the circui t DARD, we obtain
/z +41J = 16 ( No tice that there is actually a third circuit, CADBe, if we "go aga inst the fl ow," In this
case, we must treat the voltages and resistances on the "reversed" paths as negative. Doing so gives 211 + 211  41) = 8  16 =  8 or 411  41) =  8, wh ich we observe is just the difference of the voltage equations for the other two circuits. Thus, we can omit this equation, as it contributes no new in format ion. On the other ha nd. including it does no harm.) We now have a system of three linear equations in lhree variables:
12 +
0
1') ::=
I,
8
12 + 4/J
16
""
Gauss jordan elimination produces J
J
J
0 8
4
1
0
0
1
4 16
,
J
0
0
J
0 1 0 4
0
0
J 3
Hence, the cu rren ts are II = I amp, I z "" 4 amps, and 13 = 3 amps.
Remarll
In some elect rical networks, the currents may have fractional values or m ay even be negative. A negative value simpl y means that tlte current in the corresponding branch flows in the d irection opposite that shown on the network diagram.
CAS
Example 2.32
The network shown in Figure 2.\ 4 has a single power sou rce A and five resistors. Find the currents I, I I' ... , Is' This is an example o f what is known in electrical engineering as a Wheatstolle bridge circuit.
108
Chapter 2 Systems of Linear Equations
2 ohms
I ohm
c
2 ohms
I, B
I,
•
j
I,
•
£
I ohm
D 2 oh ms
A
,
,
I
10 volts
I
Figure 2.14 A bridge CIrCUli
Solullon
Kirchhoff's current law gives the followi ng eq uations at the four nodes: NodeB:
1
Node C:
II  12  I, = 0
Node D:
1  12  I, = 0
Node
E:
/1
 /4
= 0
I} + 14  Is = 0
For the three basic circuits, the voltage law gives Circuit ABEDA:
14 + 2/5 = 10
Circuit BCEB:
2/1 + 21j  14 = 0
Circuit CDEC:
12  215  21} = 0
(Observe that branch DAB has no resistor and there fo re no voltage drop; thus, there is no 1 term in the equation for ClfCUIt ABEDA. Note also that we had to change signs three times because we went "against the current." T his poses no problem, since we WIll let the sign of the answer determine the d irectio n of curre nt fl ow.) We now have a system of seven equations in six variables. Row reduction gives
I 0
I
 I I
0
0 0
0 0 0
0 2 0
0
0
 I
 I  I
 I
0
0 0 0
I
0 0
0 0 0
 I I I  I 0 I 2 10 0 2 I 0 0 0
 2
0
0
 2
0
)
I 0 0 0 0 I 0 0 0 0 I 0 0 0 0 I 0
0
0
0
I
0 0
0 0
0 0
0 0
0 0
7 3
0
0
0 0
0 0
0
0  I 0 4
4
I
3
0
0
(Use yo ur calculator or CAS to check this.) Thus, the solution (in amps) is / = 7, II = 15 = 3, 12 = l~ = 4, and I, =  1. The sIgnificance of the negative value here is that the current th rough branch CE is flowing in the di rection opposite that marked o n the diagram.
i
Remark There is o nly o ne power source in this example, so the single tovolt battery sends a current of 7 am ps through the network. If we substitute these values
SeCIIon 2.4
Applications
109
into Ohm's \aw, E = RI, we get 10 = 7 R or R = If.Thus, the entire network behaves as if there were a si ngle .!joh m resistor. This value is called the effective resls/wlce (Rclf ) of the network.
Finite linear Games There are many situations in \vhich we must consider a physical system that has only a finite number of states. SometllTles these sta tes can be altered by applying certain processes, each of which produces fmi tely many outcomes. For example, a light bulb can be on or off and a switch can change the state of the light bulb from on to off and 1'lCe versa. DIgital systems that anse in compu ler science are of\Cn of this type. More fri volously, many computer games fea ture pu zzles in which a certain devi ce must be manipulated by various slvi tches 10 produce a desired outcome. The fimt cness of such sit uations is perfectly suited to analysis using modular arithmClic, and often lin ear systems over some Zp playa role. Problems IIlvol ving thIS Iype of situation arc often c
lxample 2.33
A row of five lights is controlled by five switches. Each switch changes the state (on or off) of the light directly above it and the states of the lights immediately adjacent to the left and right. For example, if the first and thi rd lights are on, as 111 Figure 2. 15(a), then pushmg switch A changes the state of the system to that shown in Figure 2. 15(b). Ifwe next push switch C, then the resuit is the state shown in Figure 2.15(c) Suppose that initially all the lights are off. Ca n we push the switches in some order so that only the fi rst, third, and fift h lights will be on? Can we push the swllches in some order so that only the first light will be on?
A (,j
(b)
figure 2.15
Solullon
The on/off nature of this problem suggests that bi nary nota lion will be he1llful and that we should work with Z2" Accordingly, we represent the states of the five lights by a vector in 1"~ , where 0 represents off and 1 rellrescnts on. Thus, for example, th e vector
o I
I
o o corresponds to Figu re 2. 15(b).
110
Chaplcr 2 SysTcms o( l.me:lr Equations
We may also use vecto rs in ~ 10 represen t the action of each switch. If a switch changes the state of a light, the correspondi ng component is a I; o therwise, it is O. WLth this convent ion , the actio ns of the fi ve switches are given by
.
~
I
I
0
0
0
I
I
I
0
0
0 ,b =
I , c =
I • d
0
0
I
I
I
0
0
0
I
I
~
I
, e=
0
The situation d epicted in Figure 2.15(a) correspo nds to the initial state I
0 ,
I
~
0 0
followed by I I .
~
0 0 0
It is the vector sum ( 1Il ~) 0 I
s + a =
I
0
0
O bserve that this result agrees with Figure 2. 15(b ). Starting with any in itial configura tion s, suppose we push the switches in the o rder A, C, D, A, C, B. This corresponds to the vector sum s + a + c + d + a + c + b. But in Z~, addition is commutative, so we have s + a
+c +d
+ a + c + b = s + 2a + b + 2c + d = s + b + d
where we have used the fact that 2 E 0 in Zl' Thus. we would ach iev.: the same result b y pushing only Band Dand the o rd.:r does not matter. (Check that this is correct.) Henc.:, in this example, we do not need to push any switch morc than once. So. to see if we can achieve a t:lTget configuration t start ing from an in itial confLguration s, we need to determine whether th ere are scalars XI' . . .• ~ in Z l such that
Sct:tion 2.4
Applications
111
In o ther words, we need to solve (if possible ) the linear system over Z 2 that corresponds to the vector equation
In this case, S = 0 and our fi rst target configuration is I
o I
t =
o I
The augmented matrix o f this system has the g iven vecto rs as colum ns: I
I
I
0
0
0
I
I
I
0
0 0
0
I
I
1
0
I
0
0
I
I
I
0
0
0
0
1
I
I
I 0
o o
0
I 0
We reduce it over Zl to obtain
o o
0
o o
I
1
o
I
0
0 1 0 o I
I
1
0
o
o
0
1
0
Th us, Xs is a free variable. Hence, there are exactl y two sol utions (corresponding to Xs = 0 and Xs = [ ). Solving for the other variab les In terms of Xs, we obtain
x, 
X,
x! =
I
XJ =
1
xf = 1
So, when Xs = 0 and
Xs
+
X,
+
X,
== I, we have the solutions
x, x, x, x.
0
.f ,
I
1
0
I
x, x, x.
X,
0
.f ;
I
I
"d
respectively. (Check thatl hese both work. )
I
0
112
Chapter 2 Systems of Linear Equations
Sim ilarly, in the second case, we have I t =
0 0 0 0
The augmented matrix reduces as follows: I
I
0
I
I
I
0 0
0 0 0
I
I
I
0 0
I
I
0
I
0 I 0 0 0 0 I 0 I 0
,
I
0
0 0 0 0
I
0 0 0
0 0 I
0 0
0 0 0
I 0
0 I
I
I
0
0 I
I I I
showing that there is no solution in this case; that is. It IS impossible to start with all of the lights off and turn only the first light on.
Example 2.33 shows the power of linear algeb ra. Even though we might have found out by trial and er ror that there was no sol ution, checking all possible ways to push the switches would have been extremely tedious. We might also have missed the fact that no switch need ever be pushed more than once.
Example 2.34
Consider a row with only three lights, each of whICh can be off, light blue, or dark blue. Below the lights are three sw itches, A, B, and C, each of which changes the stales of part icular light510 the IIext state, in the order shown in Figure 2.16. Switch A changes the states of the first two light5.5\",itch B all three lights, and switch C the last two ligh ts. If all three lights are initially off, is it possible to push the switches in some order so thai the lights arc off, light blue, and dark blue, in that order (as in Figure 2. 17)?
/ ' orr ",,
Dark bluc
Light blue
~
A
FII.r. 2.11
c
B
B
A
c
FlI. r. 2.1I
Whereas Example 2.:n involved Z2' this one dearly (is it clea r?) involves Z,. Accordingly, lhe switches correspond 10 the vectors
S, I.II..
I
,=
I
0
,
b=
I
0
I,c 
\
I
I
Sc\:tlon 2.4
Apphcations
113
o in z:i, and the final configuration weare aiming for is t ""
I . (Off is 0, light blue is I,
2 and dark blue is 2. ) We wish to find scalars X" Xz, x J in 7i.., such that
x ,a
+ x2b +
xJc = t
(where x, represents the number of times the ith switch is pushed). Th is equation gives rise to the augmented matrix [a bc I tl , which reduces over 7i.., as follows:
I
I
0 0
I
I
I
1 I
 >10
o
I
I 2
o
0
0 2
1 01 0
I I
Hence, there is a unique solutio n: x, = 2,Xz "" I,x] = 1.ln o ther words, we must push swi tch A twice and the other two switches on ce each. (Check this.)
Exercises 2.4 Alloeallon of Resolrees I. Suppose that, in Example 2.27, 400 un its of food. A, 600 units of B, and 600 units of C arc placed in the test tube ellch day and the dll ta on dail y food consumption by the bacteria (in u nits per day) are as shown in Table 2.4. How many bacteria o f each strain can coexist III the test tube and consume all of th e food?
able 2.4 Bacteria Strain I
Bacteria Strain II
Bacteria Strain III
1 2 1
2 1 1
0 1
FooclA Food B Food. C
2
2. Suppose that in Example 2.27, 400 units offood A, 500 units of B, and 600 units of C arc placed in the test
Jable 2.5
Food A Food B Food C
Bacteria Strain I
Bacteria Strain II
Bacteria Strain III
1 2 1
2 1 1
0 3 1
tube each d ay and the data o n dal ly food consumptio n by the bacteria (in un its per day) are as shown in Table 2.5. How many bacteria of each strai n can coexist in the test tube and consume all of the food? 3. A florist offers th ree sizes o f flowe r arrangements containing roses, daisies, and chrysanthemums. Each small arrangement contains one rose, three daisies, and three chrysanthemullls. Each llledium arrangemen t contains two roses, four daiSies, and six chrysan them ums. Each large arrangement contains four roses, eight d;lisies, and six chr ysant hemums. One da y, the flo rist no ted Ihal she used ;l to tal of 24 roses, SO d aisies, and 48 chrysant hemum s in fi lling orders for these th ree types of arrangements. How m ally arrangements o f each type did she make? 4. (a) In your pocket rou have some IlIckels, d imes, and quarters. There arc 20 coins altogether and exactly twice as many dimes as nickels. The total value of the coins is S3.00. Find the number of coi ns of each type. (b) Find all possible combinations of 20 coins (nickels, dimes, and quarters) tha t will make exactl y $3.00. 5. A coffee merchant sells three blend s of coffee. A bag o f the house blend contains 300 grams of Colombian beans and 200 grams of French roast beans. A bag of the special blend contains 200 gram s o f Colombian beans, 200 grams of Kenyan beans, and 100 gra ms of French roast beans. A bag o f the gourmet blend
11.
Chapter 2 Systems of Linear Equations
f,
contains 100 grams o f Colombian beans, 200 grams o f Kenyan beans, and 200 grams of French roast beans. The merc hant has o n hand 30 kilogra ms o f Colom bian bea ns, 15 kilograms of Ke nya n beans, a nd 25 kilograms of French roast bea ns. If he wishes to use up all of the beans, how ma ny bags o f eac h type of blend can be made?
c
,
6. Redo Exercise 5, assu m ing tha t the house blend contains 300 gra ms of Colo m bian beans, 50 gra ms of Kenyan beans, and 150 grams of Fren ch roast bea ns and the gourmet blend contains 100 grams of Colombian beans, 350 grams of Ke nyan bea ns, a nd 50 grams of French roast beans. Th is time the me rchant has on hand 30 kilograms of Colombian beans, 15 ki lograms of Kenyan beans, and 15 kilograms of French roast beans. Suppose o ne bag of the house blend produces a profit of $0.50, one bag o f the special blend prod uces a profit of $1.50, and one bag of the gourmet blend produces a profit of $2.00. How many bags o f each type should the merchant prepare if he wants to usc up all o f the beans and maxinm.e his profit? Wha t is the maximum profit?
III ExerCISes 7 /4, bn/tmce tile chemical equation for each ret/cl iot!.
8. CO!
+ IIp
~
+ SOl C,H 120, + 0 1 (This reaction takes
r e 20 ) ~
place when a green plant converts carbon dioxide a nd wa te r to glucose and oxygen during photosynthesis.)
9.
CO 2 + H 20 (This reac tion occurs when butane, C~ H 1 0' burns in the presence of oxygen to form car bon dioxide and wa ter.) C~HtU
+ O2
10. Ci H 60 Z + 0 l
)
~
Flgur.2.18 (b) If the fl ow through A B is res tricted to 5 Um in, what WIll the fl ows th ro ugh the o ther two branches be? (c) What a re the m inimum and maximum possible flows through each branch? (d ) We have been assuming tha t flow is always poSitive. \¥ha t would negaave flow mean, assum ing we allowed it? G ive an illus tration for this example.
16. The downtown core of Gotha m City consists of oneway st reets, a nd the traffic fl ow has been measured al each in lersectioll. For the city block shown in Figure 2. 19, the numbers represent the average numbers of vehicles per m inute entering and leaving intersections A, C, a nd D d uring business hours . (a) Set up and solve a syste m of linear equal iOns to fin d the possible flows fl' ... ,f;,. (b) If traffic is regula ted on CD so thath "" 10 ve hicles per minute, what will the average flows on the other streets be? (c) What are the minimum and maximum possible flows on each s treet? (d) How would the solution change if all of the dIrections were reversed?
+ CO 2 Hp + CO 2 (This equation rep
H 20
II. CsH I10H + Ol ~ resents the combus tion of amyl alcoho1.)
+ P40 10 ) 13. Na 2CO J + C + N: ~ 14. C2 H lCl~ + Ca(O H ): 12. HCIO t
B
n,
• 111.elnl Ch,.lell IquIUOOS
7. FeS 2 + O 2
H)P04 + Cl z0 7 ) NaCN + CO ) C 2HCl J
•
101 10,
15. Figure 2. 18 shows a neh...ork of water pipes with flows measured in lite rs per min ute. (a) Set up and solve a system of lin ear equations to find the possible flows.
f,
A
•
hI
+ CaCI2 + H zO
N,lwor. 111111111
20i s, B
hi f,
•"
D I()
flgur' 2.19
I
•
C
lSi
•"
Section 2.4
17. A netwo rk of m igation ditches is shown in Figure 2.20, with flows measured in thousa nds o f liters per day. (a)
SCI
up and solve a system of lin ca r equations to find
the possible 110ws h ....
,is'
(b ) Suppose DC is closed. Wha t range of flow will need 10 be mamlai ned th ro ugh DB? (e) Fro m Figu re 2.20 it is deartha! DIJ cannO( be closed. (Why no t?) How does your solution in part (a) show
Electllcal "etwDI's For Exercises J9 (Iud 20, defemllne tile Ctlrrctt/s jor the gIven elecrricailletworks. I
19.
•
,
I
•
, 1 ohm
IIliS.
J,
(d) Front your solution in part (a), determine the minimu m and maxImum fl ows through DB.
i
c 8 volts
.,
100 ~
115
Applica ti ons
I,
•
A
•
B
1 oh m 4 ohms
A
I)
I,
•
•
•
I
\
I,
13 volls
:{'
20.
•
I
,
\
•
I
,
5 volts 1 ohm
"
C
c
J,
o
A
I,
•
•
B
2 ohms
fillare 2.21 4 ohms
18. (a) Set up and solve a system of linear equations to find the possible fl ows in the network shown in Figure 2.21. (b ) Is it possible for f. == 100 and i6= ISO? (Answer this questio n firs t wi th reference to your solutIOn III part (al and then direct ly from Figure 2.21. ) (el If h = 0, what will the ra nge of flo w l>c on ellch of the other b ranches?
150 !
lOot
[,
200 ~
[,t 4
c
I,t
f,t
/6
2 oh m ~
c
100 ~
/J
h
• ISO
•
r
E
0
21. (a) Find the cu rren ts I, 11" ' " I., in the bridge circuit . I·· 1Il " sure 2" . __ . (b ) Find the effective resistance of this network. (e) Crill you change the resistance III bra nch Be (bu t leave everything else unchanged ) so Ihal the current through branch CEbecomesO? I ohm
h~
[, !
200
8 volts
200t
•
A
•
, I,
I)
I,
I ohm
I, /J
"!
• 2 ohms
I,
£
•
I)
1 ohm
A
loot fig,,, 2.21
100 !
loot
•I fllure 2.22
14 volts
•I
116
Chapter 2 Systems of Linear Equations
22. The networks III parts (a) and (b) of Figure 2.23 show two resistors coupled in series and in parallel, respectively. We wish to find a general formula for the effective resistance of each network that IS, find R,ff such that E = RoffI.
24. (a) In Example 2.33, suppose the fourth light is initially on and the other four lights arc ofT. Can we push the switches in some o rder so that only the second and fourth lights will be on? (b) Can we push the switches in some order so that only the second light will be on?
(a) Show that the effective resistance Rdf of a netwo rk With two resistors coupled in series [Figure 2.23 (a) I is given by
25. In Example 2.33, desc ribe all possible configurations of lights that can be obtained if we start wi th all the ligh ts off. 26. (a) In Exam ple 2.34, suppose that all of the lights are iniliallyoff. Show that it is possible to push the switches in some o rder so that the lights are off, da rk blue, and light blue, III that order. (b) Show that it IS possible to push the sWilches in some o rder so that the lights are light blue, off, and light blue, III that order. (cl Prove tha t any configu ration of the th ree lights can be achieved.
(b) Show that the effective resista nce Rdf of a network with two resistors cou pled in parallel [Figure 2.23(b)] is given by
27. Suppose the lights in Example 2.33 can be ofT, light blue, or dark blue a nd the switches wo rk as described in Example 2.34. (That is, the sWI tches control the same lights as in Example 2.33 but cycle through the colors as in Example 2.34.) Show that it is possible to start with all of the lights off a nd push the switches in some order so that the lights are dark blue, light blue, dark blue, light blue, and dark blue, in that order.
E (,)
/
I I
,
/,
•
R,
•
N,
28. For Exercise 27, desc ribe all possible configurations of lights that can be obtamed , starting With all the lights off.

CM
1 I
/1 E (b)
29. Nine squares, each one either blac k or wh ite, are arranged in a 3X3 grid. Figure 2.24 shows one possible arrangement. 'Nhen touched, each square changes Its own state a nd the states o f some of its neighbors (black ~ white and white ~ black). Figure 2.25 shows how the state changes work. (Touchi ng the square whose numbe r is circled causes the states of the squares marked " to change.) The object of the game is to tu rn all n ine squa res b lack.lExe rcises 29 and 30
Figure 2.23 Resistors in series and in parallel
Flnlle linear Games 23. (a) In Example 2.33, suppose all th e lights are initially off. Ca n we push the swi tches in some order so that only the second and four th lights will be on? (b) Can we push the switches in some order so that only the second light will be on?
Figure 2.24 The nine squares puzzle
Section 2.4
CD•
2
4
5
•
• •
8
7
3
4
5
6
4
,
6
7
8
9
7
8
9
3
I
6
9
•
•
4
5
@
9
7
8
9
2
3
I
2
3
5
6
4
5
6
9
7
6
7
8
9
7
8
I
2
3
I
4
5
•
6
4
•
9
7
•
•
Q)
•
8
•
®• •
•
•
•
•
•
•
• 8 ® • •
figure 2.25 Stat e changes for the nme squares puzzle

CMl 30. Consider a variation on the nine squares puzzle. The game is the same as that d escribed in Exercise 29 except that there are th ree possible states for each square: white, grey, or black. The squ3 res change as shown in Figure 2.25, but now the stale changes follow the
nlur.2.2& The nine squares puzzle with more states
cycle \"hite 1> grey ~ black 1> wh ite. Sho\\' how the winning '1 1I black configuration can be achieved from the initial configuratio n shown III rigure 2.26.
MI,eelianeou, Problems
6
5
2
111
•
3
@)
•
•
2
• 4 ~ • •
•
•
I
3
I
2
3
2
I
•
I
Applications
III Exercises 3147, set lip alld solve (III appropriate system of linenr equmiol/5 to al/swer the qucstimlS. 31. Grace is three times as old as Hans, but in :; years she will be twice as old as Hans is then. Howald arc they now? 32. The sum of Annie's, Ikrt's, and Chris's ages is 60. Annie IS older than nert by the same number of years that Bert is older than Chris. When Bert is as old as Annie is now, Annie will be three times as old as Chns is now. What are their ages? The precedl11g two problems are typical of those found 111 popular boob of mathcmmrcal puzzles. However, they have their origins ill amiquity. A Babylolliatl clay tablet tJlIlt sllrvives from aiJout 300 RC. cOllwins the foffowmg problem 33. There are two fi elds whose tota l area is 1800 sq uare yards. One fie ld produces grain at the rate of i bushel per square yard; the other field produces grai n at the rate of i bushel per squa re yard. If the total yield is 1100 bushels, what is the size of each field? Over 2(J(K) years ago, the Chme$(! developed methods for so/villg systems of imear equations, mcludillg a version of GaJlSsian efimmallon that did 1101 become wefl known m Europe rllltd the 19t1l cet/fury. (There is no evulellce that Gmus was aware of the Chillese lIIethods w/renhe developed what we /lOW call Gallssian efimllwtion. However, it is ele", t/wt Ihe Chmese kr'elV the essence of the met/lOd, even though they did 1I0t jllSlify its lise.) The following problem is taken from the Chinese text !iIlZ/Wllg mailS/III (Nim! C/wplers in the M"rh emllticai Art), w,irren dl/rmg till! early Hall Dynasty, abolll 200 B.C 34. There are three types of corn. Three bundles of the first type, two of the second, and one of the third make 39 measures. Two bundles of the first type, three of the second, and one of the third make 34 measures. And one bundle of the fi rst type, two of the second, and three of the third make 26 measures. How many measures of corn are contained in onl! bundle of each type? 35. Describe all possible values of a, b, c, and d that WIll make each of the followmg a valid add ition l'able.I Problems 35 38 are based on the article
118
Cha pter 2
Systems of Linear Eq\w tlO ns
"An Application of Matrix Theory" by Paul Glaister in The MnrhematICS 7eacher, 85 ( 1992), pp. 220223. }
(a)
+ a b ,
2
3
d
4
5
(b)
+ a b c tI
3
6
4
5
36. What cond itio ns on w, X. y, and z will guarantee that we can fi nd a, /J, c, and d so tha t the following is a valid addition table?
(
W
b x
(I
Y
z
a
37. DeSCribe all possible values of a, b, c, tI, e, and f that will make each of the following a valid addition table.
forexnmple, it arises in calcit/IISwhen we tleed to mtegralea ratiollal fill/Ctfotl alld in ,Iiserete mathematics whe" we lise getlerarillgfllnctio1l$ to so/l'e recurrellce relatiom Tile decom poSition ofa ratiollal fimct loll as a Slim ofpartial fract Ions leads 10 a system oflinear e(I/IlI /ioll s. III Exercises 4144,fi"d the partitll fmction decomposition ofthegive" form. ("f1 ze caplta/ /ellers denOfeCOIISItUJIS.) 41.
3x + 1 A "" XZ+2x3 xI
42.
x!  3x+3 A B = + xl +2x2 +X x x+ I
+ abc
a
b
,
d t
3 5
2 4
I 3
d
I
e
f
4
3
I
f
3 4 5 4 5 6
(b)
2
39. From elementary geometry we know that there is a unique straight line through any two points in a plane. Less well known is the faeI lhal there is a unique parabola through any three noncollinear points in a plane. Por each set of poin ts below, find a pambola with an equation of the (arm y = + bx + ( that passes through the given points. (S ketch the resulting parabola to check the validity of your answer. )
ar
(a) (O.I). ( 1.4), and (2, I )
+
C (x + 1)1
+x+1 A _,B ,,= + x(x  I )(x1 + X + I )(x 1 + 1)3 X xI Xl
+
3
38. Generalizing Exercise 36, li nd conditions on the entries of a 3 X 3 addition t.\ble that wiH guar.tntee that we can solve fo r a, b. c. tI. e. and f as above.
8 x+3
xI u.43. Z 1 (x+ I)(x + I )(x + 4) A 8x+C Dx+E "" +, +, x+ 1 x' +l x+4 c.o.s 44.
(a)
+
Cx+D Ex+"+ x 1 +x+ ) Xl+ 1
+
Gx+H lx+] + (x 2 + 1)1 (Xl + I )l
Following are 111'0 IIsefill fonl/II/a s fo r 'he slims of powers of (omend;ve mlluml "umbers:
1 +2+···+ 11 "" and
)2
+ 21 + ... +
I~n +
I)
2
' ~II+ 1)(2n+
I)
If "" ~::...:.:c,,",",.:.!. 6
The validity of Ihese formil las for (11/ V(lilles of n ~ I (o r even n > 0) call be established usm,!; mtlfhematical indllCtioll (see Appemlix 8 ). Dlle way 10 make (III edllcatetl guess (I S to what the fo mlllias (Ire. though, is to observe Ilrtll we ((III rewrile the two forml/I(ls above as
(b) (3, 1),(2,2),and ( 1,5)
40. Th rough any three noncoHinear points there also passes a unique circle. Find the circles (whose general equations are ofthcform xl + I" + ax + by + c"" 0) that pass through Ihe sets of points in Exercise 39. (To check the validity of your answer, find the center and radius of each ci rcle and d raw a sketch. )
The process ofadtling rational filllCl ;011$ (ratios ofpoly"om I aIs) by placing them over a commo" denominator is tile anaioglleof tl(itlillg ratlOna/numbers. Tire reverse process oflakmg (i rallO/wi fU llctioll l'parl by wntmg /ltU a SI/ III of si",pier ratiOlwl/lItICliollS is IIseflll in several areas ofmatllematics;
respectlve/y. TIllS leads to the conjecture that the Sll III of ptll powers of tI,e jim" natural /wmbers is a polynomial of degree p + I in tire variable II. 45. Assuming that I + 2 + ... + 11 "" ati + bn + c, find a, I"~ and (by substituting thrcc values for /I and thereby obtai nrng a system of linear equations in (I , b, and c. 46. Assume that 12 + 21 + ... + ,,2"" (11,3 + bll 2 + en + d. Fi nd (I, b, c, and d. ( Hm l: It is legitima te 10 use 1/ = o. What is the lefthand side in that case?) 47. Show that I' + 2] + ... + ,1' = ("(/1 + I) 2)1.
I I The GI
1Positio
g
The Global Positioning System (CPS) is used In a variety of situations fo r dctcrmining geographical locatio ns. The military, survero rs, airlines. shipping co mpanies, and hikers all make use of It. CPS technology IS becoming so commonplace that some auto mobiles, cellu lar phones, and various handheld deV ices are now equipped with it. The basic idea of G PS is a variant on th reedimensionaltriangulation: A point on Ea rth's surface is uniquely determined by knowing Its distances from th ree other points. I']ere the point we wish to determine is the location of the CPS receiver. the other points are s.1tcl lites, and the distances a re computed usmg the travel times of radio signals from the sa tellites to the receiver. We will assume that Earth IS a sphere on which we impose an xyzcoordinate system with Earth centered at the origin and with the posi tive zaxis runn ing through the north pole and fi xed relative to Earth. For simplici ty, let's take one unit to be equal to the radius of Earth. Thus Earth's su rface becomes the unit sphere with equation ,( + I + t = I. Time will be measured in hundredths of a second. GPS fi nds distances by knowing how long II takes a radio signal to get from o ne point to another. For this we need to know the speed of light, which is approximately equal to 0.47 (Earlh radii per hundredths of a second). Let's imagine that you are a hiker lost in the woods at poin t (x,y, zl at some lime t. You don't know whe r~ you are, and fur thermo re, )'o u have no watch, so you don't know what time it IS. However, you have your C PS device, and it receives simultaneous signals from four satel lites, giving their positions and limes as shown in Table 2.6. (Distances are measured in Earth radii and time in hundredths of a second past midnight.)
This application is based on the article MAn Underdetermined linear System for GPS" by Dan Kalman In 71ll' Col/l'gf' Mil themiltKJ jol"nill. 33 (2002). pp. 384390.
For a more indepth treatment of the Ideas introduced here, s« G. Strang and K. Ilorre. Lil/cllr Algebra, GeM!'!),. II/Ill CPS (WellesleyCambridge Press, MA, 1997).
Let (x,y. z) be your position. and let t be the time when the signals arrive. The g0.11 is 10 solve for x. y, Z, and t. Your distance from 5..1tellite I can be compu ted as foll ows. The signal, traveling at a speed of 0.47 Earth radl i llO~l sec, was sent at time 1.29 and arrived at lime t, so it took t  1.29 hundredths of a second to reach you. Distance equals veloci ty multiplied by (dapsed) lime, so
rl = 0.47«(  1.29) 111
• Table 2.6 salemle Oat. Position
Time
(1.11. 2.55, 2.14) (2.87.0.00. 1.43) (0.00. 1.08.2.29) (1.54, 1.01 , 1.23)
1.29
SateUile
,
2 3 4
1.31
2.75 4.06
We can also express d in terrn$ o f (x. y. z) and the s,1Iell itc's position (1.11,2.55,2.14) using the distance formula:
d ~ \/"(x _:,:.,7, ") ,+:7(y_:,".,","7; ),+:7 (, _  :,.7,,"')' Combin ing these results leads to the cqutllion (x  Lll) ~
+
(y 2.55 )2
+ (z
2. 14 )2  0.47z(t  1.29)2
(I)
Expanding, simplifying, and rearrangi ng. we find thot equation ( I) becomes 2.22x
+ 5.1Oy + 4.28z 
0.57 ' = y;2
+ y + ZI
O.22r

+
11.95
Similarly, we can derIve a correspond ing equation fo r each of the other three satel lites. We end up with a system o f four equations in x. y, z, and t:
+ 5.IOr + 4.28z + 2.86z 5.74x 2.16y + 4.58% 3.08x + 2.02y + 2.46z 
2.22x
+ I + r  O.22r + ) 1.95 0.581 = xl + 1+ r  O.22r + 9.90 1.2 11 = K + i + o.zzr + 4.74 l.79r = x:2 + Y + xl  0.221 2 + 1.26 0.57t =
r
r
These are not lineOlr equations, but the nonlinear terms are the same in each equatio n. If we subtrOlct the fi rst equation from each o f the ot her three equations, ,,'e obtain a linear system: 3.52x  5.lOy  I.42z0.0 I t =
2.05
 2.22x  2.94y + 0.30z  O.64t =
7.2 1
O.86x  3.08y  J.82z  1.22( =  10.69
The augmented matrix row reduces as 3.52
5. 10
 1.42
0.01
2.05
2.22 0.86
 2.94 3.08
0 .30
 0.64
7.2 1
 1.82
 1.22  10.69
o
,o ,
0 0.36 2.97 0.03 0.8 1 0.79 59 1
from which we see that
x= 2.97  0.36/ y=0.8 1 0.03 / Z
= 5.9 1  0.791
wi th t free. Substituti ng these equations into ( I ), we obtain (2.97  0.36r  1.11)1 + {0.8 1  0.031  2.55)2 + (5.91  0.79t  2. 14 )~ = 0.471( t  1.29 )~
. 120
(')
which sim plifies to the q uadratic equation
0.54 r  6.65t + 20.32 :: 0 There are two solutions: t = 6.74
and
t = 5.60
Substituti ng into (2), we find that the first solu tio n corresponds to (x, y, z) = (0.55, 0.61,0.56) and the second solution to (x, y, z) = (0.96,0.65,1.46). The second solution is clea rl y no t o n the unit sphere (Eanh), so we reject it. The firs t solution produces Xl + + = 0.99, so we are satisfied that, within acceptable rou ndoff error, we have located yo ur coordinates as (0.55,0.61,0.56) . In practice, G PS ta kes sign ificantly more facto rs in to acco unt, such as the fact that Eart h's surface is not exactly spherical, so additio nal refinements are needed involvIng such techn iq ues as least squares approximation (see Chapter 7). In addition, the results of the CPS calculation arc converted fro m rect:mgular (Cartesian) coordinates into latitude and longi tude. an interesting exercise in Itself and one involvmg yet other branches o f mathematics.
l
r
'"
Seellon 2.5
Table 2.1 n o o x, o x,
1
2
J
4
5
6
0.714
0.914
0.976
0.993
0.998
0.999
1.400
1.829
1.949
1.985
1.996
1.999
TIle successive vectors iterate is
121
Iterative Methods for Sol\'ing Linear Systems
[::J
arecalled iterates.so,forexample. when" = 4, the fourth
[~::~~]. We can sec that the iterates in this example arc approaching [~],
which is the exact solution of the given system . (Check lhis.) We say in this case tha t Jacobi 's Illethod converges.
Jacobi's method calculates the successive iterates in a twovariable system according to the crisscross pattern shown," Table 2.S.
o
2
1
x,
J
:
:=
Before we consIder Jacobi's method in the general case, we will look al a I g
modification of it that o ften converges fas ter to the solution. The GaussSeIdel method is the same as the Jacobi method except th:Jt we usc each new val ue (IS SOOIl (IS lVe CfIlI. So In OllT exa mple. we begin by c3 leulati ng x, = (5 + 0)/7 = ~ .. 0.7 14 as before, bu t we now use this value o f x, to gel the next value of x 2:
7 + 3· J "'" 1.829 5
,•
We then usc this value of A1 to recalculate X" and so on. The iterates this lime arc shown in Table 2.9. We observe thaI the GaussSeidel method has converged (3sler to Ihe solution. The ilerates this time are c3lculaled according to the zigzag pattern shown in Table 2.10.
[ able 2 91
"
x, x,
0
1
2
J
4
5
0
0.714
0.976
0.998
1.000
1.000
0
1.829
1.985
1.999
2.000
2.000
124
Chapter 2 Systems o f Lmear Equations
Table 2.10 n o
1
2
3
The GaussSeidel method also has a nice geometric interpretation in the case of two variables. We can thin k of X I and ~ as the coordinates of po ints ill the plane. O ur starting poi nt LS the point corresponding to our Ill itial approximation, (0, 0 ). O ur first calculation gives X I = ~ ,so we move to the point ( ~, 0) .. (0.7 J 4, 0). T hen we compute:s = ~ = 1.829, which moves us to the point (~, ~) ... (0.71 4, 1.829). Continuing in this fashion, ou r calculations from the GaussSeidel method give risc to a sequence of poin ts, each one d ifferi ng frOIll the precedmg poi nt in exactly one coor== 5 and 3x1  5'0 =  7 correspo ndi ng to the two d inate. If we plot the lines 7X I given equations, we find that the points calculated above fall alternately on th e two lines, as shown III Figure 2.27. Moreover, they approach the poi nt of intersec\Jon of the li nes, which corres po nds to the solution o f the system o f equations. Th is IS what cO ll l'ergcllcc means!
:s
2
Ij. I
0.5
0.2
0.4
0.6
~+++ ~ XI 0.8 I 1.2
 0.5  I
flnrl 2.21 Convcrgmg l te ra l ~
T he general cases of the two methods are analogous. Given a system of equations in tI vanables,
""X, + ",,'"_ + .. + " ,"x" ~ " /.
II
linear
b, (2)
we solve the first equation for XI' the second for :S' and so on. Then, beginning with an initial approximation, we use these new equatio ns to iteratively update each
Section 2 5
herati\'e Me th ods for Solving Linear Systems
125
variable. Jacobi's method uses all of the values at the hh iteration to compute the (k + l )st iterate, whereas the GaussSeidel m ethod always uses the mos' recent value of each variable in every calculation. Example 2.37 below illustrates the GaussSeidel method in a threevariable problem. At th is paim, yo u should have some questions and concerns abo ut these iterat ive methods. (Do you?) Several corne to mind: Must these met hods co nverge? ICnot, when do they converge? If they co nverge, must they converge to the solution? The answer \0 the first ques tion is no, as Example 2.36 ill ustrates.
Example 2.36
Apply the GaussSeidel method to the system
with initial approximation SOIIUOI
[~].
We rea rrange Ihe equatio ns to gel
x l = l +x2 Xl
= 5  2xI
The fi rst few iterates are given in Table 2. 11. (Check these.) T he ac tual solution to the given system is [ : :] =
[~l Clearly. the iterates ill
Table 2. 11 are not app roaching this point, as Figure 2.28 m akes graphically clear in an example of divergence.
128
Chapter 2 Systems of Linear Equations
So when do these iter3tive methods com'erge? Un for tunately, the answer to this question is rather tricky. We will answer it completel y in Chapter 7, but fo r now we will gl\'e a partial answe r, \vithou t proof. Let A be the II X II m atrix
au
A=
a"
, ,
'"
a.,
a~
fl 21
'
a,.
' ,
.. ,
fl2"
.. ,
a.
We say tha t A is strictly diagonally dominant if
lulI I > laul + laul + ... + Iud > Itl211 + laBI + ... +
I UI ~I l al ~1
That is, the absol ute value of each diagonal entry a ll' U w .. . ,
n~~
is greater than the
sum of the absolute values of the re/Twining entries in that row.
Theor8 .. 2.9
If :I system o f II linear equatio ns in /I variables has a strictly diagonally dominant coefficient matrix, then it has a uniq ue solution and both the Jacobi and the GaussSeidel method converge to it .
•••• ,. Be wa rned! This theorem IS a oneway implicat ion . The fa cl that a system is lIo t strictl y dl3gonally domm[lnt does lIor m ean that the iterati ve methods diverge. They mayor may no t converge. (See Exercises 15 19.) Indeed , there 3re examples in wh ich o ne o f the m ethods converges and the o ther d iverges. However, If either of these methods converges, then it must co nverge to the sol ution It cannOI converge to some other point.
Theorem 2.10
If thc Jacobi or the GaussSeidel method converges for a syMem of /I linea equations in n variables, then it must converge to the solution of the system.
•
PilOt We will illustra te the idea behind the proof by sketch mg It o ut for the case of Jacobi 's method, using the sys tem of equations in E.xample 2.35. The general proof is similar. Convergence mea ns that from some iteration on, the val ues of th e iterates rel1l:li n the same. Th is means that X I and X:z converge to rand s. respecti vely, as shown in Table 2. 12. We musl prove tha t
[x'J ['J' X2
=
s
IS
,
the solutl()n of the system of eq uatio ns. In
other \'o'o rds, at the ( k + l )sl iteration , the values of
XI
and X:z must sta y the same as al
Se.::tion 2.5
Iterative Met hods for Solving Linear Systems
121
Table 2.12 k
n
x,
.. .
,
••
s
•
k+ I
k+ 2
,
,
.
s
s
.. .
the kth iteration. But the calculatio ns give (7 + 3xl )/5 == (7 + 3r)/5. Therefore.
5h 7
~
,
and
XI
7
= (5
..
+ x1 )17 = (5 + $)/7 and x2 =
+ 3r 5
=s
Rearranging, we see that
7r  s = 5 3r 55=  7 Thus,
XI
= r, Xi = s salisfy the o riginal equations, as required.

By now you may be wonden ng: If iterative methods don't always converge to the solution, what good arc they? Why don't we just use Gaussian eliminatio n? First, we have seen that Gaussian elim inatIOn is sensitive to roundoff errors, and this sensitivity can lead to inaccurate or even wildly wrong answers. Also, even if Gaussian elimi · nation does not go ast ray, we canno t improve o n a solu\ion once we have found it. For example, if we use Gaussian elimination to calculate a solution to two decimal places, there is no way to obtain the solution to fou r decimal places except to start over again and wo rk with increased accuracy. In contrast, we can achieve additional accu racy with ite ra tive methods simply by doing more iteratio ns. For large systems, particularly those with sparse coefficien t matrices, iterative methods are m uch faster than direct methods when Implemented on a computer. In many applications, the systems that arise are strictly diago nally dominant, and th us iterative methods are guaranteed to converge. The next example illustrates one such ap plication.
Example 2.37
Suppose we heat each edge of a metal plate to a constant temperature, as shown in Figu re 2.29.
50""
JlgurI2 .29 1\ heated metal plate
o·
!ms o f ti near Equatio ns
Eventuall y the temperatu re at the in terior po ints will reach equilibrium, where the following propert y can be shown 10 ho ld :
The temperature at each interior point Pon a plate is the average of the temperatures on the circumference of any circle centered at Pinside the plate (Figure 2.30).
Fluu,.2.30
To apply this pro perty in an actual exam ple requ ires techniques fro m calc ulus. As a n alternative, we can approximate the sit ua tio n by ove rla yi ng the plate with a grid , or mesh, that has a fi nite number o f interior points, as shown in Figure 2.3 1.
5fJ'
50'
nuur. 2.31
I,
I,
100' 'l
The di5(: rele verSLon o f Ihe heated
plale problem
0'
0'
The disc rete analogue o f the averagin g p roperi y governing equil ib ri um tempera tu res is slated as fo llows: Th e temperature at each interior point P is the average of the temperatures at th points adjacent to P.
For the example shown in Figure 2.3 1, there are th ree In terior points, and each is adjacent to four other points. Let the equ il ib ri um temperatures o f the m terior points
Section 2.5
be t.,
' 2'
and
fl ,
129
Iterati ve Methods for Solving Linear Systems
as snown. Then, by the temperatureaveraging p roperty, we have
'I + SO
+ 100 +
100
4
+
tl
+ 0
t)
+ 50
(3)
4
+
100
100
+
+
0
t2
4 "'" 250  /1 +4t} 
t}"'"
50
'1 + 41} "'" 200

No tice tnat this system is strictly diagonally do m inan\. No tICe also that equations (3) arc in the fo rm required for Jacobi o r Ga ussSeidel iteratIon. With an initial approxima tion of" "" 0, 12 "" 0, t) = O,the GaussSeidel method gives the foll owing itera tes. 100
Iteration 1:
+
100
+ 0 + 50
4 /1
I}
Iteration 2:
=
=
62.5+0+0 + 50 100
+
4 100
100
+
100
= 62.5
= 28. 125
+0+
+ 28.1 25 + 50
4
28.125
= 57.03 1
4 69.531
= 69.53 1
+ 57.031 + 0 + 50 4
100
+
100
= 44.141
°
+ + 44.1 41 :: 61.035 4
Continuing, we fi nd the ite rates listed in Table 2.13. We wo rk with fiveslgnificantdigit acc uracy a nd stop when two successive Iterates ag ree wuhin 0.001 in all variables. Thus, the equdib rium temperatures at the in tenor po ints are (to an accuracy of 0.001) II = 74. 108, = 46.430, and ') "" 61 .607. (Check Ihesecalculations.) By using a fin er grid ( wuh more Inte rior points), we can get as precise in forma tio n as we like abo ut the eq uilibrium te mperat ures at va rious POlllts on the plate.
'2
Table 2.1a 0
J
2
3
t,
0
62.500
69.531
t,
0
28. 125
I;
0
57.031
"
...
7
8
73.535
74.107
74. 107
44.141
46.1 43
46.429
46.429
6 1.035
61.536
6 1.607
61.607
.. .
+
138
Ch;lpwr 2
Systems of Lmear Eqwltion5
CAS
111 Exercises 16, apply Jacobi's lUe/lrod to thegivell S)'stelU. 'lflke Ihe zero vector as the illilifl/approximation am/work wllh fOllrsignific,," tdigit accuml)' IItllli IwO srlCceuive aerales agree willli" 0.001 in each vam,"'e {" each case, compare YO llr al/swer wi/II tile exact so/wioll foulld using (IllY direct tIIethod )'Ollilke. 1.
7x,

2. 2xl +x1 =5 x,  Xl = 1
= 6 x l Sxz = 4 Xl
3. 45xI  O.5xz "" x,  J.5J,·l
 I
4. 20xI +
Xl
:II
Xl 
x,  lOx! +  XI
5.
+
3xI +
Xl
+

= 17
6xI 
xJ

2xJ = I
17. Draw a diagram to illu strate the divergence of the GaussSeidel method in Exercise 15.
l B.  4xl :
I
'YI+3x1  X, Xl
+ 2xJ = 2 2X2 + 4xJ = I
XI  4xj
diagolt(llly dOlllill(lIIl, lIor ((III the efjJlfltiollS be reamll/ged 10 make it so. Howel'Cr, both tile Jacobi arid till' GaussSeidel method cotll'crge (lIIyway Dell/ollSlmle dUlt tlris is Irlle of IIJe G(1I/5sSeidel method, starlmg with lite zero vector as lite i"itial approx;matlon and obta;IIl118 (I $o/ul ioll tiwI ;$ accurate 10 lI'ith;/1 0.01.
1
x,
16.
hi Exercises 18 a1l(1 19, lhe coefficiem matrix is IIOt strrclly
xl +4x2 + x} = I Xl + 3x, = 1
6. 3x,
15. XI  2X2 = 3 3xl+2x2 = 1
1
Xj = 13 l Ox} "" 18
x,
o/mlill a ll approxilllflte solution Ihtl/ is aCCl/rtlte to witilill O.(XH.
= 0 + 3x}  X4 "" I X, + 3x,. = 1
I" Exercises 7/2, repcat Ille givell exercise rlSillg tire GallssSeidel metllOd. Take the zero vector as tile illitial approxilllfllioll (lIId work with fOlirsigmfiflJtJIdlgil aa"llml)' 1I11/i/ 111'0 sllccessiw! itemres agree willli" 0.00 I m each I'anable. Compare thc /II/ mila of itemtiolts requi red by tire Iflcobi arId Gal/ssSeldel met/lods 10 reach sl/ch all approxmlllte solmio1l. 7. Exercise I
B. Exercise 2
9. Exercise 3
10. Exercise 4
II. Exercise 5
12. Exercise 6
I" Exercises 13 mill 14, (Imw (/iflgrums 10 illustrate I/,e eOIlvergence of tIle G(II/ssSeidel method willi lire gll'e" system. 13. The system in Exercise I 14. The s),slem in Exercise 2
+
5xl = 14
3x1 =  7 19. 5xI  2xz + 3x, =  8 XI + 4xz  4x.l = 102 2x l  2X:! + 4x3 = 90 XI 
20. Continue perfonning iterations in EXerc"lSe 18 to oblain a solution Ih;1t is accurale 10 wilhin 0.001 . 21. Continue performing iterations
Exercise 19 10 oblain a solution Ihal is accura te to within 0.00 1.
I" Exercises 2224, the mettll plate lUIS tire C0Il5/(1II1 lemt1Crafllres shown 011 its bOllmlancs. PmtJ rite equilibrium tempemlure fll caell of tile /fIt/ieared ill/erior poillts by sCltillg up a syslem of lille(lr eqlllltiolls miff applyillg eitller the I(lcobi or the GaussSeitlel method. Obtai" a SOIIIlIOII Ihal ;$ accurate 10 lVithi" 0.001. 22.
0'
I" Exercises 15 (/lid 16, compule the firs l four ilemles, Iisillg lilt' zero vector as tile jllitiaf approximatlOtl, to show II,at tile Gm/SSSeidef metllod (Iiverges. TlJen show tlJaltl,e eqlw llOIls am be rcammged 10 give (/ strictly diago//(/I/y (IOllllllalll coeffilictll matrrx, and apply II,e GaussSeidel tIIelhod 10
III
0'
"
,. ,.
SeCiion 2.5
o·
CY'
23.
"
o· IOCY'
"
24.
o·
o·
'.
IOCY'
100"
CY'
2CY'
"
'2
40'
4CY'
27. A narrow strip of paper I unit long is placed along a number line so that its ends are at 0 and I. The paper IS folded In half, right end over left, so that its ends are now at 0 and t. Next, it is fo lded in half again, this time left end over right, so that It S ends a fC at ~ and ~ . Figure 2.32 shows this process. We con tinue fo lding the paper in half, alternating flght overIeft and leftoverright. If we could con ti nue indefinitely, il IS dear that the ends of the paper would converge to a poi nt. It is thjs point that we want 10 find .
2CY'
(a) lei XI co rrespond to the left hand end of the paper
100'
'.
"
131
Exercises 27 ali(I 28 demonstrtltc that soll/ctimes, if we arc lucky, the form of an iterative problelllll1ay allow [IS 10 lise a little IIIsight to olJtaiu all exact soilltion.
'2
IO(f
Iterati ve Methods fo r Solvmg Linear Systems
IOCY'
111 Exercises 25 (lI1d 26, we refille the gnd used In Exercises 22 mid 24 to obtai" more accllrate iltjormatloll about IIII? eqllilibrlllm lcmperarures almtcrior poims of the plates. ObU/ili solUliollS Ihat are accurate to WIt/1111 0.00], J/SlIlg eitiler the Jacoul or tile G(It/S5Seidel method. 25.
and Xz to the righth:llld end. Make a table with the first six values of [ XI' X21and plol the corresponding pomts o n XlX:! coordinate axes. (b) Find two linear equulions of the form X:! = (lX I + b and XI = £'Xl + d that determine the new values o f the endpoints at each iteration. Draw the correspondlllg lines on your coordinate axes and show that thiS d iagram would result from applying the GaussSeidel method to the system of linear equations you have found. (Your diagram should resemble Figure 2.27 on page 124.) (c) Switching to decimal representation, continue applying th e GaussSeidel method to approximate the
point to which the ends of the paper are converging to within 0.00 1 accuracy. (d) Solve the system of equations exactly and compare your answers. 28. An ant is standin g on a number line al poilu A. It
o· 5~
26.
5°
0°
o·
"
'2
'. '.
" 4CY' 40'
00
'.
'10
'13
•
5°
20°
20 0
"
'.
"
'.
'n
'12
'"
'16
20°
2CY'
100'
,,,,alks halfway to point 8 and turns arou nd. Then it walks halfway back to point A, lurns around again, Jnd walks halfway to pomt B. It continues to do this indefinitely. Let point A be at 0 and poil\\ 13 be at I. The ant's walk is made up of a sequence of overlap· ping line segments. Let XI record the positions of the lefthand endpoints of these segments a nd X:! their right hand endpoin ts. (Thus, we begin with XI = 0 and X:! = Then we have Xl = ~ and X2 = ~. and so on.) Figure 2.33 shows Ihe stun of the ant's wulk.
i.
(a) Make a table with the fi rst six values of Ix i • ~J and plot the corresponding points on X I  X 2 coordinate axes. (b) Find two linem equations of the form ~ = aX I + /, and XI = cx.z + d that dClertlline the new values of the endpoints at each iterallon. Draw the corresponding
132
Chapter 2 Systems of I.inear Equations
...
..I 0
• I ;I /Iite
I
I
0
I I
, ,, ,I •
I
,
,
I
I 0
,,I
I 0
0 I
I
1
1
,
2
I
,
3
n11U12.32 Folding a strip ofpapcr
I 0
,,I
I
I
• I i I
I 0
,
,J
)
I"J9{~\

J

I
, ,
I/P'if·1 I
I
J
I
,
1
lines on your coordinate axes and show that this diagram wo uld result from applying the GaussSeidel method to the system of linear equa tions you have found. (Your diagram should resemble rigure 2.27 on page 124.) «) Switchmg 10 decimal represen tation, continue appl ying the Ga ussSeidel method to approximate the values to which XI and Xl arc converging to within 0.00 1 acc uracy. (d ) Solve the system of equatio ns exacdy and compare yo ur answers. Inter pret yo ur resul ts.
1
2
8
figl,. 2.33 The anCs walk
R
.
.'
~
.~
.~
..:......... ":'
.
'.'
Ie, Dellnltlons and augment ed matrix, 62 back substitu tion , 62 coefficient matrix, 68 consistent system, 61 convergence, 123 124 d ivergence, 125 elementary row o perations, 70 free variable, 75 Gauss Jordan eli mination, 76 GaussSeidel method, 123
Gaussian eiimmatio n, 72 ho mogeneous system, 80 inconsistent system, 6 1 iterate, 123 Jacobi's m ethod , 122 leadmg Yanable (leading 1), 7576 linear equation, 59 linearl y dependent vecto rs, 95 linearly independent vectors, 95
pivot, 70 ra nk of a matrix, 75 .. Rank Theorem, 75 reduced row echelon form, 76 row echelon fo rm, 68 row equivalent matrices, . 72 span of a set o f vectors, 92 spanning Set, 92 sysiCm of linear equations. 59
Review Questions I. Mark each o f the following statemen ts true or fa lse:
(a ) Every system of linear eq ua tions has a solution. (b ) Every homogeneo us system of linear equations has a solution. (c) If a system of linear equations has more vanables than equat ions, then it has infinitely many solutio ns. (d ) If .. system of linear equatio ns has mo re equations than variables, then it has no solution.
(el Determining whether b is in span(a l , •• . ,an) is equivalent to determlll lllg whether the system [A I b l is consistent, whe re A ::0 lOl l'" anI . (f) In RJ,span( u , v) is alwa ys a plane through the on gln. (g) In R3, if nonzero vectors u and v are not parallel, then they are linearl y independent. (h ) In R 3 , If a set of vecto rs can be drawn head to tail, one after the o ther so that a closed path (polygon) is fo rmed, then th e vectors are linearly dependen t.
133
Chapter Review
(i ) If a set of vectors has the propert y that no two vectors in the set are scalar m ultiples of o ne another, then the set of vectors is linearly independent. (j) If there arc more vectors in a set of vectors than the num ber of entries in each vector, then the sCI of vectors is linearl y dependent.
2. Find the rank o f the mat rix.
] 3
2  ]
o
3
2
]
3
4
3
4
2
 3
o
 5
 I
6
2 2
II. Find the gener;al equatio n of the plane span ned by
1
3
1 and
2
1
]
2 12. Determ ine whet her independent.
u
~
,v =
]
4. Solve the linear system
 ]
,v =
over 2 7,
6. Solve the linear sys tem
3x +2y = 1 x + 4y = 2
7. For what value(s) of k is the linear system with
2 I] inconsistent?
2k 1
9. Find the point of intersectio n of the fo llowing lines, if it exists.
Y
2 + , I 3 2
,
10. Determine whether 1
and
2 2
x
1
and
y
,
5 2  4
 I
+
~
 I
1
3
5 is in the span of
1
3
a1 a,l. What are the possible val
t
1
17. Show that if u and v are linearly independent vecto rs, thensoare u + vand u  v.
18. Show that span(u, v) = span(u, u u and v.
+ v) fo r any vectors
19. In order for a linear system with augmented mat rix [A I b l to be consisten t, what mus t be true about the ran ks of A and [ A I b j? 1
1 1
 ]
w
16. What is the maximum rank o f a 5 X 3 matrix? What is the minimum rank of a 5 X 3 matrix?
8. Find parametric equations fo r the line of intersection of the planesx+ 2y+ 3z = 4 and 5x + 6y+ 7z = 8.
1
,
15. Let a i' a 2, aJ be linearly dependen : vectors in R', not all zero, and let A = [a l ues o f the rank of A?
x
0 1
0
(a) The reduced row echelo n for m o f A is 13' (b ) The rank of A is 3. (e) The system [A I b] has a unique solution for any vector b in [RJ. (d) (a), (b ), and (c) are all true. (e) (a) and (b ) are both true, but nOI (el.
2x+ 3y = 4 x + 2y = 3
k
1
w ~
14. Let a i' a 2, a J be linearl y independent vectors in [RJ, and let A = ta l a~ a JI. Which o f the following s tatements are true?
5. Solve the linear system
augmented matrix [ I
 2
1
0
3w+ 8x 18y+ z = 35 w + 2x  4y = II w+ 3x 7y+ z = iO
9 are linearly
0
]
 I
,
= span{u, v, w) if:
,
0
1
Cbl u ~
2
1
0
x + y  2z = 4 x + 3y  z = 7 2x+ y  5z = 7
I
3
]
Cal
3. Solve the linear system
,
]
13. Determine whether R'
3
1
20. Arc the matrices
I
I
2 3  I and  1 4 1 row equivalent? Why or why not?
I
0
 1
I 0
I I
I 3
'' .. .... ... . ,_ ...... ~,..
.....
~

.
trice
We [Halmos lind Kllpltlllsky/ share II philosophy a llOlll lim:llr algebra: we Ihink basisfree, we wnte basisfree, bur wile" Ihe chips are down we clost' Ihe affin' door ami comp"tt with matricts tikt fury. lr:vlllg Kaplansky In Pa,11 Halmas: Celebrating 50 )cars of Mar/rt'malics J. H. Ewingand F '" Gehrmg. 005. SpringerVerlag, J991 , p. 88
3.0 Introduction: Matrices In Action In this ch3pter, we will study matrices in their own right. We have already used matricesin the form of augmented matricesto record information about and to help stream,line calculatio ns involvmg systems of Imear equations. Now you will see that matrices have algebraic properties of their own, whICh enable us to calculate with them, subjoct to the rules of matrix algebra. Furthermo re, you will observe that matrices arc not stalic objects, recording information a nd data; rather, they rep resent certain types offunctions that "act" on vectors, transformi ng them in to other vecto rs. These "mat rix transformations" will begin to play a key role in our study of linear 31gcbra and will shed new light o n what you have al ready learned about vectors and systems o f Imear equatio ns. Furthermo re, mat rices arise in many form s other than augmented matrices; we will explore some of the many applications of mat rices al the end of th iS chapter. In thiS section, we will consider a few si mple examples to illustrate how matrices ca n transfo rm vectors. In the process, you will gel your first gl impse of "matrix arithmetic." Consider the equations
y, = x l + 2x2 Y2 =
(I)
3 X2
\'Ve can view these equations as describing a tran sformation of the vector x  [xX,'
1
in to the vector y = [;:]. If we d enote the matrix of coefficients of the righthand side by F, then F =
[ ~ ~] we can rewrite the transformation as
or, more succinctly, y = Fx. (T hi nk o f this expression as analogous to the functional notation y = ! (x ) you are used to: x is the independ ent "varltlble" here, y is the dependent "variable," and F is the name of the "functio n.")
13.
Section 3.0
Th us, if x = [ 
Introduction: Matrices m Action
135
~ ], then the eq uations ( I) give YI= 2 +2 'I =O
Y2 =
3 ' 1=3
We can write this expression as
y =
[ ~]
[ ~] = [ ~ ~][  ~ ].
ProblelD 1 Compute Fx for the following vectors x:
Problem 2 The heads o f the fo ur vectors x in Problem 1 locate the four corners of a square in the x I X2 pla ne. Draw this square a nd label its corners A, B, C, and D, cor· responding to parts (a), (b ), (c), and (d ) o f Problem 1. On separate coordinate axes (labeled YI and Yl)' d raw the fo ur points determined by Fx in Problem 1. Label these po~s A', 8' ,C , and D' . Let's make the (reasonable) assumption thaI the line segment AB is tra nsformed in lO the line segment A' B', and likewise for the other three sides of the square ABCD. Whal geometric figure is rep re· sen ted by A' B'C D'?
Problell 3 The center of square ABCD is the origin 0 =
[ ~ ]. What
IS
the center of
A' 8' C D' ? What algebraic calculation confirm s Ihis? Now consider the equations 21
=
YI Yl
(2)
2;> =  2YI
that t ransform a vector y =
[;J
[~]. We can abbreyiatc this
into the vecto r z =
tra nsformation as"Z = Gy, where
G=[  2' '] 0 Prolllllll 4 We arc going to fi nd out how G transfor ms the figure A' B' C D' . Compute Gy for each o f the four ve<:tors y that you comp uted in Problem 1. [That is, compute "Z = G(Fx ). You may recognize this expression as being .analogous to the composition of fun ctions wlIh which yo u are fa m iliar. I Call the correspondmg poin ts A", 11', C', and D ", and sketch the fig ure A" 8"C"D" on 21 ~ coordinate axes. Problem 5 By substituting equations ( 1) into equations (2), obtain cquatjons for 21 and ~ In terms of XI and x.. [f we deno te the matrix of these equations b y N, then we have "Z :::: Hx. Since we also have "Z = GFx, it is reasonable to write
11 = GF
'
Can yo u see how the en tries of H are related to th e entries of F and G? Plabl._ 6 Let's do the above process the other way around: First transform the square ABCD, using G, to obtain figure A"'8* ~D* . Then tra nsform the resulting figure, using F, to obtain A"'· 8"'* C"'* Dn. lNote: Don', wo rry abou t the "variables"
136
Ch:lpter 3 Matrices
It,
n, C, and D into equat ions (2)
y , and z here. Simply substit ute thc coo rdinates of A,
and then substitute the results into equations ( J ).J Are A** 13*'" CO'" 0"'''' and A" BloC' D" the same? What does this tell you about thc order ill whIch we perform the transformations Fand G? Problem] Repeat Problem:; with general matrices G
~
[g" g,,], 8 21
and
g"
That is, if equations ( I) and equations (2) have coefficients as specified by F and G, fmd the entr ies of H in terms of the entries of F and G. The result will be a formula fo r the "product" H = GF. Problem 8 Repeat Problems 16 with the follow ing matrices. (Your formula from Problem 7 may help to speed up the algebraic calculatIOns.) Note any si milarities or differences that you think are significant. (.) F
~
(0) F ~ ,
[ °1
[I
 1]G~[20] o' 0 3 I]G~ [ 2  1]
2 '
 I
I
(b) F
= [:
(d) F
~
[
~J.G = [~ :J I
2
2]G~[ 2 4 '
I
:]
Matrix Operations Although \~e have already encou ntered matrices, we begin by stating a form
Definition
A matrix is a rectangular array of numbers called the entries, or elements, of the matrix.
Although numbers will usually be
chosen from the set R of real numbers, they may also be taken from the set C of complex numbers or from Zp' where p is prime.
Technically, there is a distincnonl between row/column matrices nnd wctors, but we will not belabor this distinction. We will, however,distinguish between row matflces/vectors and co/mllli matrices/vectors. This distinction is importantat the very lea$tfor algebraic computations. as we will demonstrate.
The foltowing are all eX:llllpJes of matrices:  I
2 4 , 17
5.1
I ],
1.2
 I
6.9 4.4 , [ 7]  73 9 8.5 The size of a matrix is a description of the numbers of rows and columns it has. A ma trix is called mX 1/ (pronounced" m by 1/") if it has III rows and tI columns. Th us, the examples above are mat rices of sizes 2 X 2, 2 X 3, 3 X I, 1X4, 3X 3 and 1X I, respectively. A I X m matrix is called a row matrix (or row vector), and an /IX I ntrttr lx is called a column matrix (o r column vector). We usc dOllblesubscrlpt notation 10 refer to the entnes of a matrix A. The entry of A in row i and column j is denoted by art Thus, tf A =
[I
I
[~
9 5
I
°
 :]
then au = \ and a ll = S. [The notation A'J is sometimes used interchrtngeably with ail I We Crtll therefore compactly denote a matrix A by (aijl (or (ai)",xw ifitls Important to specify the size of A, although the size \~ill usualty be dear from the con text).
SW io n ). I
131
Matrix Operations
With this notation, a gene ral //I X 11 matrix A has the fo rm .. .
a .. 1
...
QORl
If the columns of A ar~ the vectors a l' a 2, ... ,a ~, th ~n we may represent A as •••
' .J
If the rows of A a rc AI' Al •.•. , Ani' then we may represe nt A as
A, A=
A,
A. The diagonal e,dries of A arc al l' nll> aw " . ,and If m = n ( that IS, if A has the same nu mber of rows as columns), the n A is called a square mntrix. A square matrix whose nondiagonal entries a rc all zero IS called a tlingomd matrix. A diagonal matrix all of whose diagonal en tries a rc the same is called a scalar matrix. If the scalar o n the diagonal IS 1. the scalar mat ri x is called a n idw,ity matrix. For example, let A = [
2
 I
5 4
B

[34 5'I]
o o c= o 6 o , o o 2 3
D =
1 0 0 0 1 0
o
0
1
The diagonal enlries of A a re 2 and 4, but A is no t square; B is a square ma trix of Size 2 X 2 with diagonal entries 3 a nd 5, C is a diagonal ma trix; D is a 3 X3 identity ma tTix. The n X II identity ma trix is denoted by I~ (or simply I if its size is unde rslOod). Since we c.1 n view ma trices as generalizations of vectors (and. indeed, matrices can and sho uld be though t of as being made up of bot h row a nd column veclOrs), many of the conventions and o pe rations for vectors carry th rough ( m a n obvious way) to m atrices. Two ma trices arc equal if they have the same size a nd if their corresponding ent ries are equal. Th us. if A = [a'JJmxn and B (b91 ,xl' the n A Bif and only if //I "'" r llnd /I = s a nd tI,) = hI] for atll a nd j.
=
Example 3,1
=
Conside r the matrices
A = [:
!].
B=
[ ~ ~ ].
c
o
[! ;] 3
Neither A no r B can be eq ual to C( no matter what the values of xand y), s ince A lInd Bare2 X2 malrices a nd C is2X3. However, A = Bifand on ly if ( I = 2, /, ;;; O,e  5, and d "" 3.
Example 3,2
Consider the malri c~s
R =[ l
4
3J and
C=
1 4 3
138
Chapter 3
Matrices
D espite the fac t that Rand C have the same entries in the same order, R =I C since R is 1 X3 and C is 3X I. (If we read Rand Caloud , they both sound the same; "one, fou r, th ree.") Thus, o ur distinc tion between row matrices/vectors and column matrices! vecto rs is an importan t one.
Matrix Addition and Scalar Multiplication Generalizing from vector add ition, we defi ne mat rix addi tion compOl1el1twise. If A = [a,) and B = [b;) are mX tI mat rices, their sum A + B is the mX tI matrix obtained by adding the corresponding entries. Thus,
A
+ B = [11;j +
bij]
[We could equally well ha ve defined A + B in terms o f vector addition by specifying that each column (or row) of A + B is the sum of the correspo nding colum ns (or rows) of A and 8. [ If A and B a re no t the same size, the n A + B is not defined.
Example 3. 3
Let A = [
1
4
2 6
~].
B ~
1 ] [: 1 2 . 0
Then
A+B = b ut neither A
+
C no r B
[ ~
5 6
"d
C =
[~
:]
; ]
+ C is defi ned.
The com ponen twise defi n ition of scalar multiplication will come as no surprise. If A is an m Xn matrix and c is a scalar, then the scalar multiple cA is the mXn matrix o btained by m ultiplyi ng each e ntry of A by c. More fo rmally, we have
[ In te rms o f vectors, we could equivalently stipulate that each column (or row) of cA is c times the corresponding colum n (or row) of A.I
Example 3.4
For mat ri x A in Example 3.3 ,
2A = [
2
 4
8 12
l~l
!A= [_: ~
~l
and
( l)A =[ ~
 4  6
The matrix ( I)A is written as  A and called the negativeo f A. As with vectors, we can use this fact to defi ne the difference of two matrices; If A and B are the same size, then A  B ~ A +( B)
Sect ion 3.1
111m pie 3.5
lJ1
Ma tnx Operatio ns
For matrices A and B in Example 3.3,
] [3 o A  B= [ I 4 0 I
2 6
5
3
A matrix all of whose entries arc l eTO is called a zero matrix ;md denoted by 0 (or 0 "')(11 if it is imporlant to specify its size), It should be dear that if A IS any matrix and o is the 7£ ro matrix of the same size, then
A+O=A=Q+A . nd A  A = 0 =  A
+A
MaUll MUlllpllclliOD ~13t hcll1aticia ns
are sometimes like Lewis Carroll's Hu mpty Dumpty: "Wb en I use a w\l rd ,~ Hu mpty Dumpty said, "it means just what I choose it to meanneither more nor JdoS ( from 11Iro11811 Illf Loobll8 GIIIU), M
The Introduction in Sect ion 3.0 suggested that there is a "product" of matrices that is analogous to the compo sition of fun ctions. We now make Ihis no tion morc precise. The defini tion we arc about to give generalizes what you should have discovered in Problems 5 and 7 in Section 3.0. Unl ike the definitions of matrix addition and sca lar multiplicauon, the defi nitio n o f th e product of IwO m,l\rices is not a componentwise definiti on . Of cou rse, there is nothing to stop u s from defin ing a product o f matrices in a componenlwl5e fas hion; unfortunately such a defini tion has fcw ap plica tions and is not as "natu ral" as the one we now give.
If A is an ", Xn matrix and 8 is an tlX r matriX', then the product an "'x r matrix. The (i, j) entry of the product is computed as follows:
II
.
U.,r•• Notice that A and B need not be the same size. However, the number of colulIIlI$ of A must be the same as the number o f rows of 8. If we write the sizes of A, and AIJ in order, we C;11l scc at a gl
n.
8
A
mx.
1
U
"
SJ~
SiuofAR
,
)
= AB mX<
lei
Chapter 3 Matrices
• T he formula fo r the entries of the product looks like a do t product, and indeed it IS. [t says that the ( I, j) entry of the matrix AB is the dot product of the ith row of A and the jth col umn of 8:
a"
(I ll
·•
• •
a,
a,
a,. • ••
a,.
·•
·..
boo
· ..
b11 . •
a.,
a."
• ••
·.
b. ,
a.
boo
•• •
b"
b" b"
• • •
b.,
Notice that, in the expression C'I = alibi) + (l;zb2j + ... + a'"/' "j' the "o uter subscripts" o n each ab term in the sum are always I and j where:lS the "inner subscripts" alw:lYs agree and increasc from I to 11. We see this pattern clearly if we write e'l using summatio n notation:
(KImple 3,&
Compute AB if
A~ [
I 2
J  I
:1 and
4 0 5 2 I 2
B:
3
 I
 I
I
0
6
S,I,UOl Since A is 2 X3 and B is 3X 4, the p roduct A8 is defi ned a nd will be a 2 X4 matrix. The first row of the product C::: AB IS computed by taking the dot product of the first row of A with each o f the columns of 8 in tu rn. Thus,
<" <"
1(4) + 3(5) = 1(0) + 3(2) = 1(3) + 3( 1)  1( 1) + 3(1) ~
+ ( 1)( 1) : 12 + ( 1)(2) :  8 + ( 1)(0): 0 + ( 1)(6) =  4
The second row of C is computed by taking the dot product of the second row of A With each o f the colum ns of B in turn:
+ ( 1)(5) + (1 )( 1): 2 'n: (2)(0) + ( 1)(  2) + (1 )(2) = 4
Thus, the product matrix is given by ALi :::
12 [
2
8
0
4
 5
:]
(Wi th a little practice, you should be able to do t h~ calculations mentally without writing out all of the details as we have done here. For mo re complicated examples, a calculator with matrix capabilities o r a computer algebra system is preferable.)
J
Sccllon 3.1
Matrix Operations
1.,
Befo re we go furth er, we will consider two examples that justify our chosen
definition of matrix mult irlication.
Example 3.7
Ann and Bert are plan n ing to go shopping for fru it for the next week. They each want to buy some apples, o ranges, and grapefruit , but in diffenn g amounts. Table 3.1 lists what they intend to b uy. There are two fruit markets near bySam's and Theo'sand their prices are given in Table 3.2. How much will it cost Ann and Bert to do their shopping at each of the two markets?
Table 3.1
'able 3.2
Apples
Gra pefruit
O ranges
6 4
3
10
8
5
Ann Bert
Solutloll
Apple Grapefruit Orange
Sam's
Theo's
$0.10 $OAO $0.1 0
$0.15 50.30
SO.20
If Ann shops at Stun's, she will spend
+ 3(0.40) +
10(0.10)
~
S2.80
6(0. 15) + 3(0.30) + 10(0.20)
~
$3.80
6(0.10)
If she shops at Theo's, she will spend
Bert will spend
4(0.10) + 8(0.40) + 5(0.10) = $4.10 at Sam's and
4(0.15) + 8(0.30) + 5(0.20) = $4 .00 at Thea's. (Presumably, Ann will shop at Sam's while Bert goes 10 Theo's.) The "dot product form" of these calculations suggests that malrix multipl ication is at work here. If we o rganize the given mformation into a demand matrix D and a price matrix P, we have
D =
[!
!
1~]
and
0.10 OAO 0.10
P=
0.15 030 0.20
The calculations llbove are equivalent to computing the product
Table 3.3
6
Sam's
Theo's
Ann
S2.80
S3.80
"'n
$4.10
$4.00
DP= [ 4
3
10
8
5
0.10 0. 15 0040
1O. \0
0.30
0.20
=
[ 2.80 3.80 4.10
1
4.00
Thus. the product matrix DP tells us how much each person's purchases will cost at each store (Table 3.3).
142
ChapleT)
Matrices
E".DI.3.'
Consider the linear system
x l 2x2 +3xl =
5
X1 +3x2 + xj =
I
2xI 
x2
+
(I )
4x) = 14
Observe that the lefthand Side arises from the matrix product
I
 2
3
XI
 \
3
I
x~
2
 I
4
x,
2 3 3 1
x,
I
x,
so the system (I) can be written as 1 I
2
4
5
=
1
14
or Ax = b , whe re A is the coefficient matrix, x is the (column) vector ofvariablcs, and b is the (column) vector of constant terms.
You should have no difficulty seemg that every Imear system can be writlen in the form Ax = b . In fact, the notation (A I b] for the augmented matrix of a linear system is just short hand for the matrix t<Juation Ax = b. This form will prove to be a tremendously useful way of expressi ng a system of linear equations, and we will exploit it often from here on. Combining this insight wi th Theorem 2.4, we see that Ax = b has a solution if and only if b is a linear combination of the columns of A. There is 'lOot her fact about matTJX operations that will also prove to be qu ile useful: Multiplication of a matrix by a standard unit vector can be used to "pick out" o r
" reproduce" a colum n or row 0 f ' Let a matriX.
A =
[4o 25  I]\ andconsl'der t he
products Ac) and e2A, with the lllli t vectors c J and e 2 chosen so that th e products make sense. Thus.
Ae J = [ 0 4
2 5
,nd
1][4 2 o 5
=[0 5 I} Notice that Ae) gives us the third column of A and c 2A gives us the second row of A. We reco rd the general result as a theorem.
,
Thlar•• 3.1
let A be an mX" matrix, e, a I X ", sta ndard unit vector. and e, an /IX I standard unit vector. Then a . e, A is the ith row of A and b. Ae} is the jth colum n of A.
Section 3.1
Matrix Optrallons
lC3
Prol'
We prove (b ) and leave p roving (a) as Exercise 41 . lf a l , ... , a" arc the colu nl ns o f A, then the product Ae) can be written
,
,
Ae = Oa+ Oa +···+ ' We co uld also prove (b) by direct calculation:
,
Ae
~
o
a" a"
•••
\
a
a", I
o
a.
since the I in eJ is the jth entry.
Partitioned Matrices It will o ft en be convenient to regard a mat rix as being oomposcd of a number of smaller slIbmatrices. By introducing vertical and horizontal lines into a matrl.1(, we can partition it into blocks. There is a natural wlly to partition man y matrices, particularty those aflSlIlg in certain applications. Por example, consider the matri x
A ~
\
0
0
2
 \
0 0
\
0
\
0
1 4
3 0
0 0
0 0
0 0
1
7
7
2
It seems nalUralto Pllrtition A as 1
0
0
0 !• 2
I
1
0 •'• \
3 0
• •
I _ :• .... 4 0__ ... 0......... 0 0 01 \ • 0 0 0 :i
.
, 7
[~ ~l
where lis the 3><3 identity matrix, B is 3X2, 0 is the 2 X3 zero matTlx,and Cis 2Xl. [n this way, we can vie\¥ A as a 2X2 matrix whose entries are themselves matrices. When matrices are being mul tiplied, there is ofte n an advantage to be gained by viewing them as partitioned ma trices. No t on ly does th is frequentl y reveal unde rl ying sl ructures, but it ofte n speeds up computa tIOn, espe<:ially when the ma trices are large and have man y blocks o f zeros. It tu rns ou t that the mult iplication of partitioned matrices is just like ordinary mat rix multiplication. We begin by considering some special cases of partitioned matrices. Each giyes rise to a different \"3Y of viewing the prod uct of two nKtt rices. Suppose A is III X II and B is lI >< r, so the produc t AB exists. [fwe partit ion B ill terms of Its wlu mn vectors, as B lb l ' h ! ' ... : b ,l. then
144
Chapter 3 Malrices
Th is result is an imm ediate consequence of the definitio n of matrix mul tiplicatIo n. The form o n the right is called the matrixcolum n rep resellttltiotJ of th e product.
Example 3.9
If
A=
[~
3 I
~]
4
I
I
2
3
o
then
3
Ab,. = [ 01
and
 I
3  I
 I
~] ~
J32 ::  5]2 . (Check by ordinary matm multi plicatio..+ n.)
Therefore , All = IAb l ; Ab 2 J = [
Observe that the matrixcolum n representa tion of AB allows us to writc cilch column of All as a linear combi nat ion of the columns of A with cntric.s from B as the coefficients. Fo r example,
Rillar.
3  I
(See Exercises 23 and 26.) Suppose A is mX nand B is /I X r, so the produ ct All exists. If we partition A in terms of its row vecto rs, as
A, A,
........
A=
• ........
A, A, .......
A ,B .........
.......
then
All =
Il =
A,B
.........
• .......
• .........
A.
AmB
O nce again, thIS result IS a d irect consequence of Ihe definitIon of matnx multiplicatio n. The form on the rIght is called the row1twtrix represemation of l he product.
Example 3.10
Usmg the row matrix rep resentation, compute AB for the matrices in Example 3.9.
145
Sectio n 3. 1 Matrix Operations
Solltl..
We compute
, A ,B = [ \
3
2)
 \
\
2 = [13 5) 'nd
3
0
 \
A, B = [0
 \
\
2
3
0
!)
 2)
= [2
Therefore AB := ,
4
[·~~!!l = ['~'~'~'l as before. A2B 2  2'
The defi n ition o f the matrix product A B uses the natural partitio n of A into rows and B into columns; this form might well be called the rowcolumn rep'~jen laljon o f the product. V·le can also plIrlition A in lo columns and B into rows; this for m is called the columnrow representation o f the product .
In this case, we have
8, A =
[3 r : al : ... : a"l and
B,
B =
. •... 8.
B, + .. :ra"B~ • ......
(2)
8,
Notice thai the sum resembles a dot product expansion; the difference is that the indi
vidual terms are matrices, not scalars. Let's make su re that this makes sense. Each term arB, is the product of 3n //IX I and a I X r matrix. Thus, each a,B, is an mX r Illatrixthe same size as AB. The products a,B, are called outer products, and (2) is called the 014t~rJ1roduct ~xpallsioF1 of AB.
lllmpia 3.11
Compute the outer product expansion of An for the matrices in Example 3.9.
Solullan
We have
, nd
8, ..... 8 = 8,
B, The outer products are  \) = •
and
[0'
,...........  \
~
\ 2 .....••..... 3 0
141
Chapler 3 Matrices
(Observe that computing each outer product is exactly like filling in a multiplica tIo n table.) Therefore, the outer product expansion of AB is
1]+[ 3 6] + [6 0] [1 3 o 2
 I
0
J
2
We will make use of the outer product expansion in Chapters 5 and 7 when we discuss the Spectral Theorem and the singular value decomposition, respectively. Each of the foregoing part itions IS a special case of partitioning in general. A matrix A is sa id 10 be partitioned if horizontal and vertical li nes have been introd uced, subdividing A into submatrices called blocks. Partitioning allows A to be written as a matrix whose entries are its blocks. For example, 2  I 1 3 A ••••••••••••••• 0 0 1 1" ••••••••••••• 4 0 0 0 OJ 1 7 •• 0 0 OJ 7 2 1 0 0 0 1 0
3 :• 1 2!• 1 • •
4
212
 1
8
,nd
• 1 ••• 1
•• 1  5 : J 1 .............•. +• ...•...•.. +• ... • •
1
3: OJ• 0 o i• 2
0
1 10
0;3
arc panitioned matrices. They have the block st ructures
A" ] A [An An A 21
B_ [8n
and
B"
Du
8"
~]
tf two matrices arc the same size and have been partitioned in the same way. It IS clear that they can be added and multiplied by scalars block by block. Less obvious is the fact that, with suitable partitioning, mat rices can be multiplied blockwise::as well. The next ex::ample illustrates th is process.
Example 3.12
Consider the matrices A and B above. If we Ignore for the moment the fac t that their en tries are matrices, then A appears to be a 2x2 matrix and Ba 2x3 matrix. Their product should thus be a 2X3 matnx given by
A"][B,,
An ~ 1 All Bll + All~l = [ A 21B II
+ A22~'
B" 8"
~]
All Bn
+ Alliin
A' IBu
A ! IB' 2
+ A22~2
A 21BI)
+ AI2Bn] + A!2Bn
But all of the products in this calculation arc actually IIlIItr;x prod ucts, so we need to m::ake sure that they are all defined. A quick check reveals that this is indeed the case, since the numbers of cO/limns in the blocks of A (3 and 2) match the numbers of rows in the blocks of B. The matrices A and B are said to be partitioned conformably for block multiplication. Carrying o ut the calculations indicated gives us the product AB in partitIOned form: 3 I 2 1 5 4
+
2  1 I 3 = 4 0
6
2 o 5 5 5
Section 3.1
141
Matrix O perations
(When some of the b locks are 7.C ro matrices o r identity matrices, as is the case here, these calculations can be do ne q uite q uickly.) The calculatio ns fo r the o ther five blocks of AB are similar. Check that the result is 6
o
2 :• I ••
2 1 2 • •
I •. 12
5i2 • 5  ..  5' . +! ...... 3 __3.. t ....9. ... I 7 :• 0 0 : 23 7
• • •
••
2 i O 0 : 20
..t
(Observe tha t the block in the upper left co rne r is Ihe rcsult o f o ur calculations above.)
Check lh" you ob"in lh, "m, , n"", by multiplying A by B in th, u, u, 1"'y.
Malr'. Powers When A a nd B are twO /I X " m atrices, thei r p roduct AB wil l also be an n X n m atrix. A special case occurs when A = B. It makes sense to defi ne A.! = AA and , in general, to
defi ne Ak as , Ie factor1
if k is a positive integer. Thus, A I = A, and it is convenient to defi ne All = I,.. Hefa re making too many ass um pt ions, we sh o uld ask o urselves to what exten t ma tn x powers behave like powers of real numbers. Th e fo llo w11lg properties fo ll ow immed iately fro m the defin it io ns we have just given and arc the mat ri x a nalogues of the corresponding pro perties for powers of real nu mbe rs.
If A is ,I square m a trix and r and safe nonnegative integers, then
•
J. A'A' = Ar+· 2. (A' )'= A" I
In Section 3.3 , we will extend the definitio n and properties to inclu(ie negative integer powers.
Example 3.13
(a) [f A = [:
A
2
:]. the n
=[: : ][ ~ :J = [~ ~]. AJ= A2A =[~ ~][: :] = [~
and, in general,
:J
.II'" ['" , 'l 2"
2" 2'
The above statemen t can be proved by m a thema tical induction, si nce it is an mfi m te collectio n of s ta tements, o ne fo r each natu ral num ber II. (Appe ndix B gives a
148
Chapter 3
Matnces
b rief review o f mathematicallllduction.) The basis step is to prove that th e formu la h olds for n = 1. In this case,
:] = A
as required. The induction hypothesIs is to assume that
22H] H fo r some integer k 2: I. The induction step is to prove that the formula ho lds for 1/ "" k + \. Using the d efinition o f matrix powers and the induct ion hypo thesis, we compute
:]
2<1 + 2H] + 2.11
2H
""", ]
iiH r l
 [i
H 1 1
i h l) I
T hus, the for mula holds for all n 2: 1 by the principle of mathematical induction. (b) If 8 ""
0 I] [ I
0' then B2""
[0 '][0 I] [I 0]. \
0
1
0
0
=
 I
Continuing,
we find /j~= 81 B =
[Io  0\ ][0 I] [~ ~] 1
0
,nd
B' = B' B= [ bus,
If =
° 1][0 I] [' 0]
 1 0
\
0
0
1
B, and the sequence of powers of B repeats in a cycle of fOll[:
[0 I] [I 0] [ ° '] [' 0] [0 I] 1
0'
0
 1'  \
0' 0
\ '
\
0'
Tbe Transpose 01 a Matrix Thus far, all of the mat rix operations we have defined are analogous to operations on real numbers, although they may not always behave in the same way. The next opera tion has no such analogue.
Section 3.1
Matrix Operat ions
141
, The transpou of an mX" matrix A is the nX til matrix A J' oh~ tained by interchanging the rows and columns of A That is, the ith column of A r js the ith row of A for all i.
Definillon
Enmple 3.14
L
[~
3
0
:J.
~l·
8 = [:
, nd
C = [5
1
2J
Then their tra nsposes are 5 3 0 • 2 1 1
AT =
aT=
[a/,
~J.
, nd CT =
5  1
2
•
4
The transpose is sometimes used to give an alternative defi mtlon of the dot product of two vectors in terms of matrix multiplication. If
u,
v,
u,
v,
and
v =
u,
then u· v 
v,
"I"I +
II~ V~
+ .. . +
II~V.
v,

=
[ II I
1/2
1
v,
II w
v,
>i A useful al ternative deflllition of the transpose is given component wise: (A 1)., = Ajo
for all i and j
In words, the entry in row i and column j of AT IS the saine as the entry in row } and column i of AThe transpose is also used to define a ver y Important type of sq uare matn x: a symmetric matrix.
Definition
A square malnx A is symmetric if AT its own transpose.
E... ple 3.15
L
1 3
2
35 20
0 4
Athat is, if A. is equal to
150
Chap teT 3
Matrices
T he n A is symmetric, since AT = A; but a IS not symmet ric, si nce aT =
[~
1] 3
4= B.
t
A sym met ric matrix has the property that it is Its own "mirror image" across its m ain diagonal. Figure 3.1 illustrates this propert y for a 3X3 ma tri x. The corresponding shapes represent equal e ntries; the diagonal entries ( those on the dashed line) are arbitrary. A component w ise defin it ion of a symmetric mat rix is also useful. It is si mpl y the algebraic description of the "reflection" propert y.
nuure 3.1 A symmetric matrix
A square m atrix A is sym metric if and only if A ~ = A" for all i a nd j.
L" A
~
[
3
 1
~
D
~].
3]
[ 0 2
In Exercises /16,
B=
1 •
l~
 2 2
E ~ [4
COllljJlllt: II,e
:J. 2].
1 2
C ~
3 5
F ~
4 • 6
[:]
indicated nUl/rices (,f
possible). 1. A
+ 2D
2. 3D  2A
3. B  C
4.B  CT
5. AB
6. BD
7. D
+
8. BTB
BC
10. F(DF)
9. E(AF)
I I. FE 13. BTCT
12. EF 
(ClW'
15. A3
14. DA  AD 16.
(I2  D ?
17. Give a n example of a nonzero 2x2 matnx A s uch that
A2 = O. 18. Let A =
[!
~]. Find 2X2 matrices Band C suc h that
AB = ACbut B#: C.
19. A fa ctory man ufactures three products (doohlc kies, gizmos, and widgets) a nd shIps the m to two wareh ouses fo r storage. The number of units of each product shipped to each warehouse is given by the matrix
A =
200
75
150
100
100
125
(whe re a,) is the nu mbe r of uni ts of product i sent to warehouse j a nd the products arc taken in al phabetical orde r). The cost o f shipping o ne unit o f each p roduct by tr uck is $1.50 per doohickey, $1.00 per gizmo,and 52.00 per widget. The co rres ponding unit costs to ship by trai n are $ 1.75, $1.50, and $1 .00. O rganize these costs into a ma trix B and then usc matrix multiplica· tion to show how the factor y can compare the cost of shippi ng its products to each o f the two wareho uses by tr uck and by train. 20. Referri ng to Exercise 19, suppose that the unit cost of dis tribu ting the products to stores ]s the same for each product but varies by warehouse because of the dis· tances invo lved. It costs $0.75 to distrib ute o ne uni t from warehouse I and $1.00 to d istri bute one unIt from wareho use 2. Orgalllze these costs into a matrix C and then use matrix multiplIcat Ion to compute the total cost of distributing each product.
SectIOn ].1
3 2. A =
2xz + 3x) "" 0 2xl + x 2 5x) ""4 Xl 
[ '4
1
2 33. A ,..
A = 3
1
1
2 0
 I
, B=
1
 I
0
1
4
the columns of A. 24. Use the row matrix represen tation of the product to wrrte each row of AB as a linear combination of the rows o f B.
 2 3
2
o
1
..3.. 4 i ...0
1
o B= o o
••
,
.~
1
oi  I
o ] o 0 o 0
1
 I
o :• 0
1
•
o
] : I
.
"1'
0
i• 0
 I
o
O!]
]
I
,
3
I
0 2
0
1 4
I
I
3
:
B=
'
O[ 4
o.............. 0 1 ·1·· .. ••1 I
I
] : 1
(a) Compute AZ, AJ , .•. , Ai. (b) What is AlGOL ? Why? 1
26. Usc the matrixcolumn representallon of the product to write each column of BA as a linear combination of
28. Com pute the o u ter prod uct cxp:msion of BA.
37. letA =
29. Prove that if the columns of B a re linea rly dependent. then so are the colu m ns of A B.
30. Prove that if the rows o r A arc linearly dependen t, then so arc the rows of AB.
1
I
Vi
Vi
. Find, wit h just ificatio n,
[~
:
J. Find a rormula rorA~ (1I ~ J)and
veriry your rormula using mathcmaticallllduction.
_ [COSO
 SinO]
si n 0
cos O
38. Le tA 
.
COS20
Sin29]. cos 20
(3) Show that A l = [ . Sin 29
(b) Prove, by ma thematical inductIon, that
A"=
lira I lire produci AB I/wkes
sellse.
 Vi
B lOO l.
the columns of B. 27. Use the rowma trix represen tation of the produc t to write each row of BA as a linear combination of the rows of A.
1
Vi
36. Let IJ =
25. Compu te the outer product expansion of AB.
(mlWlC
4
35. LetA = [_~ :J.
3 0  I 1 6
5
1
34. A 
23. Use the mat rixcolumn representation of the product to wrrte each column of AB as a Irnear combination of
I" Exercises 29 rlml 3D,
5
100
,
1 0
,
1
o I! I
III Exercises 2~28, let
arid
3
0 001
1 ,
 I
151
o:1
In ExerCISes 2122, write r/ie givetl system of lillear eqllat/O/IS as a matrix equation of rlre form Ax = b. 21.
Matrix Operations
COS 110
.
[ sm
119
 si n 110]
ro r n> I.
cos 110
39. In each or the rollowing, flOd the 4X4 matrix A = (ll~ 1 that satisfies the given condition: ~ j
(a) ll., = ( 1)''"1
(b)
(c) Q., =(i  IY
(d ) a,, =sm
QIj =
j
. (U+ j  I)") 4
40. In each or the rollowlng, find the 6X6 matrl."( A = (Q 1
III &ercises 3 / 34, compllie A B iJy Mock IIIII/liplicatioll, using tire i"tiict/ted ptrrlllio,,,,rg.
31. A

1 0 0
 I
• ••• •••
0
0
1 •• 0
... ,
0
0 •'
J
,
, B=
3
••• • •• •
, 1
0
 I 0 ................... 0 •: I
o 0
•
O! 1
9
that satisfies the given condition:
(a)
a.,
(c) a 'I
i+j = { 0
={I
0
ifi S j
if I
>j
(b)
if6
4 1. Prove Theorern 3.1(3 ).
ll'l ::O
{' if [i  , [ S l 0 if li  JI > I
152
Chapter J
Mat rices
•
Matrix Algebra In some ways, the arith metic of milt riccs gcnerillizes that of vecto rs. We do not expect any surp rises with respect to addition and scala r multiplicatIon and mdeed there are none. Th is will allow us to extend to matrices several concepts that we are already familiar with from our wo rk with vectors. In particular, hnear combUlalions. spann ing sets, and linear independence carryover to matrices with no difficulty. However, matrices have o ther o perations, such as matrix multiplication, that vecto rs do not possess. We sh ould not expect matrix multiplication to behave like multip lICatIon of real n umbers unless we can prove that it does; in fact . it does no t. In this sect ion, we sum marize and prove some of the main proper ties of matrix o perations and begin to develop an algebra of matrices.
Properties 01 Addllion and Scalar Multiplication All of the algebraic properties of addition ;\nd scalar multiplication fo r vecto rs (Theorem 1.1 ) carry over to ma trices. For completen ess, we summa rize these properties in the next theorem.
Theorem 3.2
Algebraic Properties of Matrix Addition and Scalar Multiplication Let A, B, and C be mat rices of the same size and let cand (I be scalars. Then Commutati\'ity Associativity
a.A+B = B + A b. (A + 8) + C"" A + (B + C) c. A+O=A d. A +( A)=O e. c(A + 8) = cA + cB f. (c + (I)A = cA + dA g. , I dA ) = ("I )A h. IA = A
DistributivilY Distributi vii y
•
The proofs of these p ro perties are duect analogu es of the correspo nd ing proofs of the vector p ro p~ rt i~s and are left as exefClses. LikeWise, the comments following Theorem 1.1 are equally valid here, and you shou ld have no difficulty usi ng these properties to perfo rm algebraic manipulations With matrices. (ReView Example I.S and see Exercises 17 and 18 at the end of this section.) The associat ivity property allows us to unam bigu ously combine scalar multiplication and addit ion without parentheses. If A. B, and C are matrices of the same size, then ( 2A
+ 38)

c = 2A + (38
 C)
a nd so we can simply write 2A + 3B  C. Generally, Ihen, if AI' A1• ... , At are matrices o f the same size and c p cz•...• c1 are scalars, we may form the linear combination clA I
+ c2 A1 + ... + etA,
We will refer to cl • '1 •... , c. as the coe{ficients o f the linear combinati on. We can now ask and answer questions about linear combinalio ns of matrices.
5«tion 3.2
Example 3.16
LetAI =[_~ ~J. A2 =[~ ~].alldAJ =[ :
153
Matrix Algebra
:].
(a ) 158 =
[ ~ ~]alinearcomblOationOfAI'Al. andAl?
(b) Is C =
[~ ~] a linear combination of AI' A
l•
and A,?
Solution (a) We want to find scalars ' I' S' and c, such tha t CIA I + <;A2 + clA, = B. Thus,
The left ~ hand side o f this equation can be rewritten as
Comparing entries and using the defini tion of matrix equality, we have equa tions
'I c.
(OU T linear
0+ C,= 1
+ cJ = 4 +c,= 2 '1+ c,=
1
GaussJordan elimination easily gives
o
1
1 0
 I
o
1
1
o
0
I
1 4
o o
1
o
2
0
1
3
000
0
1
0
1 2
1
1
1
•
(checklh is!),so'l = 1' (2 = 2,and c, = 3.Thus,A .  2Al easily checked. (b) ThiS time we want to solve
Proceeding as
to
part ( 3), we obtain the linear system
'1+ c, =1
'.  (I
+', = 2 +c,=3 '1+,)= 4
+ lA, = B,which can be
Row reduction gives 1
1 1
1 0
1 2
0  1
0
1 3
0
1
1 4
,
R,  I(
0 1 1 1 0 1  I 0 1 0 0 0
1
2 3 3
We need go no further : The last row implies that there is no solution. Therefore. in this case, C is 110t a linearcombillation of AI' A2, and Ay
Remarll
Observe that the columns of the augmented matrix contain the ent ries o f thc matrices we are given. If we read the en tries of each matrix from left to righ t and top to bOllom, we get the order in which thc ent ries appea r in the col umns of the augmented matrix. For example, we read AI as "0, I,  ],0," which corresponds to the first column of the augmented matrix. It is as if we simpl y "straightened out" the given matrices 111 10 column vectors. T hus, we would have ended up with exactly the same system o f linear equations as in part (a) if we had asked 1
"
4 2
a linear combinat ion o f
0
1
1 ,  1
0
0
1
0 1
1
,
, nd
1 1
, •
1
We will encounter such pa rallels repeated ly from now on. In Chapler 6, we will explore them III more detail. We can defille the spall o f a set of matr ices 10 be the set of all linear combinat ions of the matrices.
Describe the span of the matrices AI' A2, and AJ in Example 3.16.
Solullon
O ne way 10 do Ihis is simply to write out a general linear combinatio n of AI' A ~. and AJ • Thus, clA I
+ 0 Al +
c)A,:: cI[
_~ ~] + ~[~ ~] + C)[:
:J
(wh ich is analogous to the param etric rep rcsentalion of a plane). Hut suppose we wan t to know when the matrix [;
:J
is in span (Ap A 1 • AJ }. From the representa
tion above, we know that it is when
[
~
: ::
:: :
: :] = [ ;
:]
for some choice of scalars cl ' S. c,. Th is gives rise to a system of linear eq uations whose lefthand side IS exactly the S.1l11 e as in Example 3.16 but whose righihand side
Section 3.2
155
Matrix Algebra
is generaL The augmented matrix of Ihis system is 0
1
1 w
I
0
1 x
 I
0
0
1
1 Y 1
,
and row reduction produces 0
I
I w
I
0
\
x
 \
0
\
Y
0
\
\
,
,
\
0
0
0 0
\
0
0
\
0
0
0
}x  ~ y  tx  iy + w !x + ly
. w ,
,
(Check Ihis carefully.) The only restriction comes fro m the last row, where clearl y we must have w  z = 0 in order to have a solution. Thus, the span o f AI> A 21 and AJ COn
.
slstsof all mat rices
["xl z ' y
.
for whLch w = z. That Ls, span (AI>AZ,A J ) =
{[wxl} y w .
.+ lIole
If we had known this before attempting Example 3. 16, we wo uld have seen
immediately that B =
[I2 4]is a linear combination of AI' A21and A
Ihe necessary form (take w II , x = 41andY = 2),butC=
[~
!]
J,
since it has
can not be alin
ear combination o f AI' Al , and A, l since it does not have the proper form ( I 'fI:. 4). l inear independence also makes sense for mat rices. We say that mat rices AI' A l , ••• , At of the same size are linearly independent if the only solu tio n o f lhe equatio n
(\) is the trivl3l Olle: (. = ~ = .. = (l = O. If the re a re no ntrivial coefficients that satisfy (I ), then AI' A2•• •• , Ai are called linearly dependent.
Example 3.18
Determ ine whether the matrices AI' A l • and A) in Example 3.16 ar linearly
independent.
Sollilion
We want to solve the equatio n matrices, we have
(I[ _~
CI A .
+
CzA2
~J + (l[~ ~] + 0[:
+
c, A) = 0. Writing out the
:J [~ ~J =
This time we get a homogeneous linear syslem whose left hand side is the same as in Examples 3.16 and 3. 17. (Are yo u slarting to spot a pallern yet?) The augmented mlllrix row reduces 10 give
o
1
I 0
I  \
0 0
1 0 J 0
o
1
I
0
,
\
o
o
\
o o
0 0
0 0 o 0 I 0 0 0
156
Chapter 3 Matrices
hus, CI = C:! == " "".,, and we conclude that the matrices AI' A z' and AJ arc lineady ndel2.ende nt.
Properties 01 MalliM Multiplication \N'henever we encounter a new operation, such as matrix multiplicatIOn, we must be careful not to assume too much about it. It would be nice If matrix multiplication behaved like multiplication of real numbers. Although in many respects it docs, there are some significant differences.
Example 3.19
Consider the matrices A ~
[ ~ ~]
,nd
B= [:
~]
Multiplying gives AB = [
2
 ]
~][ ~
:]
[: ~]
and
BA
[: ~][ ~ [:
~]
~]
[rhus, AB"* BA. So, in contrast to m ultiplication of real numbers, matrix multiplication is /lot commutativethe order of the factors in a product matters! It is easy to check that A2 =
[~ ~]
(do so! ). So, for matrices, the equation
does not im ply that A = 0 (unlike the situation for real numbers, where the equation X = 0 has only x = 0 as a solution ). A~
=0
However gloomy things might appear after the last example, the situation is not really bad at all you just need to get used to worki ng with matrices and to constantly remind yourself that they are not numbers. The next theorem summarizes the main properties of matrix multiplication.
Theorem 3.3
Prdperties of Matrlx Multiplication Let A, B, and Cbe matrices (whose sizes are such that the indicated opera tions can be performed) and let k be a scalar. Then , . A(BCI
~
(ABIC
b. A(B + C) = AB + AC c. (A + B)C= AC + BC d. k(ABI ~ (kAIB ~ A(kBI e. {rn A = A = Aln if A is mX /J
Associativity Left dislribulivilY Right distribulivity ~tultiplicat ive
identity
Proof We prove (b) and half of (e). We defer the proof of property (a) until Section 3.6. The remaining properties are considered in the exercises.
•
Section 3.2
Ma tr ix Algebra
(b) To prove A ( B + C) = AB + AC, we let th e rows of A be denoted by Aj and the columns of Band Cby bl and c, Then the;th column o f B + C is bJ + cJ (since addilio n is defined component Wise), and thus [ A(B + C) ], ~ A. · (b,
+ c,)
= A" bJ+ A" cj ~ (A B)"
+ ( Ae),;
+ AC)" Since th is is true fo r all j and j. we must have A( B + C) = ~ (A B
AB
+ AC.
(el To prove Ai" = A, we note lhat the identity matrix I" can be column partitio ned as
I" ==
rei' c
l ' ... :
c,,]
where c, is a standard unit vector. Therefore,
AI" = [Acl : Ac, ' . . , ; Ae,, ] = [ a l : a~ :
... :a"]
~ A
by Theorem 3.1(b). We ca n usc these properties to further explo re how closely matrix multiplication resembles Illultiplicahon of real numbers.
(xample 3.20
If A and B are square matrices o f the S<1mc size.
50lullon
LS (A +
8 )l = Al
+ 2AB +
If?
Using properties of matrix multiplication, we compute
(A
+ 8)'
~ (A ~
+ 8)(A + 8)
(A + 8)A + (A + 8)B
By left dbtribu tivity
+ AB + Ii B) right distribuIIVil)· Therefore, (A + 8)1 ::: A2 + 2AB + Ef if and only if Al + SA + AB + Ii = Al + 2AB + W. Subtract ing Al and If from both sides gives BA + A8 = 2AIl Subtracting AB from both sides gives I3A = A Il Thus, (A + B) 1 = A 2 + 2A IJ + B2rr and only if A :0
A2 + BA
and B commute. (Can you give an example of such a pair o f matrices? Can you find two matrices that do not satisfy this pro perty?)
Properlles 01 .he Transpose Theor... 3.4
Properties of the Transpose LeI A and B be matrices (whose sizes arc such that the indicated operations can be performed ) and Ict k be a scalar. Then .b. (A + a. (A'Y= A c. ( kA )T"" k(A I ) ~. (AB ) c. (A,)T = (AT)' for all nonnegative integers r
Bf"., ~
AT + HT
B' A
158
Chapter 3
Matrices
P,.ot
Properties (a)(cl arc intuitively dear and straightforward to p rove (sec Exercise 30). Proving property (el is a good exercise in mathematical inductIOn (sec E.'(em se 3 1). We wi ll prove (d ), since it is no t what you might have expected. [Wo uld yo u have suspected that (An) T = ATBT might be true?1 First, If A is m X tI and Jj is /I X r, then BT is r X /I and AT is II X m. Thus, the product OTAT is defined and is r X III. Since AB IS III X r, ( AB) Tis r X III, and so (Am T and HTAT have the same si ze. We must now p rove that thei r correspondi ng entries arc equal. We denote the tth row of a matrix X by row,(X) and its jth colum n by col ,(X). Using these com'en tions, we sec that
[( AB)'l., = (A B),. "" rowj(A) . col,(H) "" coIAAT ). row,( H"'}
= row,(B"'} . co l~A') "" [BTA1" ( Note that we have used the definition of matrix multiplication, the defimtlon of the transpose, and th e fact that the do t product is commutative.) Since J and J are arbitrary, this resuh implies that (A B)T = BTAT.
1I ••• rll
Properties (b ) and (d ) of Theorem 3.4 can be generalized 10 sums and products of finitel y m::l lly mat rices:
(A\
+ A2 + ... + A~V =
A[ + AI + .. + A[ and (AlAI ·'· AI) T
"'" AT . AJAi assuming that the sizes of the matrices ::I re such that all of the operatio ns can be perform ed. You arc :lsked to p rove these facts by mathe matical induction III Exercises 32 and 33.
Ellmpll 3.21
Let A =
Then AT""
[~
[~
!l
 I
B 0:
and
[ ;
3
~l
3] soA + AT"" [2 !J. a sym metric matrix. 4 '
5
We have
 I
2 3
0
1
4
JjT ""
BBT =
 I [ :
 I
2 3
0
1
4
.nd
BTB
=
3
~l
[;
 I
2 3
0
1
4
 I 3
~l =
=
[1; I:] 20 2 2
2 10
3
,3 1
Thus, both BBT and BTB are sym metric, even tho ugh B is no t even square! (Check that AA T and ATA are also sym metric.)
Section 3.2
159
Matrix Algebra
The next theo rem says that the results of Example 3.21 arc true in general.
Theorem 3.5
a. If A is a square matrix, then A + AT is a symmetric matrix. b. For any matrix A, AA Tand ATA are symmet ric matrices.
Prool
We prove (a) and leave proving (b) as Exercise 34. We simply check that (A
+ Al) T = AT + (A l) T = AT + A = A + AT
(using properties of the transpose and the commutativity of mat rix addition ). Thus, A + AT is equal to its own transpose and so, by definition, is symmetric.
III Exercises 1 4, solve the equation for [;
~]atldB= +
I. X  2A
x, givetl that A =
~ ].
[ :
A.
 \
\
9. span(A p A2l in Exercise 5
10. span(Ap A2 , A3 ) in Exercise 6
In Exercises 58, write B ,IS a litlear combination of tile other matrices, if possible.
5.B = [~ ~l AI=[_: ~l Al=[~
:]
6,8 =['  4
\]
,
A = [0\
11. span(Ap A~ > A 3) in Exercise 7 12. span(A1> Al > AJ> A4 ) in Exercise 8
In Exercises 1316, determine whether the given matrices are linearly independem.
!J. [~
~] : ], [ ~ ~].[:
0'
13. [;
[~
7. 8 [3 0
\ A I  [ 0
=
1]
0 1
14. [:
0 '
0
I
\5.
0
2
2
0
0
3
\
 2 ,
o \ \ 0 0 I , 000
\ A) =
0 0 10 ,
o o
AI =
002
A2 =
0
In Exercises 912, find the general form of the spall of tlte indicated matrices, as in Example 3.17.
4. 2(A  B + X) = 3(X  A)
8. B =
1
38 = 0
3. 2(A + 2B) = 3X
A2 = [
 I
00\
2.2X = A  B
A) =
=
1
0
0
o \ o 0
\  \
o,  \
\
\
 2
 \
3 ,
0
\
I
0
2
\
0
\
3
0
4
9
0
2 , 2  \ 0 \ \ \ 0 16. 0 2 0 , 0 2 6 5
2 0 0
 \
\
0
0
 \
0
0
0
 4
:]  \
3
\
9 5
,
4 2
0
, 0 1 0 , 0
3
5
160
Chapter 3
Matrices
17. Prove Theorem 3.2(a)(d ).
35. (a) Prove that if A and B are sym metric /I X " ma trices,
IS. Prove Theorem 3.2(e)( h).
thenso isA + B. (b) Prove that if A is a symmetric "X /I nWlrix, then so is kA for any scalar k.
19. Prove Theorem 3.3(c). 20. Prove Theorem 3.3(d ). 21. Prove the half o f T heorem 3.3(e) that was not proved in the text. 22. Prove that, for square matrices A a nd B, AB = BA if and o nly if (A  B)(A + B) = A2  !Y.
In Exercises 2325, ifB
c, a1ld d such Ihat AB
=
= [:
~J.fi/UI cOllditiollSOl1 a, b,
BA.
24.A =[ 1 I] 25. A = [I '] 23.A =[ ~ 1] 1  I 1 34 26. Find conditions on a, b, c, and d such that 8
=
[
',' dIe ]
. both [ 01 commutes With 27. F. nd condItions on a, b, c, and d such that B = [ :
~]
com mu tes with every 2X 2 matrix.
2S. Prove that if AlJ and BA are both defined, then AB and 1M are both square matrices.
tries below the main diagonal are zero. TIllis, tllc form of all IIppcr trial/gll/ar mtltnx IS
o
* * ....
•
0
A square matrix is called skewsymme,ric if AT = A.
37. Which of the foll owing matrices arc skewsymmetric? (al [_:
o (el
!]
(b l
3
3 0
 1 ,
1
 2
0
(dl
[~ ~] 0 I
2
I 0
2 5
5 0
3S. Give a componentwise d efiniti on of a skewsymmetric mat ri X. 39. Prove that the main diago nal of a skewsymmetric matrix must consist entirely of zeros. 40. Prove that if A and B are skewsymmetric nX 11 matri ces, th en so is A + 8.
41. If A and B arc skewsymmetric 2 X2 mat rices, under what conditio ns is AB skewsymmctnc?
A square m(l/rix is callell upper triangular if all of the en
* * o * o 0
36. (a) G ive an example to show that if A and 13 are symmetric nX /I matrices, then A B need no t be symmetric. (b) Prove that if A and B are sy mmetric /IX /I matrices, then AB IS symmetric jf and o nly if A /J = BA.
•
42. Prove that if A IS an /I X IImatnx, then A  AT is skewsymmetric.
43. (a) Prove 111,\1 any squa re matrix A can be written as the su m of a symmet ric matrix and a skewsymmetfl c matrix. ( Hint: Consider Theorem 3.5 and Exercise 42 ). 1 3 (b) Illust rate part (a) for the matrix A = 4 5 6
,
789
o •
where tile ent ries /IIt1rked .. are arbitrary. A more formal defilllllOli of SI/ch (/ IIImrix A = [a,)I is tllat a'i = 0 if i > j. 29. Prove that the product o f two upper triangular /l X n mat n ces .s upper triangular. 30. Prove T heorem 3.4(a)  (c). 31. Prove Theorem 3A (e).
Tile 'race of all II X /I matrix A == Ill'/] is tile sum of the ell/TIes 01/ its III(/i" diagmllll (lIl d is delloted by tr (A). That IS, tr(A) = a ll + au
+ ... +
a....
44. If A and fj are /IX /I matrices, prove the fo llowing properties of the trace: (a) tr {A + B) = tr (A) + tr ( B) (b ) tr ( kA ) = kt r (A), where k is a scalar
32. Usi ng induction, prove that fo r all" 2: I, (A. + A2 + .. + A ~)T = Ai + Af + ... + A ~.
45. Prove Ihat if A and B are /IX n matrices, then tr(AH) = tr(BA).
33. Using induc tion, prove that for all n 2: I,
46. If A is any mat rix, to what is Ir (AA T) equal?
T ATA ( A I Al ' ' A~ lT= AT''' ~ ~ I' 34. Prove Theorem 3.5(b).
47. Show that there are no 2X2 matrices A and Bsuch that AB  SA = 12 ,
Section 3.3 The Inverse of a Matru
161
The Inverse 01 a Matrix In this section, we return to the matrix descript ion Ax = b of a system of linear equations and look for ways to uSt matrix algebra to solve the system. By way of analogy, consider the equation ax = h, where ti, h, and x represent real numbers and we wan t to solve for x. We can quickly figure out that we want x = bit, as the solution, bu t we must remind ourselves that th iS is true only if a"* O. Proceeding more slowly, assuming that a"* 0, we will rt"ach the solution by the fo llowing scquenct" of steps:
1
1 (1)
ax = b::::} (ax)=(b ) ::::} a (l
b ::::}l'x =b ::::} x = b  (a) x = (/ (I (l a
(This example shows how much we do in ou r head and how many propert ies of arithmetic and algebra we take for granted! ) To imitate this procedure for the matrix equation Ax = b, what do we need? We need to fi nd a matrix A' (analogous to I / a) such that A' A = I, an identity matrix (analogous to 1). If such a matrix exists (analogous to the requ irement that a ;:l; 0), then we can do the fo llowing sequence of calculations: Ax
b => A'(Ax) = A' b ::::} (A' A)x = A' b ::::} Ix = A' b => x == A' b
=
(Why would each of these steps be justified?) Our goal in this section is to determi ne preCisely when we can find such a matrix A'. In fact , we arc going to insist on a bit more: We want not only A' A = I but also AA ' == 1. This requirement forces A and A' to be square matrices. (Why?) =2
Dellnltlon
If A is an nX II mat rix, an j"verse of A is an the property that
AA' = I and where 1 == ;'JVertible.
EKample 3.22
,
If A =
AA ' :
[
21
is the
matrix A' with
A' A ~ 1
"x" identity matrix. If such an A' exists, then A is called
5] then A' = [ 3 '] 2 is an inverse or A, since 3 '
 I
[2 5][ 3 5]: [1 I
Example 3.23
1~
/IX /I
3
 I
2
0
0] and A' A = [ 3 1  1
Show that the following matTices are not invertible:
(, ) 0 :
Solullon
[~ ~]
(b) B : [ ;
:]
(a) It is easy to see that the zero matnx o docs not have an inverse. Ifit did, then there would be a matrix 0 ' sllch that 00' = J == 0 ' 0. But the product of the zero matrix with any other matrix is the zero matrix, and so 00' could never equal the identity
matrix I. (Notice that this proof makes no reference to the si%.e of the mal rices and so is true for nXn matrices in general.) (b) Suppose B has an inverse B'
=
[ ;'
:J.
The equalion 118' = / gives
[; ~t :] [~
~]
=
from which we get the equations
+ 2y
w
+ 2z
x
+ 4y
211'
= 1
= 0
=0
+ 4z = I
2x
Subtracting twice the fi rst equation from the third yields 0 =  2, which is clearly absurd. Thus, there is no solution. (Row reduction gives the same result but is not rea lly needed here.) We deduce that no such matrix B' exists; that is, IJ is not invertible. (In fact, it docs not even have an inverse th
Remarll • Even though we have seen Ihat ll1;!trix mult iplication is not, in general, commu tative, A' (ifil exists) muSI sati sfy A' A = M ' . • The examples above mise two queslions, which we will answer in th is sect ion: ( I) How can we know when a matrix has an inverse? (2) If a matrix docs have an inverse, how can we find it? • We havc not ruled oul the possibility that a matrix A might have more than one invcrse. The next theorem assures us that this cannot happen .
.6

If A is an invertible matrix, then its inverse is unique.
Prool
In mathematics, a standard way to show that there is just one of something is to show that there cannot be more than one. So, suppose that A has two inversessay, A' and Aff. Then AA' = / = A'A
Thus,
A'
and
AA "= I =A ~ A
= A', = A'(AA") = (A' A)A" = lA" = A"
Hence, A' = A", and the inverse is unique. Thanks to this theorem, we ca ll now refer to the inverse of an invertible matrix. From now on, whell A IS invertible, we ,,,.11 denote its (unique) IIlversc by A  1 (pronounced " A inverse" ).
W.ralill
1
Do not be tempted to wnte A 1 = A! There
.
1S
no such operatlOll as
"division by a matrix," E"cn if lhere were, how on earth could we dlllidc the scalar I by the
Section 3.3 The Inverse of a Matrix
163
mntrix A? If you ever feel tempted to "dIvide" by a matTiX, what you reaUy want to do is multIply by its inverse. We can now complete the analogy that we set up at the begmning of this section .
.
~~~
Theorem 3.1
If A is an invertible nXll matrix, then the system of linear equations given by Ax =< b has the unique solution X = A 'b for any b in R",
Prool
Theorem 3.7 essentially formah zes the observation we made at the beginning of this section. We will go through it again, a little more carefully this time. We are asked to prove two things: that Ax = b Itasa solution and that it has onlyOtlesolution. (In mathematics, such a proof is called an "existence and uniqueness" proof.) To show that " solution exists, we need o nly verify thai x = A ~ lb works. We check that
A(A ' b)
= (AA ')b = Ib = b
So AI b satisfies the equation Ax := b. and hence there is at least this solution. To show that this solution is unique, suppose y is another solution. Then Ay = b, and multiplying both si des of the equation by A I on the left . we obtain the chain of implications
Thus, y is the same solution as before. and therefore the solution is unique.
So, returning to the q uestions we raised in the Remarks before Theorem 3.6, how can we tell if a matrix is invertible and how can we find its inverse when it is invertible? We will give a general procedure shortly, but the SItuation fo r 2X2 matrices is sufficielltly simple to warrant being singled o ut.
Theorem 3.8 IfA = [:
~], then A is invertible if ad ,
A
1
bc*O, in which case
[ of
= ad  bc 
b ] u
If ad  bc = 0, then A is nOLmvertib1e. The expression ad  bc,s called the determin a nt of A, denoted del A. The formula for the inverse of
[ac db]
(when it exists) is thus
I
limes the matrix obtained
cletA by mterchanging the en tries on the main d iagonal and changing the signs on the other two entries. In addition to giving this for m ula. Theorem 3.8 says that a 2X2 matrix A is invertible if and only if det A i: O. We will see in Chapter 4 that the determinan t Can be defi ned for all square matrices and that this result remains true, although there is no simple formu[;, for the inverse of la rger square matnces.
Proal Suppose that det A "" ad  bc=;': O. Then
" b][ of b]= [ad be ab + bal = [ad  be [cd c a cd  dc cb +da 0
0]
[I 0]
adbe = detA o
1
16C
Ch:tpler 3
Ma lr ic~
Similarly,
b] ~ del A[ ' 0] [ de b][a a e d O l Since del A =I= 0, we can m ultIply both sides of each equa tion by l / det A to obtain
1,] ( , [cd b]) [' 0 ] a b ] [, 0 ] (del, [ dc b]n)[a ed 
a [ e (/ and
=
dcl A
A
0
I
0
I
[Note that we have used property (d) of Theorem 3.3. 1Thus, the matrix
, [ d det A c
 ~] ..
satIsfi es the definiti on of an I1lverse, SO A is invertible. Since the inverse of A is unique. by ThCQrem 3.6 , we must have A I
d "" det A  c I
[
Conversely, assume thai (/(J  he = 0. We will consider separately the cases where a =I= and where tI  O. If a =I= 0, then d "" he/n, so the matrix. can be written as
°
where k = c/ a. ln o ther words, the second rowof A ISa muillple of the fi rst. Referring
x]
to Example 3.23(b), we see that if A has an inverse [ w y z' then
x] [kna kbb] [W yz

[IOl0]
and the correspo nding system of linear equations
+
aw
~ I
by
+ bz;;:: 0
ax
+ khy
kall'
= 0
+ khz =
kax
1
has no solu tion. (Why?) If (I ;;:: 0, then ad  IJC = 0 implies that be = 0, and therefore either b o r c is O. Thus, A is of the (orm
[~ ~] In the fi rst case,
0'
[~ ~]
[~ ~][ ;:] [~~]" [~
n
have an inverse. (Verify thiS.) Consequently, if ad  be = 0, then A IS not inve rtible.
Simi""y,
[~ ~] cannot
Section 3.3
Enmple 3.24
Find the inverses of A =
Solutlol
[~
! ] and8 =
~]
i [ _: 2
165
[I! IS]
We have d el A = 1(4)  2 (3) =  2 A I =
The Inverse of a Matrix
 5
, if they exist.
"* 0, so A is invertible, with
[ ~ _~]
=
(Check th is.) On the other hand , d et B = 12 (  5)  ( 15) (4) = 0, so B is not invertib le.
+(Kample 3.25
Use the inverse of the coefficient matrix to solve the linear system
x + 2y = 3 3x + 4y=2
Solutloll
The coeffi cient mat rix is the matnx A =
[~
!l
whose inverse we com
puted in Example 3.24. By Theorem 3.7, Ax = b has the unique solution x = Al b. Here we have b =
He_.f.
[ ~ l thus, the solution to the given system is
+
Solving a linear system Ax = b via x = A 'b would ap pear to be a good
method. Unfortunately, except fo r 2X 2 coefficient matrices and matrices with certai n special fo rms, it is almost always faster to use Gaussia n or GaussJordan elt mination to fi nd the solutio n d irec tly. (See Exercise 13.) Fu rthermore, the technique of Exa mple 3.25 works only when the coeffi cient matn x is square and invert ible, while el iminatio n methods can always be applied.
Properlles ollnverllble Malrices The foIl 0....,111£ theorem records some of the most Important p roperties o f invertible matnces.
Theore.3.9
a. If A is an invertible matrix, then A
I
is invertible and
b. If A is an inverti ble matrix and c is a nonzero scalar, then cA is an invertible matrix and
tcAt' =.!.c A
I
c. If A and B a re invertible mat rices of the sam e size, then AlJ is invertihle and
Matrict'S
d. If A is an invertible matrix, then A: is invertible a
(AT) .
~
(A ')T
e. If A is an invertible matrix, then A" is invertible for all nonnegative integers 11 and
(A") .. (A ')
Prool We wIl! prove properties (a), (c), and (c) , IClIving properties (b ) and (d ) to be proven in Exercises 14 and IS. (a) To show that AI is invertible, we m ust argue t hai there is a matrix X such that
But A certainly satisfies these equation s in IJlace of X, so A  I is invertible and A is irwerse of A I . Since inverses are umque, this means lhat (A  I)  I = A.
fi ll
(c) Here we must show that there is a matrix X such tha t
(AB)X = I = X(AB) The claim is that substituting B 1A  I for X works. We check that
(AB)W'A  ')
~
A(BB ') A ' = AlA  ' = AA ' = I
where we have used associativity to shift the parentheses. Similarly, (B 1A  I) ( AB) = 1 (check!), so A B is l!lvertlble and its inverse is BI A I . (e) The basic idea here is casy enough . For example, when n = 2, we have Al (A I)2 = MA IA I = AlAI = AA I = I
Simi larly, (A  I) 2A' = 1. Th us, (A I ) l is the inverse of A~. It is not d ifficult to see that a sllnilar argument works for any higher integer va lue of n. However, mathematical inductio n is the way to carry OU I the proof. The basis step is when '1 = 0, in which case we are being as ked to prove tha t A" is 1I1vertible and tha t
(AO)  I
=
(AI t
This is the same as showing that 1 is invertible and that r l = I, whICh is dearly true. (Why? See Exercise 16.) Now we assume thai the result is true when II = k, where k is a specific non negative integer. That IS, the induction hypothesis is to ass ume that Ak is invertible and that (Ak)  I = (A  I)k
The IIlduclion step requires that we prove t hat Ak+1 is invertib le and thai (AI... 1)  1 = (A  1)h I . Now we kno\" rro m (c) that Al • I = AkA is invertible, si nce A and (by hypo thesis) Al are both invertible. Moreover, (A I)k+1 = (A I)kA  l
= (A~)  IA  I
By the induction hypothesis
= { AAl r l
By property (e)
=
{AI'~
I ) 1
•
Section 3.3 The Inverse of a Matrix
161
Therefore, A ~ is invertible fo r all nonnegative integers n, and (A")  I = (A  1) " by the principle of mathematical induction.
Hemarlls • While all of the properties o f Theorem 3.9 are useful, (c) IS the one you should highlight. It is perhaps the m ost important algebraic property of matrix inverses. It is also the one that is easiest to get wrong. In Exercise 17, you are asked to give a coun terexample to show that, contrary to what we might like, ( AB)  I '# A  I B 1 in general. The correct property, (A B)  1 = B 1A  I , is sometimes called the socksandshoes rule, because. although we p ut our socks on before our shoes, we take them off In the reverse order. • Property (c) generalizes to products of finitely many invertible matrices: If A I' A 2,.·., A n are invertible matrices of the same size, then AlA:! ... A n is IIlvertible and A )  I = A I ( AA"' I 2 " n
(See Exercise 18.) Th us, we can state that
The inverse of a product of invertible matrices is the product of their inverses in the reve rse order.
. Sm ce, for real numbers,
1 • +  , we should not expect that, for square a+b a b matrices, (A + B)  I = A I + B 1 (and, indeed , this is not tr ue in gen eral; see Exercise 19).l n fact, except for special matrices, there is no formula for (A + B) I. • Property (e) allows us to define negative integer powers o f an invertible matrix:
If A is an inverti ble matrix and
1
II
I
'* 
is a positive integer, then..A " is de.6ne.d by
With this definition, it can be shown that the rules for expo nentiat ion, A'A' = A r+, and (Ar) ' = A" , hold for all integers rand 5, provided A is invertible. One use o f the algebraic p roperties of matrices is to help solve equations involving matnces. The next example ill ustrates the p rocess. Note that we must pay pa rticular atte ntion to the order of the matrices in the product.
Example 3.26
Solve the following matrix equation for X (assuming that the matrices involved are such that all of the indicated operations are defined ):
168
Chapter 3
Matrices
SolulioD There are many ways to proceed here. One solution is A '(BX)  '
= (A I B3 r::::} «BX)A)  l = (A 1 B.l? =>[«BX)A)  'r ' ~ [(A  'B' )' r ' => (BX)A ~ [(A  'B' )(A' B')t' ::::} (BX)A = BJ(A J ) IB\A I)I
=> 8XA = W~ABJA => B1BXAA  I = B 1B 3 AB J AA  1 => IXI = B 4 AB J I => X = 8 4 AB 3 (Can you justify each step?) Note the careful use of Theorem 3.9( c) and the expansion o f (A1 8 3 ) 2. We have also made liberal use of the associativity of matrix multiplicatio n to simplify the placement (or el imination) of parentheses.
Elementary Matrices V.re are going to use matrix multiplication to take a different perspectIve o n the row reduction of mat rices. In the process, you will discover many new and important insights into the nature o f invertible matrices.
If I E ~
0
0 0
0 I
0
I
0
,nd
A ~
5  I
0
8
3
7
we find that 5
7
8
3
1
0
EA ~
In other words, multiplying A by E(on the left ) has the same effect as lllterchanging rows 2 and 3 o f A. What is significant about E? It is si mply the matnx we obtain by applying the same elementary row operation, R2 +,). R3, to the identIty matrix 13, It turns out that this always works.
Definition
An elementary matrix is any matrix that can be obtained by per forming an elementary row operation on an identity matrix.
Since there are three types of elementary row operations, there are three corresponding types of elementary matrices. Here are some more elementary matrices.
Example 3.21
Let I
E]
=
0
0 0
0 3 0 0
0 0
0 0 1 0
0
1
, Ez =
0 0
0
1
1
1
0
0 0
0 0 , 0
0
0
0
I
I
,nd
E, ~
0 0 0
0
0 0 1 0 0 0 1 0  2 0 1
Section 3.}
The Inverse of a Matrix
169
Each of thcse matrices has been obtained from the identity matrix I. by applying a single elementary row operation. The matrix £1 corresponds to 3R1 , E, to RI ++ R.p and ~ to R(  2R 2• Observe that when we leftmultiply a 4X II matrix by one of these elementary matrices, the corresponding elementary row operation is performed on the matrix. For example, if
a" a" a" a"
A~
then
E1A =
a"
al2
au
3""
3all
3al}
a" a"
a"
a"
EJA ;;
and
a.,
al2
au
a2l
a"
a" a"
a.2
a.,
• E2A =
""
a" a" a" a"
a" an
a'l
a"
au •
all
a" a"
a"
al!
au
a 21
au
a"
a"
an
an
a"  2a21
a. 2  2a Z2
ao  2a D
t
Example 3.27 and Exercises 2430 should convince you that tltlyelemen tary row operation on tilly matrIX can be accomplished by leftmultiplying by a suitable elementary matrix, We record this fact as a theorem, the proof of which is omitted.
Theo,.. 3.10
,
L
Let E be the elementary matrix obtained by performing an elemcntdry row opcration on Tw' If the salllc clementary row operat iOll is performed on an fi X r m,lI rix A, the result is the S(Hne as the matrix fA
a•• .,.
From a compu tational poin t of view, it is not a good idea to use elementary matrices to perform elementary row operationsj ust do them direct ly. However, elementary mat rices C:1I1 provide some valuable IIlslghts 1Il10 invertible matrices and the solution of systems of linear equations. We have already observed that ewry elementary row operation can be "undone," or "reversed." Th is sa me observation applied to element,lfY matrices shows us that they are invertible.
Example 3.28
Let L
E,
Then £ 1
~
0 0
0 0
0
I
0
L
L
,~=
0 0
0 4
0 0 • and
0
I
E) =
L
0
0
I
0 0
 2 0
I
corresponds to Rz H RJ , which is undone by doi ng R2 H RJ agai n. Thus, 1 = £ 1' (Check by showing that Ei = EIE. = I.) The matrix Ez comes from 41l1, EI
Chapt ~r 3
110
Matrices
which is undone by perform ing ~ R2 ' Thus.
o o 1 o E, ·  o • o o I I
which can be easily checked. Finally. ~ corresponds to the elementary row o peration RJ  2R" which can be undone by the elementary row opera tion R} + 2R .. So, in this case,
(Again, it is easy to check this by confirming that the product of this matrix and both o rd ers, is I.)
~,
in
Notice that not only is each elementary matrix invertible, but its inverse is another elementary matrix of the same type. We record this finding as the next theorem.

Theo". 3.11
Each elementary matrix is invertible, and its inverse is an elementary matrix of the same type.
T.e luUamenlal Theorem 01 Inverllble Mallicas Weare now in a position to p rove one of the main resul ts in this book a set of equivalent characterizatio ns of what it means for a matrix to be invertible. In a sense, much o f line;l r algebra is connected to this theorem, either 10 the develo pment o f these characterizations or in their applicatio n. As you m ight expect, given this introduction, we will use this theorem a great deal. Make it yo ur fr iend! We refer to Theorem 3.12 as the first version of the Fundamental T heorem, since we will add to it in subsequent chapters. You are rem inded that, when we $l.IY that a set of statements about a matrix A are equivalent, we mean that , for a given A, the statements are either all true o r all fal se.
, The Fundamental Theorem of Invertible Matrices: Version I Let A be an a. b. c. d. e.
11 X n
matrix. The following statements are equivalent:
A is invenible. Ax = b has a unique solution for every b in IR,n. Ax = 0 has only the trivial solution. The reduced row echelon form of A is 1.. A is a product of elementary matrices.
SectiOn 33
Praal
111
The Inverse of a Matrix
We Will establish the theore m by proving the Ci rcular cham of implications
(a )::::} (b ) We have al ready shown that if A is invertible, then Ax = b has the unique solut ion x = A Ib fo r any b in 1R"(Theorem 3. 7). (b) => (c) Ass ume that Ax = b has a unique solu tion for any b in [R". This implies, in particular, that Ax = 0 htls tl unique sol utIOn. But tl homogeneous system Ax = 0 alwtlys has x = 0 as olle solution. So in Ih is case, x = 0 must be tl.esolution. (c)::::} (d ) Suppose th tlt Ax = 0 has o nl y the tnvitll solution. The corresponding system of eq uations is (I ,
tX t
(l2tXt
+ +
(1124 (1 214
+ ... + + ... +
at..x" = (ll..x" =
0 0
and we are ass um ing that its solutio n is
x,
= 0 =0
x
"
=
0
In o the r words, GaussJordan eliminatio n applied to the augmented matrix of the system gives
a" a" [AI OJ =
""
a 22
a",
anl
.. ,
ti , "
0
a,"
0
,
1 0 0
1
' , ,
.. ,
0
0
0
0
=
[/"I OJ
, ,
a""
0
Thus, the reduced row echelon form of A
0 IS
0
1 0
I".
(d ) =? (e) If we assume that the reduced row echelon for m o f A is I", then A can be reduced to I" usi ng a fi nite sequence of elemen tary row operations. By Theorem 3. 10, each one o f these elementary row operations COl n be achieved by leftmultiplyi ng by an appro pria te elementary matrix. If thc app ropr iate sC{1uence of elem entary matrices is E., f l '" ., EI; (in tha t order), then we have
" "'k
''' ""2 "EA = I" 1
According to Theorem 3.11 , these elementary matrices are all invertible. Therefore, so is their p roduct. and we have
E)  II" = ( E1 ... £, 2 E'1 ) '  £, 1 ' E'l 1... E,. ' A  (El .. , E21 Agai n, each E,1 is anothe r elementary matrix, by Theorem 3. J I, so we have wriuen A as a product of elemen tary mat rices, as required. (e) =? (a) If A is a product of elementary matri ces, lhen A is invertible, since elementary matrices are invertible and products of inverti ble matrices are invert ib le.
111
Chapter 3 Matrices
Example 3.29
If possible, express A =
Solullon
[ ~ ~] as a product of elemen tary matrices.
We row reduce A as follo\~s:
A 
'[I 3]3 ,_, " ) [I2 !] KJl~ [~ !J '_"1 [I 0] IRe, [I 0] = I ° 3 o
1
'
Th us, the reduced row echelon fo rm of A is the identity matrix, so the Fundamental Theorem assures us that A IS invert ible and can be written as a product of elementary matrices. We have E~EJ ~EIA = /, where
EI = [~ ~]. E2=[_~ ~]. E3=[~
:]. ~=[~ _~]
are the elementary matrices corresponding to the four elementary row operations used to reduce A to /. As in the proof of the theorem , we have E"2 E" E" _ [ 01 A = (E~}! E E EI ) . ,  E" 'I ') '4 
~] [~  : ] [~
as required.
Remark
Because the sequence of elementary row operations that transforms A into / i.~ not un l(lue, neither is the representation of A as a product of elementary matrices. (Find a d ifferent way to express A as a product of elementary matrices.)
"Never bring a cannon on stage in Act I unless you intend to fire it by the last aC1."  Anton Chekhov
Theorem 3.13
The Fundamental Theorem is surprisingly powerfu l. To ill ustrate its power, we consider two of ItS consequences. The nrst is that. although the defin ition of an invertible mat rix Slates that a matrix A is invertible if there is a mat rix B such that both AD = / and BA = / are satisfied, we need only check oneof these equatio ns. Thus, we can cut our work in half! Let A be a square matrix. If 8 isa square matrix such that either AD = lor BA = I, then A is invertible and B = AI.
PlIof
Suppose BA = I. Consider the equation Ax = O. Leftmultiplying by IJ, we have BAx = 80. This implies thatx = Ix = O. Thus, the system re presen ted by Ax = 0 has the unique solution x == O. From the eq uivalence of (c) and (a) in the Fundamental Theorem, we know that A is invertible. (That is, A I exists and satisfies AA  I = I = A  I A.) If we now right multiply both sides of BA = / by A I, we obtain BAAI = LA I ~ BJ ::: A I => B = AI
(The proof III the case of AB
= I
is left as Exercise 4 L)
The next consequence of the Fundamental Theorem is the basis for an efficient method of computing the inverse of a matrix.
Section 3.3
Theorem 3.14
The Inverse of a Matrix
113
Let A bc a square matrix. If a sequence of elem entary row operations reduces A to /, then the same sequence o f elementary row op erations transforms 1 into AI .
If A is row equivalent to t, thcn we can achieve the reduction by leftIll ulti plying by a sequence £1' Ez • ... , E1. of elem entary matrices. Therefore, we have Ek . .. ~EI A = I. Setting B = E, ... EzE I gives IJA = I. By Theorem 3. 13, A is invertible and A I = B. Now applying the same sequcnce of elementary row olXrations to 1 is equivalent to left multiplyi ng Iby El · ·· ElEI = 8. The result is
Proof
Ek ... ~Ell "" 131
= B = AI
Thus, 1 is transform ed into AI by the slime seq uence of elemcntary row opcrations.
The GaussJordan Method lor Computing the Inverse We can perform row opcrations on A and I sim ultlillcously by constructi ng a "superaugmented ma tri x" [A l l]. Theorem 3. 14 shows that if A is row eq uivale nt to [ [which, by the I:: undamc ntal Theorem (d ) <=;> (a), means that A is invertible !, then elementary row operations Will yield
If A cannot be reduced to /, then the Fundamental Theorem guarantees us tha t A is not invertible. The procedu re just described IS simply Ga ussJordan elimilllltion performed o n an tIX27/, instead of an II X( n + 1), augmented matrix. Another way to view this procedure is to look at the problem of fi nd ing A I as solVi ng the mat rix eq uation AX = I" for an n X II matrix X. (This is sufficie nt, by the Fundamental Theorem , since a right inverse o f A mus t be a twosided inve rse. ) If we deno te the colum ns o f X by X I ' .•. , x n' then this matrix equation is equ ivalent to so lvlllg fo r the columns of X, one at a time. Since the col um ns of /" are the standard um t vectors f l ' ... , en' we th us have /I systems of linear equa tio ns, all with coeffiCIent matrix A:
Since the sam e sequence o f row o perations is needed to b rlllg A to red uced row echelo n form in each case, the augmented matr ices for these systems, [A I e d, ... , [A I e" i, ca n be co mbined as
(AI ' ,' , ... ' .]
~
[A I I.]
We now apply row operations to try 10 reduce A to I", which, if successful, will sim ul taneo usly solve for the columns o f A I , transfo rming In into A  I. We illustrate this use of Ga uss Jo rdan el iminatio n with three examples.
(Kample 3.30
FlIld the inve rse of A ~
if it exists.
1 2
 I
2 2 1 3
4
 3
114
Chapler 3
Matrices
Solulloa
GaussJordan elimination produces
2 2 2 1 3 1
[AI f ) = H. · 211, 11, II,
R.. II. )
1
0
0
4 0 3 0
1
0 1
0
2 2
0
1
1
)
 I
0
 I
6  2
 2  I
1
2
 I
1
0
0 0
1
t
1
 3 1  2  I
1
2
 I
0
1
0
0
1
2 0  I
0
0 0 1
1
0
 3 1 r  2
,
0
_1
0
1
r
0 1 0 s 0 0 1  2
,
1
1
1
3
,1
, ,
I 0 0 9 0 1 0 s 0 0 1  2
II, lll,
1
AI
=
9
1
 5
s
1
2
1
3 1
,
1
s
,1 ,

Therefore,
0 0 1 0 0 1
1
3 1
(You should always check that AA l = Tby di rect m ultiplicat ion. l3y T heorem 3.13, we do not need to check that A , A = Ttoo.)
Remlr.
NOlice Ihal we have used the v3riant of GaussJordan elimination th3t first introduces all of the zeros below the le3dmg Is, from left to right and to p to baltom, and then cre31es zeros above the leading Is, from right to left and bollom to top. This approach saves o n calculations, as we noted in Chapter 2, but yo u mOlYfind it easier, when working by hand, \0 create ,,/I o f the zeros in each column as yo u go. The answer, of cou rse, will be the same.
lxample 3.31
Find the inverse of 2 A '=
if it exists.
 4 2
1
 4
 [ 6 22
SectIon 3.3 The In\'erst of a Malrix
115
SOI,II.. We proceed as in Example 3.30, adjoin ing the identity matrix to A and then trying to manipulate IA I tJ into II I AI I. 2
[AI I}
~
I{_ 11{,
•••• II. ,_II
•
.
]
,
 \
 2
2
,
]
0
0
6 0
\
0
 2 0
0
\
\
0
0
\
0
3
 2 2  6 \
0
\
\
2
\
]
0
0 0
]
3
2
\
0 0
0 5
3
\
2
]
0
]
0
0
At this point, we see that it is not possible to reduce A to 1, SIIlCCthere is a row of zeros on Ih, Id lh,nd sid, of Ih, ,msmenICd n,,"" . Co ''''q ''c<"ly, A is nOI in" nib". . . t
As the next example illustrates, everything works the same way over Z" where p
IS pnme. Find the inverse of A=
[~ ~]
if it exists, over Z J'
SIII"I.1 in Zj'
We use the Ga ussJordan method, remembering that all calculations are
[A l t]~[~ ' • [;
.
I 2
0 0
• [~
• [~
0 0
1I.+2R,
::=
0 0
]
11,+ It,
Thus, A '
2 \
2
\ 2 \ 2
~l ~l ~l ~l
[~ ~ ], andit iseaSY to check that, ovCr ZJ' AA
Solullo.2 Since A is a 2X2 matrix, we can also com pute II
1
= f. I
using the formula
given in Theorem 3.8. The determina nt of A IS dol A ~ 2(0)  2(2) ~  ] = 2
in l j (since 2 + 1 = 0) . Thus, A I exists and is given by the formula in Theorem 3.8. We must be ClIrcful here, though, since the formula introduces the "fraction" I/ det A
116
Chapter 3 Matrices
and there are no fractions in Zy We must use multiplicative inverses rathe r than division. Instead of I/de t A = 1/ 2, we use 2 1; that is, we find the number x that satisfies the equation 2x = I in Zy It is easy to see tha t x = 2 is the sol utio n we want: In Z J' 2 1 = 2, since 2(2) = 1. The formula for A I now becomes AI =
Tl[ _~
~ ] = 2[~ ~ ] = [ ~ ~]
which agrees with our previous solution.
Exercises 3.3 In Exercises 1 10, find the inverse of the given matrix (If it exists) using Theorem 3.B.
[~ ~] 3. [~ : ] I.
5.
7. [ 0.5 9. [" h 10.
14. Prove Theorem 3.9( b ). 15. Prove Theorem 3.9(d).
Un  1.5
[
'/" l/ c
pronounced, and this explai ns why computer systems do not use one of these methods to solve linear systems.
v'v'22 ]
4.,]
8. [
2.4
2.54
8.128]
0.25
0.8
I/b]
1/ d ,where neither a, b, c, nor d is 0
ExerCises II and J2, solve the given system IIsing the method of Examp[e 3.25.
16. Prove that the nXn identity matrix In is invertible and that 1;;1 = 1.,
17. (a) Give a counterexample to show that (AB) I "* A  I B 1 in general. (h) Under what conditions on A and B is (AB)  l = A I B 1? Prove yo ur assertion. 18. By induction, p rove t hat if AI' A z, . .• , An are invertible matrices of the same Size, then the product AlA;:'" An is invertible and (A IA 2 ... An)l = An l ... A;' Ai '. 19. Give a coun te rexample to show tha t (A A I + B 1 in general.
+ B)  I
*
[/I
11.2x + y = I 5x + 3y = 2 13. Let A =
G !J. b
12.
1 2xl + X:! = 2 Xl 
X;: =
l
(a) Find A 1 and use It to solve the th ree systems Ax = bl' Ax = b 2, and Ax = by (b) Solve all three syslems at the same time by row reducing the augmented matnx jA I b , b 2 b J ] using Ga ussJordan elimination. (c) Carefully count the total number of individual multiplicalions that you performed in (a) and in (b). You should discove r thai, even for this 2 X 2 example, o ne method uses fewer operations. For larger systems, the difference is even mo re
In Exercises 2023, solve the given matriX equatIOn for X. Simplify your answers as lIluch as possible. (In the words of Albert Emstein, "Everything should be made as simple as possible, but not SIll/pIer.") Assume tllat aI/matrices are invertible. 20. M 2 = AI 21. AXB = (BA) 2 22. (A IX) I = A(B 2A)  1 23. ABM 1B 1 = I + A III Exercises 2430, lei
I I I c ~
I I 2
2
I  I 2
I I
I I ,
B~
0
 I I ,  I
D ~
I I I
 I I 2
0
I ,  I
I 2  I  3  I 3 2 I  I
Section 3.3
each mse,find (III elementary matrix E that satisfies the . gIVen equ(l!Ion. /II
.
24. £4. = 8
25. E8
= A
26. £4. = C
27. EC= A
28.EC = D
29.ED = C
30. Is there an ele men tary mat ri x £ such that EA = D? Why or why not?
47. Prove that if A and B are square matrices and A8 is invertible, then both A and B are invertible.
/n Exercises 4863, use the GllllssJord(1II method to filld the inverse of the given II/(I{rix (if it exists). 48.
[~ ~] 33. [~ ~]
[~ ~] 34. [~ ~] 32.
31.
35.
37.
]
0
0
0
]
2
0
0
]
]
0
0 0
36.
,
0 0 , C'F 0
0
]
38.
2 52.
0
]
]
]
0
0 0
]
0
0
0 0
]
0
54.
, ,c *" 0 56.
39. A = [
]  I
0]
 2
I llS
40. A =
[~ ~ ]
58.
4 1. Prove Theorem 3.13 for the case of AB = I. 42. (a) Prove that if A is invertible and AB = 0, the n B = O. (b) Give a counterexample to show that the result in part (a) may fail if A is no t invertible. 43. (a) Prove that if A is invertible and BA = CA, then B = C. (b) Give a counterexample \0 show that the result in part (a) may fa il if A is no t invertible. 44. A square matrix A is called idempotent if A' = A. (The wo rd idempotent comes f rom the Latin idem, mea ning "same," and polere, meaning "to have power." Thus, something that is idempotent has the "s:.II11e power" when squared.) (a) Find three idempotent 2X2 malnees. (b) Prove that the only invertible idempotent I/X 1/ matri x is the identity matrix.
45. Show that if A is a square matrix that satisfies the equation A 2  2A + 1 = 0, the n A I = 2/  A.
59.
;]
51.
0
]
 ]
2
0
 ]
]
]
0
]
0
]
0
]
]
/, 0
prodllcts of elementary
49.
3 2
0
]
III Exercises 39 ami 40, fi"d a sequellce of elementary matrices EI, ~, ... , Ek such that Ek ... ~ EIA = 1. Use this
sequcllce to write both A (Hid A matrices.
[ ~ ~]
50. [ 6 3
0 0
111
46. Prove that if a symmetric matrix is invertible, then its inve rse is symmet ric also.
III Exercises 3138. filld the illverse of the give" elemelltary
matrix.
The Inverse of a Matrix
0
" 0 ,
57.
d 0
0 0 20  40 0 o 0
0
]
0 0 0
0
0 0 0
3
]
]
0
0 0
]
0 0
0
]
,
;,] [: ]
 ]
2
3
]
2
2
3
 ]
a 0 0 55.
b
0 d
" 61. [~ !] over Z 5 63.
53.
[~ ~]
]
5
0
]
2
4 over l,7
3
6
]
]
0 0 2
" ]
 ]
"
]
 ]
0
]
]
]
0 2
]
0 3
60. [ 0]
62.
0
0  ]
: ] over Zl
2
]
]
]
0 2 over 1L3
0
2
]
Partitiollillg large sqllare matrices ct/II sometimes make their iI/verses easier to compute, partiCIIl(lrly if tile blocks hal'e aI/ice form. III Exercises 6468, verify by block mulliplicatiollthat the inverse of IT matrix. i!partitioIJed as shown. is (IS claimed. (Assume Illat all inverses exist (IS "ceded. )
Chapt~r
nl
3 Matrices
 (1  BC)  'B ] 1+ C(/ BC)  'B
In Exercises 6972, partition the given matrix so that you can apply one of the form uu15 from Exercrses 6468, and then calCIIlate the inverse usmg that formula.
69.
D 1
68.
A 8 ]' [P Q] [ C
D
=

(BO ' c t ' BO ' ] D 1 C(BD 1 C)  I BD 1
.
7 1.
Q =  PHO I, R =  O lCP,and 5 = D 1 + D 1CPHD 1
'
0
0
0 2
1
0
0
0 1 0 0 1
3 1 2
70. The matrix in Exercise 58
5' where P = (A  BO 1C)  I,
R
1
0
0
1
0
0
1 0
0
1
1 0
1
1
0
1
0 72.
1
1
1
3  1 5
1
1
2
..~ The LU Factorization Just as it is natural (and illuminating) to factor a natural numl.xr into a product of other nat ural numbersfor example, 30 "" 2·3· 5lt is also freq uen tl y helpful to fa ctor matrices as products o f other mat rices. Any represen tation of a matrix as a product of two or more other mat rices is called a matrix factorization. For exa mple,
] I] = [' 0][3 1 2 5
3
I
0
is a matrix fac torization. Needless to say, som e factoriza tions are mo re useful than o thers. In th is sectio n, we introduce a m atrix factorization that arises in the solution of systems of linear equations by Gaussia n elimination and is particula rly well suited to computer im plementation.ln subsequent chapters we will encoun te r other, equally useful matrix factorizations. Indeed, the topic is a ric h one, and entire books and courses have been devoted to it. Consider a system of linea r equa tions o f the for m Ax = b, where A is an /IX 'I matrix. Our goal is to show that Gaussia n eliminatio n implicitly factors A into a product of matrices that the n enable us to solve the given system (a nd any other system with the same coefficient mat ri x) easily. The follow ing example illustrates the basic idea.
Ixamplu 3.33
Let
A =
2 4  2
1  I 5
3 3 5
SectIon 3.4
n9
The LV Factonzation
Row reduction of A p roceeds as follows:
A =
2
1
3
4
 1
 2
5
3 5
R,  lR, R. + R,
2 0 0
,
1
3
 3
 3
6
8
,
R, dR .
2 0 0
3  3 2
1
3 0
U
(I)
The three elemen tary matrices £1' ~, Ej tha t accomplish this reduction of A to echelon form U arc (in order):
E,
1
0
0
 2
1
0 ,
0
0
1
E,=
1
0
0
1
0
0
0
1
, Ej = 0
1
0
1
0
0 1
0
2
1
Hence, £l ~EI A
== U
Solving for A, we gel
2
0 0 1 0
0
0
1 A = EI1 E2 JEl I U =

1
1
0  1
0 0 1 0
1
0
0
0
1
0 U
1
0
 2
0
1
0
0
2
1
0 U=W
 1
2
1
1
Thus, A can be factored as A = LV
where U is an upper triangular matrix (see the exercises for Section 3.2), and L is IIIlit lower triangular. T hat is, L has the form The LV facto ri zation was int roduced in 1948 by the great English mathematician Alan M. Turing ( 19 12 1954 ) in a paper emitled "Roundingoff Errors in Matrix
Processes» (QwlrIerly Jo/mwl of MeclulIlics lind Applied M,ahcrnalics, I (l948), pp. 287 308). During World War II, Turing was instrumental in cracking the German "Enigma" code, However, he is best known for his work in malhematicallogic Iha' laid the theoretical groundwork for the development of the digital co mputer and the modern field of artificial intelllgence. The "Turing test" that he proposed in 1950 is sti ll used as one oflhe benchmarks in addrf'ssing the question of whether a compu ter can be considered .. intt."lligcnt. H
•
1
o o
•
•
1
1 0
L=
...
with zeros above and Is on the main d iago nal.
The preceding exa mple m otivates t he following definitio n.
Definition
Let A be a square matrix. A factor ization of A as A = LU, where L is unit lowe r triangular and U is upper triangular, is called an LV factorization o f A.
Rlmarlls Observe that the matrix A in Example 3.33 had a n LV factorization bccause 110 row illlerciwl1ges were needed in the row reduction of A. llence all of the elementary matrices that arose were unit lower triangular. Thus. L was guaranteed to be unit •
180
Chapter J
Matrices
lower triangular because inverses and products of unit lower triangular matrices are also unit lower triangular. (See Exercises 29 and 30.) If a zero had appeared in a pivot position at any step, we would have had to swap rows to get a nonzero pivot. This would have resulted in L no lo nger being unit lower triangular. We will comment furthe r on this observatio n below. (Can yo u find a rna· trix for which row interchanges will be necessary?) • The notion of an LV fac torization can be generalized to nonsquare mat rices by simply requiring Vto be a matrix in row echelon form. (See Exercises i3 and 14.) • Some books define an LV factorization of a square matrix A to be any factorization A = LV, where L is lower triangular and V is upper triangular. The first remar k above is essentially a proof of the following theorem.
Theorem 3.15
If A is a squa re matrix that can be reduced to row echelon fo rm without using any row interchanges, then A has an LV factorization.
To sec why the LV factorization is useful, consider a linear system Ax = b, where the coeffi cient matrix has an LV factorization A = LV. We can rewrite the system Ax = b as LUx = b or L( Vx ) = b. If we now define y = Vx, then we can solve for x in two stages: I. Solve Ly = b for y by forward substitution (see Exercises 25 and 26 in Section 2.1). 2. Solve Vx = y for x by back substitution. Each of these linear systems is st ra ightforward to solve because the coefficient matri · ces Land U arc both triangular. The next example illustrates the method.
Example 3.34
2 4
I  I
3 3
 2
5
5
Use an LV factorizat ion o f A =
Solullon
to solve Ax = b, where b =
I  4
9
In Example 3.33, we fou nd that I
o
2  I
I
0 0
2 0
I  3
3  3
 2
I
0
o
2
w
As outli ned above, to solve Ax = b (which is the same as L( Ux) = b), we fi rst solve y,
Ly = b for y =
Y2 • This is just the linear system
y, y, 2Yl
I
+
Y2
4
 Y,  2Y2+Y3 =9
Forward substitution (that is, working from top to bottom) yields YI = I 'Y2 =  4  2Yl =  6,h = 9
+ Yl + 2Y2 =
 2
The l_U FaClori7~ lion
Seclion 3.4
~
1
Thus y
=
1'1
 6 and we now sohe Ux = Y for x
=
X
z This li near system is
x,
 2 2xl
+
X2 +3x3=
I
3Xl  3xJ =  6
2x) =  2
and back substitution quickl y p roduces XJ
=  1,
3~
+
=  6
3x) =  9 so thtH Xl = 3, and
2xI = I  x!  3xJ :. I so that XI 
Therefore. the solution to the given system Ax = b is x =
! 3  I
An EasY Way 10 find LU Faclorlzallons In Example 3_33, we computed the mat rix L as a product of elementary matrices. Fortunately. L can be computed di rectl y from the row reduct io n process wi thout our needing to compute elementa ry matrices at all. Remember that we are assuming that A can be red uced to row echelon fo rm without using any row interchanges. If th is is the case, then the en tire row reduction process can be do ne using o nl y elemen tary row operations o f the form R,  kR}' (Why d o wc not nced to use the remaining elcmcntary row operation , multiplying a row by a nonz.ero scalar?) In the o penllion
R;  kR,• we will refer to the scalar k as the mulliplje,. In Example 3.33 , the elementary row operations that were used were. in order, II,  2R,
(multiplier '" 1)
+ RI = RJ  ( I ) R, R, + 2R z = RJ  ( 2)Rz
(multiplier   I )
RJ
(multiplier   2)
The multipliers are precisely the entries of L that are below its diagonal! Indeed.
L=
= 2,
1 2
0 I
0 0
 \
 2
I
=  I , and ~l =  2. Notice that the elementary row operatio n R,  kR/ has its mult iplier k placed in thc (i,}) entry of L
and
Example 3.35
1"21
~l
Find an tUfactorizau on o f
A=
3
1
3
4
6
4 2 5
8
 10
5
 I
2
4
3
9
112
Chaptt'r 3
Matrict'S
SolulioR
Red ucmg A to row echelon form , we have I
3
4
6 4
8
1 0
5 2
 I
3
A=
3
2
9 5
1l,  IIl, Il, Il, R,(31 1l,
,
4 R,lR; R,.4R,
,
,
R., ( l j R
4
2
3 2
0
I
2
0
8
7
3 1 6
3 0 0 0
I
3 2 I  I
3 0
I
2
0 0 I
3 0
2
0 0
0 0
3 2 I 0
 2
4  2
4  8 4 2
=U
4 4
T he first three multipliers are 2, I, and  3, and these go into the subdiagonal entries o f the first column of 1. So. thus far.
I. =
I
0
0
2
I
I
0 0 I 0
• • •
3
0
I
t
The next two multipliers arc and 4, so we continue to fill o ut I.: I
L=
2
0 I
I
1
3 4
0
0
0 I
0 0 I
•
T he final m ultiplier.  I, replaces the last '" in L to give
L=
I
0
0
0
2
I
0
0
I
1
0
3
4
I  I
,
I
Thus, an LU factorizat ion o f A is
A=
3
I
6
4
3
2 5
 9
as is easily checked .
 4
3 8 5
 10
 2
 4
 I
=
I
0
0
0
2 I
I
0
0
1
3
4
I I
,
I
0
3 0 0
0
3 2 I
I
0
0
0
2
4 2 4 4
= LU
Section 3'" The LV Factorization
113
.,.ar" • In applying th is method. it is Important 10 note thai the elementary row operations R, ~ kRJ must be performed from lOp to bottom within each column (using the diagonal entry as Ihe pivot), and column by column frol11 left to right. To illustrate what can go wrong if we do not obey these rules. consider the following row reduCiion: , 2 2 , 2 2 2 2 A=
[
1
1
2 2
,
R,  !R,
•
, ,, ,,, , o placed in as follows: ,oo , , o , o c"
0
This time the multipliers would be get
0
L
0
 I
I
u
01 Ln = 2. ~I == J. We would
2
)
'*
but A LV, (Check this! Find a correct LV factorization of A. ) • An alternative way to construcl L is to observe that the multipliers can be obtained directly from the matrices obtained al the intermediate steps of the row red uction process. In Example 3.3), examine the pivots and the corresponding columns of the mlllrices that arise in the row reduction
,
3 2 3 4  I 3 +A 1 = 0  3 3 + 0 3  3  U  2 5 5 0 6 8 0 o 2 The first pivol is 2, which occurs III the first column of A. DIviding the entries of this column vector that are on or below the diagonal by the pivol produces 2
1
3
2
,
1
,
2 4
2
2
~
,
2
The next pIVot IS 3, wh ich occurs in the second column of A,. Dividing the entries of this column vector that arc on or below the diagonal by the pivot, we obtain
, T':;13 (3)
~
,
6
 2
The final pivot (whICh we did not need to usc) is 2, in the third column of U. Dividing the entries of thiS column vector that arc on or below the diagonal by the pivot, we obtalll
,
2
2
,
,
If we place the resulting three column vectors s ide by side in a matrix, .....e have
2
,
 I
 2
1
which IS exactly 1, Qnce the abovediagonal en tries are fi lled with zeros.
1..
Chapter 3
Matnces
In Chapter 2, we rema r k~d that the row echelon form of a matrix is not unique. However, if an invertible matrix A has an LV facto rization A  LV, then this factOrIzation is unique. i •
Theor.m 3.16
If A is an invertible matrix that has an LV factorization, then L and V are unique.
Prool Suppose A = LV and A == L, VI are two LV facto rizations of A. Then LV = LI Up where 1. and L ] are u nit lower triangular and U and VI a r~ up per triangular. In fact , Vand VI :1TC two (possibly different) row ~chelo n fo rms of A. By Exercise 30, 1.1 IS invertible. Because A is invertib le, Its reduced row echelon form is an id~ nt ity matrix I by the Fundamental Theo rem of Invertible M;lIrices. Hence U also row reduces to I (Why?) and so U is invert ible also. Therefore, LII(LU)U '= LII(LIU,)U I so (LI ' L)(UU  I ) "" ( L"lI LI)(UIU ' ) Hence (LII L) / = I(V,U  I}
so
LII L =
U,u I
But LII L is unit lower tmng ular by ExerciK 29, and UIU 1 is upper triangular (Why?) . It follows Ihat LII L = VI is bOJh unit lower triangular and upper triangular. The only su ch matrix IS the identity matrix, so Ll l L = I and VI V I = /. It foll ows that L = LI and U"" VI' so the LVfactorization of A is unique.
c..r'
The p' LU FaClorlzalioD We now ~xplore the p roblem of adapting the LU fac to rizatio n to handle cases where row interchanges a re necessa ry during Gaussian elimination . Consider the matrix I A =
2
 I
3 6
2
 I
1
4
A straightforward row reduction produces
1 2
A+B =
 1
0
0
5
o
3
3
which is not an upper triangular mat n x. However, we can easily convert th is into upper tr iangular form by swapping rows 2 and 3 of B to get
V ..
I
2
 I
0
3 0
3 5
o
Alternatively, we can swap rows 2 and 3 o f A first. To this end, let Pbe the elementary matrix I
0 0
0 0
0
I
0
I
Section 3.4
Th~
115
W Factori1.ation
corresponding to interchanging rows 2 and 3, and let E be the product o f the elementary matrices that then reduce PA to U (so that 1; 1 :::: L is unit lower triangular). Thus EPA = U, so A :::: (EP) I U = p IC I U :::: p I LU. Now th is ha ndles only the case of a single row interchange. In gen~ral , Pwill be the product P = Pl .. • Pl P1 of all the row interchange matrices PI,Pl , . .. Pk (where PI is performed first, on so o n.) Su ch a matrix P is called a permutation matrix. Observe that a permuta tion matrix arises from permuting thc rows of an iden tit y ma trix in some o rder. For exa mple, the following are all permutation mat rices: I
o o
[~ ~], ~
o
I
I 0 0
000 I 0 0
o o, I o o
0
I 0
I 0
Fortunately, the inverse o f a permutation mat ri x is casy to compu te; in fact, no calculatIOns arc need ed at all!
TMeore. 3.11
• If P isa perm utation matrix , thcn P
1
pT.
I _
• Prill We must show that pTp ... J. But the ith row o f pT is the same as Ihe ith column of P, and these arc bOlh equal to the same standard unit vector e, beca use Pis a permutation matrix. So
( pTp )i':::: (ith row of pT)(ith column o( IJ) :::: eTe :::: e·c :::: 1
'*
This shows that d iagonal entries of I,Tp are all Is. O n the other ha nd, if} i, then the,ith column of P is a (/iJJcrem sta ndard uni t vector from esay c '. T hus a typical offdiagonal en try of pTp is given by
( pTp) ij = (ith row of pT)Uth colum n of P) ""
('Te'
:::: ('. e' "" 0
Hence pTp IS an identity matrix, as we wished to show. Thus, in general, we can faclor a square matrix A as A "" pI LU = pTLU.
Dellnltlon
Let A be a square matrix. A factorization of A as A = pTLU, where Pis 11 permutation matrix, L is unit lo .....er triangular, and U is upper triangular, is called a pTLU factorizat ion of A.
Example 3.36
0 0 6 Find a pTI.U factorization of A =
I
2
3.
2
I
4
Solulloa
r.irst we reduce A to row echelon fo r m. Clearly, we n{'Cd at least o ne row Interchange. I 0 0 6 I  I, I 2 3 2 3 0 0 A= I 2 3 6 • 0 0 6 2 I 4 2 I 4 0 3  2
., "
•
/1, _ 11.
•
I
0 0
2  3
3  2
0
6
186
Chapter 3 Matrices
We have used two row interchanges (R I ++ Rl and then R! _ permutation mat rix is P = P 2P I =
R3), so the required
]
0
0
0
]
0
0
]
0
0
0
]
]
0
0
0
0
]
0
]
0
0
0
]
]
0
0
We now find an LV factorization of PA.
PA Hence
=
L 21
0
]
0
0
0
6
]
2
3
0
0
]
]
2
3
4
]
0
0
2
]
4
2 ] 0 0
0
2  3
3  2
0
0
6
]
fl. .,  lfl.,
•
6
U
= 2, and so
A
=
pTLV
=
0
0
]
]
0
0
]
2
]
0
0
2
]
0
 3
0
]
0
0
0
0 I
3  2
0
0
6
The discussion above justifies the following theorem.
Theore. 3.18
Every square matri x has a pTLV factoriza tion.
Hemarll
Even fo r an invertible matrix, the pTLV factorization is not Ulllque. [n Example 3.36, a single row interchange RI ++ R3 also would ha ve worked, leading to a di fferent P. However, once P has been de termined, L and V are unique.
Computational Considerations If A is tlXn, then the total number o f operations (multiplications and d ivisions) required to solve a linear system Ax = b using an LV factorization of A) is T(n) = rl 3/ 3, the same as is required for Gaussian elimination (See the Exploration "Counting Operations," in Chapter 2.) This is ha rdly surprising since the forward elimination phase produces the LV factorization in = tI'/ 3 steps, whereas both fo rward and backward substitution require = 1?/2 steps. Therefore, for large values of II, the 1~/3 term is domina nt. From this point of view, then, Gaussian elimination and the LV factorization are equivalent. However, th e LV factorization has other advantages: From a storage po int of view, the LV fac torization is very compact because we can overwrite the ent ries o f A with the entries of L and Vas they are compUled. In Example 3.33, we found that •
A~
2 4  2
I
0
0
 ]
3 3
2
]
0
5
5
 ]
 2
]
]
This can be stored as 2
 ]
2
3 2
~
3 3 2
2 0 0
 3
3 3
0
2
]
W
Section 3.4
111
The LV Faaonzation
with th("'entries placed in thcorde r (1, 1), ( 1,2) , (1,3), (2, 1), (3 , 1), (2,2) , (2,3) , (3,2 ), (3,3). In o ther words, the subdiagonal entries of A are replaced by the cOrreSIJondi ng mult ipliers. (Check that this works!) • Once an LV fa ctorization o f A has been computed . it can be used to solve 3S ma ny linear systems of the form Ax = b as we like. We just need to apply the method o f Example 3.34, va rying the vector b each time. • Fo r matrices with certain special forms, especially those with a large nu m ber of zeros (socalled "sparse·' matrices) concentra ted off the d iagonal, the re arc methods Ihal will simplify the compulalion of an LU factorization. In these cases, this m ethod is fa ster than Gaussian elimination in solving Ax = b . • For an invertible matrix A, an LV fac to ri zatio n of A can be used to find A  I, if necessar y. Moreover, this can be done in such a way that it simultaneously yields a fa ctorization of A I. (Sec Exercises 15 18.) ••• .,. If you have a CAS (such as MATLAS) that has the LVfacto riza tion buil t in, you may notice some differences between yo ur hand calculations and Ihe computer o utput. This is lx
2  1 0
In Exercises 1 6, solve tile system Ax = b using /I,cgi,'c" LV [actoriwtioll of A.
: :
I] b _ [ 5] [: ~] = [ ~ ][  ~ 6' I  2] b = 2. A = :] [~ [ ~][~ , . [~ ] l. A =
3. A =
2
1
2
3  3
,
,
2 0 0 0 2 3
4. A =
 I X
2 0 0
,
1
0
0
 I
1
0
2
0 1
.,,
3
 2
1
X
 2
,  I
,
2
2
,
0
, b =
,•
s. A =
6. A =

0 1 0 0 1
,
~
5
, . b=
0
2
0
2
0 

j
,
 3 1 0 0 7 2  I 0 0  I 5 3 . b= 0 1 0 0 0 5
1 0 0 0 1 0
I 5 2 2
,
3
0
1
0
 2
5
 I
2
 2
1
3
X
0 0
0
,
1
3 3 2 9 8 9 5 3 0 1  3 3 5 2 . b=  I 0 2 0 0 1 0 0 6
1
1
,
1
0
0 0 0
1 3
1
5
0 !
2 0 0 0
X
1
1
•8 ,, , I
0
,
,
0
0 0
0
1
0
 2
1
I" ExerCIses 7 12,jiml WI LU factorizatIon of the glvell matm:. 7. [
I
3
n
8.
[!
~l
181
9.
Chapter 3
1
2
4
5 6 7 9
8
11.
12.
Ma trices
3
10.
1
2
3
 1
2 0  1
6 6
3
0
6
7
2
 9
0
2  2
2 4
4
4
2  1
2
2
 1
4
0
4
3
4
4
1
21.
2 3
Ger/erall ze tIle dejinition of LV factorization 10 nOl/square matrices by silllply reqwrmg V to be a matrix it! TOW echelo" form. With tllis mo(lificatlon,jiml an LV fac/oriza tiotl of the matrices it! Exercises 13 and 14. 1
2
13. 0
0 3
3
1
0
0
0
5
1
14.
1
2
 2
 7
1 0
1
3
0 3 3 3
 1
8 5
6
1 2 2 0
For (11/ invertible matrix wllh till LU facto rization A = LU, botll Land U will be illvertible (///(1 A  I = [ j l C l . I" Exercises 15 and 16, find L 1) U I , and A 1 for the gIven matrix. 15. A
111
Exercise I
16. A in Exercise 4
The IIIl'erse of a matrix ca t/l/lso be computed by solvillg several systems of equtltiotlS IIsing IIle method of Example 3.34. For a" nX" matrix A, to find ils inverse we /leed 10 solve AX = I~ for the /I X " matrix X. Writillg this equation as A[ X I Xl'·· x ~ ] = [el e z '" f ~ ] . Iising the matrrxco/umll form of AX, we see that we need to solve II systems of linear eqlltltions: Ax l = f l' AX2 = f l ' •• . , Ax ~ = en. Moreover, we am use tile factorization A = LV to solve each one of these systems. It! Exercises 17 alld 18, lise Iil e appro(lCh just oudlt/ed to jilld
A  [ for tile givC/l mmrix. Compare witl! tl/C melilot! of Exercises 15 alld 16. 17. A in Exercise I
1 0 0 0 1 0
0
18. A in Exercise 4
0 0
1
0 0 19.
7 5 8
6 9
In ExerCIses 1922, write tFtc given permutatioll matrix as a product of efemelltary (row illterc/ulIlge) matrices.
20.
1 0
0 0 1 0 0 0
0
0 1 1 0
0 1 0
0
0
0
1
0
0
0 0 0
0 1 0
0
0
0 0 1 0
0 1 0
0 0 1 0 22. 0 0 0 0 0 1
0 1
0 0 0 1 0
III Exercises 2325, find a pTI.V factorization of the given matrix A.
23. A =
25. A =
0 1 4  1 2 1 1 3 3
0 0  1 1
24. A =
0 1
0
 1
1
 I
1
1
0 0
1
 1
0
1
1 3
2 1
1 I
2 2 1 0
3 2
1 1
26. Prove that there are exactly II! /IX 1/ perm utation matrices.
III Exercises 2728, solve tlJe system Ax = b usmg the given factoriuHioll A = pTf_U. Bemuse ppT = 1, pTLUx = b call be rewrittell as LUx = Pb. Tllis system callthell be solved IIsing the metllod of Exl/mple 3.34. 27. A =
1
2
3
1
1
 I
2
3
2
1
o
1
 I
1
o
0
x
28. A =
o
o
8 4 4
 1
2 =
I
0
1
o
1 0
0
0
1 0
o
1
! !
0
,,
3 5 1 2 o 3
0 I
5
o o
I
0
1
0
1
I
I
0
0
2
o
0 1 0
 I
1
4
1
2
16
x o
 1
1
 4
o
0
2
4
Section 3.5 Subspaccs, Basis. Dimension, ~nd Rank
29. Prove that a prod uct of unit lower triangu lar matrices is uni t lower mangular. 30. Prove that every um! lower I riangular matrix is
inver tible and that its inverse is also uni t lower triangular. An LDU factorization 01 a square matrix A is a lac/oriUl
lioll A =:; LOU. where L IS a umllower IriallgulClr matrix, o is a diago"al matrix, (md U is a IInit IIpper Ina"gu/ar malrix (upper mangular with 1$ on its diagomd). /" Exercises 31 (/1/(1 32,jilld (11/ LDU j(lctonzmiolJ of A.
31. A in
Exerci~
I
189
32. A in Exercise 4
33. If A is symmetric and invertible and has an LDU factorization, show thai U = LT.
LVLT (with L unit lower triangular and Ddiagonal), prove that this factOrization IS unique. That is, prove that if we also have A = LIDILi (with LI unit lower triangular and V I diagonal), then L = 1.1 and D =:; DI'
34. If A is symmetric and invertible and A
:=
Subspaces. Basis. DImension. and Rank
,
,
'" y
figure 3.2
Th iSsection introduces perhaps the most important ideas in the entire book We have already seen that there is an in terplay between geometry and algebra: We can often use geometric intuition and reasoning to obtai n algebraic results, and the power of algebra will often allow us to extend our fi ndings well beyond the geomet ric settings in which they first arose. In our study of vectors, we have al ready encountered all of the concepts in this scction informally. Here, we will start to become more formal by giving definitions for the key ideas. As you'll see, the notion of a sl4bspace IS simply an algebraic generalization of the geometric exam ples of lin es and planes through the origin. The fu ndamen tal concept of a basis for a subspace is then derived frolll the idea of direction vectors for such lines and planes. The concept of a basis will allow us to give a precise definition of dimension that agrees with an intuitive, geometric Idea of the term, yet is fl exible enough to allow generalization to other settings. You will also begin to see that these ideas shed more light on what you already know about matrices and the solution of systems of linear equations. In Chapter 6, we will encoun ter all of these fundamental ideas again, in more detail. Consider this section a "getting 10 know you" seSSion. A plane through the origin in R3 "looks like" a copy of Rl. Intuitively, we would agree that they are both "twodimensional." Pressed fu rther, we might also say that any calculation that can be done with vcctors in R2 can also be done iI) a plane through the ongin. In part icular, we can add and take scalar multiples (and, more generally, form linear combinations) of vectors in such a plane, and the results are other vectors in the smile pltllle. We say that, li ke (Rl, a plane through the origin is closed with respect to the operations of addition and scalar multiplica tion. (Sec Figure 3.2.) But are the vectors in this plane two or threedimensional objects? We might argue that they are threedimensional because they live in R) and thereforc have three components. On the other hand, they can be described as a linear combination of Just two vectorsdirection vectors fo r the planeand so are two·dimensional objrcls living in a twodimensional plane. The notion of a subspace is the key to resolv ing this conundrum.
191
Chapter 3
Malrlces
::~~~::~~~~~~~~~~~~""r;:::ii
Definition
A subspaceof R is any collection S of vectors in
.....
R" such tha t
I . The zero vector 0 is in S. 2. If u and v a re in S, then u + v is in S. (S is closed uniler addition.) 3. If u is in Sand c is a scalar, then cu is in S. ($ is closed limier scalar
multiplication. )
We could have combi ned properties (2) and (3) and requi red, equivalentl y, that S be
closed lIt1der /itlear combinations: I(u •• u2•... '
U k :!T/;'
in S a nd c., ':1 ••.. , CA arc scala rs,
then c. u .+':!Ul + ··+cku kis in S.
Ellmple 3.31
Every line and plane through the origin in R) is a s ubspace of ~J . [t sho uld be d ear geometrically tha t pro perties ( 1) through (3) are satisfied. Here is an algebraic proof in the case o f a pl a ne thro ugh the o rigin. You are asked to give the correspo nd mg proof for a line in Exe rcise 9. Let 9:1' be a plane thro ugh the origin with d irection vectors v. and v 2• Hence, '3' = span (v.,v}). The zero vcClor 0 is in \1>, since 0 = OV1 + Ov2, Now let u = c.v .
+
':!v1 and v = d.v . + d1v l
Ix two vecto rs in 'l/'. The n
Thus, II + v IS a linear com bina tion o f V I and V2 and so is in fl. Now let c be a scalar. The n
ell = e{clv.
+ C2V:) =
( C(I) V I
+ (CC2)V2
which shows lhat cu is also a linear combin3tion of V I and v 2 a nd IS therefore in tJP. We have shown tha t fJ> satisfies properties ( 1) through (3) 3nd hence is a subspace of R 3 •
4 If you look carefull y a t the details of Example 3.37. yo u Will notice that the fact that VI a nd v1 were vectors in lQ' played no role a t all in the ve rificatio n of the prope rties. Thus. the algebraic me thod we used should ge neral ize beyond R3 and apply in sit ua tions where we can no longer vis ualize th ~ geom etry. It d ~ Moreover. the m ethod o f Example 3.37 can serve as a " I ~ mplat e" in m ore general settings. When we gene ralize Exa mple 3.37 to the span of an arbitra ry SCI of vectors in a ny lQ n, the result is impo rtan t enough to be called a theorem .
Theore. 3.19 'roof
leI S = span ( VI ' v!" .. ,vk)' To ch«;k p roperty ( I) of the definition, we simply o bserve that the zero veCior 0 is in S. since 0 = OV I + Ov z + ... + Ov,.
Se<:lIo n 3.5
Subspaces, Basis, Dimension, and Rank
191
Now let u == '. v .
+
~V2
+ ... + 'kVk
and
v = d.v .
+
d1v2 +
... + alvl
be two vect o rs in S. Then
u + v = ('. v. + ';!V2 + ... + CkVl ) + (div i + d2v2 + ... + d1vk) = (ci + dl)v l + (Cz + d2)V2 + ... + (Ct + dt)vt Thus, u + v is a line ar com bina tion of v •• v •• ••• vi and so is in S. Thi s verifies pro p2 erty (2). To sho w pro pert y (3), let cbe a scalar. The n
CU = c(c. vl + ';!v] + ... + Ctv,) = (cc. )v. + (c';!)vJ + ... + (CCl)V t which shows that cu is also a line ar com bina tion of v •• v ' •••• V 2 k and .s ther efor e in S. We have sho wn that S s,1tisfles prop erti es (1) thro ugh (3) and hen ce is a subspace ofR n. We will refer to spa n (v i • VI" .• , Vi) as tI,e .sub .space .spa ,mea bYV I' VI" .. • Vic" We will ofte n be able to save a lot of work by reco gnizing whe n The orem 3.19 can be app lied .
lnm ple 3.38
x Show that the set of all vectors
forms a subspace of R3.
50lullon
y that satisfy the con di tions x = 3y and z = 2y
,
x Sub stitu ting the two con diti ons into y yields
,
3y
3
Y  2y
1
2 3
Since y is arbi trar y, the given set of vectors is spa n
1
and is thus a subspace of
2
J
R , by The orem 3. 19.
Geo met rica lly, the sci of vect ors in Exa mpl e 3.38 repr esen ts the line thro ugh the 3 orig in in R' with di rect ion vector I
2
192
Chapter 3
Matrices
(IImple 3.39
x y
Determine whether the set of all vectors
,
and z = ly is a subspace ofRl.
SOIIUO.
tha t satiSfy the conditions x :. 3y + I
This time, we have all vectors of the for m 3y
+
1
Y
 2y 3y
The zero vector is not of this form . (Why not? Try solving
+
1
y
 2y
o o .) Hence,
o
prop<:rty ( I) docs not hold , so this set cannot be a subspace of Rl.
ElIample
3.4~
Determine whether the set of all vectors
Solullo.
.:.1.•,h,,, )'
These are the vectors of the form
=
r, is a subspace of 1R1.
[~ }call this set S. This time 0 = [~]
belongs to S (take x = 0 ), so property ( I) holds. Lei u =
[~] and v = [ ~] be in S.
Then
u + v =[~:~] which, in general, IS not in 5, since it docs not have the correct form; that is,xT + (XI + ~) l . To be specific, we look fo r a counterexample. J(
x: :I
t hen both u and v arc in 5, but their sum u + v = [ :] is nOI in 5 since 5"* 31• Thus.
P,oP"'Y (2) f" ;I" ,,,d S ;s not Hubsp'" of R'.
.....t
.... ,. In order for a set 5 to be a subspace of some JR:n, we must prove that properties ( I) th rough (3) hold !II gelleml. However, for 5 to fmlto be a subspace of Rn, it is enough to sho\" that ollcof the three properties fai ls to hold. The eaStest course is usually to fi nd a single, specific cOllnterexample to illustrate the fai lure of the property. Once you have done so. there IS no need to consider the other propert ies.
SubaPlcea Issoclated wll. MatriCes A great many exa mples of subspaces arise in the context of matrices. We have already encountered the most important of these in Chapte r 2; we now reVIsit them with the notion of a subspace in mind.
Section 3.5
Definition
Subspaces, BaStS, Dimension, and Rank
193
lei A be an /fiX" matrix.
I . The row space of A is the subspace row (A) of IR" spanned by the rows of A. 2. The columll space of A is the subspace col (A) of R'" spanned by the columns of A.
lnmple 3.41
Consider the matnx
A=
1
 I
0
I
3
 3
1 (a) Determine whet her b =
2 is in the col um n space of A. 3
(b) Determine whether w = [4 5) is in the row space of A. (c) Describe row(A ) and col(A).
Solullo.
1
(a) By Theorem 2.4 and the discussio n precedi ng It, b is a linear combination of the columns of A if and only if the linear system Ax = b is consistent. We row reduce the augmented matrix as follows:
I
 \ 1
1 0 3
1 2
+)012
o
o
J  3 3
0 0
Thus, the system is consistent (and, in fac t, has a unique solution). Therefore, b is ill col (A) . (This example is just Example 2. 18, phrased in the term inology of th iS section .)
(b) As we also saw in Section 2.3, elementary row operations simply create linear combinations of the rows of a matrix. T hat IS. they produce vectors only in the row space of the matrix. If the vector w is in row (A} , then w is a linear combination o f the
~ ]. it will be poSSible to apply elementary row operations to this augmented matrix to reduce it to form ~' usi ng only elemenrows o f A,so If we augme nt A bywas [
Ij
t;uy row operations of th" fo rm R, + kRJ, where i > j  in other wo rds, workjllgfrol1l lOp
to bottom ill each CO/lIIllIl. ('Vhy?) In th is example, we have
[: 1~
1  I
fl. •.111,
1 I
0 1 3 3
R.  ~R ,
0
1
0
0
0
9
4
5
•
1  1
<
~R,
•
0
1
0
0
0
0
194
Chapler J
Malrices
Therefore, w is
y], the augmenled malrix [ ; ]
I 0
o o
I
0
o 0 in a sim ilar fas hion. Therefo re, every vector In 1R2 IS in row(A), and so row(A) = IRl. Finding col (A) is identical to solving Example 2.21, wherein we determined that it coincides with the plane (through the o rigin ) in Rl wIth equation 3x  z = O. (We will discover o ther ways to answer this Iype of question shortly.)
Re •• rll \Ve could also have answered parI (b ) and the flTst part of part (c) by observing that any question about the rows of A is the correspondmg question about the COIUIIIIl S of AT. So, for example, w is in row (A) if and only if w T IS in col(A T). This is true If and only if the system ATX = w T is consistent. We can now proceed <15 in part (a). (Sec Exercises 2 124.) The observatIons we have made about the relat ionship between elemen tary row operat ions and Ihe row space are su mmarized in the following theorem.
Theorem 3.20
Lei B be any matrix Ihat is row equivalenl to a matrix A. Then row ( B) = row (A).
• The matrix A can be transformed into B by a sequence of row operations. Consequently, the rows of B are linear combinat io ns of the rows of A; hence, linear combInations of the rows of B are linear combi nat ions of the rows of A. (See Exercise 21 In SectIon 2.3.) It follows that row (8) C row (A). On the other hand, reversi ng these row o perati ons transforms B mto A. Therefore, the above argumen t shows Ihat row(A) C row (8). Combining these results, we have row(A) = row ( B).
PrOOf
There is ano ther Important subspace that we have already encountered' the set of solut ions of a homogeneous system of Imear equat ions. It is easy to prove that this subspace 5.1tisfies the th ree subspace propertIes.
Theorem 3.21
Let A be an mX /I matrix and let N be the set of sol utions of the ho mogeneou linear system Ax = O. The n N is a subspace of Rn.
[Note that x must bea (column) vector in Rn in order for Ax to bedeflned and that 0 = Omis the zero veclor in Hm.) Since AO" = 0"" 0,, is in N. Now leI u and v be in N. Therefore, Au = 0 and Av = O. It follows that
Prool
A(u
+ v)
= Au
+ Av
= 0
+0
= 0
Section 3.5 Subspaces. Basis, Dimension, and Rank
1·lence, u
+ v is in
195
N. Finally, for any scalar c,
A(," ) = «Au)  , 0 = 0 and therefore
Cll
is also in N. It follows that N
IS
a subspace of Rn.
Dellnltlon
Let A be an mXn matrix. The null space of A is the subspace ofR" consisting of solutions of the homogeneous linear system Ax :::: 0.1t is denoted by null (A). The fact thaI Ihe null space of a matrix is a subspace allows us to prove what intU ition and examples have led us 10 understand abo ut Ihe sol utions of linear systems: They have either no solutio n, a unique solution, or infinitel y many solutions.
Theorem 3.22
Let A be a matrix whose entries are rcal n umbers. For any system of linear equations Ax :::: b, exactly one of the following is true: There is no solutio n. b. There is a unique solution. c. There are infinitely many solutions. 3.
At fi rst glance, it is no t entirely clear how we should proceed to prove this theorem. A little reRection sho uld persuade you that what we are really being asked to prove IS that if (a) and (b) arc nOI true, then (c) is the o nly other possibili ty. That is, if there is more than o ne solution, then there cannOI be just two o r even finitel y many, but there must be infin itely many.
Prool
If the system Ax :::: b has either no solutions or exactly one solution. we are
done. Assu me, then, that there are at least two d istmct solutio ns of Ax = bsay, Xl and Xl ' Thus,
Axl = b with
XI
and
AX1
= b
#; Xl' II follows that
A(XI  Xl) = Axl  Axl '" b  b :::: 0 Set ~ z: X I  X l ' Then ~ ¢ 0 and Axo :::: O. HenCl:, the n ull space of A is nontrivial, and si nce null (A ) is closed u nde r scalar multiplicat io n, c~ is in null (A) for every scalar c. COrlscquemly, the null space of A con tai ns infinitely many vectors ( SIllCC it contains nllenst every vector of the fo rm cXu and there arc infil1ltely many of these). Now, consider the (infini tely many) vectors o f the form XI + c~, as (varies th ro ugh the set ofreal n um bers. We have
A(Xl
+ cX())
= Axl
+ cAxo = b +
cO == b
Therefo re, there are infinitely many solutions o f the equation Ax :::: b ,
~
BasiS , ....(' can extract a bit m ore from the intui tive Idea that subspaces arc generalizations of planes through the o rigi n in R'. A plane is spa nned by any two vecto rs that arc
196
Chapter 3 Matrices
parallel to the plane but are not parallel 10 each other. In algebraic parlance, two such vectors span the plane and arc linearly independent. Fewer than two vectors will not work; more than two vectors is not necessary. This is the essence of a basis for a s ubspace.
Defilltlol
A basis for a subspace 5 of [Rn is a set of vectors in 5 that
I. spans 5 and 2. is linearly independent.
Example 3.42
Ixample 3.43
In Section 2.3 we saw that the standard unit "ectors c l ' e2, . . . en III Rn arc linearly ; ndependent ,nd sp,n R". Thmfn", theyfo'm' b,,;; fo, R",colled the
1n Example 2. 19, we showed that
R~ =
span ([ _
~ ], [~D. Since [ _ ~] and [~] arc
also linearly independent (as they arc not mu lfiples), they form a basis for [1!1.
A subspace can (and will) have more than one basis. For example, we have just seen that R! has the standard basis { [
~ J. [~J
}
and the basis { [ _
will prove shortly that the IHlm/Jer of vectors always be the same.
Ixample 3.44
III
l }.
~ [~]
However, we
a basis for a given subspace will
Find a basis for 5 = span (u, v, w ), \~here 3 u =
 I ,v =
5
o
2 I ,
3
and
w =
5 I
SolulioD The vectors u, v, and w already span 5, so they will be a basis for S if they arc also linearly independent. It IS easy to determine that they are not; indeed, w = 2u  3v. Therefore, we can ignore w, since any linear combinations involving u, v, and w can be rewritten to involve u and valone. (Also see Exercise 47 in Section 2.3.) This Implies that 5 = span (u, v, w ) = span (u, v), and since u and v are certainly lin early mdependent (why?), they form a basis for 5. (GeometTically, this means that u, v, and w all lie III the same plane and u and v ca n serve as a set of direction vectors for this plane.)
Section 3.5
Example 3.45
Subspaces, Basis, DimensIOn, and Rank
191
Find a basis for the row space of
A :
I 2
I  I
3
50101l0n
6  I
2
I 3
0
2 I I 6
4
I I
}
I
The reduced row echelon form of A is
R=
1 0
1 0
o
2
0
}
000 o 0 0
1 0
4
1
 I
o
By T heorem 3.20, row(A) = rowe R), so it is enough to find a baSIS for the row space of R. BUI roweR) is dearly span ned by its nonzero rows, and il is easy to check thallhe staircase pattern fo rces the first three rows of R to be linearly indepe ndent. (This is a general fact , one th at you wiII need to establish to prove Exercise 33.) Therefore, a basis for the row space of A IS
(I I
0
I
0
1] .[0
I
2
0
}J.[O
0
0
I
4])
We can use the m ethod of Example 3.45 to find a basis for the subspace spanned by a given set of vectors.
Example 3.46
Rework Example 3.44 using the method from Example 3.45.
Stlullon We transpose u , v, and w 10 gel row vectors and then form a mat rix wilh these vectors as ils rows: B=
3
 ]
5
2
1
3
o
S
1
Proceeding as in Example 3.4 5, we reduce B to its reduced row echelon form I
0
i
o
I
~
000 and use the nonzero row vectors as a basis for the row Space. Sin ce we started with column vectors, we m ust transpose again . Th us, a basis for spall (u , v, w ) is I
0
O.
I
~
!
Re.a," • In fact, we do not need to go all the way to reduced row echelon for m rowechelan form is far enough. If U is a ro w echelon form of A, then the nonzero row vecto rs
198
Chapler 3
MllIrict's
of Uwill fo rm a basis fo r row{A) (see Exercise 33). This approach has the advantage of (often) allowing us to avoid frac tions. I n Example 3.46, B can be red uced to
U~
3
 1
0 0
5 1 0 0
5
which gives us the b asis
3
0  1 ,  5
5
1
forspan( u , v, w ). • Observe that the methods used in Example 3.44, Example 3.46, and the Remark above will generally produce different bases. We now turn to the problem of findmg a basis for the colum n space o f a matrix A. One melhod is simply to tra nspose the mat rix. The column vectors o f A become the row vectors of AT, and we can apply the method of Example 3.45 to find a basis for row(A T) . Tra nsposing these vectors then gives us a basis for col (A). (You arc asked to do this in Exercises 21  24. ) This approach, however, reqUlTes performing a new SCI of row operations on AT. Instead, we prefer to lake an approach thaI allows us to use the row reduced form of A that we have already computed. Reca ll that a product Ax of a matrix and a vector corresponds to a linear com binatio n of the columns of A with the entries of x as coefficients. Thus, a no nt rivial solutio n to Ax = 0 represents a depel1dence rdatioll among the columns of A. Since eleme ntary row operations do not affect the solution set, if A IS row equiva len t to U, tire co lrlmll S of A have the same depentience relatwnships (IS the CO/llml1S of R. This important observation is lhe basis (no pun intended!) for the technique we now lISC to find a basis for col(A).
lumple 3.41
Find a basis for the column space of the matrix frol11 Example 3.45, 1 A ~
3  1 0 2 1
1
6  1
 2
1
6
1
3
1
2 3 4
1
1
Solullon
Let a , de note a column vector of A and let r, d enote a column vector of the reduced echelon form 1
0
1
0
 I
o o o
1 2 0 0
0 I
3 4
0 0 0
0
We can quickly see by inspection tha t c3 = r l + 2c 2 a nd r5 =  c1 + 3c 2 + 4c4 . (Check that, as predicted, the correspo ndi ng col umn vectors of A satisfy the same depen dence relations.) T hus, C3 and Cs contribute not hing to col (R). The remaining column
S«tion 3.5
Subspaces, BasIs, DImension, and Rank
199
vectors, f " f 2> and f 4 > are linearly independ ent , since Ihey are just standard unit veclors. The corresponding statements are therefore true of the colu mn vectors of A.
Thus, amo ng the colum n vectors of A, we climinate the dependent ones (a) and 3 5 ), and thc remaining ones will be linea rly indepe ndent and hence form a basis for col(A). What is the fastest way to find this basis? Use the columns of A that correspond to the columns of R containing tile leading I s. A basis for cal(A} is 1
1  I
2
{a l • a 2• a 4} =
3 •
1 1
2 •
2
1
4
1
i
Wa'Dlng
Elem entar y row op eratio ns change the col um n space! In ou r example, col (A ) '" col( R), since every vector in col (R) has its fo urth component equal 10 0 b ul this is certainly nOI true of col (A). So we must go back to the original mat rix A to get the colu mn vectors for a baSIS of col (A). To be speCIfic, in Example 3.47, r l> r" and r 4 do not form a basis for the column space of A .
Ixample 3.48
Find a baSIS for the null space of matrix A from Example 3.47. There is really nothing new here except the termi nology. We siml'iy have to find and describe the solutions of the homogeneous system Ax = O. We have al ready computed the reduced row echelon form R of A, so all that remains to be done in GaussJordan elimination is to solve fo r the leading variables in terms of the free variables. The fi nal augmented matrix is
S01111101
[RIO] :
1 0
1
0
1
0
0
1
2 0 0
° 1
4 0
0
0 0
0 0 0
0
) 0
If
x, X,
x:
X,
X. X,
then the leading Is are in columns 1, 2, and 4, so we solve for XI' Xz. and x. in terms of the free variables XJ and Xs. We get XI =  x) + Xs, Xz :::  2x,  3X;. and x. =  4Xs' Setting:s "" $ a nd ~ = t, we obta in
x:
+
X,
 $
X,
 2$ 
x, x, X,
,
 41
,
t
 1
1
3t
 2
 3
=,
1
+
I
0
0
 4
0
1
"" $ U
+
tv
Thus, U and v span null{A), and since they are 110early lOdependent, they form a basis (or null(A).
f 81
Chl'ptcr J
Malrices
Following is a sum mary of the most effective procedure to use to find bases fo r the row space, the column space, and the null space o f a matrix A.
I . ,:ind the reduced row echelon fo rm R of A. 2. Use the nonzero row vectors of R (containing the leading Is to arm a basi~ for row(A). 3. Use the column vecto rs of A that correspond to the columns of R containing the leading I s (the pivot colum ns) to form a basis for col (A ). 4. Solve fo r the leadi ng variables of Rx = 0 in terms of the frtt variables, set the free variables equal to parameters, substitute back into x, and write the result as a linear com binat ion o f f veclors (where f is the number of free variab These f vectors form a basis fo r nulJ (A).
If we do not need to find the null space, then it is faster to simply reduce A to row echelon form to find bases for the row and column spaces. Steps 2 and 3 above remain valid (with the substitutIOn of the word "pivots" for "leading Is").
DimenSion and Ranl We have observed that although a subspace will have different bases, each basis has the same number of vectors. Th is fu ndamental fact will be of vital im portance from here on in th iS book.
I
Theore. 3.23
"
T he Basis Theorem Let S be a subspace of vectors.
R ~.
Then any two bases for Shave th
umber of
P'"'
Sherlock Holmes Il(lted, ....."hen you have eliminated Ihe impossible, .... hate,·er remain~, hO"''f''WT improbable, muSI be the truth" (from The Sigll of Ft,"r by Sir Arth ur Conan Duyle).
Let B = lu" u2 ' ••• ' u,\ and e = Iv" v2' • •• ' v,) be bases for S. We need to pro ve Ihal r = s. We do so by showing that neither of the o ther 1\\'0 possibilities, r < s or r > s, can occu r. Suppose tha t r < s. '''le will show that this forces C to ~ a linearly dependent set of vectors. 10 this end, leI (I)
Since 6 .s a basis for S, we can write each v, as a linear combination of the elements u;
v. = ",.u, + " I2U 2 + ... + (I ),U , VI
=
" 2, U 1
+
(I 22 U2
+ ... +
(ll , U ,
Substituting the equations (2) into equation ( I ), we ob tai n
(2)
S«:tion 3 5
211
Subspaas, Basis, Dime nsion , and Rank
Regrouping, we have (cja ll
+
cZa l1
+ .,' + c,a'l)lll + (c1al l + 'la U + " . + c,a,l)UZ + ... + (cia.. + '2"1. + . + '," .. )U,
= 0
Now, since B is a basis. the u; s are linearly independent. So each orlhe expressions in parentheses must be zero: 'I tI li 'lal2
+ £1;1121 + ... + c,a,l = 0 + c2an + ... + e,n>! = 0 • •
This IS a homogeneous system of r linea r equations in the s va riables 'I> ' l " .•• c.. (The fact that the variables appear to the left of the coeffi cients makl"S no difference.) Since , < $, we know fro m Theorem 2.3 that there are infinitely many solutions. In particular, there is a llol1tnvia l sol utio n, giving a nOllt rivial dependence relation in eq ua tio n ( I). Thus, C is a linearly depe ndent set o f vectors. Bullhis findin g cont radicts the fact that C was given to be a basis, and hence linearl y independent. We condude that r < s is not possible. Sim ilarly (interchanging th e roles of B and C), we fi nd that r > 5 leads to a contradiction. Hence, we must have r = S. as desired.
Since all bases fo r a given subspace must have the same nu mber o f vectors, we can attach a name to this number.
DelialtioD
If S is a subspace of Din, then th e number of vectors in ,I basis for S
is called the dimeIJsioIJ o f 5, denoted dim
S.
",..r.
The zero vecto r 0 by itself is always a subspace o r R~. (Why?) Yet any sct cOlltnining the zero vector (and, in particular.IO}) is linearl y dependent. so {O) cannOI have a basis. We d efi ne dim {O} to be o.
Si nce the standard b3sis for Rn has /I vectors, dim IRn = II . (Note that th is result ag rees with ou r in tuitive unders tandin g of d imension fo r 1/ < 3.)
(1IlIIple 3.50
In Examples 3.4 5 th rough 3.48, we fou nd tha t row (A) has a basis with three vectors, col(A) has a basis with three vectors, and null{A) has a basis with two vectors. Hence, dim (row( A» == 3, dim (col(A)) = 3, and dllll (null (A)) = 2.
A single example is not enough on ,"hieh to specul3te, but the fa ct tha t the row and column SpliCes in Example 3.50 have the same d imension is no accident. Nor is the faCI that the su m of dim (col(A)) and dim ( null (A)) is 5, the num ber of co lumns of A. We now prove that these relationships are lrue in general.
2DZ
Chapter 3
Matrices
Theorem 3.24
= The row and col um n spaces of a mat rix A have the same dime nsion.
PrlOI
Let R be the reduced row echelon form of A. By Theorem 3.20, row(A)
r oweR), so
dim (row (A» = d im ( row(R»
= n umber of nonzero rows of R
= number ofleading Is of R Let this number be called r. Now col (Al "* col(R), but the columns of A and I~ have the same depende nce relationsh ips. Therefore, dim (col(A» = d im (col(R». Since the re are rleading Is, R has rcolumns that are standard unit vectors, e l '~"" , e". (These will be vec tors in Rm if A and Rare mX" matrices.) These r vectors are linea rly independent , and the remaining columns of R are linear combinations of them. Thus, dim (col(R» = r. It follows that d im ( row(A») = r = dim (col (A», as we wis hed to prove.
The rank of a matrix was first defined In 1878 by Georg J:robenius ( 1849 1917), although he defined it using determinan ts and not as we have done here. (See Chapter 4.) Froberu us was a German malhematlCl'ln who r«tiyed his doctorate from and later taught at the University of Berlin, Best kn own for his wn tribut ions 10 gro up theory, Frobenius ust'd matrices in his work on group
representations.
Definition
T he rank of a mat rix A is the di mens ion of its row and column spaces and is de no ted by ra nk(A).
For Example 3.50, we can thus wri te rank(A )
= 3.
Rlm.,II. • The precedmg defi nition agrees wi th the m ore informal definitio n of rank that was introduced in Chapter 2. The ad va ntage of our new definition is that it is much m ore Oexible. • The rank of a matrix simultant:ously glV(:S us information abo ut linear dependence a mo ng t he row vectors of the matrix alld among its column vectors. In p a rticular, it tells us the number of rows and colum ns that a re linearl y independent (and this n umber is the sam e in each case! ). Since the row vectors of A a re the column vecto rs of AT, Theorem 3.24 has the foll owing immediate corollary.
Theorem 3.25
For any ma trix A,
mnk{A') = mnk(A ) PrOal
We have
mnk(A') = d;m«ol(A'» = dim(ww(A» = mnk(A)
Oel'nlll08
The nullity of a matrix A is the dimension of its /lull space and is denoted by nullity(A).
Se.:tion 3.5
203
Subspaces, Basis, Di mensIon , and Ran k
In other words, nullity(A) is the dimension of the solution space of Ax = 0, which is the same as the number of free variables in the solution. We can now revisit the Rank Theorem (Theorem 2.2), rephrasing it in terms of our new d efinitions.
r"
Theore .. 3.26
"
The Rank Theorem
..
"
If A is an mX n matrix, then
rank(A) + nullity(A) =
/I
Proof Let R be the reduced row echelo n form o f A, and suppose that rank (A) = r. Then R has r leading Is, so there are rleading variables and n  r frcc variables in the solution to Ax ::: O. Since dim (null (A» = /I  r, we have rank{A)
+ nuility(A)
= r + (II  r) =
/1

Often, when we need to know the nullity of a matTIX, we do not need to know the actual solution of Ax = o. The Rank Theorem is extremely useful in such situations, as the following example ill ustrates.
Elample 3.51
Fi nd the nullity o f each of the fo llowing matrices:
~N
2
3
1
5
and
4 7 3 6 2 1  2  I 4 4 3 1 2 7 1 8
Solullon
Since the two columns o f M are clearly linearly indepe ndent , rank (M ) = 2. Thus, by the Rank Theorem, nu llity(M ) = 2  rank( M ) = 2  2 = O. There is no obvious dependence among the rows or columns of N, so we appl y row operat ions to reduce it to 2
1
 2
 1
o
2
1
J
o
0
0
0
We have reduced the matrix far enough (we do not need reduced row echelon fo rm here, since we are not looking for a basis fo r the nul! space). We see thai there are on ly IwO nonzero rows, so rank(N ) ::: 2. Hence, nullity(N) = 4  rank (N ) ::: 4  2 = 2.
4
The results o f th is sectIon allow us to extend the FundamenlaJ Theorem of Invertible Mat rices (Theorem 3.1 2) .
2DC
Chapter 3 Matrices
Theorem 3.21
ow •
The Fundamental Theorem of Invertible Matrices: Version 2 Let A be an n X tI matrix. The fo llowing statements are equivalent:
a. b. c. d. e. f.
A is invertible. Ax = b has a unique sol ution for every b in R~. Ax = 0 has on ly the trivial solution. The reduced row echelon form of A is In' A is a p roduct of elementary matrices. rank (A ) = tI g. nullity(A) = 0 h. The colum n vecto rs of A are linearl y independent. i. The column vecto rs of A span R". j. The column vectors o f A form a basis for R". k. The row vecto rs of A are linearly independent. I. The row vecto rs of A span JR". Ill. The row vectors o f A form a basis for JR".
The nullity of (I !1l(l1r1X was defined in 1884 by Jal1\es Joseph Syive,tcr
PrOal
rIIII 4HI97), who was interc5ted in
(0 <=t (g) Since milk (A) + nullity(A) = II when A is an /IX /I mat rL~, jt follows from
IItvariall/sproperties of matric~ that do not change under certain types of transformations. Born III Englnnd, Sylvester became the ~econd presIdent of the London Mathelml1ical SocIety. [n 1878, while teachmg at Johns Hopkins University in Baltimore, he founded the American lormlill of Malhellltllin. the first mathematIcal journal in the United States.
the Ra nk Theorem Ihal rank(A) ==
We have already established the equivalence of (a) thro ugh (e). It remains to be shown that statements (f) to (m) are equivalent to the first five statements. /I
if and only if n ul li ty(A) = O.
( f) => (d ) => (c) => (h) If rank(A) = tI, then the red uced row echelon fo rm of A has "leading Is and so is In" From (d) => (c) we know that Ax = 0 has on ly the triVIal solution, wh ich implies that the column vectors o f A are linearly independent. since Ax is jusl a linca r comblllatio n o f the column veClors o f A.
(h ) => (i) If the column vectors of A are linearly independen t. then Ax == 0 has only the triVIal solution. Thus, by (c) => (b), Ax = b has a unique solu tion for every bill IR". This means that every vector b in JR" can be written as a linea r combination of the column vecto rs of A. establishi ng (i). (i) => (j) If the colum n vectors of A span IR", the n col (A ) = Rn by definition, so rank (A) :::: dim (col (A» :::: tI. This is (f), and we have already established that (f) => (h ). We conclude that the column vectors o f A are linearly independent and so form a basis for IR:", since, by assumption, they also span IR". (j ) => ( f) If the column vectors of A form a basis for JR", then, in pa rticular, they are linearly independ ent. It follows that the reduced row echelon form of A contains /I lead ing Is, and thus rank(A) = II. The above dIscussion shows that ( f) => (d ) => (c) => (h) => (i) => (i ) => ( f) ¢> (g). Now recall that, by Theo rem 3.25, rank (AT ) :::: ra nk (A), so what we have just proved g ives us the correspond rng results about the column vectors of AT. These are then resulls about the rol\lvectors of A, bringing (k), (1),
=
Theorems such as th e Fundamental Theorem are not merely of theoretical interest . They arc tremendous laborsaving devices as well. The Fundamental Theorem h as already allowed us to cut in half the work needed to check that two square matri(:es arc inverses. It also simplifies the task of showing that certain sets of vectors are bases for Rn. lndeed, wh en we have a set of /I vecto rs in IR", that set will be a basis for (Rn if eitller of the necessa ry properties o f linear independence or spanning set is true. The next example shows how easy the calculations can be.
Section 3.5
Ixample 3.52
Subspaccs, Basis, Dil1lcnsion,alld Rank
2.05
Show that the vectors  I
1
4
2 •
0 • a nd
9
3
1
7
form a basis for RJ.
Solution According to the Fundamental Theorem, the vectors will form a baSIs for RJ if and only if a matrix with these vectors as its columns (or rows) htls rank 3. We perform Just enough row operations to determine this: 1
A ""
2 3
 I
4
o
9
1
7
,
1
 I
4
o o
2 0
1 7
We see that A has rank 3, so the given voctors are a baSIS fo r R J by the equIValence of (f) , nd (j ).
The next theorem is an application of both the Rank Theorem and the Fundamental Theorem. We wi\( reqUire this result in Chapters 5 and 7.
•
Theor•• 3.28
Let A be an m X n matrix. Then a. rank(ATA)'" rank (A) b. The /lXn matrix ATA is invertible if and only if rank{A) =
II.
Proof (a) Since ATA is " X fI, it has the same number of columns as A. The Rank Theorem then tells us that rank(A)
+ nulhty(A) "" " "" mnk(ATA) + nu!lity(ATA)
Hence, to show that rank(A} "" rank(ATA ), it is enough to show that nullity(A} == nul lity(A TA). We will do so by establishing that the null spaces of A and ATA arc the same. To this end. let x be in nul! {A) SO that Ax = O. Then ATAx = ATO :: 0, and thus x is in null (ATA). Conversely, let x be in null ( ATA). Then ATAx = 0, so xTATAx = xTO = O. But then
(Ax) • (Ax)  (AX)T(Ax)
~ xTATAx ~ 0
and hence Ax = 0, by Theorem l.2 (d ). Therefore, x is in null (A), so null (A) = null (ATA) , as requ ired. (b) By the Fundamental Theorem , the nX /I matrix ATA is Invertible if and only if rank(ATA) = II. But, by (a) this is so if and only if rank(A) "" II.
Coordinate, \\le 110W return to one or the questions posed a t the very beginning of th is section:
How should we view vectors in R' that live in a plane through the origin? Are they '",,adimenSIOnal or threedimensional? Th e not iOll s of basis and dimension will help clarify things.
20&
Chapler 3
Matrices
A plane through the origin is a twodimensional subspace of IR', with any set of two direction vectors serving as a basis, Basis vectors locate coord inate axes in the plane/subsp,lCe, in turn allowing us to view the plane as a "copy" of R2, Before we illustrate this approach, we prove theo rem guaranteeing that "coordinates" that arise in this \vay are Ulllque.
,I
Theore. 3.29
Let 5 be a subspace of IR" and let £3 = {VI' v 2, ... , V k} be a basis for S. For every vector v in S, there is exactly one way 10 write v as a linear combinJlion of the basis vectors in B:
Prool
Si nce £3 is a basis, it spans S, so v can be written in at least olle way as a linear combination of V I' v 2', •• , vk• Let one of these linear combinations be v
==
'I VI
+ ':2v2 + ... +
"V.
Our task is to show that this is the only way to write v as a linear combination of V1' V 2' ... , v k• ']0 this end, suppose that we also have V == d l¥1 + d1v2 +
... +
dkV k
Then Rearranglllg (using properties of vector algebra ), we obtain
(c\  dl)v l + (':I 
d 1) V1
+ .,' +
{Ck 
(h) Vk == 0
Since B is a basis, vI' v2, • , , , v. are linearly independen t. Therefore, [n other words, 'I = d1' S = tl2• ••• , Ck = dl • and the two linear combinations are actually the same. Thus, there is exactly one way to write v as a linear combination of the basis vectors in B.
DaUaifioa
Let Sbe a subspace oflR" and let B = {VI' v l " ' " Vi} bea basis for S. Let V be a vector in S, and write v = 'I V] + ':2v2 + ... + '1vr Then '1' ':2, ... , '1 (Ire called the coordjnate$ of v with respect 10 B. and the column vector
<, <,
[vl, ~
<, is called the coordinate vector of v with respect to 6.
EKample 3.53
Let [ = {e j , e 2, eJ } be the standard baSIS for liJ • Find the coordinate vector of v =
2 7 4
with respect to e.
Section 3.5
211
Subspaces. Bas,s. Dimensio n, and Rank
Solallo. Si nce v = 2e, + 7tz + 4eJ , 2 7
[ v ),~
4
4
II should be dea r that the coordinate vector of every (column) vector in R" with respect to the standard basIs is jusl the v('ctor itself.
(IImple 3.54"
z
3
In Example 3.44 , we saw that u =
 I ,v
=
0
1 ,andw =
 5 are three vcc
1 5 3 • . to rs m the same subspa (' (plane I h ro ugh the o n gln) Sof R .lIld that B basis for S. Since w = 2u  3v, we have
,
.
'.
[W)B = See Figure 3.3 .
~
• {u, v} IS a
[:J
,
......~" v
fi,,,.3.3
,
The coord1llates of a veCior with
r(Spec'
I" Exercises 14, leI S be the col/cclio,! of vectors [ ;] i" RI thm satisfy tlJcgivcn properry. J" ench case, elliter prove Ihm 5 forms a slIbspace ojR! or give a colltJIcrexample to show t!tat It does /lot. J. x=O
2. ;{ 2. 0,), ?::: 0
3.y==2x
4. xy
2::
9. Prove that every line through the origin in R J is;J subspace of R'.
10. Suppose S consists of all points in R2 that are on the xaxis o r the ynxis (or both). (S IS called Ihe II niOIi of the ' .....0 axes.) Is 5 a subspace of R2? Why or ..... hy not?
0 X
171 £XCfCISes 58, fet S be thecol/eetiotl of vectors
y
,
in lR'
6. z=2.\;y:: 0
III
roll'(A ), as;1I EXiHllple 3.41 .
II. A=[: ~ :]. b=[:].W=[ I 1 !J 12. A ==
tllnt if does lIot.
Z
Exercises II ami 12, determille wile/her b is III cof(A) and
wllelher w is
that satisfy tlte gil'f'll property. 171 cadI case, either prol't tlltlt S jorms ( I subspace oj R' or gIYe a cOllnterexample to show ::::0
a ba5is
8· lx  yl ~ Iy  ' I
7. x y+ z= I
III
5. x == Y
10
1
I
0
Z
I
 I
 3
1  4
I , b ~
1
o
,w =[Z 4 5)
t..
Chapler 3
Matrices
13. In Exercise t I, determine whether w is in row (A), using the method described In the Remark fo llowing Example 3.4 t.
3 1. Exercise 29
14. In Exercise 12, determine whether w is In row(A)
34. Prove that if the columns of A are linearly independent, then they must fo rm a basis fo r col (A).
using the method descn bed Example 3.41 .
In
the Remark fo llowing  I
15. If A is the matrix in Exercise 11 , is v =
3 in null ( A )? I  1 in null (A )? 2
19. A 
1
0
1
0
1
 I
1
0
\
 \
 \
 4 0 2 \  2 \
2 20. A =
 \ \
2 2
3
4
4
35. Exercise 17 37. Exercise 19 39. If A
is a 3 X5 matrix, explain why the columns of A
must be linea rly dependent. 40. If A is a 4 X2 matrix, explalll why the rows of A must be linearly dependent. 4 1. If A IS a 3X5 matrix, what afe the possible values of nullity(A)? 42. If A isa 4 X 2 mat rix, what ilre the possible values of nullily(A)?
:l
1
For Exercises J538, give the rank and the nullity of tile mlTfrices ill the given exercises.
38. Exercise 20
III Exercises 1720, give bases for rO IV(A), col(A), and mdf(A). 1 1 3 0 18. A = 0 2 1 17. A = 1 1 I  4
[:
33. Prove that if R is a mat rix In echelo n form, then a basis (or row( R) consists of the no nzero rows of R.
36, Exercise 18
7
16. If A is the matrix in Exercise 12, is v =
32. Exercise 30
\
I" Exercises 43 arid 44, find all possible vailles of mnk(A) as .
a vanes.
2
a
4.  2
2
\
III Exercises 2124, filld b(Ises fo r rOIV(A ) alld cof(A) in the glVell exercises using AT. 21. Exercise 17
22. Exercise 18
23. Excrcise 19
24. Exercise 20
25. Explain carefully why your answers to Exercises 17 and 21 arc both correct even though there appear to be differences. 26. Explain carefully why your answers to Exercises 18 and 22 arc both correct even tho ugh there appear to be differences.
43. A =
2
a
a 44. A =
\
3 2
2 3
 \
2
 \
•
Answer Exercises 4548 by cOllsidering ,lie mfJIrix wltl. the given vectors liS its coi.mms. \
45. Do
\
\
\  \
\
III Exercises 2730, fill d a baSIS for the Spall of the given
fo rm a basis fo r IR J ?
• 0 • \
0
46. Do
0
\
\
5 •  3 ro rm a basis (or R3?
 \ •
3
\
\
vectors.  \
\
27.
 \
•
0
0 •
29. (2
3
30. (0
\
\ ). ( \
 2
28.
\
 I
 \
\
0
\
I). (3
\
2
• 2 • \ • \ 0
\
 \
0
\
0). (4
 4
 \
0), (2
\
47. Do
2
5
\)
48. Do
For ExerCIses 31 ami 32, find bases for the spans of ' he vectors ill the give" exercises fro m (lII/O llg ' he vectors tl.emse/vcs.
\
\
0
\
\
0
\
\ • 0 • \ • \
0
\) \
\
\
\
\
0
 \
\
0 0
,
0 •  \
form a basis for iJr?
\
0 0
 \
0
 \ •
\
\
0
fo rm a basis fo r Ir?
Xclion 3.6
In Exercises 49 and 50, show thaI w is in spall(13 Jand find the coordinate vector [wlB' 49. 13 =
50. [3 =
I
I
2 ,
o
o
]
3
5
I ,
]
4
6
I
,w = I 3 4
111 Exercises 5 / 54, camp/ire fhe rank and nl/lIily of Ih e given matrices over Ille inciimtcd 7L p • 5 1.
53.
54.
]
I
0
0
1
]
0
1 over Z 2 ]
I
3
I
2
3 0
l over Zs
I
o
o
4
52.
I 2
] I
2 2 over Zl
4
022
5
I
I
1
62. Prove thai , for m X " matrices A and 8, rank (A
+
+ 8) <
rank (B).
63. Let A be an /JX n mat rix such that A< = O. Prove that ra nk(A) :S 11/ 2. I H int: Show that col( A ) C null(A) and use the Rank T heo rem.] 64. Let A be a skewsymmetric /I X n matrix. (See the
55. If A is //I X 11, prove tha t every vector In nul l(A) is orthogonal to every vector in row(A).
56. I f A and B arc /IX 11 matrices of rank has ra nk
60. Prove thaT an mX II mtllrix A has rank I ir and only if A can be written as Ih e o ute r p rod uct uv1' o f a vector u in Rm and v III III".
ra nk (A)
I
1
59. (a) Prove tha t if U is invertible, then ra nk( UA) = ran k(A). [HII1I: A = {j l (UA) .] (b) Prove that if V is invertib le, then rank{AV) = rank (A).
6 1. If an m X II malrix A has rank r, prove that A can be written as the sum of r matrices, each of which has ran k I. (Hillt: Find a way to use Exercise 60.)
200
2 4 001 63 5 1 0 1
57. (a) Prove that rank(AB) < rank (B). ( Hillt: Review Exercise 29 In Section 3.1.) (b) Give an example in which ra nk (AB) < rank(B).
58. (a) Prove thai rank (AB):S ran k(A). (Him: Review Exercise 30 in Sect ion 3. 1 or use tra nsposes and Exercise 57(a ).J (b) Give an example in which rank(AB) < rank(A).
6
2
,w =
289
Introduction 10 Linear TrallsforlllaliOlls
/I,
prove Ihal AB
exercises in Sectron 3.2.) (a) Prove that x TAx = 0 for all x JI1 [R:n. (b) Prove tha t I + A is invertiblc. [Hint: Show
null(l
that
+ A) = {OI.J
fl.
Introduclion 10 Linear Transformalions In this section, we begin to explore one of Ihe themes from the imroducl lOn \ 0 this chapter. There we saw Ihat matrices can be used 10 transfo rm vecto rs, acting as a type o f " fu nction" of the form w = T( v), where the IIldepende nt vanable v and Ihe dependent va riable w arc vecto rs. We will make th iS notion more prec ise no w and look al several exam ples of such matrix transformat ions, leading to the concept o f a linear Irtlll5formation a powerfu l idea that we will encounter repeatedly from here o n. We begi n by reGllling som e o f the basic concept s associated with fun ctions. You will be fami liar with m ost of these ideas from other courses in which yo u encountered function s of the formf: R + R Isuch asft x) = xl] that transform real nu mbers into real numbers. What is new here is that vectors arc invol ved :l1ld we are interested o nly in fu nction s that arc "compatible" wi th the vecto r operations of addiTion and scalar multiplication.
2:10
Chapter 3
Matrices
Consider an example. Let 1
A=
0  I
2 3
,nd
[:1
v=
4
T hen
Av =
1
0
2 3
 I
4
1
[:1
3 I
=
1
This shows that A transfo rms v into w =
3
 I We can desc ribe this transformat ion more generally. The mtltrix equatio n
x 2x  y 3x
+
4y
x 2x  y 3x + 4y (Although technically sloppy, o m itting the parentheses in definitions such as this on(' is a common convention that saves some writing. The description of T,., becomes
x 2x  Y 3x + 4y w ith this convention.) With this example in m ind, we now consider some terminology. A transforma· lion (or mapping or function ) T from R" to R'" is a rule tha t assigns to each vecto r v in Rn a unique vector T(Y ) in R"'. The domain of Tis R", and the codomain of Tis R;m. We indicate thLS by writing T: Ill" + R"'. Fo r a vector v in the domain o f T, the vector T (v) in the codoma in is called the image of v under (the action of) T. The set of all possible images T(v ) (as v varies througho ut the domain of T ) is called the ratlgeof T. In ou r example, the do mai n o f T... is R l and its cod omain is Ill J , so we write 1
T... : R l +IR:' .The imageofv = [_ : ] is w = T,.,(v ) =
3 . What is the range of  I
Section 3.6
111
Inuoduchon to Linear Transformatio ns
T,.,? It consists of all vectors in the codo main Rl that arc of the form 2x  y
 x2
+ 4y
3
3x
o
1
x
+y
 I
4 1
which describes the set of all linear combinations of Ihe column vectors
o  I 4
morc
2
and
3 of A. In o ther words. thc range of Tis the column space of A1 (We wi ll have
say about Ihis later for now we'll simply note il as an in teresti ng observatio n.) Geometrically, this shows that the range of T" is the plane th rough the o rigin in \0
IIll with direction vectors given by the column vectors of A. Notice that the range of T), is strictly smaller than the codomain of T,,:
linear Transformations The example T... above is a special case of a more general type of transformation called a linear tmllSformation. We will consider the general definition in Chapter 6, but the essence of it is that these are the transformations that "preserve" the vector operations o f add ition and scalar mul tiplication. •
Deftnltlon
A tra nsformation T: IR~ ~ R'" is called a litlear tratls/ormario" if
I . T(u + v) = T(u ) + T(v) forall u andvin R~ and 2. T (cv) = cT (v ) for all v in RW and all scalars c.
Example 3.55
Consider once again the transformation T : iR2 + R J defi ned by
x 2x  Y
h
+ 4y
Lei's check that Tis a linear transformation. To verify (I ), we lei u = [ ;: ]
and
v = (;:]
Then XI
2(x] 3(x] XI
+ x,
XI
2x l +2x,YI Y, 3xI + 3x2 + 4YI + 4yz
x, 2x l  YI
h i
+ 4YI
X,
+
2Xi  Yl 3x 2 + 4Yl
(2xl  YI )
(3x, +
4YI )
+ Xl
+ x2) + X, ) +
+ Yl ) 4(YI + Yl )
+ Xi
+ (2x,  Y?) + (3Xl + 4yz)
1;:1 t 1~ l(u) +
(YI
+ 7(v )
ttl
Chapter 3
Matrices
To show (2), we let v = [ ; ] and let cbe a scalar. Th en
ex 2(ex)  (0') 3(ex) + 4(0')
ex c(2x  y) c(3x + 4y)
x  c 2x  y 3x + 4y Thus, T is a linear t ransformation .
B'.lrk
T he d efi nition of a li near transformation can be st re
T : IR~ J> IRm is a linear transfon nation if
In Exercise 53, yo u will be asked to show that the sta tement above IS eqUIvalen t to the o riginal definition. In p ractice, this equivalent fo n n ula tion can save some writingtry it! Altho ugh the linear transfor mation T in Example 3.55 originally arose as a matrix transfo rmatio n TN it is a sim ple matter to recover th e matrix A fro m the definition o f Tgiven in the example. We observe that x 2x  y 3x + 4y
o
I
= x 2
3
+y
 1 4
I
=
2 3
o  1 [; 4
1
I 0 so T = T. . , where A = 2  1 . (Notice that when the variables x and yare lined 3 4 up, the matrix A is j ust their coeffiCi ent matrix.) Recogmzmg that a transformati o n JS a matrix transfo rmation is importan t, si nce, as the next theorem shows, all matrix transformat io ns are linear transformatlons.
Theor•• 3.30
Let A be an mX tJ mat rix. Then the matrix tra nsfo rmation T .... : Rn defi ned by
is a linear transfo rmation.
J>
R"I
Section 3.6
Prll'
Let u and v be vecto rs III Rn and let c bc a scalar. Then
T",(u + v) and Hence, Til.
Example 3.56
211
Introduction to Lmear Tnansformations
= A( u + v) =
Au
+ Av = T",(u ) + T",(v )
TA(CV) = A(cv)  c( Av ) = cT",(v ) IS
a linear tr:lIlsformation.
Let F : Rl .. Rl be the transfor mation that sends each poin t to its reflection in the xaxis. Show that F is .tlinear transformation.
Sol.II,.
Fro m Figure 3.'1, it is clear that F sends the point (x, y) to the point (x,  y). Thus, we may write
)'
( I . 2) T
,, ,,,
~r>~,++ x
,, ,, ,
,
,, ,,
We could proceed to check that F is linear, as in Example 3.55 (this one is even easier 10 check! ), but it is fas ter to observe Ihal
(r.  I·)
(1.  2)
flgare 3.4 Therefore,
Refie("tion III the xalCis
Jx]y rl
= A [x] Y , where A = ['0
0] . .
.
_ I ,SO F IS a malrlx transfo rmallon. It
now follows, by Theorem 3.30, that F is a linea r transformation.
Example 3.51
Let R: 1R1 + Rl be the transformation tha t rotates each poin! 90" counterclockwise about the o rigin. Show Ihat R is a linear transformatio n.
Sal.II,.
( I'. _1 )
x
As Figu re 3.5 shows, R sends the point (x, y) to the point ( y. x). Thus,
we have (x. I")
y
figure 3.5 0
A 90 rotation
.,
Hence, R IS a matrix transformation and is the refore linear.
Observe that if we multiply a matrix by standard basis vectors, we obta in the . columns of the matrix. For example,
,
,
,
,
, ,
b d
f
We can usc this observation to show that tl'l~ry linear transformati on from RM to R'" arises as a matrix tra nsformation.
21.
Chapter 3 Matrices
Theorlm 3.31
lei T: IR" + IR'" be a linear transformation. Then T is a n};ltrix transformation. More speCifically, T = TA , where A is the mX" matrLx A = [T« ,) ;T«,) ; ··· ;T «.,)]
I',.,f
Let e l, e 2, .•• , en be the standard basis vectors in R~ and let x be a vecior in R". \Ve can ,."rile x ::: xle l + x2e 2 + ... + x"en (where the x,'s are the components of xl. \Ve also know that T(e l), 1·(e zl, ... , T(e,,) are (colum n) vectors in Am. Let A ::: [T(e l) : T(el ) : ... : T(e n) J be the m X /I matrix wi th these vectors as its colu mns. Then
'f ( x) = T(xic i + x2e2 + ... + x"c,,) ::: Xl T(c l
)
+ .xzT(e2) + ... + x"T(c,,)
= [T(e,) ; 1'(o,) ; '"
; T« . ) ]
x, x,
= Ax
x, as required. The matrix A in Theorem 3.3 1 is called the standartl matrix of ti,e /i" ear transformatio" T.
Example 3.58
Show thtl l a rottllion about the origin through an angle 0 defines a linear transformation from [Rl to Oil and find ilS standard mat rix. lei R, be the rotation. We will give a geometric argumen t to establish the fact that R, is linear. Let u and v be vectors in H2. If they are nOI pa rallel, then Figure 3.6(a) shows the parallelogram rule that determines u + v. If we now apply R9 , the entire parallelogram is rotated th rough the angle 0, as shown in Figu re 3.6(b). But the diagomll of this parallelogram must be U9(u) + R.g{v), again by the parallelogmm rule. Hence, R~( u + v) = R,,(u ) + R,(v). (What happens ifu and v are parallel?)
S"IU,.
y
)'
,
 
u h
~I U
u
\)
•
~ u x (,j
(bj
JlIIIII 1.6
Similarly, If we apply R, to v and cv, we obtain R9(v) and R,(c v), as shown in Figure 3.7. But since the rotation does not affect lengths, we must then have ~( cv ) "" c~( v), as required. (Draw diagrams for the cases 0 < c < I,  I < ( < 0, and c<  I. )
SectIOn 3.6
Introduction to Linear Transformations
7:15
" /~___(cos
)'
O.
~jn
0)
sin 0
ROlev )
RI1 (v )
v
x
flgur. 3.]
Therefore, R,; is a linear transformation. Accordi ng to T heorem 3.3 1, we can fi nd Its matrix by determ in ing its effect on the standard basIS vecto rs e l and e 2 of Ill"'. Now,
[1] [<Sin0'09 ].
as Figure 3.8 shows, R,; 0 We can fi nd
=
~[~] similarly, but It IS faster toobscrve that ~[ ~ ] mu st be perpen
[1]
dic ular (coUilierciockwise) \0 R,; 39) 0 ' . . IF19ure
and so, by Example 3.57, R,;
[0] [')" 0] =
I
,
Therefore, the standa rd matri x of R,; is
cosO
[COS O SinO ]. . sm ()
cos 0
y
" o
:...l:......
40 x
flgur. 3.9 R~( el)
T he result of Example 3.58 can now be used to comp ute the effect of an y ro tation. For example, suppose we wish to rotate the point (2,  I) through 60" abo ut the origin. (The convention is that a positive angle corresponds to a coullIcrclockwise
21'
Cha pter 3
Matr ices
rotatio n, while a negative angle is clockwise.) Since cos 6()0 = we compute
Y
R6IJ
..... (2.
'] [,,,, 60" 60"][ ' ] = = [ I sin 60" ,;n cos 60"  I
 0 /2][ 2] [ 0 '/' /2 1/2  1
= [ (2
 I)
1/2and sin 6()0 = VJ/2,
+ 0)f2]
(20  1)/2
T hus, the image of the point (2,  I) under th is rotatio n is the poin t ((2 (2 V3  1)/2) '"" ( 1.87, 1.23), as shown in Figure 3.10.
Figure 3.10
+ VJ )/2,
A 60" rotation
Exalllple 3.59
(a ) Show that the transfo rma tion P ; Al to R! that p rojects a point onto the kaxis is a linear transformation and find its standard matrix. (b) More generally, if is a line thro ugh the o rigin in Al, show tha t the transformation Pt : Rl + Rl that projects a point onto is a linear tra nsfo rma tion and find Its standard matrix.
e
Salu1loa
y (x.
e
(a) As Figure 3. 11 shows, Psends the point (x, y) to the point (x,0). Thus,
q
,,
t
\" ",_ (x. 0) flglr.3 .11 A projection
.,
It fo llows that P is a ma trix transfo rmatio n (and hen ce a linea r transformat ion ) with
standard matrix
[~ ~J.
e
(b) Let the line have d irectio n vecto r d and let v be an arb itrary vector. Then p( is given by p rojd(v), the p rojectio n of v o nto d , which yo u'll recall from Sect io n 1.2 has the fo rm ula
T hus, to show that Pr is linear. we proceed as fo llows:
Similarly. Pt (cv ) = cPt( v ) for any scalar c (Exerctse 52). Hence, Pf is a linear tra nsform ation.
Section 3.6 Inlroduct ion to Linear Transformations
To find the standard matrix of Pt , we app ly Theorem 3.3 1. If we let d =
tn
~ [ dd:]'
then
.nd Thus, the standard matrix of the projection is A
, [ d '  a, + di' d d 
tllaJ (tlt + d D] ay (tl L2 + aD
I
l
L
2
As a check, note that in part (a) we could take d = e l as a direaion vector for th e xaxis. Therefore,
a, "" 1 and d
l
= 0, and we ob tai n A ""
[~ ~], as befo re.
New linear Translormallons Irom Old If T: R'" + R~ and 5 : R" + RI' arc linear transformations, then we may follow Tby 5 to form the composition of the two transfo nnations, denoted S o T. Notice that, ' 11 order for S o Tto make sense, the codomain of Tand the domain of 5 must malch (Ln this case, they are both A ") and the resultLllg composite transformation S o T goes from the domain of Tlo Ihe codomain of 5 (Lll lhis case, SOT: R'" + IRP) . Figure 3.1 2 shows schematically how this compositIon works. The for mal definit ion of composition of transfo rmations is taken directly from this fi gure and is the same as the corresponding defi nit ion of composit ion of ordinary functions:
(5' T)(v)
~
5( T(v ))
O f co urse, we would like 5 0 T to be a linear transformation too, and happily we find that it is. We can demonstra te this by showing that So Tsat isfies the defimt lon of a li near tra nsform ation (which we will do in Chapter 6), but, since for the lime being we are assuming that line
R" T
s
Ih I
FllIg,. 3.12
The composillon of transrormations
y )  (S
T)( v )
21.
Chapter 3
Matrices
•
Theor•• 3.32
Let T: Rm ? Rn and 5: R"+ R' be linear transformations. Then S~ T: A t A' is a linear transfo rmatio n. Moreover, their standard matrices ace..relatcd by ~
[So 1']
[SliT] ..
.,
,
P,..f let [5] = A and [T ] = H. (Not ice that A is px" and 8 is "x m. l I( v is a vector in R, then we simply compute ~
(So T)(v)  S(7'(v))
5(Bv)
~
A(Bv ): (AB) v
(Notice here that the dimensions of A and B guaran tee that the product AB makes sense.) Thus, we sec that the effect of S 0 Tis to multiply vectors by AB, (rom which it follows immediately that S o T is a matrix (hence, linear) transformation wi th
[So 1' [ : [5 11 1'[. Isn't this a great result? Say it in words: " The matrix of the composite is the product of the matrices." What a lovely formula!
Ellmpl. 3.60
Consider Ihe linear tra nsform3 1ion T: jRl ~ R' from Example 3.55, defined by
and the li nea r tr3 nsformat ion 5: R J ? R4 defi ned by
y, 5 y, y,
SOIIIlIl
,
2Yl + YJ 3Yl  y,
:
Yl  Y2 YI+ Y2+ YJ
We see that the !>tandard matrices arc
[5] ~
, ,
 I
 I
0
0
[So 1'] : [5][ 1'] :
0 1 1
,nd [ 1'] :
,
1
,
so Theo rem 3 gives
,
1
0 3
0 3
, ,  I
 I
0 1
1
,
 I
3
4
,
,
3
4
0
5
4
3
7
 I
1
6
3
0 :
It fo llows Ih31
(S o
T)[::] :
5
4
3
7
I 6
, [::] 3
:
5x1
+
4x2
3x1

7x!

Xl
6xl
+ X! + 3Xl
S«hon 3 6
Introduction to
Lin~ar Transformations
21.
(I n ExcrcJse 29, you will be asked 10 check this result by seu ing
y, y,
=
1'[::] =
YJ
3x1
+ 4x2
and substilUling these val ues into the definition of 5, the reby calcul:lting (S d irectly.)
Example 3.61
0
T)
[x, ] XJ
..•t
Fmd the standard matrix of the transforma tion that first rotates;! pom l 90° co un terclockwise abou t the origi n and then reflects the res ult in the xaxis.
Solallon
The ro tation R and the reAcetion F were discussed in Examples 3.57 and
356, respectively, where we found their standard matrices to be (R] =
[F] = [ ,
[~

~ ] and
0 ]. It follows that the composition F o R has for its matrix
o ,
[F' R]
~ [ e][ R] = ['o
0][0  1 ]
']o =[  0'] J
0
(Check that this result is correct by consldenng the effect of Fo R o n the st:mdard basis vectors c l and e J • NOie the importance of the order o f the transformat ions: R is performed before F, but we write F o R. In this case, R <> F also makes sense. Is R o F = F <> R?)
Inverses oIl/near TranslorllaliOls Consider the effect of a 90" counterclockwise ro mlion abo ullhe origin fo llowed by a 90 0 clockwise rolation abou t the origin. C learly Ihis leaves every point in HI unchanged. If we denote these tra nsformations by R90 and R_90 ( remember that a negati ve angle m easure corresponds to clockwise directio n ), then we may express this as (R9!) <> R_\IO)( v) == v fo r every v in R2. Note thai , in this case, if we perform Ihe tra nsformations in the other order, we gel the sa me end result: (R _'K! <> R90 ) (v ) = v for every v in A2. Thus, R90 0 R ~90 (and R~ <JO <> R.,o too) is a linear transformation that leaves every vector in A Z uncha nged. Such a tra nsformation is called an identity transformation. Generally, we have one such transformation for every RW namely, I: R~ l> R~ such Ihat l t v) "" v for every v in R". (If it is important to keep track of the dimension of the space, we might write In for clarity.) So, with this notation, we have R90 <> R ~90 = I "" R_90 0 ~ A pair of transform ations that are related to each other in this way are called inverw transformations.
Dennltloa
LeI S and Tbe linear transformations from R~ to A". Then 5 and T are inverse transformatioru if S o T = I" and T o 5 = I,..
ttl
Chapter J
Matrices
•••• r.
Since this definition is symmctric wit h respect to S and T. we will say that, when this situation occurs, S is the inverse of T and T is the inverse of S. Fur· thermore, we will say that 5 and T are invertible. In terms of matrices. we see imme(llatciy that if 5 and Ta re inverse transfonnatlOns, then [5 II TI :: 15 0 TJ "= II J = I. where the last I is the identity marrix. (Why is the standard matrix of the identity tr:lIlsformation the identit y matrix?) We must also have [TI [SI .,. I T o 51 = II] = I. This shows that IS] and I TI are inverse matrices. II shows something more: If a linear tr.ansformatlon T is invertible. then its standard matrix I TI must be invertible, and since matrix inverses are unique. this means that the inverse of Tis also unique. Therefore, we can unambiguously usc the notation r  I to refer to tile inverse of T. Th us, we can rewrite the above equations as ITIl T IJ = 1 = [ r I JI '1'1, shOWing that the matrix of 'r  I IS the inverse matrix of I TI. We have just proved the follOWing theorem.
Theorem 3.33
,
Let T: R" __ [R" be an invertible linear transformation. Then its standard matrix I 'f) is an invertible matrix, and

•••• r.
Say this one in words 100: "The matrix of the inverse is the inverse of the matrix." Fabulous!
Ellmple 3.62
Find the standard ma trix of a 6Cf' clockwise rotat io n about the origin in 1R: 2•
SDlullDa Earlier we computed the ma trix of a 60" counterclockwise rOlal ion about the origin to be
1/2  0 / 2] [ R., )  [ 0 /2 1/2 Since a 600 clockwise rotation is the inverse of a 60" counterclockwise rmation, we can apply Theorem 4 to obtain
 0 /2] ' ~ [ 1/2 0 / 2] 1/2  0/2 1/2 (Check the calculation of the matrix inverse. The fastest way is to use the 2X2 short· cut from Theorem 3.S. AlSQ, check that the resulting matrix has the right effect on the standard basis in [Jl2 by drawing a diagram.)
Example 3.63
Determine whether projection onto the xaxis is an invert ible transformat ion, and if it is, find ils inverse.
Sol,II., The sta ndard ma trix of this projection P is [~ since its determin ant is O. Hence, P is not invertible either.
~ ] , wh ich is not invertible
Section 3.6
H•••
221
r.
Figure 3.13 gives some idea why Pin Example 3.63 is not invertible. The projection "collapses" 1R2 on to the xaxis. For P to be invertible. we would have to have a wa y of "undoing" it. to recover the point (fl. b) we started with. However, the re are infinitely ma ny candidates for the image of (fI, 0) under s uch a h ypo thetical " inverse." \vhich o ne should we use? We can not simply say tha t r l must send (a. 0) to (a, b), since this cannot be a defimtion whe n we have no way of knowing what b should be. (See Exercise 42. )
,,«(/. , b) ~ (fI,b')
,
( ll.
Introd uction to Lmcar Transformations
0)
, ' (fI, b")
figure 3,13
Assoclallvilv
Projections arc nOl invertl ble
Theore m 3.3(a ) in Sectio n 3.2 stated the associativity property for matrix multiplication: A( Be ) = (A IJ) c. Of you didn't try to p rove it then, do so now. Even wi th all matrices restric ted 2 X2, yo u will get some feeling for the notat io nal complex ity involved in an "elementwise" proof. which should make you ap preciate the p roof we are abou t to give.) Our a pproach to the proof is via linear transformations. We have seen tha i every mX" matrix A gives rise to a linear transformation T", : Rn + jR'''; conversely, every linear tra nsfo rmation T: R" + R'" has a co rresponding m X II matrix [ T J. The two correspondences are inversely related; that is, given A, !Toll = A, and given T, TIT! = T. Let R = TA • 5 = TIi' and T = Tc Then , by T heorem 3.32,
A(BC)
~
(AB)e if,ndonlyif R .(S. T )
~
( R. S). T
We now prove the latter identity. Let x be in the domlli n of T (and hence in the do main o f both R 0 (S o T) and (R 0 5) 0 T wh y?). '10 prove that R 0 (S o T) = (R 0 5) 0 T, it is enough to prove that they have the same effect on oX . By repeated applicat ion of the definition of composition, we have
(R 0 (s • T))( x)
~
~
R« S ' T)(x )) R(S( T(x)))
~ ( R ' S)(T( x )) ~ «R ' S) · T)( x )
required. (Carefull y check how the definitio n o f composi tion hlls been used four times.) This sectIOn has served as an introduction to linear transformations. In C hapler 6, we will take another m o re detailed and more general loo k at these transformati o ns. The exercises that follow also contilin some add itional exploratio ns of thiS important concept.
3S
Exercises 3.6 I. Let TA
:
.
]R2 +
IR' be the matrix transfo rmatio n corre
spondmg to A =
[2 'j 3
4 . Find T",(u ) and T",(v),
2. Let TA : IR J + Rl be the matrix tra nsformation correspond ingtoA = [
4  2
0 I
 lj. Filld T",( U)a nd 3
I TA( v), where u
=
 I
2
0 and v =
5
 ,
!ZZ
Chapter 3
Matrices
III Exercises 36, prove that the given lrtills/ormation is u
linear trans/ormation, using the defillitloll (or the Remark /ollolYillg Example 3.55). X [
x
5. T Y
,
+
r]
4.T[;]
x y
+,]
x y [ 2x + y  3z
x 6. 'J' y
y x
+
2y 3x  4y
x+ , y+ , x+y
III Exercises 710, give a counterexample /0 show that tIle
givell tram/orma/io" is Ill)( a linear traIlS/ormation.
T[;l [~l ""[;1 [xx:J 7.
r[;] ~ [:::1 10.T[;] ~ [;+ :] 8.
/11 Exercises J J11, filld Ihe standard lIIa/rix of the lil/ellr mills/ormation in lire gil'cn exercise. LI. Exe rcise 3
12. Exercise 4
13. ExeTCIse 5
14. bercise 6
In Exercises 2025, find tire sW lldard matrix 0/ lite gIven linear tmm/ormation from Rl to ]R2. 20. Cou n te rclockwise rotation t h rough 1200 abo u t the • •
origin
21. Clockw ise rot ation th ro ug h 30° abo u t lhe origin
y == 2x 23. Projection o n to the line y ==  x 24. Refl ectio n in the li n e y = x 25. Rencction in the line y = x 22. Project io n onto t h e line
e
26. Let be
e,
(a) Draw d iagra ms to show thaI Fr is Imea r. (b ) Figure 3.14 suggests a way to fi nd the matrix of Fr, using the fact that t he d iago nals of a parallelogram b isect each o ther. Prove that Fr( x ) = 2Pe( x )  x, and use th is r es ult to show th at the standar d matrix of F( IS
( where the direction vector o f /11 Exemses15 18, show tlwilireg iven tram/ofl/wtion from 01: 2 to [Rl is linellr by showing l/rat It IS a IIwlrix I rans/ormatioll.
e15 d = [ ~~ J).
(c) If the a n gle between Cand the positive x axis is (J, sh o w tha t the ma t rix of F( is
20 si n 20
COS
15. Freflects a vecto r in th e yaxis.
[
16. R rotates a vector 45° coun te rcl o ckwise about the
s in 20]  cos 2{)
o n glll. 17. D stre tch es a vector by a fa ctor of 2 in the xco mponent a n d a fac to r o f 3 in t h e yco mponent.
I
18. Pprojects a vector onto the line y = x. 19. The three types of elem entary ma t rices g ive rise to five types of 2X2 m at rices with one of th e fol lowing forms:
[~ ~lo' [~ ~l [~ ~l [~ ~] o, [ : ~l Describe geometricall y the effect o f the matrix t ra nsformatio ns from R2 to ]R2 that corresp o nd to these elementary matrices. D raw p ictu res to ill ust ra te your answers.
Figure 3.14 11/ ExefCIses 27 ami 28, appl)' part (b) or (c) 0/ ExerCISe 26 to find the sland"ra lIIatrix 0/ the tmlls/ormatiol/. 27. Reflect ion in th e line y = 2x
Section 3.6 Introduction to !.mear TransformatIOns
223
e
28. Reflection in the line y = v'3x 29. Chttk the formula for S o T in Example 3.60, by tKrforming the suggested direct substitution.
Itl Exercises 3035, verify Theorem 3.32 by finding tile matrix cIS 0 T (a) by direct substltutioll (///(1 (b ) by matrix mllfllp/ical iotl of [S] [n.
10. T[x,] _ [x' x 'j.s[Y'] ~ [:r' ] x, x,+x, Y2 Y2
31. T[~] = [:~x~ :X:J s[;~] _[Y;, ~ 3~2] Y, + 3Y2 2Yl + Y2 YI  Y2
4 1. If 0 is the angle between lines and m (thro ugh the o rigin), thcn Fm0 F( = R+2JJ• (Sec Exercise 26.)
42. (a) If P is a proJection, then p o P = P. (b) The matrix of a projection can never bc invertible. 43. If e, 171, and n are three lines through the origin, then F" 0 F.. 0 F( is also a reflection m a line through the ongm. 44. Let The a linear transformation from R2 to R Z (or from R J to Rl). Prove Ihal T maps a straight line 10 a straight line or a point. ( Hint: Use the vector form of the equation of a line.) 45. Let Tbe a linear transformation fro m R2 to HZ (o r from R J to Rl). Prove that Tmaps parallel lines to paraUellines. a single line, a pair of points. or a single potnt.
x, 33. T X2
x, x, 34, T Xz
x,
~ [x' +
2Xz 
2,,]. sly'] ~ Xl
Y1
y,
Yl  Y: Y1 + Y2  YL + Y2
YI  Y2
y,
In Exercises 4651. let ABeD be the square with vertices ( I , I), (l, I), (I. l). and ( I. I). Use the resllits in Exercises 44 and 45 to find and draw Ille image of ABeD Imder the given lransformation. 46. T in Exercise 3
47. Din Exercise 17 48. P in Exercise 18
49. The projtt"lion in Exercise 22
III Exercises 3639, find file standard mtltrix of the compos
ire trans/ormation from IIll to IJII, 36. Counterclockwise rolation through 60". foll owed by reflection in the line y = x 37. Reflection in the yaxis, followed by c1odwise rotation through 30" 38. Clockwise rota tion through 45°, followed by projection onto the ,..axis, followed by clockwise rotation through 45°
39. Reflection in the 1mI.' y = x, followed by counterclockwise rotatio n through 30". followed by reflectio n in the liner=  x
In Exercises 4043, lise mal rices to prove lhe given slatemellts abollt transformations from RZ to RZ. 40. If R, denotes a rotation (about the origin) through the angle 8, then R.. 0 Rfj = R,, "Il'
50. T in Exercise 31
51. The transformation in Exercise 37 52. Prove that Pe( cv )  cPt{v) for any .scalar c IExample 3.59(b»). 53. Prove that T: IR" + Rm is a li near transformation if and on ly if
for all vI' vl in R" and scalars cl'
c,.
54. Prove that (as noted at the beginning of this sect ion) the range of a linear transformation T: R~ + R'" is the column space of its matrix IT[. 55. If A isa n invertible 2X2 matrix, what docs the Fundamental Theorem of Invertible Matrices assert about thc corresponding linear transformatio n Til in light of Exercise 19?
R
lics
[n 198 1. the U.S. Space Shuttle Columbin blasted off equipped with il dev ice called the Shuttle Remote Ma nipulator System (SRM S). This robotic arm, known as Canadarm, has proved 10 be a vi tal tool in all subsequent space shuttle missions. providing strong, yet precise a nd delica te handling o f its payloads (see Figure 3.1 5).
. ..
•
•
Flgur. 3.15 Canadarm Canadarm has been used to place satellites into their proper orbit and to retrieve mal func tion ing ones for repair, a nd it has also pe rformed critical repairs to the shuttle itself. Notably, the robot iC arm was instrumental in the successful repair of the Hubble Space Telescope. SIIlCC J 998, Canadarm has played an Important role in the assembly of the Interna tional Space Station. A robotic arm consists of a series of 1mb of fixed length connected at joiuts where they can rotate. Each link can therefo re ro tate In space, or ( through the effect of the other links) be tra nslated parallel to itself, or move by a com bination (composition) o f rotations a nd translatio ns. Before we can design a mathematical model for (I ro botic arm, we need to understa nd how rotations and translations work in composition. To s implify matters, we will aSsume that our arm is in 1R2. In Section 3.6, we saw that the matrix of a rotation R about the o rigin thro ugh an
•
v
=
[COS. O sinO ]. ( Figure 3. 16()) a . Ir
sinO cosO [ :]' then a translation along v is the transformation
T(x ) = x + v or,equivalentl y,
L ".
( Figu re 3.16(b».
r[;]
=
[;::J
y
y
TOt ) =
,
R(x )
X
+\
,
•
,
   , (a ) R(){alion
(b) TranslalJon
Fig." 3.11
Unfortunately. translation is not a linear transformation because n O}
"* O. However,
there is a trick that will gel us around this problem. We can represent the vector x " = (xl as the vector
y
Y
in R). This is called represent ing x in homogeneous roor
I
dinah!!. Then Ihe m atrix multiplication lOa
o o
I
0
b I
x y 1
x+. y+ b I
represents the transl.lted vector T(x) in homogeneous coo rdinates. We can trcat rOlations in homogeneous coordinates too. The matrix mul tiplication
cosO sin 0
o
sinO
0
x
xcosO  Y5m 0
cos O 0 o I
Y
xsinO + yeos O
I
I
rep resents Ihe rOI:llOO vector R(x) in homogeneous coordinates. The composition T o R Ihal gl\'es the ro tation R followed by Ihc transbtion Tis no\v represented by thc product
Oa cos8 sm O 0 o 1 b sm O cos O 0 001001 l
I
sinO
cosO
a b
o
0
I
cos O  smO
(Note that R o T* T o R.) To model a robotic arm. we give each link its own coordinate system (called a fmme) and examine how o ne link moves in relation to those to which it is d irectly connected. To be specific, we let the coordinate axes for the link A j be X, and Y with " bethe X,.axis aligned with the link. The length of A j is denoted by Il " and the angle tween X, and Xi _ l is deno ted by 8,. The joi nt between A , and A i I is at the point (0.0) relative to A , and ( " iI> 0 ) relative to A ,_ t. Hence, relative to A ,_ I' the coordina te system for A , has been rotated through 0, and then translated along
[a~I] m
(Figure 3.17). T his transformation is represen ted in ho mogeneo us coordinates by the matrix
T~
•
cos 9,
sinO;
sin 9, 0
cosO, 0
a,_1
0 I
)"  I
• 'L.___~
X, _ I
Ia . _ ,~
figure 3.n
To give a speCIfic example, consider Figure 3.18(a). 1t shows an arm with three li nks in which A , IS in its initial posit Ion tlnd each of the other two links has been rota ted 45° from the previous link. We will take the length of each link to be 2 units. FIgure 3. 18(b) shows A ) In its initial fram e. The transformatio n
T,
=
cos 45 sin45
o
sin 45
2
cos45
0
o
I
o
I
causes a ro tation of 45° and then a translation by 2 units. As shown in 3.18(c), this places A , III ItS ap propriate position relative to A / s frame. Next, the transformation (0545 Tz =
sin4 5
o
 sin 45 cos45
o
2
0 I
o
I
is applied to the previous resu lt. This places both A ) and A 2 in their correct position relative to A I' as shown in Figure 3.18(d). Normally. a third transformation TI (tl rotation) would be applied to the prel'ious result, but III our case, TI is the identity transformation because A I stays in ils initial position. Typically, we want to know the coordinates of the end (the "hand") of the robotic arm, given the length and angle parametersthis is known as forward kiuemtlt/cs. Following the above sequence o f calculations and refe rring to Figure 3. 18. we see thilt we need to determine where the poi nt (2, 0) ends up after T) and Tz tlre applied. Thus,
I Y3
•
"
•
'L'_ _ _+
;\ 1
x,
    _x3 (b) A ) in
(u) A threelink chain
I\S
imtial frame
3' ,
(c) T3 puts A3 in A2'S initial frame
n••rt 3.1.
the arm's hand is at 2 7~ TJ 0
=
I
I/ V,  I/ V, I/V, I/V, 0
0
2 ' 2 0 0
I
=
I
0
I
I
0 0
0 =
2+ V, V, I
2 0
I
2+V2 2+V2 I
which represents the poi nt (2+ v'i, 2+
v2)
in homogeneous coordinates. It is
easily checked fro m Figure 3.18(a) that this is correct. The methods used in this example generalize to robotic arms in three di men
sions, although
In
W there are morc degrees of freedom and hence more variables.
The method of homogeneous coordinates is also usefu l in other appIIC::llions, notably computer graphics.
'"
n.
Chapter 3
Ma trices
Applications
'
Markov Chains A market research team is cond uc ting a contro lled su rvey 10 del(.'rm ine peop le's prefe re nces in toothpaste. The sam ple consists of 200 people, each o f whom is as ked to Iry two brands of toothpaste over a period of several months. Based on the responses to the survey, the research team compiles the fo llowing statistics abou t toot hpaste p references. O f those using Brand A in a ny mon th, 70% continue to usc it the fo llowing m o nth, wh ile 30% switch to Brand B; o f those using Brand B in an y mo nlh, 80% con " tinue to use it the fo llowing m on th , wh ile 20% switch 10 Brand A. These fin d ings are sum marized in Figure 3. 19, III which the perccntages havc been converted into deCImals; we will think of them as probab ilities.
' L..
j~
Ij •
0.]0 0.70
0.80
AnJ m A. Ma rkov ! 18~ 1 922)
a Russian mathematician who stud.ed and later taught at the University o f St. Petersbu rg. He was interested III number th eo ry, analysis, nnd th e th eo ry of contmued fractions, a rece ntly developed fiel d wh ich M:lrkov applied 10 probability th eory Markov was also in terested m poetry, and one of the uSt'S to whICh he put Markov chains was th e analysis of patterns in poems and other literar y texts.
"'"015
lllmple 3.64
0,20
fillur. 3.11 Figurc 3. 19 is a simpl e ex;t m p le o f a (flll ite) Markovchain. [t re presents an evo[ving proccss consisting of a fi n ite number of Slates. At each step o r point In time, the process may bc in any o ne of the states; at the nexl step, the process can re main III its present state or switch to o ne of the other states. The state to which the process moves at the next step and the probablhty of its do ing so depend otlly on the prestnt state and not on the past history o f the process. These p robabilities are called lramilioll prob(Ibililics and are assumed to be consta nts (that is, the p robability of moving fro m state i to state j is always the same).
In the toothpaste su rvey descri bed abovc, there are just two statesusing Brand A and using Brand Band the t ransition probabilities are those lIId lcated in Figure 3. 19. Suppose that, when the survey begins, 120 people arc uSing Bra nd A and 80 people are using Brand B. How many people will be using each b rand I mon th later? 2 mon ths later? The n umber of Brand A users after 1 month Will be 70% of those in itially usi ng Brand A (those who rem ain loyal to Brand A) plus 20% of the Brand B uscrs ( tho~ who switc h fro m B to A) :
S"IUII
0.70( 120)
+ 0.20(80)
~ 100
Sim ila rly, the n umber of Brand B users after' mo nth will be a comb ination o f those who switch to Brand B and those who continue to u.se I t: 0.30(120)
+ 0.80(80)
~ 100
SectIon 3.7
Applications
229
We can summarize these two equations in a single matnx equation:
0. 7 0 0. 2 0][ 120] = [1 00] [0.30 0.80 80 100 Let's
call the matrix Pand label the vecto rs Xo = [ I!~] and
XI := [ : :
J. (Note
that the components of each vector are the numbers of Brand A and Brand ij users, in that order, after the number of months llldicalcd by the subscript.) Thus, we have XI = P X IJ> Extending the not alion, let XI_be the vector whose components record the distributio n of tooth paste users after k months. To determine the number of users of each brand after 2 mo nths have elapsed, we simply apply the same reasoning, starting wi th XI instead of xO' We o btain Xl
=
P XI =
0.70 [ 0.30
0.20][ 1 00] [90] 0.80 100 = 110
from ,"hich we see th,lt there arc now 90 Brand A users and 110 Bra nd B users.
The vectors X I in Example 3.64 are called t he state vectors of the Markov chain, and the matrIx P is called its transition matrix_ We have just seen that a Markov chain satisfies the relatio n XHI = Px ..
fork
0, J,2, .. .
Fro m this result it follows that we can comp ute an arbit rary state vector itemtll'l:ly o nce we know Xo and P. In other words. a M:lrkov chain is completely determmed by its transition probabilities and its initial Slate.
Re .. lrU • Suppose, in Example 3.64, we w:lll ted to keep track of not the actual numbers of toothpaste users but, rather, the reltltive nu mbers using each brand. We could convert the data into percentages o r fractio ns by dividing by 200, the IOtai number of users. Thus. we would start with
Xo 
[;J [~:~~] =
to reflect the fac t that , initi:llly. the Brand A Brand B split is 60% 40%. Check by
.
dIrect calculation that PXo =
[0.50] , whIch . can then be taken as Xl (10 agreement
0.50 with the 5050 split we computed above). Vectors such as these, with no nnegat ive components that add up to I , arc called probability vectors. • Observe how the transitio n probabiliti es are arranged within the transItion matrix P. We can think of the columns as being labeled wit h the prese" t states and the rows as being labeled With the "ext states: Pnseut A B Next
A [0.70
B 0.30
0.20] 0.80
Chapter 3
230
Matrices
The word stochastic I~ derived from the Greek ad;t\:tive stokhastikQs, meaning "(arable of aiming" (or gu~ing).1t has com to be applied to anything !.hat h gO\'erne
No te also that the columns of P are probability vectors; an y square m atrix with this p roperty is called a stochastic matrix. We can realize the d etermini stic nat ure of Markov chains in another way. Note lhat we can write
a nd, in general, Xl
=
piX,
for k
0, I, 2, ...
This leads us to exam ine the powers o f a transi tion matnx. In Examplc 3.64, we have
p! = [ 0.70 0.30
0.7
A 049
A
0.3 A
~ y B
fl'lre 3.21
B 0_21·
AOOO
~ BO.24.
0.20 ][ 0.70 080 0.30
0.20] 0.80
~ [ 0.55 0.30 ] 0.45
0.70
What are we to make of the elHries of th is mat rix? The first thing to observe is Ihal pl is ano ther stochastic matTIX, sincc its colum ns sum to 1. (You arc asked to prove th Is in Exercise 14. ) Could it bc that pl is also a transition matrix of some kind? Consider one of its entries say, (1'2)21 = 0.45. The tree d iagram in Figure 3.20 clarifies where Ihis ent ry came from . Thcrc are four possible state ch,mges that can occur ovcr 2 months, and these correspond to the four b ranches (or paths) o f length 2 in thc tree. Someone who illlt iall y is using Br;md i\ can end up using Brand B 2 m on ths b tcr in two d ifferen t ways (marked '" ill the figure ): The perSOll can con tinuc to use A after I month and then switch to 8 (wi th probability 0.7(0.3) = 0.21), o r thc person can switch to Barter 1 m onth and thell stay with B (w1lh probability 0.3(0.8) = 0.24 ). The sum of these probabilities givcs an overall probability of 0.45. O b serve that these calculations are extlctly wh:Jt we do whcn we compute ( P~)"I ' It fo llows that ( P2h = 0.4 5 represents the probabil ity of m oving from state I ( Brand A) to state 2 (I}rand Bl in two transitions. (Notc II1:1t the o rder of the subscripts is the rCI'erst' of what yo u migh t have guessed.) The argumcnt can be general ized to show tha t j
"
( pt)~ is the p robability of moving from state j to statc i in k transitions.
In Example 3.64, what will ha ppen to the d istribution of toothpaste users in the lo ng ru n? Let's wo rk wilh p robability veClors as state vecto rs. Continuing our calculations (ro unding to three deCImal places), we find
Xu
0.60] [
[0.50] [0. 4 5] x  Px  [0.70 0.20][0.50] 0.50 ' 0.30 0.80 05 0 = 055 ' 0.70 0.20][0.4 5] ~ [0.425] x ~ [0.412] x ~ [0 . 4 06] = [ 0575 0.588 ' 0594 ' 0.30 0.80 0.55 03] [0 . 4 02] [0. 4 01 ] [0.400] [0. 4 00] [0.4 0.597 ,x = 0.598' = 0.599 ' x\> = 0.600 ' x = 0.600 '
= 0.40 ' xI 
2 
PXl
I 
'4
7
X8
5
10
Section 3.7
131
Applications
and so on. It appears that the state vectors approach (or COtlVugf: to) the vector [ 0.4] . 0.6 implying that eventually 40% of the toothpaste U2r5 in the survey will be using Brand A and 60% will be using Brand B.lndeed, it is easy to check that, o nce this distribution is reached, it will never change. We s imply compute
0.70 [ 0.30
0.20] [ 0.4] = [0.4 ] 0.80 0.6 0.6 A state vector x with the property that Px := X is ca lled a steady state vutor. In Chapler 4, we will prove that every Markov chain has a unique steady state vector. For now, let's aceepl this asa fact and see how we can find stich a vCClorwLthout doing any iterations al aiL We begin by rewriting the matrix equation Px = x as Px = Ix. whICh can in turn be rewritten as (I  P}x = O. Now this is just a homogeneous system of linear equations with coefficient matrix 1  P, so the augmented matm is 11 P I 0). In Example 3.64, we have 1  0.70 ( /  P I 0) = [  0.30
 0.20 0] [0.30 1  0.80 0 .,  0.30
which reduces to
0.20 0] 0.20 0
0]
[~ ° a ~
So, if our steady state vector is x ::: .
.
[XI]' then Xl is a free variable and the parametric X,
solutIOn IS
X, :::
j t,
X:2 = t
If we require x to be a probability vector, then we must have I = X,
Therefore, X:! = ' ''''
i=
0.6 and
X,
+ Xl =
=
it + t =
i "" 0.4, so x =
~t
[0.4 ]. in agreement with our 0.6
iterative calculations above. ( If we require x to contain the aCt/l(ll distribution, then in this example we must have x, + Xz ::: 200, from which it follows that x "" [ I
Example 3.65
~~
l)
A psychologist places a rat in a cage with three compart ments, as shown In Figure 3.21. The rat has been trained to select a door at random whenever a bell is rung and to move through it into the next compa rtment. (a) If the rat is initially in compartment I, what is the probability that it will be in compartment 2 after the bell has rung twice? three times? (b) In the long run, what proportion of its time will the rat spend in each compartment? SOllllloD
Let P =
I~ I "" PJI =
II}ql be the transition matrix for this Markov chain . Then
L Pu
::: Pu = j , P12 = P2J =
J. and
Pl1 = P1:I. = Pn = 0
232
Chapter 3
Matrices
flll"l 3.21 ( Why? Remember th;!t P'I is the prob,lbility of moving fro m j to i.) Therefore,
0
I) =
, j
, j
0 ,
, •,
!
,,
•, 0
and the initial sl;lte vector is I x,~
0 0
(a) After one ring of the bell. we ha\'e
,! 1,, !, 0 , 1, , , 0
0
x, = PXo =
0
I
0 0
,1 1 ,
~
0 0.5 0.5
Continu ing (rou nding to three decimal places), we find
, , o , o , , 1 , o , ,, 1 , 1, o ,
,
1
~
1
0333
I I
0333
1
• .L
0.222
"
0.389
.L
0.389
0333
a nd
o 1 !,, ,1 j x} = Px~ = , o , ,1,
l
,1
o
,
~
'"
1
Therefore, after two rings, the probability that the rat is in compartment 2 is = 0333, and after three ri ngs, the probability that the rat is in com partment 2 is 0389. I Note t hat these questions could also be answered by computing ( p lhl and (JV h .j
? ....
Section 3.7
Applications
2:33
(b) This questio n is aski ng for the steady stale vector x as a prooobility vector. As we saw above, x must be in the null space of I  P, so we proceed to solve the system
,, _1, 0 , 1 1 _, "2 0 _ 1, ,, 1 0 1
[/  PI Oj 
•
1
0
_., 0
0
1
 I 0
0
0
0 0
x, Hence, if x =
X~ , then
X3
x, bility vector, we need I
C XI
= t is free and
XI
=
i i, X2 =
t. Sm ce x must be a proba
+:s +:s "" ~ t. Thus, l '" i and
,• ,• 1
•
which tells us that, in the long ru n, the rat spends ! of its tim e in compart ment 1 and 1 o f its time in each of the other two compartments.
Populallon Growlft P. H.. Leslie, "On the U!;C (,f Matrices in G:rtain I\' pulation MathematiG." Biorll,·tri!.tJ J) ( 1945), pp. 18Jlll.
Example 3.66
One o f the most popular models of population growt h is a m atTixbased model, fi rst in trod uced by i'. H. Leslie in 1945. The Leslie "Iodel describes the growth of the female portion of a population, \vhich is assu med to have a maxi mum hfespan. The females arc divided into age classes, all o f which span an eqU:11 number of years. Using data abou t the average birthrates and survi val probabilities of each class, the model is then able 10 determme the growth of the popul:Hlon over lime.
A certai n species of German beetle, the Vollmar Wasserman beetle (or VW beetle, for sho rt), lives for at most 3 years. We divide the female VW beetles into th ree age classes of I year each: yo uths (0 1 year),juveniles ( l2 years), and adults (23 yea rs). The youths do not lay eggs; each juvenile produces an ave ragc of four female beetles; and each ad ult produces an average of three fe males. The survival r:lle for }'ouths is 50% (that is, the probability of a youth's survi ving to become a juvenile is .5), and the survival rate for juven iles is 25%. Suppose we begin wi th a pop ulation of 100 fe male VW beetles: 40 youths, 40 juven iles, and 20 adults. Predi ct the beetle populatio n for each of the next 5 years.
5alall01 After I year, the number of yout hs will be the number prod uced d uring tha t year: 4O X 4 +20X3=220 The num ber of juvcniles will simply be the nu mber of youths that have survived: 4Q x 0.5 = 20 Likewise, the number of ad ults will be the nu mber o f juveniles that have su rvived : 40 X 0.25 = 10
We can combine the~ into a single matrix. eq uatio n
o
4 0 0.25
0.5
o
3 0 0
40 40
220 20
=
20
10
40 o r Lxo xI' where Xo = 40 is the initial population d istribution vector and XI
=
=
220 20
20
10 is the distribution after I year. We see that the struct u re of the equation is exactly the same as for Ma rkov chains: X i ... 1 :::: Lx. fo r k = 0, I, 2, ... (although the in terpretation is qUite different). It follows that we can iteratively compute successive population distribution vectors. (It also fo llows that x~ = L*x o for k = 0, 1,2, .. . ,as for Markov chai ns, bu t we wi ll not use thIs fact here.) We compute o 4 3 220 110 20 = 110 Xl = LX I = 0.5 0 0
o o X3
= Lx, =
0.25
0
10
5
0.5
4 0
3 0
110 110
o
0.25
0
5
455 55 27.S
4 0.5 0
3 0
455
o o
0.25
0
4 0 0.25
3 0 0
27.5 302.5 227.5 13.75
o x4=Lx)=
Xs =
L~
=
0.5 o
::::
302.5
55
227.5 13.75 95 1.2 151.2 56.88
Therefore. the model predicts that after 5 yea rs there will be approximately 951 you ng fe male VW beetles. 151 juveniles. and 57 ad ults. ( N ote: You could argue that we should have rounded to the nearest in teger at each stepfor example, 28 ad ults after step 3wluch would have affected the subsequent iterations. We elected Itot to do th is, since the calculations are o n ly approximations anyway and it is much easier to u~ a calculator or CAS if you do not round as yo u go.)
The matrix L in Example 3.66 is called a u$lie matrix. In general. if we have a p opulation with" age classes of equal duration. L will be an " X II matrix with the following structure: b, b, b, .. . $1 L ~
o o
0
0
52
0
0
S.l
000
... ...
5~_1
0
Here. b l • hz• ... are the birt/! parameters (b, = the average numbers of females prod uced by each female in class i) and 51' s,;, ... are the mrvival probabilities (5, = the p robabil ity that a fe male in class i survives into class j + 1).
Section 3.7
4000
235
Applications
Youlhs
= o 3000
1000
o k::;:::~ ;;;;;;;;:::;::. ..=4==.= :::::;:=A~d",..:I"_ o
2
4 6 Timc (10 years)
10
flgtue 3.ZZ Wh at arc we to ma ke of o u r cnlcu lat ions? Overa ll , thc beetle pop ulntion appears to be Incrensing, nlthough there nrc some fluctua tion s, such as a decrease from 250 to 225 fro m year \ to yea r 2. Figure 3.22 shows the change in the population in each of the th ree age classes a nd dearly shows the growth, wi th fluctuations. If, ins tead of plotting the actl/al population, we plot the relative population in each class, a diffe rent paltern em erges. 10 do this, we need to compute the fmc lio n of the population in each age class in each yea r; that is, we need to divide each distributio n vector by the sum o f its comp<ments. For example, after I year, we have 1
 x 250 I
1 250
220
0.88
20 to
0.08 0.04
which tells us lha t 88% of the population con sists of you ths, 8% is juveniles, and 4% IS adults. If we plot this type of data over time, we get a graph like the one in Figure 3.23, whICh s hows clearl y that the propo r tio n of the populat ion in each class IS app roachi ng a s teady state. II t urns o ut that the steady slate vector in th is example is 0.72
0.24 0.04 That is, in the long run, 72% of Ihe populatio n will be you ths, 24% juveni les, and 4% adults. (I n o ther words. the population is distributed nm ong the th ree age classes in the ratio 18: 6 : I .) We will see how to dete rmine this ratio exactly in Chapter 4.
Graphs and Digraphs There are many s itualions III which I t IS important to be able to model the interrela tionships amo ng a finite set of objects. For example, we might WIsh 10 desc ri be various Iypes of networks (roads connecti ng low ns, airline routes connecting CIties, cormnulllcalion llllks connecting satellites, CIC. ) or rela tionships amo ng gro ups or individ uals (friendship rehllionships in a sociClY. predator prey relatio nships III an ecosyste m , dominance relationships in a sport, etc. ). Graphs are ideally sui ted to modeling such ne tworks and relmionships, and it turns out that ma trices are a useful tool in their study.
23&
Cha pler 3
MalTices
09 0.8 0.7 A
B
c
.0
.

Yoolh ~
0.6
0
8.
0.5
~
• & " 0
c
C
D
0.4 0.3
Ju ven ile ..
0.2
0.1
Adults
o~~~~~~~~~===~
A' .'C!
o
5
10 TlIne (in years)
D
figure 3.24 Two rcpustntililo ns o( the same graph
The term VCrlex (\'('rl;c('s is the plural) comes (rom the La tin verb verrett, which means "to turn." In th e co ntext of gr 'l phs (and gt"{)metry ), (1 vertex iS:I corncra point wher(' an edge "turns" into a diff(' rent edge.
IS
20
flgur. 3.23 A graph consists of a finit e set of points (called vertices) and a fi nite set of edges, each of which connects two (not necessarily distinct ) vertices. We say that two vertices are adjacent if they are the endpoin ts of an ed ge. Figure 3.24 shows an example n of the same graph drawn in two different ways. The g raphs are the "sam e in the sense that all we care about a rc the 3djacency relationships tha t identify the edges. We can record th e essential in form3lion abo ut a graph in a matrix and use matrix algebra to help us answer certai n questions about the graph. This is particularly useful if the graphs are large, since computers can handle the calculations very quickl y.
Definition
If G i.<> a graph wit h IlXn matrix A [or A(G)I d efined by
a , "= { I 'I 0
/I
vertices, then its adjacency matrix is the
if there is an edge betwee.n vertices i and j otherwise
Figure 3.25 shows a graph and its associated adjacency matrix.
",
,',• A
1
1
1
1
1 0
I
1 0
1 0
'.flaur. 3.25
'J
A gmph with adjacency matrix A
1
0
0
0
0
Section 3.7
Applications
231
ft ••• ,.
O bserve that the: adjacency matrix of a graph is necessarily a symmetric matrix. (Why?) Notice also that a diagonal entry a" of A is zero unless there is a loop at vertex i. In som e situations, :'I grap h may have more than one edge between a paito f vertices. In such cases, It may make sense to modify the definition of the adjacency matrix so that (lj) eq uals the I1Ilmber o f edges hctwecrl vertICes i and ). We defi ne a patl. in a graph to be a sequenccor edges that allows us to travel from o ne vertex to another conti nuo usly. The length of a path IS the num ber of edges it contains, and we will refer to a path with k edges as a kptlth. For example. in the graph of Figure 3.25, VI v1 v2v. is a 3path, and V~V I V2V1 Vt Vj is a 5palh. Notice that the first o f these is closed (it begins and ends at the same vertex); such :, p:lIh is called a circui•. The second uses Ihe edge between 1'1 a nd 1'21WICe; a palh th
3 2 I 0 2 3 2 I I 2 2 I
o
I
I
I
Wh:1I do the entnes o f Al represent? Look al the (2, J) entry. From Ihe definition of matrix multipl icatio n we know that {A~ ) D = a21aD
+ an an + " u a" +
a24a43
The only way this expression can resu lt in a non zero number is if at lenst one of the products auQ u that make up the su m is nonzero. But au au is nonzero if and o nly if bolh ~t and au are no nzero, which means that there is an edge between 1'2 and vl as well as an edge between I'l and 1'1' Thus, there will be a 2path between vertices 2 and 3 (via vertex k). In our example, this happens for k = 1 and for k = 2, so 2 ( A )lJ = aJl a"
= 1 ·1
+ an alJ + alJQJl + " 24"4l
+
1·1 + 1'0 +0 ·0
=2 which tells us that there are two 2 paths between vert ices 2 and 3. (Check to see Ihal the remaining entries o f A2 correctly give 2paths in the graph.) The argumen t we have just given can be generalized to yield the following result, whose proof we leave as Exercise 54. =
I
If A is the adjacency matrix of a graph G, then the ( ~j ) entry of Al is equal to the number o f kpaths between vertices i lind j.
(IlmaI13.61
How many 3paths are there bclw~ n VI and
1'2
in Figure 3.25?
We need the (1,2) entry of A' , which is the d ot product ofrow I of A ~ and column 2 of A The calculation gives
SolutIon
(AJ )1l =3 '1 +2 · 1
+ \.\ +0'0 = 6
so there art: six J paths between vertices \ and 2, which can be easily checked.
231
Chap ter 3
Matrices
/',"~~>c "
In momy applications that can be mode/cd by a graph, the vert ices are ordered by SQme type of relation that imposes a direction on the edges. For example, directed edges I11ight be used to represent oneway rou tes in a graph that models a transportation network or predatorprey relationships in a graph modeling an emsystem. A graph with directed edges is called a digraph. FIgure 3.26 shows an example. An easy modification to the definition of adjacency matrices allows us to use them with digraphs.
DeliaUion
If G is a digraph with /I X II n1:l trix A ror A( G) I defined by
figure 3.26
tI
vertices, then its adjautlcy matrix is the
I+. digraph
a . = {I if there is an edge from vertex ito verlex j IJ 0 otherwise
Thus, the adjacency matrix for the digraph in Figure 3.26 IS
A=
o o I I
I
0
I
0 0 I 000 0 I 0
Not surprisingly, the adjacency matrix of a dlgmph is not symmetric III general. ('Vhen would it be?) You should have no difficulty seeing that Ai now contains the numbers of directed kpaths between vertices, where we insist that all edges along a path flow in the same direction. (See ExerCise 54. ) The next example gives an applica· tion of this idea.
Example 3.68
D
w
Five tennis players (Davenport , Graf, Hingis, Scles, and Williams) compete 111 a roundrobin tournament in which each player plays every other player once. The digraph in FIgure 3.27 summarizes the results. A directed edge from vertex i to vertex j means thai player I defeated player j . (A digraph 11\ whic.h there is exactly one directed edge between ellery pair of vertices is called a tQurnamerll.) The adjacency matrix for the digraph in Figure 3.27 is
G
A
S
flglrI 3.:n A tournament
H
0 I 0 I I 0 0 I I I I 0 0 I 0 0 0 0 0 I 0 0 I 0 0
where the order of the vertices (and hence the rows and colu mns of A) is determined alphabetically. Thus, Graf corresponds to row 2 and column 2, for example. Suppose ,"'c wish to rank the five players. based on the results or their matches. One way to do this migbt be to count the number of wins for each pla)'er. Observe that the number of WinS each player had is just the sum of the entries in the
StClion 3.7 Applications
239
corresponding row; equivalently, the ve(lor conhlining all the row sums is given by the product Aj, where
1 1 ) =
I
1 1 in our case. we have
0 0 Aj "'"
1 0 0 1
1 0 0 0 0 0
0 0 I
1
1
1
1 1 0
1
1 1 = 1 I
0
J J 2 1 1
which produces the followin g ran king: First: Davenport, Graf (tic) Second: H ingis Third:
Sdes. Williams (tie)
Are the players who tied in this ranking equally strong? Davenport might argue that since she defea ted Graf, she deserves first place. Scles would use the same type of argumcnt to break the tICwi th Williams. However, Williams could argue tha t she has two "indi rect" victories because she beat Hmgis. who defeated two others; fu rt hermo re, she m ight note that Seles has only one indircct victory (over Williams, who then dcfeated H mgis). Since one player might not have defeated all the others with whom she ultimately lies, the notion of ind irect wi ns seem s more useful. Moreover, an indi rect victory corresponds to a 2path In the digrap h, so we can use the square of the adjacency ma trix. To compute both wins and indirect wins for each player, we nced the row su ms of the matrix A + A 2 , which arc given by
(A + A'lj
=
=
0 1 0 1 1 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1 2 2 J 1 0 2 2 2 1 1 0 2 2 0 0 1 0 1 I 0 I 0
2 1 2 0 1 1 1 0 1 0 1 2 0 0 1 0 0 1 0 0 1 0
0 I
+
1 1 1 1 1
1 I
1 1 1
8 7 6
2 3
Thus, we would rank the players as fo llows: Davenport, Graf, Hingis, Williams, Seles. Unfortunately, this app roach is not guaran teed 10 break all ties.
Chapter 3 M:J{rlce's
frrorCorrectlng Codes Section 1.4 discussed examples of errordelecling codes. We turn now 10 the problem of designing codes that can ,nrrecl as well as d elect certai n ty pes of errors. O ur message will be a vector x in .l~ for some k, and we will encode it by using a mat rix transforma tion T : 1:~ ..... 1:; for some" > k. The vector T(x) will be called a ",de veclor. A simple example will serve to ill us trate the approach we will take, which is a generaJization of the paritycheck vectors in Example 1.31.
Example 3.69
Suppose the message is a single bina ry digit: 0 or I. If we encode the message by simply repeating it twice, then the code vectors are [0, OJ and [ I, I]. This code can de tect single errors. For example, if we tra nsmit [0, OJ and a n error occurs in the first component, then [1,01 is received and an error is detected, because this IS not a legal code vector. However, the receiver can not correct the error, since [ I, OJ would also be the result of an error in the second component if [ I, I] had been transmitted . We can solve this problem by mak lllg the code vectors longer re peating the message d igit three times instead of two. Thus, 0 and 1 are encoded as {O, 0, 0] and [ I, I, I ], respect ively. Now if a single error occurs, we can nol only detect it but also correct it. For example, If 10, 1. OJ is received, then \ve know it m ust have been the result of a Single error in Ihe transmission of [0, 0, 0 ]. since a single error ill [ I, I, I J could not have produced it.
Note that the code in EX
tra nsformation, albei t a particularlytnvlal o ne. Let C =
1 and define T; Z 2 _ Z~
I b y T (x ) = Gx. (Here W~ ar~ thinking of the clements of 1:2as 1 X I matrices.) T he ma t fix G IS called a generator matrix for the code. ·'0 tell whether a received vC<:lor is a code vector, we perform not one but two par
ity checks. We req uire that the receIVed vector c =
e, Cz satisfies 'I = c1 ::
' ,.
We can
c, write these equations as a !tnear system over 1: 2: (I
If we lei p =
=
~
(I)
0'
[ ~ ~ ~ ], Ihen (1) is equivalent to 1>C =
parity check matrix fo r the code. Observe that PC =
[~]
o. The mat ri x P lscalled :1 =
o.
To see how these matrices COllle in to play in the correction of errors, su ppose I we send I as I = [ 1 I I but a single error causes it 10 be received as
f,
I
SectIOn 3.7
c' = [1 0
AppilcatlOns
711
IV. We comp ute
Po'
[: o I
I
~]
0 I
so we know that c ' cannot be a code vector. VV'here is the error? Notice that Pc' IS the second column of the parity check matrix P this tells us that the error is in the second componen t of c ' (which we will prove in Theorem 3.34 below) and allows us to correct the error. (Of course, in this example we could find the erro r faster without uSing matrices, but the idea is a useful one.) To generalize the ideas In the last exam ple, we make the following definitions.
Definitions
If k < II, then an y nXkmatrix o f the form G =
[ ~], where A is
an (II  k)X k matrix over 7l 2> is called a standard generator matrix fo r an (II, k) biliary code T: 7l~ ,lo z~. Any (/I  k) X n matrix of the form P = [ B In k)' where B is an ( II  k) X k matrix over 71.1' is called a standard parity check matrix. The code is said to have length II and dim ension k.
Here
what we need to know: (a ) When is G the standard generator matrix fo r an errorcorrecflllg bmary code? (b ) Given G, how do we find an associated standard parity check matrix P? It turns out that the answers are quite easy, as shown b}' the follow ing theorem .
Theorem 3.34
If G =
IS
[ ~] is a standard generato r matrix and P =
[B
I~~tl is a standard par
ity check matrix, then P is the parity check m atrix associated with G if and on ly if A = B. T he corresponding (II, k) binary code is (single) errorcorrecting if and only if the colum ns o f Pare nonzero and distinct.
Before we prove the theorem, let's consider another, less tri vial example that illustr
Example 3.10
Suppose we want to design an errorcorrecting code that uses th ree parity check equat ions. Since these equations give rise to the rows of P, we have /J  k = 3 and k = /J  3. The message vecto rs come from 7l~ , so we would like k (and therefore II) to be as large as possible in o rder that we may transmit as much information as possible. By Th eorem 3.34, the /J columns of P need to be nonzero and distinct, so the maximum occurs when they conSIst of all the 2J  I = 7 nonzero vectors of Zr k = 7l~. One such candidate JS
p=
I
I
0
I
I
I
0
J
I
o
J
I
I
0 1 0 001
0
0
In
Chapler 3
Malrices
This means Iha! I
I
0
I
I
0
I
I
0
I
I
I
A=
and th us, by Theorem 3.34. a standa rd generator matrix fo r this code is
0
0 0 0 I 0 0 0 0 I 0 O 0 0 I I I 0 I I 0 I I 0 I I I I
C
As an example of how Ihe generator matnx works, suppose we encode [0 I 0 I ]T to get the code vecto r c = Gx = [0
I
0
I
0
!It
0)'
I
If this ve
0 I
Pc' =
I
0
I
0 0
I
I
0
I
0
I
I
I
0
I
I
0
I
0
I
I
0
0
I
I
I
0
I
0 w hich we recognize as column 3 of P. Therefore. Ihe error is in the third com ponent of c'. and by changing il we recover the correct code vector c. We also know that the fi rst fo ur co m ponents of a code vector arc the original message vecto r, so in this case we decode c to get the o riginal x = [0 1 0 I t
The code in Example 3.70 is called the (7. 4) Hamming code. Any binary code constructed in this fashion is called an ( n. k) Hamming codt'o Observe that, by construction, an (n, k) Hamming code has 11 = 2,, l  1.
PrOlf If TII"r._ 3.3.
(T hro ughout th ts proof we d eno te by a j the ith co lum n o r a m atrix A.) With P and G as in the statement of the theorem, assume fi rst that they arc standard parity check and generato r matrices fo r the same binary code. T herefore, for every x in Z;, PGx = O. [n terms of block multiplicat ion,
(B
T][ ~]x
'"
0
for allx in Z~
Sectioll 3.7
Equivalen tly, for
Bx
lIll
codes Claude Shannon had proven theoretIcally possible 1I1 1948.
243
x III Z~ we have
+ Ax :: (8 +
A)x
(Ill
=
Bx
0'
Richa rd W. Halllming ( 191 5 1998) receIVed his Ilh.D. in mathematics from the University of Ill inois at UrbanaChampaign In 1942.. HIs mathematical researc h interests were III the fields of differentIal equa\lons and numerical analYSIS. From 1946 to 1976, he worked at Bell Labs, after \.,.hich he joined the fiICulty at the U.S. Naval Postgrad uate School in Monterey, C.1Iifornia. ln 1950. he published his fundament(11 paper o n errorcorrecting codes, glVlI\g an explicIt construction for the optimal
ApplicatIOns
+
fAlx
= [8 !J[ ~]x = 0
= Ax
If we now take x = c i ' the ith standard basis vector in Z~, we see that
b, = Be, = At', = a, for all
I
Therefore, U = A. Conversely, it is ellSY to check that if B = A, then PCx = 0 for every x in Z~ (see Exercise 74). To see {hat such a p
Z!
Pc' = P(c
+
e,)
=;
Pc + Pe;::::; 0 + p, = p,
which pUlpoints the error in the ith component. On the other hand, if P, =; 0, then an error In ' he IIh componen t wtll not be detected (i.e., Pc' = O), and if p, = PI' then I'll' cannot determine whether an error occurred in the ich or thejth com ponent (Exercise 75). The main ideas of this section arc summarized below.
I. For /I > k, an IIX k matrix G and an (fl  k) X II matrix P (with entries in Z2) are a standard generator matrix and a standard parity check matrix, respectively, for an (II, k) binary code if and only if, in block form,
for some ( n  k) x k matrix A over ill' 2. C encodes a message vector x in Z! as a code vector c in Z~ ViOl c == Gx. 3. C is errorcorrecting if and only if the columns of P are nonzero and distinct. A vector c ' in Z; is a code vector if and only if Pc' = O. In this case, the corresponding message vector is the vcctor x in consisting of the first k components of c'. If Pc' 0, then c' is nota code vcctor and Pc' isone of the columns of P. If Pc' is the jth column of P, then the crror is in thc ith component of c' and wc can recovcr the correct code vector (and hence the message) by changing this component.
*"
Z;
FE
2/14
Chapter 3
Matrices
Marlo. Chains
0.5 0, 3 ] [0.5 0.7 be tile trallS/tiOll II/a· frix for a Marko y cllain wltll two stntes. Let Xo = [0.5] b, 0.5
/11 Exercises /4, let P =
tile initial suite vector for llie poplI/a liOlI. 1. Compute X I and Xz.
2. What proportion of the state I population will be in state 2 aft~r two steps? 3. What proport ion of the state 2 population will be in state 2 after two steps? 4. Find the steady stale vector.
/" Exercises 58, let P =
,1 ,1 1, ,! I
°, , 1
!
be the lransitio" malrix
°
for a Markov elw;" with three stMes. Let XtJ = j"ilial state vector for the populntioll.
120 180 bell,t 90
5. Compute x, and x,;. 6. Wha t proporlion of the state I population will be in
state I after two steps? 7. What proportion of the slate 2 population will be in state 3 after two steps?
8. Find the steady state vector. 9. Suppose that the weather in a particular region behaves according to a Markov cham. Specifically, suppose that the probability that tomorrow will be a wet day is 0.662 if today is wet and 0.250 if today is dry. The probability that tomorrow will be a dry day is 0.750 If today is dry and 0.338 if today is wet. [This exercise is based on an act ual study of rain fa ll in Tel AVIv over a 27·year period. See K. R. Gabnel and J. Neumann, "A Markov Chain Model for Daily Rainfall Occu rrence at Tel Aviv," Quarterly }oumal of tile Royal Meteorological SocielY, 88 ( 1962), pp. 9095.1 (3) Write down the transition matrix for this Markov
chain. (b) If Monday is a dry day, what is t he probability that Wednesday wl\l be wet? (c) In the long ru n, what will the distribution of wet and dry days be?
10. Data h3ve been accumu [ated on the heights of chilo dren relative to thei r parents. Suppose that the proba· bllities that a taU parent will have a tall, medium· height, or short child arc 0.6, 0.2, and 0.2, respectively; the probabili ties that a mediu m· height paren t will have a tal l, mediumheight, or short child are 0. 1, 0.7, and 0.2, respectively; and the probabilities that a short parent wilt have a tall , medium· height, or short child are 0.2, 004 , and 0.4 , respectively. (3) Write down the transi tion matrix for this Ma rkov
chain. (b) What is the probability that a short person will have a tall grandch ild? (c) If20% of the current pop ulation is tall, 50% is of medium height, and 30% is short, what will the distribution be in th ree generations? (d) If the data m part (c) do not change over time, what proportion of the population will be tall , of medium height, and short in the long run? II. A study of pinon (pille) nut crops In the American
sou thwest from 1940 to 1947 hypothesized th3t nut production followed 3 M3 rkovchain. [Sec D. H. Thomas, "A Computer Si mulation Model of Great Basin Shoshonean Subsistenc~ and Settlement Patterns," in D. L Clarke, cd .• Mo(Je/s ill Archaeology (London: Methuen, 1972 ).J The data suggested that if one year's crop was good , then the probabilities that the fo llowmg yea r's crop would be good. fa ir, or poor were 0.08, 0.07. and 0.85, respectively; if one year's crop was fa ir, then the probabilities that the following year's crop would be good, fair, or poor were 0.09, 0. 11 , and 0.80, respectively; if one year's crop was poor, then the probabilities that the followi ng year's crop would be good, fa ir, or poor were 0. 11 ,0.05, and 0.84, respectively. (. ) Write down the transition matrix for this Markov chain. (b) [f the pinon nut crop was good in 1940, fi nd the probabilities of a good crop in the years 1941 through 1945. (c) In the long run, what proportion of the crops will be good, fai r, and poor? 12. Robots have been programmed to traverse the maze shown in Figure 3.28 and at each junction randomly choose which way to go.
Section 3.7
1
2
Applications
245
pOPUlation Growlh 19. A population with three age classes has a Leslie matrix
L ==
4
r'
figure 3.2.1
xo=
1
1
3
0.7
0
O. If the imtlal population vecto r is
o
0.5
0
100 100 ,compute x j, xZ, andx J .
100 20. A populatio n with (our age classes has a Leslie matrLx o 1 2 5 0.5
o o (0) Construct the transition matrix for the Markov chain that models this situation. (b) SUl'pose we start with 15 robots at each Junction. Find the steady state distnbutlon of robots. (Assume that It rakes each robot the same amount of lime to tra vel between two adjacent j unctions.)
13. Let ; denote a row vector consisting entirely of Is. Prove that a no nnegative mat rix P is a stochastic mat rix if andonlyif j P= j.
Suppose we want to know tile average (or expected) /lllmber of steps it will take to go from state I /0 slate} /f/ a M(lfkov cham. II call be sllowlI that the following computatioll (111swers tllis quest/all: Delete the jtll row amJ the JIll COIIl1ll11 01 the trtJlIsitiOIl matrix P to get a new matrix O. (Keep the rows and colll1l1/15 % labeled as Ihey were ill P.) The expected nllmber olsieps from state i 10 state j is gIven by Ih e SIl1ll of til(' (/lines ill the ca/limn of (J  Qr l labeled i. 15. In Exercise 9, If Monday is a dry day, what is the expected number of days until a wet day? 16. In ExerCise 10, what is the expected number ofgencrations until a short person has a tall descendant?
0
0.7
0 0.3
o
O. If the init ial po pulatio n 0
0
10 10 , compute x i> x 2' a nd Xy 10 10
vector is X Q =
21 . A certam species with two agedasses of 1 year's duralion has a survival probability of 80% from class I to class 2. Empirical evidence shows that , o n ave rage, each female gives birth 10 five fema les per year. Thus, two possible Lesl ie mat rices a re
14. (a) Show that the product of two 2 X 2 stochastic matrices is also a stochastic matrix. (b) Prove that the product of l'ovo nX" stochastic
matrices is also a stochastic matrix.
0
L, =
[~.8 ~]
and
~ == [~.B ~]
. (a) Startmg wit h Xo == [ 10] 10 , co mpute x l' .. . . XU) III each case. (b) For each case, plot the relative size of each age class over li me (as in Figure 3.23). What do your graphs suggest? 22. Suppose the Leslie malrix for the VW bectle is L =
o
0
20
0. 1 0
O. Starting with an a rbitrary xO' deler
o
0
0.5
mine the behavio r of this population. 23. Suppose the Leslie matrix for the VW beetlc is 0 20
o
L=
s 0
o . Invcstigate the effect of va rying
o
0.5 o the survival probability s of lhe yo ung beetles.
17. In ExerCIse II , if the pinon nut crop is fair one ycar, what is the expected numbe r of years until a good crop ~ 24. Woodland cari bou a rc found primarily in the western occurs? provinces of Canada and the American northwest. 18. In Exercise 12. starting from each o f the other ju ncThe average lifespan of a female is about 14 years. tions, what is the expected !lumber o f moves until a The birth and survival rales for each age bracket are robot reaches ju nction 4? given in Table 3.4, which shows that canbou cows do
24&
Chapler 3
Matrices
not give birth at all dUri ng their firs t 2 years and give birth 10 about one calf per year d uring their middle years. The mortality rate for you ng calves IS very high.
yea rs 2000 and 20 10. What do yo u conclude? ( What assum pt ions does this m odel make, li nd how could it be im proved?)
Crl.lls Ind OlarlDlIs In Exerci5es 2528, (Iefermi/le lile adjacellcy matrix of the given grapll.
The numbers of woodland caribou re ported in Jasper Na tio nal Park in Albe rt;! ill 1990 arc shown III Table 3.5. Using a CAS, predi ct the Cllribo u population for 1992 and 1994. T hen projecllhe population for the
28.
Table 3.5 Age (years)
"
Number
10 2 8
68
5
8 10
12 0 1
1012 12 14
"I
Woodland Carlboa population In Jasper National Parl, 1990
02 2 4 46
SDwru: World Wildlift Puod Canada.
", ",
"
",
"
/11 Exercises 2932, draw a graph tlml ha5 Ille givell adJocell C)'
29,
m(llrix. 0
1
1
1
0 1
1
0
1
1
0
0
0
1
1
1
1
0
0
0
0
1
0
1
1
0
0
0
1
1
1 0
30.
Section 3.7
Appllcatiolls
III ExercIses 3740, dmll' (I digraph dUll Iws tlte given culjacency m(ltrix.
0
0
I
I
0
0
0
0
I
I
0
0
I
I
0
I
I
0
0
I
0
0 0
0
31. I
0 0
0
I
I
0
I
0
0
0
I
0
0
I
I
0
0
0
I
I
I
0
0
I
0
0
I
0
0
0
I
0
I
I
0
0
I
I
I
0
0
0
I
0
0
I
0
I
0
I
I
0
0 0
I
0 0
11/ ExerCISes 3336, determine the adjacency m(ltrix of tlte
0
0
I
0
I
0
I
0
0
I
givell tligraph.
I
0
I
0
0
I
0
0
0
I
I
0 0
0
0
0 0
0
I
I
I
0
I
0
I
0
I
0
0
0
I
0
I
0 0
I
I
0
0
0
33.
Vt
32.
37.
39.
'·2
UJ
38.
40.
III Exercises 4148, use powers of adjacency matrices to de
termine tlie 'IIltllber of paths of lite speCIfied lellgth betwee" ti,e givell vertrces.
'.,
41. Exercise 30, length 2,
34.
v, and 1'1
42. Exercise 32, length 2, v, ilnd
1'2
43. Exercise 30, length 3, v, and
V}
44. Exercise 32, length 4,
1'1
1'2
45. Exercise 37, length 2, v,
and [0
"J
46. Exercise 37, length 3, v, to v. 47. Exercise 40, length 3,
1'4
to v.
48. ExeTCIse 40, length 4, v, to
v~
49. Let A be the adjacency matrL.'t of a graph G.
(A) If row i of A is all zeros, what docs this imply about C? (b) If column ] of A is all zeros, what does this imply about C?
"J
35.
50. Let A be the adjacency matrix of a digraph D.
",
", \ ' .1
36.
v,
",
(a) If row i of Al is all 7.eros, what does this imply about D? (b) If column j of A2 is all 7.eros, what does this Imply about D? 5 1. Figure 3.29 IS the digraph of a tournament wi th six players, P, to Pt,. Using adjacency matrices, rank the players first by determining wins only and then by using the notion of combined wi ns and ind irect wins, as in Example 3.68. 52. Figure 3.30 is a d igraph rep resent ing a food web in a small ecosystem. A directed edge from (I to b ind.cates tha\ (I has bas a source of food. Construct the adja· cency rnalnx A for this digraph and use illo answer Ihe following questions.
'.J
(a) Which species has th e most direct so urces of food? How does A show this?
UI
Chapter 3
Matrices
Roden!
Plant
Insect
•
Fish
Bird
flaure 3.30
(b) Which species is a direct source o f food for the most other species? How docs A show this? (c) If II eats b and II eats c, we say that (I has cas an indirect source of food. How can we use A to determine which species has the most mdirect food sources? Which species has the most direct and ind irect food sources combined ? (d) Suppose that pollutants kill the plants in this food web. and we want to delermine the effect thIS change will ha\'e on the ecosystem. Construct a new adjacency ma trix A· from A by deleti ng the row and column corresponding to plants. Repeat paris (a) to (c) and determine which species arc the most and least affected by the change. (e) What will the longterm efrecl of the pollution be? What matrix calculations will sh ow this?
53. Five people are alI connected byemail. Whenever one of them hears a juicy piece of gossip. he or she passes II along byemailing it to someone e~ in the group according to 'Iilble 3.6. (a) Draw the digraph that models t his "gossip network" and find Its adjacency matrix. A.
(b) Define a step as the time it takes a person to email everyone on his or her list. (Thus. in one step, gossip gets from AnIlIO both Carla alld Ehaz.) If Bert hears a rumor, how many steps will it lake for everyone else to hear the rumor? What matrix calculation reveals th is? (c) If Ann hears a rumor, how many steps will it take for everyone else to hear the rumor? \\'hat matrix calculation reveals this? (d) In genera], if A is th e adjacency matrix of a dig raph, how can we tell if vertex i is connected to vertex jby a path (of some length)? [The gossip network in this exercise is reminiscent of , he nOlion ofusix degrees of separat ion" (found in the play and film by tha t name), which suggesls that any two people are con l1ected by a path of acq uain tances whose length IS a t most 6. The game uSix Degrees of KevlO Bacon" more frivolo usly asse rts that all actors afe conn ected to the actor Kevin Baco n in such a way. ] 54. Let A be the adjacency matrix of a graph G. (a) By induction. prove that for all n 2: I , the (i.} )
entry of A" is equal to the number o f ,,·paths between vertices I and j. (b) How do the sta tement and proof in part (a) ha ...e to be modified if G is a digraph?
55. If A IS the adjacency ma trix of a digraph G, what docs the (i, j) entry of AA r represent if i * j? A grapll is calle(1 bipartite if its vertices cat! be subdivided
illto tlVO sets U ami V sucl! tlmt el'ery edge Itas one emlpomt ;'1 U mul tile other em/poi/lt ill V. For example, tlte graph i/l r:.xercise 28 is bipartite lVillt U:: {v" 1,'2' VJ} al/d V = {v•• vj }. In ExerCIses 5659, determine wllethera graph Wilh Ihe giwn adjace"cy matriX IS bIpartite. 56. The adjacency matrix in Exercise 29
, (
Section 3.7 Applications
( 57. The adjacency matrix in Exercise 32
whether (III error has occurred and correctly decode c ' ro recover the original mcssllge vector x.
58. The adjacency matrix in ExerCIse 3 1
1 0 1 1 0 0 1 0 1 1 1 1 0 1 0 0 59. 0 0 1 0 1 1 1 1 0 1 0 0 1 1 0 1 0 0 0
0
66.,' = [0
I
0
0
I
0
67.c'=[1
1 0
0
1
1 Of
68. c' = [0
0
I
I
I
B]
(a ) Fi nd a standa rd parity check matrix for this code. (b) Find a sta ndard generator matrix. (c) Apply Theorem 3.34 to explai n why this code is not errorcor recti ng. 70. Define a code matrix
[ BT : 0
Zi . Z~ using the sta ndard generator
(b) Using the result in part (a ), prove tha t a biparti te graph has no circuits of odd length.
G=
ErrorCorrlCtiDI CodlS 61. Suppose we encode the fo ur vectors in the vector twice. Thus, we have
Zi by repea ling
[ O, I J ~ [ O, I ,O,1J
1
Show that this code is not erro rcorrecting. 62. Suppose we encode the binary digits 0 and 1 by repeating each d igit five times. Thus, O~[O,O,O ,O,O J
I ~ [I, 1, I, I, I]
Show that this code can correct double errors.
W!rm is the resull of cllcodillg the messages I!I Exercises 6365 /lsmg the (7, 4) Ifammillg code of Example 3.70?
1 1
o
1
I
0
o
1
1
1
matrL,{
[ 1. 1J ~ [ I, 1, 1, 1J
65. x =
0
71. Define a code Z~ _ l~ using the standard generator
(i ,O J ~(i,O,I,OJ
0 1
1
(a) List all four code wo rds. (b ) Find the associated sta nda rd panty check matrix for this code. Is th is code (si ngle) erro rcor rect ll1 g?
[ O,oJ~[O,O,o,o J
64. x =
Or
~  ~.
0 !, A = ......
63. x =
If
69. The parity check code in Example 1.3 1 is a code
60. (a) Prove that a gra ph IS bipartite If and only if its vertices can be labeled so that Its adjacency matri x can be partitioned as
1 1 0 0
...
1 1 1 1
When rhe (7, 4) Ham ming code oJ Example 3.70 is /lsed, suppose the messages, ' in Exercises 6668 are received. Apply t/le standard parity check matrix to c' to determine
G=
o o
o 1 o o o 1 1 o o 1
1
o
1
1
1
(a) List all eight code words. (b ) Find the associated standard parity check matrix for this code. Is this code (single) errorcorrecting? 72. Show that the cod e In Example 3.69 is a (3, 1) Hamming code. 73. Co nstruct stand
ua
Chaplcr 3 Matrices
III Definitions Ind basIs, 196 Basis Theorcm, 200 colum n matrix (vector), 136 column space of a mat rix, 193 composition of linear transformat ions. 2 17 coordinate \'ecto r with respect to a basIs, 206 diagonal matrix. 137 dimension, 201 elementary matnx, 168 Fundamental Theorem of Invertible Matrices, 170 identity matrix, 137 inverse of a squarE' maUlX, 161 inverse of a hncar transformation. 219 lmear combination of matrices, 152
rank of a matrix, 202 Rank Theorem. 203 representations of matrix prod ucts, 144146 row matrix (vector), 136 fOW space of a matrix, 193 scala r mul tiple of a matrix, 138 span of a 5(t of matrices, 154 square matriX, 137 standard ma trIX of a linc:lf tran sformatlon , 214 subspace, 190
linear dependence/independence of matrices. 155 linea r transformation, 211 LV factoriza tion, 179 matrix, 136 matrix add ilio n, 138 matrix factorization. 178 matrix multipl ication , 139 matrix powers, 147 negative of a matrix, 138 null spaceofa matrix, 195 nullity of a matrix, 202 outer product, 145 partilloned matrices (block
multiplication), 143,1 46 permutat ion matnx, 185 properties of matrix algebra, 156,157,165
symmetric matrix, 149 transpose of a matrix, zero matrix, 139
152,
149
Rerlew Questions I. Ma rk each of the following statements true or (:llse:
(a) For ally matrix A, bot h AATand ATA are defined . (b) If A and 8 are matrices such that A8 A,* Q,then 8= 0.
=
0 and
(c) If A, 8, and X are invertible matrices such that XA "" B,thenX= A  lB. (d) The inverse of an clcmellt,lry matrix is an elemen
t:lry matrix. (e) The transpose of an elementary matrix is an
elementary matrix. (f) The product of two elementary matrices is an
(j) If T: Rt _IRs is a hnear transformation , then there is a 4 X5 matrix A such that T(x) = Ax for all x in the domain of T
111 Exercises 27, let A
subspace of Rn. (h) Every plane in R:1 is a twodimensional subsp'lCe
orR'. (i) The transformation T: Rt ..... Rl defined by
ltx) =
 x
is a linear transformation.
[ ~ ~] (///(I 8 = [ ~
3
Compllte the indicated matrices, i/possible. 2. A 2B
3. ATfi'
5. ( BBT)I
6. ( BT8 ) 1
4. BTA  IB
7_ The o uter product expansion of AAT · .c h · 8.1 f Als amatnx suul t atA . , = [
elementary matrix. (g) If A is an //IX II matrix, then the null space of A is a
=
o
9. If A =
1 0
 I
2
3
 I
0
'
,
I
5 3
1// 2
3 2
']
4 ,fi ndA.
and X is a matrix such thaI
 3
0 ,findX. 2
Cha pter Review
251
16. If A is a square matrix whose rows add up to the zero vector, explain why A cannot be inverti ble. 17. Let A be an //IX " matrix with linearly independent columns. Explai n why Ar A must be an invertible mat rix. Must AA T also be invertible? Explain. 18. Find a linear transfo rmation T: 1R1 ~ 1R2 such that
19. Find the standard mat rix of the linear tra nsformati on T: [R2 ~ [Rl that corresponds to a counterclockwise rota tio n o f 450 about the origin followed by a projection onto the line y =  2x. 20. Suppose that T : IR" ~ IR ~ is a linear transformatio n and suppose that v is a vector such that T (v) =t 0 b ut T l(V) = 0 (where T l = T o T. ) Prove that v and T(v) are linearly independent.
. J'"IIF' ..... ". •. l,"; .lII(.
,
_
~
~
•. . .  . . .
_.
I
_
Ei Ei
4.0 Introducllon: A Dvnamlcal Sl/stem on Graphs
Almost ner)' combillfl/ioll of the adjectiYes proper, latent,
CAS
characteristic. eigen (HId s«ular, with the "Olms root, nUIIlb
We saw in the last chapter that iterating matrix multipllcatlon often produces interesting results. Both ~'iarkov chai ns and the Leslie m odel of po pulatIon growth exhibit
Paul It Halmos Fin ite Dimem;ollill Vector Spaces (2nd edition) Van Nostnnd , 1958, p 102
steady siaies in cerlain situations. One of the goals of this chapter is to help you understand such behavior. First we will look at another Iterative proc~. o r dynam;. cui system, tha t uses ma trices. (I n the problem s that follow. )'ou will find it helpful to u sc a CAS or a calculoltor with matrix ca pabilities to facili tate the comp uta tions.) OUf example involves graphs (see Section 3.7). A compleregraph is any graph in which every ve rtex is adj:lcent 10 every other vertex. [f a complete gra ph has n vertices. it is d enoted by Kn. Fo r example, Figure 4. 1 shows a representation of K4 •
P,,1I1,. 1 Pick any vector x in R4 with no nnegative entnes and label the ve rtices o f K~ with the. com ponen ts of x , so tha t VI is I:lbeled wilh XI ' a nd so 011. Compute the adjacency mat h x A of K4 and relabel the ve rt ices of the gra ph wit h the corresponding componen ts of Ax. Try this for several vectors x rind explain , in terms of the graph, how the new labels ca n be determ ined from the o ld labels. 2: Now iterate the process III Proble m 1. Tha t is. for :I given choice of x, rel:lbel the verticc=s as dc=sc ribed above and then apply A again (and again, a nd aga m ) until a pattern c=m ergcs. Since components of the vectors the mselves will get quite large, we will scale them by dividi ng each vector by its largest component after each ite r,Hlon. Thus, if a computa tion results in the vecto r
P,.III,.
,.,
4
2
'2
1 1
we will replace it by
•
"n,.f. ".1 K
•
'"
1
4 !
•
2 1 1
=
0.5
0.25 0.25
5«11011 4 I
•
/ •
/
•
/
•
•
fluur, 4.2
,/
•
•
fllur. 4.3
/ nlure 4.4
~
Introduction to Eigenvalues and Eigenvectors
2:53
Note thallhis p rocess guaran tees that the largest com ponen t of each vecto r will now be I. Do this for K4 , then KJ and K5 , Usc at least len iterations and twodeci malplace accuracy. What appears to be happening? Probl,. 3 You should have noticed tha t, In each ca~, the labeling vector is
approach ing a cerlain vector (a steady state label!). label the ,'ertices o( the complete graphs with th is steady state vector and apply the adjacency matrix A one more time (WIthout scaling). What is the relationship between the new labels and the old ones? Pr,"••• Make a conjecture about the general casc K... What is the steady state label? What happens if we label K" With the steady state vector and apply the adjacency mat rix A without scaling? Prllbl,.5 The Petersen graph is shown in Figure 4.2. Repeat the process in Problems I through 3 with th is graph . We will now explore the process with some other classes of graphs to see if they behave the same way. The cycle C" is the graph wi th tI vert ices arranged in a cyclic fas hion . For example, Cs is the graph shown in Figure 4.3. Pr,bl •• 6 Repeat the process of Problems I through 3 wi th cycles C~ (or various odd values of 11 and make a conjecture abou t the general casco Probl,. J Repeat Problem 6 with even val ues of n. What happens? A bipartite graph is a complete bipartite graph (see Exercises 5660 in Section 3.7) if its vertices can be partilloned into sets U and V such that every vertex in Vis adjacent to every vertex in V, and vice versa. If Vand V each have /I vert ices, then the graph is denoted by K... ~. For example, K,.) IS the graph in Figure 4.4. Pro III.. • Repeat the process of Problems ) through 3 wilh complete bipartite graphs K for various values of fl. What happens? II . "
By the end of this chapler, you will be in a position to explain the observations you have made in this Introduction.
Introduction 10 Eigenvalues and Eigenveclors

In Chapter 3, we encountered the notion of a steady state vector in the conlext of two "ppIiC(llions: Markov chains and the leslie model of population growth. For a Markov chain with transition mat rix P, a steady stale vector x had the property that Px = x; for a Leslie ma trix L. a steady Slale vector was a population vector x satisfying Lx = rx, where r represented the steady state growt h rate. For example, we saw that
° 0.7 0.' ] [0.4] = [0.4 ] ,nd 0.5 [0.3 0.8 0.6 0.6
4
3
18
0 0.25
0 0
6
18
= 1.5
Th t Germa n adJcctive tigtll means "own" or Hcha ractcristic of." f .11m/rles and ngoll'<.'(lo'5 are ehara,teTistie of a malrix in the' ~n~ that they conlain important information about the nature of the matrix. The leiter A (la mbda), the Greek eq uivalen t oflhe Englb h lellt' L, is ulotd for eigenvalues
In this chapter, we investigate this phenomenon more generally. That is. for a square matrix A, we ask whether there exist nonzero v('Ctors x such that Ax is just a scalar multiple of x. This is the eigenvaillc problem, and it is one of thc most central problems in linear algebra. It has applications throughout mathematics and in many other fields as well.
because at one time Ihq W('T(' also known as la /ellt mluts. The prt fix eigerl is pronounced ~ EYE·g un ."
Let A be an fiX" matrix. A scalar"\' is called an eigellvalue of A if there is a nonzero vector x such that Ax = Ax. Such a vector X is called an eige". vector of A corresponding to A.
o
1
6 1
Dennltion
.,
•
25'
Chapter 4
Eigenvalues and Eigenvectors
(KllDlle 4.1
Show that x =
[,]. . I
LS
an eIgenvector of A =
[3 ~ J 1
and find the corresponding
eigenvalue.
Solution
We compute
fro m which it follows that x is an eigenvector o f A co rresponding to the eigenvalue 'I.
4 (Xlilple 4.2
Show that 5 is an dgc nval ue of A =
[.' !]
and d etermi ne all eigenvectors carre
sponding to this eigenvalue.
Solullon
We must show that there is a nonzero vector x such that Ax = 5x. But this equation is equiv
AS/=[:
2] [5 0] [4 2] =
3
0
5
4
 2
Smce the columns of this matrix arc clearl y linea rl y depmden t, the Fundamental T heorem of Invertible Mat rices implies that its null space is no nzero. T hus, Ax = 5x h as a nontrivial solution, so 5 is an eigenval ue o f A. We find its eigenvectors by computing the null space:
[A  5/ 10J = Thus. if x = Xl 
[:J
2 0 ] ' , [ 2 0
\o 0] 0
is an eigenveclor corresponding to the eigenvalue 5, it satisfi es
~ Xz = 0, or X I =
1Xz. so these eigenvectors are of the fo rm
1'11;11 is, they are the nonzero multiples o f [
~] (or, equivalen tly. the nonzero multiples
of [ ; ] ).
The set of all eigenvectors corresponding to an eigenvalue A of an /IX" matrix A is just the set o f tlonzero vectors in the null space o f A  AI. It fo llows that this set of eigenvectors, together with thc zero vecto r in [Rn, is thc null space of A  AI.
S«tion 4.1
255
Introd uctio n t o Eigenvalues and Eigenvecto rs
i.
"
•
DenallloD
leI A be an n X /I matrix and let A be an eigenvalue of A. The collection of all eigenvectors correspond ing to A, together with the zero vector, is called the eigenspace of A and is denoted by E.I.'
{t[~ ] }.
Therefore, in Example 4.2, Es =
lKample 4.3 Show that ,\ = 6 is an eigenvalue of A = space. SOlltiOA
7
1
2
 3
3
6 and fi nd a basis for its cigen
2
2
2
As in Example 4.2, we compute the null space of A  6 /. Row reduction
produces
A  6/=
I
1
2
3
 3
6
2
2
 4
I
1
2
+. 0 0
0
o
0
0
from wh ich we see that the null space of A  6/ is nonzero. Hence, 6 is an eigenvalue of A, and the eigenvectors corresponding to this eigenvalue satisfy X l + ~  2x} = 0, or XI ""  ~ + 2~. It follows that Xl
£6 ""
+
 I
2 X3
X,
x,
x,
1
+ x,
I
2 0
= span
1
0
2 1 • 0 0 1
..+
In R2, we can give a geomet ric interpretation ofthe notion of an eigenvector. The equation Ax = Ax says that the vectors Ax lmd x are parallel. Thus, x is an eigenvector of A if and only If A transforms x in to a parallel vector (or, equivalent ly, if and only if TA(x ) is parnllel lO x, where T", is the mat rix transfor mation corresponding to AJ).
lxample 4.4
Find the eigenvectors and eigenvalues of A =
[~
0] geometrically.
 I
We recognize Iha l A is the matrix of a reflection F in the xaxis (see Example 3.56). The on ly vectors that F maps parallel to themselves lire vectors parallel
SOllllol
to lhe yaxis (i.e., mult iples of
[~]), which are reversed (eigenvalue 
) , and vectors
paraUel lo the xaxis (Le., mul tiples of [ ~]), which are sent to themselves (eigenvalue I) (see Figure 4.5). AccordlOg1y, A =  J and A = I arc the eIgenvalues of A, and the corresponding eigcnspaces are
25&
Chapter 4
EigenV3lues and Eigenvectors
y 3
2
~,
F(y)
1
'~ " 3
2
FIel)
1
= el
1
2
3
Y~ F{e2)   e2
~
2
1'(, )
3
fllllr, 4.5 The eige nvectors o f a reflection y
4
3
Ay
2 Y 1
,
A, 2
4
3
Flglr, 4.6
The d iscussion i ~ b
Another W'd Y to think o f eigenvectors geometrically is to draw x and Ax head totail. Then x will be an eigenvector of A if and only if x and Ax are aligned in a straight line. In Figure 4.6, x is an eigenvector of A but y is not. If x is an eigenvector of A co rrespond ing to th e eigenvalue A, then so is any nonzero multiple o f x. So, i f we want to search for eigenvectors geometrically, we need o nly consider the effect of A on ullit vectors. Figure 4 .7(a) shows what happens when we transform unit vectors with the matrix A = [3
' ] of Example 4. 1 and display
13
['/Vi] I/Vi] [
the results headIotail, as in Figure 4.6. We can see that the vector x =
1/ V2 is an
eigenvecto r, but we also notice that there appears to be an eigenvector in the second quadrant. Indeed, this is the case, and it turns o ut to be the vector
l/ Vz '
Section 4.1
15'
Introduction to Eigenvalues and Eigenvcctors
)'
)'
4
, (b)
(3)
figure 4.1
. re 4.7(b). v.'e see what happens when we use the matnx A = In F1gu
[
\  \
There arc no eigenvectors at all! We now kn m v how to find eigenvectors once we have the correspond ing eigenvalues, and we have a geometric in terpreta tion o f them but one questio n rem ains: How do we fi rst find the eigenvalues of a given mat rix? The key is the observat ion that A is an eigenval ue of A if and only if Ihe null s pace o f A  AI is nontri vial. Recall fro m Section 3.3 Ihat Ihe determinant of a 2 x 2 matrix A :. [:
!]
is the
expressIon clet A = ad  be, and A IS invertible if and o nly if d et A is nonzero. Furthermore, the Fundamental Theorem of Inver tIble Mal rices guaran tees that a matrix has a non trivial null space if and o nly if it is noninvertiblehence, If and o nly If its determinant is zero. Puuing these facls toget her, we see that (for 2x2 mat rices at least) A is an eigenvalue of A if and only if det (A  A1) = O. This fact characterizes eigenvalues. and we will soon generalize It to square matrices of arbitrary size. For the moment. tho ugh, let's see how 10 use it with 2 X 2 matrices.
Example 4.5
Find all of the eigenvalues and corresponding eigenvectors of th e matrix A
[~ ~] from Example 4. 1. Solution
The preceding remarks sho w that we mUSI find all solutions A o f the equatio n del(A  AI) = O. Since det(A  AI) = det [
3 A I
\ 1~ (3 
3 A
A)(3  A)  \
~ A' 
6A + 8
we need to solve the quadratic equation Al  6A + 8  O. The solutions to thisequation arc easily fo und to bt" A = " and A = 2. These are therefore the eigenvalues of A.
258
Chapter 4 E1genvalues and Eigenvectors
To find the eigenvectors corresponding to the eigenvalue A = 4, we compute the ntlll space of A  41. We find  I [A  4/ 1 0l~ [ 1
from which it follows that x = only if X I

Xl
= 0 or
,p,n([:])
XI x =
2•
I 0 ] 1 0] [ 1  I 0 + 0 o0
[:J
is an eigenvector corresponding to A = 4 if and
Hence, the eigenspace E. =
{[~] }
=
{Xl[:]}
Similarly, for A "" 2, we have
11 0]0 ~ [01 o100 ]
[A 2/ 1 0l~[ : so y "" YI =
[;:J
is an eigenvecto r corresponding to A = 2 if and only tf YI
Y2' Thus, the eigenspace ~ =
{[
+ Y2 =
~2 ]} = {Y2[  :]} = span( [ 
:
0 or
D.
Figure 4.8 shows graphically how {he eigenvectors of A 3rc transformed when multiplied by A: an eigenvector x in the eigenspllce E4 is transformed into 4x, and an e igenvector y in the eigenspace £2 is transformed mto 2y. As FIgure 4. 7( a) shows, the e igenvectors of A arc the only vectors in R2 that are transformed into scalar multiples of themselves when m ulti plied by A.
,
Ay = 2)
+++ _ 'I c..t+++_ x  4
3
 2
 I
I
 I
2 3  4
figure 4.8 How A transfo rrnseigenvcclors
2
3
4
Section 4.1
Introduction to Eigenvalues and Eigenvectors
259
Remarll
Yo u will recall that a polynomial equallon with real coefficients (such as the quadratic equallon in Example 4.5) need not have real roots; It may have complex rOOts. (See Appendix C.) It is also possible to compute eigenvalues and eigenvectors when the en tries of a mat rix come fro m Zp' where p is prime. Thus, it is important to specify the sctling we in tend to work in before we SC I out to compute th e eigenvalues of a matrix. However, unless otherwise specified, the eigenvalues of a matrix whose entries are real n umbers will be assumed to be real as well.
Examp114.6
Interp ret the matrix in Example4.5 as a matrix over 1 J and find its eIgenvalues in that field.
Solullon
The solut ion p roceeds exactly as above, except we work modulo 3. Hence, th e quadratic equation Al  6A + 8 "" 0 becomes A2 + 2 = O. This equation is the same as Al =  2 "" I, giving A "" I and A =  I = 2 as the eIgenvalues in ZJ' (Check that the same answer wo uld be obl,lined by first red ucing A modu lo 3 to obtain
[~ ~] and then worklOg with Ihis matrix.) Example 4.1
Find the eigenvalues of A =
Solullon
[~

~] (a) over IR and (b) over the com plex num bers C.
We must solve the equation
o=
A
det (A  A1) "" de t [ 1
Il A
"" A2 + 1
(a) Over R, there are no solutions, so A has no real eigenvalues. (b) Over C, the solutio ns are A = j and A ""  i. (See Appendix C.) In the next section , we will extend the notion of determinant from 2X2 to nX n matrices, which in turn will allow us to find the eigenvalues of arbitrary square matrices. (In fact, this isn't qu ite true but we will al least be able to find a polynomial eq uation tha t the eigenvalues of a given matrix must sat isfy.)
III Exercises 1 6, sllow that v is fill eigenvector of A alldfind
the corresponding eIgenvalue. I. A
=
2. A =
3. A =
[~ ~]. v =[:l
[; :J.v=[:l [~ ~]. v = [;l
4. A "" [ :
=~]. v = [~l
3
S. A ""
6. A =
0 0 1 1 0 0 1
1 1
1 2
0 2 • 1  I
1 , 0
v=
2 I
1
v=
2 I
 I
211
Eigenvalues and Elge:nv«tors
Chapler 4
I" Exercises 7 12, sl/Ol\' ' hat A is nil eigenvalue of A alld jiml Olle eigerll'cc/or correspondillg /0 ,IIis eigenvalile.
7.A = [22]. A = 3 2
8. A =
•. A 10. A
[:
 1
'9.
, .
2] A = 2
[ ~ == [:
I I. A =
In Exercises 1922. 111ft Imit vectors x in R! alia tlreir Images Ax Wider fi,e (ICliOll of tI 2X Z IIIfl /rix A are (Imll'lI headtotail. as III FIgure 4. 7. Estimale lire eigenvectors lind eigenva/lies of A from each "eigel/pictllre "
4] A= I
5 •
4J A"" 4
5 •
,°2 , , ,, 2 ,°,
,, ,
,A=  I
3 12. A =
4
•A= 2
°
2
III Exercises 1318. filld the clgetH'allle5 cmd eigenvectors of A geometrically. 13. A =
[  0' ~) (refl ection in the yaxis) 20.
'4. A = [ 0,
o,] ( refl ection in the line y::: x)
15. A =
[~ ~] (projection onto the xaxis)
16. A
[~
:IE
Z)
(projection onto the line through the
origin w ith directio n vecto r 17. A =
[i]>
[~ ~] (stretching by a factor of 2 horizontally
and a factor of 3 vert ically) 18. A =
[~

~] (cou nterclockwise rota tio n 0£9O"
about the o rigm)
J'
Scction 4.1
Introduction \0 Elge nva!ues and Eigenvecto rs
29. A= [ ~ :]
21.
30. A = [ 1
~j
1
261
~ i)
2
III Exercises 3/ 34, filld al/ of the eigenvailies of lire matrix A over the indicated Z,.
~]over Z3 32_ A=[~ ~J over l j
31. A= [ : 2
33. A =
[!
~]over zs
34.A = [ :
~] over Zs
35. (a) Show that the eigenvalues of the 2X 2 matrix
22.
are the solut ions of the q uad ratic equation ,\1 tr (A)A + del A = 0, where tr(A) is the trace of A. (See page 160. ) (b) Show that th e eigenvalues of the matrix A in part (a) are
y
A~
J{. + d ± V(. 
d)' + 4bd
(c) Show that the trace and determinant o f the matnx A in part (a) arc given by tr(A ) "" Aj
+ A2
del A "" AlA2
and
where A1 and '\2 arc the eigenvalues of A. 36. Consider again the matrix A In Exercise 35. Give cond il"ions on a, b, c, and d such that A has (a) two distinct real eigenvalues. (b ) o ne real eigenvalue, and (el no real eigenvalues.
III Exercises 2326, use the metllod of Example 4.5 to find all of the eigenvalues of the malnx A. Give bases for each of tile correSfJol!(lillg cigenspaces. IIIl1slrate the eigenspaces and the effect of mll/tiplying eigenvectors by A as in FIgure 4.8. 23. A = [ 2 4
 I]
2 25. A "" [0 52 ] •• bI I
24. A = 26. A ""
In Exercises 27 30, find /III of the clgenvalues of tire 1I1(11rix A over the complex numbers C. Give bases for each of ti,e correspolldillg eigel/spaces.
27. A=[_::J
28.A =[ ~ ~]
.. ,
37. Show that the eigenvalues of the upper triangular matrix
are A = {I and A = d, and find the corresponding eigenspaces. 38. Let a and b be real numbers. Find the eigenvalues and correspond ing eigenspaccs o f
A ""
[  "b b]
over the complex num bers.
{I
ZSZ
Chapter 4
Eigenvalues (lild Eigenvectors
DelermlnanlS HLstorically, determina nts preceded matrices a curious fact in light of the way Ii near algebra is taught today, with matrices before determLnan ts. Nevertheless, determ ina nts arose Inde pendently of matrices in the solut ion of many pract ica l problems, a nd the theory of determinants was well developed "Imost two centuries before matnces were deemed worthy of study in "nd o f them selves. A snapshot of the history o f determinants is presented at the end of Ihis section. Recall lhat the delermlllant of the 2X2 mat rix A "" det A ""
(/ 11 ° 22 
(II I [
(I~ I
OIl ":!1
We first encountered this expression when we determmed ways to compute the inverse of a m"trix. In particular, we found that
The determinant of a matrix A is sometimes also denoted by IAI, so for the 2 X2 ma• A = tTiX
[ 1111 (I, L
11 12]
. we may also wnte
1122
a" a'2 = IAI = il2L (/22 WlfallD
(11I1In 
(/12a2 1
This notation fo r the determina nt is remin iscent of absolute value no
tation.1t is easy to mistake al l
a L2
. ,the notati. on for determlllant, fo r [all
a 21
all
il21
all]
,the
an
no tation fo r the matrix itself. Do not confuse these. Fortunately, it Will usually be clear from the con tex t which is intended. We define the determinant of a 1 X I mat rix A = ra j to be de t A = 1(11 =
il
(Note that we really have to be careful with notat ion here: lal does '101 de note the absolute value of il in this case. ) How then should we define the d eterminant of a 3x3 matrix? If you ask your CAS for the inverse of
a b ( d
A
,
g I.
f
,.
the answer Will be equivalent to L
A = 
J
~
ei  fll fg  Iii
ell  hi
dll  cg
bg  (III
III 
bf  ce cd  at ae  bd
cg
where A = aei  Ilfli  bdi + bfg + cd"  ceg. O bscrw that
+ bfg + cd!!  ccg  a("  fl.)  b(Ji  fg) + «(dJi  (g)
A = (lei  IIfl!  I,d;
(
Ji
Section 4.2
Determinants
213
and that each of the entries in the matrix portion of A  I appears to be the determina nt of a 2 X2 submatrix of A. In fa ct, th is is t rue, and it is the basis of the defi nition of the determinant of a 3x3 matrix. The definition is rewrsive in the sense that the determinant of a 3 X 3 matrix is defined in te rms of determinants o f 2 X 2 matrices .
•
Dennlllon
I~t A =
det A ""
a ll
a ll
al l
au an . Then the ddenninant of A is the scalar
IAI =
a ll
a ll
a"
(I)
a"
Notice that each of the 2X 2 determina nts is obtai ned by deletlllg the row and column of A that contain the entry the determina nt is being multi plied by. For example, the first summand is a ll multiplied by the determinant o f the sub matri x obtained by deleting row I and colum n I. Notice also that the plus and minus signs alternate in equation ( I ). Ifwedenote by A'j the submatrix of a matrix A obtained bydcleting row i and column j, the n we may abbreviate equation ( 1) as
,
• =
L (I )I+;al l de t All F'
For any square:: matru II, det A ;) is called the (i. j)"minorof A.
I
Compu te the de term inant of
A =
SII.U..
5
 3
2
I
0
2
2
 \
3
We compute detA=5 ~ ~
o  I
2
3
 ( 3)
1
2
2
3
+2
I
2
o  I
+ 3{3  4) + 2(  1  0) 5(2) + 3(  1) + 2( 1) ~ 5 5(0  ( 2»
With a little prac tice, yo u should fi nd tha t you can easily work out 2 X 2 d eterm ina nts in yo ur head. Writing out the second li ne in the above solution is then un necessary.
.+
Another method for calculating the determ inant of a 3X 3 matrix is analogous to the method for calculating the determinant of a 2X2 matrix. Copy the first two columns of A 10 the right of the matrix and take the products of the elements on the six
214
Chapter 4 Eigenvalues and Eigcm'ectors
diagonals shown below. Attach plus signs to the products from the downwardsloping diagonals and attach minus signs to the products from the upwardsloping diagonals.
(2)
Th is method gives
In Exercise 19, you are asked to check that th is resu lt agrees with that from equation ( I ) fo r a 3X 3 determinant.
EKlmple 4.9
Calcu late the determinant of the matrix in Example 4.8 using the method shown in (2).
Solution We adjo in to
A Its first two columns and co mpute the SIX indicated
products: •
Adding the three products at th e bottom and subtracting the three products at the to p gives
d" A = 0 + ( 12) + ( 2)  0  ( 10)  (9)
=5
as before.
WarnlaD
We are about to define determinants for arbitrary square matrices. However, there is no analogue of the method in Example 4.9 for larger matrices. It is valid only for 3 X 3 matrices.
Determinants 01
nx n Matrices
The defimtion of lhe determinant of a 3X3 matrix extends naturally to arbitrary square matrices. Let A :::: [ai}l be an /I X 1/ matr~, where nant o f A is the scala r
Defiaition
•
2: (  I )I~ia l, dct Ali j '"
/I?:.
2. Then the determi
Section 4,2
Determinants
2&5
JI is convenient to comb ine a m ino r with its plus or minus sign. To this end, we define the ( i. j ) cofactor of A to be
c·'I 
(_I)'~i det A·~
With th is no tatio n, defi nition (3) becomes
2:" a ljC1i
del A =
(4)
r'
Exercise 20 asks yo u to check tha I this d efini t ion correctl y gives the formu la fo r the determ inan t of a 2x 2 matrix when u ::: 2. Defin itio n (4) IS often refer red to as cofactor expansion along tl.e first row. It is a n amazing fact thai we get exaclly the same result by expandi ng along allY row (or even (my cO/limn)! We sum ma rize this fact as a theorem but defer the p roof until the end of th is sectio n (since it is somewhat lengthy and wo uld in terru pt our diSC USSion I f we were 10 p resen t it here).
r
Theore. 4.1
The Laplace Expansion Theorem The determina n t of an n X II matrix A = [a"l. where n 2:: 2. can be co
(5)
"
= ~ a,jCii
,,
(which is the cofactor expansion along the ith row) and also all
(the cofactor expansion along the jlh column ). __________________________________ MB
Si nce e'l = ( I )'~/ det " 'i' each cofactor is pl u s or m inus the correspo nd ing mino r, with the correct sign given by the term ( 1)'+J. A qUick way 10 d etermine whether the sign is + or  IS 10 re memocr that the signs for m a "checkerboard" patte rn:

· ..
+
+ +
+ +
+ +
•• • ••
+
· ..
211
ChapttT 4
Eigenvalues and Eigenvttlors
Ixample 4.10
Compute the determinan t of th e matrix
A =
5
3
2
1
0
2
2
 I
3
by (a) cofa ctor expansion along the third row and (b ) cofactor expansion along the 5Ccond column. Salullon (a) We compute
2 5  ( 1) 2 1
3
o
: 2(  6) +" + 3(3)
: 5 I'ierre Simon L1placc ( 1749 1827) was born in No rmandy, France and was ex pected to become a ckrgyman until his mathemalleai taltnts wtTe not icw at school He made many important con tribu tions to eakulus, probability, and astronomy. He was an examinu o f the young Napoleon Bonaparte at the Royal ArUUeI)' Corps and later, when Napoleon was in power, served briefly as MlrIlster o f the Interio r and then Chancellor o f the Senat e. Laplace was granted the hlle of Count of the Empire in 1806 and received the title of Marquis de Laplace in 181 7.
IxaDlPle 4.11
(b) In th is case, we have
2 5 2 :  (3) 1 2 + ~5  ( 1) 2 3
2 3
1 2
:3( 1)+0+"
:5
Notice that in part (b) of Example 4. lOwe needed to do fewer calcu lations than in part (a) because we were expanding along a colu mn that contained a zero entrynamely, Iln; therefore, we did not need to compute ~ 2 ' It follows that the Laplace Expansion Theorem is most usefu l when the matrix conta ins a row or column with lots of zeros, since, by choosing to expand along that row or column, we minimize the num~r of cofactors we netd to com pute.
Compute the determinant of
2 5 1
2 SOlull08 First notice that column 3 has only one no nzero entry; we should therefore expand along this column. Next note that the + 1 pattern assigns a minus sign to the"
Section 4.2 DetermmanlS
21'
entrya.zl = 2. Thus, we have det
A
=
(l1 3CB
=
O(CIl )  2
=
+ (/jjC1J + ll3~C'4 + 2Ctl + O(CJj) + O(C,}4)
+
lln C lJ
2  3 1  I  2
1 3
I
0
We now conti nue by expanding along the thi rd row of the deternuna nt above (the third column would also be a good choice) to get  }
I
det A =  2( 2  I } =  2(  2(  8)  5) =  2(1 1) =22 (Note that the +/ pattern for the 3 X3 minor is not that of the original ma trix bu t that of a 3 X3 matrix III generaL )
The La place expansion is plm icularly use ful when th e matrix is (upper or lower) triangula r.
Ex._pla 4.1t
Compute the dc."termi nant of
A=
2 0
}
I
}
2
0 5
0
0 0
I
6
0
5
0
0
0
0 0 SOIIIIOIL
4
7 0 2  I
We expa nd along the fi rst column to get
} 2 0 1 detA = 2 0 0 o 0
5 6 5
o
0
 1
7
2
(We have om it ted all cofactors correspond ing to lero entries.) Now we expand along the fi rst column ag.1in: I
6
0
det A = 2 · 30
5 0
2  1
o
Comi nuing to expand along the fi rst column, we complete the calculation:
detA =2 . 3'1~ _~1 = 2'3 ' 1 ' (5( I)  2 ' O)=2'3 ' 1' 5 ' ( I) =30..i
es and Eigenvectors
Exam ple 4.1 2 should convince you that the dete rm ina n t of a triangu lar matrix is the product of its diagonal en tries. You lire lIskeel to give a p roof o{lh ls {llct in Exercise 21. We record the resul t as a theorem .
.2
The determinant of a triangular matrix is the product of the entries d iagonal. Specifically, if A = (aj) is an /l X '1 triangular mlltrix then det A ""
Cl ll tJ22 •••
011
its main
a....
• ,11 In genenl (that is, unless the mat rix is triangular or has some ot her specIal fo rm ), computing a determinant by cofactor expansion is not effici en t. For example, the de termina nt of a 3X 3 matrix has 6 = 3! summands, c
T(n )
=
(n  1)n! + tI!  I
>
II!
Even the fastest of supercomputers cannot calcula te th e determ inant of
Properlles 01 Determinants The most efficient way to compute determ inants IS to use row reduction . However, not every elemen tary row operation leaves the determinant of a matri..'{ unc hanged. Th e next theorem summarizes the main p ropenies yo u need to understand in order to use row red uction effect ively.
.3

Let A ==
.It /It.
fa,) be a square matrix.
If A has a zero row (colu mn), then det A "" O. If His obtained by intercha nging two rows (colum n s) of A, the n det B = det A. If A has two identical rows (columns), the n del A = o. d. If B is obtained by multiplying a row (co lum n) of A by k, then de t B = k det A. p If A. B, and C are identical except that the ith row (col um n) of C is the sum o f the ith rows (columns) o f A and B, then det C"" det A + det 8 . f. If B is obtained by adding a m ultiple o f one row (colum n) of A to ano ther row (colum n) , then det B = det A.
rt'
Section 4.2
211
Determinants
Prool
We will prove (b ) as Lemma 4.1 4 at the end of thIs section. The p roofs of propertIes (a) and (f) aTe left as exercises. We will prove the remaining properties in terms of rows; the corresponding proofs fo r co lum ns are analogous.
(cl If A has two identical rows, swap them to obtain the matrix B. Clearly, 11 = A, so del 8 = det A. On the other hand,by ( b ), det B =  del A. Therefore , det A = det A, so det A "" O. (d ) Suppose row j of A is multipl ied by k to produce B; that is, b" :: ka" for j = I, . .. , II. Srnce the cofactors e'l of the clements in the jlh rows of A and B are identi cal (why?). expanding alo ng the ilh row of IJ glV(~s ~
w
"
detB = L b'l e" "" Lku" e'I"" k La'J e 'J :: k del A I I
,,1
,,L
(c) A.~ in (d ), the cof[J ctors e 'J o f !he clements rn the ith rows of A, B, and C are ident ical. Moreover, ' y = 'I" + b'l for } :: I , ... , II. We expand alo ng the jth row of to obtain
e
"
,,~
"
L c'l e., = L: (a" + b,,)C~ = L: a'I C'l + 2: b'l e" ""
det C ""
,,1]'" 1
II
det A
+
d et B
, I
Notice that properties (b ). (d ). and ( f) are related 10 elementary row operations. Since the echelon form o f a square matri x is necessarily upper tria ngular, we can combllle these properlles with Theorem 2 \0 calculate determi nants effi ciently. (See Exploration: Counting Operations in Chapter 2, which shows that row reduction of an n X n matrix uscson the order of II' operat ions, far fewer than the II! needed fo r co· factor expansion.) The next examples Illustrate the computation of d etermi nan ts using row reduction.
Compute det A if
(a) A =
2 0
, 0
(bl A ~
3 2 5
3  ] 5 3 6 2 2 0
,  ]
,
5 3 6 5 7
3
]
S0111101 (a) Using property (f) and then property (a), we have
dct A =
2
3
1
o
5
3
2 3  1 o 5 3
6
2
o
,
0
o
~ O
211
Chapter 4 Eigenvalues and Eigenvectors
(b) We reduce A to echelon fo rm as fo llows (there are other possible ways to do this):
det A =
0 3
 4 5 3 6 R,_R,
2
0
2 5
~
4
 I
I 14 ,R, 0 ~  3 It, !R,
0
5  3
7
0
 I
2 4
4
I
7 2
0  I II,HII.

1t,~2/t.
3
I 0 0  I 0 0

 I
0
2 15
0
0
2 5 3
3 0 2 5
0
2 4  I
 3  4 5  3
II.....1t, ~
 (  3)
9
6
5 7
R,/3
3
I
I
0
0
 I
0
4 2
0
I
0
 I 2
0 2
2
5
 I
 4 5 5 7  3 I
4
 I 2 2 9 7  4
3 5
2 9  33  13
 3 .1' ( 1 ) ·1 5 · (  13)  585
Hamar. 13y Theorem 4.3 , we can also usc elementary column operatIOns in the process of computing determinants, and we can "mix and match" elementary row and column operatio ns, For example, in Example 4.1 3(a) , we could have started by adding column 3 to column 1 to create a leading 1 in the upper lefthand corner, In fact, the method we used was fa ster, but in other exa mples column operations may speed up the calculations. Keep this in mind when yOll work determinants by hand.
Determinants 01 llementa,. Matrices Recall fro m Sect ion 3.3 that an elementary matrix results fro m performing an elementary row operation on an identity matrix. Setting A "" I" in Theorem 4.3 yields the fo llowing theorem,
,
Theorell 4.4
Let E be an
TlX /I
elementary matrix..
a. If E results from interchangi ng two rows of lw' then del E =  I. b. If E results from multiplying one row of Iw by k, then det E = k. c. If E results from adding a multiple of one row of '" to another row, then detE = I.
The word It'llmw is derived from the Gret'k verb /ambanein, whi,h means "to grasp." In mathematics, a lemma is a "helper theorem" that we "grasp hold of" and use to prove another, usually more Important theorem.
Prool Since det I" = l . applying (b), (d),and (f) of Theorem 4.3 immcdiatcly gives (a ), (b), and (c), respectively. of Theorem 4.4. Next, recall that multiplyi ng a matrIX B by an elementary matrix on th e left performs the corresponding elemen tary row operation on B. We can therefore rephrase (b ), (d ), and (f) of Theorem 4.3 succinctly as the follOWing lemma, the p roof of which is straightforward and is left as Exercise 43.
Se<:tlon 4.2 Determinants
le .. lI. 4.5
Let 8 be an
nX II
211
matrix and let Ebe an /I X II elementary matrix. Then
det(EB)
~
(det £)(<10. 8)
•
.
We can use Lem ma 4.5 to prove the ma in theorem of thIs section: a characteriza· tion of invertibllity in terms of determi nan ts.
Theorem 4.6
'* O.
A square matrix A is invert ible if and only if det A
P,aol Let A be an
•
nX IJ matrix and lei R be the reduced row echelon for m of A. We
'*
'*
Will show first thai del A 0 if and only if det R O. Let EI, f 2• ••. , E, be Ihe ele· melltary matrices corresponding to the elementary row openltions that reduce A to R. Then
.
E"'EEA = R
"
Taking determina nts of both sides and repeatedly applying Lemma 4.5, we oblain ( deIE,)·· ·(deI El )( detE])( deIA ) = detR
By Theorem 4.4 , the determinants of all the elemen tary matrices are nOllzero. We conclude that £let A 0 if and only if det R O. Now suppose thai A is mvertible. Then , by the Fundamenlal Theorem of Invertible Mat rices, R = J", so det R = I O. Hence, det A 0 also. Conversely, if det A =F 0, then det R 0, so R cannot contain a zero row, by Theorem 4.3(3) . It follows that R
'*
"*
'*
'*
'*
muSl be I" (whyn, so A is invertible , by the Fundamental Theorem agai n.
DelerlllnnlS and MarriN Dperalions Let's now Iry to determine what relationship, if any, exists between delerminants and some of the basic matrix operations. Specifically, we would like to find for mulas for det( tA), det(A + B), del(AR), det(A 1), and det( AT) in terms of det A and det 8. Theorem 3(d) does /lot say that det(kA) = k det A. The correct relationship between scalar multiplication and determinants is given by the followin g theorem.
Theorem 4.1
If A is an /IX /I matrix, then det(kA)
=
k"detA
:
57
,
You are asked to give the proof of this theorem in Exercise 44 . Unfortunately, there is no simple fo rmula for del(A + B), and III general, det (A + B) det A + del B. (Find two 2X2 matrices that verifY this.) It therefore comes as a pleasant surprise to find out that determinants are qui te compatible wIth matrix multiplication. Indeed, we have the following nice formula due to Ca uchy.
'*
•
212
Chap ter 4
Eigenvalues and Eigem·mors
AugUSlin louis Ca uchy ( [ 7891857) was born in Pans and studied engineering bu t swi tched to ma themati CS because of poor health. A brilliant and prolific mathematICian, he published over 700 pa pers, many on quite difficult problems. His name ca n be found on many theorems and defim tio ns in differential ('(juatlons, mfinite scnes, probability theory, algebra, and physIcs. He IS noted for introducing rigor into calculus, layi ng the foundation for the br.mch of mathematics known as :maI YSls. Politicall y conservative, Ca uchy was a royalist. and in 1830 he followed C ha rles X in to el[ile. He returned 10 France 111 1838 but did not return to his post at the Sorooll ne unti l the unh~rsity dro pped its requirement that faculty swear an oath of loyalty to the new klllg.
Theo" . 4.8
If A and Bare " X 'I mat rices. then
d,,(AB)  (d" A)(det Il)
Proof We consider two cases: A IIlvertible and A not invertible. If A is invertible. th en, by the Fundamental Theorem or Invertible Mat rices, it can be written as a product of elementary mat ricessay.
,.
AEE. ··· < Then AIJ =
EI ~
~
. EtlJ, so k applications of Lemma 4.5 give
det(AB)
~
d,,(E,E, ·· · E,B)  (det E,)(det E, ) ... (dol E,)(det B)
Con tin uing to apply Lemma 4.5, we obtain
det(AB)  d,,(E,E, ... E,)det 8  (det A)(d" B)
if A is not inven ible, then neither IS AIJ, by Exerci$C 47 in Section 3.3. Thus. by Theorem 4.6, det A = 0 and det(AB) = O. Consequently, det (AIJ) = O n thc other hand,
(det AHdet 11), since both sides are zero.
Uample 4.14
ApplYlllg . T h eorem 4.8 to A =
[' 2
~]andB= [~ AB =
: ]. we find that
[ :~ ~]
and that det A = 4, det IJ = 3, and det(AB) = 12 = 4 · 3 = (det A)(det B), as daimed. (Check these assertions!)
The next theorem gives a I11ce relationship between the determinant of an invertible mat rix and the determinant of its inverse.
Section 4.2
I
Theorem 4.9
213
Deterrninants
If A is inveruble, then 1
det(A ') ~ ;'; det A
Since A is invertible, AA  t "" I, so det{AA  1 ) "" det I I. Hence, (det A)( det A I) = I, by Theorem 4,8, and SLOce det A of. 0 (why? ), dividing by del A yields the result.
Prlol
Ixample 4.15
Verify Theorem 4.9 for the matrix A of EXlImpJe 4.14.
Solution
We compute
3
I
8
8
=  == 4
1
det A
ae •• r.
The beauty or Theorem 4.9 is that somet imes we do not need to know what the inverse o r a matrix is, bu t only that it exists, or to know what its determi nant is. r o r the matrix A in the last two examples, once ,ve know that det A = 4 0, we immediately can deduce that A is inveruble and that det A I =! WIthout actually computing A  I .
"*
We now relate the d eterminant or a matrix A to that of its transpose AT, Since the rows of AT are just the columns of A, evaluat ing det AT by expanding alo ng the first row is identical to evaluating det A by expand ing along ItS fi rst column , wh ich the Laplace Expansion Theorem allows us to d o, Thus, we have the fo11ow1l1 g result .
Theore. 4.10
For any square mllirix A,
del A = dctAT "
Gabriel Cramer (1704J 752) was a Swi!;5 mathematician. The rule that bears hiS name was published in 1750, In his treatise til rroducrioll 10 the Allalysis of Aigdmllc Curves, As early as t 730, however, spedal cases or thc rormula ",'ere known to other mathematicians, Including the Scotsman Colin Maclaurin (J 6981746) , perhaps the greatest of Ihe British mathematicians who were the usuccessors of Newton,~
Cramer·, Rule and lbe Ulolal In this section, we derive two useful formulas rel3ting determinants 10 the solution of lincllr systems and th e inverse of a matnx. The fi rst of these, Cramer'J Rule, gIves a (ormulll fo r describing the solution of certain systems or 11 linear equations in 11 vari· lIbles entIrely in terms of determ inants. Wh ile Ihis result is or little practical use beyo nd 2X2 systems, il is of grell t theoreticlil importance, We will need some new notal ion for Ih is result and its proof. Fo r an fiX 11 matrix A lind a vector b in ~n, let A,(b ) denote the matrix obta ined by repillcing the ilh col· umn of A by b, That is,
,
Column
A,(b)
~
j
[a, b  a.J
214
Chapter 4 Eig~ nval ues and Eigenvectors
Cramer's RuJe Let A be an invertible " X" matrix and let b be a vector in IR". Then the unique so lution x of the system Ax = b is given by for i= I , ... ,'1
del A
Proor c2 "
•• ,
The colum ns of the iden tity matrIX I = I" are the standa rd unit vecto rs c" en' If Ax = b, then
AI,(x ) ""
A [ c 1 ...
 ["
..
X
'"
b ..
e"J
= [Ae l
' .J 
A,(b)
...
Ax
.. .
Ae"J
Therefore, by Theorem 4.8,
(dctA)(d
0
x,
0
]
x,
·•
•
·•
•
·•
det l,(x ) = 0
0
x,
o o
o ...
•
. ..
·• ·
0 0
0 0 0 0 ·•
0 ·•
•
 x,
•
• ••
x.,
]
0
• ••
x.
0
I
as can be seen by expand ing alo ng the ilh row. Th us, (det A) x, = de t(A,(b )), and the result follows by dividing by det A (which is nonzero, since A is inven ible).
Example 4.1.6
Use Cramer's Rule to solve the system
x l +2x2 = 2 Xl
SOIUtiOD
del A =
+ 4x2 =
I
We compute I
 ]
2  6 4 '
dOl(A ,(b)) 
2 2 I
4
= 6,
and
det(A 2(b » =
By C ramer's Rule,
d
6
== 1 ,nd 6
X, _
~dc:: "",(A2,(",b,, )) _ ~ _ ~ dN A
6
2
]
2
]
I
$«tion 4.2 Determinants
7:15
Be_.rl As noted above, Cramer's Rule is computationally inefficient for all but small systems of linear equations because it involves the calculation of many determinants. The effort expended to compute Just one of these determinants, using even the most efficient method , wou ld be beller spen t using Gaussian elimina tion to solve the system directly. The final result of this St(tion is a formu la fo r the inverse of a matrix in terms of determinants. This formula was hinted at by lhe formula for the inverse of a 3X3 ma trix, which was given Withou t proof at thc beginmng of this section. Thus, we have come full circle. Let's discover the formul
Therefore, Ax,
=
ej , and by Cramer's Rule, X :
det A
'I
Hoy.'cvcr, Hh (olUI11I1
"II
au
...
a ll
all
...
• 0 0
. ..
" I~ ab•
",.
. ..
• ••
o
: ( I))" det AJ'
=
CJ'
•••
which is the (j. j)cofactorof A. It fo llows thaI x'I = (t l det AlC",. so A I = X = (l / det A)[ C.I'I = (I I det A)[ C,) T. In words, the IIlverse of A is the tmrlSposc of the matrix of cofuctors of A, divided by the determinant of A. The mat fiX
IC,]: IC,l':
C" C;, c;, <="
. C., · . . c,.,
C,. C;.
·
·
. c;..
is called the adjoint (or adjugate) of II and is denoted by ad; A. The result we have just proved can be stated as follows.
211
ChaplC'f 4
ElgC'nvaJuC'S and EigenvC'Ctof$
Theorem 4.12
•
: Ii
Let A be an invertible nX PI matrix. Then I ad' A det A J
A I =
lumple 4.11
Use the adjoi nt method to compute the Inverse of
A =
1
2
 I
2
2
4
1 J
 3
Solullon We compute det A =  2 and the nine cofactors  3
2
 I
3
 3
+I~
 I
C' I = +
Cz,
ell
= 
=
• =  18
2 3
•
=3 = 10
•
2
Cil = 
I
 3
I C:z2 = + I
 I 3
I
 I
C'2 =

2
4
10 =  2
=6
Cu =
+
c" = C31 = +
2
2
I 3 I 2 I 3 I 2 2
2
=. =  \
=
2
The adjoint is the trarlSpose of the ma trix of cofactorsnamcJy,
adjA ""
 18
10
4
3
 2
 I
10
, =
6 2
 18 10
3  2
6
4
 I
 2
10
Th,n
A
_I
 18 10 3 1 . I = det Aad) A =  2 10  2  6 =  I  2
•
9
,,
5
5
I
3
 2
1
I
which is the same answer we obtained (wi th less work) In Example 3.30.
Proof of I.e laPlace I.panslon Tbeorem Unfortunately, there is no short . e'asy proof of the Laplace Expansion Theorem. The proof we give has the merit of being relat ively straightforward. We break it down into several steps, the first of which is to prove that cofactor expansion along the first row of a matrix is the same' as cofactor expansion along the first column.
Ie .... 4.13
Let A be an nX n matrix. Then
Stction 4.2 Determinants
PrIG' Wf! prove this lemma by induction on
2:n
For /I = 1, the result is trivial. Now assume tha t the result is true for (n  I )X(ll  I ) matriccs; this is our induClion hypothesis. Note that, by thf! definition of cofactor (or minor), all of the terms contain illS {I I I
o. •
o. •
a,_ I,i
i l ,. , ,!
a,. I.l
o. •
a,. ,,!
O' '' I,}
o.
• •
o. •
o. •
o. •
(I,· "n
a,. I••
The jth term in this expansion of det A'l is 0 1/ ( _ 1)1 +r l det A,." where the nota tio n AtI,rf denotes the submalrix of A obtained by deleting rows k and I and columns r and s. Combining these, we see that the term con tain ing a"llJ l o n the righthand side of equatio n ( 7) is
a,I (  I) '''a 1/ (1), .. /  1 del A 1;' 1]  (_1)'''/+'"i l aI) del A 1,1] What is the term containing (l0I 1i1/ on the lefthand side of equation (7)? The factor lll) occurs in the Jlh summand , lli/ C,/ = (I11 (  I )1 + / d el A I / . By the induction hypotheSIS, we can expand det A,/ alo ng its first column:
au
al. r '
a"
o. •
lljl
o. •
a l. /_ I
a,.,.,
o. •
a,.,.,
a,. a,.
I
• •
ll~iI
a i./H
o. •
a".
a., The nh term in th is expansion of del AI / is 11,.(  I )IIn +' del AI •. ,/ ,SQ the term containing a.,a l ) on the left hand side of equation (7) is a,) (  I) '"
( 1)(· 1) .. , d et Ah.lj  (  1),, /.. 1a'lfl1j d el Ah.l)
(101 
which establishes that the left and righthand sides of f!q ualion (7) are eqUivaleni.
Next, we prove property (b) o f Theorem 4 .3.
, 2
Ie ••• 4.14
Ltt A be an nX" matrix and let B be obtained by interchanging any two rows (col umns) of A . Then
det B = dct A •
~
and EigenvC'Ctors
Proal
Once agai n, the proof is by induction on n. The result can be easily checked when II = 2, so assume that it is true for ( /I  1) X (II  I) ma trices. We will prove that the res ult is true for /I X /I ma trices. First, we prove tha t it holds when two adjacen t rows o f A are intercha nged say, rows rand, + I. By Lemma 4.13, we can evalua te de t B by cofactor expansion along its first column. The ith term in this e xpa nsIo n is ( I )1 + 'b,) del B". If I ra nd, r + I, then bi , = a, ) and BII IS an (II  1) X (n  1) submat ri x Ihat is ide ntical to A" except tha t two adjacen t rows have been inte rchanged.
'*
*"
a"
a"
·.. • ••
a"
·.. • •
a"
a.,
••
T h us, by the induc tion hypot hesis, det B,) ""  det A'l if ff I "" r, the n hoJ = a,+1.1 and Bil = A ,+ .. ).
j
'* rand
j
*" r + I.
·. .
a"
· ..
Row i_
· ..
·. . T h erefore the rt h sum mand in det B is
(\ y~ Ib,1 det B,] Similarly, if i = r
+
= ( I
I , the n b'l
yH (I,~ 1.1 det AMI.I =
(/,1'
=  (  1)('+1)+Ia +1. 1det A,+I.I
'
Bd = A,p and th e {r+ I)sl summand in det B IS
(  l){M 1)+1h,+1. 1del B' I ).I = ( I
Ya" det A'I
= ( _1) M1 a,1 det Ad
In other wo rds, the rth and ( r + I )st te rms in the first colum n cofactor expansion o f de t B are the tlegluives of the {r + I)S! and rth terms, respectively, in the first column cofactor expa nsion of de t A. Substituting all of these results into det Band using Lemma 4.13 again, we obta m
•
de t 8 = ~ (  l) ' ~ l b;l detB'1
, , •
L, , ( 1) '+l b'l det B,] +
( I y+l b'l det B'I
+ ( \ t
+I )+l b'+1. 1de! 8,+1, 1
''',.,+1
•
L ( I) '''' aile  de t Ail) I
i I
''' ",+1
•
 L (IY'la'l d etA'1 i'" I
 detA
(_1)(r~ 1)+ I(/r+ 1, 1 det AM 1.1 
( 1)'+' a,l det Art
Sechon 4.2
Determmants
119
This proves the result for fi X " matrices if adjacent rows are interchanged. To sec that it holds fo r arbit rary row interchanges, we need only note tha t, for example. rows ra nd s, whe re r< s, can be swapped by perform ing 2($  r)  1 interchanges of adjacen t rows (see Exercise 67). Since the num ber of interchanges is odd and each one cha nges the sign of the de term inant, the n et effect is a change of sign , as desired. The proof for col um n interchanges is a nalogo us, except tha t we expand alo ng row I instead of alo ng column 1. We can now prove the Laplace Expansio n Th eorem.
Prool 01 Theorem 4.1 Let B be the m atrix obtained by moving row i of A to the top, using i  I intercha nges o f adjacent rows. By Lemma 4.14,det B = (  I ), 1det A. But 11., = a" and B., :::: A" fo r j = I •... , n. . .. ·..
'" det B = a._ I.• ti, ,,, !. I
·. .
·. . ·..
,..
...
11,.1.;
.. .
" i I.j
11, + I .~
••
· ..
...
"""
Thus,
" L (1)1·'II det ,., " " = ( _ I},I L ( I )HJa'J det A,) = L ( I ) I ~jag det A"
det A "" (  I ), 1det 1J = (  I ), 1
,.,
IJ
B IJ
,.,
which gIVes the formula for cofactor expansio n along row I. T he proof for column expansion is similar, invoki ng Lemma 4.13 so that we can ~ use col umn expansion instead of row expansion (sec Exercise 68 ).
A Briel Hlslor, 01 DOlerminanlS As noted at the beginnmg of this sect ion. the his to ry of determ inan ts predates that of matrices. Indeed, determinants we re firs t introduced, independent ly, by SekJ in 1683 and Leibn lz in 1693. In 1748, determinants appeared in Maclaurin's TreatIse 011 Algebra, which included a treat ment of Cramer's Rule up to the 4 X4 case. In 17SO, Cram er hUl1Self proved the general case of his rule, a pplying it to curve fit ling, and III 1772, Laplace gave a proof of his expansion theo rem. The term determillalU was not coined until 1801, when it was used by Gauss. Cauchy made the first usc of determinants in the modern sense in 1812. Cauchy, III fact, was respo nsible fo r developing much of the early theory of determi nants. including several importa nt results that we have me ntioned: the prod uc t rule for determina nts, the characteris tic polynomial. and the notion of a diagon:dizable ITwtrix. Determinants d id no t become widely known until 1841, when Jacobi popularized them, albeit in the context of fu nctions o f several variables, such as are encountered In a multivariable calculus course. (These types of dctermillants were called " Jacobi:l1ls" by Sylvester around 1850, a term that isslill used today. )
281
Chapter 4
Eigenvalues and Eigen\'cctors
Gottfried Wilhelm von Leibllil. ( J646 1716) W;lS born in Leipzig ~nd studied bw, theology, pnilosophy. and mathe m at i c,~. He is probably best known for developing (with Newton, mdependently) the main ide:15 of differential and intcgnll calculus. However. his contributions to other branches of mathematu:s are also im pressl\'e. 1·le develop·ed the notion of a determinant, knew versions of Cramer's Rule and the Laplace EKpansion Theorem Ix:fore others were gi\'en credit for them, and laid the foundat ion for mat nx theory through wo rk he did on quadratic forms. Leibnil. also was the I1 rSllO develop the binary system of arithmelic He believed in the importance of good notation and, along wi th the f.1miliar nOlal ion for derha tivcs and integrals, introduced a form of subscript notation fo r the coefficients of [\ Imear system Inal is essent ially the t1ol:11ion we use loday.
I•
Charles LutwidgC' Dodgson (18J2
1898) is much beller known by his pen name. Lewis Carroll, under which he wrote Alice's Adwmurts III Wimderland and Through the Lookmg Glass He also wrote several mathematICS books and collections of logic puules.
By the late 19th century, the theory of determ ina nts had developed to the stage that entire books were d evoted 10 it, including Dodgson's 1\/1 Elementary Tlleory of Determinants in 1867 and Thomas Muir's mon umental fivevolume wo rk, which appeared in the early 20th cen tury. While their history is fasci nating, today determ inants are of theoretical mo re than pract ical mterest. C r:lmer's Rule is a hopelessly inefficient method for solving system of linear equations, and numerical mel hods have re placed any use of determ inants in the computation of eigenvalues. Determinants are used. however, to give students an initial understanding of the characteristic polynomial (as in Sections 4.1 and 4.3).
,I
Compute the determilJ(lIIts ill Exercises 16 usillg cofactor exptlllsioll alollg the first row (HId nlollg the first colulI1l1. 1
0
3
1. 5
1
1
0
1
2
3.
 I
0
I
0
1
0
1
 I
2 3 3 I 1 2
5. 2 3
I
3
2
I 3
0
2.
1
1
1
0 2 1
4.
1 0
1 0
1
0
1
1
1
2 3 5 6
7. I 3
2 2 1 2 0 0
3
2
2
4
1
I 0
9.
a
1
I
8. 2
0  2
I 1
10.
0 0 15. 0
12. b 0
I 0
2 6 1 0 0 4 2 1
0 0
d
g h
0 b
,
,.
2
3
5
2 0
0 0 0
b
1
1
cosO
0 11. 0 a b a 0 b
13.
6. 4 7 8 9
3
1
1
Compute the determittatJts in £Xeroses 7 15 using cofactor expmmon along any row or coilimn that seems conve,,;ePlt. 5
4
a
, •
J
14.
1
0 2
sin O cos O
tanO sin O
sin O
cosO
a 0 d
, ,
0 0 0
3 2
 I
I
1
4
0
1
 3
2
Sc<:tio n 4.2
Itl Exercises 1618, compule the m dicated JX J determ;/ltl/lts
b , e = 4
a d
16. Thedetcrminanl in Exercise 6 10
281
Find the determinallts in Exercises 3540, assuming lilal
Ilsing the method of Example 4.9. 17. The determinant
Detennmants
Exercise 8
g II
I
18. The determinant in Exercise 11
2. 2b 2,
19. Verify that the method indicated in (2) agrees with
equation ( I) fo r a 3 X3 determinan t.
,
35. d
20. Verify that definition (4) agrees with the definition of a 2X2 determinant when II = 2. 21. Prove Theorem 4.2. ( Hint. A proof by induction would be appropriate here.)
," 37. b , "g h ,. g
3.
/.
,
 b
36. 3d
,
3g
 /,
2,
2/ 2i
a + g b+h c + i 38. d / ,• h g
d
,
• , d
2, b III ExerCIses 2225, evaluate the given determinallt using
39.
elementary rolY alld/or column oper(ltiol!5 alld Theorem 4.3 /0 reduce Jlre matrix to row echeloll form . 23. The determinant in Exercise 9
Exercise 14
In ExerCISes 2634, use properties of determilUwts to evtl/ulile Ihe given delerm illcml by illSl'ecriol1. Explclitl . your remoll/IIKI 26. 3
I 0
I 2
3 27. 0
2 2
2
o
0
I
28.05
2
3
4
]
29.
2 I
3  3
 4  2
\
5
2
I 0
3  2
164
5
4
I
o
2 0 0
32.
34.
o o o
0
I
I
0
0
0
I
\ 0
] 0
o ]
0
J
]
0 0
o
0
1
I
I
 3 0 0 33. 000
o
0
2f  3;
, •
42. Prove Theorem 4.3(0.
43. Prove Lemma 4.5.
44. Prove Theorem 4.7.
In Exercises 45 (ltId 46, use Tlleorem 4.6 to find all values of k for which A is invertible.
k
k
4 31.  2
o o o
,
k
3
45. A=Ok + l l k  8 k ]
I 0  2 5 o 4
123 30.04\
10 0
b 2e  3h h
4 J. Prove Theorem 4.3{a}.
46. A=
o
g
g
24. Thedeterrninant in Exercise \3 111
"
2i
a 40. 2d  3g
22. The determinant in Exercise]
25. The determinant
2/
k!
o
k 2 k
0
k k
II, Exercises 4752. assume that A am/B are /IX II matrices witll det A = 3 and del B =  2. Find the indicated determimmts.
0 4
] 0
47. det(AB)
48. del(A2)
49. det(Jr'A)
50. det(2A)
51. det{3BT )
52. dot(AA')
III
Exercises 53 56, A and B are nX II malflces.
53. Prove thaI det {AB) = dct{BA). 54. If B is invertible, prove that det(S' AB) "" dct (A). 55. If A is idempotent (that is, A2 = A), find anpossible val ues of det(A). 56. A square matrix A is called nilpotent if A'" = 0 for some II! > 1. (The word lIilpotl!l!I comes from the Latin nil, meaning "nothing," and potere, meaning
Z8Z
Ch~plI.: r
,I
Eige nvalues and E3igcllv(.'Ctors
"to have power." A nilpotent matrix is thus one thai bemrnes "nothing"that is. the zero matrixwhen raised to some power.) Find all possible values of det (A) jf A is nilpotent.
where P and S are square matrices. Such a matrix is said to I>e in block (upper) triangular form. Prove that det A = (det P)(det S) ( H ml: Try a proof by inductio n o n the nu mber of
111 Exercises 5760, lise Cramer's Rule 10 solve Ihe given linear sy$tem. 57.x+ y = I
58. 2x 
5
Y""
70. (a) Give an example to show that i( A can be partitioned as
x+3y   1
xy = 2 59. 2x + Y + 3z  I
6O. x+y z= 1
Y + z I
x+y +z= 2
, I
xy
=3
III Exercises 6 164, lise Theorem 4. J2 to compllte tile inverse of tile coeffiClellt mtllrix for tire gIVtII exercise. 6l. Exercise 57
62. Exercise 58
63. Exercise 59
64. Exercise 60
65.
rows of P. )
If A is an Invertible /IX n malrix, show that adj A is also invertible and that
1 A = adj (A I) det A 66. I( A is an /I X II matrix, prove that (adj A) I =
det(adj A) "" (det A)"  l 67. Venfy that i( r < s. then rows rand s o( a malflx can be interchanged by performing 2( s  r)  I mterchanges of adjacent rows. 68. Prove that the Laplace Expansion Theorem holds (or column expansion along the jlh column. 69. Let A be a square malrix that can be partitioned as A =
[6+~l
A =
[~'iil
where 11, Q, R, and 5 are all sq uare, then it is nOl necessarily Irue that
dOl A = (dct P)(det S)  (dOl Q)(dct R) (b) Assumc that A is partitioned as in part (a) and that P is invertible. LeI B
=:
[:~~~~·i +f]
Compute det ( BA ) usi ng Exercise 69 and use thc result to show that det A "" det Pdet(S  RJTIQ) (The matrix S  RP IQ is called the ScilUr complemetl' o( PIn A, afte r Issai Schur ( 1875 1941 ), who was born in Belarus bU I spent most of his li(e in Germany. He is known mamly (or his fu ndamental work on the represen tatio n theory of gro ups, but he al$O worked in number theory, analysis, and other areas.) (c) Assume Ihal A is partitioned as in part (a), that P is invertible, and t hat PR = RP. Prove Ihat det A = det(PS  RQ)
. ,
~
,  .. ".""
~
~
~
_~I
... :;:
.
Geometric Applications of Determinants This exploration will reveal some of the amazing applications of determinants to geometry. In particular, we will see that determinants arc closely related 10 area and volume form ulas and can be used to produce the equations of lines, planes. and certain other curves. Most of these ideas arose when the theory of determinants was being developed as a subject in its own right.
Tbe Cross Product Recall fro m Exploration: The Cross Product in Chapter I that the cross product u, of U = liZ and v = V2 is the vector u X v defined by
U X Y =
112V, 
l'l Vl
II J V1 
Il IV}
II1 V2 
11 2 V ,
If we write this cross product as ( 112v,  II) v2)e I  ( II I v}  Il, VI )e 2 + (u] v2  IIz V. )c}' where (' I' c 1' and c, are the stimdard basis veclOrs, then we see that the form of this fo rmula is U X V = det
CI
III
VI
C2
112
V1
eJ
IlJ
Vj
If wc expand along thc first column . (This is not a proper determinan t, of course, since e p c2• and eJ are vectors, not scalars; howcver, it gives a useful way of rcmcml>er ing the somewhat a\. . kward cross product formula. It aLso lets us use properues of determinants to ver ify some of thc properties o f the cross product.) Now let's revisit some of the exercises from Chapter I.
283
I.
Use the determinant version of the cross product to com pute u 0
(a) u =
1 ,v = 1
2
I
2  4
2 •y = 3
(,) u =
2.
3
(b) u =
1
(d) u =
6
1
1 1
• y=
1
2 3
W,
",
, V ""
 I • y= 1
v.
0
2
Y,
'" '" '"
If u :::
3  I
X
, and w =
w, ,show that
Y,
W,
", u· (v
X
w) = del
",
",
v, v, Y,
w, w, W,
3. Usc propert ies of d eterminants (and Problem 2 above, jf necessary) to prove the given property o f the cross product.
(b) u X 0 = 0 (d) u X k.v = k(u X v) (c) u x u = O (,) u X (v + w) = u X v + u X w (f ) u' (u X v ) = 0 and v' (u X v ) = 0 (g) u · (v x w) = (u X v) · w (t he Iriplescafarproduct idetltity )
(a) v X u ::: ( u X v)
Area and Volume We can now give a geometr ic interp retillion of the deter m inan ts o f 2x2 and 3X3 matrices. Recall that if u and v arc vectors in R l, then the tHCa A o f the parallelogram d eterm ined by these vectors IS given by A = II u X vii. (Sec Explo ration : The C ross Product in Chapler I.) 4.
Let u
=
["'J
and v =
vl
li!
termined by u and v is given by
b+ d d
A = del
~ UI
k ed)
( Hint: Write u and v as
(a. b)
!i2
o a a+c
figure 4.9
284
["J
. Show that the area A of the parallelogram de
x
[''""
VI
and
V< .)
0
5. Derive the area fo rmula in Problem 4 geomet rically. usi ng Figure4.9 asa gUide. (Hint: Subtract areas fro m the large rectangle until the parallelogram remains.) Where does the absolute value sign come from in this case?
v X ""
h
,

FIGura 4.10
6.
Find the area of the parallelogram determined by u and v.
Generalizing from Problems 46, consider a parallelepiped. a threedimensional solid resembling a "slanted" brick, whose six faces are all parallelograms with opposite faces parallel and congruent (Figure 4.10). Its volume is given by the area of its base tImes its height. 7. Prove that the volume Vof the pa rallelepiped determined by u, v, and w is given by the absolute value of the determinan t of the 3X3 matrix [u v w) with u, v, and w as its columns. [Him: Fro m Figure 4.10 you can see that the height h can be expressed as II = I lul ~os(}, where 0 is the angle between u and v X w. Use this fact to show that V :::: lu· (v X w)l and apply the result of Problem 2.1 8.
Flgur, 4.11
Show that the volume V of the tet rahed ron determined by u, v, and w
(figure 4. 11) is given by V = Hu· (v x w)1 [Him: From geometry, we know that the volume of such a solid is V = base) (heigh t).]
j (area of the
Now let's view these geometric interpretations from a transformational point of vIew. Let A be a 2 X2 matrix and let P be the paral lelogram determ ined by the vectors u and v. We will consider the effect of the matr ix transformation .,~ on the area of P. Let T.... (P) denote the parallelogram determined by T....(u) = Au and T....(v) = Av. 9.
Prove that the area of 1~( P) is given by Idet AI(area of 1').
10. Let A be 3 3X3 matrix and let P be the parallelepIped determined by the vectors u, v, and w. Let T",(P) denote the parallelepiped determined by T",(u):::: Au. T",{ v ) :::: Av, and 'J ~( w) = Aw. Prove that the volume of T",{P) is given by Idet AI (volume of P).
•
The preceding problems illustrate that the determinan t of a matrix captures what the corresponding matrix transformatio n does to the area or volume of figu res upon which the transformation acts. (Although we have considered o nly certain types of figures, the result is perfectly general and c.m be made rigorous. We will not do so here.)
285
LInes and Planes Suppose we are given two distinct poin ts (x .. r,) and ( ~, Y2) in the plane. There is a u nique line passing through these points, and its equation is of the form
ax + by +c=O Si nce the two given points are on this line, their coordinates satisfy this equation. Thus,
ax,+ by,+c = O aX2 + bY2 +C  Q The three equations together can be viewed as a system of linear equations in the va riables (I, b, and c. Since there is a nont rivial solution (Le., the line exists), the coeffi cien t matrix
cannot be invertible, by the Fundamental Theorem of Invertible Mat rices. Consequently, its determinan t must be zero, by Theorem 4.6. Expa nding Ih is determinan t gives the equatio n of the line.
• The equation of the line thro ugh the points (x,. y,) ilnd (~. Y2) is given by
Y
x xl
yl
1 l= O
Xl
Yl
I
II. Use the method descri bed above to fi nd the equation of the line through the given po ints.
(b)( I.2) 'nd (4.3) 12. Prove that the th ree poi nts ( xl' YI)' (\I. Yl )' and (x,. y,) are collinear (lie o n the same line) if and only if Xl
YI
1
x~
y~
l=O
xJ
Yl
1
13. Show that the equation of the plane through the three noncollinear poi nts ( XI' YI' %1 )' (~, Y2' ~), and (~, Yl' %3) is given by
, 1 1 " 1=0 "" 1 What ha ppens if the three po ints are coll inear? [Him: Explain what ha ppens when x x, x, x,
y y, y, y,
row red uction is used to evaluate the determi nant.] 286
14. Prove that the four points (x" r .. ZI)' ( ~'Y2' Zz), (.\3' Y3' Z), and (x.> Y4'Z.) 3rc coplanar (lie in the same plane) if and only if
x, y, " x, y, Z, x, y, z, y.
X.
"
1 1 1 1
= 0
Curve fitting When data arising from experimentatio n take the form of poi nts (x. y) that can be plotted in Ihc plane, it is often of interest to find a relationship between the variables x and y. Ideally, we would like to find a funct Ion whose graph passes through all of the
12
points. Sometimes all we wa nt is an approxim a tio n (see Section 7.3), but A
CX 3Ct
results
are also possible in certain situations. 15. From Figure 4.12 it appears as though we may be able to find a parabola passing through the points A( ] , 10), B(O, 5), .tnd C(3, 2). The eq uation of such a parabola is o f the form y = a + bx + cx 2, By s ubstitu ting the given pOll1ts into this equation, set up a system of three linear equations in the variables a, h, and c. Withollt
6
IJ
so/vmg the system, uSC T heore m 4.6 to argue t ha t it must have a umque solution.
4
Then solve the system to find the equation o f the parabola in Figure 4. 12. 16. Use the me thod of Problem J 5 to find the polynomials o f degree at most 2 that pass through the following se ts of po mts.
c 2
flgur. 4.12
2
4
(,) A( I. I ). 8(2, 4), C(J. J)
6
(b) A( I .  J ).8(I.I ). C(J. I)
17. Ge neralizing from Problems 15 and 16, suppose al' a2 , a nd a) a rc distinct real numbers. Fo r any real nu m bers hI' h1, a nd h3' we wa nt to show tha t there is a unique quadratic with equation of the for m y = a + hx + a? passing through the points (a l • btl , (a2, b2 ), and (a 3, h3)' Do th is by demonstrating that the coefficient matrix of the associated linea r system has the determinant ~
I
a, a2
(Ii =
1
IlJ
tt,
1
(a)  (l ,)(a3  (l ,)(aJ

a2)
which is necessarily nonzero. (Why?) Let a" rlz. a3, and
18.
1 1

"a, aiai a:ai
1
a,
1
'.
ai al ai a:
fl.
be distinct real num bers. Show that
= (~  {1,)(a3  al )(tI.!  Ql )(a.l  (~) (a4  ( 2)(a4  a3)
'* 0
For a ny real num bers b" bl , b~, and b4 , usc this res ult to prove tlla t there isa u nique cubic WIth equatton y = a + bx + ex l + dx' passing through the fo ur points ( {I I' bl ), (a 2, b2), (a3, h3), a nd ( a~, b. ). ( Do not actually solve for a, b, c, and d. )
28'
I 9.
Let a l • a2•••
• , Q~
I I I I
be
a, a, a, a"
/I
real numbers. Prove that
a:
• ••
a;rI
al
•• •
~ l
•• •
(t J
a:
'"
,
 IT
(aJ  (I,)
ls;i < I "; ~
(/~I
"
where n I '" i
I
!II
Section 4.3
,
Eigenvalues and Eigenvectors of nX 11 Matrices
219
Eigenvalues and Eigenvectors 01 nx nMatrices Now that we have defined the determinant of an /IX n m atrix, we can continue our d Iscussion of eigenvalues and eigenvectors in a general co ntext. Recall from Section 4.1 that A is an eigenvalue of A if and o nly if A  Af is noninvertible. By Theorem 4.6, Ih lS is true if and only if de t(A  AI) == O. To sUllllnarize:
The eigenvalues o f a square ma trix A are precisely the solut ions i\. o f the equation
dct(A  AI)
~
0
When we expand del(A  AI), we get a polynomial in A, called the characteristic polynomial of A. The equation d ef(A  ).I ) = 0 is called the characteristic equation of A. For example, if A = [ "
c
d
~
(I  A
b
c
d A
b], its characteristic polynomial is d = (a  A)(d  A)  be = A'  (a + d)A + (ad  be)
If A is IIX II, its characteristic polynomial will be of d egree /I. Accord ing to the Fundamental Theorem of Algebra (see Append ix D), a polynom ml of degree /I with real or complex coeffi cients has at most n distinct roots. Applymg this fa ct to the charac teristiC polynomial, we see that an ' IX 11 mat rix with real o r complex entries has at most
n d,stillct eigellvalues. Let's summarize the procedure we Will follow (for now) to fi nd the eigenvalues and eigenvectors (elgenspaces) of a ill:J\rix.
Let A be an /I X
fI
matri x.
I. Compute the characteristic polynomial det (A  AT) of A. 2. Fmd the eigenvalues of A by solvi ng the characterist ic equatio n det (A  AI ) = 0 for A. 3. For each eigenvalue A, fi nd the n ull space of the matrix A  AI. This is the eigens pace EA, the nom:ero vectors of which arc the eigenvectors of A corresponding to A. 4. Find a basis for each eigenspace.
Example 4,18
Find the eigenvalues and the correspo nding eigenspaccs of
o A =
0 2
1 0 0 S
I 4
:rIa
Chapler"
Eigenvalues and Eigenvcctors
Solullo.
We follow the p rocedure outlmed above. The characteristic polynomial is
 A del(A  AI) =
0
2
1
0 1
 A 5 4  A
A 1 =  A  5 4 A
1
0

2 4 A
=  A(A'  4A + 5)  ( 2) = _ Al + 4Al  5A + 2 To fi nd the eigenvalues, we need to solve the characteristic equation del(A  AI) = 0 for A. The characteristic polynom ial fac tors as  (A  l/(A  2) . (The I:actor Theorem is helpful here; see Appendix D.) Thus, the characten stic equal ion is ( A  1)2(A  2) = O,whichclea rl yhassolutionsA = 1 and A = 2. Since A == 1 i5a muhiple root and A = 2 is a simple root, let us label them AI = A2 == 1 and A.I = 2. To fi nd the eigenvectors correspondmg 10 AI = A2 = I, we find the null space of
A  II ==
 I
1
0
I
2
5 4 1
0 1
I
1
0
I
2
5 3
=
0 1
Row reduction produces
 I
1
0 0
0
I
1 0
2
5 3 0
[A  I IO) =
,
1
0
0
1
 I 0  I 0
0
0
0 0
x, ( We knew in advance that we m ust get at least one zero row. Why?) Thus, x =
X2
is
X,
x,
in the eigenspace E I if and only if XI  Xl == 0 and X:! = O. Set ting the free vari· able x, = t, we see that XI = t and X:! = t, from which it follo\vs Ihal
, , ,
1 ""
t 1
1
== span
1 1
1
To find the eigenvectors corresponding to AJ = 2, we fi nd lhe null space of A  21 b y row reduction :
 2 fA  21 10) =
o
1 0 0  2 I 0
2
 5 2 0
,
!
1
0
o o
1
0 , 0
0
o
,
0
<, So x =
X2
is in Ihe eigenspace ~ if and only if X I = ~Xl and
X2
=
!x. . Setting the
x, free variable
x, ""
E, =
I, we have
,"
",,
,, , ,, I
== span
,, ,
, I
1 = span
2 4
$«lIon 4.3
291
Eigenvalues and Elgen\'ectors of Il X" Matrices
where we have cleared denominators in thc basis by multi plying th rough by the least common deno mi nator 4. (Why is this permissible?)
ReDlarl! Notice that in E.xa mple 4. 18, A is a 3 X3 matrix but has only two d istinct eigenvalues. However, if we count multiplicities. A has exactly three eigenvalues (.A "" I twice and A = 2 once) , This is what the Fundamental Theorem of Algebra guarantees. Let us define the algebraic multiplicity of an eigenvalue to be its mu.bi ~ , plicity as a root of the characteristic equation . Thus, A = I has algebraic muhiplicity 2 and A = 2 has algebraic.muhi"licity I. Next notice that each eigenspace has a basis consisting of Just onc vector. In other wo rds. dim E, = dim ~ = I. Let us define the geometri(;. m ultipiicity o£an eigenvalue , A to be dim EA, thedi m e ns io n of its corresponding eigenspace. As you will see in Sec" tion 4.4, a comparison of these two notions o f multiplicity is Importa nt.
Example 4.19
Find the eigenvalues and the corres ponding eigenspaces o f
 \
A
Solation
=
0
1
3 0 \ 0
3  \
The cha racteristic eq ua tion is
0 = det(A  Al) ~
 A(A'
 \  ).
0
1
3
 ).
3
1
0
 I  A
=
= A
\  A
1
1
 \  A
+ 2A ) ~  A'(A+ 2)
IIe nce, the eigenvalues are A1 = A2 ::: 0 and A) :::  2. Thus, the eigenvalue 0 has algebraic m ultiplicity 2 and the eigenvalue  2 has algebraic multi plicity I. For A1 = Al = 0, we compute
[A  0 1 10J
[A I oj
~
 I
0
3
0
1 0 3 0
1 0
 1 0
~
,
1 0
o o
 \ 0
0
0 0
0
o0
x, fro m whICh it follows that an eigenvector x ""
Xl
lJl
Eo satisfies XI =
Xl .
Therefore,
x, both >; and xJ are free. Seu ing >; = sand xJ = t, \....e have
,
Eo =
, ,
,
I
0
+,
1
0
= span
1
0
0 1 1 • 0 0 I
For AJ = 2,
1 0
[A  (2)1[ OJ
~
[A + 21 10 J
~
1 0
2
 3 0
1 0
1 0
3
,
1 0
I 0
o
1
 3 0
o
0
0 0
lIf
Chapter 4
Eiglmvalues and Eigenvectors
so x) :::
t iS
Cret and
XI
==  xJ
::: 
Ai = 3x.l == 3r. Consequently,
r and
 I
 I
31
=
 I
3
t
== span
1
1
3 1
It follows that AI = A2 == 0 has geometric m ultiplicity 2 and Aj ==  2 has geometric multiplicity I. (Note that the algebra ic multlplteity equals the geometric mul tiplicity Cor each cigenv:llue.)
In some situations, the eigcnv.tlues of a matrix :Ire very easy to find . If A is II triangular matriX, then so is A  AI, and Theorem 4 .2 says that det (A  AI ) is j ust th e product of the diagonal entries. This imphes that the characteristic equation of a triangu lar matrix is
(all  A)(a22

A) '
A) == 0
'( 0 _ 
fro m which it follows immediately that the eigenvalues are AI == a ll' A2 == aw "" An = Il"w We summarize this result as a theorem and illustrate it with an example.
T he eigenvalues of a triangular matrix are the entries on its main diagonal.
lumple 4.20
The eigenval ues of
A
2 0 0  I 1 0 3 0 3 5 7 4
0 0 0 2
arc AI = 2, A2 = 1, Aj == 3, A4 =  2, by Theorem 4.1 5. [Indeed, the characte ristic polynomial isjust (2  A)( I  A)(3  ,\)(  2  A).)
No te that diagonal matrices are a special case of Theorem 4. 15. In fact, a diagonal matrix is both upper and lower triangular. Eigenvalues captu re much importan t information about the behllvior of a matTix. O nce we know the eigenvalues o f a mat rix. we can deduce a great many things withou t doing any more work. The nextlheorem is one of the most imponant in this regard.
Theo". 4.16
A square matrix A is invertible if and only if 0 is
/lot
an eigenvalue of A.

.. •
Proof Let A be a square matrix. By Theorem 4.6, A is invertible if and only if
'*
'*
d el A O. But det A 0 IS equivalent to del (A  01) + 0, which says that 0 root of the characteristic equation of A (Le.,O is not an eigenvalue of A).
IS
not a
We ca n now extend the Fundamental Theorem of Inverti ble Matrices to include results we have proved in this chapler.
Section 4.3
Theorem 4.11
EIgenvalues and Eigenvectors of nX n Matrices
293
The Fundamental Theorem of Invertible Matrices: Version 3 Let A be an
nX 11
matrix. The follow ing statemen ts are equivalent:
a. A is invertible. b. Ax = b has a un ique solution for every bin R". e. Ax = 0 has only the trivial solut ion. d. The reduced row echelon fo rm of A is In' e. A is a product of elementary matrices. f. rank(A) = tl g. nul1 it y(A) = 0 h. The column vectors of A are linearly independent. i. The column vectors of A span IR". j. The column vectors of A form a basis for [Rn. k. The row vectors of A arc linearly independent. 1. The row vectors of A span !R". m. The row vectors of A form a basis for \R". n.detA'lO o. 0 is not an eigenvalue of A.
Prool The equivalence (a) <=} (n) is Theorem 4.6, and we just proved (a) ¢:;> to) JI1 Theorem 4.16. There are nice formulas for the eigenval ues of the powers and inverses of a matrIX.
Theorem 4.18
Let A be a square matrix with eigenvalue A and correspondi ng eigenvector x. a. For any positive integer 11, ,\ " is an eigenvalue of A " with correspond ing eigenvector x. b. If A is invertible, then II A is an eigenvalue of A I with corresponding eigenvector x. e. For any integer tI, An is an eigenval ue of An with corresponding eigenvector x.
Prool We arc given th
Ax = '\x.
(a) We proceed by induction on tI. For" = 1, the result is just what has been given. Assume the result is true for II = k. That is, assume that, for some positive integer k, Akx = ,\ l X. We must now prove the result for n = k + I. But AH IX
= A(AkX) = A(A"x)
by the induction hypothesis. Using property (d ) of Theorem 3.3, we have A(,\kX)
= ,\'Ax)
=
Al( Ax )
= Ak+I X
Thus, Ak+ IX = Ak+ lX, as req uired. By induction, the result is true for al l integers /I ~ 1. (b) You are asked to prove this property in Exercise 13. (c) You
Z94
Chapter 4
Eigenvalues and Elgtnv«tors
Example 4.21
compute [~ : r[~]. Solullon
Let
A ""
[~
A
eigenvalues of 3re AI = and
V2
=
:]
and x
=
I and A2
=
[ ~ ] ; then what we want to find is A
10
2, with cor responding eigenvectors VI 
x. The [
I]
 I
[ ~]. That is, AV I "" V I
and
AV j ""
2v2
(Check this. ) Since l V I' v 2! forms a basis for 1R2 (why?), we call write x as a linear combi nation of VI and vz. Indeed, as is easily checked, x = 3v1 + 2v2 . Therefore, using Theorem 4.18(a), we have A lOx "" A IO(3v l
+ 2v2)
= 3(AIOvl )
+ 2(AICV 2)
"" 3(A:O)v l + 2(A~O) Vl =
3(1)"[ I] + 2(2,,)[1] I
2
= [
3 + 2"] [20SI]  3+2 4093 12
=
This is certainly a lot easier than comput ing AIO first; in fact, there arc no matrix m ultiplications at all!
When it can be used, the method of Exa mple 4.21 is quite general. We summarize it as the following theorem , which you arc asked to prove in Exercise 42.
I
Theorem 4.19
Suppose the nXtlmatrix A has eigenvectors vI' vl > ••• > v'" with corresponding eigenvalues AI' A2, ••• , Ant' If x is a vector in Rq that can be expressed as a linear combination of these eigenvectorssay, x = eiv i
+ elv: + ... + c,..v",
then, for any integer k,
Wlrll_. The catch here is the "if" in the second sentence. There is absolutely no guarantee that such a linear combination IS possible. The best possible situation would be if there were a basis of IR" consist ing of eigenvectors of A; we will explore till S possibility fur ther in the next section. As a step in that direct ion , howeve r, we have the following theorem, whi ch states lhat eigenvectors corresponding 10 distillct eigenvalues are linearly independent.
Theore.4.20
Let A be an nX n matrix and let AI' A1> ••• , A", be distinct eigenvalues of A with cor· responding eigenvectors VI' VI" '" v .... Then VI' v2> ... ,v",are linearly independent.
SKtion 4.l
2:15
Eigenyalues and Eigen\lectors of nXn Matrices
PrOI' The proof is indirect. We will assume that VI' v z•...• V", are linearly depew/em and show that this assumption leads to a contradiction. If v" VZ' ...• V'" are linearly dependent, then one of these vectors must be expressible as a linear combination of the previous o nes. ~t vH I be the first of the \lectors v, that can be so expressed. In other words, VI' VZ" '" V1 are linearly independent. but there are scalars c.' ~ , ... , c1 stlch that (I )
Mul tiplying both sides of equation ( I) by A fro m the left and using the fact that Av, = A,v, fo r each i, we ha\le
+ C:!V2 + ... + CkV ,,)
Ak+I Vl+l = AVhl = A(ci vi = ClAVI
+ C:!Ayz + ... + cl!\vi;
= c.A.v.
+ C:!Azvz + ... +
Now we multiply both sides of equatio n ( 1) by " hIYH I = c. Ah. v.
(2)
c.tAl,vI;
"t+. to get
+ C:!" Ir+ IYZ + .. + Cp\.l:+ lvk
(3)
When we subtract equation (3) from equation (2), we obtain 0 = cl(AI

Ahl)VI
+ Cz{Az  A1", I) V2 + ... + c.{A l  AI:+ .)vl;
The linear independence of v.' vl ' " ., Vi; implies that
c.( A.  AuI) = Cz{Al  AH I) = ... = r.{Al  AI;+I) "" 0 Since the eigenvalues A, arc all distinct. the terms in parentheses (A,  A•• I), I = 1••.. , k. afe all nonzero. Hence, ' J = '1 = ... = '1 = O. This implies that vhl = ' .v.
+ C:!v2 + ... + ClYl
= OV I
+ OV2 + ... + Ovl = 0
which is impossible, since the eigenvector vI:+ 1 cannot be lero. Thus, we ha\le a contradiction, which means that our assumption that VI>V l ' " • >v'" are linea rly dependen t is false. It follows that Y. , V1'. ' " V'" must be linearly independent.
/n exercISes / 12, compute (a) the characteristic polynomial of A, (b) the eigenvalues ofA, (c) a basIS for eacl! eigenspace of A, lIlid (d) tile algebmic alld geometric multiplicity ofeach eIgenvalue. I. A=[
3. A =
5. A =
I
 2
~l
2.A =[
~l
2
\
0
\
0
0
\
3
\
\
0 0 3
0 0 0
\
 2
\
2
 \
0 0
 2
\
4. A = 0
\
0
3
\
\
 \
\
0
\
\
6. A =
3 2
, 0
10. A
\
\
 \
\
,
11. A =
 \
0
2
0
\
 \
\
\
0
0 \ 4 0 0 3 0 0 0
5
2
\
0
0
 \
0 0 0 0
\
\
\
\
4
0
0
9.A =
3
 \
8. A =
2 2
\
\
2
7. A =
\
\
0 0 0 0
\
\
0 2 3  \ 0 4
=
\
\
2
296
Chapter 4
Eigenvalues and Eigenvectors
I
0
12. A :::
4 0 0 4 0 0 0 0
I
I
I
2 0
3
(b) Using Theorem 4. 18 and Exercise 22, find the eIgenvalues and eigenspaces of AI , A  2/, and A + 21. 24. Let A and Bbe n X II matrices with eigenvalues A and J.L, res pectively. (a ) Give an example to show that A + J.L need not be
13. Prove Theorem 4.1 8(b). 14. Prove Theorem 4.18(c}. [Hint: Combine the proofs of parts (a) and (b) and see the fo urth Remark follow ing
Theorem 3.9 (p. 167}.!
I" Exercises IS and 16, A tors V ,
::: [ _ : ]
ami V2
IS
a 2X2 matrix with eigenvec
::: [ :
J corresponcling to elgenvailles
!
AI ::: and Al = 2, respectively, and x :::
[~].
15. Find Al(lx.
an eigenvalue of A + B. (b) Give an example to show thtH AJ.L need not be an eigenvalue of AB. (c) Suppose A and J.L cor respond to the sallie eigenvector x. Show that, in t his case, A + JJ. is an eigenvalue of A + Hand AJ.L is an eigenvalue of AB. 25. If A and Bare IWO row equivalent matrices, do they necessarily have the S;lme eigenvillues? Ei ther prove Ihat they do or give a counterexample.
Let p(x) be tile poiynom;tli
16. Find AlX. What happens as k becomes large (i.e., k + oo)? II! exerCISes 17 alld 18, A is a 3X3 matrix with eigerlvectors I I I V, = 0 ' V2 = 1 ,am/v)::: I corresponding to eigellI o o vailles A, =  i, A2 ::: and A) = 1, respectivel)\ and
1,
p(x} = X' + a"_Ix"1 + ... + a,x +
Tile companion matrix ofp(x) is Ille /I X" matrix  a" _1
 11"  2
I
0
0
I
0
0
0
0
C(p)
2
x=
I
ao

(I ,
0
 "0 0 (4 )
0
• ••
I
0 0
26. Find the companion matrix of p(x) = x 2  7x + 12 and then find the characteristic polynomial of C( pl.
2
17. Find A 10x. 18. Find Akx. What happens as k becomes large (Le., k '"""" oo)? 19. (a) Show that, for any sq uare matri:x A, AT and A have the same characteristic polynomial and hence the same eigenvalues. (b) Give an example of a 2X2 matrix A fo r which AT and A have different eigenspaces. 20. Let A be a nilpotent matrix (that is, Am = a fo r some II! > I). Sho\'I' that A ::: 0 is the only eigenvalue of A. 21. letA bean idempotent matrix (that Is, A! = A).Showthat A = 0and A = I are the only possible eigenvalues of A. 22. If V is an eIgenvector of A with corresponding eigenvalue A and c IS a scalar, show Ihat v is an eigenvector of A  cI with co rrespondi ng eigenvalue A  c. 23. (a) Find the eIgenva lues and eigenspaces of A=
[~ ~]
27. Find the companio n ma trix of p(x) = xl + 3x2 4x + 12 and then find the characteristic polynomial
of C( pI. 28. (a) Show that the companion matrix C( p ) of p(x) ::: xl + ax + b has characteristic polynomial A2 + aA
+
b.
(b) Show that if A is an eIgenvalue of the companion
~
matrix C( p) in part (a), then [ ] is an eigenvector of C( p) corresponding to A. 29. (a) Show that the companion matrix C( p) of p(x) ::: Xl + ax 2 + bx + c has characteristic polynomial _( A' + aA 2 + bA + c). (b) Show that If Aisan eigenval ue of the compan ion
A' matrix C( I'} in part (a), then
A isan eigenvector I
of C( p) corresponding to A.
Section 4.3
30. Construct a tloili riangular 2x 2 matrix wit h eigenvalues 2 and 5. (Hint: Usc Exercise 28.)
33. Verify the Cayley Hamilton Theorcm fo r A =
2, the companion matrix C( p) of p(x) "'" x" + tl.. I x "~ I + ... + a , x + ao has characteristic polynomial ( \ )"p (A). 1HlMt: Expand by (ofacton along the last colum n. You may find it helpfu l to introduce the polynomial q (x) = ( p(x)  '\I)/x.1 (b) Show that if A IS an eigenvalue of the compan ion matrix C( p } in equation (4), then an eigenvector corresponding to A is given by II
~
 I
0
1
 2
I
0
powers mId inverses of flit/trices. Io r example, if A is a 2X 2 matrix with ,/ramaer;st j, poiynomitll ' ,,( A) ::< A2 + aA + b, thenA 2 + aA + bl= O,so
+
A
~
I
~
a" _ I A~1
+ . + a,A + aoI
Au imporlrlllt theorel/llll (Idwlllced li" eM alge/ml says 111(11 if c,.(.A) IS the ciltlftlc/eriS/1Cpolynomial of the matrix A. lilen CA( A) = (itl words, every matrix satisfies its characterislic equotioll). This IS the celebrated CayleyHamilloll r 1leorem, mll1leti after Arthur Cayley ( 182 1 1895) Ilud SI( WjJ{iam Rowan Hamiltotl (sec page 2). Cayley proved this tllt.'orem ill 1858. HamiltOll dlscoveretl iI, illdepemle1lf/); ill IllS work all quaterniorlS, (I gellemlizat;oll of tile complex nllmbers.
 aA  bl
AI = AA2 = A(  (11\  bf) =  aA 2
A~
~].
'n,e Cayley Hamilton TI ,corem can be used to ca1cultJfe
and
I\A ) "'"

34. Verify the CayleyHamilton Theorcm for A :: I I 0
A 2::11:
If p{x) = x" + tl..~ I X " 1 + ... + alx + ao and A is a square matrix, we alii define tl sqllflrc rnatflX P(A) by
[~
That is, find the characteristic polynomial c,,( A) of A and show that cA(A) = 0.
3 1. Const ruct a tlont riangular 3 X3 matrix with eigenvalues  2, I,and 3. ( H int: Use ExerclS<' 29.) 32. (a) Usc mathematical induction to prove that, for
til
Eigenvalues and Eigenvectors of n X n Matrices
bA

 ,( aA  bI)  bA (a'  b)A + abi
It is e(lsy to sec tll(l( by cOIl/hilling 111 1/115 fil5hiOIl we am express allY pO$itive power of A as a /illear combill(ltioll of I and A. From A~ + tlA + hI :: 0 , we also obtai" A(A
+ al)
=  hI, so ,
I
a b
A =   A   /
°
b provided b '# O.
35. For the ma trix A in Exercise 33, usc the CayleyHamilton Theorem to compute A 2, Al, and A4 by expressing each as a line,lf combination of I and A. 36. Por thc matrix A III Exercise 34, use the CayleyHamilton Theorem to compute AJ and A4 by expressing each as a linear combination of I, A, and A2. 37. For the matrix A in Exercise 33, use the CayleyHamilton Theorem to compute A I and A 2 by expressing each as a linear comblilation of I and A. 38. For the matnx A in Exercise 34, usc the CaylcyHamilton Theorem to compute A  I and A 1 by expressing each as a linear combination of I, A, and A2.
I
I•
39. Show that if the square matrix A can be partitioned as A =
[ 8+~l
where P and S are S
291
Chapter 4
Eigenvalues and Eigenvectors
40. Let AI' Al , . . . , A" be a complete sel of eigenvalues (repetitions included) of the IIX 11 matri x A. Prove that det(A) tr(A)
" IA2 ... A ~ and = AI + A2 + ... + =
A~
l Hint: The characteristic polynomial of A facto rs as det(A  AI) = ( I )'(A A, )(A  A, ) ... (A A.)
41. Let A and 8 be nXn matrices. Prove that the sum of till
the eigenvalues of A + B is the sum of all the eigenvalues of A and B indivIdually. Prove that the product of all the eigenvalues of AB IS the product of all the eigenvalues of A and B individ ually. (Compare this exercise with Exercise 24.) 42. Prove Theorem 4. 19.
Find the consta nt term and the coefficient of" n I on the left and right sides of this equal ion.)
..=ii',. ~~
'i~;4 ..... .
Similarilv and Diagonalizalion A5 you saw in the 1asl section, triangula r and diagonal matrices are nice in the sense
thaI their eigenvalues are transparently displayed. II would be pleasant if we could relate a given sq uare matrix to a triangular or diagonal one In such a way that they had exactly the same eigenvalues. Of course, we already know one procedure for converting a square mat rix into tTiangular fo rmnamely, Gaussian elimination. Un fortunately, this process does not preserve the eigenvalues of the matrix. In this section, we consider a different sort of tr.msformation of a matrix that does behave well with respect to eigenvalues.
Similar MaUices
DenalUon
Let A and B be /IX 11 matrices. We say that A is similar to B if there is an invertible n X n matrix P such that P IAP = B. If A is similar to B, we write
A B. R8marks • If A  B, we can write, equivalently, that A = PBP I or AP = Pil. • Similarity is a re/tIli()1I on square matrices in the same sense tha t "less than or equal to" is a relation on the integers. Note that there is a direction (or order) implicit in the definitIOn. Just as a :S beloes not necessarily imply b < a, we should not assume that A  B implies B  A. ( In fact, this is true, as we will prove in the next theorem, but it does not follow immediately from the defi nition,) • The matrix Pdepends on A and B. It is not unique for a given pair of similar matrices Aand B. To see this, simply take A = B = 1, in which case 1 l,si nc.:e p I IP = I for allY invertible matrix P.
Example 4.22
LetA =
[~
'],ndB = [2I \
0]
 I
. [' 1][ I 0]
. Then A  8, Since
I
I
 2
 I
Section 4.4
Th us, AP == PH with P == [ :
Similarity and Dlagollalization
299
: ] . (Note I hal it is nol necessary to com pute p  I.
See the first Rema rk above.)
Theorem 4.21
Let A, U, and C be /IX n matrices. a. A  A. b. If A  B, then B  A. c. If A  Band 8 G.lhen A C
Praal
(a ) This proper ty fo llows from the facl that
r
l
Ai == A.
(b) If A  8, then p  IA/) == B for some invertible matrix P. As noted in the first Remar k above, this is equi valent to PBP  ' = A. Setting Q == p  I, we have Q 1BQ = ( P l ) 18P 1 = PBP l = A. Therefore, by defin ition, H  A. (c) You are asked to prove property (c) in ExerCIse 30.
Rllllla,' Any relation satisfying the th ree proper ties ofThro rem 4.2 1 is called an equivaletlce relation. Equivalence relations arise frequent ly in mathematics, and objects that a re related via an equivalence relation usually share importllnt properties. Vie arc about to see that tIllS is t rue of similar matrices.
,
Theorem 4.22
Let A and B be /I X II matrices with A  B. Then a. b. c. d. e.
det A ::: det H A is invertible if and only if H is invertible. A and B have the same rank. A and 8 have the same characteristic polynom ial. A and H have the same eigenvalues. ,
Plool We prove (a) and (d ) and leave the remaining properties as exercises. If A  H, then P1 AP = 8 fo r some invertible matrix P. (a) l llking determinan ts of both sides, we have det B = det(P 'AP) = (del p  I)( det A)(det P)
~( det1 P)(det A)(dctp)
= delA
(d ) The characteristic polyno m ial o r H is del (B  AI) ::: dct(P 1AP  AI)
= det (p 1AP  Ar ' IP) det(P 'AP  p I(AJ)P)
d" (P '(A  M)P) ~ dot (A  AI) with the last step foll owing as in (a). Thus, de t(lJ  AI) = det(A  AI); tha t is. the characteristic polynom ials of B and A arc the same.
all
Chapter 4
EigenvaJu~
and EigenV«toN
Remark
Two mat rices may have properties (a) through (e) (and mo re) in co mmon
and yet still not be similar. For example, A =
[ ~ ~] and B  [~
: ] both have de
terminant I and rank 2, are invertible, and have characteristic polynomial (1 and eigenvalues A, = A2 = 1. But A IS not similar to B, since p  IAP "" PI IP = I fo r any invertible matrix P. Theorem 4.22 is most useful in showing that two matrices are and Bcannot be similar if any o f properties (a) th rough (e) fa ils.
Eumple 4.23
(a) A =
[ ~ ~ ] andB = [ ~ ~]are not similar, sincedetA =
(b ) A =
[ ~ ~] and B = [ ~
/lo t
"y
'* B
similar, since A
 3 butdetB = 3.
_ : ] are not similar, since the characteristIC polyno
m ial of A is A1  3'\  4 while that of B is A1  4. (ChK k this.) Note that A and B do have the same d eterminant and rank, however.
Dlauonallzallon Th e best possible situation is when a square matrix is similar to a d iagonal matrix. As y<:m are about to see, whether a mat rIX is diagonalizab le is closely related to the eigenvalues and eigenvectors o f the mtllrix.
DenuUlon
An /I X " matrix A is tliago mdizable if there is a d iagonal matrix D such that A is similar to D that is, if there is an invert ible /IX" matrix P such that P IAP= D.
Example 4.24
II =[ ~ ~ ] isdiagOnali7.able Since' ifP :::[ : _~ ] andD =[~ _~J. thenp IAP= D' as ( an be easily checked. (Actually, it is faster to check the eqUivalent statement liP = PD, since it docs not require findi ng P 1.)
Example 4.24 begs th e question of where matrices P and D came from. Observe that the diagonal entries 4 and  1 o f D arc the eigenvalues of A, since they are the roots of its characteristic polynomial, which we found in Example 4.23(b ). The ongin of matnx P is less obvLous, but, as we 3re about to demonstrate, its en tries are obtained from the eigenvecto rs of A. Theorem 4.2 3 makes th is connection precise.
Em
Let A be an n X " matrix. Then A is diagona lizable if 3nd only if A has /I linearl y independent eigenvectors. More precisely, there exist an invertible matrix P and a diagonal matrix D such that p 1AP = D if and only if the columns of Pare tllinearly independent eigen....ectors o f A and the d iagonal entries of D are the eigenvalues of A corresponding to the eigenvecto rs in P in the same order.
Similantyand Diagonalization
Snlion 4.4
311
Proof Suppo~ firs t that A is Similar to the diagonal matrix D via P ' AP = Dar, equivalently, AP = PD. Let the columns of P be P I' Pl' ... , p" ,md let the diagonal entries of Dbc AI' A2, ••. , An. Then
A, A[p, p,
.. . p.]
= [p,
p,
• ••
p.]
0
. ..
0
A, ...
0
0
0 0
(I)
A.
Ap, . . . Ap ~]  (A,P I "2P2
(2) A"p~] where the righ thand side is just the column row representation of the product PD. Equating columns, we have
0'
[ API
API = A,p ,. Apz = A2Pl" .. , Ap..
=
A"p..
which proves that the column vectors of Pare eigenvectors o f A whose corresponding eigenvalues arc the d iagonal entries of D in the same order. Since P IS invertible, ItS colum ns are lmearly independent, by the Fundamental Theorem of Invertible Matrices. Conversely, if A has 11 linearly independen t eigenvectors PI' P2' . .. , Pn with corn:sponding clgenvalucs A" AI' ... , A ~, respectively, then
API
=
AIPI ,Ap;: "" AlPl. · .. ,A p ~ =
A"p n
This implies eq uation (2) above, which ISequivalent to equation ( I ). Consequen tly. if we take P to be the " X " matrix with col umns PI' P2' ... , Pit' then equalion ( 1) becomes AP = PD. Si nce the columns of Pare linearly independent, the Fundamental Theorem of Invertible Matrices implies that P is Invertible. so PIAP = D; that is, A is diagonalizable.
lxample 4.25
If possible, find a matrix P that diagonal izes A =
o
I
0
0
0
1
2 5 4 Solution We studied this matrix in Example 4.18, where we discovered that iI has eigenvalues AI = .\.2 = 1 and A~ = 2. The elgenspaces have the follow ing bases: 1
For AI = A2 = I , EI hastxlsis 1 1 1
For AJ = 2,
Eh has baSIS 2 .
4 Since all other eigenvectors arc just multiples of one of thesc two baSIS vectors, there cannot be th ree linea rly independent eigenvectors. By Theorem 4.23, therefore, A is not diagonalizable. ~
Ixample 4.26
If possible, find a matrix P that diagonalizes A =
 I
0
1
3
0
 3
I
0
 I
30%
Chapter 4
Eigenvalues and Eigenvectors
Soliliol This is the matrix of Ex 3m pie 4. 19. There, we found that the eigenvalues o f A are AI "" A2 == 0 and A, ::::I'  2, with the following bases for the elgenspaces: 0
1
1 and pz ==
0
0
1
For AI = Al == O. Eo has basis P I =  I
3 1
II is straightforward if we take
[0
check that these three vectors are linearly independen t. Thus,
o P  [PI
P2 P,l] =
I
 \
1 0
3
o
1
1
then P IS invertible. Furt hermore.
o p IAP =
o o
0
0 0 o 0
= D
 2
as can be eaSily checked. (If you are checking by hand, It is much easier to check the equivalent equation AP = PD.)
Rllllrkl • When the re are eno ugh eigenvoctors, they ca n be placed into the columns of P in an y order. However, the eigenvalues will come up on the diagonal of D 111 the same o rder as their correspondi ng eigenvectors in P. f o r exam ple, if we had chosen
P = [ PI p, p,] =
0
 I
1
1
3
0
I
0 1
the n we would have found
0 p IAP "'"
0 0
0
0
2 0 0
0
• In Example 4.26, yo u were asked to check thaI Ihe eigenvectors PI' Pl' and P, were linearly independent. Was it necessary to check thiS? We knew that I PI' p zl was linearly independent. since it was a basis for th e eigenspace ~. We also knew that the sets IPI ,p, 1and {P2' p,1were li nearl y independen t. by Theorem 4.20. But .....ecould not conclude from this informat ion that I P I' Pl' PJ} was linearly independent. The next theorem, however, guara ntees that hnear independence is preserved when the bases of d itTerent eigcnspaces are combmed.
Theull.. 4.24
Let A be an n X 11 matrix and let AI' Az•... , Ai be distinct eigenvalues of A. If B, is a basis for the eigenspace EA, . then B = B l UB1 U ... UB, (i.e., the to tal collection of basis vectors for a ll of the eigcnspaces) is linearly independe nt.
•
Section 4.<1
PrODf
Similarity and Diagonalization
303
Let B, = {Vii' v,:!"'" v"J for i = \, .. .• k.. We have to show that
is linearly independent. Suppose some nontr ivial linear combination of these vectors is the zero vectorsay, (CII V II
+ ... +
Cln, V ln ,)
+ (C21 V21 + .. + Cz",V2~, ) + .. + (C1"\ VU + ... + Cbl, Vk~,)
=
0 (3)
Denoting the sums m parentheses by X I' Xl "
.. Xl'
we can write equation (3) as (4)
Now each x, is in EA. (why?) and so either is an eigen vector corresponding to A, or is O. But. since the eigenvalues A, arc distinct, if allY of the factors Xi is an eigenvector, they are linearly mdependent, by Theorem 4.20. Yel equation (4) is a linear dependence relationship; this IS a cont radiction. We concl ude that equation (3) must be trivial ; that is, all of its coefficients arc zero. Hence, B is llllearly independent. There is one case in which diagonalizability is automatic: an II distinct eigenvalues.
Theorem 4.25
If A is an fiX n matrix with
n
/I X /I
matrix wi th
distinct eigenvalues, then A is diagonalizable.
Let v I' Vi " ' " v ~ be eigenvectors corresponding to the 11 distinct eigenvalues of A. (Why could there not be more than 11 such eigenvectors?) By Theorem 4.20, VI' v z• ... , v~ are linearly independent, so , by Theorem 4.23, A is diagonalizable. Proof
Example 4.27
The matrix A ""
2
3
7
0
5
I
o
0
 1
has eigenvalues Al "" 2'..\2 "" 5,and A, =  I, by Theorem 4. \ 5. Since these are threedistinct eigenval ues for a 3 X 3 matrix. A is diagonalizable, by Theorem 4.25. (If we actually require a matrix P such that P IAP is diagonal, we must still compute bases for the eigenspaces, as in Example 4.1 9 and Example 4.26 above. )
The final theorem of this section is an Important result thai characterizes diagonalizable matrices in terms of the two notions of multiplicity that were introduced following Example 4.18. It gives precise conditions under which an /IX /I mat rix can be diagonalized, even when it has fewer than II eigenvalues, as in Example 4.26. We first prove a lemma that holds whether or not a matrix is diagonalizable.
lemma 4.26
If A is an "X II matrix, then the geometric multiplicity of each eigenvalue is less than or equal to its algebraic multiplicity.
314
Chapter 4
Eigenvalues and Eigenvectors
Proof
Su ppose Al is an eigenvalue of A with geometric multipliCity p; that is, dim £A, "" p. Specifically, let El , have basis B I = Iv l • v 2" '" ' vI'l. Let Qbeany invertible IIX II matrix having vI' v 2'" •• , vI' as ils fi rst pcolum nssay,
Q=
[ VI
"
vI'
vp + 1
.•.
v..l
o r, as a pa rt itioned matrix,
leI
where C is px II. Since the columns of U are eigenvectors corres ponding to AI' AU = AI U. We also have
from which we obtai n CU "" II" CV = 0, DU = 0 , and DV = 1"_p ' Therefore,
QtAQ ""
(~]A[U!VJ "" [£~'~ 'f'~1r] = [·~;ft+~~·~ ] = [·~8P. j. ~~:J
By Exercise 69 in Section 4. 2, it fo llows thm det(Q IAQ  AI) = (AI  A}pdet(DAV  AI)
(5 )
But det ( QIAQ  AI ) is the cha rac teristIC polynomial o f Q IAQ. which is the same as the characteristic polynomial of A, by Theorem 4.22( d). Thus, equation (5) implies that the algebraic multiplicity of AI IS at least p, its geometric multiplicity.
Theorell 4.21
The DiagonaJization Theorem Let A be an IIX II matrix whose d istinct eigenvalues are A" Al •. .. , A1" The following statements are equivalent: a. A IS diagonalizable. h. The union B o f the bases of the eigenspaces of A (as in Theorem 4.24) contains II vectors. c. The algebraic mult iplicity of e.leh eigenvalue equals its geometric multiplicity.
Proof
(a) ~ ( b) If A is diagonalizable, then it has II linearl), independen t eigenvectors, by Theorem 4.23. If " , of these eigenvectors correspond to the eigenvalue A" then B, contains at least II, vectors. ( We already know that these " i veclo rs are linearl y ind ependent; the only th ing that Illlght prevent the m fro m being a basis fo r £.1.. is tha t they m ight not span it.) Thus. B contains at least II vectors. But, by Theorem 4.24, B is a linearly independent set in an; hence, it contains exactly n vectors.
Section 4.4
Similarity and Dlagonalizalion
105
(b) => (c) lei the geometnc m ultiplici ty of A, be d, = d im fA, and let thc algebraic multiplicity of A, be III " By Lemma 4.26, d, < III , fo r j "'" I, .. " k. Now assume that property (b) holds. Then we also have
Bul nil + "'1 + ... + /Ilk = II, since the sum of the illgebraic multiplicities of the eigenvalues of A is just the deg ree of thc characteristic polynom ial o f A namely. II, lt follows IhOlI ell + dz + .. + cll, ., "'1 + nil + ... + "'1> wh ich implies that
(6) Using Lem ma 4.26 again, we know Ih31 III;  el, > 0 for j :z I, ... , k, from which we can ded uce that each summand in equat io n (6) is zero; that is, III, = d, fo r i '" 1, ... , k. (e) ::::} (a ) If the algebraic multi plicity III, and the geometrIC multi plicity (I, 3fC equal for each eigenvalue A, o( A, then l3 has til + dz + ... + til = "' 1 + rl/ 2 + .. . + 1111 = n vectors, which arc linearly independent. by Theorem 4.24. Thus, these are nlinearly independent eigenvectors of A, and A is diagonalizable. by Theorem 4.23. ~
Ilample 4.28 (a ) The matrix A =
o
\
0
o \
0
from Exa mple 4. 18 has two d istinct eigenvalues,
2
5 4 Al = Az = 1 and A} = 2. Since the eigenvalue Al = AJ = \ has algebraic multiplicity 2 bu t geo metric multiplicity I, A is not diago nalizable, by the Diagonalization Theorem. (See also Example 4.25.)  I 0 \
(b) The matrix A =
3 0 3 (rolll Example 4. 19 also has two d istinct cigenI 0  \ values, AI = A2 = 0 and A} =  2. The eigenvalue 0 has algebraic and g~'Ome t nc mult iplicity 2. and the eigenvalue  2 has algebraic and geometric multiplicity I. Thus, thts ma trix is diagonalizable, by the Diagonalization Theorem. (This agrees with ou r fi ndi ngs in Example 4.26.)
We conclude this section with an applICation of diagonalization to the computatio n o( the powers o( a matrix.
( .. IIPIe $olullon
In Example 4.21 , we (ound Ihat this matrix has eigenvalues At =  \ and
A2 = 2, with corresponding eigenvectors VI == [_ : ] and V2 =
[ ~]. It fol lows (fro m
anyone of a number of theorems in Ihis section) thaI A is diagonalizable and Pt AP = D, where
v,) _ [ \  \
;]
and
306
Chapter 4
Eigenvalues and Eigenvectors
Solving for A, we have A = PDP I, which makes it easy to find powers of A. We co mpu te A2 = (PDp I )( PDp I)
= PD(P lp)Dp 1 = PD/Dpl
= PD 2p l
a nd, generally, N = PlY' p 1 for all n > \. ( You should ve rify this by induction . Observe that this fact will be true fo r tilly diagonaliwble matrix, not just the one in th is example.) Since I)"
0]"~ [(\)" 0]
 I = [ 0
2
0
2"
we have
An = p[1'p1
=
[: :][<~)" ~,][ : :r [: :][<~ )" ;"][1n 2{  1) ~
3 2( _ 1)'1 01
+ 2n + 2ntJ
3
{ _ 1)~+ 1
+
2~
3 {  I t+2 + 2,,+1
3
Since we were o nly asked for A10, this is mo re than we needed. But now we can simply set 11 = 10 to find
2(  1) 10 A10 =
+ 210 (_ 1)11 + 210
3 3 2( 1 )11 + 211 ( _ 1)12 + 2"
I. A
2.A :::: [
3. A
: ]. B =[~ ~]
=[~
=
,
 5 0 0
4. A =
4 3 • 4
\
2 0
\
2
0 0
\
\
B= 0
 \ ,B= \
\
0
\
4
0 0
2
3
4
2 \ 0 \ 2 0
682 683
In Exercises 5 7, a diagonallzatioll of tile matrix A is given in the form p  I AP = D. List tile eigellvalues of A and bases for tire correspondmg eigenspaces.
5. [ _~ : ][~ ~][ : ~]=[~ ~J ,1 , l \ \ \ 3 \ 0 \ \  \ 6. j  ~  I, 0 0 \ 1, , ,, \ \ 0 , 0  \
 7\]' B=[ 42 ~]
3
341]
3
3
In Exercises 14, show tlull A and B are not similnr matrices.
= [342
1
1
\
0 \
2 0 0 0 0 0
0 0 \
=
Section 4.4
" ," _1• ,• ,• ,• • • " 6 0 0 !
1
1
_1
7.
0
2
0
0
3 3 2 0 2 3 3 ] ]
3
0
]
2
]
0
3  ]
 ]
12. A =
0 I\. A
0 0 2
3
3 0
]
2
0
0
2
0
3
2
]
0
0
0
0
0
3 0
2
14. A =
==
]
0
]
0
]
]
]
1 0
]
2
]
 ]
0
]
]
]
0
0 13. A ==
1
15. A =
2 0 0 0
]
0 2 0 0
0
4
0  2
0 0 2
0
III general, il is dlffiw/t 10 show lilM two matrices are similar. However, if two SllIIilar lIIalrices are dlllgollalizable, tile task becomes easier. III Exercises 3639, sholl' that A ami B are SlInilar by slJOwillg thm they are similar to liIe sallie diagonal matrix. Theil find all illvertible matrix P slIch liIM P1 AP= B. 36. A =
20.
22.
:r :r
4
 ]
2 ] 2 ] 2 ] 2 0
2 2 2
]
1
]
]
0
2
]
17. ]9.
•
l ~ ~r
38. A =
[~
:r
]
]
]
0
 ]
0
0
0
]
]
]
0
2 0
2
2
]
]
21.
, 23.
39. A ,."
26. A =
[ ~ ~] [~ ~]
25. A
27. A =
,
]
0
0
I 0
=
0
 2
 6
1 0  2 1 ,B =
:] 3 2 ] 2
5  4
0
0
]
2
]
0
2
3
2
]
 ]
6
5
0 0
2
0
4
4
 ]
1 ,B 1
=
2
 ]
41. Prove that if A is diagona lizable, so is A T. 42. leI A be an invert ible matrl.'c Prove that if A is diago
nalizable, so is AI.
[ ~ ~] o
[:
3 ], 8 = [I
40. Prove that i f A is similar to B, then AT is si milar to 8 T•
In Exercises 2429, find all (real) Vllllles of k for wliiclt A is diagolla/;zable. 24.A =
[~ _ : lB = [~ ~] 2
compllte the IIIdiC(lted power of the m(l/rtx.
18. [
]
34. If A and B are invertible matrices, show th,lt A B and BA are similar.
III Exercises 1623, lise ti,e melllo,/ of Example 4.29 to
['3
]
33. Prove Theorem 4.22(e).
37. A =
]6.
]
35. Prove that If A and 8 are similar mat rices, then tr(A) = Ir(B) . (HUll: Find a way 10 use Exercise 45 from Section 3.2.)
:]
9. A "" [3 I
3 1
]
]
1
k k k
32. Prove Theorem 4.22(c).
8. A== [52 ~] 10. A ==
0
]
3 1. Prove Theorem 4.22(b).
III Exercises 8 /5, determme whel/le( A is diagonalizable ami, if so, find an invertible matrix P and (l dmgonal lIIa/rix DSllch that P 1AP= D.
]
0
29. A ==
]
311
30. ProveTheorem4.21 {c).
0  2
3 0 0
0
k 0 2 0
]
28. A ==
Simitamyand Diagonahl.atlon
k 0 ]
43. Prove that if II is a diagonalizable m:nrix with only one eigenvalue A, then A is of the form A = AI. (Such a matrix is called a scalar motrix. ) 44. Let A and B he /I X n ma trices, each with II distinct eigenvalues. Prove that A and B have the same eigenvectors if and only if AB = BA. 45. Let II and B be similar ma trices. Prove that the algebraic multipli cities of th e eigenva lu es of A and Bare the same.
308
Chapter 4 Eigenvalues and Eigenvectors
46. Let A and B be similar ma trices. Prove that the geometric multiplicities of the eigenva lues of A and Bare the same. (Hint: Show that, If B = P rAP, then every eigenvector of B is of the fo rm p  I v fo r so me eigenvecto r v of A. ) 47. Prove that If A is a diagonalizablc matrix such that every eigenvalue of A is either 0 or I, then A is idempotent (that is, A2 = A ). 48. Let A be a nilpotent matrix (that is, A'" = 0 for some m > I). Prove that if A is diagona lizable, then A must be the zero matrix. 49. Suppose that A is a 6 X6 matrix with characteristic polynom ial ciA) = ( I + ,\)( \  ,\ )l(2  A)J.
(a) Prove that it is not possible to find three linearly independent vectors vI' v2' v3 in R6 such Ihat AVI = Vl' AV2 = v 2 ,and Av) = vJ • (b) If A is diagonalizable, what are the dimensions of Ihe eigenspaces E_I ' E], and ~?
so. Let A =
[:
!J
(a) Prove that A is diagolltllizable if (a  (W + 4bc > 0 and is not d iago nali7.able if(n  d )l 4bc < o.
+
(b) Find two examples to demonst rate that if (a  tf)2 + 4bc = 0, then A mayor may not be diagonalizable.
,
lIeralive Melhods for CompuUng Eigenvalues
In 1824, the Norwegian mathematiCian Niel) Ilenrik Abel (1802· 1829) proved that a general fifthdegree (qUintic) polynomial equation IS nOi SO/WIble by mdica/s: that IS, there is no fo rmula for Its roots III terms of liS coeffi cients that uses only the operations of addit ion, subtraction, multiplication, division, and lakllig I1\h rOOlS. In a paper written in 1830 and published posthumously in 1846, the French mathematician Evariste Galois ( 1811  1832) gave a more complete theory tha t established cond itions under which an arbitrary polynomial equation ca n be sol\'ed by radicals. Galois's work was inslrumental m establiShing the branch of algebra called group II,tory; his approach to polynomial equations is now known as Galois theory.
At this point, the o nly method we have fo r comput in g the eigenvalues of a matrix is to solve the characteristic equation. However, there arc several problems with this method that render it impractical in all but small examples. The first problem is that it depends o n the computatio n of a determi nan t, wh ich is a very timeconsumlllg process for large matrices. The second pro blem is that the characteristic equation is a polynom ial equatIOn, a nd there arc no fo rmulas fo r so lving polynomial equations of degree higher than 4 (polyno mials of degrees 2, 3, and 4 ca n be solved using the quad ratic fo rmula and its analogues). Thus, we arc fo rced to approximate eigenvalues in m ost practical problems. Un fo rtu nately, methods fo r approximating the roots of a polynomial are qUite senSitIVe to roundoff error and arc therefore unreliable. Instead, we bypass the characteristic polynomia l altogether and take a d ifferent approach, approximating an eigenvecto r first and then using thiS eigenvector to fi nd the correspo nding eigenvalue. In thiS section, we Will explore several variations on one such method that IS based on a simple iterative technique.
The Power Method T he power method applies to an n X n matrix that has a dominant eigenvalue AI that is, an eigenvalue that is larger in absolute value than all of the other eigenvalues. For example, if a matrix has eigenvalues  4,  3, 1. and 3, then  4 is the dominant e igenvalue, since 4 = 1 41 > 131 2: 131 2: Il l. O n the other hand, a matrix with eigenvalues  4,  3,3, a nd 4 has no dom inan t eigenvalue. The power method proceeds iterat ively to produce a sequence of scalars that converges to AI and a sequence of vectors that converges to the correspo nding eigenvector vI' the dominant eigerlvector, For simplicity, we will assume that the matrix A is d iago nalizable. The following theorem is the basis fo r the power method.
Section 4.5
Theorem 4.28
Iterative Methods fo r Computing Eigenvalues
309
Let A be an ,zX IIdiagonalizable matrix with dominant eigenvalue AI' Then there ex iSIs a nonzero vector X u such Ihat the sequence of vectors x4 defined by XI = Axe,X2 = AxI. X.l = Ax 2" " , Xl = AXk_I""
approaches a dominant eigenvector of A.
Proal
We may assume that the eigenvalues of A have been labeled so that
IAll > IAll
IA31 > ... > IAnl
2:
Let VI' v2"" ' Vn be the corresponding eigenvectors. Since vI' v2" . . , vn are lillearly in dependent (why?), they fo rm a basis for Rn. Consequently. we ca n write Xu as a linear combination of these eigenvectorssay, = CIV I
Xo
+ '2V2 + ... + cnv"
Nowx l = Ax O, x2 = Axl = A(Ax o) = A2xo. x) = Ax! = A(A 2xO) = Ajxo, and,gencrally, Xl = Ak"o for k ~ I As we saw in Exam ple4 .21, A>XO = cI A~ 1
=
+
CzA~2
+ ... +
c"A~n
A~( (IVI + '2(~J~V2 + ... + (n(~J\,, )
0)
where we have used the fact that Al * O. The fact that AI is the dominant eigenvalue means that each of the fractions A21 AI' AJI AI' ... , AQI AI' is less than I in absolute value. Thus,
all go to zero as k it 00. It follows that Xl
ask _ oo
= AkXo_A}c1 v l
(2)
Now, since AI *" Oand VI *" 0, xl is approaching a nonzero multiple of V I (thai is, an eigenvector corresponding to AI) provitJed ci O. (This is the required condition on the initial vector xu: It must have a no nzero componen t (I in the direction of the dominant eigenvector VI')
"*
Example 4.30
Approximate the dominant eigenvector of A =
[ ~ ~J
using the method of Theo
rem 4.28.
Solullon
We will take Xu =
[~ ] as the initial vector. Then
x, = Ax. = [ ;
~] [~]
= Ax , = [;
~][;]
x,
[:]
[:]
We cominue in this fashion to obtain the values of x k in 'lilble 4.1.
11'
Chapter"
Elg~nvaluts
and Eigenvectors
Table 4.1 k
0
x,
[~]
" "
I
4
3
2
5
[:] [:] [!] [ :~] [~:]
6
7
8
[::]
[:l [I7 ll 170
0.50
1.50
0.83
1.10
0.95
1.02
0.99
1.0 1
1.00
3.00
1.67
2.20
1.9 1
2.05
1.98
2.01
, 4
3
2 I
flgur.4 .13
Figure 4.13 shows what is happeni ng geo metrica lly. We know that the eigenspace for the do minant eigenvector will have dimenSIOn I . (Why? See Exercise 46.) Therefore , it is a line thro ugh the o rigin in R2. The fi rst few Iterates x.are shown along wit h the di rections Ihey determine. It appears as though lhe iterates are converging 01) Ihe line whose d irectio n vector is [:]. To con fi rm tha t this is the do minant eigenvccto r we seck, we need only observe that the ratio r~of the first to the second component of xk gels very close to I as k increases. The second line in the body of Table 4.1 gives t hese values, and you can see dearly that rl is indeed approaching I. We deduce that a do minant eigenvector of A is [ : JOnce we have found a dominant eigenvector, how can we fi nd the correspondlllg dominant eigenvalue? One approach is to observe that if an Xi is approximalely a dominant eigenvector of A for the do minant eigenvalue AI' then Xk+l
=
Ax l .. Alxt
It follows thai the ratio fl of the firs t componen t of x t _ 1 to that of Xi will approach AI
as k increases. Table 4.1 gives the valu~ of 11' and you can see that they arc approaching 2, wh ich is the dominant eigenvalue.
Section 4.5
Iterauve Methods for Computing EIgenvalues
311
There is a d rawback to the method o r Example 4.30: The componen ts or the iterates x~ get \!Cry large very q uickly and (an cause significa nt roundoff errors. To avoid this drawback. we ca n mu lti ply each Iterate by some scalar that red uces the magni tude of its componen ts. Since scalar multiples of the iterates X l will still converge to a domi nant eigenvecto r. this approach is acceptable. There are various ways to accolllplish 11. One is to normalize each Xi by dividing it by ~ x , 1 (i.e.• to make each Iterate a unit vector). An easier methodand the one we will useis to divide each '" by the component with the maximum absolute val ue, so that the largest component is now J. This method is called scaling. Thus, if "'t denOles the component of X t with the maximum absolute value, we will replace x t by Yl = (1/ tII.)x,. We illustrate th is approach with the c:llculalions rrom Example 4.30. For X II' Ihere is nothing to do, since " '0 = I. Hence,
yo=x,  [~] We then compute X l ==
[~] as berore, but now we scale with
111 1
= 2 to get
Now the calculations change. We take
X ,=AY ,=[:
~][~5] _ [ :5]
and scale to get y,
=
(fs)[ :5] = [~67]
The next few calcu lations arc sum marized in Table 4.2. You can now see dearly that Ihe sequence of vectors Yiois converging to [ : ]. a dominant eigem'ecror. Moreover, the sequence of scalars sponding domi na nt eigenvalue AI = 2.
Table 4.2 k
x, y, I1I t
0
[~] [~]
1
1
5
co nverges to the corre
6
7
[:]
[:5]
[:67]
[183]
[: 91 ] [195]
( ~.98 ]
[~5]
[~67]
[~83]
[~.91 ]
[~95]
[~98l
[~99l
8 [1199] 98 [~99l
2
1.83
2
1.95
2
1.99
2
2
15
3
III,
4 1.67
1.91
Chapler 4 Eigenvalues and Elgenvtttors
This method, called the power method, is summnrizcd below.
The Power Method
Lei A be a diagonalizable IIX n malfix with a correspond ing dominant eigen 
value AI' I. Let x(I ::: Yo be any initial vector in R" whose largest component is I. 2. Repeat the following steps for k "" I, 2, . .. : (a) Compute x k = Ay.I I' (b) Let fill be the component of Xl with the largest absolute value. (c) Set Yt::: ( l /mk) xl·
•
4
For most choices of x(I' 1114 converges to the dominant eigenvalue Al and Y1 con
verges to a dominant eigenvector.
Example 4.31
Usc the power method to approximate the dominant eigenvalue and a dommant
eigenvector of A~
6
4
5 12
 12
 2
2
10
0
Salufloll 'laking as our initial vector x.,~
1 1 1
we compute the entries in Table 4.3. 0.50 You can see tha i the vectors )'t are appro.1ching
1  0.50
and the scalars
ntl
are
approaching 16. This suggests that they are, respC<:lively, a do minant eigenvector and the dominant eigenvalue of A. Ramlrlls • If the initial vector "0 has a zero component in the direction of the dominant eigenvector V I (i.e., if c = 0 in the proof of Theorem 4.28), then the power method will
Table 4.3 k
x,
y,
no,
o
1
2
3
4
5
6
1
1
,
 9.33  19.33
6
11.67
8.62 17.3 1  9.00
8. 12 16.25 8.20
8.03 16.05  8.04
8.0\ 16.01  8.01
16.00  8.00
0.50 1  0.52
0.50 1  0.50
0.50 1 0.50
0.50 1  0.50
0.50 1 0.50
17.3 1
16.25
16.05
16.0 I
16.00
1 1 1 1 1
0. 17  0.67
1
1  0.60
1
6
 19.33
0.48
7 B.OO
Sectioll 4.5
lohn William Strun ( 184119 19 ), Baron Rayleigh, was a British physicist who made major contributions to the fields of acoustics and optics. In 1871, he gave the first correct cKplanation of why the sky is blue, and in 1895, he discovered the inert SHS argon, for which discovery he received the Nobel Prize III 1904. Rayleigh was president of the Royal Society from 1905 to 1908 and beca me chancellor of Cambridge University in 1908. Hc used Ibylelgh quotients in an 1873 paper on vibrating systems and later in his book T1lc T1,cory of Sound.
Iterative Methods for Computing Eigenvalues
313
not convcrge to a dominant eigenvector. However, it is q uite likely that du ring the calculation o f the subsequ ent iterates, at some pain! rou ndoff error will produce an Xl with a nonzero component in the direction of v \. The power method wil l then start to converge to a multiple of v •. (This is one instance where roundoff errors act ually help!) • T he power method still wo rks when the re is a repeated d ominant eigenvalue. or even whcn the matrix is not diagonalizable, under certain conditions. Detatls may be fo und in m ost modcrn textbooks on numerical analysis. (See Exercises 21  24.) • For some matrices the power m ethod convcrges rapid ly to a d Olllll1ant eigen vector, while for others the convergence may be quite slow. A carefu l look at the proof of Theo rem 4.28 reveals why. Since IAd A" 2: IA}/A,I2:·" ?: IA,,/Atl , if IA1 / A,1 is close to zero, then (Al/ AI )k, . .. , ( Ani AI )k will all approach 'lero rapidly. Equation (2) then shows that Xi = A" x u will approach A ~C1V I rapidly too. As an illustration, co nsider Example 4.3 1. The eigenvalues are 16, 4, and 2, so Az/ AI = 4/16 = 0.25. Since 0.25 7 = 0 .00006, by the seven th iteration we should have dose to four·d ecimalp];lCe accuracy. This ,s exactly what we saw. • There is an alternative way to estimate the dominant eigenvalue AI of a matrix A in conjunction with thc power m ethod . First, observe that If Ax "" Alx, then
The expression R(x) = « Ax) , x)/(x' x) is called a Rayleigh quotien,. As we compute the iterates Xl' the successive Raylcigh quo ticnts R (Xl) should approach AI ' I n fa ct, fo r symmetric mat rices. the Rilyleigh quoticn t method is abo ut twice as fast as the scaling factor method. (See Exercises 1720. )
The Shilled Power Method and the Inverse Power Muthod The power method can help us approximate the domin(l/]( eigenvalue of a matrix, but wha t should we do if we want the other eigenvalues? Fortunately. the re arc several variations of the power method thilt can be applied. T he shifted power method uses the observation that, if A is an eigenvalue of A, then A  a is an eigenvalue of A  rxl for any scalar a (Exercise 22 in Section 4.3). Thus, if AI is the domi nant eigenvalue of A, the eigenvalues o f A  Al l will be 0, A2  AI> AJ  A" ... , AQ  AI' We can then a pply the power melhexl to compute A2  AI ' and fro m Ih is value we can fi nd A2• Repeilllllg this process will allow us to compute all of the eigenvalues.
Example 4.32
Use the sh ifted power m ethod to compute the second eigenvalue o f Ihe matrix A ""
U~ ]
fro m Example 4.30.
Solutloa
In Example 4.30, we found tha t AI = 2. '10 find A2, we apply the power
method to
A 2 / = We take Xo = ill Table 4.4.
[I I] 2
2
[~], but other cho ices will also wo rk. The calculations are summarized
314
Chapter 4
Eigenvalues and Eigenvectors
Table 4.4 k
x.
r, ml
0
I
2
3
4
[~] [~]
[:] [~']
[;'] [~']
[;'] [~']
[;'] [~']
I
2
3
 3
3
Our choice of X o has p roduced the eigenvalue 3 afTer only l wo ilcralions. Therefore , Al  AI ""  3, so A2 = AI  3 = 2  3 =  I is lhe second eigenYalueof A: . . t
Recall from propert y (b ) of Th~ rem 4.18 that if A is invert ible wi th eigenvalue A, then AI has eigenvalue 1/ A. Therefo re, if we a pply the power method to AI, its dominant eigenvalue will be the recIprocal oftile smallest (Ill m ag nitude) eigenvalue o f A. ']0 use this inverse power method, we foll ow the sam e steps as in the power method, except tha t in ste p 2(a) we compute Xl = A J YlI' ( In practice, \'IC don't actually compute A I explicitly; instead , we solve the equ ivalen t equation Ax, "" Yl I fo r xk using Gaussian elimination. T his turns o ut to be faster.)
hampla 4.33
Usc the inverse power met hod to compute the second eigenval ue of the m atrix A =
[ ~ ~] from Example 4.30. S0111101
We sta rt, as in Example 4. 30, with Xu = Yo =
[~] . To solve AXI
Yo.
we use row redUCtlo n:
[Ai r,]
~ [:
I I 0] [ A l r']~20 [ 1 Hence,
Xl
= [
_~:~]. and, by scaling, we get Yl =
o
0]
I I
o
0.5 ] I  0.5
[ _ : ]. Conlinuing, we get the
values shown in Table 4.5, where the values fII, a re convergi ng to  I. Thus, the smallest eigenvalue of A is the reciprocal of  I (which is also  I ). This agrees with our previo us fi nd ing III Example 4.32.
Section 4.5
Table 4.5 k
x, y, III~
0
[~] [~] I
,
I
2
3
[ 05] 0.5
[  0 5] 1.5
[ 05 ]  0.83
[ 05]
1.5
0.83
 1.1
315
Iterative Methods for Computing Eigenvalues
6
5
 1.1
7
8
[ 05]  0.95
[ 05 ]  1.02
[ 0;] 0.99
[ 05]  1.01
 0.95
 1.02
 0.99
 1.01
[:] [~33] [~ 6] [~.4S] [~ 52] [~49] [~51] [~SO] 05
The Shilled loverse Power Method The most versatile of the varilmts of the power method is one that combines the two just mentioned. It can be used to find an approxi mation fo r (my eigenvalue, provided we have a close approximation to that eigenvalue. In other words, if a scalar a is given, the shifted i,JVerse power method will fi nd the eigenvalue A of A that is closest to a . If A is an eigenvalue of A and a *" A, then A  a Tis invertible if a is not an eigenvalue of A and 1/(,\  a ) is an eigenvalue of (A  a1) I. (See Exercise 4S.) If a IS close to A, then 1/( ,\  a ) will be a dominam eigenvalue of (A  a1) I .ln fact, If a is very close to A, then 1/(A  a ) will be nlltcl! bigger in magnitude than the next eigenvalue, so (as noted in the third Remark following Example 4.3 1) the convergence will be very ra pid.
lKample 4.34
Use the sh ifted inverse power method to approximate the eigenval ue of
0
,
A ~
2 that
IS closest
Sollll,'
6  12
5 12 2
IO
to S.
Shifting. we have
5 A  SI =
,
5
2
2
7
6  12 5
Now we apply the inverse power method with I
lCtJ = Yo =
1 I
We soh'e (A  5l)xI = Yo fo r X I :
5
[A  SI I Yo] ~
,
5
2
2
7
6 I  12 I 5 I
0 0 0.61 0 I 0 0.88 0 0 I 0.39 I
•
316
Chapler 4
Table 4.6 k o x,
y, /Ilk
I I I I I I I
Eigenvalues and Eigenvt'Clors
I
2
3
4
5
6
7
 0.61
 0.41
 0.47
 0.49
 0.50
O.SO
 O.SO
 0.88
0.69
0.89
 0.95
 0.98
 0.99
 1.00
 0.39
 0.35
 0.44
 0.48
 0.49
 0.50
 0.50
0.69
0.59
0.53
0.51
0.50
0.50
0.50
1.00
1.00
1.00
1.00
1.00
I .00
0.45
1.00 0.51
0.50
0.50
0.50
0.50
0.50
 0.88
0.69
 0.89
 0.95
 0.98
 0.99
 1.00
Th iS gives  0.6 1
x,
~
 0.88  0.39
,
"'I
~
 0.88,
, nd
I
y 1 x 1

111 1
I 0.88
 0.61
0.69
 0.88
I
 0.39
0.45
We co minue in this f;lshion to obtai.n the values in Table 4.6, from which we deduce t hat the eigenvalue of A closest to 5 is approximatel y 5 + 1/ "'7 "'" 5 + 1/(  1) = 4, which, in fact, is exact.
The power method and its va r ia nts represe n t only one approach to the computa tion o f eigenvalues. In C ha p ter 5, we will d iscuss another method based o n the QR factorizat ion o f a matrix. ror a more complete t reatmen t of this tOPIC, you can cons ult almost any textbook on n u merical methods. / _+~
W e owe this theorem to the
Russian mathematician S. Gerschgorin, who stated it in 1931. It did nOI receive much attention unli11 949, when it was resurrected by Olga TausskyTodd in a note she published in the Ameri(1Ir1
Mllthematical/'.iomh/y.
7 Gersebgorln's Theorem In this seclJon, we have d i~c ussed several variations on the power method for approximating the eigenvalues of a ma trix. All o f these methods are iterative, and the s peed with whic h they converge depends o n the choice of in itial vector. If o n ly we had some " Inside infor ma t ion" about the location o f the eigenvalues of a give n ma t rix, then we could make a j udicious choice of the initial vector and perhaps speed up the convergence of the iter ative process. Fortunately, there IS a way to es timate the locatio n of the eigenvalues of a ny ma t rix. Gerscllgorin's Disk Theorem stales t hat the eigenvalues of a ( real or complex) " X II matrix all lie inside Ihe union of II circular dIsks in the complex pla ne.
Definition
Let A "" [(Iii] be a (real or complex) /IX 11 matrix, lind let r, denote the sum o f t I.' absol u te values of the offd iagonal enlries in the ith row of A; that is, r; = I(I,;!. The ith Gerschgorin disk is the circular disk D, in the complex
L
",
plane w ith center a ;; and rad ius r j • That is,
0 , = {zinCqz 
(I'i):S r,}
Section 4.5
Iterative Methods for Computmg Eigenvalues
311
Olga TausskyTocId ( 190619')5 ) was born in OlmuTZ in the AustroHungarian Empire (now Olmu;!;; in Ihc Czech Rep ublic). She received her doctorate in number theory from the University of Vienna in 1930. During World War [I. she worked for the National Physical Labo ratory in London, whe re she investiga ted the problem of flutter in the wings of superson ic aircraft. Although the problem involved differential equations, Ihe stability of an aircraft depended on thc eigenvalues of a related rnatnx. 13 usskyTodd remembered Gerschgorin's Theorem from her graduate studi es in Vienna and was able to usc it 10 simplify the otherwise laborious computations needed to determine the eigenvalues relcvantto the fluner problem. Taussky Todd moved to the United States in 1947, and ten yea rs later she became the first woman appointed to the California Institute of Technology. In her career, she produced over 200 publications and received numerous awards. Sne was instrumental in the development of th e bnlnch of mathematics now known as matrix theory.
f
Example 4.35
Sketch thc Gerschgorin disks and thc eigenvalues for thc following matrices:
50luliOD (a) The two Gerschgorin disks are centered at 2 and  3 with radii I and 2, respectively. T he characteristic polynomial of A is A2 + A  8, so the eigenvalues
'"
A ~ (I ±V I'  4(8))/ 2 ~ 2.37. 3 .37
Figure 4.14 shows that the eigenvalues are contained wi thin the two Gerschgorin disks. (b ) The two Gerschgorin disks are cen tered at 1 and 3 with rad ii 1 31 = ) and 2, respectively. The characteristic polynomial of A is Al  4A + 9, so the eigenvalues arc
A ~ (4 ± V(  4)'  4(9))/ 2 ~ 2±
iVs ~ 2 + 2.230.2 
2.230
Figure 4.\5 plots the location of the eigenvalues relative to the Gerschgorin disks.
1m 4
6
+_Re
4
4
4 figure 4.14
318
Chapter 4 Eigenvalues lilid Eigenvectors
)'
4
2 +~x
4
6
,4
Flgur • • .15
As Example 4.35 suggests, the eigenvalues of a matrix are con tained with in Its Gerschgorin disks. The next theorem verifies that lhis is so. o
Tbeur•• 4.29
Gerschgorin's Disk Theorem
"X"
Let A be an (real or complex) matrix. Then everyeigen\"alue of A is contained withi n a Gerschgorin disk.
Prool Let A be an eigenvalue of
A with corresponding eigenvector x. Let x, be
the en try of x with the largest absolute valueand hence nonzero. (Why?) Then Ax "" AX, the jth row of which IS
x, "" Ax, or
Rearranging, we have
"*
because x, O. Takll1g absolute values and using properties of absolute value (see AppendIXC) , we obtain
I A  a"l~
*
because Ix)1s Ix,1for j i. ThLs establishes that the eigenvalue A is contained with in the Gerschgon n disk centered at an with radius T;
Section 4.5
319
Iterat ive Methods for Computing Eigenvalues
1I ••• rU • There is a corresponding version of the precedin g theorem for Gerschgori n disks whose radii are the sum o f the offdiagonal entries in the jlh column o f A. • It can be shown that if k of the Gerschgorin disks 3fC disjoint from the other disks, then exactly k eigenvalues are contained within the union of these k disks. In particular. if a single d isk is disJoint fro m the o ther diSks, then it must contain elt3ctly one eigenvalue o f the malrix. Example 4.35(a) illustrates this. • Note that In Example 4.35(a),0 is not contained in a Gerschgorin disk; that is. o is not an eigenvalue of A. Hence, without any further computation, we can deduce
that the mat rix A is invertible by Theorem 4.16. This observation is particularly useful when applied to larger matrices, because the Gerschgorin disks can be determmed dIrectly from the entries of the Illat rix.
(Kample 4.36
2 Consider the matrix A =
,1
I
0
! .
6 Gerschgorin's Theorem tells us that the eigen2 0 8 values of A are contained WIthin three disks centered at 2,6, and 8 with radii I , I, and 2, respectl\'e1y. See Figure 4. 16{a). Because the fi rst disk is disjoint from the other two, it must contam exactly o ne eigenvalue, by the second remark after Theorem 4.29. Because the characteristic polynom ial of A has real coefficients, if it has complex roots (i.e., eigenvalues of Al, they must occur in co njugate pairs. (See Appendix D. ) Hence there is a unique real eigenvalue between I and 3, and the union of the o ther two disks contains two (possIbly complex) eigenvalues whose real parts lie between 5 and 10. On the o ther hand, the first remark after Theorem 4.29 tells us that the same three eigenvalues of A arc contai ned in di sks centered al 2, 6, and 8 with radii ~ , 1, and respectively. See Figure 4. 16(b). These disks are mutually disjoint, so each contains a single (alld hence real) eigenvalue. Combining thest results, we ded uce that A has three real cigenvaJues. o ne in each of the intervals II, 3), 15, 71 . and 17.5, 8.5\. (Compute the actual eigenvalues of A to verify this.)
t,
1m 4
1m
4
2
1 il~++H+ ~R, 4 o
+++R,
2
10
2 4
FI,.r. 4.1'
 4 (b)
321
Chapter 4
Eigenvalues and Eigenvectors
III Exercises 14, a matrix A is g"'en alollg with all iterate Xs, produced as ill Exmllple 4.30. (a) Use these data to al'l"oxilllate a domillallt eigenvector whose first componelll is I and a corresponding dOlllill(lfIt eigenvalue. (Use threedecimalplace accuracy.) (b) Compare your lIpproxilllate eigenvalue ill part (aJ witlr tire actual dominallt eigenvalue. 1 A = .
2. A =
3. A
=
12. A = [ 3.5 1.5 13. A =
9 4
14. A =
8 3 1
4] =[ 78"]
 3904 [: 4] I'xs = ['489 [~ ']
\ '~
2.0
0.5] [ 60.625 ] 3.0 ,xs = 239.500
III mrcises 58, a matrix A is givell alollg with all iterate x k, prOlillced using the power met/lod, as ill Example 4.31. (a) Approximate the dominant eigem'a /lI 1! alld elgellvector by campI/Illig the corresponding Ink and r,.. (b) Verify that YOIl hm'e approximated an eigenvalue and (In eigenvector of A by comparlllg Ay, with III kY k " 5. A =
[_! ~~J.
6. A =
[~ _~J. XIO = [~:!~~]
8. A =
=[', !lXo = [~ J. k = 6
[ 4443] ' '] [ 5 4 ' xS = 11109
4. A = [ 1.5
7.A =
I I. A
Xs = [
4 0 6  [ 3 r 6 0 4 \ \
~~: ~~~]
, x~=
1.5] y = [ ' ] k=6
 0.5
4
V
0'
8  4 ,Xu =
15
,
9
1 0 3 1 1 3
0
"
,Xjl =
1 1 ,k = 6 1
In Exercises 15 tmd 16, use the power method to approximate the domillallt eigenvalue ami eigellvector of A to twodecimalplace accuracy. CllOose any initial vector YOlllike (but keep the first Remark on page 312 in mind!) and apply
the method ulltiJ tile digit in the second ,lecimal place of the iterates stops changillg. 15. A =
4
1
3
0
2
0
,
,
2
16. A
0.00 1
19. Exercise 13
20. Exercise 14
10.000
 3, XIO =
2.9 14
o \
I
 1.207
In Exercises 914, use tire power mel/lad to "l'l'roxlmate the domimlllt elgellva/lte mrd eigenvector of A. Use tile givell illilial vector X O ' tile specified number oj iteratIons k, and threedec; 1//(11· place (lcm racy.
6
to. A = [ 8
= [ 'I] '
6 0 6
 6  2 12
Ray/elgh ijuotients are describetl ill tire fOllrth Remark 011 page 313, III Exercises 17 20, /0 see Irow t're RayleIgh ijllotlellt method approximates tire dominant eigelwaille 1II0re rapidly than the ordinnry power method, compllie the sllccessive RayleIgh ijllotiems R(x,l fori = I, "" ,kfor tire mMrix A ill the given exercise. 18. Exercise 12
3.4 15
Xo
=
12 2 6
17. Exercise 11
 2
" 3' ]
1 ,k = 5 1
10.000
2 \
9. A = [ 145
1
k= 5 24. A
=
1
1
5e<:lion 4.5
III ExerCIses 2528, tlte power metltod does tlot converge /0
tire dominam eigenvalile and eigel,vector. Verify this. using tlte givert inili(11 vector xo' Compllte the exact eigenv(lllIes ami eigenveclors and explain IV/wt is happelling. 25. A = [  I 26. A =
28. A =
42. p{x) = xl  x  3.0' := 2 43. p(x) = x}  2xz + 1,0' = 0 44. p(x )= xl  5x2+ x+ 1,0'=5 45. Let A be an eige nvalue of A with corresponding eigen vector x.If 0' *" A a nd cr is not an eigenvalue of A, show that I /( A  cr) is an eigenvalue of {A  a I)  I with co rresponding eigenvector x. ( Why must A  0'1 be invertible?)
 I
 5
1
0 7
4
7 0
1
 5
27. A =
321
lIerative Meth ods for Computing Eigenvalu es
1
 I
0
1
1
0
1
 I
1
1 , ~
1
=
1 1
. ~
=
47.
29. Exercise 9
30. Exerctse 10
31. Exercise 13
32. Exe rcise 14
III Exercises 3336, apply the !IIverse power method to approximate, for the IIIMrix A III rite given exercise, the eigen
value that is sll/allest in ml/gll/tutle. Use the given mitial vector xo' k iterations, ami tftreedecimnlpl(/ce accuracy. 34. Exercise 10 1
I •k
=
5
I
36. Exe rcise 14
/" Exercises 3740, lise tlte sl"fied Im'erse power me/hotl to approximate, for tlte matrix A in the given exercise, the eigenwl/lle closest to 0'. 37. Exercise 9, cr "" 0 39. Exercise 7, a
=S
,,
1
0
4
1
1
0
5
1
III Exercises 2932, apply the slrified power method to approximate tile second eigenvaille of the matrix A ill tile given exemse. Use the given illllial vectorx o• k iterations, ami I/treedecimalplnce accumcy.
35. Exe rcise 7, Xo =
In ExeTCIses 4750, draw the Cerschgorill disks for the given matrIX.
1 1
33. Exe rcise 9
46. If A has a d o m inant eigenvalue AI' prove that the eigenspace EA, IS onedimensional.
38. E.'xercise 12, a :::: 0 40. Exercise 13, cr ""  2
Exercise 32 ill Sectioll 4.3 demOllstrates that every polyllomia/ is (plw or mill/IS) the chamcleris/ic polynomial of ItS own companion lIU1trix. Therefore, the roots ofll polynomial p are the eigel/values ofC( p). HI!TlCl!. we (nn lise Ille meth ods of this section 10 approximate tlte rOOIS of IIny polYllomial whell exact resliits are lIot rea.dily Ilvmlable. In F..xercises 4144, apply the shifted illl'crse polver metllad to thecompamotJ matrix C( p) ofp to approximate the root of p closest to a to three decimal places. 41. p(x) = :< + 2x  2,cr = 0
,
49.
1
+
,
. , +  ,.
, •
 2i
,!
, 0, 4 ,
0 0
0
1
6
1
0
!
2
0
1
•
•
,
5
0
2; 1 + ;
1
0
 I
1
50.
48. •
4  3i
,
2
 2i
1
2
2
0
0
+ 6i
2;
2;
S  5i
•8
51 . A square matrix is striclly diagollally dominant if the a bsolute value of each diagonal entry is greater tha n the sum of the abso lute val ues of the remaining entries in Ihat row. (Sec Section 2.5. ) Use Gerschgorin's Dis k Theorem to prove that a strictly diagonally dominan t matTlX must be in vertible. [Hint: See Ihe third remark after Theore m 4.29.) 52. If A is an /I X /I matrix, let II A II denote the maxim um of th e sum s of the absol ute values of the rows of Aj tha t is, IIAII = max ( I S.S n
±la,jl
).(see Sectio n 7.2.) Prove
J"' J
that if A is an eigenv'llue of A, then 1A1 s
II A 11 .
53. Let A be an eigenvalue of a stochastic matrix A (see Seclion 3.7). Prove tha t IAI s I. I Him: Apply Exercise 52 to AT.] 54. Prove that the eigenval ues o f A =
0
1
2
5 0 0
!
,
0
0
0
0
0
,
3
!
;
7
•
ace
all real, a nd locate each of these eigenvalues Within a closed in terval on the real line.
322
Chapter 4
Eigenvalues and Eigenvectors
and Ihe PerronFrobenius Theorem In this section, we will explore several applications of eigenvalues and eigenvectors. We begin by revisiting some applications from previous chapters.
Markov Chains Section 3.7 introduced Markov chains and made several observations about the transition (stochastic) matrices associated with them. In particular, we observed that if P is the transition matrix of a Markov chain, then P has a steady state vector x. T hat is, there is a vector x such that Px = x. This is equivalent to saying that P has I as an e igenvalue. We are now in a position to prove this fact .
Theore. 4.30
If P is the nX n transi tion m atrix of a Markov chain, then I is an eigenvalue of P.
Prool
Recall that every transition matrix is stochastic; hence, each of its columns sums to I. Therefore, if j isa row vector consisting of n Is, then jP = j. (See Exercise 13 i n SectIOn 3.7.) Taking transposes, we have
pTf
=
(jp) T = jT
which implies that f is an eigenvector of pT with corresponding eigenvalue 1. By Exercise 19 in Section 4.3, Pand p Thave the same eigenvalues, so I is also an eigenvalue of P. In fact, much mo re is true. For most transition matnces, evcryeigenvalue A satisfies IAI < 1 and the eigenvalue 1 is dominant; tha t is, if A =t 1, then IAI < I. We need the following two definitions: A matrix is called positive if all of its entries are positive, and a square matrix is called regular if some power of it is positive. For example,
A = [I
•
Theorem 4.31
[~ ~J
is pOSitive but B =
[~ ~ ]
is not. However, B is regular, since B2 =
~ ~] IS positive.
Let P be an nX n transitio n matrix with eigenvalue A.
a·IAJ S I b. If P is regular and
A =t
I, then IAI
<
I.
Prool
As in Theorem 4.30, the trick to proving this theorem is to use the fact that pT has the same eigenvalues as P.
(a) Let x be an eigenvector of pT corresponding to A and let Xk be the component of x with the largest absolute value m. Then Ixil S Ixlj = m for i = 1, 2, .. . , n. Comparing the kth components of the equation pTX = Ax, we have PlkXI
+
P2k~
+ .. + P"kX"
=
Ax,
Se<:tion 4.6
Applications and the Perron Frobenius Theorem
323
(Remember that the rows of pT are the colum ns of P. ) Taking absolute values, we obtain
IAlm= IAllx,1= IAxll = IPax, + P2l~ + ... + !,,,,x,,1 <: IPIIXIi + IPHXJI + ... + Ip"*x", = p' llxd + PztlXll + ." + p",lx,,[
(I )
:s Pit'" + Pa nl + .. + p .. ,m =
(PI!: + Pll. + ."
+ p",)111 "" m
The first inequality follows from the Triangle Inequality in R, lind the last eq uality comes from the fact that the rows of pT sum to I. Thus, IAlm:S m. After dividing by 111, we have IAI :!f 1, as desired.
eb) We will prove thc equivalent implication: If lAI
then A = I. First, we show that it is true when P {and therefore p T) is a positive matrix. If IAI = 1, then all o f the inequalities in equations ( I) arc actually equal ilies. In particu lar, Plkix , l
+ PHlx21 + ,.. + p"kix,,1
= \,
+ Pu m + ... + p~km
= PIA'"
Equivalent ly,
PIl(m  Ixd) + PH( m  IXlI) + .. + P"I;(III 
i x~1) = 0
(2)
Now, since P is positive,p,t > 0 for i = 1,2, . . . , II. Also, III  IxJ~ 0 fo r i = t, 2, ... , II. Therefore, eac h summand in equation (2) must be zero, and this can happen o nly if IxJ= In fo r , = t, 2, ... , II. Furt hermore, we gct equality in the Triangle Inequality III IR if and on ly if all of the summands are positive or all are negative; in other words, the P,t X, 's all have the same sign. This implies that
x
=
m
 111
m


mf
0'
x
= _ mjT
 m
m
)
III
where j is a row vecto r of n Is, as in Theorem 4.30. Thus, in either case, the eigenspace of pT co rresponding to A is EA = span(jT ). But, using th e proof of Theorem 4. 30, we see that { = pTj T = A{, and, co mparing components, we find that A = 1. This handles the case where P is positive. If Pis regular, then some power of P,S posi.tivesay, pk.1t follows that pHI must also be positive. (Why?) Since Ai and At> I are eigenvalues of pi and j Y< + I, respectively, by Theorem 4.1 8, we have just proved that AI: = AHI = 1. Therefore, ,.\l(A  I) = 0, which implies that). = I, since A= 0 is impossible if IAI = I. We can now explain some of the behavior of Markov chains that we observed in Chapter 3. In Example 3.64, we saw that for the transition matrix p=
[0.7 0.2] 0.3
0.8
anc1 '''1 IOllia state vector Xu = [0.6]h , te state vecto rs
0'] [ .
0.6
0..
XI
converge to the vector x =
,a steady state vecto r for P ( i. e., Px = xl. We arc going to prove that for regular
324
Chapter 4 EIgenvalues and Eigenvc(tors
Markov chains, this always ha ppens. Indeed, we will prove much more. Reca ll that the state vectors x*satisfy x k = P"x o' Let's invest igate Whlll happens to th e powers p l as P becomes large.
Example 4.3J
0.2]h' . .equallon . [00.3.7 0.8 as c laractcnstlc
The transition matrix P ""
0"" det (P AI) ""
0.7  A 0.3
0.2
,
(
0.8 _ A = A  l. SA + 0. 5 = A  1)(A  0.5)
so its eigenvalues arc AI "" 1 and A2 "" 0.5. (Note lhat, thanks to Theorems 4. 30 and 4.31, we knew in advance that 1 would be an eigenvalue and th e other eigenvalue would be less than \ in absolute vlllue. However, we sti ll needed to compute A1.) The . clgenspaces are
and
.
So, taking 0 =
[2
3
_ : ] ,we knowlhat O IPO ""
[~ ~.5 ]
= D.r:rom themethod
used in Example 4.29 in Section 4.4, we have
']'
I
Now, as k + 00, (O.5 )k + 0, so
1][' 0][2 1]'
,nd
 \00
3\
=
[OA ~A l 0.60.6
(Observe that the columns of th is "limit matrix" a re identical and each is a steady state vector for P.) Now let
~=
[:] be any Imtlal probability vector (i.e., a + b = 1).
Then
x, = P'x,~ [ OA
0.6
OA ]["] = [OA" + 0.4b] = [0.4] 0.6 b 0.6a + 0.6b 0.6
Not onl y does this explain what we saw in Example 3.64, it also tells us that the state vectors xk will converge to the steady state vector x
=
004 ] for any choice of xo!
[0.6
t
There is nothing special about Example 4.37. The next theorem shows that this type of behavior always occurs with regular transition matri(es. Before we can presem the theorem, we need the followin g lemma.
Lell •• 4.32
" lei P be a regular fiX II transition matrix. If P is diagonalizable, then the dominant eigenvalue AI = I has algebraic multiplicity 1.
3':5
Section 4.6 Applications and the PerronFrobenius Theorem
P,OOf
The eigenvalues of P and p1' arc the same. From the proof of Theorem 4.3 1(b), AI = 1 hasgeomctric multipl icity 1 as an eigenvalue of pT. Since P isdiagonalizable, so is pT, by Exercise 41 in Section 4.4. Therefore, the eigenvalue = I has algebraic multiplicity I, by the Dlagona lization Theorem.
"I
Theorem 4.33
,I
Let P be regular nX II transition matrix. Then as k + 00, pi approaches an "Xn matrix r. whose col umns are identical, each equal to the same vector x. This vector X is a steady state probability vector fo r P.
P,oof See Finite MarkOl' Chains by J. G. Kemeny and J. L Snell (New York: SpringerVerlag, 1976).
l b simplify the proof, we will consider only the case where P ISdiagonalizable.
The theorem is true, however, withou t this assumption. We diagonalize Pas Q IPQ = D or, equivalently, P = QDQI , where
A, 0 0 A,
D~
0
•• •
...
0
0 0
A.
From Theo rems 4.30 and 4.31 we know that each eigenvalue A, ei ther is 1 or satisfies 1",1 < I. Hence, as k } 00, A1approaches I or 0 for i = 1, . .. , n. It follows that rf approaches a diagonal matrix say, D*each of whose diagonal en tries is 1 or O. Thus, pk = QdQ 1 approaches L = QD* QI . WC write Iimpl
,\'I'e are taking some liberties with the notion of a limit. Nevertheless,
these steps should be intuitively d ear. Rigorous proofs follow from the properties of limits, which rou may have ellcoulltered in a calculus course. Rather than get sidetra cked with a disc ussion of matrix limi~ we will omi t the proofs.
= L
Observe that PL= P li m~= hmPp'< = lim ph i = L k oo
'>00
Aooo
Thereforc, each column of L is an eigenvector of P correspondlllg to AI = I. To see that each of these columns is a probability vector (i.e., L is a stochastic matrix), we need only observe that, if j is the row vector with fi l s, then = klim iP = k>oo limj = j ;L = ;limpt 1_0() .....o
for scalars
'I' '", ... ,'".Then, by the boxed comment following Example 4.21 , '*
By Lemma 4.32, A} I forj oF I, SQ, by Theorcm4.3 I(b). IA)1 A}k } 0 as k + 00, for j I. It follows that
*"
Le· = lim pke = 'I VI ,
k>oo
'
<
I for j 'I: I. Hence,
U,
Cha pter 4
Eigenvalues and Eigenveclol'$
I n other words, column j of L is an eigenvecwr corresponding 10 A, "" 1. But we have shown that the columns of LaTe probability vectors. so Le, is the I1l1iqlle mulliple x of V I whose componen ts su m to I . Since this is true for each column of L, it implies that all of the columns of L 3re identical. each equal to this vector x.
.... ark
Since L is a stochastic matrix, we can interpret it as the long range transi· tion matrix of the Markov chain . Th;lI is, Lij represents the probabi lit y of being in slate ;, having started from state j , if tile lmnsilions were to conlinlle intlefin itely. The fa ct that the columns of L 3re identical says that the starting state ,Ioes IIOt matler, as the next example illustrates.
hample 4.38
Recall the rat in a box from Example 3.65. The transition matrix was p ""
, ,
O
!
j
~
0
i
i i
0
We determined that the steady state probability vector was j
x ""
•
,•• l
Hcnce, the powers of Papproach
,• ,• ,• ,• ,• ,• • • • j
L=
j
!
=
0.250 0.375 0.375
0.250 0.250 0.375 0.375 0.375 0.375
from which we can see that the rat will evefllllally spend 25% of its time in comparl· men! J and 37.5% of its time in each of the other two co mpartments.
We conclude our discussion of regular Markov chains by proving thaI the steady state vector x is independen t of the initial slate. The proof is easIly adapted to cover Ihc case of state vectors whose components sum to an arbitrary constant say, s. In I he exercises, you arc asked 10 prove somt" other properties o r regular Markov chains.
Theore. 4.34
lei Pbc a regular n X 11 transition mat rix, wi th x the steady sta te probabilit y vecto r
ror P. as in Theorem 4.33. Then . for any initial probability vector x(l> the sequence of iterates Xl' approaches x.
PrDol
Let
x, x,
x.
Section 4.6
where XI
+ x 2 + .. + xn ""
321
ApplicatIons and the PerronFrobenius Theorem
I. Since xi; = pk" o' we must show that lim pk " 0 = x. NO\v, ,~
by Theorem 4.33, the long range tra nsition matrix is L = [x x . . . x ] and limP" = ,~
L Therefore.
x,
= {x x
• •
x" "" XI X "" ( Xl
+ X2X + ... + In " + X I + .. + Xn) X =
X
Population Growt. We return to the Leslie model o f population g rowth. which we fi rst exp lored in Section 3.7. In Example 3.66 in that section, we saw that fo r the Leslie malrix
043 L=
0.5
o
0 0.25
0 0
iterates o f the population vectors bega n to approach a mult iple of the vector
18
x
6 I
In o ther words, the three age classes o f th is population eventually ended up in the ratiO 18: 6: I . Moreover, o nce this state is reached, it is stable, since the ratios for tile fo llowing year are given by
o Lx =
0.5
o
4 0 0.25
3
18
o
6
0
I
27
=
1.5x
9 1.5
and the components are still in the ratio 27: 9: 1.5 = 18 :6: 1 . Observe that 1.5 repre· sents the growth rate of this population when it has reached its steady statc. We can now recognize that x is an cigenvector o f L co rrespo nding to the eigenvalue A = 1.5. Thus, the steady state growth ra te is a positive eigenvalue of L. and an eigenvecto r correspond ing to this eigenvalue represents the relative sizes of the age classes when the steady state has been reached. We can compute these di rectly, witho ut having to iterate as we d id before.
Example 4.39
Fmd the steady state growth rate and the corresponding ratios between the age classes for the lLslie matrix Labove.
SII.tll. We need to fi nd all positive eigenvalues and correspond ing eigenvectors o f L The characteristic polynomial of L is
d
 A
4
3
0.5
 A 0.25
o ""
o
A
_ AJ
+ 2A + 0.375
321
Chapter 4
Eigenvalues and Eigenvecto rs
so we must solve  A) + 2A ing, we have
+ 0.375 =
0 or, equivalemly, 8A J

16.\  3 = O. Facto r
(2)  3)(4A' + 6A + I ) ~ 0 (See Append ix D.) Since the second fac to r has only the roots (  3 + VS)/4 "'"  0.19 and (  )  VS)/4 "'" 1 .31, the only positive root of this eqwl tio n is A = = 1.5. The corresponding eigenvectors arc 111 Ihe null space of L  1. 5[, which we find by row reduc tion:
i
 1.5 [ L  1. 5 / 1 0 J ~
0.5
o
4 3 0  1.5 0 0 0.25  1.5 0
,
I
0
o o
I
0
 18 0  6 0 00
x, Thus, If x =
X2
is an eigenvector corresponding 10 A = 1.5, il satisfi es X I = 18X:J and
x, X:z
= 6xJ • That is,
"6
= span
x,
I
Hence, th" steady state growth rate IS 1.5, and when th is rate has been reached, the age dasscs are in the ratio 18: 6: I , as we saw before.
In Example 4.39, there was o nly one candidate for the ste'ldy state growth ra te: the unique positive eigenvalue of L But what would we have do ne if Lhad had m ore than one positive eigenvalue or none? We were also appa rently fortu na te that the re was a corresponding eigenvector all of whose components were positive, which allowed us to rcltlte these components to the size of the populat ion. We can prove that th is situa li on is not acciden tal; thai i.~, every Leslie matrix has exactl y one posilive eigenvalue a nd a corresponding eigenvector with positive comp onents. Recall that the form o f a Leslie matrix is
L
b, b, b,
U,,_I
b"
0
0
0
0
"
0
0
0
0
"
0
0
00
0
"0 0
(3)
Since the entries sJ represent survival probabilities, we will assume that they are all nonzero (otherwise, the population would rapidly die out), We will also assume Ihal at least one of the birth pa rameters b, is nonzero (otherwise, there would be no births and , again, the population would die out). With these smndmg assumpt ions. we can now prove the assertion we made above as a theorem .
Theore. 4.35
Every Leslie matrix has a unique positive eigenvalue and a correspond ing eigenvector with positive com ponents.
.,
Section 4.6
Proof
329
Applications and the PerronFrobenius Theorem
Let L be as in eqwltion (3). The characteristic polynom ial of L is
CL(A) = det(L  AI)
"" ( I)"(An

b,A"  '  b2sI A" Z  bj S1S1A,,J  ... 
b"s ,~
...
5,, _ 1)
= (  I )"[ (A) (Yo u are asked to prove this in Exercise 16.) The eigenvalues of L are therefo re the roots off{A ). Since at le3st one of the birth panlluclers b, is positive and all orl he survival probabilities 51 arc posi tive, the coefficients of I (A) change sign exactly once. By Descartes's Rule of Signs (Appendix D), therefore, [ ( A) has e:'(actly o ne posi tive root. Let us call it AI" By direct calculation, we can check that an eigenvector correspondi ng 10 AI is
1
sdA, s,sd Ai 5 l s2sJ/ Ai
(You are asked to prove this in Exemse IS.) C learly, 311 of the com ponents of X I are I'ositive.
In fact, more is true. With the add itional requirement th3t two COllSCClltive birth parameters b,. and 1'11 are positive, it tur ns out that the unique positive eigenval ue Al o f L is dOllllt/atlt; tha t is, every other (real or co mplex) eigenvalue A of L satisfies lAI < AI ' (It is beyond the scope of th is book to prove this resuit , but a partial proof is outlined in Exercise 27 for readers who :lrc fam iliar wi th the algebra o f complex nu m bers. ) This explains why we get convergence to a steady state vector when we itera te the population vecto rs: It IS just the pO\"er method working for us!
Ue PerrOnJrObenias neorem In the previous two applications, Markov chains and Leslie matrices, we saw that the eigenval ue of interest was positive and domina n t. Moreover, there was a corresponding eigenvector with positive components. It t urns out that a remarkable theo rem guarantees that this WIll be the case for a l H, A <: H, and so on .) Thus, a positive vecto r x satisfies x > O. Le t us define IAI = 11(/,,11 to be the matrix of the absolute values o f the entries of A.
330
Chapter 4
Eigenvalues and F.lgenvectors
Theore.. 4.36
•
Perron's Theorem Let A be a positive nX n matrix. The n A has a real eigenvalue A. with the fo llowing properties: a . AI> O b. AI has a correspond ing positive eigenvector. c. If A is any other eigenvalue of A, then IAI ~ AI"
,
In tuitively, we can see why the first two statements should be true. Consider the case of a 2 X 2 positive mat rix A. The corresponding matrix transfo rmatio n ma ps the first quadrant of the plane properly into Itself, since all com ponents are positive. If we repeatedly allow A to act on the images we get, they necessarily converge toward som e ray in the first quad rant (Figure 4.17). A direction vector for this ray will be a positive vector x, wh ich must be mapped into some positive multiple of itsclf (say, AI)' since A leaves the ra y fixed. In other wo rds, Ax = Alx, with x and A. both positive.
Proof
for some nonzero vectors x , Ax ?: Ax for some scalar A. When this happens, lhen A(kx) ~ A(kx ) fo r all k > 0; thus, we need only co nsider unit vectors x. In Chapter 7, we will see that A m aps the set of all unit vectors in R" (the IHIII sphere) into a "generalized ellipSOid." So, as x ranges over the nonnegative vectors on th is un it sphere, there will be a maximum value of A suc h lhat Ax 2: Ax. (See Figure 4. 18.) Denote this number by AI and the corresponding unit vector by XI' y
y
y
y
, +~x
++ x ++ x l' '
Figure 4.11 y
Figure 4.18
+ x
Sed.ion 4.6
Applications and t he PerronFrobcnius Theorem
331
We nO\\l show that Ax l = Alx l. If not, then Axl > A1x l, and, applying A agai n, we obtain A(Ax l) > A(Alx l ) = A1(Ax I) where the inequality is preserved, since A IS positive. (See Exercise 40.) But then y = ( 1/II Axllj)Ax l is a unit vector that satisfi es Ay > Aly, so there will be so me A. > AI such that Ay 2: A2y . This contradicts the fact tha t AI was the maxi mum val ue wit h th is property. Consequently, it must be the case that A X I = Alx l; thai is, AI 's an eigenvalue of A. Now A is positive and X I is positive, so A,x l = Ax, > O. This means that AI > 0 and XI> 0, which completes the proof of (a) a nd (b). To prove (c). suppose A is any other (real or complex ) eigenvalue of A with co rrespondlllg eigenvector z. Then Az == Az, and, taking absolute values, we have (4)
where the middle inequality fo llows [rom the Triangle Ineq ual ity. (See Exercise 40.) Since jzI > 0, the unit vector u III the d ireCtIon of Izi is also positive and sa tisfies Au :;> IAlu. By the maximality of AI from the first part of thiSproof, we must have IAI:$: A,. In fact, more is true. It turns out that A, is dominant, so IAI < A, for any eigenvalue A AI. It is also the case thai AI has algebraic, and hence geometric, mult iplici ty L We will not prove these facls. Perron's Theorem can be generalized from positIVe to certain nonnegative matrices. Frobeni us d id so in 191 2. The resuit requires a technical condition o n the malrlx. A S(luare matrix A is called reducible if, subject 10 some permutation of the rows and the same permutation of the columns. A can be written It1 I>lock form as
"*
where Band D arc square. Equivalently, A is reducible matrix Psuch that
,r there is some permutatio n
(See page 185.) For eX:lm ple, the mat rix 2 4
A=
I
6 I
0 2 2 0 0
0
I
3
I
5
5
7 0 0
3
0
2
I
7
2
is reducible, since jnterchangmg rows I and 3 and then col umns I and 3 produces 72 i l 30 • 2 i •••. 4 ...... 5 5 _I.. __.+ o O •i 2 I 3 o Oj 6 2 I o O i l 72
,
332
Chapter 4 EIgenvalue; and Eigenvectors
(This is just PApT, where
p
0 0
0
I
I
I
0
0
0
0
0
0 0 0 0
0 0 0 I
0 0 0 0
0
I
Check Ihis!) A square matrix A that is not reducible is called irreducible. If Al > 0 for some k, then A is called primitive. For example, every regular Markov chain has a primitive transition matrix, by definition. It IS not hard to show that every prtmitive matrix is irreducible. (Do you see why? Try showi ng the cont rapositive of this.)
Theora. 4.31
The PerronFrobenius Theorem Let A be an irreducible nonnegative
nX n
matrix. Then A has a real eigenvalue Al
with the following properties: a. Al > 0 b. Al has a corresponding positive eigenvector. c. If A is any other eigenvalue of A, then .A !SO AI' If A is primitive, then this inequality is strict. d. If A is an eigenvalue of A such that A = AI' then A is a (complex) root o f the equa tion An  A ~ = O. c. Al has algebraic multiplicity I .
S« Matrix Alwlysis by R. A. Horn and C. R. Johnson (Cambridge,
England: Cambridge Uruve~ity Pre$$, 1985).
The interested reader can filld a proof of the PerronFroheni us Theorem in many texts on nonnegative matrices or matrix analysis. The eigenvalue AI is often calted the Perron root of A, and a corresponding probability eigenvector (which is necessarily unique) is called the Perron eigenvector of A.
linear Recarrence Relations The Fibonacci numbers are the numbers in the sequence 0, 1, 1. 2, 3, 5, 8, 13, 21 , ... , where, after the fi rSI two terms, each new term is obtained by summing the two terms preceding it. If we denote the nth Fibonacci number by f.. then this sequence is completely defined by the equations fo = 0, It = 1, and. for n 2. 2,
This last equation is an example of a linea r recurrence relation. We will return to the Fibonacci numbers, but first we will consider linear recurrence relations somewhat more generally.
Section 4.6
Applicatio ns and t he PerronFrobenius Theorem
an
I.eonardo of PiS
Definition
Let (x,,) = (.\Q,XI'~, ... ) be a sequence ofnumbcrs that is defined
as follows:
I. A1l = "0, x, = a,•... , x* , = at_i. where no, a, •...• (j~l are sca lars. 2. For all 11 ;:: k, Xn = C' X n _ 1 + CzX~ 2 + ... + ctx" ' b where c,. ' 2' •..• Cl arc scalars.
"*
If ' k 0, the equation in (2) is called a linear recurrence relation of order Ie. The equations in ( I) are refe rred to as the inilial cotlditiollS of the recllrrence.
Thus, the Fibonacci num bers satiSfY a linear recurre nce rela tion o f order 2.
He.lrU • If, in order to define the ni h term in a recurrence relation. we requi re the (11 k)th term but no term before it. then the recurrence relation has order k. • T he n um ber of ini tial condi tions is the order of the recurrence relation. • It is no t necessary that the first term of the sequence be called Xo . We could Slart at xl Qr anywhere else. • It is possible to have even mo re general linear recurrence rela tio ns by allowing the coeffici ents Ci to be functions ra ther than sca lars and by allowi ng an extra, isolated coefficient, which may also be a funct Ion. An example would be the recu rrence x" =
2x..,  ,1'x,,_2 +  x,,3 +
We will not consider such recurrences here.
lnmple 4.40
1
II
"
Consider the sequence (x~) defincd by the mitial condi tions x, = I, Xz == 5 and the recurrencc relation x" == 5x,,_,  6X~_ l fo r tI :> 2. Wnte out the first five ter ms o f this sequence.
Salullaft
We are given the first two terms. We usc the recurrence relation to calculate the next three terms. We have XJ = 5x2  6xI == 5 · 5  6 · 1 = 19 ~ =
Xs
5xJ  6X2 == 5 1 9  6 ·5 = 65
= 5x~  6X3 = 5 · 65  619
so the sequence begins 1,5, 19,65,21 1, ....
= 2 11
es and Eigenvcctors
Clearly, if we were interested in, say, the lOOth term of the sequence in Example 4.40, then th e approach used there would be rather ted ious, since we would have to appl y the recurrence relation 98 times. It would be nice If we could fi nd an explicit formukl fo r x" as" fun ction of n. We refer to find ing such a fo rm ula as solving the recurrence relatio n. We will illust rate the process with the sequence from Example 4.40. To begin, we rewrite the recurrence relation as a matrix equation. Let
and int roduce vectors
XII
= [
x" ] for
II :>
2. Thus,
Xl
=
X,,_ I
[x,] ['], x I XI
J
=
[x,] ~ x2
[ l:] , x~ "" [~] : : [~~J,and so on. Nowobserve that' forn~2,wehave Ax,,_, =
[5 I
 6][X,,_,] = [5X
o _, 
0
X,, _ 2
6X
o _,]
= [ Xo ] ~
Xn  l
x.
X,, _ l
Notice that this is the same Iype of eq uatio n we encountered With Markov chains and Leslie mat ri ces. As in those cases, we can write X ,, 
Ax ,,
\ 
A' x,,2 
" ' 
AO ' x2
We now use the techn ique of Example 4.29 to compute the powers of A. The characteristic equa tion o f A is ,\2 _
5,\
+6
= 0
from which we find that the eigenvalues arc AI = 3 a nd ..\2 = 2. (Notice that the fo rm of the characteristic equation fo llows that of the recurrence relatio n. If we write the recurrence as x"  5 X n _ 1 + 6X,, _ 1 = 0, it is appa rent that the coeffi cients arc exactly the same!) The corresponding eigenspaces are
SettingP =
[ ~ ~], weknowthatfTIAP = 0
=
[~ ~]. ThenA = POp l a nd
[: ~] [3~
2~] [:
[: ~W~
2~] [  :  2(3 k+ l )
+ 3(2 h l )]  2(3') + 3(2' )
1t now follows that
Sect iol1 4.6 Applications and the PerronFrohenius Theorem
335
from whtch we read off the solution x~ = 3 ~  2". (To check our work, we could plug in" = 1,2, . .. ,5 to verify that Ihis fo rmula gives the same terms that we calculated usmg the recurrence relation. Try it!) Observe that x is a linear combination of powers of the eigenvalues. This is nee· cssarily the Case as long as the eigenvalues arc distinct las Theorem 4.38( a) will make explicit). Using this observation, we can 5.1Ve ourselves some work. Once we have computed the eigenvalues AI = 3 and Az = 2, we can immediately write lt
where 'I and c2 are [0 be determined. Using [he illlt131 conditions, we have I =
XL
= cL3 1 + '12 1 = 3c\
+
5=
X2
= c132 + c22 2 = 9c
+ 4'1
2'1
when n = I and
when
fI
I
= 2. \Ve now solve the syMem
3cI
+
2c2
1
9c l + 4c2 =5
for CI and cl \0 obtain (I "" I and ':2 = 1. Th us, x" = 3"  2", as before. This is the method we will use in practice. We now illustrate its use to find an explicit fo rmula for the FIbonacci numbers.
Example 4.41
Solve the Fibonacci recurrence fo = 0, It
= 1, a nd f~ = inI
+ f~  2 for /I 2:
2.
Writing the recurrence as in  inI  [,, 2 "" 0, we see that the characteris· tic equation is A!  A  I = 0, so the eigenvll[ues are
SuJuliDa
[t foll o\\'5 from the discussion above that the solutio n to the recurrence relation has the form Jacques Binet ( 17861856) made contributIons to matrix theory, number theory. phySICS, and astronomy. He discovered the ruk for Illatnx multiplication in \ 812. Binet's formula for the Fibonacci numbers is actually due to Euler. who published 11111 1765; however, it was forgotten L1I1(11 Billet published hIS version in 1843 Like Cauchy. Binet was a royalist, and he lost his ulllversity position when Charles X abdicated In 1830. He recel\"ed many hOllors for hLs "'ark. IIlduding hiSelection, in 1843, to the Acad~ mie des Scu::nces.
for some scal
o = fo
= CIA? + " A~ = CI
+ c,
and Solving for ci and ':2, we obttlill ci = 1/ mlila for the IIIh Fibonacci number is
\IS and c2 =
 I/
\IS.
Hence, an explicit for·
I (I + VS)" I (I  VS).' VS 2 '" VS , ~
2
(5 )
336
Chapter 4
Eigenvalues and Eigenvtttors
Formula (5) is a remarkable formula, because it is defi ned in terms of the irraIto/wI num ber Vs yet the Ftbo nacci numbers are aU integers! Try plugging in a few values for II to see how the Vs terms cancel o ut to leave the mteger values!". Formula (5) IS known as Binet's formula. The method we have just outlined works for any second order linear recur rence rd ,lIion whose associated eigenvalues are all distinct. \Vhen there is a repeated eigenval ue, the technique m ust be modifi ed, since the d iagonalization method we used may no longer work. T he next theorem summarizes both situations.
Theorem 4.38
Let x" = (lX~_ 1 + bx.._ ! be a recu rrence relation that is satisfied by a sequence (x") . Let Al and A2 be the eigenvalues of the associated characteristic equation A2 aA  b = O.
*"
a. If Al A2, then x" = CI A; + ~A~ for some scalars c1 and S. h. If AI = Al = A, then x" = cIA" + C;>IIA ~ for some scalars c1 and '1. In eit her case, ci and S can be determined using the initial conditions.
Prool
(a) Genera lizing our discussio n above, we can write the recu rrence as x.. =
Axn_ 1> where
'
n
X. [ X,, _ I
=
1
and
A =
[~ ~]
Since A has d istinct eigenva lues, it can be di3gon3lizcd . The rest of the de tails are left fo r ExeTCIse 51. (b ) \Ve will show that x" = CI An + Cl IIA" satisfies the recurrence relation x" = aX.. _1 + bx.._1 or, equivalentl y, (6)
if A2  tlA  b = O. Since X .. _ 1
=
, ,,I + " (1/
CIA
x,,  2 =
and

CI A " ~
+ C2(II  2) A,, 2
substitution into equa tion (6) yields
x"  aX,,_1  bx.._2 = ( cI A ~ + " IIA")  (/(CIA" I + ,,(II  I ) A,, I)  b( cI A,, 2 + ~ (II  2) A,, 2) (I( An

aA" 1  !JA"I)
+
~(/l A " 
a( /1  I ) A" I
 b( n  2) , ·· ' )
= ( IA" 2(A2  aA  IJ) + C211A,, 2(A2  aA  b) + ~ A"  2(aA + 2b) = cI A,,2(0) + " I1A,, 2(0) + = c1A" 1( aA
~A "  2 (aA
+ 2b)
+ 2b)
=
=
But since A is a double root o f ,\2  (IA  b 0, we m ust have Ql + 4b = 0 and A a/2, using the quad ratic (ormula. Consequently, aA + 2b = cr/2 + 2b =  4b/ 2 + 21J = 0, so
SeCtio n 4.6
331
Apphcatio ns and the Perro nFrobenius Theorem
Suppose the in itial conditions are XV = r and x, "" s. Then, in either (a) or (b ) there is a unique soluti on for and '1 (Sec Exercise 52. )
'I
Ixample 4.42
Solve Ihe recurrence relatio n XV = I. x, = 6, and
x~
= 6x.. _,  9xn_l fo r n 2: 2.
The characteristic equation is,\2  6A + 9 = 0, which has A = 3 as a dou ble root. By Theorem 4.38(b), we must have X n == e13" + ':zu3" = ( e. + ~ 1I ) 3 ". Since I = XV = c1 tl nd 6 = X I = ( ' I + ez)3, we fi nd that '2:: I,SO
SOlllilon
x" = ( I + /1)3" The techniques outlmed in Theorem 4.38 can be extend ed to higher o rder recurrence relations. We slale, without proof, the general result.
Theorem 4.39
Let x" = a ," _ \x~_ 1 + a .. _2x~_2 + "'" + ~x" '" be a recurrence relatio n of order III that is sa tisfied by a sequence (XII) ' Suppose the (lssoci:lIed characteristic polyno mial
' "  a", _ I A, ,,,I_ a",_1ft• ...2_ ... _
•
A
factors as (A  A1)"" (A  A2)"'; '" (A  AA)"", where 1111 + Then x~ has the form X,, :::: (cll A ~ + c12 nA ~ + c13 u2A7 + ... + cl",n",,I An + ...
m.,:'"'.~..,F'mL
::
111.
+ (Ckl Aj; + cu /lAi: + cul12AI' + ... + Ckm/,m" IAl)
SYSlemS 01 linear D111erenllaIIQualions In calculus, you learn that if x = x ( t) is a diffe rentiable fu nction satisfyi ng a differential equation of the fo rm x' :::: h, where k is a constant, then the genenll solut ion is x = ee b , where C is a constant, If an initial cond ition x (O) = ~ is specifi ed, then, by substitut ing I = 0 in the general solution, we fi nd that C = ~. Hence, the uniq ue solution to the differential equation that s(ltisfi es the ini tial conditio n is
Suppose we have n differen tiable fun ctio ns of Isay, x" X:z, .. . I x,, that sallsfy a system of differential equations
x; =
a l 1x 1
+
+ ... +
" l n X ..
xi =
(l2 I X .
+ (ln X 2 + ... +
(l2" X ..
{11 2Xi
We C(l 1l wflle this system in matrix for m as x'
x(I)
~
XI( t) x,( I) •
x,,( I)
X'( I)
~
x;( I) .<,(1) x;,( I)
=:
Ax, where
• , nd
A~
il l I
a"
(Il l
an
a" I
(/,,2
Now we can use mat rix methods to help us fin d the sol ution.
a l II ••
. ..
al It
{/,,"
331
Chapter 4
Eigenvalues and Eigenvectors
First, we make a useful observation. Suppose we want to solve the following system of differential equatio ns:
Each equation can be solved separately, as above. to give XI
=
Ai :::
CI C
lr
C2e SI
where CI and C; are co nstan ts. Notice that. in matrix fo rm, o ur equation x' = Ax has a dillgOlwl coeffiCient mat ri.x
A
=:
[~ ~]
and the eigenvalues 2 and 5 occur in the exponentials e ll and e St of the solutio n. This suggests that, for an arbitrary system, we sho uld start by d iagonal izing the coefficient m at ri x, If possib le.
Example 4.43
Solve the following system of differential equations:
X;= x, +2x1 xl = 3x 1 + 2Ai Solullon ues
Here the coeffic ient mat rix is A =
are_~1 = 4 and
V l .... [
'\2
=
[~ ~ ]. and we fmd that the eigenval
 I, with co rrespondi ng eigenvecto rs VI =
[~]
and
1], respectively. T herefore, A is d iagonalizable, and the matrix Ptha t docs
the job is
P=[v, v,][: :] We know that
0] = D
 I
Let x = P y (so thllt x' = Py') and substit ute these results in to the o riginal cquiltion x ' == Ax to get Py' = APr o r, equ ivalcntly,
y' = P'APy = Dy This is just the system
w hosc general solution is
Section 4.6
.. I
Applications and th<, PerronFrobtnius Th('Ornn
To find x, we just compute
=
2Cl c('  c;c ' and ~ so XI given system.}
Remlrll
= 3Cle
4
'
+
C1c '. (Check that these values satisfy the
Observe that we could also express the solution
x,.. cc14' 3 [ ' ] + cc'[  1' ]  CC14 'y1 2
+
In
Example 4.43 as
CC'y 2,
This technique generalizes easily to /I X rl systems where the coeffiC ient mat ri x is diagon,lllzable. The next theorem, whose proof IS left as an exercise, sum marizes the situatiOIl.
, l etAbean /lXndiagonal izable matrix and lei 1' = A,
0
o
A
x "'" CICA,/ v\
+
[VI
V1
•.•
vJ be such that
·.. o ·.. o
o o Then the general solution to the system x'
•
• ••
A.
Ax is
C,cA'V1
+ .. ' +
.
C eA.t v
,
The ncxt example involves a biological model in which two species live in the same ecosystem. It is reason;lble to lissume Ihat the growt h rate of each species depends o n the sizcs of both populations. (Of COUfse, there arc other fact ors that govern growth. but we will keep our model simple by ignoring these.) If XI (t) and x 2 (t) denote the sizes of the two populations at time t, then xj{t) and xi(t ) are their rates o f growth at time t. Our model is o f the form
x:{r) ::;: (lXI(t) + bx 2{t) xi(t) = eXI( t) + dx1(t) where the coefficients a, b, c, and d depend on the conditions.
lIample 4.44
Raccoons and squirrels inhabit the same ecosystem and compete with each other for food , water, and space. let the raccoon and squirrel populations at time t years be given by r et) and S( I), respecti vely. In the absence of squ irrels, the r
3411
Chapter 4 Eigenvalues and Eigenvectors
initially there are 60 raccoons and 60 squirrels in the ecosystem. Determine what happens to these two populations.
Solullop
Ou r system
IS
x' "" Ax, where
X ~x(I)~ The eigenvalues of A are A I [
~] and
V1 ""
[ ' (I)] ,( I) ""
3 and
2.5 1.0] A [ 0.25 2.5
,nd
2, with cor responding eigenvectors
'\ 2 ""
VI
=
[ ~J. By Theorem 4.40. the general solution to our system is x(t ) = C,e" v , +
c,c  ~] + ~Cll[ ~] ['(0)] [60]
~e2'vl =
.. . .
The Imtml population vector LS x(O) =
J
5(0)
{
=
(7)
60' so, settmg t = 0 m equa
tion (7). we have
Solving this equation, we find C, "" 15 and
C:! =
X(I) = 15'''[  :] +
45. Hence,
45" '[:]
from which we find ret) =  30e)' + 9Oe 2 ' and set) = ISe" + 4Se2 ,. Figure 4.19 shows the graphs of these two functio ns, and you can see clea rly that the raccoon population dies out after a little more than I year. (Can you determine exactly when it dies out?)
i
We now consider a similar example, III which one species is a source of food for t he other. Such a model is called a predQfOrprey model. O nce again, our model will be drastically oversimplified in order to illustrate its main fea tures. 3000
2000
, 000
01=0.2
~++~I
0.4 0.6 0.8
1
 , 000
Flgur, •.19 Raccoon and squirrel populatio ns
1.6
Section 4.6
lIllIPIl 4.4.5
Applications and the Pcrro nFrobenius Theorem
341
Robins and wo rms cohabi t an ecosystem. The robins e,lt the worms, which are their o nly sou rce o f food. The robin and worm populations al lime t years are denoted by r{t ) and w( Il, respecti vely, and the equatIOns governing the growth of the Iwo popu lations arc
r( l ) = W( I)  ' 2
+
w'( I ) =  ' ( I)
(8)
10
If initially 6 robins and 20 wo rms occupy the ecosystem, determ ine the behavIOr of the two populations over lime.
Solutlol The first thing we notice about th is example is the prese nce of the exira constants,  12 and 10, in the Iwo equations. Fortu nately, we can gct rid of them witha simplcchangc ofvariablcs. lfwclet r(t ) = x(t) + JO and w(l) :: y( t) + 12, the n ret) = x '( t) and wet) = Y( l ). Substituting into equations (8), we have
X( I ) = y(l)
(9 )
y( l) =  x( I)
Ax, where A =
which IS easier to work with. Equations (9) have the fo rm x '
0 [ , x(0 )
. 1cond luo · . ns are ' ] . 0 ur new .Imua 0
= , (0 )
5O X(O) =
 10
=6 
10
= .
yeO) = w(O)  12 = 20  12 = 8
and
[:1
Proceedmg as in the last example, we find the eigenva lues and eigenvectors of A. The characteristic polynomial is Al + I, which has no real roots. What should we do? We have no cho ice b ut to ust the comp\c;t roots, whic.h arc AI = i and Al =  i. The corresponding eigenvectors are also complexnamely. VI = [ I;] and
Vl "" [
_ I;]. By
Th eorem 4.40, o ur solution has the form
.
rr
ce"[ .']
X(I) = Ce"v + Ce "v 2 :::: ce [ '. ] + 11 1 1 1
·
Fromx(O) = [ : ] . weget
whose solutIo n is C, =  2  4i and
~
=  2
)<"[:] +
X(I) = (  2  ••
+ 4;. So the solution
( 2
to system (9) is
+ '.)'[ ~ .l
What arc we to make of this solution? Robllls and worms inhabit a reaJ worldyet o ur solutio n involves complex nu mbers! Fearlessly proceedmg, we apply Euler's fo rmula c,r= COS I + i sinl
342
Chapter 4
Eigenvalu es and Eigerw\,clo rs
CALVIN AND HO BBES e 1988 Watterso n. Reprin ted wi th
I~ rnllssion
of UN IVERSAL PRESS SYNDICAT E. All rights reserved.
(Appendix C) to get e ,r = cos(  t)
+ I sin ( 
x(t) = (2  41)(cos t + isin t)[
~]
t) = cos t  i sin t. Substituting, we have
+ ( 2 + 4 i)(cos t  i si n t)[
( 2 cos t + 4 sin t) + i(  4 cos t  2 sin t)
+ i(  2cos t + 4 sin t)
~J
1
[
(4 cos 1 + 2sin t)
+
[(  2COS l + 4 sin /) + 1(4 cos t + 2sin I)l (4 cos t + 2 sin t) + 1(2 cos t  4 sin t)
 4 cosI + 8sin tl [ 8COS I+ 4slll l T his gives x(t) =  4 cos 1 + 8 sin I and y(t) = 8 cos 1 + 4 sin t. Putti ng everything In terms of o ur original variables, we concl ude that r(f) = x( t) + 10 =  4 cos t + 8 sin I + 10
w(f) = y(f) + 12 =
a nd
8cost
+ 4 sinl +
12
25
20 20
"
N '
J\
r\
"10
"
5
10
~
5
0
w(t )
2
4
5 I
6
8
10
fluur, 4.20 Rob in and wo rm populatiOns
12
14
16
figure
4.21
10
"
20 25
30
I
Section 4.6 Applications and the Perron·Frobemus Theorem
au
So our solution is real after all! The graphs of r( l) and IlI'{t) in r:igu re 4.20 shO\\' that the two populations oscillate periodically. As the robin popul.u ion increases, the
worm population starts to decrease, but as the robins' only food source diminishes. their numbers star t to decline as well. As the predators diSllppear, the worm popula. lion begins to recover. As its food supply increases, so does the robin population, and the cycle repeals itself. This OSCillation is typical of exam ples in which the eigenvalues are complex. Ploning robms, worms, and lime on separate axes, as in Figu re 4.21, clearly reveals the cyclic nature of the two populations.
.4
We conclude this section by looki ng at what we have done from a different poi nt of view. If x "" x(r) is a differentiable fun ction of t, then the general solution of the ordinary differential equation x' "" ax is x "" ce o"~ where c is a scalar. The systems of linear differential equations we have been considenng have the form x' "" Ax,so if we Simply plowed ahead without thinking, we might be tempted to deduce that the solution wou ld be x = eeAl , where e is a vecto r. But what on earth could this mean? On the righthand side. we have the IIIlmber e raised to the power of a matrix. This appears to be nonsense, yet you will sec thaI there is a way to make sense of it. leI's stan by considering the expression e'1. [n calculus, you learn that the fu nction e~ has a power series expansion
e"= I +x +" + x' + '" 2!
3!
that converges for every real number x. By anaIOb,)" let us define
A'
eA""I+ A +
2!
A' +  + .. . 3!
The rightha nd side is just defined in terms of powers of A, and it can be shm.,.n that it converges for any real matrix A. So now ~ is a matrix. cailed the exporu!ntial of A. But how call we compute e" or eA,? for diagonal matnces, it is easy.
Example 4.46
Compute eDt for D =
Solutlol I
[~
_
~].
From the definition, we have
(Dr)'
(Dr)'
2!
3!
+ DI + "':;' +
+ ...
l~ ~] + [~t ~t] + t.[(4~)1
°]
(_ I)l +
,[(41)' J' 0
[ I + (4 1) + t.<4t~1 + f.{4tf + .. l + ( I) +
t.< 1~1 + ~(_ I)J + .. . ]
[~' e~'] The mat rix exponential is also nice if A
IS
(_Or)'] + ...
diagonalizable.
344
Chapter 4
Eigenvalues and Eigenvectors
Example 4.41
~].
A
Compute e for A = [ ;
SDIIIIID.
In Exa mple 4.43, we found the eigenva lues of A to be AI = 4 and A2 =  I,
with corresponding eigenvectors V I p = [ VI
vl1 '" ['3
[ ~] and

V2
= [ :
 IJ . we have pIAP  D "" [ , 1
0
J, respectively. Hence. with OJ. Si nce A "" POp  I, we
 I
have A!; = pdp 1. so
e A
=
AJ
2!
3!
I+A+  + +'" PIr ! +
~
A2
POp I
+ 1. pD2 P 1 + 1. PDJ p  1 + ... 21
31
J
P(I+ D + D1 + D + ... )" ' 2!
3!
~ [2e" + 3e5 3e~  3e 1 1
We a re now in a position to show that o ur bold (and ~emingl y foolish ) guess at a n "exponent ial " solution of x' = Ax was no t so far off after all!
Theorem 4.41
Let A be an IIX II diagonalizable matrix with eigenvalues AI' Al , ••• , An' Then the general solu tion to the system x' = Ax is x = e"'c, where c is an arbit rary constant vecto r. If an ini tial cond itio n x(O) is specified. then c = x(O).
PrDDI
Let P diagonalize A. Then A = PDp  I, and. as in Example 4.47.
Hence, we need to check that x' = Ax is sat isfied by x constant except for eDr, so
dx x' "" dt
If
=
= ~p 1 C.
d d  (Pe Dr p le ) = P (e Dr) p le dt
D~
dt
AI
0
o
A,
.. .
0
o
o o .. . A.
Now, everythi ng is
( 10)
Section 4.6
,f>~
then
345
Applications and th e PerronFrobenius The(l re m
t"
0
· ..
0
0
.t"
·..
0
0
0
<".,
Taking derivatives, we have
_d (e A,,)
0
•• •
0
0
d  teA,,)
...
0
0
0
dt
.'!.(,D') tit
dt
:
A "
0
~
~
0
A2c A"
·. . · ..
0
·..
A eA.,
e
0
A,e 0
_d (CA.,) dt
•• •
A,
0
•••
0
0
At
·. .
0
0
0
·. . A"
0
"
0
A e"
·. . ·..
D
0
·. .
A, ,
0
0
eA.,
0
De ~
Substituting th is resu lt into equat ion ( IO), we o btain
x' = PDe Drp1c
=
PDp1 peDo ple
= (PDy l)( Peblpl )e = AeA'e =
Ax
as required. The last statemen t follows easily from the fa ct that if x = x (t) = eAte, th en
x(O)
=
eA
=
Ie = e
since {J = 1. (Why?) In fac t, Theorem 4.4 1 is true even if A is not diago nalizable, b ul we will no t prove this. For example, see Lincrlr Algebra by S. H. Friedberg,A. J. lnsel, and L E. Spence (Englewood Cliffs, NJ: PrenticeHall, 1979).
Computation of matrix exponentials fo r nondiagonalizable matrices requires the Jordan normal form o f a mat rix, a topic that may be fou nd in more ad vanced linear algebra texts. Ideall y, this short d igression has served to ill us trate th e power o f mathematics to general ize and the '1:11 ue of crealive thinki ng. Matrix exponen tials turn out to be very im portant tools III many applica tions of linear algebra, both theoretical and applied.
DIscrete linear Dwnamlcat Systems We conclude this chap ter as we began it hy looking al drnarmcal systems. Markov chains and the Leslie model of population growth are examples of discrete linear dynamical systems. Each ca n be d escribed by a matrix equation o f the form X HI
= Ax.
US
Chap! tr 4
Eigenvalues and EigenveclOrs
where the vector x. records the stateofthe system at "tlme" kand A is a square matnx. As we have seen, the longterm behavior of these systems is related to the eigenvalues and eigem'ectors of the matrix A. The power method exploi ts the iterative natu re of such dynamical systems 10 approximate eigenvalues and eigenvectors, and the PerronFrobenius Theorelll gives speciali1.cd information about the longterm behavior of a discrete linear dynamical system whose coefficient matrix A is non negative. When A is a 2X2 matrix, we can describe the evolution of a dyna mICal system geometrically. The equation x N \ = Ax k is really an infinite collection of equatIons. Ikginning With an initial vector "0, we have: XI
= Axo
X l ""
/lx l
X J ""
Ax~
The set lXo. XI' Xl ' ... ) is called a trajectory of the system. ( ]:or graphical purposes, we will identify each vector in a trajectory with its head so that we can plol 11 as a point .) Note that x, "" A"xil.
Ixample 4.48
Let A ""
[0.5 ° ]. For thedynamicaJ system o 0.8
plot the first five points in
X t+1 "" Axh
the trajectories with the following initial vectors:
(,) x,
~ [~]
SDIIIIIDI
0.625] ' ° [
(b) X,
~ [_~]
(a) We compute X..j
XI
(e) x,
= ~ =
~ [: ]
2.5] [ 0
' X2 ""
(d) x,
AX I
""
~[
!]
[1.25] 0
, X}
=
Ax l
=
° .
= Ax, "" [0.3 125] These are plotted in Figure 4.22 , and the POUlts are
connected to highlight the trajectory. Similar calculations produce the trajectories marked ( b). (e), and (d) in Figure 4.22.
,. 4 (e)
«(I)
+++++ "" + ., 4 2 6
2
(b)
flgur.4.22
$cchon 4.6
Applications and thC' PerronFrobcnius Theorem
leI
In Example 4.48, every trajectory converges to O. The origi n is called an attractor Tn
th is case. We ca n understand why this is so from T heorem 4.1 9. The matrix A in
Example 4.48 has eige nvecto rs
[ ~] and [~J correspondi ng to its eigenvalues 0.5 and
0.8, respectively. (Check this.) Accordingly, for any ini tial vector
we have
Because both (0.5)1 and (0.8)1 approach zero as k gets large, Xl approaches 0 for any choice of Xo III addition, we know from Theorem 4.28 that because 0.8 IS the dom ina nt eigenvalue of A, x~, will approach a mult iple of the correspondi ng eigenvector
[~] as long as ".! *' 0 (the coefficient o f Xo correspondmg to [~]). In olher words, all trajectories except those that begin on the xaxis (where e:z :: 0) will approach the yaxis, as Figu re 4.22 shows.
EXlmple 4.49
Discuss the behaVIOr of the dynamical system o
matrix A =
Sa1aflol
[0.65 0015]  0.15
0.65
X ~"I :: Ax ~
corresponding to the
.
The eigenvalues of A arc 0.5 and 0.8 with corresponding eigenvectors [ : ]
and [  : ]. respcctively. (Check this). Hence for an ill it ialvector Xo
= CI [ : ]
+ e:z[ :],
we have
x.
Once again the origin is an attractor, because approaches 0 for any choice of !to. If c! 0, the trajectory will approach the line through the origin with direction
*
vector [  : ]. Several such trajectories are shown in Figure 4.23. The vectors Xo where
e:z = 0 are on the line th rough the origin with direction vector [ : ], and the corresponding trajectory in this case follows this line in to the origin.
at.
Chapu r 4 Eigenvalues and Eig~Il\'cclors
)'
,
filII" 4.23
Example 4.50
Discuss the behavior o f the dynamical systems fo llowi ng matrices:
X ,l:+1 ::
Ax~ co rresponding 10 the
(a) A = [ ~ ~ ] SII,n'l
(a)
and [  : ),
Th~ eigenval ues o f A are 5 and 3 with co rres po nding eigenvectors [ ; ]
r~spectivc\y. Hence for an initial vector ~ ==
C1[ : ]
+
'1( :].
We
have
As k becomes largc. so do both 5' and 3t. II~ n ce . Xl tends away from the origin. Because the do minant cigenvalue or 5 has correspond ing eigenvecto r [:]. all trajecto ries
ror which (, with
( I ""
*" 0 will event ua lly end up in the fi rst or the third quad rant. Trajccto ries
0 sta rt and stay on the line y "" x
who~ di rection vt'Ctor is [ 
: ]. See
Pigure 4.24(a). (b) In th is example, the eigenvalues are 1.5 and 05 with correspo nding eigenvectors [ : ] and [  : ] . respectively. Hence
Section 4.6
349
ApplicatIOns and the Perron Frobenius Theorem
y
4
4
c+++~
2
x
4
4
(,)
(b)
flaar, 4.24
If c,
::=
0, then
x~ == ~(O.5)k[ :] ,lo [~] as k ,looo. But if c, *" 0, then
"d such " .;octmi" osymplol i"lIy 'pp,"" h Ihe Ii", y = x. S" Figm' 4.24(b
).t
In Example 4.50(a), all points that start out near the origin become increasingly large in magnitude because IA! > I for both eigenvalues; 0 is called a "peller. In Example 4.50(b), 0 is called a saddle point beca use the o rigin attracts points in some directions and repels points in othe r directiOns. In this case, lAd < I and IAll > I. The next example shows what can happen when the eigenvalues of a real 2X2 matrix are complex (and hence conjugates of one another).
(Kample 4.51
Plot the trajectory beginning with "
[!J
fo r the d ynamical systems x,,+\ =
AXk
corresponding to the following matrices; (a)
A= [0.5 0.5
Solution
0.0.55]
(b) A =
[~ ! 1.2] 1.4
The trajecto ries arc shown in Figure 4.25(01) and (b), n:spectively. Note that (a ) is a trajectory spi raling into the origin, whereas (b) ap~a rs to follow an elliptical orbit.
351
Chap ter 4
Eigenvalues and Eigenvectors
 4  10
(, )
flllur.4 .25
(b)
The followi ng theorem explains the spiral behavio r of the trajectory in Exampic 4.5 1(:1 ).
l£tA= [: b]a .
The eigenvalues of A are A =
(I
± bi, and
if II and
/1 ,He
not
both zero, then A can be fac tored as
A:
[b
1>][' II
0lr ['~'O sme
0
 sinB] cos8
Va2 + il and 0 is the principal argument of II + bi.
where r = IAI =
Prool
a
The eigenvalues of A are
A = H2(1
± V 4( il))
= H2 (1
±
2Wv'=t) =
by Exercise 35(b) in Section 4. 1. Figure 4.26 d isplays a
b] = ,[ai' bl']: [, bl r aI r II
•
lie.".
1m
u + bi
0
0]['~'8 r
smO
+
II
± Ibli =
(I
± /"
b" r, and 8. It follows that
,'n8] cos O
Geometrically, Theorem 4.42 impli es that when A = [ ;:
 ~] '*
the linear transformat ion T(x ) "" Ax is the composition of a rota tion R
0, :EZ
[ c~sO 5mO
 SinO] through the angle 0 followed by a scaling 5 = [ , 0 ] with facto r r cosO 0 r ( Figure 4.27). In Exam ple 4.5 1(a), the cigenvalues arc A = 0.5 ± 0.5/ so r  IAI = b
V111 =
0.707 < I , and hence the trajecto ries all spiral inwards toward O. The next theorem shows that, ;n general, when a real 2 X2 matrix has complex
eigenvalues, it is similar to a matrix of the form [ : iL~~L+R'
"
Flglr.4.2&
~]. Fo r a complex vecto r
Section 4.6 Applications and the PerronFrobenius Theorem
351
y
*..
Scali ng ,
,
Ax = SRx
Rx ' .~ Rotation
,,
,
,
~::::._x
fllluft 4.21 A rotation followed by a sc:ating
we define the real part, Re x, and the Imaginary part, 1m x, of x to be Re x 

Theorem 4.43
[a]b 
[R "]
1m x 
Rew
[,] ['mz] tmw d

Let A be a real2X2 matrix with a complex eigenvalue,\ = a  hi (where h =F 0) and corresponding eigenvector x. Then the matrix P = [ He x Im.x J is invertible and
Proof
Let x = u
Au
+ Avi =
+
vi so that Re x = u and 1m x :: v. From Ax = Ax, we have
Ax = Ax
= ::
(a  bi)(u + vi) flU + avi  bul + bv
=
(au + bv) + ( bu + av)i
Equating real and Imaginary parts, we obtain Au = au + bv and
Av =  bu + av
NowP = [u j v},so
a [
P b
b]a ~ [u l vJ [ab !]
=
[au + bv I  bu + av J = [Au ! Av] = A[ u I vJ
= AP
To show that P is invertible, It is enough to show that u and v are linearly independent. If u and v were not linearly independent, then it would fo llow that v = ku for some (nonzero complex) scalar k, because neit her u nor v is O. Thus x = u + vi = u + leui = (I + ki) u Now, because A is real, Ax = Ax implies that
 = Ax=..\x=Ai Ax=Ax
so X But
::: U 
VI
is an eigenvector corresponding to the other eigenvalue A = a x = (I
+ ki)u
=
(I  ki) u
+ bi.
Eigenvalues and Eigenvectors
because u is a real vector. Hence, the eigenvectors x and x of A aTe bo th nonzero multi ples of u and therefore are m ultiples o f one anothe r. Th iS IS impossible because eigenvectors corresponding to d istinct eigenvalues must be linearly independent by Theorem 4.20. (This theorem is valid over the complex num bers as well as the real numbers.) This con trad iction implies that u and v are linearly independent and hence Pis invertible. It now follows that
b]p' " Theorem 4.43 serves to explain
A == [ 0.2 0.6
1.2 ] 1.4
Example 4.51 (b ). The eigenvalues o f
are 0.8 ± 0.6;. Por A == 0.8  0.6;, a corresponding eigenvector is
From Theorem 4.43, It fo llows that for P == have
A =
pcr l
For the given dynamICal system Xl ==
PYJ;
Xh I
[
 I
 I] and
°
C ==
[0.8 0.6
0.0.86 ]
, we
and P 1AP = C
== Ax " we p erform a change of van able. Let
(or, equ ivalently, Yt == pI X. )
Then
so Ytt l == Xk+l = Ax. == P 1APYk = CYIo
Now C has the s.1 me eigenvaJ ues as A ( Why?) and 10.8 ± 0.6;1 == 1. Thus the dynamical system Ykt l == Cy. simply rotates the points in every trajecto ry in a circle abo ut the origin by Theorem 4.42. To determine a trajectory of lhe dynamical sys tem in Example 4.5 1(b ), we itera tively apply the linear transformation T(x) = Ax == PC p Ix. The transformation c:m be though t o f as the composition of a change o f vamble (x to y), followed by the ro tation determined by C, followed by the reve rse change of variable (y back to x). \Ve wi ll encounter this idea again in the application to g raphing quadratic equations in Sectio n 5.5 and, more generally, as "ch ange of baSIS" in Section 6.3. In Exercise 96 of Section 5.5, yo u will show that the trajectory in Exam ple 4.5 1(b ) is indeed an ellipse, as it appears to be from Figure 4.25(b). To summarize then: If a real 2X2 m at rix A has com plex eigenvalues A = a ± hi, th en the trajectories o f the dynamical system Xk '.1 = AXk spiral inward if IAI < 1 (0 is a spiral attractor), sp iral outward if IAI > I (0 IS a spiral repeller), and lie on a closed orbit if IAI = 1 (O is an orbital cwter).
I g Sports Tea
In any co mpetitive spo rts league, it is not necessarily a straightfo rward p rocess \0 rank the players or teams. Counting wins and losses alone overlooks the possibility that one team may accumulate a large n um ber of victories against weak teams, wh ile another learn may have fewer victories but all of them agai nst st ro ng teams. Which of these teams is better? How should we co mpare two teams that never play o ne another? Should poin ts scored be taken into account? Points against? Desp ite these complexi ties, Ihe ranking of at hletes and sports teams has become a commonplace and muchanticipated feature in the media. For example, there are various an nual rankings of U.S. college football and basketball teams, and golfers and tennis players are also ranked internationally. There are many copyrighted schemes used to p roduce such rankings, but we can gai n some insight into how to app roach the problem by using the ideas from this chapter. To establish the basic idea, let's revisit Example 3.68. Five tennis players play o nc another in a roundrobin tournament. Wins and losses arc recorded in the form of a digraph in which a directed ed ge from ito jindICates that player i defcats player j. The correspo nding adjacency matrix A therefore h.ls a,j = 1 [fplayer i defeats player j and has n ij = 0 otherwise. I
5
2 A~
4
0
I
0 I
0 0
0 0
0 0
0 I
0 0 I
I
I
I
I
I
0
0 I
0
0
3
f,
We would like to associate a ranking r, with player I in such a way that r, > mdicates that player i is ranked more highly than p layer j. Fo r this p urpose, let's require that the r;'s be p robab ilities (that is. O:s r, < I fo r all i, and r l + ' 2 + ') + '4 + '5 = I) and then orgamze the ran kings in a rallklllg vecto,
• •
Furthermore, Jet's insist that player j's ranking should be proportional to the sum of the rankings o f the players defeated by player i. For exa mple, player I defeated players
353
2, 4, and 5, so we want ' t = a(rl
+ '4 + '$)
where a is the constant of proportionality. Writing out si milar equations for the other players produces the following system:
' 1 = cr(rl + r~ + ' s) '1 TJ
+ '. + 's) = o(rl + (4) = a(r,
Observe tha t we can \\' ri te this system 1Il 1llatrix form as
'. " "
'.
"
~
0
I
0 I 0 0
0 0 0 0
0 I 0 0 I
I
I
I
I
I
0 I
0
0 0
'. " "
0'
r = (fAr
'.
"
I
Equivalently, we see that the ranking vector r must satisfy Ar =  r. ln other words, r is an eigenvector corresponding 10 the matrix At
a
Furthermore, A is a primitive nonnegative matrix, so the PcrronFrobemll.s Theorem guarantees that there is a IIniqlll! ranking vector r . In this example. the ranking vector turns out to be 0.29 0.27 ,
~
0.22 0.08 0. 14
so \ \ 'C would rank the players in the o rder 1,2,3,5,4. By modifying the matrix A, it is possible to take into accoun t many of the corn
plexlt ils mentioned In the opening paragraph. However, th is sJln ple example has served to indicate olle useful approach to the problem of ranking teams. The same idea can be lIscd to understand how an In lernet search engine such as Google works. Older sea Tch engi nes used to relurn the resuhs of a search wlOrderetl. Useful sites would often be buried among irrelevant ones. Much scrolling was oflen needed to uncover whal you were looking for. By con trast, Google returns sea rch results ordered according 10 thei r likely relevance. Thus, a method for ranking websites is needed. Instead of tea ms playing one another, we now have websites hnking to one anot her. We can once again use a digraph to model the situation, only nOI" an edge from ito j indicates that I"ebslle i Imks to (or refers to) website j. So whereas for the sports team digraph, incommg dIrected edges are b.,d (they indicate losses), for the Inlernel d igraph. incoming directed edges are good (they indicate links from other sites). In 354
this setti ng, we wan t the: ranking of websi te i to be proportional to sum of the: rankings of all the websites that link to i. Using the digraph on page 353 10 represent jusl five websi les, we have r~
"" a(r, + r2
+
rl)
fo r example. II is easy to see Ihat we now want to use the trallsposc of the adjacency I matrix o f the d igraph. Therefo re, the ranking vecto r r mus t satisfy ATr ""  r and will thus be the Pe:rron eigenvector of AT. In this example, we obtain 0
I
0
0
0. 14
0
0
0
0
0 I
0.08
0
0 I
I
I
I
0
0
0.27
I
I
0
I
0
0.29
0 I
AT =
'nd
,
~
a
0.22
so a search tha t turns up these fi ve sites would list them in the o rder 5, 4, 3, 1,2. Google actually uses a va riant o f the m ethod desc ribed here and computes the ran king vector via an iterative method very similar to the power m ethod (Section 4.5).
355
356
Chapt('r 4
Eigen".;llut'S and EigeTl\'ectOrs
17. If all of the sur".ival rates s, are nonzero, let
l. [~~] 3.
5.
U~]
4.
0.1
0
0.5
0. 5
I
0
OA 0 0.5
p =
• !
0
I
0
0
0
I
0
•
0.5
I
0
6. 0.5
0
I
0
0
0
0
0
0
"0
0 5152
0 0
o
0
o
0
2. [~ i] j
0
I
Wlriell of the stoc/UI$tie /1lcHricl:s ill Bxercises 16 (lfe re~ul(lr?
. .. • ••
Compute p  I LPand usc it to fi nd the characterist ic polynomial of L. ( Him: Refer 10 Exercise 32 in Section 4.3.) 18. Verify that an I."rgenvC(lor of L correspond ing to A, is I
III Exerci$c$ 79, P is the trmlS/tlOli maf";x of(I reglilM Markov cll"ill. Filld the /o/lg rallge lrilllsitiofllll(l/rlX L of P. 8. P ""
7. P ""
sdA,
., .
1
1
1
StsJ A i
1
!
1
SI~$J A t
2
1
J
o ! ! 0.2 0.3 0.4 9. P = 0.6 0.1 0.4 0.2 0.6 0.2 10. Prove that the steady state probabili ty vector o f a regular Markov chain is unique. ( Hint: Use Theorem 4.33 or Theorem 4.34. )
P8,1I111I •• Gr.wlll In Exerci$es // /4, calm/ale the positive elgellvalue tII,(1 a correspolUlmg lwsiuve eigellvector of ti,e Les/re matrix L.
11. [00.5 0']
12. [I05 01.5] L =
t =
074 13. L =
05
0
0
14. L =
15
3
~
0
0
00.50 o i o 15. If a Leslie matrix has a unique positive eigenvalue A,. \"hat IS the significance fo r the populalion if A, > I? A1 < I?A 1 = 1? 16. Verify that the charaCieristic polynomIal of the Leslie matrix L in equation ( 3) is
CtP.) = (I )"(A~  b,A~  1  b2$I A ~z  bJ$,~ A "J  ...  b,.s t~" 5,._,) (/'Ji"t: Usc mathematical induction and expand dc t( L  AI) along the last column. )
( Hin!: Combine Exercise 17 above with Exercise 32 in Sect Ion 4.3 and Exercise 46 in Sect Ion 4.4). ~
In Exercises 1921, complltc the steady stale growth rare of Ihe popu/arioll with the I.e$/Ie nwtrix L from tire given exercise. Then use Exercise 18 10 help find the corresponding di$triburion of tile age classe$. 19. Exercise 19 in Section 3.7 20. Exercise 20 in Section 3.7
21. Exercise 24 in Section 3.7
_22.
Many speci~$ of seal have suff~red from commercial hunting. They have ~I."n killed for their skin , blubber, and meat. The fu r trade, in particular, reduced some seal populations to th~ point of extinctIon. Today, the greatest threats to seal populations arc dC(li ne of fish stocks due to overfi shi n g, poll ution, distu rbance o f habitat, entanglement in marine debris, and culling by fishery owners. Some seals have been declared endangered species; other species arc carefully managed. l able 4.7 gives the birth and survival rates for the northern fur seal, divided into lyear age d asses. [The data arc based o n A. E. York and J. R. Ha rtley, ~ Pup Production Fol lowing Ilarvest of Female Northern Fur Seals," Camu/iall JOII,,/III of FIsheries and Aquatic Science, 38 ( 1981). pp. 84 90.]
Secllon 4.6
ApplICations and the PerronFrobenius Theorem
35J
(b) Show tha t r = I if and on ly if At = I. (Th is represen ts zero populatioll growth.) I Hint: Lei
Show that A is an eigenvalue of L if and only if
g(A} =
I.J
(c) Ass uming that there is a unique positive eigenvalue A" show that r < I if and o nly if the population is dec reasi ng and r > I if a nd onl y if the population IS increasing.
A sustamable harvesting policy is a proce(llIre that al/ows a
certaill fract ion of a population (represemed by a papulatioll distnbution vector x ) 10 be Iwrvested so fllat the population retllrns to x after one time imerval (where a time IIItervtll is the length of aile: age class). If II is the fTC/ ction of each age class that is harvested, then we can express (ile harvesting procedure mathema tically as fo llows: if we start willt a poPlllation vector x, after aile time interval we have Lx ; hllrve5tmg rell/oves hLx, leaving
Table 4.1 Age (years)
Birth Rate
Survival Rate
<>2
0.00 0.02
0.91
0.88
0.70 1.53 1.67 1.65 1.56 1.45 1.22 0.91 0.70 0.22 0.00
0.85 0.80 0.74 0.67 0.59 0.49 0.38 0.27 0.17 0.1 5 0.00
24
46 68 810
1012 12 14 14 16 16 18 1820
2022 22 24 2426
Lx  ilL" ==
Slistainability requires that

CMl
(a) Construct the Leslie matrix L for these data and compute the positive eigenvalue a nd a corresponding positIVe eigenvector. (b) In the long run. what percentage of seals will be in each age class and what will the growth rate be?
Exercise 23 shows that the langrun behavior ofa population can be determined directly from the entries of its Leslie /nrHrix. 23. The net reproductIon rate of a population is d efined as
(1  h)Lx.
24. If .\1is the unique positive eigenvalue of a Leslie ma tn x Land h is the sustainable harves t ratio, prove tha t II == 1  I / AI. 25. (a) Find the s usta inable harvest ratio fo r the wood land caribou in Exercise 24 in Section 3.7. (b ) Usi ng the data in Exercise 24 in Section 3.7, red uce the caribou herd according to your answer to part (a). Verify that the population re tu rns to its original level after o ne time interval. 26. Find the sus taina ble harvest ratio for the seal in Exercise 22. (ConservatioOlsts have had to ha rvest seal populations when overfishing has reduced the available food su pply to the point whe re the seals are in da nger o f starva tion.)
.. .. 7 27. Le t L be a Leslie ma trix: with a unique positive eigen val ue At. Show that if A is any othe r (real or complex) eigenvalue of L, then IAI <: AI' ( /1int:Write A = where the b, are the birth rates a nd the 5} arc the survival rates fo r the popub t ion. r (cos 8 + i SIO 8 ) and substitute it into the equation g (A) = 1, as in part (b ) of Exercise 23. Use De Moivre's (a) Explain why rca n be in terpreted as the average Theorem and the n take the real part of both sides. The number of daughters born to a single fem ale over Triangle Inequal ity should prove useful .) her lifetime.
Chapter 4 Eigenvalues and Eigem'ectors
351
The PerronFrobenlus Tbeorem
(b ) Show that if A is primitive, then the o ther eigen
values are aU less than k in absolu te value. (Hmt: Adapt Theorem 4.31.)
/11 Exercises 28 31, find the Perron root and tile correspondillg PerrOIl eigenvector of A, 28, A ::
30, A ::
[~ ~l
29, A =
0
I
I
I
0
I
I
I
0
31. A ::
[;
39. Explain the results of yo ur exploratio n in Sectio n 4,0 III light of Exercises 3638 and Se
~l
2
I
I
I
I
0
I
0
I
40. Let A, B, C, and 0 be PIX /I matrices, x be m IR", and c be a scalar. Prove the follow ing lTlatrix inequalities:
lal IcA I ~ IeIIAI 1<1 IAxI " IAllxl
lise I/US crilerion 10 tielermllle whether rile malrix A 15 Irredll clhle. If A is redllcible, find a permutfltioll of ils rows and colllll1l1S t/wtl'lIlS A illlo the block form
linear Rlcurrinci .,lltI.l. /n Exercises 4J 44, wrile oul tire first six terms of tile
seql/('tIce defined by lire recurrence relal io/! wilh the given imlial conditions.
W~l
34. A =
41. Xu = I, Xn == 2x,,_1 fo r 11 i?:: I
0
0
I
0
0
0
I
0
0
0
0
I
0
0
I
I
I
0
0
0
I
I
0
0
33. A =
(dl IABI < IAII BI
(e) lf A > B > Qand ei?:: D 2: 0, then AC 2: BD i?:: 0,
It call be sllOwl/ that a nonnegatil1e /I X /I mat rix is irretillCilJle if and Dilly if ( / + A) ,,I > 0. b. Exercises 3235,
32. A ~
Ibl IA + BI siAl + 181
42. a l
= 128, an = a n_ I / 2
43, Yo = 0,11 ... I, y" =
for
II
2:
2
Y,,2 for N 2' 2
Ynl 
0
I
0
0
I
0
0
0
0
I
0
0
0
0
I
0
0
0
0
0
I
0
I
0
0
0
0
I
I
0
I
0
I
I
0
0
0
I
hi Exercises 4550, solve Ihe recllrrence relatioll IVillr tile givell Irlilial cOlldilions,
0
0
I
I
0
0
0
I
0
0
45. Xc!
I
0
0
0
0
0
0
0
I
I
46, Xu = 0,
35. A ::
44. /)0
36. (a) If A is the adjacency matrtx of n graph G, show that A is Irreducible if a nd only if G is connected . (A graph is c0111lected if there is a path between every pair of vertices.) (b) Which of the graphs in Section 4.0 have an c irreducible adjacency matrix? \Vhich have a prrm itive adjacency matrix? 37. Let G be a bipa rtite graph with adjacency matrix A. (a) Show that A is no t prim itive. (b) Show that if A is an eigenvalue of A, so is  A, \ Hil1l: Use Exercise 60 in Section 3.7 and partition an eigenvector fo r A so that it is compatible with this partitioning of A. Use this partitioning to fi nd an eigenvector fo r  A. I 38. A graph is called kregl,{ar If k edges meet at each vertex. Let G be a krcgular graph. (a) Show that the adjacency matrix A of G has A = k as an eigenva lue. (1'/1111: Adapt Theorem 4.30.)
= I , /)1 = 1, b
n
= 0, x 1 = XI
S,x"
49.
"
= 3x
n_
1
= I, x" = 4xn_ 1
47. YI = \ 'Y2 = 6,y" 48. ('0
= 2bn _ 1 + b"_2 for II 2: 2
+ 4X n_l 
fOTIi
>
2
3X,,_2 for /I i?:: 2
=
4Yn_1  4Y,,_2 for 1/ i?:: 3
= 4, " I = I, a" =
a,,_1  a,,_z/4 for II i?:: 2
bo = 0, bl
= I, b" = 2b n _ 1
+ 2b,,_2 for"
2:
2
50. The recu rrence relation in Exercise 43. Show that your solut ion agrees with the answer to Exercise 43, 5 1. Complete the proof of Theorem 4.38(a ) by showing that jf the recurrence relation x" = ax"_ 1 + bX,,_2has distlilct eigenvalues Al '\2' then the solution will be
'*
of the form
( Hilll: Show that the method of Example 4.40 wo rks in general.)
52. Show that for any choice of mltial conditio ns Xu = r and x, = S, the scalars c i and C:! can be fo und, as stated jn Theorem 4,38(a) and (b).
Section 4.6
Applications and the PerronFrobenius Theorem
T he area of the square is 64 square u mls, but the rectangle's area is 65 square u nits! Where did the extra square coille fro m? (Him: What does this have to do wi th the Fibonacci sequence?)
53. The Fibonacci recurrence f" = /,, 1+ /,,2 has the associated matrix equation x ~ = Ax n _ p where
(a) With fo = 0 and f.. = I, use mathematical ind uction to prove that
A"
~ [f"+' f.
f,
/~ I
359
54. You have a supply of three kmds of tiles: two ki nds of 1 X2 tiles and one kind of I X 1 tile, as shown in Figure 4.29.
1 figure 4.29
for ,11111 <::!: I. (b) Using part (a), prove that
Let twbe the number of different ways to cover a I X " rectangle with these tiles. For example, FIgure 4.30 shows that 13 = 5. fo r all II <::!: I. [This is called Cassini's Identity, after the astronomer G iovanni Domenico Cassini ( 162517 12). Cassini was born in haly but, on the invitation of Louis XIV, moved in 1669 to France, where he became director o f the Paris Observatory. He became a French citizen and adopted the French version of his name: Jean Dominique Cassini. Mathematics was one of his many interests o ther than astronomy. Cassini's Iden tity was published in 1680 in a paper submitted to the Royal Academy of Sciences in Paris.] (c) An SXS checkerboard can be dissected as shown in Figure 4.28(a ) and the pieces reassembled to form the 5X 13 rectangle in Figure 4.2S(b).
I


t :r ,
+
I
tt 
Jl ?'\ ,
I I
~
I
I 1
55_ You have a supply of I X2 domi noes with which to cover a 2 X n rectangle. Let dn be the number of different ways to cover the rectangle. For example, Figure 4.31 shows that d3 = 3.
+
•
(b)
fllure 4.Z8
flgur. 4.30 The five ways 10 tile a I x 3 rectangle
1 1
(Does to make ally sense? If so, what is it?) (b) Set up a second o rd er recu rrence relation for tn' (c) Using II and t1 as the initial conditions. sol ve the recurrence relation in part (b ). Check your answer against the d ata in part (a).
, ,
i
+ .
~
,
I
. ..
(a) Find tl" ..• ' 5'

•
....;.

~
(a) Find d p . . . , ds. (Does ~ make any sense? If so, what is it?) (b) Sel up a second order recu rrence relation fo r dn' (e) Using d l a nd d2 as the initial conditions, solve the recu rre nce relation in part (b) . Check you r answer against the data in part (a).
3611
Chapter 4
Eigenll(llues and eIgenvectors
The two bacteria compete for food and space but do not feed on each other. I f x == x { t) and y "" y( t) are the numbers of the str;lins at time I days, the growth rates of the two populations arc given by the system
1
x' ""
y' = O.2x
The thrt'c ways to (01lt'f (I 2X3 rcct3ngIe WIth I X2 domInoes
56. In Example 4041 , find eigenvectors V I and v1 corresponding to AI = " k ::;:
[tJ
2
and Az =
1  VS 2
• With
verify formula 2 in Section 4.5. That is.
show that, for some scalar cl'
...
In Exercises 5762. fintl tile general wilitlOlI to the gIven system of tli/fercmitl/ et/rmtiolls. Tlren find tire specific sollltion that mtisftes tire initial c01rditiottS. (Consider all /llIIctiotlS to be /liller/otIS of t.j x = 2x =
+ 3r, x(0)  0 + 2y. y(O)  5
58. x'  2x  y, )1 =  x+ 2y, 59, i l = XI
xl
+ "
y(O) x1 (0)
= I ===
60. '" ::;: YI  Y2, Yl::;: YI + )'l '
y
y'=x+ z'=x+y,
62. x'::;: x + y' = x  2y
z' = 3x+
x(0) 
z,
:], b [=:~]. X(O) [~~] 1
1
y(O)  o ,(0)   1
+ z,
to tire two populations for the given A andb lind Illitial con di tIOns x(O). (Flfst show t/ral there are cOll stants (/ and b such that rlre substitutions x = u + a ami y = v + b convert the system i llto an equivaiem aile with 110 CO/lSIan/ terms.)
66. A ::;: [ 1
Y2(0) == I z,
" "" [;] and b is a constant vector. Determine wlwt happens
I
YI(O) == I ~
Exercises 65 and 66, species X preys 011 species Y. The sizes of tile populations /Ire represented by x = x( t) a"d y = yet) . "f/ie growtlr rate of etlch population is govemed by tire system of dijferelllia/ equ(rtiolls ,,' ::;:; Ax + b, where III
65_ A =
X2. X2(0) "" 0
= XI 
61. x'
x(0)  1
O.4x  0.2y
Determine what happens to these two populations.
C.V.
Sut,.s olll.,.r DIII,rllll.1 Eallllols
57. x' y'
= O.8x + OAy
y' 
. x,
hm k = A,
1.5y
64. Two species, X and Y, live in a symbIOtic relationship. That is. neither species can survive on its own and each depends o n the other for its survival. Imtially, there are 15 of X and 10 ofY. lf x = x(t) and y = yet) arc the sizes of the populations at time t months. the growth rates of the two populatio ns are given by the system x'
Moo
+
(a) Determine what happens to these two populations by solving the system of differential equations. (b) Explore the effect of changing the in itial popu lations by letting x(0) = a and ){O) = b. Describe what happens 10 the popuJations in terms of a and b.
figure 4.31
I +VS
I.2x  0.2y
x(0)  2 y(0)  3 z(O)  4
63. A scientist places two strains of bacteria, X and Y, in a petri dish. Initially, there arc 400 of X and 500 ofY.
67. Let x = xU) be a twicedifferen tiable function and consider the second order differenllnl equatioll x"+ax'+ bx=O
(II)
(a) Show that the change of variables y = x' and z ::;: x allows equation (i l ) to be written as a system of two linear differential equations .n yand z. (b) Show that the characteristic equation of the system in pari (a) is ,\2 + aA + b::;: O.
Chapler Review
68. Show that there is a change of va riables that converts the 11th order differcntitll equation } n)
+
a~_I}" I )
+ ... + (llx' +
('0 == 0
into a system of nlinca r differential equations whose coefficient ma trix IS th e companion mat rix C{ p) of the polynomial peA) = An + a.. _IA,,1 + .. . + al A + flo. IThe notation x fl l denotes the kth deriva tive of x. See Exercises 2632 in Section 4.3 for the deflnltlon of a companio n matrix.] III Exercises 69 alld 70, use Exercise 67 to fifU/ the general
solution of the give" eql/{ltiofl. 69. x"  Sx' + 6x = 0 70. x" + 4x' +3x=O
L.5
,
79. A = [
0.'
80. A = [ 0.5
0.2 81. A = [  0.2
0.4] 0.8
82. A =[ O
1.2
3&1
0.9]
0.5
1.5 ] 3.6
1/1 Exercises 8386, the givell lIlatrix is of the form
a
A = [b
b]a ' I" each case,
A call be factored liS the
product of a scaling matrix alld a rotatiot! matrix. Fmd the scaling/actor r and the allgle 0 of rotatiOIl. Sketch the first fo ur PO;fHS of the trajectory for the dYllamical system Xu r "" Axt
with Xo = [ : ] "nd classify tlle origm (IS a spIral
III Exercises 7174, wIve tile system of differential equations m the given exerCIse using TI/Core", 4 41.
atrractor, spiral repeller, or orVital center.
71. Exercise 57
72. Exercise 58
83. A :: [:
73. Exercise 61
74. Exercise 62
84. A =
: ]
v'3,3]
Dlscr •• e Linear DJnamlCal SVS ....II
[ °05 0.5] 0
_ [v'3/2  '/2 ] 86. A '/2 v'3/2
III Exercises 7582, cOllsider the dynamical System In ExerCIses 8790, find all illvertible matrix P and a
" h i = AXt ·
(a) Compute ami plot Xo, " I '
X 2, X J
(b) Compute aud /llot Xo> X I> x2, X l for Xo =
[' 3']
!]
sl/cli (lwI A = PCp  I.
Sketch the first six points of t/ie traJlxtory for tire dynamical
[~].
system Xk+ I =
(c) Usiug eigenvailles alld eigellvectors, classify the origin as (HI anTactor, repeller, s(I(ldle poillt, or Hone of these. (tl) Sketch several typicnl trajectories of the system. 75. A=0
[~
matrix C of the form C =
for Xo = [:].
[0.5 0.5] 0.5
76.A =O
 4 78.A = [ I
Ax~ with "tl =
a spirol (ltlraC/or, spiral repeller, or orbital cellter.
1 22 0]
87.A == [ 0.1  002 ] 0.1 0.3
88.A = [
89. A=[: ~ ]
90.A== [~~]
..
~
R
[:] (llIti cltlssify the origill (IS
~
 ,"
.....


',~,
.
Ker Dellnlllons and adjoint of a matrix, 275 algebraiC multiplicity of an eigenvalue, 291 cha racterist ic equation, 289 cha ractenstic polynomial, 289 cofactor expansion, 265 Cramer's Rule, 273274 determinant, 262264
diago nali7..able matrix, 300 eigenvalue, 253 eIgenvector, 253 eigenspace, 255 Fundamental Theorem of Invertible Matrices, 293 geomet ric multiplicity of an eigenvalue, 291
Gerschgorin's Disk Theorem, 318 Laplace Expansion Theorem, 265 power method (and its variants), 308 316 properties of determmants, 268273 similar matrices, 298
362
Chapter 4
Eigenvalues and Eigenvectors
Review Questions I. Mark each of the following statements true or false:
(a) For all square matrices A, d et(  A) =  de t A. (b ) If A and B are 11 x II matrices, then det(AB) = d et ( BA). (c) If A and B are nXn matrices whose columns are the same but III different o rders, then det B =  det A. (d) If A is invertible, then det(A I) = d et AT. (e) If 0 is the only eIgenvalue of a square matrix A, then A is the zero matrix. (0 Two eigenvecto rs co rresponding to the same eigenvalue must be linearly dependent. (g) If an n X PI matrix has n distinct eigenval ues, then it m ust be diagonalizable. (h ) If an "X II matrix is diagonalizable, then it must have 11 distinct eigenvalues. (i) Similar matrices have the sam e eigenvectors. (j) If A and B are two 11 X n matrices with the sam e red uced row echelon form, then A is SImilar to B.
2. LetA =
I )
5
3 7
7
5 9
11
(a) Compu te det A by cofactor expansion alo ng any
row or column. (b) Comp ute de t A by fi rst reducing A to triangula r form.
3d 2,  4f f " b , 3. If d e f = 3, find 3(1 2b  4c c gir l 3g 211  4i j 4. Let A and B be 4 X4 mat rices with d et A = 2 and de t 8 =  i. Find d et C for the indicated mat rix C: (a) C = (A8)  '
(b) C= A'B(3A')
5. If A is a skewsymmetric /I X PI matri x and
n is odd,
prove that det A = O.
6. Fi nd all values of k for wh ich
1
 1
I 2
I 4
2
k = O. Ii'
111 Questlolls 7 and 8, sholll that x is all eigcllvector of A and
filld the corrcspolldillg eigellvalile.
7. X = [aA =[~ 8. x =
3  I ,A =
2
10
 6
3
3
4
 3
o
0 2
9. Let A =
(a) Find the characteristic polyno mial of A. (b) Find all of the eigenvalues of A.
(e) Find a basis for each of the eigenspaces of A. (d) Determine whethe r A is diagonalizable. If A is not diagona lizable, explain why not. If A is d iagonallzable, find an invert ible matrix P and a d iagonal matrix Dsuch tha t P IA P "" D.
10. If A is a 3X3 d iago nalizable m at rix wit h eigenvalues  2,3, and 4, fi nd det A. 11. If A is a 2X2 mat rix with eigenvalues Al =
and correspo nding eigenvectors V I
find AS[ ~].
t,Al ""
 ],
=[:].v,=[_:].
12. If A is a diago nalizable matm and all of its eigenval ues satisfy ~ A~ < I, prove that A" approaches the zero matrix as n gets large.
111 Questions 1~ 1 5, determine, with reasons, whether A IS similar to B. If A  B, give an invertible matrix P such tlrat p 1AP=B. 13. A = 14. A =
15. A =
[~ ~]. B =[~ ~] [~ ~].B= [~ ~] 1
1
0
1
I
0
0
I
1
, B= 0
1
0
0
0
I
0
0
1
16. Let A =
[~ ~]. Find all values of k fo r which:
(a) A has eigenvalues 3 and  \. (b) A has an eigenvalue with algebraic multiplicity 2. (e) A has no feal eigenval ues.
17. If A3 = A, what are the possible eigenvalues of A? 18. If a square matrix A has two equal rows, why must A
have 0 as one of its eigenvalues?
~l \3  5
 5
19. If x is an eigenvector o f A with eigenvalue A = 3, show
 60
 45
18  40
15
32
that x IS also an eigenvector of A l the correspond ing eige nvalue?

SA
+
21. What is
20. If A is similar to B with P IAP = B and x is an eigenvector of A, show that p  IX is an eigenvecto r of B.
rlho
.. . that sprightly Scot of S(Q/S, Dol/glaJ, /IUlt rims ahorseback up 1/ 11;1/ perpmdiClilar William Shakespeare Hrury IV; Pari I Act II, $etne JV
Ii
5.0 Introduction: Shadows on a In this chapter, we will ('xtend the notion of orthogonal projection thaI we encountered fi rst in Chapter I and then again It1 Chapter 3. Unti l now, we have d iscussed only p roje.:tio n onto a si ngle vector (or, equivalently, the o ned imensional subspace span ned by that vector), In this section, \\'e will see ,f we can find Ihe analogous formulas for projection on to a p lane in [R3. Figure 5.1 shows what havpens, fo r example, when parallel light rays create a shadow on a wall. A similar process occurs when a threedi mensional object is displayed on a twodimension'll screen, such as a computer mon itor. Later in this chapter, we will consider these ideas in full generalit y. To begm, let's take another look a t what we already know about project ions. In SWion 3.6, we showed that, 111 R2 , the standard matrix of a projwion o nlo the line thro ugh thc origin with direction vector d = [",] ;,
",
dldl/( d{ + df)] (Iil(d{ + dD Hence, the projection of the vector v onto thiS rille is just Pv.
Problem 1 Show that Pcan be written in the equivalent fo rm
flgur. 5.1 Shadows on II wall are project1ons
cos' O [ cos O sin 6
COS (;I SinO ] Sll1
2
0
(What does 0 represent here?) Prolll,.2 Sho," that l)can also be written in the fo rm p :::: uu T, where u IS a unit vector in the d irection of d. [ 3] Problem 3 Using Problem 2, find P and then find the proje<:tion of v :::: onlO the lines with the following unit direction vectors:  4
Probl••• Using the form p :::: uu T, show that (a) pT = P (i.e., P is symmetric) and (b) p! = P (i.e., Pis idempotcnt).
363
364
Chapter 5 Orthogonality
PrOb1'1I 5 Explain why, if P is a 2X2 project ion matrix, the line onto which it projects vectors is the column space of P. Now we will move in to R J and conside r projections onto planes through the origi n. We will explore several approaches. Figure 5.2 shows one way to proceed. If (t is a plane through the origin in R~ with n ormal vector n and if v is a vector in [IlJ, then p = projff (v) is a vector in '!P such thtH v  en = p for some scalar c. n
,.  ell
p
~
( II
,
Figure 5.2 ProJection onto a plane
PrOblem 6 Using the fact that n is o rthogonal to every vector in (j}, solve v  en = p for eta find an expression fo r p m terms of v and n. Probl,.l Use the method of Problem 6 to find the projection of I
v =
0
2 o nto the planes with the follOWing equations:
(a) x + y + z = 0 (b) x  2z = 0 (c) 2x  3y +
Z
= 0
Another approach to the problem of fmding the projection of a vector o nto a plane is suggested by Figure 5.3. We can decompose the p rojection of v o nto '!P into the slim of its projections onto the d irection vectors for (j}. This works only if the d irection vectors arc o rthogonal unit vectors. Accordingly. let u l and u 1 be direction vectors for '!P with the property that
,, ., ,
ri p ,, <
PI
Fillure 5.3
P,+ P2
., ..
Sccllon 5. ! Orthogonality in R"
365
By Problem 2, the projections of v o nto u 1 and u 2 are
respectively. To show that PI + P2 gives the p rojection of v onto 'fJ', we need to show that v  ( PI + P2) is orthogonal to '!P. It is enough to show that v  ( PI + P2) is orthogonal to both u l and U 2• ( Why?) Problema Show that u 1' (v  ( P I + Pl» = 0 and u 2' (v  (P I + P2» = O. ( HI/II: Use the alternative form of the dot product, x Ty = x' y, together with the fact that U I and U 1 are orthogonal unit vectors.)
It foll ows from Problem 8 and the comments preceding it that the matrix of the projection onto the subspacc <J> or M' span ned by orthogonal unit vectors U 1 and u 2 is (I )
Probl •• 9 Repeat Problem 7, using the formula for Pgiven by e(luation ( I ). Use Ihe same " and usc " I and " 2' as indicated below. (First, verify that u l and U z are o rthogon'll uni t vectors in the given plane.)
o
2/ V6 (a) x
+y+
Z
=
0
wi th
Ul
[/ v'6
=
andul =
1/ V6
o
' /VS (b) x  2z :: 0 with U 1 =
0
and U 1
I/ VS
""
1
0
I/VS (c) 2x  3y +
~ =
0 with u \ =
1/0  1/ 0
 L/ 0 1/v3
2/V6 andU2 =
1/ V6  1/ V6
Probl•• 11 Sho w that a projection matrix given by equal ion ( I ) satisfi es propertIes (a) and (b ) of Problem 4. Problem 11 Show that the matrix Pof a projc<:tion onto a plane in R) can be expressed as
£1 = /\A T fo r some 3 X 2 matrix A. [H im: Show that equalion ( I) is an o uter product expansion.] Probl•• 12 Show that if P is the matrix of a projection on to a plane in [R3, then rank (P) = 2. In this chapter, we will look at the concepts of o rt hogonality and o rt hogonal projection in greater detail. We will see that the ideas introduced in thiS section can be generalized and thilt they have many important applicatio ns.
Orthogonalltv in
~.
In this section, we will generalize the no tion of orthogonality of vectors in R" from two vectors to sets of vectors. In doi ng so, we will see that two properties make the standard basis Ie ., e2•• . •• e,,1 of R" casy to work with: Fi rst, any two distinct vectors
3&&
Chapter 5
O rthogo nality
in the set are o rthogo nal. Second, eac h vector in the set is a unit vector. Th ese ~vo properties lead us to the notio n of o rthogon:l l bases and orthono rnw l basesconcepts that we will be able to fr uit fully apply to:l va riety o f applications.
Ortbogoaal aDd Orthonormal Sets 01 Vectors
Delioitioo
A set o f vecto rs {vI' V z" .. , vd in H~ is called an orthogotlal sct if all pairs of dis tinct vectors in the set are o rthogonal that is, if Vj 'Vj"" O
whenever
i*j
fori ,j= I,2, ... ,k "
The sta ndard basis le i' ez•.. . , e~ 1 o f lR~ is an o rt hogonal set, as is a ny subset of it. As the fi rst example iII usl r3les, there a rc man y olhe r possibilities.
Example 5.1
Show that
Ivi• V~ , v,l is an o rthogonal set in H' if
o
2 VI
=
I,
V2 
 I
Solullon
I ,
1 VJ 
 I
I
1
We must show that every pair of vectors from this sel is o rt hogonal. Th is is
true, since
Y, 'Y, Y, ' Y, y
Y , ' Y,
~ ~ ~
'(0 ) + 1(1) + ( 1)( 1) ~ 0 0(1) + 1( 1) + (1 )(1) ~ 0 2(1 ) + 1(1} + (I}(I) ~ 0
Geome trically, Ihe vecto rs in Example 5. 1 a re mUlually perpendicular, as Figure 5.4 s hows.
flgur.5 .4 An orthogonal set of vectors
One of the main ad va ntages of working wi th o rthogonal selS of \'ccto rs is Ihal they a re necessarily li nearly independent, as Theore m 5.1 shows.
Theorem 5.1
I f l V I' vZ"
vtl isan orthogonal set of nonzero vectors in H", lhen these \'eCIOrs are linearly independent.
Prool
'"
If c..... , Ct a rc scalars such that (CIVI
+ ... +
(I V I
+ ... +
(tVt
= 0, then
CtVt) v, = O v, = 0
o r, equivalen tly, (I)
Since l VI' v z• _.. , vtl is an orthogonal set, all of the d ot products in equation (I) arc zero, except V, ' v,_Thus, equation ( I) reduces to
(,{v" v,) = 0
Section S. l
361
Orthogonality in R"
'*
Now, v, ' V, 0 because v, '" 0 by hypothcsls. So we must have (, = O. The fact that this is true for all j = I , ... ,k implies that lv l • v2•••• , vd is a linearly indcpendem sel.
a•• ."
Thanks (0 Theorem 5.1, we know Ih ,l\ if a set of vectors IS orthogonal,
it IS automatically linearly independent. For example. we call immediately deduce th31 the three vectors in Example 5.1 are linearly independent. Contrast this approach wi th the work nceded to establish their linear independence directly!
DelialUoD
An orthogonal basis for a subspace W of Rn is a basis of Wlhal is
an o rl hogonal set.
Example 5.1,
The vectors
o
2
1
 I
1
from Example 5.1 arc orthogonal and, hence, linearly independent. SIIlCC any three
linearly independent veclOrs in R' form a basis for Rl, by the Fundamental Theorem of Invertible Mat rices, it follows that Ivl' v2' vJI is an orthogonal basis for Rl .
v,
In Exa mple 5.2, suppose only the ort hogonal vectors V I and were given and you were asked to find a third vecto r vJ to make {vI' v" vJI an orthogonal basis for RJ. One way to do this is to remember that in iR', the cross product of two vectors V I and v1 is orthogonal to each of them. (See Exploration: The Cross Product in Chapter I.) Hence we may take •••• rll
o
2
x
I  I
I
2
=
2
2
1
Note that the" resulting vector is a multiple of the vector vJ in Example 5.2, as it must be.
Example 5.3
Find an orthogonal basis for the subspace Wa f R' given by x y:xy+2z::Q
w ~
,
S.111I,. Section 5.3 gives a general procedure fo r problems of this sort. For now, we will find the orthogonal baSIS by brute force. The subspace W is a plane through the origin III Rl. From the equation of the plane. we have x :: y  2z, so W consists of vectors of the for m
y  2z
y
,
2 y 1 +, 0 1
~
o
1
368
Cha pter 5
Orthogonality
2
1 It fo llows that u
=
I
and v
o
=
o
are a basis fo r W, but they are
1101
onhogo
1
n al. II suffices to find another nonzero vector in Wthat is o rt hogonal to either o ne of these. x Suppose w ::::
y is a veclOr in W that is orthogonal to u . T hen x  y + 2% = 0,
,
since w is in the plane 11': Since u ' w = 0, we also have x system
+ )' = O. Solving the li near
x  y +2z = 0 x +y
:::: 0
we fi nd that x =  z and y = z. (Check this.) Thus, ally nonzero vector w of the fo rm
, z
w=
, 1
will do. To be specific, we could take w =
I . lI.s easy to check that lu , w } is an 1
orthogo nal set in W and, hence, an orthogonal basis fo r W, since dim W = 2.
Another advantage o f working with an orthogonal basis is that the coo rdinates o f a vector wi th respect to such a baSIS are easy to compu te. Indeed . there is a formula fo r these coordinates, as the fo llowing theorem establishes.
Theorem 5.2
Let Iv I' v2' ••• , vk} be an orthogonal basis for a subspace W of RW and let w be any vector in W. Then the un ique scalars ' I" .. , ,~such thai w =
' IVI
+ ... + '.V.
a re given by
c, =
W'v
v' ' , v,
for i = I, ... , k
Prool
Since {v I' v ~, ...• vi} is a basis for W, we know that there are umque scalars 'I " . . , 'l such that w ::: 'I V. + .. + 'lV1 (from Theorem 3.29). To establish the for mula for 'i' we take the dot product of this linear com bination with v, to obtain
::: c\{v l 'v,)
+ ... + c,{v ,' v,) + .. . +
Ci(Vk ' V,)
= c,( V,' Vi)
since VI ' Vi = 0 for j sired result.
'* i. Si nce v, *" 0,
Vi ' V,
'* O. Divid ing by v, ' v" we obtain the de=
!i«lion 5.1
Example 5.4
Orthogonality in R
31.
1
2 with respect to the o rthogonal basiS B = {VI' v1 • vll 3
Find the coordi nates of w =
of Exa mples 5. 1 and 5. 2. Solutloll
Using Theorem 5.2, we compute
c, =
'1 =
W ' Vt
2+2 3
1 =
VI • VI
4+ 1+ 1
6
W' " l
0+2+3
5
V I ' \/2 W'Vj
= vj . v J
cJ
== 0+ 1 + 1 2 1
2+ 3
2 = 1 + 1 +1 3
Thus, w =
'I V I
+ CzvJ + eJv)
= ~ VI
+ ~ V2 + i v)
(Check this.) With the nolation introduced in $cctlon 3.5, we can also write the above • equation as 1
(WJ8 =
•1 ,,
Compare the procedure in Example 5.4 with Ihe work required to find these coordinates directly and you should start to appreciate the value of orthogonal bases. M no ted at the beginning o f this sectio n, the o ther property o f the standard basis in Rn is Ih.1( each standard basis vecto r is a umt veclor. Combin ing this property with orthogonality. we have th e foll owing defi n ition.
Delinilion
A set o f vectors in R" is an orthonormal set if it is an orthogon.d
set of un it vectors. An ortl/Onormal basis for a subspace IV of R is a basis of IV that is an o rtho normal set.
Reilif' If 5 = /q l" .. , 'hI is an o rt honormal.sct of vectors, then q, ' ~ = 0 fo r I j and , q,1 = 1. The fact that each q, is a unit vecto r is equivalent to ~. q, = I . It follows tha t we can summarize the statement that 5 is orthonormal as
*"
if i "" j if j = j
Show that S "" {q l' 'ltl is an o rthono rmal .sct in II J If
q, 
1/0  1/0 1/0
1/ \/6 and
q l::
2/ V6
1/ \/6
310
Chapter 5 Orlhogonllhty
Solall"
We check that
q, ' q, ~ 1/ v'iS  2/ v'iS + 1/ v'iS ~ 0 q l 'ql ql ' ql
=
1/ 3 + 1/3 + 1/3 "" I
=
1/ 6 + 4/ 6 + 1/ 6"" 1
If we have an orthogonal set, we can easily obtain an orthonormal SCt from it: We simply normalize each vector.
Example 5.6
Construct an orthonormal basis for Ire from the vectors in Example 5.1.
SoluUGa
Since we al ready know that vI' v1' and v) are an orthogonal basis, we nor· mahze them to get
 I
2/ V. I/ V.  I/ V.
a
a
2
I
V. I
IIv,1I
I VI ""
I
IIv,1I
v'i I
Vl 
V3
I
I
I I
 I I
I/ v'i I/ v'i 1/0  1/ 0 1/ 0
Since any orthonormal set of vectors is. in particular, orthogonal, it is linearly independent, by Theorem 5.1. If we have an orthonormal basis, Theorem 5.2 becomes even simpler.
Theall. 5.3
Let /q l' '12, ... , q.l;l be an orthonormal basis for II subspace W of R~ and let w be any vector in W. Then w "" (w'q l)q l
+ (W'lb )ql + ... + (W·qk)qk
and th is representation is unique.
PrODr
Appl y Theorem 5.2 and use the fact that q, ' q, "" I for i
=
I, ... , k.
OrtbogOnal MatrIces Matrices whose columns form an orthonormal set arise frequently in applications, as you will see in Section 5.5. Such matrices have several attractive properties, which we now examme.
Secllon 5.1
Tbeor•• 5.4
The columns QT ~ 1... .
PrODI
or an
311
Orlhogonahty in R"
m X " matrix Q form an orthonorm I set if .nd only i
j
We need to show that
{o ifif
(QTQ), =
I
'*
i j i = )
Let q, de no te the ith column of Q (and, hence, the ith row of QT). Since the ( i, j ) entry of QrQ is the d ot product of the ith row of QT and the j lh column o f Q, It follows that
(QTQ)., = q. 'q,
(2)
by the definition of matrix Illultlplication. Now the colum ns Q form an o rtho normal set if and o nly if
q, "q) =
o {J
ifi "4:)
ifl = )
which, by cquation (2), holds If and only if
(QTQ), = Orlhogonal matrix is an unforlu nate bit of terminology. ·Orthonornlal ma trix" would dearly be a better term, but it is not stJndard. Moroover, there is no tenn for;/, nonsquare matrix with orthonormal col umns.
"*
{~
if i j if i = j
This complctes the proof. If the matrix Q in Theorem 5.4 is a sqllare matri x, it has;) special Il:.me.
Definition
An n X II matrix Q whose columns form an orthonormal ... is #
~aned an orthogonal matrix.. The most impor tan t fae l about o rthogonal m:llrices is given by the next theorem.
•
Theor,. 5.5
squaJ.e: II1lltrixQis orthogonal if and onty if Q

,
Prll'
By Theorem 5.4, Q is orthogoll:.1 if and only if QTQ = 1. This only if Q is invert ible and Q I = QT, by Theorem 3.1 3.
(lampl, 5.1
IS
true if and
Show Ihat the following matrices are orthogo nal and find their inverses:
010 A =
0
0
I
I
0
0
,nd
COSo
8 
[
smO
,'no] cosO
Solullon T he columns or A arc lusl the standard basis \'ectors for clearly o rthonormal. Hence, A is orthogonal and
A1= AT ..
o
0
I
I
0
0
010
R\
which ;lre
312
Chapter 5 Orthogonalit y
For B, we check directly that
8 TB ::: [
eos(J SinO ] [ eos o  Sin 0 cos 0 sin 8
0]
 sin 00, 6
COS20 + si nl O  eos Osi nO + SiIiOCOSO] = [ ,.  sill 0 cos e + cos (J si n e 5in 20 + cos1 0 Therefore, B is orthogonal, by Theorem 5.5, lind 8 1 == BT: [
cosO  sin 0
[a' a,] =
I
0]
s in cos O
+
The word isometry literally mea ns " length pres ttvin g," since it is deriYCd from the Greek root s isos ("equal") and metrOIJ ("measure").
Theor.m 5.6
Remark Malrix A in Exa mpl e 5.7 is an e_'(am ple of a perm utation m3t rix, a matrix obta ined by perm utin g the colu mns of an iden t ity matrix. In gen el31, any "x" perm utat ion matrix is orthogo nal (see Exercise 25). Matrix B is the mat rix of a rota tion through the ang le 0 in R2. Any rotation has the prop erty that it is a length preserving tran sformatto n (known as .tIl isomelry in geo met ry). The nex t theorem shows that every orth ogo nal matrtx tra nsfo rma tion is an isomelrY. Ort hogonal matrice s also preserve dol products. In f3ct , orth ogo nal matrices arc cha racteriz ed by eith er one of these pro perties.
Let Q be an nX 11 m atri x. The following stat ements are equiv:llenl: Q is orthogo nal. b. I Qxl = Ixl for every x in R". 3.
c. Qx. Qy = x·y
fo r every x and y in RI!.
Proal We will prove that (a ) => (e) => (b) :=} (a). To do so, we will need to make use of the fact tha t if x and r are (col umn ) vC(tors in R", then x' y =: xTy. (a) => (cl Assume that Q is orthogonal. The n QTQ = I, and we have
Qx· Qr = (QX)TQ y = XTQTQ y = xTly = xTy :::: x · y ( c) => (b) Assume that Qx · Qy = x . y for every x and y in R". Theil, taki ng y : x,
weh ave Qx 'Qx = x'x ,so I Qxl:::: Y Qx · Qx = Y xx = I x~. (b) => (a) Assume that property (b) holds and let q, den ote the Ith colu mn of Q.
Using Exercise 49 in Sect ion 1.2 and propert y (b), we have
x'y :::: Hlix + y~ 2
I x 
yU1)
+ Y)I '  l Q(x  Y)i ' ) = IU Qx + QYI '  IQx  QYI') = WQ (x
= Qx · Qy for all x and y in ~ ". [This shows that (b) => (e).1 Now if e, is the IIh stan dard basis vector, then q, :
q, . q) = ,
=:
Oc,. Consequently,
o ifi> l:j {[if i = j
Thus, the colu mns of Q form an orth ono rma l set , so Q is an orth ogo nal matrix.
Sect ion 5.1
Orthogonali ty in
R~
313
Loo king at the orth ogonal matrices A and B in Exampl e 5.7, you may notic~ that no t only do their colu mns form orth ono rma l sets so do thei r rows. In fact, every o rthogonal matrix has this pro pe r!)', as the next theorem shows.
The orem 5.1
•
If Q is an orth ogo nal matrix, then its rows form
,,"1
From The orem 5.5, we know that Q I ;; QT. Therefo re,
(Q') , = (Q')"
=
Q = (Q')'
so Qr is an orth ogo nal mat rix. Thus, the colu mns of QT_w hich are just the rows of Qf orm an orth ono rma l SCI. The fi nal theo rem in th is sect ion lists som e o ther prop erties of ort hogonal mat rices.
Theorem 5.8
Let
0 be an orthogonal mal ri.'(,
a. Q I is orth ogo nal.
b. det O =:t l c. If A is an eige nva lue of Q, then ,A = I. d. If QI and O2 arc orth ogo nal ,. X" matrices. then so is
OIQ,. •
"11 1 We will prove properl y (e) and leave the proofs of the rema inin g pro perties as exercises. (c) LeI A be an eige nval ue of 0 with corr esp ond ing eigenv~c lor v. Then
and , uSlh g The orem 5.6(b), we have
Ilvll
Since ; vl
'*
I Qvl1 0, this implies that IAI == I. =
I Avi
=
=
Qv "" Av,
IAlll vl1
[0 0\]
.,••,.
Property (e) hold s even for com plex eige nvalues. The mat rix 1 is orth ogona l with eigenva lues i and  i, both of whi ch have absolu te valu e I.
Exercises 5.1 /11 Exercises /6, de/erm ine widell sets of veClQ r$ {Ire ortl wgo nal.  3 2 \  \ 4 2 I. \ • 4 •  \ 2. 2 \ 2
3.
•
2
\
3
 \
\  \
•
2 2 •  2 \
•
5
2
4
5 4.
0 \
3 • 2 • \
5.
\
2 3 \
 \
6.
2 3 \
•
 2
 4
\
6
 \
•
2 7
4
0
\
0
\
 \
0
 \
\
0
 \
\
\
\
\
\
0
2
•
Chapter 5
314
Orthogonality
In Exercises 7 /0, show t}u/f the given vectors fOrtll W I orthogonal basis for (R2 or (RJ. Theil usc 'nleorem 5.2 to express w as a linear combination of these /msis vectors. Give the coordil1ate vector [wJ8of w with respect to the basis 6 = {VI' VI } ofR l or6 = {VI' V 2' Vl } of R l •
7.v,~ [_;l v,~ [:lw ~ [:l
8.
0
I
2
=
, V2
=
I
;w =
I
I
I
I
I
I
I
I
I
;w =
 2
0
I
2 3
In Exercises //  15, determille w/letller the givell orthogonal set of vectors is orthonormal. If it is IIot, normalize the vectors to fo ml all orthonormal set. II.
[l]. [!] ,!,
j
13.
o
1/2 1/ 2 ,  1/2 1/ 2
[l]. [ll •,,
2 ~
o
,
I
,• . , ' l
15.
12
14.
,
_1
j
•
,• • 1 ,
26. If Q IS an orthogonal matriX, prove that any matrix obtained by rearranging the rows of Q is also orthogonal. 27. Let Q be an orthogonal 2X2 mat ri x and let x and y be vecto rs in RI . If 0 is the angle between x and y, prove that th e angle between Qx and Qy IS also 8. (This proves that the linear transformations defi ned by o rt hogonal m atrices arc Grlglepresavillg in Gt 2, a fact that is true in generaL) 28. (a) Prove tha t an orthogonal2X2 matrrx m ust ha ....e the form
,, _1 •, • ,,
where [ : ] is a unit vector.
j o o v'3/ 2 o V6/3  v'3/6 1/ V6 ' v'3/6 ' 1/0. 1/ 0.  1/V6  v'3/6
(b) Using part (a), show that every o rthogonal 2 x 2 matrix is of the fo rm COS [
orthogonal. If It is, find its inverse.
oil ,,
19.
cosOsin 8 COSI O
 cos ()
. '0  sm
sin 0
cos 8 si n 0
Sin 0
0
cosO
,,
20.
,,
1 , , , , • _., 0 1 1
18.
[ 1/ 0. 17.  1/ 0.
,1 _1 ,, ,
,, 1, •, ,1, j _ 1, , _.1, •, •, , j
0
sin 0
cos O
SinO ]
[ si n 0
cos 0
0]
,in  cas 8
where 0 < 0 < 27t_ (e) Show that every orthogo nal2X2 matrix corre· sponds to either a rotatio n or a reflec tion in R2. (d ) Show that an orthogonal 2X2 matrix Qcorresponds to a rotatio n in [R2 if det Q = 1 and a reflection in IR:l if de t Q =  I.
III Exercises 1621, determine whether rlzegiven matrix is
16. [ 0I
2/ 3 1/0. 1/V6  2/3 1/0.  1/ Y6 0 1/3 1/ 0.
25, Prove that every permutation matrix is orthogonal.
 I
 I , Vj =
0
1/ V6
24. Prove Theorem 5.8{d ).
 I I , v2 =
0
0
23. Prove Theorem S.8{b).
I , vJ
21.
0
0
22_ Prove Theorem 5.8(a ).
v,~ [;]. v,~ [~ l w ~ [:l I
I
1/ 0.] 1/ 0.
III Exercises 2932, lise Exercise 28 to determine whetltcr the given ortllOgoll(// matrix represents (/ rotalion or a refleetioll. If il is a rotatlOl/, give the angle of ro/at ioll; ifit is (/ reflee· lion, give the lillc of reflectjoll 29.
1/ 0.  1/ 0.] [ 1/ 0. 1/ 0.
 1/ 2 V3/2 ] 31. [ v'3/ 2 1/2
[  1/ 2 v'3/ 2] 30. _ v'3/ 2  1/ 2
,
32.
[i
Section 5.2
33, Let A and B be /IX /I orthogonal matrices. (a) Prove that A(AT + 81)8 == A + 8. (b) Use part (a) to prove th at, if det A + del B = 0, then A + B is not invertible. 34. Let x be a unit vcctor in ROO. Partition x as
x,
......
x,
x
[~ l
115
Orthogonal Complements and Orthogonal Projections
wi th a prescribed fi rst vector x, a construction that is frequently useful in applications.) 35. Prove that if an upper triangular matrix is orthogonal, then it must be a diagonal matrix. 36. Prove that if 1/> III, then there IS no mX'1 matrix A such that IAxI == Ixl for all x in n. 37. Let 8 = {VI' . .. , v~} bean orthonormal basis for R~. (a) Prove that, for any x and y in R~,
x"
s'y = (X 'YI)(Y ' V1 )
+
(x'Y!)(Y'V2)
Let
+ ...
+ Ix· v,,)ly· v,,) :!I.i..............! ~............ _
, ( I )yyT
y j 1:
I  XI
(This identity is called Parseval's ldetllily.) (b) What does Parseval's Identity imply about the relationshIp betw~ n the dot products X· yand
Prove that Q IS orthogonal. (This proced ure gives a qUIck method for finding an orthonormal basis for R~
Ix18 ' [Yl.6?
OrthogOnal Complements and Orthogonal proJections In this sect ion, we generali7.e two concepts that we encountered in Chapter I. The no· tion of a normal vector to a plane Will be extended to orthogonal complements, and the projection of one vector onto another will give nse to the concept of orthogonal projecllon onto a subspace.
IV! is pronounced ~ IV perp.~
Orthogonal Complellen\s A normal vecto r n to 11 plane IS orthogonal 10 every veclor in that plane. If the plane passes through the origin, then it is a subspace 'V of (Rl , as is span (n ). Hence, we have two subspaccs of Rl with the property that every vector of one is orthogonal to every vector of the other. This is the idea behind the foll owing definition.
,
•
IV
!k
Let \Vbe a subspace of R. We say that a vector v in R is orthogonal to W If Y is ort hogonal to ewry vector in \V. The set of all vectors that are orthogonal to W is called the orthogonal complement 0/ \V, denoted \V"'. That is,
e
WI == { vi nR ~ : v'w = 0
flglrl5 .5 \VI and IV '"
s
Definition
,.. e'"
,
fora Jlwi n IV }
,
e1
EKlmple 5.8
e
If W is a plane through the origin in R l and is the line through the ongin perpen· dicular to W ( i.e., paral lcl to the normal vector 10 \V), then every \'ector von is orthogonal to every vector w in W; hence, = W .l.. Moreover, \V consislS precise/y of those vectors w that are orthogonal to every v on C; hence, \,>,e also have \V = f J. . Figure 5.5 Illustrates this situation.
e
e
318
ChapteT 5 OrThogonalilY
In Example 5.8. the orthogonal complement of a subspace turned out to be ano ther subspace. Also, the complement of the complement of a subspace was the original subspace. These properties are true In general and are proved as properlles (a) and (b) of Theorem 5.9. Properties (c) and (d) will also be useful. (Recall that the intersectIOn A n B of sets A and B consists of their common elements. See Appendix A. )
Theo,.. 5.9
Let W be a subspace of R8. a. W.I. is a subspace of R8. b. ( W.l.).1.= W c. wn W.l. = 101 d. If w = span (wj • • . • , Wi)' then v is in W.l. if and only if v ' w, = 0 for all i = l •. . . , k.
Proal (a ) Since o · W "" 0 for all w in
W. O is in W..I. . ut u and v be in W.I. and let c
be a scalar. Then u 'W = v·w = 0
forallwlfl W
Therefore, (u + v)·w = u ' W + V'w = 0 + 0 "" 0 so u + vis in W.I. . We also have
« . ) ·w
= « •. w) = «0) = 0
from which we sec that cu is in W.I.. It foll ows that W.I. is a subspace of R". (b) We will prove this properly as Corollary 5.12. (c) You are asked to prove this property in Exercise 23. (d ) You are asked to prove this properly in Exercise 24.

We can now express some fu ndamental relationships involving the subspaces associated with an m X " matrix.
Theore .. 5.1.
Let A be an m X n matrix. Then the orthogonal complement of the row space of A is the null space of A, and the orthogonal complement of the column space of A is the null space of AT:
P,ool
If x is a vector in R", then x is in (row (A».I. if and only if x is orthogonal to every row of A. But this is true if and only if Ax = 0, whi(h is equivalent to x bc:'ing in null (A), so we have established the firs t Identity. To prove the second identity, we s imply replace A by ATand use the fa ct that row (A T ) = col (A). Thus. an m X n mat rix has four subspaces: row(A), null (A ), col (A), and null (AT ). The first two arc orthogonal complements in R8, and the last two arc orthogonal
•
,
Section 5.2
I
Orthogonal Complements and O rthogonal Proj«tions
null(Al
,0
col(A)
row(A)
R" flglre 5.6 The four fundamental subspaces
complements in R"'. The mX /I mat rix A d efines a linear transfo rmation from R~ into R" whose range is collA). Moreover, th is transfo rmatio n sends null (A ) to 0 in Ill .... Figure 5.6 illustrates these ideas schematically. These four subspaces arc called the fundame"tal subspausofthe mX" matrix A.
Example 5.9
Find bases fo r the four fu ndamental subspaccs of
A ~
1
1
3
1
6
2
 1
0
1
 1
3
2
1
 2
1
1
6
1
3
and verify Theorem 5. 10.
50lullon
In Exa mples 3.45, 3.47, and 3.48, we computed bases fo r the row space, column space, and n ull SpattOr A. We fou nd that row (A) = span (u \, u l > lll)' where
Also, null {A) = spa n (x •• Xl)' where
x,
~
1
1
2
3 0
1
0 0
,
X2
=
. 1
To show thaI (row (A».!. = null (A ), it is enough to show that every u, is orthogonal to each x" which IS an easy exercise. (Why is th is sufficierll?)
311
Chapter 5
Orthogon:llity
The colu m n space of A is col (A) "" span {a J, a 2• a J ), where ]
2 ,  3
3 1 ""
=
32
]
]
 ]
]
2
4
,
a,
=
 2
]
]
We still need 10 compute the null space of AT. Row reduction produces ]