li A
rn
CIiOD
Second
David Poole Trent University
THOIVISON
__
~JI
_
BROOKS/COLE
Aust ralia. Canada. Mexico . ...
4120 downloads
6847 Views
106MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
li A
rn
CIiOD
Second
David Poole Trent University
THOIVISON
__
~JI
_
BROOKS/COLE
Aust ralia. Canada. Mexico . Singapore . Spa in United KHl gdom • United Slates
THOIVISON -~...
-.
BROOKS/COLE
Linear Algebra: A Modern Introduct ion Second Ed ition
Dm'la Poole Executive Publisher C urt Hinnchs Executive Edi tor: jennifer Laugier Ed itor: /ohn·Paul Ramin ASSIStant Edi tor: Stacy Gre!.'n Ed itonal Assistant: Leala Holloway T!.'chnology l)roject Managt'r: E,lrl Perry Marketing M,mager. Tom Ziolkowski Marketing Assistant: Enn M IIchell Advertising Project ~lanager: Bryan Vann ProjeCl Ma nager. Edllorial ProduClion: Kelsey McGee Art Director: Vernon Boes Prinl Buyer: Jud y Inouye
Permissions Ed,lor: loohtt lLe Production Service: Matrlx Producllons Text Designer: John Rokusek Photo Research: Sarah Ever t50n / lmage Quest Copy Editor: Conllle Day Il lust ration: SCientific Illustra tors Cover Design: Kim l~okusek Cover Images: Getty Images Cover Pnnllng, Prim ing and Binding: Transcontlllen Pnnllllgl L.ouisrv ille Composilor: Inte racti\'e Compositton Corporation
e
Asia (including India ) Thomson Learning 5 Shenton Way #01 -01 UIC Buildtng Singapore 068808
2006 Thomson Brooks/Cole, a parI of The Thomson Corporation Thomson, Ihe St~r logo, and Brooks/Cor.. arc tr~demarks llsed herein under license. ALL RI GHTS RESERVED. No part of this work covered by the
cop)'right hereon may be reproduced or used III any form or b)' tI
.'>« f'lIK/'j 6(i, 86, 87, .193, .W.'i
The introduction to each chapter is a guided exploration (Section 0) in which students arc invited to discover, individually or in groups, some aspect of the upcoming chapter. For example. "The Racetrack Game" introduces vectors." Mat rices in Action" introduces matrix multiplication and linear transformations, "Fibonacci in (Vector) Space" touches on vecto r space concepts, and "Taxicab Geometry" sets up generalized norms and distance functions. Additional explorat ions found throughout the book incl ude applications of vectors and determina nts to geometry, an investigation of 3X3 magic squares, a study of symm('! ry via the tilmgs of M. C. Escher, an int roduction to complex linear algebra, and optimization problems using geometric inequalities. There are also explorations that mtroduce import3nt numerical considerat ions and the analysis of algorithms. Having students do some of these explorations is one way of encouraging them 10 become ,\Ctlve learners and to give them "ownership" over a small pa Tt of the cou rse.
lppllcallons
Ser f"li,....j :U , 532
See paga 55. J 19, 2l4. .153, 616
The book contains an abundant selection of applications chosen from a broad range of disciplines, includ ing mathematics, compu ter science. physics, chemistry. engi neeri ng, biology, psychology, geography, and sociology. Notewo rthy among these is a strong treatmen t of coding theory, from error-detecting codes (such as International Standard Book Numbers) to sophisticated error-correcting codes (such liS the ReedMuller COOl' that was used to transmit satellite photos from space). Additionally, there are five "v ignettes" that brieny showcase some very modern applications of linea r algebra: the Codabar System. the Global Positioning System (G PS), robotics, Internet search engines, and digital image compression
Examples and Exercises There are over 400 examples 111 this book, most worked III greater detail than is customary in an mtroductory linear algebra textbook. This level of detail is in keeping With the philosophy that studen ts should wan t (and be able) to read a textbook. Accordingly. it is not intended that all of lh,'se exa mples be covered in class: mnny can be assigned for ind ividual or group study, possibly as part of u project. Most examples have at least one cou nterpart exercise so that students ca n tryout the skills covered in the example before exploring generalizations. There are over 2000 exercises, more than in most textbooks al a similar le\'el. Answers to most of the computational odd- numbered exerciscs can be fo und in the back of the book. Instructors will find :111 abundance of exercises from which to select homework assignments. (Suggestions are gIVen in the IIIS/mewr s Guide.) The exercises In each sect ion are graduated, progressing from the rou tine to the challenging. ExercISeS range from those intended for hand computation to those requiring the use of a calculator or computer algebra system, and from theoretical and numerieJI exercises to conceptual exercises. Many of the examples and exercises use actual da ta
compiled from real' \\'orld situations. For example, there ,lrot problem s on modeling the grmvth of caril>ou and seal populations, radiocarbon dating of the Stonehenge monument, and predicling major league baseball players' salaries. \I'/orking such problems reinfo rces the fact that linear algebra is a valuable tool for modeling real· hfe problems. Additional exercises appear in the form of a revi ew after each chapter. In each set, there are iO true/ false questions designed to test conceptual understanding, followed by 19 computational and theoretical exercises thai sum marize the main concepts and lechniques of that chivato and Reel11 Yassawi provided useful information about dynamical systems. As ah"ays, I am grateful to my students for asking good questions and providing me with the feedback necessary to becoming a better teacher. Special thanks go to my teachlllg assistants Alex Chute, NICk Ftale (0 length) of a vettor. Thus, constant became known as scalars.
"
u +( - v )
Observe that cv has the same direction as v Lf c > 0 and the opposite di rection if c < O. We also see that cv islc~ times as long as v. For this reason , III the context of veetors, constants (that is, real numbers) are referred to as sca/a N. As Figure 1.1 2 shows, when translation of vectors is taken into account, two vectors are scalar multiples of each other 1f and only if they are parallel. A special case of a scalar multiple is (- \ lv, which is written as - v and is called the negative ofv. We can usc it to define vector subtraction: The difference of u and v is the vector u - v definfd by II -
V
= U
+ (- v)
Figure 1.1 3 shows that II - v corresponds to the "o ther" d iagonal of the parallelogram determined by u a nd v.
EII.,'I 1.4 y
b- .
~ b
B
::..:.... .,
fl,.,. 1.1.
= [-
3, 11. then u - v = (I - (- 3), 2 - 11 = [4,
II.
--
A
•
If u = [I, 2 J and v
The definitio n of subtractIon in Example 1.4 also agrees with the way we calculate a vector such as AB. If the poin ts A and B correspond to the vectors a and b in standard posLtlo n, then All = b - a. as shown in Figure 1.14 .1 Observe thai the headto · ta il rule applied to this diagram gives the equation a + (b - a ) "" b. If we had accidentally drawn b - !l wLth its head at A Lllstead of at 8, the diagram would have read b + (b - a) = a, which is d early wrong! More will be said about algebraic exprl"SSions involvmg vectors later I II this section. I
Veelors
'I ~.
Everything we have just do nt txtends tasily to three dimensions. The set of all ordered triples of real numbers is denoted by RJ. Points and vectors are located using three mutually perpendicular coordinate axes that mC " " Il nJ , V = [VI' V2" ' " v"l , and w '" [ wI'
W2•··· > w..l.
(, j
+ [vl . v2, ••• • v" 1 = [Ill + VI' /11 + V1" ' " 11.. + vnl = [ VI + 111' V2+ 112" " , Vn + IInl = [ vI' V1" ' " vnl + [ U p 1/2" ' " U,,]
u + v = [ tl l ,
1/2" ' "
unl
= v+ u The second and fourth equalities are by the defi nitio n o f vecto r addition, and the third equality is by the comm utativity of addition of real n um bers. (b) Figure 1.18 ilIuSlTates associativity in R: 2• Algebraica lly, we have
eu+ v) + w =
([lip [ Il l
(u
+ v) + w
= u
+ ( v + w)
[ ( II I [ Ii i
, 11"1 + [ VI' V2, ••• , v"]) + [Wl' W" . . . , W"] 112 + V" ... , lin + v~ ] + [ WI' W " • • . , wn]
112""
+
Vj ,
+ +
VI)
(VI
+ +
WI' (U 2
WI )' 112
+
V2)
+ ( Vl
+ +
W 2• .•• ,
W l ), · · · ,
( II ~
+
II ~ +
V~)
+
Wn ]
n + \\In) ]
(V
w
, fig.,. 1.1'
= u
+ (V + W)
The fourth equality is by the associativity o f add itio n of real nu mbers. Note the careful use of parentheses.
Section 1.1
The GWllletry and Algebra of "wors
11
By property (b) o f Theorem 1.1, we may u Ilambiguously write u + v + W without parentheses, since we may group the summands In whichever way we please. By (a), we may also rearrange the summands-for example,as w + u + v- if we choose. Likewise, sums of fou r or more vectors can be calculated without rega rd to order or grouping. In general, if VI' v2" •• , vt are vectors in R ~, we will write such sums without parentheses:
The next eX3mple illustrates the use of Theorem 1.1 in performing algebraic calculations with vectors.
Let a, b, and x denote vectors in Rn. (a) Simplify 3a + (Sb - 2a) + 2(b - a). (b ) If Sx - a "" 2{a + 2x), solve for x in terms of a.
hI'tltl We will give both solutions in detail , with reference to all of the properties in Theorem 1.1 th3t we use. It is good practICe to justify all steps the first few times you do this type of calculation. O nce rou arc comfortable with the vector properties, though, it is ;\cccptable to leave out some of the intermediate steps to save time and space. (a) We begin by inserting parentheses.
3a + (Sb - 2. ) + 2(b - a) = (3a + (Sb - 2. )) + 2(b - a) = (3a + (-2a + Sb)) + (2b - 2a) = «3a + ( - 2a» + Sb) + (2b - 2a ) = «3 + (- 2»a + Sb) + (2b - 2a) = (I a + Sb) + (2 b - 2a ) = « a + Sb) + 2b) - 2a = (a + (Sb + 2b» - 2a =(. +(S+2)b) - 2a
(a ), (t )
(bl (fl (bl, (hi (bl (f)
=
(7b + a ) - 2a
(.1
=
7b + (a - 2a)
(bl
=
7b + (1 - 2)a
(fl, (h i
=7b +( - I). =
7b - a
You can sec why we wi ll agree to omit some or these steps! In practice, it is acceptable to si mplify this sequence of steps as 3a
+ (Sb - 2a ) + 2(b - a ) "" 3a + Sb - 23 + 2b - 23 = (3. - 2a - 2a)+(Sb +2b) "" - a
or even 10 do most of the calculation mentally.
+ 7b
12
Chapter I
Vel:IOrs
(b) In de tail, we have
Sx - a "'" 2(a + 2x)
(5x
~
5x - a =2a+ 2(2x)
(eJ
5x - a = 2a + (2 · 2)x 5x - a =2a + 4x
(g)
a) - 4x = (2a + 4x) - 4x
+ 5x) - 4x "" 2a + (4x - 4x) - a + (5x - 4x ) = 2a + 0 -a + (5 - 4)x = 2a
(- a
(a). (b )
(b), (d )
(O. (c)
- a +( I )x =2a
a + (- a + x) = a + 2a (a + (- a)) + x - (I + 2).
(h)
(b ),
O + x =3a
(d )
x = 3a
«)
m
Again . in most cases we will omit most of these steps.
111..1 CO •• llltlOII.I. Coo~II.IeS A vector that IS 11 sum of scalar multiples of other vectors IS said to ~ a linear combi· lIatloll of those vectors. The fo rmal definition follows.
•
lelilill..
A vector v is a Ii" ear combinatiotf of vecto rs vI' v" . .. , Vi if there are scala rs 'I' '.! •... , c1 such that v - ' IVI + ~V2 + .. . + 'tvJ: The scalars Cl' cll ••• , c~ are called the coeffide", s of the linear comb ination.
,
...
2 The vecto r - 2 is a linear combination of - I
3
I
2
0
+2 - 3
- ]
]
-
I
0 - I
5
2
, - 3 • and - 4 , since
5 -4 = 0
•
I
0
2 -2 - ]
-4
••••rk Determining whether a given veCior is a linear combimltion of other vectors is a problem we will address in Chapler 2.
In R2, it is possible to depict linear combinat io ns of two (nonpa rallel ) vectors quite conveniently.
Let u =
[~) and v = [ ~). We can use u and v to locate a new sel of axes (in the same
way that e , =
(~] and e1 = (~] locate the standa rd coordi nate axes). We can use
Section 1.1
The Geometry and Algebra of Vectors
1~
y
2,
w
v
"
.
---jC-+--+--+_ X
-" Flg.r. 1.19 these new axes to determine a coordinate grid that wtlllet us easIly locate lInear combinations of u and v. As Figure 1.19 shows, w can be located by starting at the origin and traveling - u follow'ed by 2v. That is, w == - u
+
2v
We say that the coordinates of w with respect to u and v are - I and 2. ( Note that this IS j ust another way of thinking of the coefficients of the linear combinatio n.) It fol lows that
(Observe that - 1 and 3 are the coordmates of w with respect to c 1 and c 2. )
SWItching from the standard coordi na te axes to alternative ones is a useful idea. It has applications in chemIstry and geology, Slllce molecular and crystalline st ructures often do not fall onto a rectangular grid. It is an idea that we will encounter re peatedly in this book.
J. Draw the following vectors in standard position in Iff:
la)
a ~ [~]
I')' ~[-: ]
Ib)
b~ [ :]
Id)d ~ [_ ~ l
4. If the vectors in Exercise 3 are translated so that their heads are at the point (4, 5, 6), find the po ints that correspond to thei r tails.
-
-
5. For each of the follOWing pairs o f points, draw the vector AB. Then compute and redraw AB as a vector In standard POSI t Ion.
2. Draw the vectors in Exercise I with their tails at the point (- 2, - 3).
la) A ~ II, - I),B ~ 14,2 ) Ib) A ~ 10, - 2), B ~ 12, - I)
3. Draw the following vectors in standard position in lR':
(e) A == (2, f) ,B =
l a ) a~ [ O,2,O [
Ib)b ~[ 3,2,1[
(c)e ==[ 1, - 2, 1)
(d)d ==[ - ], - ], - 2[
(d) A ==
(t,3)
d,D, B == (i, ~ )
14
Chapter I
Vectors
6. A hiker walks 4 km no rth and the n 5 km northeast. Draw displacement vectors representing the hike r's tTiP and draw a vector that rep resents the hike r's net displacem ent frOIll the staning point.
Exercises 7- 10 refer ro the vectors in Exercise J. Compllte the indimtcd vectors (/lui also show how the resl/lts a lii be obtained geometrimlly. 7. a
+b
-
-
(a) AB
-
(b) BC
AD (d) CF (e)
-
+c
(e) AC
lO.a - d
(f) Be
8. b
-.
9. d - c
Exercises II am/ 12 refer to the vectors ;,, Exercise 3. Compute tile indicated I'ectors. 12.2c -3b - d
11.2a+ 3c
13. Find the components of the vectors ti , v, u + v, and u - v, where u and v are as shown ill r igurc 1.20. )'
-
Express each of the followm g VCCIOTS in terms of a :::: OA and b = DB:
--->
+ DE +
~
FA
In exercises 15 and 16, simplify t/ie givell vector expression.
llidimte which properties j" T/leorem 1.1 yOlil/se. 15. 2(a - 3b)
+ 3(2b + a)
16. -3(a - c) + l {a + 2b ) + 3(c - b)
111 Exercises J 7 and 18, solve for the vector x ill terms of the veClnrs a and b.
1
17. x - a = 2( x - 2a)
•
18. x
60'
+ 2a
- b = 3(x
+ a)
- 2(2a - b )
Itl Exercises 19 alld 20, draw tile coordJ/late t/.Xes re/(lfive to u and" and /ocfllew.
- I
19. u = [ _ : ]. v = [ : ], w = 2u
20,:. =
- I
FI'lre 1.11 14. In rlgure 1.21, A, B, C, D, E, and Fare the verti ces of a . regular hexagon centered at the ongill. )'
-u -2v
1/1 ExerCises 21 tHld 22. draw tile stalldard CQOrtli,wte axes 011 {h e same diagram as tIre axes relative to u alld v. Use tlrese to find w as a lillctlT combinat ion of u and v .
21. . ~ [_:]., =[ :].w ~ [!]
22.u ~ [ - ~ ]. , = [ : ]. w~ [:]
1/
C
[- ~ J. v = [ ~J. w =
+ 3v
23. Draw diagrams 10 11IuSIrate properties (d) and (e) o f
A
-",
a E
",Ire 1.11
x
T heorem 1.1 .
24. Give algebraic proofs of propert ies (d ) th ro ugh (g) of Theorem 1.1. F
Section 1.2
Lengt h and Angle: The 001 Product
15
length and Angle: The Dot Product It is qUIte easy to reformulate the familiar geometric concepts of length. distance. and angle in terms of vectors. Doing so wdl allow us to use Ihese Important and powerful ideas in settings more general than R Z and IR' . In subsequent chapters, these simple geo met ric tools will be used to solve a wide variety of problems tlrl Slll g in applicatio ns even when there is no geometry apparent at all!
n. Dot Predict T he vecto r versions of lengt h, d istance, and angle ca n all be descnbed using the noti on of the dot prod uct 0 [ 1\"'0 veC lOrs.
Delililiol
If
", ". then the dot product U · v of u and v is defined by
In wo rds. u· v is the su m of the p roducts o f the correspond ing components of u and v. 1t is importan t to note a couple of thi ngs about this "product" that we have Just defined : First, u and ... must have the same number of com po nents. Second , the dot p roduct u' v is a /lu mber, no t anoth er vector. (T his is why u ' v is sometimes called the scaltlr producl o f u and ....) The dot prod uct of vectors in IRwis a special and importan t case of the mo re general notion of imler producl, which we Iyill explo re in Chapter 7.
EII.ple 1.1
-3
I
Compute u ·v when u =
2 and v =
-3 S.IIII ••
U · V
= ] · (- 3) + 2 · 5
5
2
+ (- 3) ' 2 =
]
Notice that if we had calculated v · u in Exam p le 1.8, we wo uld have com puted v ' u = (- 3) ' 1 + 5 · 2 + 2 ' (- 3)
=
]
That u ' v = v ' u in general is clear, since the individ ual products o f th e componen ts commute. This comm utativity propert y is o ne of the properties of the dot product that we will usc repeatedly. T he main properties of the d ot p roduct arc summarized in Theo rem 1.2.
.2
Let u , v, and w be vectors in IR" and let c be a scalar. Then a. b. c. d.
u' v = v ' u Commutati\·ity Dhlributlvit)" u · (v + w) = u'v + u'w (cu ) ·v = c( u·v)-..; u · u ~ O and u·u =Oifnndonlyifu=O
',..1
We p rove (a) and (c) and leave proof of the remaining properties fo r the exercises. (a ) Applying the definition of dot product to u . v and v' U, we obtam u .v =
II I VI
=
VI lli
+ +
II. V. Y. u .
+ ... + + ... +
Il~Vn Vnll n
where the middle equality fo llo\'IS fro m the fact that multiplicalion of real numbef5 is commutative. (c ) Using the defi n itio ns of scalar multipl icatio n and dot product, we ha\'e
(cu ) • v
= [CU ,. CU2' ..• , CII "1- [ VI> V2' ... ,
= ~
+ (U 2 V1 + ... + c(ll t VI + 112V2 + ... +
(IIIV,
vn]
CII"V"
II"Vn )
« u ·v)
•••arb • Property (b ) can be read from righ t to left, in which case It S,1YS that we can factor out a common vector u from a su m of dot products. Th iS prope rt y also has a " right-handed" analogue that follows from properties (b) and (a) together:
(v + w) · u ::
V ' U
+ w . u.
• Propert y (e) can be extended to give u' (cv) = c(u ' v) (Exercise 44). This extended version of (c) essentially says thaI in t'lking a scalar mult iple of;1 dot product o f vecto rs, the scalar can first be comb med with whic hever vector is more convenient. For example,
(j[- I, - 3,ZJ) · [6,-4,O)
~
[-I ,-3,Z) ' (j[6, - 4,0J)
~
[- I,-3,2) · [3, - Z,O) ~ 3
With this approach we avoid introducing fractions mto the vectors, as the o riginal groupi ng would have. • The second part of (d ) uses the logical connective If (wd only if Appendix A d iscusses this phrase in mo re detail, but for the m oment let us just note that the wording signals a dOl/ble implication- namely, ifu = O,t hen u'u :: 0 and
if u'u = O,thenu = 0
Theorem 1.2 shows that aspects of the algebra of \'e(Juality
.
See Exercises 55 nnd 56 for nlgebraic and geometric npproaches to the pmof o f this inequality. In Rl or IR J , where we G ill usc geom etry, it is dea r from a d iagra m such as Figure 1.26 that l u + vi < ~ u ll + I vl fo r all vectors u and v. We now show that thIs is true more generally.
,
~
ThlOrlm 1.5
The Triangle Inequality For all vectors u and v in R~.
20
Chapter I
Vectors
P"II Since both sides of the inequality arc nonnegative, showing thai the square of the left-hand side is less Ihan or equal to the square of the right-hand side IS equiva lent to proving the theo rem. (Why?) We compute
Il u + V~2
(u + v) ·(u + v)
""
= u·u + 2(u·v) + v'v
By EX31llpie 1.9
" iull' + 21u· vi + ~vr s lul1 2 + 21uU + IvII 2 = (lui + IvI )'
By Cauchy-Schwoarl
as requ ired.
DlSlance The d istance betw'cen two vectors is the direct analogue of the distance between IwO points on the real n umber line or IwO points in the Cartesian plane. On the number line (Figure 1.27 ), t he distance between the numbers a and b is given by ~. (Taking the absolute need to know which o f a o r b is larger.) This d istance is also eq ual to , and its two-dimensional generalization is points (II I' a2 ) and (bl' btl-na mely, the familia r fo rmulaIor the dis· nce (I
la-
d=Y
-b
'+
-11,)' t
/,
o
I i
I
o
-2 flglr'1 .21 d
= 10 -
Jn terms of vectors, if a
=.0
[
~
::]
I 4 I I . 3
= 1- 2 - 31= 5 and b "" [ : : ], then ti is just the length o f a - b.
as shown III Figu re 1.2B. This is the basis for the next definition. y
a- b
,
,, , ,. ,, "' , __ _ ________ JI,
l a~- h _
FIliI,. 1.Z1 (/ - v""(a-,'----b,.,")'~+-c(a-,-b,.,")l
DelialtloD
=
'--------~ x
la - hI
The distanced(u, v) between vectors u and v in Rn is defined b
(u. v)
=
lu .
i
' UW'"
S«t ion
1.2 Length and Angle: Thl' Dot f'roduct
Exlmple 1.13
o
Find the distance betwl'en u =
and v :::
I
- I
S,I.II..
21
2
- 2
v2 We com pute u - v =
-
I ,so
1
d(u . , ) ~ ~ u -
'i
s
\I( v2) ' + ( -
I )'
+
I' ~
V4
~ 2
The dol product can also be used to calc ulat e the ang le between a pair of vcctors. In 1!l2 orR), Ihe angle bet>.vee n Ihe non7.ero vect ors u and v will refer 10 the angle 0 determ ined by these vectors that S:ltisfi es 0 s () :S 180" (sec Plgure 1.29 ).
,
, u
"
• C\
•
u
u
fl,. f.l .2I The lIngie bctw('('n u lind v In Figure 1.30, con sider the tria ngle with side s u, v,and u - v, where () is the angle between u and v. Applyin g the law of cosi nes \0 Ih is triangle yields
U- '
•
Il u - vl1
U
FI"f' U'
lu92 + I 2 - 20ull cos 0 Expanding the left-hand side and usin g Ivf2= v · v seve ral time s, we obta in lui ' - 2(u" ) + lvi' ~ lu! ' + lvi' - '1Iu!II'1005 0 :::
which, after simplification, leaves li S with u' v = U uUvf cosO. From this we obtain the foll owing fo rmula for Ihe cosine of tile angle () between non zero ve, determ ine whether f and QI> are parallel, perpendicular, or neither: (a) 2x+ 3y - Z "" I (c) x - y - z = 3
(b) 4x - y + 5z = 0 (d ) 4x + 6y - 2z = 0
19. The plane @'] has the equation 4x - y + 5z = 2. For each of the planes q. in Exercise 18, de termine whether qp] and 'lJ' are parallel, perpendicular, or neither.
x 28. Q = (0, \ ,0), € with eq uatio n y
{-:J -2
1
\ +
,
I
1
0
3
In Exercises 29 and 30, find the distallce from tllf point Q to the phme ~ . 29, Q "" (2, 2, 2), ~ with equation x
+ y-
z= 0
30. Q = (0, 0,0), f!J' with equation x - 2y + 22 = 1
20. Find the vector fo rm of the equation of the 11Ile in 1R2 that passes thro ugh P = (2, - I ) and is perpendicular to the line with general equation 2x - 3y = 1.
Figure 1.63 suggests a way to use vectors to locate the point R Otl f that is closest to Q.
2\. Find the vecto r fo rm of the eq uatio n of the line in [R:2 that passes th rough P = (2, - \ ) and is parallel to the line with general equat ion 2x - 3y = 1.
32. Find the point Ron f t hat is closest to Q in Exercise 28.
3 1. Find the poin t Ron
e that is doses t to Q in Exercise 27.
Q
22. Find the vector fo rm of the equation of the line in IR J that passes through P "" (- \,0, 3) and is perpendicular to the plane with general equation x - 3y + 2z = 5.
e
23. Fmd the vector fo rm of the equation o f the line in R J that passes through P = ( - 1,0, 3) and is parallel to the lme with parametric equations
,
x = I - t Y "" 2 + 3t z= - 2- I 24. Find th e nor m al for m of the equation of the plane that passes thro ugh P = (0, - 2,5) and is parallel to the plane with general equatio n 6x - y + 22 == 3.
p
o flgur. 1.63
~
r = p
+ PR
Section 1.3
Figure 1.64 suggests a way to use vectors to locate the poim R on VI' tlrat is closest to Q.
Lines and Planes
43
the angle between W> I and qp 2 to be either 8 or 180" - 0, whichever is an acu te angle. (Figure 1.65)
Q
n,
,
c
,
o
flgare 1.64 r = p + PQ + OR
- -
180 - 8
figure 1.65
33. Find the point Ron g> that is closest to Q in Exercise 29. 34. Find the po int Ron 'll' that is closest to Q in Exercise 30.
Exercises 35 (II/(/ 36, filld the distall ce between tile {X/rallel lilies.
III Exercises 43-44, find tlse acute mlgle between the pIa/Ie! with the given equat ;0115.
43.x+ y+ z = 0 and 2x + y - 2z = 0 44. 3x - y+2 z=5 and x+4y - z = 2
111
35.
= [I] + s[-'] [x] y I 3
III Exercises 45-46, show tlll/tihe pllllie and line with the given 1.'(llIatiol15 illlersecf, (lnd then find the aela/.' angle of intersectioll between them. 45. The plane given by x
36.
x Y
I
I
x
O+siandy
,
- \
I
z
o
I
\ +t \ 1
Z=
3
+
I
46. The plane given by 4x - Y -
In Exercises 37 011(/38, find the distance between the parallel planes. C 137. 2x + y - 1%= 0 and 2x + y - 2z =5
38.x+y + z =
J
,nd x + y +z= 3
39. Prove equation 3 o n page 40. 40. Prove equation 4 on page 4 J. 41. Prove that, in R ', the distance bet..."een parallel lines wit h equations n' x = c, and n· x = c1 is given by
given by x =
Exercises 47-48 explore Olle approach 10 the problem of fillding the projection of a ,'ector onlO (/ pial/e. As Figllre 1.66 shows, if@> is a plalle throllgll the origin ill RJ with normal
n
en
till
p= \
I nil If two nonparallel plalles f!J> I alld 0>2 lrave lIormaI vectors " l al1d 11, mui 8 is tile angle /Jetween " l anti " 2, then we define
6 and the line
t
42. Prove that the dis tance between parallel planes with equations n· x = til and n' x = ti, is given by -
z
y = I +2t. Z = 2 + 31
ICI - ~ I ~ nil
I(il
0 and the line
givcn byx = 2 +
I y = I - 2t.
I
+ Y + 2z =
figure 1.66 Projection onto a pl(lllc
en
44
Chapler I
Veclors
vector n, ami v is a vector in Rl, then p = pro~{ v) is a vector ;11 r:I sllch that v - en = p for some scalar c.
onto the planes With the fo llowi ng equations: (a) x+ y+ z = O
(b) 3x - y+ z = O
47. Usi ng the fa ct that n is orlhogonal to every vector in ~ (and hence to p), solve for c ::and the reby fi nd an expressio n fo r p in terms of v and n.
(e) x - 2z = 0
(d ) 2x - 3y
48. Use the method of Exercise 43 to find the p rojection of v =
1 0
-2
+z= 0
I The Cross Product It would be convenient if we could easily convert the vector form x "" p + s u + tvor the equation of a plane to the normal for m n' x = n ' p. What we need is a process that, given two nonparallel vecto rs u and v, produces a third vecto r n that is orthogo nal to both u and v. One approach is to use a const ruction known as the cross product of vectors. Dilly Yldid In RJ , it is defined as follows:
Definition
The cross prOtlUCI of u =
U2
and v =
VI
is the vector u X v
defin ed by
U X II
=
IIl V) -
II J Vl
" l V, -
" I V)
U I Vl -
U2 Y ,
. A s hortcut that can help yo u rem ember how to cakut.lte the cross product of two lIectors is illustra ted below. Under each com p lete Yector, write the first two compo-
nents of that vector. Ignon ng the two components on the top line, consider each block o f four: Subtract the products of the components connected by dashed lines from the products o f the components connected by solid lines. (It helps to notice that the fi rst component of u X v has no Is as subscripts, the second has no 2s, and the third has no 35.)
IIl Vl -
II J V,
UJ V, -
II I VJ
Il , I'l -
1'2 VI
45
The following problems brietly explore the cross product. I. Compute u x v.
,
3
0 (a) u =
,
, v :::::
(0) u
=
,
flgur.1.61
=
-.-. 2
2 ,v =
3
(b ) u
2
-, u X ,
-,
(d ) u
=
-, , , ,
•v =
2
,
, ,
0
3
, ,=
2
3
2. Show that c 1 X c 2 = c)' c1 X c j = c., and c j X e l = ez. 3. Using the definitiOn o f a cross p roduct, prove that u X v (as shown in Figure 1.67 ) is orthogonal to u and v. 4. Use the cross product to help find the no rmal form of the equation of the plane. o 3 (a) The plane passing through P = (l , 0, - 2), pa rallel to u = I and v = - ]
,
2
(b) The plane passing through p = CO, - 1, l),Q = (2,0,2),aod R = (1,2, - 1) 5. Prove the following properties of the cross produ ct: ( a) v X u = - (u x v ) (b) u X 0 = 0 (c) u X u = 0 (d ) u X kv = k(u X v) (e) u X ku == 0 (0 u x (v + w) = u X v + u X w 6. Prove th e fo llowing properties of the cross product: (a ) u· (v X w) ::::: (u X v) ·w ( b ) u x (v X w ) "" (u ·w )v - (u -v)w (e) Illl x
V!l =
I U ~ 2 ~ v ll~ -
(u -vy
7. Redo Problem s 2 and 3, this time making use of Problems 5 and 6. 8. I.et u and v be vecto rs in RJ and let 0 be the angle between u and v. (a) Prove that lu x v ~ = l un vll sin O. t Hlnt: Usc Problem 6(c).J ( b) Prove that the arc.. A of the tri .. ngle de termined by u and v (as shown in Figure 1.68) is given by
,
A u
fll.,.1 .6I
= t llu x
vii
(c) Use the resul t in part (b) to compute the area o f the tria ngle with vertices A = (1 , 2, 1) , B = (2, I,O),and C = (5, - I, 3) .
Section 1.4
Code Vectors lmd Modular Anthmetlc
41
" ."~
Code Vectors and Modular Arithmetic
The modern theory of codes onglllated WIth the work o f the American mathematician and com puter scientist Claude Shannon ( 1916-2001 ). whose 1937 thesis showed how algebra could playa role in the design and analysis o f electncal clfcuits. Shan non would later be Instrumental in th e formatIon of the field of IIIformation tlreoryand gtve the theorctkal basis for what are now called errorcorreclmg codes.
Throughout hislory, people have transmitted informa tio n usi ng codes. Sometimes the intent is to disgUise the message being sen t, such as when each letter in a word is replaced by a different leiter acco rding \ 0 a substitu tio n rule. Although fascinating, these secret codes, or ciphers, ilre not o f concern here; they are the focus of the field of cryptography. Rather, we wi ll concentrate o n codes that are used when data m ust be transmitted electronically. A familiar example of such a code is Morse code, \~ i th its system of dots and dashes. The adven t of d igital computers In the 20th centu ry led to the need to tra nsmit massive amounts of data q uickly and accurately. Computers are designed to en code data as sequences of Os ilnd Is. Many recent tech nological advilncements depend on codes, and we encounter Ihem every d ay withoul being aware of them: satellite communications, compact disc players, the u niversal product codes (U PC) associated with the bar codes fo u nd o n merchandise, and the international standard book numbers (ISBN) found o n every book published today are but a few examples. In this sectIOn, we will use vectors to design codes for detecting errors that may occur in the transmission of data. In laler cha plers, we will construct codes that can not only detect but also correct erro rs. The vectors thaI arise in the study of codes are not the familia r vectors of R" but vectors with only a fi nite number of choices for the components. These veclo rs depend on a different type of arlthmetic-moduiar arithmetIc-which will be introduced in Ihis section and used throughout the book.
Binary COdes Since computers represen t d;lIa in terms o f Os and Is (which can be interpreted as off/on, closed/open, false/ t rue, o r no/yes), we begin by consideri ng biliary codes, which co nsist of vectors each of whose componenls is eilher a 0 Qf a \, In thls setting, the usual rliles of arit hmetIC must be modified, since the result of each calculation involving SOl lars must be a 0 or a I. The modifi ed rules for addition and multiplication are given below.
+
0
I
o
I
001
o
0
0
I
I
0
I
I
0
The only curiosity here is the rule that I + I = O. This is not as strange as it appears; If we replace 0 wit h the word ""even" and I with the word "odd," these tables simply sum marize the fami liar panty rules for the additio n and multiplicatIOn of even and odd integers. For example, I + I = 0 expresses the fact tha t the sum of IWO odd integers is an even lllteger. With these rules, a Uf set of scala rs 10, I } is denoted by Z2 and is called the set of integers modulo 2.
":t
In Z2' 1 + 1 + 0 + 1 = 1 and 1 + 1 + I + I = O. (Thesecakulalions ill ustrate the panty ,"I" Th, sum o f lh,oo odds ,"d , n eve" " odd; lh, sum of fout odds is
'S''"'
We are using the term kmgth differently from the way we used it in R". This should not be confusing, since there is no gromt'tric notion of length for binary vectors.
Wi th l, as OU f set o f scalars, we now extend the above rules to vectors. The SCI of all ,,-tuples of Os and Is (with all ari th metic performed m odu lo 2) is de noted by Zl' The vectors In Z~ are called binary vectors o/Iength n.
Cha pler I
V« lo rs
Example 1.28
The vectors in l~ arc [0 , 0 1, [0, I], [I, 0], and II, contain , in general?)
Exampll 1.29
Lei U = f 1, 1,0, I, OJ and v - 10, I. I, 1,01 be two brna ry veclorsoflen glh 5. Find U ' v.
II.
(How Illany vectors does
Z'l
Solution The calculation of u' v takes place in Zl' so we have u ·v = \·0+ \ . \ + 0·\ + \·1 + 0·0 = 0 + 1 +0+ 1+0
= 0
t
I In practice, we have a message (consisting of words, numbers, or symbols) that we wish to transmit. We begin by encod ing each "word" of Ihe message as a binary vecto r. III
Definition
A binary code is a set o f binary vecto rs (of the same length ) Gliled
code vectors. The process o f com'erring a message into code vectors is ca tled encoding, and the reverse process is called decodi"g. •
"=zz
A5 we will 5«, it is highly desirable that a code have other p ro~rti es as well, such as the ability to spot when an error has occu rred in the transmission o f a code vecto r and, if possible, to suggest how to correct the erro r.
Error-Ollecllng COdes Suppose that we have alread y encoded a message as a set of binary code vectors. We now want to send the binar y cod e vecto rs across a cluHlllei (such as a radio tra nsm itter, a telepho ne line, a fiber o ptic cable, or a CD laser). Unfortunatel y, the channel may be "noisy" (because o f electrical interference, competing Signals, or dirt and scratches). As a result, erro rs may be introduced: Some of the Os may ~ changed to Is, and vice versa. How can we guard agaInst this problem ?
hample 1.30
We wish to encode and transmit a message conSisting of one of the words up, do""., I"/t, or rigill. We decide to use the fo ur vectors in Z~ as our binar y code, as shown in Table 104. If the receiver has this table too and the encoded message is transmitted without e rro r, decod ing is trivial. However, let's suppose that a si ngle error occurred. (By an error, we mean that one component o r the code vec to r changed .) For example, suppose we sent the message "down" encoded as [0, I J but an error occurred in the transm ission o f the fi rst component and the 0 changed to a t. The receiver wo uld then sec
Tlble 1.4 Message Code '
up
[0.0)
down
left 11 . 0)
right ) 1,
tJ
Section 1.4
Code Vectors and Modular Anthmetlc
49
[1. II instead and decode the message as "right." (We will only concern ourselves wi th thc case of single errors such as this o ne. In practice. it is usually assumed that the probabil ity of multiple errors is negligibly small.) Even If the receiver knew (somehow) that a single error had occurred, he or she would not know whether Ihe cor rect code vector was (0, Jj or [ I, OJ. But suppose we sent the message usi ng a code that was a subset of Z~in other wo rds, a binary code of length 3, as shown in Table 1.5.
Tnlll.5 Message
Cod,
up
down
left
right
[O.O. OJ
10, I, J]
11,0, iJ
[ 1,1, 01
This code can detect any single error. For example, if "down" was sent as [0, I, J J and an error occurred in one component, the receiver would read either [I , I , 1 J o r !0, 0, I J or [0, I , 0], none of which is a code vector. So the receiver would know that an error had occurred (but not where) and could ask that the encoded message be retransmitted. (Why wouldn't the receiver know where the error was?)
The term pa rifY comes from th~ l ntin wo rd par, meaning "equnl or ~CVCI\:' Two inteser~ ar~ s.aid t~. have the sa me parity if they are both even or bQth odd,
The code ill Table 1.5 is an example of an error-detecting code. Until the 1940s, this was the best that could be achieved. The advent of digital computers led to the development of codes thllt could correct as well as detect erro rs. We will consider these in Chapters 3, 6, and 7. The message to be transmitted may itself consist of binary vectors. In th is case, a simple but useful error-detecting code is a parity d l/?ck code, which is created by ap-
pending an extra componellt---catied a check digit-to each vector so that the par ity (the total numberof Is) is even.
Exampla 1.31
If the messllge to be sent is the binary vector I I, 0, 0, 1,0, 11. which has an odd number of Is, then the check digit will be I (In o rder to make the total number of Is in the code vector even) and th e code vector will be [ 1, 0,0, 1, 0, I, I J. Note that a single error will be detected, since It will CllUse the panty of the code vecto r to change from even to odd. For exam ple, if an erro r occurred III the third compo nent. the code vector would be received as [ I, 0, I, 1, 0, I, I J, whose parity is odd because it has fi ve Is.
~ jI
Let's look at this concept a bit more formally. Suppose the message is the binary vector b = [bl' bl •• •• , hnJ in I.';. Then the parity check code vector is v = [ /' " b2 , . .. , l bn , d] in , where the check digit d is chosen so that
Zr
bl + h2 + ... + b" + d = 0
In
Zl
or, equivalently, so that I .v = 0
where I = [1, I, ... , I J, a vector whose every component is I. The vector 1 is cal led a check yector. If vector Vi IS received and I· Vi = I, then we can be certai n that
50
Chapter I
Vectors
an error has occurred. (Although we are not considering the possibility of more tha n one erro r, observe that th is schem e will not detect an even number of erro rs.) Parity check codes arc a special case of the more general check digit codes, which we will consider after first extend ing the forego mg ideas to more general seuings.
Modular Arithmetic I I is possible to generalize what we have just done fo r b inary vecto rs to vecto rs whose
components are taken from a finite set 10, 1,2, ... , kJ fo r k 2:: 2. To do so, we must fi rst extend the id ea of b inary arit hmetic.
EKample 1.32
The integers modulo 3 consist of the set Zl = {O, I, 2 ) I io n as given below:
+1 0
I
2
0
0
I
2
0
I
I
2 0
I
2
2 0
I
2
With
addition and multiplica -
0
I
2
0 0 0
0
0
I
2
2
O bserve that the result of each addition and m ultiplication belongs to the set 10, I , 2J; we say that Zl is closed wi th respect to the operatio ns of addi tion and multiplicatio n. It is perhaps easiest to think of this set in term s of a three-ho ur dock with 0 , I, and 2 on its face, as shown in Figure 1.69. The calculation 1 + 2 = 0 translates as fo llows: 2 hours afte r I o'dock, it is o o'dock. l us t as 24:00 and 12:00 are the same on a 12-hou r d ock, so 3 and 0 are eq uivalent on this 3- ho ur clock. Likewise, all mult iples of 3-positive and negativeare equivalent to 0 here; 1 is equi valent to any num ber tha t is I more than a multiple o f 3 (such as - 2, 4, and 7); and 2 is eq uivalent to any numbe r that is 2 mo re than a m ultiple of 3 (such as - 1,5, and 8). We can Vis ualize the n um ber line as wra pping a round a circle, as shown in Figure 1.70.
o . . . . -3.0.3 . .. .
2
.... 1,2.5 . . . .
. . . . - 2, 1.4.. ..
filer. 1.&9 Arit hmetic modulo 3
Example 1.33
Fllur. 1.10
To wh
,/' Exercises /9- 24, soll'C Ihe glvell syslem by back 5111,;;11 til t/Oll.
21.x -
20. 211 - 31' = 5 211 = 6
y + z= 0 2y - z = \ 3z = - \
23. x, + x 1
22.
, ,
0
- 5x + 2x = 0 4x - 0
x }- x 4 = 1 Xl +X} +X4 = O Xj - x4 = 0 -
x~ =
x, + 2X2 + 3x) =
,
1
5
- 3x - 4y + z = - iO
7
x+5y =- 1 - x+ y =- 5 2x + 4)' = 4
30.
a - 2b + d= 2 - a+ b -c- 3d = J
lias tile given matrix as its augmemed matrix.
+ 3x 2 = 5
31.
y= 3
- 3
III Exercises 3 I alld 3Z, fi"d II system of linear etjuatlOtls ,IIat
13. x + 2y + 3z= 4
19. x - 2y= 1
=
rind tile augll/ellled matriCes of the Imear systellls ill Exercises 27-30. 27. x - y = O 28. 2x, + 3X-.! - xJ = 1 x, + x3 = 0 2x + y = 3 - x, + 2X-.! - 2X3 = 0
iog ,o Y = 2
-
+ )'
24. x - 3)' + z = 5 y - 2%=- \
32.
0
I
I I
I
- I
0 I
2
- I
I I
I
- I
I
I
0
I
0 2 0
3
I 2
I
- I 4
2
3 0
For Exercises 33-38, solve the lincnr systems ill Ihe givcn exerCIses. 33. ExerCise 27
34. Exercise 28
35. Exercise 29
36. Exercise 30
37. Exercise 3 \
38. Exercise 32
39. (a) Find a system of two linear equatIOns III the variables x and y \"hose solution set is given by the paramet ric equations x = t and), = 3 - 2t. (b) Find another pa rametric solution to the system III pari (a) in wh ich the parameter is 5 and y = s. 40. (a) Fllld a system of two linear eq uatIOns 111 the variables x,, x 2,and x J whose solution set is given by the parametric equations x I = t, X 2 = \ + I, and x 3 =2 - t. (b) Find another parametric solu tion to the system in part (a) in which the parameter is 5 and xJ = s.
Section 2. L Introduction to Systems of Linear Equations
-
In Exercises 41-44, the systems of equations are lIonliliear. Find substitutiorlS (cha nges of variables) that convert each system 11110 a linear system cmd lise this linear system to help solve the given system. 2
3 41. - + -= 0 x y 3 4 -+-= 1 x y
42.
x? + 2y2 =
6
x 2 - i =3 43.tanx-2siny
2 tanx- siny + cosz = 2 siny - cos z = - \ 44. -2~ + 2(3 b) = I 3(2") - 4(3') = I
65
Using your calculator or CAS, solve this system, rounding the result of every calculation to five significant digits. 3. Solve the system two mort" times, rounding first to four significant digits and then 10 three significant digitS. What happens? 4. Clearly, a ve ry small roundoff error ( less than or equal to 0.(0125 ) call result in very large errors in the solution. Explam why geomet rically. (Think about the grap hs o f the various li near systems you solved in Problems 1-3.) Systems such as the one you just worked w ith are called iI/-conditioned. They are ext re mely sensllive to roundoff errors, and there is nOI m uch we can do about it. We will encoun ter ill-condi tioned systems agai n in Chapters 3 and 7. Here is a nother example to experi ment with:
45S2x + 7.083y "" 1.931 1.73 Ix + 2.693y = 2.00 1 Play around with various numbers o f significant d igi ts to see what happens. startll1g with eight significant digits (if yo u can).
•
"
68
Chapkr 2
Systems of Li near Equations
•
Direcl Melhods for Solving Linear SVSlems In this section we will look at a general , systematic procedure for solving a system of lmear equations. ThIs procedure IS based o n the Idea of red ucing the augmented matrix of the given system to a form that can then be solved by back substitution. The method is direct in the sense that it leads direct ly to the solution (if one ex ists) in a finite number of steps. In Section 2.5, we wi ll co nsider some ;lI(ilrect methods that work in a completely dIfferent way_
Matrices and Echelon lorm There are two important matriccs associatcd with a linear system. The coefficient matrix contams the coeffiCIents of the variables, an 0
So there is at least one free variable and, hence, infinitely many solutions.
Theorem 2.3 says noth ing about the case where m ~ II. Exercise 44 as ks you to give examples to show that, in this case, there can be either a unique solution or infi ni tely many solut ions.
Motl
Linear SlSlems over 7L, Thus far, all of the linear systems we have encountered have involved real numbers, and the solut ions have accordingly been vectors in some [R n. We have seen how other number systems arise--no tably,'Z ..Wd~n p isa prime number,Z.pbehaves in manyrespects like R; in particular, we can a ,5ul)tract, multiply, and d ivide (by no nzero num bers). Thus. we can also solve systems of linear equations when the variab les and coefficien ts belong to some z.p'In such instances, we refer to solving a system over Z,. For example. the linear eq uation X I + Al + X, = 1, when viewed as an equation over 1 2, has exactly fou r solutio ns: Rand Zpa re examples of fields. The set o f rational num~rs 0 and the set o f complex nu mbers Care
other examples. Fields are covered in detail in courses in abstract algebra.
x,
I
X,
0
x,
0
x, ,
x,
X,
0 I ,
x,
0
x,
(where the last so lution arises because I
X,
+I+
0 0 • ,nd I I "" I in Z. l)'
x,
I
x,
,
x,
I
Section 2.2
Direct Methods for Solving Linear Systems
.,
x, When we view the equation XI
+ :s + x, =
l over Z " the sol utio ns
Xl
are
x, 100202JJ2 0 , 1 ,0,2 , 2 , 0 , 1 , 2,1 00
1
02221
1
(Check these. ) But we need not use trial-and-erro r methods; row reductio n o f augmented matrices wo rks just as well over Zpas over R.
Solve the following system o f linear equations over l ,: XI XI
+2x,+ x.~= O +x,= 2 X:z + 2x, :::z 1
Solutlaa
The fi rst thing to note in examples like this one is that subtraction and d ivision are not needed; we can accomplish the same effects usi ng addition and multiplica tion. (Th is, however, requires tha t we be working over l ,. where p is a prim e; see ExerCise 60 at the cnd of this sectIOn and Exercise 33 in Section IA. ) We row red uce the augmented matrix of the system, using calculations modulo 3. 1
2
1 0
1
0
1 2
0
,
II .. .. ZII,
2 1 II, + II,
II, + 211,
Thus, the solution is X I = 1, X2 = 2. x,
Example 2.11
::
1
2
1 0
0
1
0 2
0
1
2 1
1
0
1 2
0
1 0 2
0
0
2 2
1
0 1
0
0 1
0 2
0
0
1 1
I.
Solve the following system of linear equations oyer 1 2:
x, +x1+x, + x, +x, X:z + x, Xj + +
x~
= 1
=1 =0 ~
:: 0
x~
:: 1
82
Chapter 2 Systems of Linear Eq uations
Solation
The row red uction proceeds as fo llows: I
I
I
I
I
I
I
0
0 0 I
I
I
0 0
I
0 I 0 0 I 0
0
I
H. + II,
,
/I. ~ H,
I
11. .... ,1. II, + 11, II, + II,
•
11,+ /1 .
,
/1,. + 11,
I
I
I
I
0 0 0 0
0 I 0 I
I
I 0
I I
0 0 I 0 0 0
I
0
0
I
0 0 0
I
I
0
I
0 0 I 0
I
I 0
0
0 0
0
0 0
I
0
I
0
I
0 0
0 0 0
0 0
I
I
0
0
0
I
I
I
I
I 0
0 0 0 0 0
T herefore, we have +~ = I
x,
+
x~
= 0
X3+ X4=O
Seui ng the free variable x. = t yields
x, x, x,
I + 1
I
1
'"
1
0 0 0
1
I +1
I I I
SlIlce t can take on the two values 0 and I, there are exactly two solutions: I
o
o
I
o o
and
I I
.1 .. 1"
For li near systems over lLp' there can never be infinitely many solutions. (\Vby not?) Rat her, when there is more than one solution, the nu mber of solutio ns IS fi nite and is a function of lhe number of free variables and p. (See Exercise 59.)
-
Direct Methods for Solvmg L.lnear S),Stt'I11S
Sei.:tio n 2.2
13
Exercises 2.2 In Exercises 1-8. determi"e whether the giwn matrix is ill row echelon form. If it I S, state w~,ether it IS also it, reduced row echelon for m. 0 0
3
I
0
I
3
0
0
0
3
5. 0 0 0 I
0
I
I.
3.
0 0
[~ I
7.
I
2.
~l - 4 0 0 0 0
5
7 0
I
0
I
- I
4
111 Exercises J 7 (llid I B, sl/ow that Ihe give" IIIMnC!!5 (l re row
0
0
0
equlI'a/e/lt {Hid find (I scqllf!llce of eicm cll wry ro w opera/iollS Ilwl will COll verl A lfl to 8.
0 0
4.
6.
I
0 0 0
0 0
0
0
0
0
0
I
0
I
0
I
0
0
0
17. A ==
2 3
2
I
3
5
I
0
0
0
0
I
- I
0 0
I
I
0
I
0 0 0 0 0 0
3 0
8.
row «he/Oil form. 0
I
9. 0
I
I
I
I
I
3
5 - 2
II.
5
2 13.
3 2
4
10.
12.
4 - 2
- I
- I
- 1
- 3
- I
14.
- 2 - 3
- 4
- 2
I
6
-, - 6 2
:]
-,
- 4
0
- I
10
9
5 - 5
I
I
- 1
0
0
24
0
0 0
0
1
2
- 4
- 4
5
2 2
4
0
0
3
2
I
- I
I
3
6
2 5 5
0 ,B
- I
I
I
~
I
- I
3 2
5 2
I
0
Perfor m R! + R, and R, + R2• Now rows t and 2 are identical. Now perform Rl - R, to obtain a row o f zeros in the second row.
+
RI • R, - R2• Rz + R, • - R,
21. St udents freque ntly perfo rm th e following type of cal culation 10 introd uce a zero intO :1 matrix:
7
10 - 3
15. Reverse the elementary row operatIOns used in Example 2.9 to show that we can convert
2
I
R!
1
I
I
3
20. What is the net e ffect of performi ng the fo llowing sequence o f elem en tar y row o perations o n a matrix (with at least two rows)?
[~ ~] [:
- I
2 0
19. What is wro ng w ith the fo llowing "proof" that every m atrix with at least two rows is row cqUlvalent to a matrix with a :.:ero row?
ill Exercises 9-1 4, use eielllel//(/fY row operatiolls to redllre Ihe gIVen matrix to (a) row echeloll f orm (md (b) retillced
0
:J. B =[~ -~ ]
[~
18.A=
I
,
16. In general, what IS the elementa ry raw o peratio n that " undoes" each of the three elementary row o perations R, +-+ Rf kR,~ and R, + kR,?
However, 3 Rz - 2R, IS flOl an elem entary row operatio n. Why not? Show how 10 achieve the same resul t using elem entar y row operations. 22. Consider the ma trix A =
into
[~ ~ ]. Show that any o f
Ihe th ree types o f elem entary row operations can be used to create a le:lding I at the top of the first co lumn. \Vh ich d o yo u p refer and why? 23. What is the rank o f each of the matrices in Exercises 1-8? 24. \Vhat are the possible reduced row echelon fo rms of 3 X 3 m atrices?
••
Chapter 2 Systems of Linear Equallons
In Exercises 25-34, solve the given system of eqllations IIsing either Gausswn or Gauss-Jordan elimination, 26. x- y+ z = O 25. X,+ 2x2-3x,= 9 - x + 3y + z = 5 2x, - X2+ x,= O X2 + X, = 4
4x, -
27.
x, - 3x z - 2x, = 0 -x,+ 2xz + xJ= O 2x,
+
4x1 + 6xJ
+ s= + s=
29. 2r 4r
+ 21V + 3x
28.
+
Y
3x - y
31V - X 31V - 4x
= 0
7z = 2
+ 4z = + z=
0 I z= 2
+ y-
J 7
- XI
+ 3x1 - 2x] + 4X4 =
31.
+
3x1
X, ! X1
+ X2 -
ji, x,
+'jX2
}XI
2 =
+ Xs
22 = 3, ~
- vl
-y + V2z=
I
+X+
2y
2
+z=
+
ky = I
kx+ y == 1 43. x
3z = 2
+ Y + kz
x+ky+
x+ y+ z= k 2x - y+4z= f(-
==
1
z=
I
kx+ y+ z=- 2
and
46. 4x +y-z= O and
B
2x - y+ 4z=S
2x-y+ 3z= 4
47. (3) Give an example of th ree planes that have a com mon line of intersection (Figure 2.4 ).
I
1
°
y+z "'=' x+ y - \ w+x +z= 2 c+ d ., 34.n + b + a + 2b + 3c+ 4(1 = n + 3b + 6c+ lOti = a+ 'Ib+ 10c + 20d E> w- x -
+
45. 3x+2y+z =- 1
= -1
- 4x:s=
2x,
vly IV
=
42. x - 2y
41. x
In Exeroses 45 a"d 46, fi "d the li"e of intersection of the give" planes.
6X4
- 3",
32. V2x+ y+
33.
8~
4x, XJ -
-
0
xJ-2 ~ =-J
2x,-6x2+
40. kx + 2y = 3 2x - 4y = - 6
44. Give examples of homogenCQus systems o f m linear equations in 11 variables with It! "" n and with //I > n that have (a) infinitely many solutions and (b ) a unique solution.
2r+5s=- 1 30.
III Exercises 40--43,for what value(s) of k, if any, will the systems hllve (a) I/O soilltion, (b) a unique solution, l/lld (c) illfilli rely II J(lIIY solll t lOllS ?
4 iO 20
35
flglrl
Exercises 35-38, (lelermi"e by impectioll (i.e" withollt performing any ca/clliatiolls) wlletller a /mear system with the given allgmented matrix has a IImque 5O/lItIOIl, infillitely many solutions, or flO solution, lllsllfy your answers.
2.'
In
0 35.
37.
0 I
3 36. I
-2 2
2
4
4 0
I
2 3
4
8 0 12 0
38. 6 7
4
3
7
7
I 2
0 I 0
3 I I I
I
2
5 9
6 10
3 7
"
5 7
0 -3
(b) Give an example of th ree planes that intersect in pairs but have no common point of intersec~on (Figure 2.5).
I
I I - I
-6 2
0
5 6 2 I 7 7
39. Show that if ad - be '1= 0, then the system
(u:+by= r
cx+ dy==s has a unique solution.
fl11U12 .5
Section 2.2
Direct Methods fo r Solving linear Systems
15
show that there are mfmitcly many vectors
(el GLve an example of three planes, exactly two of which arc parallel (Figure 2.6).
x, x '"
X,
x, that simultaneously sat isfy u ' x = 0 and v' x = 0 and that aU arc multiples of u Xv=
fl,.,.2 .• (d) Give an example of three planes that in tersect in single pomt (Figure 2.7).
lL
52. Let P =
I':Y, -
"J v~
" , VI -
III V,
lil Y: -
IIl Vt
2
1
0
1 •q =
1 •u = - 1
0
- 3 , andY = 1
0 6 - 1
Show that the Ji nes x = p + su ::and x = q + IV,\Te skew lines. Find vector equations of a pair of p::arallel planes, one contaimng each line. hi Exercises 53-58, soll'e tile systems oflmeaTequations over
tile indim/ed Zr 53. x + 2y = l over Z,
+ Y= 2 +Y = lover 'Z.2
x 54. x
y+ z=O
+ z=
x
fill" 2.1
y+ z = O +z= l
x 48. P =
49. P -
3 1 .q = 0
- 1
1
2 ,u =
2 ,v =
- 1
0
0 ,v =
2 3
1
1
1
1 ,u = - 1
2 50. Let P = 2 , u = I , and v = 1 . Describe 3 - \ 0 all points Q = (a, b, c) such lhal the line through Q with direclion vecto r v intersects the line with equation x = p + SUo 1
1
51. Recall that the cross prod uct of vecto rs u and v is a v«tor u X v that is orthogonal to both u and v. (See Exploration: The Cross Product in Chapter I.) If and
v=
56. 3x x
1
- 1
0
1
= I over 'Z.,
55. x+ Y
2
"v: ",
I
+ 2y = lover l.J + 4y = I
57. 3x + 2y= I overl., x
58.
+ 4y
= 1
+ 4~
XI
x t + 2x!+ 4x, 2xt+ ~
+
"" I over Z", = 3
~= 1
+ 3x,
"" 2 59. Prove the following corollary to the Rank Theorem: Let A be an m X "m::atrix with entries in Z,. Any consistent syslcm of linear equ::ations wi th coefficienl m:llrix A has exactly p" ,.".V.l sol utions over 7L r ''(I
60. When p is not prime, eXira care is needed in solvi ng a linear system (or, indeed, any equation) over l,. Using Gaussian e limination, solve the foll owing system over l,. What complicatio ns ::arise? 2x + 3y= 4 4x + 3y = 2
.~
...
,
~
.:;$: .
CAS
Partial Pivoting In Exploration: Lies My Computer Told Me following Section 2.1, we saw that ill conditio ned linear systems ca n cause tro uble when roundoff error occurs. In this exploration, you will d iscover another way in which linear systems are sensitive to rou ndoff error and see that very small changes in the coefficients can lead to huge maccuracles in the solution . Fortunately, there is somet hing that can be done to
mmtmize or even e1immate this problem (unlike the problem with tn-conditioned systems). I.
(a) Solve the single linear equation 0.OOO21x = 1 for x.
(b) Su ppose yo ur calculator can carry on ly fo ur significant digits. The equation will be ro unded to 0.OOO2x = I. Solve this equation. The difference between the answers in parts (a) and (b) can be thought of as the effect of an error of 0.0000 1 o n the solution o f the given equation. 2.
Now extend this Id ea to a system o f linear eq uations. (a) Wit h Gaussian elimination, solve the li n ear system 0.400x
+ 99.6y =
75.3x - 45.3y =
100 30.0
using three siglllficant digits. Begin by pivoting on 0.400 and take each calculation to th ree significant digits. You should obt:lin the "solution" x = - 1.00, Y = 1.01. Check that the actual solutjon is x = 1.00, Y = 1.00. Th iS is a huge error-200% 111 the xvalue! Can you discover wholt caused It? (b) Solve the system in part (a) again, this time interchanging the two equa tions (o r, equivalently, the two rows of its augmented matri x) and pivoting on 75.3. Again, take each calculation to three sign ificant digits. What is the solution this time? The moral o f the story is that, when uSing Gaussian or Gauss-Jo rdan dim ination to obtai n a numerical solution to a system of linear equations (i.e., a decimal approx imation), you should choose the pivots with care. Specifically, at each pivotll1g step, choose fro m alTlong ,,11 possible p ivots in a colum n the entry with the la rgest absolute value. Use row interchanges to bring thIS element into the correct posit ion and usc it to create zeros where needed in the column. This strategy is known as partial pivoti"g,
.6
3. Solve the foll owing systems by Gaussian elimlllation, first without and then with part ial pivoting. Take each calculation to th ree sign ificant digits. (The exact solutions arc given.)
+ O.995y == 1O.2x + I.OOy =
(il) O.OOlx
-
1.00
(b) lOx -
- 3x
- 50.0
7y
+ 2.09y + 6z
5x -
. [xJ [,.ooJ 1.00
Exact so]utlon: y =
= 7 = 3.91
y+ Sz = 6
x Exact Solullon: y
- I.DO
z
l.OO
0.00
I Counting Operations: An Introduction to the Analysis of Algorithms
Abu la'fur Muhammad ibn Musa al -Khwarizmi (c. 780-850) was an
Ambie mathematician whose book 1'liMb al-jabT w'l/ll/Ilujllualllll (c. 825) described the uSC of HinduArabiC nume.rals and Ihe rules of basic arithmehc. T he second word oflhe book's title gives rise 10 the English word a/gellm, ,md the word algorithm is dem/cd from
(11· Khwarizml's 113111e.
! ! •
G.l ussian and Gauss-Jorda n elimination arc examples of algorithms: systematic procedures deSigned to Implement a particular task-in this case, the row reduction of the augmented matrix of a system of linear equations. Algorithms arc particularly well sui ted to com pu ter implementation, but not all algo rithms are created equal. Apart from the speed, memory, and other atlributes of the com puter system o n which they are running, some algorithms are fast er than others. One measure of the so-called complexilyof an algori thm (a measure of its effi ciency, or ability to perform its task in a reason grey ~ black --1> wh ite. Sho\\' how the winning '1 1I -black configuration can be achieved from the initial configuratio n shown III rigure 2.26.
MI,eelianeou, Problems
6
5
2
111
•
3
@)
•
•
2
• 4 ~ • •
•
• = - 2YI
that t ransform a vector y =
[;J
[~]. We can abbreyiatc this
into the vecto r z =
tra nsformation as"Z = Gy, where
G=[ - 2' -'] 0 Prolllllll 4 We arc going to fi nd out how G transfor ms the figure A' B' C D' . Compute Gy for each o f the four ve aw " . ,and If m = n ( that IS, if A has the same nu mber of rows as columns), the n A is called a square mntrix. A square matrix whose nondiagonal entries a rc all zero IS called a tlingomd matrix. A diagonal matrix all of whose diagonal en tries a rc the same is called a scalar matrix. If the scalar o n the diagonal IS 1. the scalar mat ri x is called a n idw,ity matrix. For example, let A = [
2
- I
5 4
B
-
[34 5'I]
o o c= o 6 o , o o 2 3
D =
1 0 0 0 1 0
o
0
1
The diagonal enlries of A a re 2 and 4, but A is no t square; B is a square ma trix of Size 2 X 2 with diagonal entries 3 a nd 5, C is a diagonal ma trix; D is a 3 X3 identity ma tTix. The n X II identity ma trix is denoted by I~ (or simply I if its size is unde rslOod). Since we c.1 n view ma trices as generalizations of vectors (and. indeed, matrices can and sho uld be though t of as being made up of bot h row a nd column veclOrs), many of the conventions and o pe rations for vectors carry th rough ( m a n obvious way) to m atrices. Two ma trices arc equal if they have the same size a nd if their corresponding ent ries are equal. Th us. if A = [a'JJmxn and B (b91 ,xl' the n A Bif and only if //I "'" r llnd /I = s a nd tI,) = hI] for atll a nd j.
=
Example 3,1
=
Conside r the matrices
A = [:
!].
B=
[ ~ ~ ].
c-
o
[! ;] 3
Neither A no r B can be eq ual to C( no matter what the values of xand y), s ince A lInd Bare2 X2 malrices a nd C is2X3. However, A = Bifand on ly if ( I = 2, /, ;;; O,e - 5, and d "" 3.
Example 3,2
Consider the malri c~s
R =[ l
4
3J and
C=
1 4 3
138
Chapter 3
Matrices
D espite the fac t that Rand C have the same entries in the same order, R -=I- C since R is 1 X3 and C is 3X I. (If we read Rand Caloud , they both sound the same; "one, fou r, th ree.") Thus, o ur distinc tion between row matrices/vectors and column matrices! vecto rs is an importan t one.
Matrix Addition and Scalar Multiplication Generalizing from vector add ition, we defi ne mat rix addi tion compOl1el1twise. If A = [a,) and B = [b;) are mX tI mat rices, their sum A + B is the mX tI matrix obtained by adding the corresponding entries. Thus,
A
+ B = [11;j +
bij]
[We could equally well ha ve defined A + B in terms o f vector addition by specifying that each column (or row) of A + B is the sum of the correspo nding colum ns (or rows) of A and 8. [ If A and B a re no t the same size, the n A + B is not defined.
Example 3. 3
Let A = [
1
4
-2 6
~].
B ~
1 ] [-: -1 2 . 0
Then
A+B = b ut neither A
+
C no r B
[ -~
5 6
"d
C =
[~
:]
-; ]
+ C is defi ned.
The com ponen twise defi n ition of scalar multiplication will come as no surprise. If A is an m Xn matrix and c is a scalar, then the scalar multiple cA is the mXn matrix o btained by m ultiplyi ng each e ntry of A by c. More fo rmally, we have
[ In te rms o f vectors, we could equivalently stipulate that each column (or row) of cA is c times the corresponding colum n (or row) of A.I
Example 3.4
For mat ri x A in Example 3.3 ,
2A = [
2
- 4
8 12
l~l
!A= [_: ~
~l
and
(- l)A =[- ~
- 4 - 6
The matrix (- I)A is written as - A and called the negativeo f A. As with vectors, we can use this fact to defi ne the difference of two matrices; If A and B are the same size, then A - B ~ A +(- B)
Sect ion 3.1
111m pie 3.5
lJ1
Ma tnx Operatio ns
For matrices A and B in Example 3.3,
]- [-3 o A - B= [ I 4 0 I
-2 6
5
3
A matrix all of whose entries arc l eTO is called a zero matrix ;md denoted by 0 (or 0 "')(11 if it is imporlant to specify its size), It should be dear that if A IS any matrix and o is the 7£ ro matrix of the same size, then
A+O=A=Q+A . nd A - A = 0 = - A
+A
MaUll MUlllpllclliOD ~13t hcll1aticia ns
are sometimes like Lewis Carroll's Hu mpty Dumpty: "Wb en I use a w\l rd ,~ Hu mpty Dumpty said, "it means just what I choose it to me-anneither more nor JdoS ( from 11Iro11811 Illf Loobll8 GIIIU), M
The Introduction in Sect ion 3.0 suggested that there is a "product" of matrices that is analogous to the compo sition of fun ctions. We now make Ihis no tion morc precise. The defini tion we arc about to give generalizes what you should have discovered in Problems 5 and 7 in Section 3.0. Unl ike the definitions of matrix addition and sca lar multiplicauon, the defi nitio n o f th e product of IwO m,l\rices is not a componentwise definiti on . Of cou rse, there is nothing to stop u s from defin ing a product o f matrices in a componenlwl5e fas hion; unfortunately such a defini tion has fcw ap plica tions and is not as "natu ral" as the one we now give.
If A is an ", Xn matrix and 8 is an tlX r matriX', then the product an "'x r matrix. The (i, j) entry of the product is computed as follows:
II
.
U.,r•• Notice that A and B need not be the same size. However, the number of colulIIlI$ of A must be the same as the number o f rows of 8. If we write the sizes of A, and AIJ in order, we C;11l scc at a gl I.
cos 110
39. In each or the rollowing, flOd the 4X4 matrix A = (ll~ 1 that satisfies the given condition: ~ j
(a) ll., = (- 1)''"1
(b)
(c) Q., =(i - IY
(d ) a,, =sm
QIj =
j
. (U+ j - I)") 4
40. In each or the rollowlng, find the 6X6 matrl."( A = (Q 1
III &ercises 3 /- 34, compllie A B iJy Mock IIIII/liplicatioll, using tire i"tiict/ted ptrrlllio,,,,rg.
31. A
-
1 0 0
- I
• ••• •••
0
0
1 •• 0
... ,
0
0 •'
J
,
, B=
3
••• • •• •
, 1
0
- I 0 ................... 0 •: I
o 0
•
O! 1
9
that satisfies the given condition:
(a)
a.,
(c) a 'I
i+j = { 0
={I
0
ifi S j
if I
>j
(b)
if6 I
152
Chapter J
Mat rices
•
Matrix Algebra In some ways, the arith metic of milt riccs gcnerillizes that of vecto rs. We do not expect any surp rises with respect to addition and scala r multiplicatIon and mdeed there are none. Th is will allow us to extend to matrices several concepts that we are already familiar with from our wo rk with vectors. In particular, hnear combUlalions. spann ing sets, and linear independence carryover to matrices with no difficulty. However, matrices have o ther o perations, such as matrix multiplication, that vecto rs do not possess. We sh ould not expect matrix multiplication to behave like multip lICatIon of real n umbers unless we can prove that it does; in fact . it does no t. In this sect ion, we sum marize and prove some of the main proper ties of matrix o perations and begin to develop an algebra of matrices.
Properties 01 Addllion and Scalar Multiplication All of the algebraic properties of addition ;\nd scalar multiplication fo r vecto rs (Theorem 1.1 ) carry over to ma trices. For completen ess, we summa rize these properties in the next theorem.
Theorem 3.2
Algebraic Properties of Matrix Addition and Scalar Multiplication Let A, B, and C be mat rices of the same size and let cand (I be scalars. Then Commutati\'ity Associativity
a.A+B = B + A b. (A + 8) + C"" A + (B + C) c. A+O=A d. A +(- A)=O e. c(A + 8) = cA + cB f. (c + (I)A = cA + dA g. , I dA ) = ("I )A h. IA = A
DistributivilY Distributi vii y
•
The proofs of these p ro perties are duect analogu es of the correspo nd ing proofs of the vector p ro p~ rt i~s and are left as exefClses. LikeWise, the comments following Theorem 1.1 are equally valid here, and you shou ld have no difficulty usi ng these properties to perfo rm algebraic manipulations With matrices. (ReView Example I.S and see Exercises 17 and 18 at the end of this section.) The associat ivity property allows us to unam bigu ously combine scalar multiplication and addit ion without parentheses. If A. B, and C are matrices of the same size, then ( 2A
+ 38)
-
c = 2A + (38
- C)
a nd so we can simply write 2A + 3B - C. Generally, Ihen, if AI' A1• ... , At are matrices o f the same size and c p cz•...• c1 are scalars, we may form the linear combination clA I
+ c2 A1 + ... + etA,
We will refer to cl • '1 •... , c. as the coe{ficients o f the linear combinati on. We can now ask and answer questions about linear combinalio ns of matrices.
5«tion 3.2
Example 3.16
LetAI =[_~ ~J. A2 =[~ ~].alldAJ =[ :
153
Matrix Algebra
:].
(a ) 158 =
[ ~ ~]alinearcomblOationOfAI'Al. andAl?
(b) Is C =
[~ ~] a linear combination of AI' A
l•
and A,?
Solution (a) We want to find scalars ' I' S' and c, such tha t CIA I + A 21 and AJ COn-
.
slstsof all mat rices
["xl z ' y
.
for whLch w = z. That Ls, span (AI>AZ,A J ) =
{[wxl} y w .
.+ lIole
If we had known this before attempting Example 3. 16, we wo uld have seen
immediately that B =
[I2 4]is a linear combination of AI' A21and A
Ihe necessary form (take w II , x = 41andY = 2),butC=
[~
!]
J,
since it has
can not be alin-
ear combination o f AI' Al , and A, l since it does not have the proper form ( I 'fI:. 4). l inear independence also makes sense for mat rices. We say that mat rices AI' A l , ••• , At of the same size are linearly independent if the only solu tio n o f lhe equatio n
(\) is the trivl3l Olle: (. = ~ = .. = (l = O. If the re a re no ntrivial coefficients that satisfy (I ), then AI' A2•• •• , Ai are called linearly dependent.
Example 3.18
Determ ine whether the matrices AI' A l • and A) in Example 3.16 ar linearly
independent.
Sollilion
We want to solve the equatio n matrices, we have
(I[ _~
CI A .
+
CzA2
~J + (l[~ ~] + 0[:
+
c, A) = 0. Writing out the
:J [~ ~J =
This time we get a homogeneous linear syslem whose left -hand side is the same as in Examples 3.16 and 3. 17. (Are yo u slarting to spot a pallern yet?) The augmented mlllrix row reduces 10 give
o
1
I 0
I - \
0 0
1 0 J 0
o
1
I
0
,
\
o
o
\
o o
0 0
0 0 o 0 I 0 0 0
156
Chapter 3 Matrices
hus, CI = C:! == " "".,, and we conclude that the matrices AI' A z' and AJ arc lineady ndel2.ende nt.
Properties 01 MalliM Multiplication \N'henever we encounter a new operation, such as matrix multiplicatIOn, we must be careful not to assume too much about it. It would be nice If matrix multiplication behaved like multiplication of real numbers. Although in many respects it docs, there are some significant differences.
Example 3.19
Consider the matrices A ~
[ -~ -~]
,nd
B= [:
~]
Multiplying gives AB = [
2
- ]
-~][ ~
:]
[-: -~]
and
BA
[: ~][ -~ [:
-~]
~]
[rhus, AB"* BA. So, in contrast to m ultiplication of real numbers, matrix multiplication is /lot commutative-the order of the factors in a product matters! It is easy to check that A2 =
[~ ~]
(do so! ). So, for matrices, the equation
does not im ply that A = 0 (unlike the situation for real numbers, where the equation X- = 0 has only x = 0 as a solution ). A~
=0
However gloomy things might appear after the last example, the situation is not really bad at all- you just need to get used to worki ng with matrices and to constantly remind yourself that they are not numbers. The next theorem summarizes the main properties of matrix multiplication.
Theorem 3.3
Prdperties of Matrlx Multiplication Let A, B, and Cbe matrices (whose sizes are such that the indicated opera tions can be performed) and let k be a scalar. Then , . A(BCI
~
(ABIC
b. A(B + C) = AB + AC c. (A + B)C= AC + BC d. k(ABI ~ (kAIB ~ A(kBI e. {rn A = A = Aln if A is mX /J
Associativity Left dislribulivilY Right distribulivity ~tultiplicat ive
identity
Proof We prove (b) and half of (e). We defer the proof of property (a) until Section 3.6. The remaining properties are considered in the exercises.
•
Section 3.2
Ma tr ix Algebra
(b) To prove A ( B + C) = AB + AC, we let th e rows of A be denoted by Aj and the columns of Band Cby bl and c, Then the;th column o f B + C is bJ + cJ (since addilio n is defined component Wise), and thus [ A(B + C) ], ~ A. · (b,
+ c,)
= A" bJ+ A" cj ~ (A B)"
+ ( Ae),;
+ AC)" Since th is is true fo r all j and j. we must have A( B + C) = ~ (A B
AB
+ AC.
(el To prove Ai" = A, we note lhat the identity matrix I" can be column partitio ned as
I" ==
rei' c
l ' ... :
c,,]
where c, is a standard unit vector. Therefore,
AI" = [Acl : Ac, ' . . , ; Ae,, ] = [ a l : a~ :
... :a"]
~ A
by Theorem 3.1(b). We ca n usc these properties to further explo re how closely matrix multiplication resembles Illultiplicahon of real numbers.
(xample 3.20
If A and B are square matrices o f the S[«BX)A) - 'r ' ~ [(A - 'B' )' r ' => (BX)A ~ [(A - 'B' )(A-' B')t' ::::} (BX)A = B-J(A -J )- IB-\A- I)-I
=> 8XA = W~AB-JA => B-1BXAA - I = B- 1B- 3 AB- J AA - 1 => IXI = B- 4 AB- J I => X = 8- 4 AB- 3 (Can you justify each step?) Note the careful use of Theorem 3.9( c) and the expansion o f (A-1 8 3 ) 2. We have also made liberal use of the associativity of matrix multiplicatio n to simplify the placement (or el imination) of parentheses.
Elementary Matrices V.re are going to use matrix multiplication to take a different perspectIve o n the row reduction of mat rices. In the process, you will discover many new and important insights into the nature o f invertible matrices.
If I E ~
0
0 0
0 I
0
I
0
,nd
A ~
5 - I
0
8
3
7
we find that 5
7
8
3
-1
0
EA ~
In other words, multiplying A by E(on the left ) has the same effect as lllterchanging rows 2 and 3 o f A. What is significant about E? It is si mply the matnx we obtain by applying the same elementary row operation, R2 +-,). R3, to the identIty matrix 13, It turns out that this always works.
Definition
An elementary matrix is any matrix that can be obtained by per forming an elementary row operation on an identity matrix.
Since there are three types of elementary row operations, there are three corresponding types of elementary matrices. Here are some more elementary matrices.
Example 3.21
Let I
E]
=
0
0 0
0 3 0 0
0 0
0 0 1 0
0
1
, Ez =
0 0
0
1
1
1
0
0 0
0 0 , 0
0
0
0
I
I
,nd
E, ~
0 0 0
0
0 0 1 0 0 0 1 0 - 2 0 1
Section 3.}
The Inverse of a Matrix
169
Each of thcse matrices has been obtained from the identity matrix I. by applying a single elementary row operation. The matrix £1 corresponds to 3R1 , E, to RI +-+ R.p and ~ to R( - 2R 2• Observe that when we left-multiply a 4X II matrix by one of these elementary matrices, the corresponding elementary row operation is performed on the matrix. For example, if
a" a" a" a"
A~
then
E1A =
a"
al2
au
3""
3all
3al}
a" a"
a"
a"
EJA ;;
and
a.,
al2
au
a2l
a"
a" a"
a.2
a.,
• E2A =
""
a" a" a" a"
a" an
a'l
a"
au •
all
a" a"
a"
al!
au
a 21
au
a"
a"
an
an
a" - 2a21
a. 2 - 2a Z2
ao - 2a D
--t
Example 3.27 and Exercises 24-30 should convince you that tltlyelemen tary row operation on tilly matrIX can be accomplished by left-multiplying by a suitable elementary matrix, We record this fact as a theorem, the proof of which is omitted.
Theo,.. 3.10
,
L
Let E be the elementary matrix obtained by performing an elemcntdry row opcration on Tw' If the salllc clementary row operat iOll is performed on an fi X r m,lI rix A, the result is the S(Hne as the matrix fA
a•• .,.
From a compu tational poin t of view, it is not a good idea to use elementary matrices to perform elementary row operations-j ust do them direct ly. However, elementary mat rices C:1I1 provide some valuable IIlslghts 1Il10 invertible matrices and the solution of systems of linear equations. We have already observed that ewry elementary row operation can be "undone," or "reversed." Th is sa me observation applied to element,lfY matrices shows us that they are invertible.
Example 3.28
Let L
E,
Then £ 1-
~
0 0
0 0
0
I
0
L
L
,~=
0 0
0 4
0 0 • and
0
I
E) =
L
0
0
I
0 0
- 2 0
I
corresponds to Rz H RJ , which is undone by doi ng R2 H RJ agai n. Thus, 1 = £ 1' (Check by showing that Ei = EIE. = I.) The matrix Ez comes from 41l1, EI
Chapt ~r 3
110
Matrices
which is undone by perform ing ~ R2 ' Thus.
o o 1 o E,- · - o • o o I I
which can be easily checked. Finally. ~ corresponds to the elementary row o peration RJ - 2R" which can be undone by the elementary row opera tion R} + 2R .. So, in this case,
(Again, it is easy to check this by confirming that the product of this matrix and both o rd ers, is I.)
~,
in
Notice that not only is each elementary matrix invertible, but its inverse is another elementary matrix of the same type. We record this finding as the next theorem.
-
Theo". 3.11
Each elementary matrix is invertible, and its inverse is an elementary matrix of the same type.
T.e luUamenlal Theorem 01 Inverllble Mallicas Weare now in a position to p rove one of the main resul ts in this book- a set of equivalent characterizatio ns of what it means for a matrix to be invertible. In a sense, much o f line;l r algebra is connected to this theorem, either 10 the develo pment o f these characterizations or in their applicatio n. As you m ight expect, given this introduction, we will use this theorem a great deal. Make it yo ur fr iend! We refer to Theorem 3.12 as the first version of the Fundamental T heorem, since we will add to it in subsequent chapters. You are rem inded that, when we $l.IY that a set of statements about a matrix A are equivalent, we mean that , for a given A, the statements are either all true o r all fal se.
, The Fundamental Theorem of Invertible Matrices: Version I Let A be an a. b. c. d. e.
11 X n
matrix. The following statements are equivalent:
A is invenible. Ax = b has a unique solution for every b in IR,n. Ax = 0 has only the trivial solution. The reduced row echelon form of A is 1.. A is a product of elementary matrices.
SectiOn 33
Praal
111
The Inverse of a Matrix
We Will establish the theore m by proving the Ci rcular cham of implications
(a )::::} (b ) We have al ready shown that if A is invertible, then Ax = b has the unique solut ion x = A- Ib fo r any b in 1R"(Theorem 3. 7). (b) => (c) Ass ume that Ax = b has a unique solu tion for any b in [R". This implies, in particular, that Ax = 0 htls tl unique sol utIOn. But tl homogeneous system Ax = 0 alwtlys has x = 0 as olle solution. So in Ih is case, x = 0 must be tl.esolution. (c)::::} (d ) Suppose th tlt Ax = 0 has o nl y the tnvitll solution. The corresponding system of eq uations is (I ,
tX t
(l2tXt
+ +
(1124 (1 214
+ ... + + ... +
at..x" = (ll..x" =
0 0
and we are ass um ing that its solutio n is
x,
= 0 =0
x
"
=
0
In o the r words, Gauss-Jordan eliminatio n applied to the augmented matrix of the system gives
a" a" [AI OJ =
""
a 22
a",
anl
.. ,
ti , "
0
a,"
0
,
1 0 0
1
' , ,
.. ,
0
0
0
0
=
[/"I OJ
, ,
a""
0
Thus, the reduced row echelon form of A
0 IS
0
1 0
I".
(d ) =? (e) If we assume that the reduced row echelon for m o f A is I", then A can be reduced to I" usi ng a fi nite sequence of elemen tary row operations. By Theorem 3. 10, each one o f these elementary row operations COl n be achieved by left-multiplyi ng by an appro pria te elementary matrix. If thc app ropr iate sC{1uence of elem entary matrices is E., f l '" ., EI; (in tha t order), then we have
" "'k
''' ""2 "EA = I" 1
According to Theorem 3.11 , these elementary matrices are all invertible. Therefore, so is their p roduct. and we have
E) - II" = ( E1 ... £, 2 E'1 )- ' -- £, 1- ' E'-l 1... E-,. ' A -- (El .. , E21 Agai n, each E,-1 is anothe r elementary matrix, by Theorem 3. J I, so we have wriuen A as a product of elemen tary mat rices, as required. (e) =? (a) If A is a product of elementary matri ces, lhen A is invertible, since elementary matrices are invertible and products of inverti ble matrices are invert ib le.
111
Chapter 3 Matrices
Example 3.29
If possible, express A =
Solullon
[ ~ ~] as a product of elemen tary matrices.
We row reduce A as follo\~s:
A --
'[I 3]3 ,_, " ) [I2 !] KJ-l~ [~ -!J '_"1 [I 0] IRe, [I 0] = I ° -3 o
1
'
Th us, the reduced row echelon fo rm of A is the identity matrix, so the Fundamental Theorem assures us that A IS invert ible and can be written as a product of elementary matrices. We have E~EJ ~EIA = /, where
EI = [~ ~]. E2=[_~ ~]. E3=[~
:]. ~=[~ _~]
are the elementary matrices corresponding to the four elementary row operations used to reduce A to /. As in the proof of the theorem , we have E"2 E" E" _ [ 01 A = (E~}! E E EI ) . , -- E" 'I ') '4 -
~] [~ - : ] [~
as required.
Remark
Because the sequence of elementary row operations that transforms A into / i.~ not un l(lue, neither is the representation of A as a product of elementary matrices. (Find a d ifferent way to express A as a product of elementary matrices.)
"Never bring a cannon on stage in Act I unless you intend to fire it by the last aC1." - Anton Chekhov
Theorem 3.13
The Fundamental Theorem is surprisingly powerfu l. To ill ustrate its power, we consider two of ItS consequences. The nrst is that. although the defin ition of an invertible mat rix Slates that a matrix A is invertible if there is a mat rix B such that both AD = / and BA = / are satisfied, we need only check oneof these equatio ns. Thus, we can cut our work in half! Let A be a square matrix. If 8 isa square matrix such that either AD = lor BA = I, then A is invertible and B = A-I.
PlIof
Suppose BA = I. Consider the equation Ax = O. Left-multiplying by IJ, we have BAx = 80. This implies thatx = Ix = O. Thus, the system re presen ted by Ax = 0 has the unique solution x == O. From the eq uivalence of (c) and (a) in the Fundamental Theorem, we know that A is invertible. (That is, A- I exists and satisfies AA - I = I = A - I A.) If we now right -multiply both sides of BA = / by A- I, we obtain BAA-I = LA- I ~ BJ ::: A- I => B = A-I
(The proof III the case of AB
= I
is left as Exercise 4 L)
The next consequence of the Fundamental Theorem is the basis for an efficient method of computing the inverse of a matrix.
Section 3.3
Theorem 3.14
The Inverse of a Matrix
113
Let A bc a square matrix. If a sequence of elem entary row operations reduces A to /, then the same sequence o f elementary row op erations transforms 1 into A-I .
If A is row equivalent to t, thcn we can achieve the reduction by leftIll ulti plying by a sequence £1' Ez • ... , E1. of elem entary matrices. Therefore, we have Ek . .. ~EI A = I. Setting B = E, ... EzE I gives IJA = I. By Theorem 3. 13, A is invertible and A- I = B. Now applying the same sequcnce of elementary row olXrations to 1 is equivalent to left- multiplyi ng Iby El · ·· ElEI = 8. The result is
Proof
Ek ... ~Ell "" 131
= B = A-I
Thus, 1 is transform ed into A-I by the slime seq uence of elemcntary row opcrations.
The Gauss-Jordan Method lor Computing the Inverse We can perform row opcrations on A and I sim ultlillcously by constructi ng a "superaugmented ma tri x" [A l l]. Theorem 3. 14 shows that if A is row eq uivale nt to [ [which, by the I:: undamc ntal Theorem (d ) (a), means that A is invertible !, then elementary row operations Will yield
If A cannot be reduced to /, then the Fundamental Theorem guarantees us tha t A is not invertible. The procedu re just described IS simply Ga uss-Jordan elimilllltion performed o n an tIX27/, instead of an II X( n + 1), augmented matrix. Another way to view this procedure is to look at the problem of fi nd ing A- I as solVi ng the mat rix eq uation AX = I" for an n X II matrix X. (This is sufficie nt, by the Fundamental Theorem , since a right inverse o f A mus t be a two-sided inve rse. ) If we deno te the colum ns o f X by X I ' .•. , x n' then this matrix equation is equ ivalent to so lvlllg fo r the columns of X, one at a time. Since the col um ns of /" are the standard um t vectors f l ' ... , en' we th us have /I systems of linear equa tio ns, all with coeffiCIent matrix A:
Since the sam e sequence o f row o perations is needed to b rlllg A to red uced row echelo n form in each case, the augmented matr ices for these systems, [A I e d, ... , [A I e" i, ca n be co mbined as
(AI ' ,' , ... ' .]
~
[A I I.]
We now apply row operations to try 10 reduce A to I", which, if successful, will sim ul taneo usly solve for the columns o f A -I , transfo rming In into A - I. We illustrate this use of Ga uss- Jo rdan el iminatio n with three examples.
(Kample 3.30
FlIld the inve rse of A ~
if it exists.
1 2
- I
2 2 1 3
4
- 3
114
Chapler 3
Matrices
Solulloa
Gauss-Jordan elimination produces
2 2 2 1 3 1
[AI f ) = H. · 211, 11,- II,
R..- II. )
1
0
0
4 0 -3 0
1
0 1
0
2 -2
0
1
1
)
- I
0
- I
6 - 2
- 2 - I
1
2
- I
1
0
0 0
1
-t
1
- 3 1 - 2 - I
1
2
- I
0
1
0
0
1
2 0 - I
0
0 0 1
1
0
- 3 1 r - 2
,
0
_1
0
1
r
0 1 0 -s 0 0 1 - 2
,
1
1
1
3
,1
, -,
I 0 0 9 0 1 0 -s 0 0 1 - 2
II, -lll,
1
A-I
=
9
-1
- 5
-s
1
-2
1
3 1
,
1
-s
,1 ,-
-
Therefore,
0 0 1 0 0 1
1
3 1
(You should always check that AA- l = Tby di rect m ultiplicat ion. l3y T heorem 3.13, we do not need to check that A- , A = Ttoo.)
Remlr.
NOlice Ihal we have used the v3riant of Gauss-Jordan elimination th3t first introduces all of the zeros below the le3dmg Is, from left to right and to p to baltom, and then cre31es zeros above the leading Is, from right to left and bollom to top. This approach saves o n calculations, as we noted in Chapter 2, but yo u mOlYfind it easier, when working by hand, \0 create ,,/I o f the zeros in each column as yo u go. The answer, of cou rse, will be the same.
lxample 3.31
Find the inverse of 2 A '=
if it exists.
- 4 -2
1
- 4
- [ 6 2-2
SectIon 3.3 The In\'erst of a Malrix
115
SOI,II.. We proceed as in Example 3.30, adjoin ing the identity matrix to A and then trying to manipulate IA I tJ into II I A-I I. 2
[AI I}
~
I{_ 11{,
•••• II. ,_II
•
-.
]
-,
- \
- 2
2
-,
]
0
0
6 0
\
0
- 2 0
0
\
\
0
0
\
0
3
- 2 2 - 6 \
0
\
\
2
-\
]
0
0 0
]
-3
2
\
0 0
0 -5
-3
\
2
]
0
]
0
0
At this point, we see that it is not possible to reduce A to 1, SIIlCCthere is a row of zeros on Ih, Id l-h,nd sid, of Ih, ,msmenICd n,,"" . Co ''''q ''c j - in other wo rds, workjllgfrol1l lOp
to bottom ill each CO/lIIllIl. ('Vhy?) In th is example, we have
[: 1~
1 - I
fl. •.111,
1 -I
0 1 3 -3
R. - ~R ,
0
1
0
0
0
9
4
5
•
1 - 1
r" and r 4 do not form a basis for the column space of A .
Ixample 3.48
Find a baSIS for the null space of matrix A from Example 3.47. There is really nothing new here except the termi nology. We siml'iy have to find and describe the solutions of the homogeneous system Ax = O. We have al ready computed the reduced row echelon form R of A, so all that remains to be done in Gauss-Jordan elimination is to solve fo r the leading variables in terms of the free variables. The fi nal augmented matrix is
S01111101
[RIO] :
-1 0
1
0
1
0
0
1
2 0 0
° 1
4 0
0
0 0
0 0 0
0
) 0
If
x, X,
x:
X,
X. X,
then the leading Is are in columns 1, 2, and 4, so we solve for XI' Xz. and x. in terms of the free variables XJ and Xs. We get XI = - x) + Xs, Xz ::: - 2x, - 3X;. and x. = - 4Xs' Setting:s "" $ a nd ~ = t, we obta in
x:
+
X,
- $
X,
- 2$ -
x, x, X,
,
- 41
,
t
- 1
1
3t
- 2
- 3
=,
1
+
I
0
0
- 4
0
1
"" $ U
+
tv
Thus, U and v span null{A), and since they are 110early lOdependent, they form a basis (or null(A).
f 81
Chl'ptcr J
Malrices
Following is a sum mary of the most effective procedure to use to find bases fo r the row space, the column space, and the null space o f a matrix A.
I . ,:ind the reduced row echelon fo rm R of A. 2. Use the nonzero row vectors of R (containing the leading Is to arm a basi~ for row(A). 3. Use the column vecto rs of A that correspond to the columns of R containing the leading I s (the pivot colum ns) to form a basis for col (A ). 4. Solve fo r the leadi ng variables of Rx = 0 in terms of the frtt variables, set the free variables equal to parameters, substitute back into x, and write the result as a linear com binat ion o f f veclors (where f is the number of free variab These f vectors form a basis fo r nulJ (A).
If we do not need to find the null space, then it is faster to simply reduce A to row echelon form to find bases for the row and column spaces. Steps 2 and 3 above remain valid (with the substitutIOn of the word "pivots" for "leading Is").
DimenSion and Ranl We have observed that although a subspace will have different bases, each basis has the same number of vectors. Th is fu ndamental fact will be of vital im portance from here on in th iS book.
I
Theore. 3.23
"
T he Basis Theorem Let S be a subspace of vectors.
R ~.
Then any two bases for Shave th
umber of
P'"'
Sherlock Holmes Il(lted, ....."hen you have eliminated Ihe impossible, .... hate,·er remain~, hO"''f''WT improbable, muSI be the truth" (from The Sigll of Ft,"r by Sir Arth ur Conan Duyle).
Let B = lu" u2 ' ••• ' u,\ and e = Iv" v2' • •• ' v,) be bases for S. We need to pro ve Ihal r = s. We do so by showing that neither of the o ther 1\\'0 possibilities, r < s or r > s, can occu r. Suppose tha t r < s. '''le will show that this forces C to ~ a linearly dependent set of vectors. 10 this end, leI (I)
Since 6 .s a basis for S, we can write each v, as a linear combination of the elements u;
v. = ",.u, + " I2U 2 + ... + (I ),U , VI
=
" 2, U 1
+
(I 22 U2
+ ... +
(ll , U ,
Substituting the equations (2) into equation ( I ), we ob tai n
(2)
S«:tion 3 5
211
Subspaas, Basis, Dime nsion , and Rank
Regrouping, we have (cja ll
+
cZa l1
+ .,' + c,a'l)lll + (c1al l + 'la U + " . + c,a,l)UZ + ... + (cia.. + '2"1. + . + '," .. )U,
= 0
Now, since B is a basis. the u; s are linearly independent. So each orlhe expressions in parentheses must be zero: 'I tI li 'lal2
+ £1;1121 + ... + c,a,l = 0 + c2an + ... + e,n>! = 0 • •
This IS a homogeneous system of r linea r equations in the s va riables 'I> ' l " .•• c.. (The fact that the variables appear to the left of the coeffi cients makl"S no difference.) Since , < $, we know fro m Theorem 2.3 that there are infinitely many solutions. In particular, there is a llol1tnvia l sol utio n, giving a nOllt rivial dependence relation in eq ua tio n ( I). Thus, C is a linearly depe ndent set o f vectors. Bullhis findin g cont radicts the fact that C was given to be a basis, and hence linearl y independent. We condude that r < s is not possible. Sim ilarly (interchanging th e roles of B and C), we fi nd that r > 5 leads to a contradiction. Hence, we must have r = S. as desired.
Since all bases fo r a given subspace must have the same nu mber o f vectors, we can attach a name to this number.
DelialtioD
If S is a subspace of Din, then th e number of vectors in ,I basis for S
is called the dimeIJsioIJ o f 5, denoted dim
S.
",..r.
The zero vecto r 0 by itself is always a subspace o r R~. (Why?) Yet any sct cOlltnining the zero vector (and, in particular.IO}) is linearl y dependent. so {O) cannOI have a basis. We d efi ne dim {O} to be o.
Si nce the standard b3sis for Rn has /I vectors, dim IRn = II . (Note that th is result ag rees with ou r in tuitive unders tandin g of d imension fo r 1/ < 3.)
(1IlIIple 3.50
In Examples 3.4 5 th rough 3.48, we fou nd tha t row (A) has a basis with three vectors, col(A) has a basis with three vectors, and null{A) has a basis with two vectors. Hence, dim (row( A» == 3, dim (col(A)) = 3, and dllll (null (A)) = 2.
A single example is not enough on ,"hieh to specul3te, but the fa ct tha t the row and column SpliCes in Example 3.50 have the same d imension is no accident. Nor is the faCI that the su m of dim (col(A)) and dim ( null (A)) is 5, the num ber of co lumns of A. We now prove that these relationships are lrue in general.
2DZ
Chapter 3
Matrices
Theorem 3.24
= The row and col um n spaces of a mat rix A have the same dime nsion.
PrlOI
Let R be the reduced row echelon form of A. By Theorem 3.20, row(A)
r oweR), so
dim (row (A» = d im ( row(R»
= n umber of nonzero rows of R
= number ofleading Is of R Let this number be called r. Now col (Al "* col(R), but the columns of A and I~ have the same depende nce relationsh ips. Therefore, dim (col(A» = d im (col(R». Since the re are rleading Is, R has rcolumns that are standard unit vectors, e l '~"" , e". (These will be vec tors in Rm if A and Rare mX" matrices.) These r vectors are linea rly independent , and the remaining columns of R are linear combinations of them. Thus, dim (col(R» = r. It follows that d im ( row(A») = r = dim (col (A», as we wis hed to prove.
The rank of a matrix was first defined In 1878 by Georg J:robenius ( 1849- 1917), although he defined it using determinan ts and not as we have done here. (See Chapter 4.) Froberu us was a German malhematlCl'ln who r«tiyed his doctorate from and later taught at the University of Berlin, Best kn own for his wn tribut ions 10 gro up theory, Frobenius ust'd matrices in his work on group
representations.
Definition
T he rank of a mat rix A is the di mens ion of its row and column spaces and is de no ted by ra nk(A).
For Example 3.50, we can thus wri te rank(A )
= 3.
Rlm.,II. • The precedmg defi nition agrees wi th the m ore informal definitio n of rank that was introduced in Chapter 2. The ad va ntage of our new definition is that it is much m ore Oexible. • The rank of a matrix simultant:ously glV(:S us information abo ut linear dependence a mo ng t he row vectors of the matrix alld among its column vectors. In p a rticular, it tells us the number of rows and colum ns that a re linearl y independent (and this n umber is the sam e in each case! ). Since the row vectors of A a re the column vecto rs of AT, Theorem 3.24 has the foll owing immediate corollary.
Theorem 3.25
For any ma trix A,
mnk{A') = mnk(A ) PrOal
We have
mnk(A') = d;m«ol(A'» = dim(ww(A» = mnk(A)
Oel'nlll08
The nullity of a matrix A is the dimension of its /lull space and is denoted by nullity(A).
Se.:tion 3.5
203
Subspaces, Basis, Di mensIon , and Ran k
In other words, nullity(A) is the dimension of the solution space of Ax = 0, which is the same as the number of free variables in the solution. We can now revisit the Rank Theorem (Theorem 2.2), rephrasing it in terms of our new d efinitions.
r"
Theore .. 3.26
"
The Rank Theorem
..
"
If A is an mX n matrix, then
rank(A) + nullity(A) =
/I
Proof Let R be the reduced row echelo n form o f A, and suppose that rank (A) = r. Then R has r leading Is, so there are rleading variables and n - r frcc variables in the solution to Ax ::: O. Since dim (null (A» = /I - r, we have rank{A)
+ nuility(A)
= r + (II - r) =
/1
--
Often, when we need to know the nullity of a matTIX, we do not need to know the actual solution of Ax = o. The Rank Theorem is extremely useful in such situations, as the following example ill ustrates.
Elample 3.51
Fi nd the nullity o f each of the fo llowing matrices:
~N-
2
3
1
5
and
4 7 3 6 2 1 - 2 - I 4 4 -3 1 2 7 1 8
Solullon
Since the two columns o f M are clearly linearly indepe ndent , rank (M ) = 2. Thus, by the Rank Theorem, nu llity(M ) = 2 - rank( M ) = 2 - 2 = O. There is no obvious dependence among the rows or columns of N, so we appl y row operat ions to reduce it to 2
1
- 2
- 1
o
2
1
J
o
0
0
0
We have reduced the matrix far enough (we do not need reduced row echelon fo rm here, since we are not looking for a basis fo r the nul! space). We see thai there are on ly IwO nonzero rows, so rank(N ) ::: 2. Hence, nullity(N) = 4 - rank (N ) ::: 4 - 2 = 2.
-4
The results o f th is sectIon allow us to extend the FundamenlaJ Theorem of Invertible Mat rices (Theorem 3.1 2) .
2DC
Chapter 3 Matrices
Theorem 3.21
ow •
The Fundamental Theorem of Invertible Matrices: Version 2 Let A be an n X tI matrix. The fo llowing statements are equivalent:
a. b. c. d. e. f.
A is invertible. Ax = b has a unique sol ution for every b in R~. Ax = 0 has on ly the trivial solution. The reduced row echelon form of A is In' A is a p roduct of elementary matrices. rank (A ) = tI g. nullity(A) = 0 h. The colum n vecto rs of A are linearl y independent. i. The column vecto rs of A span R". j. The column vectors o f A form a basis for R". k. The row vecto rs of A are linearly independent. I. The row vecto rs of A span JR". Ill. The row vectors o f A form a basis for JR".
The nullity of (I !1l(l1r1X was defined in 1884 by Jal1\es Joseph Syive,tcr
PrOal
rIIII 4-HI97), who was interc5ted in
(0 - (d ) => (c) => (h) If rank(A) = tI, then the red uced row echelon fo rm of A has "leading Is and so is In" From (d) => (c) we know that Ax = 0 has on ly the triVIal solution, wh ich implies that the column vectors o f A are linearly independent. since Ax is jusl a linca r comblllatio n o f the column veClors o f A.
(h ) => (i) If the column vectors of A are linearly independen t. then Ax == 0 has only the triVIal solution. Thus, by (c) => (b), Ax = b has a unique solu tion for every bill IR". This means that every vector b in JR" can be written as a linea r combination of the column vecto rs of A. establishi ng (i). (i) => (j) If the colum n vectors of A span IR", the n col (A ) = Rn by definition, so rank (A) :::: dim (col (A» :::: tI. This is (f), and we have already established that (f) => (h ). We conclude that the column vectors o f A are linearly independent and so form a basis for IR:", since, by assumption, they also span IR". (j ) => ( f) If the column vectors of A form a basis for JR", then, in pa rticular, they are linearly independ ent. It follows that the reduced row echelon form of A contains /I lead ing Is, and thus rank(A) = II. The above dIscussion shows that ( f) => (d ) => (c) => (h) => (i) => (i ) => ( f) ¢> (g). Now recall that, by Theo rem 3.25, rank (AT ) :::: ra nk (A), so what we have just proved g ives us the correspond rng results about the column vectors of AT. These are then resulls about the rol\lvectors of A, bringing (k), (1), tandard matrices arc
[5] ~
, ,
- I
- I
0
0
[So 1'] : [5][ 1'] :
0 1 1
,nd [ 1'] :
,
1
,
so Theo rem 3 gives
,
1
0 3
0 3
, , - I
- I
0 1
1
,
- I
3
4
,
-,
3
4
0
5
4
3
-7
- I
1
6
3
0 :
It fo llows Ih31
(S o
T)[::] :
5
4
3
-7
-I 6
, [::] 3
:
5x1
+
4x2
3x1
-
7x!
-
Xl
6xl
+ X! + 3Xl
S«hon 3 6
Introduction to
Lin~ar Transformations
21.
(I n ExcrcJse 29, you will be asked 10 check this result by seu ing
y, y,
=
-1'[::] =
YJ
3x1
+ 4x2
and substilUling these val ues into the definition of 5, the reby calcul:lting (S d irectly.)
Example 3.61
0
T)
[x, ] XJ
..•-t
Fmd the standard matrix of the transforma tion that first rotates;! pom l 90° co un terclockwise abou t the origi n and then reflects the res ult in the x-axis.
Solallon
The ro tation R and the reAcetion F were discussed in Examples 3.57 and
356, respectively, where we found their standard matrices to be (R] =
[F] = [ ,
[~
-
~ ] and
0 ]. It follows that the composition F o R has for its matrix
o -,
[F' R]
~ [ e][ R] = ['o
0][0 - 1 ]
-']o =[ - 0-'] J
0
(Check that this result is correct by consldenng the effect of Fo R o n the st:mdard basis vectors c l and e J • NOie the importance of the order o f the transformat ions: R is performed before F, but we write F o R. In this case, R F also makes sense. Is R o F = F R?)
Inverses oIl/near TranslorllaliOls Consider the effect of a 90" counterclockwise ro mlion abo ullhe origin fo llowed by a 90 0 clockwise rolation abou t the origin. C learly Ihis leaves every point in HI unchanged. If we denote these tra nsformations by R90 and R_90 ( remember that a negati ve angle m easure corresponds to clockwise directio n ), then we may express this as (R9!) R_\IO)( v) == v fo r every v in R2. Note thai , in this case, if we perform Ihe tra nsformations in the other order, we gel the sa me end result: (R _'K! R90 ) (v ) = v for every v in A2. Thus, R90 0 R ~90 (and R~ <JO R.,o too) is a linear transformation that leaves every vector in A Z uncha nged. Such a tra nsformation is called an identity transformation. Generally, we have one such transformation for every RW- namely, I: R~ -l> R~ such Ihat l t v) "" v for every v in R". (If it is important to keep track of the dimension of the space, we might write In for clarity.) So, with this notation, we have R90 R ~90 = I "" R_90 0 ~ A pair of transform ations that are related to each other in this way are called inverw transformations.
Dennltloa
LeI S and Tbe linear transformations from R~ to A". Then 5 and T are inverse transformatioru if S o T = I" and T o 5 = I,..
ttl
Chapter J
Matrices
•••• r.
Since this definition is symmctric wit h respect to S and T. we will say that, when this situation occurs, S is the inverse of T and T is the inverse of S. Fur· thermore, we will say that 5 and T are invertible. In terms of matrices. we see imme(llatciy that if 5 and Ta re inverse transfonnatlOns, then [5 II TI :: 15 0 TJ "= II J = I. where the last I is the identity marrix. (Why is the standard matrix of the identity tr:lIlsformation the identit y matrix?) We must also have [TI [SI .,. I T o 51 = II] = I. This shows that IS] and I TI are inverse matrices. II shows something more: If a linear tr.ansformatlon T is invertible. then its standard matrix I TI must be invertible, and since matrix inverses are unique. this means that the inverse of Tis also unique. Therefore, we can unambiguously usc the notation r - I to refer to tile inverse of T. Th us, we can rewrite the above equations as ITIl T IJ = 1 = [ r- I JI '1'1, shOWing that the matrix of 'r - I IS the inverse matrix of I TI. We have just proved the follOWing theorem.
Theorem 3.33
,
Let T: R" __ [R" be an invertible linear transformation. Then its standard matrix I 'f) is an invertible matrix, and
-
•••• r.
Say this one in words 100: "The matrix of the inverse is the inverse of the matrix." Fabulous!
Ellmple 3.62
Find the standard ma trix of a 6Cf' clockwise rotat io n about the origin in 1R: 2•
SDlullDa Earlier we computed the ma trix of a 60" counterclockwise rOlal ion about the origin to be
1/2 - 0 / 2] [ R., ) - [ 0 /2 1/2 Since a 600 clockwise rotation is the inverse of a 60" counterclockwise rmation, we can apply Theorem 4 to obtain
- 0 /2] -' ~ [ 1/2 0 / 2] 1/2 - 0/2 1/2 (Check the calculation of the matrix inverse. The fastest way is to use the 2X2 short· cut from Theorem 3.S. AlSQ, check that the resulting matrix has the right effect on the standard basis in [Jl2 by drawing a diagram.)
Example 3.63
Determine whether projection onto the x-axis is an invert ible transformat ion, and if it is, find ils inverse.
Sol,II., The sta ndard ma trix of this projection P is [~ since its determin ant is O. Hence, P is not invertible either.
~ ] , wh ich is not invertible
Section 3.6
H•••
221
r.
Figure 3.13 gives some idea why Pin Example 3.63 is not invertible. The projection "collapses" 1R2 on to the x-axis. For P to be invertible. we would have to have a wa y of "undoing" it. to recover the point (fl. b) we started with. However, the re are infinitely ma ny candidates for the image of (fI, 0) under s uch a h ypo thetical " inverse." \-vhich o ne should we use? We can not simply say tha t r l must send (a. 0) to (a, b), since this cannot be a defimtion whe n we have no way of knowing what b should be. (See Exercise 42. )
,,«(/. , b) ~ (fI,b')
,
( ll.
Introd uction to Lmcar Transformations
0)
, ' (fI, b")
figure 3,13
Assoclallvilv
Projections arc nOl invertl ble
Theore m 3.3(a ) in Sectio n 3.2 stated the associativity property for matrix multiplication: A( Be ) = (A IJ) c. Of you didn't try to p rove it then, do so now. Even wi th all matrices restric ted 2 X2, yo u will get some feeling for the notat io nal complex ity involved in an "elementwise" proof. which should make you ap preciate the p roof we are abou t to give.) Our a pproach to the proof is via linear transformations. We have seen tha i every mX" matrix A gives rise to a linear transformation T", : Rn -+ jR'''; conversely, every linear tra nsfo rmation T: R" -+ R'" has a co rresponding m X II matrix [ T J. The two correspondences are inversely related; that is, given A, !Toll = A, and given T, TIT! = T. Let R = TA • 5 = TIi' and T = Tc Then , by T heorem 3.32,
A(BC)
~
(AB)e if,ndonlyif R .(S. T )
~
( R. S). T
We now prove the latter identity. Let x be in the domlli n of T (and hence in the do main o f both R 0 (S o T) and (R 0 5) 0 T- wh y?). '10 prove that R 0 (S o T) = (R 0 5) 0 T, it is enough to prove that they have the same effect on oX . By repeated applicat ion of the definition of composition, we have
(R 0 (s • T))( x)
~
~
R« S ' T)(x )) R(S( T(x)))
~ ( R ' S)(T( x )) ~ «R ' S) · T)( x )
required. (Carefull y check how the definitio n o f composi tion hlls been used four times.) This sectIOn has served as an introduction to linear transformations. In C hapler 6, we will take another m o re detailed and more general loo k at these transformati o ns. The exercises that follow also contilin some add itional exploratio ns of thiS important concept.
3S
Exercises 3.6 I. Let TA
:
.
]R2 -+
IR' be the matrix transfo rmatio n corre-
spondmg to A =
[2 -'j 3
4 . Find T",(u ) and T",(v),
2. Let TA : IR J -+ Rl be the matrix tra nsformation correspond ingtoA = [
4 - 2
0 I
- lj. Filld T",( U)a nd 3
I TA( v), where u
=
- I
2
0 and v =
5
- ,
!ZZ
Chapter 3
Matrices
III Exercises 3-6, prove that the given lrtills/ormation is u
linear trans/ormation, using the defillitloll (or the Remark /ollolYillg Example 3.55). X [
x
5. T Y
,
+
r]
4.T[;]
x -y
+,]
x- y [ 2x + y - 3z
x 6. 'J' y
-y x
+
2y 3x - 4y
x+ , y+ , x+y
III Exercises 7-10, give a counterexample /0 show that tIle
givell tram/orma/io" is Ill)( a linear traIlS/ormation.
T[;l [~l ""[;1 [xx:J 7.
r[;] ~ [:::1 10.T[;] ~ [;+ :] 8.
/11 Exercises J J-1-1, filld Ihe standard lIIa/rix of the lil/ellr mills/ormation in lire gil'cn exercise. LI. Exe rcise 3
12. Exercise 4
13. ExeTCIse 5
14. bercise 6
In Exercises 20-25, find tire sW lldard matrix 0/ lite gIven linear tmm/ormation from Rl to ]R2. 20. Cou n te rclockwise rotation t h rough 1200 abo u t the • •
origin
21. Clockw ise rot ation th ro ug h 30° abo u t lhe origin
y == 2x 23. Projection o n to the line y == - x 24. Refl ectio n in the li n e y = x 25. Rencction in the line y = -x 22. Project io n onto t h e line
e
26. Let be No te also that the columns of P are probability vectors; an y square m atrix with this p roperty is called a stochastic matrix. We can realize the d etermini stic nat ure of Markov chains in another way. Note lhat we can write
a nd, in general, Xl
=
piX,
for k
0, I, 2, ...
This leads us to exam ine the powers o f a transi tion matnx. In Examplc 3.64, we have
p! = [ 0.70 0.30
0.7
A 049
A
0.3 A
~ y B
fl'lre 3.21
B 0_21·
AOOO
~ BO.24.
0.20 ][ 0.70 080 0.30
0.20] 0.80
~ [ 0.55 0.30 ] 0.45
0.70
What are we to make of the elHries of th is mat rix? The first thing to observe is Ihal pl is ano ther stochastic matTIX, sincc its colum ns sum to 1. (You arc asked to prove th Is in Exercise 14. ) Could it bc that pl is also a transition matrix of some kind? Consider one of its entries say, (1'2)21 = 0.45. The tree d iagram in Figure 3.20 clarifies where Ihis ent ry came from . Thcrc are four possible state ch,mges that can occur ovcr 2 months, and these correspond to the four b ranches (or paths) o f length 2 in thc tree. Someone who illlt iall y is using Br;md i\ can end up using Brand B 2 m on ths b tcr in two d ifferen t ways (marked '" ill the figure ): The perSOll can con tinuc to use A after I month and then switch to 8 (wi th probability 0.7(0.3) = 0.21), o r thc person can switch to Barter 1 m onth and thell stay with B (w1lh probability 0.3(0.8) = 0.24 ). The sum of these probabilities givcs an overall probability of 0.45. O b serve that these calculations are extlctly wh:Jt we do whcn we compute ( P~)"I ' It fo llows that ( P2h = 0.4 5 represents the probabil ity of m oving from state I ( Brand A) to state 2 (I}rand Bl in two transitions. (Notc II1:1t the o rder of the subscripts is the rCI'erst' of what yo u migh t have guessed.) The argumcnt can be general ized to show tha t j
"
( pt)~ is the p robability of moving from state j to statc i in k transitions.
In Example 3.64, what will ha ppen to the d istribution of toothpaste users in the lo ng ru n? Let's wo rk wilh p robability veClors as state vecto rs. Continuing our calculations (ro unding to three deCImal places), we find
Xu
0.60] [
[0.50] [0. 4 5] x - Px - [0.70 0.20][0.50] 0.50 ' 0.30 0.80 05 0 = 055 ' 0.70 0.20][0.4 5] ~ [0.425] x ~ [0.412] x ~ [0 . 4 06] = [ 0575 0.588 ' 0594 ' 0.30 0.80 0.55 03] [0 . 4 02] [0. 4 01 ] [0.400] [0. 4 00] [0.4 0.597 ,x = 0.598' = 0.599 ' x\> = 0.600 ' x = 0.600 '
= 0.40 ' xI --
2 -
PXl
I -
'4
7
X8
5
10
Section 3.7
131
Applications
and so on. It appears that the state vectors approach (or COtlVugf: to) the vector [ 0.4] . 0.6 implying that eventually 40% of the toothpaste U2r5 in the survey will be using Brand A and 60% will be using Brand B.lndeed, it is easy to check that, o nce this distribution is reached, it will never change. We s imply compute
0.70 [ 0.30
0.20] [ 0.4] = [0.4 ] 0.80 0.6 0.6 A state vector x with the property that Px := X is ca lled a steady state vutor. In Chapler 4, we will prove that every Markov chain has a unique steady state vector. For now, let's aceepl this asa fact and see how we can find stich a vCClorwLthout doing any iterations al aiL We begin by rewriting the matrix equation Px = x as Px = Ix. whICh can in turn be rewritten as (I - P}x = O. Now this is just a homogeneous system of linear equations with coefficient matrix 1 - P, so the augmented matm is 11- P I 0). In Example 3.64, we have 1 - 0.70 ( / - P I 0) = [ - 0.30
- 0.20 0] [0.30 1 - 0.80 0 ., - 0.30
which reduces to
-0.20 0] 0.20 0
0]
[~ ° a -~
So, if our steady state vector is x ::: .
.
[XI]' then Xl is a free variable and the parametric X,
solutIOn IS
X, :::
j t,
X:2 = t
If we require x to be a probability vector, then we must have I = X,
Therefore, X:! = ' ''''
i=
0.6 and
X,
+ Xl =
=
it + t =
i "" 0.4, so x =
~t
[0.4 ]. in agreement with our 0.6
iterative calculations above. ( If we require x to contain the aCt/l(ll distribution, then in this example we must have x, + Xz ::: 200, from which it follows that x "" [ I
Example 3.65
~~
l)
A psychologist places a rat in a cage with three compart ments, as shown In Figure 3.21. The rat has been trained to select a door at random whenever a bell is rung and to move through it into the next compa rtment. (a) If the rat is initially in compartment I, what is the probability that it will be in compartment 2 after the bell has rung twice? three times? (b) In the long run, what proportion of its time will the rat spend in each compartment? SOllllloD
Let P =
I~ I "" PJI =
II}ql be the transition matrix for this Markov chain . Then
L Pu
::: Pu = j , P12 = P2J =
J. and
Pl1 = P1:I. = Pn = 0
232
Chapter 3
Matrices
flll"l 3.21 ( Why? Remember th;!t P'I is the prob,lbility of moving fro m j to i.) Therefore,
0
I) =
, j
, j
0 ,
, •,
!
,,
•, 0
and the initial sl;lte vector is I x,~
0 0
(a) After one ring of the bell. we ha\'e
,! 1,, !, 0 , 1, , , 0
0
x, = PXo =
0
I
0 0
,1 1 ,
~
0 0.5 0.5
Continu ing (rou nding to three decimal places), we find
, , o , o , , 1 , o , ,, 1 , 1, o ,
,
1
~
1
0333
I I
0333
1
• .L
0.222
"
0.389
.L
0.389
0333
a nd
o 1 !,, ,1 j x} = Px~ = , o , ,1,
l
,1
o
,
~
'"
1
Therefore, after two rings, the probability that the rat is in compartment 2 is = 0333, and after three ri ngs, the probability that the rat is in com partment 2 is 0389. I Note t hat these questions could also be answered by computing ( p lhl and (JV h .j
? ....
Section 3.7
Applications
2:33
(b) This questio n is aski ng for the steady stale vector x as a prooobility vector. As we saw above, x must be in the null space of I - P, so we proceed to solve the system
,, _1, 0 , 1 1 _, -"2 0 _ 1, ,, 1 0 1
[/ - PI Oj -
•
1
0
_., 0
0
1
- I 0
0
0
0 0
x, Hence, if x =
X~ , then
X3
x, bility vector, we need I
C XI
= t is free and
XI
=
i i, X2 =
t. Sm ce x must be a proba-
+:s +:s "" ~ t. Thus, l '" i and
,• ,• 1
•
which tells us that, in the long ru n, the rat spends ! of its tim e in compart ment 1 and 1 o f its time in each of the other two compartments.
Populallon Growlft P. H.. Leslie, "On the U!;C (,f Matrices in G:rtain I\' pulation MathematiG." Biorll,·tri!.tJ J) ( 1945), pp. 18J-lll.
Example 3.66
One o f the most popular models of population growt h is a m atTix-based model, fi rst in trod uced by i'. H. Leslie in 1945. The Leslie "Iodel describes the growth of the female portion of a population, \vhich is assu med to have a maxi mum hfespan. The females arc divided into age classes, all o f which span an eqU:11 number of years. Using data abou t the average birthrates and survi val probabilities of each class, the model is then able 10 determme the growth of the popul:Hlon over lime.
A certai n species of German beetle, the Vollmar- Wasserman beetle (or VW beetle, for sho rt), lives for at most 3 years. We divide the female VW beetles into th ree age classes of I year each: yo uths (0- 1 year),juveniles ( l-2 years), and adults (2-3 yea rs). The youths do not lay eggs; each juvenile produces an ave ragc of four female beetles; and each ad ult produces an average of three fe males. The survival r:lle for }'ouths is 50% (that is, the probability of a youth's survi ving to become a juvenile is .5), and the survival rate for juven iles is 25%. Suppose we begin wi th a pop ulation of 100 fe male VW beetles: 40 youths, 40 juven iles, and 20 adults. Predi ct the beetle populatio n for each of the next 5 years.
5alall01 After I year, the number of yout hs will be the number prod uced d uring tha t year: 4O X 4 +20X3=220 The num ber of juvcniles will simply be the nu mber of youths that have survived: 4Q x 0.5 = 20 Likewise, the number of ad ults will be the nu mber o f juveniles that have su rvived : 40 X 0.25 = 10
We can combine the~ into a single matrix. eq uatio n
o
4 0 0.25
0.5
o
3 0 0
40 40
220 20
=
20
10
40 o r Lxo xI' where Xo = 40 is the initial population d istribution vector and XI
=
=
220 20
20
10 is the distribution after I year. We see that the struct u re of the equation is exactly the same as for Ma rkov chains: X i ... 1 :::: Lx. fo r k = 0, I, 2, ... (although the in terpretation is qUite different). It follows that we can iteratively compute successive population distribution vectors. (It also fo llows that x~ = L*x o for k = 0, 1,2, .. . ,as for Markov chai ns, bu t we wi ll not use thIs fact here.) We compute o 4 3 220 110 20 = 110 Xl = LX I = 0.5 0 0
o o X3
= Lx, =
0.25
0
10
5
0.5
4 0
3 0
110 110
o
0.25
0
5
455 55 27.S
4 0.5 0
3 0
455
o o
0.25
0
4 0 0.25
3 0 0
27.5 302.5 227.5 13.75
o x4=Lx)=
Xs =
L~
=
0.5 o
::::
302.5
55
227.5 13.75 95 1.2 151.2 56.88
Therefore. the model predicts that after 5 yea rs there will be approximately 951 you ng fe male VW beetles. 151 juveniles. and 57 ad ults. ( N ote: You could argue that we should have rounded to the nearest in teger at each step-for example, 28 ad ults after step 3-wluch would have affected the subsequent iterations. We elected Itot to do th is, since the calculations are o n ly approximations anyway and it is much easier to u~ a calculator or CAS if you do not round as yo u go.)
The matrix L in Example 3.66 is called a u$lie matrix. In general. if we have a p opulation with" age classes of equal duration. L will be an " X II matrix with the following structure: b, b, b, .. . $1 L ~
o o
0
0
52
0
0
S.l
000
... ...
5~_1
0
Here. b l • hz• ... are the birt/! parameters (b, = the average numbers of females prod uced by each female in class i) and 51' s,;, ... are the mrvival probabilities (5, = the p robabil ity that a fe male in class i survives into class j + 1).
Section 3.7
4000
235
Applications
Youlhs
= o 3000
1000
o k::;:::~ ;;;;;;;;:::;::. -..-=4==.= :::::;:=A~d",..:I"_ o
2
4 6 Timc (10 years)
10
flgtue 3.ZZ Wh at arc we to ma ke of o u r cnlcu lat ions? Overa ll , thc beetle pop ulntion appears to be Incrensing, nlthough there nrc some fluctua tion s, such as a decrease from 250 to 225 fro m year \ to yea r 2. Figure 3.22 shows the change in the population in each of the th ree age classes a nd dearly shows the growth, wi th fluctuations. If, ins tead of plotting the actl/al population, we plot the relative population in each class, a diffe rent paltern em erges. 10 do this, we need to compute the fmc lio n of the population in each age class in each yea r; that is, we need to divide each distributio n vector by the sum o f its comp<ments. For example, after I year, we have 1
- x 250 I
1 250
220
0.88
20 to
0.08 0.04
which tells us lha t 88% of the population con sists of you ths, 8% is juveniles, and 4% IS adults. If we plot this type of data over time, we get a graph like the one in Figure 3.23, whICh s hows clearl y that the propo r tio n of the populat ion in each class IS app roachi ng a s teady state. II t urns o ut that the steady slate vector in th is example is 0.72
0.24 0.04 That is, in the long run, 72% of Ihe populatio n will be you ths, 24% juveni les, and 4% adults. (I n o ther words. the population is distributed nm ong the th ree age classes in the ratio 18: 6 : I .) We will see how to dete rmine this ratio exactly in Chapter 4.
Graphs and Digraphs There are many s itualions III which I t IS important to be able to model the interrela tionships amo ng a finite set of objects. For example, we might WIsh 10 desc ri be various Iypes of networks (roads connecti ng low ns, airline routes connecting CIties, cormnulllcalion llllks connecting satellites, CIC. ) or rela tionships amo ng gro ups or individ uals (friendship rehllionships in a sociClY. predator- prey relatio nships III an ecosyste m , dominance relationships in a sport, etc. ). Graphs are ideally sui ted to modeling such ne tworks and relmionships, and it turns out that ma trices are a useful tool in their study.
23&
Cha pler 3
MalTices
09 0.8 0.7 A
B
c
.--0
.
-
Yoolh ~
0.6
0
8.
0.5
~
-• & " 0
c
C
D
0.4 0.3
Ju ven ile ..
0.2
0.1
Adults
o~~~~~~~~~===-~
A'- --.'---C!
o
5
10 TlIne (in years)
D
figure 3.24 Two rcpustntililo ns o( the same graph
The term VCrlex (\'('rl;c('s is the plural) comes (rom the La tin verb verrett, which means "to turn." In th e co ntext of gr 'l phs (and gt"{)metry ), (1 vertex iS:I corncr-a point wher(' an edge "turns" into a diff(' rent edge.
IS
20
flgur. 3.23 A graph consists of a finit e set of points (called vertices) and a fi nite set of edges, each of which connects two (not necessarily distinct ) vertices. We say that two vertices are adjacent if they are the endpoin ts of an ed ge. Figure 3.24 shows an example n of the same graph drawn in two different ways. The g raphs are the "sam e in the sense that all we care about a rc the 3djacency relationships tha t identify the edges. We can record th e essential in form3lion abo ut a graph in a matrix and use matrix algebra to help us answer certai n questions about the graph. This is particularly useful if the graphs are large, since computers can handle the calculations very quickl y.
Definition
If G i. a graph wit h IlXn matrix A [or A(G)I d efined by
a , "= { I 'I 0
/I
vertices, then its adjacency matrix is the
if there is an edge betwee.n vertices i and j otherwise
Figure 3.25 shows a graph and its associated adjacency matrix.
",
,',• A-
1
1
1
1
1 0
I
1 0
1 0
'.flaur. 3.25
'J
A gmph with adjacency matrix A
1
0
0
0
0
Section 3.7
Applications
231
ft ••• ,.
O bserve that the: adjacency matrix of a graph is necessarily a symmetric matrix. (Why?) Notice also that a diagonal entry a" of A is zero unless there is a loop at vertex i. In som e situations, :'I grap h may have more than one edge between a paito f vertices. In such cases, It may make sense to modify the definition of the adjacency matrix so that (lj) eq uals the I1Ilmber o f edges hctwecrl vertICes i and ). We defi ne a patl. in a graph to be a sequenccor edges that allows us to travel from o ne vertex to another conti nuo usly. The length of a path IS the num ber of edges it contains, and we will refer to a path with k edges as a k-ptlth. For example. in the graph of Figure 3.25, VI v1 v2v. is a 3-path, and V~V I V2V1 Vt Vj is a 5-palh. Notice that the first o f these is closed (it begins and ends at the same vertex); such :, p:lIh is called a circui•. The second uses Ihe edge between 1'1 a nd 1'21WICe; a palh th 1. (The word lIilpotl!l!I comes from the Latin nil, meaning "nothing," and potere, meaning
Z8Z
Ch~plI.: r
,I
Eige nvalues and E3igcllv(.'Ctors
"to have power." A nilpotent matrix is thus one thai bemrnes "nothing"---that is. the zero matrix-when raised to some power.) Find all possible values of det (A) jf A is nilpotent.
where P and S are square matrices. Such a matrix is said to I>e in block (upper) triangular form. Prove that det A = (det P)(det S) ( H ml: Try a proof by inductio n o n the nu mber of
111 Exercises 57-60, lise Cramer's Rule 10 solve Ihe given linear sy$tem. 57.x+ y = I
58. 2x -
5
Y""
70. (a) Give an example to show that i( A can be partitioned as
x+3y - - 1
x-y = 2 59. 2x + Y + 3z - I
6O. x+y -z= 1
Y + z- I
x+y +z= 2
,- I
x-y
=3
III Exercises 6 1-64, lise Theorem 4. J2 to compllte tile inverse of tile coeffiClellt mtllrix for tire gIVtII exercise. 6l. Exercise 57
62. Exercise 58
63. Exercise 59
64. Exercise 60
65.
rows of P. )
If A is an Invertible /IX n malrix, show that adj A is also invertible and that
1 A = adj (A- I) det A 66. I( A is an /I X II matrix, prove that (adj A) -I =
det(adj A) "" (det A)" - l 67. Venfy that i( r < s. then rows rand s o( a malflx can be interchanged by performing 2( s - r) - I mterchanges of adjacent rows. 68. Prove that the Laplace Expansion Theorem holds (or column expansion along the jlh column. 69. Let A be a square malrix that can be partitioned as A =
[6+~l
A =
[~'iil
where 11, Q, R, and 5 are all sq uare, then it is nOl necessarily Irue that
dOl A = (dct P)(det S) - (dOl Q)(dct R) (b) Assumc that A is partitioned as in part (a) and that P is invertible. LeI B
=:
[:~~~~·i +f]
Compute det ( BA ) usi ng Exercise 69 and use thc result to show that det A "" det Pdet(S - RJTIQ) (The matrix S - RP- IQ is called the ScilUr complemetl' o( PIn A, afte r Issai Schur ( 1875- 1941 ), who was born in Belarus bU I spent most of his li(e in Germany. He is known mamly (or his fu ndamental work on the represen tatio n theory of gro ups, but he al$O worked in number theory, analysis, and other areas.) (c) Assume Ihal A is partitioned as in part (a), that P is invertible, and t hat PR = RP. Prove Ihat det A = det(PS - RQ)
.- ,
~
, - .. ".""
~
~
~
_~I
... -:;:--
.
Geometric Applications of Determinants This exploration will reveal some of the amazing applications of determinants to geometry. In particular, we will see that determinants arc closely related 10 area and volume form ulas and can be used to produce the equations of lines, planes. and certain other curves. Most of these ideas arose when the theory of determinants was being developed as a subject in its own right.
Tbe Cross Product Recall fro m Exploration: The Cross Product in Chapter I that the cross product u, of U = liZ and v = V2 is the vector u X v defined by
U X Y =
112V, -
l'l Vl
II J V1 -
Il IV}
II1 V2 -
11 2 V ,
If we write this cross product as ( 112v, - II) v2)e I - ( II I v} - Il, VI )e 2 + (u] v2 - IIz V. )c}' where (' I' c 1' and c, are the stimdard basis veclOrs, then we see that the form of this fo rmula is U X V = det
CI
III
VI
C2
112
V1
eJ
IlJ
Vj
If wc expand along thc first column . (This is not a proper determinan t, of course, since e p c2• and eJ are vectors, not scalars; howcver, it gives a useful way of rcmcml>er ing the somewhat a\. . kward cross product formula. It aLso lets us use properues of determinants to ver ify some of thc properties o f the cross product.) Now let's revisit some of the exercises from Chapter I.
283
I.
Use the determinant version of the cross product to com pute u 0
(a) u =
1 ,v = 1
2
-I
2 - 4
2 •y = 3
(,) u =
2.
3
(b) u =
1
(d) u =
-6
1
1 1
• y=
1
2 3
W,
",
, V ""
- I • y= 1
v.
0
2
Y,
'" '" '"
If u :::
3 - I
X
, and w =
w, ,show that
Y,
W,
", u· (v
X
w) = del
",
",
v, v, Y,
w, w, W,
3. Usc propert ies of d eterminants (and Problem 2 above, jf necessary) to prove the given property o f the cross product.
(b) u X 0 = 0 (d) u X k.v = k(u X v) (c) u x u = O (,) u X (v + w) = u X v + u X w (f ) u' (u X v ) = 0 and v' (u X v ) = 0 (g) u · (v x w) = (u X v) · w (t he Iriplescafarproduct idetltity )
(a) v X u ::: -( u X v)
Area and Volume We can now give a geometr ic interp retillion of the deter m inan ts o f 2x2 and 3X3 matrices. Recall that if u and v arc vectors in R l, then the tHCa A o f the parallelogram d eterm ined by these vectors IS given by A = II u X vii. (Sec Explo ration : The C ross Product in Chapler I.) 4.
Let u
=
["'J
and v =
vl
li!
termined by u and v is given by
b+ d d
A = del
~ UI
k ed)
( Hint: Write u and v as
(a. b)
!i2
o a a+c
figure 4.9
284
["J
. Show that the area A of the parallelogram de-
x
[''""
VI
and
V< .)
0
5. Derive the area fo rmula in Problem 4 geomet rically. usi ng Figure4.9 asa gUide. (Hint: Subtract areas fro m the large rectangle until the parallelogram remains.) Where does the absolute value sign come from in this case?
v X ""
h
,
-
FIGura 4.10
6.
Find the area of the parallelogram determined by u and v.
Generalizing from Problems 4-6, consider a parallelepiped. a three-dimensional solid resembling a "slanted" brick, whose six faces are all parallelograms with opposite faces parallel and congruent (Figure 4.10). Its volume is given by the area of its base tImes its height. 7. Prove that the volume Vof the pa rallelepiped determined by u, v, and w is given by the absolute value of the determinan t of the 3X3 matrix [u v w) with u, v, and w as its columns. [Him: Fro m Figure 4.10 you can see that the height h can be expressed as II = I lul ~os(}, where 0 is the angle between u and v X w. Use this fact to show that V :::: lu· (v X w)l and apply the result of Problem 2.1 8.
Flgur, 4.11
Show that the volume V of the tet rahed ron determined by u, v, and w
(figure 4. 11) is given by V = Hu· (v x w)1 [Him: From geometry, we know that the volume of such a solid is V = base) (heigh t).]
j (area of the
Now let's view these geometric interpretations from a transformational point of vIew. Let A be a 2 X2 matrix and let P be the paral lelogram determ ined by the vectors u and v. We will consider the effect of the matr ix transformation .,~ on the area of P. Let T.... (P) denote the parallelogram determined by T....(u) = Au and T....(v) = Av. 9.
Prove that the area of 1~( P) is given by Idet AI(area of 1').
10. Let A be 3 3X3 matrix and let P be the parallelepIped determined by the vectors u, v, and w. Let T",(P) denote the parallelepiped determined by T",(u):::: Au. T",{ v ) :::: Av, and 'J ~( w) = Aw. Prove that the volume of T",{P) is given by Idet AI (volume of P).
•
The preceding problems illustrate that the determinan t of a matrix captures what the corresponding matrix transformatio n does to the area or volume of figu res upon which the transformation acts. (Although we have considered o nly certain types of figures, the result is perfectly general and c.m be made rigorous. We will not do so here.)
285
LInes and Planes Suppose we are given two distinct poin ts (x .. r,) and ( ~, Y2) in the plane. There is a u nique line passing through these points, and its equation is of the form
ax + by +c=O Si nce the two given points are on this line, their coordinates satisfy this equation. Thus,
ax,+ by,+c -= O aX2 + bY2 +C - Q The three equations together can be viewed as a system of linear equations in the va riables (I, b, and c. Since there is a nont rivial solution (Le., the line exists), the coeffi cien t matrix
cannot be invertible, by the Fundamental Theorem of Invertible Mat rices. Consequently, its determinan t must be zero, by Theorem 4.6. Expa nding Ih is determinan t gives the equatio n of the line.
• The equation of the line thro ugh the points (x,. y,) ilnd (~. Y2) is given by
Y
x xl
yl
1 l= O
Xl
Yl
I
II. Use the method descri bed above to fi nd the equation of the line through the given po ints.
(b)( I.2) 'nd (4.3) 12. Prove that the th ree poi nts ( xl' YI)' (-\I. Yl )' and (x,. y,) are collinear (lie o n the same line) if and only if Xl
YI
1
x~
y~
l=O
xJ
Yl
1
13. Show that the equation of the plane through the three noncollinear poi nts ( XI' YI' %1 )' (~, Y2' ~), and (~, Yl' %3) is given by
, 1 1 " 1=0 "" 1 What ha ppens if the three po ints are coll inear? [Him: Explain what ha ppens when x x, x, x,
y y, y, y,
row red uction is used to evaluate the determi nant.] 286
14. Prove that the four points (x" r .. ZI)' ( ~'Y2' Zz), (.\3' Y3' Z), and (x.> Y4'Z.) 3rc coplanar (lie in the same plane) if and only if
x, y, " x, y, Z, x, y, z, y.
X.
"
1 1 1 1
= 0
Curve fitting When data arising from experimentatio n take the form of poi nts (x. y) that can be plotted in Ihc plane, it is often of interest to find a relationship between the variables x and y. Ideally, we would like to find a funct Ion whose graph passes through all of the
12
points. Sometimes all we wa nt is an approxim a tio n (see Section 7.3), but A
CX 3Ct
results
are also possible in certain situations. 15. From Figure 4.12 it appears as though we may be able to find a parabola passing through the points A( -] , 10), B(O, 5), .tnd C(3, 2). The eq uation of such a parabola is o f the form y = a + bx + cx 2, By s ubstitu ting the given pOll1ts into this equation, set up a system of three linear equations in the variables a, h, and c. Withollt
6
IJ
so/vmg the system, uSC T heore m 4.6 to argue t ha t it must have a umque solution.
4
Then solve the system to find the equation o f the parabola in Figure 4. 12. 16. Use the me thod of Problem J 5 to find the polynomials o f degree at most 2 that pass through the following se ts of po mts.
c -2
flgur. 4.12
2
4
(,) A( I. -I ). 8(2, 4), C(J. J)
6
(b) A(- I . - J ).8(I.-I ). C(J. I)
17. Ge neralizing from Problems 15 and 16, suppose al' a2 , a nd a) a rc distinct real numbers. Fo r any real nu m bers hI' h1, a nd h3' we wa nt to show tha t there is a unique quadratic with equation of the for m y = a + hx + a? passing through the points (a l • btl , (a2, b2 ), and (a 3, h3)' Do th is by demonstrating that the coefficient matrix of the associated linea r system has the determinant ~
I
a, a2
(Ii =
1
IlJ
tt,
1
(a) - (l ,)(a3 - (l ,)(aJ
-
a2)
which is necessarily nonzero. (Why?) Let a" rlz. a3, and
18.
1 1
-
"a, aiai a:ai
1
a,
1
'.
ai al ai a:
fl.
be distinct real num bers. Show that
= (~ - {1,)(a3 - al )(tI.! - Ql )(a.l - (~) (a4 - ( 2)(a4 - a3)
'* 0
For a ny real num bers b" bl , b~, and b4 , usc this res ult to prove tlla t there isa u nique cubic WIth equatton y = a + bx + ex l + dx' passing through the fo ur points ( {I I' bl ), (a 2, b2), (a3, h3), a nd ( a~, b. ). ( Do not actually solve for a, b, c, and d. )
28'
I 9.
Let a l • a2•••
• , Q~
I I I I
be
a, a, a, a"
/I
real numbers. Prove that
a:
• ••
a;r-I
al
•• •
~- l
•• •
(t- J
a:
'"
,
- IT
(aJ - (I,)
ls;i < I "; ~
(/~-I
"
where n I '" iv'" are linea rly dependen t is false. It follows that Y. , V1'. ' " V'" must be linearly independent.
/n exercISes /- 12, compute (a) the characteristic polynomial of A, (b) the eigenvalues ofA, (c) a basIS for eacl! eigenspace of A, lIlid (d) tile algebmic alld geometric multiplicity ofeach eIgenvalue. I. A=[
3. A =
5. A =
I
- 2
~l
2.A =[
~l
2
-\
0
\
0
0
\
3
\
\
0 0 3
0 0 0
\
- 2
\
2
- \
0 0
- 2
\
4. A = 0
\
0
3
\
\
- \
\
0
\
\
6. A =
3 2
-, 0
10. A
\
\
- \
\
,-
11. A =
- \
0
2
0
-\
- \
\
\
0
0 \ 4 0 0 3 0 0 0
5
2
\
0
0
- \
0 0 0 0
\
\
\
\
4
0
0
9.A =
3
- \
8. A =
2 2
\
\
2
7. A =
\
\
0 0 0 0
\
\
0 2 3 - \ 0 4
=
\
\
2
296
Chapter 4
Eigenvalues and Eigenvectors
I
0
12. A :::
4 0 0 4 0 0 0 0
I
I
I
2 0
3
(b) Using Theorem 4. 18 and Exercise 22, find the eIgenvalues and eigenspaces of A-I , A - 2/, and A + 21. 24. Let A and Bbe n X II matrices with eigenvalues A and J.L, res pectively. (a ) Give an example to show that A + J.L need not be
13. Prove Theorem 4.1 8(b). 14. Prove Theorem 4.18(c}. [Hint: Combine the proofs of parts (a) and (b) and see the fo urth Remark follow ing
Theorem 3.9 (p. 167}.!
I" Exercises IS and 16, A tors V ,
::: [ _ : ]
ami V2
IS
a 2X2 matrix with eigenvec-
::: [ :
J corresponcling to elgenvailles
!
AI ::: and Al = 2, respectively, and x :::
[~].
15. Find Al(lx.
an eigenvalue of A + B. (b) Give an example to show thtH AJ.L need not be an eigenvalue of AB. (c) Suppose A and J.L cor respond to the sallie eigenvector x. Show that, in t his case, A + JJ. is an eigenvalue of A + Hand AJ.L is an eigenvalue of AB. 25. If A and Bare IWO row equivalent matrices, do they necessarily have the S;lme eigenvillues? Ei ther prove Ihat they do or give a counterexample.
Let p(x) be tile poiynom;tli
16. Find AlX. What happens as k becomes large (i.e., k -+ oo)? II! exerCISes 17 alld 18, A is a 3X3 matrix with eigerlvectors I I I V, = 0 ' V2 = 1 ,am/v)::: I corresponding to eigellI o o vailles A, = - i, A2 ::: and A) = 1, respectivel)\ and
1,
p(x} = X' + a"_Ix"-1 + ... + a,x +
Tile companion matrix ofp(x) is Ille /I X" matrix - a" _1
- 11" - 2
I
0
0
I
0
0
0
0
C(p)-
2
x=
I
ao
-
(I ,
0
- "0 0 (4 )
0
• ••
I
0 0
26. Find the companion matrix of p(x) = x 2 - 7x + 12 and then find the characteristic polynomial of C( pl.
2
17. Find A 10x. 18. Find Akx. What happens as k becomes large (Le., k '"""" oo)? 19. (a) Show that, for any sq uare matri:x A, AT and A have the same characteristic polynomial and hence the same eigenvalues. (b) Give an example of a 2X2 matrix A fo r which AT and A have different eigenspaces. 20. Let A be a nilpotent matrix (that is, Am = a fo r some II! > I). Sho\'I' that A ::: 0 is the only eigenvalue of A. 21. letA bean idempotent matrix (that Is, A! = A).Showthat A = 0and A = I are the only possible eigenvalues of A. 22. If V is an eIgenvector of A with corresponding eigenvalue A and c IS a scalar, show Ihat v is an eigenvector of A - cI with co rrespondi ng eigenvalue A - c. 23. (a) Find the eIgenva lues and eigenspaces of A=
[~ ~]
27. Find the companio n ma trix of p(x) = xl + 3x2 4x + 12 and then find the characteristic polynomial
of C( pI. 28. (a) Show that the companion matrix C( p ) of p(x) ::: xl + ax + b has characteristic polynomial A2 + aA
+
b.
(b) Show that if A is an eIgenvalue of the companion
~
matrix C( p) in part (a), then [ ] is an eigenvector of C( p) corresponding to A. 29. (a) Show that the companion matrix C( p) of p(x) ::: Xl + ax 2 + bx + c has characteristic polynomial _( A' + aA 2 + bA + c). (b) Show that If Aisan eigenval ue of the compan ion
A' matrix C( I'} in part (a), then
A isan eigenvector I
of C( p) corresponding to A.
Section 4.3
30. Construct a tloili riangular 2x 2 matrix wit h eigenvalues 2 and 5. (Hint: Usc Exercise 28.)
33. Verify the Cayley- Hamilton Theorcm fo r A =
2, the companion matrix C( p) of p(x) "'" x" + tl.. I x "~ I + ... + a , x + ao has characteristic polynomial (- \ )"p (A). 1HlMt: Expand by (ofacton along the last colum n. You may find it helpfu l to introduce the polynomial q (x) = ( p(x) - '\I)/x.1 (b) Show that if A IS an eigenvalue of the compan ion matrix C( p } in equation (4), then an eigenvector corresponding to A is given by II
~
- I
0
1
- 2
I
0
powers mId inverses of flit/trices. I-o r example, if A is a 2X 2 matrix with ,/ramaer;st j, poiynomitll ' ,,( A) ::< A2 + aA + b, thenA 2 + aA + bl= O,so
+
A
~
I
~
a" _ I A~-1
+ . + a,A + aoI
Au imporlrlllt theorel/llll (Idwlllced li" eM alge/ml says 111(11 if c,.(.A) IS the ciltlftlc/eriS/1Cpolynomial of the matrix A. lilen CA( A) = (itl words, every matrix satisfies its characterislic equotioll). This IS the celebrated Cayley-Hamilloll r 1leorem, mll1leti after Arthur Cayley ( 182 1- 1895) Ilud SI( WjJ{iam Rowan Hamiltotl (sec page 2). Cayley proved this tllt.'orem ill 1858. HamiltOll dlscoveretl iI, illdepemle1lf/); ill IllS work all quaterniorlS, (I gellemlizat;oll of tile complex nllmbers.
- aA - bl
AI = AA2 = A( - (11\ - bf) = - aA 2
A~
~].
'n,e Cayley- Hamilton TI ,corem can be used to ca1cultJfe
and
I\A ) "'"
-
34. Verify the Cayley-Hamilton Theorcm for A :: I I 0
A 2::11:
If p{x) = x" + tl..~ I X "- 1 + ... + alx + ao and A is a square matrix, we alii define tl sqllflrc rnatflX P(A) by
[~
That is, find the characteristic polynomial c,,( A) of A and show that cA(A) = 0.
3 1. Const ruct a tlont riangular 3 X3 matrix with eigenvalues - 2, I,and 3. ( H int: Use ExerclSoo
'
A1x l, and, applying A agai n, we obtain A(Ax l) > A(Alx l ) = A1(Ax I) where the inequality is preserved, since A IS positive. (See Exercise 40.) But then y = ( 1/II Axllj)Ax l is a unit vector that satisfi es Ay > Aly, so there will be so me A. > AI such that Ay 2: A2y . This contradicts the fact tha t AI was the maxi mum val ue wit h th is property. Consequently, it must be the case that A X I = Alx l; thai is, AI 's an eigenvalue of A. Now A is positive and X I is positive, so A,x l = Ax, > O. This means that AI > 0 and XI> 0, which completes the proof of (a) a nd (b). To prove (c). suppose A is any other (real or complex ) eigenvalue of A with co rrespondlllg eigenvector z. Then Az == Az, and, taking absolute values, we have (4)
where the middle inequality fo llows [rom the Triangle Ineq ual ity. (See Exercise 40.) Since jzI > 0, the unit vector u III the d ireCtIon of Izi is also positive and sa tisfies Au :;> IAlu. By the maximality of AI from the first part of thiSproof, we must have IAI:$: A,. In fact, more is true. It turns out that A, is dominant, so IAI < A, for any eigenvalue A AI. It is also the case thai AI has algebraic, and hence geometric, mult iplici ty L We will not prove these facls. Perron's Theorem can be generalized from positIVe to certain nonnegative matrices. Frobeni us d id so in 191 2. The resuit requires a technical condition o n the malrlx. A S(luare matrix A is called reducible if, subject 10 some permutation of the rows and the same permutation of the columns. A can be written It1 I>lock form as
"*
where Band D arc square. Equivalently, A is reducible matrix Psuch that
,r there is some permutatio n
(See page 185.) For eX:lm ple, the mat rix 2 4
A=
I
6 I
0 2 2 0 0
0
I
3
I
5
5
7 0 0
3
0
2
I
7
2
is reducible, since jnterchangmg rows I and 3 and then col umns I and 3 produces 72 i l 30 • 2 i •••. 4 ...... 5 -----5 -_I.. _--_.+ o O •i 2 I 3 o Oj 6 2 I o O i l 72
,
332
Chapter 4 EIgenvalue; and Eigenvectors
(This is just PApT, where
p-
0 0
0
I
I
I
0
0
0
0
0
0 0 0 0
0 0 0 I
0 0 0 0
0
I
Check Ihis!) A square matrix A that is not reducible is called irreducible. If Al > 0 for some k, then A is called primitive. For example, every regular Markov chain has a primitive transition matrix, by definition. It IS not hard to show that every prtmitive matrix is irreducible. (Do you see why? Try showi ng the cont rapositive of this.)
Theora. 4.31
The Perron-Frobenius Theorem Let A be an irreducible nonnegative
nX n
matrix. Then A has a real eigenvalue Al
with the following properties: a. Al > 0 b. Al has a corresponding positive eigenvector. c. If A is any other eigenvalue of A, then .A !SO AI' If A is primitive, then this inequality is strict. d. If A is an eigenvalue of A such that A = AI' then A is a (complex) root o f the equa tion An - A ~ = O. c. Al has algebraic multiplicity I .
S« Matrix Alwlysis by R. A. Horn and C. R. Johnson (Cambridge,
England: Cambridge Uruve~ity Pre$$, 1985).
The interested reader can filld a proof of the Perron-Froheni us Theorem in many texts on nonnegative matrices or matrix analysis. The eigenvalue AI is often calted the Perron root of A, and a corresponding probability eigenvector (which is necessarily unique) is called the Perron eigenvector of A.
linear Recarrence Relations The Fibonacci numbers are the numbers in the sequence 0, 1, 1. 2, 3, 5, 8, 13, 21 , ... , where, after the fi rSI two terms, each new term is obtained by summing the two terms preceding it. If we denote the nth Fibonacci number by f.. then this sequence is completely defined by the equations fo = 0, It = 1, and. for n 2. 2,
This last equation is an example of a linea r recurrence relation. We will return to the Fibonacci numbers, but first we will consider linear recurrence relations somewhat more generally.
Section 4.6
Applicatio ns and t he Perron-Frobenius Theorem
an
I.eonardo of PiS where
'
n
X. [ X,, _ I
=
1
and
A =
[~ ~]
Since A has d istinct eigenva lues, it can be di3gon3lizcd . The rest of the de tails are left fo r ExeTCIse 51. (b ) \Ve will show that x" = CI An + Cl IIA" satisfies the recurrence relation x" = aX.. _1 + bx.._1 or, equivalentl y, (6)
if A2 - tlA - b = O. Since X .. _ 1
=
, ,,-I + " (1/
CIA
x,, - 2 =
and
-
CI A "- ~
+ C2(II - 2) A,,- 2
substitution into equa tion (6) yields
x" - aX,,_1 - bx.._2 = ( cI A ~ + " IIA") - (/(CIA"- I + ,,(II - I ) A,,- I) - b( cI A,,- 2 + ~ (II - 2) A,,- 2) (I( An
-
aA"- 1 - !JA"-I)
+
~(/l A " -
a( /1 - I ) A"- I
- b( n - 2) , ··- ' )
= ( IA"- 2(A2 - aA - IJ) + C211A,,- 2(A2 - aA - b) + ~ A" - 2(aA + 2b) = cI A,,-2(0) + " I1A,,- 2(0) + = c1A"- 1( aA
~A " - 2 (aA
+ 2b)
+ 2b)
=
=
But since A is a double root o f ,\2 - (IA - b 0, we m ust have Ql + 4b = 0 and A a/2, using the quad ratic (ormula. Consequently, aA + 2b = cr/2 + 2b = - 4b/ 2 + 21J = 0, so
SeCtio n 4.6
331
Apphcatio ns and the Perro n-Frobenius Theorem
Suppose the in itial conditions are XV = r and x, "" s. Then, in either (a) or (b ) there is a unique soluti on for and '1- (Sec Exercise 52. )
'I
Ixample 4.42
Solve Ihe recurrence relatio n XV = I. x, = 6, and
x~
= 6x.. _, - 9xn_l fo r n 2: 2.
The characteristic equation is,\2 - 6A + 9 = 0, which has A = 3 as a dou ble root. By Theorem 4.38(b), we must have X n == e13" + ':zu3" = ( e. + ~ 1I ) 3 ". Since I = XV = c1 tl nd 6 = X I = ( ' I + ez)3, we fi nd that '2:: I,SO
SOlllilon
x" = ( I + /1)3" The techniques outlmed in Theorem 4.38 can be extend ed to higher o rder recurrence relations. We slale, without proof, the general result.
Theorem 4.39
Let x" = a ," _ \x~_ 1 + a .. _2x~_2 + "'" + ~x" '" be a recurrence relatio n of order III that is sa tisfied by a sequence (XII) ' Suppose the (lssoci:lIed characteristic polyno mial
' " - a", _ I A, ,,,-I_ a",_1ft• ...-2_ ... _
•
A
factors as (A - A1)"" (A - A2)"'; '" (A - AA)"", where 1111 + Then x~ has the form X,, :::: (cll A ~ + c12 nA ~ + c13 u2A7 + ... + cl",n",,-I An + ...
m.,:'"'.~..,F'mL
::
111.
+ (Ckl Aj; + cu /lAi: + cul12AI' + ... + Ckm/,m" -IAl)
SYSlemS 01 linear D111erenllaIIQualions In calculus, you learn that if x = x ( t) is a diffe rentiable fu nction satisfyi ng a differential equation of the fo rm x' :::: h, where k is a constant, then the genenll solut ion is x = ee b , where C is a constant, If an initial cond ition x (O) = ~ is specifi ed, then, by substitut ing I = 0 in the general solution, we fi nd that C = ~. Hence, the uniq ue solution to the differential equation that s(ltisfi es the ini tial conditio n is
Suppose we have n differen tiable fun ctio ns of I-say, x" X:z, .. . I x,,- that sallsfy a system of differential equations
x; =
a l 1x 1
+
+ ... +
" l n X ..
xi =
(l2 I X .
+ (ln X 2 + ... +
(l2" X ..
{11 2Xi
We C(l 1l wflle this system in matrix for m as x'
x(I)
~
XI( t) x,( I) •
x,,( I)
X'( I)
~
x;( I) . I for both eigenvalues; 0 is called a "peller. In Example 4.50(b), 0 is called a saddle point beca use the o rigin attracts points in some directions and repels points in othe r directiOns. In this case, lAd < I and IAll > I. The next example shows what can happen when the eigenvalues of a real 2X2 matrix are complex (and hence conjugates of one another).
(Kample 4.51
Plot the trajectory beginning with "]-[' II
0lr ['~'O sme
0
- sinB] cos8
Va2 + il and 0 is the principal argument of II + bi.
where r = IAI =
Prool
a
The eigenvalues of A are
A = H2(1
± V 4(- il))
= H2 (1
±
2Wv'=t) =
by Exercise 35(b) in Section 4. 1. Figure 4.26 d isplays a
-b] = ,[ai' -bl']: [, bl r aI r II
•
lie.".
1m
u + bi
0
0]['~'8 r
smO
+
II
± Ibli =
(I
± /"
b" r, and 8. It follows that
-,'n8] cos O
Geometrically, Theorem 4.42 impli es that when A = [ ;:
- ~] '*
the linear transformat ion T(x ) "" Ax is the composition of a rota tion R
0, :EZ
[ c~sO 5mO
- SinO] through the angle 0 followed by a scaling 5 = [ , 0 ] with facto r r cosO 0 r ( Figure 4.27). In Exam ple 4.5 1(a), the cigenvalues arc A = 0.5 ± 0.5/ so r - IAI = b
V111 =
0.707 < I , and hence the trajecto ries all spiral inwards toward O. The next theorem shows that, ;n general, when a real 2 X2 matrix has complex
eigenvalues, it is similar to a matrix of the form [ : -iL--~~--L---+R'
"
Flglr.4.2&
-~]. Fo r a complex vecto r
Section 4.6 Applications and the Perron-Frobenius Theorem
351
y
*..
Scali ng ,
,
Ax = SRx
Rx ' .~ Rotation
,,
,
,
~::::.-------_x
fllluft 4.21 A rotation followed by a sc:ating
we define the real part, Re x, and the Imaginary part, 1m x, of x to be Re x -
-
Theorem 4.43
[a]b --
[R "]
1m x -
Rew
[,] ['mz] tmw d
-
Let A be a real2X2 matrix with a complex eigenvalue,\ = a - hi (where h =F- 0) and corresponding eigenvector x. Then the matrix P = [ He x Im.x J is invertible and
Proof
Let x = u
Au
+ Avi =
+
vi so that Re x = u and 1m x :: v. From Ax = Ax, we have
Ax = Ax
= ::
(a - bi)(u + vi) flU + avi - bul + bv
=
(au + bv) + (- bu + av)i
Equating real and Imaginary parts, we obtain Au = au + bv and
Av = - bu + av
NowP = [u j v},so
a [
P b
-b]a ~ [u l vJ [ab -!]
=
[au + bv I - bu + av J = [Au ! Av] = A[ u I vJ
= AP
To show that P is invertible, It is enough to show that u and v are linearly independent. If u and v were not linearly independent, then it would fo llow that v = ku for some (nonzero complex) scalar k, because neit her u nor v is O. Thus x = u + vi = u + leui = (I + ki) u Now, because A is real, Ax = Ax implies that
- = Ax=..\x=Ai Ax=Ax
so X But
::: U -
VI
is an eigenvector corresponding to the other eigenvalue A = a x = (I
+ ki)u
=
(I - ki) u
+ bi.
Eigenvalues and Eigenvectors
because u is a real vector. Hence, the eigenvectors x and x of A aTe bo th nonzero multi ples of u and therefore are m ultiples o f one anothe r. Th iS IS impossible because eigenvectors corresponding to d istinct eigenvalues must be linearly independent by Theorem 4.20. (This theorem is valid over the complex num bers as well as the real numbers.) This con trad iction implies that u and v are linearly independent and hence Pis invertible. It now follows that
-b]p-' " Theorem 4.43 serves to explain
A == [ 0.2 0.6
-1.2 ] 1.4
Example 4.51 (b ). The eigenvalues o f
are 0.8 ± 0.6;. Por A == 0.8 - 0.6;, a corresponding eigenvector is
From Theorem 4.43, It fo llows that for P == have
A =
pcr l
For the given dynamICal system Xl ==
PYJ;
Xh I
[
- I
- I] and
°
C ==
[0.8 0.6
-0.0.86 ]
, we
and P- 1AP = C
== Ax " we p erform a change of van able. Let
(or, equ ivalently, Yt == p-I X. )
Then
so Yt-t l == Xk+l = Ax. == P- 1APYk = CYIo
Now C has the s.1 me eigenvaJ ues as A ( Why?) and 10.8 ± 0.6;1 == 1. Thus the dynamical system Yk-t l == Cy. simply rotates the points in every trajecto ry in a circle abo ut the origin by Theorem 4.42. To determine a trajectory of lhe dynamical sys tem in Example 4.5 1(b ), we itera tively apply the linear transformation T(x) = Ax == PC p- Ix. The transformation c:m be though t o f as the composition of a change o f vamble (x to y), followed by the ro tation determined by C, followed by the reve rse change of variable (y back to x). \Ve wi ll encounter this idea again in the application to g raphing quadratic equations in Sectio n 5.5 and, more generally, as "ch ange of baSIS" in Section 6.3. In Exercise 96 of Section 5.5, yo u will show that the trajectory in Exam ple 4.5 1(b ) is indeed an ellipse, as it appears to be from Figure 4.25(b). To summarize then: If a real 2X2 m at rix A has com plex eigenvalues A = a ± hi, th en the trajectories o f the dynamical system Xk '.1 = AXk spiral inward if IAI < 1 (0 is a spiral attractor), sp iral outward if IAI > I (0 IS a spiral repeller), and lie on a closed orbit if IAI = 1 (O is an orbital cwter).
I g Sports Tea
In any co mpetitive spo rts league, it is not necessarily a straightfo rward p rocess \0 rank the players or teams. Counting wins and losses alone overlooks the possibility that one team may accumulate a large n um ber of victories against weak teams, wh ile another learn may have fewer victories but all of them agai nst st ro ng teams. Which of these teams is better? How should we co mpare two teams that never play o ne another? Should poin ts scored be taken into account? Points against? Desp ite these complexi ties, Ihe ranking of at hletes and sports teams has become a commonplace and much-anticipated feature in the media. For example, there are various an nual rankings of U.S. college football and basketball teams, and golfers and tennis players are also ranked internationally. There are many copyrighted schemes used to p roduce such rankings, but we can gai n some insight into how to app roach the problem by using the ideas from this chapter. To establish the basic idea, let's revisit Example 3.68. Five tennis players play o nc another in a round-robin tournament. Wins and losses arc recorded in the form of a digraph in which a directed ed ge from ito jindICates that player i defcats player j. The correspo nding adjacency matrix A therefore h.ls a,j = 1 [fplayer i defeats player j and has n ij = 0 otherwise. I
5
2 A~
4
0
I
0 I
0 0
0 0
0 0
0 I
0 0 I
I
I
I
I
I
0
0 I
0
0
3
f,
We would like to associate a ranking r, with player I in such a way that r, > mdicates that player i is ranked more highly than p layer j. Fo r this p urpose, let's require that the r;'s be p robab ilities (that is. O:s r, < I fo r all i, and r l + ' 2 + ') + '4 + '5 = I) and then orgamze the ran kings in a rallklllg vecto,
• •
Furthermore, Jet's insist that player j's ranking should be proportional to the sum of the rankings o f the players defeated by player i. For exa mple, player I defeated players
353
2, 4, and 5, so we want ' t = a(rl
+ '4 + '$)
where a is the constant of proportionality. Writing out si milar equations for the other players produces the following system:
' 1 = cr(rl + r~ + ' s) '1 TJ
+ '. + 's) = o(rl + (4) = a(r,
Observe tha t we can \\' ri te this system 1Il 1llatrix form as
'. " "
'.
"
~
0
I
0 I 0 0
0 0 0 0
0 I 0 0 I
I
I
I
I
I
0 I
0
0 0
'. " "
0'
r = (fAr
'.
"
I
Equivalently, we see that the ranking vector r must satisfy Ar = - r. ln other words, r is an eigenvector corresponding 10 the matrix At
a
Furthermore, A is a primitive nonnegative matrix, so the Pcrron-Frobemll.s Theorem guarantees that there is a IIniqlll! ranking vector r . In this example. the ranking vector turns out to be 0.29 0.27 ,
~
0.22 0.08 0. 14
so \ \ 'C would rank the players in the o rder 1,2,3,5,4. By modifying the matrix A, it is possible to take into accoun t many of the corn-
plexlt il-s mentioned In the opening paragraph. However, th is sJln ple example has served to indicate olle useful approach to the problem of ranking teams. The same idea can be lIscd to understand how an In lernet search engine such as Google works. Older sea Tch engi nes used to relurn the resuhs of a search wlOrderetl. Useful sites would often be buried among irrelevant ones. Much scrolling was oflen needed to uncover whal you were looking for. By con trast, Google returns sea rch results ordered according 10 thei r likely relevance. Thus, a method for ranking websites is needed. Instead of tea ms playing one another, we now have websites hnking to one anot her. We can once again use a digraph to model the situation, only nOI" an edge from ito j indicates that I"ebslle i Imks to (or refers to) website j. So whereas for the sports team digraph, incommg dIrected edges are b.,d (they indicate losses), for the Inlernel d igraph. incoming directed edges are good (they indicate links from other sites). In 354
this setti ng, we wan t the: ranking of websi te i to be proportional to sum of the: rankings of all the websites that link to i. Using the digraph on page 353 10 represent jusl five websi les, we have r~
"" a(r, + r2
+
rl)
fo r example. II is easy to see Ihat we now want to use the trallsposc of the adjacency I matrix o f the d igraph. Therefo re, the ranking vecto r r mus t satisfy ATr "" - r and will thus be the Pe:rron eigenvector of AT. In this example, we obtain 0
I
0
0
0. 14
0
0
0
0
0 I
0.08
0
0 I
I
I
I
0
0
0.27
I
I
0
I
0
0.29
0 I
AT =
'nd
,
~
a
0.22
so a search tha t turns up these fi ve sites would list them in the o rder 5, 4, 3, 1,2. Google actually uses a va riant o f the m ethod desc ribed here and computes the ran king vector via an iterative method very similar to the power m ethod (Section 4.5).
355
356
Chapt('r 4
Eigen".;llut'S and EigeTl\'ectOrs
17. If all of the sur".ival rates s, are nonzero, let
l. [~~] 3.
5.
U~]
4.
0.1
0
0.5
0. 5
I
0
OA 0 0.5
p =
• !
0
I
0
0
0
I
0
•
0.5
I
0
6. 0.5
0
I
0
0
0
0
0
0
"0
0 5152
0 0
o
0
o
0
2. [~ i] j
0
I
Wlriell of the stoc/UI$tie /1lcHricl:s ill Bxercises 1-6 (lfe re~ul(lr?
. .. • ••
Compute p - I LPand usc it to fi nd the characterist ic polynomial of L. ( Him: Refer 10 Exercise 32 in Section 4.3.) 18. Verify that an I."rgenvC(lor of L correspond ing to A, is I
III Exerci$c$ 7-9, P is the trmlS/tlOli maf";x of(I reglilM Markov cll"ill. Filld the /o/lg rallge lrilllsitiofllll(l/rlX L of P. 8. P ""
7. P ""
sdA,
., .
1
1
1
StsJ A i
1
!
1
SI~$J A t
2
1
J
o ! ! 0.2 0.3 0.4 9. P = 0.6 0.1 0.4 0.2 0.6 0.2 10. Prove that the steady state probabili ty vector o f a regular Markov chain is unique. ( Hint: Use Theorem 4.33 or Theorem 4.34. )
P8,1I111I •• Gr.wlll In Exerci$es //- /4, calm/ale the positive elgellvalue tII,(1 a correspolUlmg lwsiuve eigellvector of ti,e Les/re matrix L.
11. [00.5 0']
12. [I05 01.5] L =
t =
074 13. L =
05
0
0
14. L =
15
3
~
0
0
00.50 o i o 15. If a Leslie matrix has a unique positive eigenvalue A,. \"hat IS the significance fo r the populalion if A, > I? A1 < I?A 1 = 1? 16. Verify that the charaCieristic polynomIal of the Leslie matrix L in equation ( 3) is
CtP.) = (-I )"(A~ - b,A~ - 1 - b2$I A ~-z - bJ$,~ A "-J - ... - b,.s t~" 5,._,) (/'Ji"t: Usc mathematical induction and expand dc t( L - AI) along the last column. )
( Hin!: Combine Exercise 17 above with Exercise 32 in Sect Ion 4.3 and Exercise 46 in Sect Ion 4.4). ~
In Exercises 19-21, complltc the steady stale growth rare of Ihe popu/arioll with the I.e$/Ie nwtrix L from tire given exercise. Then use Exercise 18 10 help find the corresponding di$triburion of tile age classe$. 19. Exercise 19 in Section 3.7 20. Exercise 20 in Section 3.7
21. Exercise 24 in Section 3.7
_22.
Many speci~$ of seal have suff~red from commercial hunting. They have ~I."n killed for their skin , blubber, and meat. The fu r trade, in particular, reduced some seal populations to th~ point of extinctIon. Today, the greatest threats to seal populations arc dC(li ne of fish stocks due to overfi shi n g, poll ution, distu rbance o f habitat, entanglement in marine debris, and culling by fishery owners. Some seals have been declared endangered species; other species arc carefully managed. l able 4.7 gives the birth and survival rates for the northern fur seal, divided into l-year age d asses. [The data arc based o n A. E. York and J. R. Ha rtley, ~ Pup Production Fol lowing Ilarvest of Female Northern Fur Seals," Camu/iall JOII,,/III of FIsheries and Aquatic Science, 38 ( 1981). pp. 84- 90.]
Secllon 4.6
ApplICations and the Perron-Frobenius Theorem
35J
(b) Show tha t r = I if and on ly if At = I. (Th is represen ts zero populatioll growth.) I Hint: Lei
Show that A is an eigenvalue of L if and only if
g(A} =
I.J
(c) Ass uming that there is a unique positive eigenvalue A" show that r < I if and o nly if the population is dec reasi ng and r > I if a nd onl y if the population IS increasing.
A sustamable harvesting policy is a proce(llIre that al/ows a
certaill fract ion of a population (represemed by a papulatioll distnbution vector x ) 10 be Iwrvested so fllat the population retllrns to x after one time imerval (where a time IIItervtll is the length of aile: age class). If II is the fTC/ ction of each age class that is harvested, then we can express (ile harvesting procedure mathema tically as fo llows: if we start willt a poPlllation vector x, after aile time interval we have Lx ; hllrve5tmg rell/oves hLx, leaving
Table 4.1 Age (years)
Birth Rate
Survival Rate
-2
0.00 0.02
0.91
0.88
0.70 1.53 1.67 1.65 1.56 1.45 1.22 0.91 0.70 0.22 0.00
0.85 0.80 0.74 0.67 0.59 0.49 0.38 0.27 0.17 0.1 5 0.00
2-4
4-6 6-8 8--10
10-12 12- 14 14- 16 16- 18 18-20
20-22 22- 24 24-26
Lx - ilL" ==
Slistainability requires that
-
CMl
(a) Construct the Leslie matrix L for these data and compute the positive eigenvalue a nd a corresponding positIVe eigenvector. (b) In the long run. what percentage of seals will be in each age class and what will the growth rate be?
Exercise 23 shows that the lang-run behavior ofa population can be determined directly from the entries of its Leslie /nrHrix. 23. The net reproductIon rate of a population is d efined as
(1 - h)Lx.
24. If .-\1is the unique positive eigenvalue of a Leslie ma tn x Land h is the sustainable harves t ratio, prove tha t II == 1 - I / AI. 25. (a) Find the s usta inable harvest ratio fo r the wood land caribou in Exercise 24 in Section 3.7. (b ) Usi ng the data in Exercise 24 in Section 3.7, red uce the caribou herd according to your answer to part (a). Verify that the population re tu rns to its original level after o ne time interval. 26. Find the sus taina ble harvest ratio for the seal in Exercise 22. (ConservatioOlsts have had to ha rvest seal populations when overfishing has reduced the available food su pply to the point whe re the seals are in da nger o f starva tion.)
.. .. 7 27. Le t L be a Leslie ma trix: with a unique positive eigen val ue At. Show that if A is any othe r (real or complex) eigenvalue of L, then IAI
2
3X,,_2 for /I i?:: 2
=
4Yn_1 - 4Y,,_2 for 1/ i?:: 3
= 4, " I = I, a" =
a,,_1 - a,,_z/4 for II i?:: 2
bo = 0, bl
= I, b" = 2b n _ 1
+ 2b,,_2 for"
2:
2
50. The recu rrence relation in Exercise 43. Show that your solut ion agrees with the answer to Exercise 43, 5 1. Complete the proof of Theorem 4.38(a ) by showing that jf the recurrence relation x" = ax"_ 1 + bX,,_2has distlilct eigenvalues Al '\2' then the solution will be
'*
of the form
( Hilll: Show that the method of Example 4.40 wo rks in general.)
52. Show that for any choice of mltial conditio ns Xu = r and x, = S, the scalars c i and C:! can be fo und, as stated jn Theorem 4,38(a) and (b).
Section 4.6
Applications and the Perron-Frobenius Theorem
T he area of the square is 64 square u mls, but the rectangle's area is 65 square u nits! Where did the extra square coille fro m? (Him: What does this have to do wi th the Fibonacci sequence?)
53. The Fibonacci recurrence f" = /,,- 1+ /,,-2 has the associated matrix equation x ~ = Ax n _ p where
(a) With fo = 0 and f.. = I, use mathematical ind uction to prove that
A"
~ [f"+' f.
f,
/~- I
359
54. You have a supply of three kmds of tiles: two ki nds of 1 X2 tiles and one kind of I X 1 tile, as shown in Figure 4.29.
1 figure 4.29
for ,11111
y(O) x1 (0)
= I ===
60. '" ::;: YI - Y2, Yl::;: YI + )'l '
y-
y'=x+ z'=x+y,
62. x'::;: x + y' = x - 2y
z' = 3x+
x(0) -
z,
:], b- [=:~]. X(O) -[~~] -1
1
y(O) - o ,(0) - - 1
+ z,
to tire two populations for the given A andb lind Illitial con di tIOns x(O). (Flfst show t/ral there are cOll stants (/ and b such that rlre substitutions x = u + a ami y = v + b convert the system i llto an equivaiem aile with 110 CO/lSIan/ terms.)
66. A ::;: [ -1
Y2(0) == I z,
" "" [;] and b is a constant vector. Determine wlwt happens
I
YI(O) == I ~
Exercises 65 and 66, species X preys 011 species Y. The sizes of tile populations /Ire represented by x = x( t) a"d y = yet) . "f/ie growtlr rate of etlch population is govemed by tire system of dijferelllia/ equ(rtiolls ,,' ::;:; Ax + b, where III
65_ A =
X2. X2(0) "" 0
= XI -
61. x'
x(0) - 1
O.4x - 0.2y
Determine what happens to these two populations.
C.V.
Sut,.s olll.,.r DIII,rllll.1 Eallllols
57. x' y'
= -O.8x + OAy
y' -
. x,
hm k = A,
1.5y
64. Two species, X and Y, live in a symbIOtic relationship. That is. neither species can survive on its own and each depends o n the other for its survival. Imtially, there are 15 of X and 10 ofY. lf x = x(t) and y = yet) arc the sizes of the populations at time t months. the growth rates of the two populatio ns are given by the system x'
Moo
+
(a) Determine what happens to these two populations by solving the system of differential equations. (b) Explore the effect of changing the in itial popu lations by letting x(0) = a and ){O) = b. Describe what happens 10 the popuJations in terms of a and b.
figure 4.31
I +VS
I.2x - 0.2y
x(0) - 2 y(0) - 3 z(O) - 4
63. A scientist places two strains of bacteria, X and Y, in a petri dish. Initially, there arc 400 of X and 500 ofY.
67. Let x = xU) be a twice-differen tiable function and consider the second order differenllnl equatioll x"+ax'+ bx=O
(II)
(a) Show that the change of variables y = x' and z ::;: x allows equation (i l ) to be written as a system of two linear differential equations .n yand z. (b) Show that the characteristic equation of the system in pari (a) is ,\2 + aA + b::;: O.
Chapler Review
68. Show that there is a change of va riables that converts the 11th order differcntitll equation } n)
+
a~_I}"- I )
+ ... + (llx' +
('0 == 0
into a system of nlinca r differential equations whose coefficient ma trix IS th e companion mat rix C{ p) of the polynomial peA) = An + a.. _IA,,-1 + .. . + al A + flo. IThe notation x fl l denotes the kth deriva tive of x. See Exercises 26-32 in Section 4.3 for the deflnltlon of a companio n matrix.] III Exercises 69 alld 70, use Exercise 67 to fifU/ the general
solution of the give" eql/{ltiofl. 69. x" - Sx' + 6x = 0 70. x" + 4x' +3x=O
L.5
-,
79. A = [
0.'
80. A = [ 0.5
0.2 81. A = [ - 0.2
0.4] 0.8
82. A =[ O
1.2
3&1
0.9]
0.5
-1.5 ] 3.6
1/1 Exercises 83-86, the givell lIlatrix is of the form
a
A = [b
-b]a ' I" each case,
A call be factored liS the
product of a scaling matrix alld a rotatiot! matrix. Fmd the scaling/actor r and the allgle 0 of rotatiOIl. Sketch the first fo ur PO;fHS of the trajectory for the dYllamical system Xu r "" Axt
with Xo = [ : ] "nd classify tlle origm (IS a spIral
III Exercises 71-74, wIve tile system of differential equations m the given exerCIse using TI/Core", 4 41.
atrractor, spiral repeller, or orVital center.
71. Exercise 57
72. Exercise 58
83. A :: [:
73. Exercise 61
74. Exercise 62
84. A =
-: ]
v'3,3]
Dlscr •• e Linear DJnamlCal SVS ....II
[- °05 0.5] 0
_ [-v'3/2 - '/2 ] 86. A '/2 -v'3/2
III Exercises 75-82, cOllsider the dynamical System In ExerCIses 87-90, find all illvertible matrix P and a
" h i = AXt ·
(a) Compute ami plot Xo, " I '
X 2, X J
(b) Compute aud /llot Xo-> X I> x2, X l for Xo =
[' 3']
-!]
sl/cli (lwI A = PCp - I.
Sketch the first six points of t/ie traJlxtory for tire dynamical
[~].
system Xk+ I =
(c) Usiug eigenvailles alld eigellvectors, classify the origin as (HI anTactor, repeller, s(I(ldle poillt, or Hone of these. (tl) Sketch several typicnl trajectories of the system. 75. A=0
[~
matrix C of the form C =
for Xo = [:].
[0.5 -0.5] 0.5
76.A =O
- 4 78.A = [ I
Ax~ with "tl =
a spirol (ltlraC/or, spiral repeller, or orbital cellter.
1 22 0]
87.A == [ 0.1 - 002 ] 0.1 0.3
88.A = [
89. A=[: -~ ]
90.A== [~~]
..
~-
R
[:] (llIti cltlssify the origill (IS
~
- ,"
.....
-
-
',~,
.
Ker Dellnlllons and adjoint of a matrix, 275 algebraiC multiplicity of an eigenvalue, 291 cha racterist ic equation, 289 cha ractenstic polynomial, 289 cofactor expansion, 265 Cramer's Rule, 273-274 determinant, 262-264
diago nali7..able matrix, 300 eigenvalue, 253 eIgenvector, 253 eigenspace, 255 Fundamental Theorem of Invertible Matrices, 293 geomet ric multiplicity of an eigenvalue, 291
Gerschgorin's Disk Theorem, 318 Laplace Expansion Theorem, 265 power method (and its variants), 308- 316 properties of determmants, 268-273 similar matrices, 298
362
Chapter 4
Eigenvalues and Eigenvectors
Review Questions I. Mark each of the following statements true or false:
(a) For all square matrices A, d et( - A) = - de t A. (b ) If A and B are 11 x II matrices, then det(AB) = d et ( BA). (c) If A and B are nXn matrices whose columns are the same but III different o rders, then det B = - det A. (d) If A is invertible, then det(A- I) = d et AT. (e) If 0 is the only eIgenvalue of a square matrix A, then A is the zero matrix. (0 Two eigenvecto rs co rresponding to the same eigenvalue must be linearly dependent. (g) If an n X PI matrix has n distinct eigenval ues, then it m ust be diagonalizable. (h ) If an "X II matrix is diagonalizable, then it must have 11 distinct eigenvalues. (i) Similar matrices have the sam e eigenvectors. (j) If A and B are two 11 X n matrices with the sam e red uced row echelon form, then A is SImilar to B.
2. LetA =
I )
5
3 7
7
5 9
11
(a) Compu te det A by cofactor expansion alo ng any
row or column. (b) Comp ute de t A by fi rst reducing A to triangula r form.
3d 2, - 4f f " b , 3. If d e f = 3, find 3(1 2b - 4c c gir l 3g 211 - 4i j 4. Let A and B be 4 X4 mat rices with d et A = 2 and de t 8 = - i. Find d et C for the indicated mat rix C: (a) C = (A8) - '
(b) C= A'B(3A')
5. If A is a skew-symmetric /I X PI matri x and
n is odd,
prove that det A = O.
6. Fi nd all values of k for wh ich
1
- 1
I 2
I 4
2
k = O. Ii'
111 Questlolls 7 and 8, sholll that x is all eigcllvector of A and
filld the corrcspolldillg eigellvalile.
7. X = [aA =[~ 8. x =
3 - I ,A =
2
10
- 6
3
3
4
- 3
o
0 -2
9. Let A =
(a) Find the characteristic polyno mial of A. (b) Find all of the eigenvalues of A.
(e) Find a basis for each of the eigenspaces of A. (d) Determine whethe r A is diagonalizable. If A is not diagona lizable, explain why not. If A is d iagonallzable, find an invert ible matrix P and a d iagonal matrix Dsuch tha t P- IA P "" D.
10. If A is a 3X3 d iago nalizable m at rix wit h eigenvalues - 2,3, and 4, fi nd det A. 11. If A is a 2X2 mat rix with eigenvalues Al =
and correspo nding eigenvectors V I
find A-S[ ~].
t,Al ""
- ],
=[:].v,=[_:].
12. If A is a diago nalizable matm and all of its eigenval ues satisfy ~ A~ < I, prove that A" approaches the- zero matrix as n gets large.
111 Questions 1~ 1 5, determine, with reasons, whether A IS similar to B. If A - B, give an invertible matrix P such tlrat p- 1AP=B. 13. A = 14. A =
15. A =
[~ ~]. B =[~ ~] [~ ~].B= [~ ~] 1
1
0
1
I
0
0
I
1
, B= 0
1
0
0
0
I
0
0
1
16. Let A =
[~ ~]. Find all values of k fo r which:
(a) A has eigenvalues 3 and - \. (b) A has an eigenvalue with algebraic multiplicity 2. (e) A has no feal eigenval ues.
17. If A3 = A, what are the possible eigenvalues of A? 18. If a square matrix A has two equal rows, why must A
have 0 as one of its eigenvalues?
~l \3 - 5
- 5
19. If x is an eigenvector o f A with eigenvalue A = 3, show
- 60
- 45
18 - 40
15
-32
that x IS also an eigenvector of A l the correspond ing eige nvalue?
-
SA
+
21. What is
20. If A is similar to B with P- IAP = B and x is an eigenvector of A, show that p - IX is an eigenvecto r of B.
rlho
.. . that sprightly Scot of S(Q/S, Dol/glaJ, /IUlt rims a-horseback up 1/ 11;1/ perpmdiClilar- William Shakespeare Hrury IV; Pari I Act II, $etne JV
Ii
5.0 Introduction: Shadows on a In this chapter, we will ('xtend the notion of orthogonal projection thaI we encountered fi rst in Chapter I and then again It1 Chapter 3. Unti l now, we have d iscussed only p roje.:tio n onto a si ngle vector (or, equivalently, the o ne-d imensional subspace span ned by that vector), In this section, \\'e will see ,f we can find Ihe analogous formulas for projection on to a p lane in [R3. Figure 5.1 shows what havpens, fo r example, when parallel light rays create a shadow on a wall. A similar process occurs when a three-di mensional object is displayed on a two-dimension'll screen, such as a computer mon itor. Later in this chapter, we will consider these ideas in full generalit y. To begm, let's take another look a t what we already know about project ions. In SWion 3.6, we showed that, 111 R2 , the standard matrix of a projwion o nlo the line thro ugh thc origin with direction vector d = [",] ;,
",
dldl/( d{ + df)] (Iil(d{ + dD Hence, the projection of the vector v onto thiS rille is just Pv.
Problem 1 Show that Pcan be written in the equivalent fo rm
flgur. 5.1 Shadows on II wall are project1ons
cos' O [ cos O sin 6
COS (;I SinO ] Sll1
2
0
(What does 0 represent here?) Prolll,.2 Sho," that l)can also be written in the fo rm p :::: uu T, where u IS a unit vector in the d irection of d. [ 3] Problem 3 Using Problem 2, find P and then find the proje
The orem 5.1
•
If Q is an orth ogo nal matrix, then its rows form
,,"1
From The orem 5.5, we know that Q- I ;; QT. Therefo re,
(Q') , = (Q-')"
=
Q = (Q')'
so Qr is an orth ogo nal mat rix. Thus, the colu mns of QT_w hich are just the rows of Q-f orm an orth ono rma l SCI. The fi nal theo rem in th is sect ion lists som e o ther prop erties of ort hogonal mat rices.
Theorem 5.8
Let
0 be an orthogonal mal ri.'(,
a. Q- I is orth ogo nal.
b. det O =:t l c. If A is an eige nva lue of Q, then ,A = I. d. If QI and O2 arc orth ogo nal ,. X" matrices. then so is
OIQ,. •
"11 1 We will prove properl y (e) and leave the proofs of the rema inin g pro perties as exercises. (c) LeI A be an eige nval ue of 0 with corr esp ond ing eigenv~c lor v. Then
and , uSlh g The orem 5.6(b), we have
Ilvll
Since ; vl
'*
I Qvl1 0, this implies that IAI == I. =
I Avi
=
=
Qv "" Av,
IAlll vl1
[0 -0\]
.,••,.
Property (e) hold s even for com plex eige nvalues. The mat rix 1 is orth ogona l with eigenva lues i and - i, both of whi ch have absolu te valu e I.
Exercises 5.1 /11 Exercises /-6, de/erm ine widell sets of veClQ r$ {Ire ortl wgo nal. - 3 2 \ - \ 4 2 I. \ • 4 • - \ 2. 2 \ 2
3.
•
2
\
3
- \
\ - \
•
2 2 • - 2 \
•
-5
2
4
5 4.
0 \
3 • -2 • \
5.
\
2 3 \
- \
6.
2 3 -\
•
- 2
- 4
\
-6
- \
•
2 7
4
0
\
0
\
- \
0
- \
\
0
- \
\
\
\
\
\
0
2
•
Chapter 5
314
Orthogonality
In Exercises 7- /0, show t}u/f the given vectors fOrtll W I orthogonal basis for (R2 or (RJ. Theil usc 'nleorem 5.2 to express w as a linear combination of these /msis vectors. Give the coordil1ate vector [wJ8of w with respect to the basis 6 = {VI' VI } ofR l or6 = {VI' V 2' Vl } of R l •
7.v,~ [_;l v,~ [:lw ~ [-:l
8.
0
I
2
=
, V2
=
I
;w =
I
I
I
I
I
I
I
I
I
;w =
- 2
0
I
2 3
In Exercises // - 15, determille w/letller the givell orthogonal set of vectors is orthonormal. If it is IIot, normalize the vectors to fo ml all orthonormal set. II.
[l]. [-!] ,!,
j
13.
o
1/2 1/ 2 , - 1/2 1/ 2
[l]. [-ll •,,
2 -~
o
,
I
,• . , ' l
15.
12
14.
,
_1
j
•
,• • 1 ,
26. If Q IS an orthogonal matriX, prove that any matrix obtained by rearranging the rows of Q is also orthogonal. 27. Let Q be an orthogonal 2X2 mat ri x and let x and y be vecto rs in RI . If 0 is the angle between x and y, prove that th e angle between Qx and Qy IS also 8. (This proves that the linear transformations defi ned by o rt hogonal m atrices arc Grlgle-presavillg in Gt 2, a fact that is true in generaL) 28. (a) Prove tha t an orthogonal2X2 matrrx m ust ha ....e the form
,, _1 •, • -,,
where [ : ] is a unit vector.
j o o v'3/ 2 o V6/3 - v'3/6 1/ V6 ' v'3/6 ' 1/0. 1/ 0. - 1/V6 - v'3/6
(b) Using part (a), show that every o rthogonal 2 x 2 matrix is of the fo rm COS [
orthogonal. If It is, find its inverse.
oil ,,
19.
cosOsin 8 COSI O
- cos ()
. '0 - sm-
sin 0
-cos 8 si n 0
Sin 0
0
cosO
,,
20.
,,
1 , , , , • _., 0 1 1
18.
[ 1/ 0. 17. - 1/ 0.
,1 _1 ,, ,
,, 1, •, ,1, j _ 1, , _.1, •, •, , j
0
sin 0
cos O
-SinO ]
[ si n 0
cos 0
0]
,in - cas 8
where 0 < 0 < 27t_ (e) Show that every orthogo nal2X2 matrix corre· sponds to either a rotatio n or a reflec tion in R2. (d ) Show that an orthogonal 2X2 matrix Qcorresponds to a rotatio n in [R2 if det Q = 1 and a reflection in IR:l if de t Q = - I.
III Exercises 16-21, determine whether rlzegiven matrix is
16. [ 0I
2/ 3 1/0. 1/V6 - 2/3 1/0. - 1/ Y6 0 1/3 1/ 0.
25, Prove that every permutation matrix is orthogonal.
- I
- I , Vj =
0
1/ V6
24. Prove Theorem 5.8{d ).
- I I , v2 =
0
0
23. Prove Theorem S.8{b).
I , vJ
21.
0
0
22_ Prove Theorem 5.8(a ).
v,~ [;]. v,~ [-~ l w ~ [:l I
I
1/ 0.] 1/ 0.
III Exercises 29-32, lise Exercise 28 to determine whetltcr the given ortllOgoll(// matrix represents (/ rotalion or a refleetioll. If il is a rotatlOl/, give the angle of ro/at ioll; ifit is (/ reflee· lion, give the lillc of reflectjoll 29.
1/ 0. - 1/ 0.] [ 1/ 0. 1/ 0.
- 1/ 2 V3/2 ] 31. [ v'3/ 2 1/2
[ - 1/ 2 v'3/ 2] 30. _ v'3/ 2 - 1/ 2
-,
32.
[-i
Section 5.2
33, Let A and B be /IX /I orthogonal matrices. (a) Prove that A(AT + 81)8 == A + 8. (b) Use part (a) to prove th at, if det A + del B = 0, then A + B is not invertible. 34. Let x be a unit vcctor in ROO. Partition x as
x,
......
x,
x-
-[~ l
115
Orthogonal Complements and Orthogonal Projections
wi th a prescribed fi rst vector x, a construction that is frequently useful in applications.) 35. Prove that if an upper triangular matrix is orthogonal, then it must be a diagonal matrix. 36. Prove that if 1/> III, then there IS no mX'1 matrix A such that IAxI == Ixl for all x in n-. 37. Let 8 = {VI' . .. , v~} bean orthonormal basis for R~. (a) Prove that, for any x and y in R~,
x"
s'y = (X 'YI)(Y ' V1 )
+
(x'Y!)(Y'V2)
Let
+ ...
+ Ix· v,,)ly· v,,) :!I.i..............! ~............ _
, ( I )yyT
y j 1:
I - XI
(This identity is called Parseval's ldetllily.) (b) What does Parseval's Identity imply about the relationshIp betw~ n the dot products X· yand
Prove that Q IS orthogonal. (This proced ure gives a qUIck method for finding an orthonormal basis for R~
Ix18 ' [Yl.6?
OrthogOnal Complements and Orthogonal proJections In this sect ion, we generali7.e two concepts that we encountered in Chapter I. The no· tion of a normal vector to a plane Will be extended to orthogonal complements, and the projection of one vector onto another will give nse to the concept of orthogonal projecllon onto a subspace.
IV! is pronounced ~ IV perp.~
Orthogonal Complellen\s A normal vecto r n to 11 plane IS orthogonal 10 every veclor in that plane. If the plane passes through the origin, then it is a subspace 'V of (Rl , as is span (n ). Hence, we have two subspaccs of Rl with the property that every vector of one is orthogonal to every vector of the other. This is the idea behind the foll owing definition.
,
•
IV
!k
Let \Vbe a subspace of R-. We say that a vector v in R is orthogonal to W If Y is ort hogonal to ewry vector in \V. The set of all vectors that are orthogonal to W is called the orthogonal complement 0/ \V, denoted \V"'. That is,
e
WI == { vi nR ~ : v'w = 0
flglrl5 .5 \VI and IV '"
s
Definition
,.. e'"
,
fora Jlwi n IV }
,
e1
EKlmple 5.8
e
If W is a plane through the origin in R l and is the line through the ongin perpen· dicular to W ( i.e., paral lcl to the normal vector 10 \V), then every \'ector von is orthogonal to every vector w in W; hence, = W .l.. Moreover, \V consislS precise/y of those vectors w that are orthogonal to every v on C; hence, \,>,e also have \-V = f J. . Figure 5.5 Illustrates this situation.
e
e
318
ChapteT 5 OrThogonalilY
In Example 5.8. the orthogonal complement of a subspace turned out to be ano ther subspace. Also, the complement of the complement of a subspace was the original subspace. These properties are true In general and are proved as properlles (a) and (b) of Theorem 5.9. Properties (c) and (d) will also be useful. (Recall that the intersectIOn A n B of sets A and B consists of their common elements. See Appendix A. )
Theo,.. 5.9
Let W be a subspace of R8. a. W.I. is a subspace of R8. b. ( W.l.).1.= W c. wn W.l. = 101 d. If w = span (wj • • . • , Wi)' then v is in W.l. if and only if v ' w, = 0 for all i = l •. . . , k.
Proal (a ) Since o · W "" 0 for all w in
W. O is in W..I. . u-t u and v be in W.I. and let c
be a scalar. Then u 'W = v·w = 0
forallwlfl W
Therefore, (u + v)·w = u ' W + V'w = 0 + 0 "" 0 so u + vis in W.I. . We also have
« . ) ·w
= « •. w) = «0) = 0
from which we sec that cu is in W.I.. It foll ows that W.I. is a subspace of R". (b) We will prove this properly as Corollary 5.12. (c) You are asked to prove this property in Exercise 23. (d ) You are asked to prove this properly in Exercise 24.
--
We can now express some fu ndamental relationships involving the subspaces associated with an m X " matrix.
Theore .. 5.1.
Let A be an m X n matrix. Then the orthogonal complement of the row space of A is the null space of A, and the orthogonal complement of the column space of A is the null space of AT:
P,ool
If x is a vector in R", then x is in (row (A».I. if and only if x is orthogonal to every row of A. But this is true if and only if Ax = 0, whi(h is equivalent to x bc:'ing in null (A), so we have established the firs t Identity. To prove the second identity, we s imply replace A by ATand use the fa ct that row (A T ) = col (A). Thus. an m X n mat rix has four subspaces: row(A), null (A ), col (A), and null (AT ). The first two arc orthogonal complements in R8, and the last two arc orthogonal
•
,
Section 5.2
I
Orthogonal Complements and O rthogonal Proj«tions
null(Al
,0
col(A)
row(A)
R" flglre 5.6 The four fundamental subspaces
complements in R"'. The mX /I mat rix A d efines a linear transfo rmation from R~ into R" whose range is collA). Moreover, th is transfo rmatio n sends null (A ) to 0 in Ill .... Figure 5.6 illustrates these ideas schematically. These four subspaces arc called the fundame"tal subspausofthe mX" matrix A.
Example 5.9
Find bases fo r the four fu ndamental subspaccs of
A ~
1
1
3
1
6
2
- 1
0
1
- 1
-3
2
1
- 2
1
lll)' where
Also, null {A) = spa n (x •• Xl)' where
x,
~
-1
1
-2
-3 0
1
0 0
,
X2
=
-. 1
To show thaI (row (A».!. = null (A ), it is enough to show that every u, is orthogonal to each x" which IS an easy exercise. (Why is th is sufficierll?)
311
Chapter 5
Orthogon:llity
The colu m n space of A is col (A) "" span {a J, a 2• a J ), where ]
2 , - 3
3 1 ""
=
32
]
]
- ]
]
2
4
,
a,
=
- 2
]
]
We still need 10 compute the null space of AT. Row reduction produces ]
2
- 3
4 0
]
0
0
]
]
- ]
2
0
6 0
0
]
]
- 2
-]
]
0 0 0
]
]
0 0 0 0
]
IA'I O)= 3
0 0 ] 0 3 0
3 0 0 0 0 0
•
]
•
•
0 0
0
So, if y is in the n ull space of AT, then Y. ""' - Y., Y2 "" - 6y., and Yj = - 3Y4. It fo llo\ys that
nuU(A')
- 6y. - 3y.
=
"" span
vCClO r
-3 ]
)'.
and it is easy to check tha t this
-. - ]
-Y.
is orthogonal to a l> a z• and a y
The method of Example 5.9 is easily adapted to other situatio ns.
(umple 5,10
Let W be the subspace of [R5 spanned by ]
wl =
-3 5 , w1 = 0
5
-]
0
]
- ]
2 , - 2 3
Wj=
4 - ]
5
Find a basis for W J.. .
Salallall
The subspace W spanned by wI'
w~ .
and wJ is the same as the column
space of
A=
]
- ]
0
-3
]
- ]
0
2 -2
4 - ]
5
3
5
5
Seclion 5.2
Orthogonal Complements and Orthogonal Project ions
319
Therefore, by Theorem 5.1 0, W i = (col(A))'" = null (AT ), and we may p roceed as in the p re vio us exam ple. We com pute
[A' loJ-
1
- 3
- I
1
°
5 2
- I 4
°J ° 5
•
- 2 0 - \ 5 0
° °0 01° o1 32 °° 1
°°
3
4
1
Hence, yisin W.L ifandonlY lf Yi = - 3Y4 - 4YS'Y2 = - Y4 - 3Ys, and YJ = - 2Ys. lt follows that
- 3
- 3Y4 - 4Y5 -Y4 - 3Y5 - 2ys
W l. =
-4 - I - 3 0 , -2 1 0 0 1
= span
y,
y, and these two vectors for m a basis for W-i
,
Orthogonal Prolee"ons Recall that, in Rl, the projection o f a vecto r v on lO a no nzero vector u is given by )
projo(v)
v)
U· = ( U· U
u
Furthermo re, the vector perpg(v) = v - proj,,( v) is orthogo nal to proju(v ), and we can decompose v as
v = proj.(v) + perpu{v) as shown in rigurc 5.7. If we leI W = span (u ), then w = proj., ( v) is in Wand w'" = perp.( v) is In Wi , We therefore have a way of "decomposing" v into the sum of two vectors, one from Wand the other orthogonal to IV- namely, v = w + W i . We now generalize this idea to R~.
Definition
Let Wbe a subspace of R~ and le t {u l , . . . , uJ.} be an orthogonal basis for W. For any vector v in R~, the ortlJogonal projection of v Ollt o W is defi ned as
The component ofv orthogonal to W is the vector
Each sum mand in the defi nition o f proj I~ V) is also a projectio n onto a single vecto r (o r, equivalently, the one-d imensional subspace span ned by it- in our p revIO Us sense). Therefore, with the notation of the preceding defin ition, we can write
p roj W and Ihe coefficients of the cross-product terms x,x, arc split between a'i and a)l. This gives
2 A=
3 -i
3
- I
0
-~
0
5
Section 55 Applications
!(X1• X1• Xl) = (XI x, x,]
so
2 3
- \
-j
0
3
-,,
.13
x, x, x,
0 5
as you can easily check.
In the case of a quad ratic fo rm fix, y) in two variables. the graph of z = !(x,y) is a surface in R l. So me examples arc shown in Figure 5.1 2. Observe that the effect of holding X or y constant is to lake a cross sectio n of the graph parallel to the )'2 o r X2 planes. respectivel y. For the gra phs In Figure 5. 12, .. 11 of these cross sections are easy to identify. Fo r exam ple, in Figure 5.12 (3), the cross sections we get by holding x or y co nstant are all parabolas opening upward, so ! (x. y) ;?! 0 fo r all values of x and y. In Figure S. 12(c), holding xconstant gives parabolas opening downward and holding yconstanl gives parabolas opening upward, prod ucing a saddle poillt.
,
••
,. y (a) 4 - ~
+ 3;
"" y (c) 4 '"'
2f2 -
3)'2
fllure 5.12 Graphs of quadratic forms fi x. y)
(d)
z - 2f2
414
Cha pter 5 Onhogon:lht y
What makes this type of analysIs quite easy is the fac t that these quadratic forms have no cross-product terms. The mat rix associated with such a quadratic form is a d iagonal matrix. For exam ple.
In general. the matrix of a quadratic form is a symmet ric mat rix. and we saw in Sc which has no cross-prod uct terms. If the eigenvalues of A are AI> ' . . ,An and y = [YI ... Y~r, then ..
Section 5.5
(Kample 5.2
ApplicatIOns
415
Find a change of variable that transforms the quad ratic form
[(XI' x;.) = 5x,l
+
4X1X2
+ 2x~
into one wi th no cross-product terms.
Solullon
The matrix of fis A =
[~ ~]
with eigenvalues Al = 6 and " 1 = I . Corresponding uni t eigenvecto rs are ql =
Ij Vs ['/Vs]
and
(Check this.) If we set
'/Vs Q ~ [I/Vs
I/Vs] - 2/Vs
q2 =
[- 2jI/Vs] Vs
• and
D=
[~ ~]
then QTAQ = D. The change of variable x = Qy, where
x
~ [~]
'nd y
~ [;;l
converts f into
fry) ~ [(y" y,)
~ [y, Y'l[~ ~t:] ~ 6yi + y!
The original quadratic form x TAx and'the n ew o ne yTDy (refer red to in the Principal Axes Theorem) are equal in the fo llowing sense. In Example 5.27, suppose we want
to evaluate f(x) = x TAx at x =
[ -~]. We have
[(-1,3 ) ~ 5(-1)' + 4( -1 )(3) + ' (3)' ~ II In te rms of the new variables,
[ y,y, ]~ y ~ Q'x ~[ 2/Vs I/Vs [(YI'Y2) = 6y~ +
ri =
I/Vs] [- I ]~[ I/Vs] -'/Vs 3 -'/Vs
6( I/Vs)2 +
(-7/ vsl
=
55/5 = II
exactly as before. The Principal Axes Theorem has some interesting and important consequences. We will consider two of these. The fi rst relates to the possible values that a quadratic form can take on.
Definition
A quadratic formf(x ) = xTAx is classified as one of the following:
I. positive d efinite if f( x ) > 0 for all x *" O. 2. positive semidefinite if f(x) ~ 0 for all x. 3. negative definite if f(x ) < 0 for all x *" O. 4. negative semidefinite if f(x ) < 0 for all x. 5. jndefinite if f( x ) takes on both positive and negative values.
41&
Cha pter 5 Orl hogollaluy
A symmetric matrix A is called positive definite, positive semidefinite, negative definite, negative semidefi nite, or i"definite if th e associated quadratic fo rm [(x) = xTAx has the corresponding property.
The quadratic forms in parts (a), (b), (cl, and (d) of Figure 5. 12 are posi tive defin ite, negative defi nite, indefi nite, and posltl\'e semidefini te, respectIvely. The PrinCIpal Axes Theorem makes it easy to tel l if a quadratic form has one of these properties.
Theorem 5.24
Let A be an !IX 1/ sym metric matrix. The quadratic form [(x) = x TAx is a. jIOsitlve definite if and only if aU"of the eigenvalues of A ar po&Jtive. b . .positive SCiUldefiniJe j f and only if II of the eigenval ues of A nonn ive. egative definite-if and only if all of the eigenvalues of A are rfegs· . egative semidefinitejf and only if aU of the eigenvalues of A ak non"",,,';"v
a O (c) a pair of straight lines o r an imaginary conic if k 'l 0, and prove that if x lies o n this ell ipse, so does Ax.J
• ., Dellnltions and fundamental subspaces of a matrix.. 377 Gram·Schmidt Process. 386 orthogonal basis, 36i orthogonal complemen t of a subspace, 375 orthogonal matrix. 37 1
orthonormal set of vectors. 369 propcn ies of onhogonal mat rices. 373 QR faclOrization. 390 Rank Theorem, 383 spectral decompositIOn , 402 Spectr,,1 Theorem , 400
o nhogonal proJKtion , 379 o rthogonal set of vecto rs, 366 Orthogonal Decomposi tion Theorem. 381 o rthogonally diagonalizable matrix. 397 o nhonormal basis, 369
Review Questions 7
1. Mark e"ch of the following statements t rue or false: (a) Every ortho normal set of vttto rs is linearly
(b) (c) (d) (e)
(0 (g)
(h) (i) (j)
independent. Every nonzero subspace of R~ has an orthogonal basis If A is a square matrix with orthonormal rows, then A is an orthogo nal matTlx. Every o rthogol1,,1 matrix is invertible. If A is a mat rix with det A :: I, then A is an orthogonal matrix. If A is an m X " matrix such that (row(A» .l = R~. then A must be the zero matrix. If W is a subspace of IR~ and v is a vector in R" such that pro; lO'( v) = O, lhen v must be the zero vector. If A is a symmetTlc,orthogonal matrix, then AZ = I. E\'ery o rthogonally diagonaliz.l ble matrix is Invertible. Given any /I real numbers AI! . .. ,An' there exists a symmetric /I X III11"tfl)C with AI' ... , An as Its eigenvalues.
2. Find all values of (/ and b such that I
2. 3
4
•
I , b - 2 3
is an o rthogonal set of vectors.
3. Find the coordmate vector [V]6 of v =
- 3 with 2
respect to the orthogonal basis I
I
0,
I ,
I
- I
- I
2
of R'.
1
4. The coordina te vector of a veClOr v with respect to an
orthonorm:llbasis 6 "" {v l, vl}of R l is [V]6 = If VI =
J/5] [
[ l/~] '
4/ 5, find allposslblevectorsv.
6! 7 2! 7 3! 7 5. Show that - I!V, o 2! V, 4! 7Vs - 15!7V, 2! 7V,
•
IS an
o rthogonal matTix. 6. If
[ 1~2
:] is an o rthogonal matrix, fi nd all possible
values of a, h, and c. 7. If Qis an orthogonal " X II matrix and {VI> ...• v(} is an orthonormal sct In Rn, prove that {Q v l , . . . , QVt} is an o rthonormal sct.
431
Chapter 5 Orthogonali ty
8. If Q IS an " X " mat rix such that the angles L (Qx, Qy) and L (x , y ) a re eq ual for all vectors x and y in lQ:", prove that Q IS an orthogonal matrix .
(b) Use the result of part (a) to find a OR factorization
o f A ""
In QlleJtlO1U 9-12. find Il basis lor IV J.. 9. W is the line 111 H2 with general equation 2x - Sy = O. X=I
11. W "" span
\
0
- \ ,
\
,
\
vectors
A=
3
2
\
-2
\
4
8
9
- 5
6
- \
7
2 3
IS. Let A =
- \
: XI
El
I
- 1
I - I
2 I
I 2
-=
span
20. If {V I' V 2• 0
,
\
=
\
\
\
0
\
\
\
\ \
, Xl
=
\
, X)
0
to fi nd a n o rthogonal basis for
=
\ \
W
.•
\
,
\
\
\
,E _l = span
- \
o
v,,} is an o rthonormal basis for R d a nd
prove that A IS a sym m e tric m a trix wi th eigenvalues cl' ':!' . .. • c,. and corresponding eigenvectors VI' v 1• • . , v .....
- 2
15. (a) Apply the G ram · Schmidt Process to
XI
of W.
\
- \
\
= 0
3
\
,
\
o
with rcsp«t to
0
+ Xi + X, + ~
2
2
\
R~ that contains the
\
\
0 - \
\
\
19. Find asymmetric ma trix wit h eigenvalues Al = Az = I, AJ = - 2 and eigenspaces
\
W = span
\ 0
(a) O rthogonally diagonahze A. (b) Give the spectral decomposition of A.
14. Find the orthogonal decompositio n of
v =
\
17. Find an ort hogo nal basis for the subspace
\
- \
\
2
2 - 2
- \
\
"d
2
13. Find bases for each o f the four fundame ntal subspaces of \
\
\
x,
- \
\
o
2
\
\
0
\
\
0
\
x, x, x,
\ \
- I
-3
4
12. W = span
=
\
16. Fi nd an orthogonal basis for
10. Wis the line in W wi th parametric equations y = 21. Z
\
= span{xl'
X l ' Xl }'
ctor
Algebm is gellerous; she of,ell gives more 1111111 is asked of IU'r. - Jean Ie Rond d'Alembert
6.0 Introduction: Fibonacci in (Veclor) Space The Fibonacci sequence was introduced in Section 4.6. It is the sequence
( 17 17- 1783)
In Carl B. Boyer A Hisrory of MII/h emafies \'Viley, 1968, p. 481
0, I, 1,2,3,5,8, 13, ...
of no nnegative integers with the property that after the fi rst IwO terms, each term is the sum of the two terms preceding it. Thus 0 + 1 = 1, 1 + 1 = 2, J + 2 "" 3, 2 + 3 = 5,and soon. If we denote the terms of the Fibonacci sequence by ~, h., ... , then the entire sequence is completely determined by specifyi ng that
to,
fo = Q,!;
= I
and
in=
In- I
+ i"-2 fo r II 2: 2
By analogy with vector notation, let's write a sequence .\("
XI'
X:!' x3• '
••
as
x = [Xo,XI' ~,x3" " )
The Fibonacci sequence then becomes f = [Io,!"!,,!,,. .. ) = [0, I, 1, 2,. .. )
We now general ize this notion.
Definition
A Fibonacci-type sequence is any sequence x = (Xu, xI' X 2, Xl" such that Xu and X I are real numbers and xn = xn _ 1 + xn_2 for n > 2. For example, [ I, sequence.
Vi. I + V2. 1 + 2 V2. 2 + 3 v'2.... )
••
is a Fibonacci-type
Problell1 Write down the first five terms of three more Fibonacci-t ype sequences. By analogy with vecto rs agai n. let's defi ne the $11111 of two seq uences x = [At). xI' X l > . . . ) and y = [Yo. Y» Y2' .. . ) to be the sequence
x + Y = [41
+ Yo,xl + YI'X2 + Yl.·· ·)
If c is a scalar, we can likewise define the scalar multiple of a sequence by
•
431
taZ
Chapter 6
Vector Spaces
'r,lIle.2 (a) Using your examples from Problem 1 or other examples, compute thf! sums of various pairs of Fibonacci-type sequences. Do the resulting sequences appear to be Fibonacci-type? (b ) Com pute va rious scalar multiples of your Fibonacci-type sequences from Problem I . Do t he resulting sequences appear to be Fibonacci-type? Probl •• 3 (a) Prove that if x and y arc Fibonacci-type sequences, then so is x + y. (b ) Prove that if x is a Fibonacci-type sequence and c is a scalar, then ex is also a
Fibonacci-type sequence. Let's denote the set of all Fibonacci-type sequences by Fib. Problem 3 shows Ihat, like R~, Fib is closed under addition and scalar multiplication. The next exercises show that Fib has much more in common with R". 'robl •• 4 Review the algebraic properties of vectors in Theorem 1. 1. Does Pib satisfy all of these properties? What Fibonacci-type sequence plays the role of O? For a Fibonacci-type sequence x, what is - x? Is - x also a Fibonacci-type sequence? 'robl •• 5 In An, we have the standard basis vecto rs el' e1•... • eft' The Fibonacci sequence f = [0. [, I, 2, . .. ) can be thought of as the analogue of e~ because its fi rst two terms arc 0 and l. Whal sequence e in Fib plays the role of c l? What about el , e~ •. .. ? Do these vectors have analogues in Fib? 'rolll•• 6 Let x = [;.;;., xl'~ ' ... ) be a Fibonacci- type sequence. Show that x is a linear combination of e and f. J Show that e and f arc linearly independent. (That is, show that if ce + df = O,then c '" (1 = 0. ) Problel! 8 Given your answers to Problems 6 and 7, what would be a sensible value to assign to the "'dimension" of Fib? Why? ProbleUl 9 Are there any geometric sequences in Fib? That is. if
'r.III,.
Il,r, r1, rJ , ••
.)
is a Fibonacci-type sequence, what arc the possible values of ~ 'r,bl •• 11 Find a "baSIS" for Fib consisting of geometric Fibonacci-type sequences. ",lIle. 11 Using your answer to Problem 10, give an alternative derivation of Biflet's fomlllfil Iformula ( 5) in Section 4.6 1:
I (I + v'S)" _ I (I - v'S )" "v'S 2 v'S 2
f, _
for the terms orthe Fibonacci sequence f = the basiS from Problem to.) The Lucas sequence is named after Edouard lucas (see pagf! 333).
1fo.J; ./,., . . . ). ( Hint: Express f in terms of
The Luctu seq uence is the Fibonacci-type sequence 1 =[ ~,11'12 ,13 , ·· · ) - [ 2, 1 ,3,4,
. .. )
Problea 12 Use the basis from Problem 10 to find an analogue of Binet's formula fo r the nth term f" of the Lucas seq uenCl~~. Proble. 13 Prove that the Fibonacci and Lu cas sequen c~ are related by the identity
{,. -I + f~" 1 = I~ [H im; The fibona cci-type sequences r-
for tl
2:.
1
"" 11, I, 2, 3, ... ) and f'"
= [ I, 0, I, I, ... )
fo rm a basis for Fib. (Why?)] In this Introduction, we have seen that the collection Fib of all Fibonacci -type sequences ~ha ves in many respects like H2. even though the "vectors" are actually infinite sequencrs. This useful analogy leads to the general notion of a vector space that is the subject of this chapter.
5«-tion 6. 1 Vector Spaces and Subspaces
ua
Vector Spaces and Subspaces In Chapters 1 and 3, we saw that lhe algebra of vectors and the algebra of matrices are similar in many respects. In particular, we can add both vC(:tors and matrices, and we can multiply both by scalars. The properties that result from these two operations (Theorem 1.1 and Theorem 3.2) are identICa l in bot h settings. In th IS section, we usc these properties to define generalized "vectors" tha t arise in a wide variety of exam ples. By proving general theorems about these "vectors," we will therefore sim ultaneo usly be provlllg results about all of these examples. ThiS is lhe real po\Y"er of algebra: its ability to take properties from a concrete setting, like RM, and (lbstmct them into a general setting.
,, Let Vbe a set 011 wh ich two operations, called ndditiol1 and 5calar ; have been defi ned. If u and v arc in V, the 511m of u and v is denoted by u + v, and if c is a scalar, the scalar multiplc of u by c is denoted by cu. If the following axioms hold for all u, v, and w in Vand for aU scaJars cand d. then V is called a vector space and its elements are called vectOni. The German mathematiCIan Hermann Grassmann ( 18091877) is generally credited with first Introducing the idea of a vector space (although he did not can it that) in 1844 Unfortu · nately, his work was very difficult to read and did not receive the attention it deserved. One person who did study it was the Italian mathematician Giuseppe !'eano ( [8 58~ 1932). In his 1888 book C(llcolo GeomctncQ, Peano clarified Grassmann's e;lrlier work and laid down the axioms for a vector space as we know them today. Pea no's book is also remarkable for introducing operations on sets. His notations U, n , and E (for "union," "inler· section,Mand "is an dement of") are the ones we still use, although they were nOI immcdlaldy accepted by other mathematici(1I1S. Peano's axiomatic defini· tion of a vector space 111so had vcry little mfluence for many years. Acceplance came in 1918, after Hermann Weyl ( 18851955) repeated it 111 his book Space, Time, Mmler, 1111 introduction to Einstcl11's general theory of relativity.
l. u + v lsinV. 2. u + v = v + u
under addition Commutativity 3. ( u + v) + w = u + (v + w ) M!,ociati\-il\' 4. There ex ists an element 0 in v, called a %ero vector, such that u + 0 = u. 5. Fo r each u in V, there is an clement - u in V such that u + (- u ) == o. 6. culs inV. Clo~urc under !oCalar muJtipJi";,1lion 7. c( u + v) = co + CV Diwibutivity 8. (c + d) u = ru + du D i~tributivi t y 9. c(tlu ) = (cd )u 10.1u = u Ch)~ure
Re • • ," • By "scalars" we will usually mea n the real numbers. Accordingly, we should refer to Vas a rC(l1 vector space (or a vector space over tile Tenlmlmbers) . It IS also possible fo r scalars to be complex numbers o r to belong to Zp' where p is prime. In these Cllses, V is called a complex vector SplICe o r a vector space over Zp' respectively. Most of our examples will be real vector spaces, so we will usually o mit the adjective " real." If something is referred to as a "vector space," assume that we arc working over the real number system. In fact, the scalars can be chosen from any num ber system in which, roughly speakmg, we can add, subtract, multiply, and divide according to the usual laws of arit hmetic. In abstract algebra, such a number system is called a field. • The definition of a vector space does not specity what the set V consists of. Neither docs it specify what the operations called "addition" and "scalar multiplication" look like. Often, they will be fam ilia r, but they necd not he. Sec Example 6 below and ExerCises 5-7,
\Ve will now look at several examples of vector spaces. In each case, we need to specify the set Vand the operations of addition and· scalar multiphcation and to verify axioms 1 th rough 10. We need to pay particular attention to axioms 1 and 6
434
Chapler 6
Veclor Spaces
(closu re), axiom 4 (the existence o f a zero vector V must have a negative in V).
In
V), and axiom 5 (each vector in
Ixample 6.1
For any 1/ i2: 1, IRn is a vector space with the us ual op erations of addition and scalar m ultiplication. Axio ms I and 6 follow from the defi n itions o f these operations. and the remaining axioms foll ow from Theorem 1.1.
lKample 6.2
The set of all 2X3 matrices is a vecto r space with the usual operations of matrix addition and m atrix scalar multiplication. Here the "vectors" are actually matrices. We know that the sum of 1\\10 2X3 matrices is also a 2X3 matrix and that multiplying a 2X3 matrix by a scalar gives anothe r 2X3 mat r ix; hence, we have closure. The remaming aXIO ms follow from Theorem 3.2. In particular. the zero vector 0 is the 2X3 "lero matrix, and the negative of a 2x3 matrix A is just the 2x3 matri x - A. There IS noth ing special about 2X3 matrices. For any positive integers m and n, th e set of all //I X tI mat rices fo rms a vector space with the usual operatio ns of m atri x add ition and matrix scalar multi plication. This vector space is denoted M m".
IxampleJt
Let ~ 1 denote the set of all polynomials o f degree 2 or less with real coefficients. Define addition and sca lar multiplication in the usua l way. (See Appendix D.) If
p(x)
= flo + alx + al~
and
(Ax)
= bo + b,x + blxl
are in f!J' 2' then
p(x)
+ q(x) ""
(110
+ Vo) + (a l + vl)x + (al + b2 )X2
has degree at most 2 and so is in r;p 2' If c is a scalar, then
cp(x) "" ctlo + calx + cal;; is also in qp 2' This verifies axioms 1 and 6. The zero vector 0 is the zero po lynom ial- that is, the polyno mial all of whose coefficients are zero. The negati ve of a polynom ial p(x) = flo + (/IX + (l lX2 is the polyn om ial -p{x) "" -flo - (l 1X - a2x 2. lt is now easy to verify the remaining axio m s. We will check axiom 2 and Icave the ot hers for Exercise 12. With p{x) and q(x) as above, we have
P(x) + (Kx) = (ao + a\x + a2;;) + (~ + blx + blxl) = (flo
+ bo) + (a\ + b,)x + (a2 +
b2 )Xl
+ (b\ + al)x + (b2 + ( 2)x2 = (bo + b\x + b2xl) + ('10 + (l\X + (l2xl) = q(x) + p(x) = (bo + (10)
where the third equality follows fro m the fac t that addition o f real nu m bers is comm utative.
Section 6. 1 Vector Spaces and Subspaces
435
In general, for any fixed t/ :> 0, the set Cjp" o f all polynomials of degree less than or equal to " is a vector space, as is the set g> of all polynomials.
Example 6.4
Let ~ denote the set of all real-valued fu nctio ns defined on the real line. [f [a nd g arc two such func tions and c is a scala r, then f + g and c[are defi ned by
(f + g)(x)
~
f(x) + g(x)
,,,d
(e!)(x) - if(x)
In other words, the valll l! of f + g at x is obtained by adding together the values of f and g at x/Figure 6. 1(a)l. Similarl y, the val ue of c[at x IS Just the value of fat x mu ltiplied by the scalar c I Figure 6.1 (b)]. The zero vector in c:;. is the constant fu nctlon /o tha t is identically zero; that is,/o (x) ::: 0 for all x. The negative of a funct ion f IS the function - f defined by ( - f) (x) ::: - [( x) IFigure 6. 1(c)]. Axioms I and 6 arc obviously true. Verifi cation of the remaimng axioms is left as Exercise 13. Th us, g. is a vector space.
)'
(x. 2j(x» (x,j(x)
\
+ g(x»
I
2(
f f+g
8
\ _ --+.o:,x:c.f,,"::: '"j:)...-' '~
--
(x. 0)
~~ L--4--'----~ x (x. 0)
f
/ - '-" Jj(x))
(l.
(b)
(a)
)'
(x,J(x)
\ / -fIx»~
f
-f
(x. (e)
"gar. &.1 The graphs of (a) f, g, and [ + g, (b) [, 2[, and - 3[, and (c) f and - f
-3f
'
436
C hapter 6
Vector Spaces
In Example 4, we could also h:we considered o n ly those fu nctions defined o n some closed mterval [a, h] of the real line. T his approach also prod uces a vector space, d enoted by ~ [a, h].
Example 6.5
T he set 1L of integers wit h the usual ope ra tions is lIo t a vector space. '10 de mo nstrate this, it is enough 10 find tha t ol/cof the ten axioms fail s and to give a specific instance in which it fails (a cOllllterexample). In this ca se, we find that we do not have closure under scalar multiplica tion. For example, the m ult iple o f the in teger 2 by the scalar is 0)(2) = whic h is no t an integer. Th us, il is nOI true that ex is in if.. for every x in 1L a nd every scalar c (i.e., axiom 6 fails).
1
Example 6.6
L
Let V = R2 wit h the us ual defi nition of add ilion but the fo llowing defin ition of scala r m ultiplication:
t ]~ [ ~] 1[:] ~ [~] t [:]
Then, for example,
so axiom 10 fal ls. [In fact , the other nine axioms are all true (check Ihis), but we do n ot need to look into the m because V has already failed to be a vector space. This example shows the value of lookmg ahead, rathe r th an working through the list of axioms in the o rde r in which they have been given. )
Example 6.1
LeI (2 de note the set of all o rdered pairs of com plex n umbe rs. Defi ne addition and scalar multi plication as in 1112, except here the scalars are com plex nu mbers. For exam ple,
[ 1+i] +[-3+ 2i] [-2+3i] 2 - 31
(I -
a nd
4
6 - 31
i)[ I+i] ~ [(I-i)( I +i)] ~ [ 2-3i
( l - i)(2 -3i)
2] -I - 51
Using prope rt ies of the complex numbers, it is straigh tforward 10 check that all ten axio m s hold. The refore, C 2 is a co mplex vector space.
In general,
Example 6.8
e" is a complex vector space for all n 2:
I.
If P is prime, the set lL; (with the us ual d efinitions of addition a nd multiplication by scalars from Z,J is a vector space over lLp for all n 2: I.
Sa:tion 6.1
Vector Spaces and Subspace!;
UI
Udore we consider furt her examples. we st,llc a theorem that contains somc useful properties of vecto r spaces. It is Important to note Ihal, by proving this theorem fo r vector spaces In gelleml, we are actually provmg it for every specific vector spact.
Theor•• 6.1
Ltt V be a vector spact, u a vector in V, and c a scalar.
II
3.0U = 0
b.eO = O c. (- l) u = - u d. Ifcu = O,then
C '"'
Oor u = O.
Proal We prove properties (b) and (d ) and leavt Ihe proofs of the rema ining propenies as exercises. (b) \Ve have dI ~ « 0
+
0 ) ~ dI
+
dI
by vector space axioms 4 ::and 7. Adding the negat ive of d) 10 both sides produces
+ (-dl )
dI
~
(dl
+ (0 ) +
(-dl)
which implies
°= dI + (dl + ( - dl )) - cO
+0
By ax iom ~ 5 and 3
By axiom 5
= dl
By axiom 4
c = 0 or u = 0 , let's assume that c ¢ O. (If c = O. there is no thing to prove.) Then, since c r:f. 0, its reciprocal I/c is defi ned, and
(d ) Suppose cu = O. To show that ei ther
u
=
lu
U)' axiom 10
e} I
,
-fro )
I\y axiom 9
I
-0 c
°
Ill" property (b )
We will wn te u - v for u + (- v ), thereby definmg sub/merion of veclo rs. We will also exploit the associativity property of addit io n to unambiguo usly write u + v + w fo r the sum of three vectors and, more generally,
for a linear combiNation of vectors.
Sibspaces We have seen that, in R~. it is possible for onc vector space to sit inside another one, glVmg rise to the notion of 3 subspace. For example. a plane through the ongin is a subspace of R '. We now extend th Ls concept to general vector spaces.
431
Chapter 6
Vtctor Spaces
Dennmon
A subset W of a vector space V is ca lled a sllbspace of V if IV is .tse f a vector space with the same scalars, add ition, and scala r multiplication as V.
As in IR~, checking to see whet her a subset W o f a vector space Vis a subspace of V involves testing only two of the ten vecto r space axioms. We prove this observation as a theorem.
•
it
Theorem 6.2
Let V be a vector space and lei W be a nonempty subset of V. Then \Visa subspace of Vi f and only if the fol lowi ng conditions ho ld: a. Ifu and varei n W, thenu + v is in W. b. If u is in Wand c is a scalar, then cu is in IV.
Prool Assume that W is a subspace of V. Then W satisfi es vecto r space axio ms I to 10. In particular. ax iom 1 is cond ition (a) and axiom 6 is condition (b). Conversely, ass ume that W is a subset of a vector space v, satisfying co nditions (a) and (b ). By hypothesis, axioms I and 6 hold. Axioms 2, 3, 7,8,9, and 10 hold in Wbecause they are true for allveclors in Vand thus are true in particular for those veclo rs in W. (We say that W inlterits these properties from V.) This leaves axioms 4 and 5 to be checked . Si nce W is noncmpty, it contains at least one vcctor u. Then condi tion (b) and Theorem 6. I(a) imply that Ou = 0 is also in W. This is axiom 4. If u is in V,then, bytakingc = - I in condi tion (b ), we have that - u = (- J) u is also in W, using Theorem 6.1 (c).
R,.arr
SlIlce Theorem 6.2 generalizes the no tion of a subspace from the ca nlext of lR~ to general vector spaces, all of the subsp:lcCS of R" that we encountered in Chaptcr 3 arc subspaces o f R" in the current context. In particular, lines and planes th rough the origin arc subs paces of Rl.
-
Ixample 6.9
lxample 6.10
We have already shown that the set ~ n of all polynomials with d egree at most vector space. Hence, (jJ> ~ is a subspace o f the vector space ~ of all polyno mials.
/I
is a
.-t
lei Wbe the set o f sym met ric /I X /I matrices. Show that W is a subspace of M n" .
Sola11011 Clearly, W is nonempty, so we need only check condJlio ns (a) and (b ) in Theorem 6.2. I.et A and B be in Wand let c be a scalar. Then A T = A and 8 T = B. from wh ich it fo llows that (A + 8) T = AT + 8T = A + B
Therefore, A + B is symmetric and , hence, is in \V. Similarly,
(CA )T = CAT = cA so cA is symmetric a nd, thus, is in W. We have shown that W is dosed under add ition and scalar multiplication. Therefore, it is a subspace of M"", by T heo rem 6. 2.
$cClion 6. J
Elample 6.11
Veclo r Spaces and Subspaces
la9
LeI cg be the set of all continuous real-valued functions defined 011 R and let £b be the sel of all d Ifferen tiable real -valued func tions defined on R. Show that is a subspace o f 3' and Mn ' Typical elements of these vector spaces are, respectively,
a b u
In th e words of Yogi Berra, "It s dej il. vu all over again."
EMample 6.13
~
,
, p(x)
=
a
+ bx +
a;l
+ dx 3 ,
d
Any calculations involving the vec tor space o pe rations of add ition and scalar multiplication are essentially the same in all three settings. To high light t he simila rities, in the next exam ple we will perform the necessary steps in the three vector spaces side by side. (a) Show tha t the set W of all vectors of the form
a b - b a is a subspace of [R4. (b) Show that the set W of all polynomials of t he form a s ubspace of 9J> y (e) Show that the set W of all matrices of the form [ _ :
+ bx - bil + (o?
:J
IS
a
is a subspace of Mn ·
,,
(
Section 6.1
,
•
Vector Spaces and Subspaccs
.41
Solutlan (3) W is no n empty beca use it contai ns the ):cro vector O. (Take a = b = 0.) leI u and v be in \.~' say,
,
a
"~
(b) W IS nonem pty because it contains the zero polyno m ial. (Take a = b = 0.) Let p {x ) and q(x) be in W-say,
b -b
and
v
~
AX) =
d
,
+ I,x - bx- + ax' A =
q(x) = c + (Ix -
cJr + ex'
and
Then
Then
+C b+ d
B=
[ - "b b] (l
,I] , [ -d
c
Then
p(x) + q(x)
II
u + v=
tams the zero matri x O. (Take (l = b =' 0.) Let A and B be in W-say,
and
- d
a
(I
(c) W is nonempty because il con-
~
+, b+ d] A +B= [ - (b + d) a + ,
(a + ,)
a
+ (b + d)x
- b- d
- (b + d)"
+ (a+ c)K
a+, b+ d -(b + d)
a+, so u + v is also in W (because It has the right form ), Similarly, if k is a scalar, then
so p{x ) + q(x) is also in W (because it has the righ t fo rm ). Sim ilarly, if k is a scalar, then
kp{x)
ka ku =
=
so A + B is also in W (because it has the right form ). Si m ilarly, if k is a .scalar, then
ka + kbx - kbK + kax'
ka kA = [ -kh
kb - kb
kb] ka
ka so ku is in W. Thus, W IS a noncmpty subset of
R4 that is d osed under addition and scalar multiplication. Therefore, W is
a subspace of R 4 , by Theorem 6.2.
so kp(x) is In w. Thus, W is a no n('m pty subset of IJi'J tha t is closed under addition and scalar m ultiplication. Th('rcfo re, W is a subspace of qp ) by T heorem 6.2.
so kA is in \-V. Thus, W is a nonempty subset of M n that is closed under addition and scalar multiplicat IOn. T herefo re, W is
a sob, p'" of M ".by Th'N' ''' 6.:...t
Exampk 6. 13 shows that it is often possible to relate examples that,on the surfa ce, appear to have nothing in com mo n. Conseq uently, we can apply o ur knowledge of III " to polynomials, matrices, and othe r examples. We will encoun ter this idea several times mlhis chapter and will m ake it prel' determine whether ,(x) = I - 4x ((x ) = I - x
+ xl
+ 6~ is in span ( p(x). q(x)), where
and
q(x)::::: 2
+x
- 3K
Solallol We arc looki ng for scalars c and d such that cp(x) + dq(x) - ,(x). This means that c( 1 - x
+ r) +
(2
+x
- 3x1 ) = I - 4x
+ 6K
Regrouping according powers of x, we have
(c+ 2d)
+ (-c+ tl)x+ (c- 3t1),r
Equaling the coeffi cients o f like powers of x gives
c+ 2d= I -c + d = -4
c - 3d =
6
= 1 - 4x+ 6x 2
...
Ch:lptcr 6 Vector Spaces
which is easily solved to give c = 3 and d ... - J. Therefore. r{x) = 3p(x) - q(x) , so r(x) is in span (p(x). q(x» . (Check this.)
Example 6.20
In !J. determine whether sin 2x is in span(si n x, cos x). We set C Sin x + i/ cos x "" sin 2x and try to determine c and Ii so that th is equ::ltion is true. Since these are function s, the equation must be true for (III values of x. Setting x = 0, we have
$0111101
csinO + dcos O = Sin O or
c(O) + d(l ) = 0
from which we see Ihat i/ = O. Setting x = rr / 2, we get csin(7T/ 2) + dCOS(7T/2}
= sin(7T)
or
c( l ) + d(O)
=0
giving c = O. Bu t this implies that sin 2x "" O(SIn x) + O(cos x ) = 0 for all x, which is absu rd, since sin 2x is not the u ro function. We conclude thai sin 2x is not in span (sin x. cos x) .
••••'l It is true that sin 2x can be written in terms of sin x and cos
x. ror
example, we have the double angle for mula sin 2x = 2 sin xcos x. However, th is is not a lillcilrcombination.
Ellmpl,6.21
[~
In Mw descnbe the span of A "" SIIIII..
'], 8=[ '
°
°
O,].and C=[ O,
0'],
Every linear combination of A, B, and C is of the for m
CA+dB +eC= C[ ~ ~]+d[ ~ ~]+ e[~ ~] ~
[ v2' . • . , .... l. Then, since W is closed under addition and scalar multiplication , it contains every linear combination CIVI + C~V2 + ... + ckV k of vI' V I" •• , vr Therefo re, span (v i • v l " .. , vt ) is contallled in W.
ExerCises 6.1 In Exercises 1-11, determine whether thegivell set, together with the specified operatiolls of additioll and scalar mulriplicatiOlI, is a vector space. If It is 1I0t, list all of t ile axioms !lrat fail to hold 1. The set of all vectors in
R2of the
form
[:J.
with the
usual vcrtor addiuon and scalar multiplication
2. The set of all vectors [ ;] in Rl with x C!: 0, Y 2: 0 (i.e., the first quadrant), with the us ual vector addition and scalar multiplication
3. The set of all vectors [;] in 1R2 with xy C!: 0 (i.e., the union of the fi rst and third quadrants), with the usual vector addition and scalar multiplication 4. The set of all vectors [ ;] in R2 with x
~ y, with the
usual vector addition and scalar multiplication
5. IR', with the usual addition but scalar multiplication defi ned by
6. 1R 2, with the usual scala r multiplication but addition defi ned by
X'] + [""] _ [x' +"" + [Y I Yl -YI+y,+
I] l
7. The set of all posiuve real numbers, with addition defined by xE!) y = xy and scalar multiplication 0 defined by c 0 x "'" x'
E:B
8. The set of all rat ional numbers, wi th the usual additIOn and multi plication 9. The set of all uppe r triangular 2X2 matrices, with the usual matm additio n and scalar multiplication 10. The set of all 2 X 2 matrices of the form [:
: ].
where ad :::: 0, with the usual matrix addition and scalar multiplicat ion
11. The set of all skew-symmetric 71X n matrices, with the usual matflx additio n and sca lar multiplication (see Exercises 3.2). 12. Fin ish veri fying tha t qp l is a vector space (see Exampie 6.3) . 13. Finish verifying that ~ is a vector space (see Example 6.4).
...
Cha pt~r
6
Vector Spaces
•• ~ III Exercises 14- / 7, delt:rl/lll1e whether the gIven set, toge/I, er w;th the specified operntiollS of (uldition (wd scalar multipUcmioll, is a complex vector space. If it is nor, list all of the axioms thnt fnil 10 /IO/d. 14. The set of all vectors in C 1 o f the for m
[~], with the
usual vector add ition and scalar multiplication 15. The sct M",~(C ) o f all m X " comple x matrices, wi th the usual ma trix addi tion and scalar multiplication 16. The set
el, with
the usual vector addit ion but scalar
multiplication defin ed by 17.
c[::] = [~~]
Rn, with the usual vector add ition and scalar multiplicat ion
III Exercises 18-2 1, determille whether the give" set, together wirh tlJe specified operatiolls of mld,t,oll (lml scatt" multipU-
alliorl, IS a vector space over tlJe illdicated Z,.. If it IS IIOt, Ust rill of lite rlxiOIllS II/at fat! to Itold.
18. The set of aU vectors in Z; with an tvt'n numocr of I s, over Zz with the usual vector additio n and scalar multiplication 19. The set of all vectors in Zi with an odd number o f Is, over Z, with the usual VC1:tor addition and scalar multiplication 20. The set M"",(Z,J of all m X " mat rices With entries from Zp> over Zp with the usual ma trix addition and scalar multipl icatio n
21. 1 6 , over ill with the usual additio n and multiplicatio n (Think this o ne Ihrough carefu lly!) 22. ProveTheorem6.1 (a).
23. PrQ\'e Theorem 6.1 (c).
In Exercises 24-45, lIse Theorem 6.2 to determine whether W is a subspace ofY.
27. V = Rl, W =
• b
I. I
28. V=M n ,W = {[:
2~)}
29. V=M n ,W = { [ :
~] :ad2bc}
30. V = Mn~' W = lAin M",, : det A = I} 31. V = M".., W is the set o f diagonal
"X" mat rices
32. V = M"", W is the set o f idem potent nXn matrices 33. V = At"", \V = IA in M",, : AB = BA}, where B IS a given (fixed ) matrix
34. V ~ ~" W = {bx+ d} 35. V = CJ>:z, W= fa + bx+ a 1:u + b+ c= O} 36. V=~" W = {.+ Itr+ d ,abc=O} 37. V =
~,
W is the set o f all polynomials o f degree 3
38. V= '§, W = {n n '§'f(- x) = f(x))
39. V = 1/', IV = (f ;,, 1/" f( - x) = - f(x))
'0. V = S;, IV = (f; n 1/' , f(O) = I) 41. V = :1', IV = 1f;":I', f(O) = O} 42. V = '§, IV is the set o f all llliegrable fu nctions 43. V = 9i, IV = {fin ~: r ( x) ~ 0 for all x} 44. V = ,§, w = (€ (l), the sct of all fu nctions with continuous second derivatives ~ 45. V =
,-,
1/', IV = (f h' 1/', Um f(x) = 00)
46. leI Vbe a vector space with subspaces U and W Prove that u n W IS a subspace of V. 47. Let Vbe a vector space wit h subspaces U and HI. Give an example wit h V "" Rl to show that U U W need nOI be a subspace of V. 48. Le t Vbe a vecto r space with subspaces U and \V. Define the slim of U t,ml W 10 be
U+ W = lu + w : u isin U, w is in W]
25. V = R', W=
•
-. 2.
26. V = Rl, W=
a b a+b+1
(a) If V = IR:J, U is the x-axis, and W is the y-axis, what is U + W ? (b) If U and Wa re subspaces of a vector space V, p rove Ih:1I U + W /s a subspace of V.
49. If U and Yare vector spaces, define the Cartesian product of U and V to be U X V = leu , v) : u isin Uand v isi n VI
Prove that U X V is a vector space.
Secllon 6.2
50. Let W be a subspace of a vector space V. Prove that = !( w, w ): wislll W I is asubspace of VX V.
a
In Exercises 51 (lnd 52, let A = [ 8 =
\ -I] [ 1
\ - \
5I. C=[~!]
52.C =[~
-5] - \
xl-
54. sex) = I
58. hex)
= situ
:].[ ~ ~]. [: ~]t
-~}
60. Is M22 spanned by [ :
~].[ : ~].[: :].[~
-~]?
62. IsttP 2 spannedbyi
span(p(x). q{x). r( x)).
56. h(x) = cos 2x
59. ISM21SpannedbY [~
61. Is (jJ>1 spanned by I
In Exercises 53 011(/ 54. let p(x) = 1 - 2x, q(x) = x - X l , alld r(x) = - 2 + 3x+ x 2. Determine whether s(x) IS in 53. s(x) = 3 - 5x -
=I 57. h (x) = sin 2x
55. h(x)
\ ] and \
O · Determine whether C is ill span (A, 8 ).
en
I.mear Independence, BasIs, and Dimension
+ x,x + xl, 1 + Xl? +x+ 2x",2 + x+ 2X 2,
-1+ x+2x 2?
63. Prove tha t every vector space has a unique zero vector.
+ x + xl-
64. Prove that for every vector v in a vector space V. there is a unique ,,' in V such that v + v' = o.
In Exercises 55-58, let f (x) = sin 2x ami g(x) = cos 2x. Defermine wlle/ller II(X) is ill spaf/(f (x), g(x)).
Linear Independence. Basis. and Dimension In this section , we extend the notions of linear independence, basis, and dime nsion to general veclor spaces, generalizing the results of Sections 2.3 and 35. In most cases, the proofs o f the theo rems ca rryove r ; we simply replace R" by the vector space V.
linear Iidependence DeOnIllD.
A set of vectors {V I ' v2, ••• , vk} in a vector space V is linearly de· pendent if there are scalars CI' C:l, ... • c1, allerul one of wJUda..u.. 0, such that
A set of vectors tha t is not linearly d ependent is sa id to be linearly independen .
As only if
In
RIr, Ivp v" . . . , vA.} is linearl y inde pendent in a vector space V if and
We also have the following useful alternative formulation of linear d ependence.
441
Chapter 6
Vector Spaces
Ii
Theorem 6.4
A set of vectors l VI' V2" . • , v k } in a vector space Vis linearly dependent if and only if alieasl one of the vectors can be expressed as a linear combination of the others.
Prill
•
The proof is .dentical to that of Theorem 25.
As a spc • •• , e,,\ is a basis for R ~,
xl> I is a basis for qJ>~, called the sumdard btlSis for qJ> ,..
Xl,
I + .0}isaba si s for ~l'
We have already shown that 6 is linearly independen t, in Example 6.26. To show that 8 spans f/P 2' let a + bx + ex! be an arbitrary polynomial in ~l' We must show that there 3re scalars c" ';}. and t; such that
Solulloa
(,( I
+ x) +
C:l( x
+
x2)
+
cil
+
x 2) = ()
+
/JX
+
ex l
or, equivalently,
Equating coefficients of like powers of x, we obtain the linear system
(, +
Cj = a
which has a solution, Since the coefficien t matrix
I
0
I
I
I
0 has rank 3 and, hence,
o
I
I
is invertible. (We do nO I need 10 know wllll/ ihe solution is; we only need to know that it exists.) Therefore, B is a basis for r;p l '
Remar. Observe that the matrix
I I
0 I lOis the key to Example 6.32. We can
o
I
I
immediately obtain it using the correspondence between flP'1 and RJ, as indicatC'd the Remark foUowing Example 6.26.
In
.52
Chapter 6
Veclor Spaces
Example 6.33
Show that 6 = { I, X,
X l , ..• }
isa basis fo r ~.
[n Example 6.28, we saw that 6 IS linear[y mdependent. It also spans IJJ>, since clearly every polynomial IS a linear combination of (finite ly many) powers o f x.
Solution
-tExample 6.34
Find bases for the three vector spaces in Example 6.13:
a
(,) W, =
b -b
a
Solullon Once again, we will work the three examples side by side to highlight the similari ties amo ng them. In a strong sense, they are all the SClme ex:ample, but ]t will rake us until Section 6.5 to make this idea perfectly precise.
('J
(b) Sin ce
Since
"b :
-b
"
"
1
0
0
1
0 1
+ b
0
1
0
0
1
0 1
'nd
n +bx - bx 2 +nxl = a( l + x 3) + h(x - xl)
- I
we have WI = span (u , v ), where
u:
(c) Since
v =
we have W z = span ( ll(x), V(xl), where
u(x) = 1 + x 3
- I
0
Since lu, v) is clearly linea rly indepe ndent, it is also a basis fo r WI'
and
v(x)
=
we have W3 = span( U, V) , where
u=l~~]
and
v = [ _~~]
x - x2
Since lu (x), vex» ) is dearly linearly independent, it is also a baSIS for W2•
Since 1U, VI is dearly linearly in dependent, it is also a basis for WJ •
Coordinates Section 3.5 in troduced the idea of the coordinates of a vector with respect to a basis for subspaces of Rn. We now extend thiS concept to arbitrary vector spaces.
Theorem 6.5
Let V be a vector space and let 6 be a basis for V. For every vector v in V, there is exactly one way to wri te v as a linear combination of the basis vectors in 13.
Proof
The proof is the same as the proof o f T heorem 3.29.11 works even if the basis B is infmite, since linear combinat ions are, by defi nition, finite.
Sectton 6.2
453
Linear independence, Basis, and Dimension
The conve rse of Theorem 6.5 IS also true. That is, If 13 is a set of vectors in a vector space V \" ith the pro perty that every vector in V can be wri tten uniquely as a linear combination of the vectors in 13, then B is a basis for V (see Exercise 30) . In this sense, the unique representation property characterizes a basis. Since representation of a vector with respect to a basis IS unique, the next definition m akes sense.
Definition
Let 13 = { V I> V2' • . " v~} be a basis for a vector space V. Let V be a vector in V, and write v = ci v, + '7v2 + ... + env no Then (I' (2' .. . , cn arc called the coordinates ofv with respect to B, and the column vector
c,
c, c. is called the coordi1mte vector of v with respect to 8.
O bserve that if the basis 13 of Vhas n vectors, then [vlLl is" (colum n) vector III Rn.
Example 6.35
Find the coordinate vcrtor [P(X) ]8 of p (x ) = 2 - 3x dard basis B = {i, x, Xl} of rz; 2.
SoluUon
+ 5Xl with respect to
The polynomial p(x) is already a line" r combination of i , x. and
the sta n-
xl, so
2 [P(X) iB~
-3 5
This is the correspondence between QIl 2 and IR3 tha t we remarked o n after Example 6.26, and it can easily be generalized to show that the coordin,lIe vector of a polynomial
p(x) = ~
+ alx + alx 2 + .. +
with respect to the standard basis 13 = { i,
X,
an x~ in qp~
x 2, .. . ,xn} is jus t the vector
a, [p(x) iB~ a,
in IRn+1
a. The order in which the basis vectors appear in 6 affects the o rder of the entries III a coordina te vector. For examp le, in Example 6.35, assume that the
1I,•• rl
.5.
Chapter 6
VeclOr Spaces
standllrd basis vecto rs lire o rde red as B' = {x 2, p (x) = 2 - 3x + 5; wit h respect to B' is
X,
I}. Then the coordillate vector o f
5
[p(xl l.
lumpls 6.36
Find the coo rd inate vecto r
~
,
- 3
-']
[A],~ of A = [~
3 with respect to the standa rd basis
l3 = {Ell' E12 , £:z l' £:zl}o f M12 • SoluUon
Since
2
-,
we have
4 3 ThIS is the correspondence between Mn lind IR" that we no ted bcfo rc thc intro· duct ion to Exam ple 6.13, It too can easily be generalized to give 11 corrcspondcnce b etween M",~ and R""'.
lump Ie 6.31
Find the coordinate vector [ p(x)]/3 of p(x) = 1 + 2x C = II + x, x + x 2, I + x 2 ) o f W> 2'
SOIl iiol
Xl
wit h respect to the basis
We need to find cl ' c2' and c, such that
' 1(1 + x ) + Gj(x + x 2) + c3(1 + x 2) "" 1 + 2x - x 2 or, eq uivalently,
(el
+ eJ ) + (c1 +
Gj )x
+ (Gj +
c, )x 2 = 1 + 2x - x 2
As in Exam ple 6.32, this m eans we need to solve the system
+ c1 + ( I
1
(J ""
(2
""
2
c2 +(3=- 1 whose solution is found to be el
= 2, C:.! = 0, £3 = - I. T herefore, [p(xl lc
~
2 0
-,
Secllon 6.2
Linear Independence, Basis, and Dimension
(Since this result says that p( x) "" 2( I correct.)
+ x)
- (I
455
+ xl), it is easy to check that it is
The next theorem shows that the process of forming coordinate veclOrs is compatible wi th the vector space operations of addition and scalar multiplication.
Theor,. B.B
lei 6 "" {VJ' VI" .. , V~} be a basis for a vector space V..Le.LLLand v be vectors in and let c be a scalar. Then
a. [u + v]s "" [u]s + [vls b. [cuJ. = cl uj.
Prllt
We begin by writing u and v in terms of the basis vectors-say, as
Then, using vector space properties, we have
,nd so
d,
c, ~
[u + v]s ""
[eu]s
and
+
d,
= [uJ. + [vJ.
=c
=
ee.,
e.,
An easy corollary to Theorem 6.6 states t hat coordinale vectors preserve linear
combinations:
[ c, u, + "" ..
i!""OCC'
!. ' " :
- ;;-;c
c;-"f"-:
(I )
You are asked to prove this corollary in Exercise 3 1. The most useful aspe
dence it, Mll' For those that art finearly dependent, express on~ of the matrices as a linear combination of the others.
1.{[~
_:].[:
2. {[ :
-~ ]. [;
3. {[
4.
{[~
: ].[:
~].[~
: ].[ :
~]}
III Exercises 5-9, test tile sets of polYllomjnls for li'lcur indc· pel/dellce. For t/lOse IIIat are linearly dependent, express one of the poiYllomials as a lillear combillatioll of the oll'ers.
=; ;].[~ 0][ 0 2] [-I ~]} 1 ' - 3
1 ' - 1
5. \x. I + xl in !Jl , 6. {I + x, I + x 2• I - x + x 21 III '21'2 7_ tx, 2x - x 2, 3x + 2Xl} in !Jl 2
Section 6.2
8. {2x.x -
Xl,
I + xl,2 -
X2
+ x'i in eP]
9. {I - 2x, 3x+x 2 - x' , 1 + x 2 +2Xl.3 + 2x+ 3X' ] i n ~l
LineM Independcncc, Basis, and DimenSIOn
20. V = Mll,6 = {[~ ~],[~ ~]. [~
:]. [ :
481
~]}
21. V = Mw
In Exermes 10-1 4, lestlhe sets of fi mctions for /rl/ ear imlependellce ill :y;. ror tllOse that are lil/eMly depel/dem, express one of the fimcriOPu as a linear combmation of tire otlrers. 10. {I,sin:c.cosx] II- {1,sin 1x,cos!xl 12. {e"', e-.rl 13. {I , In (2x).ln (x 1)1
sin 2x, sin 3xl p . Iffand gare in «5('), the vector space of all functions 14. {sin x,
with cominuous derivatives, then the determinant
W(x) = f(x) ['(x)
g(x) g'(x)
W(x) =
·• · fl'- ' ~ x)
[,(x) fi(x) ·• •
J1 ' - ' ~ x)
•• •
•
f.(x) f;(x) • • •
... r.'- '~ x)
and h, .. . ,f~ are linearly independent, provided W(x) is not identically zero. Repeat Exercises 10-14 using the Wronskian test. 17. Let lu, v, wI be a linearly independent SCt of vectors in (1 vector space V.
(a) Is lu + v, v + w, u +
wi linearly independen t?
Either prove that it is or give a counterexample to show that it is not. (b) Is ju - v, v - w , u - w i li nearl y independen t? Either prove that it is or give a counterexample to show that it is not.
III Exercises 18-25. detem.ine whether ,I.e wt 6 is a basis for the vector space V 18. V = M21 ,6 =
22. V = ~ l ' B = {x, I + x. x - x 2} 23. V = f!t ~.B = (I - x, l - x 2,x - x 2 } 24. V = Cfl> 2' B
IE
{I, I
+ 2x + 3x 2}
25. V = ([P2,13 = {1,2 - x,3 - x 2,x
+ 2X2}
26. Find the coordinate v«tor of A =
[~
!J
with
respect to the basis 6 = {Eu • E,., En . El .} of Mn.
.s called the Wronskian of f and g (named aft er the Polish- French mathernatician J6sef Maria Ifacne· Wronski ( 1776-1853), who worked on the theory of determinants and the philosophy of mathematicsl. Show that land g are linearly independent if their Wronskian is not identicaUy zero [that is, if there is somexsuch that W(x ) *- 01. ~ 16. In general, the Wronskian of j; •. .. ,j" in ~ (" - I l is the determinan t
f,(x) f :(x)
a=W:l[: ;l[-; :J,[~ :l[; ~]}
{[~ : J. [~ -~J.[:
1 9. V = M21.6 = {[~ ~J.[~ -~],[ :
_:]}
:].[: _:]}
27 . Fmd the coordinate vtttor of A =
[~
:] with respect
to theb ' " ,(U ~] B} is linearly independent in R- then {UI" .. , Uk} is linearly independent in V. 33. Let {u ,•. . .• u",} be a set of vectors in an
II-dimensional vector space Vand let 6 be a basis for V Let S = {( U,]B•. .. , ( u",] s} be the set of coord inate vectors of (UI> . . . , u'"} wit h respect to 6. Prove that span( u ,' . .. , u"') = V if and only if span(S) = R". /11 Exerciws 34- 39, fi nd the dill/ellsiOI/ of tI/e I'ector splice V (md give (I
hllsis for V
34. V = {p(x); n !'i', 0 P(O) = 0) 35. V = {p(xj;n!'i', op(l ) = 0) . V - {p(x) ;n !'i', 0 xp'( x) = p(x))
482
C hapter 6
Vector Spaces
37. Vo: {A in Mn:A is upper lriangular}
54. Let S = {Vi•...• v.} ~ a linearly independent set in a vector space V. Show that if v is a vector in V that is not in spa n{S), then S' " {VI" '" v~, v} is sl'iJi linea riy independent.
38. V = {A in M12: A is skew-symmetric}
39. V= {A inMn:AB = BA}.where B =
[~
:]
40. Find a formula for the dimension of the vector space of symmetrtc nXn matrices. 41. Find a form ula for the dimension of the vector space
of skew·sym metric
II X II
mat rices.
v,,} be a basis (or a vector space Vand let c... . . , en be nonzero scalars. Prove that {clv" ... , e"v,,}
57. Let { VI"'"
42. Let U and W be subspaces of a fin ite-dimensional
vector space V. Prove Grassmtum's Identity:
is also a basis for V.
dim(U+ W ) == dimU + dimW - dim(Un W )
[Hi"t: The subspace U + W is defined in Exercise 48 of Sect ion 6.1. Let 8 == Jv,. ...• v*' be a basis for un w. Extend l3 10 a basis C of U a nd a basis V of W. Prove that CuD is a basis for U + w. J 43. Let U and Vbe fi nite-dimensional vector spaces.
(a) Find a formula for dim(U X V ) in terms of dim U is infin ite-dimensionaL ( Him: Suppose it has a finite basis. Show that there is
some polynomial that is not a linear combination of thiS basis.) 45. Extend I I + x. I + x + 46. Extend {[ 47. Extend
~
: ].
[~
Xl )
to a basis for ra> 2'
:]} to a basis for M12•
{[~ ~]. [~ ~]. [ ~ -~] } to abasisfor M~.
48. Extend { [
~ ~], [ ~ ~] } to a basis for the vector
space of symmetric 2 X2 m3trices. 49. rind a basis fo r span( l . I + x, 2x) in 1P 1. 50. Fi ndab3sisfor span (1 - 2x,2x- Xl , 1- x 2, 1 + x 2 ) in (jJ> l ' 51. Findabasisfor span (J - x,x- X l , I - x 2 , 1 - 2x+ X l ) in f!J>2' 52. Fi nd a basis for
span ([~ ~]. [~
'] [-' ']
O·
[ - 1, - I' ]) ;nM"--.
53. Find a baSIS for span(sin1x, cos 2x, cos 2x) in ~.
55. Let S == {VI" ..• v,,} be a spanning set for a vector space V. Show that ifv" IS in span (v l •• .. , V,,_ I)' then S' = {VI" .. • v n- I} is still a spann ing set for V. 56. Prove Theorem 6. IO(f).
I
- I '
58. Let {Vi' ...• v ..} be a basis fora vector space V. Prove thai
{VI' VI + Vl, VI + v1 + V,' ...• VI + ... + v,,} is also a basis for V.
Let (Ie, Ill>' • . ,(I" be n + I dis/mel rea/ nllmbers. Defil1c polynomials pJ"x), plx), . . .• p.(x) by .I ) _
(x - IlO) ". (x - a' _I)(x - a,+I)'" (x - a. )
p" .' - (a, -
"0) . . (a, - (1, _1)( (I,
-
a,+I)' " (a, - an)
These are all/ell tlte lAgrange polY'lOmials associate,' with tIo, Ill" •. , an' IJoseph·wllis Lagmllse ( 1736- 1813) was hom i,1 ffaly bllt spent most of his life ill GermallY ami Frallce. He made important cOllf riblltioll$ to mel! fields as /lumber theory, algebra, astronomy. mechanics, and the calculus of variatiOllS. I" 1773, lAgrnnge WtlS tile first to give the volume iflterpreltl tioll of a determinant (see Chapter 4).1 59. (a) Compute the Lagrange polynomials associ31cd With a." = l ,u 1 = 2, ° 2 :: 3. (b) Show, in general, that
p,(a,) =
t
ifi "'} if j = j
60. (a) Prove that the set 13 = {Alx), plx), .. . , p,,(x)} of Lagrange polynom ials is linearly independent III ~,..IHjm:Se t GJAix) + ... + c"p,,(x) == Oand use Exercise 59(b).] (b) Deduce that B IS a basis fo r flI'". 6 1. If q(x) is an arbit rary polynomial in qp M it follows from Exercise 6O(b) that
q(x) = '>Po(x ) + ... + "p,(x) for some sca l ars~, ... , c,..
(l )
(a) Show that c, == q(a,) for i = 0•. .. , n. and deduce th .. q(x) = q(a,)p,(x) + ... + q(a. )pJx) ;sth, unique represen tation of q(x) with respect to the basis B.
Settion 6.2
(b ) Show that fo r ;lny n + I points (Ug. Co), (al' c, ), . .. , ( tI~, cn) with distinct first components, the (unction q(x) defined by equation ( I) is the unique po lynomial of degree a t most I1 lha l passes th rough all of
the points. This formula is known as the Lagrange ;"'erpolation formula. (Compare this formula with Problem 19 in E..xploration: Geometric Applications of Determinants in Chapler 4.) (cl Usc the L1grangc interpolation for mula to fin d the polynomial of degree at most 2 that passes through the points
tmear Independenct. Bas,s, and Dimension
461
(i) (1. 6). (2. -i) .,nd (3. - 2) (ii) ( -I, 1O), (O,S), and {3,2)
62. Use the Lagrange interpolation for mula to show that if a polynomial in <jp" has /I + I zeros, then it must be the zero polynom ial. 63. Find a formula for the number of invertible matTices
in M",,(lL p )' [Hint: This IS the smue as determini ng the number of different bases for Z;' (Why?) Count the number of ways to construct a basis for one vector at a lime.]
Z;,
............':
· .,,"
-~.
.
-
Magic Squares T he engraving shown on page 465 is Albrecht Durer's Melancholia 1 (1514). Among the many mathematical arlifacts in this engraving is the chart of numbers that hangs on the wall in the upper right-hand corner. (It is enlarged in the detail shown.) Such an array of numbers is known as a magic square. We can thmk of it as a 4X4 matrix 16
3
2
13
51011
8
9
6
7
12
4
15
14
I
Observe that the numbers in each row, in each column, and in both diagonals have the same sum: 34. Observe further that the entries are the integers 1,2, ... ,16. (Note that Durer cleverly placed the 15 and 14 adjacent to each other in the last row, giving the date of the engraving. ) These observatIOns lead to the followlllg definition.
Definition
An nX rI matrix M iscaHed a magic sq uare if the sum of the entries is the same in each row, each column, and both diagonals. This common sum is called the weight of M, de noted wt(M). If M is an /IX 11 magic square that contains each of the entries 1,2, . .. , r? exactly once, then M is called a classical magic
square.
I.
If M is a classical nX II magic square, show that
WI(M)
11(112 + I) 2
( Hint: Use Exercise 45 in Section 2.4.) 2. Find a dassical 3 X3 magic square. Find a differen t one. Are your two exam · pies related in any way?
•••
.,
:
"
~
i
by defi nition. It foll ows that P = Pc+- tl . (c) Since {UI• ... , un) is linearly independent in V, thesel l (u llc> "" [u"lel is Jim'arly independent in [Rn, by Theorem 6.7. Hence, PC....B = [ [ ul]e [ u Je is invertible, by the Fundamental Theorem. For all x In V, we have Pc+-Bi: xlB = [xle. Solving for [xJB. we fi nd that
[xJ,
~
(P, _,)- '[ xk
for all x in V. Therefore. (PC+-S)- l is a matrix that changes bases from C 10 B. Thus. by the uniqueness property (b). we must have (PC_B) -I = PB.....c.
Remarks • You may find it helpful to thin k of change of basis as a transformation (indeed, it is a linear transformation) fro m IR" to itself that simply switches from one coordinate system to another. The transformation corresponding to PC+- B accepts [ X)B as inpu t and returns [xlc as out put; (PC+-S)-l = PB--f! does just the opposite. Figure 6.5 gives a schematic representation of the process.
x
•
I Ie /
"--.. I I.
/ ' Multipliclltion ~
by PC- Ii
R" ngure 6.5 Change of b aS IS
•
•
• R" M uhipl icaLion by
i'u-c "'" (PC-U) - I
• The columns of Pe_ 8 are the coordinate vectors of one basis with respect to the other basis. To remember which basis is which, thlllk of the notation C f - 8 as sayi ng "8 in terms of C." It is also helpful to remember that pc_u[xlu is ,\ linear combination of the columns of Peo- e. Bu t si nce the result of this combination is [xl" the columns of PC_ B must themselves be coordinate vectors wi th respect to C.
Example 6.46
Find the change-of-basis matrices PC_I> and Ptlo-C for the bases 8 :::: {I, x, x 2} and C "" {I + X, x + X l , I + X l) of'lJ 2• Then find the coord inate vector of I'(x) = I + 2x - x 2 with respect to C.
Solullon
Changing to a standard basis is easy. so we fi nd PlJ-c firs t. Observe Ihat the coordinate vectors for C m terms of Bare
o
I
[l+x jB =
I .
o
[X+ Xl ]B =
I , I
I
[ 1 + x 1],6
=
0 I
Section 6.3
Change of lJasis
.U
(Look back at the Rema rk followi ng Example 6.26.) It fo llows that I
PB-c=
0
I
II
O
o
I
I
To find PCo-li. we could express each vector in B as a linear combination of the vectors in C (do this), but it is much casier 10 use the fac l that PC_ B = ( PI:I-C) - I, by Theo rem 6.12(c). We fi nd that
, 1, _1, _1 , 1, ,1 ,1 _ 1, 1, 1
(I'a ...d
PCo- B -
-I
=
It now fo llows that
,1 1,, -~ _1, , 1, _1, 1, I
I
2 - I
2 0 - I
which agrees with Exam ple 6.37.
Rlmlrk
If we do no t need PCo- Bexplicitly, we can find [p(x) ]c fro m [p(x) ]s and PB-c using Gaussi:lll elimination. Row reductIOn produces
(Sec the next section o n using Gauss-Jordan eliminatio n.) It is worth re~ating the observation in Example 6.46: C hanging to a standard basis is easy. If t is the standard basis for a vector space Vand 8 is any other basis, then the columns of Pt _ s are the coordmate vectors of 8 with respect to [ . and these arc usually "visible." We make use of this observation again in the next example.
In M w let 8 be Ihe basis IEII' E21' E12, En ! and leI C be the basis lA, B, C, Dl, where
A-
[10]
00 '
8 -
[II] 00 '
C-
[I ~]. I
D=[ :
Find the change-of-basis matrix PC-I:> and verify that [ Xlc
[; !l
z:::
:]
pc_J X]s
for X =
.12
Chapter 6
Vector Spaces
Solulltn 1 To solve this problem d irectly. we must fi nd the coordinate vectors of B with respect to C. This involves solvi ng four linear com bmation problems of the form X = aA + bB + cC + dD. when' X is in B and we mu st find a, b. c, and d. However, he re we are lucky, since we can fi nd the required coeffi cients by inspection. Clearly. Ell = A, ~ I = - B+ C, EI2 = - A + B,and ~2 = -C+ D. Thus.
[ EI1 ]C =
If X =
[~
0 , 0 0
- I
0 - I
1
1
, [E"k = 1
[E, ,]c=
0
[E,, ]c =
,
0 0
0 0
1
0
- I
0
- I
1
0 0
1
0 0
0
- I 0 0
- I 1
! ]. then 1
3
2 4
and
PC_8[XJ.
=
1
0
-I
0
1
0
- I
1
0
0 0
1
0 0
- I
3 2 4
0
1
=
- I -I - I 4
T h is is the coordinate vector with respect to C of the malrix
-A - 8 -C+ 4D= - [ ~ ~] - [~ ~) -[ : ~]+ 4[ : =
[~
!]
:]
= X
as it sh ould be.
Solullon 2 We can compute PC is Ihe ch and compare YOllr answer with tile one JOUlld In part (a).
3. x =
1
1
o
0 ,6 =
0 , 0
1 , 0
-1 1 c~
1
,
1
0
0
1
, 0
1
1
3
4.x =
I ,8
5 c ~
1
I ,
o,
0
0
1
0
0
I ,
I , 0
0
1
Pco- B
I
in
~j
1
=[_: -~ J
16. Let Band C be bases forQJ> 2. If 6 = {x, I + X, I - x + xl} and t he change-of-basis matrix from B to C is 100 021 - I
I
I
find C. Xl,
X + x 2},
C= {I, 1 + x,xl}in QJ>2
III calmlus, you leam that a Taylor polynomial of degree n
In Exercises 9 and 10, follow the instructiolls for Exercises 1-4 IIsing A instead of x.
~ {[~
{ [ ~], [ ~]} and
find 6 .
8.p(x) = 4 - 2x- x 2,6= {x, 1+
C
III Exercises 11 and 12,jollow the Im tructions for Exercises 1-4 IIsingf(x) ills/ead of x. 11. f(x) = 2 sin x - 3 cos x, 6 = {sin x + cos X, cos xl, C = {sin x, cos x} in span(sin x, cos x) 12. f(x) = sin x, B = {sin x + cos X, cosx},C = {cosx sin x, sin x + cos x} in span (sin x, cos x)
the change-of-basis matrix from B to C is
,
C={I,x,x Z}in~2
o
~ {[~ ~]. [~ C ~ {[~ :J. [ :
B
15. U=I Band C be bases for R'. If C =
III Exercises~. follow the imtructions for Exercises 1-4 IIsing p(x) instead of x. 5. p(x) = 2 - x,6 = {1,x},C = {x, I + x} in 9J'1 6.p(x) = I +3x,6 = {l +x,1 - x},C={2x,4}in Q/'1 2 2 7 . p(x) = I + x , 6 = {I + x + X l, X + X Z, x },
9. A = [ 4
:].
14. Repeat Exercise 13 with 0 = 135°.
in W
o
1
,
1
0 ~
415
13. Rotate the xy-axes in the plane counterclockwise through an angle 8 = 60° to obtain new x' y' -axes. Usc the methods of this section to find (a) the x' y' -coordinates of the point whose xy-coordinates are (3, 2) and (b) the xy-coordinatcs of the point whose x' y' -coordina tes are (4, - 4).
0
o
IO. A=[ :
Change of Basis
about a is a polYllomial of the form
2 ], 6 = the standard basis,
ao + aj(x - a) + a,(x - a)l + ... + a,,(x where an *" o. In other words, it is a polynomial/hat has
-:J. [:
beell expanded ill terms of powers of x - a illstead of powers of x. Taylor polYllomials are very useful for approximating futlc/ions that are "well behaved" lIear x = a.
- 1
~J. [~
:J. [~
~]} in M"
p(x)
=
at
Chapter 6 Vee/or Spaces
416
The set 8 = {I, x - a, (x - af .. ., (x - a)If} is a basis for9P nfor(IIIY real /IIIII/ ber (/. (Do YOI I see a quick way to show tills? Try uSing Throrem 6.7.) TIIis fact allows us to lise the techniqlles of tllis section (0 rewrite a polYllomial as a 1(lylor poiYllolllwl abollt a given a.
be bases for a timte-dlmensional vecfO r space V. Prove that
21 . Let 8. C, and V
22. Le t V be an II-dimensional ve .. ---+ '!J'1t and T: ~ It ---+ IlJ> It by
S(p(x)) - p(x + 1) ,nd
T(I~ x))
- p'(x)
Find (S 0 T)(p(x» and (To S)(p(x)}. ( H mt: Remember the Cham Rule.) ~28. Defin e linear transfo rmation s 5:
'lP .. ---+ 'lP and It
T : IlJ> It ---+ IlJ> It by
S(p(x)) - p(x + 1) ,nd F; nd (5 0
T)~x))
"i,p(x)) - xp'(x)
, nd (T o 5)(p(x)).
y
y
3x
x- y - 3x+4y
y
and T :1R1---+1R2
1
30. S: Q/' I ---+ eJ> I d efined by S( a + bx) = ( - 4a and T:
=
3
The kernel of DIS the set of all constant polynomials: ker( D) = la : (I in RI in Rl. Hence, III is a basis fo r ker (D), so
=
Ia ' I : a
nullity(D) = dim(ker{D)) = I
Example 6.65
rind the rank and the null ity of the linear transformation S: '3'1"" R defined by
~ i'P(X)dX ,
S1 p(x))
SOluiloR
r rom Example 6.6 1. ra nge(S) kereS) =
=
Rand rank(S) = dim R = I. Also,
{ - ~ + bx :bin R}
= { b( - ~
+ x): bin IR} = span{ -~ + x) so {-1 + x} isa basis fo r ker(S) . Therefore, null ity(S) = dim(ker (S» = 1.
Example 6.66
Find the nm k and the nullity of the linear transfor mation T: M 22 .." M n defi ned by T(A) = AT.
SoJulion
In Example 6.62, we fo und that range(T) = Mu and ker( T ) = 101. Hence, rank( T)
=
di m
M 22
= 4
and
nulli ty{T) = dim{O}
=
0
In Chapter 3, we saw that the rank and nul lity of an //IX 11 matrix A are related by the formula rank(A) + nutlity(A l = II. This is the Rank Theorem (Theorem 3.26). Since the matrix transformation T = TAhas An as Its domain, we could rewrite the relationship as rank(A) + nUIliry(A) = dim R" This version of the Rank Theorem extends very nicely to general hnear transformations, as you Gill see from the last three examples: rankeD) + nUllity(D) = 3 + I = 4 = dim W' j
Example 6.64
rank(S) + nullity(S) = I + I = 2 = dim W' I
Example 6.65
rank(T) + nulllty( T)
Example 6.66
=
4 + 0 = 4 = dim M n
49G
Chapter 6
Vector Spaces
Theorem 6.19
The Rank Theorem Let T: V ----+ Wbe a linear transformation from a fi n ite·dimensional vector space Vinlo a vector space W. Then rank( T)
+ nUllity(T)
= d im V
In the next section, you will see how to adapt the proof of Theorem 3.26 to prove this version of the result. For now, we give an alternative proof that does not use matrices.
n and let IvI' . . . , v k} be a basis for ker( T)(so that nullity( T) = dim(ker ( T» = k). Since {v l" ' " vd is a linearly independent set, it can be extended to a basis for V, by Theorem 6.28. Let B = {VI' ... , vi' vHI " .. , v n} be such a basis. If we can show that the set C = IT(v k+ , ), . . • , T(v n ») is a basis fo r rangeCT) , then we will have rank( T) = dim(rangc(T» = /I - k and thus
Prool Let dim
V=
rank(T)
+ nullity(T)
= k + (n -
k)
= n = dim V
as required. Certainly C is contained in the range of T. To show that C spans the range of T, let T(v) be a vector in the range of T. Then v is in V, and since 13 is a basis for V, we can fi nd scalars , en such that
'I" ..
Since v I" .. , v k are in the kernel of T, we have T(v, )
= ... ""
T(vt ) = 0, so
T( v) "" T('lv J + ... + 'tVt + CUJ Vk+1 + ... + ' wvw) = c,T(v, ) + .. + 'iT(Vt) + Ct+IT(Vt+l ) + ... + ,"T(v") = ' HI T(Vl+l) + ... + c" T(v w) This shows that the range of Tis spanned by C. To show that C is linearly independent, suppose that there are scalars 't , I" .. , ' " such that 'l +I T(Vl+ l )
+ ... + cn T(v n)
=
0
Then T(ck+IVl+1 + ... + 'nvnl = 0, which means that 'H IVk+1 + ... + '"v" is in the kernel of T and IS, hence, expressible as a linear combination of the basis vectors V I " ' " Vi of ker( T)-say,
Bul now a nd the linear independence of B forces ' I = ... = ' n = O. In particular, C1+1 = .. . "" ' " = 0, whICh means C is linearly independent. We have shown that C is a basis for the range of T,sa, by our com ments above, the proof is complete. We have verified the Rank Theorem for Examples 6.64 , 6.65, and 6.66. In practice, this theorem allows us to fi nd the rank and nullity of a linear transfo rmation with on ly half the work. The following examples illustrate the process.
&ction 6.5 The Kernel and Range of II Linear Transformation
(KImple 6.6J
Find the rank 3nd nullity of the line3r transformation T: fjl2 ----i' 1'(p(x» ". xp(x). (Check that Treally is linear.)
Solution
491
W>j defined by
In detail, we h3ve
T(a
+ bx + ex l ) = ax + bx 2 +
ex J
It follows that
ker(T) = {a + bx + cx 2 : T(a + /IX + cx l ) = O} = {a + bx + cx 2 : ax + bx 1 + ex ) = O}
{a + bx + ex 2 : a = /, - e = O} = (0) dim (ker( T» = O. The Rank Theorem implies th3t =
so we have nullity(T)
=
rank( T ) = dim ~2
-
nullity(T) = 3 - 0 = 3
1I,.,rll In Example 6.67, it would be just as easy to find the rank of T first, since Ix.;, KI is easily secn to be a basis for the range of T. Usually, though, one of the two (the rank or the nu llity of a linear transformation) will be easier to compute; the Rank Theorem can then be used to fi nd the ot her. Wi th practice, you will become belter at knowi ng which way to proceed.
(KImple 6.68
Let W be the vector space of all symmetric 2 X2 matrices. Define a linear transformation T : W----i' QIl 1 by
1: ~]=(a-b)+(b-c)x+(c-a)xl
(Che (r'. T )(v ) = 0 =>1(v)= O ::::} v = 0
which establishes that ke r( T ) = {o}. Therefore. T is one-Io-one, by Theorem 6.20. To show th::1t Tis onto, let w be in Wand let v = T -I(w ). Then
T( v ) = T(r '(w» = (T . r ')(w) = I(w)
= w wh ich shows that w is the image of v under T. Since v is in V. this shows that Ti s on to. Conversely, assume that T is one-to -one and on to. Th is means that nullity( T ) = 0 and Tank( n = d im W. We need to show that there exists a linear transformation T': W-+ Vsuch that Y' 0 T = I" and T o T' = 1w. lei w be in W. Since Tis on to, there exists some vector v in Vsuch Ihal T(v) = w. There is only one such vector v, since, ifv' is another vector in V such that T(v ' ) = w, then T(v ) = T(v ') ; the fact thaI Tis on e-Io -One then implies that v = v'. It therefore makes sense to define a ma pping T' : W ~ Vby setting T'( w ) = v. It follows that
(T" a nd
T) (v ) = T'(T(v » = T' (w) = v
(T . '[") (w) = T(T' (w » = T(v ) = w
then follows that T' 0 T = Iv and T o T' = 1w' Now we must show that T' is a IiI/ear tra nsformation. To this end, let w, and W2 be in W and leI ci and '
]t
v,
1" (C I W I
+
C:/w l ) = r (c, T(v l )
r (T(c,v l
I(c,v,
+ !1: T(v2 + CzV2»
+ 'tv,)
Conseq uen tly, y ' is li near, so, by Theorem 6.1 7,
r =
T - I•
»
Section 6.5 The Kernd and Range ofa Lmear Transformation
The WQrds ;w/llorpl,ism an,d~ i5()/llorpir;c are derived from tile Cretk words isoJ., meaning -equal, and morph, meaning "shape.~ Thus, figuratively speaking, isomorphic ~~tor ~pa(es h;we "equal shapes....
411
Isomorphisms 01 VeCIOr Spacls Wc now arc in a position to describe, spaces to be "essentially the sam e."
10
concrete terms, what it means fo r two vecto r
Ii
Oilluitioa
A linear transformation T: V -+ Wis called an isomorphism jf it
is o ne-la-one and on to. If Vand Ware two vector spaces such that the re is an isomorphism from V to W, then we say that V is isomorphic to \V and write V ::: 1\'. .
Example 6.12
Show that eP~_. and IR " are isomo rph ic.
Solullol
The process of fo rmmg the coordinate vecto r of a polynomial provides us wi th one possible isomorphism (as we observed alrc:ldy in Seclion 6.2, although we did not use the term iso m orp}IIS/li there). Specifically, define T : ~ ,,_ I -+ R" by T(p( x))'" [ ~ x)][, whe re t: = { I, X, • .. ,r'} is the standard baSIS for~,,_ •. ThaI is,
a,
Theo rem 6.6 shows thaI T is a linear transfo rmation . If p( x) = "0 1l,._ lx"-' is in the kernel of T, then
+
a,x
+ ...
o
o Hence, (10 = Il, = ... = a~ _ 1 = 0, so p(x) = O. Therefore. kerO') = 10 1. and Tis oneto-one. Since dim ?PIf_ I = dim R~ = fI, T is also onlO. by Theorem 6.2 1. Thus. T is an isomo rphism , and IJP ~_ I -= R".
Example 6.13
Show that M m " and R ..... are isomorphic. Once again, the coordinate mapping from M m" to Rm. (as in Example 6. 36) is an isomorphism . The details o f the proof are left as an e xeTCI s~ .
SOliliol
In fu ct. the easiest way to tell if two vector spaces are isomorphic is simply to check thei r dimensions. as the nexltheorem shows.
498
Cha pt~r6
Vector Spa(e5
r'
•
•
3
leI Vand Wbe two finite-dimensional vector spaces. Then Vis isomorphic 10 W if and only if dim V = dim W. 7"
57
!
I
PrOD' Let" = dim V. If V is isomorphic to W, then there is an isomo rphism T: V-+ W. Since Tis one-to-o ne, nullity(T) = O. The Rank Theorem then implies that
rank( T) ::: dim V - nullity( T) "" n - 0 =
11
Therefore, the range of T is an n·dimensional subspace of W. But, since 1'is on to, W "" range( T), so dim W = fI, as we wished to show. Conversely, assume that V and W have the same dimension, 11. Let B '" {V I ' " ., v~} be a basis for Vand lei C '" jw p . •• , wnl be a basis for W. We will define a linear transformation T: V -+ Wand then show th at Tis one-Io-one and o nto. An ar bitrary vector v in V can be written uniquely as a linear combination of the vectors ill the basis B-say,
We define T by
+,,·+c w T{ v ) =cI wI n" It is straightforward to check that Tis linear. (Do so.) 'Ib sec that Tis o ne-Io-one, suppose v is in the kernel o f T. Then
and the linear independence o f C forces 'I '" . . . = ' " = O. But then
v "'cIvI +·,,+,,," v = 0 so ker (T) "" IO}, meaning thaI Tis one-to -one. Si nce dim V = dim W, Tis also on to, by Theorem 6.21. Therefore, T is an isomorphism, a nd V .. W.
hample 6.14
Show that R" and 'lP" a re not isomorphic. Since dim Rn = Theorem 6.25.
S.IIU"
hamPl1 6.15
11
"*
n + I = d im ~ ", IR:" and (jp" are not isomorphic. by
Let Wbe the vector space of all symmetric 2X 2 matrices. Show that W is isomorphic to R).
SDlull"l In Example 6.42. we showed that dim W = 3. Hence, dim W = dim Rl, so W~ R l , by Theorem 6.25. (There is an obvious candidate for an Isomorphism T: W~ RJ. Whatisit ?)
Section 6.5
The Kernel and Range of a Linear Transformatio n
tl9
1I'.'fll Our examples have all been re(l/ vector spaces, but the theorems we have proved arc true for vector spaces over the complex numbers C or Zr' where p is prime. For example, the vector space M n (Z2) of a1l2X2 matrices with entries from 1:.2 has dimcnsio n 4 as a vector space over 1.2' a nd hence M n(Zz) iii Z~.
Exercises 6.5 (a) Which, if any, of the fo llowing polynomials are in kcr( T)? (i) 2 (ii) X l (iii) 1 - x (b) Which, if any, of the polynomia ls in parI (a ) are in
1. Let T: Mn ~ M22 be the linear transformation
defined by
range(T)? (c) Describe ker( T) and range(T).
(a) Wh ich, If any, of the following matrices are in ker( T)?
(i) [
I - I
'] 3
(ii)
[~ ~]
(iii)
[3o 0] -3
(b) Which. if any, of the matrices in part (a ) are in range( T)? (cl Describe ker ( T) and ra nge( T) . 2. Let T: Mn ~ IR be the linea r transforma tion defi ned by 'I\A) ~ " (A), (a) Which. if any, of the followmg matrices are in ker( T)? (i) [_ : : ]
(ii)
[~~]
(iii)
[ ~ _~]
(b) Which, if any, of the following scalars arc in range( T)? (i) 0
(ii) - 2
III Exercises 5-8, filld bases for the keme/ (///(/ mnge of the linear tralls!ormatlOn s T ill the indicated exercises. In each case, state the IIl1/1ity CHId r(wk ofT alld verify tile Rank Th eorem. 5. Exerose 1
6. Exercise 2
7. Exercise 3
8. Exercise 4
III Exercises 9-14,fHlci either the 1I11!lJty or the milk ofT and then li se the Rall k Theorem to find the other. 9. 10,
B~[
3, Let T: '1J>2 ~ [R2 be the linear t.ransformation defined by
+
bx
+ a! }
=
c ["b -+ b]
B ~[
[~l
- I
-I] 1
1-I]
- I
1
~ 13.
(ii) x - x 2
(iii) I
+x-
x2
(b) Which, if any, of thc following vecto rs are in range( T)? (i)
1
12. T: M22 ~ MZ2 defined by T(A} = AB - BA, \",here
(a) Which, if any, of the following polynomials arc in ker(T)? (i ) 1 + x
T ~,-+R' d'finedbyT(p(x» ~ [~~~n
11. T: M22 ~ M12 defined by 'It A) "" AB, where
(iii) I/ Vi
(cl Describe ker(T) and range( T).
'{{a
T:Mu _ W definedbYT[~ ~] = [~ = ;]
(ii)
[~l
(iii)
[~l
(c) Describe ker(T) and range(T). 4. Let T: 'lP 1 -+ 'lP 1 be the linear t ransformatIon defined by T(p(x» ~ xp'(x),
T : 1J>2_IR defi ned by T(p(x)) = p'(O) 14. T:MJJ ~MjJ de fi nedby 'J1A} = A - AT
Itt Exercises 15-20, determit1e wlletller I/Ie linear transfor-
mallOn T is (a) aile-la-one alld (b) 011/0. 15. T: 1H z _ [Rl defined by T [ x] = Y
[2X - Y] x + 2y
588
Cha pter 6 Vector Sp:lces
ee[0, 2]. 32, Show that (€[a, bJ - C€[ c, d] fo r all a < ba nd c < d. 31. Sho w that C(6[ 0, I J -
x - 2y 3x
+Y
x +y 2a - b
a + b - 3c
18. H I" --+11' dcfi ncd by 'I11'(x» = n
19. T : llt~ M22 d efi n edby T U
,
[;~~n
a+b+ [ b - 2c
C
2']
,
(a) Prove that if 5 and T are both o ne-to -one, so IS SoT. (b ) Prove that if 5 and T are both on to, so is SoT.
34. Let 5: V -+ Wand T: U --'" V be linear tra nsformations.
__ ["" +bb bb -+ ,,]
(a) Prove that If 5 0 T is o ne-to -one, so is T (b ) Prove that If S o T is onto, so is S. 35. Let T: V --'" W be a linear tra nsfo rmatio n between two fi nite-d im ensional vector spaces.
a 20. T: R J --'" W defi ned by T b
33. Let 5: V ~ W and T : U --'" V be linear transfo rm ations.
=
b, where W is the vector space o f a- ,
all symmet ric 2 X2 matrices
(a) Prove that if di m V < dim W, then Tcan not be onto. (b) Prove that if dim V> d im W, then 1'can not be one- to-o ne. 36, Let no, (1, •••• , a" be n + I distinct real n umbe r~. Defin e T: W'" -+ IR:"-t l b y
In Exercises 2/-26, determine whether Vand Wa re
T(p(x) =
isomorphIC. If they are, gIVe an explicit IsomQrphism T: V~ W. 22. V = 53 (sym metric 3 X 3 matrices) , W = UJ (upper t riangu lar 3 X 3 mat rices)
53 (skcw-
24. V = !j>,. IV = (P(x) in !j>" P(O ) = 0) •• 101
25. V = C, W = R2 26. V = {A in M" , u(A) = 0). W =
Il'
27. Show that T:~n ~ ~n defi ned by T{p(x» = p(x) p'(x) is an isomorphism. 28. Show that T:'lP n --'" 'lP n d efined by is an isomorphism. 29.
+
T(p(x» = p(x - 2)
~how that T:~n--"' 'lPn defined by T(p(x»)
=
xnp(; )
IS an Isomorphism. 30. (a) Show that (£[0, I ] - '{; [2, 3]. [ Hint: Define T: C€ (0, 11--'" C€ [2,3] by letti ng T(f) be the functio n whose value at x IS (T(f))(x) = f(x - 2) for x in
[2.3[.[ (b ) Show that l --+ \jll be the d ifferen tial opera tor D( p(x)) "" p'(x). Let B = {I, oX. x ' , Xl } and C "" !1, x, xl) h e bases for ~3 and '!P2' respectively. (a) Find the matrix A of D with respect to Band C.
(b) r ind the matrix A' of D With respect to B' and C, where B' = {x', Xl,
+
ec) Usi ng pa rt (a), compute 0(5 - x Theorem 6.26.
SolulioD Fi rst note that D(a ple 6.60.)
+
bx
2K) and D(a
+ a! +
+
dx}) = b
bx
+
+
2ex
cx 2
+
X,
I}.
+ dx' ) to verify
3dx 2• (See Exam-
(a) Since the images of the basis B unde r Dare D( I) = 0, D(x) = I, D( xZ) = 2x, and D(x J ) = 3Xl , their coordinate vecto rs with respect to Care
[D( ll ]c ~
o
I
0 , [ D(x1 ]c ~
0,
o
o
0
[D(x') ],
~
2,
0
[D(x' )],
~
o
Consequently,
A ~ [D],_.
~
[[ D(1l], I [D(xllc i [ D(.'l ], I [D(x')] , ]
o =
0
1 0
0 2
0 0
o
0
0
3
( b) Since the basis 8 ' is just 8 in the reverse order, we see that
A' ~ [D], _G' ~ [[ D(x'l ],
I [ D(.') ]c I [ D(x1]c I [ D( ll ]cl
o o
0
I
0
2
0
0
3
0
0
0
0 3
Section 6.6 The Matrix of a Linear Transformati on
505
(This shows that the order o f the vectors in the bases Band C affects the matrix of a transformation with respect to these bases.)
(c) First we compute 0(5 - x
+ 2Xl)
=
- 1 + 6x 2 di rectly, getting the coordinate
vector - \
[D{S - x + 2xJ)]c = [ - 1 + 6x 2 ]c
0
=
6 On the other hand,
; - \
o 2
A[S - x+ 2X)J8 =
which
0
\
0
0
0
2 0
0
0
0
5
0
- \
- \
=
0
3
= ( D(5 - x + 2x')Je
0 6
2
'g"" w"h Thcoc,m 6.26. We I"y, proof of \h, gen",1 G'''' " , n
'''': t
Sin ce the linear tra nsformat ion in Example 6.77 is easy to usc di rectly, there is rc· ally no advantage to using the matrix of this transfo rmation 10 do calculat ions. However, in other cxamples-cspecially large o nes-the matrix app roach may be simpler, as it is very well-suited to compu ter impleme ntation. Example 6.78 illustrates the basic idea behind this indirect approach.
Example 6.18
Let T : ~l ---+ '!f2 be the linear transformat ion defined by
'f(p(x» -
P(2x - \)
(a) Find the mat rix of Twith respect to E = {I. x, xl}. (b ) Compute T (3
SoluUon
+ 2x
- x 2 ) indlfectly, using part (a).
(a) Wc sce thaI
T(\ )
= I,
= 2x -
T(x)
I,
T(x 2 ) = (2x - 1)2 = I - 4x + 4x~
so the coordina te vecto rs arc \
(T(I)J, =
- \
I
0 • ( T(x)J, =
2 • ( T(x' )J, -
o
o
- 4 4
Therefore,
(T],
= ([T( I)], I (T(x) J, I (,[,(x')J,] =
\
- \
\
0
2
- 4
o
o
4
51&
Chapter 6
Vector Spaces
(b ) We apply Th eorem 6.26 as follows: The coordinate vector of p(x) = 3 with respect to [: is
+ 2x - xl
3
[p(xl ], =
2 - ]
Therefore, by Theorem 6.26,
[T(3 + 2x - x' l ], = [ T(p(x)) ], = [ T],[p(xl ], = It follows that 7(3 + 2x computing T{3 + 2x- x 2 )
I
- I
I
3
0
2
- 4
2
o
0
4
- I
0
=
8 - 4
+ 8-x - 4_x 2 = 8x - 4x 2. (Verify this
Xi )
= 0- 1
= 3
+ 2(2x -
by
I) - (2x - I)l directly.)
The matrIX of a linear tra nsformation can sometimes be used in surpn smg ways. Example 6.79 shows its applicn tion to a traditional ca lculus problem.
Example 6.19
~
Let Qb be the vector space o f all diffe rentiable funct ions. ConSider the subspace IV of Qb given by IV ::: spa n(e JX , xe)K, x 2e3K ). Since the set 13 = f e-'x, xc' ''' x 2e IS linearly independent (why?), it is a basis for IV.
'x }
(a) Show that the d ifferenti al operator D ma ps IV into itself. (b) Find the ma trix of D wi th respect to 13.
(c) Compute the derivative o f Sel X + 2xelK- xle lx ind irectly, using Theorem 6.26, ilnd verify II using part (a).
50lulloo
(.1) Applying D 10 a general element of W, we see that
D(aelx + bxelx + cx 2e 'f=ars, on the surface, to be a calculus proble m. We will explore this idea further in Example 6.83.
Example 6.80
Let V be an ,,·dimensional vector space and let I be tne identity transformation on V. What is the matrix of I with respc
X,
nal matrix.
Solution
(a) In Example 6.78, we found that the matrix of T wit h respect to the standard basis £ = {I, X, Xl} is
1 [71,~
- \
0
1
2-' 0 ,
o The change-of-ba sis ma trix fro m B to £ is 1
P = PCo-- 8
==
1 0
1
-I
0
o
0
1
It foll ows thai the matrix of Twl1h respect 10 {3 is
[ T).
~
~
~
!"[ TV
,, 1, , _1,
1
0
1
- I
1
1
0
2 0
-4
1
0
0 1
4
0
0
1 0 -I 2
0
-,,,
0
0
1 0 - I 0
0
1
,
4
(Check this.)
(b ) The eigenvalues of [Tlc are 1,2, and 4 (why?), so we kn ow from Theorem 4.25 that ['[1,.: is diagonalizable. Eigenvectors corresponding to these eigenvalues are I
- 1
0,
1,
o
0
1
-2
1
respectively. Therefore, setti ng 1 p ~
nc
- I
1
0
1 -2
o
o
1
and
D ==
1 0 0 020 o 0 ,
p = D. Furthermore, P is the change-of-basis matrix from a basis we h[lve p- I[ C to £, and the col um ns o f P are th us the coordinate vectors of C in terms o f E. It follows that
C == { I, - I and l1') c == D.
+ x, 1 - 2x + Xl}
516
Chapter 6 Vector Spaces
The preceding ideas can be generalized to relate the matrices [1]c0-6 and [TJeo-B' of a linear transformation T : V --+ IV, where 8 and 6' are bases for Va nd C and C' a re bases fo r W. (Sec Exercise 44. ) We conclude this section by revisiting the Fundamental Theorem of Invertible Matrices and incorpo rating some results fro m this chapter. 1&.
Theora. 6.30
T he Fundamental Theorem of Invertible Matrices: Version 4 Let A be an /IX II matrix and let T: V --+ IV be a linear transformation whose mat rix [J]co-u with respect to bases 6 and C of V and W, res pectively, is A. The following statements are equivalent: a. A is invertible. b. Ax :::: b has a unique solution for every b in R". c. Ax :::: 0 has o nly the trivial solution. d. The reduced row echelon for m of A is 1..e. A is a prod uct of elemen tary matrices. f. rank(A) = /I g. nullity(A) = 0 h. The column vectors of A are li nearly independent. i. The column vectors of A span IR". j . The colum n vectors of A fo rm a basis for H". k. The row vectors of A are linearly independen t. I. The row vectors of A span IR ". m . The row vectors of A form a basis for R". n . det A,#O o. 0 is not an eigenvalue of A. p. T is invertible. q. Tis one·to· o ne. r. T is onto. s. ker(T)::::! OI t . range( T) :::: W
Prool The equ ivalence (q) ¢:;> (s) is Th eorem 6.20, and (r) ¢:;> (t) is the defini tion of o nto. Since A is /l Xn, we must have dim V :::: dim W = n. From Theorems 6.21 and 6.24, we get (p) ¢:;> (q) ¢:;> ( r). Finally, we connect the last fi ve statemen ts to the others by Theorem 6.28, which implies that (a ) ~ (p).
In Exercises 1- 12,jilld the matrix [Tb_6 of the linear transformation T: V --+ IV with respect to the bases 6 alld C of V and W, respectively. Verify Theorem 6.26 for the vector v by computing T( v) directly and using the theorem. I. T:
bx) = b - ax, B = C = {I. x}, v = p(x} = 4 + 2x
2. T : 9/'1 _ 9/'1 defmed by T(a + bx) = b - ax, 6 = 11 + X, 1 - x},C = {l,x}, v :::: p(x) :::: 4 + 2x 3. T: 'lJ> Z--+!J>l defm ed by T ( p{x) = p{x + 2), B = {i. >; x' }. C = {I. x + 2. (x + 2)' }. v =
p(x ) = a + bx + cx 2
Section 6.6
4. T:~ ! --+!J1 defined by T( p(x» = p{x + 2),
B - {I, x + 2, (x + 2)'), e - {I, x, x' }. v "" p(x) "" a
+
bx
2
(a) Show that the d ifferential operator D maps Winto itself. (b ) Find the mal rixof Dwith respect to B = {fL~l}' V =
511
1l. 14. Consider the subspace W o f 2b, given by W - span (C'", e- 1., "),
+d
5. T ;@>l --+ R d efinedbyT(p(x» =
The MatriJ( of a Linea r TransformatIon
A = [:
~]
13. Consider the subspace Wof 2'b. given by W = span (slll X, cos x). (a) Show that the differential operator D maps IV into itself. (b) Find the matrix of 0 with respect to B = {sin x, cos x}. (c) Compute the d erivative of fix) "" 3 sin x - 5 cos x indireclly. using Theorem 6.26, and verify that it agrees with r ex) as computed directly.
18. T: ~' -+ ~l definedbyT(P(x» = p(x+ I), S:~l-+~ldefi n edbyS{p(x» - p(x+ I),
B: {I, x},e - V - {I,x,x'}
III Exercises 19-26, determine wller/ler tire lillear trfillsformatioll T is invertible by considering its matrix witll respect to the standard bases. 1f T is invertible. lise Theorem 6.28 and tile metlrod of Example 6.82 /0 fiml T- ' . 19. Tin Exercise 1
20. Tin Exercise 5
21. Tin ExeTClse3 22. T: (jJ> I -+ '!P 2 defi ned by T(p(x» = p' (xl . T: ~l-+'!Pl defined by T(p(x»
= p(x) + p' (x)
Chapler 6 Vector Spaces
51.
24. T: Mn --+ MZ2 defi ned by T(A ) = AU, when!:
it to compute the o rthogonal projection of v o nto W, where 3 v -
25. T in Exercise II
2
26. T In Exercise 12
C()1l1pare your answer with Example 5.11. I Him: Find an orthogonal decomposition oflVas W = w + W1. using an o rthogonal baSIs for W. See Example 5.3.1 39. Let T: V--+- Wbe a li near transform.llion between finite- dimensional vecto r spaces and let Band C be bases fo r Vand W, respectively. Show that the matnx of Twith respect to Ba nd C is un ique. That is, If A IS a matrix such that A[ v]o = [ T(v)] c for all v in V, then A = [1'Jc_B' {Hi"t: Find values of v that will show this. one column at a time.]
~ 11, ExerCIses 27-30, use the method of Example 6 83 to eVal'lnte the given inlrgml.
f 28. f
27.
(si n x - 3 cos x )llx (See Exercise 13.) Se- z.. (ix (Sce Exercise 14. )
29.
J (;S cos x -
30.
f
2i-" si n x) dx(See Exercise 15.)
(xcos x + xsin x ) dx(See Exercise 16.)
III Exercises 31-36, a lillear Ir(msfor/1lf1liml T: V--+ Vis give" If possil1le, find a 1n,sis C for V S'ld, IIJat the matrix [T1- ofT will, respect 10 C is dlagollal.
'1. T: R2 -t Rl definedbyJal ~ [ - 4b 1 Jl a +5b
33.
1:]
=
(tl
41. Show that rank(T) = rank(A).
[: +~]
T :gJI, --+ ~, d e finedbyT( a + bx) = (4a
42. If V = Wand lJ = C, show that Tis diagonalizable if and only if A is diagonalizable. + 2b)
+
+ 3b)x
34. T: @I, --+ ~ Zdcfined by T(p(x )) = p{x + I) l.&.35. T :@I,--+gJIl defined by T(p(x» = p(x) + xp'(x) 36. T : t~\ --+ ~2 definedby T(p(x)) = p(3x+ 2) 37. Let
ebe the line thro ugh the o n gin in R' with dire
~
G1
~
0 x
flaur • •.21
-" p( - x) ~ p(x))
III QrlestiollS 11- 13, determine whether T is a linear trmlsformation. II . T:
R2 --+ JRl d efined by "/1x) = yxTy, where y =
1 [2 ]
12. T : Mnn --+ M nn defined by T( A ) = A TA 13. T: Ql'n -+9Pndefi ned by T(p(x)) = p(2x- 1)
14. If T: W' l --+ M21 is a linear transfo rmation such that
T(I) ~[ ~ nT(I +X)~ [~ T( J + x + x 2 ) =
0 -I] [ I
:],nd
O. fi nd T(5 - 3x + 2x2 ).
15. Find the null ity of th e linea r transformation T: M m , --+ IR defin ed by T(A) = tr(A). 16. Let W be the vector space of upper triangular 2 X2
m atrices.
=a+ c=b+d } 4. V ~
I O. Find the change-of- basis mat rices Pc..../) and PB.....c with respect to the bases B = I 1, 1 + x, 1 + x + x"} and C = Ii + x,x+ xl, I + xl} of~ 2 '
(a) Find a linear transformation T : Mn --+ M21 such that ker(T) = W. (b ) Find a linear transform ation T : M12 --+ Mn s uch tha t range( T ) = W.
17. Find the matrix I T lc....a o f the linear transformation T in Question 14 with respect to the standa rd bases B = {I, x, Xl) ofQJl 2 and C = {Ell' E12 , ~ L> !;,2} of M n· 18. Let 5 = {VI' ... , v n} be a set of vectors in a vector space V with the property that every vecto r in V can be written as a linear combination of V I' .•. , V n in exactly one way. Prove that 5 is a basis for V. 19. If T: U --+ V and 5: V --+ Ware li nea r transformations such that ra n ge(T) C ker(S), wha t can be deduced about So T? 20. Let T: V --+ V be a linear transformatio n, and let {VI' ... , v n } be a basis fo r V such tha t {T(v , ), ... , T(v n )} is also a basis fo r V. Prove that Tis inve rtible.
iSlanc I
A stralght/ine may be the shortest dislmlce betwun two points, but il
is by no lIlea/lS the most . . mteresrmg. -Doctor Who In ''The Time Monster" By Robert Sloman BBC,1972
A/though Illis may seem a pnradox, all exact sCIence is dominated by the idea of approximation. - Bertrand Russell In W. H. Auden and L. Kronenberger, eds. Tile Vikillg Book of Aphorisms Viking, 1962, p. 263
B
A
Fluurll.1 Taxicab distance
538
Ii
1.0 Introduction: Taxicab Geometrll We live in a three-dime nsional Euclid ean wo rld , and therefore, concepts fro m Euclidean geometry govern our way of looking at the world. In particular, imagine stopping people on the street and asking them to fill in the blank in the following ." They will a lmost sen tence: "The shortest distance between two points is a certainly respond \vith "straigh t line." There a re, however, other equall y sensible and intuitive notions of d istance. By allowing ourselves to think of "distance" in a more flexible way, we will open the door to the possibility o f having a "distance" between polynomials, funct ions, mat rices, and many other objects that arise in li near algebra. In this section, you will dIscover a type of "distance" that is every bit as real as the straight-line distance you are used to from Euchdean geometry (the one that is a consequence of Pythagoras' Theorem). As you'll see, this new type o f "distance" still behaves in som e fam iliar ways. Suppose you are standing at an mtersection lt1 a city, trying to get 10 a restaurant a l anolher intersection . If you ask someone how far il is to the restaurant, that person is unlikely to measure distance "as the c row flies " (i .e., usmg the Euclidean version of distance). Instead, the response will be someth ing like " It's five blocks away." Since thIS is the way taxicab drivers measure dis tance, we will refer to this notion of "dIstance" as taxicab distance, Figure 7.1 shows a n exam ple of taxicab d istance. The sho rtest path from A to B req uires traversing the Sides of five city blocks. Notice that although there is more than one route from A to B, all shortest ro utes requ ire th ree horizontal moves and two ve n ical moves, where a "move" corresponds to the SIde of one city block. (How many shortest routes are there from A to B?) Therefore, the taxicab distance from A to B is 5. Idealizmg thIS situation, we will assume that all blocks a re unit squares, and we WIll use the notatIon d,(A, B) for the taxicab distance from A to B.
Problell 1 Find the taxicab distance between the followrng pairs of points:
(,) ( 1,2)ood(S,5)
(b) (2,4),nd (3, - 2)
(e) (0,0) ,nd (- 4, - 3)
(d) (- 2,3) ,nd (I, 3)
(e) (I, D and{ -L D
(f) (2.5,4 .6)and(3 . 1,1.5)
Section 7.0
Introduction: Taxicab Geometry
539
Proble. 2 Which of the following is the correct formula for the taxicab distance d ,(A, 8 ) between A = (a l • a2) and B = ( hI> b2)?
(,) d,(A, B) ~ (a, - b,) + (a, - b,) (b) d,(A, B) ~ (la,1- lb,l) + (I.,I- lb,l) «) d,(A, B) ~ la, - & ,1 + la, - b,1 We can d efi ne the taxicab " orm of a ve2' (For example, if p(x) = I - 5x + 6 + 2x then (p(x), q(x» = 1· 6 + (- 5) . 2 + 3 · ( - 1) "" -7.)
r,
3r and q(x)
E:
SOlutiOD Since (jp 2 is isomorphic to H), we need only show that the dot product in R~ is an inner product, which we have already established.
Section 7.1
Example 1.5
Inner Product Spaces
543
Let f and g be in «5 [a, h] , the vector space of all continuous fun ctions on the closed interval [a, bJ. Show that
(f, g) defines an inner product on '€ [a,
Solution
We have
(f, g)
~
r
[( x)g(x) dx
•
hI.
r
~
~
[(x)g(x) dx
•
r
g(x)[(x ) dx
~ (g,f)
•
Also, if Ii is in '€ la, hI , then
(f, g + h) =
r r
f(x)(g(x) + h(x)) dx
•
([(x)g(x) + [(x)h(x)) dx
•
r
[(x)g(x) dx +
•
= (f,g) + (f, h) If c is a scalar, then
(of, g)
~
r
r•
[(x)h( x) dx
,[(x)g(x) dx
•
~
,r
[(x)g(x) dx
•
~
Finally,if, f) =
f
clj. g)
b(f(X»2 dx 2: 0, and it follows from a theorem of calculus that, since f
• is continuous.lj.f) =
r
(f(X» l dx = 0 if and on ly if f is the zero fun ction . T he refore,
•
(f, g) is an in ner product o n '€ [n, hI.
Example 7.5 also defines an inner product o n any subspaceoftfl. [a, bJ. For example, we could res trict our attent ion to polynom ials defined o n the interval [a, bJ. Suppose we consider '!J> [0, II> the vector space o f all polynomials o n the interval [0, 1J . Then, using the inner product of Example 7.5, we have
{x 2,1 +
~ = (x2(1 + x) dx = ( X2+ x') dx o
=
0
[Y! + x4jl = .!.+.!.=:~ 3
4
(I
3
4
12
544
Chapter 7
Distance and Ap proxi mation
Properties ollner Produc.s The following theorem summarizes some additional properties that follow from the definition of inner product.
I
Theare. 1.1
Let u, v, and w be vectors in 3n inner product s.E.3ce Vand Jet c be a scalar: a. (u+v,w) = (U,W/+ {V,W) b. (u, cv) = C(U,VI c. lu , O)~IO,v)~ O
i
We prove property (a) , leaving the proof of properties (b) and (c) as Exercises 23 and 24. Referring to the definition of inner product, we have
ProOf
(u + v, w) = (w, u +
VI
By (1)
"" (w, ul + (w, VI
By (2)
= (u, W I + {v, WI
By (I)
lengtl, Distance, aDd OrtbOgOD8111V In an inne r product space, we can defi ne the length of a vector, distance between vec\ors, and orthogonal vec!ors,just as we did in Section 1.2. We sim ply have to replace every usc of the dOl product u . v by the more general inner product (u, v). Ii
DeHnlllon
Let u and v be vectors in an inner product s ace
I. The length (or "orm ) of v is ~ v~ = V (v, v). 2. The distance between u and v is d( u, v) = IIu - v~. 3. u and v are orthogonal if (u, VI = o.
Note that I v~ is always defin ed, since (v, v) 2 0 by t he definition of inner product, so we can take the square root of this nonnegative quan tity. As in Rn, a vector of length I is called a u,lit vector. The unit sphere in V IS the set S of all untt vectors in V.
Example 1.6
ConsIder the inner product on <eto, IJ given in Example 75. If f(x) = x and g(x) = 3x - 2, find
(,) II!I
(b) d (f,g)
Soll11al
(a ) We find that If, f)~
'" Ifl
«) If,g)
~ VIf,f) ~ 1/ V3
( ' p{x ) dx = ( IX2dX= xJjl =..!. Jo Jo 3 0 3
Section 7.1
(b) $; nce d(f, g) ~
U-
if -
5_5
~I ~ VI] ~
[ (x ) - 8(x) we ha ve
Inner Product Spaces
g,f - g) , nd x - (3x - 2) ~ 2 - 2x
g,f - g):: ((f(X) - g(x}f dx
=
,
r,
~
2( 1 - x)
4(1 - 2x + Xl ) dx
:: 4[X _X2+fL : Combining these fact s, we see that d (f,g ) :: (c) We compute
(f,g) =
r,
f(x)g(x) dx =
r,
V4fj = 2/v'3..
x(3x - 2) dx = ((3X 2 - 2x) dx = (x 3
,
-
X2J~ =
0
Thus,f and g are o rthogonal. It is impo rtan t to rem ember that the "d ist ance" between f and g in Example 7.6 does 1I0t refer to any m easurem ent related to the graphs of these functions. Neither does the fac t that [and g aTe orthogonal mean that their graphs intersect at right angles. We aTe sim ply appl ying the defimtion of a particular inner p roduct. However, in d oing so, we shou ld b e guided by the corresponding not ions in R2 and 1R1, where the inner product is the do t product. The geometr y of Euclidean space ca n still guide us here, even though we cannot visualize things in the same way.
EKample 1.1
Using the inner p roduct o n R2 defi ned in Example 7.2, draw a sketch of the unit sphere (circle).
SDlnllon If X = [ ; ], then (x, x) = 2xl + 3r·. Since the uni t sphere {circle} consists of all
x such that Ixll 1=
= I,wehave
I x~ = V{x, x) = \12x!- + 31
or
2X 2
+ 3y2 =
Th is is the equation of an ellipse, and its gra ph is shown in Figure 7.2.
y 1
y'j
-+- -HH-+-+-+- -I-l-l- -I-+-I-l-~x 1 1 -v'2 \ii
ngure 1.2 A unit circle thaI is an ellipse
I
546
Chapter 7
Distance and Approximation
We will discuss properties o f length. d istance, and o rthogonality in the next section and in the exercises. One result tha t we will need III this section is the general ized version of Pythagoras' Theorem. which e" tends Theorem J .6. ,
, e
i
Pythagoras' Theorem let u and v be vectors in an in ner product space V. Then u and v and only if
Proof
As you will be asked to prove
lu + v ~2
-
Exercise 32, we have
+ v, u +
(u
11 foUows immediately th:lI l u
In
+ vl1 1 =
v) = l u l 2
l ul2
+
2(u , v) + ~ v112
+ Ilvll if and o nl y if(u , v} ::: O.
on.ogonal Prolecllons and Ihe Gram-Scbmldl Process In C hapter 5, we disc ussed orthogonality in R~. Most of this material generalizes nicely to general inner product spaces. Fo r example. an orthogonal set of vectors in an inner product space V is a sct {V I' ... , vll o f vectors from V such that (v•• v) = 0 whenever v, :1= v}' An orthonormal sd of vectors is then an o rthogonal set of unit vecto rs. An orthogonal btuis for a s ubspace Wof V is just a basis fo r W tha t is an orthogonal set; sim ilarly, an orthonormal basis for a subspace Wof V is a basis for W that is an o rt ho norm al set. In IR". the G ram-Schmidt Process (Theorem 5.15) shows that every subspace has an o rt hogonal basis. We can mimic the construction of the Gram -Schmidt Process to show that every fin ite-dimensional subspace of an inner product space has an orthogo nal basis-all We need to do is replace the do t product by the more general inner product. We illustrate this approach wit h an example. (Compare the steps he re with those in Example 5.1 3.)
~
Example 1.8
Construct an o rthogonal basis for ~2 with respect to the inner product
(f, g) -
r-,
f(x)g(x) dx
by applying the Gram -Schmidt Process to the basis
SallUIi
l..etx l = I, x l = x, andx , =
\1. x. xl \.
xl. We begin
by sctting
VI
=
Xl
=
I.
Next we
comp ute
(VI, VI) =
II
dx = X]I
-I
- I
= 2
and
(V I' X ~ =
' xdx = X-,], =0 J 2 _I
-1
Se define {A, B) = det(AB).
[u,Jand
~ vf = ~~ u
are orthogonal.
17. In 0'" defin' (p(x), q(x)) ~ p( I)q( I).
III Exercises 19 and 20, (u,
+
34. (u , v) = i llu
16. In 0'" d'fine(p(x), q(x)) ~ p(O)q(O).
where u =
30. Show that, in an inner product space, there can not be unitvectorsuandvwith(u,v) < - I.
Lix) =
2n - \
n
xL"_ I(x) -
11 -
n
Section 7.1
for all ,,