Graduate Texts in Mathematics
Linear Algebra
Volume 23 1975
Springer
Graduate Texts in Mathematics 23
Editorial Bo...
1782 downloads
7235 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Graduate Texts in Mathematics
Linear Algebra
Volume 23 1975
Springer
Graduate Texts in Mathematics 23
Editorial Board: F. W. Gehring P. R. Halmos (Managing Editor) C. C. Moore
Werner Greub
Linear Algebra Fourth Edition
Springer-Verlag
New York Heidelberg Berlin
Werner Greub University of Toronto Department of Mathematics Toronto M5S 1A1 Canada
Managing Editor P. R. Halmos Indiana University Department of Mathematics Swain Hall East Bloomington, Indiana 47401
Editors F. W. Gehring
C. C. Moore
University of Michigan Department of Mathematics Ann Arbor, Michigan 48104
University of California at Berkeley Department of Mathematics Berkeley, California 94720
AMS Subject Classifications 15-01, 15A03, 15A06, 15A18, 15A21, 16-01 Library
Greub,
of Congress Cataloging in Publication Data Werner Hildbert, 1925—
algebra. (Graduate texts in mathematics; v. 23) Bibliography: p. 445 1. Algebras, Linear. 1. Title. II. Series. QA184.G7313 1974 512'.S 74-13868 Linear
All rights
reserved.
No part of this book may be translated or reproduced in any form without written permission from Springer-Verlag. © 1975 by Springer-Verlag New York Inc. Softcover reprint of the hardcover 4th edition 1975 ISBN 978-1-4684-9448-8 DOl 10.1007/978-1-4684-9446-4
ISBN 978-1-4684-9446-4 (eBook)
To Roif Nevanlinna
Preface to the fourth edition This textbook gives a detailed and comprehensive presentation of linear algebra based on an axiomatic treatment of linear spaces. For this fourth edition some new material has been added to the text, for instance, the intrinsic treatment of the classical adjoint of a linear transformation in Chapter IV, as well as the discussion of quaternions and the classification of associative division algebras in Chapter VII. Chapters XII and
XIII have been substantially rewritten for the sake of clarity, but the contents remain basically the same as before. Finally, a number of problems covering new topics — e.g. complex structures, Caylay numbers and symplectic spaces — have been added.
I should like to thank Mr. M. L. Johnson who made many useful suggestions for the problems in the third edition. I am also grateful to my colleague S. Halperin who assisted in the revision of Chapters XII and XIII and to Mr. F. Gomez who helped to prepare the subject index. Finally, I have to express my deep gratitude to my colleague J. R. Vanstone who worked closely with me in the preparation of all the revisions and additions and who generously helped with the proof reading.
Toronto, February 1975
WERNER H. GREUB
Preface to the third edition The major change between the second and third edition is the separation
of linear and multilinear algebra into two different volumes as well as the incorporation of a great deal of new material. However, the essential
character of the book remains the same; in other words, the entire presentation continues to be based on an axiomatic treatment of vector spaces.
In this first volume the restriction to finite dimensional vector spaces has been eliminated except for those results which do not hold in the infinite dimensional case. The restriction of the coefficient field to the real and complex numbers has also been removed and except for chapters VII to XI, § 5 of chapter I and § 8, chapter IV we allow any coefficient
field of characteristic zero. In fact, many of the theorems are valid for modules over a commutative ring. Finally, a large number of problems of different degree of difficulty has been added.
Chapter I deals with the general properties of a vector space. The topology of a real vector space of finite dimension is axiomatically characterized in an additional paragraph. In chapter lIthe sections on exact sequences, direct decompositions and duality have been greatly expanded. Oriented vector spaces have been incorporated into chapter IV and so chapter V of the second edition has disappeared. Chapter V (algebras) and VI (gradations and homology)
are completely new and introduce the reader to the basic concepts associated with these fields. The second volume will depend heavily on some of the material developed in these two chapters.
Chapters X (Inner product spaces) XI (Linear mappings of inner product spaces) XII (Symmetric bilinear functions) XIII (Quadrics) and XIV (Unitary spaces) of the second edition have been renumbered but remain otherwise essentially unchanged. Chapter XII (Polynomial algebra) is again completely new and developes all the standard material about polynomials in one indeterminate. Most of this is applied in chapter XIII (Theory of a linear transformation). This last chapter is a very much expanded version of chapter XV of the
second edition. Of particular importance is the generalization of the
Preface to the third edition
results in the second edition to vector spaces over an arbitrary coefficient field of characteristic zero. This has been accomplished without reversion
to the cumbersome calculations of the first edition. Furthermore the concept of a semisimple transformation is introduced and treated in some depth.
One additional change has been made: some of the paragraphs or sections have been starred. The rest of the book can be read without reference to this material.
Last but certainly not least, I have to express my sincerest thanks to everyone who has helped in the preparation of this edition. First of all I am particularly indebted to Mr. S. HALPERIN who made a great number of valuable suggestions for improvements. Large parts of the book, in particular chapters XII and XIII are his own work. My warm thanks also go to Mr. L. YONKER, Mr. G. PEDERZOLI and Mr. J. SCHERK who did the proofreading. Furthermore I am grateful to Mrs. V. PEDERZOLI
and to Miss M. PETTINGER for their assistance in the preparation of the
manuscript. Finally I would like to express my thanks to professor K. BLEULER for providing an agreeable milieu in which to work and to the publishers for their patience and cooperation.
Toronto, December 1966
WERNER H. GREUB
Preface to the second edition Besides the very obvious change from German to English, the second
edition of this book contains many additions as well as a great many other changes. It might even be called a new book altogether were it not
for the fact that the essential character of the book has remained the same; in other words, the entire presentation continues to be based on an axiomatic treatment of linear spaces. In this second edition, the thorough-going restriction to linear spaces of finite dimension has been removed. Another complete change is the
restriction to linear spaces with real or complex coefficients, thereby removing a number of relatively involved discussions which did not really contribute substantially to the subject. On p. 6 there is a list of those chapters in which the presentation can be transferred directly to spaces over an arbitrary coefficient field.
Chapter I deals with the general properties of a linear space. Those concepts which are only valid for finitely many dimensions are discussed in a special paragraph.
Chapter II now covers only linear transformations while the treatment of matrices has been delegated to a new chapter, chapter III. The discussion of dual spaces has been changed; dual spaces are now introduced abstractly and the connection with the space of linear functions is not established until later. Chapters IV and V, dealing with determinants and orientation respectively, do not contain substantial changes. Brief reference should
be made here to the new paragraph in chapter IV on the trace of an endomorphism — a concept which is used quite consistently throughout the book from that time on. Special emphasis is given to tensors. The original chapter on Multilinear Algebra is now spread over four chapters: Multilinear Mappings (Ch. VI), Tensor Algebra (Ch. VII), Exterior Algebra (Ch. VIII) and
Duality in Exterior Algebra (Ch. IX). The chapter on multilinear mappings consists now primarily of an introduction to the theory of the tensor-product. In chapter VII the notion of vector-valued tensors has
been introduced and used to define the contraction. Furthermore, a
XII
Preface to the second edition
treatment of the transformation of tensors under linear mappings has been added. In Chapter VIII the antisymmetry-operator is studied in greater detail and the concept of the skew-symmetric power is introduced. The dual product (Ch. IX) is generalized to mixed tensors. A special paragraph in this chapter covers the skew-symmetric powers of the unit tensor and shows their significance in the characteristic polynomial. The paragraph "Adjoint Tensors" provides a number of applications of the duality theory to certain tensors arising from an endomorphism of the underlying space.
There are no essential changes in Chapter X (Inner product spaces) except for the addition of a short new paragraph on normed linear spaces.
In the next chapter, on linear mappings of inner product spaces, the orthogonal projections 3) and the skew mappings 4) are discussed in greater detail. Furthermore, a paragraph on differentiable families of automorphisms has been added here. Chapter XII (Symmetric Bilinear Functions) contains a new paragraph dealing with Lorentz-transformations. Whereas the discussion of quadrics in the first edition was limited to quadrics with centers, the second edition covers this topic in full. The chapter on unitary spaces has been changed to include a more thorough-going presentation of unitary transformations of the complex plane and their relation to the algebra of quaternions. The restriction to linear spaces with complex or real coefficients has of course greatly simplified the construction of irreducible subspaces in chapter XV. Another essential simplification of this construction was achieved by the simultaneous consideration of the dual mapping. A final paragraph with applications to Lorentz-transformation has been added to this concluding chapter. Many other minor changes have been incorporated — not least of which are the many additional problems now accompanying each paragraph.
Last, but certainly not least, I have to express my sincerest thanks to everyone who has helped me in the preparation of this second edition. First of all, I am particularly indebted to CORNELIE J. RHEINBOLDT
who assisted in the entire translating and editing work and to Dr. WERNER C. RHEINBOLDT who cooperated in this task and who also made a number of valuable suggestions for improvements, especially in the chapters on linear transformations and matrices. My warm thanks
also go to Dr. H. BOLDER of the Royal Dutch/Shell Laboratory at Amsterdam for his criticism on the chapter on tensor-products and to Dr. H. H. KELLER who read the entire manuscript and offered many
Preface to the second edition
XIII
important suggestions. Furthermore, I am grateful to Mr. GI0RGI0 PEDERZOLI who helped to read the proofs of the entire work and who collected a number of new problems and to Mr. KHADJA NESAMUDDIN KHAN for his assistance in preparing the manuscript.
Finally I would like to express my thanks to the publishers for their patience and cooperation during the preparation of this edition. Toronto, April 1963
WERNER H. GREUB
Contents Chapter 0. Prerequisites
Chapter I. Vector spaces § 1. Vector spaces § 2. Linear mappings § 3. Subspaces and factor spaces § 4. Dimension § 5. The topology of a real finite dimensional vector space
37
Chapter II. Linear mappings § 1. Basic properties § 2. Operations with linear mappings § 3. Linear isomorphisms § 4. Direct sum of vector spaces § 5. Dual vector spaces § 6. Finite dimensional vector spaces
41 41
Chapter III. Matrices § 1. Matrices and systems of linear equations § 2. Multiplication of matrices § 3. Basis transformation § 4. Elementary transformations
83 83 89 92 95
Chapter IV. Determinants § 1. Determinant functions § 2. The determinant of a linear transformation § 3. The determinant of a matrix § 4. Dual determinant functions § 5. The adjoint matrix § 6. The characteristic polynomial §7. The trace § 8. Oriented vector spaces Chapter V. Algebras § 1. Basic properties
51
55 56
63 76
104 109 112 114 120 126 131
Change of coefficient field of a vector space
144 144 158 163
Chapter VI. Gradations and homology § 1. G-graded vector spaces § 2. G-graded algebras § 3. Differential spaces and differential algebras.
167 167 174 178
Chapter VII. Inner product spaces § 1. The inner product § 2. Orthonormal bases
186 186
§2. Ideals § 3.
191
Contents
XVI
Normed determinant functions Duality in an inner product space Normed vector spaces § 6. The algebra of quaternions Chapter VIII. Linear mappings of inner product spaces § 1. The adjoint mapping § 2. Selfadjoint mappings § 3. Orthogonal projections § 4. Skew mappings § 5. Isometric mappings
195
§ 3. § 4. § 5.
§ 6.
Rotations of Euclidean spaces of dimension 2, 3and4
§ 7.
Differentiable families of linear automorphisms.
202 205
208 216
216 221
226 229 232
237 249
Chapter IX. Symmetric bilinear functions § 1. Bilinear and quadratic functions § 2. The decomposition of E § 3. Pairs of symmetric bilinear functions § 4. Pseudo-Euclidean spaces § 5. Linear mappings of Pseudo-Euclidean spaces.
261
Chapter X. Quadrics § 1.
296 296
§
301
261 265
272
.
§
281 288
Affine spaces 2. Quadrics in the affine space 3. Affine equivalence of quadrics
§ 4.
310 316
Quadrics in the Euclidean space
Chapter XI. Unitary spaces § 1. Hermitian functions § 2. Unitary spaces § 3. Linear mappings of unitary spaces § 4. Unitary mappings of the complex plane § 5. Application to Lorentz-transformations
325 325 327 334 340
Chapter XII. Polynomial algebra . § 1. Basic properties § 2. Ideals and divisibility § 3. Factor algebras § 4. The structure of factor algebras.
351 351 357
Chapter XIII. Theory of a linear transformation § 1. Polynomials in a linear transformation § 2. § 3.
345
366 369
.
Generalized eigenspaces Cyclic spaces
.
383 390
397
§ 4. Irreducible spaces
402
Application of cyclic spaces Nilpotent and semisimple transformations 7. Applications to inner product spaces .
§ 5.
§ 6. §
383
.
.
415 425 436
Bibliography
445
Subject Index
447
Interdependence of Chapters Vector spaces F
Linear mappings J
Matrices
Determinants
Inner product spaces
Gradations and homology
Linear mappings of inner product spaces
Symmetric bilinear functions
r
Unitary spaces
Algebras
Polynomial algebra
Quadrics
Theory of a linear transformation
Chapter 0
Prerequisites 0.1. Sets. The reader is expected to be familiar with naive set theory
up to the level of the first half of [11]. In general we shall adopt the notations and definitions of that book; however, we make two exceptions. First, the word function will in this book have a very restricted meaning, and what Halmos calls a function, we shall call a mapping or a set mapping. Second, we follow Bourbaki and call mappings that are one-to-one (onto, one-to-one and onto) injective (surjective, bijective). 0.2. Topology. Except for § 5 chap. I, § 8, Chap. IV and parts of chapters VII to IX we make no use at all of topology. For these parts of the book the reader should be familiar with elementary point set topology as found in the first part of [16]. 0.3. Groups. A group is a set G, together with a binary law of composition x G—+G which satisfies the following axioms (x, y) will be denoted by xy): 1. Associativity: (xy)z = x (yz) 2. Identity: There exists an element e, called the identity such that
xe = ex = x. 3. To each element xeG corresponds a second element
xx' =
such that
= e.
The identity element of a group is uniquely determined and each element has a unique inverse. We also have the relation
(xy)' As an example consider the set of all permutations of the set { 1. n} and define the product of two permutations a, t by . .
(crt)i=a(ti) In this way
i=1...n.
becomes a group, called the group of permutations of n objects. The identity element of is the identity permutation.
I
Greub. linear Algebra
Chapter 0. Prerequisites
Let G and H be two groups. Then a mapping G —*
H
is called a homomorphism if
X,yEG.
=
A homomorphism which is injective (resp. surjective, bijective) is called a monomorphism (resp. epimorphism, isomorphism). The inverse mapping of an isomorphism is clearly again an isomorphism. A subgroup H of a group G is a subset H such that with any two elementsyeHand zeH the productyz is contained in Hand that the inverse of every element of H is again in H. Then the restriction of p to the subset Hx H makes H into a group. A group G is called commutative or abelian if for each x, yeG xy=yx.
In an abelian group one often writes x+y instead of xy and calls x+y the sum of x and y. Then the unit element is denoted by 0. As an example consider the set of integers and define addition in the usual way. 0.4. Factor groups of commutative groups.* Let G be a commutative
group and consider a subgroup H. Then H determines an equivalence relation in G given by x
x'
if and only if
x—
x' e H.
The corresponding equivalence classes are the sets {H+x} and are called the cosets of H in G. Every element xeG is contained in precisely one coset ?. The set G/H of these cosets is called the factor set of G by H and the surjective mapping G
G/H
defined by XEX
called the canonical projection of G onto G/H. The set G/H can be made into a group in precisely one way such that the canonical projection becomes a homomorphism; i.e., is
it(x + y) = irx + 7ry. To define the addition in G/H let
e G/H, 9 e G/H be arbitrary and choose
xeG and yeG such that and
my=9.
*) This concept can be generalized to non-commutative groups.
Chapter 0. Prerequisites
Then the element it (x +y) depends only on
two other elements satisfying whence
3
and 9. In
and icy' =9
we
fact, if x', y' are
have
x'—xeH and y'—yeH (x' + y') — (x
+ y)eH
and so ir(x'+y')—ic(x+y). Hence, it makes sense to define the sum by
is easy to verify that the above sum satisfies the group axioms. Relation (0.1) is an immediate consequence of the definition of the sum in G/H. Finally, since it is a surjective map, the addition in G/H is uniquely deterIt
mined by (0.1). The group G/H is called the factor group of G with respect to the subgroup H. Its unit element is the set H. 0.5. Fields. Afield is a set F on which two binary laws of composition, called respectively addition and multiplication, are defined such that 1. F is a commutative group with respect to the addition. 2. The set F — {0} is a commutative group with respect to the multiplication.
3. Addition and multiplication are connected by the distributive
law,
The rational numbers the real numbers 11 and the complex numbers C are fields with respect to the usual operations, as will be assumed without proof. A homomorphism p:F—÷F' between two fields is a mapping that preserves addition and multiplication. A subset A F of a field which is closed under addition, multiplication and the taking of inverses is called a subfield. If A is a subfield of F, F is called an extension field of A. Given a field F we define for every positive integer k the element ke (c unit element of F) by
The field F is said to have characteristic zero if ke + 0 for every positive integer k. If F has characteristic zero it follows that kc+k'c whenever k+k'. Hence, a field of characteristic zero is an infinite set. Throughout this book it will be assumed without explicit mention that all fields are of characteristic zero.
Chapter 0. Prerequisites
4
For more details on groups and fields the reader is referred to [29]. 0.6. Partial order. Let cl be a set and assume that for some pairs X. Y (XE.ci. YE.cl) a relation, denoted by X Y, is defined which satisfies the following conditions: (i) XX for every Xe.cl (Reflexivity) (ii) if X Y and Y X then X = Y (Antisymmetry) (iii) If X Y and Y Z, then X Z (Transitivity). Then is called a partial order in d.
A homomorphism of partial/v ordered sets is a map go:
such
that goX go Y whenever X Y
Clearly a subset of a partially ordered set is again partially ordered.
Let d be a partially ordered set and suppose A Ed is an element such that the relation A X implies that A = X. Then A is called a maximal element of ci. A partial ordered set need not have a maximal element. A partially ordered set is called linear/v ordered or a chain if for every
pairX, YeitherXYor be a subset of the partially ordered set ci. Then an element if X A for every XEc11. A ed is called an upper hound for In this book we shall assume the following axiom: A partially ordered set in which every chain has an upper bound, contains a maximal element. This axiom is known as Zorn's lemma, and is equivalent to the axiom Let
of choice (cf. [11]).
d be a 0.7. Lattices. Let d be a partially ordered set and let subset. An element Aed is called a least upper hound (l.u.b.) for d1 if 1) A is an upper bound for cl1. 2) If X is any upper bound, then A X. It follows from (ii) that if a l.u.b. for d1 exists, then it is unique. In a similar way, lower bounds and the greatest lower bound (g.1.b.) for a subset of d are defined. A partially ordered set ci is called a lattice, if for any two elements X, Y the subset {X, Y} has a 1.u.b. and a g.l.b. They are denoted by X v Y and X A Y. It is easily checked that any finite subset (X1, ..., Xr) of a lattice
has a 1.u.b. and a g.l.b. They are denoted by V
and A X,.
As an example of a lattice, consider the collection of subsets of a given set, X, ordered by inclusion. If U, V are any two subsets, then
UAV=UflV and UvV=UUV.
Chapter 1
Vector Spaces § 1.
Vector spaces
1.1. Definition. A vector (linear) space, E, over the field F is a set of elements x, y, ... called vectors with the following algebraic structure:
I. E is an additive group; that is, there is a fixed mapping Ex E—+E denoted by (1.1)
and satisfying the following axioms: 1.1. (x +y) +z = x + (y + z) (associative law)
1.2. x+y=y+x (commutative law) 1.3. there exists a zero-vector 0; i.e., a vector such that x+0= O+x=x for every xeE. 1.4. To every vector x there is a vector —x such that x+(—x)=O.
II. There is a fixed mapping F x
denoted by
(2,x)—÷Ax
(1.2)
and satisfying the axioms: 11.1. (associative law) 11.2.
A(x + y) = Ax + Ay (distributive laws) 11.3.
unit element of F)
(The reader should note that in the left hand side of the first distributive law, + denotes the addition in F while in the right hand side, + denotes the addition in E. In the sequel, the name addition and the symbol + will continue to be used for both operations, but it will always be clear from the context which one is meant). F is called the coefficient field of the vector space E, and the elements of F are called scalars. Thus the mapping
Chapter 1. Vector spaces
(1.2) defines a multiplication of vectors by scalars, and so it is called scalar multiplication.
If the coefficient field 1' is the field l1 of real numbers (the field C of complex numbers), then £ is called a real (complex) vector space. For the rest of this paragraph all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0. If {x1 ,...,x,1} is a finite family of vectors in E, the sum x1 + will often be denoted
Now we shall establish some elementary properties of vector spaces. It follows from an easy induction argument on a that the distributive laws hold for any finite number of terms,
A) x .
Ax
holds if and only if
=
=
0
2=0 or x=0.
Proof Substitution of pc=O in the first distributive law yields Ax = Ax + Ox
whence Ox =0. Similarly, the second distributive law shows that
20 = 0. Conversely, suppose that Ax=O and assume that 2+0. Then the associative law 11.1 gives that
lx = (A1A)x = A1(Ax) = Ab0 = 0 and hence axiom 11.3 implies that x=O. The first distributive law gives for Ax + (— A)x =
(A
—
—A
A)x =
whence (— A)
x = — Ax.
0
§ 1. Vector spaces
In the same way the formula 2(— x) = — 2x is
proved.
1.2. Examples. 1. Consider the set F "=
of n-tuples
and define addition and scalar multiplication by I
+
1
+
=
1,
+
and
Then the associativity and commutativity of addition follows at once from the associativity and commutativity of addition in F. The zero vector is the n-tuple (0, ..., 0) and the inverse of is the n-tuple ..., Consequently, addition as defined above makes the set ..., F" into an additive group. The scalar multiplication satisfies I1.i, 11.2, and 11.3, as is equally easily checked, and so these two operations make F" into a vector space. This vector space is called the n-space over F. In particular, F is a vector space over itself in which scalar multiplication coincides with the field multiplication.
2. Let C be the set of all continuous real-valued functions, f, in the
interval I:Ot 1, R.
1ff, g are two continuous functions, then the function f+g defined by (1 + g) (t) is
= f (t) + g (t)
again continuous. Moreover, for any real number 2, the function 2f
defined by
(1f)(t) = 2.1(t) is
continuous as well. It is clear that the mappings (1' g) —p
f +g
and
(2,1)
2]
the systems of axioms 1. and 11. and so C becomes a real vector space. The zero vector is the function 0 defined by satisfy
0(t) =
0
Chapter 1. Vector spaces
and the vector —f is the function given by
(- f)(t) = - f(t). Instead of the continuous functions we could equally well have considered the set of k-times differentiable functions, or the set of analytic functions.
3. Let X be an arbitrary set and E be a vector space. Consider all and define the sum of two mappings f and g as the xeX (f + g)(x) = f(x) + g(x) and the mapping )f by
mappings J: mapping
(i.[)(x) =
XEX.
Under these operations the set of all mappings f:
becomes a vector space, which will be denoted by (X; E). The zero vector of (X; E) is the function f defined by f(x)=0, XEX. 1.3. Linear combinations. Suppose E is a vector space and x1 are vectors in E. Then a vector XEE is called a linear combination of the vectors x1 if it can be written in the form x=
be any family of vectors. Then a vector More generally, let if there is a family xEE is called a linear combination of the vectors only finitely many different from zero, such that of scalars, x= where the summation is extended over those We shall simply write =
for which
cLEA
and it is to be understood that only finitely many are different from zero. In particular, by setting =0 for each we obtain that the 0-vector is a linear combination of every family. It is clear from the definition that if x is a linear combination of the family {xj then x is a linear combination of a finite subfamily. Suppose now that x is a linear combination of vectors cteA
x= rzeA
and assume further that each x2 is a linear combination of vectors
§ 1. Vector spaces
9
JJeB2,
=
e F. p
Then the second distributive law yields X
=
x2
r
=
Yap'
=
hence x is a linear combination of the vectors A subset ScEis called a system of generators for Eif every vector xeE is a linear combination of vectors of S. The whole space E is clearly a system of generators. Now suppose that S is a system of generators for E and that every vector of S is a linear combination of vectors of a subset Tc: S. Then it follows from the above discussion that T is also a system of generators for E. 1.4. Linear dependence. Let be a given family of vectors. Then a non-trivial linear combination of the vectors is a linear combination where at least one scalar is different from zero. The family {xa} and
is called linearly dependent if there exists a non-trivial linear combination of the that is, if there exists a system of scalars such that (1.3)
and at least one + 0. It follows from the above definition that if a subfamily of the family is linearly dependent, then so is the full family. An equation of the form (1.3) is called a non-trivial linear relation. A family consisting of one vector x is linearly dependent if and only if x 0. In fact, the relation
1.0=0 that the zero vector is linearly dependent. Conversely, if the vector x is linearly dependent we have that 2x=O where 2+0. Then Proposition I implies that x=0. It follows from the above remarks that every family containing the zero vector is linearly dependent. shows
is linearly dependent if and Proposition II: A family of vectors is a linear combination of the vectors x2, cx+fl. only if for some fleA, Proof: Suppose that for some fleA, xp
Chapter 1. Vector spaces
10
Then setting
—1
we obtain
=
o
and hence the vectors x2 are linearly dependent.
Conversely, assume that
=
0
and that
for some fleA. Then multiplying by view of II.! and 11.2 0= +
we obtain in
2*/i
()ji)_l)2x
xp=— a*f1
Corollary: Two vectors x, y are linearly dependent if and only ify=Ax (or x==)Ly) for some AcT. 1.5. Linear independence. A family of vectors (x2)26A is called linearly
independent if it is not linearly dependent; i.e., the vectors x2 are linearly independent if and only if the equation
= implies that A2=O for each
0
It is clear that every subfamily of a line-
arly independent family of vectors is again linearly independent. If flEA,
is a linearly independent family, then for any two distinct indices is injective. and so the map
Proposition III: A family (x2)2EA of vectors is linearly independent if and only if every vector x can be written in at most one way as a linear combination of the x2 i.e., if and only if for each linear combination
X>22X2
(1.4)
the scalars A2 are uniquely determined by x. Proof. Suppose first that the scalars A2 in (1.4) are uniquely determined by x. Then in particular for x=0, the only scalars A2 such that =0
are the scalars A2=O. Hence, the vectors x2 are linearly independent. Con-
§ 1. Vector spaces
versely,
suppose that the
11
are linearly independent and consider the
relations Then
=0
—
whence in view of the linear independence of the i.e., 1.6. Basis. A family of vectors
in E is called a basis of E if it is simultaneously a system of generators and linearly independent. In view of Proposition III and the definition of a system of generators, we have that A is a basis if and only if every vector xeE can be written in precisely one way as
The scalars
are called the components of x with respect to the basis
As an example, consider the n-space, ['fl, over F defined in example 1, sec. 1.2. It is easily verified that the vectors
i=1...n
x1=(O,...,O,l,O...O)
form a basis for F". We shall prove that every non-trivial vector space has a basis. For
the sake of simplicity we consider first vector spaces which admit a finite system of generators. Proposition IV: (i) Every finitely generated non-trivial vector space has a finite basis (ii) Suppose that S=(x1 ,...,Xm) is a finite system of generators of E and that the subset R c S given by R = (x1 Xr) (r m) consists of
linearly independent vectors. Then there is a basis,
of E such that
RcTcS. Proof: (i) Let x1 the vectors x1
x,, be a minimal system of generators of E. Then are linearly independent. In fact, assume a relation
=
0.
Chapter I. Vector spaces
12
If
= 0 for some 1, it follows that
;eT and so the vectors
(1.5)
generate E. This contradicts the minimality
of ii. (ii) We proceed by induction on ,i
(nr). If n=r then there is nothing to prove. Assume now that the assertion is correct for ii — I. Consider the vector space, F, generated by the vectors x1 Xr+i Then by induction, F has a basis of the form X1
Xr,
where V1ES
Ys
1
(j = 1
Now consider the vector x,1. If the vectors x1
). 1
x, are
linearly independent, then they form a basis of E which has the desired property. Otherwise there is a non-trivial relation
= 0.
+ Since the vectors x1
are
that y+O. Thus
linearly independent, it follows
r
x,7= Q=l
71 generate E. Since they are linearly
Hence the vectors x1
independent, they form a basis. Now consider the general case. Theorem I: Let E be a non-trivial vector space. Suppose S is a system
of generators and that R is a family of linearly independent vectors in E such that R S. Then there exists a basis, of E such that R S. Proof Consider the collection .W(R, S) of all subsets, X, of E such that
1) RcXcS 2) the vectors of X are linearly independent. The a partial order is defined in d(R, 5) by inclusion (cf. sec. 0.6). We show that every chain, in .&(R, 5) has a maximal element A. In fact, R
A
set A =
We
have to show that A ed(R, S). Clearly,
S. Now assume that x5,eA.
(1.6)
§ 1. Vector spaces
Then, for each assume that
i,
for some
is a chain, we may
Since
(i=1...n).
(1.7)
are linearly independent it follows that Since the vectors of (v = ... n) whence A ed(R, S). Now Zorn's lemma (cf. sec. 0.6) implies that there is a maximal element, L in d(R, S). Then R c: Tc:S and the vectors of Tare linearly 1
independent. To show that T is a system of generators, let xeE be arbitrary. Then the vectors of T U x are linearly dependent because otherwise it would follow that x U Ted(R, S) which contradicts the maximality of T Hence there is a non-trivial relation ,tVET, Since
the vectors of T are linearly independent, it follows that
whence
xv.
This equation shows that T generates E and so it is a basis of E.
Corollary I: Every system of generators of E contains a basis. In particular, every non-trivial vector space has a basis.
Corollary II: Every family of linearly independent vectors of E can be extended to a basis. 1.7. The free vector space over a set. Let X be an arbitrary set and such that f(x)+0 only for finitely many XEX. consider all maps f:
Denote the set of these maps by C(X). Then, if fe C(X), ge C(X) are scalars, 2f+ is again contained in C(X). As in example 3, sec. 1.2, we make C(X) into a vector space. For any aeX denote by the map given by and ).,
x=a
51
Then the vectors (aEX) form a basis of C(X). In fact, let fe C(X) be the (finitely many) distinct points be given and let a1 a, such that [(a) 0. Then we have
f where
= [(a)
i=1
(1 =
1
n)
Chapter I. Vector spaces
14
and so the element a relation
(aEX) generate C(X). On the other hand, assume
= 0. j=1
Then we have for each 1(1= ... ii) 1
0
1
=
... a). This shows that the vectors
(aeX) are
linearly independent and hence they form a basis of C(X). Now consider the inclusion map X—*C(X) given by
ix(a)=Ja
aEX.
This map clearly defines a bijection between X and the basis vectors
of C(X). If we identify each element aeX with the corresponding map then X becomes a basis of C(X). C(X) is called the free vector space over X or the vector space generated by X.
Problems 1. Show that axiom 11.3 can be replaced by the following one: The equation 2x=O holds only if 2=0 or x=0. 2. Given a system of linearly independent vectors (x1, ..., xe,), prove that the system (x1, .. i4j with arbitrary 2 is again linearly independent. 3. Show that the set of all solutions of the homogeneous linear differential equation d2y dt2
+
dy + qy = 0 dt
p and q are fixed functions of t, is a vector space. 4. Which of the following sets of functions are linearly dependent in the vector space of Example 2? where
a)f1=3t; f2=t+5; f3=2t2; b)f1=(t+1)2;f2=t2—1; f3=2t2+2t—3 f2=et; f3=e_t c) f1=1; d)f1=t2; e)
f1=1—t;
f2=t;
f3=1
f2=t(1—t); f3=1—t2.
§ 1. Vector spaces
5. Let E be a real linear space. Consider the set E x E of ordered pairs
(x, y) with xeE and yeE. Show that the set Ex E becomes a complex vector space under the operations: (x1,y1) + (x2,y2) =
(x1
+ x2,y1 + Y2)
and + i fJ)(x, y) =
x—
,8
y, y + J3 x)
/3 real numbers).
6. Which of the following sets of vectors in (a generating set, a basis)?
linearly independent,
a) (1, 1, 1, 1), (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1) b) (1, 0, 0, 0), (2, 0, 0, 0) c) (17, 39, 25, 10), (13, 12, 99, 4), (16, 1,0,0) 0, 0, d) (1, 1, 0, 0), (0, 0, 1, 1), (0, 4, 4, 1),
Extend the linearly independent sets to bases. 7. Are the vectors x1 (1, 0, 1); x2 = (i, 1, 0), x3 = (1, 2, 1 +1) linearly independent in C3? Express x = (1, 2, 3) and y = (1, 1, 1) as linear combibinations of x1, x2, x3. is defined by a map 8. Recall that an n-tuple given by
(i=1...n). Show that the vector spaces C{l...n} and F" are equal. Show further that the basis J defined in sec. 1.7 coincides with the basis x defined in sec. 1 .6.
9. Let S be any set and consider the set of maps
f: S
F"
such thatf(x)=0 for all but finitely many xeS. in a manner similar to that of sec. 1.7, make this set into a vector space (denoted by C(S, F')). Construct a basis for this vector space. be a basis for a vector space E and consider a vector 10. Let
a= that for some fleA. again a basis for E. Suppose
Show that the vectors
form
Chapter I. Vector spaces
11. Prove the following exchange theorem of Steinitz: Let be a a system of linearly independent vectors. basis of E and Then it is possible to exchange certain p of the vectors by the vectors a such that the new system is again a basis of F. Hint: Use problem 10. 12. Consider the set of polynomial functionsf:
f (x) =
xi.
Make this set into a vector space as in Example 3, and construct a natural basis. § 2.
Linear mappings
In this paragraph, all vector spaces are defined over a fixed but arbitrarily chosen field F of characteristic zero.
1.8. Definition. Suppose that E and F are vector spaces, and let (p:
be a set mapping. Then 'p will be called a linear mapping if
'p(x+y)='px+'py x,yeE
(1.8)
and
)LEF,xEE
= 2çox
(1.9)
(Recall that condition (1.8) states that 'p is a homomorphism between abelian groups). If F=F then 'p is called a linear function in E. Conditions (1.8) and (1.9) are clearly equivalent to the condition =
'p
'p
and so a linear mapping is a mapping which preserves linear combinations. From (1.8) we obtain that for every linear mapping, 'p, 'p 0
whence 'p (0)
=
'p (0
+0) =
'p (0)
+ 'p (0)
0. Suppose now that
=0 is a linear relation among the vectors 'p x1
=
'p
)!
Then we have
=
'p0
=0
§ 2. Linear mappings
whence
= 0.
17
(1.11)
Conversely, assume that ço:E—*F is a set map such that (1.11) holds whenever (1.10) holds. Then for any x, yeE and )LEF set
u=x+y and v=2x. Since
u—x—y=0 and v—2x=0
it follows that x
(x + y) —
—
y= 0
and ç (2 x) —2 (p x =
0
and hence p is a linear mapping. This shows that linear mappings are precisely the set mappings which preserve linear relations. In particular, it follows that if x1 . ..v,. are linearly dependent, then so If X1 .• . are linearly independent, it does not, are the vectors çox1 however, follow that the vectors cOXi...cOXr are linearly independent. In fact, the zero mapping defined by (px=0,xEE,is clearly a linear mapping which maps every family of vectors into the linearly dependent set (0). A bijective linear mapping (p: E3F is called a linear isomorphism and will be denoted by p: Given a linear isomorphism (p: the set mapping (p 1: E+—F. It is easy to verify that p again satisfies the .
. . .
conditions (1.8) and (1.9) and so it is a linear mapping. p' is bijective and hence a linear isomorphism. It is called the inverse isomorphism of (p. Two vector spaces E and F are called isomorphic if there exists a linear
isomorphism from E onto F. is called a linear transformation of E. A A linear mapping (p bijective linear transformation will be called a linear automorphism of E. 1.9. Examples: 1. Let E=F" and define (p:E—+E by =
+
satisfies the conditions (1.8) and (1.9) and hence it is a linear transformation of E. 2. Given a set S and a vector space E consider the vector space (S; E) be the mapping given by defined in Example 3, sec. 1.2. Let (S;
Then
(pf=f(a) fe(S;E) where aES is a fixed element. Then (p is 2
Greub. Linear Atgebra
a linear mapping.
Chapter
18
1.
Vector spaces
where )LeT is a be the mapping defined by 3. Let fixed element. Then p is a linear transformation. In particular, the ideaWv map i:E—+E, ix=x, is a linear transformation. 1.10. Composition. Let p: E—*F and i/i: F—*G be two linear mappings. Then the composition of ço and /i0ço:E—*G
is defined by XEE. The relation
x) =
(/9
ço x1)
k/i
= p is a linear mapping of E into G. k/Jo p will often be denoted simply by k/1p. If p is a linear transformation in E, then we denote (/9 o (p by (p2. More generally, the linear transformation p ... p is denoted by
shows that k/Jo
i. A linear transformation, p, satisfying (p2 = i is called an involution in E. 1.11. Generators and basis. S—*F Proposition I: Suppose S is a system of generators for E and is a set map (Fa second vector space). Then Po can be extended in at most one way to a linear mapping We extend the definition to the case k=O by setting (p°=
F.
(p: E
A necessary and sufficient condition for the existence of such an extension is that 0
whenever
= 0.
Proof If (p X1
is
an extension of
we
have for each finite set of vectors
ES
=
=
Since the set S generates E it follows from this relation that is (/9 uniquely determined by (po Moreover, if
XES
§ 2. Linear mappings
it follows that =
=
=
cOO
=
0
and so condition (1.12) is necessary. Conversely, assume that (1.12) is satisfied. Then define
by (1.13)
To prove that
is
a well defined map assume that
Then =o
—
whence in view of (1.12)
0
—
and so
= The linearity of p follows immediately from the definition, and it is clear that p extends cOo.
Proposition II: Let be a basis of E and cOo: be a set can be extended in a unique way to a linear mapping map. Then (p:
F.
Proof. The uniqueness follows from proposition I. To prove the existence of p consider a relation = 0. Since the vectors
are linearly independent it follows that each
whence
= 0.
Now proposition I shows that
can be extended to a linear mapping
Corollary: Let S be a linearly independent subset of E and be a set map. Then cOo can be extended to a linear mapping
Chapter I. Vector spaces
20
Proof. Let T be a basis of F containing S (cf. sec. 1.6). Extend c°o in an T—+F. Then k/I0 may be extended to a linear arbitrary way to a set map mapping t/i:E—+F and it is clear that k/i extends
Now let be a surjective linear map, and suppose that S is a system of generators for E. Then the set = is a system of generators for F. In fact, since çü is surjective, every vector yEF can be written as = x
for some xeE. Since S generates E there are vectors such that x=
and scalars
whence
y=
=
This shows that every vector yEF is a linear combination of vectors in
'p (5) and hence 'p (5) is a system of generators for(5). 'p Next, suppose that p: E—÷ F is injective and that S is a linearly independent subset of E. Then 'p (5) is a linearly independent subset of F. in
fact the relation implies that
Since 'p
is
= 0,
= 0.
injective we obtain
=
0
whence, in view of the linear independence of the vectors 'p (5) is a linearly independent set. In particular, if çø: E—+F is a linear isomorphism and for E, then is a basis for F.
A! =0. Hence
is a basis
Proposition III: Let ço:E—+Fbe a linear mapping and be a basis of E. Then 'p is a linear isomorphism if and only if the vectors = form a basis for F. Proof. If 'p is a linear isomorphism then the vectors form a linearly independent system of generators for F. Hence they are a basis. Converse-
§ 2. Linear mappings
ly, assume that the vectors
21
form a basis of F. Then we have for every
yEF y= and so
is
=
surjective.
Now assume that jf Xci.
Then it follows that o=
—
= Since the vectors each and so
—
are linearly independent, we obtain that
for
1?
It follows that 'p is injective, and hence a linear isomorphism.
Problems 1. Consider the vector space of all real valued continuous functions defined in the interval a t b. Show that the mapping 'p given by (p:X(t)—* is
tX(t)
linear.
2. Which of the following mappings of F4 into itself are linear transformations?
-
a) b)
+
c)
+
+
Let E be a vector space over F, and letf1 . given by E. Show that the mapping p: 3.
.
be linear functions in
= (ft(x), ...,fr(X)) is
linear.
4. Suppose
is a linear map, and write 'p x = (f1 (x), .. , fr (x)).
Show
that the
are linear functions in E.
Chapter I. Vector spaces
22
5. The unirersal property of C(X). Let X be any set and consider the free vector space, C(X). generated by X (cf. sec. 1 .7).
(i) Show that if j: then there is a unique linear map p:
X into a vector space F such that (p0
where X—*C(X) is the inclusion map. (ii) Let Y be a set map. Show that there is a unique linear C(X)—*C(Y) such that the diagram map
tX1.
C(X)f'4 C(Y)
is a second set map prove the composition
commutes. If [3: formula
(fib
=
(iii) Let E be a vector space. Forget the linear structure of E and form
the space C(E). Show that there is a unique linear map mE:
C(E)—÷E
such that mE ° 1E
(iv) Let E and F be vector spaces and let E—*F be a map between the underlying sets. Show that is a linear map if and only if ltFO(p*_ (POThE (v) Denote by N(E) the subspace of C(E) generated by the elements of the form a, beE, /IEF — (cf.
part (iii)). Show that
kermE = N(E).
6.Let v=O
be a fixed polynomial and letf be any linear function in a vector space E. Define a function P(f): E-÷F by
P(f)x =
(x)".
Find necessary and sufficient conditions on P that P(f) function. § 3.
be
again a linear
Subspaces and factor spaces
In this paragraph, all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0.
§ 3. Subspaces
and factor spaces
23
1.12. Subspaces. Let E be a vector space over the field F. A non-empty subset, E1, of E is called a subspace if for each x, yEE1 and every scalar
x+yEE1
(1.14)
and
Equivalently, a subspace is a subset of E such that Ax + /IyEE1
whenever x, yeE1. In particular, the whole space E and the subset (0) consisting of the zero vector only are subspaces. Every subspace E1 E contains the zero vector. In fact, if x1 e E1 is an arbitrary vector we have that 0= x1 — x1 E E1. A subspace E1 of E inherits the structure of a vector space from E. Now consider the injective map i: E1 —* E defined by
ix=x,
xeE1.
In view of the definition of the linear operations in E1 i is a linear mapping, called the canonical injection of E1 into E. Since i is injective it follows from (sec. 1.11) that a family of vectors in E1 is linearly independent (dependent) if and only if it is linearly independent (dependent) in E. Next let S be any non-empty subset of E and denote by E5 the set of linear combinations of vectors in S. Then any linear combination of vecis a linear combination of vectors in S (cf. sec. 1.3) and hence tors in Thus is a subspace of E, called the subspace generated it belongs to by 5, or the linear closure of S. In particular, if the set S is Clearly, S is a system of generators for linearly independent, then S is a basis of We notice that = S if and only if S is a subspace itself. 1.13. Intersections and sums. Let E1 and E2 be subspaces of E and consider the intersection E1 n E2 of the sets E1 and E2. Then E1 fl E2 is again a subspace of E. In fact, since OeE1 and OeE2 we have OEE1 fl E2 and so E1 fl E2 is not empty. Moreover, it is clear that the set E1 n E2 satisfies again conditions (1.14) and (1.15) and so it is a subspace of E. E1 n E2 is called the intersection of the subspaces E1 and E2. Clearly, E1 fl E2 is a subspace of E1 and a subspace of E2. The sum of two subspaces E1 and E2 is defined as the set of all vectors of the form x1eE1,x2EE2 (1.16) x=x1 +x2,
Chapter 1. Vector spaces
24
and is denoted by E1 +122. Again it is easy to verify that E1 +E2 is a subspace of E. Clearly E1 + E2 contains E1 and E2 as subspaces. A vector x of E1 + E2 can generally be written in several ways in the form (1.16). Given two such decompositions
x=x1+x2 and it follows that x1
—
=
x'2 —
x2.
Hence, the vector z = x1
—
fl E2. Conversely, let x=x1 +x2, x1 eE1, x2eE2 be a decomposition of x and z be an arbitrary vector of E1 fl E2. Then the vectors
is contained in the intersection
=
— zEE1
=
and
x2
+ zEE2
form again a decomposition of x. It follows from this remark that the decomposition (1.16) of a vector xeE1 +E2 is uniquely determined if and only if E1 n E2 0. In this case E1 + E2 is called the (internal) direct sum of E1 and 122 and is denoted by Now let S1 and be systems of generators for E1 and E2. Then clearly is a system of generators for E1 +E2. If T1 and T2 are respectively S1 U
bases for E1 and E2 and the sum is direct, E1 fl 122=0, then T1 U T2 is a To prove that the set T1 U T2 is linearly independent, basis for suppose that = 0, + Then
=
=
fl
—
0
whence and
Now the x1 are linearly independent, and so )J=O. Similarly it follows
that Suppose that
E=E1$E2
(1.17)
is a decomposition of E as a direct sum of subspaces and let F be an arbitrary subspace of E. Then it is not in general true that
F=F
n E1
Fn
E2
§ 3. Subspaces
and factor spaces
the example below will show. However, if E1 In fact, it is clear that
as
25
then (1.18) holds. (1.19)
On the other hand, it'
y=x1 +x2
x1eE1,x2eE2
is the decomposition of any vector yEF, then
x1eE1=Ffl E1, x2=y—x1eFfl E2. It
follows that
FcFfl
E2.
(1.20)
The relations (1.19) and (1.20) imply (1.18). Example 1: Let E be a vector space with a basis e1, e2. Define E1, E2 and F as the subspaces generated by e1, e2 and e1 + e2 respectively. Then
E =E1 while on the other hand
Fn
E1
=F
n E2
= 0.
Hence
F+Fn
E2.
1.14. Arbitrary families of subspaces. Next consider an arbitrary family of subspaces E, Then the intersection flEa is again a subspace of E. The sum EE1 is defined as the set of all vectors which can be written as finite sums,
(1.21)
and is a subspace of E as well. If for every
then each vector of the sum
can be uniquely represented in the form
(1.21). In this case the space
is called the (internal) direct sum of the
and is denoted by
subspaces
If
is a system of generators for
then the set
If the sum of the
is direct and
generators for then
is a basis for
is a system of
is a basis of
Chapter 1. Vector spaces
26
Example 2. Let by Then
be a basis of E and E2 be the subspace generated
E= Suppose (1.22)
is a direct sum of subspaces. Then we have the canonical injections We define the canonical projections
determined by
= where
It is clear that the are surjective linear mappings. Moreover, it is easily verified that the following relations hold: c'
XEE.
1.15. Complementary subspaces. An important property of vector spaces is given in the
Proposition I: If E1 is a subspace of E, then there exists a second subspace E2 such that E = E1 E2. E2 is called a complementary subspace for E1 in E. Proof We may assume that E1 E and E1 (0) since the proposition is trivial in these cases. Let (xx) be a basis of E1 and extend it with vectors to form a basis of E (cf. Corollary TI to Theorem I, sec. 1.6). Let E2 Then be the subspace of E generated by the vectors
E=
E1
E2.
In fact, since (xx) U (yp) is a system of generators for E, we have that
E=E1+E2.
(1.23)
On the other hand, if XEE1 fl E2, then we may write
x=
x=
Yfi II
and factor spaces
§ 3. Subspaces
27
whence
= 0.
— 13
Now since the set (x2) U
(y13) is
linearly independent, we obtain and
whence x=0. It follows that E1 fl E2=O and so the decomposition (1.23) is direct. As an immediate consequence of the proposition we have Corollary I. Let E1 be a subspace of E and : E1 —* F a linear mapping (Fa second vector space). Then ço1 may be extended (in several ways) to a linear map (p: E—*F.
Proof Let E2 be a complementary subspace for E1 in E,
E=E1Ej3E2
(1.24)
and define p by çox =
where
x—y+z is the decomposition of x determined by (1.24). Then 'p
=
+
'p
=
+
=
= and so q, is linear. It is trivial that 'p extends 'pr.
As a special example we have:
Corollary II: Let E1 be a subspace of E. Then there exists a surjective linear map —*
E1
such that
çox=x
xEE1.
Chapter I. Vector spaces
28
Proof Simply extend the identity map i : E1 —÷E1 to a linear map (p:E-*E1. 1.16.
Factor spaces. Suppose E1 is a subspace of the vector space E.
Two vectors XE E and x' e E are called equivalent mod E1 if x' — XE E1. it
is easy to verify that this relation is reflexive, symmetric and transitive and hence is indeed an equivalence relation. (The equivalence classes are the cosets of the additive subgroup E1 in E(cf. sec. 0.4)). Let E/E1 denote the set of the equivalence classes so obtained and let be
the set mapping given by
7rXX, where
XEE
is the equivalence class containing x. Clearly m is a surjective
map.
Proposition II: There exists precisely one linear structure in E/E1 such that m is a linear mapping. Proof: Assume that E/E1 is made into a vector space such that m is a linear mapping. Then the equations
m(x + y) =
mx
+ my
and m (2 x) =
2
mx
show that the linear operations in E/E1 are uniquely determined by the linear operations in E. it remains to be shown that a linear structure can be defined in E/E1 such that m becomes a linear mapping. Let i and 9 be two arbitrary elements of E/E1 and choose vectors XEE and yEE such that
my=9. Then the class m (x + y) depends only on
and 9. Assume for instance that
x'EE is another vector such that mx'=i. Then mx' = mx and hence we may write
x'=x+z,
zEE1.
it follows that
x' + y = (x + y) + z whence
m(x' + y)= m(x + y).
§ 3. Subspaces and factor spaces
We now define the sum of the elements where
29
and 9eE/E1 by
and 9=7ry.
(1.25)
It is easy to verify that E/E1 becomes an abelian group under this operation and that the class = E1 is the zero-element. Now let be an arbitrary element and 2eF be a scalar. Choose xeE such that Then a similar argument shows that the class ir(Ax) depends only on (and not on the choice of the vector x). We now define the scalar multiplication in E/E1 by where
(1.26)
Again it is easy to verify that the multiplication satisfies axioms 11.1—11.3
and so E/E1 is made into a vector space. It follows immediately from (1.25) and (1.26) that
ir(x+y)=itx+iry
x,yeE
= 2itx i.e., iv is a linear mapping.
The vector space E/E1 obtained in this way is called the factor space of E with respect to the subspace E1. The linear mapping iv is called the canonical projection of E onto E1. If E1 = E, then the factor space reduces to the vector 0. On the other hand, if E1 =0, two vectors xeE and yeE are equivalent mod E1 if and only if y = x. Thus the elements of E/(0) are the singleton sets {x} where x is any element of E, and iv is the linear isomorphism Consequently we identify E and E/(0). 1.17. Linear dependence mod a subspace.. Let E1 be a subspace of E, and suppose that (x2) is a family of vectors in E. Then the will be called linearly dependent mod E1 if there are scalars not all zero, such that
If the
are not linearly dependent mod E1 they will be called linearly
independent mod E1. Now consider the canonical projection
It follows immediately from the definition that the vectors dependent (independent) mod E1 if and only if the vectors dependent (independent) in E/E1.
are linearly are linearly
Chapter 1. Vector spaces
1.18. Basis of a factor space. Suppose that U (:fl) is a basis of E form a basis of E1. Then the vectors 7r:p form a such that the vectors basis of E/E1. To prove this let E2 be the subspace of E generated by the vectors zp . Then
(1.27)
Now consider the linear mapping
defined by
goz=rrz Then
ZEE2.
is surjective. In fact, let
be an arbitrary vector. Since
it:E—*E/E1 is surjective we can write
x—irx,
xEE.
In view of (1.27) the vector x can be decomposed in the form
x=y+z
yEE1,ZEE2.
(1.28)
Equation (1.28) yields
= irx = icy + Trz = and so çø is surjective. To show that çø is injective assume that
çoz=çoz'
=
z,z'eE2.
Then —
z) = p(z' — z) = 0
and hence z'—zeE1. On the other hand we have that z'—zeE2 and thus
z'—zeE1 fl is a linear isomorphism and now PropoIt follows that sition Ill of sec. 1.11 shows that the vectors irzp form a basis of E/E1.
Problems 1. Let be an arbitrary vector in F3. Which of the following subsets are subspaces? a) all vectors with b) all vectors with =0 c) all vectors with d) all vectors with = F,,, Fd generated by the sets of problem 1, 2. Find the subspaces and construct bases for these subspaces.
§ 3. Subspaces and factor spaces
31
3. Construct bases for the factor spaces determined by the subspaces of problem 2.
4. Find complementary spaces for the subspaces of problem 2, and construct bases for these complementary spaces. Show that there exists more than one complementary space for each given subspace. 5. Show that
a) 13=Fa+Fb b)
13=Fb+FC
c) 13=Fa+Fc Find the intersections Fa fl Fb, Fb fl
Fa n
and decide in which cases
the sums above are direct.
6. Let S be an arbitrary subset of E and let be its linear closure. is the intersection of all subspaces of E containing S. Show that 7. Assume a direct composition Show that in each class of E with respect to E1 (i.e. in each coset there is exactly one vector of E2. 8. Let E be a plane and let E1 be a straight line through the origin. What is the geometrical meaning of the equivalence classes with respect to E1. Give a geometrical interpretation of the fact that x
y
x' + i.
Suppose S is a set of linearly independent vectors in E, and suppose T is a basis of E. Prove that there is a subset of T which, together with S, is again a basis of E. and E_ defined 10. Let w be an involution in E. Show that the sets 9.
by
= {xeE; wx = x}, are
E_ = {xeE; wx = — x}
subspaces of E and that
E=
E..
11. Let E1, E2 be subspaces of E. Show that E1 + E2 is the linear closure of E1 u E2. Prove that
E1 + E2 =
E1
U E2
if and only if
E1DE2 or E2DE1. 12. Find subspaces E1, E2, E3 of r3 such that
i) En E1=0 (i+j) ii) E1 + E2 + E3 = p3 iii) the sum in ii) is not direct.
Chapter I. Vector spaces
32
§ 4.
Dimension
In this paragraph all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0. 1.19. Finitely generated vector spaces. Suppose E is a finitely generated vector space, and consider a surjective linear mapping (p:E—+F. Then F is a system of generators for is finitely generated as well. In fact, if generate F. In particular, the factor space E, then the vectors çox1, ..., of a finitely generated space with respect to any subspace is finitely generated. Now consider a subspace E1 of E. In view of Cor. II to Proposition I, sec. 1.15 there exists a surjective linear mapping ço:E—÷E1. It follows that E1 is finitely generated.
1.20. Dimension. Recall that every system of generators of a nontrivial vector space contains a basis. It follows that a finitely generated non-trivial vector space has a finite basis. In the following it will be shown that in this case every basis of E consists of the same number of vectors.
This number will be called the dimension of E and will be denoted by dim E. E will be called a finite-dimensional vector space. We extend the definition to the case E=(0) by assigning the dimension 0 to the space (0). If E does not have finite dimension it will be called an infinite-dimensional vector space.
Proposition 1. Suppose a vector space has a basis of 11 vectors. Then every family of (n+ 1) vectors is linearly dependent. Consequently, ii is the maximum number of linearly independent vectors in E and hence every basis of E consists of n vectors. Proof: We proceed by induction on ii. Consider first the case n= 1 and let a be a basis vector of E. Then if x 0 and y 0 are two arbitrary vectors we have that whence
and y=pa, ii+O —
= 0.
Thus the vectors x and y are linearly dependent. Now assume by induction that the proposition holds for every vector space having a basis of 1 vectors. = 1. ii) be a basis of E and a family of ii + I vectors. We may assume that + 0 because
Let F be a vector space, and let alL x1 . .
. .
1
§ 4. Dimension
33
otherwise it would follow immediately that the vectors x1 . 1 were linearly dependent. Consider the factor space E1 =E/(xn+ i) and the canonical projection
it: E —÷ E/(xn+i) Since the system where (xn+ i) denotes the subspace generated by generates E1 it contains a basis of E1 (cf. Cor. Ito Theorem I, a1, ...,
sec. 1.6). On the other hand the equation Xn+i
implies that
vi
AV
=0
and so the vectors (a1, ..., an) are linearly dependent. It follows that E1
has a basis consisting of less than n vectors. Hence, by the induction hypothesis, the vectors are linearly dependent. Consequently, there exists a non-trivial relation
=0 and so
=
Xn+i.
This formula shows that the vectors Xi...Xn+i are linearly dependent and
closes the induction.
Example: Since the space vectors it follows that
n
(cf. Example 1, sec. 1.2) has a basis of n
= n.
Proposition II: Two finite dimensional vector spaces E and F are isomorphic if and only if they have the same dimension. Proof: Let be an isomorphism. Then it follows from Proposition III, sec. 1.11 that p maps a basis of E injectively onto a basis of F and so dim E=dim F. Conversely, assume that dim E=dim F—n and let and = 1.. .n) be bases of E and F respectively. According to Propo-
sition II, sec. 1.11 there exists a linear mapping
such that
1...n). Then p maps the basis onto the basis it is a linear isomorphism by Proposition Ill, sec. 1.11.
and hence
3
(Ireub. Lineur Algebra
Chapter I. Vector spaces
34
1.21. Subspaces and factor spaces. Let E1 be a subspace of the n-dimen-
sional vector space E. Then E1 is finitely generated and so it has finite dimension in. Let Xi...Xm be a basis of E1. Then the vectors xi...xm are linearly independent in E and so Cor. II to Theorem I, sec. 1.6 implies that the vectors may be extended to a basis of E. Hence
dimE1 dimE.
(1.29)
if equality holds, then the vectors xi...xm form a basis of E and it follows that E1 = E. Now it will be shown that
dimE=dimE1 +dimE/E1.
(1.30)
If E1 =(0) or E1 =E (1.30) is trivial and so we may assume that E1 is a proper non-trivial subspace of E,
0< dimE1 xEE,yEF.
Dual spaces. Suppose E*, E is a pair of vector spaces, and assume that a fixed non-degenerate bilinear function, , in E* x E is defined. Then E and E* will be called dual with respect to the bilinear function x>, is called the scalar product of and x, The scalar and the bilinear function is called a scalar product between E* and E. 2.22.
Examples.
1
Let E= E* = F and define a mapping
by
2,peF.
=2ji
Clearly is a non-degenerate bilinear function, and hence F can be regarded as a self-dual space.
2. Let E= E* = F" and consider the bilinear mapping defined by = 1=1 Greub. Li near AJgcbra
Chapter II. Linear mappings
where
x=
...,
It is easy to verify that the bilinear mapping is non-degenerate and is dual to itself.
hence
3. Let E be any vector space and E* = L (E) the space of linear functions in E. Define a bilinear mapping by
=f(x),
JEL(E),XEE.
Sincef(x)=O for each XEE if and only iff=O, it follows that NL(E)=O. On the other hand, let aeE be a non-zero vector and E1 be the onedimensional subspace of E generated by a. Then a linear function g is defined in E1 by g(x)=)L where In view of sec. 1.15, g can be extended to a linear function f in E. Then
a> =
f (a) =
g
1 +0.
(a)
It follows that NE = 0 and hence the bilinear function is non-degenerate.
This example is of particular importance because of the following
Proposition I: Let E*, E be a pair of vector spaces which are dual with respect to a scalar product Then an injective linear map & E*_*L(E) is defined by =
E E*, XE E.
x>
Proof? Fix a vector a*EE*. Then a linear function,
ja*k) = Since
is
bilinear,
X>
XE E.
(2.39)
is defined in E by (2.40)
depends linearly on a*. Now define cP by setting (2.41)
To show that 1i is injective, assume that i(ci*)=O for some a*EE*. Then
X>=0 for every XEE. Since
is
non-degenerate. it follows
that a*=0. Note: It will be shown in sec. 2.33 that P is surjective (and hence a linear isomorphism) if E has finite dimension.
§ 5. Dual vector spaces
67
e E * and xe E are 2.23. Orthogonal complements. Two vectors called orthogonal if <xe, x> = 0. Now let E1 be a subspace of E. Then the of E*. vectors of E* which are orthogonal to E1 form a subspace is called the orthogonal complement of E1. In the same way every subspace
determines an orthogonal complement (Er)' E. The fact that the bilinear function is non-degenerate can now be expressed by the relations
E'=O and (E*)±=O. It follows immediately from the definition that for every subspace E1
E
(2.42)
E1
Suppose next that E*, E are a pair of dual spaces and that F is a sub-
space of E. Then a scalar product is induced in the pair E*/F', F by VEF,
where is a representative of the class In fact, let be the restriction of the scalar product to E* x F. Then the nullspaces of cli are given by
and NF=O. Now our result follows immediately from sec. 2.21.
More generally, suppose Fc E and H* E* are any subspaces. Then a scalar product in the pair H*/H* fl F', F/Fn (H*)', is determined by
= as
a similar argument will show.
2.24. Dual mappings. Suppose that E, E* and F, F* are two pairs of dual spaces and p:E—÷F, are called dual if 'p and
are linear mappings. The mappings
= exists at most one dual mapping.
To a given linear mapping
If
and
are dual to 'p we have that =
=
and
whence —
=
0
XEE,Y*EF.
Chapter II. Linear mappings
68
This implies, in view of the duality of E and E*, that
=
whence
an example of dual mappings consider the dual pairs E*, E and is a subspace of E (cf. sec. 2.23) and let it be the canonical projection of E* onto E*/EjL, As
E*/EjL, E1 where E1
E*. Then the canonical injection
i:E1
is dual to it. In fact, if xeE1, and y*EE* are arbitrary, we have
= =
=
and thus it
= i*.
2.25. Operations with dual mappings. Assume that E*, E and F*, F are two pairs of dual vector spaces. Assume further that p: E—*F and i/i:E-+F are linear mappings and that there exist dual mappings p*: and ,/,*:E*÷_F*. Then there are mappings dual to and 2p and these dual mappings are given by = (p* + 1/1*
(p +
(2.43)
and (2.44)
(2.43) follows from the relation + VJ*)y*,x> =
+ +
—
=
+
and (2.44) is proved in the same way. Now let G, G* be a third pair of F* +— G * be a pair of dual mappings. Then dual spaces and let x: F—* G, the dual mapping of x p exists, and is given by :
(xoco)* =
In fact, if z*EG* and xeE are arbitrary vectors we have that <x*z*,cox> = .
§ 5. Dual vector spaces
69
For the identity map we have clearly —1 —
\*
Now assume that p: E—+F has a left inverse 'Pi : F—+E, (2.45) and that the dual mappings
and
exist. Then we
obtain from (2.45) that = (1E)*
(2.46)
In view of sec. 2.11 the relations (2.45) and (2.46) are equivalent to p injective,
surjective
injective,
(p* surjective.
and
In particular, if p and
are inverse linear isomorphisms, then so are (p* and 'pt. 2.26. Kernel and image space. Let p : E—*F and (p*: be a pair of dual mappings. In this section we shall study the relations between the subspaces
kerpcE, ImqcF
and ker p*
F
Im (p* c E*.
First we establish the formulae
kerp* = (lmço)'
(2.47)
kerp =(Imp*)±.
(2.48)
In fact, for any two vectors y*eker p*, pxelm çü we have = =
0
and hence the subspaces ker and Im 'p are orthogonal, ker Now let y*E(Im p)' be any vector. Then for every xeE EJ. j*i
Chapter II. Linear mappings
74
is again non-
Moreover, the restriction of the scalar product to E1, degenerate, and
Proposition V has the following converse: Proposition VI. Let E1 c E be any subspace, and let subspace dual to E1 such that
L (E) be a
= Then (2.56)
and
L(E) =
(2.57)
Proof: We have
(E1 + Er)' =
n
(Er)' =
n
=0
whence
E=O'=(E1
(2.58)
On the other hand, since E1 and E1 n
are dual, it follows that =0
which together with (2.58) proves (2.56). (2.57) follows from Proposition V and (2.56).
Problems 1. Given two pairs of dual spaces E*, E and F*, F prove that the spaces
and EG3F are dual with respect to the bilinear function = <x*, x> + •
2. Consider two subspaces E1 and E2 of E. Establish the relation (E1
n
E
consider the mapping ).:
fined by
Prove that is injective. Show that
de-
aEE, jeL(E). is surjective if and only if dim E
= Replacing
by
in this relation we obtain the formula =
which shows that the components of a vector XEE with respect to a basis
of E are the scalar products of x with the dual basis vectors. Proposition II: Let F be a subspace of E and consider the orthogonal complement F'. Then dimF + dimF' = dimE. (2.63)
Proof: Consider the factor space E*/FH In view of sec. 2.23, is dual to F which implies (2.63). Proposition III: Let E, E* be a pair of dual vector spaces and consider a bilinear function cP: E* x E—÷F. Then there exists precisely one linear transformation such that
= <x*,px>
x*EE*,xEE.
§ 6. Finite dimensional vector spaces
79
Let XEE be a fixed vector and consider the linear E* defined by =
in
In view of proposition I there is precisely one vector çoxeE such that
= The two above equations yield x> = cp(x*, x)
E
E*, xe E
and so a mapping (p : E—÷E is defined. The linearity of çø follows immediately from the bilinearity of and Suppose now that are two
linear transformations of F such that cli(x*,x) =
x>
and
cli(x*,x) =
(p2x>
Then we have that
<X, (pi — X> = 0 whence Proposition III establishes a canonical linear isomorphism between the
spaces B(E*, E) and L(E; F),
L(E; E). Here B(E*, F) is the space of bilinear functions &E* x dition and scalar multiplication defined by (tIi1 + tli2)(x*,x) =
and
(2 k)(x*, x) =
with ad-
+ cP2(x*,x) çji (x*, x).
2.34. The rank of a linear mapping. Let (p : E—÷F be a linear mapping of finite dimensional vector spaces. Then ker cp c E and Tm (p F have finite dimension as well. We define the rank of (p as the dimension of Tm (p
r((p) = dim Im (p. In view of the induced linear isomorphism Im(p
we have at once r((p) + dim ker(p = dimE.
(2.64)
(p is called regular if it is injective. (2.64) implies that (p is regular if and only if r ((p) = dim F.
Chapter 11. Linear mappings
In the special case dim E= dim F (and hence in particular in the case of a linear transformation) we have that is regular if and only if it is surjective. 2.35. Dual mappings. Let E*, E and F*, F be dual pairs and çø: be a linear mapping. Since E* is canonically isomorphic to the space L (E) there exists a dual mapping Hence we have the relations (cf. sec. 2.281
Im
= (ker
and (ker(p)L.
The first relation states that the equation = has
a solution for a given yEF if and only if y satisfies the condition
<x*,y>=O forevery
x*ekerp*.
The second relation implies that dual mappings have the same rank. In fact, from (2.63) we have that
dimlmp* = dim(kerp)' = dimE — dimkerço = dimlmp whence r(co*)
= r(p).
(2.65)
Problems (In problems 1—10 it will be assumed that all vector spaces have finite dimension).
1. Let E, E* be a pair of dual vector spaces, and let E1, E2 be subspaces of E. Prove that (E1 n
Hint: Use problem 2, § 5. 2. Given subspaces Uc E and V* c E* prove that
dim(U' n V*) + dim U = dim(U
n
+ dim
3. Let E, E* be a pair of non-trivial dual vector spaces and let 'p : be a linear mapping such that 'p = (z*) - 'p for every linear auto1
morphism t of E. Prove that 'p =0. Conclude that there exists no linear mapping which transforms every basis of E into its dual basis.
§ 6. Finite dimensional vector spaces
(v = 1. . .n) of E, E* show that the
4. Given a pair of dual bases
••,
bases (x*' + again dual. v2
...,
and (x1,
are
5. Let E, F, G be three vector spaces. Given two linear mappings prove that
ço:E—*F and
and
6. Let E be a vector space of dimension n and consider a system of n linear transformations such that
(i,j=1...n). a) Show that every
has rank 1 a second system with the same property, prove that there exists a linear automorphism t of E such that
b) If
=
t.
a
— 1
7. Given two linear mappings p:
r(q) -
and
: E—>F prove that
r(ço) +
+
Show that the dimensions of the spaces M' (ço), are given by
8. § 2
dimM'(p) dim
(dim F —
(p) in problem 3,
r(p))dimE
((p) = dim ker çø dim F.
9. Show that the mapping
defines a linear isomorphism,
cP:L(E;F)3 L(F*;E*). 10. Prove that c1'M'(ço) =
and = Ml(ço*) where
the notation is defined in problems 8 and 9. Hence obtain the
formula
11. Let p:
r(ço) = r(p*).
be a linear mapping (E, F possibly of infinite dimension). Prove that Im ço has finite dimension if and only if ker 'p has finite codimension (recall that the codimension of a subspace is the dimension 6
Greub, Lineur Algebra
Chapter II. Linear mappings
of the corresponding factor space), and that in this case
codimkerp = dimlmp. 12. Let E and F be vector spaces of finite dimension and consider a bilinear function P in Ex F. Prove that dim E — dim NE = dim F — dim NF
where NE and NF denote the null spaces of P. Conclude that if dim E=dim F. then NE=O if and only if
Chapter III
Matrices In this chapter all vector spaces will be defined over afixed, but arbitrarily chosen field F of characteristic 0.
§ 1. Matrices and systems of linear equations 3.1. Definition. A rectangular array A—_(
(3.1)
)
is called a matrix of n rows and m columns or, in of nm scalars an n x m-matrix. The scalars are called the entries or the elements of the matrix A. The rows (v—_1...n) and therefore are called the
can be considered as vectors of the space row-vectors of A. Similarly, the columns —
...
considered as vectors of the space
(it = 1 ... m)
are called the column-vectors of A.
Interchanging rows and columns we obtain from A the transposed matrix
A*=(
).
(3.2)
In the following, matrices will rarely be written down explicitly as in This notation has the (3.1) but rather be abbreviated in the form A = disadvantage of not identifying which index counts the rows and which the columns. It has to be mentioned in this connection that it would be very undesirable — as we shall see — to agree once and for all to always let
Chapter III. Matrices
the subscript count the rows, etc. If the above abbreviation is used, it will be stated explicitly which index indicates the rows. 3.2. The matrix of a linear mapping. Consider two linear spaces E and With the aid of F of dimensions ii and rn and a linear mapping p: (v 1. . n) and bases = 1. in) in E and in F respectively, every can be written as a linear combination of the vectors vector (1u:=l...rn), . .
.
(v
1... n).
(3.3)
where v In this way, the mapping p determines an n x rn-matrix counts the rows and ,u counts the columns. This matrix will be denoted by M(p, or simply by M(tp) if no ambiguity is possible. Conversely, every n x in-matrix determines a linear mapping p:E-+F by the equations (3.3). Thus, the operator M('p)
M: 'p
defines a one-to-one correspondence between all linear mappings ço:E—*F and all n x rn-matrices.
3.3. The matrix of the dual mapping. Let E* and F* be dual spaces of E and F, respectively, and p: E-+F, a pair of dual mappings. Consider two pairs of dual bases X*V, (v = 1. n) and (/2 1.. . in) of E*, E and F*, F, respectively. We shall show that the two corresponding matrices and M('p*) (relative to these bases) are transposed, i.e., that M('p*) = M((p)*. (3.4) :
. .
The matrices M('p) and M('p*) are defined by the representations 'p
=
and
=
Note here that the subscript v indicates in the first formula the rows of
the matrix and
and in the second the columns of the matrix in the relation
Substituting
=
(3.5)
=
(3.6)
we obtain Now (p
=
YK> =
(3.7)
§ 1. Matrices and systems of linear equations
and
=
(3.8)
The relations (3.6), (3.7) and (3.8) then yield */1_ — V• JL
V
— as stated before — that the subscript v indicates rows of and columns of (ce) we obtain the desired equation (3.4). 3.4. Rank of a matrix. Consider an n x ni-matrix A. Denote by r1 and by r2 the maximal number of linearly independent row-vectors and co-
Observing
lumn-vectors, respectively. It will be shown that r1 =r2. To prove this let E and F be two linear spaces of dimensions n and iii. Choose a basis x. (v 1.. .n) and y,2 (p = 1.. rn) in E and in F and define the linear mapping p: F by (/1X% = Besides
'p, consider the isomorphism f3:
F
F"'
defined by where
y= Then
j3 o p
is a linear mapping of E into F"'. From the definition of fi it into the v-th row-vector,
follows that f3 a p maps
=
/3
Consequently,
the rank of flap
is
equal to the maximal number r1 of
linearly independent row-vectors. Since /3 is a linear isomorphism, has the same rank as p and hence r1 is equal to the rank r of p. Replacing çø by we see that the maximal number r2 of linearly independent column-vectors is equal to the rank of ço*. But (p* has the same rank as ço and thus r1 = r2 = r. The number r is called the rank of the matrix A. 3.5. Systems of linear equations. Matrices play an important role in the discussion of systems of linear equations in afield. Such a system
=
(p = 1... in)
(3.9)
Chapter III. Matrices
of in equations with ii unknowns is called inhomogeneous if at least one is different from zero. Otherwise it is called homogeneous. From the results of Chapter II it is easy to obtain theorems about the existence and uniqueness of solutions of the system (3.9). Let E and F be two linear spaces of dimensions ii and in. Choose a basis (v = I .n) of .
E as well as a basis
(p =
1
. . .
in)
.
of F and define the linear mapping
F by
(p:
= Consider two vectors (3.10)
and
(3.11)
Then (3.12)
Comparing the representations (3.9) and (3. 12) we see that the system (3.9) is equivalent to the vector-equation X
= j.
Consequently, the system (3.9) has a solution if and only if the vector y
is contained in the image-space Im tp. Moreover, this solution is uniquely determined if and only if the kernel of ço consists only of the zero-vector. 3.6. The homogeneous system. Consider the homogeneous system =
0
(p
1... in).
(3.13)
is a From the foregoing discussion it is immediately clear that solution of this system if and only if the vector x defined by (3.10) is contained in the kernel ker p of the linear mapping p. In sec. 2.34 we have shown that the dimension of ker 'p equals n—r where r denotes the rank of 'p. Since the rank of 'p is equal to the rank of the matrix we therefore obtain the following theorem:
A homogeneous system of m equations with n unknowns whose coefficient-
matrix is of rank r has exactly n — r linearly independent solutions. In the special case that the number in of equations is less than the number n of unknowns we have ii— r n — in 1. Hence the theorem asserts that the system (3. 13) always has non-trivial solutions if ni is less than n.
§ I. Matrices and systems of linear equations
87
3.7. The alternative-theorem. Let us assume that the number of equations is equal to the number of unknowns,
=
= 1... n).
(3.14)
Besides (3.14) consider the so-called "corresponding" homogeneous system =0 (3.15) (ii = 1... n).
The mapping introduced in sec. 3.5 is now a linear mapping of the ndimensional space E into a space of the same dimension. Hence we may apply the result of sec. 2.34 and obtain the following alternative-theorem: If the homogeneous system possesses only the trivial solution (0.. .0), the inhomogeneous system has a solution
every choice of the right-
..
hand side. If the homogeneous system has non-trivial solutions, then the inhoinogeneous one is not so/cable for eterv choice of the ;7v(v = . ii). From the last statement of section 3.5 it follows immediately that in the first case the solution of (3.14) is uniquely determined while in the second case the system (3.14) has — if it is solvable at all — infinitely many solutions. 1
.
3.8. The main-theorem. We now proceed to the general case of an arbitrary system
=
(1u
= 1 ... m)
(3.16)
of m linear equations in n unknowns. As stated before, this system has a solution if and only if the vector
y= is
contained in the image-space Im ço. In sec. 2.35 it has been shown that
the space Im p is the orthogonal complement of the kernel of the dual mapping In other words, the system (3.16) is solvable if and only if the right-hand side (p=l...m) satisfies the conditions (3.17)
for all solutions
=I
.. ni)
of the system
(v=1...n). We formulate this result in the following
(3.18)
Chapter III. Matrices
Main-theorem: An inhoniogeneous system of n equations in in unknowns has a solution if and only if every solution 1. in) of the transposed homogeneous system (3.18) satisfies the orthogonality-relation (3.17). . .
Problems 1. Find the matrices corresponding to the following mappings:
a) px=O. b) çox=x. c) px=2x. d)
x=
where
(v =
1,
..., n) is a given basis and in n is a
given number and
2. Consider a system of two equations in n unknowns
Find the solutions of the corresponding transposed homogeneous system.
3. Prove the following statement: The general solution of the inhomogeneous system is equal to the sum of any particular solution of this system and the general solution of the corresponding homogeneous system. 4. Let and be two bases of E and A be the matrix of the basistransformation Define the automorphism c' of E by Prove that A is the matrix of as well with respect to the basis as with respect to the basis 5. Show that a necessary and sufficient condition for the n x n-matrix A= to have rank 1 is that there exist elements and ..., j31 •.., f3fl such that f32 = (v = 1,2, ...,n; = 1,2, ...,n). if A +0, show that the elements cc and 13P are uniquely determined up to = 1. constant factors ). and p respectively, where as 6. Given a basis a linear space E, define the mapping p : p
=
Find the matrix of the dual mapping relative to the dual basis. 7. Verify that the system of three equations:
+ + = 3, —11—4' = 4, + 3ij + 34' = 1
§ 2. Multiplication of matrices
89
has no solution. Find a solution of the transposed homogeneous system which is not orthogonal to the vector (3, 4, 1). Replace the number 1 on the right-hand side of the third equation in such a way that the resulting system is solvable. 8. Let an inhomogeneous system of linear equations be given,
(ii=1,...,in). augmented matrix of the system is defined as the in x (n+ 1)-matrix ..., çi"). Prove that obtained from the matrix by adding the column the above system has a solution if and only if the augmented matrix has the same rank as the matrix (cd). The
§ 2.
Multiplication of matrices
3.9. The linear space of the n x m matrices. Consider the space L(E; F) of all n x rn-matrices. of all linear mappings ço:E—÷F and the set Once bases have been chosen in E and in F there is a 1-1 correspondence between the mappings p: E—÷ F and the n x rn-matrices defined by (3.19)
This correspondence suggests defining a linear structure in the set MnXrn such that the mapping (3.19) becomes an isomorphism. We define the sum of two n x rn-matrices and
A= as
B
the n x rn-matrix
A+B=
+
and the product of a scalar 2 and a matrix A as the matrix 2A
is immediately apparent that with these operations the set is a linear space. The zero-vector in this linear space is the matrix which has only zero-entries. Furthermore, it follows from the above definitions that It
i.e., that the mapping (3.19) defines an isomorphism between L (E; F) and the space X
Chapter III. Matrices
3.10. Product of matrices. Assume that and
are linear mappings between three linear spaces E, F, G of dimensions ii, in and I, respectively. Then /i is a linear mapping of E into G. Select in) and a basis (v I .. ii), (p = .. .1) in each of the three spaces. Then the mappings and i/i determine two matrices and by the relations = 1
. . .
1
and
These two equations yield = Consequently, the matrix of the mapping
ço
relative to the bases x%, and
is given by
(3.20)
The n x 1-matrix (3.20) is called the product of the ii x rn-matrix A = and the rn x /-matrix B= and is denoted by A B. It follows immediately from this definition that
= M(tp)M(t/i).
(3.21)
Note that the matrix M p) of the product-mapping p is the product of the matrices M(ço) and in reversed order of the factors. It follows immediately from (3.21) and the formulas of sec. 2.16 that the matrix-multiplication has the following properties:
A(.)LB1 +pB2)=).AB1 +pAB2 (AB)C = A(BC) (AB)* = B*A*. 3.11. Automorphisms and regular matrices. An ii x n-matrix A is called regular if it has the maximal rank /1. Let 'p be an automorphism of the n-dimensional linear space E and A = M ('p) the corresponding /1 x nmatrix relative to a basis (v = .. n). By the result of section 3.4 the rank of 'p is equal to the rank of the matrix A. Consequently, the matrix A is regular. Conversely, every linear transformation p: E—* E having a regular matrix is an automorphism. 1
.
§ 2. Multiplication of matrices
To every regular matrix A
A' such that
there
exists an in verse matrix, i.e., a matrix
AA' =
=
.1,
where J denotes the unit matrix whose entries are In fact, let cp be the E automorphism of E such that M((p)=A and let be the inverse automorphism. Then = 1
whence
M(p)M(q,)' =
o(p)
= M(i)
J
and
M(ço')M(p)= These
M(t)= J.
equations show that the matrix
A' = is the inverse of the matrix A.
Problems 1. Verify the following properties: a) b)
(L4)*=2A*.
c) 2. A square-matrix is called upper (lower) triangular if all the elements below (above) the main diagonal are zero. Prove that sum and product of triangular matrices are again triangular.
3. Let p be linear transformation such that 'p2 = 'p. Show that there exists a basis in which 'p is represented by a matrix of the form: O...O
1 1
.
rn
O...O
1
o
0
n—rn 0
o n
Chapter III. Matrices
92
4. Denote by A13 the matrix having the entry I at the place (i,j) and zero elsewhere. Verify the formula =
Prove that the matrices form a basis of the space M"
X
Basis-transformation
§ 3.
3.12. Definition. Consider two bases and E. Then every vector (v = 1. ii) can be written as
l...n) of the space
. .
=
(3.22)
Similarly, (3.23)
The two n x n-matrices defined by (3.22) and (3.23) are inverse to each other. In fact, combining (3.22) and (3.23) we obtain — xi,— A
This is equivalent to
5:A = 0
— A
p
and hence it implies that
In a similar way the relations
are proved. Thus, any two bases of E are connected by a pair of inverse matrices. Conversely, given a basis (v = 1. n) and a regular n x n-matrix another basis can be obtained by = . .
To show that the vectors
are linearly independent, assume that
= 0. Then
=0
§ 3. Basis-transformation
and
93
hence, in view of the linear independence of the vectors
Multiplication with the inverse matrix
yields
(K=1...n). 3.13. Transformation of the dual basis. Let E* be a dual space of E, the dual of the basis (v I.. .n). Then the dual basis of and = where
is
(3.24)
a regular n x n-matrix. Relations (3.23) and (3.24) yield
;>.
x%) =
(3.25)
Now xi,> =
and
=
Substituting this in (3.25) we obtain I-'v
This shows that the matrix of the basis-transformation is the inverse of the matrix of the transformation The two basis-transformations and = (3.26) = are called contragradient to each other. The relations (3.26) permit the derivation of the transformation-law
for the components of a vector xeE under the basis-transformation Decomposing x relative to the bases
and
we obtain
and
From the two above equations we obtain in view of (3.26). =
=
(3.27)
Comparing (3.27) with the second equation (3.26) we see that the components of a vector are transformed exactly in the same way as the vectors of the dual basis.
Chapter III. Matrices
94
3.14. The transformation of the matrix of a linear mapping. In this section it will be investigated how the matrix of a linear mapping (p: E—÷ F is changed under a basis-transformation in E as well as in F. Let M(p; xi,, and M (p; be the ii x in-matrices of çø relative = = to the bases xv,, (v = .. in), respectively. Then and .1?, p = 1
.
1
(v=1...n).
and
Introducing the matrices A=
and
of the basis-transformations we then have the relations
-
(3.28)
B= and their inverse matrices,
and
=
=
=
=
(3.29) YK.
Equations (3.28) and (3.29) yield
and we obtain the following relation between the matrices =
and (n): (3.30)
Using capital letters for the matrices we can write the transformation formula (3.30) in the form = A M (p;
M ((p;
It shows that all possible matrices of the mapping p are obtained from a particular matrix by left-multiplication with a regular 11 x n-matrix and right-multiplication with a regular in x in-matrix.
Problems 1. Let f be a function defined in the set of all ii x n-matrices such that
f(TAT1)=f(A) for every regular matrix T. Define the function F in the space L (E; E) by
F (q) =
f (M (ço;
xv))
§ 4. Elementary transformations
95
where E is an n-dimensional linear space and (v = 1.. . n) is a basis of E. Prove that the function F does not depend on the choice of the basis is a linear transformation E—+E having the same 2. Assume that matrix relative to every basis (v = 1. n). Prove that p = 2i where 2 is a scalar. 3. Given the basis transformation . .
= X2
—
x2
—
x3
— X2
X3 = 2x2 + x3
find all the vectors which have the same components with respect to the bases x1, and
1,
§ 4.
2, 3).
Elementary transformations
3.15. Definition. Consider a linear mapping p: Then there exists a basis ..., n)ofE and a basis ..., n) ofFsuch that the corresponding matrix of 'p has the following normal-form:
r (3.31)
b
.
.
where r is the rank of 'p. In fact, let a basis of E such form a basis of the kernel. Then the vectors that the vectors ar+ bQ 'p aQ = 1, ..., r) are linearly independent and hence this system can be extended to a basis (b1, ..., bm) of F. It follows from the construction and that the matrix of 'p has the form (3.31). of the bases 1, ..., n) and 1, ..., m) be two arbitrary bases of Now let E and F. It will be shown that the corresponding matrix M('p; can be converted into the normal-form (3.31) by a number of elementary basis-transformations. These transformations are:
Chapter III. Matrices
(1.1.) Interchange of two vectors and x1(i+j). (1.2.) Interchange of two vectors Yk and Yi (k 1). (11.1.) Adding to a vector an arbitrary multiple of a vector (11.2.) Adding to a vector Yk an arbitrary multiple of a vector Yi (1+ k). it is easy to see that the four above transformations have the following effect on the matrix M((p): (1.1.) Interchange of the rows I andj. (1.2.) Interchange of the columns k and 1. (11.1.) Replacement of the row-vector by (11.2.) Replacement of the column-vector bk by It remains to be shown that every n x rn-matrix can be converted into the normal form (3.31) by a sequence of these elementary matrix-transformations and the operations be the given ii x rn-matrix. 3.16. Reduction to the normal-form. Let otherwise the It is no restriction to assume that at least one matrix is already in the normal-form. By the operations (1.1.) and (1.2.) this element can be moved to the place (1, 1). Then 0 and it is no restriction to assume that = 1. Now, by adding proper multiples of the first row to the other rows we can obtain a matrix whose first column consists of zeros except for Next, by adding certain multiples of the first column to the other columns this matrix can be converted into the form
o...o
1
0*
*
(3.32)
0* if all the elements
*
(v = 2.. .n, R = 2.. rn) are zero, (3.32) is the normal-
form. Otherwise there is an element This can be moved to the place (2,2) by the operations (1.1. and (1.2.). Hereby the first row and the first column are not changed. Dividing the second row by and applying the operations (11.1.) and (11.2.) we can obtain a matrix of the form 1
0
0
1
0 *
00*
*
In this way the original matrix is ultimately converted into the form (3.31.).
§ 4. Elementary transformations
97
3.17. The Gaussian elimination. The technique described in sec. 3.16 can be used to solve a system of linear equations by successive elimination. Let (3.33)
be a system of in linear equations in n unknowns. Before starting the elimination we perform the following reductions: If all coefficients in a certain row, say in the i-th row, are zero, consider the corresponding number on the right hand-side. If the i-th equation contains a contradiction and the system (3.33) has no solution. If the i-th equation is an identity and can be omitted. Hence, we can assume that at least one coefficient in every equation is different from zero. Rearranging the unknowns we can achieve that 40. Multiplying the first equation by — and adding it to the p-th equation we obtain a system of the form 1
+
= (3.34)
which is equivalent to the system (3.33).
Now apply the above reduction to the (rn—. 1) last equations of the system (3.34). If one of these equations contains a contradiction, the system (3.34) has no solutions. Then the equivalent system (3.33) does not have from the a solution either. Otherwise eliminate the next unknown, say reduced system.
Continue this process until either a contradiction arises at a certain step or until no equations are left after the reduction. In the first case, (3.33) does not have a solution. In the second case we finally obtain a triangular system 132+0 (3.35)
which is equivalent to the original system*).
*) If no equations are left after the reduction, then every n-tupie solution of (3.33). 7
Greub. Linear Algebra
...
is a
ChaI)ter I II. \4atrices
The system (3.35) can be solved in a step by step manner beginning
with (3.36)
1
—
Inserting (3.36) into the first (r— 1) equations we can reduce the system to a triangular one of r— equations. Continuing this way we finally obtain the solution of (3.33) in the form 1
(v=1...r)
JLr+ 1
In) are arbitrary parameters.
where the
Problems 1. Two n x ni-matrices C and C' are called equivalent if there exists a regular n x n-matrix A and a regular in x rn-matrix B such that C' = A CB. Prove that two matrices are equivalent if and only if they have the same rank. 2. Apply the Gauss elimination to the following systems: a) —
= 2.
+
b)
2,71+ 172+4173+ 3,71+4,72+ 2,7' + + 5,73 c)
+
2c'+c2
—
= 1,
= 3.
Chapter IV
Determinants In this chapter, except for the last paragraph, all vector spaces will be defined over afixed but arbitrarily chosen field F of characteristic 0.
§ 1. Determinant functions 4.1. Even and odd permutations. Let X be an arbitrary set and denote Let for each 1 by X" the set of ordered p-tuples (x1 Y
be a map from
to a second set, Y. Then every permutation
a new map
acP: X"-* Y
defined by
= It follows immediately from the definitions that (4.1)
and (4.2)
where i is the identity permutation. Now let X = Y= ZL and define cli by cP(x1
=
fl
—
•
x1 E ZL.
i<j
It is easily checked that for every ac/I
=
c,7
where C,7= ± 1. Formulae (4.1) and (4.2) imply that
= and El
= 1.
.
100
IV.
Thus t;
is
a homomorphism from
Determinants
to the multiplicative group — 1. + I }. = (respectively
A permutation a is called eten (respectively odd) if c(7
1
—I).
Since for a transposition i. r= — 1, the transpositions are odd permutations. 4.2. p-linear maps. Let E and F be vector spaces. A p-linear map from E to F is a map 1: E"—+F which is linear with respect to each argument; i.e.. cP(x1
x1,)
F
=)Mx1
A p-linear map from E to F is called a p-linear function in E. As an be linear functions in E and define 1i by example, let j1 P(x1
...
=
A p-linear map k: mutation a
E E.
is called skew symmetric, if for every per-
=.
that is,
= r7P(x1
Every p-linear map II', given by
XJ))
E"—*F determines a skew symmetric p-linear map, = cp. a
In fact, let 'r be an arbitrary permutation. Then formula (4.1) yields
=
a)
a
= W
and so
is skew symmetric.
Proposition I: Let 1i be a p-linear map from E to F. Then the following conditions are equivalent: (i) is skew symmetric. (ii) P(x1 whenever x1 for some pair i4j. (iii) k(x1 whenever the vectors x1 ,...,x1, are linearly dependent. is skew symmetric and that x=xj Assume that Proof:
§ 1. Determinant functions
101
Denote by -r the transposition interchanging i andj. Then, since x1 = xj. (rP)(x1
xv).
On the other hand, since ct is skew symmetric,
=
(-r1)(x1
—
These relations imply that
and so, since F has characteristic zero, cli(x1
Conversely, assume that 1i satisfies (ii). To show that 1i is skew symmetric, fix a pair i, j (i<j) and set W(x, y) = 'Ji(x1
where the vectors
4 1,/) are fixed. Then
W(x,x)=O
XEE.
It follows that W(x, v) +
x) =
+ i, x + v) —
y) = 0
x) —
whence x)
—
E.
x,
¶1'(x, y)
This relation shows that
icP= for any transposition. Since every permutation is a product of trans-
positions, it follows that
i.e..
is skew symmetric.
Assume that satisfies (ii) and let x1 dependent vectors. We may assume that xi,.
xp
Then, by (ii),
be linearly
-
xp1.
=0
Chapter IV. Determinants
102
and so P satisfies (iii). Obviously. (iii) implies (ii) and so the proof is corn plete. CoroIlar%': If dim E=n. then every skew symmetric p-linear map P from E to F is zero if p>n.
Proposition II: Assume that E is an n-dimensional vector space and let P be a skew symmetric n-linear map from E to a vector space F. Then P is completely determined by its value on a basis of E. In particular, if P vanishes on a basis, then = 0. a, of E and write Proof: In fact, choose a basis a1
x)=
(,.= 1 ...
n).
Then çJ5
(x1
.. .
v,,) =
=
1)(a,
"P
1)
=
1)
.
a,,).
Since the first factor does not depend on P, the proposition follows. 4.3. Determinant functions. A determinant function in an n-dimensional vector space E is a skew symmetric n-linear function from E to F. In every n-dimensional vector space E (n 1) there are determinant functions which are not identically zero. In fact, choose a basis, f of the space L(E) and define the n-linear function P by
=
P(x1. ....
...
E E.
is the dual basis in E,
Then. if a1
(4.3)
Now set 4
Then A is a skew symmetric. Moreover, relation (4.3) yields A(a1
and so A
0.
=
§ 1. Determinant functions
103
Proposition III: Let E be an n-dimensional vector space and fix a nonzero determinant function 4 in E. Then every skew symmetric n-linear map ço from E to a vector space F determines a unique vector b E F such that = zl(x1
(p(x1
b.
(4.4)
of E so that zl (a1 = Then the n-linear map i/i: given by
Proof: Choose a basis a1
b=
p(a1
a,1).
= 4(x1
.
1
and set (4.5)
b
agrees with on this basis.Thus, by Proposition II. ,/i = p, i.e., = 4 Clearly, the vector b is uniquely determined by relation (4.4).
.
b.
Corollary: Let zl be a non-zero determinant function in E. Then every determinant function is a scalar multiple of 4.
Proposition IV: Let 4 be a determinant function in E (dim E=n). Then, the following identity holds: XEE, (4.6)
If the vectors x1 are linearly dependent a simple calculation shows that the left hand side of the above equation is zero. On the other hand, by sec. 4.2, 4(x1 Thus we may assume x,, form a basis of E. Then, writing
that the vectors x1
x we
have x1
=
(—
1
4 (xi,, x1
=4(x1
x1,
(4.7)
xv). x.
Problem Let E*, E be a pair of dual spaces and 4+0 be a determinant function in E. Define the function 4* of n vectors in E* as follows: Ifthevectorsx*v(v = .n)arelinearlydependent,then4* (x* 1•• .x*'l)= 0. 1
. .
Chapter I V. Detcrtiiinants
104
Ifthe vectors
(v 1 where
=
.n)are linearly independent, then zi * (x*
1
1
n) is the dual basis. Prove that /1* is a deter-
I
minant function in E*. § 2.
The determinant of a linear transformation
4.4. Definition. Let p be a linear transformation of the n-dimensional linear space E. To define the determinant of (p choose a non-trivial determinant function z1. Then the function defined by /14,
=
(x1 ...
A
x1
...
obviously is again a determinant function. Hence, by the uniquenesstheorem of section 4.3, /14, =
LI,
is a scalar. This scalar does not depend on the choice of A. In fact, if A' is another non-trivial determinant function, then zI'=2A and where
consequently Thus, the scalar
A ,,
=
2
=
is uniquely determined by the transformation p. It is
called the determinant of (p and it will be denoted by det (p. So we have the
following equation of definition: /14,
where A is an arbitrary non-trivial determinant function. In a less condensed form this equation reads = det (p A (x1 ...
A (p X1 ... p
(4.8)
In particular, if (p = 2i, then /14,
=
and hence
det(2i) = It follows from the above equation that the determinant of the identitymap is 1 and the determinant of the zero-map is zero. 4.5. Properties of the determinant. A linear transformation is regular if and only if its determinant is different from zero. To prove this, select a basis (v = 1.. .n) of E. Then A (p e1 ...
p
= det (p A (e1 ... en).
(4.9)
§ 2. The determinant of a linear transformation
If p is regular, the vectors p (v =
1.
. .
105
are linearly independent; hence
n)
(4.10)
Relations (4.9) and (4.10) imply that
+ 0.
det
+ 0. Then it follows from (4.9) that
Conversely, assume that det
Hence the vectors p (v = 1. . n) are linearly independent and Consider two linear transformations and i,Ii of E. Then
is regular.
.
det
det
In particular, if is a linear automorphism and p1 is the inverse automorphism, we obtain
detq1 det(p = deti =
1.
4.6. The classical adjoint. Let E be an n-dimensional vector space and let zi +0 be a determinant function in E. Let çoeL(E; E). Then an n-linear map E)
is given by
v,)x =
px1
j=
E.
.
1
It is easy to check that Ji is skew symmetric. Thus, by Proposition 11,
there is a unique linear transformation, ad(p), of E such that cP(x1
= z1(x1
ad(p)
E;
E
E.
This equation shows that the element ad(p)EL(E; E) is independent of the choice of A and hence it is uniquely determined by 'p. It is called the classical adjoint of'p.
Chapter IV. 1)eterminaiits
106
Proposition V: The classical adjoint satisfies the relations
ad(ço)o(p=1 •detq and i
. det(p.
Proof: Replacing x by pX in (4.14) we obtain x,) x1 = 4(x1
x,
IP
,...,x,) ad
x).
j=1
Now observe that, in view of the definition of the determinant and
Proposition IV.
P1
=det(p j= 1
=detq)
x1EE, XEE.
x
.
Composing these two relations yields
4(x1 ,...,x,) ad (p((px) = det
.
zl(x1 ,...,x,) x .
and so the first relation of the proposition follows. The second relation is established by applying p to (4.12) and using the identity in Proposition IV, with x replaced by (i= ... a). 1
Corollary: If go is a linear transformation with det an inverse and the inverse is given by go1
then go has
det go
Thus a linear transformation of E is a linear isomorphism if and only if its determinant is non-zero. 4.7. Stable subspaces. Let go: E—*E be a linear transformation and assume that E is the direct sum of two stable subspaces,
E=
E1
$E2.
Then linear transformations and
go2:E2—*E2
§ 2. The determinant of a linear transformation
107
are induced by (p. It will be shown that
det(p = detp1detq2.
Define the transformations
and
E—÷E
by
mE1 i
Then (P
and so det(p =
Hence it is sufficient to prove that det
=
det(p1
and
detk//2 = dettp2.
0 be a determinant function in E and h1 ... the function /11, defined by
Let zl
...
be a
basis of E2. Then
p=dimE,
... hqh
is a non-trivial determinant function in E1. Hence
= detp1A1(x1
A1 (Pi x1 ...
On the other hand we obtain from (4.14) ... xi,, b1 ... A1(x1 ... xi,). A (x1
=
bq)
These relations yield det(p1
The second formula (4.13) is proved in the same way.
Problems 1. Consider the linear transformation p:
(v=1...n) where
(v =
1.
.
.n)
is a basis of E. Show that
(4.13)
defined by
(4.14)
I \'.
Deicriiii
2. Let be a linear transformation and assume that E1 is a and stable subspace. Consider the induced transformations Prove that
detp = 3. Let x:E-+F be a linear isomorphism and p be a linear transformation of E. Prove that = detp. 4. Let E be a vector space of dimension ii and consider the space L(E; E) of linear transformations. a) Assume that F is a function in L (E; F) satisfying
F can be written in the form
F(p)= f(detq) wheref:F—+F is a mapping such that
=f(2)f(p). b) Suppose that F satisfies the additional condition that Then, if E is a real vector space,
= detp or F(p) = and if E is a complex vector space F(ço) = detçc.
Hint for part a): Let (i== 1.. .n) be a basis for E and define the transformations and by
v+i v=t and
v=i
.
Show first that =I and that F(ço1)
is
independent of i.
i,j=1...n
§ 3. The determinant of a matrix
4. Let E be a vector space with a countable basis and assume that a function F is given in L (E; E) which satisfies the conditions of problem 3a). Prove that
peL(E;E).
Hint: Construct an injective mapping that
such
and a surjective mapping i/i
=0.
5. Let w be a linear transformation of E such that w2 = i. Show that where r is the rank of the map i — w. — 6. Let j be a linear transformation of E such that = — i. Show that then the dimension of E must be even. Prove that det j = 1. 7. Prove the following properties of the classical adjoint: det w = (
1
(i)
(ii) det (iii) If p has rank n—i, then Im ad((p)=ker cm (iv) If p has rank n—2, then ad(p)=0. (v) 8.
ad(ad p)=(det If n=2 show that
ad(p)=i . trp—p. § 3.
The determinant of a matrix
4.8. Definition. Let ço be a linear transformation of E and sponding matrix relative to a basis (v = 1. . n). Then .
=
'p
Substituting
in (4.8) we obtain A ('p
= det 'p A (e1 ...
... 'p
The left-hand side of this equation can be written as A ('p e1, ... 'p
= =
We thus obtain
A
ep,
r(1)
... c(n)
.A(e ...
en).
the corre-
Chapter IV. Determinants
1 1()
This formula shows how the determinant of is expressed in terms of the corresponding matrix. We now define the determinant of an ii x n-matrix A by
f(n)
detA =
(4.16)
Then equation (4.15) can be written as det(p =
(4.17)
Now let A and B be two n x n-matrices. Then det (A B) = det A det B.
(4.18)
In fact, let E be an n-dimensional vector space and define the linear transformations 'p and of E such that (with respect to a given basis)
M(q)=A and M ('p) M (i/i) = det M p) = det (i/i p) = det M ('p) det M (i/i) = det A det B.
= det 'p det
Formula (4.18) yields for two inverse matrices
detAdet(A1) = detJ =
1
(J unit-matrix)
showing that
det(A1) = (detA)'. Finally note that if an (n x ;i)-matrix A is of the form
A = det
det A2
(4.19)
as follows from sec. 4.7. 4.9. The determinant considered as a function of the rows. If the rows
of the matrix A are considered as vectors of the space = . .. the determinant det A appears as a function of the n vectors (v = n). To investigate this function define a linear transformation 'p of by 1
(v=1...n)
. . .
§ 3. The determinant of a matrix where the vectors
111
are the n-tuples
(v=1...n). Now let A be the deterThen A is the matrix of relative to the basis minant function in r which assumes the value one at the basis (v = 1.. .n),
A
A
A
det
A (a1 ... an).
A
(4.20)
This formula shows that the determinant of A considered as a function of the row-vectors has the following properties: 1. The determinant is linear with respect to every row-vector. 2. If two row-vectors are interchanged the determinant changes the sign.
3. The determinant does not change if to a row-vector a multiple of another row-vector is added. 4. The determinant is different from zero if and only if the row-vectors are linearly independent. An argument similar to the one above shows that
where the bV are the column-vectors of A. It follows that the properties 1—4 remain true if the determinant of A is considered as a function of the column-vectors.
Problems 1. Let A = (cd) be a matrix such that
= 0 if v ). x1 ...
Then it follows from the properties of the determinant of a matrix that Q is linear with respect to each argument. Moreover, Q is skew symmetric and with respect to the vectors x1 (i= I . .n). with respect to the vectors Hence the uniqueness theorem (sec. 4.3) implies that Q can be written as .
x*fl; x1 ...
= cP(x*' ... x*fl)A(x1 ...
(4.22)
§ 4. Dual determinant functions
depends only on the vectors X*i. Replacing the basis e of E we obtain
where
Q(x*l, ...
e1
...
=
...
in (4.21) by a
... en).
This relation shows that is linear with respect to every argument and skew symmetric. Applying the uniqueness theorem again we find that fJEF.
(4.23)
= f3A*(x*l ... x*fl)A(x1 ...
(4.24)
Combining (4.22) and (4.23) we obtain x1
...
Now let e*i, e.(i= 1...n) be a pair of dual bases. Then (4.24) yields 1
f3A*(e*' ...
...
(4.25)
and so we obtain the relation (4.21). Multiplying (4.24) by The determinant functions A* and A are called dual if the factor in (4.21) is equal to 1; i.e., 1
= det(<x*i,xj>).
(4.26)
To every determinant function A +0 in E there exists precisely one dual determinant function A* in E*. In fact, let be an arbitrary determinant function in E* and set where is the scalar in (4.21). Then A* and A are dual. To prove the uniqueness, assume that and are dual determinant functions to zi. Then we have that x*n)
—
...
...
=0
whence
4.11. The determinant of dual transformations. Let ço: E—3 E and (p*:E*4E* be two dual linear transformations. Then det
= det çø.
(4.27)
To prove this, let A*, A be a pair of dual determinant functions in E* and E. Then we have in view of (4.26) x*n)A(xi ...
= det(<x*i,xj>).
This relation yields ... 8
Greub. Linear Algebra
= det()
Chapter IV. Determinants
and =
...
Since
(i,j = ... n)
= <X*i, (p x1>
1
it follows that = A*(x*l ... x*n)A(cox1 ... (4.28)
But
A*(p*x*l,
and
q,*X*fl) = detco*zl*(x*l ...
= det
A((p X1 ... (p
A (x1 ...
and so we obtain from (4.28) that (det (p* — det co)zl* (x*' ... x*'l)A (x1 ...
=0
whence (4.27)
The above result implies that transposed n x n-matrices have the same determinant. In fact, let A be an n x n-matrix and let (p be a linear transformation of an n-dimensional vector space such that (with respect to a given basis) M(ço)=A. Then it follows that detA* = detM(cp)* = detM((p*) = = det (p* = det (p = det M ((p) = det A.
Problems 1. Show that the determinant-functions, A, 4* of § 1, problem 1 are dual. 2. Using the expansion formula (4.16) prove that,
detA* = detA. §
5. The
adjoint matrix
4.12. Definition. Let (p:E—*E(dim E=ii) be a linear transformation and let ad((p) be the adjoint transformation (cf. sec. 4.6). We shall express the matrix of ad ((p) in terms of the matrix of It is called the adjoint of the matrix of (p. Let a1
a,, be a basis of E such that 4(a1 ad(p)a1 =
I and write
adjoint matrix
§ 5. The
(v= I ...
Then, setting
a)
L15
and x=a1 in the identity (4.14) we obtain ...,
zi(a1.
= zl(a1
whence
=
(i,j= 1 ... ii).
a1,
This equation shows that = det
where
E—*E is the linear transformation given by =
(v
Since the matrix of
i)
(i,j = ...
a, =
1
a)
with respect to the basis a1
-'
a is
given by
+1
1
I
1 1
f—I
i
i±1
f-fl
•..
...
a
CL1-f1
+1•••
1
we have
= det is called the cofactor of Deilnition: The determinant of denoted by Thus we can write
and is
(i,j= la); i.e.. the adjoint matrix is the transpose of the matrix of cofactors. 4.13. Cramer's formula. In view of the result above we obtain from Proposition V. sec. 4.6, the relations =
. detA
(4.36)
and (4.37)
Chapter IV. Determinants
Setting k=i in (4.36) we obtain the expansion formula by cofactors.
(i= I
detA
and define a matrix
Now assume that detA
=
Then these equations yield
(4.38)
n). by setting
detA
is the inverse of the matrix (xi). showing that the matrix From (4.36) we obtain Cramers solution formula of a system of n linear equations in a unknowns whose determinant is non-zero. In fact, let
=
=
a
(4.39)
k=1
be such a system. Then we have, in view of (4.36).
whence
=
=
This formula expresses the (unique) solution of (4.39) in terms of the
cofactors of the matrix A and the determinant of A. Given an (a x ii)-matrix denote for each 4.14. The submatrices pair (Lj) (i= 1 .. a, j = I ... a) by SI the (a — 1) x (a — 1)—matrix obtained from A by deleting the i-th row and the j-th column. We shall show that .
cof
=(—
1)1
+
det
(i, I = 1 ... a).
(4.40)
In fact, by (i—I) interchanges of rows and (j— I) interchanges of columns we can transform 1
into the matrix (cf. sec. 4.12)
§ 5. The
Thus
adjoint matrix
= det
=(
117
—
On the other hand we have, in view of the expansion formula (4.38), with 1= 1
detBl =
and so (4.40) follows. 4.15. Expansion by cofactors. From the relations (4.35) and (4.30) we
obtain the expansion-formula of the determinant with respect to the 1th row detA = (1 = 1... ii). (4.41) By this formula the evaluation of the determinant of n rows is reduced to the evaluation of n determinants of n— 1 rows. In the same way the expansion-formula with respect to the Jth column is proved: + det A = ( det S/ (4.42) (j = 1 ... n).
4.16. Minors. Let A =
be a given ii x in-matrix. For every system
of indices and
the submatrix of A, consisting of the rows 1l'k and the is called a minor of order k of the matrix A. It will be shown that in a matrix of rank r there is always a minor of order r which is different from zero, whereas all minors of order be a minor of order k > r. Then the row-vectors k > r are zero. Let of A are linearly dependent. This implies that the rows of the a,1.. are also linearly dependent and thus the determinant must matrix be zero. It remains to be shown that there is a minor of order r which is different denote by
columns /1•Jk• The determinant of
from zero. Since A has rank r, there are r linearly independent rowvectors a, ..
a. The submatrix consisting of these row-vectors has again
the rank r. Therefore it must contain r linearly independent columnvectors h' ... 1* (cf. sec. 3.4). Consider the matrix Its columnvectors are linearly independent, whence
detA/'j"+O. If A is a square-matrix, the minors det
are called the principal minors of order k.
Chapter IV. Determinants
Problems 1. Prove the Laplace expansioii formula for the determinant of an 1) be a fixed integer. Then
(ii x n)-matrix: Let p (1 det A =
det
v1,) det A"
c (v1
where
= B'"
+
and
= 2.
(ii)
— 1)'
1
h, be two bases of a vector space E and a, and h1 1) be given. Show that the vectors can be reordered
Let a1
let p(l such that (i)
(v,—i) (
hr,, is a basis of E, is also a basis of E.
a1
h,,
3. Compute the inverse of the following matrices. 11
1
1
1
—1
1
1
—1 —1 1
—1 1
21 21 B= 2
4. Show that a)
xaa...a
a x a...a det
:
:
a...ax
= [x + (n —
1)
a] (x —
— 1
§ 5. The acijoint matrix
and that
r1
b)
...
1
1
22
det
... 2
1 1
—1
1
—1
Show
(n>2)
that
z11=x1; z12=x1x2+1. 6. Verify the following formula for a quasi-triangular
x11...x1 p
O....O
determinant:
xp+1p+1....xp+ln
o....o I
=
det
xnl 7.
xnn
det
xnp+1
Prove that the operation
properties: a) ad (A B)= ad B ad A. b) det ad A = (det c) ad ad A=(det
A =(det
xnn
_.)
1)2
(cf. sec.4.12) has the following
Chapter IV. Determinants
12()
§ 6.
The characteristic polynomial
4.17. Eigenvectors. Consider a linear transformation 'p of an n-dimensional linear space E. A vector a+O of E is called an eigenvector of 'p if 'p a =
2
a.
The scalar 2 is called the corresponding eigen value. A linear transformation (p need not have eigenvectors. As an example let E be a real linear space of two dimensions and define 'p by
(px1=x2 çox2=—x1 the vectors x1 and x2 form a basis of E. This mapping does not have eigenvectors. in fact, assume that where
a=
x1
+
is an eigenvector. Then pa=2a and hence These
equations yield
whence
+
=
and
4.18. The characteristic equation. Assume that a is an eigenvector of 'p and that 2 is the corresponding eigenvalue. Then
'pa=2a,
a+O.
This equation can be written as ('p — 2i)a = 0
(4.43)
showing that p—21 is not regular. This implies that det('p — 2i) = 0.
(4.44)
Hence, every eigenvalue of 'p satisfies the equation (4.44). Conversely, assume that 2 is a solution of the equation (4.44). Then 'p—2i is not regular. Consequently there is a vector a+O such that ('p — 2i)a = 0,
whence 'pa=2a. Thus, the eigenvalues of'p are the solutions of the equation (4.44). This equation is called the characteristic equation of the linear transformation 'p.
§ 6. The characteristic polynomial
121
4.19. The characteristic polynomial. To obtain a more explicit expression for the characteristic equation choose a determinant function A +0 in E. Then
=
A(px1 — Ax1 ... çox, —
=
1
det(ço
—
Ai)A(x1
... n).
(4.45)
Expanding the left hand-side we obtain a sum of
terms of the form
A(z1 ...
Denote by the sum of all terms in which p arguments are equal to and il—p arguments are equal to Collect in each term of Si,, the indices (v1 < ,
Chapter IV. Determinants
12g
where e*%' (v = 1 .. ii)
is the dual basis of
we can rewrite equation
(4.60) as tr çø =
(4.61)
<e*1, ço e>.
Formula (4.60) shows that the trace of a linear transformation is equal to the sum of all entries in the main-diagonal of the corresponding matrix. For any ii x n-matrix A this sum is called the trace of A and will be denoted by tr A, (4.62)
Now equation (4.60) can be written in the form
tr(p = trM(p).
4.25. The duality of L(E; F) and L(F; E). Now consider two linear spaces Eand Fand the spaces L(E; F) and L(F; E) of all linear mappings With the help of the trace a scalar product can be p:E—÷F and introduced in these spaces in the following way: q7EL(E; F),
=
E).
(4.63)
The function defined by (4.63) is obviously bilinear. Now assume that (4.64)
for a fixed mapping coeL(E; F) and all linear mappings /ieL(F; E). It has to be shown that this implies that ço=O. Assume that 'p+O. Then Extend the vector b1 =pa to there exists a vector aEE such that a basis (b1 .
.
.bm)
of F and define the linear mapping i/i: F—>E by
(p=2...in).
Then
=0
= b1,
= 2... in),
whence
=
=
= 1.
This is in contradiction with (4.64). Interchanging E and F we see that the relation
for a fixed mapping /ieL(F; E) and all mappings peL(E; F) implies that Hence, a scalar-product is defined in L(E; F) and L(F; E) by (4.63).
§7. The trace
129
Problems 1. Show that the characteristic polynomial of a linear transformation of a 2-dimensional linear space can be written as
f()) = )2 Verify that every such
satisfies (p2 —
— 2trço + detçD.
its characteristic equation,
(ptr(p + ldet(p = 0.
2. Given three linear transformations 'p, i/i,
of E show that
tr(yoç/ioço) + in general. 3. Show that the trace of a projection operator ir:E—+E1 II sec. 2.19) is equal to the dimension of Im it.
(see
Chapter
4. Consider two pairs of dual spaces E*, E and F*, F. Prove that the spaces L (E; F) and L (E*; F*) are dual with respect to the scalar-product defined by = F*). (pEL(E; F)
5. Letf be a linear function in the space L(E; E). Show thatf can be written as
f((p) = is a fixed linear transformation in E. Prove that is uniquely determined by f 6. Assume thatf is a linear function in the space L(E; E) such that where
Prove that
f(ço)=2tr(p where 2 is a scalar. 7. Let and be two linear transformations of E. Consider the sum
i*j
A (x1 ... (p Xi ... cli x1...
where A+O is a determinant function in E. This sum is again a determinant function and hence it can be written as i4j 9
A (x1
Linear Algebra
(p
...
cli
x1 ...
= B (p, cl') A (x1 ...
Chapter IV. Determinants
I 3()
By the above relation a bilinear function B is defined in the space L (E; E). Prove: a) B(p, p tr b) IB((p, (p)=(— where is the coefficient of 2P12 in the characteristic polynomial of çø. c)
=
—
8. Consider two ii x n-matrices A and B. Prove the relation tr(A B) = tr(BA)
a) by direct computation.
b) using the relation tr p=tr M(p). 9. If çø and are two linear transformations of a 2-dimensional linear space prove the relation +
=
+
+
—
10. Let A:L(E; E)—÷L(E; E) be a linear transformation such that
A
a 2-dimensional vector space and p be a linear transformation of E. Prove that ço satisfies the equation p2= —2i, 2>0 if and
only if
detço>0 and trp=0. 12. Let ço:E1—*E1 and
be linear transformations. Consider
=
Prove that tr ço=tr +tr P2• be a linear transformation and assume that there is a 13. Let into subspaces such that decomposition (1= l...r). Prove that tr q=0. be linear maps between finite dimen14. Let p: and /i: a sional vector spaces. Show that tr ((p /1) = )-th I 5. Show that the trace of ad ((p) (cf. sec. 4.6) is the negative (ii characteristic coefficient of 1
§ 8. Oriented vector spaces
§ 8.
131
Oriented vector spaces
In this paragraph E will be a real vector space of dimension n 1. 4.26. Orientation by a determinant function. Let A1 + 0 and A2 + 0 be two determinant functions in E. Then A2=2A1 where 2+0 is a real number. Hence we can introduce an equivalence relation in the set of all determinant functions A +0 as follows: if
2>0.
it is easy to verify that this is indeed an equivalence. Hence a decomposition of all determinant functions A +0 into two equivalence classes is induced. Each of these classes is called an orientation of E. If (A) is an orientation and A e (A) we shall say that A represents the given orientation. Since there are two equivalence classes of determinant functions the vec-
tor space E can be oriented in two different ways. A basis (v = n) of an oriented vector space is called positive if 1 . . .
where A is a representing determinant function. If (e1 .. is a positive basis and a is a permutation of the numbers (l...n) then the basis (eat!)... ea(fl)) is positive if and only if the permutation a is even. Suppose now that E* is a dual space of E and that an orientation is defined in E. Then the dual determinant function (cf. sec. 4.10) determines an orientation in E*. it is clear that this orientation depends only on the orientation of E. Hence, an orientation in E* is induced by the orientation of E. .
4.27. Orientation preserving linear mappings. Let E and F be two oriented vector spaces of the same dimension n and be a linear isomorphism. Given two representing determinant functions AE and AF defined by in E and F consider the function x1, ...,
= Clearly
is again a determinant function in E and hence we have that =
2+0 is a real number. The sign of 2 depends only on p and on the given orientations (and not on the choice of the representing determinant functions). The linear isomorphism çø is called orientation preserving if where
Chapter IV. Determinants
132
).>O. The above argument shows that, given a linear isomorphism and an orientation in E, then there exists precisely one orientation in F such that
preserves the orientation. This orientation will be called the orientation induced by ço. Now let be a linear automorphisni; i.e., F=E. Then we have AF=AF and hence it follows that = det(p.
This relation shows that a linear automorphism
preserves the
orientation if and only if det ço >0. As an example consider the mapping ço= —i. Since det(— i) = (— 1)'
it follows that ii is even.
preserves the orientation if and only if the dimension of
4.28. Factor spaces. Let E be an orientated vector space and F be an oriented subspace. Then an orientation is induced in the factor space E/F in the following way: Let A be a representing determinant function
in E and a1 .
.
be a positive basis of F. Then the function A (a1 ... ap,
depends only on the classes are equivalent mod F. Then
+
EE
1
in fact, assume for instance that
and
)p+1 = Xp+i +
and we obtain A (a1 ...
a
•..
A
=
A
(a1 ...
of (n—p) vectors in E1'F is well defined by
=
...
(4.65)
it is clear that 21 is linear with respect to every argument and skew symmetric. Hence A is a determinant function in ElF. It will now be shown that the orientation defined in E/F by A depends only on the orientations of E and F. Clearly, if A' is another representing determinant function in
§ 8. Oriented vector spaces
133
E we have that A'
2A, 2>0 and hence A' = 221. Now let another positive basis of F. Then we have that
>0
= whence A (a'1 ...
a,
.. .a,) be
=
1
A (a1 ...
1
It follows that the function A' obtained from the basis a'1 . . .a, is a positive multiple of the function A obtained from the basis a1... a direct decomposition
E=E1E13E2
(4.66)
and assume that orientations are defined in E1 and E2. Then an orientation is induced in E as follows: Let l...p) and l...q) be positive bases of E1 and E2 respectively. Then choose the orientation of E such that the basis a1 . b1 . . .bq is positive. To prove that this orientation depends only on the orientations of E1 and E2 let (i= I .. .p) and (1= 1... q) be two other positive bases of E1 and E2. Consider the linear transformations p:E1—+E1 and defined by .
(i=1...p) Then the transformation basis (a1..
b1
. .
(j=1...q).
and
carries the basis
bi...bq) into the
Since det ço >0 and det i/i >0 it follows from sec. 4.7
.bq).
that
>0
det(p
hi...hq) is again a positive basis of E. and hence Suppose now that in the direct decomposition (4.66) orientations are given in Eand E1. Then an orientation is induced in E2. In fact, consider the projection it:E—÷E2 defined by the decomposition (4.66). It induces an isomorphism
p:E/E1 4E2. In view of sec. 4.28 an orientation in E/E1 is determined by the orientations of E and E1. Hence an orientation is induced in E2 by p. To describe this orientation explicitly let A be a representing determinant function in E and e1
ep be a positive basis of E1. Then formula (4.65)
implies that the induced orientation in E2 is represented by the determinant function A2
...
=
A (e1
...
... yb),
(4.67)
Chapter IV. Determinants
134
Now let ..., be a positive basis of E7 with respect to the induced orientation. Then we have
and hence formula (4.67) implies that 4(e1 ... It follows that the basis e1 ...
of E is positive. In other words,
1
the orientation induced in E by E1 and 122 coincides with the original orientation. The space 122 in turn induces an orientation in E1. It will be shown that this orientation coincides with the original orientation of E1 if and only if p (n —p) is even. The induced orientation of E1 is represented by the determinant-function A1(x1 ...
=
...
...
(4.68)
where eA(2—p+ 1, ...n) is a positive basis of E2. Substituting (v
1.. .n) in equation (4.68) we find that
A1(e1 ...
=
But e)(i.=p+ I
...
...
= (— 1)
...
en).
(4.69)
n) is a positive basis of E2 whence
...
(4.70)
It follows from (4.69) and (4.70) that A1(e1 ...e")
c>
0
if p(n — p) is even if p(n—p)is odd.
(4.71)
is positive with respect to the original orienSince the basis of tation, relation (4.71) shows that the induced orientation coincides with the original orientation if and only if p(n—p) is even. 4.30. Example. Consider a 2-dimensional linear space 12. Given a basis (e1, e2) we choose the orientation of E in which the basis e1, e2 is positive. Then the determinant function A, defined by
zl(e1,e2) =
1
1,2) generrepresents this orientation. Now consider the subspace 1,2) with the orientation defined by Then E1 induces in ated by 122 the given orientation, but E2 induces in E1 the inverse orientation.
§ 8. Oriented vector spaces
135
In fact, defining the determinant-functions A1 and A1 (x)
A (e2,
x)
xe E1,
A2 (x) = A
and
A2 in E1 and in E2 by (e1,
x)
x e E2
we find that
4.31.
A2(e2) = A(e1, e2) =
1
Intersections. Let E1
and E2 be
and
A1 (e1) =
A(e2,
e1) = —
two subspaces of E
E = E1 + E2
1.
such
that (4.72)
and assume that orientations are given in E1, E2
and E. It
will be shown
that then an orientation is induced in the intersection E12=E1 n
E2.
Setting
dimE1 = p,
dimE2 = q,
dimE12 = r
we obtain from (4.72) and (1.32) that
r= p + q — n. Now consider the isomorphisms ço:E/E1 and Vi:E/E2
E1/E12.
Since orientations are induced in E/E1 and E/E2 these isomorphisms determine orientations in E2/E12 and in E1/E12 respectively. Now choose two positive bases +1• a,, and hr . bq in E1/E12 and E2/E12 respecand bJEE2 be vectors such that tively and let .
and
where it1 and
it2
denote the canonical projections and
ir2:E2—+E2/E12.
Now define the function A12 by A12 (z1 ... Zr) = A (Z1 ... Zr, ar+ 1
a,,, br+ 1 .. bq).
(4.73)
In a similar way as in sec. 4.30 it is shown that the orientation defined in depends only on the orientations of E1, and E (and not on E12 by the choice of the vectors and ba). Hence an orientation is induced in E12.
Chapter IV. Determinants
Interchanging E1 and E2 in (4.73) we obtain
(:i
•..
(:i
=A
r,
br+ 1
bq,
1
ar).
(4.74)
Hence it follows that =
(q-r)4
1)(P_r)
(n-q)4
= (_
(475)
Now consider the special case of a direct decomposition. Then p+q=n and E12=(0). The function /112 reduces to the scalar (4.76)
Moreover the number—2 depends
It follows from (4.76) that
only on the orientations of E1, E2 and E. It is called the intersection number
of the oriented subspaces E1 and E2. From (4.75) we obtain the relation =(
= 1.. ii) be two bases of E. Basis deformation. Let and is called deformable into the basis if there exist n Then the basis continuous mappings 4.32.
t t1
t
satisfying the conditions 1. (t0) = and (t1) = (t) (v = 2. The vectors 1
The
.
.11) are
linearly independent for every fixed t.
deformability of two bases is obviously an equivalence relation.
Hence, the set of all bases of E is decomposed into classes of deformable bases. We shall now prove that there are precisely two such classes. This is a consequence of the following Theorem: Two bases and (v = 1.. ./1) are deformable into each other if and only if the linear transformation p: E defined by 'p = has positive determinant.
Proof: Let 4+0 be an arbitrary determinant function. Then formula lii) are 4.17 together with the observation that the components continuous functions of shows that the mapping Ex x E—*R defined by A is continuous. into the is a deformation of the basis Now assume that basis Consider the real valued function (t) ...
§ 8. Oriented vector spaces
137
The continuity of the function 4 and the mappings
the function Ji
is
implies that
continuous. Furthermore,
ck(t)+0 the vectors (t) (v = I .n) are linearly independent. Thus the function cP assumes the same sign at t=t0 and at t=t1. But because
. .
= 4(pa1 ...
= A(b1
= det(pA(a1 ...
=
whence
detço >0 and so the first part of the theorem is proved. 4.33. Conversely, assume that the linear transformation a deformation assume first that the vector n-tuple (a1 ...
...
(4.77)
1). Then consider the de-
is linearly independent for every 1(1 composition = By the above assumption the vectors (a1..
are linearly independ-
ent, whence /3" + 0. Define the number c,, by
1+1 if if
L—1
It will be shown that the n mappings
Ix,(t)=a.(v=1...n—1) = define
(1 —
+
(0O.
(4.79)
and by the result of sec. 4.32 Relations (4.78), and (4.79) imply that > 0.
det But whence
+1. Thus, the number of equal to — is even. Rearranging the vectors (v = 1.. n) we can achieve that 1
.
5—i
vl+i
(v=1...2p)
(v=2p+1...n).
§ 8. Oriented vector spaces
139
Then a deformation b1
...
e,,
-÷ (b1 ...
is defined by the mappings
+
=— =—
(v — —
1
—
(v=2p+1...n) 4.34. The case remains to be considered that not all the vector n-tuples (4.77) are linearly independent. Let A + 0 be a determinant function. The linear independence of the vectors (v = 1 . n) implies that . .
Since zl is a continuous function, there exists a spherical neighbourhood that Ua
(v= 1...n).
if XvEUa
Choose a vector a'1 e Uaj which is not contained in the (n — 1)-dimensional subspace generated by the vectors (b2. . Then the vectors (a'1, b2.. are linearly independent. Next, choose a vector a'2 E Ua2 which is not con-
tained in the (n— 1)-dimensional subspace generated by the vectors (a'1, b2. Then the vectors (a'1, a'2, b3. are linearly independent. Going on this way we finally obtain a system of n vectors (v = 1.. n) .
.
.
such that every n-tuple ...
(a'1
is
linearly independent. Since
Hence the vectors
(1= 1 ... ;i)
it follows that
(v = 1. . n) form a basis of E. The n mappings .
define a deformation (4.80)
In fact,
(t)(0 t
1) is contained in Uav whence A
(t) ...
(t)) + 0
(0
t
This implies the linear independence of the vectors
1).
(t)(v = 1.. .n).
140
Chapter IV. Determinants
By the result of sec. 4.33 there exists a deformation ...
... ba).
(4.81)
The two deformations (4.80) and (4.81) yield a deformation (a1 ...
... ba).
This completes the proof of the theorem in sec. 4.32. 4.35. Basis deformation in an oriented linear space. If an orientation is given in the linear space E, the theorem of sec. 4.32 can be formulated as and (v = I n) can be deformed into each other follows: Two bases if and only if they are both positive or both negative with respect to the given orientation. In fact, the linear transformation . .
p:
.
(v = I ... a)
—*
has positive determinant if and only if the bases and b%, (v = I .. are both positive or both negative. Thus the two classes of deformable bases consist of all positive bases and all negative bases. 4.36. Complex vector spaces. The existence of two orientations in a
real linear space is based upon the fact that every real number 2t0 is either positive or negative. Therefore it is not possible to distinguish two orientations of a complex linear space. In this context the question arises whether any two bases of a complex linear space can be deformed into each other. It will be shown that this is indeed always possible. Consider two bases and (v = a) of the complex linear space E. As in sec. 4.33 we can assume that the vector n-tuples 1
(a1...
. .
...
are linearly independent for every 1(1 1). It follows from the above assumption that the coefficient in the decomposition = is
different from zero. The complex number
Now choose a positive continuous function
r(0) =
1,
r(I) = r
can
be written as
1) such that (4.82)
§ 8. Oriented vector spaces
and a continuous function
1) such that
(t)(O t
(4.83)
Define mappings
(t),
(0
t 1)
(v =
=
by 1 ...
n — 1) )
and (t)
=t
Then the vectors (t) (v = fact, assume a relation
1 ..
(4.84)
1.
t
flV
+ r(t) .
n)
are linearly independent for every t. In
0. Then
+
r (t)
+
=
0
whence
(v==1...n—1) and = 0.
1, the last equation implies that )J' = 0. r (t) + 0 for 0 t (n— 1) equations reduce to in— 1). It follows from (4.84), (4.82) and (4.83) that
Since
Hence the
first
=
and
=
Thus the mappings (4.84) define a deformation (a1 ...
—÷
Continuing this way we obtain after 1...n). into the basis
(a1 n
...
ba).
steps a deformation of the basis
Problems 1. Let E be an oriented n-dimensional linear space and
(v = 1.. .11) be
the subspace generated by the vectors a positive basis; denote by is positive with respect Prove that the basis (x1 to
the orientation induced in
by the vector
l)i —
1
xL.
Chapter IV. Determinants
142
2. Let E be an oriented vector space of dimension 2 and let a1, a2 be two linearly independent vectors. Consider the 1-dimensional subspaces E1 and E2 generated by a1 and a2 and define orientations in E. such that the bases a are positive (i= 1,2). Show that the intersection number of E1 and E2 is + 1 if and only if the basis a1, a2 of E is positive.
3. Let E be a vector space of dimension 4 and assume that (v
= ... 1
4)
is a basis of E. Consider the following quadruples of vectors:
I. e1+e2,e1+e2+e3,e1+e2+e3+e4,e1—e2+e4 II. e1+2e3, e2+e4, e2—e1+e4, e2
III.
e2+e4, e3+e2, e2—e1
1V.
e1+e2—e3, e2—e4, e3+e2, e2—e1
V. e1—3e3, e2+e4, e2—e1—e4, e2. a) Verify that each quadruple is a basis of E and decide for each pair of bases if they determine the same orientation of E. b) If for any pair of bases, the two bases determine the same orientation, construct an explicit deformation. c) Consider E as a subspace of a 5-dimensional vector space E and assume that (v = 1, . .5) is a basis of E. Extend each of the bases above to a basis of which determines the same orientation as the basis (v= 1, ..., 5). Construct the corresponding deformations explicitly. 4. Let E be an oriented vector space and let E1, E2 be two oriented subspaces such that E== E1 + E2. Consider the intersection E1 fl E2 together with the induced orientation. Given a positive basis (c1, ..., Cr) of extend it to a positive basis (c1, ..., Cr, ar+ E1 fl a positive basis (C1, ..., Cr, br+ i, ..., bq) of E2. Prove that then (C1, ..., Cr, ar+ 1' ..., ap, br+ i, ..., bq) is a positive basis of E. .
5. Linear isotopices. Let E be an n-dimensional real vector space. will be called linearly Two linear automorphisms p: and i/i: (I the closed unit isotopic if there is a continuous map 1: I x and such that for each interval) such that x)=p(x), x) is a linear automorphism. tEl the map given by (i) Show that two automorphisms of E are linearly isotopic if and only if their determinants have the same sign. (ii) Let j: E—+E be a linear map such that = — Show that j is linearly isotopic to the identity map and conclude that detj= 1. 6. Complex structures. A complex structure in real vector space E is a linear transformation j: E—+E satisfying j2 = — 1.
§8. Oriented vector spaces
143
(i) Show that a complex structure exists in an n-dimensional vector space if and only if ii is even. (ii) Let j be a complex structure in E where dimE=2n. Let 4 be a determinant function. Show that for (v= 1 ... 11) either 4(x1
or 4(x1 ,...,x,1,jx1 The natural orientation of (E, J) is
defined to be the orientation re-
presented by a non-zero determinant function satisfying the first of these.
Conclude that the underlying real vector space of a complex space carries a natural orientation. (iii) Let be a vector space with complex structure and consider the complex structure (E, —j). Show that the natural orientations of and (E, —j) coincide if and only if n is even (dimE=2n).
Chapter V
Algebras In paragraphs one and two all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0.
§ 1. Basic properties 5.1. Definition: An algebra, A, is a vector space together with a mapping A x such that the conditions (M1) and (M2) below both hold. The image of two vectors xeA, yeA, under this mapping is called the product of x and y and will be denoted by xy. The mapping A x A-+A is required to satisfy: (M1) (M2)
(2x1 + jix2)y = 2(x1 y) + ,u(x2y) + 11y2)
2(xy1) + ji(xy2).
As an immediate consequence of the definition we have that
0x = x0 0. Suppose B is a second algebra. Then a linear mapping p : a homomorphism (of algebras) if çø preserves products; i.e.,
p(xy)=pxpy.
is called (5.1)
A homomorphism that is injective (resp. surjective, bijective) is called a monomorphism (resp. epimorphism, isomorphism). If B=A, p is called an endomorphism. Note: To distinguish between mappings of vector spaces and mappings of algebras, we reserve the word linear mapping for a mapping between vector spaces satisfying (1.8), (1.9) and homomorphism for a linear mapping between algebras which satisfies (5.1). Let A be a given algebra and let U, V be two subsets of A. We denote by U V, the set
§ 1. Basic properties
145
Every vector aeA induces a linear mapping p(a):A—*A defined by
1u(a)x=ax
(5.2)
4u (a) is called the multiplication operator determined by a.
An algebra A is called associative if
x(yz)=(xy)z and commutative if
xy=yx
x,y,zEA
x,yeA.
From every algebra A we can obtain a second algebra (x y)OPP
by defining
=yx
AON' is called the algebra opposite to A. It is clear that if A is associative
then so is A°". If A is commutative we have A°"=A. If A is an associative algebra, a subset Sc: A is called a system of generators of A if each vector xeA is a linear combination of products of elements in 5, 1... x= e S, 1... Vp ... (v)
A unit element (or identity) in an algebra is an element e such that for every x
xe=ex=x.
(5.3)
if A has a unit element, then it is unique. In fact, if e and e' are unit elements, we obtain from (5.3) e = e e' = e'. Let A be an algebra with unit element eA and ço be an epimorphism of A onto a second algebra B. Then eB = eA is the unit element of B. In fact,
if yEB is arbitrary, there exists an element xeA such that y= çox. This gives
yeB =
=
=
= y.
In the same way it is shown that An algebra with unit element is called a division algebra, if to every element a 0 there is an element a' such that a a — = a — = c. 1
5.2. Examples: 1. Consider the space L(E; E) of all linear transformations of a vector space E. Define the product of two transformations by 10
= i/i o Greub Linear Algebra
Chapter V. Algebras
146
The relations (2.17) imply that the mapping (p, i)—*çlip satisfies (M1) and (M2) and hence L(E; E) is made into an algebra. L(E; E) together with this multiplication is called the algebra of linear transformations of E and is denoted by A (E; E). The identity transformation i acts as unit element in A (E; E). It follows from (2.14) that the algebra A (E; E) is associative.
However, it is not commutative if dim E 2. In fact, write E= (x1) and (x2) are the one-dimensional subspaces generated by two linearly independent vectors x1 and x2, and F is a complementary subspace. Define linear transformations 'p and by
'px1=O,
(px2=x1,
py=O,yeF
ç/ix2=O;
çliy=O,yeF.
and Then
= 'p0 = 0 while (pX2
=
çlix1
= x2
whence
Suppose now that A is an associative algebra and consider the linear mapping defined by
p(a)x=ax.
(5.4)
Then we have that
p(a b)x = a bx = p(a)p(b)x whence
p(ab) = p(a)p(b). This
relation shows that p is a homomorphism of A
into A (A; A).
M X be the vector space of (n x mi)-matrices for a given integer ii and define the product of two (mi x n)-matrices by formula Example 2: Let
(3.20). Then it follows from the results of sec. 3.10 that the space is made into an associative algebra under this multiplication with the unit matrix J as unit element. Now consider a vector space E of dimension n with a distinguished basis (v = 1. n). Then every linear transforE determines a matrix M ('p). The correspondence 'p —+ M(co) mation 'p . .
§ 1. Basic properties
determines a linear isomorphism of A (E; E) onto
147 X
In view of sec.
3.10 we have that
M is an isomorphism of the algebra A (E; E) onto the opposite algebra (M" X n)0PP. Example 3: Suppose F1 cF is a subfield. Then F is an algebra over F1. We show first that F is a vector space over F1. In fact, consider the mapping F1 x F—+F defined by
(2,x)-÷2x, It satisfies the relations
2eF1,xEF.
+ x =2x+ x 2(x + y) = Ax + 2y (24u)x = 2(1ux)
lx = x where 2,
x, yeF. Thus F is a vector space over F by (x, y) x y (field multiplication).
Then M1 and M2 follow from the distribution laws for field multiplication.
Hence F is an associative commutative algebra over F1 with 1 as unit element.
Example 4: Let cr be the vector space of functions of a real variable t which have derivatives up to order r. Defining the product by
(1 g) (t) = f (t) g (t) we obtain an associative and commutative algebra in which the function 1(t) = 1 acts as unit element. 5.3. Subalgebras and ideals. A subalgebra, A1, of an algebra A is a linear subspace which is closed under the multiplication in A; that is, if x and y are arbitrary elements of A1, then xyeA1. Thus A inherits the structure of an algebra from A. It is clear that a subalgebra of an associative (commutative) algebra is itself associative (commutative). Let S be a subset of A, and suppose that A is associative. Then the subspace B A generated (linearly) by elements of the form
is
SEES Si...Sr, clearly a subalgebra of A, called the subalgebra generated by S. It is
easily verified that
where the 10*
B= A containing S.
Chapter V. Algebras
A right (left) ideal in an algebra A is a subspace I such that for every xEI, and every yEA, xyEI(yxEI). A subspace that is both a right and left ideal is called a tivt'o-sided ideal, or simply an ideal in A. Clearly, every
right (left) ideal is a subalgebra. As an example of an ideal, consider the subspace A2 (linearly generated by the products xy). A2 is clearly an ideal and is called the derived algebra. The ideal I generated by a set S is the intersection of all ideals containing S. If A is associative, I is the subspace of A generated (linearly) by elements of the form
s,as,sa
sES,aEA.
In particular every single clement a generates an ideal principal ideal generated by a.
4
is
called the
Example 5: Suppose A is an algebra with unit element e, and let (p:F-÷A be the linear mapping defined by ço2 = Ae.
Considering F as an algebra over itself we have that =
e
=
e)(R e) =
then ).e=0 whence p is a homomorphism. Moreover, if )L=0. It follows that p is a monomorphism. Consequently we may identify F with its image under p. Then F becomes a subalgebra of A and scalar multiplication coincides with algebra multiplication. In fact, if). is any scalar, then = (2e).a = = Example
6: Given an element a of an associative algebra consider
the set, Na, of all elements xEA such that ax =0. If xENa then we have for every VEA
and so xyENa. This shows that Na is a right ideal in A. It is called the right annihilator of a. Similarly the left annihilator of a is defined. 5.4. Factor algebras. Let A be an algebra and B be an arbitrary subspace of A. Consider the canonical projection
It will be shown that A/B admits a multiplication such that ir is a homomorphism if and only if B is an ideal in A. Assume first that there exists such a multiplication in A/B. Then for
§ 1. Basic properties
149
every xeA, yeB, we have
ir(xy) =
irxiry =
it follows that and so B must be an ideal. Conversely, assume B is an ideal. Then define the multiplication in
A/B by
= ir(xy)
(5.5)
where x and y are any representatives of
and 7 respectively.
It has to be shown that the above product does not depend on the choice of x and y. Let x' and y' be two other elements such that it x' = and iry' =7. Then
x'—xeB and y'—yeB. Hence we can write
x'==x+b, It
beB and y'—y+c,
ceB.
follows that
x'y' — xy = by + xc + bceB and so
rr(x'y') = it(xy). The multiplication in A/B clearly satisfies (M1) and (M2) as follows from the linearity of it. Finally, rewriting (5.5) in the form
it (x y) = it x iry
we see that it is a homomorphism and that the multiplication in A/B is uniquely determined by the requirement that it be a homomorphism. The vector space A/B together with the multiplication (5.5) is called the
factor algebra of A with respect to the ideal B. It is clear that if A is associative (commutative) then so is A/B. If A has a unit element e then ë= ire is the unit element of the algebra A/B. 5.5. Homomorphisms. Suppose A and B are algebras and is
a homomorphism. Then the kernel of p is an ideal in A. In fact, if xeker ço and yEA
are
arbitrary we have that
p(xy) =
=
=
0
whence xyEker qi. In the same way it follows that yxEker tp. Next consider the subspace Im pcB. Since for every two elements x, yeA = qi(xy)elmq
it follows that Im 'p is a subalgebra of B.
Chapter V. Algebras
Now let
be the induced injective linear mapping. Then we have the commutative d iagrani
A/kerço
and since
is
a homomorphism, it follows that = = = =
'p
(x y) (x). 'p (y)
is a homomorphism and hence a monomorphism. In particular, the induced mapping This relation shows that
4 Imço is an isomorphism. Finally, assume that C is a third algebra, and let be a homomorphism. Then the composition ti0 p: A —+ C is again a homomorphism. In fact, we have 'ps. = = =
Let 'p:A—*B be any homomorphism of associative algebras and S be by a system of generators for A. Then 'p determines a set map (po:
'p0x=çox,
xeS. In fact, if
The homomorphism 'p is completely determined by
x=
...
E S,
(v)
is
an arbitrary element we have that
... 'poxv
= (v)
I
Vp
e I'
§ 1. Basic properties
151
be an arbitrary set map. Then Po can be Proposition I: Let can be extended to a homomorphism ço:A—+B if and only if ...
whenever
0
po
...
= 0. (5.6)
(v)
(v)
Proof: It is clear that the above condition is necessary. Conversely, assume that (5.6) is satisfied. Then define a mapping ço:A—÷B by
= (v)
...
(5.7)
(v)
To show that çü is, in fact, well defined we notice that if = ... •.. Ypq
(v)
(p)
then —
Ypq
= 0.
(ji)
(v)
In view of (5.6) ...
...
—
(v)
0
(p)
and so
...
It follows from (5.7) that
px=ip0x xeS p(2x + jiy) = 2px + and
'p (x y) = 'p
'p y
and hence 'p is a homomorphism. Now suppose that
be a linear map such
is a basis for A and let
= for each x, j3. Then 'p is a homomorphism, as follows from the relation 'p (x y) =
'p
ep)}
e2) (> p
2,fl
=
'p (ep)) =
'p p
'p
(x) 'p (y).
Chapter V. Algebras
152
5.6. Derivations. A linear mapping 0: A called a derivation if
A of an algebra into itself is
X,yEA.
(5.8)
and As an example let A be the algebra of Cr-functions f: define the mapping 0 by 0:f—+f' wheref' denotes the derivative off. Then the elementary rules of calculus imply that 0 is a derivation. If A has a unit element e it follows from (5.8) that Oe =
Oe
+ 0e
whence Oe=—O. A derivation is completely determined by its action on a
system of generators of A, as follows from an argument similar to that used to prove the same result for homomorphisms. Moreover, if O:A—*A
is a linear map such that
=
+
is a basis for A, then 0 is a derivation in A. For every derivation 0 we have the Leibniz formula
where
on(xy)
(5.9)
= In fact, for n = 1, (5.9) coincides with (5.8). Suppose now by induction that (5.9) holds for some n. Then +
1(x y) =
o on
(x y) or+
-r y +
= x.On+ly +
+
or x
-r+
=
= x.On+ly + n+ 1
and so the induction is closed.
Orx.On+l_ry +
(n+1) Orx.On_r+ly +
§ 1. Basic properties
153
The image of a derivation 0 in A is of course a subspace of A, but it is in general not a subalgebra. Similarly, the kernel is a subalgebra, but it is not, in general, an ideal. To see that ker 0 is a subalgebra, we notice that for any two elements x, yeker 0
0(xy) =
0
whence xy E ker 0.
it follows immediately from (5.8) that a linear combination of derivations 01:A—*A is again a derivation in A. But the product of two derivations 01, 02 satisfies
(0102)(xy) = 01(02x.y + x.02y) (5.10)
and so is, in general, not a derivation. However, the commutator [01,02] = 0102—0201 is again a derivation, as follows at once from (5.10). 5.7. ip-derivations. Let A and B be algebras and 'p : A B be a fixed homomorphism. Then a linear mapping 0: A B is called a 'p-derivation if
0(xy)=
x,yeA.
+
denotes In particular, all derivations in A are i-derivations where i : A the identity map. As an example of a 'p-derivation, let A be the algebra of Cm-functions f: R—* and let B= 11. Define the homomorphism cp to be the evaluation homomorphism : f -÷ (0) and the mapping 0 by
f
0:f Then it follows that
0(fg) = (1 g)'(O) = = 0
f' (0) g (0) + f (0) g' (0)
a 'p-derivation.
More generally, if °A is any derivation in A, then derivation. In fact, 0(xy) = çoOA(xy) + xOAv) = = 'p
+
'p x
0 y.
Similarly, if °B is a derivation in B, then °B° 'p
is
a 'p-derivation.
is
a ço-
Chapter V. Algebras
154
5.8. Antiderivations.
Recall that an involution in a linear space is a
linear transformation whose square is the identity. Similarly we define an involution w in an algebra A to be an endomorphisin of A whose square is the identity map. Clearly the identity map of A is an involution. If A has a unit element e it follows from sec. 5.1 that we=e. Now let w be a fixed involution in A. A linear transformation Q:A—*A will be called an antiderivation with respect to w if it satisfies the relation
Q(xj) =
+
wxQy.
(5.11)
In particular, a derivation is an antiderivation with respect to the involution i. As in the case of a derivation it is easy to show that an antiderivation is determined by its action on a system of generators for A and that ker Q is a subalgebra of A. Moreover, if A has a unit element e, then Qe=O. It also follows easily that any linear combination of antiderivations with respect to a fixed involution w is again an antiderivation with respect to w. Suppose next that Q1 and Q2 are antiderivations in A with respect to 0w1. Then w1 oW2 the involutions w1 and w2 and assume that w1 0w2 is again an involution. The relations
(Q122)(xy)= Q1(Q2xy + w2xQ2y)=
+
and
(Q2Q1)(xy)=
+ w1xQ1y)= Q2Q1xy + + w2Q1xQ2y + Q2w1xQ1y + w2w1xQ2Q1y
yield
(Q1Q2 ± Q2 ± Q2w1)xQ1 y + +(Q1w2 ± w2Q1)xQ2y + W1W2X.(Q1Q2 ± Q2Q1)y. Q2
(5.12)
Now consider the following special cases: 1. w1 Q2 = Q2 w1 and w2 Q1 = Q1 (this is trivially true if w1 = ± i and Q2 —Q2 is an antiderivation = ± i). Then the relation shows that with respect to the involution W1 w2. In particular, if Q is an antiderivation
with respect to w and 0 is a derivation such that 0)0=0W, then OQ—Q0 is again an antiderivation with respect to oi. 2. w1Q2=—Q2W1andw2Q1=—Q1W2.ThenQ1Q2+Q2Q1isanantiderivation with respect to the involution W1 0)2.
§ 1. Basic properties
155
Now let Q1 and Q2 be two antiderivations with respect to the same involution w such that
(i=1,2). Then it follows that Q1 Q2 + Q2 Q1 is a derivation. In particular, if Q is any antiderivation such that wQ = - Qw then Q2 is a derivation. Finally, let B be a second algebra, and let cp: A —÷ B be a homomorphism. Assume that WA is an involution in A. Then a p-antiderivation with respect to WA is a linear mapping Q:A-+B satisfying (5.13) If WB is
an involution in B such that WA
WB 'P
then equation (5.13) can be rewritten in the form
Q(xy) =
+
Problems 1. Let A be an arbitrary algebra and consider the set C(A) of elements aEA that commute with every element in A. Show that C(A) is a subspace
of A. If A is associative, prove that C(A) is a subalgebra of A. C(A) is called the centre of A. 2. If A is any algebra and 0 is a derivation in A, prove that C(A) and the derived algebra are stable under 0. 3. Construct an explicit example to prove that the sum of two endomorphisms is in general not an endomorphism. 4. Suppose 'p A —÷ B is a homomorphism of algebras and let ). + 0, 1 be
an arbitrarily chosen scalar. Prove that 2p is a homomorphism if and only if the derived algebra is contained in ker 'p. 5. Let C1 and C denote respectively the algebras of continuously differentiable and continuous functionsf: (cf. Example 4). Consider the linear mapping d: C' C
given by df=f' wheref' is the derivative off.
Chapter V. Algebras
a) Prove that this is an i-derivation where i: C1—*C denotes the canonical injection. b) Show that d is surjective and construct a right inverse for d.
c) Prove that d cannot be extended to a derivation in the algebra C. 6. Suppose A is an associative commutative algebra and 0 is a derivation in A. Prove that Ox" = px"' 0(x). 7. Suppose that 0 is a derivation in an associative commutative algebra A with identity e and assume that xeA is invertible; i.e.; there exists an element x ' such that
Prove that x" (p 1) is invertible and that
(x")' —(x')" Denoting the inverse of x" by
show that for every derivation 0
0(x") = — px"1 0(x). 8. Let L be an algebra in which the product of two elements x, y is denoted by [x, y]. Assume that [x, y] + [y, x] = z], x] + y], z] +
0
(skew symmetry)
x], y] = 0 (Jacobi identity)
Then L is called a Lie algebra. Let Ad (a) be the multiplication operator in the Lie algebra L. Prove that Ad (a) is a derivation.
9. Let A be an associative algebra with product xy. Show that the multiplication (x, y)—÷[x, y] where
[x,y]
xy — yx
makes A into a Lie algebra. 10. Let A be any algebra and consider the space D (A) of derivations in A. Define a multiplication in D(A) by setting [01,02] = a)
0102 —0201.
Prove that D (A) is a Lie algebra.
b) Assume that A is a Lie algebra itself and consider the mapping given by Show that 'p is a homomorphism of Lie algebras. Determine the kernel of p.
§ 1. Basic properties
11. If L is a Lie algebra and I is an ideal in A, prove that the algebra L/I is again a Lie algebra. 12. Let E be a finite dimensional vector space. Show that the mapping 1: A (E; E) —÷ A (E*; E*)OPP
is an isomorphism of algebras. given by 13. Let A be any algebra with identity and consider the multiplication
operator
ji: A
A (A; A).
Show that p is a monomorphism. If A = L (E; E) show that by a suitable restriction of p a monomorphism
(i= 1
E be an n-dimensional vector space. Show that each basis n) of L(E; E) such that n) of E determines a basis 1
(i) QijOlk = (ii)
i.
Conversely, given n2 linear transformations of E satisfying i) and ii), prove that they form a basis of L(E; E) and are induced by a basis of E. Show that two bases e1 and of E determine the same basis of L (E; E) if and only if e = A 2 eF. 15. Define an equivalence relation in the set of all linearly independent E), in the following way: n2-tuples (p1...pn2), ...
('P1 ...
if and only if there exists an element =
such that (v
= ... 1
,12)
Prove that 2EV
only if 2=1. 16. Prove that the bases of L(E; E) defined in problem 14 form an equivalence class under the equivalence relation of problem 15. Use this to show that every non-zero endomorph ism cP: A (E; E)—*A (E; E) is an inner automorphism; i.e., there exists a fixed linear automorphism of E such that 1
17. Let A be an associative algebra, and let L denote the corresponding
Chapter V. Algebras
158
is Lie algebra (cf. problem 9). Show that a linear mapping derivation in A only if it is a derivation in L. 18. Let E be a finite dimensional vector space and consider the mapping E)—*A(E; E) defined by
= — Prove that is a derivation. Conversely, prove that every derivation in A (E; E) is of this form. Hint. Use problem 14. § 2.
Ideals
5.9. The lattice of ideals. Let A be an algebra, and consider the set 1 of ideals in A. We order this set by inclusion; i.e., if and '2 are ideals if and only if in A, then we write '1 The relation is clearly a partial order in 1 (cf. sec. 0.6). Now let and '2 be ideals in A. Then it is easily cheeked that + and '1 n '2 are again ideals, and are in
fact the least upper bound and the greatest lower bound of '1 and '2 Hence, the relation induces in 5 the structure of a lattice. 5.10. Nilpotent ideals. Let A be an associative algebra. Then an element aEA will be called ni/potent if for some k, (5.14)
The least k for which (5.14) holds is called the degree of ni/potency of a. An ideal I will be called nilpotent if for some k,
,k0
(5.15)
The least k for which (5.15) holds is called the degree of ni/potency of I and will be denoted by deg I. 5.11.* Radicals. Let A be an associative commutative algebra. Then the nilpotent elements of A form an ideal. In fact, if x and y are nilpotent of degree p and q respectively we have that p+q
+
p+q
p
= i=O
and
(xy)" = x"y" = 0.
§ 2. Ideals
The ideal consisting of the nilpotent elements is called the radical of A and will be denoted by rad A. (The definition of radical can be generalized to the non-commutative case; the theory is then much more difficult and belongs to the theory of rings and algebras. The reader is referred to [14]).
It is clear that rad (rad A) = rad A.
The factor algebra A/rad A contains no non-zero nilpotent elements. To prove this assume that A is an element such that for some k. Then x"erad A and hence the definition of rad A yields / such that = (x")' = 0. It follows that xErad A whence by the formula
The above result can be expressed
rad(A/radA) = 0.
Now assume that the algebra A has dimension n. Then rad A is a nilpotent ideal, and (5.16) deg(radA) dim(radA) + 1 n + 1. For the proof, we choose a basis e1, ..., er of rad A. Then each is nilpotent. Let k = max (deg e1), and consider the ideal (rad An arbitrary element in this ideal is a sum of elements of the form where
k
so
...
=0. This shows that
=0 and so rad A is nilpotent. Now let s be the degree of nilpotency of rad A, and suppose that for some m<s, (rad A)tm = (rad A)m+
(5.17)
Then we obtain by induction that (radA)tm =
= (radA)m+2 =...= (radA)s = 0
which is a contradiction. Hence (5.17) is false and so in particular dim(rad A)tm> dim (radA)m+l,
m <s.
Chapter V. Algebras
It follows at once that s — cannot be greater than the dimension of rad A, which proves (5.16). As a corollary, we notice that for any nilpotent element xEA, its degree of nilpotency is less than or equal to ii+ I, 1
degx
n + 1.
5.12.* Simple algebras. An algebra A is called simple if it has no proper
non-trivial ideals and if A2 +0. As an example consider a field F as an algebra over a subfield F1. Let 1+0 be an ideal in F. If x is a non-zero element of I, then 1
and it follows that
F=
cJ
whence F = I. Since 12+0, F is simple. As a second example consider the algebra A (E; E) where E is a vector space of dimension n. Suppose I is a non-trivial ideal in A (E; E) and let p+O be an arbitrary element of I. Then there exists a vector aeE such
that pa+0. Now define the linear transformations
by
i,k=1...n where e (i= that
.n) is a basis of E. Choose linear transformations
such
i=1...n.
(E; E) be arbitrary and c4 let be the matrix of Let the basis e1. Then = = =
with respect to
whence 'I'
=
It follows that i/iel and so I_—A(E; E). Since, (clearly) A (E; E)2 +0, A (E; E) is a simple algebra. Theorem I: Let A be a simple commutative associative algebra. Then A is a division algebra. Proof: We show first that A has a unit element. Since A2+0, there is an element aEA such that Ia+0• Since A is simple, Thus there is an element eEA such that C!• e = a.
§2. Ideals
161
Next we show that e2 = e. In fact, equation (5.18) implies that a(e2 — e)
= (ae)e — ae = ae
—
ae =0
and so e2—e is contained in the annihilator of a, e2—eeN0. Since N0 is an ideal and Na*A it follows that Na=0 whence e2=e.
Now let xeA be any element. Since a=aeele we have Ie#0 and 50 IeA. Thus we can write x = er
for some ye A.
It follows that ex=e2v=ev=x and so e is the unit element of A. Thus every element xeA satisfies x=x. e; i.e., xel,. In particular, * 0 if .v * 0. Hence, if x + 0. = A and so there is an element x such
that xx"'=x'x=e: i.e., A is a division algebra.
5.13.* Totally reducible algebras. An algebra A is called totally recluclb/c if to every ideal I there is a complementary ideal I', A=
1$!'.
Every ideal I in a totally reducible algebra is itself a totally reducible algebra. In fact, let I' be a complementary ideal. Then
1.1' ci fl I' =
0.
Consequently, if J is an ideal in I, we have
J
an
ideal in A. Let J' be a complementary ideal in A, A
Intersecting with I and observing that id we obtain (cf. sec. 1.13) I It follows that us again totally reducible. An algebra, A, is called irreducible if it cannot be written as the direct
sum of two non-trivial ideals. 5.14.* Semisimple algebras. In this section A will denote an associative commutative algebra. A will be called semisimple if it is totally reducible and if for every non-zero, I, ideal j2 +0. 1
(reeb.
near AIgr±ra
Chapter V. Algebras
162
Proposition I: If A is totally reducible, then A is the direct sum of its radical and a semisimple ideal. The square of the radical is zero. Proof: Let B denote a complementary ideal for rad A,
A = rad A
B.
B contains no non-zero nilpotent eleIt follows from sec. (5.13) that B is totally reducible ments and so B2 and hence B is semisimple. To show that the square of rad A is zero, let k be the degree of nilan ideal in rad A, and potency of rad A, (rad A)k=0. Then (rad so there exists a complementary ideal J, Since
EEJ = radA.
Now we have the relations =
=
0
=0 n J=0.
jk-1 These relations yield (rad
0
A
and only if A is totally reducible and
rad A=0.
Problems 1. Suppose that
are ideals in an algebra A. Prove that
+ '2)/Il
'2/(11
'2).
Show that the algebra C' defined in Example 4, § 1, has no nilpotent elements +0. 111. Show that the oper3. Consider the set S of step functionsf: [0, ations 2.
(1 +g)(t)=f(t)+g(t) (Af)(t) =
(t)
(fg)(t) =f(t)g(t)
§ 3. Change of coefficient field of a vector space
163
make S into a commutative associative algebra with identity. (A function f: [0,1] —÷ is called a step function if there exists a decomposition of the
unit interval,
0=toJEk Then by setting
—1)
(ke7L).
we make Einto agraded space, and when-
ever we refer to the graded space E_—>Ek, we shall mean this particular k=O
gradation. A gradation of E such that Ek = 0, k — gradation.
1,
is called a positive
Chapter VI. Gradations and homology
Now let E be a G-graded space, E=>Ek,
and
consider a subspace
kEG
FcEsuchthat
F=
keG
Then a G-gradation is induced in F by assigning the degree k to the vec-
tors of Fn Ek. F together with its induced gradation is called a G-graded subspace of E. Suppose next that is a family of G-graded spaces indexed by a set I and let E be the direct sum of the EA. Then a G-gradation is induced in
Eby E=
where
Ek =
kEG
).eI
This follows from the relation Ad
kEG )et
AEI keG
keG
6.2. Linear mappings of G-graded spaces. Let E and F be two G-graded spaces and let E—3F be a linear map. The map p is called homogeneous if there exists a fixed element kEG such that
jeG
(6.2)
k is called the degree of the homogeneous mapping p. The kernel of a homogeneous mapping is a graded subspace of E. In fact, if
E and F induced by the gradations of E and F it follows from (6.2) that cJk+JO(P—(POQJ.
(6.3)
Relation (6.3) implies that ker 'p is stable under the projection operators and hence ker 'p is a G-graded subspace of E. Similarly, the image of 'p is a G-graded subspace of F. Now let E be a G-graded vector space, F be an arbitrary vector space (without gradation) and suppose that 'p:E—*Fis a linear map of Eonto F such that ker 'p is a graded subspace of E. Then there is a uniquely determined G-gradation in F such that 'p is homogeneous of degree zero. The G-gradation of F is given explicitly by jeG where
= 'p (Es).
(6.4)
§ 1. 6-graded vector spaces
169
To show that (6.4) defines a G-gradation in F, we notice first that since ço is onto,
F=
=
E=
To prove that the decomposition (6.4) is direct assume that where Since lows that ço
every can be written in the form = 0 whence
It fol-
Since ker p is a graded subspace of E we obtain
for eachj Thus the decomposition (6.4) is direct and hence it defines a G-gradation in F. Clearly, the mapping p is homogeneous of degree zero with respect to the induced gradation. Finally, it is clear that any G-gradation of F such that ço is homogeneous of degree zero must assign to the elements of the degree]. In view of the decomposition (6.4) it follows that this G-gradation is uniquely determined by the requirement that 'p be homogeneous of degree zero. This result implies in particular that there is a unique G-gradation determined in the factor space of E with respect to a G-graded subspace such that the canonical projection is homogeneous of degree zero. Such a factor space, together with its G-gradation, is called a G-graded factor whence
space of E.
Now let E= that
Ek and F= kEG
Fk be two G-graded spaces, and suppose keG
is a linear mapping homogeneous of degree 1. Denote by Pk the restriction of 'p to Ek, Ek —*
Fk+l.
Then clearly keG
It follows that 'p is injective (surjective, bijective) if and only if each c°k injective (surjective, bijective).
Chapter VI. Gradations and homology
6.3. Gradations with respect to several groups. Suppose that t: G
into another abelian group G'. Then a G-
gradation of E induces a G'-gradation of E by where
(6.5) t(2)=fl
/3
To prove this we note first that E= since
X
The directness of the decomposition (6.5) follows from the
fact that every space decomposition
is a sum of certain subspaces
and that the
is direct.
A Gp-gi'adation is a gradation with respect to the group
GEE... p
If G = we refer simply to a p-gradation of E. Given a G p-gradation in E consider the homomorphism
i:
G
given by
The induced (simple) G-gradation of E is given by (6.6)
j
The G-gradation (6.6) of E is called the (simple) G-gradation induced by the given G p-gradation. Finally, suppose E is a vector space, and assume that
E=
E=
jeG
keIi
Fk
(6.7)
define G and H-gradations in E. Then the gradations will be called coinpatible if E= n Fk. j,
k
if the two gradations given by (6.7) are compatible, they determine a in E by the assignment n Fk.
§ 1. G-graded vector spaces
Conversely, suppose a
171
in E is given
E= k
Then compatible G and H-gradations of E are defined by jeG
keH
and
E determined by these G and Hgradations is given by
jEG,kEH.
EJ,k=EfflFk
6.4. The Poincaré series. A gradation E== >.Ek of a vector space E is called almost finite if the dimension of every space Ek is finite. To every almost finite positive gradation we assign the formal series PE(t) = (t) is called the Poincaré series of the graded space E. If the dimension of E is finite, then (t) is a polynomial and
PE(l) = dimE. The direct sum of two almost finite positively graded spaces E and F is again an almost finite positively graded space, and its Poincaré series is given by = PE(t) + PF(t). Two almost finite positively graded spaces E and F are connected by a homogeneous linear isomorphism of degree 0 if and only if = In fact, suppose is such a homogeneous linear isomorphism of degree 0. Writing k=O
(cf. sec. 6.2) we obtain that each c°k is a linear isomorphism, Ek 4Fk.
Chapter VI. Gradations and homology
172
Hence
(k=O,1,...)
dimEk=dimFk
(6.8)
and so
= Conversely, assume that are linear isomorphisms
Then (6.8) must hold, and thus there (6.9)
(pk.Ek—*Fk.
Since
we
can construct the linear mapping
= k=O
E
F
which is clearly homogeneous of degree zero. Moreover, in view of sec. (6.2) it follows from (6.9) that ço is a linear isomorphism. 6.5. Dual G-graded spaces. Suppose E= > Ek and F= Fk are two GkE G graded vector spaces, and assume that k G ço:E x
is a bilinear function. Then we say that cv respects the G-gradations of E
and Fif (6.10)
for each pair of distinct degrees, k +j. Every bilinear function cv: E x F—÷F which respects G-gradations determines bilinear functions pk: Ek x (keG) by
(PIEkXFk=(Pk,
keG.
(6.11)
Conversely, if any bilinear functions (pk: Ek x are given, then a unique bilinear function p:Ex F-+F which respects G-gradations is determined by (6.10) and (6.11). In particular, it follows that ço is non-degenerate if and only if each is non-degenerate. Thus a scalar product which respects G-gradations determines a scalar product between each pair (Ek, Fk), and conversely if a scalar product is defined between each pair (Ek, Fk) then the given scalar product can be extended in a unique way to a G-gradation-respecting scalar product between E and F. E and F, together with a G-gradation-respecting scalar product, will be called dual G-graded spaces.
Now suppose that E and F are dual almost finite G-graded spaces.
§ 1. G-graded vector spaces
173
Then Ek and Fk are dual, and so
dimEk=dimFk In particular, if G
kEG.
and the gradations of E and F are positive, we have
=
Problems 1. Let çø : E—*F be an injective linear mapping. Assume that F is a Ggraded vector space and that Im ço is a G-graded subspace of F. Prove that there is a unique G-gradation in E so that p becomes homogeneous of degree zero. 2. Prove that every G-graded subspace of a G-graded vector space has a complementary G-graded subspace. 3. Let E, F be G-graded vector spaces and suppose that E1 E, F1 F are G-graded subspaces. Let p: E—+F be a linear mapping homogeneous of degree k. Assume that p can be restricted to E1, F1 to obtain a linear mapping p1 :E1—*F1 and an induced mapping
Prove that q1 and are homogeneous of degree k. 4. If cp, E, F are as in problem 3, prove that if has a left (right) inverse, then a left (right) homogeneous inverse of must exist. What are the possible degrees of such a homogeneous left (right) inverse mapping? 5. Let E1, E2, 123 be G-graded vector spaces. Suppose that p: E1 and i/i:E1—+E3 are linear mappings, homogeneous of degree k and 1 respectively. Assume that can be factored over 'p. Prove that cli can be factored over 'p with a homogeneous linear mapping x: and determine the degree of x. Hint: See problem 5, chap. II, § 1.
6. Let E, E* and F, F* be two pairs of dual G-graded vector spaces. Assume that and 'p is homogeneous of degree k, prove that 'p* homogeneous of degree k. and 7. Let E be an almost finite graded space. Suppose that are G-graded spaces each of which is dual to the G-graded space E. Construct
is
Chapter VI. Gradations and homology
174
a homogeneous linear isomorphism of degree zero
such that
=
xeE.
8. Let F, E* be a pair of almost finite dual G-graded spaces. Let F be a G-graded subspace of E. Prove that F' is a G-graded subspace of E*
and that (F')'=F. 9. Suppose E, E*, F are as in problem 8. Let F1 be a complementary G-graded subspace for F in E (cf. problem 2). Prove that
E* =F'f13F1' and that F, F and F1, F' are two pairs of dual G-graded spaces. 10. Suppose E, E* and F, F* are two pairs of almost finite dual G1L
graded vector spaces, and let p:E—*Fbe a linear mapping homogeneous of degree k. Prove that çQ* exists.
11. Suppose F, E* is a pair of almost finite dual G-graded vector spaces. Let be a basis of F consisting of the set union of bases for the in E* exists. homogeneous subspaces of E. Prove that a dual basis 12. Let F and F be two G-graded vector spaces and p: E—+ F be a homogeneous linear mapping of degree k. Assume further that a homomorphism w:G—*H is given. Prove that p is homogeneous of degree w(k) with erspect to the induced H-gradation. § 2.
G-graded algebras
6.6. G-graded algebras. Let A be an algebra and suppose that a Ggradation A =
is defined in the vector space A. Then A is called a kEG
G-graded algebra if for every two homogeneous elements x and y, xy is
homogeneous, and
deg(xy) = degx + degy. Suppose that A = keG
(6.12)
is a graded algebra with identity element e.
Then e is homogeneous of degree 0. In fact, writing
e=
ek
keG
ekeAk
§ 2. G-graded algebras
175
we obtain for each xeA that
x is homogeneous of degree 1 kEG
whence
xek—O for k+O and so
xe0=x.
(6.13)
Since (6.13) holds for each homogeneous vector x it follows that e0 is a right identity for A, whence e = e e0
= e0.
Thus eeA0 and so it is homogeneous of degree 0.
It is clear that every subalgebra of A that is simultaneously a G-graded subspace, is a G-graded algebra. Such subalgebras are called G-graded subalgebras.
Now suppose that 1c A is a G-graded ideal. Then the factor algebra A/I has a natural G-gradation as a linear space (cf. sec. 6.2) such that the canonical projection ir:A—A/I is homogeneous of degree zero. Hence if and 9 are any two homogeneous elements in A/I we have .9 =
and so
is
deg
= 7t (x y)
homogeneous. Moreover, = deg(x y) = deg x + deg y = deg
+ deg9.
Consequently, A/I is a G-graded algebra. More generally, if B is a second algebra without gradation, and p : A—÷B is an epimorphism whose kernel is a G-graded ideal in A, then the induced G-gradation (cf. sec. 6.2) makes B into a G-graded algebra. Now let A and B be G-graded algebras, and assume that p:A—+B is a
homogeneous homomorphism of degree k. Then ker 'p is a G-graded ideal in A and Im 'p is a G-graded subalgebra of B. G' is a homoSuppose next that A is a G-graded algebra, and r: morphism, G' being a second abelian group. Then it is easily checked that the induced G'-gradation of A makes A into a G'-graded algebra.
Chapter VI. Gradations and homology
176
The reader should also verify that if E is simultaneously a G- and an H-graded algebra such that the gradations are compatible, then the inalgebra. of A makes A into a duced A graded algebra A is called anticommutative if for every two homogeneous elements x and y = if x and y are two homogeneous elements in an associative anticommux y comtative graded algebra such that deg mute, and so we obtain the binomial formula n
yfl - i.
(x +
= an involution w is defined by
In every graded algebra A =
1)kx, wx = (6.14) XEAk. In fact, if XEAk and yeA1 are two homogeneous elements we have
w(xy) =
1)k+lxy = (_
1)1y
=
w preserves products. It follows immediately from (6.14) that = i and so w is an involution. w will be called the canonical involution of the graded algebra A. A homogeneous antiderivation with respect to the canonical involution (6.14) will simply be called an antiderivation in the graded algebra A. It satisfies the relation
Q(xy)=
+(— 1)kxQy,
xeAk,yeA.
If Q1 and Q2 are antiderivations of odd degree then Q1 Q2 + Q2 Q1 is a
derivation. If Q is an antiderivation of odd degree and 0 is a derivation then QO —OQ is an antiderivation (cf. sec. 5.8). Now assume that A is an associative anticommutative graded algebra
and let hEA be a fixed element of odd (even) degree. Then, if Q is a homogeneous antiderivation, p (h) Q is a homogeneous derivation (antiderivation) and if 0 is a homogeneous derivation, p(h)O is a homogeneous antiderivation (derivation) as is easily checked.
Problems 1. Let A be a G-graded algebra and suppose x is an invertible element homogeneous of degree k (cf. problem 7, chap. V, § 1). Prove that 1
§ 2. G-graded algebras
177
is homogeneous and calculate the degree. Conclude that if A is a positively graded algebra, then kz=O. 2. Suppose that A is a graded algebra without zero divisors. Prove that every invertible element is homogeneous of degree zero.
3. Let E, F be G-graded vector spaces. Show that the vector space LG(E; F) generated by the homogeneous linear mappings ço:E-+F is a subspace of L(E; F). Define a natural G-gradation in this subspace such that an element cOELG(E; F) is homogeneous if and only if it is a homogeneous linear mapping. 4. Prove that the G-graded space LG (E; E) (E is a G-graded vector space) is a subalgebra of A (E; E). Prove that the G-gradation makes LG (E; E) into a G-graded algebra (which is denoted by AG (E; E)). 5. Let E be a positively graded vector space. Show that an injective (surjective) linear mapping E) has degree Conclude that a homogeneous linear automorphism of E has degree zero. 6. Let A be a positively graded algebra. Show that the subset Ak of A consisting of the linear combinations of homogeneous elements of degree
k is an ideal. 7. Let E, E* be a pair of almost finite dual G-graded vector spaces. Construct an isomorphism of algebras: &AG(E;E)4 AG(E*;E*)0PP. Hint: See problem 12, chap. V, § 1. Show that there is a natural G-gradation in AG(E*; such that 1i is homogeneous of degree zero. 8. Consider the G-graded space LG (E; E) (E is a G-graded vector space). Assign a new gradation to LG (E; E) by setting
degp =
— degp
whenever PELG(E; E) is a homogeneous element. Show that with this new gradation LG (E; E) is again a G-graded space and AG (F; E) is a
G-graded algebra. To avoid confusion, we denote these objects by LG(E; E) and AG(E; E). Prove that the scalar product between LG (E;
E) and EG (E; E) defined
by
= makes these spaces into dual G-graded vector spaces. 12
Linear Algebra
Chapter VI. Gradations and homology
178
9. Let
be a graded algebra and consider the linear mapping
O:A—*A defined by
Ox = px
Show that 0 is a derivation. §
3•* Differential spaces and differential algebras
6.7. Differential spaces. A differential operator in a vector space E is a linear mapping E—5 E such that = 0. The vectors of ker = Z(E) are called boundaries. it are called cycles and the vectors of Im follows from 132=0 that B(E)cZ(E). The factor space
H(E) = Z(E)/B(E) is called the homology space of E with respect to the differential operator 13. A vector space E together with afixed differential operator 13E' is called a differential space. A linear mapping of a differential space (E, 13E) into a differential space (F, 13F) is called a homomorphism (of differential spaces) if (6.15)
CF0P_(P0GE
It follows from (6.15) that p maps Z(E) into Z(F) and B(E) into B(F). Hence a linear mapping p is an isomorphism of differential spaces and is the linear inverse iso1
morph ism, then by applying p - on the left and right of (6.15) we obtain 1
—1
°
= CEO çø
and so p1 is an isomorphism of differential spaces as well. If 1/1 is a homomorphism of(F, 13k) into a third differential space (G, we
have clearly
In particular, if ço is an isomorphism of E onto F and p isomorphism we have =I =
1
and
= Consequently,
= 1.
is an isomorphism of H(E) onto H(F).
is
the inverse
§ 3. Differential spaces and differential algebras
6.8. The exact triangle. An exact sequence of differential spaces is an exact sequence (6.16)
where (F, aF), (E, and (G, are differential spaces, and 'p, are homomorphisms. Suppose we are given an exact sequence of differential spaces then the sequence H(E) H(G) o is exact at H(E). In fact, clearly, = 0 and so Tm * c ker Conversely, let and choose an element yEZ(E) which repre-
sents /3. Then, since z1EG.
is surjective, there is an element y1 E E such that t/iy1 = z1. It follows that — CGZ1 = 0. — CEll) = = — 6G Since
Hence, by exactness at E,
for some XEF. Applying
we
obtain CE(PX = GEl
=0
whence
Since p is injective, this implies that i.e., XEZ(F). Thus x represents an element It follows from the definitions that (p*
= /3
and so
It should be observed that the induced homology sequence is not short exact. However, there is a linear map, y: H(G)—*H(F) which makes the triangle
H(F)-*H(E)
(6.17)
A
H(G)
exact. It is defined as follows: Let sentative of Choose VEE such that =
and let ZEZ(G) be a repreThen = GG
Z
=0
Chapter VI. Gradations and homology
and so there is an element XEF such that = Since
=
=
=
0
and since is injective, it follows that IFX=O and so x represents an element x of H(F). It is straightforward to check that is independent of all choices and so a linear map y: is defined by It is not difficult to verify that this linear map makes the triangle (6.17) exact at 11(G) and H(F). / is called the connecting homomorphism for the short exact sequence (6.16). 6.9. Dual differential spaces. Suppose (E, is a differential space and consider the dual space E* = L(E). Let a* he the dual map of Then for all xEE, X*EE* we have = <x*,0> =
=
0
whence 3*a*x*_o i.e., (o*)2 = 0.
Thus (E*,
is again a differential space. It is called the dual differential
space.
= Z (E*) are called cocycles (for E) and the vectors The vectors of ker of B(E*) are called coboundaries (for E). The factor space
H(E*) = Z(E*)/B(E*) is called the cohoinology space for E. it will now be shown that the scalar product between E and E* determines a scalar product between the homology and cohomology spaces. In view of sec. 2.26 and 2.28 we have the relations
Z(E*)=B(E)L,
Z(E)=B(E*)L
(6.18)
B(E*) = Z(E)-'-,
B(E) = Z(E*)--.
(6.19)
We can now construct a scalar product between H(E) and H(E*). Consider the restriction of the scalar product between E and E* to Z(E) x Z(E*), Z(E) x Then since, in view of (6.18) and (6.19),
z(E*)±
n
Z(E) = B(E)
n
Z(E) = B(E)
§ 3. Differential spaces and differential algebras
and n
181
B(E*) n z(E*) =
z(E*)
the equation
=
a scalar product between H(E) and H(E*) (cf. sec. 2.23). (E*, Finally, suppose (E, and (F, SF)' (F*, CF*) are two pairs of dual differential spaces. Let
defines
ço:E-*F be a homomorphism of differential spaces, with dual map (p*: Then dualizing (6.15) we obtain *
and so
induced
°
*_ F
* * E0(P
is a homomorphism of differential spaces. It is clear that the mappings
and
H(F*)
: H(E*)
are again dual with respect to the induced scalar products; i.e., I
*\,'# — —I
\*
6.10. G-graded differential spaces. Let E be a G-graded space, E= peG
and consider a differential operator in E that is homogeneous of some degree k. Then a gradation is induced in Z(E) and B(E) by
Z(E)=
Z,,(E)
and B(E)=
peG
and
where
peG
(cf. sec. 6.2).
Now consider the canonical projection
Since rr is an onto map and the kernel of is a graded subspace of Z (E), a G-gradation is induced in the homology space H(E) by
H(E)=
where
peG
Chapter VI. Gradations and homology
The Now consider the subspaces and factor space is called the p-ill homology space of the graded In differential space E. It is canonically isomorphic to the space fact, if then denotes the restriction of it to the spaces
is an onto map and the kernel of =
n
is given by
kent =
n
B(E) =
induces a linear isomorphism of if E is a graded space and dim is finite we write
Hence,
onto
=
dim
is called the p-th Betti number of the graded differential space (E,
If E is an almost finite positively graded space, then clearly, so is H(E). The Poincaré series for H(E) is given by P
D
H(E)
P=o
6.11. Dual G-graded differential spaces. Suppose (E= (E* = keG
E,
Ek, c5)
and
kEG is
a pair of dual G-graded differential spaces. Then if 0E
is homogeneous of degree 1, we have that
E1
E1 +
and hence
= = 0 unless
it follows that
fl
j+i—l
is homogeneous of degree —1. and so Now consider the induced G-gradations in the homology spaces
The induced scalar product is given by (6.22)
=
Since
and
§ 3. Differential spaces and differential algebras
183
are homogeneous of degree zero, it follows that the scalar product (6.22) respects the gradations. Hence H(E) and H(E*) are again dual G-graded vector spaces. In particular, if G = the p-th homology and cohomology spaces of E are dual. If has finite dimension we obtain that dim
=
dim
6.12. Differential algebras. Suppose that A is an algebra and that 13 is a differential operator in the vector space A. Assume further that an involution w of the algebra A is given such that 13w+w13=O, and that 13 is an antiderivation with respect to i.e., that (6.23)
Then (A, 13) is called a differential algebra.
It follows from (6.23) that the subspace Z(A) is a subalgebra of A. Further, the subspace B(A) is an ideal in the algebra Z(A). In fact, if and 13xeB(A) we have
yeZ(A)
13(xy) = 13xy
and
13(wy.x)= w2 y13x +
y13x
yeZ(A)
whence 13x .yeB(A) 13xeB(A). Hence, a multiplication is induced in the homology space H(A). The space H(A) together with this multiplication is called the homology algebra of the differential algebra (A, 13). The multiplication in H(A) is given by
irz1 nz2 =
z1,z2eZ(A)
m(z1 z2)
where ir:Z(A)—÷H(A) denotes the canonical projection. If A is associative (commutative) then so is H(A). Let (A, 8A) and (B, 0B) be differential algebras. Then a homomorphism A -+B is called a homomorphism of d(fferential algebras if 'P 13A =
13B
It follows easily that the induced mapping : is a homomorphism of homology algebras. Suppose now A is a graded algebra and that 13 and a) are both homogeneous, w of degree zero. The A is called a graded differential algebra. Consider the induced gradation in H(A). Since the canonical projection m:Z(A)—÷H(A) is a homogeneous map of degree zero it follows that H(A) is a graded algebra.
Chapter VI. Gradations and homology
If A is an anticommutative graded algebra, then so is H(A) as follows from the fact that rr is a homogeneous epimorphism of degree zero.
Problems 1. Let (E, 0k), (F, 02) be two differential spaces and define the differby ential operator 8 in 8= Prove that
2. Given a differential space (E, 0), consider a differential subspace; i.e., a subspace E1 that is stable under 0. Assume that Q:E—÷E1 is a linear mapping such that i) ii)
VEE1
iii) QX — XEB(E)
XEZ(E).
Prove that the induced mapping
H(E1) is a linear isomorphism. 3. Let (F, 0) be a differential space. A homotopy operator in E is a linear transformation h:E—*E such that
hO + Oh = i.
Show that a homotopy operator exists in E if and only if H(E)==O. 4. Let (E, be and (F, OF) be two differential spaces and let =1/1* if and only if homomorphisms of differential spaces. Prove that there exists a linear mapping such that liCE + CFh =
—
is called a honiotopy operator connecting çø and //. Show that problem 3 is a special case of problem 4. 5. Let 81, 02 be differential operators in Ewhich commute, 8102=8281. a) Prove that 01 02 is a differential operator in E. b) Let B1, B2, B be the boundaries with respect to 82 and 82. Prove that 02(B1) = 81(B2) = B. h
§ 3. Differential spaces and differential algebras
c) Let Z1, Z2, Z that
Z1 + Z2
be the cycles with respect to 02 and Z. Establish natural linear isomorphisms
Z/Z1 4B1 n
Z2
and
Z/Z2
185
Show
4B2 fl Z1.
d) Establish a natural linear isomorphism (B1 n
n
Z1)/02(Z1)
and then show that each of these spaces is linearly isomorphic to Z/(Z1 + Z2). induces a differential operator in Z2. Let P1 denote e) Show that the corresponding homology space. Assume now that Z=Z1 +Z2 and prove that P1 can be identified with a subspace of the homology space H1 =Z1/B1. State and prove a similar result for 02. f) Show that the results a) to e) remain true if 0102 02 be differential operators in E such that 0102= are differential operators in E. a) Prove that 01+02 and b) With the notation of problem 5, assume that
B=B1nB2 and Z=Z1+Z2. Prove that the homology space is linearly isomorphic to
of each of the differential operators in a)
Z1 n Z2/(Z1 (This
n
B2 + Z2 n B1).
is essentially the Künneth theorem of sec. 2.10
7.
formula. Let E =
be
volume II.)
a finite dimensional
graded
differential space and assume that 0 is homogeneous of degree — be a homomorphism of differential spaces, (1=0 in E0). Let p: homogeneous of degree zero. Denote the restrictions of p (respectively by ço1, (respectively Prove the (respectively to Lefschetz formula —
—
p=O
Conclude
=
p=O
that —
(EP)
p=O
and PH(E). (Euler-Poincaré formula). Express this formula in terms of The number given in (EP). is called the Euler-Poincaré characteristic
of E.
Chapter VII
Inner product spaces In this chapter all vector spaces are assumed to be real vector spaces
§ 1. The inner product 7.1. Definition. An inner product in a real vector space E is a bilinear function (,) having the following properties: 1. Symmetry: (x, y)=(y, x).
2. Positive definiteness: (x, x)O, and (x, x)=O only for the vector x==O.
A vector space in which an inner product is defined is called an inner product space. An inner product space of finite dimension is also called a Euclidean space. The norm xl of a vector xEE is defined as the positive square-root
A unit vector is a vector with the norm 1. The set of all unit vectors is called the unit-sphere. It follows from the bilinearity and symmetry of the inner product that x + y12 = 1x12 + 2(x,y)+ JyV
whence
(x,y) = 4(lx +
y12
-
This equation shows that the inner product can be expressed in terms of the norm. The restriction of the bilinear function (,) to a subspace E1 E has again properties 1 and 2 and hence every subspace of an inner product space is itself an inner product space. The bilinear function (,)is non-degenerate. In fact, assume that (a,y)=O
for a fixed vector aeE and every vector yeE. Setting y=a we obtain (a, a) =0 whence a =0. It follows that an inner product space is dual to itself.
7.2.
§ 1. The inner product
187
Examples. 1. In the real number-space
standard inner pro-
duct is defined by
(x,y) where
x =
and
...
...
y =
2. Let E be an n-dimensional real vector space and basis of E. Then an inner product can be defined by
(v =
1.
.
.n) be a
(x,y) = where
3.
Consider the space C of all continuous functions f in the interval 1 and define the inner product by
(f,g) = Jf(t)g(t)dt. 7.3. Orthogonality. Two vectors xeE and yeE are said to be orthogonal if (x, y)=O. The definiteness implies that only the zero-vector is orthogonal to itself. A system of p vectors +0 in which any two vectors are orthogonal, is linearly independent. In fact, the rex. and lation =0 yields x,1) =
0
=1
...
p)
whence
Two subspaces E1 E and E2 c E are called orthogonal, denoted as E1 ±E2, if any two vectors X1EE1 and x2eE2 are orthogonal.
7.4. The Schwarz-inequality. Let x and y be two arbitrary vectors of the inner product space E. Then the Schwarz-ine quality asserts that (x,y)2
1x121y12
(7.1)
and that equality holds if and only if the vectors are linearly dependent. To prove this consider the function Ix
+
Chapter VII. Inner product spaces
of the real variable 2. The definiteness of the inner product implies that
Expanding the norm we obtain )2
+ 22(x, y) + x12
0.
Hence the discriminant of the above quadratic expression must be negative or zero,
(x,r)2 < Now assume that equality holds in (7.1). Then the discriminant of the quadratic equation + 22(x,y) + lxi2 = 0
221yi2
is zero.*) Hence equation (7.2) has a real solution
(7.2)
It follows that
i2oy + XV = 0, whence 2oY + X = 0.
Thus, the vectors x and y are linearly dependent. 7.5. Angles. Given two vectors x + 0 and y 0, the Schwarz-inequality implies that (x,y) —
1
- lxi -1.
Consequently, there exists exactly one real number w (0
(x y)
cosw=—
w
m) such that (7.3)
.
lxi
The number w is called the angle between the vectors x and y. The sym-
metry of the inner product implies that the angle is symmetric with respect to x and y. If the vectors x and y are orthogonal, it follows that cos co =0, whence U) =
—
2
Now assume that the vectors x and y are linearly dependent, y=2x, Then
2
1+1
121
L—1
if 2>0 if 20
(ic
if 20. The equation
Ix+yi=ixi+iyi implies that lxi
2
+ 2(x,y)+
fyi2
= lxi 2 + 2fxifyf +
whence
(x, y) = lxi fyi.
(7.5)
Thus, the vectors x and y must be linearly dependent,
x=2y. Equations (7.5) and (7.6) yield 2=121, whence 20. Conversely, assume that x = 2y, where 2 0. Then fx
+ yf = (2 + 1)yI = (2 + 1)fyf = 2fyf + iyf = fxf + fyj.
(7.6)
Chapter VII. Inner product spaces
Given three vectors x, y, z, the triangle-inequality can be written in the form
x-yl Ix-zI +lz-yi.
(7.7)
As a generalization of (7.7), we prove the Ptolemy-inequality
ix-yllzl ly-zHxl +lz-xilyi.
(7.8)
Relation (7.8) is trivial if one of the three vectors is zero. Hence we
may assume that x+O, y+O and z+O. Define the vectors x', y' and z' by x
X—2, Y lxi
y
,
z Z—2. ,
2'
Izi
Then
lv'—y'l2=
k—12 xi2 in2'
1
lxl2
lxl2 lyi2
lyl2
Applying the inequality (7.7) to the vectors x', y' and z' lxi
=
we
obtain
Izi lxi'
Izi
whence (7.8).
7.7. The Riesz theorem. Let E be an inner product space of dimension n and consider the space L(E) of linear functions. Then the spaces L(E) and E are dual with respect to the bilinear function defined by
On the other hand, E is dual to itself with respect to the inner product. Hence Corollary II to Proposition I sec. 2.33 implies that there is a linear isomorphism a—*fa of E onto L(E) such that fa(Y) = (a,y). In
other words, every linear function f in E can be written in the form
f(y) and
= (a,y)
the vector aEE is uniquely determined byf(Riesz theorem).
Problems 1. For x =
1
(x,y) satisfies
show that the bilinear function
in
and y = —
the properties listed in sec. 7.1.
—
+
§ 2. Orthonormal bases
191
2. Consider the space S of all infinite sequences that
Show that
•••) such
converges and that the bilinear function (x,
is an inner product. 3. Consider three distinct vectors x40, y+O and z40. Prove that the equation lxi + iz - xi fyi ix izi = iY
-
-
holds if and only if the four points x, y, z, 0 are contained on a circle such
that the pairs x, y and z, 0 separate each other. 4. Consider two inner product spaces E1 and E2. Prove that an inner product is defined in the direct sum E1 EEE2 by x1,y1 eE1,
((x1,x2),(y1,y2)) = (x1,y1) + (x2,y2) 5.
x2,y2eE2.
Given a subspace E1 of a finite dimensional inner product space E,
consider the factor space E/E1. Prove that every equivalence class con-
tains exactly one vector which is orthogonal to E1. § 2.
Orthonormal bases
7.8. Definition. Let E be an n-dimensional inner product space and (v = .n) be a basis of E. Then the bilinear function (,) determines a symmetric matrix (v,p = 1... n). (7.9) 1 . .
The inner product of two vectors x= can
and
be written as
(x,y) =
if1(x
y=
x) =
(7.10)
and hence it appears as a bilinear form with the coefficient-matrix The basis (v = .ii) (v = .11) is called orthonornial, if the vectors are mutually orthogonal and have the norm 1, 1
1
. .
=
. .
(7.11)
Chapter VII. Inner product spaces
192
Then formula (7.10) reduces to
(x,y) and in the case y = x ixi2
The substitution
(7.12)
=
in (7.12) yields
=
(p
= 1... n).
(7.13)
Now assume that x+0, and denote by the angle between the vectors x and Formulas (7.3) and (7.13) imply that =
(p =
1
n).
(7.14)
lxi
If x is a unit-vector (7.14) reduces to cos
(p = I ... n).
=
(7.15)
These equations show that the components of a unit-vector x relative to an orthonormal basis are equal to the cosines of the angles between x and the basisvectors 7.9. The Schmidt-orthogonalization. In this section it will be shown that an orthonormal basis can be constructed in every inner product space of finite dimension. Let (v = ii) be an arbitrary basis of E. Starting out from this basis a new basis (v = I .. n) will be constructed whose vectors are mutually orthogonal. Let b1 = a1. Then put b2 = a2 + 2b1 1 . .
.
and determine the scalar A such that (h1, h2)=0. This yields
(a2,b1) + A(b1,b1) = 0. Since b1 40, this equation can be solved with respect to A. The vector b2 thus obtained is different from zero because otherwise a1 and a2 would be linearly dependent.
To obtain b3 set
b3=a3 -i-pb1 +-vb2
and determine the scalars p and v such that (b1,b3) =
0
and
(b2,b3) = 0.
§ 2. Orthonormal bases
193
This yields
b1) = 0
(a3, b1) +
and (a3, b2) + v(b2,b2)
0.
Since b1 + 0 and b2 + 0, these equations can be solved with respect to ji and v. The linear independence of the vectors a1, a2, a3 implies that b3 +0. Continuing this way we finally obtain a system of n vectors
such that
+ 0 (v = 1.. .n)
(v+ji).
It follows from the criterion in sec. 7.3, that the vectors are linearly independent and hence they form a basis of E. Consequently the vectors (v
=
= 1 ... n)
form an orthonormal basis. 7.10. Orthogonal transformations. Consider two orthogonal bases and (v = 1.. .n) of E. Denote by the matrix of the basis-transformation (7.16)
The relations and imply that (7.17)
This equation shows that the product of the matrix and the transposed matrix is equal to the unit-matrix. In other words, the transposed matrix coincides with the inverse matrix. A matrix of this kind is called orthogonal.
Hence, two orthonormal bases are related by an orthogonal matrix. Conversely, given an orthonormal basis (v 1. . n) and an orthogonal the basis defined by (7.16) is again orthonormal. n x n-matrix .
7.11. Orthogonal complement. Let E be an inner product space (of finite or infinite dimension) and E1 be a subspace of E. Denote by set of all vectors which are orthogonal to E1. Obviously,
E only.
of the zero-vector
is called the orthogonal complement of E1. If E has finite dimen-
sion, then we have that
dimE1 +dimEiL= dimE 13
the
Greub. Linear Algebra
Chapter VII. Inner product spaces
and hence E1
n E iL
=
0
implies that (7.18)
Select an orthonormal basis and a vector y= of
E1 consider
rn) of E1. Given a vector XEE
1
the difference z
=x
—
y.
Then (z,
= (x,
—
= (x,
(y,
if and only if
This equation shows that z is contained in (R
—
= 1... rn).
We thus obtain the decomposition
x=p+h
(7.19)
where
p is called the orthogonal projection of x onto E1.
Passing over to the norm in the decomposition (7.19) we obtain the relation ix12
= pt2 + hI2.
(7.20)
Formula (7.20) yields Bessel's-inequality lxi
showing that the norm of the projection never exceeds the norm of x. The
equality holds if and only if h=0, i.e. if and only if xeE1. The number ihi is called the distance of x from the subspace E1.
Problems 1. Starting from the basis a1 = (1,0,1) a2 = (2,1,
—
3)
a3 = (— 1,1,0)
construct an orthonormal basis by the Schmidtof the number-space orthogonalization process.
§ 3. Normed determinant functions
195
2. Let E be an inner product space and consider E as dual to itself. Prove that the orthonormal bases are precisely the bases which are dual to themselves.
3. Given an inner product space E and a subspace E, of finite dimension consider a decomposition
x=x1+x2 and the projection
x,eE,
x=p+h
Prove that 1X21
Ihi
that equality is assumed only if x, =p and x2 = h. 4. Let C be the space of all continuous functions in the interval 0 t 1 with the inner product defined as in sec. 7.2. If C' denotes the subspace of all continuously differentiable functions, show that (C1)' = 0. 5. Consider a subspace E1 of E. Assume an orthogonal decomposition and
E,=F1$G, F1±G,. Establish the relations
F,' =
I G1
and
IF1.
=
6. Let F3 be the space of all polynomials of degree inner product of two polynomials as follows: (P,Q)
2. Define the
= f P(t)Q(t)dt.
The vectors 1, t, t2 form a basis in F3. Orthogonalize and orthonormalize this basis. Generalize the result for the case of the space P of polynomials of degree § 3.
Normed determinant functions
7.12. Definition. Let E be an n-dimensional inner product space and A0+O be a determinant function in E. Since E is dual to itself we have in view of (4.21) z10 (x,, ...
where
...
is a real constant. Setting
= det (x1,
e E,
where
eE
is an orthonormal
Chapter VII. Inner product spaces
basis we obtain
= z10(e1 ...
and so the constant
(7.21)
is positive. Now define a determinant function A
by A
(7.22)
=±
Then we have
A (x1 ...
=
(Yi ...
(7.23)
y1).
A determinant function in an inner product space which satisfies (7.23) is called a normed determinant function. It follows from (7.22) that there are precisely two normed determinant functions A and —A in E. Now assume that an orientation is defined in E. Then one of the functions A and —A represents the orientation. Consequently, in an oriented inner product space there exists exactly one normed determinant function representing the given orientation.
7.13. Angles in an oriented plane. With the help of a normed determinant-function it is possible to attach a sign to the angle between two vectors of a 2-dimensional oriented inner product space. Consider the normed determinant function A which represents the given orientation. Then the identity (7.23) yields 1x12
(7.24)
1y12 — (x, y)2 = A (x, y)2.
Now assume that x+O andy+O. Dividing (7.24) by 1x121y12 we obtain the relation + 1x121y12
1x121y12
Consequently, there exists exactly one real number 0 mod 2m such that
cosO =
(x,y)
.
and
sin0 =
lxi
A(x,y) .
lxi
(7.25)
This number is called the oriented angle between x and y.
if the orientation is changed, A has to be replaced by —A, and hence 0 changes into —0. Furthermore it follows from (7.25) that 0 changes sign if the vectors x and y are interchanged and that 0(x,
—
y)
= 0(x, y) +
mod 2it.
§ 3. Normed determinant functions
197
1.. .p) in an inner 7.14. The Gram determinant. Given p vectors product space E, the Gram determinant G (x1 . x,,) is defined by . .
= det
G (x1 ...
It will be shown that
\ (xi,, x1) ... (xe,
G(x1 ...
(7.26)
).
(
0
(7.27)
and that equality holds if and only if the vectors (x1 . .x,) are linearly dependent. in the case p = 2 (7.27) reduces to the Schwarz-inequality. To prove (7.27), assume first that the vectors (v 1.. .p) are linearly .
dependent. Then the rows of the matrix (7.26) are also linearly dependent whence
if the vectors (v = .. .p) are linearly independent, they generate a p-dimensional subspace E1 of E. E1 is again an inner product space. Denote by A1 a normed determinant function in E1. Then it follows from (7.23) that G(x1 ... = A1(x1 ... 1
The linear independence of the vectors A1 (x1 .. . x,,) +0,
(v = .. .p) 1
implies that
whence
7.15. The volume of a parallelepiped. Let p linearly independent vectors be given in E. The set
(v=1...p)
(7.28)
is called the p-dimensional parallelepiped spanned by the vectors (v = 1.. .p). The volume V(a1 ..
of the parallelepiped is defined by (7.29)
where A1 is a normed determinant function in the subspace generated by the vectors (v = .p). In view of the identity (7.23) formula (7.29) can be written as 1 .
.
V(a1 ...
= det
(
a1) ... (ar,
).
(7.30)
Chapter VII. Inner product spaces
198
In the case p = 2 the above formula yields V(ai,a2)2=1a1121a212—(a1,a2)2=1a1121a212sin20,
(7.31)
where 0 denotes the angle between a1 and a2. Taking the square-root on
both sides of (7.31), we obtain the well-known formula for the area of a parallelogram: V(a1,a2) = 1a11 1a21 sin
p) and de-
Going back to the general case, select an integer i (1 i compose a1 in the form
a, =
v*i
+
=
where
(v + i).
0
(7.32)
Then (7.29) can be written as V(a1 ...
=
A1 (a1
...
...
Employing the identity (7.23) and observing that (hi,
= 0 (v +1) we
obtain *) (a1, a1) ... (a1, a1) ... (a
V(a1 ...
det
(7.33)
...
The determinant in this equation represents the square of the volume of the(p — 1)-dimensional parallelepiped generated by the vectors (a1 . .
a1..
We thus obtain the formula V(a1 ... a',) = V(a1
showing that the volume V(a1 .
(1 .
is
i
p)
the product of the volume
of the "base" and the corresponding height. 7.16. The cross product. Let E be an oriented 3-dimensional Euclidean space and A be the normed determinant function which represents the orientation. Given two vectors xeE and yEE consider the linear function f defined by (7.34) f (z) = A (x, y, z). V(a1 . . .á,. .
In view of the Riesz-theorem there exists precisely one vector uEE such that (7.35) f (z) = (u, z). *) The symbol at indicates that the vector at is deleted.
§ 3. Normed determinant functions
199
The vector u is called the cross product of x and y and is denoted by x x y. Relations (7.34) and (7.35) yield
(x xy,z)=A(x,y,z).
(7.36)
It follows from the linearity of 4 in x and y that the cross product is distributive (Ax1 + px2) x y = Ax1 x y + px2 x x x (AY1 + 11Y2) = AX X Yi + PX X
y
and hence it defines an algebra in E. The reader should observe that the
cross product depends on the orientation of E. If the orientation is reversed then the cross product changes its sign. From the skew symmetry of 4 we obtain that
x xy=—y xx. Setting
z=x
in (7.36) we obtain that
(x x y,x) = 0.
Similarly it follows that
(x xy,y)=O and so the cross product is orthogonal to both factors. It will now be shown that x x y + 0 if and only if x and y are linearly independent. In fact, if y—Ax, it follows immediately from the skew symmetry that x x y =0. Conversely, assume that the vectors x and y are linearly independent. Then choose a vector zEE such that the vectors x, y, z form a basis of E. It follows from (7.36) that (x
x y,z) =
zl(x,y,z) + 0
whence x x y +0.
Formula (7.36) yields for z=xxy
A(x,y,x x y) = Ix x
(7.37)
y12.
If x
and y are linearly independent it follows from (7.37) that 4 (x, y, x x y) >0 and so the basis x, y, x x y is positive with respect to the given orientation. Finally the identity
(x1 x x2,y1 x j'2) = (x1,y1)(x2,y2)
—
(x1,y2)(x2,y1)
will be proved. We may assume that the vectors x1,
x2
(7.38)
are linearly inde-
Chapter VII. Inner product spaces
200
pendent because otherwise both sides of (7.38) are zero. Multiplying the relations A(x1,x2,x3) = (x1 x x2,x3) and x A(y1,y2,i'3) = we obtain in view of (7.23) (x1 x x2, x3)(v1
13) =
Setting Y3 = x1 x x2 and expanding the determinant on the right hand side
by the last row we obtain that
(x1 x x2,x3)(y1 x
x x2) =
(x1 x x2, x3) [(x1, v1)(x2, v2) — (x1,
y2)(x2, v1)].
(7.39)
Since x1 and x2 are linearly independent we have that x1 x x2 + 0 and hence x3 can be chosen such that (x1 x x2, x3)+0. Hence formula (7.39) implies (7.38).
Formula (7.38) yields for x1=y1=x and x2=y2=y (7.40)
x
If 0(0 0 the form
m) denotes the angle between x and y we can rewrite (7.40) in
= xl yI sin 0
Ix x
x + 0, y + 0.
Now we establish the triple identity
xx (1'x z)=(x,z)y—(x,y)z. In fact, let UEE
be
(7.41)
arbitrary. Then formulae (7.36) and (7.38) yield
(xx ft x z),u) = A(x,y x z,u)= — 4(y x z,x,u) = — ft x z, x x u) = — (,x)(z, u) + (y, u)(z, x)
Since ueE is arbitrary, (7.41) follows. From (7.41) we obtain the Jacobi identity
xx(y xz)+vx(z x x)+:x(x xy)=0. Finally note that if e1, e2, e3 is a positive orthonormal basis of E we have the relations X e2
=
C2 X Cf3 = e1,
Cf3
X
= C2.
§ 3. Normed determinant functions
201
Problems 1. Given a vector a +0 determine the locus of all vectors x such that x—a is orthogonal to x+a. 2. Prove that the cross product defines a Lie-algebra in a 3-dimensional inner product space (cf. problem 8, Chap. V, § 1). 3. Let e be a given unit vector of an n-dimensional inner product space E and E1 be the orthogonal complement of e. Show that the distance of a vector xeE from the subspace E1 is given by
d= (x,e)J. 4. Prove that the area of the parallelogram generated by the vectors x1 and x2 is given by A = — a)(s — b)(s — c), where
a=1x11,
b=1x21, c=1x2—xiI, s=4(a+b+c).
5. Let a +0 and b be two given vectors of an oriented 3-space. Prove that the equation x x a = b has a solution if and only if (a, b) = 0. If this condition is satisfied and x0 is a particular solution, show that the general solution is x0+2a. 6. Consider an oriented inner product space of dimension 2. Given two positive orthonormal bases (e1, e2) and (e1, e2), prove that = e1 cos w — e2 sin co = e1 sin co + e2 cos co. where co is the oriented angle between e1 and ë1.
7. Let a1 and a2 be two linearly independent vectors of an oriented Euclidean 3-space and F be the plane generated by a1 and a2. Introduce an orientation in F such that the basis a1, a2 is positive. Prove that the angle between two vectors and
is determined by the equations —
cosO
=
lxi lyl
and sinO =
lxi lyl
x a21(— ic O (x, y)
Q(x,y)=
if x+y (x, z) + (z, y)
(triangle inequality)
Q(y,x).
is a metric in E and so it defines a topology in E, called the norm topology. It follows from N2 and N3 that the linear operations are continuous in this topology and so E becomes a topological vector space. 7.20. Examples. 1. Every inner product space is a normed linear space with the norm defined by Hence
iixii
2.
Let C be the linear space of all continuous functions fin the interval 1. Then a norm is defined in C by
= max Ot 1
f(t)l.
Conditions N1 and N3 are obviously satisfied. To prove N2 observe
that
1(t) + g(t)l
If (t)I + ig(t)i
ill ii +
whence
ui +
ii + iigii.
,
(0
t 1)
Chapter VII. inner product spaces
3. Consider an n-dimensional (real or complex) linear space E and let (v 1 .. .n) be a basis of E. Define the norm of a vector
= by 7.21. Bounded linear transformations. A linear transformation p: of a normed space is called bounded if there exists a number M such that
xEE.
(7.51)
It is easily verified that a linear transformation is bounded if and only if it is continuous. It follows from N2 and N3 that a linear combination of bounded transformations is again bounded. Hence, the set B(E; E) of all bounded linear transformations is a subspace of L (E; E). Let be a bounded linear transformation. Then the set = 1 is bounded. Its least upper bound will be denoted by II'pII
= sup
(7.52)
lixil = 1
It follows from (7.52) that
xeE. Now it will be shown that the function (p—* thus obtained is indeed a norm-function in B(E; E). Conditions N1 and N3 are obviously satisfied. To prove N2 let ço and be two bounded linear transformations. Then
xeE and consequently
The norm-function
has the following additional property:
2. Then dim F 2. Hence there are unit vectors e1EF, C2EF such that (e1, e2) = 0.
It follows that e1 e2 + e2 e1 = — 2(e1, e2) e = 0.
Now set e3 =
Then we have
2_
,
e3 — e1 e2 c1
e1
e2.
2 2_ ,— — — e1 e2 —
whence e3EF. Moreover, e1 e3 =
— e2,
e2 e3 =
These relations imply that (e3, e3)= 1 and
(e1, e3) = (e2, e3) = 0.
Thus the vectors e1, e2, e3 are orthonormal.
e1.
e
Chapter VII. Inner product spaces
214
In particular. it follows that dim Now we show that the vectors e1. c2. e3 span F. In fact, let :eF. We may assume that :2 = e. Then —
= :e1 e7 — e1e2:
= [— 2(:, e1) e — e1 :] C2 — C1 [— 2(:. e2) — = e2] = —2(:, e1)e2 + 2(:, e2)e1. On
the other hand, _C3
+
e3
= —2(:, e3)e.
Adding these equations we obtain
= —(, e1) e2 + (-.
—
(:, e3) e
whence
=
(, e1) C1 + L e2) e2 + (,
C3.
This shows that: is a linear combination of e1, e2 and e3. Hence dim F= 3
and so dim A = 4. Finally, we show that A is the quaternion algebra. Let A be the trilinear function in F defined by A(x, y, ) = (V'(x,
p),)
x. y,
:EF.
(7.68)
We show that A is skew symmetric. Clearly, A(x, x, =)=O. On the other hand, formula (7.67) yields A(x,
y)
= (W(x. v), i') =
—
Thus A is a determinant in F. Since A(e1, e2, e3) =
— C2
e3) = (e3, e3) =
1,
A is a norined determinant function in the Euclidean space F. In particular, A specifies an orientation and so the cross product is defined in F. Now relation (7.68) implies that v) = x x v
x, VEF.
Combining (7.66) and (7.69) yields
xi'=—(x,v)e+xxy
X,I'EF.
Since on the other hand,
xe=ex=x and A =
XEA
it follows that A is the algebra of quaternions.
(7.69)
§ 6. The algebra of quaternions
215
Remark: It is a deep theorem of Adams that if one does not require associativity there is only one additional finite dimensional division
algebra over
It is the algebra of Cavley numbers defined in Chap. XI, § 2,
problem 6.
Problems 1. Let a 0 be a quaternion which is not a negative multiple of e. Show that the equation x2 = a has exactly two solutions. (1= 1, 2, 3) are three vectors in E1 prove the identity 2. If Y2
Y3 = — 4(Yl' Y2' y3) e — (y1, y2) y3 + (y1, y3) y2 — (y2,
y1.
3. Let p be a fixed quaternion and consider the linear transformation p given by p x = px. Show that the characteristic polynomial of 'p reads
f(t) = (t2 - 2(p, e) t + p12)2. Conclude that p has no real eigenvalues unless p is a multiple of e.
Chapter VIII
Linear mappings of inner product spaces In this chapter all linear spaces are assumed to be real and to have finite dimension
§ 1. The adjoint mapping 8.1. Definition. Consider two inner product spaces Eand Fand assume that a linear mapping p: E—*Fis given. If E* and F* are two linear spaces
dual to E and F respectively, the mapping ço induces a dual mapping The mappings p and are related by =
xe E, y* e F*.
(8.1)
Since inner products are defined in E and in F, these linear spaces can be considered as dual to themselves. Then the dual mapping is a linear mapping of F into E. This mapping is called the adjoint mapping of p and will be denoted by Replacing the scalar product by the inner product in (8.1) we obtain the relation
(px,y)=(x,çzy) xEE,yeF.
(8.2)
In this way every linear mapping p of an inner product space E into an inner product space F determines a linear mapping of F into E. The adjoint mapping of is again p. In fact, the mappings and are related by (8.3)
(y,
Equations (8.2) and (8.3) yield
xeE,yeF = ço. Hence, the relation between a linear mapping and the adjoint mapping is symmetric. As it has been shown in sec. 2.35 the subspaces Im çø and ker are whence
§ 1. The adjoint mapping
217
orthogonal complements. We thus obtain the orthogonal decomposition (8.4)
8.2. The relation between the matrices. Employing two bases (v = 1.. .n) and = 1.. .m) of E and of F, we obtain from the mappings and and two matrices defined by the equations
= and
Substituting
and
in (8.2) we obtain the relation (8.5)
= Introducing the components = (xv, xA)
and
=
YK)
of the metric tensors we can write the relation (8.5) as
Multiplication by the inverse matrix gVQ yields the formula (8.6)
Now assume that the bases normal, =
(v =
1. . .n)
and YL (ji = 1. . . m) are ortho-
=
Then formula (8.6) reduces to
= This relation shows that with respect to orthonormal bases, the matrices of adjoint mappings are transposed to each other. 8.3. The adjoint linear transformation. Let us now consider the case that F= E. Then to every linear transformation 'p of E corresponds an adjoint transformation Since is dual to 'p relative to the inner pro*) The
subscript indicates the row.
218
Chapter VIII. Linear mappings of inner product spaces
duct, it follows that det
= det
and
tr
The adjoint mapping of the product i/is l/ioço
The matrices of
tr ço. is
given by
=
and çø relative to an orthonormal basis are transposes
of each other.
Suppose now that e and ë are eigenvectors of p and
respectively.
Then we have that
çoe=2e and whence in view of (8.2) —
2) (e, e)
= 0.
It follows that (e, e) = 0 whenever 2; that is, any two eigenvectors of p and whose eigenvalues are different are orthogonal. 8.4. The relation between linear transformations and bilinear functions. Given a linear transformation ço:E—*E consider the bilinear function
1i(x,y) = (px,y).
(8.7)
The correspondence p —* c/i defines a linear mapping
Q:L(E; E) denotes the space of bilinear functions in Ex E. It will be shown that this linear mapping is a linear isomorphism of L(E; E) onto B(E, E). To prove that is regular, assume that a certain ço determines the zero-function. Then ('p x, y) =0 for every x e E and every ye E, whence 'p = 0.
It remains to be shown that is a mapping onto B(E, E). Given a bilinear function c/i, choose a fixed vector xeE and consider the linear funcdefined by = cIl(x,y). By the Riesz-theorem (cf. sec. 7.7) this function can be written in the form
L(y) = (x',y) where
the vector x'eE is uniquely determined by x.
§ 1. The adjoint mapping
219
Define a linear transformation ço:E—÷E by
X = X'. Then
xeE,yeE. Thus, there is a one-to-one correspondence between the linear transformations of E and the bilinear functions in E. In particular, the identitymap corresponds to the bilinear function defined by the inner product. Let çji be the bilinear function which corresponds to the adjoint transformation. Then
=(x,py) =
=
((py,x)
This equation shows that the bilinear functions cb and çji from each other by interchanging the arguments. 8.5. Normal transformations. A linear transformation p : normal, if
are
obtained is called (8.9)
The above condition is equivalent to
x,yeE.
(8.10)
In fact, assume that çø is normal. Then
(px,py) =
=(x,p
(x,
=
Conversely, condition (8.10) implies that (y,
=(py,px) =
= (y,p
whence (8.9).
Formula (8.10) is equivalent to xeE. This relation implies that the kernels of p and
coincide,
kerq = Hence, the orthogonal decomposition (8.4) can be written in the form (8.11)
Relation (8.11) implies that the restriction of 'p to Im 'p is regular. Hence, has the same rank as 'p. The same argument shows that all the trans-
formations p"(k=2, 3...) have the same rank as 'p.
Chapter VIII. Linear mappings of inner product spaces
220
It is easy to verify that if ço is a normal transformation then so is p —Ai, Hence it follows that
Ae 11.
ker(q — 2z)=
—
2i).
In other words, çø and have the same eigenvectors. Now the result at the end of sec. 8.3. implies that every two eigenvectors of a normal transformation whose eigenvalues are different must be orthogonal. be a linear transformation and assume that an orthogonal Let ço decomposition the restriction Then p is normal if and only if the subspaces are stable under and the transformations are normal. In fact, assume that p is normal and let be arbitrary. Then we
is given such that the subspaces E1 are stable. Denote by
of p to
have for every
= 0.
=
This implies that The normality of
Thus E, is stable under j+ whence follows immediately from the relation
Conversely, assume that
is stable under
and
that
is normal. Then
we have for every vector that 1px12 =
and
=
=
Vpjxjl2 =
=
so p is normal.
Problems 1. Consider two inner product spaces E and F. Prove that an inner product is defined in the space L(E; F) by
Derive the inequality
and show that equality holds only if
§ 2. Selfadjoint mappings
221
be the adjoint transbe a linear transformation and 2. Let (p: formation. Prove that ifFc:Eis stable under (p, then F' is stable under 3. Prove that the matrix of a normal transformation of a 2-dimensional space with respect to an orthonormal basis has the form fl\
/
)
§ 2.
11
or
Selfadjoint mappings
8.6. Eigenvalue problem. A linear transformation p: E—+E is called p or equivalently
selfadjoint if =
x,yeE.
(cox,y)=(x,coy)
The above equation implies that the matrix of a selfadjoint transformation relative to an orthonormal basis is symmetric. If p: E is a stable subspace
then the orthogonal complement F' is stable as well. In fact, let ZEF' be any vector. Then we have for every yeF
(coz,y) =(z,çoy)= 0 whence pzEF'. It is the aim of this paragraph to show that a selfadjoint transformation of an n-dimensional inner product space E has n eigenvectors which are mutually orthogonal. Define the function F by
F(x)=
(x,px)
x+O.
(x,x)
(8.12)
This function is defined for all vectors x +0. As a quotient of continuous functions, F is also continuous. Moreover, F is homogeneous of degree zero, i.e.
F(2x)=F(x)
(2+0).
(8.13)
Consider the function F on the unit sphere lxi = 1. Since the unit sphere is a bounded and closed subset of E, F assumes a minimum on the sphere lxi = 1. Let e1 be a unit vector such that
F(e1) F(x)
(8.14)
222
Chapter VIII. Linear mappings of inner product spaces
for all vectors
=
1.
Relations (8.13) and (8.14) imply that
F(e1) F(x)
(8.15)
for all vectors x+O. in fact, if x*O is an arbitrary vector, consider the corresponding unit-vector e. Then x=JxIe, whence in view of (8.13)
F(x) = F(e) F(e1). Now it will be shown that e1 is an eigenvector of çø. Let y be an arbitrary vector and define the functionf by
f(t) =
F(e1
+ tj).
(8.16)
Then it follows from (8.15) thatf assumes a minimum at t=O, whence f' (O)=O. inserting the expression (8.12) into (8.16) we can write (e1 +
ty,e1 + ty) Ditlèrentiating this function at t=O we obtain
f'(O) =
+
—
(8.17)
Since ço is selfadjoint,
= and hence equation (8.17) can be written as
f'(O) = 2((pe1,y) —
2(e1,çoe1)(e1,y).
(8.18)
We thus obtain (8.19)
for every vector yEE. This implies that pe1
i.e. e1 is an eigenvector of 'p and the corresponding eigenvalue is = (e1,'pe1).
8.7. Representation in diagonal form. Once an eigenvector of 'p has been constructed it is easy to find a system of ii orthogonal eigenvectors. In fact, consider the 1-dimensional subspace (e1) generated by e1. Then (e1) is stable under 'p and hence so is the orthogonal complement E1 of
§ 2. Selfadjoint mappings
223
Clearly the induced linear transformation is again selfadjoint and hence the above construction can be applied to E1. Hence, there exists an eigenvector e2 such that (e1, e2)=0. Continuing this way we finally obtain a system of n eigenvectors (v = 1.. .n) such that (e1).
e,4) =
form an orthonormal basis of E. In this basis the mapping p has the form The eigenvectors
(8.20)
denotes the eigenvalue of These equations show that the matrix of a selfadjoint mapping has diagonal form if the eigenvectors are used as a basis. 8.8. The eigenvector-spaces. If 2 is an eigenvalue of 'p, the corresponding eigen-space E(2) is the set of all vectors x satisfying the equation px=2x. Two eigen-spaces E(2) and E(2') corresponding to different eigenvalues are orthogonal. In fact, assume that where
çø e = 2 e
p e' = 2' e'.
and
Then (e', 'p e) = 2 (e, e') and
(e, 'p e')
=
2' (e,
e').
Subtracting these equations we obtain
(2'— 2)(e,e') = 0, whence (e, e') = 0 if 1' + 2. Denote by (v = . .r) the different eigenvalues of 'p. Then every two eigenspaces and are orthogonal. Since every vector xeE can be written as a linear combination of eigenvectors it follows that the direct sum of the spaces is E. We thus obtain the orthogonal decomposition 1
.
(8.21)
Let
be the transformation induced by 'p in ço,x
Then
xeE(23.
=
This implies that the characteristic polynomial of
det
—
2i) = (2
—
2)ki
(1 =
is given by 1
... r)
(8.22)
where k. is the dimension of E(21). It follows from (8.21), and (8.22)
224
Chapter VIII. Linear mappings of inner product spaces
that the characteristic polynomial of det((p —)
1)
is equal to the product
— 2)k1
=
(2r
—
(8.23)
The representation 8.23 shows that the characteristic polynomial of a selfadjoint transformation has n real zeros, if every zero is counted with its multiplicity. As another consequence of (8.23) we note that the dimenin sion of the eigen-space is equal to the multiplicity of the zero the characteristic polynomial. 8.9. The characteristic polynomial of a symmetric matrix. The above has n real eigenresult implies that a symmetric n x n-matrix A = values. In fact, consider the transformation =
(v =
1
... n)
(v = 1.. .n) is an orthonormal basis of E. Then çø is selfadjoint and hence the characteristic polynomial of p has the form (8.23). At the same time we know that
where
det ('p — 2 i)
= det (A — 2 J).
(8.24)
Equations (8.23) and (8.24) yield det(A — 2J) =
— 2)k1 ...
—
8.10. Eigenvectors of bilinear functions. in sec. 8.4 a one-to-one correspondence between all the bilinear functions qi in E and all the linear transformations 'p E-+E has been established. A bilinear function and the corresponding transformation 'p are related by the equation :
cli(x,y)=(çox,y)
x,yeE.
Using this relation, we define eigenvectors and eigenvalues of a bilinear function to be the eigenvectors and eigenvalues of the corresponding transformation. Let e be an eigenvector of and 2 be the corresponding eigenvalue. Then çJi
(e,
y) =
('p e,
y) =
2 (e,
y)
(8.25)
for every vector yeE. Now assume that the bilinear function k is symmetric,
cli(x,y)= cli(y,x). Then the corresponding transformation 'p
is
selfadjoint. Consequently,
§ 2. Selfadjoint mappings
225
there exists an orthonormal system of ii eigenvectors
(v = 1... n).
=
(8.26)
This implies that cP
2v
Hence, to every symmetric bilinear function in E there exists an orthonormal basis of E in which the matrix of P has diagonal-form.
Problems 1. Prove by direct computation that a symmetric 2 x 2-matrix has only real eigenvalues. 2. Compute the eigenvalues of the matrix
-1 —1 —2 2
L 3.
Find the eigenvalues of the bilinear function v*1L
4. Prove that the product of two selfadjoint transformations p and i,1i is selfadjoint if and only if cli p = 'p 0k/i. 5. A selfadjoint transformation 'p is called positive, if
(x,çox) 0 for every xEE. Given a positive selfadjoint transformation ço, prove that there exists exactly one positive selfadjoint transformation i/i such that
6. Given a selfadjoint mapping çø, consider a vector be(ker p)'. Prove that there exists exactly one vector aEker p' such that çoa=b. 7. Let 'p be a selfadjoint mapping and let 1 ...n) be a system of n orthonormal eigenvectors. Define the mapping 'pA by 'pA = 'p — 2t
where 2
is
a real parameter. Prove that PA
provided that 2 15
Greub. Linear Algebra
is
not an eigenvalue of 'p.
xeE.
Chapter VIII. Linear mappings of inner product spaces
226
8. Let p be a linear transformation of a real n-dimensional linear space E. Show that an inner product can be introduced in E such that becomes a selfadjoint mapping if and only if has n linearly independent eigenvectors. 9. Let be a linear transformation of E and the adjoint map. Denote the norm of which is induced by the Euclidean norm of E (cf. by sec. (7.19)). Prove that = where 2 is the largest eigenvalue of the selfadjoint mapping
çø.
10. Let ço be any linear transformation of an inner product space E. Prove that ço is a positive self-adjoint mapping. Prove that
if and only if ço is regular.
11. Prove that a regular linear transformation 'p of a Euclidean space can be uniquely written in the form (p = 0o1
is a positive selfadjoint transformation and z is a rotation. Hint: Use problems 5 and 10. (This is essentially the unitary trick of
where
Weyl).
§ 3.
Orthogonal projections
8.11. Definition. A linear transformation rc:E—÷E of an inner product space is called an orthogonal projection if it is selfadjoint and satisfies the condition it2 = it. For every orthogonal projection we have the orthogonal decomposition
E = ker it
Im it
and the restriction of it to Im it is the identity. Clearly every orthogonal
projection is normal. Conversely, a normal transformation 'p which satis= = 'p is an orthogonal projection. In fact, since fies the relation we can write
x='px+x1
x1eker'p.
Since 'p is normal we have that ker 'p = ker
and so it follows that
x1eker Hence
we obtain for an arbitrary vector yEE
(x,'py)=('px,'py) +(x1,q,y) =(px,'py)
= (px,'py)
§ 3. Orthogonal projections
227
whence
(x,(py) =
(y,çox).
It follows that p is selfadjoint. To every subspace E1 E there exists precisely one orthogonal projection it such that Im it=E1. It is clear that iris uniquely determined byE1. and define it by To obtain it consider the orthogonal complement iry = y,yeE1;
irz =
Then it is easy to verify that it2 = and = Consider two subspaces E1 and E2 of E and the corresponding orthogonal projections it 1: E—+ E1 and it2 E—3 E2. It will be shown that it2 it1 =0 if and only if E1 and E2 are orthogonal to each other. Assume first that =0. ConE1 ±E2. Then it1xEE± for every vector xeE, whence it2 versely, the equation =0 implies that it1 xeE± for every vector xeE, whence E1EE±. 8.12. Sum of two projections. The sum of two projections it1 and it2:E-+E2 is again a projection if and only if the subspaces E1 and E2 are orthogonal. Assume first that E1 ±E2 and consider the transformation Then it.
it
:
it(x1+x2)=itx1+irx2=x1+x2 Hence, it reduces
x1eE1,x2eE2.
to the identity-map in the sum E1$E2. On the other
hand,
itx=0
if
is the orthogonal complement of the sum n it is the projection of E onto Conversely, assume that it1 + it2 is a projection. Then But
(it1 + it2)2 =
it1
and hence
+ it2,
whence it1 This
it2 + it2 it1 =
0.
(8.27)
equation implies that
it1oit2oit1+it2oit1=0
(8.28)
=0.
(8.29)
and
it1oit2 + Adding
(8.28) and (8.29) and using (8.27) we obtain
=0.
(8.30)
Chapter VIII. Linear mappings of inner product spaces
The equations (8.30) and (8.28) yield
0.
7C2oit1
This implies that E1 I E2, as has been shown at the end of sec. 8.11. 8.13. Difference of two projections. The difference it1 — it2 of two proand it2 : E-+E2 is a projection if and only if E2 is a subjections it1 : space of E1. To prove this, consider the mapping
= t —(itt —it2)=(i —it1)+ it2. it follows that p is a projection if Since i —itt is the projection If this condition is fuland only if i.e., if and only if E1 This implies that filled, p is the projection onto the subspace —it2 = i —p is the projection onto the subspace n
This subspace is the orthogonal complement of E2 relative to E1.
8.14. Product of two projections. The product of two projections it1 : E—+E1 and it2 E—*E2 is an orthogonal projection if and only if the projections commute. Assume first that it2 it1 = it1 it2. Then it2 it1 x = it2 x = x for every vector
XE E1 fl E2.
(8.31)
On the other hand, it2 it1 reduces to the zero-map in the subspace (E1 n E2)' =
+ E±. In fact, consider a vector X=
+ X±
Then it2it1X =
=
+
+ it1
= 0.
Equations (8.31) and show that it2 0it1 is the projection Conversely, if it2 cit1 is a projection, it follows that it2
=
7t1 OTt2
(8.32) n E2.
= it1
Problems 1. Prove that a subspace JcE if and only if
J=Jn
is
stable under the projection it:E—+E1
§ 4. Skew mappings
229
2. Prove that two projections it1 : E—+E1 and it2 : E—+E2 commute if and only if
E1 +E2 =E1 n E2 +E1 fl
E at a subspace E1 is defined by
ox = where
x=p+h(peE1,
tion it:
p
—
h
Show that the reflection
and the projec-
are related by
= 2ir — 4. Consider linear transformation of a real linear space E. Prove that an inner product can be introduced in E such that p becomes an orthogonal projection if and only if p2 = 5. Given a selfadjoint mapping p of E, consider the distinct eigenvalues denotes and the corresponding eigenspaces 1 ...r). If the orthogonal projection prove the relations:
a)
(i+j).
b) c)
§ 4. Skew mappings
8.15. Definition. A linear transformation in E is called skew if —!i. The above condition is equivalent to the relation
x,yeE.
(8.33)
It follows from (8.33) that the matrix of a skew mapping relative to an orthonormal basis is skew symmetric. Substitution of y=x in (8.33) yields the equation
(x,t/ix)=O
xeE
(8.34)
showing that every vector is orthogonal to its image-vector. Conversely, a transformation having this property is skew. In fact, replacing x by x+y in (8.34) we obtain = 0, (x + + whence (y,
x) + (x, t/i v) = 0.
Chapter VIII. Linear mappings of inner product spaces
23()
It follows from (8.34) that a skew mapping can only have the eigenvalue I;. =0. The relation /i= —p/i implies that
tn/i =
0
and
den/i = The last equation shows that
(—
detk/J = 0
if the dimension of E is odd. More general, it will now be shown that the rank of a skew transformation is always even. Since every skew mapping is normal (see sec. 8.5) the image space is the orthogonal complement of the kernel. Consequently, the induced transformation : Im k/i—*Im is regular. Since is again skew, it follows that the dimension of Im must be even. It follows from this result that the rank of a skew-symmetric matrix is always even. 8.16. The normal-form of a skew symmetric matrix. In this section it will be shown that to every skew mapping cli an orthonormal basis (v = 1 n) can be constructed in which the matrix of i/i has the form .
. .
0
K1
— K1
0
(8.35) —
0
=
Consider the mapping Then 8.7, there exists an orthonormal basis 'p
=
=
According to the result of sec. (v = 1 ii) in which 'p has the form . . .
(v = I ... ii)
All the eigenvalues L, are negative or zero. In fact, the equation
pe=i.e
eJ=l
§ 4. Skew mappings
231
implies that 2
= (e,(pe) =
=
—
0.
(t/ie,çl'e)
has the same rank as i/i, the rank of Since the rank of is even and must be even. Consequently, the number of negative eigenvalues is even and we can enumerate the vectors (v = 1 . n) such that (p
. .
(v=2p+ in).
(v= l...2p) and Define the orthonormal basis
(v = 1. . ii) by .
(v=1...p) and
(v=2p+1...n).
In this basis the matrix of
has the form (8.35).
Problems 1. Show that every skew mapping (p of a 2-dimensional inner product space satisfies the relation (p
(p
(x, y).
2. Skew transfbrmations of 3-space. Let E be an oriented Euclidean 3-space.
(i) Show that every vector aEE determines a skew transformation of E given by (ii) Show that
x x.
a, bEE. = — (iii) Show that every skew map p: E—÷E determines a unique vector aEE such that given by Hint: Consider the skew 3-linear map P: E x E x
P(x, v, :)=(px, v) :+(pv. :) .v+(p:, x) y and choose aEE to be the vector determined by v. :) = zl(x, v, :)a
sec. 4.3, Proposition II). (iv) Let e1. e2, e3 be a positive orthonorma! basis of E. Show that the vector a of part (iii) is given by (cf.
a= where
C1 +
C2 +
C3
is the matrix of (p with respect to this basis.
Chapter VIII. Linear mappings of inner product spaces
232
3. Assume that p4O and are two skew mappings of the 3-space having the same kernel. Prove that i/i = ). çø where is a scalar. 4. Applying the result of problem 3 to the mappings çox = (a1 x a2) x x and
— a1(a2,x)
x= prove
the formula (a1 x a2) x a3 = a2(a1,a3) — a1(a2,a3).
E—*E satisfies the relation that a linear transformation if and only if p is selfadjoint or skew. 6. Show that every skew symmetric bilinear function P in an oriented 5. Prove
3-space E can be represented in the form
cP(x,y)=(x xy,a) and that the vector a is uniquely determined by 'P. 7. Prove that the product of a selfadjoint mapping and a skew mapping has trace zero. 8. Prove that the characteristic polynomial of a skew mapping satisfies the equation
x(- = (-
From this relation derive that the coefficient 0f is zero for every odd v. 9. Let çø be a linear transformation of a real linear space E. Prove that an inner product can be defined in E such that ço becomes a skew mapping if and only if the following conditions are satisfied: 1. The space E can be decomposed into ker 'p and stable planes. 2. The mappings which are induced in these planes have positive determinant and trace zero. 10. Given a skew symmetric 4 x 4-matrix A = verify the identity
detA = §
5.
+
+
Isometric mappings
8.17. Definition. Consider two inner product spaces E and F. A linear mapping 'p:E—*F is called isometric if the inner product is preserved under
('px1,'px2)=(x1,x2) x1,x2eE. Setting x1 =
= x we find
kpxl=IxI xeE.
§ 5. Isometric mappings
Conversely, the above relation implies that
=
+
x2)12
is
233
isometric. In fact,
—
—
= x1 + x2f2 — 1x112 — 1x212 = 2(x1,x2).
an isometric mapping preserves the norm it is always injective. We assume in the following that the spaces E and F have the same dimension. Then every isometric mapping ço:E—÷Fis a linear isomorphism of E onto F and hence there exists an inverse isomorphism p 1: F—FE. The isometry of implies that Since
((px,y)=(x,p'y)
xeE,yeF,
whence (8.36)
Conversely every linear isomorphism çü satisfying the equation (8.36) is isometric. In fact,
= (x1,x2).
=
(x1,
The image of an orthonormal basis
(v =
1. . . n)
of E under an isometric
mapping is an orthonormal basis of F. Conversely, a linear mapping which sends an orthonormal basis of E into an orthonormal basis (v = 1.. .n) of F is isometric. To prove this, consider two vectors
and then
and whence
(pxl,px2) =
=
=
=
It follows from this remark that an isometric mapping can be defined between any two inner product spaces E and F of the same dimension: Select orthonormal bases and (v = 1. n) in E and in F respectively . .
and define ço by
8.18. The condition for the matrix. Assume that an isometric mapping lii) we obtain from p:E—*Fis given. Employing two bases and ço an n x n-matrix by the equations =
234
Chapter VIII. Linear mappings of inner product spaces
Then the equations ((p
(p aM) =
aM)
can be written as (b), bK) = (as,, aM).
Introducing the matrices
=
and
(au,
h)K
(b1, bK)
we obtain the relation
=
(8.37)
Conversely, (8.37) implies that the inner products of the basis vectors are preserved under (p and hence that (p is an isometric mapping. If the bases and are orthonormal,
=
=
,
relation (8.37) reduces to
= showing that the matrix of an isometric mapping relative to orthonormal
bases is orthogonal. 8.19. Rotations. A rotation of an inner product space E is an isometric mapping of E into itself. Formula (8.36) implies that
(detp)2 =
1
showing that the determinant of a rotation is ± 1. A rotation is called proper if det (p = + 1 and improper if det (p = — 1. imEvery eigenvalue of a rotation is ± 1. In fact, the equation plies that let = let, whence )= ± 1. A rotation need not have eigenvecvectors as can already be seen in the plane (cf. sec. 4, 17). Suppose now that the dimension of E is odd and let be a proper rotation. Then it follows from sec. 4.20 that has at least one positive eigenvalue ).. On the other hand we have that ). = ± 1 whence 2= 1. Hence, every proper rotation of an odd-dimensional space has the eigenvalue 1. The corresponding eigenvector e satisfies the equation (pe=e; that is, e A similar argument shows that to every imremains invariant under proper rotation of an odd-dimensional space there exists a vector e such that pe= —e. If the dimension of E is even, nothing can be said about
§ 5. Isometric mappings
235
the existence of eigenvalues for a proper rotation. However, to every im-
proper rotation, there is at least one invariant vector and at least one vector e such that çoe= —e (cf. sec. 4.20). Let ço:E-+E be a rotation and assume that FcE is a stable subspace. Then the orthogonal complement F' is stable as well. In fact, if z is arbitrary we have for every yeF (coz,y) =
=
0
whence çozeF'.
The product of two rotations is obviously again a rotation and the inverse of a rotation is also a rotation. In other words, the set of all rotations of an n-dimensional inner product space forms a group, called the general orthogonal group. The relation det (p2 ° pi) = det c°2 det q1
implies that the set of all proper rotations forms a subgroup, the special orthogonal group.
A linear transformation of the form 2 ço where 2 0 and p is a proper rotation is called homothetic. 8.20. Decomposition into stable planes and straight lines. With the aid of the results of § 2 it will now be shown that for every rotation 'p there exists an orthogonal decomposition of E into stable subspaces of dimension 1 and 2. Denote by E1 and E2 the eigenspaces which correspond to the eigenvalues 2 + 1 and 2 = — 1 respectively. Then E1 is orthogonal to E2. In fact, let x1 eE1 and x2eE2 be two arbitrary vectors. Then
'px1=x1 and 'px2=—x2. These
equations yield
(x1,x2) = — (x1,x2), whence (x1, x2)=0. It follows from sec. 8.19 that the subspace F= (E1 is again stable under 'p. Moreover, F does not contain an eigenvector of 'p and hence F has even dimension. Now consider the selfadjoint mapping
of F. The result of sec. 8.6 assures that there exists an eigenvector e of i/i. If 2 denotes the corresponding eigenvalue we have the relation çoe
+
'p' e = 2e.
Chapter VIII. Linear mappings of inner product spaces
236
Applying p we obtain ço2e=)Lçce—e.
(8.38)
Since there are no eigenvectors of çø in F the vectors e and e are linearly independent and hence they generate a plane F1. Equation (8.38) shows
that this plane is stable under 'p. The induced mapping is a proper rotation (otherwise there would be eigenvectors in F1). F is again stable The orthogonal complement under and hence the same construction can be applied to Fj'-. Continuing in this way we finally obtain an orthogonal decomposition ofF into mutually orthogonal stable planes. Now select orthonormal bases in E1, 122 and in every stable plane. These bases determine together an orthonormal basis of E. In this basis the matrix of 'p has the form
cosO1
sin 01
—sin01
cos01
±1
—
where
(i =
1.
. .
k) denotes
(v =
1
...p)
2k=n—p
COSOk
sinOk
sin °k
C05 °k
the corresponding rotation angle (cf. sec. 8.21).
Problems 1. Given a skew transformation k/I of E, prove that
is a rotation and that — is not an eigenvalue of 'p. Conversely, if 'p is a rotation, not having the eigenvalue — I prove that =('p — i)o('p + 1)-i is a skew mapping. 2. Let 'p be a regular linear transformation of a Euclidean space E such that ('p x, coy) = 0 whenever (x, y) = 0. Prove that 'p is of the form 'p = )L t, ).+O where z is a rotation. 1
§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4
237
3. Assume two inner products qi and 'P in E such that all oriented angles with respect to and 'I' coincide. Prove that 'P(x, y) where 2>0 is a constant. 4. Prove that every normal non-selfadjoint transformation of a plane is homothetic. 5. Let be a mapping of the inner product space E into itself such that (pO=0 and çox — (P11 = lx — x. VEE.
Prove that
is then linear.
there exists a continuous 6. Prove that to every proper rotation family (p(0t I) of rotations such that and (P1 = 7. Let be a linear automorphism of an n-dimensional real linear space E. Show that an inner product can be defined in E such that p becomes a rotation if and only if the following conditions are fulfilled:
(i) The space E
can
be decomposed into stable planes and stable
straight lines. (ii) Every stable straight line remains pointwise fixed or is reflected at the point 0.
(iii) In every irreducible invariant plane a linear automorphism induced such that and tn/il 2. Consider a proper rotation t which commutes with all proper rotations. Prove that = i where c = if ii is odd and = ± if n is even. 1
§ 6.
1
Rotations of Euclidean spaces of dimension 2,3 and 4
8.21. Proper rotations of the plane. Let E be an oriented Euclidean plane and let zl denote the normed positive determinant function in E. is determined by the equation Then a linear transformation j: zi (x, i') = (j
i)
x, v E.
Chapter VIII. Linear mappings of inner product spaces
238
The transformation / has the following properties:
I) (jx,v)+(x,jv)=O (jx,jv)=(x,v)
2)
3)
4)
detj=1.
In fact, 1) follows directly from the definition. To verify 2) observe that the identity (7.24), sec. 7.13, implies that (x, y)2 + (/x,v)2 y
= IxV
x we obtain, in view of 1),
(jx, jx)2 = j is injective (as follows from the definition) we obtain
= imply that
Now the relations .1= —.1 and
=—
Finally, we
obtain from 1), 2), 3) and from the definition of /
4(jx,iv) = (/2 x, iv) = — (x,jy) = (jx, v) = zl(x,
y)
whence
det j =
1.
The transformation / is called the canonical complex structure of the oriented Euclidean plane E. Next, let p be any proper rotation of E. Fix a non-zero vector x and denote by 0 the oriented angle between x and p x (cf. sec. 7.1 3). It is determined mod 2 it by the equation
px = x cos 0 +j(x) sin 0.
(8.39)
We shall show that cos 0 = tr 'p
and
sin 0 = 4 tr(jo p).
(8.40)
In fact, by definition of the trace we have
zl(p x, y) + 4(x, py) = 4(x, y). tr 'p. Inserting (8.39) into (8.41) and using properties 2) and 3) of j
(8.41)
we
obtain
2cos0 .A(x, v)=zl(x,y) . trqi and so the first relation (8.40) follows. The second relation is proved in a similar way.
§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4
239
Relations (8.40) show that the angle (9 is independent of x. It is called the rotation angle of and is denoted by e((p). Now (8.39) reads
cosO((p)+j(x) sinê((p)
XEE.
In particular, we have
O(i)=0,
/ix=x
&(—i)=m, cos
sin
a second proper rotation with rotation angle shows that is
a
simple calculation
= x cos(&(p)+ (9(h)) +j(x) sin((9(p)+ Thus any two proper rotations commute and the rotation angle for their
product is given by (p) = (9((p) +
(mod 2it).
(8.42)
Finally observe that if e1, e2 is a positive orthonormal basis of E then we have the relations (p e1 e1 cos O(p) + e2 sin O((p) (p e2 = — e1 sin O(p) + e2 cos O(p).
Remark: If E is a non-oriented plane we can still assign a rotation angle to a proper rotation (p. It is defined by the equation
cos 9(p) = tr p
(0
it).
(8.43)
Observe that in this case (9(p) is always between 0 and it whereas in the oriented case 9(p) can be normalized between —it and it. 8.22. Proper rotations of 3-space. Consider a proper rotation p of a 3-dimensional inner product space E. As has been shown in sec. 8.19,
there exists a 1-dimensional subspace E1 of E whose vectors remain fixed. If p is different from the identity-map there are no other invariant vectors
(an invariant vector is a vector xeE such that (px=x). In fact, assume that a and b are two linearly independent invariant vectors. Let c(c+0) be a vector which is orthogonal to a and to b. Then (pc—2c where 2= ± 1. Now the equation det (p= implies that 2= + 1 showing that (p is the identity. In the following discussion it is assumed that +1. Then the invariant vectors generate a i-dimensional subspace E1 called the axis of (p. 1
240
Chapter VIII. Linear mappings of inner product spaces
To determine the axis of a given rotation (p consider the skew mapping (8.44)
and introduce an orientation in E. Then
can
t/ix=u xx
be written in the form
UEE
(8.45)
(cf. problem 2, § 4). The vector u which is uniquely determined by (p iS
called the rotation-vector. The rotation vector is contained in the axis of (p. In fact, let a+O be a vector on the axis. Then equations (8.45) and (8.44) yield (8.46)
showing that u is a multiple of a. Hence (8.45) can be used to find the rotation axis provided that the rotation vector is different from zero. This exceptional case occurs if and only if = i.e. if and only if Then (p has the eigenvalues 1, —1 and — 1. In other words, is (p = (p a reflection at the rotation axis. 8.23. The rotation angle. Consider the plane F which is orthogonal to E1. Then transforms F into itself and the induced rotation is again proper. Denote by 0 the rotation angle of (p1 (cf. remark at the end of see 8.21). Then, in view of (8.43),
cosO =
Observing that trço =
+1
trço1
we obtain the formula cosO = 3-(trp
1).
To find a formula for sin 0 consider the orientation of Fwhicl' is induced by E and by the vector u (cf. sec. 4.29)*). This orientation is represented by the normed determinant function
A1(y,z)= lul
where A is the normed determinant function representing the orientation
of E. Then formula (7.25) yields
sinO=A1(y,(py)='A(u,y,(py) lul
*) It is assumed that u
0.
(8.47)
§ 6. Rotations of Euclidean spaces of dimension 2. 3 and 4
241
where y is an arbitrary unit vector of F. Now
A(u,y,çoy)= = 4(u,co' y,y)= — A(u,y,ço1y) and hence equation (8.47) can be written as (8.48) lul
By adding formulae (8.47) and (8.48) we obtain (8.49)
lxi
21u1
Inserting the expression (8.45) in (8.49), we thus obtain
x y)=
sin0 = lul Since
x lul
(8.50)
y is a unit-vector orcnogonal to u, it follows that iu
xyl=iuilyl=lui
and hence (8.50) yields the formula
sin0=iui. This equation shows that sin 0 is positive and hence that 00 be an arbitrary number. It follows from the convergence of the series
MV —i-- (t1
_t0)V that there exists an integer N such that
n+p MV v!
(t1 — to)V
N and p 1.
The inequalities (8.60) and (8.61) yield —
p
1.
(8.61)
Chapter VIII. Linear mappings of inner product spaces
These relations show that the sequence the interval
is uniformly convergent in (p(t).
In view of the uniform convergence, equation (8.56) implies that
t t1).
= i+
(8.62)
As a uniform limit of continuous curves the curve p (t) is itself continuous. Hence, the right hand-side of (8.62) is differentiable and so p (t) must be differentiable. Differentiating (8.62) we obtain the relation
= //(t)op(t) showing that 'p (t) satisfies the differential equation (8.54). The equation
i is an immediate consequence of (8.62). 8.28. The determinant of It remains to be shown that the mappings 'p (t) are linear automorphisms. This will be done by proving the formula
'p (ta) =
f tri/i(t)dt
det'p(t) = eto Let A
(8.63)
0 be a determinant function in E. Then A ('p (t) x1 ... 'p (t)
= det 'p (t) A
(x1 ...
E E.
Differentiating this equation and using the differential equation (8.54) we obtain ... (8.64)
= Observing that (ço (t)x1 ...
=
...
(co(t)x1 ...
= tr /i (t) det 'p (t) A
(x1 ...
we obtain from (8.64) the differential equation det 'p (t) = tr
(t) det 'p (t)
(8.65)
families of linear automorphisms
§ 7. Differentiable
253
for the function det p (t). Integrating this differential equation and observing the initial condition det (t0) = det i = 1 we find (8.63). 8.29. Uniqueness of the solution. Assume that (t) and p2(t) are two solutions of the differential equation (8.54) with the initial condition = z. Consider the difference p
p(t) = p2(t) — The curve 'p (t) is again a solution of the differential equation (8.54) and it satisfies the initial condition 'p (t0) = 0. This implies the inequality
=
(8.66) to
to
Now define the function F by
F(t)=
(8.67)
Then (8.66) implies the relation
E(t) MF(t). Multiplying by e_tM we obtain E(t)e_tM — Me_tMF(t)
0,
whence d
dt
(F(t)e_tM)
0. —
Integrating this inequality and observing that F(t0) = 0, we obtain
F(t)e_tM 0
for all vectors x +0. As has been shown in sec. 7.4, a positive definite bilinear function satisfies the Schwarz-inequality
cP(x,y)2 cIl(x)cP(y)
X,yEE.
Equality holds if and only if the vectors x and y are linearly dependent. A positive definite function çJl is non-degenerate. If cll(x)O for all vectors xeE, but cli(x)=O for some vectors x+0, the function cli is called positive semidefinite. The Schwarz inequality is still valid for a semidefinite function. But now equality may hold without the vectors x and y being linearly dependent. A semidefinite function is al-
Chapter 1X. Symmetric bilinear functions
ways degenerate. In fact, consider a vector x0 0 such that di (x0) =0. Then the Schwarz inequality implies that cP (x0, y)2 cP (x0) cP (y) = 0
whence dl (x0, y) =0 for all vectors y. In the same way negative definite and negative semidefinite bilinear functions are defined. The bilinear function cP is called indefinite if the function (x) assumes positive and negative values. An indefinite function may be degenerate or non-degenerate. 9.6. The decomposition of E. Let a non-degenerate indefinite bilinear function dl be given in the n-dimensional space E. It will be shown that E such that the space Ecan be decomposed into two subspaces cli is positive definite in and is negative definite in E. Since cli is indefinite, there is a non-trivial subspace of E in which cli is positive definite. For instance, every vector a for which cIl(a)>0 generates such a subspace. Let be a subspace of maximal dimension such that dl is positive with redefinite in Et Consider the orthogonal complement E of spect to the scalar product defined by c/I. Since cli is positive definite in E the intersection E consists only of the zero-vector. At the same time we have the relation (cf. Proposition II, sec. 233)
dimE = dimE. This yields the direct decomposition E =
Now it will be shown that cli is negative definite in E. Given a vector z +0 of E -, consider the subspace E1 generated by E + and z. Every vector of this subspace can be written as
This implies that cli (x) =
cli
(y) +
dl (z).
(9.19)
Now assume that li(z)>0. Then equation (9.19) shows that cli is positive definite in the subspace E1 which is a contradiction to the maximumproperty of E Consequently,
dl(z) 0
for all vectors
zeE
§ 2. The decomposition of E i.e.,
is
267
negative semidefInite in E. Using the Schwarz inequality
z1eE,zeE
(9.20)
we can prove that cP is even negative definite in E. Assume that cb(z1)=0 for a vector z1eE. Then the inequality (9.20) yields
1(z1,z) = 0
for all vectors zeE. At the same time we know that =0
for all vectors yeEt These two equations imply that 'k(z1,x)—_ 0
for all vectors xeE, whence z1 =0. 9.7. The decomposition in the degenerate case. If the bilinear function 'P is degenerate, select a subspace E1 complementary to the nulispace E0, E = E0 EBE1. Then 'P is non-degenerate in E1. In fact, assume that
'P(x1,y1)= 0 for a fixed vector x1 e E1 and all vectors Yi e E1. Consider an arbitrary vector yeE. This vector can be written as
YYo+Yi
y0eE0,y1eE1
whence
'P(x1,y) = 'P(x1,y0) + 'P(x1,y1) = 0.
(9.21)
This equation shows that x1 is contained in E1 and hence it is contained in the intersection E0 n E1. This implies that x1 =0. Now the construction of sec. 9.6 can be applied to the subspace E1. We thus obtain altogether a direct decomposition (9.22)
of E such that 'P is positive definite in and negative definite in E. 9.8. Diagonalization of the matrix. Let (x1 .x5) be a basis of E which is orthonormal with respect to 'P, (Xs+i...Xr) be a basis of E which is . .
Chapter IX. Symmetric bilinear functions
orthonormal with respect to —P, and (Xr+i...Xn) be an arbitrary basis of Then + l(v= is) ç — I (v = s + I ... r) where = (xv, = (
O(v=r+1...n)
then form a basis of E in which the matrix of Ji has the following diagonal-form: The vectors (x1 .
. .
0
9.9. The index. It is clear from the above construction that there are infinitely many different decompositions of the form (9.22). However, the dimensions of E + and E - are uniquely determined by the bilinear function b. To prove this, consider two decompositions (9.23)
and (9.24)
such that P is positive definite in
and
and negative definite in
Ej and E. This implies that n
whence dim
n.
(9.25)
+dimEj +dimE0=n.
(9.26)
+ dim Ej + dim E0
Comparing the dimensions in (9.23) we find
Equations (9.25) and (9.26) yield
§ 2. The decomposition of E
Interchanging
and
269
we obtain
whence
=
Consequently, the dimension of is uniquely determined by cu. This number is called the index of the bilinear function cui and the number —dim E =2s—r is called the signature of cP. dim Now suppose that (v = 1.. .n) is a basis of E in which the quadratic function cli has diagonal form 11(x) = and assume that
and Then p is the index of cli. In fact, the vectors (v = 1. . .p) generate a subspace of maximal dimension in which cli is positive definite. From the above result we obtain Sylvester's law of inertia which asserts
that the number of positive coefficients is the same for every diagonal form. 9.10. The rank and the index of a symmetric bilinear function can be determined explicitly from the corresponding quadratic form cil(x) =
We can exclude the case cu =0. Then at least one coefficient from zero. If apply the substitution
= Then
where say
=
+
is different
—
=
+0 and
0. Thus, we may assume that at least one coefficient is different from zero. Then çJl (x) can be written as
2"
X11
)
The substitution ill =
+
in (v=2...n)
Chapter IX. Symmetric bilinear functions
yields
(x) =
(p1)2 1
+
(9.27)
The sum in (9.27) is a symmetric bilinear form in (n — 1) variables and hence the same reduction can be applied to this sum. Continuing this way we finally obtain an expression of the form = Rearranging the variables we can achieve that
)V>O (v=l...s) (v=s+1...r) (v=r+l...n). Then r is the rank and s is the index of P.
Problems 1. Let cP + 0 be a given quadratic function. Prove that cu can be written in the form
±1 wheref is a linear function, if and only if the corresponding bilinear function has rank 1. 2. Given a non-degenerate symmetric bilinear form cu in E, let J be a subspace of maximal dimension such that cui(x, x)=0 for every xeJ. Prove that dim J = mm (s, n — s).
Hint: Introduce two dual spaces E* and F* and linear mappings and defined by
cli(x,y)= . in the space L(E; E) by
3. Define the bilinear function
= Let S(E; E) be the space of all selfadjoint mappings and A (E; E) be the space of all skew mappings with respect to a positive definite inner product. Prove:
§ 2. The decomposition of E
271
a) cb(p, (p)>O for every p+O in S(E; E), b) 'P(p, (p)O for all vectors x #0. Construct a basis of E in which the matrix of çji has the form 1
K1
—K1 1
1
1
1, Hint: Decompose CP in the form
=
+
where
cP1(x,y)=
+ CP(y,x))
and P2 (x, y) =
(cP (x, y)
— 1i
(y, x)).
Chapter IX. Symmetric bilinear functions
272
8. Let E be a 2-dimensional vector space, and consider the 4-dimensional space L(E; E). Prove that there exists a 3-dimensional subspace Fc:L(E; E) and a symmetric bilinear function in F such that the nilpotent transformations (cf. problem 7, Chap. IV, § 6) are precisely the transformations t satisfying (In other words, the nilpotent transformations form a cone in F). § 3.
Pairs of symmetric bilinear functions
9.11. In this paragraph we shall investigate the question under which
conditions two symmetric bilinear functions cP and W can be simultaneously reduced to diagonal form. To obtain a first criterion we consider the case that one of the bilinear functions, say W, is non-degenerate. Then the vector space E is self-dual with respect to W and hence there exists a linear transformation satisfying
x,yeE (cf. Prop. Ill, sec. 2.33). Suppose now that x1 and x2 are eigenvectors of p such that the corresponding cigenvalues and 22 are different. Then we have that = ¶t'(x1,x2) and
cP(x2,x1) = 22 ¶P(x2,x1)
whence in view of the symmetry of cP and W —
Since
22) W(x1,x2)
0.
it follows that ¶I'(x1,x2)=O and hence k(x1, x2)=O.
Proposition: Assume that Y is non-degenerate. Then P and
are
simultaneously diagonalizable if and only if the linear transformation 'p has n linearly independent eigenvectors. If 'p has /1 linearly independent eigenvectors consider the distinct eigenvalues of 'p. Then it follows that . .
E=E1EE...EEE. where E, is the eigenspace of V'
(x,
=
Then we have for 0
and
CP (xe,
and
1jj
= 0.
Now choose a basis in each space E1 such that ¶1' has diagonal form (cf.
§ 3. Pairs of symmetric bilinear functions sec.
273
9.8). Since
= it follows that has also diagonal form in this basis. Combining all these bases of the E such that and W have diagonal form. Conversely, let e (i = 1.. .n) be a basis of E such that (ei, = 0 and W (ei, = 0 if i +j. Then we have that
i+j. This equation shows that the vector (p e1 is contained in the orthogonal complement (with respect to the scalar product defined by of the subspace v + i. But Fj is the 1-dimensional generated by the vectors subspace generated by e, =2 e1. In other words, the are eigenvectors of (p. As an example let E be a plane with basis a, b and consider the bilinear functions W given by
P(a,b)=O,
P(b,b)=—1
and
W(a,a)= 0,
0.
1,
It is easy to verify that then the linear transformation p is given by
(pa=b, pb=—a. Since the characteristic polynomial of p is 22+ 1 it follows that (p has no eigenvectors. Hence, the bilinear functions and 'P are not simultaneously diagonalizable. Theorem: Let E be a vector space of dimension n 3 and let and 'I' be two symmetric bilinear functions such that
D(x)2-E-W(x)2+O
if x+O.
Then çji and 'P are simultaneously diagonalizable. Before giving the proof we comment that the theorem is not correct for dimension 2 as the example above shows. 9.12. To prove the above theorem we employ a similar method as in sec. 8.6. If one of the functions rIi and 'P,say 'P, is positive definite the desired basis-vectors are those for which the function
F (x) = 18
Greub. Linear Algebra
P (x)
x
+ 0.
(9.28)
274
Chapter IX. Symmetric bilinear functions
assumes a relative minimum. However, if is indefinite, the denominator and hence the in (9.28) assumes the value zero for certain vectors
function F is no longer defined in the entire space x+0. The method of sec. 8.6 can still be carried over to the present case if the function F is replaced by the function
arc tanF(x).
(9.29)
To avoid difficulties arising from the fact that the function arc tan is not single-valued, we shall write the function as a line-integral. At this point the hypothesis n 3 will be essential *). Let E be the deleted space x 0 and x = x (t)(0 t 1) be a differentiable curve in E. Consider the line-integral
=
(9.30)
+
J
0
taken along the curve x (t). First of all it will be shown that the integral J
depends only on the initial point x0=x(O) and the endpoint x=x(l) of the curve x(t). For this purpose define the following mapping of E into the complex w-plane:
w(x)= cP(x)+ I W(x). The image of the curve xQ) under this mapping is the curve w (t) =
cP
(x (t)) + I 'I' (x (t))
in the w-plane. The hypothesis 1i
(0 t 1)
(9.31)
+ 'P (x)2 + 0 implies that the curve
w(t)(0t 1) does not go through the point w=0. The integral (9.30) can now be written as 1
2J u +v dt 2
2
0
where the integration is taken along the curve (9.31). Now let 0(t) be an angle-function for the curve w(t) i.e. a continuous function of t such that v(t) u(t) (9.32) and sin 0(t) = cos0(t) = .
w(t)I
*) The proof given is due to JOHN MILNOR.
Jw(t)I
§ 3. Pairs of symmetric bilinear functions
275
(cf. fig. 2) *). It follows from the differentiability of the curve w (t) that
the angle-function 0 is differentiable and we thus obtain from (9.32) —uiwi +
d wi (9.33)
iwi
and
d wi Iwi
Fig. 2
(9.34) by
0
0 and adding these equa-
tions we find that
ui'—üv
0=-
-
+
Integration from t = 0 to t = gives 1
liv 2dt=O(l)—0(0) +V
ut) — U
2
showing that the integral J is equal to the change of the angle-function 0 along the curve w (t), (9.35) 2J = 0(1) — 0(0).
Now consider another differentiable curve x = (t)(0 t 1) in E with the initial point x0 and the endpoint x and denote by Jthe integral (9.30) Then formula (9.35) yields taken along the curve
2J= 0(1) — 0(0)
(9.36)
where U is an angle-function for the curve (t) =
cP
(t)) + i
(t))
Since the curves w (t) and (t)(0 t the same endpoint it follows that O(O)—O(O)=2k0m
and
(0 t 1).
1) have the same initial point and
O(l)—0(l)=2k1it
(9.37)
*) For more details cf. P.S. A1ExANDRoV. Combinatorial Topology, Vol. I. chapter II. § 2.
Chapter IX. Symmetric bilinear functions
where k0 and k1 are integers. Equations (9.35), (9.36) and (9.37) show that the difference J—i is a multiple of 7t,
J—J = km.
it remains to be shown that k=0. The hypothesis n3 implies that the space E is simply connected. in other words, there exists a continuous 1 into E such that mapping x = x (t, t) of the square 0 t 1, 0 t
x(t,O)=x(t), and
x(O,t)=x0,
x(1,'r)=x
The mapping x(t, t) can even be assumed to be differentiable. Then, for every fixed 'r, we can form the integral (9.30) along the curve
x(t,z) This integral is a continuous function J(t) of i. At the same time we know that the difference —J is a multiple of m, J(t) — J = mk('r).
Hence, k (z)
(9.38)
a continuous integer-valued function in the interval Ot I and thus k(t) must be a constant. Since k(0)=0 it follows that is
1). Now equation (9.38) yields
J(x)=J Inserting t = 1 we obtain the relation
i=J showing that the integral (9.30) is indeed independent of the curve x (t). 9.13. The function F. We now can define a single-valued function F in the space E by
F(x) =
r (x) I
J
-
W (x
cP(x)2
-
(x
+ W(x)2
W
(x)
dt
(9.39)
where the integration is taken along an arbitrary differentiable curve x(t) in E leading from x0 to x. The function F is homogeneous of degree zero,
F(2 x) = F(x),
> 0.
(9.40)
§ 3. Pairs of symmetric bilinear functions
277
To prove this, observe that F(itx)—F(x)=
— + W(x)2
I
j
—-dt.
Choosing the straight segment
as path of integration we find that
=
(1
—
whence
= 0.
—
This implies the equation (9.40). 9.14. The construction of eigenvectors. From now on our proof will follow the same lines as in sec. 8.6. We consider first the case that is non-
degenerate. Introduce a positive definite inner product in E. Then the continuous function F assumes a minimum on the sphere lxi = 1. Let e1 be a vector on lxi = I such that
F(e1) F(x) for all vectors lxi = 1. Then the homogeneity of F implies that
F(e1) F(x) for all vectors x +0. Consequently, the function
f(t) =
F(e1
+ ty),
where y is an arbitrary vector, assumes a minimum at t==0, whence
f'(O)=O.
(9.41)
Carrying out the differentiation we find that (0) — —
Equations
'P(e',y)
y)
P(e1)2 + ¶P(e1)2
(9 42)
(9.41) and (9.42) imply that
cl(e1,y)
k11(e1)
— P(e1) ¶I'(e1,y) = 0
(9.43)
for all vectors yeE. In this equation W(e1)+0. In fact, assume that
_____
278
Chapter IX. Symmetric bilinear functions
W(e1)=0. Then and hence equation (9.43) yields ¶I'(e1, y)=0 for all vectors yEE. This is a contradiction to our assumption that W is non-degenerate. Define the number by = then equation (9.43) can be written as VEE.
9.15. Now consider the subspace E1 defined by the equation
V'(e1,z) = 0.
Since W is non-degenerate, E1 has the dimension n — I. Moreover, the restriction of 'P to E1 is again non-degenerate: Assume that z1 is a vector of E1 such that
'P(z1,z)=O
(9.44)
for all vectors zeE1. Equation (9.44) implies that (9.45)
for every vector xEE because x can be decomposed in the form zEE1.
=0, and so 'P is non-degenerate in E1. Therefore, the construction of sec. 9.14 can be applied to E1. We thus obtain a vector e2eE1 with the property that Now it follows from (9.45) that
b (e2, z) =
22
'P (e2, z)
for every vector z
E E1
(9.46)
where — 2
1(e2) W(e2)
Equation (9.46) implies that
=
22
¶I'(e2,y)
for every vector yeE; in fact, y can be decomposed in the form ZEE1
(9.47)
§ 3. Pairs of symmetric bilinear functions
and
279
we thus obtain
+ P(e2,z) =
= =
+
(9.48)
=
W(e1,e2) +
and =
W(e2,z).
+
(9.49)
Equations (9.46), (9.48) and (9.49) yield (9.47).
Continuing this construction we obtain after n steps a system of n vectors
subject to the following conditions:
;') =
P
(9.50)
E
(er, v)
W
4 0
= Rearranging the vectors
+ ii).
0
and multiplying them with appropriate scalars
we can achieve that = where
that
s
ç+i(v=i...s) — 1(v
(9.51)
= s + 1 ...n)
denotes the signature of W. It follows from the above relations
the vectors
Inserting
form a basis of E. in the first equation
(9.50)
we
find
(9.52)
= Equations (9.51) and (9.52) show that the bilinear functions diagonal form in the basis 1.. .n).
and 'P have
9.16. The degenerate case. The degenerate case now remains to be con-
We may assume that W+O. Then it will be shown that there exists a scalar is non-desuch that the bilinear function sidered.
generate Let E* be a dual space of E and consider the linear mappings
and defined
by the equations
cP(x,y)=
and
Then Im
To prove this relation, let
fl
p (ker fl
= 0. be any vector. Then
(9.53)
=px
Chapter IX. Symmetric bilinear functions
for some xe ker i/i. Hence (9.54)
=0
¶P(x) =
and
=
x,
=
x> =0
(9.55)
and
because
and hence that = ço x = 0. x Now let I .n) be a basis of E such that the vectors (Xr+ i. form a basis of ker i/i. Employing a determinant-function zl +0 in E we obtain zl(gox1 + + = + ... (PXr + . .
.
The expansion of this expression yields a polynomial f(2) starting with
the term is not identically zero. This follows from the relation (9.53) and the fact that the r vectors = ...r) and the (n —r) in) are linearly independent. (a=r+ vectors (pxae(p(ker i/i) implies Hence, f is a polynomial of degree r. Our assumption that r 1. Consequently, a number can be chosen such is non-degenerate (cf. sec. 9.4). Then By the previous theorem there exists a basis (v = .. .n) of E in which the bilinear functions P and P+)LW both have diagonal form. Then the The coefficient of
1
1
functions
and ¶1' have also diagonal form in this basis.
Problems 1. Let 'k and 'I' be two symmetric bilinear functions in E. Prove that the condition P(x)2 + W(x)2 > 0, x+0 is equivalent to the following one: There exist real numbers ). and ji such that (x) + /2 W (x) > 0 for every x +0. and B = 2. Let A = be two symmetric n x n-matrices and assume that the equations and
§4. Pseudo-Euclidean spaces
together imply that
(v=
1
281
n). Prove that the polynomial
+ AB)
is of degree r and has r real roots where r is the rank of B. § 4.
Pseudo-Euclidean spaces
9.17. Definition. A pseudo-Euclidean space is a real linear space in which a non-degenerate indefinite bilinear function is given. As in the positive definite case, this bilinear function is called the inner product and
is denoted by (,).The index of the inner product is called the index of the pseudo-Euclidean space. Since the inner product is indefinite, the number (x, x) can be positive, negative or zero, depending on the vector x. A vector x+0 is called space-like, if (x, x) >0 time-like, if (x, x) O.
(9.66)
Relations (9.65) and (9.66) imply that there exists exactly one real number 0 ( — oo (yr)
whence
Inner multiplication of (9.69) by z yields
(z,z1) = A(z,z) + 0. Next, consider a light-vector 1. Then
l=Az+y
(z,y)=0
and
A2 =(y,y) >0. These two relations imply that
(!,z) = A(z,z) + 0 which proves (2). Now let and '2 decompositions
be
two orthogonal light-vectors. Then we have the
=21z+y1 and
12
22z+y2,
whence —
A2 + (Y1'Y2) = 0.
(9.70)
Observing that
= (Yi'Yi) and
= (Y2'Y2)
we obtain from (9.71) the equation (Yl,Yl)(Y2'Y2) = (Y1'Y2)2.
(9.71)
Chapter IX. Symmetric bilinear functions
286
The vectors Yi and Y2 are contained in the orthogonal complement of z. In this space the inner product is positive definite and hence equation. Inserting (9.71) implies that Yt and Y2 are linearly dependent, Y2 whence 12=2'l. this into (9.70) we find Finally, let / be a light-vector and E1 be the orthogonal complement of 1. It follows from property (2) that E1 does not contain time-like vectors. In other words, the inner product is positive semidefinite in E1. To find the null-space of the inner product, assume that Yi is a vector of E1 such that (Yi'Y) = 0 for all vectors yeE1.
This implies that (Yi' Yi)=O showing that Yi is a light-vector. Now it follows from property (3) that Yi is a multiple of 1. Consequently, the null-space of the inner product in E1 is generated by 1. 9.22. Fore-cone and past-cone. As another consequence of the properties established in the last section it will now be shown that the set of all T (cf. fig. 4.) To time-like vectors consists of two disjoint sectors this purpose we define an equivalence relation in the set T of all time-like
vectors in the following way:
if (:1,z2)
and equation (9.82) shows that (x, q,x) >0 for every space-like vector x. Using formulae (9.67) we obtain the equations 0
cosh ü =
(2 +
and sinh
=1
(9.83) —
where 0 denotes the pseudo-Euclidean angle between the vectors x and Now consider the orthonormal basis of E which is determined by the vectors
=
— 12)
and
e2
+ 12).
Then equations (9.80) yield
—
+ 2(A
thus obtain the following representation of p, which corresponds to the representation of a Euclidean rotation given in sec. 8.21: We
0 + e2 sinh 0 = e1 sinh 0 + e2 cosh 0.
p e1 = e1 cosh
q, e2 19*
292
Chapter IX. Symmetric bilinear functions
9.27. Lorentz-transformations. A 4-dimensional pseudo-Euclidean space with index 3 is called Minkowski-space. A Lorentz-transformation is
a rotation of the Minkowski-space. The purpose of this section is to show that a proper Lorentz-transforrnation ço possesses always at least one eigen vector on the We may restrict ourselves to Lorentz-
transformations which do not interchange fore-cone and past-cone because this can be achieved by multiplication with — I. These transformations are called orthochroneous. First of all we observe that a lightvector 1 is an eigenvector of ço if and only if (1, col) = 0. In fact, the equation çol=2l yields (l,(p/) = 2(1,1) = 0.
Conversely, assume that 1 is a light-vector with the property that (1, pl) = 0. Then it follows from sec. (9.21) property (3) that the vectors 1 and çol are linearly dependent. In other words, 1 is an eigenvector of ço. Now consider the selfadjoint mapping (9.84) Then
=(x,gox) xeE.
= j-(x,(px)+
(9.85)
It follows from the above remark and from (9.85) that a light-vector 1 is an eigenvector of p if and only if (I, = 0. We now preceed indirectly and assume that ço does not have an eigenvector on the lightcone. Then (x, = (x, cox) 0 for all light-vectors and hence we can apply the result of sec. 9.24 to the mapping /i: There exist four eigenvectors (v = 1.. .4) such that
(+
=
=
1(v = 1,2,3) 1
(v
= 4).
Let us denote the time-like eigenvector e4 by e and the corresponding eigenvalue by 2. Then and hence it follows from (9.84) that ço2e = 22çIie
—
e.
Next, we wish to construct a time-like eigenvector of the mapping 'p. If çoe is a multiple of e, e is such a vector. We thus may assume that the vectors e and çoe are linearly independent. Then these two vectors generate *) Observe that a proper Euclidean rotation of a 4-dimensional space need not have
eig envectors.
§ 5. Linear mappings of pseudo-Euclidean spaces
a
293
plane F which is invariant under ço. This plane intersects the light-cone
in two straight lines. Since the plane F and the light-cone are both invariant under these two lines are either interchanged or transformed into themselves. In the second case we have two eigenvectors of p on the light-cone, in contradiction to our assumption. In the first case select two generating vectors and '2 on these lines such that (/i, /2)= 1. Then and
The condition (911,(p12) = (11,12)
implies that cxf3= 1. Now consider the vector z Z
= 11 + CL!2.
Then
=
+ cx/311
cL!2
+
=
z
To show that z is timelike observe that
(z,z) = and that the vector t=11
is
2ci(11,12)
time-like. Moreover,
(t, q,t) .
Inserting this into (10.7) we obtain
+ = 0.
—
Hence, the equation of Q assumes the form 'P(x —
x0)= 0.
A quadric of this kind is called a cone li'ith the vertex x0. For the sake of
§ 2. Quadrics in the affine space
303
simplicity, cones will be excluded in the following discussion. In other
words, it will be assumed that
p x + a* + 0 for all points x e Q.
(10.8)
10.8. Tangent-space. Consider a fixed point x0eQ. It follows from condition (10.8) that the orthogonal complement of the vector (pxo+a* is of E. This subspace is called the an (n — 1)-dimensional subspace if and tangent-space of Q at the point x0. A vectoryEEis contained in only if (10.9) 0.
andf equation (10.9) can be written as
In terms of the functions
(xe, y) + f (y)
0.
(10.10)
affine subspace which is determined by the point x0 and the tangent-space is called the tangent-hyperplane of Q at x0. It consists of all points The (n — 1)-dimensional
x=x0+y Inserting y=x—x0 into equation (10.10) we obtain
cli(x0,x—x0)+f(x—x0)=0.
(10.11)
Observing that cP(x0)+
we can write equation (10.11) of the tangent-hyperplane in the form
+ x)=
(10.12)
To obtain a geometric picture of the tangent-space, consider a 2dimensional plane (10.13)
through x0 where a and b are two linearly independent vectors. Inserting (10.13) into equation (10.5) we obtain the relation + +
+
a) + f (a)) +
+
b) + 1(b)) =
0
(10.14)
showing that the plane F intersects Q in a conic Define a linear function g by setting (10.15) g(x) = x) + .1(x)).
Chapter X. Quadrics
304
Now the equation of the conic y can be
then. in view of (10.8), written in the form
+ ijg(b) = 0.
b) + ij2'P(b) +
+
(10.16)
Now assume that the vectors a and h are chosen such that g(a) and g(h) are not both equal to zero. Then the conic has a unique tangent at the and this tangent is generated by the vector point
t=—g(b)a +g(a)b.
(10.17)
this follows from the The vector t is contained in the tangent-space equation g(t) = — g(b)g(a) + g(a)g(b) = 0.
can be obtained in this way. Every vector y+O of the tangent-space in fact, let a be a vector such that g (a) = and consider the plane through x0 spanned by a andy. Then equation (10.17) yields 1
t=—g(y)a +g(a)y=y
(10.18)
showing that y is the tangent-vector of the intersection Q fl F at the point
Note: If g(a)=0 and g(b)=0 equation (10.16) reduces to + ?J2cIi(b) = 0.
+
Then the intersection of Q and F consists of a) two straight lines intersecting at x0, if cli(a,b)2 — b)
the point x0 only, if 1(a, b)2 — cP(a)cP(b)
we see that contains the null-space of cli. every tangent-space The equation of the tangent-hyperplane of Q at x0 is given by cli (x0,y) = f3.
(10.34)
It follows from (10.34) that a center of Q is never contained in a tangenthyperplane. 1 Dividing (10.33) by ,13 and replacing the quadratic function cli by cli we can write the equation of Q in the normal-form cll(x) = 1.
Now select a basis
(10.35)
(v = 1.. .n) of E such that
+ 1(v = =
is)
= — I (v = s + 1... r)
0(v=r+1...n)
(10.36)
Chapter X. Quadrics
308
where r denotes the rank and s denotes the index of form (10.36) can be written as
Then the normal-
1.
(10.37)
10.12. Normal-form of a quadric without center. Now consider a quadnc Q without a center. If a point of Q is chosen as origin the constant in (10.5) becomes zero and the equation of Q reads
ck(x) + 2 = 0.
(10.38)
By multiplying equation (10.38) with — if necessary we can achieve that 2s r. In other words, we can assume that the signature of çJi is not 1
negative
To reduce equation (10.38) to a normal form consider the tangentspace T0 at the origin. Equation (10.9) shows that T0 is the orthogonal complement of a*. Hence, a* is contained in the orthogonal complement On the other hand, a* is not contained in the orthogonal complement K' (K=ker p) because otherwise Q would have a center (cf. sec. 10.10). show that T01cf K'. Taking the orthoThe relations a*ET0' and gonal complement we obtain the relation T0 K showing that there exists a vector aeK which is not contained in T0 (cf. fig. 5). Then +0 and hence we may assume that = 1. Now T0 has dimension n— 1 and hence every vector
xEE can be written in the form
yeT0.
(10.39)
Inserting (10.39) into equation (10.38) we obtain
2
—
= 0.
Now the uniqueness theorem yields
= 2b2('rx) where 2+0 is a constant. This relation implies that the bilinear functions have the same rank r and that cP1 and or =r depending
on whether 2>0 or 2.
Now set a=K1(h). 11.6. INormed determinant functions. A nouined (leteulni000t Junction in a unitary space is a determinant function, zl, which satisfies x1eE.
4k1
(11.13)
We shall show that normed determinant functions always exist and and 47 are connected by that two normed determinant functions. the relation 47 where r is a complex number of magnitude I. In fact, let be any determinant function in E. Then a determinant in E is given by function.
\.*fl)_A(k._I\.*l
Since E and E
are
dual with respect to the scalar product for-
mula (4.21) yields
4o(xi where
is a complex constant. Setting =
we
obtain x,)
Now set It follows that
where
(v= 1 ... n) 40(e1
is an orthonormal basis of E. =
y
§2. Unitary spaces and so
331
is real and positive. Let ). be any complex number satisfying and define a new determinant function, 4, by setting 4 I,.
Then (11.14) yields the relation
4
,...,x,,)
v,,) = det ((xi, vi))
4(v1
showing that zi is a normed determinant function. Finally, assume that z11 and are normed determinant functions in E. 1. Then formula (11.13) implies that 42=r41,
11.7. Real forms. Let E be an n-dimensional complex vector space and consider the underlying 2 n-dimensional real vector space E is a subspace, Fc: such that F is a real form of E, then every vector zeE can be uniquely written in the form z=x+iy. x. yeF.
Every complex space has real forms. In fact, let z1 ... z,7 be a basis of E and define F to be the subspace of consisting of the vectors X
E
=
Then clearly, F is a real form of E. A conjugation in a complex vector space is an satisfying of
iz =
involution,
—
Every real form F of E determines a conjugation given by x + ivF—*x —
iy
x.
yeF.
is a conjugation in E,then the vectors z which satisfy = z determine a real form F of E as is easily checked. The vectors of F are called real (with respect to the conjugation). Conversely, if
Problems 1. Prove that the Gram determinant
x1) ... (xv,
Chapter XI. Unitary spaces
332
of p vectors of a unitary space is real and non negative. Show that if and only if the vectors
are linearly dependent.
2. Let E be a unitary space. (i) Show that there are conjugations in E satisfying (ii) Given such a conjugation show that the Hermitian inner product of E defines a Euclidean inner product in the corresponding real subspace.
3. Let F be a real form of a complex vector space E and assume that a positive definite inner product is defined in F. Show that a Hermitian inner product is given in E by + i((x1.
:2) = ((x1. x2) + 2)
a unit quaternion / (cf. sec. 7.23) such that (I. e)=O.
Use / to make the space of quaternions into a 2-dimensional complex vector space. Show that this is not an algebra over C. 5. The complex cross product. Let E be a 3-dimensional unitary space and choose a normed determinant function /1. (i) Show that a skew symmetric bilinear map E x E E (over the reals) is determined by the equation
x, v. :eE. (ii) Prove the relations
(ix)xv= xx(iv)= —i(xxv)
(xxi',x)=(xxv,v)=O (x1 x x,.
X 12) = (yl. x1)(v2. x7) x
=
—
x1)(11. x2)
(x1. x2H2
6. Cavlev iiumhers. Let E be a 4-dimensional Euclidean space and E (cf. Example II, sec 11.4). Choose a and let E1 denote the orthogonal complement of e. unit vector Choose a normed determinant function 4 in E1 and let x be the corresponding cross product (cf. problem 5). Consider as a real 8-dimenlet
§2. Unitary spaces
333
sional vector space F and define a bilinear map x, y x y = —(x, y)e +
and
xx v
x i by setting
x. tEE1 tEC. VEE1 xeE1
x(ie)=/x
Show that this bilinear map makes F into a (non-associative) division algebra over the reals. Verify the formulae x =(xv)y and x2 y=x(xy). 7. Symp/ectic spaces. Let E be an rn-dimensional vector space over a field F. A symplectic inner product in E is a non-degenerate skew symmetric bilinear function in E. (i) Show that a symplectic inner product exists in E if and only if in is even, rn=2n. (ii) A basis a1 a,,. b1 h, of a symplectic vector space is called syinpiectic. if K
a1>
=0
=
Kb,.
0
=
Show that every symplectic space has a symplectic basis. (iii) A linear transformation p: E—÷E is called symplectic. if it satisfies = <x, v>
x. yeE.
If (P and W are symplectic inner products in E, show that there is a linear automorphism of E such that x. yeE. 8. Symplectic ad joint transJorrnations. i) If p: E —* E is a linear transformation of a symplectic space the syrnp/ectic adjoint transformation. is defined by the equation = <x. 10Y>
—(p) then (respectively (respectively skew syniplectic).
X.
VEE.
is called svinp/ectic se//ac/joint
(i) Show that = p. Conclude that every linear transformation ço of E where is symplectic can be uniquely written in the form selfadjoint and p2 is skew symplectic. (ii) Show that the matrix of a skew symplectic transformation with respect to a symplectic basis has the form (A B
Chapter XI. Unitary spaces
334
where A. B. C, D are square (n x ii)-mat rices satisfying
D=_A*.
C*=C.
B*=B.
Conclude that the dimensions of the spaces of symplectic selfadjoint (respectively skew symplectic) transformations are given by = ii(2ii
= ii(2,i
— 1)
—1.— 1).
(iii) Show that the symplectic adjoint of a linear transformation p of a 2-dimensional space is given by =
i
trço —
çø.
(iv) Let E be an n-dimensional unitary space with underlying real vector space Show that a positive definite symmetric inner product and a symplectic inner product is defined in by (x. j') = Re (x. j') and
x.
<x, i'> = Im (x, v)
Establish the relations <x, y> = — (ix.
and
v)
= (x.v).
Conclude that the transformation symplectic.
§
is both symplectic and skew
3. Linear mappings of unitary spaces
11.8. The adjoint mapping. Let E and F be two unitary spaces and (p: E-÷ F a linear mapping of E into F. As in the real case we can associate
of F into E. A conjugation in a unitary with ço an adjoint mapping of the underlying real vector space space is a linear automorphism
such that ix=
and
fl=(x,v).
If
is a conjugation in E,
then a non-degenerate complex bilinear function is defined by <x,y>= (x, and be conjugations in E and in F, respecx, yEE. Let tively. Then E and F are dual to themselves and hence 'p determines a dual mapping 'p*: by the relation = <x,'p*y>.
(11.16)
§ 3. Linear
mappings of unitary spaces
335
Replacing y by 9 in (11.16) we obtain (11.17)
Observing the relation between inner product aid scalar product we can rewrite (11.17) in the form
(çox,y) =(x,co*9). Now define the mapping
E+-F by (11.19)
Then relation (11. 18) reads
xeE, yEF. The mapping assume that (11.20). Then
(11.20)
does not depend on the conjugations in E and F. In fact, and are two linear mappings of F into E satisfying
0.
—
This equation holds for every fixed yeF and all vectors xeE and hence it implies that
=
The mapping
is called the adjoint of the mapping
It follows from equation (11 .18) that the relations
and hold for any two linear mappings and for every complex coefficient 2. Equation (11.20) implies that the matrices of 'p and relative to two orthonormal bases of E and F are connected by the relation
Now consider the case E= F. Then the determinants of 'p and are complex conjugates. To prove this, let A +0 be a determinant function in E and A be the conjugate determinant function. Then it follows from the definition of A that = det 'p A
...
) = det 'p
This equation implies that deUp = det'p.
A (x1
...
___________
336
Chapter XI. Unitary spaces
If (p is replaced by p —Ai, where 2 (11.21) yields det
—2
is
a complex parameter, relation
i) = det ((p — 2 i).
Expanding both sides with respect to 2 we obtain ;n—V
=
This equation shows that corresponding coefficients in the characteristic polynomials of (p and are complex conjugates. In particular,
= trço.
11.9. The inner product in the space L(E; E). Consider the space L (E; E). An inner product can be introduced in this space by
=
(11.22)
It follows immediately from (Il .22) that the function (p, i/i) is sesquilinear. Interchange of and yields
To prove that the Hermitian function (11.22) is positive definite let (v= in) be an orthonormal basis. Then and where
(11.23)
Equations (11.23) yield (p çoe%, =
whence
=
=
=
This formula shows that ((p, p)> 0 for every transformation (p + 0. 11.10. Normal mappings. A linear transformation (p : E—+ E is called normal, if the mappings (p and commute, (11.24)
In the same way as for a real inner product (cf. sec. 8.5) it is shown
§ 3. Linear mappings of unitary spaces
337
that the condition (11.24) is equivalent to
xeE.
It follows from (11.25) that the kernels of (p,
(11.25)
coincide, ker p = ker
and
We thus obtain the direct decomposition
E = ker (p
(11.26)
Tm (p.
The relation ker (p = ker implies that the mappings (p and have the same eigenvectors and that the corresponding eigenvalues are complex conjugates. In fact, assume that e is an eigenvector of (p and that is the corresponding eigenvalue, çoe = 2e. Then e is contained in the kernel of (p —2i. Since the mapping (p —2i is
again normal, e must also be contained in the kernel of —L, i. e.
qe—)e. In sec. 8.7 we have seen that a selfadjoint linear transformation of a real inner product space of dimension n always possesses n eigenvectors which are mutually orthogonal. Now it will be shown that in a complex space the same assertion holds even for normal mapping. Consider the characteristic polynomial of (p. According to the fundamental theorem Then of algebra this polynomial must have a zero is an eigenvalue of Let e1 be a corresponding eigenvector and E1 the orthogonal complement of e1. The space E1 is stable under (p. In fact, let y be an arbitrary vector of E1. Then ((py,e1) = (y,
=
= 2(y,e1) =
0
and hence is contained in E1. The induced mapping is obviously again normal and hence there exists an eigenvector e2 in E1. Continuing this way we finally obtain n eigenvectors (v = 1. n) which are mutually orthogonal, . .
(eV,e,L)=O
(v+ii).
If these vectors are normed to length one, they form an orthonormal basis of E. Relative to this basis the matrix of has diagonal form with the eigenvalues in the main-diagonal, (pep = 22
Greub. Linear Algebra
(v = 1... n).
(11.27)
Chapter XI. Unitary spaces
11.11. Selfadjoint and skew mappings. Let ço be a selfadjoint linear
transformation of E; i.e., a mapping such that
= (p. Then relation
(11.20) yields
(çox,y)=(x,coy)
X,yEE.
Replacing y by x we obtain
((px,x) = (x,px)
=(qx,x)
showing that ((px, x) is real for every vector xEE. This implies that all eigenvalues of a selfadjoint transformation are real. In fact, let e be an eigenvector and )L be the corresponding eigenvalue. Then çoe=Ae, whence
(pe,e) = 2(e,e). Since (çoe, e) and (e, e) + 0 are real, ). must be real.
Every selfadjoint mapping is obviously normal and hence there exists a system of n orthonormal eigenvectors. Relative to this system the matrix of p has the form (11.27) where all numbers are real.
The matrix of a selfadjoint mapping relative to an orthonormal basis is Hermitian. A linear transformation p of E is called skew if p= —(p. In a unitary
space there is no essential difference between selfadjoint and skew mappings. In fact, the relation iço
shows that multiplication by i associates with every selfadjoint mapping a skew mapping and conversely. 11.12. Unitary mappings. A unitary mapping is a linear transformation of E which preserves the inner product,
(çox,çoy)=(x,y)
x,yeE.
(11.28)
Relation (11.28) implies that
xEE showing that every unitary mapping is regular and hence it is an automorphism of E. If equation (11 .28) is written in the form
((px,y) =(x,(p1 y) it shows that the inverse mapping of (p is equal to the adjoint mapping, (11.29)
§ 3. Linear mappings of unitary spaces
339
Passing over to the determinants we obtain 1
whence
= 1. Every eigenvalue of a unitary mapping has norm 1. In fact, the equation çoe = 2e
yields = 121 el
whence 121=1.
Equation (11.29) shows that a unitary map is normal. Hence, there exists an orthonormal basis l...n) such that
= where the
are
(v = 1... n)
complex numbers of absolute value one.
Problems 1. Given a linear transformation p: E-÷E show that the bilinear function ck defined by
P(x,y)=(px,y)
is sesquilinear. Conversely, prove that every sesquilinear function can be obtained in this way. Prove that the adjoint transformation determines the Hermitian conjugate function.
2. Show that the set of selfadjoint transformations is a real vector space of dimension n2. 3. Let p be a linear transformation of a complex vector space E. a) Prove that a positive definite inner product can be introduced in E
such that p becomes a normal mapping if and only if ço has n linearly independent eigenvectors. b) Prove that a positive definite inner product can be introduced such
that p is i) selfadjoint ii) skew iii) unitary
if and only if in addition the following conditions are fulfilled in corresponding order: i) all eigenvalues of çø are real ii) all eigenvalues of 'p are imaginary or zero iii) all eigenvalues have absolute value 1. 22*
Chapter XI. Unitary spaces
4. Denote by S(E) the space of all selfadjoint mappings and by A (E) the space of all skew mappings of the unitary space E. Prove that a multiplication is defined in S(E) and A (E) by and
=
—
respectively and that these spaces become Lie algebras under the above multiplications. Show that S(E) and A(E) are real forms of the complex space L(E; E) (cf. problem 8, § 1, chap. I and sec. 11.7). §
4•* Unitary mappings of the complex plane
11.13. Definition. In this paragraph we will study the unitary mappings of a 2-dimensional unitary space C in further detail. Let -r be a unitary mapping of C. Employing an orthonormal basis e1, e-, we can represent the mapping i in the form (11.30) ie1 =c,e1 ±f3e2 're2 = c(— f3e1 + e
are complex numbers subject to the conditions =
1cx12 + 11312
and
= 1.
These equations show that
dett =
c.
We are particularly interested in the unitary mappings with the determinant + 1. For every such mapping equations (11.30) reduce to
te1=cxe1+$e2
2 cx
+
—
This implies that
-r'e1
—/3e2
T'e2=13e1 +cxe2. Adding the above relations in the corresponding order we find that (i +
= (cx +
t+
=
t' = itri.
(11.31)
§ 4. Unitary mappings of the complex plane
341
Formula (11.31) implies that
(z,rz) + (z,x' z) = zl2trt for every vector zeC. Observing that
(z,'r' z)
(tz,z)
(z,tz)
thus obtain the relation
we
2Re(z,rz)=Izl2trx
zEC
(11.32)
showing that if det t = 1. then the real part of the inner product (:.-r:) depends only on the norm of:. (11.32) is the complex analogue of the relation (8.42) for a proper rotation of the real plane. We finally note that the set of all unitary mappings with the determinant + 1 forms a subgroup of the group of all unitary mappings. 11.14. The algebra Q. Let Q denote the set of complex linear transformations of the complex plane C which satisfy
q3+ço=ltrço. It
(11.33)
is easy to see that these transformations form a real 4-dimensional
vector space, Q. Since (11.33) is equivalent to ç75 = ad p (cf. sec. 4.6 and problem 8, § 2, chapter IV) it follows that Q is an algebra under composition. Moreover, we have the relation pEQ.
(11.34)
Define a positive definite inner product in Q by setting (cf. sec. 11.9)
Then we have
(p.p)=detp.
çoeQ
(11.35)
and (11.36)
Now we shall establish an isomorphism between the algebra Q and the algebra I-fl of quaternions defined in sec. 7.23. Consider the subspace Q1 of Q consisting of those elements which satisfy or equivalently (p. i)=O. Then Q1 is a 3-dimensional Euclidean subspace of Q.
Chapter XI. Unitary spaces
342
and thus (11.34) yields
For q7eQ1 we have, in view of(11.33).
=—detq 1=— (p,p)i
eQ1.
This relation implies that =
+
'
— ((p,
(11.37)
Next, observe that for çoEQ1 and /JEQ1
((pop whence
°
—
i)=
—
Thus a bilinear map. x, is defined in Q1 by
°
(p x
(11.38)
It satisfies ((p
and
x
(q x
(11.39)
Moreover, equations (11.37) and (11.38) yield (11.40)
It follows that (11.41)
Finally, let /1 be the trilinear function in Q1 given by L1((p,
In view of (11.40) we can write
= (p x
L1((p,
(11.42)
This relation together with (11.39) implies that I is skew symmetric and hence a determinant function in Q1. Moreover, setting x = (p x yields, in view of (11.41), X
.
Thus A is a uoi';ned determinant function in the Euclidean space Q1. Hence it follows from (11.42) that the bilinear function x is the cross product in Q1 if Q1 is given the orientation determined by I (cf. sec. 7.16). Now equation (11.40) shows that Q is isomorphic to the algebra 1-fl of quaternions (cf. sec. 7.23).
§ 4. Unitary mappings of the complex plane
343
Remark: In view of equation (11.35), Q is an associative division algebra of dimension 4 over 11. Therefore it must be isomorphic to
I-fl
(cf. sec. 7.26). The above argument makes this isomorphism explicit. 11.15. The multiplication in C. Select a fixed unit-vector a in C. Then
to every vector :EC there exists a unique mapping
such that
This mapping is determined by the equations
a = a + f3 b = — fla + b
is a unit-vector orthogonal to a and
z = ca + fib. The correspondence
obviously satisfies the relation
+
=
for any two real numbers 2 and p. Hence, it defines a linear mapping of
C onto the linear space Q, if C is considered as a 4-dimensional real linear space. This suggests defining a multiplication among the vectors zEC in the following way: Z1Z2 = (11.43) Then (Pa1 and z1z2eC. (11.44) = In fact, the two mappings
Z2
and
°
are both contained in Q and
a into the same vector. Relation (11.44) shows that the correspondence preserves products. Consequently, the space C besend
comes a (real) division-algebra under the multiplication (11.43) and this algebra is isomorphic to the algebra of quaternions. Equation (11.31) implies that = 2(p, i)a z+ (11.45) for every unit-vector z. In fact, if z is a unit-vector then is a unitary mapping with determinant 1 and thus (11.31) and (11.36) yield z
+
= çoa +
=
+
a=
=
Finally, it will be shown that the inner products in C and Q are connected by the relation Re (z1, z2) =
(p.,
p:2).
(11.46)
Chapter XI. Unitary spaces
To prove this we may again assume that and :2 are unit-vectors. Then and are unitary mappings and we can write
pa,a) = ((p,-!a,a).
=
(:1,:2) =
(11.47)
is also unitary formula (11.32) yields
Since
a, a) = =
tr (p2_i
—1
= -3
= -3
(11.48)
=
Relations (11.47) and (11.48) imply (11.46).
Problems 1. Assume that an orthonormal basis is chosen in C. Prove that the transformations which correspond to the matrices (1
(i
O\ 1)
( 0-i
(O-l\
O\
o)
0
form an orthonormal basis of Q. 2. Show that a transformation çOEQ is skew if and only if tr p=O. 3. Prove that a transformation çOEQ satisfies the equation
=
—
if and only if
detp=1
and
trço=O.
4. Verify the formula (z1 z, z2 z) = (z1, z2)
z e C.
5. Let E be the underlying oriented 4-dimensional vector space of the complex plane C and define a positive definite inner product in E by
(x, J')E = Re(x, v).
Let a. h be an orthonormal basis of C such that a is the unit element for the multiplication defined in sec. 11.15. Show that the vectors = a.
e1 = ia_
c7 =
h,
c3 = ib
form an orthonormal basis of E and that, if 4 is the normed positive determinant function in E, = — I.
§ 5. Application to Lorentz-transformation
345
Let E1 denote the orthogonal complement of e in E (cf. problem 5). t/i be a skew Hermitian transformation of C with tn/i=0. a) Show that there is a unique vector peE1 such that x. xeE. 6.
Let
b)If / is the matrix of with respect to the orthonormal basis a. b in problem 5, show that p= + fi e2 + ; e3. §
5•* Application to Lorentz-transformations
11.16. Selfadjoint linear transformations of the complex plane. Consider the set S of all selfadjoint mappings a of the complex plane C. S is a real 4-dimensional linear space. In this space introduce a real inner product by (11.69)
This inner product is indefinite and has index 3. To prove this we note first that — (trcr)2) = — deta = (11.70) and
=—4tra.
(11.71)
Now select an orthonormal basis z1,z2 of C and consider the transformations (j = 1,2,3) which correspond to the Pauli-matrices
/1
0\ —i)
i\
/0
/0
o)
—i 0
Then it follows from (11.69) that = 1. In view of (11.70) and (11.71) these conditions are equivalent to
trw=0 and detw=—l. Therefore WoW
1.
By hypothesis, T preserves fore-cone and past-cone. Hence Ti can be written as (cf. sec. 9.26)
Ti = i cosh 0 + wsinh0.
(11.91)
Let cx be the selfadjoint transformation defined by
=
i cosh
0
+ w sinh —.
(11.92)
Then
Ti=
0
= i cosh2 + —
2 w cosh
2
=
i cosh 0
0
0
2
2
- sinh - +
0
w w sinh2 — 2
(11.93)
+ w sinh 0.
Comparing (11.91) and (11.93) we see that
Ti = Tai. This equation shows that the transformation T' ° T leaves vector i invariant. As it has been shown already there exists a linear transformation f3e Q of determinant 1 such that T=
Chapter XI. Unitary spaces
Hence,
T= It remains to be proved that cz has determinant 1. But this follows from (11.70), (11.92) and (11.88): det
= — sinh2 - = 1.
Problems 1) Let be the linear transformation of a complex plane defined by the matrix
(
1
21 3
Find the real 4 x 4 matrix which corresponds to the Lorentz-transformation with respect to the basis i, a3 (cf. sec. 11.16). 2) A fractional linear transformation of the complex plane formation of the form
T(:) =
c:+d
ad— hc
=
is a trans-
1
where a. b. c. d arc complex numbers. (i) Show that the fractional linear transformations form a group.
(ii) Show that this group is isomorphic to the group of proper orthochroneous Lorentz transformations.
Chapter XII
Polynomial Algebras in this chapter F denotes a fIeld of characteristic zero. § 1. Basic properties In this paragraph we shall define the polynomial algebra over F and establish its elementary properties. Some of the work done here is
simply a specialization of the more general results of Chapter VII. Volume II. and is included here so that the results of the following chapter will be accessible to the reader who has not seen the volume on multilinear algebra. 12.1. Polynomials over F. A polynomial over F is an infinite sequence
are different from zero. The sum and such that only finitely many the product of two polynomials =
...)
and
g=
...)
are defined by
jg =
...)
where i-t-j=k
These operations make the set of all polynomials over F into a commutative associative algebra. It is called the polynomial algebra over F and is denoted by ['[t]. The unit element, 1, of F[t] is the sequence (1.0...). It is easy to check that the map i: F—+F[t] given by
is an injective homomorphism of algebras.
Thus we may identify F with its image under 1. The polynomial (0, 1, 0 ...) will be denoted by
t=(0.l,0...).
Chapter XII. Polynomial algebra
It follows that
tk=(0. (11.0...)
k =0.1....
Thus every polynomial I can be written in the form (12.1)
where only finitely many are different from zero. Since the elements (k=0. 1. ...) are linearly independent, the representation (12.1) is unique. Thus these elements form a basis of the vector space T[t]. Now let .1
0
=
be a non-zero polynomial. Then ; is called the /eadi;ig ('oeff!cwIlt of f The leading coefficient of is the product of the leading (ff0, coefficients off and g. This implies that fg 0 whenever / 0 and g 0. A polynomial with leading coefficient is called monk. On the other hand, for every polynomial the element I
is called the sea/a, term. It is easily checked that the map is a surjective homomorphism. given by Consider a non-zero polynomial k=O
The number ii is called the degree of
I.
If g is a second non zero
polynomial, then clearly
deg(f + g) max(degf,degg) and
deg(fg) = degf + degg.
(12.2)
A polynomial of the form ;t' (; +0) is called a monomial of degree ii. be the space of the monomials of degree ii together with the Let zero element. Then clearly
F[t] and by assigning the degree a to the elements of
we make r[t]
into a graded algebra as follows from (12.2). The homogeneous elements
§ 1. Basic properties
353
of degree n with respect to this gradation are precisely the monomials of degree a.
However, the structure of F[t] as a graded algebra does not play a role in the subsequent theory. Consequently we shall consider simply its structure as an algebra. 12.2. The homomorphism Let A be any associative algebra with unit element e and choose a fixed element aEA. Then the map 1
e, t
a
can be extended in a unique way to an algebra homomorphism
The uniqueness follows immediately from the fact that the elements 1 and t generate the algebra F[t]. To prove existence, simply set
=
a".
It follows easily that k is an algebra homomorphism. The image of T[t] under J) will be denoted by T(a). It is clearly the subalgebra of A generated by e and a. Its elements are called polynomials in a, and are denoted by J(a). The homomorphism cP:f—÷f(a) induces a monomorphism cP:
In particular, if A is generated by e and a then cP (and hence P) are surjective and so P is an isomorphism
P: F[t]/kerP4A in this case.
Example: Let AeT[t] and let
be a polynomial. Then we
have
f(g) =
ti'.
In particular. the scalar term of f(g) is given by =
=J(/30).
Thus we can write
where g: 23
Greub. Linear Algebra
is the homomorphism defined in sec. 12.1.
Chapter XII. Polynomial algebra
12.3. Differentiation. Consider the linear mapping
d:E[t]—*F[t] defined by
dt"—pt"' dl =0. Then we have for p,q1 = = (p + = + t"q = + =
d
d
+
d
(12.3)
It is easily checked that (12.3) continues to hold for p=O or q=0. Since the polynomials form a basis of T[t] (ef. sec. 12.1) it follows from (12.3) that the mapping d is a derivation in the algebra F[t]. d is called the d(fferentiation map in F[t], and is the unique derivation which maps t into 1. It follows from the definition of d that d lowers the degree of a polynomial by 1. In particular we have ker d = F
and so the relations
df = dg and
are equivalent. The polynomial df will be denoted byf'. The chain rule states that for any two polynomials f and g,
(f(g))'=f'(g)g'. For the proof we comment first that
dgk=kgk_ldg dg° =
(12.4)
0
which follows easily from an induction argument. Now let
I= Then
I (g) =
gk
§ 1. Basic properties
355
and hence formula (12.4) yields
(f(g))' =
1) is called the r-th derivative of f and is The polynomial d7 We extend the notation to the case r=O by usually denoted by if and only setting f(0)=f It follows from the definition that if r exceeds the degree of j 12.4. Taylor's formula. Let A be an associative commutative algebra over F with unit element e. Recall from sec. 12.2 that if jET[t] and aeA then an element [(a)eA is determined by J(a) = Tailor's fhrmula states that for aEA, beA fl
j(a+h)= >J——--,-—-h". p=o
(12.5)
P•
Since the relation (12.5) is linear in j it is sufficient to consider the case f= t". Then we have by the binomial formula f(a + h) =
(a
=
+
p==o
P
...
(a
p
+ 1)
...
(a — p
+ 1)
On the other hand. = n(n
— 1)
—
whence (a)
= a (a —
1)
and so we obtain (12.5).
Problems defined by 1. Consider the mapping F[t] x a) Show that this mapping does not make the space F[t] into an algebra.
b) Show that the mapping is associative and has a left and right identity.
Chapter XII. Polynomial algebra
c) Show that the mapping is not commutative. d) Prove that the mapping obeys the left cancellation law but not the right cancellation law; i.e., (g) =f2(g) implies that f1 =f2 but f(g1) f(g2) does not imply that g1 =g2. 2. Construct a linear mapping
r[t] such that
do$=i. Prove that if such that
and
$2 are
two such mappings, then there is a fixed scalar,
-$2)f In particular, prove that there is a unique homogeneous linear mapping $
of the graded space F [t] into itself and calculate its degree. $ integration operator. 3. Consider the homomorphism Q:F[t]—J' defined by
is
called the
=
Q
Show that
=f(O). Prove that if $ is the integration operator in F[z'], then
=i Use
—
this relation to obtain the formula for integration by parts: $1 g'
4. What is the Poincaré series for the graded space F[t]? 5. Show that if ThF[t]—+F[t] is a non-trivial homogeneous antiderivation in the graded algebra F[t] with respect to the canonical involution, and then
pl.
6. Calculate Taylor's expansion
f (g + h) = I (g) + hf'(g) +.. for the following cases and so verify that it holds in each case:
a) f=t2—t+l,
g=t3+2t,
h=t—5
b)f=t2+l, g=t3+t—1, h=—t+l c)f=3t2+2t+5, g=l, h=—l d)f=t3—t2+t—l,
g=t,
h=t2—t+1
§ 2. Ideals and divisibility
357
7. For the polynomials in problem 6 verify the chain rule
[1(g)]' explicitly. Express the polynomialf(g(h))' in terms of the derivatives off, g and h and calculate [f(g(h))]' explicitly for the polynomials of problem 6.
§ 2.
Ideals and divisibility
12.5. Ideals in F Iti. In this section it will be shown that every ideal in
the algebra F[t] is a principal ideal (cf. sec. 5.3). We first prove the following
Lemma I: (Euclid algorithm): Let [+0 and g+0 be two polynomials. Then there exist polynomials q and r such that
f = gq + and deg r<deg g or r=0. Proof: Let degf= n and deg g = in. If in > n we write
f =gO+f and the lemma is proved. Now consider the case Without loss of generality we may assume thatf and g are monic polynomials. Then we have
f=
+
degf1 0.
(13.11)
Setting g2=gf1 we have degg2<degp and so p is not a divisor of g2. it follows that g2 cannot annihilate all the vectors of E; i.e., there is a vector xeE such that (13.12)
Let y =f1 (p)x. Then we obtain from (13.11) and (13.12) that
f(ço)y =f(p)f1(p)x = p(p)x =
0
while
= g(p)f1(p)x = g2(p)x
0.
§ I. Polynomials in a linear transformation
Thus
yeK(f), but
387
and so K(g) is a proper subspace of K(f).
Corollary I: Letf be a non-zero polynomial. Then
K(f)=O if and only iff and
(13.13)
are relatively prime.
1ff and /2 are relatively prime, then Corollary I to Proposition II gives
K(f)=K(f)n E=K(f)n K(/2)=O. Conversely suppose (13.13) holds, and let d be the greatest common divisor off and p. Then l/d and d/p but
K(d)=K(f)fl K(p)=O=K(1). It follows from Proposition III that 1 cannot be a proper divisor of d; hence 1 andf and p are relatively prime. Corollary II: Let f be any non-zero monic polynomial that divides p, and let ço1 denote the restriction of çü to K(f). Let pj. denote the minimum polynomial of p1. Then (13.14) pf=f.
Proof: We have from the definitions thatf(p1)=O, and hence ps/f. It K(f) and since K(pf) K(f) we obtain
follows that
K(p1) =
On the other hand,fjp and cannot be a proper divisor of Proposition IV: Let be
K(f).
Now Proposition III implies that,u1 which yields (13.14).
pf1...fr
a decomposition of p into relatively prime factors. Then
Moreover, if co denotes the restriction of q, to K(f,) and then polynomial of
pi =fi.
25*
is the minimum
Chapter XIII. Theory of a linear transformation
Proof: The proposition is an immediate consequence of Corollary II to Proposition 11 and Corollary 11 to Proposition 111. such 13.3. Eigenvalues. Recall that an eigenvalue of is a scalar that
(px=/x
(13.15)
for some non-zero vector XEE. and that x is called an eigenvector corresponding to the eigenvalue ).. (13.1 5) is clearly equivalent to
K(f)
0
(13.16)
where / is the polynomial t In view of Corollary I to Proposition III. (13.16) is equivalent to requiring that / and p have a non-scalar common divisor. Siiice deg/ I. this is the same as requiring that Thus the eigenvalues of p are precisely the distinct roots of p. Now consider the characteristic polynomial of çø, x
=
are the characteristic coefficients of p defined in sec. 4.19. The corresponding polynomial function is then given by where the
det ((p — )L i):
x
it follows from the definition that
dimE = Recall that the distinct roots of are precisely the eigenvalues of (cf. sec. 4.20). Hence the distinct roots of the characteristic polynomial coincide with those of the minimum polynominal. In sec. 13.20 it will be shown that the minimum polynomial is even a divisor of the characterisic polynomial.
Problems 1. Calculate the minimum polynomials for the following linear transformations: a) ço=).t b) (p is a projection operator c) (p is an involution
§ I. Polynomials in a linear transformation d) e)
389
'P is a differential operator
is a (proper or improper) rotation of a Euclidean plane or of
Euclidean 3-space 2. Given an example of linear transformations (p, t/i:E—*E such that 0i/i do not have the same minimum polynomial. o 'P and where (1= 1,2) are 3. Suppose E—E1E13E2 and linear transformations. Let p, 111, P2 be the minimum polynomials of (p, and 'P2. Prove that p is the least common multiple of P1 and P2. 4. More generally, suppose E1, E2 c E are stable under 'P and E= and 'P12:E1fl E2 be the E1+E2. Let q,1:E1—+E1, of'P and suppose that they have minimum restrictions polynomials P1,
is the least common multiple of Pt and P2• where v is the greatest common divisor of c) Give an example showing that in general P12 + 5. Suppose E1 E is a subspace stable under 'P. Let and minimum polynomials of p
b)
and
P2.
and 1ü be the
and
Prove that let v be the least common multiple of Pi and Construct an example where v=p+üp1 and an example where v+p= Finally construct an example where
6. Show that the minimal polynomial p of a linear transformation 'P can be constructed in the following way: Select an arbitrary vector x1eE and determine the smallest integer k1, such that the vectors (PVX1 (v= 0.. .k1) are linearly dependent,
= 0. Define a polynomial ft by k1
11
v0
tv.
If the vectors (PVXI (v=O...k1) do not generate the space E select a vector x2 which is not a linear combination of these vectors and apply the same
construction to x2. Let f2 be the corresponding polynomial. Continue this procedure until the whole space E is exhausted. Then p is the least common multiple of the polynomialsfQ. 7. Construct
the minimum and characteristic polynomials for the
following linear transformations of R4. Verify in each case that p divides
Chapter XIII. Theory of a linear transformation
390
a) b)
= =
c)
=
+ + + —
d)
+ —
+
+
+
+
—
—
—
+
— —
8. Let p be a rotation of an inner product space. Prove that the coefficients
of the minimum polynomial satisfy the relations
k=degp,v=0...k ± 1 depending on whether the rotation is proper or improper. 9. Show that the minimum polynomial of a selfadjoint transformation of a unitary space has real coefficients. 10. Assume that a conjugation z—*2 is defined in the complex vector space E (cf. sec. 11.7). Let ço:E—÷E be a linear transformation such that Prove that the minimum polynomial of p has real coefficients. 11. Show that the set of stable subspaces of E under a linear transformation is a lattice with respect to inclusion. Establish a lattice homomorphism of this lattice onto the lattice of ideals in F((p). 12. Given a regular linear transformation show that is a polywhere
nomial in çø.
13. Suppose p e L(E, E) is regular. Assume that for every //e L(E: E) Prove that J= I for some k. If/k is the leasi. integer (some
such that )'= I, prove that the minimum polynomial,
of p can be
written
=
§ 2.
Generalized eigenspaces
13.4. Generalized eigenspaces. Let
=
...
ft irreducible
(13.17)
be the decomposition of p into its prime factors (cf. sec. 12.9). Then the spaces
E. = K are called the generalized eigenspaces of p. It follows from sec. 13.2 that are stable under 'p. Moreover, Proposition IV, sec. 13.2 implies the
§ 2. Generalized eigenspaces
391
that (13j8) and
=
jk
where p, denotes the minimum polynomial of the restriction of p to E1. In particular, dim Now suppose 2 is an eigenvalue for 'p. Then t—2Jp, and so for some i t
Hence the eigenspaces of
—2.
are precisely the spaces
K(f) where the J, are the those polynomials in the decomposition (13.17) which are linear.
13.5. The projection operators. Let the projection operators in E associated with the decomposition (13.18) be denoted by ira. It will be shown that the mappings are polynomials in p,
i=1,...,r. If r= 1, it1 = t and the assertion is trivial. Suppose now that r> 1 and define polynomials by —
g
are relatively prime, and hence there
exist polynomials h1 such that (13.19)
On the other hand, it follows from Corollary II, Proposition II, sec. 13.2 that
K(g) =
j*i
and so, in particular, =
0
xe
j*i
Now let xeE be an arbitrary vector, and let x1EE1
(13.20)
392
Chapter XIII. Theory of a linear transformation
be the decomposition of x determined by (13.18). Then (13.19) and (13.20) yield the relation x=
=
x =
h (go) g1 (go)
h. (go) g. ((p) x1
whence x1
=
I = 1,
...,r.
(13.21)
It follows at once from (13.20) and (13.21) that
=
I
=
1,
...,r
which completes the proof. 13.6. Arbitrary stable subspaces. Let Fc: E be any stable subspace. Then
F=
(13.22)
n
where the are the generalized eigenspaces of go. In fact, since the projection operators are polynomials in go, it follows that F is stable under each 7C1.
Now we have for each xeF that X
=1X=
X
and fl
It follows that
E1,
whence E1.
inclusion in the other direction is obvious, (13.22) is established. 13.7. The Fitting decomposition. Suppose F0 is the generalized eigenspace of go corresponding to the irreducible polynomial t (if t does not divide then of course F0=0). Let F1 be the direct sum of the remaining generalized eigenspaces. The decomposition Since
E=
F0
F1
is called the Fitting decomposition of E with respect to go. F0 and F1 are called respectively the Fitting-null component and the Fitting-one component of E.
§ 2. Generalized eigenspaces
393
Clearly F0 and F1 are stable subspaces. Moreover it follows from the and F1, then definitions thatif 'p is ni/potent; i.e.,
forsomel>O
while Pi is a linear isomorphism. Finally, we remark that the corresponding projection operators are polynomials in (p, since they are sums of the projection operators defined in sec. 13.5. 13.8. Dual mappings. Let E* be a space dual to E and let E*
çp* E*
be the linear transformation dual to 'p. Then if f is any polynomial, it follows from sec. 2.25 that This
=
implies that f(p*) = 0 if and only if f('p) 0. In particular, the
minimum polynomials of 'p and
coincide.
Now suppose that F is any stable subspace of E. Then F' is stable under co*. In fact, if yEF and y*EF' are arbitrarily chosen, we have =
=
'p
0
whence 'p*y*EFI. This proves that F' is stable. Thus 'p* induces a linear transformation ((p*)I.
E*/FI
E*/FI.
On the other hand, let and ('p*)1 are be the restriction of 'p to F. It will now be shown that dual with respect to the induced scalar product between F and E*/FI is a representative of (cf. sec. 2.24). In fact, if yEF is any vector and an arbitrary vektor e E*/FI, then EJ)l
j*i
defined by
i=1,...,r.
Then (13.24)
§ 2. Generalized eigenspaces
are stable under
Then, as shown above, the
395
and (13.25)
It will now be shown that (13.26)
Let y*EF,* be arbitrarily chosen. Then for each
=
=
we have
= 0.
In view of the duality between EL and F1*, this implies thatf/'1(cp*)y* =0;
y*EE* This establishes (13.26). Now a comparison of the decompositions (13.23) and (13.25) yields (13.24).
Problems 1. Show that the minimum polynomial of p is completely reducible (i.e. all prime factors are of degree 1) if and only if every stable subspace contains an eigenvector.
2. Suppose that the minimum polynomial p of ço is completely reducible. Construct a basis of E with respect to which the matrix of ço is lower triangular; i.e., the matrix has the form 0
*
Hint: Use problem 1.
3. Let E be an n-dimensional real vector space and p: E—*E be a regular linear transformation. Show that çø can be written p=c01p2 where every eigenvalue of p1 is positive and every eigenvalue of .p2 is negative.
4. Use problem 3 to derive a simple proof of the basis deformation theorem of sec. 4.32. 5. Let ço:E-*E be a linear transformation and consider the subspaces F0 and F1 defined by F0 =
and
F1 =
fl
Chapter XIII. Theory of a linear transformation
396
a) Show that F0 = U ker
ço1.
1
b)
Show that
c) Prove that F0 and F1 are stable under
and that the restrictions are respectively nilpotent and regular. and cond) Prove that c) characterizes the decomposition
(p0:F0-3F0 and ço1
clude that F0 and F1 are respectively the Fitting null and the Fitting 1-component of E.
6. Consider the linear transformations of problem 7, § 1. For each transformation a) Construct the decomposition of into the generalized eigenspaces. b) Determine the eigenspaces. c) Calculate explicitly polynomials g, such that the g.(p) are the pro-
jection operators in E corresponding to the generalized eigenspaces. Verify by explicit consideration of the vectors that the (ço) are in fact the projection operators. d) Determine the Fitting decomposition of E. 7. Let E= be the decomposition of E into generalized eigenspaces
of p, and let
be the corresponding projection operators. Show that
there exist unique polynomials
(p) = Conclude that the polynomials
such that and
deg
deg
depend only on p. E*
8. Let E* be dual to E and
E* into generalized eigenspaces of co* prove that
=
are the corresponding projection operators and the g, are defined in problem 7. Use this result to show that and are dual and to obtain formula
where the
(13.24).
9. Let FcE be stable under 'p and consider the induced mappings E/F—+ E/F. Let E= be the decomposition of F into
c°F :F—÷F and
be the canonical injection and generalized eigenspaces of 'p. Let Q:E—*E/Fbe the canonical projection. a) Show that the decomposition of F into generalized eigenspaces is given by
F=
where
F
fl
§ 3. Cyclic spaces
397
Show that the decomposition of E/F into generalized eigenspaces of is given by E/F = (ElF)1 where (E/F), = (E1). b)
Conclude that
determines a linear isomorphism (E/F),.
E1/F,
c) If denote the projection operators in E, F and E/F associated with the decompositions, prove that the diagrams
and F
are commutative, and that
are the unique linear mappings for
which this is the case. Conclude that if g are the polynomials of problem 7,then F g1((p) and = —
10. Suppose that the minimum polynomial ji of çü is completely reducible. a) By considering first the case p = (t —
dimE.
degp
b) With the aid of a) prove that of p. § 3.
prove that
x the characteristic polynomial
Cyclic spaces
13.9. The map 0a• Fix a vector aEE and consider the linear map ar,:
given by a
fer[t].
Let K1, denote the kernel of ar,. It follows from the definition that feK,, Ka, where p is if and only if aEK(J). K,, is an ideal in T[t]. Clearly the minimum polynomial of p. Proposition 1: There exists a rector a E E such that K,, = I,, Proof: Consider first the case that p is of the form
p=
k 1.
/ irreducible.
Chapter XIII. Theory of a linear transformation
398
Then there is a vector tIE E such that (13.27)
0.
Suppose now that hEKa and let g be the greatest common divisor of h and p. Since aeK(fl, Corollary Ito Proposition I, sec. 13.2 yields
ueK(g).
(13.28)
Since g, p it follows that g =1' where / k. Hence 1 (p) a = () and relation (13.27) implies that /=k. Thus K(g)=K0. Now it follows from (13.28) that
In the general case decompose p in the form =
J irreducible
...
and let (i= 1 i) be be the corresponding decomposition of E. Let is the induced transformation. Then the minimum polynomial of given by (cf. sec. 13.4) = r) (1= 1
Thus, by the first part of the proof, there are vectors K,, =
I,,
(i =
I
such that
. . .
Now set
(1(l1 Assume that jEK11. Then f(p) a=0 whence
= 0.
Since f(p)
it follows that
f(p) whence fe Ka (i =
1
=0
(1 =
I
... r). Thus •f C
and so tel0. This shows that
•'' fl
=
III
whence K(,=I,L.
13.10. Cyclic spaces. The vector space E is called crc/ic (with respect
to the linear transformation p) if there exists a vector acE such that is surjective. Every such vector is called a geiiclaloI of E. the map In fact, let I cK and let XEE. Then, Ifa is a generator of E. then
§ 3. Cyclic spaces
399
for some geT[t], x=g(p)a. It follows that
=f(p)g(p)a = whence
f(p)a = g(p)O = 0
and so tel1.
Proposition II: If E is cyclic, then
degp = dimE.
Proof Let a be a generator of E. Then, since Ka =
0a induces an
isomorphism E.
It follows that (cf. Proposition I, sec. 12.11)
= degp.
dim
Proposition III: Let a be a generator of E and let deg p = m. Then the vectors
a,p(a)
form a basis of E.
Proof. Let / he the largest integer such that the vectors a, p(a)
(13.29)
are linearly independent. Then these vectors generate E. In fact, every
vector xeE can be written in the form x=f(p)a where I =
is a
polynomial. It follows that i—i
k
X
=
(a) = V=
p' (a). 1)
1)
Thus the vectors (13.29) form a basis of E. Now Proposition II implies that I=ni. Proposition IV: The space E is cyclic if and only if there exists a basis (v=O... n—i) of E such that =
(v = 0 ... ii —2).
Proof: If E is cyclic with generator a set a=a and apply Proposiu—I) is a basis satisfying tion III. Conversely. assume that
the conditions above. Let v= e I'[t] by
be an arbitrary vector and define V
Chapter XIII. Theory of a linear transformation
400
Then
=x as is easily checked and so E is cyclic. (v = 0 ii — I be a basis as in the Proposition CoioI/urr: Let above. Then the minimum polynomial of is given by )
= where the
are
—
determined by
Proof: It is easily checked that p(p)=O and so p is a multiple of the minimum polynomial of p. On the other hand. by Proposition II. the minimum polynomial of has degree ii and thus it must coincide with p. 13.11. Cyclic subspaces. A stable subspace is called a crc/ic siihspacc if it is cyclic with respect to the induced transformation. Every vector UEE determines a cyclic subspace. namely the subspace = Im Proposition V: There exists a cyclic subspace whose dimension is equal to degp. Proof: In view of Proposition I there is a vector u e F such that ker = Then induces an isomorphism E11.
It follows that
dimE, = diml'[t]
'R
= degp.
Throic;,, I: The degree of the minimum polynomial satisfies
degp dimE. Equality holds if and only if E is cyclic. Moreover, if F is any cyclic subspace of E. then
dimF degp.
Proof: Proposition V implies that deg p dimE. If E is cyclic. equality
holds (cf. Proposition III). Conversely. assume that degp=dimE. By Proposition V there exists a cyclic subspace with dim F=degp. It follows that F=E and so E is cyclic.
§ 3. Cyclic spaces and irreducible spaces
401
Finally, let Fc:E be any cyclic subspace and let v denote the minimum polynomial of the induced transformation. Then, as we have seen above, dim F = deg v. But v/p (cf. sec. 13.2) and so we obtain dim F deg p.
Corollary: Let F E be any cyclic subspace, and let v denote the minimum polynomial of the linear transformation induced in F by (p. Then
v=u if and only if
dimF = degp. Proof: From the theorem we have
dimF =
degv
while according to sec. 13.2 v divides p. Hence v=p if and only if deg v=deg ,i; i.e., if and only if
dimF = degp. 13.12. Decomposition of E into cyclic subspaces. Theorem II: There exists a decomposition of E into a direct sum of cyclic subspaces. Proof: The theorem is an immediate consequence (with the aid of an induction argument on the dimension of E) of the following lemma.
E such that
Lemma I. Let dimEa = degp.
Then there is a complementary stable subspace, Fc E, E = Ea
F.
Proof: Let Ea
denote the restriction of (p to Ea, and let (p*:E*
E*
the linear transformation in E* dual to (p. Then (cf. sec. 13.8) stable under (p*, and the induced linear transformation be
*
*
J_
*
(pa:E lEa *—E lEa 26
Greub. Linear Algebra
is
Chapter XIII. Theory of a linear transformations
402
is dual to
with respect to the induced scalar product between Ea and
The corollary to Theorem I, sec. 13.11 implies that the minimum polynomial Of 'Pa is again p. Hence (cf. sec. 13.8) the minimum polynomial
are dual, so that
of p is p. But Ea and
= dimEa = degp.
dim
is cyclic with respect to çü.
Thus Theorem I, sec. 13.11 implies that Now let
be the projection and choose u*eE* so that the element generates E*/E±. Then, by Proposition III, the vectors (p = 0... in — 1) form a basis of Hence the vectors
(p* V
(p = I ... in — I) are linearly independent.
Now consider the cyclic subspace Since I) it follows that On the other hand. Theorem I. sec. 13.11. implies that dim Hence dim
= (p = 0... is injective. Thus 0
Finally, since 7r* TE to
dim
+ dim
=
in —
= m. I
)
it follows that the restriction of But
in + (ii — in)
=
ii
(n =dim E)
and thus we have the direct decomposition E* =
of E* into stable subspaces. Taking orthogonal complements we obtain the direct decomposition
E=
E into stable subspaces which completes the proof.
§ 4.
Irreducible spaces
13.13. Definition. A vector space E is called
!I1(Iecoin/)os(Ib/e or
ii'i'cthicible with respect to a linear transformation 'P. if it can not be expressed as a direct sum of two proper stable subspaces. A stable
§4. Irreducible spaces
403
subspace F E is called irreducible if it is irreducible with respect to the linear transformation induced by (p. Proposition I: E is the direct sum of iireducihle subspaces. Proof: Let
E into stable subspaces such that s is maximized
(this is clearly possible. since for all decompositions we have sdim E). Then the spaces are irreducible. In fact, assume that for some i, =
is a decomposition of
dim
Ft"
> 0, dim
>0
into stable subspaces. Then
E=
F Fe" j*i is a decomposition of E into (s+ I) stable subspaces, which contradicts the maximality of s. An irreducible space E is always cyclic. In fact, by Theorem II, sec. 13.12, E is the direct sum of cyclic subspaces.
E=
If E is indecomposable. it follows that 1=and so E is cyclic. On the other hand, a cyclic space is not necessarily indecomposable. In fact, let E be a 2-dimensional vector space with basis a. h and define p: E—+E by setting (pa=h and (ph =0. Then (p is cyclic (cf. Proposition IV, sec. 13.10). On the other hand, E is the direct sum of the stable subspaces generated by a + b and a — b. 1
Theorem I: E is irreducible if and only if i)
p=fk,
f irreducible;
ii) E is cyclic.
Proof. Suppose E is irreducible. Then, by the remark above. E is E into the cyclic. Moreover, if generalized eigenspaces (cf. sec. 13.4), it follows that r= 1 and so
(f irreducible). Conversely, suppose that (i) and (ii) hold. Let E=
E1
E2
Chapter XIII. Theory of a linear transformations
404
be any decomposition of E into stable subspaces. Denote by and c02 the linear transformations induced in F1 and E, by ço, and let and 112 be the minimum polynomials of and q2. Then sec. 13.2 implies that Hence, we obtain from (i) that p and 111 =
k2 k.
=
(13.33)
Without loss of generality we may assume that k1 k2. Then xEE1
or
XEE7
and so
0.
It follows that
whence k1 k. In view of (13.33) we obtain k1 =k
i.e., It
= Iti•
Now Theorem I, sec. 13.10 yields that
dimE1 degp.
(13.34)
On the other hand, since E is cyclic, the same Theorem implies that
dimE = degjt.
(13.35)
Relations (13.34) and (13.35) give that
dimE = dimE1. Thus E= F1 and F2 = 0. It follows that E is irreducible.
Corollary I. Any decomposition of F into a direct sum of irreducible subspaces is simultaneously a decomposition into cyclic subspaces.
Then a stable subspace Corollary II. Suppose that of F is cyclic if and only if it is irreducible. 13.14. The Jordan canonical matrix. Suppose that F irreducible with respect to p. Then it follows from Theorem I sec. 13.13 that F is cyclic and that the minimum polynomial of ço has the form
p_fk
kl
(13.36)
where f is an irreducible polynomial. Let e be a generator of E and
§4. Irreducible spaces
405
consider the vectors (13.37)
it will be shown that these form a basis of E. Since
dimE =
deg,u
= pk
it is sufficient to show that the vectors (13.37) generate E. Let Fc:E be the subspace generated by the vectors (13.37). Writing
we obtain that
i==1,...,k j=1,...,p—1 e
= f(çp)1_i ço"e =
p—i —
i1,...,k1 These
equations show that the subspace F is stable under 'p. Moreover,
e = a11 e F. On the other hand, since E is cyclic, E is the smallest stable
subspace containing e. It follows that F—E. Now consider the basis ...
a11,
...;akl ...
of E. The matrix of 'p relative to this basis has the form 0 1
(13.38)
0
406
Chapter XIII. Theory of a linear transformation
are all equal, and given by
where the submatrices
01 0
0 1
0•.
j=1,...,k. 0
1
The matrix (13.38) is called a Jordan canonical matrix of the irreducible transformation p. Next let ço be an arbitrary linear transformation. In view of sec. 13.12 there exists a decomposition of E into irreducible subspaces. Choose a basis in every subspace relative to which the induced transformation has the Jordan canonical form. Combining these bases we obtain a basis of E. In this basis the matrix of p consists of submatrices of the form (13.38)
following each other along the main diagonal. This matrix is called a Jordan canonical matrix of çø. 13.15. Completely reducible minimum polynomials. Suppose now that E is an irreducible space and that the minimum polynomial is completely
reducible (p= 1); i.e. that
p = (t
—
2)k.
It follows that the
are (1 x 1)-matrices given by Jordan canonical matrix of ço is given by
21
Hence the
0
2 1. (13.39)
..1 0
In particular, if E is a complex vector space (or more generally a vector space over an algebraically closed field) which is irreducible with respect to 'p, then the Jordan canonical matrix of çü has the form (13.39). 13.16. Real vector spaces. Next, let E be a real vector space which is irreducible with respect to ço. Then the polynomialf in (13.36) has one of the two forms f=t—2 2ER or
§4. Irreducible spaces
407
In the first case the Jordan canonical matrix of p has the form (13.39). In the second case the are 2 x 2-matrices given by 0
Hence
1
the Jordan canonical matrix of 0
has the form
1
-fl
1
- -
1
H 13.17. The number of irreducible subspaces. It is clear from the construction in sec. 13.12 that a vector space can be decomposed in several ways into irreducible subspaces. However, the number of irreducible subspaces of any given dimension is uniquely determined by as will be shown in this section. Consider first the case that the minimum polynomial of 'p has the form
p=f
degf=p
wheref is irreducible. Assume that a decomposition of E into irreducible subspaces is given. The dimension of every such subspace is of the form pK(l as follows from sec. 13.12. Denote by FK the direct sum of the irreducible subspaces of dimension pic and denote by NK the number of the subspaces Then we have
E=
FK.
Comparing the dimensions in (13.40) we obtain the equation = dim E=n.
(13.40)
Chapter XIII. Theory of a linear transformation
405
Now consider the transformation
it follows from (13.40) that
Since the subspaces FK are stable under
(13.41)
By the definition of
we have FK
dim
=
whence
=
Since the dimension of each follows that
decreases by p under k/I (cf. sec. 1 3. 14) it
dimk//Fh =
—
(13.42)
1)NK.
Equations (13.41) and (13.42) yield (r(k/I) = rank t/i)
K=2
Repeating the above argument we obtain the equations j=1,...,k.
Kj+
1
Replacing] byj+ I and]—! respectively we find that
(K—j—l)NK=p (13.43)
and —j + 1)NK =
=
—j)NK +
pINK.
(13.44)
Adding (13.43) and (13.44) we obtain
+
= 2p
(K
=2p Kj+
1
—j)NK +
+
+
§4. Irreducible spaces
409
whence
=
+
j
—
= 1,...,k.
This equation shows that the numbers N1 are uniquely determined by the
(j= l,...,k).
ranks of the transformations In particular.
1.(i/1k_l)>
Nk=
I
if]> k. Thus the degree of is pk where k is the largest integer k and such that Nk>O. In the general case consider the decomposition of E into the generalized eigenspaces E= a a direct sum of irreducible subspaces. Then E every irreducible subspace F, is contained in some E (cf. Theorem I, sec. 13.13). Hence the decomposition determines a decomposition of
each E, into irreducible subspaces. Moreover, it is clear that these subspaces are irreducible with respect to the induced transformation E. Hence the number of irreducible subspaces in E1 of a given dimension is determined by and thus by p. It follows that the number of spaces F, of a given dimension depends only on (p. 13.18. Conjugate linear transformations. Let E and F be u-dimensional vector spaces. Two linear transformations p: E—*E and i/i: F—÷F are called conjugate if there is a linear isomorphism such that (pt: E1
= Proposition II: Two linear transformations (p and u/i
are
conjugate if
and only if they satisfy the following conditions: (i)
The minimum polynomials of (p and
factors /1
u/i
have
the same prime
Ir
(ii)
Proof: If (p and u/i
1=1 ... are
I.
conjugate the conditions above are clearly
satisfied.
To prove the converse we distinguish two cases.
Chapter XIII. Theory of a linear transformation
410
Casc I: The minimum polynomials of p and /i are of the form Pq)
.1
=
and
f irreducible.
.11.
Decompose E and F into the irreducible subspaces I
i=-1 j=1
The numbers =
.\ (/1)
F=>i=1 j=1
and N1(l/J) are given by
-
+
p=degf
and
= (cf.
-
+
sec. 13.17). Thus (ii) implies that
il. Since
i>k and
I>!
it follows that k=l. In view of Theorem I, sec. 13.13, the spaces El and F/ are cyclic. Thus Lemma I below yields an isomorphism such that El
= These
isomorphisms determine an isomorphism 1/I
Case II: and ized eigenspaces
=
0
such that
0
are arbitrary. Decompose E and F into the general-
and set
(i= I ... i). Then
restricts to a linear isomorphism in the subspace
§4. Irreducible spaces
Moreover, for
411
is zero in E,. Thus dimE —
=
= 1 ...
dimF —
=
=
Similarly.
... r.
i
Now (ii) implies that
1=1 ...r. Let and denote the respectively. The minimum polynomials of
restrictions of 'p and are of the form
and
i=l ... r and Since
+ dimE —
and
= r(f
j
+ dim F — dim
1
—
it follows from (ii) that l
(j(y) = Thus the pair
satisfies
...
the hypotheses of the theorem and so we
are reduced to case I.
Lemma I: Let p: E E and /i: F —* F (dim E = dim F = ii) be linear transformations having the same minimum polynomial. Then, if E and F are cyclic, 'p and are conjugate. Proof: Choose generators a and b of E and F. Then the vectors 01='pi(a) and
ii— I) forma basis ofEand F respectively.
Now set a,='p(a) and h= t/i'(h). Then, for some and
a,, = 1=
h=
()
h1. 1=
()
Moreover, the minimum polynomials of 'p and =
t'
and
—
1=)
=
t'
are given by —
1=0
Chapter XIII. Theory of' a linear transformations
412
(cf. the corollary of Proposition IV, sec. 1 3.10). Since 1 = 0 ...
=
it follows that
I).
n
by setting
define an isomorphism =
=
/ = 0 ...
h1
n —
Then we have
= (a,) = These
= h,. j=()
j=()
equations imply that
j=0...ii— I. This shows that Anticipating the result of sec. 13.20 we also have Corollary I: Two linear transformations are conjugate if and only if they have the same minimum polynomial and satisfy condition (ii) of Proposition II.
Corollary II: Two linear transformations are conjugate if and only if they have the same characteristic polynomial (cf. sec. 4.19) and satisfy (ii).
Proof: Let J be the common characteristic polynomial of p and /i. Write
=
. .
=
...
.
(f irreducible).
and = .1111
Set
the generalized eigenspaces for direct decompositions
and
fr
13(1=1 ... r) denote 1= 1 ... r; let and t/i respectively. Then we have the
§4. Irreducible spaces
it follows that
Since
413
(1=1 ... r). Similarly.
whence
(i=l...r). Thus we may assume that (g irreducible). Then (kni). ,=g'
I
is
of the form f=g" and
the corollary
follows immediately from the proposition.
Corollaiv III: Every linear transformation is conjugate to its dual.
Problems 1.
(i)
Let P: A(E; E) be a non-zero algebra endomorphism. Show that for any projection operator it.
r(P(m))=i(rr), (ii) Use (i) to prove that 2. Show that every non-zero endomorphism of the algebra A (E; E) is an inner automorphism. Hint: Use Problem I and Proposition II. sec. 13.18. 3. Construct a decomposition of into irreducible subspaces with respect to the linear transformations of problem 7, § 1. Hence obtain the Jordan canonical matrices. 4. Which of the linear transformations of problem 7, § make into a cyclic space? For each such transformation find a generator. For which of these transformations is irreducible? 5. Let E1 cE be a stable subspace under p and consider the induced mappings p1: E1 E1 and E/E1 E/E1. Let p, 1u1, ji be the corresponding minimum polynomials. a) Prove that E is cyclic if and only if 1
i) E1 is cyclic ii) E/E1 is cyclic
iii) p=u1ü. In particular conclude that every subspace of a cyclic space is again cyclic.
b) Construct examples in which conditions i), ii), iii) respectively fail, while the remaining two continue to hold. Hint: Use problem 5, § 1. 6. Let j=1 are
E into subspaces and suppose linear transformations with minimum polynomials
Chapter XIII. Theory of a linear transformations
414
Define a linear transformation
of E by
a) Prove that E is cyclic if and only if each relatively prime.
is cyclic and the
are
b) Conclude that if E is cyclic, then each F3 is a sum of generalized eigenspaces for çø.
c) Prove that if E is cyclic and
a generates Eifand only if generates 1. ..s). 7. Suppose Fc: E is stable under p and let (pF:F-+F (minimum polynomial jib.) and (minimum polynomial ji) be the induced transformations. Show that E is irreducible if and only if i) E/F is irreducible. ii) F is irreducible. iii) wheref is an irreducible polynomial. 8. Suppose that Eis irreducible with respect to çø. Letf (f irreducible)
be the minimum polynomial of ço. a) Prove that the k subspaces K(f) stable subspaces of E b) Conclude that lmf(co)K =
K(fk_l) are the only non-trivial
K
k.
9. Find necessary and sufficient conditions that E have no nontrivial stable subspaces. 10. Let be a differential operator in E.
a) Show that in any decomposition of E into irreducible subspaces, each subspace has dimension I or 2. b) Let N1 be the number of/-dimensional irreducible subspaces in the above decomposition (j= 1, 2). Show that
N1-f-2N,=dimE and N1=dimH(E). c) Using part b) prove that two differential operators in E are conjugate if and only if the corresponding homology spaces coincide. Hint: Use Proposition II, sec. 13.18.
11. Show that two linear transformations of a 3-dimensional vector space are conjugate if and only if they have the same minimum polynomial.
§ 5. Applications of cyclic spaces
12. Let be a linear transformation. Show that there exists a (not necessarily unique) multiplication in E such that i) E is an associative commutative algebra ii) E contains a subalgebra A isomorphic to F((p)
iii) If &F(p)4A is the isomorphism, then
13. Let F be irreducible (and hence cyclic) with respect to p. Show that the set S of generators of the cyclic space E is not a subspace. Construct a subspace F such that S is in I — correspondence with the non-zero elements of E/F. 1
14. Let
be
a linear transformation of a real vector space having can not be written in the
distinct eigenvalues, all negative. Show that form § 5. Applications
of cyclic spaces
In this paragraph we shall apply the theory developed in the preceding paragraph to obtain three important, independent theorems. 13.19. Generalized eigenspaces. Direct sums of the generalized eigenspaces of 'p are characterized by the following
Theorem I: Let (13.45)
be any decomposition of E into a direct sum of stable subspaces. Then the following three conditions are equivalent: (i) Each is a direct sum of some of the generalized eigenspaces E, of 'p. (ii) The projection operators
in E associated with the decomposition
(13.45) are polynomials in 'p. (iii) Every stable subspace UcE satisfies U=
U n
Proof: Suppose that (i) holds. Then the projection operators are sums of the projection operators associated with the decomposition of F into generalized eigenspaces, and so it follows from sec. 13.5 that they are polynomials in 'p. Thus (i) implies (ii).
416
Chapter XIII. Theory of a linear transformation
Now suppose that (ii) holds, and let Uc E be any stable subspace. Then i, we have
since
Uc>o1U. Since Uis stable underQ1 it follows that Q1Uc: Un U c:
U
fl
whence
F1.
The inclusion in the other direction is obvious. Thus (ii) implies (iii). Finally, suppose that (iii) holds. To show that (i) must also hold we first prove
Lemma I: Suppose that (iii) holds, and let
E into generalized eigenspaces. Then to every i, such that (1= 1, ...,r) there corresponds precisely one integer], (1 E1 n
Proof of lemma I: Suppose first that s). Then from (iii) we obtain every ./(1
E. =
E, n
=0 for a fixed I and for
fl
=
0
j= 1
which is clearly false (cf. sec. 13.4). Hence there is at least one] such that
To prove that there is at most one] (for any fixed 1) such that E1 n we shall assume that for some f,]1,]2, fl
+0
and E1 n
0,
+0
and derive a contradiction. Without loss of generality we may assume that
E1flF1+0 and E1flF2+0. Choose two non-zero vectors
y1eE1flF1 and y2EE1flF2. Then since Yi' y2eE1 we have that =
§ 5. Applications of cyclic spaces
417
J irreducible, is the decomposition of p. Let and '2 the least integers such that =0 and =0. We may assume that = '2 we simply replace Yt by the vector where ji
. .
be
Then
f
= 0 and
11
+ 0 for k
k
whence k=l== 1. Hencef(p)=O. is simultaneously nilpotent and semisimple, then Corollary I: If (p = 0.
The major result on semisimple transformations obtained in this section is the following criterion:
Theorem I: A linear transformation is semisimple if and only if its minimum polynomial is the product of relatively prime irreducible polynomials (or equivalently, if the polynomials p and ji' are relatively prime).
Remark: Theorem I shows that a linear transformation p is semisimple if and only if it is a semisimple element of the algebra F(p) (cf. sec. 12.17).
Proof: Suppose (p
=
j1ki
is
...
semisimple. Consider the decompositions
1 irreducible and relatively prime
of the minimum polynomial, and set
ffifr
Then
f(p) is nilpotent, and hence by Proposition I of this section,
f(p)=O. It follows that
by definition, we have
This proves the only if part of the theorem.
To prove the second part of the theorem we consider first the special case that the minimum polynomial, ji, of ip is irreducible. To show that (p is semisimple consider the subalgebra, T(p), of A (E; E) generated by (p and i. Since p is irreducible, F(p) is a field (cf. sec. 12.14). E(p) contains F and hence it is an extension field of F, and E may be considered as a vector space over F(p) (cf. § 3, Chapt. V). Since a subspace of the F-vector space E is stable under p if and only if it is stable under every transforma-
tion of F(p), it follows that the stable subspaces of E are precisely the F(p)-subspaces of the space E. Since every subspace of a vector space has a complementary subspace it follows that 'p is semisimple.
Chapter XIII. Theory of a linear transformation
Now consider the general case
P=
fi
•
.
. fr
ft irreducible and relatively prime.
Then we have the decomposition
E into generalized eigenspaces. Since the minimum polynomial of the is preciselyf (cf. sec. 13.4) it follows induced transformation from the above result that is semisim pIe. Now let E be a stable subspace. Then we have, in view of sec. 13.6,
F F is a stable subspace of complementary subspace
=
and hence there exists a stable
(F n E1)
These equations yield =
H=
H is a stable subspace of E it follows that p is semisimple. Corollary I. Let p be any linear transformation and assume that
E=(F E into stable subspaces such that the induced are semisimple. Then p is semisimple.
transformations be the minimum polynomial of the induced transforProof: Let Since is semisimple each p is a product of relatively mation prime irreducible polynomials. Hence, the least common multiple, f, of the is again a product of such polynomials. Butf(p) annihilates E and hence the minimum polynomial, p, of 'p dividesf It follows that p is a product of relatively prime irreducible polynomials. Now Theorem I implies that 'p is semisimple. Proposition H: Let A c F be a subfield, and assume that E, considered as a A-vector space has finite dimension. Then every (F-linear) transfor-
mation, çü, of E which is semisimple as a A-linear transformation is semisimple considered as F-linear transformation.
§ 6. Nilpotent
and semisimple transformations
429
Proof Let p4 be the minimum polynomial of p considered as a zi-linear are relatively
transformation. It follows from Theorem I that and prime. Hence there are polynomials g,rE4[t] such that
(13.55)
On the other hand, every polynomial over A[t] may be considered as a polynomial in F[t]. Since ((p) 0 we have JLr
denotes the minimum polynomial of the (F-linear) transformation p. Hence we may write = p1 h some /1 E F [t] and so = (13.56) + Pr/I'. where
Combining (13.55) and (13.56) we obtain
q/zp1+h'rp1+hrp-= 1 whence
p- = I
(q Ii + h 'r) Pr + (Ii
are relatively prime. This relation shows that the polynomials and Now Theorem I implies that the r-linear transformation ço is semisimple.
Theorem II: Every linear transformation p can be written in the form
where c°s is semisimple and p and
are given by
and
k=max(k1 Moreover, if N
any decomposition of p into a semisimple and nilpotent transformation such that then is
and Proof': For the existence apply Theorem I and Theorem IV. sec. 12.16
with A=F(p).
430
Chapter XIII. Theory of a linear transformation
To prove the uniqueness let
be
any decomposition of (p
into a semisimple and a nilpotent transformation such that with Then the subalgebra of A(E; E) generated by i.
commutes is and
commutative and contains (p. Now apply the uniqueness part of Theorem IV. sec. 12.16.
13.25. The Jordan normal form of a semisimple transformation. Suppose that E is irreducible with respect to a semisirn pie transformation Then it follows from sec. 13.13 and Theorem I sec. 13.24 that the minimum polynomial of (p
has
the form IL =
where
f is irreducible. Hence the Jordan canonical matrix of (p
has
the
form (cf. see. 13.15) o i.••
(13.57) o
p
•i
p=
deg p.
It follows that the Jordan canonical matrix of an arbitrary semisimple transformation consists of submatrices of the form (13.57) following each other along the main diagonal. Now consider the special case that E is irreducible with respect to a semisimple transformation whose minimum polynomial is completely reducible. Then we have that p= 1 and hence E has dimension 1. It follows that if is a semisimple transformation with completely reducible minimum polynomial, then E is the direct sum of stable subspaces of dimension 1; i.e., E has a basis of eigenvectors. The matrix of with respect to this basis is of the form
(13.58)
§ 6. Nilpotent
and semisimple transformations
431
are the (not necessarily distinct) eigenvalues of p. A linear transformation with a matrix of the form (13.58) is called diagonalizable. Thus semisimple linear transformations with completely reducible minimum polynomial are diagonalizable. Finally let p be a semisimple transformation of a real vector space E. where the
Then a similar argument shows that E is the direct sum of irreducible subspaces of dimension 1 or 2. 13.26. * The commutant of a semisimple transformation.
Theorem III: The commutant C(p) of a semisimple transformation is a direct sum of ideals (in the algebra C((p)) each of which is isomorphic to the full algebra of transformations of a vector space over an extension
field ofF. Proof: Let
EE1EI3"EBEr
(13.59)
be the decomposition of E into the generalized eigenspaces. It follows from sec. 13.21 that the eigenspaces E are stable under every transformation I/JEC((p). Now let be the subspace consisting of all transformations t/i such that
is stable under each element of C(p) it follows that is an ideal in the algebra C((p). As an immediate consequence of the definition, we Since
have
fin
j=1,...,r.
k*j
(13.60)
be arbitrary and consider the projection operators associated with the decomposition (13.59).
Now let
Then (13.61)
where (13.62)
It follows from (13.62) that
Hence formulae (13.60) and (13.61)
imply that C ('p) =
It is clear that C
is the restriction of 'p to
Chapter XIII. Theory of a linear transformation
432
where the isomorphism is obtained by restricting a transformation to
induced by (p. Since the Now consider the transformations minimum polynomial of is irreducible it follows that E we obtain from chap. V, § 3 a vector space over that = of linear 13.27. Semisimple sets of linear transformations. A set { (p transformations of E will be called seniisimple if to every subspace F1 c: E which is stable under each there exists a complementary subspace F2 which is stable under each }
Suppose now that {co2} is any set of linear transformations and let be the subalgebra generated by the Then clearly, a subspace Fc:E is stable under each if and only if it is stable under is semisimple if and only if the each L'eA. In particular, the set algebra A is semisimple. be a set of commuting semisimple transforTheorem IV: Let mations. Then is a semisimple set.
Proof: We first consider the case of a finite set of transformations and proceed by induction on s. If s= I the theoreni is trivial. Suppose now it holds for s--I and assume for the moment that the minimum polynomial of is irreducible. Then E may be considered as a they may F((p1)-vector space. Since the ...,s) commute with be considered as F(ço1)-linear transformations (cf. Chap. V, § 3). Moreover, Proposition II, sec. 13.24 implies that the considered as linear transformations, are again semisimple. Now let F1 E be any subspace stable under the I, ...,s). Then since F1 is stable under (pt, it is a F(p1 )-subspace of F. Hence. by the induction hypothesis, there exists a F(p1 )-subspace of E, F2, which is stable under and such that
E=
F1
F2.
Since F2 is a F((p1)-subspace, it is also stable under stable subspace complementary to F1. of Let the minimum polynomial simple. we have P1 =
fi
. .
. Jr
and so it is a p
f1 irreducible and relatively prime.
is semi-
§ 6. Nilpotent and semisimple transformations
433
Let
E into the generalized eigenspaces of (pt.
Now assume that F1 cE is a subspace stable under each (1=1, ...,s). According to sec. 13.21 each E1 is stable under each It follows that the subspaces F1 n are also stable under every Moreover the restrictions of the to each are again semisimple (ci. sec. 13.24) and in particular, the restriction of ço1 to has as minimum polynomial the
irreducible polynomial f1. Thus it follows that the restrictions of the form a semisimple set, and hence there exist subspaces which are stable under each co and which satisfy to
= (F1
P
fl
Setting F2
I, ...,
=P
we have that F2 is stable under each
E=
j=
F1
and that F2.
This closes the induction, and completes the proof for the case that the are a finite set. If the set is infinite consider the subalgebra A c: A (E; E) generated by the 'pa. Then A is a commutative algebra and hence every subset of A consists of commuting transformations. In view of the discussion in the beginning of this section it is sufficient to construct a semisimple system of generators for A. But A is finite dimensional and so has a finite system of generators. Hence the theorem is reduced to the case of a finite set. Theorem IV has the following converse: Theorem
V: Suppose Ac:A(E; E) is a commutative semisimple set.
Then for each çoEA, p is a semisimple transformation. Proof Let coEA be arbitrary and consider the decomposition
of its minimum polynomial p. Define a polynomial, g, by
Since the set A is commutative, K(g) is stable under every t/ieA. Hence 28
Greub. Linear Algebra
Chapter XIII. Theory of a linear transformation
434
there exists a subspace E1 c:Ewhich is stable under every /JEA such that
E= Now let
Ii =
(13.63)
1k1—1
Then we have
h(ço)E c: K(g). On the other hand, since E1 is stable under h(p), h((p)E1
E1
whence
K(g)
h((p)E1
n E1
= 0.
It follows that E1 c: K(h). Now consider the polynomial
1,1).
where
Since
k, —
1
it follows that
p(p)x=O
xeE1
(13.64)
xEK(g).
(13.65)
and from 1, 1 we obtain
p(p)x=O
In view of (13.63), (13.64) and (13.65) imply that p(p)=O and so Now it follows that
max(k1—1,l)k,
i=1,...,r
whence
i = 1, ...,r. k, = I Hence ço is semisimple. Corollary: If çü and are two commuting semisimple transformations, then p+i/i and IJco are again semisimple.
Proof. Consider the subalgebra A cA(E;E) generated by p and Then Theorem IV implies that A is a semisimple set. Hence it follows
from Theorem V that ço +
and
p
are semisimple.
Problems I. Let p be nilpotent, and let be the number of subspaces of dimension 2 in a decomposition of E into irreducible subspaces. Prove that dim kerp = EN).
§ 6. Nilpotent
and semisimple transformations
435
2. Let p be nilpotent of degree k in a 6-dimensional vector space E. For each k (1 k 6) determine the possible ranks of p and show that k and r(co) determine the numbers N) (cf. problem 1) explicitly. Conclude that two nilpotent transformations ço and are conjugate if and only if
and 3. Suppose ço is nilpotent and let p*:E*÷_E* be the dual mapping. Assume that E is cyclic with respect to p and that p is of degree k. Let a be a generator of E. Prove that E* is cyclic with respect to and that
is of degree k. Let a*EE* be any vector. Show that
if and only if a* is a generator of E*. 4. Prove that a linear transformation p with minimum polynomial 1u is diagonalizable if and only if I) p is completely reducible ii) p is semisimple Show that i) and ii) are equivalent to
where the
are
distinct scalars.
5. a) Prove that two commuting diagonalizable transformations are simultaneously diagonalizable; i.e., there exists a basis of E with respect to which both matrices are diagonal. b) Use a) to prove that if p and t/i are commuting semisimple transformations of a complex space. then a complex space E. Let a p be the decomposition of £ into generalized eigenspaces, and let
be the corresponding projection operators. Assume that the minimum is Prove polynomial of the induced transformation that the semisimple part of 'p is given by cos
Let £ be a complex vector space and 'p be a linear transformation a), not necessarily distinct. Given an arbitrary polynomialf prove directly that the linear transformationf('p) has the eigenvalues (v= 1 n) (n=dimE). 7.
with eigenvalues
28*
I
Chapter XIII. Theory of a linear transformation
8. Give an example of a semisimple set of linear transformations which contains transformations that are not semisimple.
9. Let A be an algebra of commuting linear transformations in a complex space E. such that for any pEA, a) Construct a decomposition E1 is stable under 'p and the minimum polynomial of the induced transformation is of the form (t
(p) —
b) Show that the mapping A—÷C given by
'peA is a linear function in A. Prove that preserves products and so it is a homomorphism. c) Show that the nilpotent transformations in A form an ideal which is precisely rad A (cf. chap. V, § 2). Consider the subspace T of L(A) Prove that generated by the radA = T1 d) Prove that the semisimple transformations in A form a subalgebra, A5. Consider the linear functions in A5 obtained by restricting to A5. Show that they generate (linearly) the dual space L (A5). Prove that the is a linear isomorphism. T4 L(A5). mapping
10. Assume that E is a complex vector space. Prove that every commutative algebra of semisimple transformations is contained in an ndimensional commutative algebra of semisimple transformations. 11. Calculate the semisimple and nilpotent parts of the linear transformations of problem 7, § 1. § 7. Applications
to inner product spaces
In this concluding paragraph we shall apply our general decomposition theorems to inner product spaces. Decompositions of an inner product space into irreducible subspaces with respect to selfadjoint mappings,
skew mappings and isometries have already been constructed
in
chap. VIII.
Generalizing these results we shall now construct a decomposition for a iiorinal transformation. Since a complex linear space is fully reducible with respect to a normal endomorphism (cf. sec. 11.10) we can to real inner product spaces. restrict
§ 7. Applications
to inner product spaces
437
13.28. Normal transformations. Let E be an inner product space and (p: E—*E be a normal transformation (cf. sec. 8.5). It is clear that every
polynomial in ço is again normal. Moreover since the rank of (p1' (k =2, 3 ...)
equal to the rank of p, it follows that p is nilpotent only if = 0. Now consider the decomposition of the minimum polynomial into its prime factors, (13.66) = ... is
and the corresponding decomposition of E into the generalized eigenspaces,
(13.67)
Since the projection operators m1 associated with the decomposition (13.67) are polynomials in they are normal. On the other hand and so it follows from sec. 8.11 that the and X1EE1 be arbitrary. Then
arc selfadjoint. Now let
=0 i + I; = i.e., the decomposition (13.67) is orthogonal. It follows from Now consider the induced transformations sec. 8.5 that the are again normal and hence so are the transformations On the other hand, is nilpotent. It follows thatf(pj=0 and (xi,
= (xi,
hence all the exponents in (13.66) are equal to I. Now Theorem I of sec. 13.24 implies that a normal transformation is semisimple. Theorem I: Let E be an inner product space. Then a linear transformation ço is normal if and only if i) the generalized eigenspaces are mutually orthogonal. are homothetic (cf. sec. 8.19). ii) The restrictions
Proof: Let p be a normal transformation. It has been shown already that the spaces E1 are mutually orthogonal. Now consider the minimum polynomial. of the induced transformation p.. Since j is irreducible over it follows that either (13.68)
or
ft =
+
;t +
— 4,8k
In the first case we have that consider the case (13.69). Then
+