This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
(A.o)t/J;,j-m• m.
j = 0, ... ,
K; -
1,
i = 1, ... , r.
Indeed, this follows from Proposition 1.11 and the definition of a canonical set of Jordan chains, taking into consideration that ({J;o = F ;.0 (A. 0 )t/J w, i = 1, ... , r, and therefore tjJ 10 , .•• , t/1, 0 are linearly independent if and only if
1.6.
35
CANONICAL SET OF JORDAN CHAINS
cp 10 , ... , cp, 0 are linearly independent. In particular, the lengths of Jordan chains in a canonical set corresponding to A. 0 for L(A.) and D;.0 (A.) are the same, and Proposition 1.13 follows. 0 Corollary 1.14. The sum L~= 1 K; ofthe lengths ofJordan chains ina canonical set corresponding to an eigenvalue A. 0 of a monic matrix polynomial L(A.) coincides with the multiplicity of A. 0 as a zero of det L(A.).
The next proposition shows that a canonical system plays the role of a basis in the set of all Jordan chains of L(A.) corresponding to a given eigenvalue A. 0 • Let Jl be the length of the longest possible Jordan chain of L(A.) corresponding to A. 0 . It will be convenient to introduce into consideration the subspace .AI c q;n~' consisting of all sequences (y 0 , ... , Yr 1 ) of n-dimensional vectors such that
0 L(A.o)
L(A.o) L'(A.o)
0 0
Yo Y1
=0. 1
1
L(ll-1l(A.o)
(Jl- 2)!
(Jl- 1)!
vr2l(A.o)
L(A.o)
(1.43)
Yr!
We have already mentioned that .AI consists of Jordan chains for L(A.) corresponding to A. 0 , after we drop first zero vectors (if any) in the sequence (y 0 , .•. , yll_ 1) E .AI. Proposition 1.15.
Let i = 1,
({J;o, ... ' ({J;.Il,-1,
0
0
0'
(1.44)
s
be a set ofJordan chains of a monic matrix polynomial L(A.) corresponding to A. 0 • Then the following conditions are equivalent:
(i) (ii)
the set (1.44) is canonical; the eigenvectors cp 10 , ... , cp. 0 are linearly 1 Jl; = (J, the multiplicity of A. 0 as a zero of det L(A.); (iii) the set of sequences
D=
Yii = (0, ... , 0,
({J;o, •.• , ({);),
j = 0,
0
0
0'
Jli - 1,
independent
i = 1,
where the number of zero vectors preceding ({J;o in Yii is Jl - (j in .#·.
0
0
0'
and
s, (1.45)
+ 1),form a basis
Proof Again, we shall use the reduction to a local Smith form. For brevity, write L for the matrix appearing on the left of (1.43). Similarly, A, B will denote the corresponding matrices formed from matrix polynomials
36
1. LINEARIZATION AND STANDARD PAIRS
A(A.), B(A.) at A. 0 , respectively. First let us make the following observation: let A(A.) and B(A.) be matrix polynomials nonsingular at A. = A.0 . Denote by % c flnJL the subspace defined by formula (1.43) where L(A.) is replaced by [(A.) = A(A.)L(A.)B(A.). Then % =
B%.
(1.46)
(Note that according to Proposition 1.11 the length of the longest Jordan chain corresponding to A. 0 of L(A.) and [(A.) is the same J.l.) Indeed, (1.46) follows from the following formula (here [is the matrix on the left of (1.43) with VPl(A. 0 ) replaced by [(;t0 )' ].
E.J = .I_., (E- 1)u>(;t 0 )' ].
j = 0, 1, ....
Let now col(O, 0, ... , x) E Im
s. = Im[~(F0 , F 1 , ... , P.- 1) · ~(L 1 , L 2 , ••• , L.)],
so col(O, 0, ... , x) = ~(F0 , F 1 ,
... ,
P._ 1 ) ·~(Lt. L 2 , •.• , L.)col(z 1 , •.. , z.)
for some n-dimensional vectors z 1 , •.• , z•. Multiplying this equality from the left by ~(F 0 , F 1 , .•. , F._ 1 ), where Fj = (1/j!)F = rx; holds. It is convenient for us to write this canonical form in terms of pairs of matrices (X;, J;), where •.. m Y;), i= 1
where X;, Yi E ftn. For example, a pair oflinear transformations (X, T), where X: ([;" ~ ([;", T: (/;" 1 ~ (/;" 1 is a standard pair for the operator polynomial L(A.) if col(XT;)!:~ is an invertible linear transformation and 1-1
LA;XT;
+ XT 1 =
0.
i=O
Theorem 1.25 in the framework of operator polynomials will sound as follows: Any two standard pairs (X, T) and (X', T') of the operator polynomial L(A.) are similar, i.e., there exists an invertible linear transformation S: (/;" 1 ~ (/;" 1 such that X' = X S and T' = S- 1 T S. In the sequel we shall often give the definitions and statements for matrix polynomials only, bearing in mind that it is a trivial exercise to carry out these definitions and statements for the operator polynomial approach.
Comments A good source for the theory of linearization of analytic operator functions, for the history, and for further references is [28]. Further developments appear in [7, 66a, 66b]. The contents of Section 1.3 (solution of the inverse problem for linearization) originated in [3a]. The notion of a Jordan chain is basic. For polynomials of type H- A, where A is a square matrix, this notion is well known from linear algebra (see, for instance, [22]). One of the earliest systematic uses of these chains of generalized eigenvectors is by Keldysh [ 49a, 49b], hence the name Keldysh chains often found in the literature. Keldysh analysis is in the context of infinitedimensional spaces. Detailed exposition of his results can be found in [32b, Chapter V]. Further development of the theory of chains in the infinitedimensional case for analytic and meromorphic functions appears in [60, 24, 38]; see also [5]. Proposition 1.17 appears in [24]. The last three sections are taken from the authors' papers [34a, 34b]. Properties of standard pairs (in the context of operator polynomials) are used in [76a] to study certain problems in partial differential equations.
Chapter 2
Representation of Monic Matrix Polynomials
In Chapter 1, a language and formalism have been developed for the full description of eigenvalues, eigenvectors, and Jordan chains of matrix polynomials. In this chapter, triples of matrices will be introduced which determine completely all the spectral information about a matrix polynomial. It will then be shown how these triples can be used to solve the inverse problem, namely, given the spectral data to determine the coefficient matrices of the polynomial. Results of this (and a related) kind are given by the representation theorems. They will lead to important applications to constant coefficient differential and difference equations. In essence, these applications yield closed form solutions to boundary and initial value problems in terms of the spectral properties of an underlying matrix polynomial.
2.1. Standard and Jordan Triples
Let L(A.) = V.. 1 + 2:}: bA iA.i be a monic matrix polynomial with standard pair (X, T). By definition (Section 1.10) of such a pair a third matrix Y of size 50
2.1. STANDARD AND JORDAN TRIPLES
51
nl x n can be defined by
y =
[:T 1-~r~10. :
xyz-t
(2.1)
I
Then (X, T, Y) is called a standard triple for L(A.).If T = J is a Jordan matrix (so that the standard pair (X, T) is Jordan), the triple (X, T, Y) will be called Jordan. From Theorem 1.25 and the definition of a standard triple it follows immediately that two standard triples (X, T, Y) and (X', T', Y') are similar, i.e., for some nl x nl invertible matrix S the relations T'
X'=XS,
=
s- 1 TS,
Y' =s-ty
(2.2)
hold. The converse statement is also true: if (X, T, Y) is a standard triple of L(A.), and (X', T', Y') is a triple of matrices, with sizes of X', T', andY' equal to n x nl, nl x nl, and nl x n, respectively, such that (2.2) holds, then (X', T', Y') is a standard triple for L(A.). This property is very useful since it allows us to reduce the proofs to cases in which the standard triple is chosen to be especially simple. For example, one may choose the triple (based on the companion matrix C 1)
X= [I
(2.3)
0 · · · 0],
In the next sections we shall develop representations of L(A.) via its standard triples, as well as some explicit formulas (in terms of the spectral triple of L(A.)) for solution of initial and boundary value problems for differential equations with constant coefficients. Here we shall establish some simple but useful properties of standard triples. First recall the second companion matrix C 2 of L(A.) defined by
c2
=
0 0 I 0 0 I
0 -Ao 0 -At 0 -A2
0 0
I -A,_!
The following equality is verified by direct multiplication: C 2 = BC 1B- 1 ,
(2.4)
52
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
where A1 A2
A2
A1-1 I
I 0
I 0
A1-1 I
B=
(2.5) 0
is an invertible nl X nl matrix. In particular, c2 is, like C1, a linearization of L(il) (refer to Proposition 1.3). . Define now an nl x nl invertible matrix R from the equality R = ~Q)- 1 with B defined by (2.5) and Q = col(XT')l:A or, what is equivalent,
RBQ =I.
(2.6)
We shall refer to this as the biorthogonality condition for Rand Q. We now have
whence
RC 2
=
(2.7)
TR.
Now represent R as a block row [R 1 Ra, where Ri is an nl x n matrix. First observe that R 1 = Y, with Y defined by (2.1). Indeed,
(2.8)
where the matrix B- 1 has the following form (as can be checked directly): 0
0
Bo B1
B-1=
with B 0 =I and B 1 ,
.•• ,
... 0
Bo
Bo
B1
(2.9)
B1-1
B1 _ 1 defined recursively by
r = 0, ... , l - 2.
2.1.
53
STANDARD AND JORDAN TRIPLES
Substituting (2.9) in (2.8) yields
by definition. Further, (2.7) gives (taking into account the structure of C 2 ) R; = yi-ly
for
i
= 1, ... , l
and YA0
+
TY A 1
+ ·· · +
T 1- 1 Y A 1 _
1
+
T 1Y = 0.
(2.10)
We have, in particular, that for any standard triple (X, T, Y) the nl x nl matrix row (TiY)l;;;b is invertible. Note the following useful equalities:
fT l[Y
rril-l
TY
(2.11)
···
and B- 1 is given by (2.9). In particular, for for
i i
= 0, ... , l - 2 = l - 1.
(2.12)
We summarize the main information observed above, in the following statement: Proposition 2.1.
(i) matrix,
(ii)
If(X, T, Y) is a standard triple of L(A.), then:
row (TjY)~-:6
= [Y TY .. · T 1- 1 Y] is an nl
X is uniquely defined by T and Y: X= [0
x nl nonsingular
··· 0
I]· [row
(TjY)J:br\ (iii)
YA 0
+
TYA 1
+ .. · +
T 1- 1 YA 1_ 1
+
T 1Y = 0.
Proof Parts (i) and (iii) were proved above. Part (ii) is just equalities (2.12) written in a different form. D
For the purpose of further reference we define the notion of a left standard pair, which is dual to the notion of the standard pair. A pair of matrices (T, Y), where Tis nl x nl and Y is nl x n, is called a left standard pair for the monic matrix polynomial L(A.) if conditions (i) and (iii) of Proposition 2.1 hold. An
2. REPRESENTATION OF MONIC MATRIX POLYNOMIALS
54
equivalent definition is (T, Y) is a left standard pair of L(A.) if (X, T, Y) is a standard triple of L(A.), where X is defined by (ii) of Proposition 2.1. Note that (T, Y) is a left standard pair for L(A.) iff (YT, TT) is a (usual) standard pair for LT(A.), so every statement on standard pairs has its dual statement for the left standard pairs. For this reason we shall use left standard pairs only occasionally and omit proofs of the dual statements, once the proof of the original statement for a standard pair is supplied. We remark here that the notion of the second companion matrix C 2 allows us to define another simple standard triple (X 0 , T0 , Y0 ), which is in a certain sense dual to (2.3), as
X 0 = [0
(2.13)
· · · 0 I],
It is easy to check that the triple (X 0 , T0 , Y0 ) is similar to (2.3): X 0 =[1
0 ··· O]B-t,
C2 =BC 1 B-\
Y0 =Bcol( nl. We shall reduce the proof to the case when v = nl. To this end we shall appeal to some well-known notions and results in realization theory for linear systems, which will be stated here without proofs. Let W(A.) be a rational matrix function. For a given A. 0 E (/;define the local degree b(W; A. 0 ) (cf. [3c, Chapter IV]),
!
W-q O(W; J.,)
~rank [
w-q+1 . . . w_1] w_q · · · w_2 . .
o
. .
'
w_q
where W(A.) = If= -q (A. - A.0 )iUJ is the Laurent expansion of W(A.) in a neighborhoodofA. 0 . Definealsob(W; oo) = b(W; 0), where W(A.) = W(A. - 1 ). Evidently, b(W; A. 0 ) is nonzero only for finitely many complex numbers A. 0 • Put b(W)
=
I
b(W; A.).
.l.E(u{oo)
The number b(W) is called the McMillan degree of W(A.) (see [3c, 80]). For rational functions W(A.) which are analytic at infinity, the following equality holds (see [81] and [3c, Section 4.2]):
(2.33)
where Di are the coefficients of the Taylor series for W(A.) at infinity, W(A.) = and m is a sufficiently large integer. In our case F(A.) = Q(IA. - T)- 1R, where (Q, T, R) satisfies (2.28) and (2.29). In particular, equalities (2.29) and (2.33) ensure that b(F) = nl. By a well-known result in realization theory (see, for instance, [15a, Theorem 4.4] or [14, Section 4.4]), there exists a triple of matrices (Q 0 , T0 , R 0 ) with sizes n x nl, nl x nl, nl x n, respectively, such that Q0 (I A. - T0 )- 1 R 0 = Q(I A. - T)- 1 R. Now use (2.32) for (Q0 , T0 , R 0 ) in place of (Q, T, R) to prove that (Q0 , T0 , R 0 ) is in fact a standard triple of some monic matrix polynomial of degree l. D
If=o DiA.- i,
2.
70
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
2.4. Initial Value Problems and Two-Point Boundary Value Problems
In this section we shall obtain explicit formulas for the solutions of the initial value problem and a boundary value problem associated with the differential equation
()=
L(~) d X t t
d1x(t) dI t
1 1 . dix(t) = + j=~ L.... A) d j 0 t
f( ) t'
tE[a,b]
(2.34)
where L(A.) = U 1 + ,l}:~ AiA.i is a monic matrix polynomial,f(t) is a known vector function on t E [a, b] (which will be supposed piecewise continuous), and x(t) is a vector function to be found. The results will be described in terms of a standard triple (X, T, Y) of L(A.). The next result is deduced easily from Theorem 1.5 using the fact that all standard triples of L(A.) are similar to each other. Theorem 2.9. formula
The general solution of equation (2.34) is given by the
tE [a,
b],
(2.35)
where (X, T, Y) is a standard triple of L(A.), and c E rtnt is arbitrary. In particular, the general solution of the homogeneous equation
(2.36) is given by the formula CE
fln.
Consider now the initial value problem: find a function x(t) such that (2.34) holds and x + Ju< 2 > = f
with the boundary conditions u(a) = u(b) = 0. These boundary conditions are obtained by defining the 2n x 2n matrices
2.5. COMPLETE PAIRS AND SECOND-QRDER DIFFERENTIAL EQUATIONS
75
where the blocks are all n x n. Insisting that the homogeneous problem have only the trivial solution then implies that the matrix (2.47) appearing in (2.43) is nonsingular. In this case we have V(Go) =
-M[
Xl ]eit
J 2 , respectively, and let S 1 =
We show first that S t• S2 a complete pair implies that X, J is a Jordan pair for L(A.). We have
78
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
So the nonsingularity of the first matrix follows from that of S 2 - S 1 • Then Sf + A 1S; + A 0 = 0, for i = 1, 2, implies X;Jf + A 1X;J; + A 0 X; = 0 for each i and hence that XJ 2 + A 1XJ + A 0 X = 0. Hence (by Theorem 1.23) X, J is a Jordan pair for L(A.) and, in the terminology of the preceding section, J 1 and J 2 determine an admissible splitting of J. To prove part (a) of the theorem observe that u(t)
= [e8' 1 e821 ] [~:] = [X 1eJ 11 X 2eh 1] [~~:~:] = XeJ 1)~}= 1 by the following equalities: s~1~,; = cii for i,j = 1, ... , k; s~~ = 1 for m r!. {/3'1 , ... , f3i}, where p; = f3k + /3;; s~j> = 0 otherwise. Then S4 T3 S4 1 = triang[J, W4 , JT + Z 4 ], where the (/3;, f3) entries of W4 form the unit matrix I, and all other entries are zeros; the oc x oc matrix Z 4 can contain nonzero entries only in the places (/3;, f3i- 1) for i < j and such that oci > 1. Again, it is easy to see that S4 T3 S4 1 is similar to T4 = triang[J, W4 , JT]. Evidently, the degrees of the elementary divisors of T4 are 2oc 1 , ... , 2ock. So the same is true for T. D
In concrete examples it is possible to find the Jordan form of T by considering the matrix
[~1 R~;2]. as we shall now show by example.
3.2.
89
DIVISION PROCESS
ExAMPLE
3.1.
Let
Using the computations in Example 2.2 and Theorem 3.2, we find a linearization T for the product L(A.) = (L 1 (.1)) 2 : T =
[J
YXJ J ,
0
where
X=[~
0
-fi+
fi-2 fi
I
+I
fi+2] 0 ,
0 0 -1
Hfi + 2) Y= ¥-fi-1) ¥-fi + 2) ¥-fi + 1)
1 0 0 1
4
0 0
J=
'
0
0
-1 0 -1
1
-4
By examining the structure ofT we find the partial multiplicities of L(l) atA 0 = 1. To this end we shall compute the 4 x 4 submatrix T0 ofT formed by the 3rd, 4th, 9th, and lOth rows and columns (the partial multiplicities of L(l) at .1 0 = 1 coincide with the degrees of the elementary divisors of /A - Y0 ). We have
T, = 0
Since rank(T0 is4. 0
-
/)
=
1 [0 0 0
1 I 0 0
-tfi t
1 0
-!
1
!{2fi - 3) . 1 1
3, it is clear that the single partial multiplicity of L(l) at .1 0
=
1
3.2. Division Process
In this section we shall describe the division of monic matrix polynomials in terms of their standard triples. We start with some remarks concerning the division of matrix polynomials in general. Let M(A.) = MiA.i and N(A.) = L~=o NiA.i be matrix polynomials of size n x n (not necessarily monic). By the division ofthese matrix polynomials we understand a representation in the form
D=o
M(A.) = Q(A.)N(A.)
+ R(A.),
(3.4)
90
3.
MULTIPLICATION AND DIVISIBILITY
where Q(A.) (the quotient) and R(A.) (the remainder) are n x n matrix polynomials and either the degree of R(A.) is less than the degree of N(A.) or R(A.) is zero. This definition generalizes the division of scalar polynomials, which is always possible and for which the quotient and remainder are uniquely defined. However, in the matricial case it is very important to point out two factors which do not appear in the scalar case: (1) Because of the noncommutativity, we have to distinguish between representation (3.4) (which will be referred to as right division) and the representation (3.5) for some matrix polynomials Q 1 (A.) and R 1 (A.), where the degree of R(A.) is less than the degree of N(A.), or is the zero polynomial. Representation (3.5) will be referred to as left division. In general, Q(A.) =1= Q 1(A.) and R(A.) =1= R 1(A.), so we distinguish between right (Q(A.)) and left (Q 1 (A.)) quotients and between right (R(A.)) and left (R 1 (A.)) remainders. (2) The division is not always possible. The simplest example of this situation appears if we take M(A.) = M 0 as a constant nonsingular matrix and N(A.) = N 0 as a constant nonzero singular matrix. If the division is possible, then (since the degree of N(A.) is 0) the remainders R(A.) and R 1(A.) must be zeros, and then (3.4) and (3.5) take the forms (3.6) respectively. But in view of the invertibility of M 0 , neither of (3.6) can be satisfied for any Q(A.) or Q 1(A.), so the division is impossible. However, in the important case when N(A.) (the divisor) is monic, the situation resembles more the familiar case of scalar polynomials, as the following proposition shows.
Proposition 3.5. If the divisor N(A.) is monic, then the right (or left) division is always possible, and the right (or left) quotient and remainder are uniquely determined. Proof We shall prove Proposition 3.5 for right division; for left division the proof is similar. Let us prove the possibility of division, i.e., that there exist Q(A.) and R(A.) (deg R(A.) < deg N(A.)) such that (3.4) holds. We can suppose 1~ k; otherwise take Q(A.) = 0 and R(A.) = M(A.). Let N(A.) = JA.k + z:~;;;J NiA.i, and write (3.7)
91
3.2. DIVISION PROCESS
with indefinite coefficients Qj and Rj. Now compare the coefficients of A.1, A.1-l, ... , A_k in both sides:
M1 M1-1
= Ql-k' = Ql-k-1 + Ql-kNk-t> (3.8) k-1
Mk = Qo
+ L Qk-;N;. i=O
From these equalities we find in succession Q1-k, Q1_k_ 1, found Q(A.) = 2:}-:~ A.jQj. To satisfy (3.7), define also
•.. ,
Q0 , i.e., we have
j
Rj = Mj -
L QqNj-q
j = 0, ... , k - 1.
for
(3.9)
q=O
The uniqueness of the quotient and remainder is easy to prove. Suppose
M(A.) = Q(A.)N(A.)
+ R(A.) =
Q(A.)N(A.)
+ R(A.),
where the degrees of both R(A.) and R(A.) do not exceed k - 1. Then
(Q(A.) - Q(A.))N(A.)
+ R(A.) -
R(A.) = 0.
Since N(A.) is monic of degree k, it follows that Q(A.) - Q(A.) = 0, and then also R(A.) = R(A.) = 0. 0 An interesting particular case of Proposition 3.5 occurs when the divisor = /A - X. In this case we have extensions of the remainder theorem:
N(A.) is linear: N(A.)
Corollary 3.6.
Let
l
L MjA.j =
Q(A.)(H- X)+ R =(/A- X)Q1(A.)
+ R!.
j=O
Proof
Use formulas (3.8) and (3.9). It turns out that i
According to (3.9), l
R = M0
+ Q0 X
=
l:MjXj. j=O
For R 1 the proof is analogous.
0
= 1, ... ' l.
3.
92
MULTIPLICATION AND DIVISIBILITY
We go back now to the division of monic matrix polynomials. It follows from Proposition 3.5 that the division of monic matrix polynomials is always possible and uniquely determined. We now express the left and right quotients and remainders in terms of standard triples. This is the first important step toward the application of the spectral methods of this book to factorization problems. Let L(A) = H 1 + I,;:~ AiAi be a monic matrix polynomial
Theorem 3.7. of degree l, and let
L1(A) = Hk- X 1T~(V1 + · · · + V,.Ak- 1 ) be a monic matrix polynomial of degree k ::;; lin the right canonical form, where (X 1 , T1 ) is a standard pair of LlA), and [V1 • • • V,.] = [col(X 1 TDt,;:-Jr 1 . Then L(A) = Q(A)L 1(A)
+ R(A),
(3.10)
where (3.11) and
(3.12) where we put A 1 = I.
Before we start to prove Theorem 3.7, let us write down the following important corollary. We say that L 1{A) is a right divisor of L(A) if L(A)
= Q(A)L 1 (A),
i.e., the right remainder R(A) is zero.
Coronary 3.8. L 1(A) is a right divisor of L(A) pair (X 1, T1) of L 1(A), the equality AoX1
if and only if, for a standard
+ A1X1T1 + ··· + A1-1X1Ti- 1 + X1Ti
= 0
(3.13)
holds. Proof If (3.13) holds, then according to (3.12), R(A) = 0 and L 1 (A) is a right divisor of L(A). Conversely, if R(A) = 0, then by (3.12)
( .±A;X 1 T~)~ =
0
for j
=
1, ... , k.
•=0
Since [V1
· •·
V,.] is nonsingular, this means that (3.13) holds.
D
93
3.2. DIVISION PROCESS
Proof of Theorem 3. 7. We shall establish first some additional properties of standard triples for the monic polynomials L(A.) and L 1 (A.). Define G~p =
X 1 T~ Vp,
1 .::;; {3 .::;; k,
(3.14)
oc = 0, 1, 2, ....
Then for each i, 1 .::;; i .::;; k, we have (3.15)
lX~~'
(and we set Gpo = 0, p = 0, 1, ... ). Indeed, from the definition of V; we deduce
I
= [V1
V2
···
V,.]
(3.16)
]·
X 1 T~- 1
V,.] on the
Consider the composition with X 1 Tf on the left and T1 [V1 right to obtain
G1,kl
:
. (3.17)
Gk,k
Reversing the order of the factors on the right of (3.16), we see that Gii = fori= 1, 2, ... , k- 1 andj = 1, 2, ... , k. The conclusion (3.15) then follows immediately. Now we shall check the following equalities:
b;,j- 1 I
k-j
VJ
=
L T'{'V,.Bj+m•
j = 1, ... 'k,
(3.18)
m=O
where BP are the coefficients of L 1 (A.): L 1 (A.) = L~=o BiA.i with Bk =I. Since the matrix col(X 1 TD~,:-J is nonsingular, it is sufficient to check that k-j X 1 T~ VJ = X 1 T~
L T'{'V,.Bj+m•
j = 1, ... , k,
(3.19)
m=O
fori = 0, 1, ... . Indeed, using the matrices Gii and the right canonical form for L 1 (A.) rewrite the right-hand side of (3.19) as follows (where the first equality follows from (3.15)): k- j-1
Gi+k- j,k -
L
Gi+m,kGk,j+m+ 1
m=O k- j-1
= Gi+k- j,k -
L
(Gi+m+ 1,j+m+ 1 -
Gi+m.j+m)
m=O
= G;+k- i.k - (G;++ 1,k -
and (3.18) is proved.
G;j)
= Gii,
94
3. MULTIPLICATION AND DIVISIBILITY
Definealsothefollowingmatrixpolynomials:L 1 jA.) = Bi + Bi+tA + · · · 0, 1, ... , k. In particular, L 1 , 0 (A.) = L 1 (A.) and L 1 ,k(A.) =I. We shall need the following property of the polynomials L 1 jA.):
+ BkA.k- i,j =
k-1
)
V,.L 1 (A.) = (I A.- T1) ( i~O T{ V,.Lt.i+ t(A.) .
(3.20)
Indeed, since A.L 1 ,i+ 1 (A.) = L 1 ,/A.)- Bi (j = 0, ... , k - 1), the right-hand side of (3.20) equals k-1
k-1
k
L T{ V,.(Llj(A.)- B)- L T{+ 1 V,.Ll,j+ l(A.) = - L T{ V,.Bj
j=O
j=O
j=O
+
V,.LlO(A.).
Butsince(X 1, T1, V,.)isastandard tripleofL 1 (A.),in viewof(2.10),L~=o T{ V,.Bi = 0, and (3.20) follows. We are now ready to prove that the difference L(A.) - R(A.) is divisible by L 1 (A.). Indeed, using (3.18) and then (3.20) R(A.)(Lt(A.))- 1
= (to AiX 1 Ti) ·
Ct:
T{ V,.Ll,j+ 1(A.)) · L1 1 (A.)
I
= L AiX 1 T~(IA.- Tl)- 1 v,.. i=O
Thus, I
L(A.)(Ll(A.))- 1
= IAjA.i·X 1(IA.-
T~)- 1 V,.
i=O (and here we use the resolvent form of L 1 (A.)). So I
L(A.)(Ll(A.))- 1
-
R(A.)(Ll(A.))- 1 = LAiXl(IA.i- Ti)(IA- Tl)- 1 V,.. i=O
Using the equality Hi- Ti1 = (L~~b A,Pyi-t-p)·(AI- T1 )(i > 0), we obtain I
L(A.)(Ll(A.))- 1
-
R(A.)(Ll(A.))- 1
i-1
= LAiXl L A_Pyi-l-pfk p=O
i=l 1-1
= L A_P
I
L
AiXtTi-t-pv,.. (3.21)
p=O i=p+l Because of the biorthogonality relations X 1 T{ V,. = 0 for j = 0, ... , k- 1, all the terms in (3.21) with p = l - k + 1, ... , p = l - 1 vanish, and formula (3.11) follows. D
3.2.
95
DIVISION PROCESS
In terms of the matrices Gap introduced in the proof of Theorem 3.7, the condition that the coefficients of the remainder polynomial (3.12) are zeros (which means that L 1(A.) is aright divisor of L(A.))can be conveniently expressed in matrix form as follows. Corollary 3.9. [Gll
G12
L 1(A.) is a right divisor of L(A.) Gld=-[Ao
X
···
if and only if
A1-1J
Goz
r G,
Gll
Glk G"'
G12
0
G1~1.1
G1-1,2
1 ,
G~~ l,k
where Gap are defined by (3.14). The dual results for left division are as follows. Theorem 3.10.
Let L(A.) be as in Theorem 3.7, and let L1(A.) = A.kJ - (Wl
+ .. · + vfpk- 1)T1 Y1
be a monic matrix polynomial of degree k in its left canonical form (2.15) (so (T1, Y1) is a left standard pair of L 1(A.) and col(W;)~=l = [Yl
T1Y1
...
T1- 1Y1r 1).
Then where
and
where A 1 =I. Theorem 3.10 can be established either by a parallel line of argument from the proof of Theorem 3.7, or by applying Theorem 3.7 to the transposed matrix polynomials LT(A.) and Li(A.) (evidently the left division
L(A.)
=
L 1(A.)Q 1(A.)
+ R 1(A.)
of L(A.) by L 1(A.) gives rise to the right division LT(A.) = QJ(A.)Lf(A.) of the transposed matrix polynomials).
+ Ri(A.)
96
3. MULTIPLICATION AND DIVISIBILITY
The following definition is now to be expected: a monic matrix polynomial L 1(A.) is a left divisor of a monic matrix polynomial L(A.) if L(A.) = L 1(A.)Q 1(A.) for some matrix polynomial Q 1(A.) (which is necessarily monic). Corollary 3.11. L 1 (A.) is a left divisor of L(A.) standard pair (T1, Y1) of L 1(A.) the equality 1-1 YlT{Aj + Y1T 11 = 0
if and
only
if for
a left
L
j=O
holds.
This corollary follows from Theorem 3.10 in the same way as Corollary 3.8 follows from Theorem 3. 7.
3.3. Characterization of Divisors and Supporting Subspaces
In this section we begin to study the monic divisors (right and left) for a monic matrix polynomial L(A.) = IA 1 + I~:6 AiA.i. The starting points for our study are Corollaries 3.8 and 3.11. As usual, we shall formulate and prove our results for right divisors, while the dual results for the left divisors will be stated without proof. The main result here (Theorem 3.12) provides a geometric characterization of monic right divisors. A hint for such a characterization is already contained in the multiplication theorem (Theorem 3.2). Let L(A.) = Lz(A.)L 1 (A.) be a product of two monic matrix polynomials L 1(A.) and L 2 (A.), and let (X, T, Y) be a standard triple of L(A.) such that
X= [X 1
0],
T= [T10
X1
Yz]
T2
'
y =
[~J.
(3.22)
and (X;, T;, Y;) (i = 1, 2) is a standard triple for L;(A.) (cf. Theorem 3.2). The form ofT in (3.22) suggests the existence of aT-invariant subspace: namely, the subspace .A spanned by the first nk unit coordinate vectors in ([" 1 (here l (resp. k) is the degree of L(A.) (resp. L 1(A.))). This subspace .A can be attached to the polynomial L 1(A.), which is considered as a right divisor of L(A.), and this correspondence between subspaces and divisors turns out to be one-to-one. It is described in the following theorem. Theorem 3.12. Let L(A.) be a monic matrix polynomial of degree l with standard pair (X, T). Then for every nk-dimensional T-invariant subspace .A c ([" 1, such that the restriction col( X l.,a(T lv~tY)f~ J is invertible, there exists a unique monic right divisor L 1 (A.) of L(A.) of degree k such that its standard pair is similar to (XIv~t, Tlv~t).
3.3. CHARACTERIZATION OF DIVISORS AND SUPPORTING SUBSPACES
97
Conversely,for every monic right divisior L 1(A.) of L(A.) of degree k with standard pair (X 1 , T 1), the subspace .A = Im([col(XT;)l:6r 1[col(X 1TDl:6J)
(3.23)
is T-invariant, dim .A= nk, col(XI..H(TI..~tY)~,:-J is invertible and (XI..It, Tl..~t) is similar to (X 1, T1).
Here X l..~t (resp. T l..~t) is considered as a linear transformation .A --+ ft" (resp . .A --+ .A). Alternatively, one can think about X l..~t (resp. T l..~t) as an n x nk (resp. nk x nk) matrix, by choosing a basis in .A (and the standard orthonormal basis in fl" for representation of X l..~t). Proof Let .A c fl" 1 be an nk-dimensional T-invariant subspace such that col( X l..~t(T l..~tY)~,:-J is invertible. Construct the monic matrix polynomial L 1(A.) with standard pair (XI..H, Tl..~t) (cf. (2.14)): L1 (,1.)
= JA.k - X l..~t(T l..~t)k(Vl + V2 A. + · · · + V,.A.k- 1),
where [V1 · · · V,.] = [col(XI..~t(TI..~tY):.:-Jr 1 . Appeal to Corollary 3.8 (bearing in mind the equality AoX + A 1 XT + · · · + A1_ 1 XT 1-
1
+ XT 1 = 0,
where Ai are the coefficients of L(A.)) to deduce that L 1(A.) is a right divisor of L(A.). Conversely, let L 1(A.) be a monic right divisor of L(A.) of degree k with standard pair (X 1, Y1). Then Corollary 3.8 implies C1 col(X 1rD::6 = col(X 1TDl:6Tl,
(3.24)
where C 1 is the first companion matrix for L(A.). Also C 1 col(XT;)::6 = col(XT;)l:6 T.
(3.25)
Eliminating C 1 from (3.24) and (3.25), we obtain T[col(XTi)l:6r 1[col(X 1rD::6J ={col(XTi)::5r 1[col(X 1TDl:6JT1. (3.26)
This equality readily implies that the subspace .A given by (3.23) is Tinvariant. Moreover, it is easily seen that the columns of [col(XT;)l:6r 1 [col(X 1TDl:6J are linearly independent; equality (3.26) implies that in the basis of .A formed by these columns, T l..~t is represented by the matrix T1. Further,
98
3. MULTIPLICATION AND DIVISIBILITY
so Xj 41 is represented in the same basis in .A by the matrix X 1 • Now it is clear that (XIAt, TIAt) is similar to (X 1 , T1). Finally, the invertibility of col( X IA~(T IA~Y)l: 6 follows from this similarity and the invertibility of col(X 1 TDf,:-c}. 0 Note that the subspace .A defined by (3.23) does not depend on the choice of the standard pair (Xt. T1 ) of L 1 (.A.), because Im col(X 1 TDl:6 = Im col(X 1 S · (S- 1 T1 SY)l:6 for any invertible matrix S. Thus, for every monic right divisor L 1(A.) of L(A.) of degree k we have constructed an nk-dimensional subspace A, which will be called the supporting subspace of L 1 (A.). As (3.23) shows, the supporting subspace does depend on the standard pair (X, T); but once the standard pair (X, T) is fixed, the supporting subspace depends only on the divisor L 1(A.). If we wish to stress the dependence of A on (X, T) also (not only on L 1 (A.)), we shall speak in terms of a supporting subspace relative to the standard pair
(X, T). Note that (for fixed (X, T)) the supporting subspace A for a monic right divisor L 1(A.) is uniquely defined by the property that (X IAt, T lA~) is a standard pair of L 1 (A.). This follows from (3.23) if one uses (XIAf, TIAt) in place of (X t• T1). So Theorem 3.12 gives a one-to-one correspondence between the right monic divisors of L(A.) of degree k and T-in variant subspaces A c (/;" 1, such that dim A = nk and col( X iAt(T IA~)i)f,:- 6 is invertible, which are in fact the supporting subspaces of the right divisors. Thus, Theorem 3.12 provides a description of the algebraic relation (divisibility of monic polynomials) in a geometric language of supporting subspaces. By the property of a standard pair (Theorem 2.4) it follows also that for a monic right divisor L 1 (A.) with supporting subspace A the formula
(3.27) holds, where [V1 ~] = [col(XIAt·(TIAtY)f,:-c}r 1 . If the pair (X, T) coincides with the companion standard pair (Pt. C 1) (see Theorem 1.24) of L(A.), Theorem 3.12 can be stated in the following form: Corollary 3.13. A subspace A c (/;" 1 is a supporting subspace (relative to the companion standard pair (Pt. C 1 ))for some monic divisor L 1 (A.) ofL(A.) of degree kif and only if the following conditions hold: (i) dim A= nk; (ii) A is C 1 -invariant; (iii) the n(l - k)-dimensional subspace of (/;" 1 spanned by all nk-dimensional vectors with the first nk coordinates zeros, is a direct complement to A.
3.3.
CHARACTERIZATION OF-DIVISORS AND SUPPORTING SUBSPACES
99
In this case the divisor L 1 (A.) is uniquely defined by the subspace .A and is given by (3.27) with X= P 1 and T = C 1 . In order to deduce Corollary 3.13 from Theorem 3.12 observe that *J, where I is the nk x nk unit matrix. Therefore condition (iii) is equivalent to the invertibility of col(P 1 (C 1 Y)~,:-J has the form [I
col(P11.~t(C1I.~tY)~,:-J.
We point out the following important property of the supporting subspace, which follows immediately from Theorem 3.12. Corollary 3.14. Let L 1 (A.) be a monic divisor of L(A.) with supporting subspace .A. Then a(L 1 ) = a(JI.~t); moreover, the elementary divisors of L 1 (A.) coincide with the elementary divisors of I A. - J l.~t.
Theorem 3.12 is especially convenient when the standard pair (X, T) involved is in fact a Jordan pair, because then it is possible to obtain a deeper insight into the spectral properties of the divisors (when compared to the spectral properties of the polynomial L(A.) itself). The theorem also shows the importance of the structure of the invariant subspaces of T for the description of monic divisors. In the following section we shall illustrate, in a simple example, how to use the description of Tinvariant subspaces to find all right monic divisors of second degree. In the scalar case (n = 1), which is of course familiar, the analysis based on Theorem 3.12leads to the very well-known statement on divisibility of scalar polynomials, as it should. Indeed, consider the scalar polynomial L(A.) = 0f= 1 (A. - A.J"'; A. 1, ... , A.p are different complex numbers and a; are positive integers. As we have already seen (Proposition 1.18), a Jordan pair (X, J) of L(A.) can be chosen as follows:
where X; = [1 0 · · · OJ is a 1 x a; row and J; is an a; x a; Jordan block with eigenvalue A;. Every J -invariant subspace .A has the following structure:
.A = .AlB ... (f) .AP' where .A; c e:"• is spanned by the first p;coordinate unit vectors, i = 1, ... , p, and pi is an arbitrary nonnegative integer not exceeding a; .It is easily seen that each such ]-invariant subspace .A is supporting for some monic right divisor L.At(A.) of L(A.) of degree p = P1 + .. · + PP: the pair (XI.At, Jl.~t) = ([X 1 ,.~t · · · Xp,.~tJ, J 1 ,.~t (f)··· (f)Jp,.~t), where X;,.~t = [1 0 · · · OJ of size 1 x Pi and Ji,.At is the Pi x Pi Jordan block with eigenvalue A;, is a Jordan pair for polynomial Of= 1 (A.- A.;)'\ and therefore col(XI.~t(JI.~t)i)f:J
3.
100
MULTIPLICATION AND DIVISIBILITY
is invertible. In this way we also s_ee that the divisor with the supporting subspace A is just 1 (A. - A.;)il;. So, as expected from the very beginning, all the divisors of L(A.) are of the form 1 (A.- A.;)'\ where 0 :S f3i :S ah i = 1, ... ' p. For two divisors of L(A.) it may happen that one ofthem is in turn a divisor of the other. In terms of supporting subspaces such a relationship means nothing more than inclusion, as the following corollary shows.
Ilf=
Ilf=
Corollary 3.15. Let L 11 (A.) and Lu(A.) be monic right divisors of L(A.) then L 11 (A.) is a right divisor of L 12 (A.) if and only iffor the supporting subs paces A 1 and A 2 of L 11 (A.) and L 1 iA.), respectively, the relation A 1 c A 2 holds. Proof Let (X, T) be the standard pair of L(A.) relative to which the supporting subspaces A 1 and A 2 are defined. Then, by Theorem 3.12, (X I.At;, T I.At) (i = 1, 2) is a standard pair of Lu(A.). If A 1 c A 2 , then, by Theorem 3.12 (when applied to L 12 (A.) in place of L(A.)), L 11 (A.) is a right divisor of L 12 (A.). Suppose now L 11 (A.) is a right divisor of L 1 iA.). Then, by Theorem 3.12, there exists a supporting subspace A 12 c A 2 of L 11 (A.) as a right divisor of L 12 (A.), so that (XI.At 12 , TI.At 1 , ) is a standard pair of L 11 (A.). But then clearly A 12 is a supporting subspace of L 11 (A.) as a divisor of L(A.). Since the supporting subspace is unique, it follows that A 1 = A 12 c A 2 .
0 It is possible to deduce results analogous to Theorem 3.12 and Corollaries 3.14 and 3.15 for left divisors of L(A.)(by using left standard pairs and Corollary 3.11). However, it will be more convenient for us to obtain the description for left divisors in terms of the description of quotients in Section 3.5.
3.4. Example Let
J
A.(A. - 1)2 L(A.) = [ O
bA. A. 2 (A. _ 2) ,
bE fl.
Clearly, a(L) = {0, 1, 2}. As a Jordan pair (X, J) of L(A.), we can take
- [-b
0 1 1 0 10000
X-
0
1=
1 0 0 1 1 1 2
-b]
1'
3.4.
101
EXAMPLE
So, for instance, [-n, 0 and [6] form a canonical set of Jordan chains of L(A.) corresponding to the eigenvalue .A. 0 = 0. Let us find all the monic right divisors of L(A.) of degree 2. According to Theorem 3.12 this means that we are to find all four-dimensional }-invariant subspaces .A such that [fJ] I.At is invertible. Computation shows that
l+ -b
We shall find first all four-dimensional }-invariant subspaces .A and then check the invertibility condition. Clearly, .A = .A0 + .A 1 + .A2 , where .Hi isJ-invariantanda(JI.AtJ = {i},i = 0, 1,2.Sincedim.A0 ~ 3,dim.A 1 ~ 2, and dim .A 2 ~ 1, we obtain the following five possible cases: Case
dim .K0
1
3 3
2
dim .K 1
dim .K 2
1
0
0 2
0
2 2
3 4 5
1
1
2
Case 1 gives rise to a single }-invariant subspace .A 1 = {(xr.x 2 ,x 3 ,x4 ,0,0)TixjECl,j= 1,2,3,4}. Choosing a standard basis in .A r. we obtain
-b 0 1 1] [ [xJ}.At~ = o o 1
0 0 0 -b 1' 0 1 0 0
X
which is nonsingular for all b E Cl. So .A 1 is a supporting subspace for the right monic divisor L.At 1 (.A.) of L(A.) given by
L.AtJA.) = IA. 2
-
XI.At/ 2 I.At 1 (V\1 ) + v~l)A.),
where
[V\"V\"]
~ :J~r ~ [! [[
1 0 0 0 b -1 0 1
-]
(3.28)
3.
102
MULTIPLICATION AND DIVISIBILITY
Substituting in (3.28) also
[0 00 00 01] '
XIAt,(JIAt) 2 = 0 we obtain
Case 2 also gives rise to a single ]-invariant subspace A
2
= {(x 1 ,x 2 ,x 3 ,0,0,x 6 )TixiEC',j= 1,2,3,6}.
But in this case
(written in the standard basis) is singular, so A 2 is not a supporting subspace for any monic divisor of L(.A) of degree 2. Consider now case 3. Here we have a ]-invariant subspace A
3
= {(x 1 , 0, x 3 , x 4 , x 5 , O)TixiE C,j = 1, 3, 4, 5}
and a family of ]-invariant subspaces ..# 3 (a) = Span{(!
0 0 0 0 O)T, (0 I a 0 0 O)T, (0 0 0 I 0 O)T, (0 0 0 0 I O)T},
where a E Cis arbitrary. For A
3
(3.29)
we have
[:J., ~
n~ !
!J
which is singular, so there are no monic divisors with supporting subspace A 3 ; for the family olt 3 (a) in the basis which appears in the right-hand side of (3.29)
3.4.
103
EXAMPLE
which is nonsingular for every a E fl. The monic divisor L.Jt,(a)(A.) with the supporting subspace A 3 (a) is
L.At,(aP) = IA 2
-
XI.Jt,(a). (JI.Jt,(a)) 2 . (V\3 )
+
V~3 )A.),
where [V\3 >V~3 >] = [col(XI.A£,(a) (JI, 4hY)l=or 1 , and computation shows that
[00 00 01
2
L.At,(A.) = IA -
Case 4. Again, we have a single subspace .A4
= {(x 1 , 0, x 3 , x 4 , 0, x 1YixiE fl,j = 1, 3, 4, 6}
and a family
Aia) = Span{(l
0 0 0 0 O)T, (0 1 a 0 0 O)T, (0 0 0 1 0 O)T, (0 0 0 0 0
1)T}.
Now
1
-b 1 1 -b 1 0 0 1 [xJ}.At 4 = o o 1-2b' 0 0 0 2 X
[
which is nonsingular, and the monic divisor L.41 4 (A.) with the supporting subspace A 4 is
We pass now to the family
.A 4 (a).
a
0 -b
1 -b 0 1 1 -2b
0
2
104
3.
MULTIPLICATION AND DIVISIBILITY
This matrix is non singular if and only if a =f. 0. So for every complex value of a, except zero, the subspace A 4 (a) is supporting for the monic right divisor L.Jt. i form a nonsingular matrix. For each i from n + 1 to ln we obtain such a matrix and, together with the submatrix of the first n columnsofXweobtainn(l- 1) + 1nonsingularn x nsubmatricesofX. D Comments
The presentation in this chapter is based on the authors' papers [34a, 34b]. The results on divisibility and multiplication of matrix polynomials are essentially of algebraic character, and can be obtained in the same way for matrix polynomials over an algebraically closed field (not necessarily(/;), and even for any field if one confines attention to standard triples (excluding Jordan triples). See also [20]. These results can also be extended for operator polynomials (see [34c]). Other approaches to divisibility of matrix (and operator) polynomials via supporting subspaces are found in [46, 56d]. A theorem showing (in the infinite-dimensional case) the connection between monic divisors and special invariant subspaces of the companion matrix was obtained earlier in [56d]. Some topological properties of the set of divisors of a monic matrix polynomial are considered in [70e].
3. 7.
DECOMPOSITION INTO A PRODUCT OF LINEAR FACTORS
115
Additional information concerning the problem of computing partial multiplicities of the product of two matrix polynomials (this problem was mentioned in Section 3.1) can be found in [27, 71a, 75, 76b]. Our Section 3.3 incorporates some simplifications of the arguments used in [34a] which are based on [73]. The notion of a supporting projector (Section 3.5) originated in [3a]. This idea was further developed in the framework of rational matrix and operator functions, giving rise to a factorization theory for such functions (see [ 4, 3c]). Theorem 3.20 (which holds also in the case of operator polynomials acting in infinite dimensional space), as well as Lemma 3.4, is proved in [34f]. Theorem 3.21 is proved by other means in [62b]. Another approach to the theory of monic matrix polynomials, which is similar to the theory of characteristic operator functions, is developed in [3b]. The main results on representations and divisibility are obtained there.
Chapter 4
Spectral Divisors and Canonical Factorization
We consid•r here the important special case of factorization of a monic matrix polynomial L(A.) = L 2 (A.)L 1(..1.), in which L 1(..1.) and L 2 (..1.) are monic polynomials with disjoint spectra. Divisors with this property will be called spectral.
Criteria for the existence of spectral divisors as well as explicit formulas for them are given in terms of contouf integrals. These results are then applied to the matrix form of Bernoulli's algorithm for the solution of polynomial equations, and to a problem of the stability of solutions of differential equations. One of the important applications of the results on spectral divisors is to canonical factorization, which plays a decisive role in inversion of block Toeplitz matrices. These applications are included in Sections 4.5 and 4.6. Section 4. 7 is devoted to the more general notion of Wiener-Hopf factorization for monic matrix polynomials. 4.1. Spectral Divisors
Much of the difficulty in the study of factorization of matrix-valued functions arises from the need to consider a" splitting" of the spectrum, i.e., the 116
4.1.
117
SPECTRAL DIVISORS
situation in which L(A.) = LiA.)L 1 (A.) and L 1 , L 2 , and L all have common eigenvalues. A relatively simple, but important class of right divisors of matrix polynomials L(A.) consists of those which have no points of spectrum in common with the quotient they generate. Such divisors are described as "spectral" and are the subject of this section. The point A.0 E (/; is a regular point of a monic matrix polynomial L if L(A-0 ) is nonsingular. Let r be a contour in fl, consisting of regular points of L. A monic right divisor L 1 of Lis a r-spectral right divisor if L = L 2 L 1 and cr(L 1 ), cr(L 2 ) are inside and outside r, respectively. Theorem 4.1. If L is a monic matrix polynomial with linearization T and L 1 is a r-spectral right divisor ofL, then the support subspace of L 1 with respect
to Tis the image of the Riesz projector Rr corresponding toT and Rr =
~ 2m
,C(IA.- T)Yr
1
dA..
r: (4.1)
Proof Let A r be the T-invariant supporting subspace of L~> and let A~ be some direct complement to A r in rtn1 (lis the degree of L(A.)). By Corollary 3.19 and the definition of a r-spectral divisor, cr(TIAir) is inside r and cr(PT l.ui-J is outside r, where Pis the projector on A~ along A r· Write the matrix T with respect to the decomposition rtn 1 = Ar +.A~:
T= [T~1 ~~] (so that T11 = TL41 ;-, T12 =(I- P)T!/11 ;-, T22 = PTIAti-• where P and I - Pare considered as linear transformations on A~ and Ar respectively). Then
and 1 j (I A. - T) -1 dA. 2ni
Rr=-
and so Im Rr = Ar.
1
=
[IO
*]
O,
0
Note that Theorem 4.1 ensures the uniqueness of a r -spectral right divisor, if one exists. We now give necessary and sufficient conditions for the existence of monic r-spectral divisors of degree k of a given monic matrix polynomial L of degree I. One necessary condition is evident: that det L(A.) has exactly nk zeros inside r (counting multiplicities). Indeed, the equality L = L 2 L 1 , where
118
4.
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
L 1 (A.) is a monic matrix polynomial of degree k such that a(L 1 ) is insider and a(L 1 ) is outside r, leads to the equality
det L = det L 2 • det L 1 , and therefore det L(A.) has exactly nk zeros insider (counting multiplicities), which coincide with the zeros of det L 1 (A.). For the scalar case (n = 1) this necessary condition is also sufficient. The situation is completely different for the matrix case: EXAMPLE
4.1.
Let A. 2 L(A.) = [ 0
J
0 (A. - 1)2 ,
and let r be a contour such that 0 is inside r and 1 is outside r. Then det L(A.) has two zeros (both equal to 0) inside r. Nevertheless, L(A.) has no r-spectral monic right divisor of degree 1. Indeed,
1 0 0 OJ
[ X=oo1o' is a Jordan pair of L(A.). The Riesz projector is
and Im Rr = Span{(l 0 0 O)T, (0 1 0 O)r}. But XIImRr is not invertible, so Rr is not a supporting subspace. By Theorem 4.1 there is no r -spectral monic right divisor of L(A.). D
So it is necessary to impose extra conditions in order to obtain a criterion for the existence of r -spectral divisors. This is done in the next theorem. Theorem 4.2. Let L be a monic matrix polynomial and r a contour consisting of regular points of L having exactly nk eigenvalues of L (counted according to multiplicities) insider. Then Lhasa r-spectral right divisor if and only if the nk x nl matrix
M
-
k,t-
1 2ni
4.1.
119
SPECTRAL DIVISORS
has rank kn. If this condition is satisfied, then the r-spectral right divisor L 1 (A) = /Ak + 2}:6 LuAj is given by the formula [L1o
···
Llk-I]=-~{[AkL- 1 (A) ·
2m
Jr
···
Ak+l-tL- 1 (A)]dA·MLl, (4.3)
where
ML 1 is any right inverse of M k,l·
Proof Let X, T, Y be a standard triple for L. Define Rr as in (4.1 ), and let A 1 = Im Rr, A 2 = Im(J- Rr). Then T = T1 E8 T2 , where T1 and T2 are the restrictions ofT to A 1 and -~ 2 respectively. In addition, u(Tt) and u(T2 ) are inside and outside r, respectively. Define X 1 = Xl 11 ,, X 2 = Xl 112 , and Y1 = Rr Y, Y2 =(I- Rr)Y. Then, using the resolvent form of L (Theorem 2.4) write
L -\A)= X(IA- T)- 1 Y = X 1 (/A- T1 )- 1 Y1
+
Xz(IA- T2 ) - 1 Y2 •
Since (/A- T2 )- 1 is analytic insider. then fori= 0, 1, 2, ... we have
where the last equality follows from the fact that u(T1 ) is insider (see Section Sl.8). Thus,
(4.5)
and since the rows of the matrix, row(Ti1 Y1 )l;;;6, are also rows of the nonsingular matrix, row(TiY)l:6, the former matrix is right invertible, i.e., row(Ti1 Y1 )l:6 · S = I for some linear transformationS: A--> fln 1• Now it is clear that M k, 1 has rank kn only if the same is true of the left factor in (4.5), i.e.,
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
120
the left factor is nonsingular. This implies that A is a supporting subspace for L, and by Theorem 3.12 L has a right divisor of degree k. Furthermore, X 1 , T1 determine a standard triple for Lt> and by the construction O"(L 1 ) coincides with O"(T1 ) and the part of O"(L) inside r. Finally, we write L = L 2 L 1 , and by comparing the degrees of det L, det L 1 and det L 2 , we deduce that O"(L 2 ) is outsider. Thus, L 1 is a r-spectral right divisor. For the converse, if L 1 is a r-spectral right divisor, then by Theorem 4.1 the subspace A = Im Rr c fl 1" is a supporting subspace of L associated with L~> and the left factor in (4.5) is nonsingular. Since the right factor is right invertible, it follows that M k,l has rank kn. It remains to prove the formula (4.3). Since the supporting subspace for L 1 (A.) is Im Rr, we have to prove only that
X1
T~ {l :T l f' ~ 2~i fp.'L XTk- 1
-•(J.)
lttJ
(4.6)
(see Theorem 3.12). Note that Ml, 1 = [row(T~ Yt)l:6J' · [col(X 1 T~t;:Jr 1 , where [row(T~ Y1 )l:6J' is some right inverse of row(T~ Y1 )l:6. Using this remark, together with (4.4), the equality (4.6) becomes X 1 T~
·(col(XT;t:"J)- 1 = [X 1 T~Y1 . . . X 1 T~+ 1 - 1 Y1 ] · [row(T~ Yt)l:6J' · ((col(X 1 TD~:J)- 1 ,
which is evidently true.
D
The result of Theorem 4.2 can also be written in the following form, which is sometimes more convenient. Theorem 4.3. Let L be a monic matrix polynomial, and let r be a contour consisting ofregular points ofL. Then Lhasa r -spectral right divisor ofdegree k if and only if rank Mk,l =rank Mk+ 1 , 1 = nk. (4.7) Proof We shall use the notation introduced in the proof of Theorem 4.2. Suppose that L has a r -spectral right divisor of degree k. Then col( X 1 T {)~,:- 6 is invertible, and T~> X 1 are nk x nk and n x nk matrices, respectively. Since row(T~ Y1 )l:6 is right invertible, equality (4.5) ensures that rank Mk,l = nk. The same equality (4.5) (applied to Mk+ 1 , 1) shows that rank Mk+ 1 , 1 ~ {size of Tt} = nk. But in any case rank Mk+ 1 , 1 ~rank Mk, 1; so in fact rank Mk+ 1 , 1 = nk.
4.1.
121
SPECTRAL DIVISORS
Suppose now (4.7) holds. By (4.5) and right invertibility ofrow(T~ Y1 )L;;A we easily obtain rank col(X 1 TD~,;;-J = rank col(X 1 TD~=o = nk.
(4.8)
Thus, Ker col( X 1 TD~,;;-J = Ker col( X 1 TD~=o. It follows then that (4.9) for every p ~ k. Indeed, let us prove this assertion by induction on p. For p = k it has been established already; assume it holds for p = p0 - 1, Po > k + 1 and let x E Ker col(X 1 TD~,;;-J. By the induction hypothesis,
x E Ker col( X 1 TDf~1 1 . But also
so T1 x
E
Ker col(X 1 Ti)~,;;-J,
and again by the induction hypothesis, T1 x
E
Ker col(X 1 TDf~(j 1
or
It follows that
x E Ker col( X 1 TDf~o. and (4.9) follows from p = p0 . Applying (4.9) for p = l - 1 and using (4.8), we obtain that rank col(X 1 TDl:A = nk. But col(X 1 T~)l:A is left invertible (as all its columns are also columns of the nonsingular matrix col(XT;)l:6), so its rank coincides with the number of its columns. Thus, T1 is of size nk x nk; in other words, det L(A.) has exactly nk roots insider (counting multiplicities). Now we can apply Theorem 4.2 to deduce the existence of a r -spectral right monic divisor of degree k. D If, as above, r is a contour consisting of regular points of L, then a monic left divisor L 2 of Lis a r-spectralleft divisor if L = L 2 L 1 and the spectra of L 2 and L 1 are inside and outside r, respectively. It is apparent that, in this case, L 1 is a r 1 -spectral right divisor, where the contour r 1 contains in its interior exactly those points of a{L) which are outside r. Similarly, if L 1 is a r -spectral right divisor and L = L 2 L~> then L 2 is a r cspectralleft divisor. Thus, in principle, one may characterize the existence of r-spectral left divisors by using the last theorem and a complementary contour r 1 . We shall show by example that a r -spectral divisor may exist from one side but not the other.
122
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
The next theorem is the dual of Theorem 4.2. It is proved for instance by applying Theorem 4.2 to the transposed matrix polynomial LT(A.) and its right r-spectral divisor.
Theorem 4.4. Under the hypotheses of Theorem 4.2 L has a r-spectral left divisor if and only if the matrix L -1(A.)
2~i L;tl-1 ~ -1(A.) [
Ml,k =
has rank kn. In this case the r-spectralleft divisor L 2 (A.) = is given by the formula [
L 20
:
Lz.~-1
l
= -
1
Ml
k •-
,
r[
A_k L
~: (A.) 1
JA.k
l
dA.,
+ ~}:J
L 2 iA.i
(4.10)
2rci)r ;tk+l-1L-1(A.)
where Ml,k is any left inverse of M 1,k· The dual result for Theorem 4.3 is the following:
Theorem 4.5. Let L(A.) be a monic matrix polynomial and r be a contour consisting of regular points of L. Then L has a r -spectral left divisor ofdegree k if and only if rank M 1,k = rank M 1,k+ 1 = nk. Note that the left-hand sides in formulas (4.3) and (4.10) do not depend on the choice of Ml, 1 and Ml, k> respectively (which are not unique in general}. We could anticipate this property bearing in mind that a (right) monic divisor is uniquely determined by its supporting subspace, and for a r-spectral divisor such a supporting subspace is the image of a Riesz projector, which is fixed by r and the choice of standard triple for L(A.). Another way to write the conditions of Theorems 4.2 and 4.4 is by using finite sections of block Toeplitz matrices. For a continuous n x n matrixvalued function H(A.) on r, such that det H(A.) =1 0 for every A. E r, deijne Di = (2rci)- 1 A.- i- 1H(A.) dA.. It is clear that if r is the unit circle then the Di are the Fourier coefficients of H(A.):
Jr
00
H(A.) =
L j=-
A_iDj
(4.11)
00
(the series in the right-hand side converges uniformly and absolutely to H(A.) under additional requirements; for instance, when H(A.) is a rational matrix function). For a triple of integers rx ~ {3 ~ y we shall define T(Hr; rx, {3, y) as
4.1. SPECTRAL DIVISORS
123
the following block Toeplitz matrix: DfJ .
-
T(Hr, oc, {3, y)-
r
DfJ+1
:
D" We mention that T(Hr; oc, oc, y) and
= [D" D"_ 1
Dy] is a block row,
is a block column. Theorem 4.2'. Let L(A.) and r be as in Theorem 4.2. Then L(A.) has a r-spectral right divisor L 1 (A.) = JA.k + 2_};;:J L 1iA.i if and only if rank T(Li 1; -1, -k, -k- l + 1) = kn. In this case the coefficients of L 1 are
given by the formula [L 1 ,k-1 L1,k- 2 ...
L 10 ]
= -T(Li 1; -k- 1, -k- 1, -k -1) ·(T(Li 1; -1, -k, -k -1
+ 1))1,
(4.12)
where (T(Li 1 ; -1, -k. -k- l + 1)1 is any right inverse of T(Li 1; -1, -k, -k -1 + 1). Note that the order of the coefficients L 1i in formula (4.12) is opposite to their order in formula (4.3). Theorem 4.4'. Let L(A.) and r be as in Theorem 4.4. Then L(A.) has a r-spectral left divisor LiA.) = JA.k + 2_};;:J L 2iA.i if and only if rank T(li 1 ; - 1, -1, - k - l + 1) = kn. In this case the coefficients of L 2 are given by the formula
L2o [ L 21
1= -(T(Li 1;
-1, -1, -k- l
+ 1))1
L2,~-1 · T(Li 1 ; -k- 1, -k- l, -k- 1),
(4.13)
where (T(Li 1 ; -1, -1, - k - l + 1))1 is any left inverse of T(Li 1; -1, -1, -k -1 + 1). Of course, Theorems 4.3 and 4.5 can also be reformulated in terms of the matrices T(Li 1 ; oc, {3, y).
124
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
We conclude this section with two examples: EXAMPLE
4.2.
Let L(A.)=
[
.1. 3
- 42.1. 2 -6.1. 2
300.1.2 - 30U + 42] .1. 3 +42.1. 2 -43.1.+6.
Then L has eigenvalues 0 (with multiplicity three), I, 2, and -3. Let r be the circle in the complex plane with centre at the origin and radius l We use a Jordan triple X, J, Y. It is found that we may choose J1 =
[00 01 0]I ,
yl =
[-7 48] I -7
0 0 0 and we have
_I J,
2ni 'fL
_1
-6 42 ,
(A.)dA.=XtYt =
[-7 48]
I -7,
which is invertible. In spite of this, there is no r-spectral right divisor. This is because there are, in effect, three eigenvalues of L insider, and though M 1 , 3 of Theorem 4.2 has full rank, the first hypothesis of the theorem is not satisfied. Note that the columns of X 1 are defined by a Jordan chain, and by Theorem 3.12 the linear dependence of the first two vectors shows that there is no right divisor of degree one. Similar remarks apply to the search for a r-spectralleft divisor. D ExAMPLE
4.3.
Let
a 1,
a2 , a3 ,
a4
be distinct complex numbers and
Let r be a contour with a 1 and a 2 insider and a 3 and a 4 outside r.lfwe choose
XI=[~ ~]. then it is found that
where f3 = (a 2
-
X Y =
11
a 3 )(a 2
[0
O
-
a 4)- 1(a 2 - a 1) and
f3(a 1
-
a2)- 1] O ,
[I
XtJtY1= 0
a2 {3(a 10 - a2)- 1].
Then, since M 1, 2 = [X 1Y1 X 1J 1Y1] and
J
X 1 Y1 M2,t = [ XtJtYt ,
it is seen that M 1 , 2 has rank 1 and M 2 , 1 has rank 2. Thus there exists a r-spectralleft divisor but no r-spectral right divisor. D
4.2.
125
LINEAR DIVISORS AND MATRIX EQUATIONS
4.2. Linear Divisors and Matrix Equations
Consider the unilateral matrix equation l-1 yt + I Aj yi = 0,
(4.14)
j=O
where Aj(j = 0, ... , l - 1) are given n x n matrices and Yis ann x n matrix to be found. Solutions of (4.14) are closely related to the right linear divisors ofthe monic matrix polynomial L(il) = H 1 + 2}:6 AiA.i. Namely, it is easy to verify (by a straightforward computation, or using Corollary 3.6, for instance) that Y is a solution of (4.14) if and only if IA- Y is a right divisor of L(il). Therefore, we can apply our results on divisibility of matrix polynomials to obtain some information on the solutions of (4.14). First we give the following criterion for existence of linear divisors (we have observed in Section 3.7 that, in contrast to the scalar case, not every monic matrix polynomial has a linear divisor). Theorem 4.6. The monic matrix polynomial L(A.) has a right linear divisor I A. - X if and only if there exists an invariant subspace ~ of the first companion matrix C 1 of L(A.) of the form I X ~
= Im X 2 xt-1
Proof Let ~ be the supporting subspace of IA - X relative to the companion standard pair (P 1, C 1). By Corollary 3.13(iii) we can write ~
= Im[col(X;)l;:6J
for some n X n matrices Xo =I and xl, ... ' xl-1· But Cl-invariance of~ implies that X; = XL i = 1, ... , l - 1. Finally, formula (3.27) shows that the rightdivisorofL(il)correspondingto~isjustH- X,soinfactX 1 =X. 0 In terms of the solutions of Eq. (4.14) and relative to any standard pair of L(il) (not necessarily the companion pair as in Theorem 4.6) the criterion of Theorem 4.6 can be formulated as follows. Corollary 4.7. form
A matrix Y is a solution of(4.14)
if and only if Y
has the
y =XI.${· Tl4t·(XI,~t)- 1 , where (X, T) is a standard pair of the monic matrix polynomial L(A.) and is a T-invariant subspace such that X 1.$1 is invertible.
~
126
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
To prove Corollary 4.7 recall that equation (4.14) is satisfied if and only if H - Y is a right divisor of L(A.), and then apply Theorem 3.12. A matrix Y such that equality (4.14) holds, will be called a (right) solvent of L(A.). A solventS is said to be a dominant solvent if every eigenvalue of S exceeds in absolute value every eigenvalue of the quotient L(A.)(IA- S)- 1. Clearly, in this case, I A. - S is a spectral divisor of L(A.). The following result provides an algorithm for computation of a dominant solvent, which can be regarded as a generalization of Bernoulli's method for computation of the zero of a scalar polynomial with largest absolute value (see[42]).
Theorem 4.8. Let L(A.) = IA1 + Il:A A.iA; be a monic matrix polynomial ofdegree l. Assume that L(A.) has a dominant solventS, and the transposed matrix polynomial LT(A.) also has a dominant solvent. Let {Ur}~ 1 be the solution of the system r
= 1, 2,... (4.15)
where {Ur},~= 1 is a sequence ofn x n matrices to be found, and is determined by the initial conditions U 0 = · · · = U 1_ 1 = 0, U 1 = I. Then Ur+ 1 ur- 1 exists for larger and ur+ 1 ur- 1 -+ as r-+ 00.
s
We shall need the following lemma for the proof of Theorem 4.8.
Lemma 4.9. Let W1 and W 2 be square matrices (not necessarily of the same size) such that
(4.16) Then W1 is nonsingular and
lim II Will
· II w;.-mll
=
o.
(4.17)
Proof Without loss of generality we may assume that W1 and W 2 are in the Jordan form; write
W;
= K; + N;,
i = 1, 2
where K; is a diagonal matrix, N; is a nilpotent matrix (i.e., such that Nr = 0 for some positive integer r;), and K;N; = N;K;. Observe that a{K;) = a(W;) and condition (4.16) ensures that W1 (and therefore also K 1) are non singular. Further, IIK1 1II = [inf{IA.IIA.ea(W1)}rt, IIK2II = sup{IA.IIA.ea(W2)}. Thus (again by (4.16))
IIKill · IIK1mll
~
ym,
0 < y < 1,
m
= 1, 2, ....
(4.18)
4.2.
127
LINEAR DIVISORS AND MATRIX EQUATIONS
Letr 1 besolargethatN~' = O,andputb 1 = maxosks;r,~J IIK!kN~II. Using the combinatorial identities
f (_1)k ( nk) (n + kk - 1) = {01
k=O
p-
for for
p > 0 p = 0,
it is easy to check that
So
~ [ r,k~O
1
::::; IIK!mll·
(m + kk - 1) bi J ::::; IIK!mlloo
Proof of Theorem 4.8. Let (X, J, Y) be a Jordan triple for L(A). The existence of a dominant solvent allows us to choose (X, J, Y) in such a way that the following partition holds:
J = [Js 0
OJ
Jt '
y =
[il
where sup{ IAII A E u(Jt)} < inf{ IAII A E u(Js)}, and the partitions of X and Y are consistent with the partition of J. Here Js is the part of J corresponding to the dominant solvent S, so the size of Js is n x n. Moreover, Xs is a nonsingular n x n matrix (because I A - Sis a spectral divisor of L(A)), and X sis the restriction of X to the supporting subspace corresponding to this divisor (cf. Theorems 4.1 and 3.12). Since LT(A) has also a dominant solvent, and (Yr, Jr, XT) is a standard triple of LT(A), by an analogous argument we obtain that Y, is also nonsingular.
128
4.
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
Let {Ur };"'~ 1 be the solution of (4.15) determined by the initial conditions u I = ... = ul-1 = 0; Uz = I. Then it is easy to see that r = 1, 2, ....
Indeed, we know from the general form (formula (2.57)) of the solution of (4.15) that Ur = XJ'- 1 Z, r = 1, 2, ... for some nl x n matrix Z. The initial conditions together with the invertibility of col(XJit:;f> ensures that in fact
Z=Y. Now U, = XJr-l Y = Write M = Xs ¥., and E, = XJ~-~xs-1)
¥., + XtJ~-I J'; + (X 1 1~- 1 1';)Y.,- 1 Xs- 1 JXs ¥.,.
XsJ~-I
= [XsJ~- 1 Xs-
1
X 1 J~ 1';, r
= 1, 2, ... , so that (recall that S =
U, =(Sr-I+ E,-1M-I)M =(I+ E,-1M-1s-r+I)S'-1 M.
Now the fact that Sis a dominant solvent, together with Lemma 4.9, implies that E,M- Is-r--+ 0 in norm as r--+ 00 (indeed,
by Lemma 4.9). So for large enough r, U, will be nonsingular. Furthermore, when this is the case,
and it is clear, by use of the same lemma, that U,+ I ur- 1
--+
s as
r--+ 00.
D
The hypothesis in Theorem 4.8 that LT(A,) should also have a dominant solvent may look unnatural, but the following example shows that it is generally necessary. EXAMPLE
4.4.
Let
L(.?c)
=
[.F-1I .~czOJ ' .lc -
with eigenvalues I, -I, 0, 0. We construct a right spectral divisor with spectrum {I, -I}. The following matrices form a canonical triple for L:
y-
-
l
~I
_l 2
-I
01
0
0' 1
4.3.
129
SOLUTIONS OF DIFFERENTIAL EQUATIONS
and a dominant solvent is defined by S = X1J1X1-1 =
[1 -1]. 0
-1
Indeed, we have
L(A) = [A + 1 1
- 1 A-1
J[A -0 1
J
1 A+1'
However, since the corresponding matrix Y1 is singular, Bernoulli's method breaks down. With initial values U 0 = U 1 = 0, U 2 = I it is found that
so that the sequence {U,}:':o oscillates and, (for r > 2) takes only singular values.
D
An important case where the hypothesis concerning LT(A.) is redundant is that in which Lis self-adjoint (refer to Chapter 10). In this case L(A.) = L *(A.) and, if L(A.) = Q(A.)(IA.- S), it follows that LT(A.) = Q(A.)(IA.- S) and, if S is dominant for L, so is S for LT. 4.3. Stable and Exponentially Growing Solutions of Differential Equations
Let L(A.)
= IA.1 + .L;}:~ AiA.i be a monic matrix polynomial, and let d1x(t) -d, t
+
dix(t) L;Ai-di =0,
1- 1
j=O
tE[a,oo)
t
(4.21)
be the corresponding homogeneous differential equation. According to Theorem 2.9, the general solution of (4.21) is given by the formula
x(t) = XetJc, where (X, J) is a Jordan pair of L(A.), and c E (/;" 1• Assume now that L(A.) has no eigenvalues on the imaginary axis. In this case we can write
X= [X 1 X2J,
J
=
[J~
JoJ.
where a(J 1 ) (resp. a(J 2 )) lies in the open left (resp. right) half-plane. Accordingly,
x(t) = x 1(t)
+ x 2 (t),
(4.22)
where x 1 (t) = X 1 ethc 1 , x 2 (t) = X 2 ethc 2 , and c = [~~]. Observe that limr~oo llx 1 (t)ll = 0 and, if c 2 # 0, limr~oo llx 2 (t)ll = co. Moreover, x 2 (t) is
130
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
exponentially growing, i.e., for some real positive numbers J1 and v (in fact, J1 (resp. v) is the real part of an eigenvalue of L(A.) with the largest (resp. smallest) real part among all eigenvalues with positive real parts) such that
for some positive integer p, but lim1 _ 00 lle-txit)ll = oo for every£> 0 (unless c2 = 0). Equality (4.22) shows that every solution is a sum of a stable solution (i.e., such that it tends to 0 as t --+ oo) and an exponentially growing solution, and such a sum is uniquely determined by x(t). The following question is of interest: when do initial conditions (i.e., vectors xU>(a), j = 0, ... , k - 1 for some k) determine a unique stable solution x(t) of(4.21), i.e., such that in (4.22), x 2 (t) 0? It turns out that the answer is closely related to the existence of a right spectral divisor of L(A.).
=
Theorem 4.10. Let L(A.) be a monic matrix polynomial such that a(L) does not intersect the imaginary axis. Then for every set of k vectors x 0 (a), ... , Xbk- 1>(a) there exists a unique stable solution x(t) of (4.21) such that xU>(a) = x~>(a),j = 0, ... , k- 1, if and only if the matrix polynomial L(A.) has a monic right divisor L 1(A.) of degree k such that a(L 1) lies in the open left half-plane.
Proof Assume first that L 1 (A.) is a monic right divisor of L(A.) of degree k such that a(Lt) lies in the open left half-plane. Then a fundamental result of the theory of differential equations states that, for every set of k vectors x 0 (a), ... , x~-t>(a) there exists a unique solution x 1(t) of the differential equation L 1(d/dt)x 1(t) = 0, such that xV>(a) = x~>(a),
j = 0, ... , k - 1.
(4.23)
Of course, x 1 (t) is also a solution of(4.21) and, because a(L 1 ) is in the open left half-plane, (4.24) l--"00
We show now that x 1(t) is the unique solution of L(A.) such that (4.23) and (4.24) are satisfied. Indeed, let x1(t) be another such solution of L(A.). Using decomposition (4.22) we obtain that Xt(t) = XletJ,cl, x!(t) = Xlethcl for some c~> c 1 . Now [x 1 (t)- x1 (t)]W(a) = 0 for j = 0, ... , k- 1 implies that
[
Xt X1J1 :
1e (c a}[
-
1 -
c 1 ) = 0.
(4.25)
Xt11-t Since (X 1 ,1 1 ) is a Jordan pair of L 1 (A.) (see Theorems 3.12 and 4.1), the matrix col(X 1J;)7,:-J is nonsingular, and by (4.25), c 1 = c1, i.e., x 1(t) = x1(t).
4.4.
131
LEFT AND RIGHT SPECTRAL DIVISORS
Suppose now that for every set of k vectors x 0 (a), ... , x~- 1 >(a) there exists a unique stable solution x(t) such that xU>( a) = x~>(a), j = 0, ... , k - 1. Write x(t) = X 1 ethc 1 for some c 1 ; then (4.26) For every right-hand side of (4.26) there exists a unique c 1 such that (4.26) holds. This means that col(X 1 JD~,;;-J is square and nonsingular and the existence of a right monic divisor L 1 (A.) of degree k with a(L 1 ) in the open left half-plane follows from Theorem 3.12. D 4.4. Left and Right Spectral Divisors
Example 4.3 indicates that there remains the interesting question of when a set of kn eigenvalues-and a surrounding contour r -determine both a r-spectral right divisor and a r-spectral left divisor. Two results in this direction are presented. Theorem 4.11. Let L be a monic matrix polynomial and r a contour consisting of regular points of L having exactly kn eigenvalues of L (counted according to multiplicities) insider. Then L has both a r-spectral right divisor and a r-spectralleft divisor if and only if the nk x nk matrix Mk,k defined by
l .
L -1(A.)
= -1
Mk
,k
2rci
[
r
.
A_k-1 L -1(A.)
·
·
A_k-1L -1(A.)
l
d111.
(
4.27)
A_2k-2L -1(A.)
is nonsingular. If this condition is satisfied, then the r-spectral right (resp. left) divisor L 1(A.) = JA.k + D:J L 1jA.j (resp. Lz(A.) = JA.k + D,;;-J L 2jA.j) is given by the formula: [L1o
(
.. · L1 k-1] = - ~ f [A.kL - 1(A.) · 2rczJr resp. [
L~o
L2 ..k-t
l
= - M;l· _1 .
.. ·
~
A. 2k- 1L - 1(A.)] dA.· M;l ·
l
[ A_kL 1(A.) dA.). 2rciL A_2k-1L-1(A.)
Proof Let X, T, Y be a standard triple for L, and define X 1, T1, Y1 and X 2 , T2 , Y2 as in the proof of Theorem 4.2. Then, as in the proof ofthat theorem, we obtain (4.28)
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
132
Then nonsingularity of M k, k implies the nonsingularity ofboth factors and, as in the proof of Theorem 4.2 (Theorem 4.4), we deduce the existence of a r-spectral right (left) divisor. Conversely, the existence of both divisors implies the nonsingularity of both factors on the right of (4.28) and hence the nonsingularity of M k, k. The formulas for the r -spectral divisors are verified in the same way as in the proof of Theorem 4.2. 0 In terms of the matrices T(Li 1 ; a, {3, y) introduced in Section 4.1, Theorem 4.11 can be restated as follows. Theorem 4.11'. Let L and r be as in Theorem 4.11. Then L has right and left r-spectral divisors L 1 (.~.) = JAk + ~}:6 L 1j)) and Lz(A.) = Hk + 2}:6 L 2jA.j, respectively, if and only ifdet T(L - 1 ; -1, -k, -2k + 1) =1- 0. In this case the coefficients of L 1 (A.) and L 2(A.) are given by the formulas
[L1,k-1
L1,k-2
...
L 10 ] = -T(Li 1; - k - 1, - k - 1, -2k)
l/~J ~
·(T(Li 1; -1, -k, -2k
+ 1))-1,
-(T(LC'; -1, -k, -2k
+ !))-'
· T(Li 1; - k - 1, -2k, -2k).
In the next theorem we relax the explicit assumption that r contains exactly the right number of eigenvalues. It turns out that this is implicit in the assumptions concerning the choice of I and k. Theorem 4.12. Let L be a monic matrix polynomial of degree l ::::; 2k, and let r be a contour ofregular points ofL. Ifdet Mk,k =1- 0 (Mk,k defined by (4.27)) then there exist a r-spectral right divisor and a r-spectralleft divisor.
Proof We prove Theorem 4.12 only for the case I= 2k; in the general case l < 2k we refer to [36b]. Suppose that r contains exactly p eigenvalues of Lin its interior. As in the preceding proofs, we may then obtain a factorization of M k, k in the form (4.28) where X 1 = X IAt, T1 = T IAt and A is the pdimensional invariant subspace of L associated with the eigenvalues insider. The right and left factors in (4.28) are then linear transformations from flnk to A and A to flnk respectively. The nonsingularity of M k, k therefore implies p = dim A ?: dim flnk = kn.
4.5. CANONICAL FACTORIZATION
133
With Yb X 2 , T2 , Y2 defined as in Theorem 4.2 and using the biorthogonality condition (2.14) we have
XY O= [
XTY
X~Y XTk.-ly X 2 Y2
= Mk k + [
X2
: X2T~- 1 Y2
···
T~- 1 Y2 ] : ·
X2nk- 2Y2
Thus, the last matrix is invertible, and we can apply the argument used in the first paragraph to deduce that In - p 2 kn. Hence p = kn, and the theorem follows from Theorem 4.11. 0 A closed rectifiable contour 3, lying outsider together with its interior, is said to be complementary tor (relative to L) if 3 contains in its interior exactly those spectral points of L(A.) which are outsider. Observe that L has right and left r-spectral monic divisors of degree kif and only if L has left and right 3-spectral monic divisors of degree l - k. This obs~rvation allows us to use Theorem 4.12 for 3 as well as for r. As a consequence, we obtain, for example, the following interesting fact. Corollary 4.13. Let L and r be as in Theorem 4.12, and suppose l = 2k. Then M k, k is nonsingular if and only if M~. k is nonsingular, where 3 is a contour complementary tor and M~.k is defined by (4.27) replacing r by 3.
For n = 1 the results of this section can be stated as follows. Corollary 4.14. Let p(A.) = L}=o ajA.j be a scalar polynomial with a1 :f. 0 and p(A.) :f. Ofor A. E r. Then an integer k is the number of zeros ofp(A.) (counting multiplicities) inside r if and only if
rank Mk,k = k
for
k 2 l/2
rank M~-k.l-k = l- k
for
k :-s; l/2,
or
where 3 is a complementary contour tor (relative to p(A.)). 4.5. Canonical Factorization
Let r be a rectifiable simple closed contour in the complex plane fl bounding the domain F+. Denote by p- the complement of p+ u r in
4.
134
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
(/; u {oo }. We shall always assume that 0 E F+ (and oo E F-). The most important example of such a contour is th~:: unit circle r 0 =PIP-I= 1}. Denote by c+(r) (resp. c-(r)) the class of all n )( n matrix-valued functions G(A), A E r u F+ (resp. A E r u F-) such that G(A) is analytic in F+ (resp. F-) and continuous in F+ u r (resp. F- u r). A continuous and invertible n x n matrix function H(A) (A E r) is said to admit a right canonical factorization (relative to r) if H(A)
= H +(A)H _(A),
(4.29)
(AE r)
where H$ 1(A) E c+(r), H~ 1(A) E c-(r). Interchanging the places of H +(A) and H _(A) in (4.29), we obtain left canonical factorization of H. Observe that, by taking transposes in (4.29), a right canonical factorization of H(A) determines a left canonical factorization of (H(A))r. This simple fact allows us to deal in the sequel mostly with right canonical factorization, bearing in mind that analogous results hold also for left factorization. In contrast, the following simple example shows that the existence of a canonical factorization from one side does not generally imply the existence of a canonical factorization from the other side. EXAMPLE
4.5.
Let H(A.)
and let
r
=
=
[
,t-1
0
1]
A. '
r 0 be the unit circle. Then
is a right canonical factorization of H(A.), with H +(A.)= [
0 A.I]
-1
and
On the other hand, H(A.) does not admit a left canonical factorization.
D
In this section we shall consider canonical factorizations with respect to a rectifiable simple contour r of rational matrix functions of the type R(A) =
LJ=
-r
AjAj.
We restrict our presentation to the case that R(A) is monic, i.e., As = I. This assumption is made in order to use the results of Sections 4.1 and 4.4, where only monic matrix polynomials are considered; but there is nothing essential in this assumption and most of the results presented below remain valid without the requirement that As = I (cf. [36c, 36d, 73]).
4.5.
135
CANONICAL FACTORIZATION
We introduce some notation. Denote by GL(€") the class of all nonsingular n x n matrices (with complex entries). For a continuous n x n matrix-valued function H(),): r-+ GL(€") define Di = (2ni)- 1
and define T(Hr;
IX,
LA- i- 1 H(A) d),,
f3, y) to be the following block Toeplitz matrix: Dp
T(Hr;
IX,
/3,
y) =
r:
Dp-1
Dp+1
Dp
D~
D~-1
(4.30)
We have already used the matrices T(Hr; IX, /3, y) in Sections 4.1 and 4.4. For convenience, we shall omit the subscript r whenever the integration is along r and there is no danger of misunderstanding. The next theorem provides the characterization of right factorization in terms of some finite sections of the infinite Toeplitz matrix T( ·; oo, 0, - oo ). Theorem 4.15. A monic rational matrix function R(A) = 2:}: :.r Ai),i H': r-+ GL(€") admits a right canonical factorization if and only if
rank T(Ri 1 ; r- 1, 0, - r- s
+
+ 1)
=rank T(Ri 1 ;r- 1, -1, -r- s) = rn. (4.31) Proof We first note that the canonical factorization problem can be reduced to the problem of existence of r-spectral divisors for monic polynomials. Indeed, let R(A) = 1}: :.r Ai),i + H' be a monic rational function and introduce a monic matrix polynomial M(A) = ;,rR(A). Then a right canonical factorization R(),) = W+(A)W_(A) for R(A) implies readily the existence of right monic r -spectral divisor ;,rw_ (A) of degree r for the polynomial M(A). Conversely, if the polynomial M(A) has a right monic r-spectral divisor N 1(A) of degree r and M(),) = N 2 (A)N 1 (A), then the equality R(A) = N 2 (A) · rrN 1 (),) yields a right canonical factorization of R(A). Now we can apply the results of Section 4.1 in the investigation of the canonical factorization. Indeed, it follows from Theorem 4.3 that the polynomial M(),) has a r-spectral right divisor if and only if the conditions of Theorem 4.15 are satisfied (here we use the following observation: T(R- 1; IX,
/3, y) =
T(M- 1; IX+ r,
f3
+ r, y + r)).
0
The proof of the following theorem is similar to the proof of Theorem 4.15 (or one can obtain Theorem 4.16 by applying Theorem 4.15 to the transposed rational function).
4.
136 Theorem 4.16.
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
A monic rational function R(A)
=
2:}:~, AiAi
r--+ GL(C") admits a left canonicalfactorization if and only if rank T(R- 1 ; r- 1, -s, - r - s + 1) =rank T(R- 1 ;r- 1, -s, - r - s) = rn.
+ H':
(4.32)
In the particular case n = 1 Theorem 4.15 implies the following characterization of the winding number of a scalar polynomial. The winding number (with respect to r) of a scalar polynomial p(A) which does not vanish on r is defined as the increment of(1/2n) [argp(A)] when Aruns through r in a positive direction. It follows from the argument principle ([63], Vol. 2, Chapter 2) that the winding number of p(A) coincides with the number of zeros of p(A) inside r (with multiplicities). Let us denote by 3 a closed simple rectifiable contour such that 3 lies in F- together with its interior, and p(A) =1= 0 for all points A E F- \ { oo} lying outside 3. Corollary 4.17. Let p(A) = D~o aiAi be a scalar polynomial with a 1 =I= 0 and p(A) =I= 0 for A E r. An integer r is the winding number of p(A) if and only if the following condition holds: rank T(p- 1 ; -1, -r, -1- r
+
1)
=rank T(p- 1 ; -1, - r - 1, -1- r) = r. This condition is equivalent to the following one:
rank T(p:S 1 ; - I, - s, -I - s + 1) = rank T(p:S 1 ; - 1, - s - 1, -1 - s) = s, where s
= l - r.
The proof of Corollary 4.17 is based on the easily verified fact that r is the winding number of p(A) if and only if the rationalfunction A-'p(A) admits onesided canonical factorization (because of commutativity, both canonical factorizations coincide). The last assertion of Corollary 4.17 reflects the fact that the contour 3 contains in its interior all the zeros of p(A) which are outside r. So p(A) has exactly s zeros (counting multiplicities) inside 3 if and only if it has exactly r zeros (counting multiplicities) inside r. Now we shall write down explicitly the factors of the canonical factorization. The formulas will involve one-sided inverses for operators of the form T(H; rx, /3, y). The superscript I will indicate the appropriate one-sided inverse of such an operator (refer to Chapter S3). We shall also use the notation introduced above. The formulas for the spectral divisors and corresponding quotients obviously imply formulas for the factorization factors. Indeed, let
4.5. CANONICAL FACTORIZATION
137
R(A.) = IA.s + ~};; ~r AiA.i: r---+ GL(~") be a monic rational function, and let W+(A.) = JA.S + IJ:6 wt A_i and W_(A.) = I + LJ= 1 wr-- jA.- j be the factors of the right canonical factorization of R(A.): R(A.) = W+(A.)W_(A.). Then the polynomial A.rW_(A.) is a right monic r-spectral divisor of the polynomial M(A.) = A.rR(A.), and formulas (4.12) and (4.13) apply. It follows that [Wr-- 1 W;_ 2
•••
W 0 ] = -T(Ri 1; -1, -1, -r- s) · [T(Ri 1 ; r- 1, 0, -r- s
+
1)J, ·(4.33)
col(Wnj:6 = -[T(R:S 1;r- 1, -s, -2s + 1)J · T(R:S 1 ; r - s - 1, - 2s, - 2s). (4.34) Let us write down also the formulas for left canonical factorization: let A+(A.) = IA.s + Ii=6 At A_i and A _(A.) = I + LJ= 1 Ar-:_ iA.- i be the factors of a left canonical factorization of R(A.) = IA.s + ~r AiA.i: r---+ GL(~"), i.e., R(A.) = A_(A.)A+(A.). Then
IJ;;
col(A;-x:J = -(T(Ri 1;r- 1, -s, -r- s · T(Ri 1; -1, -r- s, -r- s), [As+-1
As+-z
...
+ l)Y
AriJ=-T(R:S 1;-1,-1,-r-s) · (T(R:S 1 ; r - 1, 0, - r - s + 1))1.
(4.35)
(4.36)
Alternative formulas for identifying factors in canonical factorization can be provided, which do not use formulas (4.12) and (4.13). To this end, we shall establish the following formula for coefficients of the right monic r-spectral divisor N(A.) of degree r of a monic matrix polynomial M(A.) with degree l: [Nr
Nr-1
...
No]=[/ 0 ... O]·[T(M- 1;0,-r,-r-l)]i, (4.37)
where Ni = Y0 Ni (j = 0, 1, ... , r - 1) and Nr = Y0 . Here Ni are the coefficients of N(A.): N(A.) = IA.r + IJ:6 NiA.i; and Y0 is the lower coefficient of the quotient M(A.)N- 1(A.): Y0 = [M(A.)N- 1(A.)] l;.=o. It is clear that the matrix function Y(A.) = Y0 N(A.)M- 1(A.) is analytic in p+ and Y(O) = I. Then (4.38) and for
j = 0, 1, ...
(4.39)
138
4.
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
Indeed, (4.39) is a consequence of analyticity of the functions ..1/Y0 N(A.)M- 1 (..1.) in F+ for j = 0, 1, .... Formula (4.38) follows from the residue theorem, since . 1. 0 = 0 is the unique pole of the function r 1 Y0 N(A.)M- 1 (..1.) in F+. Combining (4.38) with the first l equalities from (4.39), we easily obtain the following relationship: ...
0
Y0 N 0 ]·T(M;: 1 ;0,-r,-r-l)=[I
0],
which implies (4.37) immediately. A similar formula can be deduced for the left r-spectral divisor (..1.) = IX + Ii:6 0, there exists {J > 0 with the following property: if L' E ~~ and a1(L', L) < b, then L' admits a factorization L' = L~L~ such that L'1 E~., L~ E~q and ar(L~, L 1 )
< e,
aq(L]., L 2) <e.
The aim of this section is to characterize stability of a factorization in terms of the supporting subspace corresponding to it. Theorem 5.5 states that any spectral divisor is stable. However, the next theorems show that there also exist stable divisors which are not spectral. First, we shall relate stable factorization to the notion of stable invariant subspaces (see Section S4.5).
Theorem 5.7. Let L, L 1 , and L 2 be monic n x n matrix polynomials and assume L = L 2L 1. This factorization is stable if and only if the corresponding supporting subspace (for the companion matrix eL of L) is stable. Proof Let l be the degree of L and r that of L 1, and put e = e L. Denote the supporting subspace corresponding to the factorization L = L 2 L 1 by .A. If A is stable fore then, using Theorem 5.2, one shows that L = L 2L 1 is a stable factorization. Now conversely, suppose the factorization is stable, but A is not. Then there exists e > 0 and a sequence of matrices {em} converging to e such that, for all "f/" E Om, ()("f/", A)
~
e,
m = 1, 2, ....
(5.7)
Here {Om} denotes the collection of all invariant subspaces for em. Put Q = col(bi1J)l= 1 and m = 1, 2, ....
Then {Sm} converges to col(Qe;- 1)l=t. which is equal to the unit nl x nl matrix. So without loss of generality we may assume that Sm is nonsingular for all m, say with inverse S;;; 1 = row(Um;)l= 1· Note that
umi--+ col(bj;J)l=1•
i = 1,
0
0
0'
l.
(5.8)
A straightforward calculation shows that Sm ems;;; 1 is the companion matrix associated with the monic matrix polynomial
From (5.8) and the fact that em--+ e it follows that azCLm, L)--+ 0. But then we may assume that for all m the polynomial Lm admits a factorization Lm = Lm2Lm1 with Lm1 E ~.. Lm2 E ~l-n and
154
5. PERTURBATION AND STABILITY OF DIVISORS
Let Am be the supporting subspace corresponding to the factorization Lm = Lm 2 Lm 1 • By Theorem 5.2 we have ()(Am, A)--+ 0. Put "f/ m = s;;; 1 Am. Then "f/ mis an invariant subspace for em. In other words "f/ mE Qm· Moreover, it follows from Sm --+ /that()( "f/ m• Am) --+ 0. (This can be verified easily by using, for example, equality (S4.12).) But then ()("f/ m• A) --+ 0. This contradicts (5. 7), and the proof is complete. D We can now formulate the following criterion for stable factorization.
Theorem 5.8. Let L, L 1 , and L 2 be monic n x n matrix polynomials and assume L = L 2 L 1 • This factorization is stable if and only if for each common eigenvalue ..1. 0 of L 1 and L 2 we have dim Ker L(A.0 ) = 1. In particular, it follows from Theorem 5.9 that for a spectral divisor L 1 of L the factorization L = L 2 L 1 is stable. This fact can also be deduced from Theorems 5.5 and 5.2. The proof of Theorem 5.8 is based on the following lemma.
Lemma 5.9.
Let
be a linear transformation from (;minto (;m, written in matrix form with respect to the decomposition (;m = (;m' EEl (;m 2 (m 1 + m 2 = m). Then (;m' is a stable invariant subspace for A if and only iffor each common eigenvalue A.0 of A 1 and A 2 the condition dim Ker(A. 0 J -A) = 1 is satisfied. Proof It is clear that (;m, is an invariant subspace for A. We know from Theorem S4.9 that (;m, is stable if and only if for each Riesz projector P of A corresponding to an eigenvalue ..1.0 with dim Ker(A. 0 J - A) ;;::: 2, we have p(;m, = 0 or P~' = ImP. Let P be a Riesz projector of A corresponding to an arbitrary eigenvalue A.0 • Also for i = 1, 2 let P; be the Riesz projector associated with A; and A.0 • Then, for e positive and sufficiently small
p = (
P1
1.
-2
f
nz JIJ.-J.ol=e
(JA.- A 1)- 1 A 0 (IA.- A 2 )- 1 dJ ·
P2
0
Observe that the Laurent expansion of (H - A;)- 1 (i form
= 1, 2) at
..1. 0 has the
-q
(I A. - A;)- 1
=
L
j= -1
(A.- A.0 )iP;QijP;
+ · · ·,
i
= 1, 2,
(5.9)
5.4. GLOBAL ANALYTIC PERTURBATIONS: PRELIMINARIES
155
where Q;i are some linear transformations of Im P; into itself and the ellipsis on the right-hand side of (5.9) represents a series in nonnegative powers of (A. - ..1. 0 ). From (5.9) one sees that P has the form
p= (p~
P1Q1 ; 2
QzPz)
where Q1 and Q2 are certain linear transformations acting from llm 2 into flm'. It follows that Pllm' 1= {0} 1= Im P if and only if ..1. 0 E a(A 1 ) n a(A 2 ). Now appeal to Theorem S4.9 (see first paragraph of the proof) to finish the proof of Lemma 5.9. D
Proof of Theorem 5.8. Let .A be the supporting subspace corresponding to the factorization L = L 2 L 1 • From Theorem 5.7 we know that this factorization is stable if and only if .A is a stable invariant subspace for C L. Let l be the degree of L, let r be the degree of L 1 , and let .%, = {(x 1,
...
,x1)Efln 1lx 1
= ··· = x, = 0}.
Then flnl = .A EB %,. With respect to this decomposition we write CL in the form
From Corollaries 3.14 and 3.19 it is known that a(L 1 ) = a(Cd and a(L 2 ) = a(C 2 ). The desired result is now obtained by applying Lemma 5.9.
D 5.4. Global Analytic Perturbations: Preliminaries In Sections 5.1 and 5.2 direct advantage was taken of the explicit dependence of the companion matrix C L for a monic matrix polynomial L(A.) on the coefficients of L(A.). Thus the appropriate standard pair for that analysis was ([J 0 · · · 0], CL). However, this standard pair is used at the cost of leaving the relationship of divisors with invariant subspaces obscure and, in particular, giving no direct line of attack on the continuation of divisors from a point to neighboring points. In this section we shall need more detailed information on the behavior of supporting subspaces and, for this purpose, Jordan pairs X ll' J ll will be used. Here the linearization J ll is relatively simple and its invariant subspaces are easy to describe. Let A 0 (f.L), ... , A 1_ 1 (f.L) be analytic functions on a connected domain Q in the complex plane taking values in the linear space of complex n x n matrices. Consider the matrix-valued function 1-1
L(A., f.L) = IA.1 +
L A;{f.L)Ai.
;~o
(5.10)
5. PERTURBATION AND STABILITY OF DIVISORS
156
Construct for each 11 En a Jordan pair (X~'' J ~')for L(A., Jl). The matrix J ~'is in Jordan normal form and will be supposed to have r Jordan blocks J~> of size q;, i = 1, ... , r with associated eigenvalues A.;(Jl) (not necessarily distinct). In general, rand q; depend on Jl. We write J ~' = diag[J~1 >, ... , Jt>J. Partition
X~'
as X
ll
= [X, i = 1, ... , r, of X~' are analytic functions ofJl on 0.\S 1 which may also be branches of analytic functions having algebraic branch points at some points of S 1 •
(a)
qb
... ,
The set S 1 is associated with Baumgartel's Hypospaltpunkte and consists of points of discontinuity of J ~'inn as well as all branch points associated with the eigenvalues.
5.4. GLOBAL ANALYTIC PERTURBATIONS: PRELIMINARIES
157
Let Xj(J.L), j = 1, 2, ... , t, denote all the distinct eigenvalue functions defined on il\S 1 (which are assumed analytic in view of Proposition 5.10), and let S 2 = {J.L E il\S 1 1l;(J.L) = lj(J.L) for some i # j}. The Jordan matrix J ~' then has the same set of invariant subspaces for every J.L E il\(S 1 u S 2 ). (To check this, use Proposition S4.4 and the fact that, for
rx, pErt, matrices J !< - rxl and J ~' - PI have the same invariant subspaces.) The set S 2 consists of multiple points (mehrfache Punkte) in the terminology of Baumgartel. The setS 1 u S 2 is described as the exceptional set of L(A., J.L) inn and is, at most, countable (however, see Theorem 5.11 below), having its limit points (if any) on the boundary of n. ExAMPLE
+ J-lA + w 2
Let L(A., J-l) = A.Z
5.1.
J
_
±2w-
for some constant w, and 0 =fl. Then
[±w 1 J 0 ±w
and, for J-l #- ±2w, J~' = diag[A. 1 (J-l), A. 2 (J-l)] where A. 1 • 2 are the zeros of A. 2 Here, S 1 = {2w, -2w}, S 2 = ¢. 0 5.2.
EXAMPLE
+ J-lA + w2 •
Let
L(A., J-l)
=
[
(A. - 1)(A. - J-l) ll
and 0 = fl. The canonical Jordan matrix is found for every ll Efland hence the sets S 1 and S 2 • For ll ¢ {0, 2, ± 1, ±j2}, J ~' = diag{J-l, J-l 2 , 1, 2}.
Then J0
= diag[O, 0,
J '-
J
s,
J2
d;a·{[~ :]. -1,2}
±,~ = diag{[~
It follows that
1, 2],
= {
a 1,
±J2}
±1, 2, ±J2}, s2 = =
X " ~
J,-
{0} and, for ll E 0\S ,,
[(J-l - 2)(/-l - 1) 0 1
a diag mi :J.+
= diag{[~
1
1 - fl 2 ,.II
OJ 1 • 0
1, 4},
158
5. PERTURBATION AND STABILITY OF DIVISORS
5.5. Polynomial Dependence In the above examples it turns out that the exceptional set S 1 u S 2 is finite. The following result shows it will always be this case provided the coefficients of L(A., /1) are polynomials in 11· Theorem 5.11. Let L(A., Jl) be given by (5.10) and suppose that A;(f.l) are matrix polynomials in 11 (Q = e), i = 0, ... , 1- 1. Then the set of exceptional points of L(A., Jl) is finite.
Proof Passing to the linearization (with companion matrix) we can suppose L(A., /1) is linear: L(A., /1) = !A. - C(f.1). Consider the scalar polynomial
where n is the size of C(Jl), and aj(Jl) are polynomials in 11· The equation
f(A., /1) = 0
(5.12)
determines n zeros A. 1 (f.1), ... , A.n(Jl). Then for 11 in e\S 2 (S 3 is some finite set) the number of different zeros of (5.12) is constant. Indeed, consider the matrix (of size (2n - 1) x (2n - 1) and in which a 0 , a 1, ... , an-! depends on f.l) 1
an-1
lln-2
a1
llo
0
0
1
0 n-1
a2
a1
ao
0
0 0
n 0
0 1 (n-1)a. _1 (n-2)an-2 •• •
n
(n-1)a._ 1 • • •
an-1
an -2
an-3
ao
a1
0
0
0
2a2
a1
0
(5.13)
0 0
0
n
(n-1)a,_1
a1
The matrix M(f.l) is just the resultant matrix of f(A., f.l) and of(A., /1)/oA., considered as polynomials in A. with parameter Jl, and det M(Jl) is the resultant determinant of f(A., /1) and ~f(A., /1)/oA.. (See, for instance [78, Chapter XII] for
5.5.
159
POLYNOMIAL DEPENDENCE
resultants and their basic properties.) We shall need the following property of M(J1): rank M(Jl) = 2n
+
1 - n(Jl),
where n(Jl) is the number of common zeros (counting multiplicities) of f(A., Jl) and of(A., Jl)/oA. (see [26]). On the other hand, n = n(Jl) + CJ(Jl), where CJ(Jl) is the number of different zeros of (5.12); so
= n + 1 + CJ(Jl).
rank M(Jl)
(5.14)
Since C(Jl) is a polynomial in Jl, we have rank M(Jl) = max rank M(Jl),
( 5.15)
/lEC
where
s3
is a finite set. Indeed, let Jlo rank M(J1 0 )
E
(be such that
= max rank M(Jl) '~ r, /lEI/:
and let M 0 (J1) be an r x r submatrix of M(Jl) such that det M 0 (J1 0 ) =I= 0. Now det M 0 (J1) is a nonzero polynomial in Jl, and therefore has only finitely many zeros. Since rank M(Jl) = r provided det M 0 (Jl) =1= 0, equality (5.15) follows. By ( 5.14) the number of different zeros of ( 5.12) is constant for J1 E fl\S 3 . Moreover, CJ(Jll) < CJ(Jlz) for Jlt E s3 and Jlz E fl\S3. Consider now the decomposition of f(A., Jl) into the product of irreducible polynomials
f(A., Jl) =
ft (A., Jl) · · · fm(A.,
Jl),
( 5.16)
where /;(A., Jl) arc polynomials on A. whose coefficients are polynomials in J1 and the leading coefficient is equal to 1; moreover, /;(A., Jl) are irreducible. Consider one of the factors in (5.16), say, f 1 (A., Jl). It is easily seen from the choice of S 3 that the number of different zeros of f 1 (A., Jl) is constant for J1 E fl\S 3. Let A. 1 (Jl), ... , A.s(Jl), J1 E fl\S 3 be all the different zeros of equation f 1 (A., Jl) = 0. For given A.;(Jl), i = 1, ... , sand for given j = 1, 2, ... , denote
vij(Jl)
= rank(A.;(Jl)/ - C(Jl))j
Let us prove that v;/Jl) is constant for J1 E Cfl\S 3)\S;j, where Sij is a finite set. Indeed, let
vij = max rank (A.;(Jl)l - C(Jl))j, !1EI/'\S3
160
5. PERTURBATION AND STABILITY OF DIVISORS
and let f.lo = f.lo(i,j) E e:\S 3 be such that vii(f.lo) = vii. Let A;p., f.l) be a square submatrix of(H - C(f.l))i (of size v;i x v;) such that det Aii(A.;(f.1 0 ), f.lo) =1 0. We claim that (5.17) for k = 1, 2, ... , s (so in fact vii does not depend on i). To check (5.17) we need the following properties of the zeros A. 1(f.l), ... , A..(f.1): (1) A.;(f.l) are analytic in e:\S 3 ; (2) A. 1 (f.1), ... , A..(f.l) are the branches of the same multiple-valued analytic function in e:\S 3 ; i.e., for every f.1 E e;\S3 and for every pair A.;(f.l), A.j(f.l) of zeros of f 1(A., f.1) there exists a closed rectifiable contour r c e;\S 3 with initial and terminal point f.1 such that, after one complete turn along r, the branch A.;(f.l) becomes A.j(f.l). Property (1) follows from Proposition 5.10(b) taking into account that the number of different zeros of f 1(A., f.l) is constant in e:\S 3 . Property (2) follows from the irreducibility of / 1 (A., f.1) (this is a wellknown fact; see, for instance [63, Vol. III, Theorem 8.22]). Now suppose (5.17) does not hold for some k; so det Au(A.k(f.l), f.l) 0. Let r be a contour in e;\S 3 such that f.lo E rand after one complete turn (with initial and terminal point f.lo) along r, the branch A.k(f.l) becomes A.;(f.1). Then by analytic continuation along r we obtain det Aii(A.;(f.1 0 ), f.lo) = 0, a contradiction with the choice of f.lo· So (5.17) holds, and, in particular, there exists f.lo = f.l'o(i,j)E e:\S 3 such that
=
k = 1, ... , s.
(5.18)
(For instance, f.lo can be chosen in a neighborhood of f.lo.) Consider now the system of two scalar polynomial equations f1(A., f.l)
= 0,
det Au(A., f.l) = 0.
(5.19)
(Here i and j are assumed to be fixed as above.) According to (5.18), for f.1 = f.lo this system has no solution. Therefore the resultant R;j(f.l) of f 1(A., f.l) and det A;j(A., f.l) is not identically zero (we use here the following property of R;j(f.l) (see [78, Chapter XII]): system (5.19) has a common solution A. for fixed f.1 if and only if Rii(f.l) = 0). Since R;j(f.l) is a polynomial in f.l, it has only a finite set Sii of zeros. By the property of the resultant mentioned above,
as claimed. Now let T1 = U~= 1 1 Sii, and letT = 1 7;,, where T2 , ••• , Tm are constructed analogously for the irreducible factors f 2 (A., f.l), ... , fm(A., f.l), respectively, in (5.16). Clearly Tis a finite set. We claim that the exceptional set S1 u S2 is contained in S3 u T (thereby proving Theorem 5.11). Indeed,
Ui'=
U;=
5.5.
161
POLYNOMIAL DEPENDENCE
from the construction of S3 and T it follows that the number of different eigenvalues of I A. - C(J-t) is constant for J-l E (/;\(S 3 u T), and for every eigenvalue A.0 (J-t) (which is analytic in (/;\(S 3 u T)) of IA - C(J-t), r(J-t;j) ~r rank(A. 0 (J-t)/- C(J-t)).i
is constant for 1-l E (/;\(S 3 u T);j = 0, 1, .... The sizes of the Jordan blocks in the Jordan normal form of C(J-t) corresponding to A.0 (J-t) are completely determined by the numbers r(J-t;j); namely, j
= 1, ... , n,
(5.20)
where Y.i(J-l) is the number of the Jordan blocks corresponding to A.0 (J-t) whose size is not less than j; j = 1, ... , n (so y1 (J-t) is just the number of the Jordan blocks). Equality (5.20) is easily observed for the Jordan normal form of C(J-t); then it clearly holds for C(J-t) itself. Since r(J-t;j) are constant for J-l E (/;\(S 3 u T), (5.20) ensures that the Jordan structure of C(J-t) corresponding to A.0 (J-t) is also constant for J-l E ([;\(S 3 u T). Consequently, the exceptional set of H - C(J-t) is disjoint with S 3 u T. D Let us estimate the number of exceptional points for L(A., J-l) as in Theorem 5.11, assuming for simplicity that det L(A., J-t)is an irreducible polynomial. (We shall not aim for the best possible estimate.) Let m be the maximal degree of the matrix polynomials A 0 (J-t), ... , A 1 _ 1 (J-t). Then the degrees of the coefficients a;(J-t) of f(A., J-t) = det(J A. - C(J-t)), where C(J-t) is the companion matrix of L(A., J-t), do not exceed mn. Consequently, the degree of any minor (not identically zero) of M(J-t) (given by formula (5.13)) does not exceed (2nl - 1)mn. So the set S3 (introduced in the proof of Theorem 5.11) contains not more than (2nl - 1)mn elements. Further, the left-hand sides ofEq. (5.19) are polynomials whose degrees (as polynomials in A.) are nl (for f 1 (A., J-t) = f(A., J-t)) and not greater than nlj (for det Aij(A., J-t)). Thus, the resultant matrix of f(A., J-t) and det A;.i(A., J-t) has size not greater than nl + nlj = nl(j + 1). Further, the degrees of the coefficients of f(A., J-t) (as polynomials in J-t) do not exceed mn, and the degrees ofthe coefficients of det Aij(A., J-t) (as polynomials on J-t) do not exceed mnj. Thus, the resultant Rij(/-l) of f(A., J-t) and det Aij(A., J-t) (which is equal to the determinant of the resultant matrix) has degree less than or equal to lmn 2j(j + 1). Hence the number of elements in Sij (in the proof of Theorem 5.11) does not exceed lmn 2j(j + 1). Finally, the set nl
nl
s3 U U U sij i= l.i=l
consists of not more than (2nl- 1)mn + n 3 / 2 m L}~ 1 (j + 1)j = (2nl- l)mn + n 3 l 2 m(-~n 3 l 3 + n2 l 2 + ~nl) = mn(tp 5 + p4 + ~p 3 + 2p - 1) points where p = nl. Thus for L(A., J-t) as in Theorem 5.11, the number of exceptional points
162
5.
PERTURBATION AND STABILITY OF DIVISORS
does not exceed mn(jp 5 + p4 + Jp 3 + 2p - 1), where m is the maximal degree of the matrix polynomials A;(Jl), i = 0, ... , l- 1, and p = nl. Of course, this estimate is quite rough. 5.6. Analytic Divisors
As in Eq. (5.10), let L(A., Jl) be defined for all (A., Jl) E rt X 0. Suppose that for someJlo E 0 the monic matrix polynomial L(A., Jlo) has a right divisor L 1(A.). The possibility of extending L 1 (A.) to a continuous (or an analytic) family of right divisors L 1(A., Jl) of L(A., Jl) is to be investigated. It turns out that this may not be possible, in which case L 1 (A.) will be described as an isolated divisor, and that this can only occur if Jlo is in the exceptional set of Lin 0. In contrast, we have the following theorem. Theorem 5.12. If Jlo E O\(S 1 u S2 ), then every monic right divisor L 1(A.) of L(A., Jlo) can be extended to an analytic family L 1 (A., Jl) ofmonic right divisors of L(A., Jl) in the domain 0\(S 1 u S3 ) where S3 is an, at most, countable subset of isolated points of0\S 1 and L 1 (A., Jl) has poles or removable singularities at the points of S 3 .
Proof Let A be the supporting subspace of divisor L 1 (A.) of L(A., Jlo) with respect to the Jordan matrix JJLo· Then, if L 1 has degree k, define
It follows from the definition of a supporting subspace that Qk(Jlo) is an invertible linear transformation. Since (by Proposition 5.10) Qk(Jl) is analytic on 0\S 1 it follows that Qk(Jl) is invertible on a domain (0\S 1 )\S 3 where S3 = {Jl E 0\S 1 Idet Qk(Jl) = 0} is, at most, a countable subset of isolated points of 0\S 1 • Furthermore, the J Jlo -invariant subspace A is also invariant for every J JL with Jl E 0\S 1. Indeed, we have seen in Section 5.4 that this holds for JlE0\(S 1 uS 2 ). When JlES 2 \S 1 , use the continuity of Iw Hence by Theorem 3.12 there exists a family of right divisors L 1(A., Jl) of L(A., Jl) for each Jl E O\(S 1 u S 3 ), each divisor having the same supporting subspace A with respect to J w By formula (3.27), it follows that this family is analytic in 0\ (S 1 u S 3 ). An explicit representation of L 1 (A., Jl) is obtained in the following way (cf. formula (3.27)). Take a fixed basis in A and for each Jl in 0\S 1 represent Qk(Jl) as a matrix defined with respect to this basis and the standard basis for rtkn. Then for Jl¢= s3' define In X n matrix-valued functions WI (Jl), W,.(Jl) by 0
0
0'
5.6.
163
ANALYTIC DIVISORS
where R is the matrix (independent of fl) representing the embedding of .A into C1n. The divisor L 1 has the form
The nature of the singularities of L 1(A., fl) is apparent from this representation.
D An important special case of the theorem is flo
Corollary 5.13. If det L(A., fl) has In distinct zeros for every fl En, and if En, then every monic right divisor of L(A., flo) can be extended to a family of
monic right divisorsfor L(A., fl) which is analytic on !1\S 3 . Proof Under these hypotheses S 1 and S 2 are empty and the conclusion follows. D EXAMPLE
5.3.
Let
Then a Jordan matrix for L does not depend on Jl, i.e., for every J1 E fl,
J~ = J = diag{[~ ~l [~
:]}
We have S 1 and S 2 both empty and
X
~
=[I0 )010! 0OJ·
The subspace Jt spanned by the first two component vectors e 1 and e 2 is invariant under J and
Thus, for J1 #- 0, .It is a supporting subspace for L(A., Jl). The corresponding divisor is L 1 (A., Jl) = !A. - X J /X~ L41 ) - I and computation shows that
So S 3 = {0}.
0
Corollary 5.14. If the divisor L 1(A.) of Theorem 5.12 is, in addition, a r-spectral divisor for some closed contour r then it has a unique analytic extension to a family L 1 (A., fl) of monic right divisors defined on !1\(S 1 u S 3 ).
164
5.
PERTURBATION AND STABILITY OF DIVISORS
Proof Let L 1 (A., fl) be an extension of L 1 (A.); existence is guaranteed by Theorem 5.12. By Lemma 5.4, for fl E 0 close to flo, the divisor L 1(A., fl) is r-spectral. In particular, L 1 (A., fl) is uniquely defined in a neighborhood of flo. Because of analyticity, L 1 (A., fl) is a unique extension of L 1 (A.) throughout O\(S 1 u S 3 ). D
5.7. Isolated and Nonisolated Divisors As before, let L(A., fl) be a monic matrix polynomial of degree I with coefficients depending analytically on fl in 0, and let L 1 (A.) be a monic right divisor of degree k of L(A., flo), floE Q. The divisor L 1(A.) is said to be isolated if there is a neighborhood U(flo) of flo such that L(A., fl) has no family of right monic divisors L 1 (A., fl) of degree k which (a) depends continuously on fl in U(flo) and (b) has the property that lim 11 ~ 110 L 1 (A., fl) = L 1 (A.). Theorem 5.12 shows that if, in the definition, we have floE O\(S 1 u S2 ) then monic right divisors cannot be isolated. We demonstrate the existence of isolated divisors by means of an example. EXAMPLE 5.4. Let C(fl) be any matrix depending analytically on fl in a domain Q with the property that for fl = floE Q, C(J respectively, of L(A.) corresponding to ..1. 0 , and therefore they are linearly independent again by Proposition 1.15. Now it is clear that (X;. 0 , i-;0 1 ), where j ;. 0 is the part of jF corresponding to the eigenvalue ..1. 0 ( # 0) of j F and X;. 0 is the corresponding part of XF, is the part of a standard pair of L1 (A.) corresponding to ..1. 0 1 . Recall that by definition (X oo, j 00 ) is the part of a Jordan pair of L 1 (A.) corresponding to 0. So indeed (X, f) is a standard pair of L1 (A.). In particular, col(X j i)l.:b = col(XFjii, X oohJl:& is nonsingular (cf. Section 1.9). Since we can choose XF = XF, jF = JF + r:t.I, and the pair (X oo, J 00 ( / + r:t.J 00 ) - 1 ) is similar to (X 00 , j 00 ), the matrix 11- 1 ~r col( X F(J F + aiY, X 00 1t; 1 -;(J + r:t.1 00 Y)l:& is also nonsingular. Finally, observe that
I r:t.I 11-1 = @r:t.2I (b)azi
0 I (Dai
0 0 (DI
0 0 0
(Dal-1 I
(Dal-2I
Cl)I
SI-b
(7.14)
where S1_ 1 is given by (7.5). (This equality is checked by a straightforward computation.) So S 1_ 1 is also invertible, and Theorem 7.3 is proved. D 7 .4. Properties of Decomposable Pairs
This section is of an auxiliary character. We display here some simple properties of a decomposable pair, some of which will be used to prove main results in the next two sections. Proposition 7.4. Let (X, Y) = ([X 1 X 2 ], T1 EB T2 ) be a decomposable pair of degree l, and let S1_ 2 = col(X 1 TL X 2 T~- 2 -;)l:6. Then
(a) S1_ 2 hasfull rank (equal to n(l- 1)); (b) the nl x nl matrix (7.15) where S 1_ 1 is defined by (7.4), is a projector with Ker P = Ker S 1_ 2 ; (c) we have (I- P)(I
[0 OJ
_ 1 - 1 EB T2)S;I. · 1 =(I EB T2)Sz-1 0
(7.16)
192
7. Proof
SPECTRAL PROPER TIES AND REPRESENTATIONS
By forming the products indicated it is easily checked that
l ~: ~l
:::
ll
~1
s1-1
= sz- 2 T(A.),
(7.17)
-I
where T(A.) = (IA.- T1) EB (T2 A.- I). Then put A.= A.0 in (7.17), where A. 0 is such that T(A. 0 ) is nonsingular, and use the nonsingularity of S1_ 1 to deduce that sl- 2 has full rank. Dividing (7.17) by A. and letting A.--+ oo, we obtain (7.18) Now, using (7.18)
[I] [I]
- [I]
- 1 P 2 =(I EB T2)S1-1 O Sz-2(1 EB T2)Sz-11 O Sz-2
- 1 O [I =(I EB T2)Sz-1
- [I]
[JJ
OJ O Sz-2 =(I EB T2)Sz-11 O Sz-2 = P,
so P is a projector. Formula (7.15) shows that Ker P => Ker S1_ 2.
(7.19)
On the other hand,
[I]
-1 O Sz-2 = Sz-2 Sz- 2P = Sz- 2(1 EB T2)Sz-1
inviewof(7.18),sorankP 2:: rankS 1_ 2 = n(l- 1).Combiningwith(7.19),we obtain Ker P = Ker S1_ 2 as required. Finally, to check (7.16), use the following steps: P(l EB T2)S1--\ =(I EB
T2 )S1--\[~J {S1_ 2(J EB T2)S1--\}
= (I EB T2)Sz--\ [~][I OJ = (I EB T2)Sz--\ [~ where the latter equality follows from (7.18).
~J
0
The following technical property of decomposable pairs for matrix polynomials will also be useful.
7.4.
193
PROPERTIES OF DECOMPOSABLE PAIRS
Proposition 7.5. Let ([X 1 X 2 ], T1 EB T2 ) be a decomposable pair for the matrix polynomial L(A.) = L~;o A,iAi. Then
= 0,
VP
(7.20)
where Pis given by (7.15), and
V=
[
A l X 1 T 11-
1
'
-
l-1
"A.X T 12- 1 -i L.., ' 2
J •
i;Q
Proof
Since
I
LAiX 2 T~-i=0
[X 1T 11-l,X 2]=[0 ···
and
0
I]S 1_ 1 ,
i;O
we have
VP = [A1X1T11-l,
-:~AiX2 T~-]st--\[~Jst-2
=A 1 [X 1 T 11-l,X 2 ]S 1--\[~]S 1 _ 2 =A 1 [0
···
0
J][~]S 1 - 2 =0.
0
We conclude this section with some remarks concerning similarity of decomposable pairs. First, decomposable pairs
with the same parameter m are said to be similar if for some nonsingular matrices Q 1 and Q2 of sizes m x m and (nl- m) x (nl- m), respectively, the relations i = 1, 2,
hold. Clearly, if (X, T) is a decomposable pair of a matrix polynomial L(A.) (of degree 1), then so is every decomposable pair which is similar to (X, T). The converse statement, that any two decomposable pairs of L(A.) with the same parameter are similar, is more delicate. We prove that this is indeed the case if we impose certain spectral conditions on T1 and T2 , as follows. Let L(A.) = Il;o AiA.i be a matrix polynomial with decomposable pairs (7.21) and the same parameter. Let T1 and f 1 be nonsingular and suppose that the following conditions hold: (J( 7;)
=
(J(T1 1 ) n (J(T2 )
(J( f;),
=
Then (X, T) and (X}') are similar.
i
= 1, 2,
(J(T1 1 ) n (J(T2 )
(7.22)
=
0.
(7.23)
7.
194
SPECTRAL PROPERTIES AND REPRESENTATIONS
Let us prove this statement. We have [A 0
A 1 ···
A 1 ]co1(X 1 Ti1 ,X 2 Ti-;)~=o=0
(7.24)
or
Since S1_ 1 and T1 are nonsingular, Im A 0
::::>
Im[A 1 · · · A 1],
and therefore A 0 must also be nonsingular (otherwise for all A. E C, A 1 · · · A 1] # fl",
lm L(A.) c lm[A 0
a contradiction with the regularity of L(A.)). Consider the monic matrix polynomial L(A.) = A 0 1A.1L(r 1). Equality (7.24) shows that the standard pmr
[I
0
...
0 0
I 0
0 0
0 I
0], 0 0 -A()1AI -A()1AI-1 -A()1AI-2 0
I -A() 1A1
of L(A.) is similar to ([X 1 X 2 ], T! 1 EB T2 ), with similarity matrix col( X 1T!;, X 2 T~)~;;b (this matrix is nonsingular in view of the nonsingularity of S1_ 1). So ([X 1 X 2 ], T! 1 EB T2 ) is a standard pair of L(A.). Analogous arguments show that ([X 1 X 2 ], T! 1 EB T2 ) is also a standard pair of L(A.). But we know that any two standard pairs of a monic matrix polynomial are similar; so T1 1 EB T2 and t 1 1 EB T2 are similar. Together with the spectral conditions (7.22) and (7.23) this implies the similarity ofT; and f;, i = 1, 2. Now, using the fact that in a standard triple of L(A.), the part corresponding to each eigenvalue is determined uniquely up to similarity, we deduce that (X, T) and (X, T) are similar, as claimed. Note that the nonsingularity condition on T1 and T1 in the above statement may be dropped if, instead of (7.23), one requires that a((T1
+ al)- 1
n a(T2 (/
+ aT2 )- 1 ) = a((T1 + al)- 1
n a(Tz(I
+ aT2 )- 1 ) = 0,
(7.25)
for some a E ([,such that the inverse matrices in (7.25) exist. This case is easily reduced to the case considered above, bearing in mind that
7.5. DECOMPOSABLE LINEARIZATION AND A RESOLVENT FORM
195
is a decomposable pair for L(A. - a), provided ([X 1 X 2 ], T1 EB T2) is a decomposable pair for L(A.). (This can be checked by straightforward calculation using(7.14).) Finally, it is easy to check that similarity ofT2 (I + aT2 )- 1 and Tz(I + aT2)- 1 implies (in fact, is equivalent to) the similarity of T2 and T2. 7 .5. Decomposable Linearization and a Resolvent Form
We show now that a decomposable pair for a matrix polynomial L(A.), introduced in Section 7.4, determines a linearization of L(A.), as follows. Theorem 7.6. Let L(A.) = L~=o A;A.; be a regular matrix polynomial, and let ([X 1 X 2], T1 EB T2) be its decomposable pair. Then T(A.) = (I A. - T1) EB (T2A.- I) is a linearization of L(A.). Moreover
CL(A.)Sz-1 = [Sz-2] V T(A.,)
(7.26)
where V = [A 1 X 1T 11-\ - ~]::6 A; X 2 Ti-l-i] and
C L(A)
I 0
0 I
0 0
0 0
0 0
I
=
~L
0 0
-I
~J
0
.
0
0
0 -I
0 0
0
0
-I
\
Ao Al A2
Az-1
is the companion polynomial of L(A.).
Proof The equality of the first n(l - 1) rows of (7.26) coincides with (7.17). The equality in the last n rows of (7.26) follows from the definition of a decomposable pair for a matrix polynomial (property (iii)). The matrix [s'v 2 ] is nonsingular, because otherwise (7.26) would imply that det CL(A.) = 0, which is impossible since CL(A.) is a linearization of L(A.) (see Section 7.2) and L(A.) is regular. Consequently, (7.26) implies that T(A.) is a linearization of L(A.) together with CL(A.), 0 The linearization T(A.) introduced in Theorem 7.6 will be called a decomposable linearization of L(A.) (corresponding to the decomposable pair ([X 1 X 2 ], T1 EB T2 ) of L(A.)). Using a decomposable linearization, we now construct a resolvent form for a regular matrix polynomial. Theorem 7.7. Let L(A.) be a matrix polynomial with decomposable pair ([X 1 X 2 ], T1 EB T2) and corresponding decomposable linearization T(A.). Put
7.
196
SPECTRAL PROPERTIES AND REPRESENTATIONS
V= [Aix1n- 1, -I:l:6A;Xzr~- 1 -i],
S1-2
=col(X1Ti1•
xzn-z-i)i:6,
and
Then
(7.27) Observe that the matrix [8 'v 2 ] in Theorem 7.7 is nonsingular, as we have seen in the proof of Theorem 7.6. Proof.
We shall use the equality (7.2), which can be rewritten in the form .diag[L - 1(A.), I, ... , I] = D(A.)Ci 1(A.)B- 1(A.),
where
B(A.) =
B 1 (A.) B 2 (A.) .. · B 1_ 1 (A.) -I 0 0 0 -I 0 0
-I
0
I 0 0
(7.28)
0
with some matrix polynomials B;(A.), i = 1, ... , l - 1, and I
D(A.) =
0 -A.! I 0 -A.! 0
Multiplying this relation by [I
0
0 0 I 0
0 0 0
-A.!
(7.29)
I
0 · · · 0] from the left and by
u o .. · oF from the right, and taking into account the form of B(A.) and D(A.) given by (7.28) and (7.29), respectively, we obtain L - 1 (A.)
=
[I
Using (7.26) it follows that
0
. . . O]CL(A.)- 1 [0
... 0
I]T.
7.6. REPRESENTATION AND THE INVERSE PROBLEM
197
Now
[I
0
O]Sz-1T(A.)-l[Sz~2rl =
and (7.27) follows.
= [Xl [Xl
X2T~-l]T(A.)-l[Sz~2rl
X2](I EB
T~-l)T(A.)-l[Sz~2rl
0
7 .6. Representation and the Inverse Problem
We consider now the inverse problem, i.e., given a decomposable pair (X, T) of degree I, find all regular matrix polynomials (of degree/) which have (X, T) as their decomposable pair. The following theorem provides a description of all such matrix polynomials. Theorem 7.8. Let(X, T) = ([X 1 X 2 ], T1 EB T2 )beadecomposablepair of degree I, and let S 1_ 2 = col( X 1Ti1 , X 2 T~- 2- i)~:6. Then for every n x nl matrix V such that the matrix [s'v 2 ] is nonsingular, the matrix polynomial
L(A.) = V(I- P)[(H- T1 ) EB (T2 A.- I)](U 0
+
U 1 A.
+ ·· · +
U1 _ 1 A.1-
1 ),
(7.30) where P
= (I EB T2 ) [col( X 1Ti1 , X 2 n-l-i)~:6r 1[6]S 1_ 2 and [U 0
U 1 ···
U1_ 1] = [col(X 1 TLX 2 T~-l-i)l;:6r 1 ,
has (X, T) as its decomposable pair. Conversely, if L(A.) = A)i has (X, T) as its decomposable pair, then L(A.) admits representation (7.30) with
:D=o
(7.31) For future reference, let us write formula (7.30) explicitly, with V given by (7.31):
(7.32)
198
7. SPECTRAL PROPER TIES AND REPRESENTATIONS
Let V be an n x nl matrix such that [8 'v 2 ] is nonsingular, and let by (7.30). We show that (X, T) is a decomposable pair of L(A). DefineS 1_ 1andPby(7.4)and(7.15),respectively,andput W = V(I- P). Since P is a projector, Proof L(A) =
Ll=o AiAi be defined
WP=O.
(7.33)
Besides,
hence [ 8 \V 2 ] is nonsingular. Using (7.33) and Proposition 7.4(c), we obtain W(I
EB T2)S/. \
= W(I - P)(I = W(I
EB T2)s,-_\
[0 OJ
_ 1 EB T2)S1-1 O In
= [0
··· 0
A,]; (7.34)
and therefore by the definition of L(A) W(T1
EB I)s,-_\ = - [A 0
A 1_ 1].
A1 ·· ·
(7.35)
Combining (7.34) and (7.35), we obtain CL(A)SI-1 =
[s~ 1 ]rcA),
(7.36)
where C L(A) is the companion polynomial of L(A) and T(A) = (I A - T1) EB (T2 A - I). Comparison with (7.26) shows that in fact 1-1 W = [ AI X lyl-1 yl-1-i l • -"A-X L..r22
J
(7.37)
·
i=O
The bottom row of the equation obtained by substituting (7.37) in (7.36) takes the form
x2 r~-11
X 2 y~-2
[H-
1 = [ A X yl-1 _ I="A.X yl-1-i J l
1
1
'
i£;-0
I
2
2
0
x2 T1
0
J
T2 A - I .
199
7.6. REPRESENTATION AND THE INVERSE PROBLEM
This implies immediately that l
l
IAiX1 T~
IAiX2 T~-i = 0,
= 0,
1=0
i=O
i.e., (X, T) is a decomposable pair for L(A.). We now prove the converse statement. Let L(A.) = L~=oA;A.i be a regular matrix polynomial with decomposable pair (X, T). First observe that, using (7.17), we may deduce S1- 2 T(A.)Sl--\ col(JA.;)~:6 = 0.
(7.38)
Then it follows from (7.2) that
0 I
L(A.) EB I.(l-1) = [_I*
:
n(l-1)
0
0
°
.
01 :
·.
0 I
where the star denotes a matrix entry of no significance. Hence,
L(A.)
= [*
...
*
I.]CL(A.) col(JA.;)~:6.
(7.39)
Now use Theorem 7.6 (Eq. (7.26)) to substitute for CL(A.) and write
I.]
[S 1~ 2 JT(A.)S 1-_\ col(JA.;)::6
L(A.) = [*
...
*
= [*
···
*]S1- 2 T(A.)S 1--\ col(JA.;)l:6
+
VT(A.)S 1--\ col(JA.;)~:6.
But the first term on the right is zero by (7.38), and the conclusion (7.30) follows since V P = 0 by Proposition 7.5., 0 Note that the right canonical form of a monic matrix polynomial (Theorem 2.4) can be easily obtained as a particular case of representation (7.32). Indeed, let L(A.) = H 1 + Il:6 A)i, and let (X, T) = (XF, JF) be its finite Jordan pair, which is at the same time its decomposable pair (Theorem 7.3). Then formula (7.32) gives
L(A.) =
XT 1 - 1 T(A.)(t~U;A.;) = XT
which coincides with (2.14).
1-
1(IA.-
T)c~U;A.)
200
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
Theorem 7.8 shows, in particular, that for each decomposable pair (X, T) there are associated matrix polynomials L(A.), and all of them are given by formula (7.30). It turns out that such an L(A.) is essentially unique: namely, if L(A.) and L(A.) are regular matrix polynomials of degree l with the same decomposable pair (X, Y), then
L(A.) = QL(A.)
(7.40)
for some constant nonsingular matrix Q. (Note that the converse is trivial: if L(A.) and L(A.) are related as in (7.40), then they have the same decomposable pairs.) More exactly, the following result holds.
let
Theorem 7.9. Let (X, T) be a decomposable pair as in Theorem 7.8, and vl and v2 ben X nl matrices such that [ 8 ~/] is nonsingular, i = 1, 2. Then Lv/A.) = QLv2(A.),
v;, and
where Lv;(A.) is given by (7.30) with V =
Q=
(Vl\Kers,_ 2 )
•
(V2\KerS 1 _ ) - 1·
(7.41)
Note that, in view of the nonsingularity of [8 j,~ 2 ] , the restrictions v;\Kers,_ 2 , i = 1, 2 are invertible; consequently, Q is a nonsingular matrix.
Proof
We have to check only that QVil- P) = V1(1- P).
(7.42)
But both sides of (7.42) are equal to zero when restricted to ImP. On the other hand, in view of Proposition 7.4(c) and (7.41), both sides of (7.42) are equal when restricted to Ker P. Since q;nl is a direct sum of Ker P and Im P, (7.42) follows. D We conclude this section by an illustrative example. EXAMPLE 7.2. Let X 1 = X 2 = I, T1 = 0, T2 = a!, a # 0 (all matrices are 2 x 2). Then ([X 1 X 2 ], T1 EB T2 ) is a decomposable pair (where l = 2). According to Theorem 7.8, all matrix polynomials L(A.) for which [(X 1 X 2 ], T1 EB T2 ) is their decomposable pair are given by (7.30). A computation shows that
(7.43) where V1 and V2 are any 2 x 2 matrices such that
is nonsingular. Of course, one can rewrite (7.43) in the form L(A.) = V(A.- aA. 2 ), where Vis any nonsingular 2 x 2 matrix, as it should be by Theorem 7.9. 0
7.7.
201
DIVISIBILITY OF MATRIX POLYNOMIALS
7.7. Divisibility of Matrix Polynomials In this section we shall characterize divisibility of matrix polynomials in terms of their spectral data at the finite points of the spectrum, i.e., the data contained in a finite Jordan pair. This characterization is given by the following theorem. Theorem 7.10. Let B(A.) and L(A.) be regular matrix polynomials. Assume B(A.) = L}=o BiA.i, Bm #- 0, and let (XF, JF) be the .finite Jordan pair of L(A.). Then L(A.) is a right divisor of B(A.), i.e., B(A.)L(A.)- 1 is also a polynomial, if and only if m
L BjXFJ~ = 0.
(7.44)
j=O
We shall need some preparation for the proof of Theorem 7.10. First, without loss of generality we can (and will) assume that L(A.) is comonic, i.e., L(O) = I. Indeed, a matrix polynomial L(A.) is a right divisor of a matrix polynomial B(A.) if and only if the co monic polynomial L - 1 (()()L(A. + ()() is a right divisor of the comonic polynomial B- 1(()()B(A. + ()(). Here ()( E q; is such that both L(()() and B(()() are non singular, existence of such an()( is ensured by the regularity conditions det B(A.) I- 0 and det L(A.) I- 0. Observe also that (XF, JF) is a finite Jordan pair of L(A.) if and only if (XF, JF +()(I) is a finite Jordan pair of L - 1 (()()L(A. + ()().This fact may be verified easily using Theorem 7.1. Second, we describe the process of division of B(A.) by L(A.) (assuming L(A.) is comonic). Let l be the degree of L(A.), and let (X oo, J 00 ) be the infinite Jordan pair of L(A.). Put (7.45) We shall use the fact that (X, J) is a standard pair for the monic matrix polynomial [(A.) = A.1L(r 1 ) (see Theorem 7.15 below). In particular, col(XJi- 1 ):= 1 is nonsingular. For ()( ~ 0 and 1 ~ f3 ~ l set Fap = XJ<XZp, where (7.46) Further, for each()(~ 0 and f3 the following formulas hold:
~
0 or f3 > l put Fap = 0. With this choice of Fap 00
L(A.) = I -
L A.iFz.z+
1-
i'
(7.47)
j= 1
and (7.48)
202
7.
SPECTRAL PROPERTIES AND REPRESENTATIONS
Indeed, formula (7.47) is an immediate consequence from the right canonical form for L(.A.) using its standard pair (X, J). Formula (7.48) coincides (up to the notation) with (3.15). The process of division of B(.A.) by L(.A.) can now be described as follows:
k = 1, 2, ... '
(7.49)
where Qk(.A.) = kit.A.j(Bj 1=0 Rk(},) =
I BiFl+j-t-i,l),
+1
1
r=O
j~/(Bj + :t>iFl+k-1-i,l+k-j),
(7.50)
and B j = 0 for j > m. We verify (7 .49) using induction on k. Fork = 1 (7 .49) is trivial. Assume that (7.49) holds for some k ~ 1. Then using the recursion (7.48), one sees that B(.A.)- Qk+ 1 (.A.)L(.A.) = Rk(.A.)- ,A_k(Bk
+
:t>;F
1+k-t-i.)L(.A.)
= Rk+ t(.A.), and the induction is complete. We say that a matrix polynomial R(.A.) has codegree k if R(il(O) = 0, i = 0, ... , k - 1, and R
m,
'f.BiXJz+k-1-i = 0.
(7.51)
i=O
Now suppose that L(A.) is a right divisor of B(A.). Then there exists k that (7. 51) holds. Because of formula (7.45) this implies
~
1 such
m
'f.BiXFJ;;l-k+1+i = 0. i=O
Multiplying the left- and right-hand sides of this identity by J~+k- 1 yields the desired formula (7.44). Conversely, suppose that (7.44) holds. Let v be a positive integer such that J"cr_, = 0. Choose k ~ 1 such that l + k ~ m + v. Multiply the left- and righthand sides of (7.44) by Jil-k+ 1 . This gives m
'f.BiXFJi(t) =X 1 T~er 11 X = 0 fori = 1, ... , l- 1. So col(X 1 T~)l:5er 11X = 0, and since the columns of col(X 1 TDl:5 are linearly independent (refer to the definition of a decomposable pair in Section 7.3), we obtain er 11 X = 0 and x = 0. D
=
The following theorem is a more detailed version of Theorem 8.1 for the case of comonic matrix polynomials. Theorem 8.2. Let L(A.) be a comonic matrix polynomial of degree l, and let (X F• JF) and (X 00 , J cxJ be its finite and infinite Jordan pairs, respectively. Define Z to be the nl x n matrix given by
[col(XJ;- 1 )l= 1
r
1
= [* · · · * Z],
where X = [X F X 00 ] , J = Ji 1 tf) J 00 • Make a partition Z = [~:J corresponding to the partition X= [XF X ool Assume that the q;n_valued function f(t) is v-times continuously differentiable, where v is the least nonnegative integer such that J"oo = 0. Then the general solution of (8.1) is an 1-times continuously differentiable function of the form v-1
u(t) = XFeJFtx
+ LX
00
J',;; 1+kZ 00 j(t)
k=O
- XFJi 1, take x E As. Then (by (9.2) for r = 1), x E./Its+ 1 . In particular, this means that col(XTi)f= 1 x = 0, i.e., Tx EA•. Again, by (9.2) for r = 1, Tx E As+ 1 and, consequently, T 2 x E As. Continuing this process, we obtain T'x E As for r = 1, 2,, .. , and (9.2) follows. Two admissible pairs (X, T) and (X 1 , T1 ) are said to be similar if there exists a nonsingular matrix S such that XS =X 1 and T = ST1 S- 1 . In other words,(X, T)and(X 1 , T1 )aresimilarif(X, T)isanextensionof(X 1 , T1 )and (X 1 , T1 ) is an extension of (X, T). We recall the necessary and sufficient condition for extension of two admissible pairs (X, T) and (X 1 , T1 ) given by Lemma 7.12 (see Section 7.7 for the definition of extension of admissible pairs): namely, if Ker(X, T) = {0} and Ker(X 1 , T1 ) = {0}, then (X, T) is an extension of (X 1 , T1 ) if and only if (9.3)
9 .1.
233
COMMON EXTENSIONS OF ADMISSIBLE PAIRS
for some k ;;::-: max{ind(X, T), ind(X 1 , T1 )}. If this is the case, then (9.3) holds for all integers k ;;::-: 0. Let (X 1 , T1 ), . . . , (X,, T,.) be admissible pairs. The admissible pair (X, T) is said to be a common extension of (X 1 , T1 ), ••• , (X" T,.) if (X, T) is an extension of each (Xi, T), j = I, ... , r. We call (X 0 , T0 ) a least common extension of (X 1 , T1 ), . . . , (X" T,.) if (X 0 , T0 ) is a common extension of (X 1 , T1 ), . . . , (X" T,.), and any common extension of (X 1 , T1 ), . . . , (X" T,.) is an extension of(X 0 , T0 ).
Theorem 9.1. Let (X 1 , T1 ), •.. , (X" T,.) be admissible pairs of orders p 1 , ... , Pn respectively, and suppose that Ker(Xi, ~) = {0} for 1 ::;; j ::;; r. Then up to similarity there exists a unique least common extension of(X 1 , T1 ), . . . ,
(X" T,.). Put X = [X 1 . . . XrJ, T = T1 EB ... EB T,., and p = p 1 + ... + Pn and let P be a projector off/7 along Ker(X, T) (i.e., Ker P = Ker(X, T)). Then one such least common extension is given by
(9.4)
(X limP• PTIImP).
Let X 0 be the first term in (9.4) and T0 the second. By choosing some basis in ImP we can interpret (X 0 , T0 ) as an admissible pair of matrices. As Ker(X, T) is invariant forT and Xu = 0 for u E Ker(X, T), one sees that Ker(X 0 , T0 ) = {0} and Proof
Im[col(X 0 T~- 1 );"= 1 ]
= Im[col(XT;- 1 )7'= 1 ],
m;;::-:1.
(9.5)
From the definitions of X and T it is clear that
+ .. · + Im[col(XJ~- 1 );"= 1 ], m
Im[col(XT;- 1 )7'= 1 ] = Im[col(X 1 T~- 1 )7'= 1 ]
;;::-: 0.
(9.6)
But then we apply the criterion (9.3) to show that (X 0 , T0 ) is a common extension of (X 1 , T1 ), . . . , (X" T,.). Now assume that (Y, R) is a common extension of (X 1 , T1 ), . . . , (X" T,.). Then (cf. (9.3)) for 1 ::;; j ::;; r we have Im[col(YR;- 1 )~ 1 ]
::::J
Im[col(Xi T~- 1 )}= 1 ],
m;;::-:1.
Together with (9.5) and (9.6) this implies that Im[col(YR;- 1 )7'= 1 ]
::::J
Im[col(X 0 T~- 1 )7'= 1 ],
m;;::-:1.
But then we can apply (9.3) again to show that (Y, R) is an extension of (X 0 , T0 ). It follows that (X 0 , T0 ) is a least common extension of (X 1 , T1 ), ... , (X" T,.). Let (X 0 , T0 ) be another least common extension of (X 1 , T1 ), ..• , (X" T,.). Then (X 0 , T0 ) is an extension of (X 0 , T0 ) and conversely (X 0 , T0 ) is an exten-
sion of(X 0 , T0 ). Hence both pairs are similar. D
234
9.
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Corollary 9.2. Let (X 1 , T1 ), •.. , (X" T,.) be as in Theorem 9.1, and let (X 0 , T0 ) be a least common extension of(X 1 , T1 ), ••• , (X" T,.). Then Ker(X 0 , T0 )
= 0 and r
a(T0 ) c
U a(1;).
(9.7)
i= 1
Proof The equality Ker(X 0 , T0 ) = {0} is an immediate consequence of formula (9.4). To verify (9.7) observe that (in the notation of Theorem 9.1) Ker(X, T) is aT-invariant subspace and Thas the following form with respect to the decomposition ([;P = Ker(X, T) +ImP:
T= [T
* ]·
IKer(X. T)
0
PTilmP
So r
a(T0 )
= a(PThmP)
c a(T)
=
Ua(T;), i= 1
and (9.7) follows.
D
The first statement of Theorem 9.1 holds true for any finite set of admissible pairs (X 1 , T1 ), ... , (X" T,.) (not necessarily satisfying the condition Ker(Xi, Ij) = {0}, 1 :: 1 , q> 2 , ••• , Cf>r) E flP (p = p 1 + p2 + · · · + Pr) such that IX ;;:::
0.
Then%= {(q>, S 2 q>, ... , Srcp)lcp E A}, where A is the largest T1-invariant subspace of flP' such that for every j = 2, ... , r, there exists a linear transformations/ A~ flPi with the property that X 1 I.At = XiSi, Proof
Si T1 1At = 1JSi
(j
= 2, ... , r).
Note that% is a linear space invariant under T
=
(9.10)
T1 EB · · · EB T,..
Put
A = {q>
E
flP'I3cpi
E
flPi,
(2 :::;;; j :::;;; r),
(q>1, q>z, ... , Cf>r)
E
%}.
Take (q> 1, q> 2 , ••• , Cf>r) and (q> 1, $ 2 , ... , cpr) in %. Then (0, q> 2 - cp 2 , ••• , Cf>r - fpr) E %, and hence for 2 :::;;; j :::;;; r we have ~ Tj(q>i - (pi) = 0, IX ;;::: 0. AsKer(Xi, 1}) = {O}foreachj,wehaveq>i = fpi.Soeach(q> 1 , q> 2 , ••• , Cf>r)E% may be written as
(q>1, q>z, · · ·' Cf>r) = (q>1, S2q>1, · · ·' Srq>1), where sj is a map from A into flPi (2 :::;;; j :::;;; r). In other words
% = {(q>,S 2 q>, ... ,Srcp)lq>EA}.
(9.11)
As% is a linear space, the maps S2, ... , Sr are linear transformations. Since % is invariant for T1 EB · · · EB T,., the space % is invariant for T1 and j
= 2, ... , r.
236
9. L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Further, from the definition of off and the identity (9.11) it is clear that xjsj = X11At· It remains to prove that .A is the largest subspace of flP' with the desired properties. Suppose that .A0 is a T1-invariant subspace of flP' and let SJ: .A0 --+ flf' (j = 2, ... , r) be a linear transformation such that (9.12) Then SJT~q> = TjSJcp for each q> E .A0 and oc ;;::: 1. This together with the first identity in (9.12) shows that (q>
But then .A0 c .A, and the proof is complete.
E
.Ao).
0
The linear transformations Si (2 ::;; j ::;; r) in the previous theorem are injective. Indeed, suppose that Siq> = 0 for some j ;;::: r. Then (9.10) implies that q> E Ker(X 1, T1). But Ker(X 1, T1) = {0}, and hence q> = 0. We can now prove an existence theorem for a greatest common restriction under the hypotheses used in Theorem 9.1 to obtain the corresponding result for a least common extension. Theorem 9.5. Let (X 1 , T1 ), .•. , (X" T,.) be admissible pairs of orders p, respectively, and suppose that Ker(Xi, T) = {0} for j = 1, ... , r. Then up to similarity there exists a unique greatest common restriction of
p 1,
•.. ,
(X 1 , T1 ),
... ,
(X, T,.).
If the subspace .A is defined as in Lemma 9.4, then one such greatest common restriction is given by
Proof Put X 0 = X 1 !At and T0 = T1 !At. By choosing some basis in .A we can interpret (X 0 , T0 ) as an admissible pair of matrices. By definition (X 0 , T0 ) is a restriction of (X 1, T1). As the linear transformations S 2 , ••• , S, in Lemma 9.4 are injective, we see from formula (9.12) that (X 0 , T0 ) is a restriction of each (Xi, Tj), 2::;; j::;; r. Thus (X 0 , T0 ) is a common restriction of (X 1 , T1 ), . . . , (X, T,.). Next, assume that (Y, R) is a common restriction of(X 1, T1), ... , (X, T,.). Let q be the order of the pair (Y, R). Then there exist injective linear transformations G/ flq --+ flPi, j = 1, ... , r, such that j
=
1, ... , r.
It follows that for each q> E flq we have ()( ;;::: 0.
9.2. COMMON RESTRICTIONS OF ADMISSIBLE PAIRS
237
HenceG 1 cpE.A,andthus Y = X 0 G 1 andG 1 R = T0 G 1 .Butthisimpliesthat (X 0 , T0 ) is an extension of (Y, R). So we have proved that (X 0 , T0 ) is a greatest common restriction of (X 1 , T1 ), ••• , (X" T,.). Finally, let (X 0 , T0 ) be another greatest common restriction of (X 1 , T1 ), ... , (X" T,.). Then (X 0 , T0 ) is an extension of (X 0 , T0 ) and conversely (X 0 , T0 ) is an extension of(X 0 , T0 ). Hence both pairs are similar. D As in the case ofleast common extensions, Theorem 9.5 remains true, with appropriate modification, if one drops the condition that Ker(Xi, T) = {0} for j = 1, ... , r. To see this, one first replaces the pair (Xi, T) by j
= 1, ... , r,
(9.13)
where Pi is a projector of ([Pi along Ker(Xi, 1]). Let (X 0 , T0 ) be a greatest common restriction of the pairs (9.13), and put X= [X 0 0 1] and T = T0 EB 02 , where 0 1 is the n x t zero-matrix and 02 the t x t zero-matrix, t = mini dim Ker(Xi, 1]). Then the pair (X, T) is a greatest common restriction of (X 1 , T1 ), ... , (X" T,.). We conclude this section with another characterization of the greatest common restriction. As before, let (X 1 , T1 ), ••• , (X" T,.) be admissible pairs of orders p 1 , ••• , Pn respectively. For each positive integer s let .%. be the linear space of all (cp 1 , cp 2 , ... , cp,) E flP, (p = p 1 + · · · + p,) such that
()( = 0, ... ' s- 1.
n:;,
Obviously, 1 .%. = .%, where .% is the linear space defined in Lemma 9.4. The subspaces .% 1 , .% 2 , •.. , form a descending sequence in flP, and hence there exists a positive integer q such that Yfq = Yfq+ 1 = · · · . The least least q with this property will be denoted by q{(Xi, 1})}= d. Theorem 9.6. Let(X 0 , T0 ),(X 1 , T1 ), ••. , (X" T,.)beadmissiblepairs,and suppose that Ker(X i, 1}) = {0} for j = 0, 1, ... , r. Then (X 0 , T0 ) is a greatest common restriction of(X 1 , T1 ), ••• , (X" T,.) if and only if
nIm[col(XiTt )7'= j= r
Im[col(X 0 T~- 1 )7'= 1 ]
=
1
1]
(9.14)
1
for each m;;::: q{(Xi, 1})}= dProof Let (Y0 , R 0 ) be the greatest common restriction of (X 1 , T 1 ), ••• , (X" T,.) as defined in the second part of Theorem 9.5. Take a fixed m ;;::: q{(Xi, 1})}= d, and let
nIm[col(Xj Tt )7'= j= r
col(cp;)7'= 1 E
1
1
1].
238
9.
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Then there exist 1/Ji E flPi (j = 1, ... , r) such that (/J; = Xi T}- 1 1/Ji for 1 :::;; i :::;; m and 1 :::;; j :::;; r. It follows that (l/1 1, ... , 1/1,) E off m· But off m = off. So l/1 1 E .A and 1/Ji = Sil/1 1 ,j = 2, ... , r (cf. Lemma 9.4). But then ({J;
= XiT}- 1 Sil/J 1 = X 1 T;1- 1 l/1 1 = Y0 R~- 1 l/1 1 •
It follows that
nlm[col(XiTt )i'= r
1
1]
c Im[col(Y0 R~- 1 )i'= 1 ].
j= 1
As (Y0 , R 0 ) is a common restriction of (X 1 , T1 ), •.. , (X, T,.), the reverse inclusion is trivially true. So we have proved (9.14) for X 0 = Y0 and T0 = R 0 . Since all greatest common restrictions of (X 1 , T1 ), ••• , (X, T,.) are similar, we see that (9.14) has been proved in general. Conversely, assume that (9.14) holds for each m ~ q = q{(Xi, 1})}= d.
Let (Y0 , R 0 ) be as in the first part of the proof. Then form each m, we have
~
q and hence for
Im[col(X 0 T~- 1 )7'= 1 ] = Im[col(Y0 R~- 1 )i'= 1 ]. By (9.3) this implies that (X 0 , T0 ) and ( Y0 , R 0 ) are similar, and hence (X 0 , T0 ) is a greatest common restriction of (X 1 , T1 ), ••. , (X, T,.). 0 The following remark on greatest common restrictions of admissible pairs will be useful.
Remark 9.7. Let (X 1 , T1 ), •.• , (X, T,.) be admissible pairs with Ker(Xi, 1}) = {0} for j = 1, ... , r, and let (X 0 , T0 ) be a greatest common restriction of (X 1 , T1 ), ••. , (X, T,.). As the proof of Theorem 9.6 shows, q{(Xi, 1})}= dis the smallest integer m ~ 1 such that the equality
nlm[col(XiT}- );"= r
lm[col(X 0 T~- 1 )i'= 1 ] =
1
1]
j= 1
holds. Further, (9.15) for every m ~ q{(Xi, 1})}= d, where d 0 is the size of T0 . Indeed, in view of formula (9.14), we have to show that the index of stabilization ind(X 0 , T0 ) does not exceed q{(Xi, 1})}= d. But this is evident, because ind(X 0 , T0 ):::;; ind(X 1 , T1 ):::;; q{(Xi, 1j)}=d, as one checks easily using the definitions.
9.3.
CONSTRUCTION OF L.C.M. AND G.C.D. VIA SPECTRAL DATA
239
9.3. Construction of l.c.m. and g.c.d. via Spectral Data
Let L 1(A.), ... , Ls(A.) be comonic matrix polynomials with finite Jordan pairs (X lF• J 1 F), ... , (X sF• ]sF), respectively. To construct an l.c.m. and ag.c.d. of L 1 (A.), ... , L,(A.) via the pairs (X lF• J !F), ... ,(X sF• ]sF), we shall use extensively the notions and results presented in Sections 9.1 and 9.2. Observe that (X;F, l;F) is an admissible pair of order P; = degree(det L;(A.)) and kernel zero:
where I; is the degree of L;(A) (this fact follows, for instance, from Theorem 7.15). Let (X" T,.) and (Xe, I;,) be a greatest common restriction and a least common extension, respectively, of the admissible pairs (X lF• J 1 F), ... , (XsF• ]sF). By Corollary 9.2, the admissible pairs (X, T,) and (X., I;,) also have kernel zero, and T,, I;, are nonsingular matrices. Let l, = ind(X" T,),
Theorem 9.8.
le = ind(X., T,).
The comonic matrix polynomial
M(A.) =I- Xe T; 1e(V1 ). 1e + V2 ). 1e-l
+ ·· · +
J!;eA.),
(9.16)
where [V1 V2 · · · J!;J is a special/eft inverse of col(X e T; j)~e=-o\ is an l.c.m. of minimal possible degree (?f L 1 (A.), ... , L,(A.). Any other l.c.m. of L 1 (A.), ... , Ls(A.) has the form U(A.)M(A.), where U(A.) is an arbitrary matrix polynomial with det U(A.) const i= 0.
=
Let us prove first that a comonic matrix polynomial N(A.) is an l.c.m. of L 1 (A.), ... , Ls(A.) if and only if the finite Jordan pair (XNF• JNF) of N(A.) is a least common extension of (X lF• J 1 F), ... , (X sF• ]sF). Indeed, assume that N(A.) is an l.c.m. of L 1 (A.), ... , L,(A.). By Theorem 7.13, (X NF• J NF) is a common extension of (X;F, l;F), i = 1, ... , s. Let (X, T) be an extension of(X NF• J NF), and let N(A.) be a matrix polynomial (not necessarily comonic) whose finite Jordan pair is similar to (X, T) (N(A.) can be constructed using Theorem 7.16; if Tis singular, an appropriate shift of the argument A is also needed). By Theorem 7.13, N(A.) is a common multiple of L 1 (A.), ... , Ls(A.), and, therefore, N(A.) is a right divisor of N(A.). Applying Theorem 7.13 once more we see that (X, T) is an extension of(XNF• JNF). Thus, (XNF• JNF) is a least common extension of (X lF• J 1 F), ... , (X sF• ]sF). These arguments can be reversed to show that if(X NF• J NF) is a least common extension of(X;F, l;F), i = 1, ... , s, then N(A.) is an l.c.m. of L 1 (A.), ... , Ls(A). After the assertion of the preceding paragraph is verified, Theorem 9.8 (apart from the statement about the uniqueness of an l.c.m.) follows from Theorem 7.16. Proof
240
9.
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
We prove the uniqueness assertion. Observe that, from the definition of an l.c.m., if N 1(A.) is an l.c.m. of L 1(A.), ... , L.(A.), then N 1(A.) and M(A.) (where M(A.) is given by (9.16)) are right divisors of each other. So
M(A.) = U 2 (A.)N 1(A.) for some matrix polynomials U 1 (A.) and U 2 (A.). These equalities imply, in particular, that U 1(A.)U 2 (A.) = I for every A. E a(N 1). By continuity, U 1(A.)U 2 (A.) = I for all A. E (/;, and therefore both matrix polynomials U 1(A.) and U 2 (A.) have constant nonzero determinant. So we have proved that any l.c.m. of L 1 (A.), ... , L,(A.) has the form (9.17)
U(A.)M(A.)
where U(A.) is a matrix polynomial with empty spectrum. Conversely, it is easily seen that every matrix polynomial of the form (9.17) is an l.c.m. of L 1(A.), ... , L,(A.). 0 The proof of the following theorem is analogous to the proof of Theorem 9.8 and therefore is omitted. Theorem 9.9.
The comonic matrix polynomial
D(A.) =I_ x,r-tr(W1A_tr +
w2 A.tr-1
+ ... + Jo/1)),
where [W1 W2 • • • Jo/IJ is a special left inverse of col( X, T; i)~r;;01 , is a g.c.d. of minimal possible degree of L 1(A.), ... , L,(A.). Any other g.c.d. of L 1(A.), ... , L,(A.) is given by the formula U(A.)D(A.), where U(A.) is an arbitrary matrix polynomial without eigenvalues. 9.4. Vandermonde Matrix and Least Common Multiples
In the preceding section we constructed an l.c.m. and a g.c.d. of comonic matrix polynomials L 1 (A.), ... , L,(A.) in terms of their finite Jordan pairs. However, the usage of finite Jordan pairs is not always convenient (in particular, note that the Jordan structure of a matrix polynomial is generally unstable under small perturbation). So it is desirable to construct an l.c.m. and a g.c.d. directly in terms of the coefficients of L 1(A.), ... , L.(A.). For the greatest common divisor this will be done in the next section, and for the least common multiple we shall do this here. We need the notion of the Vandermonde matrix for the system L 1 (A.), ... , L.(A.) of comonic matrix polynomials, which will be introduced now. Let Pi be the degree of LlA.), and let (Xi, T;) = ([X iF, XiooJ, Jif- 1 EB J 00 ) beacomonic Jordan pair of LlA.), i = 1, ... , s. According to Theorem 7.15, (Xi, T;) is a
241
9.4. VANDERMONDE MATRIX AND LEAST COMMON MULTIPLES
standard pair for the monic matrix polynomial L;(i.) ticular, col( X; TDf~-01 is nonsingular; denote
Uj= [col(XjT~- 1 )f~ 1 r 1 ,
l
=
In par-
LJ=
1
pj and m is an
1
X1U1
xsus xs ~us
X1T1U1
X
1 ).
.i= l, ... ,s,
and define the following mn x pn matrix, where p arbitrary positive integer:
V,.(L1, ... , Ls) =
= ).P'L;U-
:
1
T7- 1 U 1
X 2 T7- 1 U 2
.
Xs T~-1Us
The matrix V"'(L 1, ... , L.) will be called the comonic Vandermonde matrix of L1(1l), ... , Ls(/1.). We point out that V,.(L 1, ... , Ls) does not depend on the choice of the comonic Jordan pairs (X;, I;) and can be expressed directly in terms of the coefficients of L 1 (1l), ... , Ls(.A.). In fact, let L 1 (.A.) =I+ Ap,- 1 .A. + Ap,- 2 11. 2 + ulpJ, where ulj is ann X n ... + Ao).Pl, and let u1 = [U11 u12 S: p 1 ) are given by the followf3 S: (1 p matrix. Then the expressions X 1 TfU 1 ing formulas (see Proposition 2.3): if m = 0, ... , p 1 if m = 0, ... , p 1
XITTU1p=
-
-
1 and 1 and
f3 i= m + f3 = m +
1 1 (9.18)
and in general form > p 1 ,
X 1TfU 1p =
~~~ [J ;,+-~iq=kj~(-Ap,-)] 1
·(-AfJ+k-m+p,-1) where, by definition, A;
+ (-Ap-m+p,-1),
(9.19)
= 0 for i s; 0 and
q
TI(-Ap,-;) = (-Ap, ;)(-Ap,--;)···(-Ap,-;J j= 1 To formulate the next theorem we need some further concepts. For a given family L 1, ... , Ls of co monic matrix polynomials with infinite Jordan pairs (X; 00 , 1; 00 ), i = 1, ... , s, denote by voc(L 1 , ... , Ls) the minimal integer j such that H:o = 0 for i = 1, ... , s. For integer i s; j, denote by S;/L 1 , ... , L.) the lower (j - i + 1) matrix blocks of the comonic Vandermende matrix Vj(L 1 , ••• , L.): Su{L1, ... , Ls) =
[
XJi~-~u~
:
XJt 1 UI
s s s xJi-lul . .
XJ{- 1 Us
9.
242
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
If i = j, we shall write S;(L 1, ... , L.) in place of S;;(L 1, ... , L.). In Theorem 9.10 below we shall use also the notion of a special generalized inverse (see Section 7.11 for its definition).
Theorem 9.10. Let L 1 (A.), ... , L.(A.) be a family of comonic matrix polynomials, and let v 2::: V 00 (L 1, ••• , L.) be an arbitrary integer. Then the minimal possible degree m of a least common multiple of L 1 (A.), ... , L.(A.) is m = min{j 2:::
11 Ker s.+ 1, v+ i(L 1, ... , L.)
= Ker S.+ 1, v+ i+ 1(L 1, ... , L.)}
(9.20)
and does not depend on the choice of v. One of the least common multiples of L 1 (A.), ... , L.(A.) of the minimal degree misgiven by the formula M(A.) = I - Sv+m+ 1.v+m+ 1CL1, · · ·, L,) · [W1A.m
+ W2A.m- 1 + · · · + WmA.J, (9.21)
where [W1
W2 · · ·
Wm] is the special generalized inverse of s.+ 1,v+m(L1, · · ·, L.).
Proof Let(X;, 7;) = ([X;F X; 00 ],J;"f/ EB 1 00 )beacomonicJordanpair of L;(A.), i = 1, ... , s; let U; = [col(X;l{)f!:-0 1 1, where P; is the degree of L;(A.). Then for v 2::: v00 (L 1 , ..• , L.) we have
r
x.r; :
l
diag[U 1 ,
••. ,
u.]
x. r;+ j - 1 X2
o (9.22)
Since the matrix diag[(J;"fc' EB I)U;]i; 1 is nonsingular, the integer m (defined by (9.20)) indeed does not depend on v 2::: V 00 (L 1, ••• , L.). Let (Xe, T;,) be the least common extension of the pairs (X 1 F, J 1 F), ... , (X sF• J.F). In view of Theorem 9.8 it is sufficient to show that m = ind(Xe, T;,), and Xe T;m[V1 V2 · · ·
VmJ
=
Sv+m+ 1(L 1, ... , L,)[W1 W2 · · ·
Wm], (9.23)
243
9.4. VANDERMONDE MATRIX AND LEAST COMMON MULTIPLES
From the equality (9.22) it follows that m coincides with the least integer j such that Ker col[ X 1FJ~t. X 2Fr;}, ... ' xsFJ~i,kJ{:6 = Ker col[ X 1FJ~t, X zFF}}, ... , XsFJs"FkJLo· Using the construction of a least common extension given in Theorem 9.1, we find that indeed m = ind(X., I;,). Further, we have (by analogy with (9.22))
Sv+m+ 1 = [X 1FJ~Fm 0 XzFJZFm · diag[(J;f," EB l)U;]i= 1
0
· · · XsFJsFm
0]
and taking into account (9.22) it is easy to see that the right-hand side of(9.23) is equal to (9.24) where
[W~
W~]
W2
is a special generalized inverse of
col[X 1FJ~/. X 2FJi/, ... ' xsFJs"Fj]j=-d. We can assume (according to Theorem 9.1) that (X., T.,) =(X limP• PTI 1mp), where
X= [X1F
xsFJ
...
T
=
diag[J~J, ... , Js"F 1],
and P is a projector such that Ker P = Ker(X, T)
(=
.n xri)· Ker
•=0
The matrices X and T have the following form with respect to the decomposition flP = Ker P ImP, where p = p 1 + p2 + · · · + Ps:
+
x
= [O
x.J,
It follows that XT; = [0 X • T~], i = 0, 1, .... Now from the definition of a special generalized inverse we obtain that the left-hand side of (9.23) is equal to (9.24). So the equality (9.23) follows. D
The integer estimated: v 00 (L 1 ,
••• ,
V 00 (L 1, ••• ,
L.) ::::;; min {j
~
L.) which appears in Theorem 9.10 can be easily
11 Ker Jj(L 1 ,
••• ,
Lm) = Ker Jj+ 1 (L 1 ,
••. ,
Lm)}.
(9.25)
244
9.
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Indeed, for the integer v = v00 (L 1 , ... , L,) we have J'foo = 0 and J'f;; 1 =1= 0 for some i. Then also X ioo Ji;; 1 =I= 0, and there exists a vector x in the domain of definition of 1; 00 such that X; 00 Ji;; 1 X: =I= 0, X; 00 Jioo x = 0. Therefore, Ker V,(L 1 ,
.•. ,
Lm) -jJ Ker S.+ 1 (L 1 ,
..• ,
Lm),
and (9.25) follows. 9.5. Common Multiples for Monic Polynomials
Consider the following problem: given monic matrix polynomials L 1 (A.), ... , L.(A.), construct a monic matrix polynomial L(A.) which is a common multiple of L 1 (A.), ... , L.(A.) and is, in a certain sense, the smallest possible. Because of the monicity requirement for L(A.), in general one cannot demand that L(A.) be an l.c.m. of L 1 (A.), ... , L.(A.) (an l.c.m. of L 1 (A.), ... , L.(A.) may never be monic). Instead we shall require that L(A.) has the smallest possible degree. To solve this problem, we shall use the notion of the Vandermonde matrix. However, in the case of monic polynomials it is more convenient to use standard pairs (instead of comonic Jordan pairs in the case of comonic polynomials). We arrive at the following definition: let L 1 (A.), ... , L.(A.) be monic matrix polynomials with degrees p 1 , . . . , Ps and Jordan pairs (X 1 , T1 ), .•. , (X., 7'.), respectively. For an arbitrary integer m ~ 1 put
xlul Wm(Ll, ... ,L.)=
r
X 1 T1 U 1
X 2 T2 U 2
X1 T~- 1 U 1
X 2 Ti- 1 U 2
:
:
l
x.u. , x.(u· x. rr;- 1 v.
where Vi= [col(XiT~- 1 )f~ 1 r 1 . Call Wm(L 1 , ••• ,L.) the monic Vandermonde matrix of L 1 (A.), ... , L.(A.). From formulas (9.18) and (9.19) (where L 1 (A.) = L),!, 0 AiA.i) it is clear that Wm(L 1 , ••• , L.) can be expressed in terms of the coefficients of L 1 (A.), ... , L.(A.) and, in particular, does not depend on the choice of the standard pairs. For simplicity of notation we write Wm for Wm(L 1 , ... , L.) in the next theorem. Theorem 9.11.
Let L 1 (A.), ... , L.(A.) be monic matrix polynomials, and let r = min {m
~
11 Ker
Wm = Ker Wm + 1 }.
Then there exists a monic common multiple L(A.) ofL 1 (A.), ... , L.(A.) of degree r. One such monic common multiple is given by the formula L1 (A.)
= H' - S,+ 1l1mP · (Vl + VzA. + · · · + V,..A.'- 1 )
(9.26)
245
9.5. COMMON MULTIPLES FOR MONIC POLYNOMIALS
where P is a projector along Ker W,., [V1 , V2 , ••• , V.J is some left inverse for W..hmP• and Sr+ 1 is formed by the lower n rOWS of W,.+ 1· Conversely, if L(A.) is a monic common multiple of L 1 (A.), ... , L.(A.), then deg L 2 r. Observe that a monic common multiple of minimal degree is not unique in general.
Proof
Since U 1 ,
•.. ,
u. are nonsingular,
r = ind([X 1
· · ·
X.], diag[T1 ,
..• ,
'T.]).
Let (X 0 , T0 ) be the least common extension of(X 1 , T1 ), ••. , (X., T.)(Section 9.1). Then Ker(X 0 , T0 ) = {0} and ind(X 0 , T0 ) = r (see Theorem 9.1). Let
L(A.) be a monic matrix polynomial associated with the r-independent admissible pair (X 0 , T0 ), i.e., (9.27)
nx:J.
where [V1 , V2 , ••• , V.J is a left inverse of col(X 0 By Theorem 6.2, a standard pair of L(A.) is an extension of (X 0 , T0 ) and, consequently, it is also an extension of each (Xi, 7;), i = 1, ... , s. By Theorem 7.13, L(A.) is a common multiple of L 1 (A.), ... , L.(A.). In fact, formula (9.27) coincides with (9.26). This can be checked easily using the description of a least common extension given in Theorem 9.1. Conversely, let L(A.) be a monic common multiple of L 1 (A.), ... , L.(A.) of degree l. Then (Theorem 7.13) a standard pair (X, T) of L(A.) is a common extension of (X 1> T1 ), •.• , (X., T.). In particular, ind(X, T) 2 ind(X 0 , T0 )
= r,
(As before, (X 0 , T0 ) stands for a least common extension of (X 1 , T1 ), . . . , (X., T.).) On the other hand, nonsingularity of col(XTi)l:~ implies ind(X, T) = l; sol 2 r as claimed. D Theorem 9.11 becomes especially simple for the case oflinear polynomials Li(A.) = IA. - Xi, i = 1, ... , s. In this case Wm(L 1 ,
.•. ,
L.)
= col([XL X~, ... , x~m·~-01 •
Thus: Corollary 9.12. Let X 1 , ... , x. ben x n matrices. Then there exist n x n matrices A 0 , .•• , A 1_ 1 with the property l-1 l Xi+
" f...
j=O
AjXf. = 0,
i
= 1, ... ' s,
(9.28)
246
9. L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
if and only if the integer I satisfies the inequality I~ min{m ~ liKer col([X~, X~, ... , X~])k':J
= Ker col([X~, Xt ... , X~])k'= 0 }. Jf(9.29) is satisfied, one of the possible choices for A 0 , holds is given by the formula [Ao
Al
(9.29) ••• ,
A 1_ 1 such that (9.28)
· · · A,_l] = -[XL X~,···, X!] IImP[Vl
Vz
···
Jt;],
where Pis a projector along Ker col([X~, X~, ... , X~])~: b. and
is some left inverse for col([ X~, X~, ... , x~m:b limP· 9.6. Resultant Matrices and Greatest Common Divisors
In order to describe the construction of a g.c.d. for a finite family of matrix polynomials, we shall use resultant matrices, which are described below. The notion of a resultant matrix R(a, b) for a pair of scalar polynomials a(A.) = L~=o a;A.i and b(A.) = D=o b;A.; (a;, bi E C) is well known:
R(a, b)=
ao 0
al ao
0 bo 0
0 bl bo
a, al
0 b. bl
0 a,
0
0 0
a, 0 0
ao al 0 0 b. 0
1
; l) 0 (9.30) 0 are called resultant matrices of the polynomial L(A.). For a family of matrix polynomials Lj(A.) = Lk~o AkiAk (Aki is an n x n matrix, k = 0, 1, ... , pi, j = 1, 2, ... , s) we define the resultant matrices as
Rq(L 1 , L 2 ,
•.• ,
L.)
=
r=:i~::j
( q > max
1 ::;;;j ::;;r
Pi)·
(9.31)
Rq(L,) This definition is justified by the fact that the matrices Rq play a role which is analogous to that of the resultant matrix for two scalar polynomials, as we shall see in the next theorem. Let L 1 , . . . , Ls be comonic matrix polynomials with comonic Jordan pairs (X 1 , T1 ), ... , (Xs, T,), respectively. Let mi be the degree of det Li, j = 1, ... , s, and let m = m 1 + · · · + ms. For every positive integer q define
em,
there As the subs paces x 1 , .% 2 , •.• form a descending sequence in exists a positive integer q0 such that X qo = X qo+ 1 = · · · . The least integer q0 with this property will be denoted by q(L 1 , .•. , Ls). It is easi'l.y seen that this definition of q(L 1 , •.. , L,) does not depend on the choice of (X 1 , T1 ), ... , (X, TJ The following result provides a description for the kernels of resultant matrices. This description will serve as a basis for construction of a g. c. d. for a family of matrix polynomials (see Theorem 9.15).
Theorem 9.13. Let L 1 , . . . , Ls be comonic matrix polynomials with comonic Jordan pairs [X;, 1] = ([X;F X;"'], l;p 1 EB 1;"'), i = 1, ... , s, respectively. Let d0 be the maximal degree of the polynomials L 1 , . . . , Ls (so Rq(L 1 , . . . , Ls) is defined for q > d0 ). Then q0 d~ max{q(L 1 , ••• , Ls), d0 + 1} is the minimal integer q > d0 such that (9.32)
9.
248
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
and for every integer q 2 q 0 thefollowing formula holds: KerRq(L 1 ,
...
+ Imcol(XaJ~-i){= 1 ,
,L,) = Imcol(XFJ~- 1 ){= 1
(9.33)
where (XF, JF) (resp. (X JocJ) is the greatest common restriction of the pairs (X IF, J tF), · · ·, (X sF, ]sF) (resp. (X 1 a-, J 1 aJ, · · ·, (Xsoo, Jso:J). 00 ,
p1,
The number q(L 1 , ... , L,) can ~e estimated in terms of the degrees Ps of L 1 (A), ... , L,(A), respectively:
•.. ,
q(L 1 ,
••• ,
L,)
s
n · min pj
+
(9.34)
max Pj· 1::£:j5:s
1 s,js,s
For the proof of this estimation see [29b]. So (9.33) holds, in particular, for every q 2 n · min 1 ,;j:o;s pj + max 1 ,;j:o;s Pj· Theorem 9.13 holds also in the case that L 1 (A), ... , L 5 (A) are merely regular matrix polynomials. For a regular matrix polynomial L;(A), let (X;o:n l;ocJ be an infinite Jordan pair for the comonic matrix polynomial L;- 1 (a)L;(A + a) (art CJ(L;)). The pair (X; 00 , 1; 00 ) = (X; 00 (a), X; 00 (a)) may depend on the choice of a, i.e., it can happen that the pairs (X; 00 (a), l; 00 (a)) and (X; 00 (b), l; 00 (b)) are not similar for a i= b, a, b i CJ(LJ However, the subspace lm col(X; 00 (a)(J; 00 (a))q- j)]= 1 does not depend on the choice of a. In view of Theorem 9.6, the subspace Im col( X oo J~- j)J= 1 also does not depend on the choices of a; i CJ(L;), i = 1, ... , s, where (X 00 , J 00 ) is the greatest common restriction of (X; 00 (a;), l; 00 (a;)), i = 1, ... , s. So formula (9.33) makes sense also in the case when L 1 (A), ... , L,(}c) are regular matrix polynomials. Moreover, the formula (9.33) is true in this case also. For additional information and proofs of the facts presented in this paragraph, see [29b]. For the proof of Theorem 9.13 we need the following lemma. Lemma 9.14. Let L(A) = I + L~=o Aj}cj be a comonic n x n matrix polynomial, and let (X, T) be a comonic Jordan pairfor L. Thenfor q > l we have
Ker Rq(L) = Im Proof
col(XP-')~= 1 .
Put F,p = XPZp (o: 2 0, 1 [Zt
Zz
.. · Zz] =
s
{J
s
1), where
[col(XTi-t)~= 1 r 1 .
Then (see Theorem 7.15 and (2.14)) I -Fu
Rq(L) =
[
-F
O
I -Fu
11
• •.
0 ]·
-Fil
9.6.
249
RESULTANT MATRICES AND GREATEST COMMON DIVISORS
Introduce
s=
l~
0
-I -FII
-I -FII
... -Fq-2!
Using Theorem 7.15 and equalities (3.15) we deduce that SRq(L)
=
[U
+
V
(A.o)Yr+I-j)
( xi,.± j = Il· where B is given by (10.3).
=
(x 1 ,By,),
(10.51)
10.5. SIGN CHARACTERISTIC OF A SELF-ADJOINT MATRIX POLYNOMIAL
The proof is a combinatorial one. Substituting
x
1
275
col[.A.~x 1 ]~;:;~ and ::;; 0) in the expres-
=
y, = col[Lf=o ({).A.b-iy,_;]~;;;~ (by definition yP = 0 for p sion (x 1 , By,) we deduce that
It is easy to see that '\'l
L...
'\'l
L...
k=lp=i+k
(
P-
.
I
k)
;.ri-lA
o
P
1 = ·---L[-1]EB[~ ~l
288
11.
We then have A
1
= 2 2
1
FACTORIZATION OF SELF-ADJOINT MATRIX POLYNOMIALS 1
+ Im Q
0
a B-neutral subspace, and -1
=1m [-1 -1
Qc =
lm[~
-1 -1 -1
-1 1
-a -i]
-1 -1 -i 2i 2i -3 -3
1 0
and S 2 = col[SK'];=o is given by
-r-~ - ~1 1
Sz-
-1 1 1 -1
i i
1 1
and
-1
1
1 [ -1 s-12 -4 2
1 2
-2i -2i
-~ -~] 0
0 .
2
2
Computing SK 2 S- 1 it is then found that A. LI(A.)= [
2 -
iA.-1 0
It is easily verified that L(A.) = Lf(A.)L 1 (A.).
J
-iA. A.z-iA.-1.
0
The special cases l = 2, 3 arise very frequently in practice and merit special attention. When l = 2 Theorem 11.2 implies that the quadratic bundle (with self-adjoint coefficients) can always be factored in the form
V.. 2
+ A 1 A. + A 0
= (IA.- Y)(IA.- Z).
(11.15)
If the additional hypothesis of Corollary 11.6 holds, then there is such a factorization with Y = Z*. Further discussions of problems with l = 2 will be found in Chapter 13. When l = 3 there are always at least n real eigenvalues and there are matrices Z, B 1 , B 2 for which
If there are precisely n real eigenvalues, then it follows from Corollary 11.5 that there are matrices Z, M with M* = M and
Furthermore, a(! A.- Z) is nonreal, a(! A.+ M) is just the real part of a(L(A.)) and the real eigenvalues necessarily have all their elementary divisors linear. EXAMPLE 11.2. Let l = 2 and suppose that each real eigenvalue has multiplicity one. Then the following factorizations (11.15) are possible:
L(A.) = (lA. - Y;)(IA. - Z;),
11.4.
DISCUSSION AND FURTHER DEDUCTIONS
289
i = 1, 2, in which V.. - Z 1 has supporting subspace which is maximal B-nonnegative and that of V..- Z 2 is maximal B-nonpositive, while u(L) is the disjoint union of the spectra of V..- Z 1 and V..- Z 2 . In this case, it follows from Theorem 2.15 that Z 1 , Z 2 form a complete pair for L.
Comments
The presentation of this chapter is based on [34d, 34f, 34g]. Theorem 11.2 was first proved in [56d], and Corollary 11.5 is Theorem 8 in [56c]. Theorem 11.4 is new. The notation of" kinds" of eigenvectors corresponding to real eigenvalues (introduced in Section 11.2) is basic and appears in many instances. See, for example, papers [17, 50, 51]. In [69] this idea appears in the framework of analytic operator valued functions.
Chapter 12
Further Analysis of the Sign Characteristic
In this chapter we develop further the analysis of the sign characteristic of a monic self-adjoint matrix polynomial L(A.), begun in Chapter 10. Here we shall represent the sign characteristic as a local property. This idea eventually leads to a description of the sign characteristic in terms of eigenvalues of L(A. 0 ) (as a constant matrix) for each fixed A. 0 E IR (Theorem 12.5). As an application, a description of nonnegative matrix polynomials is given in Section 12.4. 12.1. Localization of the Sign Characteristic
The idea of the sign characteristic of a self-adjoint matrix polynomial was introduced as a global notion in Chapter 10 via self-adjoint triples. In this section we give another description of the sign characteristic which ultimately shows that it can be defined locally. Theorem 12.1. Let L 1(A.) and LiA.) be two monic self-adjoint matrix polynomials. If A. 0 E a(L 1 ) is real and
Ly>(A. 0 ) = L~l(A. 0 ), i = 0, 1, ... , y, where y is the maxima/length of Jordan chains of L 1(A.) (and then also of L 2 (A.)) corresponding to A. 0 , then the sign characteristics of L 1 (A.) and LiA.) at A. 0 are the same. 290
12.1.
291
LOCALIZATION OF THE SIGN CHARACTERISTIC
By the sign characteristic of a matrix polynomial at an eigenvalue we mean the set of signs in the sign characteristic, corresponding to the Jordan blocks with this eigenvalue. It is clear that Theorem 12.1 defines the sign characteristic at A. 0 as a local property of the self-adjoint matrix polynomials. This result will be an immediate consequence of the description of the sign characteristic to be given in Theorem 12.2 below. Let L(A.) be a monic self-adjoint matrix polynomial, and let A. 0 be a real eigenvalue of L(A.). For x E Ker L(A. 0 ) \ {0} denote by v(x) the maximal length of a Jordan chain of L(A.) beginning with the eigenvector x of A.0 . Let 'P;, i = 1, ... , y (y = max{v(x)lxEKer L(A. 0 )\{0}}) be the subspace in Ker L(A.0 ) spanned by all x with v(x) ~ i. Theorem 12.2.
Fori
!;(x, y) =
=
1, ... , y, let A. 0 be a real eigenvalue of Land
(x, ±~ LUl(A. )y:;:b (see, for instance, (1.67)). Recall the equivalence E(A.)(IA- C1)F(A.) = L(A.) ElHn · · · p] = col([y< 1 > . . . y = L 0 for real A¢ (J(L), j = 1, ... , n. Let Ao E (J(L). Then clearly the first nonzero derivative
302
12.
FURTHER ANALYSIS OF THE SIGN CHARACTERISTIC
= 1, ... , n, is positive. So the signs are + 1 in view of Theorem 12.5. (v) = (i). It is sufficient to show that Jl/A) c 0 for real )., j = 1, ... , n. But this can easily be deduced by using Theorem 12.5 again. To complete the proof of Theorem 12.8 it remains to prove the implication (iv) =(iii). First let us check that the degree I of L().) is even. Indeed, otherwise in view of Theorem 10.4 L().) will have at least one odd partial multiplicity corresponding to a real eigenvalue, which is impossible in view of (iv). Now let L 1 ().)be the right divisor of L().) constructed in Theorem 11.2, such that its c-set lies in the open upper half-plane. From the proof of Theorem 11.2 it is seen that the supporting subspace of L 1 is B-neutral (because the partial multiplicities are even). By Theorem 11.1. J1}al(). 0),j
L
so (iii) holds (with M = L 1 ).
= L'fL 1 ,
D
We can say more about the partial multiplicities of L().) = M*().)M().). Theorem 12.9. Let M().) be a monic matrix polynomial. Let ). 0 E o{M) be real, and let 11 1 c · · · c IX, be the partial multiplicities of M().) at ). 0 . Then the partial multiplicities of L().) = M*().)M().) at ). 0 are 21X 1 , . . . , 21X,. Proof Let (X, J, Y) be a Jordan triple for M. By Theorem 2.2, (Y*, J*, X*) is a standard triple forM*().), and in view of the multiplication
theorem (Theorem 3.1) the triple [X
0],
[~
YY*] [ 0 J* ' X*
J
is a standard triple of L().). Let J 0 be the part of J corresponding to ).0 , and and let Y0 be the corresponding part of Y. We have to show that the elementary divisors of
are 21X 1 , ... , 21X,. Note that the rows of Y0 corresponding to some Jordan block in J 0 and taken in the reverse order form a left Jordan chain of M().) (or, what is the same, their transposes form a usual Jordan chain of MT().) corresponding to ). 0 ). Let Yb ... , Yk be the left eigenvectors of M().) (corresponding to ). 0 ). Then the desired result will follow from Lemma 3.4 if we show first that the matrix A = col[y;]~= 1 · row[yt]~= 1 is nonsingular and can be decomposed into a product of lower and upper triangular matrices. are linearly independent, Since (Ax, x) c 0 for all x E rtn andy!, ... , the matrix A is positive definite. It is well known (and easily proved by induction on the size of A) that such a matrix can be represented as a product A 1 A 2 of the lower triangular matrix A 1 and the upper triangular matrix A 2
y:
12.5.
NONNEGATIVE MATRIX POLYNOMIALS
303
(the Cholesky factorization). So indeed we can apply Lemma 3.4 to complete the proof of Theorem 12.9. D We remark that Theorem 12.9 holds for regular matrix polynomials M(A.) as well. The proof in this case can be reduced to Theorem 12.9 by considering the monic matrix polynomial M(A.) = A. 1M(r 1 + a)M(a)- 1 , where 1 is the degree of M(A.) and a¢ a(M). Comments
The results presented in Sections 12.1-12.4 are from [34f, 34g], and we refer to these papers for additional information. Factorizations of matrix polynomials of type (12.14) are well known in linear systems theory; see, for instance, [ 15a, 34h, 45], where the more general case of factorization of a self-adjoint polynomial L(A.) with constant signature for all real A. is considered.
Chapter 13
Quadratic Self-Adjoint Polynomials
This is a short chapter in which we illustrate the concepts of spectral theory for self-adjoint matrix polynomials in the case of matrix polynomials of the type L(A) = H 2
+ BA + C,
(13.1)
where B and C are positive definite matrices. Such polynomials occur in the theory of damped oscillatory systems, which are governed by the system of equations 2 L ( -d) x = -d x2
dt
dt
dx + Cx = + Bdt
f,
(13.2)
'
Note that the numerical range and, in particular, all the eigenvalues of L(A), lie in the open left half-plane. Indeed, let (L(A)f,f) = A2 (f,f)
+ A(Bf,f) + (Cf,f) =
0
for some A E C and f # 0. Then A= -(Bf, f)± j(Bf, f) 2 - 4(!, f)(Cf, f) 2(f,f) '
(13.3)
and since (Bf,f) > 0, (Cf,f) > 0, the real part of A is negative. This fact reflects dissipation of energy in damped oscillatory systems. 304
13.1.
305
OVERDAMPED CASE
13.1. Overdamped Case
We consider first the case when the system governed by Eqs. (13.2) is overdamped, i.e., the inequality
(Bf,f) 2
-
4(f,f)(Cf,f) > 0
(13.4)
holds for everyf E C\{0}. Equation (13.3) shows that, in the over-damped case, the numerical range of L(A.) is real and, consequently, so are all the eigenvalues of L(A.) (refer to Section 10.6). Note that from the point of view of applications the overdamped systems are not very interesting. However, this case is relatively simple, and a fairly complete description of spectral properties of the quadratic self-adjoint polynomial (13.1) is available, as the following theorem shows. Theorem 13.1.
Let system (13.2) be overdamped. Then
(i) all the eigenvalues of L(A.) are real and non positive; (ii) all the elementary divisors of L(A.) are linear; (iii) there exists a negative number q such that n eigenvalues A.\ll, ... , A.~ 1 ' are less than q and n eigenvalues A.\2 ', •.. , A.~2 ' are greater than q; (iv) fori = 1 and 2, the sign of A.Y' (j = 1, ... , n) in the sign characteristic ofL(A.) is ( -1Y; (v) the eigenvectors uy', ... , u~il of L(A.), corresponding to the eigenvalues A.y', ... , A.~il, respectively, are linearly independent fori= 1, 2.
Proof Property (i) has been verified already. Let us prove (ii). If (ii) is false, then according to Theorem 12.2 there exists an eigenvalue A. 0 of L(A.) and a corresponding eigenvector fo such that (L'(A.o)foJo) = 2A.o}.
(13.7)
Suppose the contrary; then for some j we have: P1(y);:::::
Aj2 l = Pz(x),
(13.8)
where x is some eigenvector corresponding to andy E C"\ {0}. Clearly, pz(y) > p 1 (y) so that pz(y) =I= p 2 (x), and since p 2 (ax) = p 2 (x) for every a E C\ {0}, we deduce that x andy are linearly independent. Define
.?cj2 l,
v
= f.lY + (1 - p)x,
where f.1 is real. Then (as L(.?cj2 l)x = 0)
(.?cj2 >) 2 (v v) + .?cj2 l(Bv, v) + (Cv, v)
= f.lz{(.?cjzl)z(y, y) + .?cjzl(By, y) + (Cy, y)}.
(13.9)
However, p 1 (y) ;::::: .?cj2 l implies that the right-hand side of (13.9) is nonnegative and consequently, (13.10)
13.1.
307
OVERDAMPED CASE
for all Jl· Define g(Jl) = 2il~2 l(v, v) + (Bv, v); then g(l) < 0 by (13.8) and g(O) > 0 by (13.5) (with i = 2 and uT = x; so the sign + appears in (13.5)). Furthermore, g(Jl) is a continuous function of Jl, and hence there exists a Jlo E (0, 1) such that g(J1 0 ) = 2i!.Yl(v 0 , v0 ) + (Bv 0 , u0 ) = 0, z: 0 = v(J1 0 ). Together with (13.10) this implies (Bt• 0 , v0 ? : :; 4(v0 , v0 )(Cv0 , t•0 ); since the system is assumed to be overdamped, this is possible only if v0 = 0. This contradicts the fact that x andy are linearly independent. So (13.7) is proved. Since ilPl
E
{p 1(x)lx
E ~n\{0}},
inequalities (13.6) follow immediately from (13.7).
D
Observe that using Theorem 12.5 one can easily deduce that max{il\1 l, ... , il~ 1 l} < il 0 < min{Wl, ... , il~2 J} if and only if L(il 0 ) is negative definite (assuming of course that the system is overdamped). In particular, the set {il E IR IL(il) is negative definite} is not void, and the number q satisfies the requirements of Theorem 13.1 if and only if L(q) is negative definite. We shall deduce now general formulas for the solutions of the homogeneous equation corresponding to (13.2), as well as for the two-point boundary problem. Theorem 13.2. Let system (13.2) be overdamped, and let i!.)il be as in Theorem 13.1. Let r + (resp. r _)be a contour in the complex plane containing no points of (J(L), and such that WJ. ... ' il~2 ) (resp. wl, ... 'il~l)) are the only eigenvalues ofL(il) lying insider+ (resp. r _). Then
(i) (ii)
Jr
Jr _
the matrices + L - 1 (il) dil and L - 1 (il) dil are nonsingular; a general solution of L(d/dt)x = 0 has the form x(t) = ex + 1c 1
for some c 1, c 2
E ~n,
+ ex_ c 2 1
where
x± =
(Jr± L -l(il) dilr
1
L/L -l(il) dil;
(13.11)
(iii) the matrices X+ and X_ form a complete pair of solutions of the matrix equation X 2 + BX + C = 0, i.e., X+ - X_ is nonsingular; (iv) ifb- a> 0 is large enough, then the matrix ex+(b-a)- eX-(b-a) is nonsingular; (v) forb - a > 0 large enough, the two-point boundary value problem
L(:t)x(t) = .f(t),
x(a) = x(b) = 0
13.
308
QUADRATIC SELF-ADJOINT POLYNOMIALS
has the unique solution x(t)
b
.
= f a G0 (t, r)f( r) dr
fb[eX+(a-r)]
- [ex • 0 for every real A and every f # 0; in particular, the polynomial L(A) is nonnegative and has no real eigenvalues. According to Theorem 12.8, L(A) therefore admits the factorization L(A) = (/A - Z*)(JA - Z), where u(Z) coincides with a maximal c-set S of eigenvalues of L(A) chosen in advance. (Recall that the set of eigenvalues S is called a c-set if S n {A IAE S} = 0.) In particular, as a maximal c-set one may wish to choose the eigenvalues lying in the open upper half-plane. In such a case, by Theorem 2.15, the matrices Z and Z* form a complete pair of solutions of the matrix equation X 2 + BX + C = 0. If, in addition, eZ 1, i.e.,
If all a/.1.) are zeros, there is nothing to prove. Suppose that not all of a/.1.) are zeros, and let ai0 (A) be a polynomial of minimal degree among the nonzero entries of L(.1.). We can suppose thatj 0 = 1 (otherwise interchange columns in L(.1.)). By elementary transformations it is possible to replace all the other entries in L(,l.) by zeros. Indeed, let a/.1.) #- 0. Divide ai(,l.) by a 1 (.1.): ap) = h/.1.)a 1 (.1.) + r/.1.), where r/.1.) is the remainder and its degree is less than the degree of a 1 (.1.), or r/.1.) 0. Add to the jth column the first column multiplied by -hi( A). Then in the place j in the new matrix will be rp). If ri(,l.) of= 0, then put rp) in the place 1, and if there still exists a nonzero entry (different from ri(A)), apply the same argument again. Namely, divide this (say, the kth) entry by r/.1.) and add to the kth column the first multiplied by minus the quotient of the division, and so on. Since the degrees of the remainders decrease, after a finite number (not more than the degree of a 1 (.1.)) of steps we find that all the entries in our matrix, except the first, are zeros.
=
S 1.
318
THE SMITH FORM AND RELATED PROBLEMS
This proves Theorem Sl.l in the case m = 1, n > 1. The case m > 1, n = 1 is treated in a similar way. Assume now m, n > 1, and assume that the theorem is proved for m - 1 and n - 1. We can suppose that the (1, 1) entry of L(A.) is nonzero and has the minimal degree among the nonzero entries of L(A.) (indeed, we can reach this condition by interchanging rows and/or columns in L(A.) if L(A.) contains a nonzero entry; if L(A.) 0, Theorem Sl.l is trivial). With the help of the procedure described in the previous paragraph (applied for the first row and the first column of L(A.)), by a finite number of elementary transformations we reduce L(A.) to the form
=
aW(A.) L 1(A.)
=
[
0
1
~
a~ll:(A.)
··· 0 . . . a~~( A.) .
0
a~i(A.)
· · · a~J(A.)
Suppose that some a!J>(A.) ¢= 0 (i,j > 1) is not divisible by aW(A.) (without remainder). Then add to the first row the ith row and apply the above arguments again. We obtain a matrix polynomial of the structure
r
1
~
a~2~(A.) . . . a~2~(A.) ,
0
···
0
0
a~i(A.)
···
a~J(A.)
aW(A.)
L2(A.) =
where the degree of aW(A.) is less than the degree of ail£(A.) which is not divisible by aW(A.), repeat the same procedure once more, and so on. After a finite number of steps we obtain the matrix 0
aW(A.) L3(A.)
=[
···
0
1
~
a~~(A.) . . . a~3~(A.) ,
0
a!;i(A.)
· · · a!;J(A.)
where every ajj>(A.) is divisible by aW(A.). Multiply the first row (or column) by a nonzero constant to make the leading coefficient of ai3f(A.)l : , a!;J(A.)
(here L 4 (A.) is an (m - 1) x (n - 1) matrix), and apply the induction hypothesis for L4 (A.) to complete the proof of Theorem Sl.l. D
Sl.2.
INVARIANT POLYNOMIALS AND ELEMENTARY DIVISORS
319
The existence of the Smith form allows us to prove easily the following fact. Corollary S1.3. An n x n matrix polynomial L(A.) has constant nonzero determinant if and only if L(A.) can be represented as a product L(A.) = F 1(A.)F;(A.) · · · F p(A.) of a finite number of elementary matrices F;(A.).
Proof Sincedet F;(A.) =canst "I= O,obviouslyanyproductofelementary matrices has a constant nonzero determinant. Conversely, suppose det L(A.) = canst "I= 0. Let L(A.) = E(A.)D(A.)F(A.)
(S1.8)
be the Smith form of L(A.). Note that by definition of a Smith form E(A.) and F(A.) are products of elementary matrices. Taking determinants in (S1.8), we obtain det L(A.) = det E(A.) · det D(A.) · det F(A.), and,consequently,det D(A.) =canst -:1= 0. ThishappensifandonlyifD(A.) =I. But then L(A.) = E(A.)F(A.) is a product of elementary matrices. D Sl.2. Invariant Polynomials and Elementary Divisors
The diagonal elements d 1(A.), ... , d,(A.) in the Smith form are called the invariant polynomials of L(A.). The number r of invariant polynomials can be defined as (S1.9)
r = max{rank L(A.)} . .l.eC
Indeed, since E(A.) and F(A.) from (Sl.l) are invertible matrices for every A., we have rank L(A.) = rank D(A.) for every A. E rt. On the other hand, it is clear that rank D(A.) = r if A. is not a zero of one of the invariant polynomials, and rank D(A.) < r otherwise. So (S1.9) follows. Represent each invariant polynomial as a product of linear factors
i
=
1, .. . , r,
where A.il, ... , A.;,k, aTe different complex numbers and ail, ... , IX;k, are positive integers. The factors (A. - A.iiY'i,j = 1, ... , k;, i = 1, ... , r, are called the elementary divisors of L(A.). An elementary divisor is said to be linear or nonlinear according as aii = 1 or <Xii > 1. Some different elementary divisors may contain the same polynomial (A. - A.0 )a (this happens, for example, in case d;(A.) = di+ 1(A.) for some i); the total number of elementary divisors of L(A.) is therefore 1 k;.
Ll=
320
S 1.
THE SMITH FORM AND RELATED PROBLEMS
Consider a simple example. EXAMPLE
S 1.1.
Let A.(A. - 1) L(A.) = [ 0
J
1
A.(A. - 1)
First let us find the Smith form for L(A.) (we shall not mention explicitly the elementary transformations): [A.(A.
~ 1)
A.(A.
~ 1)]----> [ -A.z(~- 1)2
A.(A.
~ 1)]---->
[_,12(~-1)2 ~]---->[~ A_2(A_~1)2l Thus the elementary divisors are A. 2 and (A. - 1) 2 .
D
The degrees rx;j of the elementary divisors form an important characteristic of the matrix polynomial L().). As we shall see later, they determine, in particular, the Jordan structure of L(.A.). Here we mention only the following simple property of the elementary divisors, whose verification is left to the reader.
Proposition Sl.4. Let L(.A.) be an n x n matrix polynomial such that det L().) =f- 0. Then the sum Li~ 1 L~'= 1 IY.;j of degrees of its elementary divisors (). - ).ijY'J coincides with the degree of det L(.A.). Note that the knowledge of the elementary divisors of L().) and of the number r of its invariant polynomials d 1 (.A.), ... , d,().) is sufficient to construct d 1 (.A.), ... , d,(.A.). In this construction we use the fact that d;(.A.) is divisible by d;_ 1 (.A.). Let ). 1 , ... , ).P be all the different complex numbers which appear in the elementary divisors, and let().-).;)"", ... ,().- ).;)"'·k' (i = 1, ... , p) be the elementary divisors containing the number A;, and ordered in the descending order of the degrees rx; 1 ~ · · · ~ rx;,k, > 0. Clearly, the number r of invariant polynomials must be greater than or equal to max{k~> ... , kP}. Under this condition, the invariant polynomials d 1 ().), ... , d,().) are given by the formulas p
dp) =
fl (A_ ).;)"i,r+l-1, i~
j=1, ...
,r,
1
where we put (). - ).;)"' 1 = 1 for j > k;. The following property of the elementary divisors will be used subsequently:
Proposition Sl.S. Let A().) and B().) be matrix polynomials, and let C().) = diag[A().), B().)], a block diagonal matrix polynomial. Then the set ofelementary divisors of C().) is the union of the elementary divisors of A().) and B(.A.).
S1.3.
321
DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
Proof Let D 1 (A.) and D 2 (A.) be the Smith forms of A(A.) and B(A.), respectively. Then clearly
for some matrix polynomials E(A.) and F(A.) with constant nonzero determinant. Let (A. - A. 0 )~', ..• , (A. - A. 0 )~p and (A.- A. 0 )P•, ... , (A. - A. 0 )P" be the elementary divisors of D 1 (A.) and Dz(A.), respectively, corresponding to the same complex number A. 0 . Arrange the set of exponents IX 1 , ••• , IXP, [3 1, ••• , /3q, in a nonincreasing order: {1X 1, ..• , IXP, [3 1, ••• , /3q} = {y 1, ••. , Yp+q}, where 0 < y1 :::;; · · ·:::;; Yp+q· Using Theorem Sl.2 it is clear that in the Smith form D = diag[d 1 (A.), ... , d,(A.), 0, ... , 0] of diag[D 1(A.), D 2 (A.)], the invariant polynomial d,(A.) is divisible by (A.- A.0 )YP+q but not by (A.- A.0 )Yp+q+t, d,_ 1 (A.) is divisible by (A. - A. 0 )Yp +q - 1 but not by (A. - A. 0 )YP + • - 1 + 1 , and so on. It follows that the elementary divisors of
(and therefore also those of C(A.)) corresponding to A. 0 , are just (A. - A. 0 )Y 1 , (A. - A. 0 )YP +•, and Proposition S1.5 is proved. 0
••• ,
S1.3. Application to Differential Equations with Constant Coefficients Consider the system of homogeneous differential equations d 1x dx A 1 dtf + · · · + A 1 dt + A 0 x = 0,
(Sl.lO)
where x = x(t) is a vector function (differentiable l times) of the real argument t with values in fln, and A 1, ••• , A 0 are constant m x n matrices. Introduce the following matrix polynomial connected with system (S1.9) l
A(A.) =
L AjA.i.
(Sl.ll)
j=O
The properties of the solutions of (SUO) are closely related to the spectral properties of the polynomial A( A.). This connection appears many times in the book. Here we shall only mention the possibility of reducing system (SUO) to n (or less) independent scalar equations, using the Smith form of the matrix polynomial (SUl).
S 1.
322
THE SMITH FORM AND RELATED PROBLEMS
Given matrix polynomial B(A.) = LJ=o BjA.j and a p-times differentiable vector function y = y(t), denote for brevity
Thus, B(djdt)y is obtained if we write formally (Lf= 1 BjA.j)y and then replace A. by djdt; for example, if
B(A.) = [A.2 0
J
A.+ 2 A. 3 + 1
and
then
B(!!_) = dt y
[y~ + Y~ + 2y2]· Y2' + Y2
Note the following simple property. If B(A.) = B 1(A.)B 2(A.) is a product of two matrix polynomials B 1 (A.) and BiA.), then (SU2) This property follows from the facts that for oc 1, oc 2 E fl
and
0 ::;; i ::;; j. Let us go back to the system (SUO) and the corresponding matrix polynomial (SU1). Let D(A.) = diag[d 1(A.), ... , d,(A.), 0 ···OJ be the Smith form of A(A.):
A(A.) = E(A.)D(A.)F(A.),
(SU3)
where E(A.) and F(A.) are matrix polynomials with constant nonzero determinant. According to (SU2), the system (SUO) can be written in the form (SU4)
Sl.3.
DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
323
Denote y = F(djdt)x(t). Multiplying (S1.14) on the left by E- 1 (djdt), we obtain the system 0 0
Yr
d,(:t) 0
0
0
Y1 = 0,
(Sl.15)
Yn 0
(y = (y 1 , •.. , y.?), which is equivalent to (S1.14). System (Sl.15) splits into r independent scalar equations
d{:t)Yi(t) = 0,
i = 1, ... , r,
(Sl.16)
(the last n - r equations are identities: 0 · Yi = 0 fori = r + 1, .. , n). As is well known, solutions of the ith equation from (Sl.16) form ami-dimensional linear space ni, where mi is the degree of di(t). It is clear then that every solution of (S1.15) has the form (y 1 (t), ... , y.(t)) with yJt) E lli(t), i = 1, ... , r, and Yr+ 1(t), ... , y.(t) are arbitrary /-times differentiable functions. The solutions of (SUO) are now given by the formulas
where Yi E lli, i = 1, ... , r, and Yr+ ~> ... , Yn are arbitrary I- times differentiable functions. As a particular case of this fact we obtain the following result. TheoremS1.6. LetA(A.) = L}=o AiA.ibeann x nmatrixpolynomialsuch that det A(A.) =f: 0. Then the dimension of the solution space of(Sl.lO) is equal to the degree of det L(A.). Indeed, under the condition det A(A.) =f: 0, the Smith form of A(A.) does not contain zeros on the main diagonal. Thus, the dimension of the solution space is just dim 0 1 + · · · + dim n., which is exactly degree(det A(A.)) (Proposition S1.4). Theorem Sl.6 is often referred to as Chrystal's theorem. Let us illustrate this reduction to scalar equations in an example.
S 1.
324 ExAMPLE S 1.2.
THE SMITH FORM AND RELATED PROBLEMS
Consider the system d2
d dt
-x 1 +x 2 =0,
x1
+-
dt
2
(S1.17)
x 2 = 0.
The corresponding matrix polynomial is
Compute the Smith form for A(A.):
Equations (S1.15) take the form Y1 = 0,
Linearly independent solutions of the last equation are Y2.1 =
e',
Y2. 2 = exp [
-1
+2 iJ3 t
J,
Y2, 3 = exp [
-12
ij3 tJ. (S1.18)
Thus, linearly independent solutions of (S1.17) are
where y 2 = yz(t) can be each one of the three functions determined in (S1.18).
0
In an analogous fashion a reduction to the scalar case can be made for the nonhomogeneous equation (S1.19)
Indeed, this equation is equivalent to
where f(t) is a given function (we shall suppose for simplicity that f(t) has as many derivatives as necessary).
Sl.4.
325
APPLICATION TO DIFFERENCE EQUATIONS
Here D(.l.) is the Smith form for A(.l.) = I~= 0 A j;.< and E(.l.) and F(.l.) are defined by (S1.13). Let y = F(d/dt)x, g = E- 1 (d/dt)f(t); then the preceding equation takes the form
v(fr)y(t) = g(t);
(Sl.20)
thus, it is reduced to a system of n independent scalar equations. Using this reduction it is not hard to prove the following result.
Lemma S1.7. The system (S1.19) (where n = m) has a solution for every right-hand side f(t) (which is sufficiently many-times differentiable) ifand only if det A(,l.) ¢ 0.
Proof If det A(,l.) '/= 0, then none of the entries on the main diagonal in D(.l.) is zero, and therefore every scalar equation in (S1.20) has a solution. If det A(.l.) 0, then the last entry on the main diagonal in D(,l.) is zero. Hence, iff(t) is chosen in such a way that the last entry in E- 1 (d/dt)f(t) is not zero, the last equation in (S1.20) does not hold, and (Sl.19) has no solution for this choice of f(t). 0
=
Of course, it can happen that for some special choice of f(t) Eq. (S1.19) still has a solution even when det A(.l.) 0.
=
S 1.4. Application to Difference Equations Let A 0 , ••• , A 1 be m x n matrices with complex entries. Consider the system of difference equations
k
=
0, 1, 2, 00., (S1.21)
where (Yo' YJ, is a given sequence of vectors in em and (Xo' X 1• is a sequence in fl" to be found. For example, such systems appear if we wish to solve approximately a system of differential equations 0
B2
0
0
,)
0
d2 -- 2 x(r) dt
dx(t) dt
+ B 1 - - - - + B0 x(t) = y(t),
0 < t < :x:,
0
0
,)
(S1.22)
(here y(t) is a given function and x(t) is unknown), by replacing each derivative by a finite difference approximations as follows: let h be a positive number; denote tj = jh, j = 0, 1, .... Given the existence of a solution x(t), consider (S1.22) only at the points t/
d 2 x(t)
B 2 -d(Z
dx(t)
+ B 1 -dt +
B 0 x(t) =
y(t),
j
= 0, 1, 00., (S1.23)
S 1.
326
THE SMITH FORM AND RELATED PROBLEMS
and replace each derivative (d 1/dt 1)x(ti) by a finite difference approximation. For example, if xi is an approximation for x(t),j = 0, 1, 2, ... , then one may use the central difference approximations dx(t) . xi+ 1 - xj- 1 ~= 2h d 2x(d) . xi+ 1 - 2xi h2
~=
(S1.24)
+ xi-1
(Sl.25)
Inserting these expressions in (S1.23) gives B xi+ 1 - 2xi 2 h2
+ xi_ 1
+
B xi+ 1 - xi_ 1 1 2h
+
B
_ oXj -
( ) Y ti ,
j
= 1, 2, ... '
or (B2
+ !B 1 h)xi+ 1 = h 2y(t),
-
(2B 2 - B 0 h 2)xi
+ (B 2 -
j = 1, 2, ... ,
!B 1 h)xi_ 1
(S1.26)
which is a difference equation oftype(Sl.21)withyk = h 2 y(tk+ 1 ),k = 0, 1,2, .... Some other device may be needed to provide the approximate values of x(O) and x(h) (since, generally speaking, the unique solution of (Sl.26) is determined by the values of x(O) and x(h)). These can be approximated if, for instance, initial values x(O) and dx(O)/dt are given; then the approximation x 1 for x(h) could be dx(O)
x 1 = x(O)
+ h ---;{('
This technique can, of course, be extended to approximate by finite differences the equation of lth order
d1x(t)
B 1 ----;IT
+ · · · + B 0 x(t) =
y(t).
The approximation by finite differences is also widely used in the numerical solution of partial differential equations. In all these cases we arrive at a system of difference equations of type (S1.21). The extent to which solutions of the difference equation give information about solutions ofthe original differential equation is, of course, quite another question. This belongs to the study of numerical analysis and will not be pursued here. To study the system (S1.21), introduce operator Iff (the shift operator), which acts on the set of all sequences (x 1 , x 2 , •.. ,) of n-dimensional vectors xi as follows: tff(x 1, x 2 , ... ,) = (x 2 , x 3 , ... ,). For difference equations, the
327
S1.4. APPLICATION TO DIFFERENCE EQUATIONS
operator Iff plays a role analogous to the operator d/dt in Section Sl.3. The system (Sl.21) can be rewritten in the form (A 0
+ A 1 tff + · · · + A 1tff 1)x =
f,
(Sl.27)
f = (!0 , f 1 , .•• ,) is a given sequence of n-dimensional vectors, and = (x 0 , x 1 , ..• ,) is a sequence to be found. Now it is quite clear that we have
where x
to consider the matrix polynomial A(A.) = L~~o AiA.i connected with (S1.21). Let D(A.) be the Smith form of A(A.) and A(A.) = E(A.)D(A.)F(A.),
=
where det E(A.) const =F 0, det F(A.) Iff and substitute in (S1.17) to obtain
(Sl.28)
=const =F 0. Replace A. in (Sl.28) by
D(tff)y = g, where y = (y 0 , y 1 ,
(Sl.29)
= F(tff)(x 0 , x 1, .•• ,) and
•.. ,)
g =(go, gt, · · · ,) = (E(tff))- 1 0 such that every matrix B with IIA - Bll < B is also onesided invertible, and from the same side as A. (Here and in what follows the norm II M I of the matrix M is understood as follows: II M I = sup 11 x 11 = 1 II M x II, where the norms of vectors x and Mx are euclidean.) This property follows from (ii) (in case ofleft invertibility) and from (v) (in case of right invertibility). Indeed, assume for example that A is left invertible (of size m x n). Let A 0 be a nonsingular n x n submatrix of A. The corresponding n x n submatrix B 0 of B is as close as we wish to A 0 , provided IIA - Bll is sufficiently small. Since nonsingularity of a matrix is preserved under small perturbations, B 0 will be nonsingular as well if IIA - Bll is small enough. But this implies linear independence of the columns of B, and therefore B is left invertible. One can take B = IIAIII- 1 in the preceding paragraph, where AI is some right or left inverse of A. Indeed, suppose for definiteness that A is right invertible, AAI =I. Let B be such that IIA- Bll < IIAIII- 1 . Then BAI = AAI + (B- A)AI =I+ S, whereS = (B- A)AI. By the conditions liS II< 1, I + S is invertible and (I
+ s)- 1 =
I -
s + s2
-
s3 + ... ,
351
ONE-SIDED AND GENERALIZED INVERSES
where the series converges absolutely (because liS II < 1). Thus, BAI(l + S)- 1 = I, i.e., B is right invertible. We conclude this chapter with the notion of generalized inverse. Ann x m matrix AI is called a generalized inverse of them x n matrix A if the following equalities hold:
This notion incorporates as special cases the notions of one-sided (left or right) inverses. A generalized inverse of A exists always (in contrast with one-sided inverses which exist if and only if A has full rank). One of the easiest ways to verify the existence of a generalized inverse is by using the representation A=
B 1 [~ ~JB2,
(S3.5)
where B 1 and B 2 are nonsingular matrices of sizes m x m and n x n, respectively. Representation (S3.5) may be achieved by performing elementary transformations of columns and rows of A. For A in the form (S3.5), a generalized inverse can be found easily:
o]B-1.
AI= B-2[1 2 0 0
1
Using (S3.5), one can easily verify that a generalized inverse is unique if and only if A is square and nonsingular. We shall also need the following fact: if .It c (f;m is a direct complement to Im A, and .;V c (f;n is a direct complement to Ker A, then there exists a generalized inverse AI of A such that Ker AI =.It and Im AI = %. (For the definition of direct complements see Section S4.1.) Indeed, let us check this assertion first for (S3.6) The conditions on .It and .Jt
.;V
imply that
= Im[
X
Jm-r
J
% =
Im[iJ
for some r x (m - r) matrix X and (n - r) x r matrix Y. Then
352
S3.
ONE-SIDED AND GENERALIZED INVERSES
is a generalized inverse of A with the properties that Ker A 1 = ~~ and Im A1 = %. The general case is easily reduced to the case (S3.6) by means of representation (S3.5), using the relations Ker A= B2 1 (Ker D), where
Im A= B 1 (1m D),
Chapter S4
Stable Invariant Subspaces
In Chapters 3 and 7 it is proved that the description of divisors of a matrix polynomial depends on the structure of invariant subs paces for its linearization. So the properties of the invariant subspaces for a given matrix (or linear transformation) play an important role in the spectral analysis of matrix polynomials. Consequently, we consider in this chapter invariant subspaces of a linear transformation acting in a finite dimensional linear vector space fll". In the main, attention is focused on a certain class of invariant subs paces which will be called stable. Sections S4.3, S4.4, and S4.5 are auxiliary, but the results presented there are also used in the main text. We shall often assume that f!( = fl", where fl" is considered as the linear space of n-dimensional column vectors together with the customary scalar product (x, y) defined in fl" by n
(x, y) =
L xiyi, i= 1
where x = (x 1 , .•. , xn?, y = (y 1 , ..• , Yn? E fl". We may also assume that linear transformations are represented by matrices in the standard orthonormal basis in fl". S4.1. Projectors and Subspaces
A linear transformation P: fll"---+ f!( is called a projector if P 2 = P. It follows immediately from the definition that if Pis a projector, so is s- 1 PS
353
354
S4.
STABLE INVARIANT SUBSPACES
for any invertible transformation S. The important feature of projectors is that there exists a one-to-one correspondence between the set of all projectors and the set of all pairs of complementary subs paces in fl£. This correspondence is described in Theorem S4.1. Recall first that, if A, ff' are subs paces of fl£, then A + ff' = { z E fl£ I z = x + y, x E.~, y E 2'}. This sum is said to be direct if.~ n ff' = {0} in which case we write .~ ff' for the sum. The subspaces A, ff' are complementary (are direct complements of each other) if A n ff' = {0} and.~ ff' = fl£. Subspaces .~, ff' are orthogonal if for each x E.~ and y E ff' we have (x, y) = 0 and they are orthogonal complements if, in addition, they are complementary. In this case, we write A = f£'1., ff' = A1..
+
+
Theorem S4.1. Let P be a projector. Then (ImP, Ker P) is a pair of complementary subs paces in fl£. Conversely, for every pair (ff' 1 , ff' 2 ) of complementary subspaces in fl£, there exists a unique projector P such that Im P = ff' 1 , Ker P = ff' 2 . Proof
Let x
E
fl£. Then
x = (x - Px)
+ Px.
Clearly, Px E Im P and x - Px E Ker P (because P 2 = P). So Im P + Ker P = fl£. Further, if x E Im P n Ker P, then x = Py for some y E fl£ and Px = 0. So x = Py = P 2 y = P(Py) = Px = 0,
and ImP n Ker P = {0}. Hence ImP and Ker Pare indeed complementary subspaces. Conversely, let ff' 1 and ff' 2 be a pair of complementary subs paces. Let P be the unique linear transformation in fl£ such that Px = x for x E ff' 1 and Px = 0 for x E ff' 2 . Then clearly P 2 = P, ff' 1 c: ImP, and ff' 2 c: Ker P. But we already know from the first part of the proof that ImP Ker P = fl£. By dimensional considerations, we have, consequently, ff' 1 = Im P and ff' 2 = Ker P. SoP is a projector with the desired properties. The uniqueness of P follows from the property that Px = x for every x E Im P (which in turn is a consequence of the equality P 2 = P). D
+
We say that Pis the projector on ff' 1 along ff' 2 if ImP = ff' 1, Ker P = ff' 2 • A projector Pis called orthogonal if Ker P = (lm P)1.. Orthogonal projectors are particularly important and can be characterized as follows: Proposition S4.2. adjoint, i.e., P* = P.
A projector P is orthogonal
if and
only
if P
is self-
S4.1.
355
PROJECTORS AND SUBSPACES
Recall that for a linear transformation T: fl" --+ fl" the adjoint T*: fl" is defined by the relation (Tx, y) = (x, T*y) for all x, y E fl". A transformation T is self-adjoint if T = T*; in such a case it is represented by a hermitian matrix in the standard orthonormal basis.
fl"
--+
Proof Suppose P* = P, and let x E ImP, y E Ker P. Then (x, y) = (Px, y) = (x, Py) = (x, 0) = 0, i.e., Ker P is orthogonal to Im P. Since by Theorem S4.1 Ker P and ImP are complementary, it follows that in fact Ker P = (lm P).L. Conversely, let Ker P = (Im P).L. In order to prove that P* = P, we have to check the equality (Px, y) = (x, Py)
for all
x, y E fl".
(S4.1)
Because of the sesquilinearity of the function (Px, y) in the arguments
x, y E (![,and in view of Theorem S4.1, it is sufficient to prove (S4.1) for the following 4 cases: (1) x, y E ImP; (2) x E Ker P, y E ImP; (3) x E ImP, y E Ker P; (4) x, y E Ker P. In case (4), the equality (4.1) is trivial because both sides are 0. In case (1) we have
(Px, y) = (Px, Py) = (x, Py), and (S4.1) follows. In case (2), the left-hand side of (S4.1) is zero (since x E Ker P) and the right-hand side is also zero in view of the orthogonality Ker P = (lm P).L. In the same way, one checks (S4.1) in case (3). So (S4.1) holds, and P* = P. D Note that if Pis a projector, so is I - P. Indeed, (I - P) 2 = I - 2P + P 2 = I - 2P + P = I - P. Moreover, Ker P = lm(J - P) and Im P = Ker(J - P). It is natural to call the projectors P and I - P complementary projectors. We shall now give useful representations of a projector with respect to a decomposition of (![ into a sum of two complementary subspaces. Let T: (![ --+ (![be a linear transformation and let .P b .P 2 be a pair of complementary subspaces in (![. Denote m; = dim .P; (i = 1, 2); then m1 + m2 = n ( = dim (![). The transformation T may be written as a 2 x 2 block matrix with respect to the decomposition .P 1 .P 2 = (![:
+
(S4.2) Here T;i (i,j = 1, 2) is an m; x mi matrix which represents in some basis the linear transformation P; Tl.~j: .Pi--+ .P;, where P; is the projector on .P; along .f£3 _; (so P 1 + P 2 =I).
356
S4.
STABLE INVARIANT SUBSPACES
Suppose now that T = Pis a projector on!£ 1 = ImP. Then representation (S4.2) takes the form (S4.3) for some matrix X. In general, X # 0. One can check easily that X = 0 if and only if!£ 2 = Ker P. Analogously, if!£ 1 = Ker P, then (S4.2) takes the form (S4.4) and Y = 0 if and only if!£ 2 = Im P. By the way, the direct multiplication P. P, where Pis given by (S4.3) or (S4.4), shows that Pis indeed a projector: p2 = P. S4.2. Spectral Invariant Subspaces and Riesz Projectors
Let A: fl£ ~ fl£ be a linear transformation on a linear vector space fl£ with dim fl£ = n. Recall that a subspace !£ is invariant for A (or A -invariant) if Aft' c !£. The subspace consisting of the zero vector and fl£ itself is invariant for every linear transformation. A less trivial example is a one-dimensional subspace spanned by an eigenvector. An A-invariant subspace!£ is called A-reducing if there exists a direct complement !£''to !£ in fl£ which is also A-invariant. In this case we shall say also that the pair of subspaces (!£',!£'')is A-reducing. Not every A-invariant subspace is A-reducing: for example, if A is a Jordan block of size k, considered as a linear transformation of (/} to fl\ then there exists a sequence of A-invariant subspaces flk :::J ff'k_ 1 :::J ff'k_ 2 :::J · · • :::J !£ 1 :::J {0}, where ff'; is the subspace spanned by the first i unit coordinate vectors. It is easily seen that none of the invariant subspaces ff'; (i = 1, ... , k - 1) is A-reducing (because ff'; is the single A-invariant subspace of dimension i). From the definition it follows immediately that !£ is an A-invariant (resp. A-reducing) subspace if and only if Sf£' is an invariant (resp. reducing) subspace for the linear transformation SAS- 1 , where S: fl£ ~ fl£ is invertible. This simple observation allows us to use the Jordan normal form of A in many questions concerning invariant and reducing subspaces, and we frequently take advantage of this in what follows. Invariant and reducing subspaces can be characterized in terms of projectors, as follows: let A: fl£ ~ fl£ be a linear transformation and let P: fl£ ~ fl£ be a projector. Then
S4.2.
SPECTRAL INVARIANT SUBSPACES AND RIESZ PROJECTORS
(i)
357
the subspace ImP is A-invariant if and only if PAP= AP;
(ii)
the pair of subspaces ImP, Ker Pis A-reducing if and only if AP
= PA.
Indeed, write A as a 2 x 2 block matrix with respect to the decomposition Im P Ker P = !!l':
+
(so, for instance, A 11 = PA hmP: Im P--+ Im P). The projector P has the corresponding representation (see (S4.3))
OJ
P=[I0 0 .
Now PAP = AP means that A 21 = 0. But A 21 = 0 in turn means that Im P is A -invariant, and (i) is proved. Further, AP = P A means that A21 = 0 and A 12 = 0. But clearly, the condition that Im P, Ker P is an A-reducing pair is the same. So (ii) holds as well. We consider now an important class of reducing subspaces, namely, spectral invariant subspaces. Let a( A) be the spectrum of A, i.e., the set of all eigenvalues. Among the A-invariant subspaces of special interest are spectral invariant subspaces. An A-invariant subspace fi' is called spectral with respect to a subset a c a(A), if fi' is a maximal A-invariant subspace with the property that a(A 1-P) c a. If a = {A.0 } consists of only one point A.0 , then a spectral subspace with respect to a is just the root subspace of A corresponding to A.0 . To describe spectral invariant subspaces, it is convenient to use Riesz projectors. Consider the linear transformation P" = -21 . 1tl
i
(/A.- A) -1 dA.,
(S4.5)
ro
where r" is a simple rectifiable contour such that a is insider" and a(A)\a is outside r". The matrix P" given by (S4.5) is called the Riesz projector of A corresponding to the subset a c a(A). We shall list now some simple properties of Riesz projectors. First, let us check that (S4.5) is indeed a projector. To this end letf(A.) be the function defined as follows:f(A.) = 1 inside and on r",J(A.) = 0 outsider". Then we can rewrite (S4.5) in the form
S4.
358
STABLE INVARIANT SUBSPACES
where r is some contour such that a(A) is insider. Sincef(A.) is analytic in a neighborhood of a(A), by Theorem Sl.15 we have P; =
~ f
2m Jr
f(A.)(IA. - A)- 1 dA.. ~ f f(A.)(IA.- A)- 1 dA. 2m Jr
=~
rU 0, there exists c5 > 0 such that the following holds true: If B is a linear transformation with liB- All < c5 and {.IIi} is a complete chain of B-invariant subspaces, then there exists a complete chain {%;} of A-invariant subspaces such that liP.v1 - P At 1 ll < e for j = 1, ... , n-1. In general the chain {.Ki} for A will depend on the choice of B. To see this, consider
where v E (/;. Observe that for v # 0 the only one dimensional invariant subspace of B. is Span{(O, 1?} and forB~, v # 0, the only one-dimensional invariant subspace is Span{(1, O)T}. Proof. Assume that the conclusion of the theorem is not correct. Then there exists e > 0 with the property that for every positive integer m there
S4.
368
STABLE INVARIANT SUBSPACES
exists a linear transformation Bm satisfying IIBm - A I < 1/m and a complete chain {.It mi} of Em-invariant subs paces such that for every complete chain {.AI) of A-invariant subspaces:
liP.Yi
max
-
P Atm)l
~ t:
m = 1, 2,....
(S4.27)
1 ,;,j:o;,k-1
Denote for brevity Pmi = P Atmj' Since liPmill = 1, there exists a subsequence {mi} of the sequence of positive integers and linear transformations P 1 , ••• , Pn_ 1 on fz;n, such that lim Pm;.i = Pi,
j
=
1, ... , n- 1.
(S4.28)
i-+- 00
Observe that P 1 , . . . , P n- 1 are orthogonal projectors. Indeed, passing to the limit in the equalities Pm;,i = (P m,,) 2 , we obtain that Pi = PJ. Further (S4.28) combined with P!,,i = Pm;.i implies that Pj =Pi; so Pi is an orthogonal projector (Proposition S4.2). Further, the subspace Ali= Im Pi has dimension j, j = 1, ... , n - 1. This is a consequence of Theorem S4.6. By passing to the limits it follows from BmP mi = PmiBmPmi that APi = PiAPi. Hence Ali is A-invariant. Since Pmi = Pm,i+ 1 P mi we have Pi= Pi+ 1 Pi, and thus Ali c %i+ 1 . It follows that Ali is a complete chain of A-invariant subspaces. Finally, &(Ali, .Itm,.) = IIPi - Pm,jll -+ 0. But this contradicts (S4.27), and the proof is complete. 0
Corollary S4.11. dim Ker(A. 0 I - A)
=
If A has only one eigenvalue, A. 0 say, and 1, then each invariant subspace of A is stable.
if
Proof The conditions on A are equivalent to the requirement that for each 1 .::; j .::; k - 1 the operator A has only one j-dimensional invariant subspace and the nontrivial invariant subspaces form a complete chain. So we may apply the previous theorem to get the desired result. 0
Lemma S4.12. dim Ker(A. 0 I and fln.
If A has only one eigenvalue, A.0 say, and if 2, then the only stable A-invariant subspaces are {0}
A)~
Proof Let J = diag(J 1 , •.• , J.) be a Jordan matrix for A. Here Ji is a simple Jordan block with A. 0 on the main diagonal and of size Ki, say. As dim Ker(A. 0 I -A)~ 2 we haves ~ 2. By similarity, it suffices to prove that J has no nontrivial stable invariant subspace. Let e 1 , .•. , ek be the standard basis for fln. Define the linear transformation T, on (/;n by setting T .=
,e,
{c:ei-1 0
if i = K1 + otherwise,
· · · + Kj + 1,
j
= 1, ... , S
-
1
S4.7.
369
GENERAL CASE OF STABLE INVARIANT SUBSPACES
and put B, = J + T,;. Then liB, - 111 tends to 0 as t: ..... 0. Fort: # 0 the linear transformation B, has exactly one j-dimensional invariant subspace namely, JVj = Span{e 1, .•. , ej}. Here 1 :::;; j:::;; k - 1. It follows that Jll"j is the only candidate for a stable ]-invariant subspace of dimensionj. Now consider J = diag(J., ... , J 1 ). Repeating the argument of the previous paragraph for J instead of J, we see that Jll"j is the only candidate for a stable ]-invariant subspace of dimensionj. But J = sJs- 1, where Sis the similarity transformation that reverses the order of the blocks in J. It follows that SJII"j is the only candidate for a stable ]-invariant subspace of dimensionj. However, ass ;;::: 2, we have SJII"j # JVj for 1 :::;; j:::;; k - 1, and the proof is complete. D Corollary S4.1 and Lemma S4.12 together prove Theorem S4.9 for the case when A has one eigenvalue only. S4.7. General Case of Stable Invariant Subspaces
The proof of Theorem S4.9 in the general case will be reduced to the case of one eigenvalue which was considered in Section S4.6. To carry out this reduction we will introduce some additional notions. We begin with the following notion of minimal opening. For two subspaces vii and JV of q;n the number
YJ(vll, JV) = inf{llx
+ Ylllx E vii, y E Jll", max(llxll,
IIYII) = 1}
is called the minimal opening between vii and JV. Note that always 0 :::;; YJ(vll, JV) :::;; 1, except when both vii and JV are the zero subspace, in which case YJ(vll, JV) = oo. It is easily seen that YJ(vll, JV) > 0 if and only if vii n JV = {0}. The following result shows that there exists a close connection between the minimal opening and gap B(vll, JV) between subspaces. Proposition 84.13.
Let vii m• m = 1, 2, ... , be a sequence of subspaces in
q;n_ If limn-oo B(vllm• 2) = 0 for some suhspace 2, then (S4.29)
for every subspace JV. Proof Note that the set of pairs {(x, y)lx E vii, y E Jll", max(llxll, IIYII) = 1} c q;n x q;n is compact for any pair ofsubspaces vii, JV. Since llx + Yll is a continuous function of x and y, in fact YJ(vll, .AI") = llx
+ Yll
370
S4.
STABLE INVARIANT SUBSPACES
for some vectors x E vii, y E .AI' such that max(llxll, IIYII) = 1. So we can choose xm E vllm, Ym E %, m = 1, 2, ... , such that max(llxmll, IIYmll) = 1 and l](vllm, AI')= llxm + Ymll· Pick convergent subsequences xmk--+ x 0 E r;;n, Ymk--+ Yo E AI'. Clearly, max(llxoll, IIYoll) = 1 and l](vllmk• AI')--+ llxo +Yo II. It is easy to verify (for instance by reductio ad absurdum) that x 0 E 2. Thus (S4.30) On the other hand limm__, 00 SUp l](vllm, AI')::;; 1](2, AI'). Indeed, let x
E
2, y
E
(S4.31)
.AI' be such that max(llxll, IIYII) = 1 and 1](2, .AI')
= llx + Yll·
Assume first that x ¥- 0. For any given a > 0 there exists m 0 such that d(x, vii m) < a for m ~ m 0 • Let xm E vii m be such that d(x, viim) = I x - xm II· Then zm 'M (llxllxm)/llxmll E vllm, max(llzmll, IIYII) = 1 and for m ~ m 0 we obtain llzm
+ Yll
~~~~~~~~· xll
::;; llx
+ Yll +
::;; llx
+ Yll + l i 8 + lllxmll-llxlll.
llzm-
llxmll
+
llxlll1 -
~:~::Ill I
llxmll
Taking a < llxll/2, we obtain that llzm
+ Yll
::;; llx
+ Yll + 2a + 5a, so
+ Yll
--+ llx
+ Yll
= 1](2, AI'),
l](vllm, AI')::;; llzm
and (S4.31) follows. Combining (S4.30) and (S4.31) we see that for some subsequence mk we have limk-+oo l](vllmk• AI')= 1](2, %). Now by a standard argument one proves (S4.29). Indeed, if (S4.29) were not true, then we could pick a subsequence m~ such that lim I]( vii m"• AI') "# 1](2, AI').
(S4.32)
k-+ 00
But then we repeat the above argument replacing vii m by vii ml,. So it is possible to pick a subsequence m~ from m~ such that limk-+oo I](Jtm'k• AI')= 1](2, %), a contradiction with (S.4.32). D Let us introduce some terminology and notation which will be used in the next two lemmas and their proofs. We use the shorthand Am--+ A for limm__, 00 IlAm - A I = 0, where Am, m = 1, 2, ... , and A are linear transformations on r;;n. Note that Am --+ A if and only if the entries of the matrix representations of Am (in some fixed basis) converge to the corresponding entries
S4.7.
GENERAL CASE OF STABLE INVARIANT SUBSPACES
371
of A (represented as a matrix in the same basis). We say that a simple rectifiable contour r splits the spectrum of a linear transformation T if a(T) n r = 0. In that case we can associate with T and r the Riesz projector
P(T;
n=~ ren- n-l dA. 2m Jr
The following observation will be used subsequently. If T is a linear transformation for which r splits the spectrum, then r splits the spectrum for every linear transformationS which is sufficiently close to T (i.e., I S - T I is close enough to zero). Indeed, if it were not so, we have det(A.m/ - Sm) = 0 for some sequence Am E rand sm --+ T. Pick a subsequence A.m. --+ Ao for some A. 0 E r; passing to the limit in the equality det(A.mJ - Sm.) = 0 (here we use the matrix representations of sm. and T in some fixed basis, as well as the continuous dependence of det A on the entries of A) we obtain det(A. 0 / - T) = 0, a contradiction with the splitting of a(T). We shall also use the notion of the angular operator. If Pis a projector of fl" and At is a subspace of fl" with Ker P At = fl", then there exists a unique linear transformation R from Im P into Ker P such that At = {Rx + xlx E ImP}. This transformation is called the angular operator of At with respect to the projector P. We leave to the reader the (easy) verification of existence and uniqueness of the angular operator (or else see [3c, Chapter 5]).
+
Lemma S4.14. Let r be a simple rectifiable contour that splits the spectrum ofT, let T0 be the restriction ofT to Im P(T; r) and let % be a subspace of Im P(T; r). Then % is a stable invariant subspace for T if and only if% is a stable invariant subspace for T0 .
Proof Suppose % is a stable invariant subspace for T0 , but not for T. Then one can find s > 0 such that for every positive integer m there exists a linear transformation sm such that
IISm- Til
0 and a sequence {Sm} such that Sm--+ T and 0(.2, .A)
~
a,
fi'
E
Qm,
m = 1, 2, ... ,
(S4.39)
where Qm denotes the set of all invariant subspaces of Sm. As .AI is stable for T, one can find a sequence of subspaces {.AIm} such that Sm% m c: .AIm and 0(% m' %) --+ 0. Further, since r splits the spectrum ofT and sm --+ T, the contour r will split the spectrum of sm for m sufficiently large. But then, without loss of generality, we may assume that r splits the spectrum of each Sm. Again using Sm --+ T, it follows that P(Sm; r) --+ P(T; r). Let fZ be a direct complement of .AI in As 0(% m' .AI)--+ 0, we have = fZ .AIm for m sufficiently large (Theorem S4.7). So, without loss of generality, we may assume that = fZ .AIm for each m. Let Rm be the angular operator of .AIm with respect to the projector of along fZ onto .AI, and put
en
+
en
en. +
en
Em=[~ ~m] en
+
where the matrix corresponds to the decomposition = fZ %. Note that Tm = E;;; 1 SmEm leaves .AI invariant. Because Rm--+ 0 we have Em--+ I, and so Tm --+ T.
374
S4.
STABLE INVARIANT SUBSPACES
Clearly, r splits the spectrum of Tlx· As Tm--+ T and% is invariant for Tm, the contour r will split the spectrum of Tm lx too, provided m is sufficiently large. But then we may assume that this happens for all m. Also, we have lim P(Tm lx; r)
--+
P(Tix; r).
Hence Am = Im P(Tm lx; r) --+ Im P(Tix; r) = A in the gap topology. Now consider !£' m = Em j f m. Then !£' m is an Sm-invariant subspace. From Em --+I it follows that tl(!f' m, Am)--+ 0. This together with ti(A m, A) --+ 0, gives tl(!f' m, A) --+ 0. So we arrive at a contradiction to (S4.39), and the proof is complete. D After this long preparation we are now able to give a short proof for Theorem S4.9. Proof ofT heorem S 4. 9. Suppose % is a stable invariant subspace for A. Put %j = Pj%. Then%=% 1 +···+Air. By Lemma S4.15 the space %j is a stable invariant subspace for the restriction Aj of A to %(.1). But Aj has one eigenvalue only, namely, A.j. So we may apply Lemma S4.12 to prove that %j has the desired form. Conversely, assume that each %j has the desired form, and let us prove that% = % 1 % r is a stable invariant subspace for A. By Corollary S4.11 the space %j is a stable invariant subspace for the restriction Aj of A to Im Pj. Hence we may apply Lemma S4.14 to show that each %j is a stable invariant subspace for A. But then the same is true for the direct sum % =
+ ··· +
%1
+ ... +fly.
0
Comments
Sections S4.3 and S4.4 provide basic notions and results concerning the set of subs paces (in finite dimensional space); these topics are covered in [32c, 48] in the infinite-dimensional case. The proof of Theorem S4.5 is from [33, Chapter IV] and Theorem S4.7 appears in [34e]. The results of Sections 4.5-4.7 are proved in [3b, 3c]; see also [12]. The book [3c] also contains additional information on stable invariant subspaces, as well as applications to factorization of matrix polynomials and rational functions, and to the stability of solutions of the algebraic Riccati equation. See also [3c, 4] for additional information concerning the notions of minimal opening and angular operator. We mention that there is a close relationship between solutions of matrix Riccati equations and invariant subspaces with special properties of a certain linear transformation. See [3c, 12, 15b, 16, 53] for detailed information.
Chapter S5
Indefinite Scalar Product Spaces
This chapter is devoted to some basic properties of finite-dimensional spaces with an indefinite scalar product. The results presented here are used in Part III. Attention is focused on the problem of description of all indefinite scalar products in which a given linear transformation is self-adjoint. A special canonical form is used for this description. First, we introduce the basic definitions and conventions. Let f!l' be a finite-dimensional vector space with scalar product (x, y), x, y E f!l', and let H be a self-adjoint linear transformation in?£; i.e., (H x, y) = (x, Hy) for all x, y E f!l'. This property allows us to define a new scalar product [x, y], x, y E f!l' by the formula [x, y] = (Hx, y)
The scalar product [ , ] has all the properties of the usual scalar product, except that [x, x] may be positive, negative, or zero. More exactly, [x, y] has the following properties: (1) (2) (3)
(4)
[o:lxl + o:2x2, y] = o:1[x1, y] + o:2[x2, y]; [x, 0:1Yl + 0:2Y2J = ii1[x, Y1J + ii2[x, Y2]; [x,y] = [y,x]; [x, x] is real for every x E ?£. 375
376
S5.
INDEFINITE SCALAR PRODUCT SPACES
Because of the lack of positiveness (in general) of the form [x, x], the scalar product [ , ] is called indefinite, and fl£ endowed with the indefinite scalar product [ , ] will be called an indefinite scalar product space. We are particularly interested in the case when the scalar product [ , ] is nondegenerate, i.e., [x, y] = 0 for all y E fl£ implies x = 0. This happens if and only if the underlying self-adjoint linear transformation H is invertible. This condition will be assumed throughout this chapter. Often we shall identify fl£ with(£". In this case the standard scalar product is given by (x, y) = L~= 1 xc)l;, x = (x 1, ... , x")T E fl", y = (Jt, ... , Yn)T E fl", and an indefinite scalar product is determined by an n x n nonsingular hermitian (or self-adjoint) matrix H: [x, y] = (Hx, y) for all x, y E fl". Finally, a linear transformation A: fl£ ---> fl£ is called self-adjoint with respect to H (or H-self-adjoint) if [Ax, y] = [x, Ay] for all x, y E fl£, where [x, y] = (Hx, y). This means HA = A*H.
S5.1. Canonical Form of a Self-Adjoint Matrix and the Indefinite Scalar Product Consider the following problem: given a fixed n x n matrix A, describe all the self-adjoint and nonsingular matrices H such that A is self-adjoint in the indefinite scalar product [x, y] = (Hx, y), i.e., [Ax, y] = [x, Ay], or, what is the same, H A = A* H. Clearly, in order that such an H exist, it is necessary that A is similar to A*. We shall see later (Theorem S5.1 below) that this condition is also sufficient. For the spectral properties of A this condition means that the elementary divisors of I A - A are symmetric relative to the real line; i.e., if A0 ( # A0 ) is an eigenvalue of H - A with the corresponding elementary divisors (A - AoY', i = 1, ... , k, then A: 0 is also an eigenvalue of H - A with the elementary divisor (A - A: 0 )"', i = 1, ... , k. Now the problem posed above can be reformulated as follows: given a matrix A similar to A*, describe and classify all the self-adjoint non singular matrices which carry out this similarity. Consider first the case that A = I. Then any self-adjoint nonsingular matrix His such that I isH-self-adjoint. Thus, the problem of description and classification of such matrices H is equivalent to the classical problem of reduction of H to the form S* PS for some nonsingular matrix S and diagonal matrix P such that P 2 = I. Then the equation H = S* PSis nothing more than the reduction of the quadratic form (H x, x) to a sum of squares. To formulate results for any A we need a special construction connected with a Jordan matrix J which is similar to J*. Similarity between J and J* means that the number and the sizes of Jordan blocks in J corresponding to some eigenvalue A0 ( # A: 0 ) and those corresponding to A0 are the same. We
S5.1.
377
CANONICAL FORM
now fix the structure of J in thefollowingway: select a maximal set {A. 1 , ••. , A.a} of eigenvalues of J containing no conjugate pair, and let {A.a+ ~> ... , A.a+b} be the distinct real eigenvalues of J. Put A.a+b+ j = Xj, for j = 1, ... , a, and let (S5.1) where J; = diag[Jii]~;; 1 is a Jordan form with eigenvalue A.; and Jordan blocks, 1;, 1, •.. , Ji,k; of sizes IX;, 1 2:: · · · 2:: IXi,k;• respectively. An IX x IX matrix whose (p, q) entry is 1 or 0 according asp + q = IX + 1 or p + q =f. IX + 1 will be called a sip matrix (standard involuntary permutation). An important role will be played in the sequel by the matrix P,, 1 connected with J as follows: (S5.2) where
and
Pr = diag[diag[eiiPii]~;; 1 ]f;:+ 1 with sip matrices P, 1 of sizes IX,1 x 1Xsr (s = 1, ... , a + b, t = 1, ... , k.) and the ordered set of signs e = (e;), i = a + 1, ... , a + b,j = 1, ... , k;, eii = ± 1. Using these notations the main result, which solves the problem stated at the beginning of this section, can bow be formulated.
Theorem S5.1. The matrix A is self-adjoint relative to the scalar product (Hx, y) (where H = H* is nonsingular) if and only if T*HT
= P,,J>
T- 1 AT= J
(S5.3)
for some invertible T, a matrix J in Jordan form, and a set of signs e.
The matrix P,, 1 of (S5.3) will be called an A-canonical form of H with reducing transformation S = T- 1 • It will be shown later (Theorem S5.6) that the set of signs e is uniquely defined by A and H, up to certain permutations. The proof of Theorem S5.1 is quite long and needs some auxiliary results. This will be the subject matter of the next section. Recall that the signature (denoted sigH) of a self-adjoint matrix H is the difference between the number of positive eigenvalues and the number of negative eigenvalues of H (in both cases counting with multiplicities). It coincides with the difference between the number of positive squares and the number of negative squares in the canonical quadratic form determined by H.
S5.
378
INDEFINITE SCALAR PRODUCT SPACES
The following corollary connects the signature of H with the structure of real eigenvalues of H -self-adjoint matrices. Corollary S5.2. Let A be an H-self-adjoint matrix, and let s be the signature of H. Then the real eigenvalues of A have at least lsi associated elementary divisors of odd degree. Proof It follows from the relation (S5.3) that s is also the signature of P,,J· It is readily verified that each conjugate pair of nonreal eigenvalues makes no contribution to the signature; neither do sip matrices associated with even degree elementary divisors and real eigenvalues. A sip matrix associated with an odd degree elementary divisor of a real eigenvalue contributes"+ 1" or" -1" to s depending on the corresponding sign in the sequence e. The conclusion then follows. D
Another corollary of Theorem S5.1 is connected with the simultaneous reduction of a pair of hermitian matrices A and B to a canonical form, where at least one of them is nonsingular. Corollary S5.3. Let A, B be n x n hermitian matrices, and suppose that B is invertible. Then there exists a nonsingular matrix X such that X*AX = P,,JJ,
X*BX = P,,J,
where J is the Jordan normal form of B- 1 A, and P,,J is defined as above. Proof Observe that the matrix B- 1 A is B-self-adjoint. By Theorem S5.1, we have X*BX = P,,J,
for some nonsingular matrix X. So A = BXJX- 1 and X*AX = X*BXJ = P,,JJ.
0
In the case when B is positive definite (or negative definite) and A is hermitian one can prove that the Jordan form of B- 1 A is actually a diagonal matrix with real eigenvalues. In this case Corollary S5.3 reduces to the wellknown result on simultaneous reduction of a pair of quadratic forms, when one of them is supposed to be definite (positive or negative), to the sum of squares. See, for instance, [22, Chapter X] for details. S5.2. Proof of Theorem S5.1 The line of argument used to prove Theorem S5.1 is the successive reduction of the problem to the study of restrictions of A and H to certain invariant subspaces of A. This approach is justified by the results of the
S5.2.
PROOF OF THEOREM
S5.1
379
next two lemmas. A subspace .# of ~" is said to be nondegenerate (with respect to H) if, for every x E.#\ {0} there is ayE.# such that (Hx, y) =1= 0. For any subspace.# the orthogonal companion of.# with respect to His .#A = {x E ~n I(Hx, y) = 0, Y E.#}. The first lemma says that, for a nondegenerate subspace, the orthogonal companion is also complementary, together with a converse statement.
Lemma S5~4. If.# is a nondegenerate subspace ._#A and conversely, if.# is a subspace and ~n = .# degenerate.
of~", then ~n = .# + + .#A, then .# is non-
Proof. Note first that .#A is the image under H- 1 of the usual orthogonal complement (in the sense of the usual scalar product ( , )) to .#. So
dim .#A+ dim.#= n. Hence it is sufficient to prove that .# n .#A = {0} if and only if.# is nondegenerate. But this follows directly from the definition of a nondegenerate subspace. D In the next lemma a matrix A is considered as a linear transformation with respect to the standard basis in (/;".
Lemma S5.5. Let A be self-adjoint with respect to H and let .# be a non-degenerate invariant subspace of A. Then (a) .#A is an invariant subspace of A. (b) If Q is the orthogonal projector onto .# (i.e., Q 2 = Q, Q* = Q, Im Q = .#), then A I..« defines a linear transformation on .# which is selfadjoint with respect to the invertible self-adjoint linear transformation QHI..«: .#--+ .#. Proof. Part (a) is straightforward and is left to the reader. For part (b) note first that QHI..« is self-adjoint since H = H*, Q = Q*, and QHI..« = QHQI..«· Further, QHI..« is invertible in view of the invertibility of Hand-nondegeneracy of.#. Then HA = A*H implies Q(HA)Q = Q(A*H)Q and, since AQ = QAQ, (QHQ)(QAQ)
from which the result follows.
= (QA*Q)(QHQ),
D
Now we can start with the proof of Theorem S5.1. If (S5.3) holds, then one can easily check that A is self-adjoint relative to H. Indeed, the equalities P,. 1 J = J* P,, 1 are verified directly. Then HA=T*- 1 P e,J T- 1 -TJT- 1 =T*- 1 P e,J JT- 1 =T*- 1 J*P*e,J T- 1 = T*- 1 J*T*T*- 1P,, 1 T- 1 = A*H,
S5.
380
INDEFINITE SCALAR PRODUCT SPACES
i.e., A is self-adjoint relative to H. Now let A be self-adjoint relative to the scalar product [x, y] = (Hx, y). Decompose fl" into a direct sum fl" = fll" 1 fll"a fll"a+ 1 fll"a+b• (S5.4) where fll"; is the sum of the root subspaces of A corresponding to the eigenvalues A; and A;, i = 1, ... , a. For i = a + 1, ... , a + b, fll"; is the root subspace of A corresponding to the eigenvalue A.,. We show first that fll"; and fll"i (i =1- j, i,j = 1, ... , a+ b) are orthogonal with respect to H, i.e., [x, y] = 0 for every x E fll"; and y E fll"i. Indeed, let x E fll";, y E fll"i and assume first that (A - f.lJ)•x = (A - f.lil)'y = 0, where f.l; (resp. f.li) is equal to either A; or A; (resp. A.i or .A), for some nonnegative integers sand t. We have to show that [x, y] = O.lt is easily verified that this is so when s = t = 1. Now assume inductively that [x', y'] = 0 for all pairs x' E fll"; and y' E fll"i for which (A - f.lJ)•'x' = (A - !lilly' = 0, with s' + t' < s + t.
+ ··· + +
+ ··· +
Consider x' = (A - f.lJ)x, y' =(A - f.li)y. Then, by the induction hypothesis, [x', y] = [x, y'] = 0 or f.l;[x, y] = [Ax, y], ,aJx, y] = [x, Ay]. But [Ax, y] = [x, Ay] (since A is self-adjoint with respect to H); so (f.l;- .Ui)[x,y] = O.Bythechoiceoffll";andfll"iwehavef.l; =1- .Ui;so[x,y] = 0. As fll"; = {x 1 + x 2 I(A- f.lJ)•'x 1 =(A- Ji)) 52 X 2 = 0 for some s 1 and s 2 }, with an analogous representation for fll"i, it follows from the above that [x, y] = 0 for all x E fll";, y E fll"i, where i =1- j. It then follows from (S5.4) and Lemma S5.4 that each fll"; is nondegenerate. fll";', Consider fixed fll";, where i = 1, ... , a (i.e., A.; =1- A:;). Then fll"; = fll"; where fl£; (resp. fll";') is the root subspace of A corresponding to A.; (resp. A:;). As above, it follows that the subspaces fll"; and fll";' are isotropic with respect to H, i.e., [x, y] = 0 for either x, y E fl£; or for x, y E fll";'. There exists an integer m with the properties that (A - A.J)ml.~t = 0, but (A - A.J)m- 1 a 1 =1- 0 for some a 1 E fl£;. Since fll"; is nondegenerate and fll"; is isotropic, there exists a b 1 E fll";' such that [(A - A.J)m- 1a 1 , b 1 ] = 1. Define sequences a 1, .•. , am E fll"; and b 1, ..• , bm E fll";' by . 1 bi = (A - ...1 J)r . 1 ai = (A - A.Jya , b , j = 1, ... , m.
+
1
1
Observe that [a 1 , bmJ = [a 1, (A - A:Jr- 1 b 1 ] = [(A - A.Jr- 1a 1 , b 1 ] = 1, in particular, bm =1- 0. Further, for every x E fll"; we have [x, (A - A;l)bm] = [x, (A - A:Jrb 1] = [(A - A.J)mx, b 1 ] = 0;
S5.2.
PROOF OF THEOREM
S5.1
381
so the vector (A- A.J)bm isH-orthogonal to g(j. In view of (S5.4) we deduce that (A- J.J)bm isH-orthogonal to fln, and hence (A - A.;I)bm
= 0.
Then clearly am, ... , a 1 (resp. bm, ... , b1)is a Jordan chain of A corresponding to A; (resp. J.;), i.e., for j = 1, 2, ... , m - 1,
with analogous relations for the bj (replacing A; by J.;). For j we have
+k
= m
+1
[aj, bk] = [(A - A))j- 1a 1, (A - A.J)k- 1b 1 ] = [(A - AJ)j+k- 2 a 1 , b1] = 1;
(S5.5) and analogously for j
+ k > m + 1.
(S5.6)
Now put m
c1 = a1
+ L
cj+ 1 =(A- A))cj, j = 1, ... ,m- 1,
rt.jaj,
j=2
where
rt. 2 , ••• ,
rt.m are chosen so that
[c 1, bm_ 1] = [c 1, bm_ 2 ] = · · · = [c 1, b1] = 0. Such a choice is possible, as can be checked easily using (S5.5) and (S5.6). Now for j + k ~ m [cj, bk] = [(A - A))j- 1c 1, bk] = [c 1, (A - A.J)j- 1 bk] = [c 1, bk+ j- 1] = 0, andforj (S5.6):
+ k;;:::
m
+ 1 weobtain,using(A-
AJra 1 = Otogetherwith(S5.5),
[cj, bk] = [(A - A))j- 1c 1, (A - A.J)k- 1 b 1] = [(A- AJ)j+k- 2 c 1 , b 1] = [(A- A))j+k- 2 a 1 , b 1]
=
{1,0,
for j for j
+k = +k >
m m
+1 + 1.
Let % 1 =Span{c 1, ... ,cm,b 1, ... ,bm}. The relations above show that Al.v 1 = J 1 EB ] 1 in the basis c 1, ... , em, b1, ... , bm, where J 1 is the Jordan block of size m with eigenvalue A;;
[ ]- *[0p1 X,
y - y
p1] 0
X,
X,
yE
%1
382
S5.
INDEFINITE SCALAR PRODUCT SPACES
in the same basis, and P 1 is the sip matrix of size m. We see from this representation that .% 1 is nondegenerate. By Lemma S5.4, rz;n = .% 1 A't, and by Lemma S5.5, Ntis an invariant subspace for A. If Al.v~ has nonreal eigenvalues, apply the same procedure to construct a subspace .% 2 c A't with basis c'1, ... , c~, b'1, ... , b~. such that in this basis A l.v2 = J 2 EEl J 2 , where J 2 is the Jordan block of size m' with nonreal eigenvalue, and
+
[
x, y] = y
*[ p2] p02
0 x,
with the sip matrix P 2 of size m'. Continue this procedure until the nonreal eigenvalues of A are exhausted. Consider now a fixed fll';, where i = a + 1, ... , a + b, so that A; is real. Again, let m be such that (A - AJr 1~, = 0 but (A - AJ)m- 1 1~, #- 0. Let Q; be the orthogonal projector on fll'; and define F: fll';--+ fll'; by
Since A; is real, it is easily seen that F is self-adjoint. Moreover, F #- 0; so there is a nonzero eigenvalue ofF (necessarily real) with an eigenvector a 1. Normalize a 1 so that B
=
±1.
In other words, [(A -
Letai =(Am+1
AJr- 1a1, a1]
=
B.
(S5.7)
AJ)i- 1a 1,j = 1, ... ,m.Itfollowsfrom(S5.7)thatforj + k =
[ai, ak] = [(A - AJ)i- 1a 1, (A - AJt- 1a 1] = [(A - AJr- 1a1, a1J = B. (S5.8) Moreover, for j + k > m + 1 we have:
[ai, ak] =[(A- A;f)i+k- 2a 1 , a 1 ] = 0
(S5.9)
in view of the choice of m. Now put b1 = a1 + Q(2a2 + · · · + Q(mam,
, J)i- 1b 1, J. --'- 1, ... , m, bi -- (A - 11.;
and choose Q(; so that
= [b1, b2J = •· · = [b1, bm-1J = 0. Such a choice ofQ(; is possible. Indeed, equality [b 1, bi] = OU = 1, ... , m - 1) [b1, b1J
gives, in view of (S5.8) and (S5.9), 0 = [a1 + Q(2a2 + ... + Q(mam,ai + Q(2aj+1 + ... + Q(m-j+1Q(mJ = [a 1 ,ai] + 2BQ(m-i+ 1 + (termsinQ( 2, ... ,Q(m-}
S5.3.
383
UNIQUENESS OF THE SIGN CHARACTERISTIC
Evidently, these equalities determine unique numbers 0( 2 , 0( 3 , ... , ll(m in succession. Let % = Span{b 1 , ..• , bm}. In the basis b 1, •.• , bm the linear transformation A IvY is represented by the single Jordan block with eigenvalue A.i, and
[x, y]
= y*eP 0 x,
x,yE%,
where P 0 is the sip matrix of size m. Continue the procedure on the orthogonal companion to%, and so on. Applying this construction, we find a basisf1, ..• Jn inC" such that A is represented by the Jordan matrix J of (5.1) in this basis and, with P,, 1 as defined in (S5.2),
[x, y] = y*P,, 1 x, where x and y are represented by their coordinates in the basis [ 1, .•. , fn· Let T be the n x n invertible matrix whose ith column is formed by the coordinates of[; (in the standard orthonormal basis), i = 1, ... , n. For such aT, the relation T- 1 AT = J holds because [ 1 , ... Jn is a Jordan basis for A, and equality T* HT = P,, 1 follows from (S5.5), (S5.6), and (S5.8). So (S5.3) holds. Theorem S5.1 is proved completely. 0 S5.3. Uniqueness of the Sign Characteristic Let H be an n x n hermitian nonsingular matrix, and let A be some Hself-adjoint matrix. If J is a normal form of A, then in view of Theorem S5.1 H admits an A-canonical form P,, 1 . We suppose that the order of Jordan blocks in J is fixed as explained in the first section. The set of signs e in P,, 1 will be called the A-sign characteristic of H. The problem considered in this section is that of uniqueness of e. Recall that e = {ea+j,J, i = 1, 2, ... , ki, andj = 1, 2, ... , b. in the notation of Section S5.1. Two sets of signs e(r) = {et~ i. i}, = 1, 2, will be said to be equivalent if one can be obtained from the other by permutation of signs within subsets corresponding to Jordan blocks of J having the same size and the same real eigenvalue.
r
Theorem S5.6. Let A be an H-self-adjoint matrix. Then the A-sign characteristic of His defined uniquely up to equivalence. Note that in the special case A = I, this theorem states that in any reduction of the quadratic form (Hx, x) to a sum of squares, the number of positive coefficients, and of negative coefficients is invariant, i.e., the classical inertia law.
85.
384
INDEFNINITE SCALAR PRODUCT SPACES
Proof Without loss of generality a Jordan form for A can be fixed, with the conventions of Eq. (S5.1). It is then to be proved that if P,,J, Pa,J are A-canonical forms for H, then the sets of signs £ and {> are equivalent. It is evident from the definition of an A-canonical form (Eqs. (S5.1) and (S5.2)) that the conclusion of the theorem follows if it can be established for an A having just one real eigenvalue, say a. Thus, the Jordan form J for A is assumed to have k; blocks of size m;, i = 1, 2, ... , t, where m1 > m2 > · · · > m1• Thus, we may write
and P,,J = diag[diag[~>j, ;Pi]~~ 1]}= 1
with a similar expression for Pa,J• replacing ~>j,i by signs bj,i· It is proved in Theorem S5.1 that, for some nonsingular S, H = S*P,,JS and A = s- 1JS. It follows that for any nonnegative integer k, (S5.10)
and is a relation between hermitian matrices. Observe that (Ja - J;r'- 1 = 0 for i = 2, 3, ... , t, and (Ja - J 1r'- 1 has all entries zeros except for the entry in the right upper corner, which is 1. Consequently, P,jia- Jr'- 1 = diag[~>1.1P1(Ja-
J1r'-\ ... ,~>1,k,P1(Ja- J1r'- 1,0, ... ,0].
Using this representation in (S5.10) with k = m1 - 1, we conclude that the signature of H(Ia - Ar'- 1. But precisely the same argument applies using the A-canonical form Pa,J for Hand it is concluded that L~!, 1 ~>1.; coincides with
k,
I i=
k,
~>1,i = 1
I i=
J1.i·
(S5.11)
1
Consequently, the subsets {~> 1 , 1 , ... , ~> 1 ,k,} and {b 1, 1, ... , b1,k,} of£, b are equivalent. Now examine the hermitian matrix P,,ila- J)m 2 - 1. This is found to be block diagonal with nonzero blocks of the form i
= 1, 2, ... 'k1
and j = 1, 2, ... , k 2 •
S5.3.
385
UNIQUENESS OF THE SIGN CHARACTERISTIC
Consequently, using(S5.10)withk = m 2 is given by
1, thesignatureofH(/()(- A)m 2 - 1
-
(~/u)(sig[P1(J()(- J1r
2
-
1])
+ i~/2,j·
But again this must be equal to the corresponding expression formulated using() instead of e. Hence, using (S5.11) it is found that k2
" e 2 o}. L.., j= 1
k2
= "L.., () 2 'J. j= 1
and the subsets {e 2 , 1, ... , e2 ,k,} and {b 2 , 1, ... , b2 ,k 2 } of e and() are equivalent. Now it is clear that the argument can be continued fort steps after which the equivalence of e and () is established. D It follows from Theorem S5.6 that the A-sign characteristic of H is uniquely defined if we apply the following normalization rule: in every subset of signs corresponding to the Jordan blocks of the same size and the same eigenvalue, + ls (if any) precede -ls (if any). We give a simple example for illustration. ExAMPLE
S5.1
Let J =
diag{[~ ~]. [~ ~]}-
Then p E,J = diag{[0
e1
e1] 0 ' [0 ez]} 0 ez
for
e= (e 1,e2), e; =
± 1.
According to Theorem S5.1 and Theorem S5.6, the set Q = QJ of all invertible and selfadjoint matrices H such that J isH-self-adjoint, splits into 4 disjoint sets Q 1 , Q 2 , Q 3 , Q 4 corresponding to the sets of signs ( + 1, + 1), ( + 1, -1), ( -1, + 1), and ( -1, -1), respectively:
An easy computation shows that each set Q; consists of all matrices H of the form
where a 1 , a 2 are positive and b 1 , b 2 are real parameters; e1 , e2 are± 1 depending on the set Q;. Note also that each set Qi (i = 1, 2, 3, 4) is connected. 0
386
S5.
INDEFINITE SCALAR PRODUCT SPACES
S5.4. Second Description of the Sign Characteristic Let H be an n x n hermitian nonsingular matrix, and let A be H-selfadjoint. In the main text we make use of another description of the A-sign characteristic of H, which is given below. In view of Theorem S5.1, H admits an A-canonical form P,,J in some orthonormal basis, where J is the Jordan form of A: (S5.12) HereS is a nonsingular matrix such that SA = JS and a is the A-sign characteristic of H. We can suppose that the basis is the standard orthonormal basis in q;n. Let .il0 be a fixed real eigenvalue of A, and let 'I' 1 c q;n be the subspace spanned by the eigenvectors of I.il - A corresponding to .il 0 . For x E 'I' 1 \ {0} denote by v(x) the maximal length of a Jordan chain beginning with the eigenvector x. Let '1';, i = 1, 2, ... ,y, (y = max{v(x)lxE'¥ 1 \{0}}) be the subspace of 'I' 1 spanned by all x E 'I' 1 with v(x) ~ i. Then Ker(Jil 0
-
A) = 'I' 1
~
'I' z
~
···
~
'I'r .
The following result describes the A -sign characteristic of H in terms of certain bilinear forms defined on '1';. Theorem S5.7.
Fori = 1, ... , y, let I' ;;(x, y) -- (x, Hy (i) ),
X,
y
E
'1';,
where y = y(l>, y( 2 >, ... , y(i>, is a Jordan chain of I.il - A corresponding to real il 0 with the eigenvector y. Then
(i) J;(x, y) does not depend on the choice of y( 2 >, ... , y(k); (ii) for some self-adjoint linear transformation G;: 'I';-+ '1';, J;(x, y)
= (x, G;y),
X,
y E 'I';;
(iii) for G; of(ii), 'I';+ 1 = Ker G; (by definition '~'r+ 1 = {0}); (vi) the number of positive (negative) eigenvalues of G;, counting multiplicities, coincides with the number of positive (negative) signs in the A-sign characteristic of H corresponding to the Jordan blocks of J with eigenvalue il 0 and size i. Proof By (S5.12) we have J;(x, y) = (Sx, P,,JSy(il), x, y E '1';. Clearly, Sx and Sy(l>, ... , Sy(il are eigenvector and Jordan chain, respectively, of I.il - J corresponding to .il0 • Thus, the proof is reduced to the case A = J and H = P,,J· But in this case the assertions (i)-(iv) can be checked without difficulties. D
85.4. SECOND DESCRIPTION OF THE SIGN CHARACTERISTIC
387
Comments
The presentation in this chapter follows a part ofthe authors' paper [34f]. Indefinite scalar products in infinite-dimensional spaces have been studied extensively; see [8, 43, 51] for the theory of indefinite scalar product spaces and some of its applications. Results close to Theorem S5.1 appear in [59, Chapter 7], and [77]. Applications of the theory of indefinite scalar product spaces to solutions of algebraic Riccati equations and related problems in linear control theory are found in [53, 70d, 70f].
Chapter S6
Analytic Matrix Functions
The main result in this chapter is the perturbation theorem for selfadjoint matrices (Theorem S6.3). The proof is based on auxiliary results on analytic matrix functions from Section S6.1, which is also used in the main text. S6.1. General Results
Here we consider problems concerning existence of analytic bases in the image and in the kernel of an analytic matrix-valued function. The following theorem provides basic results in this direction. Theorem S6.1. Let A(e), e En, be an n X n complex matrix-valued function which is analytic in a domain Q crt. Let r =max, en rank A(e). Then there exist n-dimensional analytic (in Q) vector-valued functions y 1(e), ... , Yn(e) with the following properties:
(i) y 1(e), ... , y.(e) are linearly independent for every e E Q; (ii) Yr+ 1(e), ... , Yn(e) are linearly independent for every e E Q;
(iii)
Span{Yt(e), ... , y.{e)}
= Im A(e)
(S6.1)
and
Span{y,+ 1(e), ... , yie)} = Ker A(e) 388
(S6.2)
S6.1. GENERAL RESULTS
389
for every e E n, except for a set of isolated points which consists exactly of those eo En for which rank A(eo) < r. For such exceptional eo En, the inclusions (S6.3) and (S6.4) hold. We remark that Theorem S6.1 includes the case in which A(e) is an analytic function of the real variable e, i.e., in a neighborhood of every real point e0 the entries of A(e) = (a;k(e))i.k~ 1 can be expressed as power series me- e0 : co
a;k(e)
=
I
aie- e0 )i
(S6.5)
j~O
where ai are complex numbers depending on i, k, and e, and the power series converge in some real neighborhood of e0 . (Indeed, the power series (S6.5) converges also in some complex neighborhood of e0 , so in fact A(e) is analytic in some complex neighborhood Q of the real line.) Theorem S6.1 will be used in this form in the next section. The proof of Theorem S6.1 is based on the following lemma.
Lemma S6.2. Let X 1(e), ... ' x,( e), e E n be n-dimensional vector-valued functions which are analytic in a domain Q in the complex plane. Assume that for some e0 En, the vectors x 1(e 0 ), ... , x.(e 0 ) are linearly independent. Then there exist n-dimensional vector functions Y1(e), ... 'y,(e), e En, with the following properties: (i) (ii) (iii)
y 1 (e), ... , y,(e) are analytic on Q; y 1(e), ... , y,(e) are linearly independent for every e E 0; Span{y 1 (e), ... ,y.(e)} = Span{x 1 (e), ... ,x,(e)} (c~n)
for every e E 0\0 0 , where 0 0 pendent}.
= {e E Qjx 1(e), ... , x,(e) are linearly de-
Proof We shall proceed by induction on r. Consider first the case r = 1. Let g(e) be an analytic scalar function in Q with the property that every zero of g(e) is also a zero of x 1 (e) having the same multiplicity, and vice versa. The existence of such a g(e) is ensured by the Weierstrass theorem (concerning construction of an analytic function given its zeros with corresponding multiplicities); see, for instance, [63, Vol. III, Chapter 3].
S6.
390
ANALYTIC MATRIX FUNCTIONS
Put YI(e) = (g(e))- 1x 1(e) to prove Lemma S6.2 in the case r = 1. We pass now to the general case. Using the induction assumption, we can suppose that x 1(e), ... , x,_ 1(e) are linearly independent for every e E 0. LetX 0 (e)beanr x rsubmatrixofthen x rmatrix[x 1(e), ... ,x,(e)]suchthat det X 0 (e 0 ) i= 0. It is well known in the theory of analytic functions that the set of zeros of the not identically zero analytic function det X 0 (e) is discrete, i.e., it is either empty or consists of isolated points. Since det X 0 (e) i= 0 implies that the vectors x 1(e), ... , x,(e) are linearly independent, it follows that the set no = {e E Oix1(e), ... 'x,(e) are linearly dependent} is also discrete. Disregarding the trivial case when 0 0 is empty, we can write 0 0 = {( 1,( 2 , ••. }, where (;EO, i = 1,2, ... ,is a finite or countable sequence with no limit points inside n. Let us show that for every j = 1, 2, ... , there exist a positive integer si and scalar functions a 1j(e), ... , a,_1.i(e) which are analytic in a neighborhood of (i such that the system of n-dimensional analytic vector functions in 0 x 1(e), ... , x,_ 1(e), (e - (j)-•i [ x,(e)
+ r-1 ;~ aii(e)x;(e) 1
J
(S6.6)
has the following properties: for e i= (i it is linearly equivalent to the system x 1(e), ... , x,(e) (i.e., both systems span the same subspace in (l'"); fore = (i it is linearly independent. Indeed, consider the n x r matrix B(e) whose columns are formed by x 1(e), ... , x,(e). By the induction hypothesis, there exists an (r - 1) x (r - 1) submatrix B 0 (e) in the first r - 1 columns of B(e) such that det B0 ((j) i= 0. For simplicity of notation suppose that B 0 (e) is formed by the first r - 1 columns and rows in B(A.); so B(e) = [B 0 (e) B 2 (e)
B 1 (e)J B 3 (e)
where B1(e), B 2 (e), and B 3 (e) are of sizes (r- 1) x 1, (n - r + 1) x (r- 1), and (n - r + 1) x 1, respectively. Since B 0 (e) is invertible in a neighborhood of (.; we can write B(e) = [
I B 2 (e)B 0 1(e)
O][B 0 (e) I 0
0 ][I B0 1(e)B 1(e)J. Z(e) 0 I
where Z(e) = B 3(e)- B 2 (e)B 0 1(e)B 1(e) is an (n - r
+ 1)
(S 6.?)
x 1 matrix. Let
si be the multiplicity of (i as a zero of the vector function Z(e). Consider the
matrix function
S6.1.
391
GENERAL RESULTS
Clearly, the columns b 1(e), ... , b.(e) of B(e) are analytic and linearly independent vector functions in a neighborhood V(() of (j. From formula (S6.7) it is clear that Span{x 1(e), ... , x.{e)} = Span{b 1 (e), ... , b.{e)} for BE U(()\(j. Further, from (S6.7) we obtain B(e) =
[BB (e) 0 (e) 2
0 J
(e - ()-"iZ(e)
and 0 [ (e - C -sjz(e)
J= (
B
_
o-•iB( )[-B 0 1 (e)B 1 (e)J J B I .
So the columns b 1(e), ... , b.(e) of B(e) have the form (S6.6), where a;j(B) are analytic scalar functions in a neighborhood of (j. Now choose y 1(e), ... , y.(B) in the form r
y.{e) =
L g;(e)x;(B),
where the scalar functions g;(B) are constructed as follows: (a) g.{e) is analytic and different from zero in Q except for the set of poles ( 1 , ( 2 , .•• , with corresponding multiplicities s 1, s 2 , ••. ; (b) the functions g;(B) (fori = 1, ... , r - 1) are analytic in Q except for the poles ( 1 , ( 2 , ..• , and the singular part of g;(B) at (j (for j = 1, 2, ... ) is equal to the singular part of aij(e)g.(B) at (j· Let us check the existence of such functions g;(B). Let g.{e) be the inverse of an analytic function with zeros at ( 1 , ( 2 , ••. , with corresponding multiplicities s 1 , s 2 , ••• , (such an analytic function exists by the Weierstrass theorem mentioned above). The functions g 1(e), ... , gr_ 1(e) are constructed using the Mittag-Leffler theorem (see [63, Vol. III, Chapter 3]. Property (a) ensures that y 1 (e), ... , Yr(B) are linearly independent for every BE 0\ {( 1, ( 2 , .•. }. In a neighborhood of each (j we have r-1
y.{e) =
L (g;(B) -
r-1
aij(e)g.(e))x;(B)
+ g.{e)(x.{e) +
i= 1
L a;j(B)x;(e))
i= 1
= (e- (j)-•{xr(E) + :t~aij(e)x;(e)]
+ {linear combination of x 1 ((j), ... , xr_ 1 (()} + · · ·, (S6.8) where the final ellipsis denotes an analytic (in a neighborhood of() vector function which assumes the value zero at (j. Formula (S6.8) and the linear
S6.
392
ANALYTIC MATRIX FUNCTIONS
independence of vectors (S6.6) for e = (i ensures that y 1 ((), ... , y,(() are linearly independent. 0 The proof of Lemma S6.2 shows that if for some s ( ~ r) the vector functions x 1 (e), ... , x.(e) are linearly independent for all e E Q, then Y;(e), i = 1, ... , r, can be chosen in such a way that (i)-(iii) hold, and moreover, y 1 (e) = x 1 (e), ... , y.(e) = xs(e) for all e E Q. We shall use this observation in the proof of Theorem S6.1. Proof of Theorem S6.1. Let A 0 (e) be an r x r submatrix of A(e), which is nonsingular for some£ E Q, i.e., det A 0 (£) =f. 0. So the set Q 0 of zeros of the analytic function det A 0 (e) is either empty or consists of isolated points. In what follows we assume for simplicity that A 0 (e) is located in the top left corner of size r x r or A( e). Let x 1 (e), ... , x,(e) be the first r columns of A(e), and let y 1 (e), ... , y,(e) be the vector functions constructed in Lemma S6.2. Then for each e E Q\Q 0 we have
Span{y 1 (e), ... , y,(e)} = Span{x 1 (e), ... , x,(e)} = Im A( e)
(S6.9)
(the last equality follows from the linear independence of x 1 (e), ... , x.(e) for e E Q\Q 0 ). We shall prove now that Span{y 1 (e), ... , y,(e)}
::::>
Im A(e),
£ E
Q.
(S6.10)
Equality (S6.9) means that for every e E Q\Q 0 there exists an r x r matrix B(e) such that Y(e)B(e) = A(e),
(S6.11)
where Y(e) = [y 1 (e), ... , y.(e)]. Note that B(e) is necessarily unique (indeed, if B'(e) also satisfies (S6.11), we have Y(e)(B(e) - B'(e)) = 0, and, in view of the linear independence of the columns of Y(e), B(e) = B'(e)). Further, B(e) is analytic in Q\Q 0 . To check this, pick an arbitrary e' E Q\Q 0 ,and let Y0 (e) be an r x r submatrix of Y(e) such that det(Y0 (e')) =f. 0 (for simplicity of notation assume that Y0 (e) occupies the top r rows of Y(e)). Then det(Y0 (e)) =f. 0 in some neighborhood V of e', and (Y0 (e))- 1 is analytic on eE V. Now Y(e) 1 ~[(Y0 (e))- 1 ,0] is a left inverse of Y(e); premultiplying (S6.11) by Y(e)' we obtain B(e) = Y(e)'A(e),
e E V.
(S6.12)
So B(e) is analytic on e E V; since e' E Q\Q 0 was arbitrary, B(e) is analytic in Q\Q 0 . Moreover, B(e) admits analytic continuation to the whole ofQ, as follows. Let e0 E Q 0 , and let Y(e)' be a left inverse of Y(e), which is analytic in a neighborhood V0 of e0 (the existence of such Y(e) is proved as above).
S6.1.
393
GENERAL RESULTS
Define B(s) as Y(s)'A(s) for s E V0 . Clearly, B(s) is analytic in V0 , and for s E V0 \ { s0 }, this definition coincides with (S6.12) in view of the uniqueness of B(s). So B(s) is analytic inn. Now it is clear that (S6.11) holds also for s E 0 0 , which proves (S6.10). Consideration of dimensions shows that in fact we have an equality in (S6.10), unless rank A(s) < r. Thus (S6.1) and (S6.3) are proved. We pass now to the proof of existence of y,+ 1 (s), ... ,y"(s) such that (ii), (S6.2), and (S6.4) hold. Let a 1(s), ... , a,(s) be the first r rows of A(s). By assumption a 1(e), ... , a,(£) are linearly independent for somes E n. Apply Lemma S6.2 to construct n-dimensional analytic row functions b 1(s), ... , b,(s) such that for all s E 0 the rows b 1(s), ... , br(s) are independent, and Span{b1(s?, ... ' br(s?} = Span{a1(s?, ... 'arCs?},
£ E
0\0o.
(S6.13)
Fix s0 E 0, and let br+ 1, ... , br be n-dimensional rows such that the vectors b 1(s 0 ?, ... , br(s 0 ?, b;+ 1, ... , bJ form a basis in fl". Applying Lemma S6.2 again (for x 1(s) = b 1 (s?, ... ,xr(s) = br(s?, Xr+ 1 (s) = b;+ 1 , ••. ,xn(s)=bJ) and the remark after the proof of Lemma S6.2, we construct n-dimensional analytic row functions br+ 1(s), ... , bn(s) such that then x n matrix
is nonsingular for all s E 0. Then the inverse B(s)- 1 is analytic on s E 0. Let Yr+ 1 (s), ... ,y"(s) be the last (n- r) columns of B(s)- 1. We claim that (ii), (S6.2), and (S6.4) are satisfied with this choice. Indeed, (ii) is evident. Takes E 0\0 0 ; from (S6.13) and the construction of yls), i = r + 1, ... , n, it follows that
But since s ¢ 0 0 , every row of A(s) is a linear combination of the first r rows; so in fact Ker A(s) :::J Span {Yr+ 1(s), ... , Yn(s)}. (S6.14) Now (S6.14) implies (S6.15)
394
S6.
ANALYTIC MATRIX FUNCTIONS
Passing to the limit when e approaches a point from Q 0 , we obtain that (S6.15), as well as the inclusion (S6.14), holds for every e E Q. Consideration of dimensions shows that the equality holds in (S6.14) if and only if rank A(e) = r, e E Q. 0 S6.2. Analytic Perturbations of Self-Adjoint Matrices In this section we shall consider eigenvalues and eigenvectors of a selfadjoint matrix which is an analytic function of a parameter. It turns out that the eigenvalues and eigenvectors are also analytic. This result is used in Chapter 11. Consider ann x n complex matrix function A(e) which depends on the real parameter e. We impose the following conditions: (i) A(e) is self-adjoint for every e E IR; i.e., A(e) = (A(e))*, where star denotes the conjugate transpose matrix; (ii) A(e) is an analytic function of the real variable e. Such a matrix function A(e) can be diagonalized simultaneously for all real e. More exactly, the following result holds. Theorem S6.3. Let A(e) be a matrix function satisfying conditions (i) and (ii). Then there exist scalar functions J1 1 (e), ... , Jln(e) and a matrix-valued function U(e), which are analytic for reale and possess the following properties for every e E IR:
A(e) = (U(e))- 1 diag[J1 1 (e), ... , Jln(e)] U(e), Proof.
U(e)(U(e))* = I.
Consider the equation
det(JA.- A(e)) =A."+ a 1(e)n-l
+ · · · + an_ 1(e)A. + an(e) =
0,
(S6.16)
which ai(e) are scalar analytic functions of the real variable e. In general, the solutions Jl(e) of (S6.16) can be chosen (when properly ordered) as power series in (e - e0 ) 1 1P in a real neighborhood of every e0 E IR (Puiseux series):
Jl(e) = Jlo
+ b 1(e -
e0 ) 11P
+ bz(e
- e0 ) 21 P
+ · · ·,
(S6.17)
(see, for instance, [52c]; also [63, Vol. III, Section 45]). But since A(e) is self-adjoint, all its eigenvalues (which are exactly the solution of (S6.1)) are real. This implies that in (S6.17) only terms with integral powers of e - e0 can appear. Indeed, let bm be the first nonzero coefficient in (S6.17): b 1 = · · · = bm-l = 0, bm =f. 0. (If bi = 0 for all i = 1, 2, ... , then our assertion is trivial.) Letting e - e0 -+ 0 through real positive values, we see that
. Jl(e) - Jlo bm = hm ( •-it e -eo r/p
S6.2.
395
ANALYTIC PERTURBATIONS OF SELF-ADJOINT MATRICES
is real, because p(t.) is real for t. real. On the other hand, letting t. - t. 0 approach 0 through negative real numbers, we note that
= lim Jl(t.) - flo
( -l)mfpb m
t~to (t.o - f.r/p
is also real. Hence ( -l)m;p is a real number, and therefore m must be a multiple of p. We can continue this argument to show that only integral powers of t. - t. 0 can appear in (S6.17). In other words, the eigenvalues of A(t.), when properly enumerated, are analytic functions of the real variable t.. Let p 1(t.) be one such analytic eigenvalue of A(t.). We turn our attention to the eigenvectors corresponding to p 1(t.). Put B(t.) = A(t.)- p 1(t.)J. Then B(t.) is self-adjoint and analytic for real t., and det B(t.) 0. By Theorem S6.2 there is an analytic (for real t.) nonzero vector x 1(t.) such that B(t.)x 1(t.) = 0. In other words, x 1(t.) is an eigenvector of A(t.) corresponding to p 1(t.). Since A(t.) = (A(t.))*, the orthogonal complement {x 1(t.)}.L to x 1(t.) in fl" is A(t.)-invariant. We claim that there is an analytic (for real t.) orthogonal basis in {x 1(t.)}.L. Indeed, let
=
X1(t.) = [xll(t.), ... , X1n(t.)]T. By Theorem S6.2, there exists an analytic basis y 1(t.), ... ,y"_ 1(t.) in Ker[x 11 (t.), ... , x 1"(t.)]. Then y 1(t.), ... , Yn- 1 (t.) is a basis in {x 1(t.)}.L (the bar denotes complex conjugation). Since t. is assumed to be real, the basis y 1(t.), ... , Yn- 1 (t.) is also analytic. Applying to this basis the Gram-Schmidt orthogonalization process, we obtain an analytic orthogonal basis z 1(t.), ... , zn_ 1(t.) in {x 1(t.)}.L. In this basis the restriction A(t.)bd