E. Evangelisti ( E d.)
Controllability and Observability Lectures given at a Summer School of the Centro Internazionale Matematico Estivo (C.I.M.E.), held in Pontecchio (Bologna), Italy, July 1-9, 1968
C.I.M.E. Foundation c/o Dipartimento di Matematica “U. Dini” Viale Morgagni n. 67/a 50134 Firenze Italy
[email protected] ISBN 978-3-642-11062-7 e-ISBN: 978-3-642-11063-4 DOI:10.1007/978-3-642-11063-4 Springer Heidelberg Dordrecht London New York
©Springer-Verlag Berlin Heidelberg 2010 st Reprint of the 1 ed. C.I.M.E., Ed. Cremonese, Roma 1969 With kind permission of C.I.M.E.
Printed on acid-free paper
Springer.com
CENTRO I NT E R NA Z IO NAL E MATEMATICO ESTIVO (C.I. M. E.) 0 1 Cicio - S3-SS0 Ma r-coni dal 1- 9 L ugl i o 1968
CON TROL LAB IL ITY AND OBSERVABIL ITY Coordinatore : P ro f. G . EVANGE L ISTI
R .E . KALMAN : Le ctures on Controllability and Obs ervability
pag .
R . KULIKOWSKI : Controllabil ity and Optimum Contro l
" " "
A . STRASZAK
Supe r vi s o r-y Cont r-ollab ility
L . WE ISS
Lectu res on Co nt r o ll a bil ity an d Obser vability
151
193 20 1
CENTRO INTERNAZIONALE MATEMATICO ESTIVO (C.I.M.E.)
LECTURES ON CONTROLLABILITY AND OBSERVABILITY
R.E. KALMAN (Stanford- University)
Corso tenuto a S3.SS0 Marconi (Bologna) dal
1 al 9
Luglio
1968
TABLE OF CONTENTS
O.
Introduction.
5
1.
Classical and modern dynamical systems.
15
2.
Standardization of definitions and "classical" results.
23
3.
Definition of states via Nerode equivalence classes.
35
4.
Modules induced by linear input/output maps.
43
5.
Cyclicity and related questions.
59
6.
Transfer functions.
78
7.
Abstract construction of realizations.
92
8.
Construction of realizations.
98
9.
Theory of partial realizations.
112
10.
General theory of observability.
119
11.
Historical notes.
133
12.
References.
142
-5-
R. E. Kalman
INTRODUCTION The theory of controllability and observability has been developed, one might almost say reluctantly, in response to problems generated by technological science, especially in areas related to control, communication, and computers.
It seems that the first
conscious steps to formalize these matters as a separate area of (system-theoretic or mathematical) research were undertaken only as late as 1959, by KAlMAN [l960b -c ].
There have been, however, many
scattered results before this time (see Section 12 for some historical comments and references), and one might confidently assert today that some of the main results have
bee~
discovered, more or less independ-
ently, in every country which has reached an advanced stage of "development" and it is certain that these same results will be rediscovered again in still more places as other countries progress on the road to development. With the perspective afforded by ten years of happenings in this field, we ought not hesitate to make some guesses of the significance of what has been accomplished. (i)
I see two main trends:
The use of the concepts of controllability and observability
to study nonclassical questions in optimal control and optimal estimation theory, sometimes a s basic hypotheses securing existence, more often as seemingly technical cond.tLons which allow a sharper statement of results or shorter proofs. (ii)
Interaction between the concepts of controllability and
observability and the study of structure of dynamical systems, such
-6-
R. E. Kal man as:
formulation and solution of the problem of realization,
canonical forms, decomposition of systems. The first of these topics is older and has been studied primarily from the point of view of analysis, although the basic lemma (2.7) is purely algebraic.
The second group of topics
may be viewed as "blowing up" the ideas inherent in the basic lemma (2.7), resulting in a more and more strictly algebraic point of view. There is active research in both areas. In the first, attention has shifted from the case of systems governed by finite-dimensional linear differential equations with constant coefficients (where succes s was quick and total) to systems governed by infinite-dimensional linear differential equations (delay differential equations, classical types of partial
different~al
equations, etc.), to finite-dimensional linear differential equations with time-dependent coefficients, and finally to all sorts and subsorts of nonlinear differential equations.
The first two
topics are surveyed concurrently by WEISS [1969] while MARKUS [1965] looks at the nonlinear situation.
My own current interest lies in the second streao, and thes e lectures will deal ptimarily with it, after a rather hurried overview of the general problem and of the "classical" results. Let us take a quick look at the most important of these "classical" results .
For conveni enc e I shall describe them in system-theoretic
7
H. E. Kalman (rather than conventional pure mathematical) language.
The mathe-
matically trained reader should have no difficulty in converting them into his preferred framework, by digging a little into the references. In area (i), the most important results are probably those which give more or less explicit and computable results for controllability and observability of certain specific classes of systems. Beyond these, there seem to be two main theorems: THEOREM A.
A real, continuous-time, n-dimensional, constant,
linear dynamical system I:
has the property "every set of
n
eigenvalues may be produced by Imitable state feedback" if and only if I:
is completely controllable.
The central special case is treated in great detail by KALMAN, FAlB, and ARBIB [1969, Chapter 2, Theorem 5.10]; for a proof of the general case with background comments, refer to WONHP~ [1967].
As
a particular case, we have that every system satisfying the hypotheses of the theorem can be "stabilized" (made to have eigenvalues with negative real parts) via a suitable choice of feedback.
This result
is the "existence theorem" for algorithms used to construct contr01 systems for the past three decades, and yet a conscious formulation of the problem and its mathematical solution go back to about 1963~ (See Theorem D below.)
The analogous problem for nonconstant linear
systems (governed -by linear differential equations with variable coefficients) is still not solved.
8
R.E.Kalman THEOREM B.
( "Duality Principle")
Every problem of control-
lability in a real, (continuous-time, or discrete-time), finitedimensional, constant, linear dynamical system is equivalent to a controllability problem in a dual system. This fact was first observed by KALMAN [1960a] in the solution of the optimal stochastic filtering problem for discrete-time systems, and was soon applied to several problems in system theory by KAI.MAN [196ob-c].
See also many related comments by KAI.MAN, FALB,
and ARBIB [Chapters 2 and 6, 1969].
As a theorem, this principle
is not yet known to be valid outside the linear area, but as an intuitive prescription it has been rather useful in guiding systemtheoretic research.
The problems involved here are those of fomula-
tion rather than proof. algebra and in particular
The basic difficulties seem to point toward category theory.
System-theoretic
duality, like the categoric one, is concerned with "reversing arrows".
See Section 10 for a modern discussion of these points
and a precise version of Theorem B. Partly as a result of the questions raised by Theorem Band partly because of the algebraic techniques needed to prove Theorem A and related lemmas, attention in the early 1960.s shifted toward certain problems of a structural nature which were, somewhat surprisingly at first, found to be related to controllability and observability. THEOREM C.
The main theorems again seem to be two: (Canonical Decomposition)
Every real (continuous-
time o~ discrete-time), finite-dimensio~al, const?r.t, linear ~vnamical
9
R.E.Kalman
system may be canonically decomposed into four parts, of which only one part, that which is completely controllable and completely observable, is involved in the input/output behavior of the system. The proof given by ~~ [1962] applies to nonconstant systems only under the severe restriction that the dimensions of the subspace of all controllable and all unobservable states is c onstant on the whole real line.
The result represented by Theorem C is far from
definitive, however', since finite-dimensio nal linear, :'cnconsta:lt systems admit at least four diffe·re::-:' canonical decompositi-:ns:
i t is
possible and fruitful to dualize the notions of controllability and observability, thereby arriving at four properties, presently called reachability and controllability as well as constructibility* and observability. (See Section 2 definitions.)
Any combination of a property from
the first list with a property from the second list gives a canonical decomposition re sult
an a l og~~s
to Theorem C.
The comple xity of
the s1 tuation wa.s first revealed by \·/EISS and KALMA..~ [1955]; this paper contributed to a revival of interest (with hopes of success) in the special problems of nonconstant linear systems.
Recent
*WEISS [1969] uses "determinability" instead of constructibility. The new terminology used in these lectures is not yet entirely standard.
-
10 -
R. E. Kalman
progress is surveyed by WEISS [1969].
Intimately related to the
canonical structure theorem, and in fact necessary to fully clarify the phrase "involved in the input/output behavior of the system'~ is the last basic result: THEOREf4 D.
(Uniqueness of I·linimal Realization)
Given the
imwulse-response matrix W of a real, continuous-time, finitedimensional, linear dynamical system, there exists a time, finite-dimensional, linear dynamical system (a)
continuous-
which
realizes W: that is, the impulse-respo~se matrix of is equal to W;
~
(b)
Lw
re~l,
has minimal dimension in the class of linear systems satisf'ying (a);
(c)
is completely controllable and completely observable;
(d)
is uniquely determined (modulo the choice of a basis at each
t
for its state space) by reauirement (a)
together with (b) or, independently, by (a) together with
(c). In short, for any W as described above, there is an "essentially unique"
~
of the same "type" which satisfies (a) through (c).
COROLLARY 1. constant
~
If W comes from a constant system, there is a
which satisfies (a) through (c), and is uniquely
determined by (a) + (b) or (a) + (c) (modulo a fixed choice of basis for its state space).
-
II
-
R. E. Kalman
COROLULQY 2.
All claims of Coyollary 1
"impulse-response matrix of a
const?~t,
c o ~tinue
to hold if
finite-dimensional system"
is replaced by "tra::1sfer function matrix of a constant, finitedimensional system". The first general discussion of the situation with an equivalent statement of Theorem D is due to KAlMAN [1963b, Theorems 7 and 8].
(This paper does not include co~plete proofs, or even
an explicit statecent of Corollaries 1 and 2, although they are implied by the general algorithm given in Section version of the original unpublished proof of
7.
7.~eorem
An edited
D is given
in KALMAN, FALB, and ARBIB [1969, Chapter 10, Appendix C].) These results are of great importance in engineering system theory since they relate methods based on the Laplace transform (using the transfer function of the systec) and the time-donain methods based on input/output data (the matrix ;'1) to the statevariabl
(dynamical system) methods developed in 1955-1960.
fact, by Corollary 1 it follows that the
t~o
In
methods NUst yield
identical results; for instance, starting with a constant impulseresponse matrix W,
property (c) implies thao the existence
of a stable control lay is always assured by virtue of Theorem A. Thus it is only after the development represented by Theorems A-D that a rigorous justification is obtained for methods used in control
t~e
intuitive design
enginee~ing.
As with Theorem C, certain formulational difficulties arise in connection with a precise definition of a "r.onconstant linear
-
12 -
R. E. Kalman
dynamical system".
Thus, it seems preferable at present to replace
in Theorem D "impulse-response matrix WIt (or "abstract input/output map WIt) by "complete reachability" •
by "weighting pattern WIt
and "complete controllability"
The definitive form of the 1963 theorem
evolved through the works of WEISS and KAlMAN [1965], YOULA [1966], and KALMAN; a precise formulation and modernized proof of Tneorem D in the weighting pattern case was given recently by KAI.MA..f>l', FALB, and ARBIB [1969, Chapter 10, Section 13.]
A completely general
discussion of what is meant by a "minimal realization" of a nonconstant impulse-response matrix involves many technical complications due to the fact that such a minimal realization does not exist in the class of linear differential equations with "nice" coefficient functions.
For the current status of this problem,
consult especially DESOER and VARAIYA [19 67], SILVERMAN and MEADOWS [1969], KArMAN, FALB, and ARBIB [1969, Chapter 10, Section 13] and WEISS [1969]. From the standpoint of the present lectures, by far the most interesting consequence of Theorem D is its influence, via efforts to arrive at a definitive proof of Corollary 1, on the development of the algebraic stream of system theory.
The first proof of this
important result (in the special case of distinct eigenvalues) is that of GILBERT [1963].
Immediately afterwards, a general proof
was given by KArMAN [1963b, Section 7].
This proof, strictly
computational and linear algebraic in nature, yields no theoretical insight although it is useful as the basis of a computer algorithm.
-
13 -
R. E. Kalman
Using the classical theory of invariant factors,
~~
[1965a]
succeeded in showing that the solution of the minimal realization problem can be effectively reduced to the classical invariantfactor algorithm.
This result is of great theoretical interest
since it strongly suggests the now standard module theoretic approach, but it does not lead to a simple proof of Corollary 1 and is not a practical method of computation. The best known proof of Corollary 1 was obtained in 1965 by B. L. Ho, with the aid of a remarkable algorithm, which is equa l.Iy important
from a theoretical and computational viewpoint.
The early formula-
tion of the algorithm was described by HO and KALMAN [1966], with later refinements discussed in HO and KALMAN [1969], KAh\UU~ FALB, and ARBIB [1969, Chapter 10, Section 11] and ~~ [1969c]. Almost simultaneously with the work of B. L. Ho, the basic results were discovered independently also by YOULA and TISSI [1966] and by SILVERM.I'IN [1966].
The subject goes back to the 19th century
and centers around the theory of Hankel matrices; however, many of the results just referenced seem to be fundamentally new. field is currently in a very active stage of development. discuss the essential ideas involved in Sections 8-9.
This
We shall
Many other
topics, especially Silverman's generalization of the algorithm to nonconstant systems unfortunately time.
cap~ot
be covered due to lack of
-
14 -
R.E. Kalman
Acknowledgment It is a pleasure to thank C. I. M. E. and its organizers, especially Professors E. Bompiani, E. Sarti, and E. Belardinelli, for arranging a special conference on these topics.
The sunny
skies and hospitality of Italy, along with Bolognese food pla.yed a subsidiary but vital part in the success of this important gathering of scientists.
-
IS -
R. E. Kalman 1.
CLASSICAL Am> MODERN DYNA1.fICAL SYSTEHS
In mathematics the term dynamical system (synonyms:
topological
dynamics, flows, abstract dynamics, etc.) usually connotes the action of a one-parameter group
T
(t~e
reals) on a set
X,
where
X is
at least a topological space (more often, a differentiable manifold) and the action is at least continuous.
This setup is physically
motivated, but in a very old-fashioned sense.
A "dynamical system"
as just defined is an idealization, generalization, and abstraction o~
Newton's world view of the Solar System as described via a finite set c
nonlinear ordinary differential equations.
These equations represent
the positions and momenta of the planets regarded as point masses and are completely determined by the laws of gravitation, i.e., they do not contain any terms to account for "external" forces that may act on the system. Interesting as this notation of a dynamical system may be (and iSl) in pure mathematics, it is much too limited for the study of those dynamical systems which are of contemporary interest.
There
are at least three different ways in which the classical concept must be generalized: (i)
The time set of the system is not necessarily restricted
to the reals; (ii)
A state
x E X of the system is not merely acted upon by
the "passage of time" but also by inputs which are or could be manipulated to bring about a desired type of behavior;
-
16 -
R. E. Kalman
(iii)
The states of the system cannot, in general, be observed.
Rather, the physical behavior of the system is manifested through its outputs which are many-to-one functions of the state. The generalization of the time set is of minor interest to us here.
The notions of input and output, however, are exceedingly
fundamental; in fact, controllability is related to the input and observability to the output.
With respect to dynamical systems in
the classical sense, neither controllability nor observability are meaningful concepts. A much more detailed discussion of dynamical systems in the modern sense, together with rather detailed precise definitions, will be found in KALMAN, FALB, and ARBIB [1969, Chapter 1]. From here on, we will use the term "dynamical system" exclusively in the modern sense (we have already done so in the Introduction). The following symbols will have a fixed meaning throughout the paper:
( 1.1)
T
time set,
U
set of input values,
X
state set,
y
set of outpu.t values,
n
input functions,
Ij)
transition map,
TJ
readout map.
The following assumptions will always apply (otherwise the sets above are arbitrary):
-
17 -
R. E. Kalman
T
an ordered subset of the reals
o
class of functions
T -> U
(i)
w is undefined outside some
each function
finite interval
w
E 0
with For most
dependent on
on
W'
abelian group of integers;
m;
there is a function
which agrees with w on
later,
f_ :;~ 3e3
such that
JeT co
¢,
(ii)
~,
J
w and
J w"
T will be equal to
U, X, Y, 0
fined" can be replaced by "equal to
~
= (ordered)
will be linear spaces; "unde-
0"; and "functions undefined out-
side a finite interval" will mean the same as "finite sequences". The most general notion of a dynamical system for our present needs is given by the followinr;
DEfINITION.
~amical
consi sting of the maps
cp:
und efined when ever
defined on the sets
1]
x, m) '-. cp(t;
T,
t
>
T, U, 0, X, Y
T,
x, m)
T·
= '
TXX-> Y:
The tran sition map cp (1.4 )
is a composite object
TXT X X X 0 -> X,
(t;
Tj:
rp,
system E
cp( t; t, x, w)
(t, x) I--> Tj(t, x). satisf~c~
x·,
the following assumptions:
-
18 -
R.E.Kalman (1.5)
cp(t;
(1.6)
if cp(s;
T,
X,
=
(I)
T,
cp(t; s, cp(s;
(I)
(I)'
X,
on
[T, t), cp( s;
(I)
T,
x,
T,
(I)
then for all
,
(I);
s E [T, t)
x, ,:D' ) .
The definition of a dynamical system on this level of generality should be regarded only as a scaffolding for the terminology; interesting mathematics begins only after further hypotheses are made. T, U,
instance, it is usually necessary to endow the sets Y
with a topology ~{PLE.
and then require that
cp
and
1]
n,
= B = reals,
X, and
be continuous.
The classical setup in topological dynamics may
be deduced from our Definition (1.3) in the following way. T
For
Let
regarded as an abelian group under the usual addition
and having the usual topology; let
n consist only of the nowhere-
defined function; let
X be topological space; disregard Y and
define
T
cp for all cp(t;
T,
t, X,
entire]~;
E T and write it as x·(t - T),
(I)
that is, a function of
1]
x
and
t -
T
alone.
Check (1.4-5); in
the new notation they become
x'O
x
and
x.(s + t)
Finally, require that the map
(1.8)
INTERPRETATION.
(x·s)·t.
(x, t) ~ x·t
be continuous.
The essential idea of Definition (1.3) is
that it axiomatizes the notion of state.
A dynamical system is informally
-
19 -
R. E. Kalman
a rule for state transitions (the function
together with suitable
~),
means of expressing the effect of the input on the state and the effect of the state on the output (the function as follows: time
T
"an input w,
~).
The map ~ is verbalized
applied to the system Z
produces the stati! ~(t; T, x, w)
at time
in state t."
x at
The peculiar
definition of an input function w is used here mainly for technical convenience; by (1.6) only equivalence classes of inputs agreeing over
h, t] enter into the determination of at
t
~( t ; T,
means no input acts on Z at time The pair
x, w).
"w not defined"
t.
(T, x) E T X X will be called an event of a dynamical
system L In the sequel, we shall be concerned primarily with systems which are finite-dimensional, linear, and continuous-time or discrete-time. Often these systems will be also real and constant (= stationary or time-invariant).
We leave the precise definition of these terms in
the context of Definition (1.3) to the reader (consult KALMAN, FALB, or ARBIE [1969, Chapter 1] as needed) and proceed to make some ad hoc definitions without detailed explanation. The following conventions will remain in force throughout the lectures whenever the linear case is discussed: Continuous-time.
n
=
T =~,
U
= gm,
all continuous functions
X = gn, Y = ~p, m R -+ R which vanish out-
side a finite interval. (1.10)
Discrete-time.
T =?!,
K
= fixed field (arbitrary),
-
20 -
R. E. Kalman
u =!fl, x = r, Z ~
!fl
Y
= KP, n = all
functions
which are zero for all but a finite number of
their arguments. Now we have, finally,
(loll)
dynamical system E time
A real, continuous-time, n-dimensional, linear
DEFINITION.
(F(·), G(·),
is a triple of continuous matrix functions of
H(.»
where
(n X n matrices over
~)
~
(n X m matrices over
~)
R -+
(p X n matrices over
~).
F(·) :
R
-)
G(.) :
R
H(.):
,
These maps determine the equations of motion of E in the following ~:
F(t)x + G(t)w(t),
dx/dt (1.12)
where
{
H(t)x(t),
y(t)
t E~,
x E
t,
(I)(~) E ~m,
and yet) E ~p.
To check that (1.12) indeed makes E
into a well-defined dynamical
system in the sense of Definition (1.3), it is necessary to recall the basic facts about finite systems of ordinary linear differential equations with continuous coefficients. iPF(t, '1"):
Define the map
~ X ~ ~ {n X n
matrices over
~}
to be the family of n X n matrix solutions of the linear differential
-
21 -
R. E. Kalman
equation F(t)x,
dx/dt
x E ~
subject to the initial condition unit matrix,
I
~F is of class Cl
Then
in both arguments.
transition matrix of (the system E matrix is)
F(·).
"E R.
It is called the
whose "infinitesimar'transition
From this standard result we get easily also the
fact that the transition map of E is explicitly given by
T,
W(T, t) for
where
The original proof of (b) is in KALMAN [1960b]; both cases are treated in detail in KALMAN, FALB, and
JL~IB
[1 969, Chapter 2,
is
-
25 -
R. E. Kalman Section 2].
Note that if G(·)
we cannot have reachability, and if G(·) zero on
(- 00, T)
is identically zero on is identically
(T, + 00) we cannot have controllability.
For a constant system, the integrals above depend only on the difference of the limits; hence, in particular
So we have
(2.4)
PROPOSITION.
In a
rea~continuous-time,
linear, constant dynamical system an event for all
T
finite-dimensional,
(T, x) is reachable
if and only if it is reachable for one
T;
an ev:nt
is reachable if and only if it is controllable. From (2.3) one can obtain in a straightfoF~~rd fashion also the following much stronger result: (2.5)
THEOREM.
In a rea1 continuous-time, n-dimensional,
linear, constant dynamical system L = (F, G, -)
a state
x
is reachable (or, equivalently, controllable) at a~ T E ~ if and only i f x E span (G, FG, ••• ) C ~n; if this condition is satisfied, we can choose with
°> 0
arbitrary.
s
=T
-
0, t
=T
(The span of a sequence of matrices is to
be interpreted as the vector space generated by the columns of these matrices.)
+ 0,
-
26
-
R. E. Kalman A proof o:r (2.5) may be found in KALMAN, HO, and NARENDRA
[1963] and in KALMAN, FALB, and ARBIB [1969, Chapter 2, Section
3]. A trivial but noteworthy consequence is the fact that the definition of reachable states of E is "coordinate-free":
(2.6)
COROLJJL~Y.
states of E ~
Xz,
The set of reachable (or controllable)
in Theorem (2.5) is a subspace of the real vector
the state space of E.
Very o:rten the attention to individual states is unnecessary and therefore many authors prefer to use the terminology completely reachable at x E
X~
.
T"
~s reachabl~ ""~ , or ~
event in E
for
"every event
( 'l", x),
completely reachable " for
is reachable", etc.
is
"l.
fixed,
'l" =
" every
Thus (2.5), together with the
Cayley-Hamilton theorem, implies the BASIC
~MA.
A real, continuous-time, n-dimensional,
linear, constant dynamical system E
= (F, G, -) is comnletely
reachable if an only if rank
(2.8)
(G, FG, ••• , yn-1G)
n.
Condition (2.8) is very well-~~own; it or equivalent forms of it have been discovered, explicitly used, or implicitly assumed by many authors.
A trivially equivalent form of (2.7) is given by
COROLLARY 1.
A constant system E
completely reachable if and only if the
= (F,
s~allest
is
F-invariant
subspace of XL containing (all col~on vectors of) itself.
G, -)
G is
~
_
27-
R. E. Kalman
A useful variant of the last fact is given by
(2.10)
COROLLARY 2.
(W. Hahn)
A constant system E = (F, G, -)
is completely reachable if and only if there is no nonzero eigenvector of F which is orthogonal to (every column vector of)
G.
Finally, let us note that, far from being a technical condition, (2.5) has a direct system-theoretic interpretation, as follows: PROPOSITION.
(2.il)
The state space
~
of a real, continuous-
time, n-dimensional, linear, constant dynamical system E
= (F,
G, -)
may be written as a direct sum
which induces a decomposition of the equations of motion as (obvious notations) dx/dt {
dx,jdt
The subsystem L a state
= (Fil, Gl, -)
x = (~, x E ~ 2) PROOF.
of E;
l
is completel v reachable.
is reachable i f and only if x 2
We define
Xl
tion, every state in Xl
Xl
= O.
to be the set of reachable states
by (2.5) this is an F-invariant subspace of
finite-dimensionality,
Hence
is a direct summand in
XE" ~"
Hence, by By
construc-
is reachable, and (every column vector of)
-
28
-
R. E. Kalman G belongs to F = 0, ll
Xl'
The F-invariance of Xl
implies that
which implies the asserted form of the
e~uations
of
motion.
0
REMARK.
Note that
X 2
is not intrinsically defined
(it depends on an arbitrary choice in completing the direct sum). Hence to say that state if
"(0, x
2)
is an unreachable (or uncontrollable)
x 2 -f 0" is an abuse of language.
More precisely: the
set of all reachable (or controllable) states has the structure of a vector suac~ bltthe set of all unreachable (or uncontrollable) states does not have such structure.
This fact is important to
bear in mind for the algebraic development which follows after this section and also in the definition of observabi1ity and constructibi1ity below. chosen in such a way that
In general, the direct sum cannot be F = 0. 12
While condition (2.8) has been fre~uently used as a technical re~uirement
in the solution of various optimal control problems in
the late 1950 s, it was only in 1959-60 that the relation between (2.8) and system theoretic questions was clarified by KALMAN (1960b-c] via Definition (2.2) and Propositions (2.5) and (2.11). 11 for further details.)
(See Section
In other words, without the preceding
discussion the use of (2.8) may appear to be artificial, but in fact it is not, at least in problems in which control enters, because, by (2.12) control problems stated for respect to the intrinsic subspace
Xl'
~
are nontrivial only with
-
29 -
R.E.Kalman
The hypothesis "constant" is by no means essential for Proposition (2.11), but we must forego further comments here. For later purposes, we state some facts here for discretetime, constant linear systems analogous to those already developed for their continuous-time counterparts.
The proofs are straight-
forward and therefore omitted (or given later, for illustrative purposes). (2.14)
PROPOSITION.
A state
n-dimensional, linear, con stant
x
of a real, discrete-time,
~~cal
system E
= (F,
G, -)
is reachable if and only if x E span (G, FG, ••• , ~-lG). Thus such a system is completely reachable if and only if (2.8) holds. (2.16)
PROPOSITION.
A state
x
of the system E
described
in Proposition (2.14) is controllable if and only if
) x E span ( F-1G, • ", F-nG, where {x:
PROPOSITION.
(2.18)
~x
column vectcr of
G}.
In a real, discrete-time, finite-dimensional,
linear, constant dynamical system E
= (F,
G, -)
a reachable state
is always controllable and the converse is always true whenever det F
f
O.
-
30 -
R. E. Kal ma n
Note also that Propositions (2.11) and its proof continue to be correct, without any modification, when "continuous-time" is replaced by "discrete-time". Now we turn to a discussion of observability. The original definition of observability by KALMAN' [1960b, Definition (5.23)] was concocted in such a way as to take advantage of vector-space duality.
The conceptual problems surround-
ing duality are easy to handle in the linear case but are still by no means fully understood in the nonlinear case (see Section 10).
In order to get at the main facts quickly, we shall consider
here only the linear case and even then we shall use the underlying idea of vector-space duality in a rather ad-hoc fashion. The reader wishing to do so can easily turn our remarks into a strictly dual treatment of facts (2.1)-(2.12) with the aid of the setup introduced in Section 10. DEFI}ilTION.
An event
("
x)
in a real, continuous-
time, finite-dimensionak linear dynamical system E
= (F(o),
-, H(·»
is unobservable iff
DEFI}ilTION. ("
x)
With respect to the same system, an event
is unconstructible* iff
*In the older literature, starting with KAl}t~~ [1960b, Definition (5.23)], it is this concept which is called "observability"o By hindsight, the present choice of words seems to be more natural to the writer.
-
31 -
R. E. Kalman
The motivation for the first
defi~ition
is obvious:
the
"occurrence" of an unobservable event cannot be detected by looking at the output of the system after time subsumes
ill
= 0,
linearity.)
T.
(The definition
but this is no 10s3 of generality because of
The motivation for the second definition is less
obvious but is in fact strongly suggested by statistical filtering theory (see Section ments Definition
10).
(2.20)
complements Definition
In any case, Definition
(2.21)
comple-
in exactly the same way as Definition
(2.1)
(2.2).
From these definitions, it is very easy to dedu ce the following criteria:
(2.21)
PROPOSITION.
In a real, continuous-time, finite-di mensional,
linear dynamical system E (a)
= (F('),
-, H('))
unobservable if and only if x for all
t E
~,
an event
(T, x)
is
E kernel M(T, t)
t > T, where
M(T, t)
(b)
unconstructible if and only if for all
s
E~,
s
x € kernel
W( T,
t)
T
and observable # for some t >
x € range
M( T,
t)
T.
From these relations we can easil,y deduce the so-called "duality rules"; that is, problems involving observability (or constructibility) are converted into problems involving reachaoility (or ~ o nt rollability) in a suitabl,y defined dual system.
See KALMAN, FALB,
and ARBIB [1969, Chapter 2, Proposition (6.12)] and the broader discussion in Section 10. We will say: by slight abuse of language, that a system is completely observable whenever
0
is the onl,y unobservable state.
Thus the Basic Lemma (2 .7) "dualizes" to the PROPOSITION.
A real, continuous-time or discrete-time,
n-dimensional, linear, constant
dynamical system E
= (F,
- , H)
*A11 this would be strictly correct if we agreed to replace "direct sum" in Pr oposition (2.11) and its counterpart (2. 25) by "orthogonal. direct sum"; but thi s would be an arbitrary convention which, whi l e conve nie ~t, h~s no natural system-theoretic justification. Rer ead Rem1i.tk (2.13).
-
33 -
R. E. Kal ma n
is completely observable if and only if (2.24)
rank (rr , F'H', ••• , (F,)n-~,)
n.
By duality, complete constructibility in a continuous-time system is equivalent to observability; in a discrete-time system this is not true in general but it is true when det F
f
O.
It is easy to see also that (2.11) "dualizes" to: PROPOSITION. time or discrete-time system r.
= (F,
The state space
Xr.
of a real,
continuous-
n-dimensional, linear, constant dynamical
-, H) may be written as a direct sum
and the equations of r. are decomposed correspondingly as dx/dt
Fllxl,
dx,jdt
F 21x l + F 22x2'
yet) PROOF.
H x (t ) . 2 2 Proceed dually to the proof of Proposition (2.11),
beginning with the definiticn of Xl states of r..
as the set of all unobservable
o
Combining Propositions (2.11) and (2.25) gives Theorem C as in KALMAN (1962].
This completes our survey of the "classical" results related
-
34 -
R. E . Kalman
to reacha bility, controllability, observapility, and constructibility. The remaining lectures wi l l be concerned exclusively with discrete-time sy stems .
The main motivation for the succeeding
developments will be the algebraic criteria (2.8) and (2.24) as well as a deeper examination of Theorems C and D of the Introduction.
-
35
-
R. E. Kalman
3.
DEFINITION OF STATES VIA NERODE EQUIVALENCE CLASSES
A classical dynamical system is essentially the action of the time set
T
(= reals)
on the states
X.
In other words} the
states are acted on by an abelian group, namely definition of addition). consequences. inputs
(~+
usual
This is a trivial fact, but it has deep
A (modern) dynamical system is the action of the
n on X;
in exact analogy ,nth the classical case, to
the abelian structure on
T there corresponds an (associative
but noncommutative) semigroup structure on
n.
The idea that
n
always admits such a structure was apparently overlooked until the late 1950's when it became fashionable in automata theory (school of SCHUTZENBERGER).
This seeClS to be the "right" way
of translating the intuitive notion of dynamics into mathematics, and it will be fundamental in our succeeding investigations. It is convenient to assume from now on, until the end of these lectures} that T
time - -set -
Z
additive (ordered) group of
integers. Since we shall be only interested in constant systems from here on} we shall adopt the following normalization convention:*
*In the discrete-time nonconstant case, we WOQld have to deal with ~ copies of n, each normalized with respect to a different particular value of T E ~.
-
36 -
R. E. Kalman No element of n is defined for
t >
In view of (3.2), we can define the "length"
max {-t E Z:
Iill I
is
ill
I!:Jt
Before defining the semigroup on fundamental notion of dynamics : defined for all
ari':
0
q~
in
n ~ n:
ill
O.
T
1(1)1
of ill
defined for any
n,
by
s < t}.
we introduce another
the (left) shift operator
an'
Z by
1-+
t ~ m(t + q).
arim:
Note that the definition of O"n
is compatible with the normaliza-
tion (3.2). If J
ill
n Jill ,
of (I) and
00'
(3.4)
v
(l)
= empty for
we define the join
ill, ill' E 0.,
as the function
(00
w'
lill'
on on
Jill' Jill'.
When n has an additive structure, then we replace
DEFINITION. o.
n X0
~
o.
0,
00 v 00'
by
ill
+ (I)'.
There is an associative operation
called concatenation, defined by
(00, v)
Note that, by
1-+
anIvl ill
v
v,
(3.2) through (3.4),
o
is well defined.
Note also that the asserted existence of concatenation rests on the fact that
0
intervals in
We might express the content of (3.5) also as:
T.
is made up of functions defined over finite
o is a semigroup with valuation, since evidently
l(I)ovl = Iwl + Ivl.
37 -
R.E.Kalman
In view of (3.5), it is natural to use an abbreviated notation* also for the transition function, as follows:
(3.6)
Iwl,
ep(o; -
Xow
w)
x,
Now we come to an important nonclassical concept in dynamical systems, whose evolution was strongly influenced by problems in communications and automata theory:
a discrete-time constant
input/output map f: 0 -+ Y: w >-+ few)
y(l)
We interpret this map as follows: system E
y(l)
is the output of some
(say, a digital computer) when E is subjected to
the (finite) input sequence w,
assuming that
E is some fixed
initial equilibrium state before the application of co,
This
definition automatically incorporates the notions of "discretetime" as well as "causal" or "dynamics" (the latter because yet)
is not defined for
t < 1).
However, (3.7) does not
clearly imply "constancy" (implicitly, however, this is clear from the normalization assumption (3.2) on more forceful, we extend
(3.8)
r:
n
-+
r
f
=
n).
To make the definition
to the map
Y X Y
(infinite cartesian product) (y(l), y(2), •••
Interpretation:
r
of the system E after
gives the output sequence t
=0
y
= (y(l),
). y(2),
resulting from the application of an
*Observe that xow is the strict analog of the notation xt customary in topological dyn?~cs. The action of w on x satisfies xo(wov) = (xom)ov in view of (1.5).
-
38 -
R. E. Kalman input
ill
which stops at
t
= o.
This definition expresses causality more forcefully and incorporates constancy, provided we define the (left) shift operator for any
~r
T
~
on
r
0, T E
so as to be compatible with let
~'
t
y( t + T)
l-t
:(y(l), y(2), ••• ) Note:
the operator
operator
~r
~n
1-4
(y(T + 1), y(T + 2), •.• )
"appends" an undefined term at
0,
the
"discards" the term y(l).
Now, dropping the bar over
(3.10)
(3.3). So,
DEFINITION.
f,
we adopt
A discrete-time, constant input/output map
(of some underlying d"vnamical system E)
is any map f
such that
the following diagram
is commutative.
f
is Hnc"r iff i t is a K-vcctor -------------_.------
§.P~c~\J..~:>m:>..r:~1.i.:s"1J!.
(3.10) as the external
It will be convenient to regard
definition of a dynamical system, in contrast to the internal definition set up in Section 1. Intuitively, we should think of kind of experimental data; namely,
f f
as a highly idealized incorporates all possible
information that could be gained by subjecting the underlying
-
39 -
R. E. Kalman
system to experiments in which only input/output data is available.
This point of view is related to experimental physics the
same way as the classical notion of a dynamical system is related to Newtonian (axiomatic) physics. The basic question which motivates much of what will follow can now be formulated as f ollows: PROBLEM OF REALIZATION. f
(but of course also of
~,
Given only the knowledge of
l1, and
r)
how can we discover,
in a mathematically consistent, rigorous, and natural
.~y,
the
properties of the system E which is supu os ed to underlie the given input/output map f? This suggests immediately the following fUndamental concept: DEFINITION.
A fixed dynamical system E
(internal
definition, as in Section 1) is a realization of a fixed input/ output map f
iff
fE' o the input/output map of Eo. o
f
o
=
that is,
f
o
is identical with
In view of the notations of Section 1 plus the special convention
(3.6),
the explicit form of the realization condition is
simply that f (m) o
for all m
l1.
~ ( 0
k
fez -oi)
in Z.
The proof of Theorem (4.2) is now complete, since the last lemma identifies module
X f
as defined by (3.15) with the
quotient
n/kernel f.
We write elements of the latter as it is clear that since
K[z]
n
Xf
[w]f
=
W
+ kernel fj
as a K[z]-module is generated by
itself is generated by
e
l, that the scalar product in n/kernel f
••• , em
then
[el]f' ••• , [em]f'
(see (4.6».
Note also
is
(4.10) The last product abov" (that in n)
has already been defined in (4.5).
The reader should verify directly that (4.10) gives a well-defined scalar product.
-
48 -
R. E. Kalman
REMARK.
(4.11) define
f.
There is a strict duality in the setup used to
From the point of view of homological algebra [MAC LANE
19631, this duality looks as follows.
Since every free module is
projective, the natural map
exhibits
X as the image of a projective module. On the other f hand, there is a bijection between the set X and the set f
'::'f
fen)
'::'f is clearly a and so
X f
X f
K[ z 1- submodu.Le of
r
(with
f( z .m) ) ,
z- f(m)
K[r..] -modules.
It is
is an injective module [MAC LANE 1963, page 95,
r
Exercise 21
1'.
'::'f are isomorphic also as
and
known that
c
So the natural Inap X ~ :;;:f: f
as a submodule of an injective module.
[m1f
H
f(m)
exhibits
This fact is basic in the f (Section 7),
construction of the "transfer function" associated with
but it s full implications are not yet understood at present. There is an easy counterpart of Theorem (4.2) which concerns a dynamical system given in "internal" form: (4.12)
PROPOSITION.
The state set
X E
of every discrete-time,
ffnite-dimensional, linear, constant dynamical system admits the structure of a PROOF. K-vector space.
E
=
(F, G, -)
K[zl-~.
By definition (see (1.10)), We make it into a
X= ~
is already a
K[zl-module by defining
-
49 -
R. E. Ka l m a n
K[ z ] X JCl ->
(4.13)
.:
(4 .14)
COMMENT.
JCl:
(7T, x)
H
7T(F)x .
o
The construction used in the proof of (4 .12) is
t he classical trickof studying the properties of a fixed linear map F:
JCl
->
JCl
via the
K[ zl-module structure that
F
induces on
JCl
by (4 .13). In view of the canonical construction of L provided by f Proposition (3.16), the state set X can be treated as a K[zl module irrespective as to whether
X is constructed from
or given a priori as part of the specification of L the
(X
f
(X
= Xf)
= ~). Thus
K[zl-module structure on X is a nice way of uniting the "external"
and the "internal" definitions of a dynamical system .
Henceforth we
shall talk about a (discrete-time, linear, constant dynamical) system
L somewhat imprecisely via properties of its ass ociated
K[zl -module~.
We shall now give some examples of using module-theoretic language to express standard facts encountered before . (4. 15)
PROPOSITION .
FE is given by PROOF.
If X is the state-module of L,
X -) X: x
H
the map
z -x ,
This is obvious from (4 .13) if X
~, f
then we find that, by (1.17), x(l)
Fx(O) + Gw(O) , F[ ~lf + Gw( O);
since
x( O) results fr om input
~,
x(l)
resuJ.ts from input
z·~
+ w(O)
- so -
R. E. Kalman
and we get [Z· E + w(0) ] f'
z'[E]f + [w(O)]f' z'[E]f + GuJ(O).
o
So the assertion is again verified.
Now we can replace Proposition (2.14) by the much more elegant (4.16)
A system E = (F, G, -)
PROPOSITION.
if and only if the columns of G generate
is completely reachable
~.
The claim is that complete reachability is equiva-
PROOF.
lent to the fact that every element
x E
~
is expressible as
m
x
1rj E K[z],
= ftl1rjgj'
G
[gl' ••• , ~].
In view of (4.15), this is the same as requiring that
x be expressible
as m
x
ftl1r/F)gj;
this last condition is equivalent to complete reachability by (2.14). (4.17)
COROLLARY.
The reachable states of E are precisely
those of the submodule of (4.18)
REMARK.
simply means that
0
~
generated by (the columns of)
The statement that X is
~
G.
"E is not completely reachable"
generated by those vectors which make up
the matrix G in the specification of the input side of the system
E.
-
51 -
R. E. Kalman
It does not follow that vectors.
X cannot be finitely generated by some other
In fact, to avoid unnecessary generality, we shall henceforth
assume that
X is always finitely generated over K[z]. From the system-theoretic point of view, the case
when we need
infinitely many generators, that is, infinitely many input channels, seems rather bizzare at present. The syst em X f
PROPOSITION. PROOF. is reached by
(4.20)
~
iff is
W
Obvious from the notation:
E
E
[O]f'
a state
x
[~]f
n.
o
PROPOSITION. PROOF.
is completely reachable.
The system X f
is completely observable.
Obvious from Lemma (h) above:
D([w]f)
= fe w) = 0
which says that the only unobservable state of X f
o
0 E X • f
Let us
g~nera!ize
the l ast r esult t o obtain a module-theoretic
for complete obs ervability. doing this.
c r it c r i ~n
There are two technically different ways of
The first depends on the observation that the "dual" of a
submodule (see Corollary
(4.17»
is a ~uotient module.
observability via the "dual" system Consider a dynamical system E K[z ]-module ~
(F', HI, -)
= (F,
The s econd defines
associated with
(F, -, H).
-, H) and the corresponding
and K-homomorphism H: ~ ~ Y = KP•
We can extend H
-
52 -
R. E. Kalman
to a K[z] -homomorphism
H:
~
--+
H
(look back at (?8)) by setting
r
x ~ (Hx, H(z'x), H(z2. x), .•• ). From Definition (2.19) we see that no nonzero element of the quotient module
~/kernel
can say that
H is unobservable. Hence, by abuse of language, we
~/kernel
H
is the module of observable states of E.
Thus we arrive at phrasing the counterparts of (4.16-17) in the following language:
(4.21)
A system E
PROPOSITION.
if anionly i f the quotient module
(4.22)
COROLLARY.
= (F,
-A
~/kernel
is compl et el y observable
TERMINOLOGY.
~/kernel
are to be identified
H.
The preceding cons i derat i ons suggest viewing
a system E as essentially the sane "thing" as a module speaking, however, knowing E
= (F,
G, H)
(see (4.13)) but also a quotient module module (that generated by
XEo =
~.
H is isomorphic with
The observable states of E
with the elements of the quotient module (4.23)
H)
G)
of ~,
~
X.
gives us not only (over kernel
H)
Strictly ~
= XF
of a sub-
that is
K[z]G/ kernel -H.
If ~ ~~ we say that
~
is canonical (relative to the given
G, H).
To be more precise, let us observe the following stronger version of (4.19-20):
-
53 -
R. E. Kalman
(4.24)
CORRESPONDENCE
between
There is a bijective correspondence
THEO~I.
K[z]-homomorphisms
f: n
r and the equivalence class of
--+
completely reachable and completely observable systems basis change in
E modulo a
~.
Detailed discussion of this result is postponed until Section A
7. stricter observation of the "duality principle" leads to
(4.25)
The K-linear dual of E
DEFINITION.
= (F, G, H) is
E* = (FI, H', G') (, = matrix transposit ion). The states of E* are called costates of E. The following fact is an immediate consequence of this definition:
(4.26)
PROPOSITION.
structure of
K[z-l]
is the dual of ~ product in
~*
The state set
of
Y~*
module, as follows:
E* may be given the
(i) as a vector space
~*
regarded as a K-vector space, (ii) the scalar
is defined by x*(Fx).
(4.26A)
REMAR.X.
We cannot define
K[z]-linear dual of domain
~,
XE*
as
because every-torsion module
D has a trivial D-dual.
equal to
M over an integral
However, the reader can verify (using
the ideas to be developed in Section 6) that morphic with
Ho~[ z ] (~, K[ z])
~*
defined above is iso-
Ho~[z](~,K(z)/K[Z]). See BOURBAKI [Algebre, Chapter
(2 e ~d.), Section
4, No.8].
7
-
54 -
R. E. halman Now we verify easily the following dual statements of (4.16-17):
(4.:27)
PROPOSITION.
if and only if (4.28)
A system E
generates
HI
COROLLARY.
generated by
-, H)
is completely observable
~*.
The observable COstates of E* are precisely
the reachable states of E*, ~*
= (F,
that is, those of the submodule of
H'.
We have eliminated the abuse of language incurred by talking about "observable states" through introduction of the new notion of "observable COstates".
The full explication of why this is necessary
(as well as natural) is postponed until Section 10. The preceding simple facts depend only on the notion of a module and are immediate once we recognize the fact that
F may be eliminated
from statements such as (2.8) by passing to the module induced by via (4.13).
F
But module theory yields many other, less obvious results
as well, which derive mainly from the fact that
K[z]
is a principa1-
ideal domain. We recall:
an element
m of an
R-modu1e M (R
= arbitrary
commutative ring) has torsion iff there is a r E R such that r·m
= O.
If this is not the case,
m is free.
Similarly,
said to be a torsion module iff every element of
M has torsion.
M is a free module if no nonzero element has torsion. is any subset of M, ~
=
(r;
the annihilator r·j,
it follows immediately that
o ~
of
~
for all
j,
M is
If
LC M
L is the set
E L);
is an ideal in
R.
Note also that
-
55 -
R. E. Kalman
the statement that
"M
is a torsion module" does not imply in general
is nontrivial, that is,
~
~
f. o.
(Counterexample:
take
an M which is not finitely generated.) Coupling these notions with the spe cial fact that, for us, R
= K[z),
we get a number of interesting
(4.29)
PROPOSITION.
is a torsion
PROOF.
">S:
is finite-dimensional i f and only i f
I:
If
>s:
q
is infinite dimensional.
"I: = finite-dimensional" is defined
Xl' ••• , x q
G).
of XI:
deg Y
j
Hence
with Y E K[z).
Yl[z)
= nj >
0
j
for all
is either zero (and then a unit which implies
See (l.18).
(which are not·
is a principal-ideal domain, each of the
K[z}
>s:
By assumption X is. finitely generated
nonzero elements
pal ideal, say, then
I:
= finite-dimensional as a K-vector space".
necessarily the columns of
Since
is free,
We recall that
Sufficiency. by, say,
results:
K[z)-~.
COROLlARY.
to be
syste~theoretic
x
replace each expression
j
x. J
=0
j
If
1, ••• , q.
JS:
A
x
is a princij
is a torsion module, For otherwise
Y
j
is free, which is a contradiction) or contr
y to assumption.
Hence we can
-
56 -
R. E. Kalman by the simpler· one
XL,
which shows that
as a K-module, is generated by the finite set
Necessity. x
F:
1-+
z.-x,
If
Let
XL
~F
be the minimal polynomial of the map
is finite-dimensional as a K-module,
deg
> O.
~F
This means (by the usual definition of the minimal polynomial in matrix theory or more generally in linear algebra) that x E
XL
so that
y~
is a torsion
~F
annihilates every
o
K[z]-module.
Notice, from the second half of the proof, that the notion of a minimal polynomial can be extended from K-linear algebra to
K[z]-modules.
In fact, the same argument gives us also the well-known (~030)
PROPOSITION.
Every finitely generated torsion module
over a principal-ideal domain ~M
given by
~
=
R has e nontrivial minimal pynomial
~~.
COROLLARY. q
If a
K[z]-module
generators and minimal polynomial
f
~X'
X is finitely generated with then
dim X (as K-vector space)
~
REMARK.
is completely reachable and is
The fact that
therefore generated by of L
M
L
f
q.deg ~X'
m vectors allows us to estimate the dimension
by (4.31) knowi.ng only
deg
but without having computed
~X
f
-
57 -
R. E. Kalman
X
itself.
(Knowing X explicitly means knowing F: x ~ z·x, etc.) f In other words, the module-theoretic setup considerably enhances the f
content of Proposition (3.16).
Guided by these observations, we shall
develop in Section 8 explicit algorithms for calculating from
f
without first having to compute PROPOSITION.
E
If
XL
f
directly
F.
is a free
K[z]-module, no sta.te of
can be simultaneously reachable and controllable. PROOF.
We recall that
"XL = free"
simplicity that
XE = K[z].
for some 5 E K[z]. zlwl. x +
00.1
=0
This shows that WI
Similarly,
for some
1
Then
00
x
XL
means that
(isomorphic to) a finite sum of copies of K[z].
by
dim E
= reachable
is
Suppose for means that
x
= 5'1
x = controllable means that
E K[z].
is annihilated by
Hence if x has both properties,
~ow,
which contradicts the assumption that
the input ~
5 followed
is free.
o
The most important consequence of Theorem (4.2) is due to the fact that through it we can apply to linear dynamical systems the well-known FUNDAMENTAL Sl'RUCTURE THEOREM FOR FINITELY GENERATED MODULES OVER A PRINCIPAL IDEAL DOr1Allf R (Invariant Factor Theorem for Modules).
Every such module M
wi~h
~
generators is isomorphic to
-
58 -
R. E. Kalman
where the the
El
Vi
R/ViR are quotient rings of
R viewed as modules over
(called the invariant factors of M)
M up to units in
denotes the free
Vi Iw .. , i l
R,
R-module with
s
R,
are uniquely determined
i = 2, ••• , q,
and, as usual, RS
generators; finally,
r + s
~
m.
Various proofs of this theorem are referenced in KALMAN, FALE, and ARBIB [1969, page 270], and one is given later in Section 6. Note:
The divisibility conditions imply that
module iff
s = 0
and then
M is a torsion
VM = VI.
One important consequence of this theorem (others in Section is that it gives us the most general situation when torsion module
E.
~
7)
is not a
(4.33) with (4.34), we
For instance, combining
get PROPOSITION.
A system cannot be simultaneously completely
reachable and completely controllable if its oo-dimensional components (i.e.,
(4.37 )
REMARK.
s >0
in
K[z]-module
X has any
(4.35».
Although our entire development in this section may
be regarded as a deep examination of Proposition (2.14), most of our comments apply equally well to (2.7), since both statements rest on the ~ algebraic condition (2.8).
In fact, the only remaining
thing to be "algebraized" is the notion of "cont i nuous- t ime " . shall not do this here.
We
Once this last step is taken, the algebraization
of the Laplace transform (as related to ordinary linear differential equations) will be complete.
-
59 -
R. E. Ka I ma n
5.
CYCLICITY AND REIATED QUESTIONS
We recall that an R-module
element
th~t
M = Hm.
m E M such that
iff there is an element better to say
(R = arbitrary ring) is cyclic
M
such a module is monogenic:
[It would be
generated by one
m.]
If M is cyclic, the map R and has kernel
Am'
loI: r
~
r'm
H
is an epimorphism
the annihilatir-g ideal of m.
This plus the
homomorphism theorem gives t he well-known PROPOSITION.
Every cyclic
R-~J ::J:lle
RIA:n
is isomorphic with the quotient ring
r-,
with ge::e!'at::>!' :n
vie~ed a s an
~-~Jdule.
This result is much m::>re interesting when, as in our case,
R
is not only commutative and a principal-ideal domain, but specifically the polynomial ring
X be a cyclic
So let A
g
g.
=
'I' g K(z],
where
K(z] -module ·H.ith generator
X "" X[ z l/vK[ z l. (i)
(ii)
and
l~t
X.
Write
~
g
Hence
= ~ X =~.
'I' g is a minimal
In view of (5.1),
Let us r ecs.Ll some f'eabur es of the ring
K[ z ]hx[ z }:
Its elements are the residue classe s of polyno~als ~ ( mod V) ,
rr E K[z]. (rr]-[cr]
g
is the minimal or annihilating polynomial of
By commutativity and cyclicity,
polynomial also for
(rr]
K[z] .
W!'ite these as
[~)
or
[rrlo/.
Multiplication i s def ined as
= [rrcr). Each
[rr]
is either a
is a unit iff (n, 'f)
tL~it
= greatest
or a divisor of zero.
In fact,
common divisor of u, V is a
-
60 -
R. E. Kalman
unit in
K[z)
(that is,
cnr + Tljr so that
unit in K[z),
divisors since (iii)
(0",
1
If
T
[71").[ljr/Q] ljr
Then
E K[z])
is the inverse of
[0")
=0 f
(71", ljr)
(71", \if) E K).
[71").
On the other hand, if
then both
[71")
and
[ljr/Q) are zero
= [(71"/Q)ljr] = o.
is a prime in K[z)
(that is, an irreducible poly-
nomial with re spe ct to coefficients over the ground field by (ii)
K[z ]/ljrJ([ z ]
i s a field.
K),
then
This is a very standard construction
in algebraic number theory. Since it is awkward to compute with equivalence classes
[71"],
shall often prefer to work with the standard representative of namely a polynomial mined by
[71"]
1i- of least degree in
and the condition
[ 71"] •
deg 7i- < deg ljr.
we
[71"),
7i- is uniquely deterHenceforth
-
will
always be used in this sense. The next two assertions are immediate: (5.2) to the
PROPOSITION .
K[z]/fK[z)
K-vector space CEJ( n)
K[z)/ljrK[ z ]
=
~
a E K[z l : deg l' < n
is also isomorphic to lJ1(n) as a
we define the scalar product in ®(n) PROPOSITION. then
K-vector space is isomoruhic
dim E = deg ljr .
If
=
deg \' } •
K[ z I-mcdul,e, provided
E.l (71".1')
r+
7i1.
XE is cyclic ,d th minimal poJ.y:lomial
~,
-
61 -
R. E. Kalman
(4.34), we see that the most general
Looking back at Theorem
K(z]-module is a direct sum of cyclic
(5.3) and
(4.3~
K(z]-modules.
By combining
and using the fact that dimension is additive under
direct summing, we can replace (4.31) by the followiEg exact result: PROPOSITION. factors
If X is a torsion module with invariant E
W , ••• , W then l q -dim E
A simple but highly useful consequence of cyclicity is the so-called control canonical form [KALMAN, FALB, and ARBIB, 1969,
44] for a completely reachable pair (F, g) wher e g is an
page
matrix.
n X1
We shall now
Observe first that lent to
"g
generates
procee~to
"(F, g) XF,
deduce this result.
completely reachable" is equiva-
the module induced by
F via (1+ .13) ."
Let
det (zI - F), n
z + al Z then X F•
~
n-l
+ ••• + an'
~ E K;
is the characteristic (and also the) minimal polynomial for
[This is a well-known fact of module theory.
See for example
KAlMAN, FALB, and ARBIB [1969, Chapter 10, Section 7] for detailed discussion.]
As in KALMAN [1962], consider the vectors
-
62 -
R . E. Kalman
en e
in~.
=
n_l
g
=
=
l.g
=
{l)(z).g,
~.g =
z.g +
[ For consistency,
{2)(z)'g,
xin+l)(z) F
= X(z).J F These vectors are
easily seen to be linearly independent over since
~
'" ®n)
as a
K.
They generate
K-vector space (Proposition (5 .2}).
••• , en are a basis for ~ as a K-vector space. l, respect to this basis, the K-homomorphism e
z:
is represented by the matrix
(5. 6)
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
F
-0:
r
-exn-1
-0:
n-2
[This is proved by direct computation. necessary to use the fact that
-0:
2
Hence
With
z -x
X H
-~
In particular, it is
~
-
63 -
R. E. Kalman
z.e
1
z~n)(z).g, (~(z) - an) 'g,
Note that the last row of
By definition,
F
of
~.
~
has the representation
(5.7 )
g
in
(5.6) consists of the coefficients
= en' Hence g as a column vector in
g
Conversely, suppose
W,
~
have the matrix representation
with respect to some basis in
~.
(5. 6-7)
Then (by direct computation)
the rank condition (2 .8) is satisfied and therefore
(F, g) is
completely reachable in both the continuous-time and discretetime cases (Propositions (2 .7) and (2.16)). We have now proved:
(5.8)
PROPOSITION.
The pair
(F, g)
is completely reachable
if and only if there is a basis relative to which
COROLLARY. A(Z)
= zn + tll zn-l + ... + tln
exists an ~
Given an arbitrary
n-vector
(F, g)
£
in
such that
K[ zJ,
n-th K
is,given by
degree polynomial
= arbitrary
A = ~_g£'
is completely reachable.
F
field.
There
if and only if the
-
64 -
R. E . Kalman
PROOF.
Suppose that
With respect to the same basis
(5.6-1),
forms
(F, g)
(5.5)
is completely reachable.
which exhibits the canonical
define
Then verifY by direct computation that Conversely, suppose that reachable.
(F,
A = ~_gtl' g)
is not completely
Then, recalling Proposition (2.12) (which is an
algebraic consequence of (2.8) and hence equally valid for both continuous-time and discrete-time), deg ~
•
~
Since
22
the polynomial
KU
and so is also
F-invariant subspace of X = ~,
is independent of the choicp of basis in
~
II
~ = XP/X • (In F 22 11 does not depend on the arbitrary choice of
and the same is true then also for
particular, X 2
is an
dim X > 0 2
~
22
in satisfYing the condition X =
we have for all
n-vectors
~ EB X
2.)
t, deg
This contradicts the claim that with suitable choice of
In view of (2.12),
~
22
A-X - · F- gt ·
> O.
is true for any
t.
In view of the importance of this last result, we shall
rephrase it in purely module theoretic terms:
A
o
-
65 -
R. E. Kalman THEOREM.
Let
K be an arbitrary field and
K[z]-module with generator n.
g and minimal polynomial
There is a bijection between n-th
= zn + 131 zn-l + .e: If' -+ If': x(j) .g ~
degree polynomials
(5.5»
J
such that
-
"A is the minimal polynomial for the
new module structure induced on X by the map Note that in (5 .11) The map l
z
X of degree
..• + 13 in K[ z l and K-homomorphisms n H l .• g (j = 1, " " nand x(j) defined
"A( e)
E
X a cyclic
(F, g, -) to
z*
l(x) .
z*: x H z -x -
lex) corresponds to gl'x in (5.10).
in (5.11) defines a control law for the system corresponding to the module X.
The passage from
is the module-theoretic form of the well-known open-loop
to closed-loop transformation used in classical linear control theory. PROOF.
If',
basis for treat
l
l
represents the equivalence class is never a
l
that this choi ce of
= 1,
••• , n + 1.
"A(l)(z _
13 j
.e
- O:j'
implies
We
(that is, an operator
l'x = l(~ 'g), where
= (s:
[s]
s·g
= x).
Unless
K[zl-homomorphism and therefore
does not commute with nonunits in K[z]. Define
j
form a
is clearly a well-defined K-homomorphism.
K-vector space), by writing
identically zero,
.e
x(l) .g, ••• , x(n)'g
formally as an element of K[z]
on X is a ~
Since the vectors
j
= 1,
.•• , n.
"A(j)(z -
Use induction on
j.
We prove first
l) = x(j)(z) for By definition,
l) = x(l)(z). f I n the general case,
-
66 -
R. E. Kalman
(inductive hypothesis), (def . of .£), (def. of .£ .), J
(def. of x(j+l)).
j = n + 1)
It follows (case regarded as a
K[z*]-module.
A(l)(z*).g, ••• , A(n)(Z*)'g space since
F=~~asitions
to the
A annihilates
g
X
On the other hand, the is a basis for
X as a
K-vector
X(l)(z).g : .• , x(n)(z).g was such a basis.
is cyclic with generator by
that
also as a
K[z*]-module.
(5.1-2) the annihilating ideal of
So X
Hence
g" with respect
K[z*)-module structure cannot be generated by a polynomial
of degree less than
n,
nomial with respect to
that is, z*.
A is indeed the minimal poly-
The correspondence
A
f-'
.£ is obviously
o
bijective. The proof immediately implies the following COROLLARY. ~
K[z]-module.
respect to the are related as
Hz)
~
Then
x x
=
~. g
be any element of X viewed
has the representation
K[z*]-module structure on
X,
where
~*.g
S
with
~
s*
-
67 -
R. E. Kalman
So the open-loop/closed-loop transformation is essentially a change in the canonical basis, provided X is cyclic.
X(j)
It is interesting that the
have long been known in
Algebra (they are related to the Tschirnhausen transformation discussed extensively by WEBER [1898, §46, 54, 74, 85, 96]), but their present (very natural) use in module theory seems to be new. **Theorem (5.l1} may be viewed as the central special case of Theorem A of the Introduction.
Let us restate the latter in
precise form as follows: THEOREM. n
;>.(z) = z + I\Z There exists an if and only if
n-l
Given an arbitrary . + ••• + I3n E! K[z],
n X m matrix (F, G)
Lover
n-th
degree polynomial
K = arbitrary field. K such that
~-GL'
=A
is completely reachable.
For some time, this result had the st a t us of a well-known folk theorem, considered to be a straightfoniard consequence of (5.9). has been discovered independently by many pe ople .
The latter
(I first he ard
of it in 1958, proposed as a conjecture by J. E. Bertram and proved soon afterwards by the so-called root-locus method.)
Indeed, the
passage from (5.11) to (5.13) is primarily a tecnnical problem.
A
proof of (5.13) was given by LAIiGEliliOP [1964) and subsequent ly simplified by WON¥JU~ [1967).
Tne first proof was (~n_~ecessarily)
very long, but the second proof is also unsatisfactory; since it depends on arguments using a splitti ng field of
K
**The material between these marks was added after the Summer School.
-
68 -
R. E. Kalman
and fail when K is a finite field.
We shall use this situation
as an excuse to illustrate the power of the module-theoretic approach and to give a proof of (5.13) valid for arbitrary fields. The procedure of LANGENHOP and WONHAM rests on the following fact, of which we give a module-theoretic proof: LEMMA.
Let
F be cyclic* and
!.!! m-vector
a E
K be an arbitrary but infinite field.
(F, G)
Ifl
completely reachable.
such that
(F, Ga)
Let
Then there is
is also completeq
reachable. We begin with a simple remark, which is also useful in reducing the proof of (5.13) to Lemma (5.18). SUBLEMMA.
Every submodule of a cyclic module over a
principal-ideal domain is cyclic. PROOF OF (5.14). m= 1
is trivial.
m.
The case
The general case amounts to the following.
Consider the submodule gl' ••• , ~-l
We use induction on
of G.
Y of X =
~
generated by the columns
In view of (5.15),
Y is cyclic.
By the
inductive hypothesis, we are given the existence of a cyclic generator of Y of the form
gy
We must prove:
a, J3 E K the vector
for suitable
is a cyclic generator for
=
a i gl + ••• + am_I· ~-l' a a.~
+
i
E K.
J3.~
X.
*Of course, this means that the is cyclic.
K[z]-module
X F
(see (4.13))
-
69 -
R . E. Ka l ma n
By hypothesis,
Sx'
X has an (abstract) cyclic generator
By cyclicity we have the representations
=
gy
and
TJ'~
Eim
Tj,
~,~,
~ E K[ e l-
Hence our problem is reduced to proving the following:
ex, tl
E
K the polynomial
aT)
~
+
is a unit in
K[
for suitable This,
Z]/~K[ z ] .
in turn, is equivalent to proving
(5.16) where
aT)
classes zero.
-
mod gi'
K[z]
are the unique prime factors of
Then no pair
(~i' ~i)'
X,
reachable.
= 1.
values of
X'
(F, G)
is completely ~
and
gy
the condition
tl from con sideration.
can be
is a proper sub-
are zero, then every ~ .
K[Z]/~K[z],
Then
.•• , r
that is, ~gi annihilates
whence
contradicting the fact that
I f all the
is a unit in
ex
X' = K[z]gy + K[z]9m'
= 1,
i
For if one is, then gil (~, TJ, ~),
module of
r
in
1, ... , r
i
~
mean the representative of least degree of equivalence
the submodule
So let
° (mod g.)
f-
gl' .,., gr Let
~.
+ ~
f-
0,
~. + ~ . ~
Since
~
= 0 eliminates at most K is infinite by
(5.16).
An essential part of the lemm~ is the stipulation that
"F
= cyclic
+ (F, G)
TJ
is already a cyclic generator.
hypothesis, there are always some tl which sati sfy
The hypothesis
so
0
a E ~.
= completely reachable" means that
-
70 -
H. E . Kalman
that is, the le~~ i s trivially true for some a E ~[z]
sx = Ga.
But since we want
a E K,
since
there must be interaction
between vector-space structure and module structure, and for this reason the lemma is nontrivial. when K = finite field.
As a matter of fact, the lemma is false
The simplest counterexample is provided
when (5.12) rules out a single nonzero value of 13, out all
thereby ruling
13. COUNTEREXAMPLE.
Let
integers modulo the prime ideal
Notice that
K = y~, ~.
that is, the ring of
Consider
~ = Xl e X e ~
(as a K[z]-module), where the 2 minimal polynomials of the direct sumrrands are
').(z) X 2(z) X (Z)
3
z2 + z + I, z 2, z + 1. (Xl' X X = 1, hence 2, 3) gl generates Xl eX while
All these factors are relatively prime, X is cyclic. generates
Notice also that
X ex • A cyclic generator for 2 3
3
X is
-
71 -
R. E. Kalman
A simple calculation gives
(z
4
2
+ Z
+ l)'~'
Conditions (5.16) are here a-I + f3'0
f
0
(mod Xl)'
+ f3.1
f
0
(mod X
a-I + f3-1
f
0
(mod X ) .
a-O
2),
3
These conditions have no solution in
g/~.
At this point, the following is the situation concerning Theorem (5.13): (1)
Its counterpart, Theorem A of the Introduction, was
claimed to be true in the continuous-time case under the hype
.~esis
of complete controllability. (2)
In the discrete-time case (5.13) with the preceding
hypothesis Theorem A is false, because of the counterexample: (F
= nilpotent,
~-GL'
G
= 0)
the pair
is completely controllable, but evidently
1s independent of L.
However, in view of (5.11),.Theorem
(5_13) might be true also in the discrete-time case if "complete controllability" is replaced by "complete reachability", this modification being immaterial in the continuous-time case. (3)
Because of (5.17), we might expect that a theorem like (5.13)
1s false for an arbitrary field
K.
-72R. E. Kalman
(4)
If our general claim that reachability properties are
reflected in module-theoretic properties is true, then (5.13) should hold without assumptions concerning module-theoretic fact, that
K,
= principal
K[z]
independent of the specific choice of
because the principal ideal domain, is
K.
We now proceed to establish Theorem (5.13). hypotheses on
K will turn out to be irrelevant.
PROOF OF (5.13).
Necessity is proved exactly as in (5.8).
Sufficiency will follow by induction on m,
~~.
once we have proved it
m = 2:
in the special case
(5.18)
Let
K be an arbitrary field and let
K[z]-module generated by
gl' g2.
K[z*]-module structure on
Let
Case 1.
z*
=z
£ - £
£(x)
will change the
serve that on
o
or
x E Z.
on
Thus there exist polynomials
£
In (5.11)
Replacing
K[z]-module structure on
t.. on
z
Y but pre-
is prime to the unchanged minimal polynomial
y +
Z
by
so that the new minimal poly-
V,
a
such that
B.r hypothesis, every x E X has the representation x
induces a
g2·
X=YEllZ.
for all
nomial Z.
gl + g2
that is,
Z. Further, choose Y
z - £
Y = K[z]gl and Z = K[z]g2.
ynZ=O,
such that
=
z*
X then X is cyclic with respect to this
structure and is generated by either
PROOF.
X be a
There is a K-homomorphism £
(of the tyPe defined in (5.11] such that if
take an
That is, special
vt.. + o X
~Z
= 1.
X
-
73 -
R.E. Kalman
Now verify that x
= (T]crX + svA)·(gl + g2)'
T]crX'g
l
+ SVA·g 2,
T](l - VA)·gl + s(l - crX)'g2' Tj'gl + s 'g2'
K[z*]-module.
ynz=wf o.
C2.s e2.
there is ag E K[z] cyclicity of Take same w Tj
-1
T]
g'g2
f
on
Then if
O.
generates
unit (mod
w
there is also a
Y,
f
su ch that
1Sc).
To show:
X such that
g' g2
Z = X.
3y ;lypotlle s i s,
and therefore, by
Tj E K[z]
such that
1Sc)
g'g2 = w = Tj'gl'
we are done because
In the nontrivial case,
there is a suitable new module structure
~ = unit (mod X* ) ,
nomial of X as a
'lE W.
T] = u,'.it (mod
and so
Y,
kt
X*
being the minimal poly-
K[ z* ]-moduLe,
The main facts we need are the following: SUB~~~.
deg X = n,
Let
X be a fixed element of
FX the companion matrix of
the cyclic module induced by X F• X
Then
Tj E K[z]
F)? and
is a unit modulo
X given by g
K[z]
,nth
(5.6),
X FX
a cyclic generator of
X if and only if
~'g
is
also a cyclic gener at or of X . FX PROOF.
Obvious.
o
-
74 -
R. E. Kalman
.Jl-l )
f
( dety, FXY' ••. , .1"X Y
where
y
(5.19).
Same notations as in
SUB~~~.
(5.20)
Write
0,
is the column vector Tin
PROOF.
Since
X(1), ••• , x(n)
is the basis for the
K-vector space of all polynomials of degree (~l' ••• , ~n)
is uniquely determined by
< n,
By definition
~.
is the matrix representing the module operator to the special basis
e
l,
••• , en
in
~ X
the n-tuple
z: x
given by
~
z·x
FX relative
(5.5).
Similarly,
using one of the module axioms, we verifY that
£
J=l
[rt.x(j)(Z)]'g "J
'
Jl'iij[x(j)(z).gJ,
in other words, the numerical vector (5.22) represents the abstract vector
Ti·g
in
X relative to the same basis FX
e
l,
.•• , en'
Recall
-
75 -
R. E. Kalman
that By
generates
Tj'g
(2.7)
~X
(F x, ll(FX) g)
is complete reachable.
the latter condition is equivalent to
follows from
(5.21)
Same notations as in
(5.19)
and
(5.20).
Given
n-vector (5.22), there exists a polynomial
X
is satisfied.
PROOF.
Let
11 1 , Ti 2 , X(z )
The rest
o
any nonzero nwnerical such that
(5.21).
(5.19).
SUBLEMVA.
numbers
iff
Ti r be the first member of the sequence of which is nonzero.
n +
Z
~z
and determine the first ll r
'i'ir+l
o
Tjr
o
o
n-l r
Write
+ •.. + an' coefficients of
X by the rule
~:J
:J
T}r
an
o
o
1
(Since all numbers belong to a field, the required values of a
r,
..• , an
exist.)
reduce the matrix in
Now check, by computation, that these conditions
(5.21)
to the direct sum of two triangular
matrices, each with nonzero elements on its diagonal . In view of always choose a new
(5.12), Xy = Xt
it follows from these facts that we can such that
Tjt
= unit
mod Xt •
o
-
76 -
R. E. Kalman
The proof of Case 2 is not yet complete, however, because we must still extend the is easy .
Write first
Z
K[z*]-module structure from
= W$
Z·
and then
direct sum is now wi t h r espect to the
t
from Y to X
"by
£i Z'
s '.:ttins
polynomial
X*
(5.12),
is replaced by some
(5.24 )
~
defined over
K-~odule
O.
=
X Since ~*
X
Y to X. This
Y $ Z',
where the
structure of X.
Extend
;;o',{ we have a n(;w mi nima.L z*
= Zt on Y,
~*
= ~t .
By
such that
w
that is, our previous representation of
w~ 0
in W induces a
similar representation with respect to the new K[z*]-module structure on X. Since
~
Xr,
is a unit modul o
By (5.24), we have, with re sp ect t o the cy.
(~* ·g2)'
c-
(~* ·gl)'
we can
~T it e
K[z*]-s tructure,
(1 + TXt) ·gl'
gl· This proves that
52 generates both Y and Z; that is,
a cyclic generator f or
X end owed
~~ th
proof of Lemma ( 5 . 18 ) is now complete.
the
K[z*]-structure.
is The
o
-
77 -
R. E. Kal ma n It should be clear that Theorem (5.13) is not a purely moduletheoretic result, but depends on the interplay between module theory, vector-spaces, and elimination theory (via (5.21)). the fact that
£
ca~
be extended from
For instance,
Y to X, which was needed
in the proof of Case 2, is a typical vector-space argument.** There are many open (or forgotten) results concerning cyclic modules which are of interest in system theory. is easy to show that an
n Xn
real matrix is cyclic iff a certain is nonzero at
~
For instance, it
is roughly analogous to the polynomial
det
,
F'
the polynomial
in the same ring,
but, unlike in the latter case, the general form of
~
does not seem
to be known. We must not terminate this discussion without pointing out another consequence of cyclicity which work.
Since
K[z]jXg K[z], co~~tative
X = cyclic with generator it is clear that
Xg
g
the module frame-
is isomorphic with
X also has the structure of this
ring, that is, the product is defined as
xXy If
tra~scends
(~Tj) 'g.
irreducible, then
X has a galois group.
No one has
tion of this galois group. facts in the theory of
X is even a field. eve~
Hence, in particular,
given a dynamical
interpret~-
In other words, there are obvious algebraic
dyr~nical
from the dynamical point of view.
systems which have never been examined For some related comments in the
setting of topological semi groups, see DAY and WALLACE [1967].
-
78 -
R. E. Kalman
6.
(6.0)
PREAMBLE.
TRANSFER FUNCTIONS
There has been a vigorous tradition in engineer-
ing (especially in electrical engineering in the United States during 1940-1960) that seeks to phrase all results of the theory of linear constant dynamical systems in the language of the Laplace transform. Textbooks in this area often try to motivate their biased point of view by claiming that "the Laplace transform reduces the analytical problem of solving a differential equation to an algebraic problem". When directed to a mathematician, such claims are highly misleading because the mathematical ideas of the Laplace transform are never in fact used.
The ideas which are
complex function theory:
actu~lly
used belong to classical
properties of rational functions, the
partial-fraction expansion, residue calculus, etc.
More importantly,
the word "algebraic" is used in engineering in an archaic sense and the actual (modern) algebraic content of engineering education and practice as related to linear sy stems
i~
very meager.
For
eXfu~ple,
the crucial concept of the transfer function is usually introduced via heuristic arguments based on linearity or "defined" purely formally as "the ratio of Laplace transforms of the output over the input". do the job
~~
To
and to recognize the transfer function as a natural
and purely algebraic gadget, requires a drastically new point of view, which is now at hand as the machinery set up in Sections 3-5. essential idea of our present treatment was first published in KALMAN [1965b l.
The
-
79 -
R. E. Kalman
The first purpose of this section i s to give an intrinsically algebraic definition of the transfer function associated with a discrete-time, constant, linear input/output map (see Definition (3.10)). Since the applications of transfer functions are standard, we shall not develop them in detail, but we do want to emphasize their role in relating the classical invariant factor theorem for polynomial matrices to the corresponding module theorem (4. 34). Consider an arbitrary
K[zl-homomorphism
(g) following Theorem (4.2)) . equivalent to the set
(f(e
j),
n~ r
f:
(see lemma
Then as a "mathematical object" i
1, ... , m,
e
j
f
is
defined by (4.6)),
since (6.1) (The scalar product on the right is that in the defined in Section 4.) power series in
z
-1
By definition of with vani shing
fir~t
r,
K[zl-module
each
term.
f(e .) J
r,
as
is a formal
We shall try to
represent these formal power series by ratios of polynomials (Which we shall call transfer functions~) and then we ca n replace formula (6.1) by a certain specially defined product of a ratio of polynomials by a polynomial .
Some algebraic sophistication will be needed to find the
correct rules of calculations.
These "rules" will consititute a
rigorous (and simple) version of Heaviside 's so-called "calculus". There are no conceptual complications of any sort.
(However, we are
dodging some difficulties by working solely in discrete-time.) *This entrenched terminology is rather unenlightening in the present algebraic context.
-
80 -
R. E. Kalman
X = n/kernel f be the state set of f regarded as f K[zl-module. We assume that X is a torsion module with nontrivial f Let
a
minimal polynomial
ljr.
=
ljr·f(e.)
(6.2)
J
Then, for each f(ljr·e.) J
=
j = 1,
ordinary product of the power series
no dot
O.
~([ljr.e.l) J
By definition of the module structure on
a (vector) polynomial.
•.. , m we have
r,
(6.2) means that the
f(e j )
by the polynomial
Hence (6.2) is equivalent to
ljr
is
(notation:
ordinary product) 1, ... , m.
!ntuit.i'y-e}:.y.:, we can solve this equat.Lon by writing
fee .) J
There are two vmys of making this idea rigorous. Method 1.
(6.3)
Define
=
f(e.) J
G./ljr J
as the formal division of
G.
by
1jr
Check that the coefficient of
ZO
is always
J
into ascending powers of O.
*(z-l)
Multiply both sides of (6.2 1 ) by
= z-nljr(z)
and
Q.(z-l) ~ z-nQ(z). J
Then
-1
Verify by computation
that the power series so obtained satisfies (6.2 1 ) Method 2.
z
.
z-m.
Write
~ E K[z-ll C K[[z-lJl
and (6.2 1 ) becomes (6.2")
~f( e .)
Moreover, the
J
O-th
coefficient of
~
is
1
(because of the convention
-
81 -
R.E. Kalman
W
that the leading coefficient of K[[z-l]]
is
1),
hence
t
is a unit in
and therefore
(6.3' )
f(e.) J
Note that tions of
(6.3) and (6.3')
f(e.),
give slightly different defini-
depending on whether we use a transfer function with
J
z
respect to the variable
or
in the engineering literature.) preferable.
actu~lly
z
-1
(Both notations have been used
For us the form~lisffi of Method 1 is
(The calculations of Method 1 can be reduced by Method 2
to the better-known calculations of the inverse in the ring
K[[z-l]].)
Summarizing, we have the easy but fundamental result:
(6.4)
EXIS~~CE
OF
TRP~SFER
correspondence beblcen polynomial
~Ihere
Q
yuNCTIONS.
K[ z j-homomorpht er,s
wand transfer function
j E KP[z], deg
den ominator of
Q
j <
I'
with minimal
of the type
W is the lea.st common
Z.
In many contexts, it is preferable to deal with the ponding to
f
rat.he r t.han \,ith
f
itself.
Zf
dim Zf /',.
W z
and conversely. dim f
fare well-
Thus, for instan ce,
~ dim X
f;
least common denominator of minimal polynomial of
corre s-
Because the cor r e sponde nce
is bijective, it is clear that all objects induced by defined also for
Zf
fZ'
Z,
-
82 -
R.E.Kalman
(6.5)
REMARK.
realization of
In view of Propositions (4.20-21), the natural
Z,
namely
D.
X = X z f, Z
well as completely observable. has caused a great confusion ,
is completely reachable as
Not having this fact available before 1960 Questions such as thoscresolved by Theorem (5.13)
tended to be attacked algorithmically, using special tricks amounting to elementary algebraic manipulations of elements of
Z.
Very few
theoretical results could be conclusively established by this route until the conceptual foundations of the theory of reachability and observability were developed. The preceding results may be restated as "rules" whereby the values of
f
may be computed using
Z.
We have in fact,
fern) = Z· rn,
(6.6)
wZ
multiply the polynomial matrix consisting of the numerators of Z with rn, reduce to minimaldegree polynomials modulo and then divide formally by W as in ~lethod 1 above.
*
We can also compute the entire output of the system E
Z
(that is,
all output values following the application of the first nonzero input value) by the rule
same as above, but do not reduce modulo
W.
In this second case, the output sequence will begin with a positive power of
z.
(The coefficients of the positive powers of
thrown away in the definition of
f
(see (3.7»
z
are
and in the definition
vhere
-
83 -
R. E. Kalman
r,
of the scalar product in for
X f
= n/kernel
in order to secure a simple formula
f.)
Many other applications of transfer functions may be found in KAl1~, FALB, and ARBIB [1969, Chapter 10, Section 10].
It is easy to show that the transfer function associated with
= (F,
the system L f
G, H)
is given by
Zf
= H(zI
- F)-lG.
(This is
just the formal Laplace transform computed from the constant version of (1.12) by setting
= zx(t).)
x(t + 1)
z
= d/dt
or from (1.17) by setting
Probably the simplest way of computing
Z
is
via the formula
6.8)
q
where
1/I
F
is the minimal polynomial of the matrix
script denotes the special polynomials defined in identity
(6.8)
deg .1/I, F and the super-
(5.5).
The matrix
follows at once from the classical scalar identity
[WEBER, 1898, §4]
ttl . ( L) (z - w) .L. zJ;; q-a, (w),
7T(Z) - 7T( w) upon setting
w
= F,
7T
J= l
= 1/IF'
q
deg 7T,
and invoking the Cayley-Hamilton theorem.
Much of classical linear system theory was concerned with computing Zr
In the modern context, this problem "factors" into first solving
the realization problem
f ~ L f
a~d
then applying formula
(6.8).
See
Sections 8 and 9. One of the mysterious features of Rule (6.6) (as contrasted with the conventional rule (6.7)) is the necessity of reducing mowllo The simplest way of understanding the importance of this
1/1.
-
84 -
R. E. Kalman
aspect of the problem is to show how to relate the module invariant factors occuring in the structure theorem (4 .34) to the classical facts concerning the invariant factors of a polynomial matrix. INVARIANT FACTOR THEOREM FOR MATRICES.
Let
P be a
matrix with elements in an arbitrary principal-ideal domain p
(6.10) where
A and
Bare
p X P
diag
IT
= rank
P.
The
II. 1
and
Rand
Then
m X m matrices (not necessarily det A, det B units in
(~x, ~FEx, •••
define a factorization of
f.
Hence the correspondence between (3.12)
~ (7.2) is bijective.
The quickest way to exploit the algebraic consequences of our definition (7.2) is via the following arrow-theoretic fact:
-
94 -
R.E.Kalman
ZEIGER FILL-IN LEMMA. and
5
~
A, B, C, D be sets and ex,
s, r,
set maps for which the following diagram commutes: ex
4
>
B
./
./
r ./
VJi/
~
~
5
is surjective and
t3
./
C
If ex
./
./
;>
W D
5 . i s injective, there exists a unique set
corresponding to the dashed arrow which preserves commutativity.
This follows by straightforward "diagram-chasing", which proves at the same time the COROLLARY.
The claim of the lemma remains valid if "sets"
are replaced by "R-modules" and "set maps" by "R-homomorphisms". Applying the module version of the lemma twice, we get
(7.6)
PROPOSITION.
fixed
f:
Consider any two canonical realizations of a
the corresponding state-sets are isomorphic as K[z]-modyles.
Since every K[z]-module is automatically also a K-vector space, (7.6) shows that the two state sets are K-isomorphic, that is, have the same dimension as vector spaces.
The fact that they are also K[z]-isomorphic
implies, via Theorem (4.34), that they have the same invariant factors. We have already employed the convention that (in view of the bijection between
f
and l:f)' the invariant factors of
f
and X f
are to be
-
95 -
R. E. Kalman
identified.
In view of
(7.6),
this is now a general fact, not dependent
on the special construction used to get
(7.6)
x. f
We can therefore restate
as the
(7.7)
ISOHORPHISM THEOREM FOR CANONICAL REALIZATIONS.
canonical realizations of a fixed
f
Any two
have isomorphic state module s.
The state module of a canonical realization is uniquely characterized (up to isom orphism) by its invariant factors, which may be also viewed as those of
f.
A simple exercise proves also
(7.8)
PROPOSITION.
realization
f,
then
If
X is the state module of a canonical
dim X (as a vector space) is minimum in the
class of all realizations of
f.
This result has been used in some of the literature to justify the terminology "minimal realization" as equivalent to "canonical realization".
'-Ie shall see in Section 9 that the two notions are
not aD~Ys equivalent; we prefer to view (7.2) as the basic definition and
(7.8)
as a derived fact.
REMARK. claimed (4.24).
2 = (F, G, H)
(7.7)
Theorem
constitutes a proof of the previously
To be more explicit:
if
E
(F, G, H)
and
are two triples of matrices defining canonical realiza-
tions of the same
f,
then
space isomorphism A: X -)
(7.7)
X
implies the existence of a vector-
such that
-
96 -
R . E. Kalman F
(7.10)
'" G
AG,
1\
If we identify X and X then A is simply a basis change and it follows that the class of all matrix triples which are canonical realizations of a fixed grOUp over
f
is isomorphic with the general linear
X. The actual computation of a canonical realization, that is,
of the abstract Nerode equivalence classes
[m]f'
require a consider-
able amount of applied-mathematical machinery, which will be developed in the next section. a factorization of
The critical hypthesis is the existence of f
such that
expressed by saying that
f
dim X
A' + N'. ~ and
~B'
-
106-
R. E . Kalman
The sequence
A is uniquely determined by
from the left and the sequence acting on the matrix
N.
acting on
~AI, 7\"(~)
B is uniquely determined by from the right .
~A', A"(~)
are equal by hyp othesis on
~
~B
The two matrices
Moreover,
and
are also equal, since the matrices on the right-hand side depend only on the
2nd, •.• , N-th
member of ea ch sequence.
Using only this fact
and the associativity of the matrix product 11:-1
~AI, A"~~B ;:
So
'
k-l
~~AI, N'~B
'
o
B.
A
Now we can hope for a realization algorithm which uses only the first
A' + A"
terms of a sequence of finite length.
In fact, we have
(8.16)
B. L. HO' s REALIZATION AIDORITHM.
seguence
A of finite length with associated Hankel matrix
Consider any i nfinit e
following steps will lead to a canonical realization of
A:
H.
The
-
107 -
R . E. Kalman (i)
Determine
(ii)
Compute
nonsingular
pA'
X
pN
A', A". n = rank ~A'I A"; and
mA"
X'
mA"
in doing so, determine
matrices
P, Q su ch that
(8.17)
(iii)
(8.18)
Compute
Rn P!!" ,,,~, - /\ ,/\
G
H = are idempot ent "editing" matrices c orre spondi ng to the operations "r et a i n onl y the first
p
rows" and "retain only the first
m columns". We claim the (8.19)
REALIZATION THEOREl·! FOR INFINITE SEQUENCES.
seguence
~
(A', N'),
whos e a ssociat ed Hankel mat r i x
~
For any infinite
ha s f inite length
B. L. Ho's f or mula s (8.17-18) yi el d a canonical r ealization. PROOF.
If E
defined by (8.17-18 ) is a realization of ~,
then it is certainly cano ni ca l : the class of all realizations of
by ~
(8.4)
L
ha s minimal dimension in
and so it i s canonical by (7. 8) .
The required verification is int eresting. subscripts.
Observe that
l!
H
n
= QCRP
First, drop all
is a pseudo-inverse of
~,
that
-
108 -
R.E.Kalman is,
~~ =~.
Then, by definition of
~G
F, G, H,
II
.'m d
~,
(~Q.C)(RP[(J"r&]Q.C)k(~C),
~(~II[(J"~])~~C; by repeated application of (8.9),
~(~I1~)~~C ~~(~II~)k-~~C,
RS~~C,
~~C, R[(J"~]C. The last equation calls for picking out the first first (8.20)
m columns of COt~NT.
(J"~,
which is just
A+ l k,
p
rows and the
as required.
0
This is a considerably sharper result than Theorem
(8.12), in two respects: (i) use the matrix (ii) form:
It is no longer necessary to compute ~", , ,," «(J"At;;) ,
~:
we simply
which is part of the data of the problem.
Formulas (8.18) give the desired realization in minimal
there is no need to reduce (8.13) to a minimal realization (recall
here (7.11». Notice also that the proof of (8.19) does not re~uire (8.12) but depends (just like the latter) on direct use of (8.8).
-
109 -
R. E. Ka l ma n
An apparently serious limitation of the algorithm (8.16) is the
necessity to verify abstractly that
has finite length".
"~
Of
course, this can be done only on the basis of certain special hypotheses on ~'
given in advance.
(ii) ~
= coefficients
(Examples:
=0
(i) ~
for all
k > q;
of the T~lor expansion of a rational function.)
Fortunately, the difficulty is only apparent, for the preceding developments can be sharpened further: F1JNDA.MEN'rAL THEOREM OF LINEAR REALIZATION THEORY.
(8.21)
any infinite sequence
~
and the corresponding Hankel matrix H.
Suppose there exist integers (8.22)
1,1, 1,"
such that
rank
u., +,.r, 1 n,,(~), _
rank
~£ I, 1,"+1 q~) .
_.r,
'" of Then there exists unique extension A
such that with
A'
1.
by
and
Z such that
-
110-
R. E. Kalman By repeated application of (8.23), it follows that we have also
k > O.
Now i t is clear, from (8.8), that
A~
A"
every
block column of H(A) = =r
is linearly dependent on the columns to the left of it. Every partial sequence seguence
~
may be extended to an infinite
A in at least one way such that the condition n (A) o =r
for all
~
> A' (A ), v > A" (A ) =r =r
is satisfied. PROOF.
The existence of the numbers
It suffices to show, for arbitrary such a way that the numbers
A', A",
and
Consider the first row of Ar+l
n
r,
is trivial.
how to select
Ar+l
remain constant.
and examine in turn all the
first rows of the first, second, third, ••. ,
!! U ). - -r
o
A'. A"
ALth
block rows in
If the first row of the first block row is linearly depen-
dent on the rows above it (that is,
0), we fill in the first row
in
-
116-
R. E. Kalman
of Ar+l
using this linear dependence (that is, we make the first
row of Ar+l
all zeros).
This choice of the first row of Ar+l
will preserve linear dependencies for the first row of every block row below the second block row, by the definition of the Hankel pattern.
If the first row in the first block row is linearly
independent of those above (that is, contributes
I
to
n (A
o =r
we pass to the second block row ana repeat the procedure.
»,
Eventually
the first row of some block row will become linearly dependent on those above it, except when
A' = r; in that case, choose the first
row of Ar+l
to be linearly dependent of the first rows of
••• , A r•
Repeating this process for the second, third,
~,
of each block row*, eventually ing
At
or
Ar+l
rows
is determined without increas-
n. o
To complete the proof, we must show that the above definition of Ar+l
also preserves the value of
~~
That is, we must show
that no new independent columns are produced in the Hankel array of ~r
when Ar+l
is filled in.
that the definition of Ar+l rank H =r, I rank I]-r- 1 , 2
rank HI = ,r
This is verified immediately by noting implies the conditions
rank!! -r+I , l' rank ~r, 2'
rank ~2, r
rank ~l,r+l.
*Of course, ::0-., Li nea,r dep endence in t.he first step does :1Qt that the corresponding row of Ar+l will be ~ll zeros.
in~ly
o
-
117 -
R. E. Kalman
With
th~
a.id of this simple but subtle observation, the problem
is reduced to that covered by the V~in Theorem
(8.21) of Section 8. We have:
MAIN THEORD1 FOR MINIMAL PARTIAL REALIZATIONS.* be a partial sequence.
flr
Let
Then:
(i) Every minimal realization of ~r
has dimension
n (A ). o =r
(ii) All minimal realizations may be determined with the aid of B. L. Ho's formulas as given by Lemma (iii) If -is unique.
(8.17-18) vdth
r> A'(A ) + A"(A) = =r =r
there are extensions of
~r
then the minimal realization
~~ny
satiSfying
By the Main Lemma
minima'
r=alizcti~ns
o
So we can apply the
as
(9.6).
(9.5),
every partial sequence
has at least one infinite extension "hich preserves n.
A" = A"(A ) =r
(9.5).
Othen,ise there are ss
PROOF.
and
A', A"
~r
and
(8.21) of the preceding section.
It fo.l Lovs that the minimal partial realization is uni que if (the A' (A ) + A"(A ) + 1 Hankel matrix can be =r =r = =r =r filled in completely with the available data); in the contrary case, the
r
> At (A ) + A" (A )
minimal extensions will depend on the
mar~er
in which the matrices
Ar+l' •••, AA'+ 1\' have been determined (subject to the requirement
o
(9.6) ).
In view of the theorem, we are justified in calling the integer
A• =or *A similar result was obtained sDT.lutaneously and independently by T. Tether (Stanford dissertation, 19c9).
-
(9.8)
REMARK.
R.E.Kalman
118-
The essential point is that the quantities
no'
A', and AU are uniquely determined already from partial data, irrespective of the possible nonuniqueness of the minimal extensions of the partial sequence.
We warn, however, that this result does
not generalize to all invariants of the minimal realization. instance, one cannot determine from
For
how many cyclic pieces a
~r
minimal realization of A will have: some minimal realizations =r may be cyclic and others may not [KALMAN 1970b). Finally, let us note also a second consequence of the Main Theorem: COROLLARY.
Suppose
nl(~r)
columns of the Hankel array of ~r
no(~r))· Then
dim ~r
PROOF.
If
is the number of independent
(defined analogously with
= nl(~r)· "l(~r)
>
no(~r)
then, using the Main Theorem,
we get a contradiction to the fact that the rank of any Hankel matrix of an infinite sequence is lower bound f0r the dimension of any realization (Proposition to any
~Al+~~l
equal to
(8.4)). If
nl(~r)
K) •
is simply a "rule" (in practice, a computing
algorithm) which assisns to each possible output seqDtn~e Y
in
r
-
123 -
R. E. Kalman
a number in the field
K.
If y
resulted from the state
x
then
y(y)
Y(f(w))
(rof) (00)
" and, by definition of gives the value of a certain function in n the state, also the value of a certain function in
X.
This suggests
the DEFiNITION.
(10.2)
iff there is a
An element
y"x E?
~ E
X is
an c:se~vable costate
such that we have identically for all
ooEn
In other words, no matter what the initial state the value of ~
y"x
rule
at
x
x
=
[oolf
is,
can always be determined by applying the
to the output sequence
f(oo)
resulting from
x.
Note,
carefully, that this definition subsumes (i) a fixed choice of the class of functions denoted by the circumflex, and (ii) a fixed input sequence after
t
=
0
(here
v
=
0).
For certain purposes, it
may be necessary to generalize the definition in various ways (KALMAN 1970 al, but here we wish to avoid all unessential complications. According to Definition (10.2), we shall see that a system is COmpletely observable iff every costate is observable.
This agrees
with the point of view adopted earlier (see Section 4) in an ad-hoc fashion.
Also, the vague requirement to "determine
x"
used in
-
124 -
R. E . Kalman
(10.1) is now replaced by a precise notion which can be manipulated (via the actual definition of the circumflex) to express limitations on the algorithms that we may apply to the output sequence of the system. The requirement "every costate is observable" can be often replaced by a much simpler one.
For instance, if
X is a vector
space, it is enough to know that "every linear costate is observable" or even just that "every element of some dual basis is an observable costate"; if X is an algebraic variety, it is natural to interpret "complete observability" as "every element of the coordinate ring of X is an observable costate" [KALMAN 1970a]. We can now carry out a straightforward "dualization" of the
n ~r.
setup involved in the definitior. of the input/output map f:
First, we adopt (again with respect to a fixed interpretation of the circumflex) : DEFINITION.
The dual of an input/output map f:
n
~r
is the map
Note that
f
is well-defined, since the circumflex means the class
of all functions. As to the next step, we wish to prove that constancy is inherited under dualization. spift operator on obvious ones:
To do this, wo have to induce a definition of the rand
n.
The only possible definitions are the
-
125 -
R. E. Kalman
'" r
--+ '" r:
Both of these new shi f t operators will be den oted by
z
-1
The reason for this notation will become clear later. Now it is easy to verify:
(10.4)
PROPOSITION. PROOF.
f
is constant, so is
'" f.
We apply the definitions in suitable sequence:
fez -l·r)(w)
and so we see that
If
'" f
(z-l·r)(f(w))
(de t'. of
r),
Y(z.f(w))
(def. of
(ff'),
Y(f(z.w))
(f
f(r)(z. w)
(de r , of
r) ,
(z-l·1'(r))(w)
(def'. of
(fn),
c ommutes with
z
At this stage, we cannot as yet view
wheneve r
f
f
is constant),
does.
0
as the input/ output map
of a dynamical system because concatenation is not yet defined on and therefore
r
'"n "
is not yet a properly defi ned "input set".
In other words, it is necessary t o chec k that the notion of ti me i s also inherited under dualization. to be
po ~sible
In gen eral , this doe s not appe ar
wi t hout some str ong limitation on the cla s s
we shall look only at the simpl e s t
'"P.
Here
-
126 -
R. E. Kalman
HYPOTHESIS. finiteness condi t i on : such that for all
Every function
y
There is an integer
y, 0 E
r
satisfies the
ly"'l
(dependent on
in
y)
the condition
r
I, ••• ,
IrI
implies
Yeo).
r(y)
In other words, we assume that the value of each "y
at
y
is uniquely determined by some finite portion of the output sequence y.
Assuming (10.5), it is immediate that
f admits a concatenation
multiplication which corresponds (at least intuitively) to the usual
n:
one defined on (10.6)
We can now prove the expected theorem, which may be regarded as the precise form of the "duality" principle: THEOREM. map and
f
Let
its dual.
f
be an arbitrary constant input/output
Suppose further that (10. 5 ) holds.
each observable costate of
f
(relative to
may be viewed as a reachable state of '"f, PROOF.
r
induced by
f.
r
Then
satiSfYing (10.5))
and conversely.
First we determine the Nerode equivalence classes on By definition
-
127 -
R. E.Y.:alman
'"€ E P. '"
for all
Now "r is linear
f
the definition of
and
(!);
in fact, direct use of
(10.6) gives (50f)(W), wEn.
So rof
and
are equal as plements ~l
50f
same observable Gostate.
~:
chey define the
Tn fancier language, the assignment
{lo.B) is well defined and constitutes a bijection between the reachable states of '" f
and those costates of
f
which are observable
o
relative to the function class Thus ~o
hold.
(10.5) is a sufficient condition for Ghe
d~lity
principle
However, the fact that the canonical realization 0f '" f
is
completely reachable is not quite the same as saying that the canonical realization of
f
is completely observable because the latter depends
on the choice of
r
Moreover, Theorem
(10.7)
and therefore is not an intrinsic property of does not give any indicati~n how "big"
and it may certainly happen that the observability problem for ~~ch
more difficult than the reachability problem.
f.
X
is
f
f
is
These matters will
be illustrated later by some examples. Now we deduce the original form of the duality principle from Theorem
(10.7).
The essential point is that (10.5) holds automati-
cally as a result of linearity. New definition of the function class: the class of all K-linear ~xnctions.
let the circumflex denote
(All the underlyin~ bets with the
K-vector spaces, so the definition makes sense.)
-
128 -
R.E.Kalman The following facts are well known: PROPOSITION. K-vector spaces .
Let
*
denote duality in the sense of
Then:
r {).
(KP[[z-ll])*
n {).
(JCD'[ z])*
KP[z-ll,
JCD'[ [z l l.
Now we can state the (10.10)
MAIN THEOREM.
dimensional. (i)
Suppose
Suupose further that
f
PROOF.
f
f,
A
~
hence every costate of
The fact that
by Proposition (10.4).
r
Then:
K[z~ll-homomorphism
f,
isomorphic with the X f
is observable.
is K-linear implies, by (10.3),
(Caution:
K[zl-homomorphism
cannot be simplified.
are
f*
is K-linear; the constancy of
dual of the
K-linear duality.
and finite-dimensional.
The reachable states of
K-linear dual of X f;
that
is K-linear, con stant, finite-
is K-linear and constant, that is, a
(and therefore written~ f*) (ii)
f
f
f
always implies that of is not the K[zl-linear
and the construction given here
See Remark (4.26A).)
To prove the second part, we note that by Proposition (10. 9) Hypothesis (10.5) holds and thus map of a dynamical system. of
f*
are isomorphic with
f = f*
is a well-defined input/output
We must prove that the reachable states
X;,
the K-linear dual of X f•
amounts t o proving that the K-vector spac e of functions
This
-
129-
R. E. Kalman is isomorphic with the K-vector space
X;.
It suffices to prove
that the K-vector space generated by the K-linear functions (10.1l) is isomorphic with Then by
x f
0, 1, •••
i
= 0,
X;.
Suppose that, for fixed
and x,
j
1, ••• , m]
'"
every
A(x)
= O.
by definition of the Nerode equivalence relation induced
(recall here the discussion from Section
3).
X is f finite-dimensional by hypothesis, it follows from this property of
the functions
(A)
that they generate
X* f•
Since
Obviously,
din:
x;
=
so that everything is proved.
[J
In other terms, the fact that
f
with the appropriate definition of
A
is a
f
K[z-l]-homomorphism.
= K[z]-homomorphism
t
=-
k
Since (10.5) holds, we can interpret
due to input
Y
In fact, we have
y(y)
f(y)( m), (Yof)(w) , ~(f(Y)(-
k»(Wk).
the output of the dual
is given by the assignment
which is a linear function defined on the sequence.
together
implies that
in a system-theoretic 'iay, as follc~s:
system at
dim Xf'
k-th
term of the input
-
130 -
H.E.Kalman
(10.12) that
"f
REMARK.
It is essentially a consequence of Proposition (10.9)
turns out to be the same kind of algebraic object as
f.
Note,
however, that under duality the input and output terminals are interchanged and ~
t
is replaced by
-t
(hence
z
z -1) •
In terms of the pictorial definition of a system, this statement simply amounts to "reversing the directions of the arrows", which is the "right" way to define duality in the most general mathem~tical
context, namely in category theory.
We would expect
that the duality principles of system theory will eventually become a part of this very general
du~lity
theory.
yet because the correct categories to
b~
This has not happened
considered in the study of
dynamical systems have not yet been determined.
It is likely that
eventually many different categories wi]l have to be looked at in studying dynamical problems. We shall now present an example the previous results.
whi~h
should help to interpret
We emphasize, however, that the theory sketched
here is still in a very rudi.mentary form. (10.13)
EXAMPLE. x(t + 1)
y(t)
Consider the system
L
defined by
2x(t) + u(t), y(t)
=( 1
if
0;;
if
1/2 ~ x(t) < 1,
x(t) < 1/2,
x(t), t E ~;
-
131 -
R. E. Kalman
X = U = Y = ~ mod 1, i.e., the interval [0, 1).
with
be thought of as identified with 0 . ) x
We let
u(t)
o
1.
(1
= O.
is to
We view
through its binary representation or
It is clear from the definition of the sy stem that the output sequence due to any
If it.
x
x
is precisely
is irrational, infinitely many terms are needed to identify
Consequently, the
x's
lence classes induced by Relative to
are isomorphic with the Nerode equiva-
f[.
So [
cannot be reduced.
".... = functions", every co stat e of
f[
is
observable, provided that Hypothesis (10. 5) is not satisfied.
If
it is, then only those c ostates defined on fi xed-length rationals are observable (more precisely, these on a fixed finite subset of the not define a
dyn~ic al
functions which depend only .... gk(x)ls). Thus: either f does ~re
sy st em or not all co st ates are obse r vabl e .
Now let us replace the set
[0, 1)
by its inters ection
with the rationals .
It is clear that there is now a finite algorithm
for dete rmining
we simply apply the re sult s of partial realiza-
x:
tion theory of the previous se ction. problem is to express of polynomials in is rational.)
x
from
~2[2]--which
However,
x
(We take
K
= ~2
(gl(x), ••• , g2(x) 0
and the
as a ratio
i s always pos sible sinc e each
x
i s not "effecti vely computable" in the
-
132 -
R. E. Kalman
strict sense since there is no way of knowing when the algorithm has stopped. no
~
~(x)
rule
for all
In other words, given an arbitrary costate
,.,
y"
x
x.
such that the application of "y" x
to
On the other hand, substituting into
,.,
x
there exists
Yx gives
,.,
x
the
results of the partial-realization algorithm will give an approxi~tion to the value of
~(x)
which always converges in a finite
(but a priori unknown) number of steps as more values of the output sequen~e
are observed.
In short, the costate-determination algorithm
has certain pseudo-random elements in it and therefore cannot be described through the machinery of deterministic dynamical systems. (Is there some relation here to the conceptual difficulties of Quantum Mechanics?)
-
133 -
R. E. Kalman
11.
HISTORICAL COMMENTS
It is not an exaggeration to say that the entire theory of linear, constant (and here, discrete-time) dynamical systems can be viewed as a systematic development of the equivalent algebraic conditions (2.8) and (2.15). Of course, the use of modules (over
K[z])
to study a constant
square matrix (see (4.13)) has been " st andar d" since the 1920's under the influence of E. NOETHER and especially after the publication of the Modern Algebra of VAN DER WAERDEN. must be also quite old.
Condition (2.15), by itself,
For instance, GANTMAKHER [1959, Vol. 1, p. 203]
attributes to KRYLOV [1931] the idea of computing the characteristic polynomial of a square matrix A by choosing a random vector computing successively b, Ab,
A2b, ...
band
until linear dependence is
obtained, which yields the coefficients of det (zI - A). will succeed iff X is cyclic with generator A
g.)
(The method
However, the
merger of (4.13) with (2.15), which is the essential idea in the algebraic theory of linear systems, was done explicitly first in KALMAN [1965b]. We shall direct our remarks here mainly to the history of conditions (2.8) and (2.15) as related to controllability.
See also earlier
comments in KALMAN [1960c, pp , 481, 483, 484] and in KAWAN, HO, and NARENDRA [1963, pp. 210-212].
We will have to bear in mind that the
development of modern control theory cannot be separated from the development of the concept of controllability; moreover, the technological problems of the 1950's and even earlier had a major influence on the genesis of mathematical ideas (just as the latter have led to many new technological applications of control in the 1960's).
-
134 -
R.E. Kalman
The writer developed the mathematical definition of controllability with applications to control theory, during the first part of 1959. (Unpublished course notes at Johns Hopkins University, 1958/59.) first definitions were in the form of (2.17) and (2.3).
These
Formal presenta-
tions of the results were made in Mexico City (September, 1959, see KALMAN [1960b]), University of California at Berkeley (April, 1969, see KALMAN [1960d]), and Moskva (June, 1960, see KALMAN [1960c]), and in scientific lectures on many other concurrent occasions in the U.S.
As
far as the writer is aware, a conscious and explicit definition of controllability which combines a control-theoretic wording
~th
a
precise mathematical criterion was first given in the above references. There are of course many instances of similar ideas arising in related contexts.
Perhaps the comments below can be used as the starting point
of a more detailed examination of the situation in a seminar in the history of ideas. The following is the chain of the writer's own ideas culminating in the publications mentioned above: (1)
In KALMAN [1954] it is pointed out (using transform methods)
that continuous-time linear systems can be controlled by a linear discrete-time (sampled-data) controller in finite time.* *It is sometimes claimed in the mathematical literature of optimal control theory that this cannot be done with a linear system. This is false; the correct statement is "cannot be done with a linear controller producing control functions which are continuous (and not merely piecewise continuousl) in time." Such a restriction is completely 'irrelevant from the technological point of view. As a matter of fact, computer-controlled systems have been proposed and built for many years on the basis of linear, time-optimal control.
-
135-
R. E. Kalman
(2)
Transposing the result of KALMAN [1954] from transfer functions
to state variables, an algorithm was sketched for the solution of the discrete-time time-optimal control of systems with bounded control and linear continuous-time dynamics. (3)
[KALMAN, 1957]
As a popularization of the results of the preceding work, the
same technique was applied to give a general method for the design of linear sampled-data systems by
~~
and BERTRAM [1958].
Some background comments concerning these papers are appropriate: (1)
The ideas and method presented in KALMAN [1954] descend
directly from earlier (and very well known) engineering research on time-optimal control.
(The main references in KALMAN [1954] are:
McDONALD [1950], HOPKIN [1951], BOGNER and KAZDA [1954], as well as a research report included in
~~l
[1955].)
Although the results of
KALMAN [1954] on linear time-optimal control were considered to be new when published, it became clear later that similar ideas were at least implicit in OLDENBOURG and SARTORIUS [1951, §90, p. 219] and in TSYPKIN's work in the early 1950's.
The engineering idea of nonlinear time-optimal
control goes back, at least, to DOLL [1943] and to OLDENBURGER in 1944, although the latter's work was unfortunately not widely known before 1957. During the same time, there was much interest in the same problems in other countries; see, for instance, FELDBAUM [1953] and UTTLEY and HAMMOND [1953].
Mathematical work in these problems probably began with BUSHAW's
dissertation [1952] in which, to quote from
~·WL~
[1955, before equation
(40»), " ••• [it was] rigorously proved that the intuition which led to the formulation of the [engineering] theory [quoted above] was indeed correct."
TSIEN's survey [1 954] contains a lengthy account of this state
-
136 -
R.E.Kalman
of affairs and was ready by many• . We emphasize:
none of this
extensive literature contains even a hint of the algebraic considerations related to controllability. (2-3)
The critical insight gained and recorded in KAU~ [1957] is
the following:
the solution of the discrete-time time-optimal control
problem is equivalent to expressing the state as a linear combination of a certain vector sequence (related to control and dynamics) with coefficients bounded by 1
in absolute value, the coefficients being
the values of the optimal control sequence. of the first
n
The l inear independence
vectors of the sequence guarantees that every point
in a neighborhood of zero can be moved to the origin in at most
n
steps (hence the terminology of "complete controllability"); and the condition for this is identical with (2 .17) (stated in KALMAN [1 957] and KALMAN and BERTRAM [1958] only for the case
det F of 0
and m = 1).
A thorough discussion of these matters is found in KALMAN [1960c; see especially Theorem I, p. 485].
A serious conceptual error in KALMAN
[1957] occurred, however, in that complete controllability was not assumed, as a hypothesis for the existence of time-optimal control law, but an attempt was made to show that the controllability is almost always com.plete [Lem:na 1].
In fact, this lemma is true, with a small
technical modification in the condition.
Only much later did it become
clear (see the discussion of Theorem D in the Introduction), however, that a dynamical system is always completely controllable (in the nonconstant case, completely reachable) if it is derived from an external description. this difficulty, very
~sterious
in 1957, which led to the development
It was
-
137 -
R.E. Kalman Of a formal machinery for the definition of controllability during the next two years .
The changing point of view is already apparent in
KALMAN and BERTRAM [1958]; the unpublished paper promised there was delayed precisely because the algebraic machinery to prove Theorem D was out of reach in 1957-8.
Consult also the findings of the biblio-
grapher RUDOLF [1969].
IN
S~~Y:
under the stimulation of the engineering problems
of minimal-time optimal control, the researches begun by KALMAN [1954,
1957] and KAlilAN and BERTRAM [1958] eventually evolved intoiwhat has come to be called the mathematical theory of controllability (of linear systems). Beginning about 1955, Ind stimulated by the same engineering problems, FONTRYAGIN .and h i,s school in the USSR developed their mathematical theory of optimal control around the celebrated "Maximum Principle". mentioned
(They were well aware of the survey of TSIEN [1954] acove J and referenced it both in English and in the Russian
translation of 1956.)
We now know that ~ theory of control, regard-
less of its particular mathematical style, must contain ingredients related to controllability.
So it is interesting to examine how
explicitly the controllability condition appears in the work of PONTRYAGIN and related research. GAMKRELIDZE [1957, §2; 195e §lJ §2] calls the time optimal control problem associated with the system
(11.1)
dx/dt
Ax
+ bu(t)
-
138 -
R. E. Kalman "nondegenerate" iff subspace of (11.2)
n R •
b
is not contained in a proper A-invariant
He notes immediately that this is equivalent to
~ ) det ( b, Ab, ••. , An- [)
f.
(i.e., the special case of (2.8) for
0
m = 1).
He then proves:
in
the "degenerate" case the problem either reduces to a simpler one or the motion cannot be influenced by the control function
u(·).
~
this is very close to an explicit definition of controllability. However, in discussing the general case
m > 1,
GAMKRELIDZE [1958,
§3, Section 1] defines "nondegeneracy" of the system
=
dx/dt
Ax + Bu(t)
as the condition (11.4)
det (b., Ab., ••• , An-~.) ~
~
~
f.
0
for every column
b~
.
E B,
but he does not show that this generalized condition of "nondegeneracy" for (11.3) inherits the interesting characterization proved for "nondegeneracy" in the case of (11.1).
In fact, condition (11.4) is much too strong
to prove this; the correct condition is (2.8), that is, complete controllability.
In other w0rds, in GPJ~IDZE's work (11.4) plays
the role of a technical condition for eliminating "degener a cy" (actually, lack of uniqueness) from a particular optimal control problem and is not ; explicitly related to the more baEic notion of complete controllability. Neither GAMKRELIDZE nor PONTRYAGIN [1958] give an interpretation of (11.4) as a property of the dynamical system (11.3) , but employ (11.4) only in relation to the particular problem of time-optimal control.
See
-
139 -
R.E.Kalman also KALMAN [l960c, p. 484].
A siaular point of view is taken by
USALLE [1960]; he calls a dynamical system (11.3) satisfying (2.8) "proper" but then goes on to require (11.4) (to assure the uniqueness of the time-optimal controls) and calls such systems "normal". The assumption of some kind of "nondegeneracy" condition ·...as apparently unavoidable in the early phases of research on the timeoptimal control problem.
For example, ROSE [1953, pp . 39-58] examines
this problem for (11.1); by defining "nondegeneracy" [po 41] by a condition equivalent ot (11.2), he obtains most of GAMKRELIDZE's results in the special case when A has real eigenvalues [Theorem 12].
ROSE
uses determinants closely related to the now familiar lemmas in controllability theory but he, too, fails to formulate controllability as a concept independent of the time-optimal control problem. A similar situation exists in the calculus of variations.
The
so-called Caratheodory classes (after CARATHEODORY [1933]) correspond to a kind of classification of controllability properties of nonconstant systems.
In fact, the standard notion of a normal family of extremals
of the calculus of variations is closely related to condition (11.4), suitably generalized via (2.5) to nonconstant systems.*
Normality is
used in the calculus of variations mainly as a'hondegeneracy'condition. It is importan':. to note that the "nondegeneracy" condit loons employed in opt Ime.l c orrt r o., ",nd the calculus role of eliminating
annoyin~
01
var a.at a.ons play mainly the
;,echnicalities and simplifying proofs.
*The use of the word "normal" by IaSALLE [1960] t'or (.11.4) is only accidentally coincident with the earlier use of the "normal" in the calculus of variations.
-
140-
R. E. Kalman With suitable formulation, however, the basic results of time-optimal control theory continue to hold without the assumption of complete controllability.
The same is not true,
howeve~,
of the four kinds of
theorems mentioned in the Intorduction, and therefore these results are more relevant to the story of controllability than the time-optimal control discussed above. There is a considerable body of literature relevant to controllability theory which is quite independent of control theory.
For instance, the
treatment of a reachability condition in partial differential equations goes back at least to CHOW [1940] but perhaps it is fairer to attribute it to Caratheodory's well-known approach to entropy via the nonintegrability condition.
The current status of these ideas as related to
controllability is reviewed by WEISS [1969, Section 9].
An independent
and very explicit study of reachability is due to ROXIN [1960]; unfortunately, his examples were purely geometric and therefore the paper did r.ot help in clarifying the celebrated condition (2.8).
The
Wronskian determinant of the classical theory of ordinary differential equations with variable coefficients also has intersections with controllability theory, as pointed out recently with considerable success by SILVERMAN [1966].
Vany
problems in control theory were misunderstood
or even incorrectly solved before the advent of controllability theory. Some of these are mentioned in KALMAN [1963b, Section 9].
For relations
with automata theory, see ARBIB [1965]. Let us conclude by stating the writer's own current position as to the significance of controllability as a subject in mathematics:
-
141-
R. E. Kalman
(1)
Controllability is basically an algebraic concept.
(This
claim applies of course also to the nonlinear controllability results obtained via the Pfaffian method.) (2)
The historical development of controllability was heavily
influenced by the interest prevailing in the 1950·s in optimal control theory.
Ultimately, however, controllability is seen as a relatively
minor component of that theory .
(3)
Controllability as a conceptual tool is indispensable in
the discussion of the relationship between transfer functions and differential equations and in questiohs relating to the four theorems of the Introduction.
(4)
The chief current problem in controllability theory is the
extension to more elaborate algebraic structures. For a survey of the historical background of observability, which would take us too far afield here, the reader should consult KALMAN [1969b].
-
142 -
R. E. Kalman
12. Sec~ion
A:
REFERENCES
General References
M. A. ARBIB
A common framework for automata theory and control theory, SIAM J. Contr., 2:206-222. C. W. CURTIS and 1. REINER Representation Theory of Finite Groups and Associative Algebras, Interscience-Wiley. E. M. DAY and A. D. WALIACE [1967]
Multiplication induced in the state space of an act, Math. System Theory; 1:305-314.
C. A. DESOEH and P. VABAlYA [1967]
The minimal realization of a nonanticipative impulse response matrix, SIAM J. Appl. Math., 15:754-764.
E. G. GILBERT Controllability and observability in multivariable control systeffi3, SIAM J. ContrOl, 1:128-151. B. L. HO and R. E. KAIJlAN [1966]
Effective construction of linear state-variable models from input/output functions, Rege1ungstechnik, 14:545-548. The realization of linear, constant input/output maps, I. Complete realizations, SIAM J. Contr., to appear.
S. T. HU [1965 ]
Elements of Modern Algebra, Holden-Day.
R. E. KALMAN
[1960a]
A new approach to linear filtering and prediction problems, J. Basic Engr. (Trans. ASME), 82D:35-45.
[1960b]
Contributions to the theory of optimal control, Bol. Soc. Mat. Mexicana, L:I02-119.
-
[1960c]
143 -
On the general theory of control systems, Proc. 1st IFAC Congress, Moscow; Butterworths, London. Canonical structure of linear dynamical systems, Proc. Nat. Acad. of Sci. (USA), 48:596-600. New methods in Wiener filtering theory, Proc. 1st Symp. on Engineering Applications of Random Function Theory and Probability, Purdue University, November 1960, pp 270-388, Wiley. (Abridged from RIAS Technical Report 61-1.)
[1963b]
Mathematical description of linear dynamical systems, SIAM J. Contr., 1:152-192.
[1965a]
Irreducible realizations and the degree of a rational matrix, SIAM J. Contr., 13:520-544. Algebraic structure of linear dynamical systems. I. The Module of E, Proc. Nat. Acad. Sci. (USA), 54:1503-1508.
[1967]
[1969a]
On multilinear machines, J. Compo and System Sci., to appear.
[1969b]
pynamic Prediction and Filtering Theory, Springer, to appear.
[1969c]
On partial realizations of a linear input/output map, Guillemin Anniversary Volume, Holt, Winston and Rinehart.
[1970a]
Observability in multilinear systems, to appear.
[1970b]
The realization of linear, constant, input/output maps. II. Partial realizations, SIAM J. Control, to appear.
R. E. KAWAN and R. S. BUCY
New results in linear prediction and filtering theory, J. Basic Engr. (Trans. ASME, Sere D), 83D:95-100. R. E. KALMAN, P. L. FALB and M. A. ARBIB Topics in
~~thematical
System Theory, McGraw-Hill.
R. E. KALMAN, Y. C. HO and K. NARENDRA [1963]
Controllability of linear dynamical systems, Contr. to Diff. Equations, 1:189-213.
C. E. LANGENHOP
[1964]
On the stabilization of linear systems, Proc. Am. Soc., 15:735-742.
~~th.
-
S. LANG
[1965]
144
R. E. Kalman
Algebra, Addison-Wesley.
S. MAC LANE
[1963]
Homology, Springer.
L. A. MARKUS
[1965 ]
Controllability of nonlinear processes, SIAM J. Control, J:78-9O.
E. F. MOORE [1956]
Gedanken-experiments on sequential machines, in Automata Studies, C. E. Shannon and J. McCarthy (eds.), pp. 129-153, Princeton University Press.
P. MUTH [1899]
Theorie und Anwendung der Elementarthei1er, Teubner, Leipzig.
A. NERODE [1958]
Linear automaton transformations, Proc. Amer. Math. Soc.,
2:5 41-544 .
L. SILVERMAN [1966]
Representation and realization of time-variable linear systems, Doctoral dissertation, Columbia University.
L. M. SILVERMAN and H. E. MEADOWS Equivalent realizations of linear systems, J. Control, to appear. H.
S~l
WEBER [1898]
Lehrbuch der Algebra, Vol. 1, 2nd Edition, reprinted by Chelsea, New York.
L. WEISS Lectures on Controllability and Observability, C.I.M.E . Seminar. L. WEISS and R. E. KALMAN [1965 ]
Contributions to linear system theory, Intern. J. Engr. ScL, J:141-171.
W. M. WONHAM [1967]
On pole assignment in multi-input controllable linear systems, IEEE Trans. Auto. Contr., AC-12:6oo-665.
-
145 -
A. M. YAGLOM
An Introduction to the Theory of Stationary Random Functions, Prentice-Hall.
D. C. YOUIA
[1966]
The synthesis of linear dynamical systems from prescribed weighting patterns, SIAM J. Appl. Math., 14:527-549.
D. C. YOUIA and P. TISSI
[1966]
n-port synthesis via reactance extraction, Part I, IEEE Intern. Convention Record.
O. ZARISKI and P. SAMUEL [1958]
Commutative Algebra, Vol. 1, Van Nostrand.
-
Section B:
146-
References for Section 11
M. A. ARBIB
A common framework for automata theory and control theory, ~IAM.J. Contr., 1:206-222. I. BOGNER and L. F. KAZDA
[1954 ]
An investigation of the switching criteria for higher order contactor servomechanisms, Trans. AlEE, 11 11:118-127.
D. W. BUSHAW Differential equations with a discontinuous forcing term, doctoral dissertation, Princeton University.
[1952]
C. CARATHEODORY [1933 ]
Uber die Einteilung der Variationsprobleme von Lagrange nach Klassen, Comm, Mat. Relv., 2:1-19.
W. L. CHCM
[1940]
Uber Systeme von linearen partiellen Differentialgleichungen erster Ordnung, Math. Annalen, :98-105.
H. G. DOLL
[ 1943]
Automatic control system for vehicles, US Patent 2,463,362.
A. A. FELDBAUM
[1953 ]
Avtomatika i Telemekhanika, 14 :712-728.
R. V. <W-lKRELIDZE the theory of optimal processes in linear systems (in Russian), Dokl. Akad. Nauk SSSR, 116:9-11.
[1957]
On
[1958]
The theory of optimal processes in linear systems (in Russian), Izvestia Akad. Nauk SSSR, ~:449-474.
F. R. GANTMAKHER
[1959]
The Theory of Matrices, 2 vols., Chelsea.
-
147 -
A. M. HOPKIN [1951]
A phase-plane approach to the compensation of saturating servomechanisms, Trans. AlEE, 70:631-639.
R. E. KALMAN
[1954 ]
D~scussion of a paper by Bergen and Ragazzini, Trans. AIEE, 73 II: 245-246.
[1955]
Analysis and design principles of second and higherorder saturating servomechanisms, Trans. AIEE, 74 II:29h-3l0.
[ 1957]
Optimal nonlinear control of saturating systems by intermittent control, IRE WESCON Convention Record, 1, IV:130-135.
[1960b]
Contributions to the theory of optimal control, Bol. Soc. ~~t. Mexicana, 1:102-119.
[1960c]
On the general theory of control systems, Froc. 1st IFAC Congress, Moscow; Butterworths, London. Lecture notes on control system theory (by M. Athans and G. Lendaris), Univ. of Calif. at Berkeley.
[1963b]
NathemaUcal description of linear dynamical systems, SIAM J. Contr., 1,:152-192.
[1965b]
Algebraic structure of linear dynamical systems. I. The Module of E, Proc. Nat. Acad. Sci. (USA), 54:15 03-1508 .
[1969b]
Dyna~ic
Prediction and Filtering Theory, Springer, to app ear.
R. E. YJ\l1:A1J and J. E BERTRAM
[195 8 ]
R. E. KALllAN, [1963]
General synthesi s procedure f or computer control of single and ~ulti-loop linear systems, Trans, ALEE, TI 11 I: t 0 for a l mos t
a l mo st e ve ry - w he re i n wh ere
all 1: E P -C [0, TJ, then
a ll 't ER C [0, T J ,
then
gr a d
T he co nd it io n s ( 3) ,
(4)
PI (x, ')")
= 0
R,
P, R = a r b it ra ry s et s of po s iti ve m e a sure o f the interval
d ient form:
x
[0,
TJ
(and (7), (8)) c a n b e a lso written i n th e g r a -
-
167-
R. K ulik ow s ki
III)
~ g r a d >. 'f(X, ",\ )
= G(x)
> 0 for a l m o s t all C" E P C [0, T] then
), ('t) = 0 almost every-where in P, IV) if
"5:(-::) > 0
g r a d .l.
1? (x, ~
Co n d it i o n s
:=
for almost all 'e'ER
[O,T] , then
) = 0 almost every-wher e in R.
III-IV h a v e a simple physical interpretation: a t th e p oints
where the c o n s t r a i nt s are not active the L a grange-functi on vanishes and wh en the L a gran g e - function is positive the
co n s t ra i nt s must be a ctive .
It i s also p ossible to write down the g ra d i e n t form of conditions o f
optimality in th e case of minimali zation o f Wh en th e operator c tio n s ,
F'(x ) •
G(x) is a cting into th e space o f c o nt i n uo us fu n -
G: X -,. C [0, T] , one c a n w r ite
i. c ,
T ) G[x] d), ("d
\ " .\ LG(x) '1J
o where
). (~)
d ). (r) ~ 0 small
is a non-in creasing fun ction, what c an b e als o written a s
vi c inity of 1:' is non -ne gative). Wh en
A'( z )
d ~\ (1:") =
de',
;l.(?:")
to
d). (. ("l:")
(that notation m e ans that the increas e of
distribu tion). In th a t
- fun ctions a t th e dis c ontinuity-
- p o i nt s o f .:\ (1: ) The co n d itio n
( 3)
in th e pr-e s e nt c a se c a n b e written a s
0, or I) i f = 0
II)
if
It is
grad/.
~ (x'\) = G Lx] > 0
.r o r
()..(1:") = c o n s t ) at the point d\(;;) > 0, "t"l:P
th en
"r
[0, TJ
't" E-P C [0 , T ]
then
d
A( co)
T or v i c i n i t y of ""t' ,
= 0,
tEP .
gr'a d i " n ~
10r m
G [ x]
also po s .si bl e to write th e
€
of 0 ;.'1. i ma l i ' y co n d i -
-
168-
R. Kulikowski
tions for the remaining formulae leave that as an exercise The conditions
(1) - (4), (5) - (8), (10) - (1:3).
We shall
for the readers. (5) - (8)
can be called the quasi-saddle-point condi-
tions and they can be treated as a generalization of well differential conditions of optimality, which
known Kuhn-Tucker
were for mulai ed originally for
nonlinear functions in finite -dimensional spaces
~~~~~~~~~~X_~~~j2~~~~~Y!_~2~~~~~~~~ When an optimization problem is being solved it is also important to know that the functions x(t) which do not satisfy the cond itions (1)-(4) or (5) - (8)
can not be optimal. That problem requires to prove that the
conditions (1)-(4) are also necessary for optimum. In order to prove the necessary conditions we shall impose certain regularity conditions on G but Shall not assume any more that We shall call ble variation
x
x € X o at the point
o
,which
G
if
is defined
for every admiss iby the condition
G [x ] + dG(x , x) ~ 0 , o 0
(I)
there e xi s t s e curve emanating of admissible solutions
n .
generally speaking, a function i , e.
F and G are concave .
a regular point of x
F and
y:
from
x, tangent to x and lying in the set o 0 By a curve in the Banach space we understand,
r of real variable
s
with
the range in
X ,
R't X • According to definition that function should satisfy the follow-
ing conditions: (2)
't (s)E n , We shall
F(x)
(3)
'(0) = x O '
dt(O, 1)
show now, that if
and a re gular point of
=~.
is a maximali zing point for
G then for each admissible variation
G(x) + dG(x, x)
~
0 ,
x, i. e .
-
169-
R. Kulikowski
we get a non-positive increase of F, i , e. -dF(i, x) 2: 0.
(4)
Indeed, the real function f(s)
(5)
attains the maximum value for On the
['f(S)J
F
'1'(0) =
X, i.e. for s
= 0. Then
df(O, I)'::; 0.
other hand, according to the differentiation rule of compound. fun-
ction (5)
and formulae (2) we get df(O, I) = dF ['1-(0), d 1(0, I)] Then
dF(x, x) SO and
(4) has been
= dF(x, x) .
proved.
Introducing the notation II (x)
= -d F(x,
g(x )
xj ,
= G(x) + dG(i,
x)
the obtained result can be written as: (6)
g(x) 2:
if
°
then
II (x) ~
°.
The next step in our reasoning consists in showing that there exists such
a functional
). > 0, which will ensure the relation
where L (x) I
(7)
= dG(x, x] .
The main obstacle in showing that is the nonlinearity of g(x), whi ch consists of linear term In order to (8)
L (x) and the additive term G(x). I overcome that obstacle an auxiliary operator L < s , x>
=
< s, sG(i) + dG(x , x»,
where L: R"X'-' R"Z
s€R,
-170 -
R. Kulikowski
can be introduced. it can be proved that for
real numbers
and
o(l,,x2
Since the fun ctional Rx X
L
is a linear operator,
i,
e,
sl' S2€ R,
11 (x)
can
be also treated as defined over
then the following notation 1
S,
-dF(x,
x)
can be introdu ced. Now we should che ck whether the condit ion L <s ,
x> ? 0 impli es
<s , x>
> O.
Observe th at this c o n d iti o n c a n be written as
s ? 0
(9)
and
sG(iC) + dG(i, x ,
Assuming first of all that
s >0
G(x) + dG(x, x is) Taking into ac count
? 0
and divining
?
(4) we obtain
(9) by
s
we get
0 . -dF(x, x is)? 0
and -dF(x, x)?O
or ( 10)
1
which s at i s fly c o nd itio n s Each point
(11 )
S,
s > 0,
on the set
P
of
pai r s
sG(x) + dG(x, x).2 0 .
for which the relation dG(x , x) ;: 0
holds can be treated as a limit of the sequence sequen ce c a n be constructed as follows.
of points from
P,
That
-171-
R. Kulikowski
Take on element
!n
(12)
x
with the property that
o
+ dG(x, x
G(x)
Summing up (11) and (12)
!n
0
In)
>0
-
one gets
G(x) + dG(x, x
In + x)
>0
0 -
Then 1 1 < -, - x n
n
1
lim n .,.
then
(9)
+x >e
0
P
• Since
1
< 0, x>
< -;;-' -;;- X o + x >
00
implies (10) for all
s
2 o.
Introducing the notation
<s, x> = w, Rx X = W the obtained relation
can be written as ( 13)
if
L(w)
2
0
then
l(w)::: 0 ,
where L = linear operator, I
L: W
= linear functional over
W.". , adjoint to
~
V = Rx Z ,
W (an element of the space
W).
We can now return back to the main problem which is the existence of nonnegative functional
l(w)
(14)
To solve that spa ce s
V Jt
satisfying the relation
=
v*[L(w)].
problem we shall need a generalization for Banach
of the well known Farkas lemma. Hirst of all it is necessary to introduce a few additional notions. We shall denote by
Q
a set of all functionals of
be represented in the form (15 )
w*+ (w) = v ....
[L(W)J '
'"
v 2:: 0,
w
which can
-
172-
R. Kulikow ski
where
*
v e- V
v
l O.
As shown in Ref. dition
3 . of theorem
an element (21 )
[7] in order
to check whether the re gularity c on -
2 holds it is sufficient to show that there exists such
w-Ii E W, that L(w'''') > 0 •
-175 -
R. Kulikowski
For example, in the case of the
L(w)
= < s,
operator
t
J If' [x{~)J x('t')d"t'
(20)
one obtains
t
- Sr [XC~)} d t'} > ,
+ s [a(t)
°st:ST.
o
o t
Since
~ If[xCI:)]
a(t) -
there exists such a parr
d 1;' ~ 0 , then
< s,* x· >,
it can be easily proved that
s'" > 0
x~(t) > 0,
and
t
e [0, T],
that
L(w*) > 0 • It should be noted that the theorem
of variational
2 generalizes certain theorem
calculus in Banach spaces and, in particular,
the following
theorem of Luster-nik, Theorem 3. Let the functionals
If grad
X E.X
and
F'(x},
subject
H(x)
II > O.
If
to the constraint
x
is
a conditional
H(x) = c ,
grad F(x)
(22) and
F, H be strongly differentiable at the point
= p
where grad
extremum point of
c = H(x) , then
H(x) ,
/"- is a number • The proof of that theorem is given in
Consider a controlled input
u(t)
Ref.
[9] .
linear dynamic system, shown in Fig. 1; having one and n+I outputs, which
are described by Volterra
operators: (1)
Yi(t)
J kp,1:") u(1:')
d1:' ,
o where ki (t, 1:' ) - linearly independent transient functions of the system, i = 0, 1, .•• , n •
-
176-
R. Kulikowski
A typical optimization problem can be formulated as follows; Find the function u(t) € L P [0, TJ, which minimalizes T
II
(2)
u 1/
=( J
p
ju(t)1 p
o
,) 1/p
d~
, p Co 1
subject to the constraints T
J
y/T) =
(3)
k.(T,'t' 1
u(-r) d z-
= x .; 1
i = 0,
1, ••• , n
o where
T,
x . - given real numers, 1
In other words, for the given outputs
it is required to minimalize the control cost (2)
x . attained at the time 1
t = T •
In certain cases additional conditions of the form (4) (~,
M ~ u(t) M
M
given numbers) or t J.(t) = k.(t, z ) u(?") d z < x .(t), J 0 J J
f
(5)
S
= 0,1, ... , n
(x .(t) = given time functions) are being imp osed. J The constraint (5) is c a ll e d "restriction of phase coordinates". There exist, of course, many known optimization techniques, such as : va r ia t io na l cal culus,
maximum principle, dynamic programming etc. whi ch
c an be applied for the solution of the optimization problems formulated above. In the present se ction we should like to demonstrate that the m ethod based on theorems
I,
2 of sections
4, 5 , is very c o n ve nie nt for the solution of
problems incl uding restri ction o f ph ase c o o r d i na te s . Instead of de alin g w ith a g e ne r a l n-dim ensional s ystem we shall confine o u r anal ysis to a s e cond order s ystem, whi ch is fr equently e nc o u nt e r e d in th e
en gine ering pra cti ce (e. g. in s ervom e ch ani sms etc .) . The r esult o f that
anal ysis w ill be us eful for th e in vestigation of a class of c ontrollability probl e rn s ,
-
177-
R. Kulikowski
Example I. Consider a system described by the differential equa tio n dYI dt = u(t),
(6)
with zero i nitia l conditions:
y 0(0) = Y1(0) = 0,
shown in Fig. 2 .
It is r equired to find such a control function
u(t)
which min imali ze s
th e " energy cost": I
T
="2 j
E(u)
(7)
[u('()
J
2
dt,
o s ub ject to the constraints T (8)
YO(T) =
J
(T -1:') u(;:-) d r = x ' O
(x
O
> 0)
0 T
YI (T) =
J
u( ·~)
dr
=0 •
0
The c o n s t r a i nts (8), (9) mean th at the deflection of the output c o o r dinat e of the s ystem for of that coordinate
at
t =T
t =T
is equal x 0
a nd the c o r re s po n d i ng velo city
is equal ze r o . The c o ns t r a i nt s of that kind are
typi cal for operatio n of controlled motors
and se r vo m ec ha nis m s .
The Lagrangean of present pr obl em is e q ua l
~ (u, f ) (10)
1 =2
T 2 \ [u ( 1:' )] d t- +
o
T
( T -'t') u(r) d r -
xo~+ /'1-2 J o
wh ere
r-2 = La grange multipliers. Th E' ne cess ary, a nd at th e sa m e t im e s uff icie nt co nditi o n (due t o t he
1~I'
co nvexi ty of
E (u )) , o f o p t ima l it y acco r d in g t o th e or em
3 becomes :
-178 -
R. Kulikowski
(ll ) where
grad f'l'
u
~(u, t1' 4>2
Definition 2.2. 4>(') to
E:
C([a,t].S) .
(i) Given an initial function
C ([a,to],D) , a solution to (1) is a function
< S ~ y , such that
x(t)
E:
x(t) = <jl(t)
satisfies (1) on
for all
t
E:
[Cl,t
x(t;t , 4» o
•
] O
C ([ Cl, rl) ,D)
E:
and
(to'S) (ii) The solution at time
from initial time
x(·)
to
and initial function
4>
t
generated
is denoted by y(t;t , 4» o
This solution is unique if any other solution
is identical to it as far as both are defined. Theorem 2.2. be continuous in
l
Consider the system (2.1) and let
and locally lipschitzian in
W.
Let
Then there exists a unique solution on
[a,S), to < S
~
y,
then for any compact set numbers
to < t
i.e., as
t
< t
l
+ ~
x(t)
2
t
and
such that
Let
D
ljI, and linear in
W
[a,y) .
S
=
x(t;t , 4» o
cannot be increased,
x(t
k)
E:
comes arbitrarily close to
there exists a unique solution interval
S < Y and
x(t)
G ~ D there exists a sequence of real
< •• + S
Corollary 2.3. in
and if
f(t,ljI(·»
Rn
x(t)
and let
D - G, k = 1,2, •.. ,
D or is unbounded. f(t,W('»
be continuous
Then for every =
x(t;t , 4» o
on the entire
-
209-
L. Weiss
3.
REPRESENTATION OF SOLUTION S FOR LI NEAR DELAY- DIFFERENTI AL SYS TH1 ~
In th is s e cti on, whi ch i s b as e d he avi l y on th e work o f Hale and Meyer [12] , we cons i der t he equati ons o f t he f orm
dx dt = f(t,x('»
(3.1)
with in itial function sp ace is linear in
x(')
t - h ::s::t f or a l l
~ £
It
B , all
The con t ro l funct ion
C([t
+ u(t)
o
- h,t o],R
n)
=
and de pends on ly on values of
u(')
L(')
x (s )
f(t, x( '» for
II f( t, ¢(.» II
i s furthe r a ssumed that t . where
B whe r e
s L( t)
II
. t
0
0
to whi ch a unique so l ut ion ex is ts by The orem 2 .2 . on
t
The hy potheses
a l l ow appl icat i on of th e Ri esz Repr es en ta tion The ore m t o
e s tab lis h e xi stence of an
n x n
matri x value d f unc tion
n
defi ned
[t-h. t 1
-
210-
L. Weiss
on ~
(_00,00) x [-h,O] £
f(t,~(·»
n(t,T)]~(T)
= fO
[d -h T is of bounded variation on
Moreover, n Ct ;«)
B
each
such that
for all
[-h,O]
for
n) Ll([to,T) ,R
t
range in
Now let denote the space of functions with n R which are Lebes~ue integrable over [to' T) Then we
have Theorem 3.1. (or (3.2» ~
c B.
Let
with control
for all
x(t;to'~'O) + K (t ,»)
f or each W(t,s)
T >. t
and with
o
Then
(3.3)
where
be the solution of (3.1)
x(·,to'~'u)
is defined for
t , and
f:
K(t,s)u(s)ds , t >. to o
s ~ t - h, K(t,')
K(t,s) -- a~=,s)
2 L«t 0 ,t],Rn)
£
00
1 t everyw h ere, were h amos
is the unique solution of the equations
W(t,s) = Oft
for all
t
£
[s - h,s]
(3.4)
{
s
Proof:
Let
f-ho
{d T1U;,T)} W(T + F;,i;)dF; - (t - s)1 , s
W( t ,s)
'S t
T
u(·) c L ([to,t],Rn ) l
is a continuous linear operator mapping
M(u) =6 x(t;to'O,u )
Then n
Ll([to,t],R)
Theorem, there exists an uefined for all
t
:l;
t
o
into n x n
n
R
and
matrix
such that
.
-
211-
L. Weiss
I
t X(t;t ,T)u(T)dT t
It is let
ea~ily
K(t,T)
shown [12] that
=
X is independent of
X(t;to,T) , t c (-00,00) , T c (-,t]
K( t, T)
0
W( t ,»)
- fK(t,T)dT
Then
0
o
for
T
1:
(t,t + h] for
For any
t >. n
'1
W satisfies (3.4) and
and
~
t
and let
(_, 00)
£
Wet ,~)
Hence
o
,
for
0
let t
£
['1 - h,'1]
K is as stated in the theorem .
The linear system to be discussed in some detail from a controllability standpoint is of the form
~;
(3.5)
where
=
A(t)x(t) + B(t)x(t - h) + C(t)u(t)
n , ut t ) c RP , and
xf t) e R
continuous functions.
A(o)
, B(') , cc-)
The solution of (3.5) can be represented
as in (3.3) and it is easily checked that the function ~t isfies
(3.6)
are
K(t,s)
the partial differential equations [2]
aK( t , s ) as
-K(t,s)A(s) -
aK(t ,s) as
-K(t,s)A(s) , t - h
K(t,t)
I.
xr e , s + h)B(s + h) , to
~
s
~
t
~
s < t - h
-
212-
L. Weiss
4.
DEFINITIONS OF CONTROLLABILITY Consider the nonlinear delay-differential system
dx dt
(4.1)
= f(t,x(t) ,x(t - h) .uf t )
n x(t) £ R , u(t) £ RP , and
where
u
, t >- to
is measurable and bounded
on every finite time interval (such controls will be called " admiss'ible").
The delay is represented by a real scalar
and it is assumed that for all
t.
l f £ C
in all its arguments and
The initial function space is the space
h > 0 f(t,O,O,O)
0
B as defined
earlier. Definition 4.1 . if for any
£ B
~
control segment independent of if
t
l
- to
th~re
The system (4.1) is exists
t
l
=
n R - controllable
tl($) £(to'oo)
and an admissible
such that x(tl;to,$,u) = O. If t l i s l] n $ , we speak of fixed-time R - controllability, and u[to,t
can be made arbitrarily small we speak of differential
Rn _ controllability. While this definition turns out to be quite useful, it does not reflect the fact that the state space of (4.1) is a function space and that one can conceive of control problems in which the state of the system is to be transferred to a point (or region) in function space. Hence it makes sense to also consider the following definition.
-
213-
L. Weiss Definition 4.2.
The system (4.1) is controllable to the
origin with respect to the space of initial functions ~
E B
there exists
t
l
= tl(~) E(to' oo)
B if for any
and an admissible control
] such that x(t;tc'~'u) = 0 for all t E [tl,t l + h] . to,tl+h Although controllability to the origin does not imply controllability
segment
u[
to some other point in function space, it is possible to obtain results for the latter problem using an approach similar to that presented in the sequel (see Weiss [23]). 5.
CONTROLLABILITY OF LINEAR DELAY-DIFFERENTIAL SYSTEMS
We begin with the following Lemma. Lemma 5.1.
n R - controllable if there
such that
exists
(5.1)
The system (3.5) is
rank Itl K(tl,n)C(n)C'(n)K'(tl,n)dn t
=n
o
where the prime (') indicates transpose. Proof:
Let
C be the matrix in (5 .1) whose rank is
n .
The Lemma follows by substituting
in (3.3), for then
x(tl;to'~'u)
=O
.
The question of the necessit y
of (5.1) involves the concept defined below .
-214 -
L. Weiss
Definition 5.1. each
A s ys t e m (3.5) is poi ntw i se comp le te i f fo r
there exis t s a s et o f init ial fun cti on s
t
x(t;t ,~,O) o 1
such th at the set
i
1~
(
B , i
=
1, .. . , n ,
fo rms a bas i s f o r
1 , .. . , n
n R
It is e as y to construct an example t o sh ow th at not a l l time varying s ystems (3 .5) are pointwis e comp l e t e.
We con jec tu re, howe ver,
that all constant coefficient systems of the form (3.5) are pointwi se complete . Lemma 5 .2.
If the syst em (3.5) i s po intwi se comp l e t e , th en
(5 .1) i s ne ce ss a ry an d suffi cient f or f i xed -time Proof : For any t
l
> to
0
=
vector
R
Xl
(
B , suppose there e xists a f i xed t ime
n
=
=
0
( L)
(11) admissible
By hypothesis,
Then
Xl
such that
[to ,til
Then th ere exi sts a nonz ero
x ~K ( t l 's ) C ( s ) = 0
such that
Theorem 5.3. wi th respect to
u
but (5.1) does not hold.
xix ( t l ; t o , , O)
x(tl;to, ,O)
(5 .2)
£
and an admis s i b l e c ontrol
x(tl ;to ,,u)
Then
n R - controllab i lity.
x~ x l
=
0
for all
s
(
[to ,t l l
ca n be chos en so that
whi ch is a contradiction .
A system (3 .5) is controllable to the origin
B i f and only if it i s
n R
for each
con trollab Le
( B and for some corresponding t l
s uch that
C(t)u(t)
and
-
215-
L . Weiss
u *(.)
has an admissible solution Proof: 4>
B , let
£
u[
(tl,t
The necessity of (i) is obvious.
to,t l
]
be such that
holds, then on the interval ~(t)
defined on
(tl,t
A(t)x(t)
l
x(t it l o,4>,u)
+ h) .
l
Now, given =
0
If (5.2)
+ h) , the system (3.5) becomes
It then follows that
x(t)
=
0
for
Conversely, i f (3.5) is controllable to the origin with
B , then for each
respect to and control t
£
u
[to,tl+h)
[tl,t l + h]
4>
B there exists some
£
such that
x(t;t
o,4>,u)
=
0
t
l
> t
0
for all
This implies (i) and the uniqueness theorem for
delay equations implies (ii). Remark:
The major element in the controllability problem
for 3.5 is the solution to (5.2). exist on
(tl,t
the range of
l
+ h)
C(t)
Clearly, an admissible solution will
if and only if the right side of (5.2) is in
almost everywhere on the interval.
condition for the latter to hold is the existence of an matrix
D(t)
with bounded measurable elements such that
almost everywhere on 6.
LOCAL
(tl,t
l
+ h)
A sufficient n x p B(t)
C(t)D(t)
•
n R - CONTROLLABILITY OF NONLINEAR DELAY-DIFFERENTIAL SYSTEMS In this section, we generalize the results of Lee and
Markus [19] to the case of delay-equations.
-
216-
L. Weiss Definition 6 .1.
The system (4.1) is locally
Rn - controllable
n B if it is R - controllable to the orig in B with respect to a neighborhood N(O} where to the origin with respect to
(6.1)
A(t}
af =ai( (t,O,O,O)
B(t}
at = aX (t,O,O,O) , where
C(t}
af = au (t,O,O,O)
d
Theorem 6.1.
xd(t}
n R - controllable
A system (4.1) is locally
to the origin with respect to
x( t - h)
B if its first variation about the
zero-solution satisfies the condition that there exists
Proof:
t
l
> to
such that (5.1) holds
We introduce a parameter
~
into the control
u
and define
(6.2)
u~(t} - u(t,O
It should be noted that that if
u(t,O}
= UO(t} = a on
for
t
£
[to,t
l],
and
-
217-
L. Weiss
Define the Jacobian matrix
J(t)
B E,
(6.3)
Since
dX( t ; 'i>' O ,u )
J( t)
dE,
by
I E, =O
0, the solution of (4.1) is written as
~
x(t,E,)
J:
f(T,x(T),xd(T),u E,(T)dT o
Then we have
J(t)
=
~; I =
J:
E,=O
where
A. B ,C
(6.4)
j(t)
[A(T)J( T) + B(T)J(t - h) + C(T) 0
are as given in (6.1) .
~~
(T ,0)
ta.
Differentiating we obtain
dU
A(t)J(t) + B(t)J(t - h) + C(t) dE, (t,O) , to
~ t
But, from (6.2)
dU ~
(t,O)
and
ct r) ~ (t,O) dE,
- B(t ) J (t - h) , t 1
. K[C(t,02)]
01 < 0 2 denotes
and by orthogonal
R[C(t,ol)] li R[C(t,02)]
Corollary 7.2.
There exists a positive
l C
function
u(t)
such that
U R[C(t,a)]
R[C(t,t + \l(t»] .
o>t Corollary 7.3. with decreasing
R[C(t,a)], a
t;
t
is monotone nondecreasing
°.
Identical results hold with
C(t,o)
replaced by
V(t, o)
Hence, if we denote the set of states controllable from (reachable at) time
t
by
Pc(t)(Pr(t»
and denote the set of states which
determinable at (observable from) time there exist positive
l C
functions
t
by
Qd(t)(~(t»
vet) , wet) ,p(t)
Pc Ct)
R[C(t,t +
Pr I t)
R[C(t,t - v (t» ]
Qd(t)
R[V(t,t - w(t»]
Qo(t)
R[V(t,t + p (t»]
(7.6)
\let»~]
, then
such that
-
223-
L. Weiss
We can now characterize the concepts of controllab ilit y, r e achabilit y, determinability, and observability for the s ystem (7.1). Theorem 7.4.
A state
from (reachable at) time value of time
t
l
T (t
l
X
T)
such that
X
Proof (for controllability only) : solution of (7.1) with initial state
x
o
£
o
R[C(T,t
l)]
(Sufficiency) :
The
and initial time
is
given by
x(t;"x ) o
By hypothesis, there exists Setting (r , « ) all
,
u(n)
C'(n)¢'("n)z
° ,,) X
o
we find that
n R
such that
C("tl)z
= X
o .
and making use of the fact that ((t,T)
x(tl;"x
o)
=
(t, o) ( 0 , T)
for
= 0 .
(Necessity) :
Suppose
x U 0) o
is controllable from
¢ R[C("t)]
for all
t >
Then there exists a
finite value of time such that
o or
(7.7)
£
satisfies a group property
t,
but
= -
z
x
o
T
•
and an admissible control segment
-
224-
L. Weiss Since xl
X
on
o i R[C(T, t 1) ] K[C(T,t 1)] .
in which case
i t follows that
o
has a nonz ero projec ti on
By Theorem (7.1), "i E qC(T,t)]
xi~(T,n)C(n)
=0
for all
(7.7), xixo = 0 which implies
from
X
X
o
n E[T,t
for all But then,
l].
E R[C(T,t
l)]
t d T,tll
, a contradi ct i on.
From Theorem 7.1 and Theorem 7.4 we deduce Theorem 7.5. at) time
T
A system (7.1) is controllable from (reachable
if and only if there exists a finite value of time
Another useful result is given by Theorem 7.6.
(L) I f
x E Pc( t)
and
T
~
t , then
(ii) If
x E Pr(t)
and
T
~
t , then
$(T,t) x E Pc(T) .
~(T,t)
x E Pr(T) . Proof of (i):
u* [t,t
(7.8)
such that l]
o
x(t
By hypothesis there exists
1;t,x)
= O.
Then
~(tl,n)C(n)u
*(n)dn
tl
>
t
and
-
L. Weiss
225-
Let
o
on
[T,t)
Then
~(tl,T)~(T,t)x + ~(tl,T) ft ~(T,~)C(~)~*(~)d~ T
o
by (7 .8).
The proof of (Lt ) follows exactly as above.
Now, from Defini tions
7.3, 7.4 and Theorem 7.4, we have Theorem 7.7. (observable from) time
A state T
Corollary 7.8. (observable from) time t
l
T)
x
o
of (7 .1) is determinable at
if and only if there exists a finite
A system (7 .1) is determinable at T
if and only if there exists a finite
such that rank
V(T,t
l)
=
n .
The dual of Theorem 7.6 is also clear and is given by
-
226-
L. Weiss The significance of the phrases "observable from" and "determinable at" should now be clear from th e nature of (7.1) . For i f (7.1) is determinable at (observable from) time (assuming u(o) uniguel~
= 0)
, the state of the system at time
determined from knowledge of the output
can be
y(t)
over a
finite time interval ending at (beginning from) time To see this, consider the solution for u(o) = 0 , the initial state is (7.1) is determinable at
and, for some
t
of (7.1) when
the initial time is
o'
t
a
; and ass ume
Then we can write
a
t l < to tl
x
x
y
I
a
t
a
'
(t,t )H' (t)y(t)dt . a
Application of Theorem 7 .9 indicates that the state be determined at time
t
' ( t o , t ) xo
c an
0
It is also easy to check the following facts about th e sys tern (7.1). 1. time
t
Controllability from
t
implies that any state at
can be transferred to any other state in a finite interval
of time beginning wi th
t.
-
2.
227-
If (7.1) is c on t r o ll a b l e from
L . Weiss
t
and we revers e th e
ordering of the time scale, the reversed time s y s t e m i s re a chab le at
t.
3.
In general, complete controllability does not i mp l y
complete reachabili ty or vice ve ra a , in which
n = p = r = 1 , A(t)
=0
1 - cos t
o
C(t)
To see this, cons id e r (7.1)
, and
, 0
~
,
:> 0
t
~ ~
t
t
1
The above system is completely controllable, but is r eachabl e at for
t > 0
~(.)
T
I f a system (7.1) i s controllable
, then it is reachable at all
t ~ T
+
~( T)
where
is as defined in Corollary 7.2. (d i )
it i s observable from all
t:>
If a system is determinable at
T - W(T)
where
W(T)
T
is as defined
earlier. Proof: rank
o n ly
One can say the following, however.
Proposition 7.10 . (i) from time
t
C(T,T + U(T» rank
If (7.1) is controllable from
(i)
=
n
=
C(t;,T)
rank
C(T +
for all
t;
~
T
,
then
~( T),T)
T+
~ (T )
(ii) Follows from (i) b y dualit y
C0rollary 7 .11.
Complete differential c on c r o l l ab i l i ty
(obsez '; ab :l1ity) is equivalent to c omp l e t e d t f f e r er-t La I re ach ab ilit y
,
-lli-
L. VVeiss (determinability).
[It is therefore obvious that no distinction
need be made between the concepts of controllability (observabi1ity) and reachabi1ity (determinability) for time-invariant systems (7.1)]. We now consider briefly the problem of controllability and determinability for ordinary nonlinear differential systems.
The
controllability problem can, in fact, be handled in a manner completely analogous to that presented for systems containing time delays (simply h + 0).
let
Results of a slightly different nature can be obtained
for special types of nonlinear equations and these are discussed in the section dealing with the application of Pfaffian systems to the controllability problem. The determinability problem warrants more detailed comment, however, and so we consider the problem of giving sufficient conditions for a class of
non1inea~
ordinary differential s ;'stems to be uniformly
determinable (defined below).
The discussion is based on the work
of A1'breckht and Krasovskii [1]. Consider the system
{
(7 .9)
dx dt =
y(t)
Rn , y(t)
x(t)
of
x
in a neighborhood of
of
f
and
g
£
g(t,x(t))
Rr , f(t,O)
where
£
f(t,x(t))
x
=0
= 0 , f and
g
are analytic functions
, and the components
respectively cm' be expanded in a series as foll3Ws.
-
229-
L. Weiss
{
fi(t,x)
(7.10)
gi (t ,«)
where
$ (m) i
and
oji (m) i
are
L
m=l
L m=l
(m) $i (t; ,x)
(m) ojii (t,x)
th m
order forms in
x
wi th continuous
and bounded coefficients . Definition 7.6. at all
t ~ Y if there exists a function
that any trajectory
(7.11) for all
The system (7.9) is uniformly determinable
x(t)
Rl x Rr + Rn
Y:
such
can be expressed as
x(t) = Y(t,y(t +
e» , - y
~
e
~
0
t >. y > 0 .
Remark:
The interest in uniform determinability stems from
the desire to develop a method of state-vector determination whose practical implementation would involve the taking of measurements over time intervals of fixed length. The first result of interest characterizes uniform observability for the linear system (7.1) (with
u(·)
= 0) , and follows from
Corollary 7.8 (See Kalman [17]). Lemma 7.10. at all where
t ~ Y
The system (7.1) is uniformly determinable
if and only if rank
V(t,t - y) = n
for all
t >. y ,
-
230-
L. Weiss
and
t- y
V(t,t -
(7.12)
f
t
y)
t'(n,t)H'(n)H(n) t(n,t)d n
i s the transition matrix co r r es pondi n g to (7 .3).
t
The main result is the theorem below. Theorem 7.11. at all
t
~
Y if its first variation has that property.
Proof: suppose in
X
n R
o
The system (7.9) is uniformly determinable
£
Consider the solution
x(-,O,x
o)
of (7.9) and
N(O), a sufficiently small neighborhood of th e o ,igin
Writing
x(t)
x(t;O,x
=
o)
, we have the following expan s ion
on the interval [t - y , t )
(7.3)
x ( t + 8)
where
t
~
t ( t + 8,t)x(t) + I S(m)(t, 8,x(t», - y " e
0
is the transition matrix associated with the fir st
variation of (7.9).
The series (7.13) will converge for
sufficiently small.
Since
(7.14)
y ft
+
8)
g (t + 8 , x( t + 8»
,
- y
~
8
x(t)
:'! 0
then s ub s t i t u t i ng (7.13) into (7.14) and expanding yi e l ds
y ( t + 8)
where
H(t + 8)t ( t + 8 , t ) x ( t ) + I p (m) ( t ,8 , x( t » m=2
, - y
:'!
0
~
0
-
231-
L. Weiss
H(t) =
18. ax
(t 0)
'
Now let
(7.15)
Y(t,y(t + e )
A( t )
y
x(t) + V(t,t - y)-l
Jo- (t,n)C(n)C' (n)' (t,Il)T' (t)
for all
t
Choosing
and all n
=
nc[t,t +
~(t)l
where
K(t,n)
is
t , it is clear that the above equation implies
Using the notation
for all
t
which proves the main part of the theorem.
The remainder
follows by a trivial observation. By completely analogous argument,one can prove Theorem 10 .3 : V(t,t - w(t»
Consider (1) with determinability matr ix
and let rank
Then there exists a
l C
V(t,t - w(t)
= rd
max(w.(t», i i
~
1,2, 3,4
-
263 -
L. Weiss
It is e asy to show th at
R[Vaa (t ,t - w e t »~
Hence there e xi sts a matri x
I
R[ R( t ,t - '<J ( t )
K(t)
su ch th at
By Corollary 10 .1 it f ollows th at
K( ' ) c (:1
Now define the ( n - r matrix
T (t ) 4
c
- r
d
+ r 2
d
)
c
I rd
denotes the
1 1 is C
r
c
-
r
d
2
+ rd
)
1
J
- K( t) I n _ r -r
T4 ( r )
-c ( n -
1
b y th e f ormula
(10.14)
whe re
I
r
d
2
x r identity mat r i x. Clea rly, 1 d1 From ( 10 . 13) an d (10.14) we ob ta i n (omi t ti ng a rgumen t s d
on ri gh t hand side)
where
Q 1
Q - R'K
is n onne gati ve def i n ite a n d i t fo llows b y h y po t h e s I s
-
264-
L. Weiss
that rank
Ql(t,t - w(t»
=
r
d
- r 3
d
Applying Corollary 10 .1 let (n - r
c
- r
d
for all 1 T (t)
s
t
.
be an
(n - r
- r
c
d
)
x
Z
continuously differentiable nonsingular matrix such
)
Z
that
o
o where for all
PI (t) t •
- r and rank Pl(t) - r ) x (r d d) d 3 1 3 1 Then the coordinate transformation defined by is
(r
d
where
(10.15)
has the effect of transforming the determinability matrix
03
for the
system (10.11) into the form
(10.16)
-'i
-
:J
]
-
265-
L. Weiss for all
t.
It is easy to check that
T(t)-l
h as
t
lu: " "'"L' fo rm
dS
T(t) , in fact -KT
(10.17)
for all
5l
J
T, t , where the dimensions of the
"0"
are
n - r
c
- r
d2
" rd
Therefore, the transformed state coefficient matrix of (10.11) is given by
•
+ T(t)T{t)
A d(t) a,
and has the form
a, d(t)
A
and the corresponding form, i.e.,
" 4>
a, d(t .. r)
~ransition
matrix
4>
a,
d(t,T)
a l s o has th e
-I
1
-
266-
L. Weiss for all
t,T
•
Now partition
where
; ad 1
where
~ 12
Add
x n - r
is
is
;ad
(n - r
as f o l l ows :
, and
c
- r + r d ) x (r d - r d) d 2 3 1 3 1 remaining matrices are comformab1e ,.ith thi s. c
- r
; dd
and
d
and the
corresponds to a partition of the ve ct or Then the transpo se o f
(10 .19)
'
a,d
a
r····
Add '
; ad ' 1 ; ad ' 2
~ ll
Add ' ~12
''J 2 1 Add ' ~2 2
By (10 .16), states which are determinable at any fi xed time under the new c oo r di n a t e system, have the f orm
Xl
(d]
[ wet)
must,
-
267-
L. Weiss
where w(t)
dim xl(t) =
0
r
=
for all
d
3
t.
, dim w(t) = n - r
- r + r ' and d2 d3 d1 From Theorem 7.6 and (10.19) it f oll ows t hat
Add ' 1>12
for all
t.
0
=
Transposing, and using equation (7.3), we f ind that
r
A
for all
~
(t~
- r
c
a,d
aa A
Aa1 d
0
0
Add All
0
0
Add A 2l
Add A 22
t • Before giving the final regrouping of terms, it rema ins
to find the "output" coefficient matrix of (l0.11) under the new coordinate system .
This is given by
H( t )
and so, from
(10.17) (with arguments omitted)
But from Theorem 10.3 it is clear that
H
H takes on th e fo r m
-
268-
L . We iss where
is
We now define th e f oll owin g
quan ti ties :
~
~
a
x
a
~
[;]
c
~
b d
x x
lC = [A CC
"dd J AU
lc = [A dC
~~d ]
~dd
ac
~~d ]
.ta a
It =
[A
b d 2
~C
[HC
H~]
"dd
An =
Aaa ~bb=Aba
and if we denote
then we define
Finally,
~b b
bb ,
= A
The t he orem
is thus proved . It s hou l d be emphasized that our proce du re fo r s t r uc t u ral de composit ion is "symmetric" from a nunber o f point s of view.
For
example, just as The orem 10.3 is a dual r e sult t o Theorem 10.2 , we
-
269-
L. Weis s
could have given a completely dual procedure f or consonant with (10.12).
obtainin ~
a f orm
That is, one can easily write th e duaJ t o
Theorems 10.4 and 10.5 which would begin with the applicati on of Theorem 10.3 and would replace the matrices matrices
C C C l, Z' 3
VI' V ' 03
Z
wi t h
etc.
In addition to all this "dual" symmetry, the resul ts of Section 7 indicate that the same type of structural decompositi on is obtained if "controllability" is replaced by "reachability" and/or "determinability" is replaced by "observability".
Hence,
Theorems 10 .4 and 10.5 as well as their duals are each represent ati ve s for a set four structural decomposition theorems. To avoid confusion in the sequel, our discuss ion and interpretation of the results of this section are given only with reference to the actual procedure adopted i n Theorems 10.4 and 10.5 t o obtain (lO.lZ) .
On the basis of our comments above, the
reader can easily supply the interpretations for all the r ema in i ng approaches. Remarks:
1.
The overall coordinate transf ormat ion whi ch
produces the general structural decomposition of an arbitrary s yst em (7.1) is represented by the matrix
I
r
d
0
T i(t)
1 T
T( t) 0
Ts(t)
-1
J
4(t)
-1
I
0
I
-1
iT 1(
0
T 3( t
-JI )
.J
-
2.
L. Weiss
270-
For the special case when (1) is time-invariant, all applications
of Corollary 10.1) will involve time-invariant transformations so that the procedure given in the proofs of Theorems 10.4 and 10.5 clearly leads to a time-invariant structural decomposition. 3.
Pictorially, the decomposition (10.12) can be viewed as in
Figure I, which shows four interconnected systems
Sa, Sb, sC, Sd
enclosed in "boxes" labelled with their associated state vectors. If, as is natural, we view the interconnecting lines inside the large "box" as input and output lines for the structural components, then the following result is readily discernible from individual examination of each structural component in (10.12) plus reference to the proofs of Theorems 10.4, 10.5. Corollary 10.6: (i)
Sa
is completely con t rollab Le and completely detorminablp
(Lf.)
Sb
is completely controllable and completely undeterminab I e
(iii)
Sc
is completely uncontrollable and completely determinable
(Lv)
Sd
is completely uncontrollable and completely undeterminable
Note:
I f the matrices
A('), C('), H(')
analytic functions of time, the ranks of
in (7.1) are
e(t,t + ~(t», Vi(t,t - wi»'
i = I, 2, 3 , will be constant everywhere in the t-domain.
Hence,
the system-theoretic interpretation of the structural decomposition of a system with analytic coefficients is given by Corollary 10.6.
This
provides the proof for assertions concerning the structural decomposition of analytic systems which were made by Kalman [16] and Weiss and Kalman [21].
It may be of interest to point out that in this special case
the overall coordinate transformation can be taken to be analytic rather than just
Cl •
This follows directly from the proof of Dolezal's Theorem
(Lemma 8.4) given by Weiss and Falb [26].
-
271 -
L, We is s
O H ''':
Sa I XO GO
u Gb
'''F
bo
F
OC
bc Sb xb F JSC XC I
1
I
~Fbd
I
I
HC
Y
~ F dC
SdI xd
F IGURE l :
St r u c t u r al Decomposi tio n of a Li nea r Sy st em
-
11.
272-
WEIGHTING PATTERNS, IMPULSE RESPONSES, MINIMAL REALIZATIONS AND CONTROLLABILITY THEORY
Until recently, input/output relations were the most popular means used in engineering textbo oks to represent systems, with the "State" being on l y impli citly considered .
Since an input/output relation is the natural ou t come
of an attempt to model a system from experimental observations, it is of interest to investigate the relationship between input/output represent ations and state-vector represent ations of a s ystem.
In
keeping with the theme of these notes, our discussion centers on the properties of controllab ility, observability, et c. for such representations .
We consider only ordinary linear differential
systems of the form ( 7.1) . The solution to ( 7.1) can be written as
( 11.1)
where
H(t)~(t,tO)xO
y Ir)
X
o
(11.2)
+
I:
W(t,T)u(T)dT o
is the state of the system at time
W(t,T)
H(t) ~(t, T)C(T)
f or all
t
o
and
t, T
and is denoted as the weighting pattern for ~ 7.1) (See Wei ss [20]). Clearly, if
x
o
o ,
then
W(t,T)
contains all the informat i on
needed to compute all input/output pairs of the system .
On thi s
-
273-
L. Weiss
basis we concentrate our s t udy of i npu t /outp u t the function
r ~l ati on s
so le ly un
W(t,T).
[A historical aside:
En gi ne e r s h ave
t
r a d i t Lona l Ly
concerned themselves with inp ut/ output rel ati ons associ a t ed wi t il t he causal impulse response function
(11. 3)
The distin ction betwe en
W c
Wc(t,T)
def ined b y
W( t, T)
,
t
o
,
t
(t,~)
and
The weighting pattern for the
OCt )
=
Consider (11.1). 4>(Tl,t)C(t)
Letting
yields the desired factorization.
-
275-
L. Weis s
It is als o quite s i mple t o c on s t r uc t " g l ob a l I v r ed uc e d
weighting pattern from a gi ven one, as i n dicate d b y t he p r oo f of the result below .
Every wei ghtin g p at t ern has
Lemma 11. 2.
g .l oh a l Ly
.J
reduced form .
Pr oof : the row s o f
' (. )
Suppose (11.4 ) is n ot gl oball y reduced. and for the colunms of
Suppose the row s of exists a n
n x n
(' ( . )
Then
'P( · ) are dependent on
a re dependent.
nonsingular con s ta n t matr i x
K
Then th ere
such that
K0 ( t )
where the rows of
W( t,.r)
O( · )
are independent over
0
'l'( t ) K- l
[
(_00, 00)
Then
(t)] 0
'l'(t) 0(t )
If the columns of
'l'l ~t)
are not independent over
we introduce a nonsingular con s t an t matrix
where the columns of
qrl1(t)
L
(_00 ,00) ,
su ch that
a r e independent ov e r
(_00,"' )
•
-
276-
L. Weiss
Then, letting
~( t)
L-lO(t)
we get
Wet,,)
(11 .5)
and (11.5) is globally reduced. We now investigate some of the properties of minim al realizations of globally reduced weighting patterns .
The first
result justifies the terminology "minimal". Lemma 11.3. weighting pattern realizations of Proof. Wet,,)
A minimal realization of a globally r :1ured
Wet,,)
has the lowest dimension of all glob al
W(t,')' Suppose the contrary .
Let
n
and consider a global realization of
dimens ion
< n.
be the order of
Wet,,)
with
Clearly, its weigh ting pattern is of order
which contradicts the assumption that
Wet,,)
< n
is globally reduced.
An interesting fact which relates the material on "structure" to that presently being developed is the f ollowing. (See Kalman [15], [16)). Proposition 11.4.
The subsystem
Sa
in Figure 1 is
a minimal realization of the weighting pattern for the overall system. Proof:
The weighting pattern for the system (10.12) is
-
277-
L. Weiss
W(t,T)
where
aa
corresponds to the coefficient matrix
right side of (11.6) is the weighting pattern for globally reduced.
Faa Sa
'111e and is
It is a trivial matter to check that the order
of this weighting pattern is the dimension of Definition 11.7.
Two linear systems
~
a
S,S
of the form
(7.1) are algebraically eguivalent if there exists a nonsingular continuously differentiable matrix
T(t)
T(t)As(t)T(t)
such that
-1'
+ T(t)T(t)
-1
(11. 7)
for all
t . ~,
obvious result is
Proposition 11.5.
Weighting patternsare invariant under
algebraic equivalence. Lemma 11.6.
Points of time from which a system (7.1) is
controllable (or observable) or at which a system is reachable (or determinable) are invariant under algebraic equivalence.
-
278-
L. Weiss Proof: analogously).
(for controllability only.
The rest follows
Under algebraic equivaien ce we have the cor r es ponde nce
C(t, t + \let) + T(t)C(t, t + \J(t»T' (t) and so rank
C(t,t + \J(t»
= n
C(t,t +
implies rank
1J( t )
C(t,t
to
1 ;l ( t ) )
n •
The following result was first stated but not proved in [16] and [21].
A proof was subsequently published by YouLa- [26].
The following proof combines that of You]a with one given by Kal man in unpublished notes.
Theorem 11.7.
Any two minimal realizations of a given
globally reduced weighting pattern are algebraically equivalent. Proof:
It is clear from the proof of Lemma 11.1 that
any minimal realization is algebraically equivalent to one with the
A(·) " 0
coefficient matrix
(take
T(t)
= ~(t,n)
in (11.7».
Hence it suffices to prove that any two minimal realizations {O'~1(·),0l(·)}
weighting pattern
and
{O'~2(·), 02(·)}
Wet,,)
of a given globally reduced
are algebraically equivalent.
To do
this, first note that
(11. 8)
Wet,,)
where the columns of the
~i(·)
are linearly independent on exist finite intervals
i,
i
=
0. (· ) 1
, i = 1,2
It then follows that there
(_00, 00)
J , K
i
and the rows of the
1,2
on whi ch the aforementioned
columns and rows are linearly independent respectively.
[for if not,
-
279-
L. Weiss
then on a ny int erval vect or on
sk
[-k,k], k = 1 , 2, . . . , th e r e exis ts a c on s t a n t
o f un it no r m s u c h th at
[-k,k]
The s e quen ce
I r
~i m s k . = £;
lSk.} · s u c h that 1.
1 -Ko
".
'k '
and
~ l ( t ) l; k = 0
almos t everywhere
h a s a con ve r ge nce s uh s eq ue n ce '1'1(t )t; = 0
a .e . i n
( _ 'c , ")
1.
th u s cant radicting the linear independen ce of the c o l umns o f
'j' ( . )
He n ce the matrices
I 'l'~(t)'¥ .(t)dt
i
1,2
(t} G~ (t)dt
i
1,2
J .
1
1
1
and
f
N. 1
are n on singul ar.
G.
K.
1
1
1
Hence , fr om (11.8) we c an wr ite th at
(11.9)
'l'l ( t )V
Now, mult iply ing bot h sides o f (11.10) on t h e left by inte grating over -1
M 2
we get
I
n
J
2
UV
,
~ 2 (t )
,
and multiplyin g on t h e l e f t agai n b y But from ( 1 1. 8) , (11.9) , (1 1. 10 ) , we h ave
-
280 -
L. We is s from whi ch it f ollows t hat {O, 'l'1(0), 0
l(0)
}
I
n
= VU so t hat
U=
i s algebrai cally equi va len t t o
v-I
and t here f ore
{C,'I'2 ( o),02 ( o)} Io"' i ch
proves the theorem. As a direct cons eq uen ce of Theorem 11. 7 and Lemma 11.6 we have the s t a t e me n t that all min imal re a l iza t i ons o f a given gl oball y reduced weighting pattern h ave e ss entially t h e same be ha vi or
~i" '
respe ct to the properties of con t ro l lab i l i t y , r e a chabil it y, determinab ilit y , and observability. We can, o f course, go even further as indicated bel ow. Theorem 11080
Given a globally reduced weighting pattern
(11.1), there exist fin ite values of time
t' , t" , s uch that
all min imal realizations of (11. 1) are con t r o llab l e ( or obs erva b le ) from all
0 ( T)
=
t < t'
and are re ach able ( or determinable) at a ll
Proof:
Let
T
A realization of a causal (anticausal)
WC(t,T) (Wa(t,T»
is a system ( 7 .1) whose
causal (anticausal) impulse response is
W (Wa(t,T» c(t,1)
•
The following Corollary of Lemma 11.1 is obvious. Corollary llJO.
An
r x p
is a causal impulse response for an if and only if there exists an r x n n x p
matrix
matrix function
W
C(t,1)
n-dimensional system ( 7 .1) matrix
~(.)
, and an
0(') , both defined and continuous for all time
such that
(11.13)
o ,
t < T
-
283-
L. Weiss
In similar fashion, t he an t ic aus a l impulse r e s pons e must have the form
(11.14)
'J' ( t ) 0 ( T) , t
o, Defini tion 11.]0.
T >
(-"',T)
are linearly independent over Definition 11 .11.
> r
the rows of
,
[i;,"')
linearly independent over
(-"', i;)
while the col umns o f
i;