Controllability and Observability (C.I.M.E. Summer Schools, 46)

E. Evangelisti ( E d.) Controllability and Observability Lectures given at a Summer School of the Centro Internazional...

Author: E. Evangelisti

47 downloads 676 Views 81MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

E. Evangelisti ( E d.)

Controllability and Observability Lectures given at a Summer School of the Centro Internazionale Matematico Estivo (C.I.M.E.), held in Pontecchio (Bologna), Italy, July 1-9, 1968

C.I.M.E. Foundation c/o Dipartimento di Matematica “U. Dini” Viale Morgagni n. 67/a 50134 Firenze Italy [email protected]

ISBN 978-3-642-11062-7 e-ISBN: 978-3-642-11063-4 DOI:10.1007/978-3-642-11063-4 Springer Heidelberg Dordrecht London New York

©Springer-Verlag Berlin Heidelberg 2010 st Reprint of the 1 ed. C.I.M.E., Ed. Cremonese, Roma 1969 With kind permission of C.I.M.E.

Printed on acid-free paper

Springer.com

CENTRO I NT E R NA Z IO NAL E MATEMATICO ESTIVO (C.I. M. E.) 0 1 Cicio - S3-SS0 Ma r-coni dal 1- 9 L ugl i o 1968

CON TROL LAB IL ITY AND OBSERVABIL ITY Coordinatore : P ro f. G . EVANGE L ISTI

R .E . KALMAN : Le ctures on Controllability and Obs ervability

pag .

R . KULIKOWSKI : Controllabil ity and Optimum Contro l

" " "

A . STRASZAK

Supe r vi s o r-y Cont r-ollab ility

L . WE ISS

Lectu res on Co nt r o ll a bil ity an d Obser vability

151

193 20 1

CENTRO INTERNAZIONALE MATEMATICO ESTIVO (C.I.M.E.)

LECTURES ON CONTROLLABILITY AND OBSERVABILITY

R.E. KALMAN (Stanford- University)

Corso tenuto a S3.SS0 Marconi (Bologna) dal

1 al 9

Luglio

1968

TABLE OF CONTENTS

O.

Introduction.

5

1.

Classical and modern dynamical systems.

15

2.

Standardization of definitions and "classical" results.

23

3.

Definition of states via Nerode equivalence classes.

35

4.

Modules induced by linear input/output maps.

43

5.

Cyclicity and related questions.

59

6.

Transfer functions.

78

7.

Abstract construction of realizations.

92

8.

Construction of realizations.

98

9.

Theory of partial realizations.

112

10.

General theory of observability.

119

11.

Historical notes.

133

12.

References.

142

-5-

R. E. Kalman

INTRODUCTION The theory of controllability and observability has been developed, one might almost say reluctantly, in response to problems generated by technological science, especially in areas related to control, communication, and computers.

It seems that the first

conscious steps to formalize these matters as a separate area of (system-theoretic or mathematical) research were undertaken only as late as 1959, by KAlMAN [l960b -c ].

There have been, however, many

scattered results before this time (see Section 12 for some historical comments and references), and one might confidently assert today that some of the main results have

bee~

discovered, more or less independ-

ently, in every country which has reached an advanced stage of "development" and it is certain that these same results will be rediscovered again in still more places as other countries progress on the road to development. With the perspective afforded by ten years of happenings in this field, we ought not hesitate to make some guesses of the significance of what has been accomplished. (i)

I see two main trends:

The use of the concepts of controllability and observability

to study nonclassical questions in optimal control and optimal estimation theory, sometimes a s basic hypotheses securing existence, more often as seemingly technical cond.tLons which allow a sharper statement of results or shorter proofs. (ii)

Interaction between the concepts of controllability and

observability and the study of structure of dynamical systems, such

-6-

R. E. Kal man as:

formulation and solution of the problem of realization,

canonical forms, decomposition of systems. The first of these topics is older and has been studied primarily from the point of view of analysis, although the basic lemma (2.7) is purely algebraic.

The second group of topics

may be viewed as "blowing up" the ideas inherent in the basic lemma (2.7), resulting in a more and more strictly algebraic point of view. There is active research in both areas. In the first, attention has shifted from the case of systems governed by finite-dimensional linear differential equations with constant coefficients (where succes s was quick and total) to systems governed by infinite-dimensional linear differential equations (delay differential equations, classical types of partial

different~al

equations, etc.), to finite-dimensional linear differential equations with time-dependent coefficients, and finally to all sorts and subsorts of nonlinear differential equations.

The first two

topics are surveyed concurrently by WEISS [1969] while MARKUS [1965] looks at the nonlinear situation.

My own current interest lies in the second streao, and thes e lectures will deal ptimarily with it, after a rather hurried overview of the general problem and of the "classical" results. Let us take a quick look at the most important of these "classical" results .

For conveni enc e I shall describe them in system-theoretic

7

H. E. Kalman (rather than conventional pure mathematical) language.

The mathe-

matically trained reader should have no difficulty in converting them into his preferred framework, by digging a little into the references. In area (i), the most important results are probably those which give more or less explicit and computable results for controllability and observability of certain specific classes of systems. Beyond these, there seem to be two main theorems: THEOREM A.

A real, continuous-time, n-dimensional, constant,

linear dynamical system I:

has the property "every set of

n

eigenvalues may be produced by Imitable state feedback" if and only if I:

is completely controllable.

The central special case is treated in great detail by KALMAN, FAlB, and ARBIB [1969, Chapter 2, Theorem 5.10]; for a proof of the general case with background comments, refer to WONHP~ [1967].

As

a particular case, we have that every system satisfying the hypotheses of the theorem can be "stabilized" (made to have eigenvalues with negative real parts) via a suitable choice of feedback.

This result

is the "existence theorem" for algorithms used to construct contr01 systems for the past three decades, and yet a conscious formulation of the problem and its mathematical solution go back to about 1963~ (See Theorem D below.)

The analogous problem for nonconstant linear

systems (governed -by linear differential equations with variable coefficients) is still not solved.

8

R.E.Kalman THEOREM B.

( "Duality Principle")

Every problem of control-

lability in a real, (continuous-time, or discrete-time), finitedimensional, constant, linear dynamical system is equivalent to a controllability problem in a dual system. This fact was first observed by KALMAN [1960a] in the solution of the optimal stochastic filtering problem for discrete-time systems, and was soon applied to several problems in system theory by KAI.MAN [196ob-c].

See also many related comments by KAI.MAN, FALB,

and ARBIB [Chapters 2 and 6, 1969].

As a theorem, this principle

is not yet known to be valid outside the linear area, but as an intuitive prescription it has been rather useful in guiding systemtheoretic research.

The problems involved here are those of fomula-

tion rather than proof. algebra and in particular

The basic difficulties seem to point toward category theory.

System-theoretic

duality, like the categoric one, is concerned with "reversing arrows".

See Section 10 for a modern discussion of these points

and a precise version of Theorem B. Partly as a result of the questions raised by Theorem Band partly because of the algebraic techniques needed to prove Theorem A and related lemmas, attention in the early 1960.s shifted toward certain problems of a structural nature which were, somewhat surprisingly at first, found to be related to controllability and observability. THEOREM C.

The main theorems again seem to be two: (Canonical Decomposition)

Every real (continuous-

time o~ discrete-time), finite-dimensio~al, const?r.t, linear ~vnamical

9

R.E.Kalman

system may be canonically decomposed into four parts, of which only one part, that which is completely controllable and completely observable, is involved in the input/output behavior of the system. The proof given by ~~ [1962] applies to nonconstant systems only under the severe restriction that the dimensions of the subspace of all controllable and all unobservable states is c onstant on the whole real line.

The result represented by Theorem C is far from

definitive, however', since finite-dimensio nal linear, :'cnconsta:lt systems admit at least four diffe·re::-:' canonical decompositi-:ns:

i t is

possible and fruitful to dualize the notions of controllability and observability, thereby arriving at four properties, presently called reachability and controllability as well as constructibility* and observability. (See Section 2 definitions.)

Any combination of a property from

the first list with a property from the second list gives a canonical decomposition re sult

an a l og~~s

to Theorem C.

The comple xity of

the s1 tuation wa.s first revealed by \·/EISS and KALMA..~ [1955]; this paper contributed to a revival of interest (with hopes of success) in the special problems of nonconstant linear systems.

Recent

*WEISS [1969] uses "determinability" instead of constructibility. The new terminology used in these lectures is not yet entirely standard.

-

10 -

R. E. Kalman

progress is surveyed by WEISS [1969].

Intimately related to the

canonical structure theorem, and in fact necessary to fully clarify the phrase "involved in the input/output behavior of the system'~ is the last basic result: THEOREf4 D.

(Uniqueness of I·linimal Realization)

Given the

imwulse-response matrix W of a real, continuous-time, finitedimensional, linear dynamical system, there exists a time, finite-dimensional, linear dynamical system (a)

continuous-

which

realizes W: that is, the impulse-respo~se matrix of is equal to W;

~

(b)

Lw

re~l,

has minimal dimension in the class of linear systems satisf'ying (a);

(c)

is completely controllable and completely observable;

(d)

is uniquely determined (modulo the choice of a basis at each

t

for its state space) by reauirement (a)

together with (b) or, independently, by (a) together with

(c). In short, for any W as described above, there is an "essentially unique"

~

of the same "type" which satisfies (a) through (c).

COROLLARY 1. constant

~

If W comes from a constant system, there is a

which satisfies (a) through (c), and is uniquely

determined by (a) + (b) or (a) + (c) (modulo a fixed choice of basis for its state space).

-

II

-

R. E. Kalman

COROLULQY 2.

All claims of Coyollary 1

"impulse-response matrix of a

const?~t,

c o ~tinue

to hold if

finite-dimensional system"

is replaced by "tra::1sfer function matrix of a constant, finitedimensional system". The first general discussion of the situation with an equivalent statement of Theorem D is due to KAlMAN [1963b, Theorems 7 and 8].

(This paper does not include co~plete proofs, or even

an explicit statecent of Corollaries 1 and 2, although they are implied by the general algorithm given in Section version of the original unpublished proof of

7.

7.~eorem

An edited

D is given

in KALMAN, FALB, and ARBIB [1969, Chapter 10, Appendix C].) These results are of great importance in engineering system theory since they relate methods based on the Laplace transform (using the transfer function of the systec) and the time-donain methods based on input/output data (the matrix ;'1) to the statevariabl

(dynamical system) methods developed in 1955-1960.

fact, by Corollary 1 it follows that the

t~o

In

methods NUst yield

identical results; for instance, starting with a constant impulseresponse matrix W,

property (c) implies thao the existence

of a stable control lay is always assured by virtue of Theorem A. Thus it is only after the development represented by Theorems A-D that a rigorous justification is obtained for methods used in control

t~e

intuitive design

enginee~ing.

As with Theorem C, certain formulational difficulties arise in connection with a precise definition of a "r.onconstant linear

-

12 -

R. E. Kalman

dynamical system".

Thus, it seems preferable at present to replace

in Theorem D "impulse-response matrix WIt (or "abstract input/output map WIt) by "complete reachability" •

by "weighting pattern WIt

and "complete controllability"

The definitive form of the 1963 theorem

evolved through the works of WEISS and KAlMAN [1965], YOULA [1966], and KALMAN; a precise formulation and modernized proof of Tneorem D in the weighting pattern case was given recently by KAI.MA..f>l', FALB, and ARBIB [1969, Chapter 10, Section 13.]

A completely general

discussion of what is meant by a "minimal realization" of a nonconstant impulse-response matrix involves many technical complications due to the fact that such a minimal realization does not exist in the class of linear differential equations with "nice" coefficient functions.

For the current status of this problem,

consult especially DESOER and VARAIYA [19 67], SILVERMAN and MEADOWS [1969], KArMAN, FALB, and ARBIB [1969, Chapter 10, Section 13] and WEISS [1969]. From the standpoint of the present lectures, by far the most interesting consequence of Theorem D is its influence, via efforts to arrive at a definitive proof of Corollary 1, on the development of the algebraic stream of system theory.

The first proof of this

important result (in the special case of distinct eigenvalues) is that of GILBERT [1963].

Immediately afterwards, a general proof

was given by KArMAN [1963b, Section 7].

This proof, strictly

computational and linear algebraic in nature, yields no theoretical insight although it is useful as the basis of a computer algorithm.

-

13 -

R. E. Kalman

Using the classical theory of invariant factors,

~~

[1965a]

succeeded in showing that the solution of the minimal realization problem can be effectively reduced to the classical invariantfactor algorithm.

This result is of great theoretical interest

since it strongly suggests the now standard module theoretic approach, but it does not lead to a simple proof of Corollary 1 and is not a practical method of computation. The best known proof of Corollary 1 was obtained in 1965 by B. L. Ho, with the aid of a remarkable algorithm, which is equa l.Iy important

from a theoretical and computational viewpoint.

The early formula-

tion of the algorithm was described by HO and KALMAN [1966], with later refinements discussed in HO and KALMAN [1969], KAh\UU~ FALB, and ARBIB [1969, Chapter 10, Section 11] and ~~ [1969c]. Almost simultaneously with the work of B. L. Ho, the basic results were discovered independently also by YOULA and TISSI [1966] and by SILVERM.I'IN [1966].

The subject goes back to the 19th century

and centers around the theory of Hankel matrices; however, many of the results just referenced seem to be fundamentally new. field is currently in a very active stage of development. discuss the essential ideas involved in Sections 8-9.

This

We shall

Many other

topics, especially Silverman's generalization of the algorithm to nonconstant systems unfortunately time.

cap~ot

be covered due to lack of

-

14 -

R.E. Kalman

Acknowledgment It is a pleasure to thank C. I. M. E. and its organizers, especially Professors E. Bompiani, E. Sarti, and E. Belardinelli, for arranging a special conference on these topics.

The sunny

skies and hospitality of Italy, along with Bolognese food pla.yed a subsidiary but vital part in the success of this important gathering of scientists.

-

IS -

R. E. Kalman 1.

CLASSICAL Am> MODERN DYNA1.fICAL SYSTEHS

In mathematics the term dynamical system (synonyms:

topological

dynamics, flows, abstract dynamics, etc.) usually connotes the action of a one-parameter group

T

(t~e

reals) on a set

X,

where

X is

at least a topological space (more often, a differentiable manifold) and the action is at least continuous.

This setup is physically

motivated, but in a very old-fashioned sense.

A "dynamical system"

as just defined is an idealization, generalization, and abstraction o~

Newton's world view of the Solar System as described via a finite set c

nonlinear ordinary differential equations.

These equations represent

the positions and momenta of the planets regarded as point masses and are completely determined by the laws of gravitation, i.e., they do not contain any terms to account for "external" forces that may act on the system. Interesting as this notation of a dynamical system may be (and iSl) in pure mathematics, it is much too limited for the study of those dynamical systems which are of contemporary interest.

There

are at least three different ways in which the classical concept must be generalized: (i)

The time set of the system is not necessarily restricted

to the reals; (ii)

A state

x E X of the system is not merely acted upon by

the "passage of time" but also by inputs which are or could be manipulated to bring about a desired type of behavior;

-

16 -

R. E. Kalman

(iii)

The states of the system cannot, in general, be observed.

Rather, the physical behavior of the system is manifested through its outputs which are many-to-one functions of the state. The generalization of the time set is of minor interest to us here.

The notions of input and output, however, are exceedingly

fundamental; in fact, controllability is related to the input and observability to the output.

With respect to dynamical systems in

the classical sense, neither controllability nor observability are meaningful concepts. A much more detailed discussion of dynamical systems in the modern sense, together with rather detailed precise definitions, will be found in KALMAN, FALB, and ARBIB [1969, Chapter 1]. From here on, we will use the term "dynamical system" exclusively in the modern sense (we have already done so in the Introduction). The following symbols will have a fixed meaning throughout the paper:

( 1.1)

T

time set,

U

set of input values,

X

state set,

y

set of outpu.t values,

n

input functions,

Ij)

transition map,

TJ

readout map.

The following assumptions will always apply (otherwise the sets above are arbitrary):

-

17 -

R. E. Kalman

T

an ordered subset of the reals

o

class of functions

T -> U

(i)

w is undefined outside some

each function

finite interval

w

E 0

with For most

dependent on

on

W'

abelian group of integers;

m;

there is a function

which agrees with w on

later,

f_ :;~ 3e3

such that

JeT co

¢,

(ii)

~,

J

w and

J w"

T will be equal to

U, X, Y, 0

fined" can be replaced by "equal to

~

= (ordered)

will be linear spaces; "unde-

0"; and "functions undefined out-

side a finite interval" will mean the same as "finite sequences". The most general notion of a dynamical system for our present needs is given by the followinr;

DEfINITION.

~amical

consi sting of the maps

cp:

und efined when ever

defined on the sets

1]

x, m) '-. cp(t;

T,

t

>

T, U, 0, X, Y

T,

x, m)

T·

= '

TXX-> Y:

The tran sition map cp (1.4 )

is a composite object

TXT X X X 0 -> X,

(t;

Tj:

rp,

system E

cp( t; t, x, w)

(t, x) I--> Tj(t, x). satisf~c~

x·,

the following assumptions:

-

18 -

R.E.Kalman (1.5)

cp(t;

(1.6)

if cp(s;

T,

X,

=

(I)

T,

cp(t; s, cp(s;

(I)

(I)'

X,

on

[T, t), cp( s;

(I)

T,

x,

T,

(I)

then for all

,

(I);

s E [T, t)

x, ,:D' ) .

The definition of a dynamical system on this level of generality should be regarded only as a scaffolding for the terminology; interesting mathematics begins only after further hypotheses are made. T, U,

instance, it is usually necessary to endow the sets Y

with a topology ~{PLE.

and then require that

cp

and

1]

n,

= B = reals,

X, and

be continuous.

The classical setup in topological dynamics may

be deduced from our Definition (1.3) in the following way. T

For

Let

regarded as an abelian group under the usual addition

and having the usual topology; let

n consist only of the nowhere-

defined function; let

X be topological space; disregard Y and

define

T

cp for all cp(t;

T,

t, X,

entire]~;

E T and write it as x·(t - T),

(I)

that is, a function of

1]

x

and

t -

T

alone.

Check (1.4-5); in

the new notation they become

x'O

x

and

x.(s + t)

Finally, require that the map

(1.8)

INTERPRETATION.

(x·s)·t.

(x, t) ~ x·t

be continuous.

The essential idea of Definition (1.3) is

that it axiomatizes the notion of state.

A dynamical system is informally

-

19 -

R. E. Kalman

a rule for state transitions (the function

together with suitable

~),

means of expressing the effect of the input on the state and the effect of the state on the output (the function as follows: time

T

"an input w,

~).

The map ~ is verbalized

applied to the system Z

produces the stati! ~(t; T, x, w)

at time

in state t."

x at

The peculiar

definition of an input function w is used here mainly for technical convenience; by (1.6) only equivalence classes of inputs agreeing over

h, t] enter into the determination of at

t

~( t ; T,

means no input acts on Z at time The pair

x, w).

"w not defined"

t.

(T, x) E T X X will be called an event of a dynamical

system L In the sequel, we shall be concerned primarily with systems which are finite-dimensional, linear, and continuous-time or discrete-time. Often these systems will be also real and constant (= stationary or time-invariant).

We leave the precise definition of these terms in

the context of Definition (1.3) to the reader (consult KALMAN, FALB, or ARBIE [1969, Chapter 1] as needed) and proceed to make some ad hoc definitions without detailed explanation. The following conventions will remain in force throughout the lectures whenever the linear case is discussed: Continuous-time.

n

=

T =~,

U

= gm,

all continuous functions

X = gn, Y = ~p, m R -+ R which vanish out-

side a finite interval. (1.10)

Discrete-time.

T =?!,

K

= fixed field (arbitrary),

-

20 -

R. E. Kalman

u =!fl, x = r, Z ~

!fl

Y

= KP, n = all

functions

which are zero for all but a finite number of

their arguments. Now we have, finally,

(loll)

dynamical system E time

A real, continuous-time, n-dimensional, linear

DEFINITION.

(F(·), G(·),

is a triple of continuous matrix functions of

H(.»

where

(n X n matrices over

~)

~

(n X m matrices over

~)

R -+

(p X n matrices over

~).

F(·) :

R

-)

G(.) :

R

H(.):

,

These maps determine the equations of motion of E in the following ~:

F(t)x + G(t)w(t),

dx/dt (1.12)

where

{

H(t)x(t),

y(t)

t E~,

x E

t,

(I)(~) E ~m,

and yet) E ~p.

To check that (1.12) indeed makes E

into a well-defined dynamical

system in the sense of Definition (1.3), it is necessary to recall the basic facts about finite systems of ordinary linear differential equations with continuous coefficients. iPF(t, '1"):

Define the map

~ X ~ ~ {n X n

matrices over

~}

to be the family of n X n matrix solutions of the linear differential

-

21 -

R. E. Kalman

equation F(t)x,

dx/dt

x E ~

subject to the initial condition unit matrix,

I

~F is of class Cl

Then

in both arguments.

transition matrix of (the system E matrix is)

F(·).

"E R.

It is called the

whose "infinitesimar'transition

From this standard result we get easily also the

fact that the transition map of E is explicitly given by

T,

W(T, t) for

where

The original proof of (b) is in KALMAN [1960b]; both cases are treated in detail in KALMAN, FALB, and

JL~IB

[1 969, Chapter 2,

is

-

25 -

R. E. Kalman Section 2].

Note that if G(·)

we cannot have reachability, and if G(·) zero on

(- 00, T)

is identically zero on is identically

(T, + 00) we cannot have controllability.

For a constant system, the integrals above depend only on the difference of the limits; hence, in particular

So we have

(2.4)

PROPOSITION.

In a

rea~continuous-time,

linear, constant dynamical system an event for all

T

finite-dimensional,

(T, x) is reachable

if and only if it is reachable for one

T;

an ev:nt

is reachable if and only if it is controllable. From (2.3) one can obtain in a straightfoF~~rd fashion also the following much stronger result: (2.5)

THEOREM.

In a rea1 continuous-time, n-dimensional,

linear, constant dynamical system L = (F, G, -)

a state

x

is reachable (or, equivalently, controllable) at a~ T E ~ if and only i f x E span (G, FG, ••• ) C ~n; if this condition is satisfied, we can choose with

°> 0

arbitrary.

s

=T

-

0, t

=T

(The span of a sequence of matrices is to

be interpreted as the vector space generated by the columns of these matrices.)

+ 0,

-

26

-

R. E. Kalman A proof o:r (2.5) may be found in KALMAN, HO, and NARENDRA

[1963] and in KALMAN, FALB, and ARBIB [1969, Chapter 2, Section

3]. A trivial but noteworthy consequence is the fact that the definition of reachable states of E is "coordinate-free":

(2.6)

COROLJJL~Y.

states of E ~

Xz,

The set of reachable (or controllable)

in Theorem (2.5) is a subspace of the real vector

the state space of E.

Very o:rten the attention to individual states is unnecessary and therefore many authors prefer to use the terminology completely reachable at x E

X~

.

T"

~s reachabl~ ""~ , or ~

event in E

for

"every event

( 'l", x),

completely reachable " for

is reachable", etc.

is

"l.

fixed,

'l" =

" every

Thus (2.5), together with the

Cayley-Hamilton theorem, implies the BASIC

~MA.

A real, continuous-time, n-dimensional,

linear, constant dynamical system E

= (F, G, -) is comnletely

reachable if an only if rank

(2.8)

(G, FG, ••• , yn-1G)

n.

Condition (2.8) is very well-~~own; it or equivalent forms of it have been discovered, explicitly used, or implicitly assumed by many authors.

A trivially equivalent form of (2.7) is given by

COROLLARY 1.

A constant system E

completely reachable if and only if the

= (F,

s~allest

is

F-invariant

subspace of XL containing (all col~on vectors of) itself.

G, -)

G is

~

_

27-

R. E. Kalman

A useful variant of the last fact is given by

(2.10)

COROLLARY 2.

(W. Hahn)

A constant system E = (F, G, -)

is completely reachable if and only if there is no nonzero eigenvector of F which is orthogonal to (every column vector of)

G.

Finally, let us note that, far from being a technical condition, (2.5) has a direct system-theoretic interpretation, as follows: PROPOSITION.

(2.il)

The state space

~

of a real, continuous-

time, n-dimensional, linear, constant dynamical system E

= (F,

G, -)

may be written as a direct sum

which induces a decomposition of the equations of motion as (obvious notations) dx/dt {

dx,jdt

The subsystem L a state

= (Fil, Gl, -)

x = (~, x E ~ 2) PROOF.

of E;

l

is completel v reachable.

is reachable i f and only if x 2

We define

Xl

tion, every state in Xl

Xl

= O.

to be the set of reachable states

by (2.5) this is an F-invariant subspace of

finite-dimensionality,

Hence

is a direct summand in

XE" ~"

Hence, by By

construc-

is reachable, and (every column vector of)

-

28

-

R. E. Kalman G belongs to F = 0, ll

Xl'

The F-invariance of Xl

implies that

which implies the asserted form of the

e~uations

of

motion.

0

REMARK.

Note that

X 2

is not intrinsically defined

(it depends on an arbitrary choice in completing the direct sum). Hence to say that state if

"(0, x

2)

is an unreachable (or uncontrollable)

x 2 -f 0" is an abuse of language.

More precisely: the

set of all reachable (or controllable) states has the structure of a vector suac~ bltthe set of all unreachable (or uncontrollable) states does not have such structure.

This fact is important to

bear in mind for the algebraic development which follows after this section and also in the definition of observabi1ity and constructibi1ity below. chosen in such a way that

In general, the direct sum cannot be F = 0. 12

While condition (2.8) has been fre~uently used as a technical re~uirement

in the solution of various optimal control problems in

the late 1950 s, it was only in 1959-60 that the relation between (2.8) and system theoretic questions was clarified by KALMAN (1960b-c] via Definition (2.2) and Propositions (2.5) and (2.11). 11 for further details.)

(See Section

In other words, without the preceding

discussion the use of (2.8) may appear to be artificial, but in fact it is not, at least in problems in which control enters, because, by (2.12) control problems stated for respect to the intrinsic subspace

Xl'

~

are nontrivial only with

-

29 -

R.E.Kalman

The hypothesis "constant" is by no means essential for Proposition (2.11), but we must forego further comments here. For later purposes, we state some facts here for discretetime, constant linear systems analogous to those already developed for their continuous-time counterparts.

The proofs are straight-

forward and therefore omitted (or given later, for illustrative purposes). (2.14)

PROPOSITION.

A state

n-dimensional, linear, con stant

x

of a real, discrete-time,

~~cal

system E

= (F,

G, -)

is reachable if and only if x E span (G, FG, ••• , ~-lG). Thus such a system is completely reachable if and only if (2.8) holds. (2.16)

PROPOSITION.

A state

x

of the system E

described

in Proposition (2.14) is controllable if and only if

) x E span ( F-1G, • ", F-nG, where {x:

PROPOSITION.

(2.18)

~x

column vectcr of

G}.

In a real, discrete-time, finite-dimensional,

linear, constant dynamical system E

= (F,

G, -)

a reachable state

is always controllable and the converse is always true whenever det F

f

O.

-

30 -

R. E. Kal ma n

Note also that Propositions (2.11) and its proof continue to be correct, without any modification, when "continuous-time" is replaced by "discrete-time". Now we turn to a discussion of observability. The original definition of observability by KALMAN' [1960b, Definition (5.23)] was concocted in such a way as to take advantage of vector-space duality.

The conceptual problems surround-

ing duality are easy to handle in the linear case but are still by no means fully understood in the nonlinear case (see Section 10).

In order to get at the main facts quickly, we shall consider

here only the linear case and even then we shall use the underlying idea of vector-space duality in a rather ad-hoc fashion. The reader wishing to do so can easily turn our remarks into a strictly dual treatment of facts (2.1)-(2.12) with the aid of the setup introduced in Section 10. DEFI}ilTION.

An event

("

x)

in a real, continuous-

time, finite-dimensionak linear dynamical system E

= (F(o),

-, H(·»

is unobservable iff

DEFI}ilTION. ("

x)

With respect to the same system, an event

is unconstructible* iff

*In the older literature, starting with KAl}t~~ [1960b, Definition (5.23)], it is this concept which is called "observability"o By hindsight, the present choice of words seems to be more natural to the writer.

-

31 -

R. E. Kalman

The motivation for the first

defi~ition

is obvious:

the

"occurrence" of an unobservable event cannot be detected by looking at the output of the system after time subsumes

ill

= 0,

linearity.)

T.

(The definition

but this is no 10s3 of generality because of

The motivation for the second definition is less

obvious but is in fact strongly suggested by statistical filtering theory (see Section ments Definition

10).

(2.20)

complements Definition

In any case, Definition

(2.21)

comple-

in exactly the same way as Definition

(2.1)

(2.2).

From these definitions, it is very easy to dedu ce the following criteria:

(2.21)

PROPOSITION.

In a real, continuous-time, finite-di mensional,

linear dynamical system E (a)

= (F('),

-, H('))

unobservable if and only if x for all

t E

~,

an event

(T, x)

is

E kernel M(T, t)

t > T, where

M(T, t)

(b)

unconstructible if and only if for all

s

E~,

s

x € kernel

W( T,

t)

T

and observable # for some t >

x € range

M( T,

t)

T.

From these relations we can easil,y deduce the so-called "duality rules"; that is, problems involving observability (or constructibility) are converted into problems involving reachaoility (or ~ o nt rollability) in a suitabl,y defined dual system.

See KALMAN, FALB,

and ARBIB [1969, Chapter 2, Proposition (6.12)] and the broader discussion in Section 10. We will say: by slight abuse of language, that a system is completely observable whenever

0

is the onl,y unobservable state.

Thus the Basic Lemma (2 .7) "dualizes" to the PROPOSITION.

A real, continuous-time or discrete-time,

n-dimensional, linear, constant

dynamical system E

= (F,

- , H)

*A11 this would be strictly correct if we agreed to replace "direct sum" in Pr oposition (2.11) and its counterpart (2. 25) by "orthogonal. direct sum"; but thi s would be an arbitrary convention which, whi l e conve nie ~t, h~s no natural system-theoretic justification. Rer ead Rem1i.tk (2.13).

-

33 -

R. E. Kal ma n

is completely observable if and only if (2.24)

rank (rr , F'H', ••• , (F,)n-~,)

n.

By duality, complete constructibility in a continuous-time system is equivalent to observability; in a discrete-time system this is not true in general but it is true when det F

f

O.

It is easy to see also that (2.11) "dualizes" to: PROPOSITION. time or discrete-time system r.

= (F,

The state space

Xr.

of a real,

continuous-

n-dimensional, linear, constant dynamical

-, H) may be written as a direct sum

and the equations of r. are decomposed correspondingly as dx/dt

Fllxl,

dx,jdt

F 21x l + F 22x2'

yet) PROOF.

H x (t ) . 2 2 Proceed dually to the proof of Proposition (2.11),

beginning with the definiticn of Xl states of r..

as the set of all unobservable

o

Combining Propositions (2.11) and (2.25) gives Theorem C as in KALMAN (1962].

This completes our survey of the "classical" results related

-

34 -

R. E . Kalman

to reacha bility, controllability, observapility, and constructibility. The remaining lectures wi l l be concerned exclusively with discrete-time sy stems .

The main motivation for the succeeding

developments will be the algebraic criteria (2.8) and (2.24) as well as a deeper examination of Theorems C and D of the Introduction.

-

35

-

R. E. Kalman

3.

DEFINITION OF STATES VIA NERODE EQUIVALENCE CLASSES

A classical dynamical system is essentially the action of the time set

T

(= reals)

on the states

X.

In other words} the

states are acted on by an abelian group, namely definition of addition). consequences. inputs

(~+

usual

This is a trivial fact, but it has deep

A (modern) dynamical system is the action of the

n on X;

in exact analogy ,nth the classical case, to

the abelian structure on

T there corresponds an (associative

but noncommutative) semigroup structure on

n.

The idea that

n

always admits such a structure was apparently overlooked until the late 1950's when it became fashionable in automata theory (school of SCHUTZENBERGER).

This seeClS to be the "right" way

of translating the intuitive notion of dynamics into mathematics, and it will be fundamental in our succeeding investigations. It is convenient to assume from now on, until the end of these lectures} that T

time - -set -

Z

additive (ordered) group of

integers. Since we shall be only interested in constant systems from here on} we shall adopt the following normalization convention:*

*In the discrete-time nonconstant case, we WOQld have to deal with ~ copies of n, each normalized with respect to a different particular value of T E ~.

-

36 -

R. E. Kalman No element of n is defined for

t >

In view of (3.2), we can define the "length"

max {-t E Z:

Iill I

is

ill

I!:Jt

Before defining the semigroup on fundamental notion of dynamics : defined for all

ari':

0

q~

in

n ~ n:

ill

O.

T

1(1)1

of ill

defined for any

n,

by

s < t}.

we introduce another

the (left) shift operator

an'

Z by

1-+

t ~ m(t + q).

arim:

Note that the definition of O"n

is compatible with the normaliza-

tion (3.2). If J

ill

n Jill ,

of (I) and

00'

(3.4)

v

(l)

= empty for

we define the join

ill, ill' E 0.,

as the function

(00

w'

lill'

on on

Jill' Jill'.

When n has an additive structure, then we replace

DEFINITION. o.

n X0

~

o.

0,

00 v 00'

by

ill

+ (I)'.

There is an associative operation

called concatenation, defined by

(00, v)

Note that, by

1-+

anIvl ill

v

v,

(3.2) through (3.4),

o

is well defined.

Note also that the asserted existence of concatenation rests on the fact that

0

intervals in

We might express the content of (3.5) also as:

T.

is made up of functions defined over finite

o is a semigroup with valuation, since evidently

l(I)ovl = Iwl + Ivl.

37 -

R.E.Kalman

In view of (3.5), it is natural to use an abbreviated notation* also for the transition function, as follows:

(3.6)

Iwl,

ep(o; -

Xow

w)

x,

Now we come to an important nonclassical concept in dynamical systems, whose evolution was strongly influenced by problems in communications and automata theory:

a discrete-time constant

input/output map f: 0 -+ Y: w >-+ few)

y(l)

We interpret this map as follows: system E

y(l)

is the output of some

(say, a digital computer) when E is subjected to

the (finite) input sequence w,

assuming that

E is some fixed

initial equilibrium state before the application of co,

This

definition automatically incorporates the notions of "discretetime" as well as "causal" or "dynamics" (the latter because yet)

is not defined for

t < 1).

However, (3.7) does not

clearly imply "constancy" (implicitly, however, this is clear from the normalization assumption (3.2) on more forceful, we extend

(3.8)

r:

n

-+

r

f

=

n).

To make the definition

to the map

Y X Y

(infinite cartesian product) (y(l), y(2), •••

Interpretation:

r

of the system E after

gives the output sequence t

=0

y

= (y(l),

). y(2),

resulting from the application of an

*Observe that xow is the strict analog of the notation xt customary in topological dyn?~cs. The action of w on x satisfies xo(wov) = (xom)ov in view of (1.5).

-

38 -

R. E. Kalman input

ill

which stops at

t

= o.

This definition expresses causality more forcefully and incorporates constancy, provided we define the (left) shift operator for any

~r

T

~

on

r

0, T E

so as to be compatible with let

~'

t

y( t + T)

l-t

:(y(l), y(2), ••• ) Note:

the operator

operator

~r

~n

1-4

(y(T + 1), y(T + 2), •.• )

"appends" an undefined term at

0,

the

"discards" the term y(l).

Now, dropping the bar over

(3.10)

(3.3). So,

DEFINITION.

f,

we adopt

A discrete-time, constant input/output map

(of some underlying d"vnamical system E)

is any map f

such that

the following diagram

is commutative.

f

is Hnc"r iff i t is a K-vcctor -------------_.------

§.P~c~\J..~:>m:>..r:~1.i.:s"1J!.

(3.10) as the external

It will be convenient to regard

definition of a dynamical system, in contrast to the internal definition set up in Section 1. Intuitively, we should think of kind of experimental data; namely,

f f

as a highly idealized incorporates all possible

information that could be gained by subjecting the underlying

-

39 -

R. E. Kalman

system to experiments in which only input/output data is available.

This point of view is related to experimental physics the

same way as the classical notion of a dynamical system is related to Newtonian (axiomatic) physics. The basic question which motivates much of what will follow can now be formulated as f ollows: PROBLEM OF REALIZATION. f

(but of course also of

~,

Given only the knowledge of

l1, and

r)

how can we discover,

in a mathematically consistent, rigorous, and natural

.~y,

the

properties of the system E which is supu os ed to underlie the given input/output map f? This suggests immediately the following fUndamental concept: DEFINITION.

A fixed dynamical system E

(internal

definition, as in Section 1) is a realization of a fixed input/ output map f

iff

fE' o the input/output map of Eo. o

f

o

=

that is,

f

o

is identical with

In view of the notations of Section 1 plus the special convention

(3.6),

the explicit form of the realization condition is

simply that f (m) o

for all m

l1.

~ ( 0

k

fez -oi)

in Z.

The proof of Theorem (4.2) is now complete, since the last lemma identifies module

X f

as defined by (3.15) with the

quotient

n/kernel f.

We write elements of the latter as it is clear that since

K[z]

n

Xf

[w]f

=

W

+ kernel fj

as a K[z]-module is generated by

itself is generated by

e

l, that the scalar product in n/kernel f

••• , em

then

[el]f' ••• , [em]f'

(see (4.6».

Note also

is

(4.10) The last product abov" (that in n)

has already been defined in (4.5).

The reader should verify directly that (4.10) gives a well-defined scalar product.

-

48 -

R. E. Kalman

REMARK.

(4.11) define

f.

There is a strict duality in the setup used to

From the point of view of homological algebra [MAC LANE

19631, this duality looks as follows.

Since every free module is

projective, the natural map

exhibits

X as the image of a projective module. On the other f hand, there is a bijection between the set X and the set f

'::'f

fen)

'::'f is clearly a and so

X f

X f

K[ z 1- submodu.Le of

r

(with

f( z .m) ) ,

z- f(m)

K[r..] -modules.

It is

is an injective module [MAC LANE 1963, page 95,

r

Exercise 21

1'.

'::'f are isomorphic also as

and

known that

c

So the natural Inap X ~ :;;:f: f

as a submodule of an injective module.

[m1f

H

f(m)

exhibits

This fact is basic in the f (Section 7),

construction of the "transfer function" associated with

but it s full implications are not yet understood at present. There is an easy counterpart of Theorem (4.2) which concerns a dynamical system given in "internal" form: (4.12)

PROPOSITION.

The state set

X E

of every discrete-time,

ffnite-dimensional, linear, constant dynamical system admits the structure of a PROOF. K-vector space.

E

=

(F, G, -)

K[zl-~.

By definition (see (1.10)), We make it into a

X= ~

is already a

K[zl-module by defining

-

49 -

R. E. Ka l m a n

K[ z ] X JCl ->

(4.13)

.:

(4 .14)

COMMENT.

JCl:

(7T, x)

H

7T(F)x .

o

The construction used in the proof of (4 .12) is

t he classical trickof studying the properties of a fixed linear map F:

JCl

->

JCl

via the

K[ zl-module structure that

F

induces on

JCl

by (4 .13). In view of the canonical construction of L provided by f Proposition (3.16), the state set X can be treated as a K[zl module irrespective as to whether

X is constructed from

or given a priori as part of the specification of L the

(X

f

(X

= Xf)

= ~). Thus

K[zl-module structure on X is a nice way of uniting the "external"

and the "internal" definitions of a dynamical system .

Henceforth we

shall talk about a (discrete-time, linear, constant dynamical) system

L somewhat imprecisely via properties of its ass ociated

K[zl -module~.

We shall now give some examples of using module-theoretic language to express standard facts encountered before . (4. 15)

PROPOSITION .

FE is given by PROOF.

If X is the state-module of L,

X -) X: x

H

the map

z -x ,

This is obvious from (4 .13) if X

~, f

then we find that, by (1.17), x(l)

Fx(O) + Gw(O) , F[ ~lf + Gw( O);

since

x( O) results fr om input

~,

x(l)

resuJ.ts from input

z·~

+ w(O)

- so -

R. E. Kalman

and we get [Z· E + w(0) ] f'

z'[E]f + [w(O)]f' z'[E]f + GuJ(O).

o

So the assertion is again verified.

Now we can replace Proposition (2.14) by the much more elegant (4.16)

A system E = (F, G, -)

PROPOSITION.

if and only if the columns of G generate

is completely reachable

~.

The claim is that complete reachability is equiva-

PROOF.

lent to the fact that every element

x E

~

is expressible as

m

x

1rj E K[z],

= ftl1rjgj'

G

[gl' ••• , ~].

In view of (4.15), this is the same as requiring that

x be expressible

as m

x

ftl1r/F)gj;

this last condition is equivalent to complete reachability by (2.14). (4.17)

COROLLARY.

The reachable states of E are precisely

those of the submodule of (4.18)

REMARK.

simply means that

0

~

generated by (the columns of)

The statement that X is

~

G.

"E is not completely reachable"

generated by those vectors which make up

the matrix G in the specification of the input side of the system

E.

-

51 -

R. E. Kalman

It does not follow that vectors.

X cannot be finitely generated by some other

In fact, to avoid unnecessary generality, we shall henceforth

assume that

X is always finitely generated over K[z]. From the system-theoretic point of view, the case

when we need

infinitely many generators, that is, infinitely many input channels, seems rather bizzare at present. The syst em X f

PROPOSITION. PROOF. is reached by

(4.20)

~

iff is

W

Obvious from the notation:

E

E

[O]f'

a state

x

[~]f

n.

o

PROPOSITION. PROOF.

is completely reachable.

The system X f

is completely observable.

Obvious from Lemma (h) above:

D([w]f)

= fe w) = 0

which says that the only unobservable state of X f

o

0 E X • f

Let us

g~nera!ize

the l ast r esult t o obtain a module-theoretic

for complete obs ervability. doing this.

c r it c r i ~n

There are two technically different ways of

The first depends on the observation that the "dual" of a

submodule (see Corollary

(4.17»

is a ~uotient module.

observability via the "dual" system Consider a dynamical system E K[z ]-module ~

(F', HI, -)

= (F,

The s econd defines

associated with

(F, -, H).

-, H) and the corresponding

and K-homomorphism H: ~ ~ Y = KP•

We can extend H

-

52 -

R. E. Kalman

to a K[z] -homomorphism

H:

~

--+

H

(look back at (?8)) by setting

r

x ~ (Hx, H(z'x), H(z2. x), .•• ). From Definition (2.19) we see that no nonzero element of the quotient module

~/kernel

can say that

H is unobservable. Hence, by abuse of language, we

~/kernel

H

is the module of observable states of E.

Thus we arrive at phrasing the counterparts of (4.16-17) in the following language:

(4.21)

A system E

PROPOSITION.

if anionly i f the quotient module

(4.22)

COROLLARY.

= (F,

-A

~/kernel

is compl et el y observable

TERMINOLOGY.

~/kernel

are to be identified

H.

The preceding cons i derat i ons suggest viewing

a system E as essentially the sane "thing" as a module speaking, however, knowing E

= (F,

G, H)

(see (4.13)) but also a quotient module module (that generated by

XEo =

~.

H is isomorphic with

The observable states of E

with the elements of the quotient module (4.23)

H)

G)

of ~,

~

X.

gives us not only (over kernel

H)

Strictly ~

= XF

of a sub-

that is

K[z]G/ kernel -H.

If ~ ~~ we say that

~

is canonical (relative to the given

G, H).

To be more precise, let us observe the following stronger version of (4.19-20):

-

53 -

R. E. Kalman

(4.24)

CORRESPONDENCE

between

There is a bijective correspondence

THEO~I.

K[z]-homomorphisms

f: n

r and the equivalence class of

--+

completely reachable and completely observable systems basis change in

E modulo a

~.

Detailed discussion of this result is postponed until Section A

7. stricter observation of the "duality principle" leads to

(4.25)

The K-linear dual of E

DEFINITION.

= (F, G, H) is

E* = (FI, H', G') (, = matrix transposit ion). The states of E* are called costates of E. The following fact is an immediate consequence of this definition:

(4.26)

PROPOSITION.

structure of

K[z-l]

is the dual of ~ product in

~*

The state set

of

Y~*

module, as follows:

E* may be given the

(i) as a vector space

~*

regarded as a K-vector space, (ii) the scalar

is defined by x*(Fx).

(4.26A)

REMAR.X.

We cannot define

K[z]-linear dual of domain

~,

XE*

as

because every-torsion module

D has a trivial D-dual.

equal to

M over an integral

However, the reader can verify (using

the ideas to be developed in Section 6) that morphic with

Ho~[ z ] (~, K[ z])

~*

defined above is iso-

Ho~[z](~,K(z)/K[Z]). See BOURBAKI [Algebre, Chapter

(2 e ~d.), Section

4, No.8].

7

-

54 -

R. E. halman Now we verify easily the following dual statements of (4.16-17):

(4.:27)

PROPOSITION.

if and only if (4.28)

A system E

generates

HI

COROLLARY.

generated by

-, H)

is completely observable

~*.

The observable COstates of E* are precisely

the reachable states of E*, ~*

= (F,

that is, those of the submodule of

H'.

We have eliminated the abuse of language incurred by talking about "observable states" through introduction of the new notion of "observable COstates".

The full explication of why this is necessary

(as well as natural) is postponed until Section 10. The preceding simple facts depend only on the notion of a module and are immediate once we recognize the fact that

F may be eliminated

from statements such as (2.8) by passing to the module induced by via (4.13).

F

But module theory yields many other, less obvious results

as well, which derive mainly from the fact that

K[z]

is a principa1-

ideal domain. We recall:

an element

m of an

R-modu1e M (R

= arbitrary

commutative ring) has torsion iff there is a r E R such that r·m

= O.

If this is not the case,

m is free.

Similarly,

said to be a torsion module iff every element of

M has torsion.

M is a free module if no nonzero element has torsion. is any subset of M, ~

=

(r;

the annihilator r·j,

it follows immediately that

o ~

of

~

for all

j,

M is

If

LC M

L is the set

E L);

is an ideal in

R.

Note also that

-

55 -

R. E. Kalman

the statement that

"M

is a torsion module" does not imply in general

is nontrivial, that is,

~

~

f. o.

(Counterexample:

take

an M which is not finitely generated.) Coupling these notions with the spe cial fact that, for us, R

= K[z),

we get a number of interesting

(4.29)

PROPOSITION.

is a torsion

PROOF.

">S:

is finite-dimensional i f and only i f

I:

If

>s:

q

is infinite dimensional.

"I: = finite-dimensional" is defined

Xl' ••• , x q

G).

of XI:

deg Y

j

Hence

with Y E K[z).

Yl[z)

= nj >

0

j

for all

is either zero (and then a unit which implies

See (l.18).

(which are not·

is a principal-ideal domain, each of the

K[z}

>s:

By assumption X is. finitely generated

nonzero elements

pal ideal, say, then

I:

= finite-dimensional as a K-vector space".

necessarily the columns of

Since

is free,

We recall that

Sufficiency. by, say,

results:

K[z)-~.

COROLlARY.

to be

syste~theoretic

x

replace each expression

j

x. J

=0

j

If

1, ••• , q.

JS:

A

x

is a princij

is a torsion module, For otherwise

Y

j

is free, which is a contradiction) or contr

y to assumption.

Hence we can

-

56 -

R. E. Kalman by the simpler· one

XL,

which shows that

as a K-module, is generated by the finite set

Necessity. x

F:

1-+

z.-x,

If

Let

XL

~F

be the minimal polynomial of the map

is finite-dimensional as a K-module,

deg

> O.

~F

This means (by the usual definition of the minimal polynomial in matrix theory or more generally in linear algebra) that x E

XL

so that

y~

is a torsion

~F

annihilates every

o

K[z]-module.

Notice, from the second half of the proof, that the notion of a minimal polynomial can be extended from K-linear algebra to

K[z]-modules.

In fact, the same argument gives us also the well-known (~030)

PROPOSITION.

Every finitely generated torsion module

over a principal-ideal domain ~M

given by

~

=

R has e nontrivial minimal pynomial

~~.

COROLLARY. q

If a

K[z]-module

generators and minimal polynomial

f

~X'

X is finitely generated with then

dim X (as K-vector space)

~

REMARK.

is completely reachable and is

The fact that

therefore generated by of L

M

L

f

q.deg ~X'

m vectors allows us to estimate the dimension

by (4.31) knowi.ng only

deg

but without having computed

~X

f

-

57 -

R. E. Kalman

X

itself.

(Knowing X explicitly means knowing F: x ~ z·x, etc.) f In other words, the module-theoretic setup considerably enhances the f

content of Proposition (3.16).

Guided by these observations, we shall

develop in Section 8 explicit algorithms for calculating from

f

without first having to compute PROPOSITION.

E

If

XL

f

directly

F.

is a free

K[z]-module, no sta.te of

can be simultaneously reachable and controllable. PROOF.

We recall that

"XL = free"

simplicity that

XE = K[z].

for some 5 E K[z]. zlwl. x +

00.1

=0

This shows that WI

Similarly,

for some

1

Then

00

x

XL

means that

(isomorphic to) a finite sum of copies of K[z].

by

dim E

= reachable

is

Suppose for means that

x

= 5'1

x = controllable means that

E K[z].

is annihilated by

Hence if x has both properties,

~ow,

which contradicts the assumption that

the input ~

5 followed

is free.

o

The most important consequence of Theorem (4.2) is due to the fact that through it we can apply to linear dynamical systems the well-known FUNDAMENTAL Sl'RUCTURE THEOREM FOR FINITELY GENERATED MODULES OVER A PRINCIPAL IDEAL DOr1Allf R (Invariant Factor Theorem for Modules).

Every such module M

wi~h

~

generators is isomorphic to

-

58 -

R. E. Kalman

where the the

El

Vi

R/ViR are quotient rings of

R viewed as modules over

(called the invariant factors of M)

M up to units in

denotes the free

Vi Iw .. , i l

R,

R-module with

s

R,

are uniquely determined

i = 2, ••• , q,

and, as usual, RS

generators; finally,

r + s

~

m.

Various proofs of this theorem are referenced in KALMAN, FALE, and ARBIB [1969, page 270], and one is given later in Section 6. Note:

The divisibility conditions imply that

module iff

s = 0

and then

M is a torsion

VM = VI.

One important consequence of this theorem (others in Section is that it gives us the most general situation when torsion module

E.

~

7)

is not a

(4.33) with (4.34), we

For instance, combining

get PROPOSITION.

A system cannot be simultaneously completely

reachable and completely controllable if its oo-dimensional components (i.e.,

(4.37 )

REMARK.

s >0

in

K[z]-module

X has any

(4.35».

Although our entire development in this section may

be regarded as a deep examination of Proposition (2.14), most of our comments apply equally well to (2.7), since both statements rest on the ~ algebraic condition (2.8).

In fact, the only remaining

thing to be "algebraized" is the notion of "cont i nuous- t ime " . shall not do this here.

We

Once this last step is taken, the algebraization

of the Laplace transform (as related to ordinary linear differential equations) will be complete.

-

59 -

R. E. Ka I ma n

5.

CYCLICITY AND REIATED QUESTIONS

We recall that an R-module

element

th~t

M = Hm.

m E M such that

iff there is an element better to say

(R = arbitrary ring) is cyclic

M

such a module is monogenic:

[It would be

generated by one

m.]

If M is cyclic, the map R and has kernel

Am'

loI: r

~

r'm

H

is an epimorphism

the annihilatir-g ideal of m.

This plus the

homomorphism theorem gives t he well-known PROPOSITION.

Every cyclic

R-~J ::J:lle

RIA:n

is isomorphic with the quotient ring

r-,

with ge::e!'at::>!' :n

vie~ed a s an

~-~Jdule.

This result is much m::>re interesting when, as in our case,

R

is not only commutative and a principal-ideal domain, but specifically the polynomial ring

X be a cyclic

So let A

g

g.

=

'I' g K(z],

where

K(z] -module ·H.ith generator

X "" X[ z l/vK[ z l. (i)

(ii)

and

l~t

X.

Write

~

g

Hence

= ~ X =~.

'I' g is a minimal

In view of (5.1),

Let us r ecs.Ll some f'eabur es of the ring

K[ z ]hx[ z }:

Its elements are the residue classe s of polyno~als ~ ( mod V) ,

rr E K[z]. (rr]-[cr]

g

is the minimal or annihilating polynomial of

By commutativity and cyclicity,

polynomial also for

(rr]

K[z] .

W!'ite these as

[~)

or

[rrlo/.

Multiplication i s def ined as

= [rrcr). Each

[rr]

is either a

is a unit iff (n, 'f)

tL~it

= greatest

or a divisor of zero.

In fact,

common divisor of u, V is a

-

60 -

R. E. Kalman

unit in

K[z)

(that is,

cnr + Tljr so that

unit in K[z),

divisors since (iii)

(0",

1

If

T

[71").[ljr/Q] ljr

Then

E K[z])

is the inverse of

[0")

=0 f

(71", ljr)

(71", \if) E K).

[71").

On the other hand, if

then both

[71")

and

[ljr/Q) are zero

= [(71"/Q)ljr] = o.

is a prime in K[z)

(that is, an irreducible poly-

nomial with re spe ct to coefficients over the ground field by (ii)

K[z ]/ljrJ([ z ]

i s a field.

K),

then

This is a very standard construction

in algebraic number theory. Since it is awkward to compute with equivalence classes

[71"],

shall often prefer to work with the standard representative of namely a polynomial mined by

[71"]

1i- of least degree in

and the condition

[ 71"] •

deg 7i- < deg ljr.

we

[71"),

7i- is uniquely deterHenceforth

-

will

always be used in this sense. The next two assertions are immediate: (5.2) to the

PROPOSITION .

K[z]/fK[z)

K-vector space CEJ( n)

K[z)/ljrK[ z ]

=

~

a E K[z l : deg l' < n

is also isomorphic to lJ1(n) as a

we define the scalar product in ®(n) PROPOSITION. then

K-vector space is isomoruhic

dim E = deg ljr .

If

=

deg \' } •

K[ z I-mcdul,e, provided

E.l (71".1')

r+

7i1.

XE is cyclic ,d th minimal poJ.y:lomial

~,

-

61 -

R. E. Kalman

(4.34), we see that the most general

Looking back at Theorem

K(z]-module is a direct sum of cyclic

(5.3) and

(4.3~

K(z]-modules.

By combining

and using the fact that dimension is additive under

direct summing, we can replace (4.31) by the followiEg exact result: PROPOSITION. factors

If X is a torsion module with invariant E

W , ••• , W then l q -dim E

A simple but highly useful consequence of cyclicity is the so-called control canonical form [KALMAN, FALB, and ARBIB, 1969,

44] for a completely reachable pair (F, g) wher e g is an

page

matrix.

n X1

We shall now

Observe first that lent to

"g

generates

procee~to

"(F, g) XF,

deduce this result.

completely reachable" is equiva-

the module induced by

F via (1+ .13) ."

Let

det (zI - F), n

z + al Z then X F•

~

n-l

+ ••• + an'

~ E K;

is the characteristic (and also the) minimal polynomial for

[This is a well-known fact of module theory.

See for example

KAlMAN, FALB, and ARBIB [1969, Chapter 10, Section 7] for detailed discussion.]

As in KALMAN [1962], consider the vectors

-

62 -

R . E. Kalman

en e

in~.

=

n_l

g

=

=

l.g

=

{l)(z).g,

~.g =

z.g +

[ For consistency,

{2)(z)'g,

xin+l)(z) F

= X(z).J F These vectors are

easily seen to be linearly independent over since

~

'" ®n)

as a

K.

They generate

K-vector space (Proposition (5 .2}).

••• , en are a basis for ~ as a K-vector space. l, respect to this basis, the K-homomorphism e

z:

is represented by the matrix

(5. 6)

0

1

0

0

0

0

0

1

0

0

0

0

0

0

1

F

-0:

r

-exn-1

-0:

n-2

[This is proved by direct computation. necessary to use the fact that

-0:

2

Hence

With

z -x

X H

-~

In particular, it is

~

-

63 -

R. E. Kalman

z.e

1

z~n)(z).g, (~(z) - an) 'g,

Note that the last row of

By definition,

F

of

~.

~

has the representation

(5.7 )

g

in

(5.6) consists of the coefficients

= en' Hence g as a column vector in

g

Conversely, suppose

W,

~

have the matrix representation

with respect to some basis in

~.

(5. 6-7)

Then (by direct computation)

the rank condition (2 .8) is satisfied and therefore

(F, g) is

completely reachable in both the continuous-time and discretetime cases (Propositions (2 .7) and (2.16)). We have now proved:

(5.8)

PROPOSITION.

The pair

(F, g)

is completely reachable

if and only if there is a basis relative to which

COROLLARY. A(Z)

= zn + tll zn-l + ... + tln

exists an ~

Given an arbitrary

n-vector

(F, g)

£

in

such that

K[ zJ,

n-th K

is,given by

degree polynomial

= arbitrary

A = ~_g£'


F

field.

There

if and only if the

-

64 -

R. E . Kalman

PROOF.

Suppose that

With respect to the same basis

(5.6-1),

forms

(F, g)

(5.5)


which exhibits the canonical

define

Then verifY by direct computation that Conversely, suppose that reachable.

(F,

A = ~_gtl' g)

is not completely

Then, recalling Proposition (2.12) (which is an

algebraic consequence of (2.8) and hence equally valid for both continuous-time and discrete-time), deg ~

•

~

Since

22

the polynomial

KU

and so is also

F-invariant subspace of X = ~,

is independent of the choicp of basis in

~

II

~ = XP/X • (In F 22 11 does not depend on the arbitrary choice of

and the same is true then also for

particular, X 2

is an

dim X > 0 2

~

22

in satisfYing the condition X =

we have for all

n-vectors

~ EB X

2.)

t, deg

This contradicts the claim that with suitable choice of

In view of (2.12),

~

22

A-X - · F- gt ·

> O.

is true for any

t.

In view of the importance of this last result, we shall

rephrase it in purely module theoretic terms:

A

o

-

65 -

R. E. Kalman THEOREM.

Let

K be an arbitrary field and

K[z]-module with generator n.

g and minimal polynomial

There is a bijection between n-th

= zn + 131 zn-l + .e: If' -+ If': x(j) .g ~

degree polynomials

(5.5»

J

such that

-

"A is the minimal polynomial for the

new module structure induced on X by the map Note that in (5 .11) The map l

z

X of degree

..• + 13 in K[ z l and K-homomorphisms n H l .• g (j = 1, " " nand x(j) defined

"A( e)

E

X a cyclic

(F, g, -) to

z*

l(x) .

z*: x H z -x -

lex) corresponds to gl'x in (5.10).

in (5.11) defines a control law for the system corresponding to the module X.

The passage from

is the module-theoretic form of the well-known open-loop

to closed-loop transformation used in classical linear control theory. PROOF.

If',

basis for treat

l

l

represents the equivalence class is never a

l

that this choi ce of

= 1,

••• , n + 1.

"A(l)(z _

13 j

.e

- O:j'

implies

We

(that is, an operator

l'x = l(~ 'g), where

= (s:

[s]

s·g

= x).

Unless

K[zl-homomorphism and therefore

does not commute with nonunits in K[z]. Define

j

form a

is clearly a well-defined K-homomorphism.

K-vector space), by writing

identically zero,

.e

x(l) .g, ••• , x(n)'g

formally as an element of K[z]

on X is a ~

Since the vectors

j

= 1,

.•• , n.

"A(j)(z -

Use induction on

j.

We prove first

l) = x(j)(z) for By definition,

l) = x(l)(z). f I n the general case,

-

66 -

R. E. Kalman

(inductive hypothesis), (def . of .£), (def. of .£ .), J

(def. of x(j+l)).

j = n + 1)

It follows (case regarded as a

K[z*]-module.

A(l)(z*).g, ••• , A(n)(Z*)'g space since

F=~~asitions

to the

A annihilates

g

X

On the other hand, the is a basis for

X as a

K-vector

X(l)(z).g : .• , x(n)(z).g was such a basis.

is cyclic with generator by

that

also as a

K[z*]-module.

(5.1-2) the annihilating ideal of

So X

Hence

g" with respect

K[z*)-module structure cannot be generated by a polynomial

of degree less than

n,

nomial with respect to

that is, z*.

A is indeed the minimal poly-

The correspondence

A

f-'

.£ is obviously

o

bijective. The proof immediately implies the following COROLLARY. ~

K[z]-module.

respect to the are related as

Hz)

~

Then

x x

=

~. g

be any element of X viewed

has the representation

K[z*]-module structure on

X,

where

~*.g

S

with

~

s*

-

67 -

R. E. Kalman

So the open-loop/closed-loop transformation is essentially a change in the canonical basis, provided X is cyclic.

X(j)

It is interesting that the

have long been known in

Algebra (they are related to the Tschirnhausen transformation discussed extensively by WEBER [1898, §46, 54, 74, 85, 96]), but their present (very natural) use in module theory seems to be new. **Theorem (5.l1} may be viewed as the central special case of Theorem A of the Introduction.

Let us restate the latter in

precise form as follows: THEOREM. n

;>.(z) = z + I\Z There exists an if and only if

n-l

Given an arbitrary . + ••• + I3n E! K[z],

n X m matrix (F, G)

Lover

n-th

degree polynomial

K = arbitrary field. K such that

~-GL'

=A


For some time, this result had the st a t us of a well-known folk theorem, considered to be a straightfoniard consequence of (5.9). has been discovered independently by many pe ople .

The latter

(I first he ard

of it in 1958, proposed as a conjecture by J. E. Bertram and proved soon afterwards by the so-called root-locus method.)

Indeed, the

passage from (5.11) to (5.13) is primarily a tecnnical problem.

A

proof of (5.13) was given by LAIiGEliliOP [1964) and subsequent ly simplified by WON¥JU~ [1967).

Tne first proof was (~n_~ecessarily)

very long, but the second proof is also unsatisfactory; since it depends on arguments using a splitti ng field of

K

**The material between these marks was added after the Summer School.

-

68 -

R. E. Kalman

and fail when K is a finite field.

We shall use this situation

as an excuse to illustrate the power of the module-theoretic approach and to give a proof of (5.13) valid for arbitrary fields. The procedure of LANGENHOP and WONHAM rests on the following fact, of which we give a module-theoretic proof: LEMMA.

Let

F be cyclic* and

!.!! m-vector

a E

K be an arbitrary but infinite field.

(F, G)

Ifl

completely reachable.

such that

(F, Ga)

Let

Then there is

is also completeq

reachable. We begin with a simple remark, which is also useful in reducing the proof of (5.13) to Lemma (5.18). SUBLEMMA.

Every submodule of a cyclic module over a

principal-ideal domain is cyclic. PROOF OF (5.14). m= 1

is trivial.

m.

The case

The general case amounts to the following.

Consider the submodule gl' ••• , ~-l

We use induction on

of G.

Y of X =

~

generated by the columns

In view of (5.15),

Y is cyclic.

By the

inductive hypothesis, we are given the existence of a cyclic generator of Y of the form

gy

We must prove:

a, J3 E K the vector

for suitable

is a cyclic generator for

=

a i gl + ••• + am_I· ~-l' a a.~

+

i

E K.

J3.~

X.

*Of course, this means that the is cyclic.

K[z]-module

X F

(see (4.13))

-

69 -

R . E. Ka l ma n

By hypothesis,

Sx'

X has an (abstract) cyclic generator

By cyclicity we have the representations

=

gy

and

TJ'~

Eim

Tj,

~,~,

~ E K[ e l-

Hence our problem is reduced to proving the following:

ex, tl

E

K the polynomial

aT)

~

+

is a unit in

K[

for suitable This,

Z]/~K[ z ] .

in turn, is equivalent to proving

(5.16) where

aT)

classes zero.

-

mod gi'

K[z]

are the unique prime factors of

Then no pair

(~i' ~i)'

X,

reachable.

= 1.

values of

X'

(F, G)

is completely ~

and

gy

the condition

tl from con sideration.

can be

is a proper sub-

are zero, then every ~ .

K[Z]/~K[z],

Then

.•• , r

that is, ~gi annihilates

whence

contradicting the fact that

I f all the

is a unit in

ex

X' = K[z]gy + K[z]9m'

= 1,

i

For if one is, then gil (~, TJ, ~),

module of

r

in

1, ... , r

i

~

mean the representative of least degree of equivalence

the submodule

So let

° (mod g.)

f-

gl' .,., gr Let

~.

+ ~

f-

0,

~. + ~ . ~

Since

~

= 0 eliminates at most K is infinite by

(5.16).

An essential part of the lemm~ is the stipulation that

"F

= cyclic

+ (F, G)

TJ

is already a cyclic generator.

hypothesis, there are always some tl which sati sfy

The hypothesis

so

0

a E ~.

= completely reachable" means that

-

70 -

H. E . Kalman

that is, the le~~ i s trivially true for some a E ~[z]

sx = Ga.

But since we want

a E K,

since

there must be interaction

between vector-space structure and module structure, and for this reason the lemma is nontrivial. when K = finite field.

As a matter of fact, the lemma is false

The simplest counterexample is provided

when (5.12) rules out a single nonzero value of 13, out all

thereby ruling

13. COUNTEREXAMPLE.

Let

integers modulo the prime ideal

Notice that

K = y~, ~.

that is, the ring of

Consider

~ = Xl e X e ~

(as a K[z]-module), where the 2 minimal polynomials of the direct sumrrands are

').(z) X 2(z) X (Z)

3

z2 + z + I, z 2, z + 1. (Xl' X X = 1, hence 2, 3) gl generates Xl eX while

All these factors are relatively prime, X is cyclic. generates

Notice also that

X ex • A cyclic generator for 2 3

3

X is

-

71 -

R. E. Kalman

A simple calculation gives

(z

4

2

+ Z

+ l)'~'

Conditions (5.16) are here a-I + f3'0

f

0

(mod Xl)'

+ f3.1

f

0

(mod X

a-I + f3-1

f

0

(mod X ) .

a-O

2),

3

These conditions have no solution in

g/~.

At this point, the following is the situation concerning Theorem (5.13): (1)

Its counterpart, Theorem A of the Introduction, was

claimed to be true in the continuous-time case under the hype

.~esis

of complete controllability. (2)

In the discrete-time case (5.13) with the preceding

hypothesis Theorem A is false, because of the counterexample: (F

= nilpotent,

~-GL'

G

= 0)

the pair

is completely controllable, but evidently

1s independent of L.

However, in view of (5.11),.Theorem

(5_13) might be true also in the discrete-time case if "complete controllability" is replaced by "complete reachability", this modification being immaterial in the continuous-time case. (3)

Because of (5.17), we might expect that a theorem like (5.13)

1s false for an arbitrary field

K.

-72R. E. Kalman

(4)

If our general claim that reachability properties are

reflected in module-theoretic properties is true, then (5.13) should hold without assumptions concerning module-theoretic fact, that

K,

= principal

K[z]

independent of the specific choice of

because the principal ideal domain, is

K.

We now proceed to establish Theorem (5.13). hypotheses on

K will turn out to be irrelevant.

PROOF OF (5.13).

Necessity is proved exactly as in (5.8).

Sufficiency will follow by induction on m,

~~.

once we have proved it

m = 2:

in the special case

(5.18)

Let

K be an arbitrary field and let

K[z]-module generated by

gl' g2.

K[z*]-module structure on

Let

Case 1.

z*

=z

£ - £

£(x)

will change the

serve that on

o

or

x E Z.

on

Thus there exist polynomials

£

In (5.11)

Replacing

K[z]-module structure on

t.. on

z

Y but pre-

is prime to the unchanged minimal polynomial

y +

Z

by

so that the new minimal poly-

V,

a

such that

B.r hypothesis, every x E X has the representation x

induces a

g2·

X=YEllZ.

for all

nomial Z.

gl + g2

that is,

Z. Further, choose Y

z - £

Y = K[z]gl and Z = K[z]g2.

ynZ=O,

such that

=

z*

X then X is cyclic with respect to this

structure and is generated by either

PROOF.

X be a

There is a K-homomorphism £

(of the tyPe defined in (5.11] such that if

take an

That is, special

vt.. + o X

~Z

= 1.

X

-

73 -

R.E. Kalman

Now verify that x

= (T]crX + svA)·(gl + g2)'

T]crX'g

l

+ SVA·g 2,

T](l - VA)·gl + s(l - crX)'g2' Tj'gl + s 'g2'

K[z*]-module.

ynz=wf o.

C2.s e2.

there is ag E K[z] cyclicity of Take same w Tj

-1

T]

g'g2

f

on

Then if

O.

generates

unit (mod

w

there is also a

Y,

f

su ch that

1Sc).

To show:

X such that

g' g2

Z = X.

3y ;lypotlle s i s,

and therefore, by

Tj E K[z]

such that

1Sc)

g'g2 = w = Tj'gl'

we are done because

In the nontrivial case,

there is a suitable new module structure

~ = unit (mod X* ) ,

nomial of X as a

'lE W.

T] = u,'.it (mod

and so

Y,

kt

X*

being the minimal poly-

K[ z* ]-moduLe,

The main facts we need are the following: SUB~~~.

deg X = n,

Let

X be a fixed element of

FX the companion matrix of

the cyclic module induced by X F• X

Then

Tj E K[z]

F)? and

is a unit modulo

X given by g

K[z]

,nth

(5.6),

X FX

a cyclic generator of

X if and only if

~'g

is

also a cyclic gener at or of X . FX PROOF.

Obvious.

o

-

74 -

R. E. Kalman

.Jl-l )

f

( dety, FXY' ••. , .1"X Y

where

y

(5.19).

Same notations as in

SUB~~~.

(5.20)

Write

0,

is the column vector Tin

PROOF.

Since

X(1), ••• , x(n)

is the basis for the

K-vector space of all polynomials of degree (~l' ••• , ~n)

is uniquely determined by

< n,

By definition

~.

is the matrix representing the module operator to the special basis

e

l,

••• , en

in

~ X

the n-tuple

z: x

given by

~

z·x

FX relative

(5.5).

Similarly,

using one of the module axioms, we verifY that

£

J=l

[rt.x(j)(Z)]'g "J

'

Jl'iij[x(j)(z).gJ,

in other words, the numerical vector (5.22) represents the abstract vector

Ti·g

in

X relative to the same basis FX

e

l,

.•• , en'

Recall

-

75 -

R. E. Kalman

that By

generates

Tj'g

(2.7)

~X

(F x, ll(FX) g)

is complete reachable.

the latter condition is equivalent to

follows from

(5.21)

Same notations as in

(5.19)

and

(5.20).

Given

n-vector (5.22), there exists a polynomial

X

is satisfied.

PROOF.

Let

11 1 , Ti 2 , X(z )

The rest

o

any nonzero nwnerical such that

(5.21).

(5.19).

SUBLEMVA.

numbers

iff

Ti r be the first member of the sequence of which is nonzero.

n +

Z

~z

and determine the first ll r

'i'ir+l

o

Tjr

o

o

n-l r

Write

+ •.. + an' coefficients of

X by the rule

~:J

:J

T}r

an

o

o

1

(Since all numbers belong to a field, the required values of a

r,

..• , an

exist.)

reduce the matrix in

Now check, by computation, that these conditions

(5.21)

to the direct sum of two triangular

matrices, each with nonzero elements on its diagonal . In view of always choose a new

(5.12), Xy = Xt

it follows from these facts that we can such that

Tjt

= unit

mod Xt •

o

-

76 -

R. E. Kalman

The proof of Case 2 is not yet complete, however, because we must still extend the is easy .

Write first

Z

K[z*]-module structure from

= W$

Z·

and then

direct sum is now wi t h r espect to the

t

from Y to X

"by

£i Z'

s '.:ttins

polynomial

X*

(5.12),

is replaced by some

(5.24 )

~

defined over

K-~odule

O.

=

X Since ~*

X

Y to X. This

Y $ Z',

where the

structure of X.

Extend

;;o',{ we have a n(;w mi nima.L z*

= Zt on Y,

~*

= ~t .

By

such that

w

that is, our previous representation of

w~ 0

in W induces a

similar representation with respect to the new K[z*]-module structure on X. Since

~

Xr,

is a unit modul o

By (5.24), we have, with re sp ect t o the cy.

(~* ·g2)'

c-

(~* ·gl)'

we can

~T it e

K[z*]-s tructure,

(1 + TXt) ·gl'

gl· This proves that

52 generates both Y and Z; that is,

a cyclic generator f or

X end owed

~~ th

proof of Lemma ( 5 . 18 ) is now complete.

the

K[z*]-structure.

is The

o

-

77 -

R. E. Kal ma n It should be clear that Theorem (5.13) is not a purely moduletheoretic result, but depends on the interplay between module theory, vector-spaces, and elimination theory (via (5.21)). the fact that

£

ca~

be extended from

For instance,

Y to X, which was needed

in the proof of Case 2, is a typical vector-space argument.** There are many open (or forgotten) results concerning cyclic modules which are of interest in system theory. is easy to show that an

n Xn

real matrix is cyclic iff a certain is nonzero at

~

For instance, it

is roughly analogous to the polynomial

det

,

F'

the polynomial

in the same ring,

but, unlike in the latter case, the general form of

~

does not seem

to be known. We must not terminate this discussion without pointing out another consequence of cyclicity which work.

Since

K[z]jXg K[z], co~~tative

X = cyclic with generator it is clear that

Xg

g

the module frame-

is isomorphic with

X also has the structure of this

ring, that is, the product is defined as

xXy If

tra~scends

(~Tj) 'g.

irreducible, then

X has a galois group.

No one has

tion of this galois group. facts in the theory of

X is even a field. eve~

Hence, in particular,

given a dynamical

interpret~-

In other words, there are obvious algebraic

dyr~nical

from the dynamical point of view.

systems which have never been examined For some related comments in the

setting of topological semi groups, see DAY and WALLACE [1967].

-

78 -

R. E. Kalman

6.

(6.0)

PREAMBLE.

TRANSFER FUNCTIONS

There has been a vigorous tradition in engineer-

ing (especially in electrical engineering in the United States during 1940-1960) that seeks to phrase all results of the theory of linear constant dynamical systems in the language of the Laplace transform. Textbooks in this area often try to motivate their biased point of view by claiming that "the Laplace transform reduces the analytical problem of solving a differential equation to an algebraic problem". When directed to a mathematician, such claims are highly misleading because the mathematical ideas of the Laplace transform are never in fact used.

The ideas which are

complex function theory:

actu~lly

used belong to classical

properties of rational functions, the

partial-fraction expansion, residue calculus, etc.

More importantly,

the word "algebraic" is used in engineering in an archaic sense and the actual (modern) algebraic content of engineering education and practice as related to linear sy stems

i~

very meager.

For

eXfu~ple,

the crucial concept of the transfer function is usually introduced via heuristic arguments based on linearity or "defined" purely formally as "the ratio of Laplace transforms of the output over the input". do the job

~~

To

and to recognize the transfer function as a natural

and purely algebraic gadget, requires a drastically new point of view, which is now at hand as the machinery set up in Sections 3-5. essential idea of our present treatment was first published in KALMAN [1965b l.

The

-

79 -

R. E. Kalman

The first purpose of this section i s to give an intrinsically algebraic definition of the transfer function associated with a discrete-time, constant, linear input/output map (see Definition (3.10)). Since the applications of transfer functions are standard, we shall not develop them in detail, but we do want to emphasize their role in relating the classical invariant factor theorem for polynomial matrices to the corresponding module theorem (4. 34). Consider an arbitrary

K[zl-homomorphism

(g) following Theorem (4.2)) . equivalent to the set

(f(e

j),

n~ r

f:

(see lemma

Then as a "mathematical object" i

1, ... , m,

e

j

f

is

defined by (4.6)),

since (6.1) (The scalar product on the right is that in the defined in Section 4.) power series in

z

-1

By definition of with vani shing

fir~t

r,

K[zl-module

each

term.

f(e .) J

r,

as

is a formal

We shall try to

represent these formal power series by ratios of polynomials (Which we shall call transfer functions~) and then we ca n replace formula (6.1) by a certain specially defined product of a ratio of polynomials by a polynomial .

Some algebraic sophistication will be needed to find the

correct rules of calculations.

These "rules" will consititute a

rigorous (and simple) version of Heaviside 's so-called "calculus". There are no conceptual complications of any sort.

(However, we are

dodging some difficulties by working solely in discrete-time.) *This entrenched terminology is rather unenlightening in the present algebraic context.

-

80 -

R. E. Kalman

X = n/kernel f be the state set of f regarded as f K[zl-module. We assume that X is a torsion module with nontrivial f Let

a

minimal polynomial

ljr.

=

ljr·f(e.)

(6.2)

J

Then, for each f(ljr·e.) J

=

j = 1,

ordinary product of the power series

no dot

O.

~([ljr.e.l) J

By definition of the module structure on

a (vector) polynomial.

•.. , m we have

r,

(6.2) means that the

f(e j )

by the polynomial

Hence (6.2) is equivalent to

ljr

is

(notation:

ordinary product) 1, ... , m.

!ntuit.i'y-e}:.y.:, we can solve this equat.Lon by writing

fee .) J

There are two vmys of making this idea rigorous. Method 1.

(6.3)

Define

=

f(e.) J

G./ljr J

as the formal division of

G.

by

1jr

Check that the coefficient of

ZO

is always

J

into ascending powers of O.

*(z-l)

Multiply both sides of (6.2 1 ) by

= z-nljr(z)

and

Q.(z-l) ~ z-nQ(z). J

Then

-1

Verify by computation

that the power series so obtained satisfies (6.2 1 ) Method 2.

z

.

z-m.

Write

~ E K[z-ll C K[[z-lJl

and (6.2 1 ) becomes (6.2")

~f( e .)

Moreover, the

J

O-th

coefficient of

~

is

1

(because of the convention

-

81 -

R.E. Kalman

W

that the leading coefficient of K[[z-l]]

is

1),

hence

t

is a unit in

and therefore

(6.3' )

f(e.) J

Note that tions of

(6.3) and (6.3')

f(e.),

give slightly different defini-

depending on whether we use a transfer function with

J

z

respect to the variable

or

in the engineering literature.) preferable.

actu~lly

z

-1

(Both notations have been used

For us the form~lisffi of Method 1 is

(The calculations of Method 1 can be reduced by Method 2

to the better-known calculations of the inverse in the ring

K[[z-l]].)

Summarizing, we have the easy but fundamental result:

(6.4)

EXIS~~CE

OF

TRP~SFER

correspondence beblcen polynomial

~Ihere

Q

yuNCTIONS.

K[ z j-homomorpht er,s

wand transfer function

j E KP[z], deg

den ominator of

Q

j <

I'

with minimal

of the type

W is the lea.st common

Z.

In many contexts, it is preferable to deal with the ponding to

f

rat.he r t.han \,ith

f

itself.

Zf

dim Zf /',.

W z

and conversely. dim f

fare well-

Thus, for instan ce,

~ dim X

f;

least common denominator of minimal polynomial of

corre s-

Because the cor r e sponde nce

is bijective, it is clear that all objects induced by defined also for

Zf

fZ'

Z,

-

82 -

R.E.Kalman

(6.5)

REMARK.

realization of

In view of Propositions (4.20-21), the natural

Z,

namely

D.

X = X z f, Z

well as completely observable. has caused a great confusion ,

is completely reachable as

Not having this fact available before 1960 Questions such as thoscresolved by Theorem (5.13)

tended to be attacked algorithmically, using special tricks amounting to elementary algebraic manipulations of elements of

Z.

Very few

theoretical results could be conclusively established by this route until the conceptual foundations of the theory of reachability and observability were developed. The preceding results may be restated as "rules" whereby the values of

f

may be computed using

Z.

We have in fact,

fern) = Z· rn,

(6.6)

wZ

multiply the polynomial matrix consisting of the numerators of Z with rn, reduce to minimaldegree polynomials modulo and then divide formally by W as in ~lethod 1 above.

*

We can also compute the entire output of the system E

Z

(that is,

all output values following the application of the first nonzero input value) by the rule

same as above, but do not reduce modulo

W.

In this second case, the output sequence will begin with a positive power of

z.

(The coefficients of the positive powers of

thrown away in the definition of

f

(see (3.7»

z

are

and in the definition

vhere

-

83 -

R. E. Kalman

r,

of the scalar product in for

X f

= n/kernel

in order to secure a simple formula

f.)

Many other applications of transfer functions may be found in KAl1~, FALB, and ARBIB [1969, Chapter 10, Section 10].

It is easy to show that the transfer function associated with

= (F,

the system L f

G, H)

is given by

Zf

= H(zI

- F)-lG.

(This is

just the formal Laplace transform computed from the constant version of (1.12) by setting

= zx(t).)

x(t + 1)

z

= d/dt

or from (1.17) by setting

Probably the simplest way of computing

Z

is

via the formula

6.8)

q

where

1/I

F

is the minimal polynomial of the matrix

script denotes the special polynomials defined in identity

(6.8)

deg .1/I, F and the super-

(5.5).

The matrix

follows at once from the classical scalar identity

[WEBER, 1898, §4]

ttl . ( L) (z - w) .L. zJ;; q-a, (w),

7T(Z) - 7T( w) upon setting

w

= F,

7T

J= l

= 1/IF'

q

deg 7T,

and invoking the Cayley-Hamilton theorem.

Much of classical linear system theory was concerned with computing Zr

In the modern context, this problem "factors" into first solving

the realization problem

f ~ L f

a~d

then applying formula

(6.8).

See

Sections 8 and 9. One of the mysterious features of Rule (6.6) (as contrasted with the conventional rule (6.7)) is the necessity of reducing mowllo The simplest way of understanding the importance of this

1/1.

-

84 -

R. E. Kalman

aspect of the problem is to show how to relate the module invariant factors occuring in the structure theorem (4 .34) to the classical facts concerning the invariant factors of a polynomial matrix. INVARIANT FACTOR THEOREM FOR MATRICES.

Let

P be a

matrix with elements in an arbitrary principal-ideal domain p

(6.10) where

A and

Bare

p X P

diag

IT

= rank

P.

The

II. 1

and

Rand

Then

m X m matrices (not necessarily det A, det B units in

(~x, ~FEx, •••

define a factorization of

f.

Hence the correspondence between (3.12)

~ (7.2) is bijective.

The quickest way to exploit the algebraic consequences of our definition (7.2) is via the following arrow-theoretic fact:

-

94 -

R.E.Kalman

ZEIGER FILL-IN LEMMA. and

5

~

A, B, C, D be sets and ex,

s, r,

set maps for which the following diagram commutes: ex

4

>

B

./

./

r ./

VJi/

~

~

5

is surjective and

t3

./

C

If ex

./

./

;>

W D

5 . i s injective, there exists a unique set

corresponding to the dashed arrow which preserves commutativity.

This follows by straightforward "diagram-chasing", which proves at the same time the COROLLARY.

The claim of the lemma remains valid if "sets"

are replaced by "R-modules" and "set maps" by "R-homomorphisms". Applying the module version of the lemma twice, we get

(7.6)

PROPOSITION.

fixed

f:

Consider any two canonical realizations of a

the corresponding state-sets are isomorphic as K[z]-modyles.

Since every K[z]-module is automatically also a K-vector space, (7.6) shows that the two state sets are K-isomorphic, that is, have the same dimension as vector spaces.

The fact that they are also K[z]-isomorphic

implies, via Theorem (4.34), that they have the same invariant factors. We have already employed the convention that (in view of the bijection between

f

and l:f)' the invariant factors of

f

and X f

are to be

-

95 -

R. E. Kalman

identified.

In view of

(7.6),

this is now a general fact, not dependent

on the special construction used to get

(7.6)

x. f

We can therefore restate

as the

(7.7)

ISOHORPHISM THEOREM FOR CANONICAL REALIZATIONS.

canonical realizations of a fixed

f

Any two

have isomorphic state module s.

The state module of a canonical realization is uniquely characterized (up to isom orphism) by its invariant factors, which may be also viewed as those of

f.

A simple exercise proves also

(7.8)

PROPOSITION.

realization

f,

then

If

X is the state module of a canonical

dim X (as a vector space) is minimum in the

class of all realizations of

f.

This result has been used in some of the literature to justify the terminology "minimal realization" as equivalent to "canonical realization".

'-Ie shall see in Section 9 that the two notions are

not aD~Ys equivalent; we prefer to view (7.2) as the basic definition and

(7.8)

as a derived fact.

REMARK. claimed (4.24).

2 = (F, G, H)

(7.7)

Theorem

constitutes a proof of the previously

To be more explicit:

if

E

(F, G, H)

and

are two triples of matrices defining canonical realiza-

tions of the same

f,

then

space isomorphism A: X -)

(7.7)

X

implies the existence of a vector-

such that

-

96 -

R . E. Kalman F

(7.10)

'" G

AG,

1\

If we identify X and X then A is simply a basis change and it follows that the class of all matrix triples which are canonical realizations of a fixed grOUp over

f

is isomorphic with the general linear

X. The actual computation of a canonical realization, that is,

of the abstract Nerode equivalence classes

[m]f'

require a consider-

able amount of applied-mathematical machinery, which will be developed in the next section. a factorization of

The critical hypthesis is the existence of f

such that

expressed by saying that

f

dim X

A' + N'. ~ and

~B'

-

106-

R. E . Kalman

The sequence

A is uniquely determined by

from the left and the sequence acting on the matrix

N.

acting on

~AI, 7\"(~)

B is uniquely determined by from the right .

~A', A"(~)

are equal by hyp othesis on

~

~B

The two matrices

Moreover,

and

are also equal, since the matrices on the right-hand side depend only on the

2nd, •.• , N-th

member of ea ch sequence.

Using only this fact

and the associativity of the matrix product 11:-1

~AI, A"~~B ;:

So

'

k-l

~~AI, N'~B

'

o

B.

A

Now we can hope for a realization algorithm which uses only the first

A' + A"

terms of a sequence of finite length.

In fact, we have

(8.16)

B. L. HO' s REALIZATION AIDORITHM.

seguence

A of finite length with associated Hankel matrix

Consider any i nfinit e

following steps will lead to a canonical realization of

A:

H.

The

-

107 -

R . E. Kalman (i)

Determine

(ii)

Compute

nonsingular

pA'

X

pN

A', A". n = rank ~A'I A"; and

mA"

X'

mA"

in doing so, determine

matrices

P, Q su ch that

(8.17)

(iii)

(8.18)

Compute

Rn P!!" ,,,~, - /\ ,/\

G

H = are idempot ent "editing" matrices c orre spondi ng to the operations "r et a i n onl y the first

p

rows" and "retain only the first

m columns". We claim the (8.19)

REALIZATION THEOREl·! FOR INFINITE SEQUENCES.

seguence

~

(A', N'),

whos e a ssociat ed Hankel mat r i x

~

For any infinite

ha s f inite length

B. L. Ho's f or mula s (8.17-18) yi el d a canonical r ealization. PROOF.

If E

defined by (8.17-18 ) is a realization of ~,

then it is certainly cano ni ca l : the class of all realizations of

by ~

(8.4)

L

ha s minimal dimension in

and so it i s canonical by (7. 8) .

The required verification is int eresting. subscripts.

Observe that

l!

H

n

= QCRP

First, drop all

is a pseudo-inverse of

~,

that

-

108 -

R.E.Kalman is,

~~ =~.

Then, by definition of

~G

F, G, H,

II

.'m d

~,

(~Q.C)(RP[(J"r&]Q.C)k(~C),

~(~II[(J"~])~~C; by repeated application of (8.9),

~(~I1~)~~C ~~(~II~)k-~~C,

RS~~C,

~~C, R[(J"~]C. The last equation calls for picking out the first first (8.20)

m columns of COt~NT.

(J"~,

which is just

A+ l k,

p

rows and the

as required.

0

This is a considerably sharper result than Theorem

(8.12), in two respects: (i) use the matrix (ii) form:

It is no longer necessary to compute ~", , ,," «(J"At;;) ,

~:

we simply

which is part of the data of the problem.

Formulas (8.18) give the desired realization in minimal

there is no need to reduce (8.13) to a minimal realization (recall

here (7.11». Notice also that the proof of (8.19) does not re~uire (8.12) but depends (just like the latter) on direct use of (8.8).

-

109 -

R. E. Ka l ma n

An apparently serious limitation of the algorithm (8.16) is the

necessity to verify abstractly that

has finite length".

"~

Of

course, this can be done only on the basis of certain special hypotheses on ~'

given in advance.

(ii) ~

= coefficients

(Examples:

=0

(i) ~

for all

k > q;

of the T~lor expansion of a rational function.)

Fortunately, the difficulty is only apparent, for the preceding developments can be sharpened further: F1JNDA.MEN'rAL THEOREM OF LINEAR REALIZATION THEORY.

(8.21)

any infinite sequence

~

and the corresponding Hankel matrix H.

Suppose there exist integers (8.22)

1,1, 1,"

such that

rank

u., +,.r, 1 n,,(~), _

rank

~£ I, 1,"+1 q~) .

_.r,

'" of Then there exists unique extension A

such that with

A'

1.

by

and

Z such that

-

110-

R. E. Kalman By repeated application of (8.23), it follows that we have also

k > O.

Now i t is clear, from (8.8), that

A~

A"

every

block column of H(A) = =r

is linearly dependent on the columns to the left of it. Every partial sequence seguence

~

may be extended to an infinite

A in at least one way such that the condition n (A) o =r

for all

~

> A' (A ), v > A" (A ) =r =r

is satisfied. PROOF.

The existence of the numbers

It suffices to show, for arbitrary such a way that the numbers

A', A",

and

Consider the first row of Ar+l

n

r,

is trivial.

how to select

Ar+l

remain constant.

and examine in turn all the

first rows of the first, second, third, ••. ,

!! U ). - -r

o

A'. A"

ALth

block rows in

If the first row of the first block row is linearly depen-

dent on the rows above it (that is,

0), we fill in the first row

in

-

116-

R. E. Kalman

of Ar+l

using this linear dependence (that is, we make the first

row of Ar+l

all zeros).

This choice of the first row of Ar+l

will preserve linear dependencies for the first row of every block row below the second block row, by the definition of the Hankel pattern.

If the first row in the first block row is linearly

independent of those above (that is, contributes

I

to

n (A

o =r

we pass to the second block row ana repeat the procedure.

»,

Eventually

the first row of some block row will become linearly dependent on those above it, except when

A' = r; in that case, choose the first

row of Ar+l

to be linearly dependent of the first rows of

••• , A r•

Repeating this process for the second, third,

~,

of each block row*, eventually ing

At

or

Ar+l

rows

is determined without increas-

n. o

To complete the proof, we must show that the above definition of Ar+l

also preserves the value of

~~

That is, we must show

that no new independent columns are produced in the Hankel array of ~r

when Ar+l

is filled in.

that the definition of Ar+l rank H =r, I rank I]-r- 1 , 2

rank HI = ,r

This is verified immediately by noting implies the conditions

rank!! -r+I , l' rank ~r, 2'

rank ~2, r

rank ~l,r+l.

*Of course, ::0-., Li nea,r dep endence in t.he first step does :1Qt that the corresponding row of Ar+l will be ~ll zeros.

in~ly

o

-

117 -

R. E. Kalman

With

th~

a.id of this simple but subtle observation, the problem

is reduced to that covered by the V~in Theorem

(8.21) of Section 8. We have:

MAIN THEORD1 FOR MINIMAL PARTIAL REALIZATIONS.* be a partial sequence.

flr

Let

Then:

(i) Every minimal realization of ~r

has dimension

n (A ). o =r

(ii) All minimal realizations may be determined with the aid of B. L. Ho's formulas as given by Lemma (iii) If -is unique.

(8.17-18) vdth

r> A'(A ) + A"(A) = =r =r

there are extensions of

~r

then the minimal realization

~~ny

satiSfying

By the Main Lemma

minima'

r=alizcti~ns

o

So we can apply the

as

(9.6).

(9.5),

every partial sequence

has at least one infinite extension "hich preserves n.

A" = A"(A ) =r

(9.5).

Othen,ise there are ss

PROOF.

and

A', A"

~r

and

(8.21) of the preceding section.

It fo.l Lovs that the minimal partial realization is uni que if (the A' (A ) + A"(A ) + 1 Hankel matrix can be =r =r = =r =r filled in completely with the available data); in the contrary case, the

r

> At (A ) + A" (A )

minimal extensions will depend on the

mar~er

in which the matrices

Ar+l' •••, AA'+ 1\' have been determined (subject to the requirement

o

(9.6) ).

In view of the theorem, we are justified in calling the integer

A• =or *A similar result was obtained sDT.lutaneously and independently by T. Tether (Stanford dissertation, 19c9).

-

(9.8)

REMARK.

R.E.Kalman

118-

The essential point is that the quantities

no'

A', and AU are uniquely determined already from partial data, irrespective of the possible nonuniqueness of the minimal extensions of the partial sequence.

We warn, however, that this result does

not generalize to all invariants of the minimal realization. instance, one cannot determine from

For

how many cyclic pieces a

~r

minimal realization of A will have: some minimal realizations =r may be cyclic and others may not [KALMAN 1970b). Finally, let us note also a second consequence of the Main Theorem: COROLLARY.

Suppose

nl(~r)

columns of the Hankel array of ~r

no(~r))· Then

dim ~r

PROOF.

If

is the number of independent

(defined analogously with

= nl(~r)· "l(~r)

>

no(~r)

then, using the Main Theorem,

we get a contradiction to the fact that the rank of any Hankel matrix of an infinite sequence is lower bound f0r the dimension of any realization (Proposition to any

~Al+~~l

equal to

(8.4)). If

nl(~r)

K) •

is simply a "rule" (in practice, a computing

algorithm) which assisns to each possible output seqDtn~e Y

in

r

-

123 -

R. E. Kalman

a number in the field

K.

If y

resulted from the state

x

then

y(y)

Y(f(w))

(rof) (00)

" and, by definition of gives the value of a certain function in n the state, also the value of a certain function in

X.

This suggests

the DEFiNITION.

(10.2)

iff there is a

An element

y"x E?

~ E

X is

an c:se~vable costate

such that we have identically for all

ooEn

In other words, no matter what the initial state the value of ~

y"x

rule

at

x

x

=

[oolf

is,

can always be determined by applying the

to the output sequence

f(oo)

resulting from

x.

Note,

carefully, that this definition subsumes (i) a fixed choice of the class of functions denoted by the circumflex, and (ii) a fixed input sequence after

t

=

0

(here

v

=

0).

For certain purposes, it

may be necessary to generalize the definition in various ways (KALMAN 1970 al, but here we wish to avoid all unessential complications. According to Definition (10.2), we shall see that a system is COmpletely observable iff every costate is observable.

This agrees

with the point of view adopted earlier (see Section 4) in an ad-hoc fashion.

Also, the vague requirement to "determine

x"

used in

-

124 -

R. E . Kalman

(10.1) is now replaced by a precise notion which can be manipulated (via the actual definition of the circumflex) to express limitations on the algorithms that we may apply to the output sequence of the system. The requirement "every costate is observable" can be often replaced by a much simpler one.

For instance, if

X is a vector

space, it is enough to know that "every linear costate is observable" or even just that "every element of some dual basis is an observable costate"; if X is an algebraic variety, it is natural to interpret "complete observability" as "every element of the coordinate ring of X is an observable costate" [KALMAN 1970a]. We can now carry out a straightforward "dualization" of the

n ~r.

setup involved in the definitior. of the input/output map f:

First, we adopt (again with respect to a fixed interpretation of the circumflex) : DEFINITION.

The dual of an input/output map f:

n

~r

is the map

Note that

f

is well-defined, since the circumflex means the class

of all functions. As to the next step, we wish to prove that constancy is inherited under dualization. spift operator on obvious ones:

To do this, wo have to induce a definition of the rand

n.

The only possible definitions are the

-

125 -

R. E. Kalman

'" r

--+ '" r:

Both of these new shi f t operators will be den oted by

z

-1

The reason for this notation will become clear later. Now it is easy to verify:

(10.4)

PROPOSITION. PROOF.

f

is constant, so is

'" f.

We apply the definitions in suitable sequence:

fez -l·r)(w)

and so we see that

If

'" f

(z-l·r)(f(w))

(de t'. of

r),

Y(z.f(w))

(def. of

(ff'),

Y(f(z.w))

(f

f(r)(z. w)

(de r , of

r) ,

(z-l·1'(r))(w)

(def'. of

(fn),

c ommutes with

z

At this stage, we cannot as yet view

wheneve r

f

f

is constant),

does.

0

as the input/ output map

of a dynamical system because concatenation is not yet defined on and therefore

r

'"n "

is not yet a properly defi ned "input set".

In other words, it is necessary t o chec k that the notion of ti me i s also inherited under dualization. to be

po ~sible

In gen eral , this doe s not appe ar

wi t hout some str ong limitation on the cla s s

we shall look only at the simpl e s t

'"P.

Here

-

126 -

R. E. Kalman

HYPOTHESIS. finiteness condi t i on : such that for all

Every function

y

There is an integer

y, 0 E

r

satisfies the

ly"'l

(dependent on

in

y)

the condition

r

I, ••• ,

IrI

implies

Yeo).

r(y)

In other words, we assume that the value of each "y

at

y

is uniquely determined by some finite portion of the output sequence y.

Assuming (10.5), it is immediate that

f admits a concatenation

multiplication which corresponds (at least intuitively) to the usual

n:

one defined on (10.6)

We can now prove the expected theorem, which may be regarded as the precise form of the "duality" principle: THEOREM. map and

f

Let

its dual.

f

be an arbitrary constant input/output

Suppose further that (10. 5 ) holds.

each observable costate of

f

(relative to

may be viewed as a reachable state of '"f, PROOF.

r

induced by

f.

r

Then

satiSfYing (10.5))

and conversely.

First we determine the Nerode equivalence classes on By definition

-

127 -

R. E.Y.:alman

'"€ E P. '"

for all

Now "r is linear

f

the definition of

and

(!);

in fact, direct use of

(10.6) gives (50f)(W), wEn.

So rof

and

are equal as plements ~l

50f

same observable Gostate.

~:

chey define the

Tn fancier language, the assignment

{lo.B) is well defined and constitutes a bijection between the reachable states of '" f

and those costates of

f

which are observable

o

relative to the function class Thus ~o

hold.

(10.5) is a sufficient condition for Ghe

d~lity

principle

However, the fact that the canonical realization 0f '" f

is

completely reachable is not quite the same as saying that the canonical realization of

f

is completely observable because the latter depends

on the choice of

r

Moreover, Theorem

(10.7)

and therefore is not an intrinsic property of does not give any indicati~n how "big"

and it may certainly happen that the observability problem for ~~ch

more difficult than the reachability problem.

f.

X

is

f

f

is

These matters will

be illustrated later by some examples. Now we deduce the original form of the duality principle from Theorem

(10.7).

The essential point is that (10.5) holds automati-

cally as a result of linearity. New definition of the function class: the class of all K-linear ~xnctions.

let the circumflex denote

(All the underlyin~ bets with the

K-vector spaces, so the definition makes sense.)

-

128 -

R.E.Kalman The following facts are well known: PROPOSITION. K-vector spaces .

Let

*

denote duality in the sense of

Then:

r {).

(KP[[z-ll])*

n {).

(JCD'[ z])*

KP[z-ll,

JCD'[ [z l l.

Now we can state the (10.10)

MAIN THEOREM.

dimensional. (i)

Suppose

Suupose further that

f

PROOF.

f

f,

A

~

hence every costate of

The fact that

by Proposition (10.4).

r

Then:

K[z~ll-homomorphism

f,

isomorphic with the X f

is observable.

is K-linear implies, by (10.3),

(Caution:

K[zl-homomorphism

cannot be simplified.

are

f*

is K-linear; the constancy of

dual of the

K-linear duality.

and finite-dimensional.

The reachable states of

K-linear dual of X f;

that

is K-linear, con stant, finite-

is K-linear and constant, that is, a

(and therefore written~ f*) (ii)

f

f

f

always implies that of is not the K[zl-linear

and the construction given here

See Remark (4.26A).)

To prove the second part, we note that by Proposition (10. 9) Hypothesis (10.5) holds and thus map of a dynamical system. of

f*

are isomorphic with

f = f*

is a well-defined input/output

We must prove that the reachable states

X;,

the K-linear dual of X f•

amounts t o proving that the K-vector spac e of functions

This

-

129-

R. E. Kalman is isomorphic with the K-vector space

X;.

It suffices to prove

that the K-vector space generated by the K-linear functions (10.1l) is isomorphic with Then by

x f

0, 1, •••

i

= 0,

X;.

Suppose that, for fixed

and x,

j

1, ••• , m]

'"

every

A(x)

= O.

by definition of the Nerode equivalence relation induced

(recall here the discussion from Section

3).

X is f finite-dimensional by hypothesis, it follows from this property of

the functions

(A)

that they generate

X* f•

Since

Obviously,

din:

x;

=

so that everything is proved.

[J

In other terms, the fact that

f

with the appropriate definition of

A

is a

f

K[z-l]-homomorphism.

= K[z]-homomorphism

t

=-

k

Since (10.5) holds, we can interpret

due to input

Y

In fact, we have

y(y)

f(y)( m), (Yof)(w) , ~(f(Y)(-

k»(Wk).

the output of the dual

is given by the assignment

which is a linear function defined on the sequence.

together

implies that

in a system-theoretic 'iay, as follc~s:

system at

dim Xf'

k-th

term of the input

-

130 -

H.E.Kalman

(10.12) that

"f

REMARK.

It is essentially a consequence of Proposition (10.9)

turns out to be the same kind of algebraic object as

f.

Note,

however, that under duality the input and output terminals are interchanged and ~

t

is replaced by

-t

(hence

z

z -1) •

In terms of the pictorial definition of a system, this statement simply amounts to "reversing the directions of the arrows", which is the "right" way to define duality in the most general mathem~tical

context, namely in category theory.

We would expect

that the duality principles of system theory will eventually become a part of this very general

du~lity

theory.

yet because the correct categories to

b~

This has not happened

considered in the study of

dynamical systems have not yet been determined.

It is likely that

eventually many different categories wi]l have to be looked at in studying dynamical problems. We shall now present an example the previous results.

whi~h

should help to interpret

We emphasize, however, that the theory sketched

here is still in a very rudi.mentary form. (10.13)

EXAMPLE. x(t + 1)

y(t)

Consider the system

L

defined by

2x(t) + u(t), y(t)

=( 1

if

0;;

if

1/2 ~ x(t) < 1,

x(t) < 1/2,

x(t), t E ~;

-

131 -

R. E. Kalman

X = U = Y = ~ mod 1, i.e., the interval [0, 1).

with

be thought of as identified with 0 . ) x

We let

u(t)

o

1.

(1

= O.

is to

We view

through its binary representation or

It is clear from the definition of the sy stem that the output sequence due to any

If it.

x

x

is precisely

is irrational, infinitely many terms are needed to identify

Consequently, the

x's

lence classes induced by Relative to

are isomorphic with the Nerode equiva-

f[.

So [

cannot be reduced.

".... = functions", every co stat e of

f[

is

observable, provided that Hypothesis (10. 5) is not satisfied.

If

it is, then only those c ostates defined on fi xed-length rationals are observable (more precisely, these on a fixed finite subset of the not define a

dyn~ic al

functions which depend only .... gk(x)ls). Thus: either f does ~re

sy st em or not all co st ates are obse r vabl e .

Now let us replace the set

[0, 1)

by its inters ection

with the rationals .

It is clear that there is now a finite algorithm

for dete rmining

we simply apply the re sult s of partial realiza-

x:

tion theory of the previous se ction. problem is to express of polynomials in is rational.)

x

from

~2[2]--which

However,

x

(We take

K

= ~2

(gl(x), ••• , g2(x) 0

and the

as a ratio

i s always pos sible sinc e each

x

i s not "effecti vely computable" in the

-

132 -

R. E. Kalman

strict sense since there is no way of knowing when the algorithm has stopped. no

~

~(x)

rule

for all

In other words, given an arbitrary costate

,.,

y"

x

x.

such that the application of "y" x

to

On the other hand, substituting into

,.,

x

there exists

Yx gives

,.,

x

the

results of the partial-realization algorithm will give an approxi~tion to the value of

~(x)

which always converges in a finite

(but a priori unknown) number of steps as more values of the output sequen~e

are observed.

In short, the costate-determination algorithm

has certain pseudo-random elements in it and therefore cannot be described through the machinery of deterministic dynamical systems. (Is there some relation here to the conceptual difficulties of Quantum Mechanics?)

-

133 -

R. E. Kalman

11.

HISTORICAL COMMENTS

It is not an exaggeration to say that the entire theory of linear, constant (and here, discrete-time) dynamical systems can be viewed as a systematic development of the equivalent algebraic conditions (2.8) and (2.15). Of course, the use of modules (over

K[z])

to study a constant

square matrix (see (4.13)) has been " st andar d" since the 1920's under the influence of E. NOETHER and especially after the publication of the Modern Algebra of VAN DER WAERDEN. must be also quite old.

Condition (2.15), by itself,

For instance, GANTMAKHER [1959, Vol. 1, p. 203]

attributes to KRYLOV [1931] the idea of computing the characteristic polynomial of a square matrix A by choosing a random vector computing successively b, Ab,

A2b, ...

band

until linear dependence is

obtained, which yields the coefficients of det (zI - A). will succeed iff X is cyclic with generator A

g.)

(The method

However, the

merger of (4.13) with (2.15), which is the essential idea in the algebraic theory of linear systems, was done explicitly first in KALMAN [1965b]. We shall direct our remarks here mainly to the history of conditions (2.8) and (2.15) as related to controllability.

See also earlier

comments in KALMAN [1960c, pp , 481, 483, 484] and in KAWAN, HO, and NARENDRA [1963, pp. 210-212].

We will have to bear in mind that the

development of modern control theory cannot be separated from the development of the concept of controllability; moreover, the technological problems of the 1950's and even earlier had a major influence on the genesis of mathematical ideas (just as the latter have led to many new technological applications of control in the 1960's).

-

134 -

R.E. Kalman

The writer developed the mathematical definition of controllability with applications to control theory, during the first part of 1959. (Unpublished course notes at Johns Hopkins University, 1958/59.) first definitions were in the form of (2.17) and (2.3).

These

Formal presenta-

tions of the results were made in Mexico City (September, 1959, see KALMAN [1960b]), University of California at Berkeley (April, 1969, see KALMAN [1960d]), and Moskva (June, 1960, see KALMAN [1960c]), and in scientific lectures on many other concurrent occasions in the U.S.

As

far as the writer is aware, a conscious and explicit definition of controllability which combines a control-theoretic wording

~th

a

precise mathematical criterion was first given in the above references. There are of course many instances of similar ideas arising in related contexts.

Perhaps the comments below can be used as the starting point

of a more detailed examination of the situation in a seminar in the history of ideas. The following is the chain of the writer's own ideas culminating in the publications mentioned above: (1)

In KALMAN [1954] it is pointed out (using transform methods)

that continuous-time linear systems can be controlled by a linear discrete-time (sampled-data) controller in finite time.* *It is sometimes claimed in the mathematical literature of optimal control theory that this cannot be done with a linear system. This is false; the correct statement is "cannot be done with a linear controller producing control functions which are continuous (and not merely piecewise continuousl) in time." Such a restriction is completely 'irrelevant from the technological point of view. As a matter of fact, computer-controlled systems have been proposed and built for many years on the basis of linear, time-optimal control.

-

135-

R. E. Kalman

(2)

Transposing the result of KALMAN [1954] from transfer functions

to state variables, an algorithm was sketched for the solution of the discrete-time time-optimal control of systems with bounded control and linear continuous-time dynamics. (3)

[KALMAN, 1957]

As a popularization of the results of the preceding work, the

same technique was applied to give a general method for the design of linear sampled-data systems by

~~

and BERTRAM [1958].

Some background comments concerning these papers are appropriate: (1)

The ideas and method presented in KALMAN [1954] descend

directly from earlier (and very well known) engineering research on time-optimal control.

(The main references in KALMAN [1954] are:

McDONALD [1950], HOPKIN [1951], BOGNER and KAZDA [1954], as well as a research report included in

~~l

[1955].)

Although the results of

KALMAN [1954] on linear time-optimal control were considered to be new when published, it became clear later that similar ideas were at least implicit in OLDENBOURG and SARTORIUS [1951, §90, p. 219] and in TSYPKIN's work in the early 1950's.

The engineering idea of nonlinear time-optimal

control goes back, at least, to DOLL [1943] and to OLDENBURGER in 1944, although the latter's work was unfortunately not widely known before 1957. During the same time, there was much interest in the same problems in other countries; see, for instance, FELDBAUM [1953] and UTTLEY and HAMMOND [1953].

Mathematical work in these problems probably began with BUSHAW's

dissertation [1952] in which, to quote from

~·WL~

[1955, before equation

(40»), " ••• [it was] rigorously proved that the intuition which led to the formulation of the [engineering] theory [quoted above] was indeed correct."

TSIEN's survey [1 954] contains a lengthy account of this state

-

136 -

R.E.Kalman

of affairs and was ready by many• . We emphasize:

none of this

extensive literature contains even a hint of the algebraic considerations related to controllability. (2-3)

The critical insight gained and recorded in KAU~ [1957] is

the following:

the solution of the discrete-time time-optimal control

problem is equivalent to expressing the state as a linear combination of a certain vector sequence (related to control and dynamics) with coefficients bounded by 1

in absolute value, the coefficients being

the values of the optimal control sequence. of the first

n

The l inear independence

vectors of the sequence guarantees that every point

in a neighborhood of zero can be moved to the origin in at most

n

steps (hence the terminology of "complete controllability"); and the condition for this is identical with (2 .17) (stated in KALMAN [1 957] and KALMAN and BERTRAM [1958] only for the case

det F of 0

and m = 1).

A thorough discussion of these matters is found in KALMAN [1960c; see especially Theorem I, p. 485].

A serious conceptual error in KALMAN

[1957] occurred, however, in that complete controllability was not assumed, as a hypothesis for the existence of time-optimal control law, but an attempt was made to show that the controllability is almost always com.plete [Lem:na 1].

In fact, this lemma is true, with a small

technical modification in the condition.

Only much later did it become

clear (see the discussion of Theorem D in the Introduction), however, that a dynamical system is always completely controllable (in the nonconstant case, completely reachable) if it is derived from an external description. this difficulty, very

~sterious

in 1957, which led to the development

It was

-

137 -

R.E. Kalman Of a formal machinery for the definition of controllability during the next two years .

The changing point of view is already apparent in

KALMAN and BERTRAM [1958]; the unpublished paper promised there was delayed precisely because the algebraic machinery to prove Theorem D was out of reach in 1957-8.

Consult also the findings of the biblio-

grapher RUDOLF [1969].

IN

S~~Y:

under the stimulation of the engineering problems

of minimal-time optimal control, the researches begun by KALMAN [1954,

1957] and KAlilAN and BERTRAM [1958] eventually evolved intoiwhat has come to be called the mathematical theory of controllability (of linear systems). Beginning about 1955, Ind stimulated by the same engineering problems, FONTRYAGIN .and h i,s school in the USSR developed their mathematical theory of optimal control around the celebrated "Maximum Principle". mentioned

(They were well aware of the survey of TSIEN [1954] acove J and referenced it both in English and in the Russian

translation of 1956.)

We now know that ~ theory of control, regard-

less of its particular mathematical style, must contain ingredients related to controllability.

So it is interesting to examine how

explicitly the controllability condition appears in the work of PONTRYAGIN and related research. GAMKRELIDZE [1957, §2; 195e §lJ §2] calls the time optimal control problem associated with the system

(11.1)

dx/dt

Ax

+ bu(t)

-

138 -

R. E. Kalman "nondegenerate" iff subspace of (11.2)

n R •

b

is not contained in a proper A-invariant

He notes immediately that this is equivalent to

~ ) det ( b, Ab, ••. , An- [)

f.

(i.e., the special case of (2.8) for

0

m = 1).

He then proves:

in

the "degenerate" case the problem either reduces to a simpler one or the motion cannot be influenced by the control function

u(·).

~

this is very close to an explicit definition of controllability. However, in discussing the general case

m > 1,

GAMKRELIDZE [1958,

§3, Section 1] defines "nondegeneracy" of the system

=

dx/dt

Ax + Bu(t)

as the condition (11.4)

det (b., Ab., ••• , An-~.) ~

~

~

f.

0

for every column

b~

.

E B,

but he does not show that this generalized condition of "nondegeneracy" for (11.3) inherits the interesting characterization proved for "nondegeneracy" in the case of (11.1).

In fact, condition (11.4) is much too strong

to prove this; the correct condition is (2.8), that is, complete controllability.

In other w0rds, in GPJ~IDZE's work (11.4) plays

the role of a technical condition for eliminating "degener a cy" (actually, lack of uniqueness) from a particular optimal control problem and is not ; explicitly related to the more baEic notion of complete controllability. Neither GAMKRELIDZE nor PONTRYAGIN [1958] give an interpretation of (11.4) as a property of the dynamical system (11.3) , but employ (11.4) only in relation to the particular problem of time-optimal control.

See

-

139 -

R.E.Kalman also KALMAN [l960c, p. 484].

A siaular point of view is taken by

USALLE [1960]; he calls a dynamical system (11.3) satisfying (2.8) "proper" but then goes on to require (11.4) (to assure the uniqueness of the time-optimal controls) and calls such systems "normal". The assumption of some kind of "nondegeneracy" condition ·...as apparently unavoidable in the early phases of research on the timeoptimal control problem.

For example, ROSE [1953, pp . 39-58] examines

this problem for (11.1); by defining "nondegeneracy" [po 41] by a condition equivalent ot (11.2), he obtains most of GAMKRELIDZE's results in the special case when A has real eigenvalues [Theorem 12].

ROSE

uses determinants closely related to the now familiar lemmas in controllability theory but he, too, fails to formulate controllability as a concept independent of the time-optimal control problem. A similar situation exists in the calculus of variations.

The

so-called Caratheodory classes (after CARATHEODORY [1933]) correspond to a kind of classification of controllability properties of nonconstant systems.

In fact, the standard notion of a normal family of extremals

of the calculus of variations is closely related to condition (11.4), suitably generalized via (2.5) to nonconstant systems.*

Normality is

used in the calculus of variations mainly as a'hondegeneracy'condition. It is importan':. to note that the "nondegeneracy" condit loons employed in opt Ime.l c orrt r o., ",nd the calculus role of eliminating

annoyin~

01

var a.at a.ons play mainly the

;,echnicalities and simplifying proofs.

*The use of the word "normal" by IaSALLE [1960] t'or (.11.4) is only accidentally coincident with the earlier use of the "normal" in the calculus of variations.

-

140-

R. E. Kalman With suitable formulation, however, the basic results of time-optimal control theory continue to hold without the assumption of complete controllability.

The same is not true,

howeve~,

of the four kinds of

theorems mentioned in the Intorduction, and therefore these results are more relevant to the story of controllability than the time-optimal control discussed above. There is a considerable body of literature relevant to controllability theory which is quite independent of control theory.

For instance, the

treatment of a reachability condition in partial differential equations goes back at least to CHOW [1940] but perhaps it is fairer to attribute it to Caratheodory's well-known approach to entropy via the nonintegrability condition.

The current status of these ideas as related to

controllability is reviewed by WEISS [1969, Section 9].

An independent

and very explicit study of reachability is due to ROXIN [1960]; unfortunately, his examples were purely geometric and therefore the paper did r.ot help in clarifying the celebrated condition (2.8).

The

Wronskian determinant of the classical theory of ordinary differential equations with variable coefficients also has intersections with controllability theory, as pointed out recently with considerable success by SILVERMAN [1966].

Vany

problems in control theory were misunderstood

or even incorrectly solved before the advent of controllability theory. Some of these are mentioned in KALMAN [1963b, Section 9].

For relations

with automata theory, see ARBIB [1965]. Let us conclude by stating the writer's own current position as to the significance of controllability as a subject in mathematics:

-

141-

R. E. Kalman

(1)

Controllability is basically an algebraic concept.

(This

claim applies of course also to the nonlinear controllability results obtained via the Pfaffian method.) (2)

The historical development of controllability was heavily

influenced by the interest prevailing in the 1950·s in optimal control theory.

Ultimately, however, controllability is seen as a relatively

minor component of that theory .

(3)

Controllability as a conceptual tool is indispensable in

the discussion of the relationship between transfer functions and differential equations and in questiohs relating to the four theorems of the Introduction.

(4)

The chief current problem in controllability theory is the

extension to more elaborate algebraic structures. For a survey of the historical background of observability, which would take us too far afield here, the reader should consult KALMAN [1969b].

-

142 -

R. E. Kalman

12. Sec~ion

A:

REFERENCES

General References

M. A. ARBIB

A common framework for automata theory and control theory, SIAM J. Contr., 2:206-222. C. W. CURTIS and 1. REINER Representation Theory of Finite Groups and Associative Algebras, Interscience-Wiley. E. M. DAY and A. D. WALIACE [1967]

Multiplication induced in the state space of an act, Math. System Theory; 1:305-314.

C. A. DESOEH and P. VABAlYA [1967]

The minimal realization of a nonanticipative impulse response matrix, SIAM J. Appl. Math., 15:754-764.

E. G. GILBERT Controllability and observability in multivariable control systeffi3, SIAM J. ContrOl, 1:128-151. B. L. HO and R. E. KAIJlAN [1966]

Effective construction of linear state-variable models from input/output functions, Rege1ungstechnik, 14:545-548. The realization of linear, constant input/output maps, I. Complete realizations, SIAM J. Contr., to appear.

S. T. HU [1965 ]

Elements of Modern Algebra, Holden-Day.

R. E. KALMAN

[1960a]

A new approach to linear filtering and prediction problems, J. Basic Engr. (Trans. ASME), 82D:35-45.

[1960b]

Contributions to the theory of optimal control, Bol. Soc. Mat. Mexicana, L:I02-119.

-

[1960c]

143 -

On the general theory of control systems, Proc. 1st IFAC Congress, Moscow; Butterworths, London. Canonical structure of linear dynamical systems, Proc. Nat. Acad. of Sci. (USA), 48:596-600. New methods in Wiener filtering theory, Proc. 1st Symp. on Engineering Applications of Random Function Theory and Probability, Purdue University, November 1960, pp 270-388, Wiley. (Abridged from RIAS Technical Report 61-1.)

[1963b]

Mathematical description of linear dynamical systems, SIAM J. Contr., 1:152-192.

[1965a]

Irreducible realizations and the degree of a rational matrix, SIAM J. Contr., 13:520-544. Algebraic structure of linear dynamical systems. I. The Module of E, Proc. Nat. Acad. Sci. (USA), 54:1503-1508.

[1967]

[1969a]

On multilinear machines, J. Compo and System Sci., to appear.

[1969b]

pynamic Prediction and Filtering Theory, Springer, to appear.

[1969c]

On partial realizations of a linear input/output map, Guillemin Anniversary Volume, Holt, Winston and Rinehart.

[1970a]

Observability in multilinear systems, to appear.

[1970b]

The realization of linear, constant, input/output maps. II. Partial realizations, SIAM J. Control, to appear.

R. E. KAWAN and R. S. BUCY

New results in linear prediction and filtering theory, J. Basic Engr. (Trans. ASME, Sere D), 83D:95-100. R. E. KALMAN, P. L. FALB and M. A. ARBIB Topics in

~~thematical

System Theory, McGraw-Hill.

R. E. KALMAN, Y. C. HO and K. NARENDRA [1963]

Controllability of linear dynamical systems, Contr. to Diff. Equations, 1:189-213.

C. E. LANGENHOP

[1964]

On the stabilization of linear systems, Proc. Am. Soc., 15:735-742.

~~th.

-

S. LANG

[1965]

144

R. E. Kalman

Algebra, Addison-Wesley.

S. MAC LANE

[1963]

Homology, Springer.

L. A. MARKUS

[1965 ]

Controllability of nonlinear processes, SIAM J. Control, J:78-9O.

E. F. MOORE [1956]

Gedanken-experiments on sequential machines, in Automata Studies, C. E. Shannon and J. McCarthy (eds.), pp. 129-153, Princeton University Press.

P. MUTH [1899]

Theorie und Anwendung der Elementarthei1er, Teubner, Leipzig.

A. NERODE [1958]

Linear automaton transformations, Proc. Amer. Math. Soc.,

2:5 41-544 .

L. SILVERMAN [1966]

Representation and realization of time-variable linear systems, Doctoral dissertation, Columbia University.

L. M. SILVERMAN and H. E. MEADOWS Equivalent realizations of linear systems, J. Control, to appear. H.

S~l

WEBER [1898]

Lehrbuch der Algebra, Vol. 1, 2nd Edition, reprinted by Chelsea, New York.

L. WEISS Lectures on Controllability and Observability, C.I.M.E . Seminar. L. WEISS and R. E. KALMAN [1965 ]

Contributions to linear system theory, Intern. J. Engr. ScL, J:141-171.

W. M. WONHAM [1967]

On pole assignment in multi-input controllable linear systems, IEEE Trans. Auto. Contr., AC-12:6oo-665.

-

145 -

A. M. YAGLOM

An Introduction to the Theory of Stationary Random Functions, Prentice-Hall.

D. C. YOUIA

[1966]

The synthesis of linear dynamical systems from prescribed weighting patterns, SIAM J. Appl. Math., 14:527-549.

D. C. YOUIA and P. TISSI

[1966]

n-port synthesis via reactance extraction, Part I, IEEE Intern. Convention Record.

O. ZARISKI and P. SAMUEL [1958]

Commutative Algebra, Vol. 1, Van Nostrand.

-

Section B:

146-

References for Section 11

M. A. ARBIB

A common framework for automata theory and control theory, ~IAM.J. Contr., 1:206-222. I. BOGNER and L. F. KAZDA

[1954 ]

An investigation of the switching criteria for higher order contactor servomechanisms, Trans. AlEE, 11 11:118-127.

D. W. BUSHAW Differential equations with a discontinuous forcing term, doctoral dissertation, Princeton University.

[1952]

C. CARATHEODORY [1933 ]

Uber die Einteilung der Variationsprobleme von Lagrange nach Klassen, Comm, Mat. Relv., 2:1-19.

W. L. CHCM

[1940]

Uber Systeme von linearen partiellen Differentialgleichungen erster Ordnung, Math. Annalen, :98-105.

H. G. DOLL

[ 1943]

Automatic control system for vehicles, US Patent 2,463,362.

A. A. FELDBAUM

[1953 ]

Avtomatika i Telemekhanika, 14 :712-728.

R. V. <W-lKRELIDZE the theory of optimal processes in linear systems (in Russian), Dokl. Akad. Nauk SSSR, 116:9-11.

[1957]

On

[1958]

The theory of optimal processes in linear systems (in Russian), Izvestia Akad. Nauk SSSR, ~:449-474.

F. R. GANTMAKHER

[1959]

The Theory of Matrices, 2 vols., Chelsea.

-

147 -

A. M. HOPKIN [1951]

A phase-plane approach to the compensation of saturating servomechanisms, Trans. AlEE, 70:631-639.

R. E. KALMAN

[1954 ]

D~scussion of a paper by Bergen and Ragazzini, Trans. AIEE, 73 II: 245-246.

[1955]

Analysis and design principles of second and higherorder saturating servomechanisms, Trans. AIEE, 74 II:29h-3l0.

[ 1957]

Optimal nonlinear control of saturating systems by intermittent control, IRE WESCON Convention Record, 1, IV:130-135.

[1960b]

Contributions to the theory of optimal control, Bol. Soc. ~~t. Mexicana, 1:102-119.

[1960c]

On the general theory of control systems, Froc. 1st IFAC Congress, Moscow; Butterworths, London. Lecture notes on control system theory (by M. Athans and G. Lendaris), Univ. of Calif. at Berkeley.

[1963b]

NathemaUcal description of linear dynamical systems, SIAM J. Contr., 1,:152-192.

[1965b]

Algebraic structure of linear dynamical systems. I. The Module of E, Proc. Nat. Acad. Sci. (USA), 54:15 03-1508 .

[1969b]

Dyna~ic

Prediction and Filtering Theory, Springer, to app ear.

R. E. YJ\l1:A1J and J. E BERTRAM

[195 8 ]

R. E. KALllAN, [1963]

General synthesi s procedure f or computer control of single and ~ulti-loop linear systems, Trans, ALEE, TI 11 I: t 0 for a l mos t

a l mo st e ve ry - w he re i n wh ere

all 1: E P -C [0, TJ, then

a ll 't ER C [0, T J ,

then

gr a d

T he co nd it io n s ( 3) ,

(4)

PI (x, ')")

= 0

R,

P, R = a r b it ra ry s et s of po s iti ve m e a sure o f the interval

d ient form:

x

[0,

TJ

(and (7), (8)) c a n b e a lso written i n th e g r a -

-

167-

R. K ulik ow s ki

III)

~ g r a d >. 'f(X, ",\ )

= G(x)

> 0 for a l m o s t all C" E P C [0, T] then

), ('t) = 0 almost every-where in P, IV) if

"5:(-::) > 0

g r a d .l.

1? (x, ~

Co n d it i o n s

:=

for almost all 'e'ER

[O,T] , then

) = 0 almost every-wher e in R.

III-IV h a v e a simple physical interpretation: a t th e p oints

where the c o n s t r a i nt s are not active the L a grange-functi on vanishes and wh en the L a gran g e - function is positive the

co n s t ra i nt s must be a ctive .

It i s also p ossible to write down the g ra d i e n t form of conditions o f

optimality in th e case of minimali zation o f Wh en th e operator c tio n s ,

F'(x ) •

G(x) is a cting into th e space o f c o nt i n uo us fu n -

G: X -,. C [0, T] , one c a n w r ite

i. c ,

T ) G[x] d), ("d

\ " .\ LG(x) '1J

o where

). (~)

d ). (r) ~ 0 small

is a non-in creasing fun ction, what c an b e als o written a s

vi c inity of 1:' is non -ne gative). Wh en

A'( z )

d ~\ (1:") =

de',

;l.(?:")

to

d). (. ("l:")

(that notation m e ans that the increas e of

distribu tion). In th a t

- fun ctions a t th e dis c ontinuity-

- p o i nt s o f .:\ (1: ) The co n d itio n

( 3)

in th e pr-e s e nt c a se c a n b e written a s

0, or I) i f = 0

II)

if

It is

grad/.

~ (x'\) = G Lx] > 0

.r o r

()..(1:") = c o n s t ) at the point d\(;;) > 0, "t"l:P

th en

"r

[0, TJ

't" E-P C [0 , T ]

then

d

A( co)

T or v i c i n i t y of ""t' ,

= 0,

tEP .

gr'a d i " n ~

10r m

G [ x]

also po s .si bl e to write th e

€

of 0 ;.'1. i ma l i ' y co n d i -

-

168-

R. Kulikowski

tions for the remaining formulae leave that as an exercise The conditions

(1) - (4), (5) - (8), (10) - (1:3).

We shall

for the readers. (5) - (8)

can be called the quasi-saddle-point condi-

tions and they can be treated as a generalization of well differential conditions of optimality, which

known Kuhn-Tucker

were for mulai ed originally for

nonlinear functions in finite -dimensional spaces

~~~~~~~~~~X_~~~j2~~~~~Y!_~2~~~~~~~~ When an optimization problem is being solved it is also important to know that the functions x(t) which do not satisfy the cond itions (1)-(4) or (5) - (8)

can not be optimal. That problem requires to prove that the

conditions (1)-(4) are also necessary for optimum. In order to prove the necessary conditions we shall impose certain regularity conditions on G but Shall not assume any more that We shall call ble variation

x

x € X o at the point

o

,which

G

if

is defined

for every admiss iby the condition

G [x ] + dG(x , x) ~ 0 , o 0

(I)

there e xi s t s e curve emanating of admissible solutions

n .

generally speaking, a function i , e.

F and G are concave .

a regular point of x

F and

y:

from

x, tangent to x and lying in the set o 0 By a curve in the Banach space we understand,

r of real variable

s

with

the range in

X ,

R't X • According to definition that function should satisfy the follow-

ing conditions: (2)

't (s)E n , We shall

F(x)

(3)

'(0) = x O '

dt(O, 1)

show now, that if

and a re gular point of

=~.

is a maximali zing point for

G then for each admissible variation

G(x) + dG(x, x)

~

0 ,

x, i. e .

-

169-

R. Kulikowski

we get a non-positive increase of F, i , e. -dF(i, x) 2: 0.

(4)

Indeed, the real function f(s)

(5)

attains the maximum value for On the

['f(S)J

F

'1'(0) =

X, i.e. for s

= 0. Then

df(O, I)'::; 0.

other hand, according to the differentiation rule of compound. fun-

ction (5)

and formulae (2) we get df(O, I) = dF ['1-(0), d 1(0, I)] Then

dF(x, x) SO and

(4) has been

= dF(x, x) .

proved.

Introducing the notation II (x)

= -d F(x,

g(x )

xj ,

= G(x) + dG(i,

x)

the obtained result can be written as: (6)

g(x) 2:

if

°

then

II (x) ~

°.

The next step in our reasoning consists in showing that there exists such

a functional

). > 0, which will ensure the relation

where L (x) I

(7)

= dG(x, x] .

The main obstacle in showing that is the nonlinearity of g(x), whi ch consists of linear term In order to (8)

L (x) and the additive term G(x). I overcome that obstacle an auxiliary operator L < s , x>

=

< s, sG(i) + dG(x , x»,

where L: R"X'-' R"Z

s€R,

-170 -

R. Kulikowski

can be introduced. it can be proved that for

real numbers

and

o(l,,x2

Since the fun ctional Rx X

L
is a linear operator,

i,

e,

sl' S2€ R,

11 (x)

can

be also treated as defined over

then the following notation 1

S,

-dF(x,

x)

can be introdu ced. Now we should che ck whether the condit ion L <s ,

x> ? 0 impli es

<s , x>

> O.

Observe th at this c o n d iti o n c a n be written as

s ? 0

(9)

and

sG(iC) + dG(i, x ,

Assuming first of all that

s >0

G(x) + dG(x, x is) Taking into ac count

? 0

and divining

?

(4) we obtain

(9) by

s

we get

0 . -dF(x, x is)? 0

and -dF(x, x)?O

or ( 10)

1
which s at i s fly c o nd itio n s Each point

(11 )

S,

s > 0,

on the set

P

of

pai r s

sG(x) + dG(x, x).2 0 .

for which the relation dG(x , x) ;: 0

holds can be treated as a limit of the sequence sequen ce c a n be constructed as follows.

of points from

P,

That

-171-

R. Kulikowski

Take on element

!n

(12)

x

with the property that

o

+ dG(x, x

G(x)

Summing up (11) and (12)

!n

0

In)

>0

-

one gets

G(x) + dG(x, x

In + x)

>0

0 -

Then 1 1 < -, - x n

n

1

lim n .,.

then

(9)

+x >e

0

P

• Since

1

< 0, x>

< -;;-' -;;- X o + x >

00

implies (10) for all

s

2 o.

Introducing the notation

<s, x> = w, Rx X = W the obtained relation

can be written as ( 13)

if

L(w)

2

0

then

l(w)::: 0 ,

where L = linear operator, I

L: W

= linear functional over

W.". , adjoint to

~

V = Rx Z ,

W (an element of the space

W).

We can now return back to the main problem which is the existence of nonnegative functional

l(w)

(14)

To solve that spa ce s

V Jt

satisfying the relation

=

v*[L(w)].

problem we shall need a generalization for Banach

of the well known Farkas lemma. Hirst of all it is necessary to introduce a few additional notions. We shall denote by

Q

a set of all functionals of

be represented in the form (15 )

w*+ (w) = v ....

[L(W)J '

'"

v 2:: 0,

w

which can

-

172-

R. Kulikow ski

where

*

v e- V

v

l O.

As shown in Ref. dition

3 . of theorem

an element (21 )

[7] in order

to check whether the re gularity c on -

2 holds it is sufficient to show that there exists such

w-Ii E W, that L(w'''') > 0 •

-175 -

R. Kulikowski

For example, in the case of the

L(w)

= < s,

operator

t

J If' [x{~)J x('t')d"t'

(20)

one obtains

t

- Sr [XC~)} d t'} > ,

+ s [a(t)

°st:ST.

o

o t

Since

~ If[xCI:)]

a(t) -

there exists such a parr

d 1;' ~ 0 , then

< s,* x· >,

it can be easily proved that

s'" > 0

x~(t) > 0,

and

t

e [0, T],

that

L(w*) > 0 • It should be noted that the theorem

of variational

2 generalizes certain theorem

calculus in Banach spaces and, in particular,

the following

theorem of Luster-nik, Theorem 3. Let the functionals

If grad

X E.X

and

F'(x},

subject

H(x)

II > O.

If

to the constraint

x

is

a conditional

H(x) = c ,

grad F(x)

(22) and

F, H be strongly differentiable at the point

= p

where grad

extremum point of

c = H(x) , then

H(x) ,

/"- is a number • The proof of that theorem is given in

Consider a controlled input

u(t)

Ref.

[9] .

linear dynamic system, shown in Fig. 1; having one and n+I outputs, which

are described by Volterra

operators: (1)

Yi(t)

J kp,1:") u(1:')

d1:' ,

o where ki (t, 1:' ) - linearly independent transient functions of the system, i = 0, 1, .•• , n •

-

176-

R. Kulikowski

A typical optimization problem can be formulated as follows; Find the function u(t) € L P [0, TJ, which minimalizes T

II

(2)

u 1/

=( J

p

ju(t)1 p

o

,) 1/p

d~

, p Co 1

subject to the constraints T

J

y/T) =

(3)

k.(T,'t' 1

u(-r) d z-

= x .; 1

i = 0,

1, ••• , n

o where

T,

x . - given real numers, 1

In other words, for the given outputs

it is required to minimalize the control cost (2)

x . attained at the time 1

t = T •

In certain cases additional conditions of the form (4) (~,

M ~ u(t) M

M

given numbers) or t J.(t) = k.(t, z ) u(?") d z < x .(t), J 0 J J

f

(5)

S

= 0,1, ... , n

(x .(t) = given time functions) are being imp osed. J The constraint (5) is c a ll e d "restriction of phase coordinates". There exist, of course, many known optimization techniques, such as : va r ia t io na l cal culus,

maximum principle, dynamic programming etc. whi ch

c an be applied for the solution of the optimization problems formulated above. In the present se ction we should like to demonstrate that the m ethod based on theorems

I,

2 of sections

4, 5 , is very c o n ve nie nt for the solution of

problems incl uding restri ction o f ph ase c o o r d i na te s . Instead of de alin g w ith a g e ne r a l n-dim ensional s ystem we shall confine o u r anal ysis to a s e cond order s ystem, whi ch is fr equently e nc o u nt e r e d in th e

en gine ering pra cti ce (e. g. in s ervom e ch ani sms etc .) . The r esult o f that

anal ysis w ill be us eful for th e in vestigation of a class of c ontrollability probl e rn s ,

-

177-

R. Kulikowski

Example I. Consider a system described by the differential equa tio n dYI dt = u(t),

(6)

with zero i nitia l conditions:

y 0(0) = Y1(0) = 0,

shown in Fig. 2 .

It is r equired to find such a control function

u(t)

which min imali ze s

th e " energy cost": I

T

="2 j

E(u)

(7)

[u('()

J

2

dt,

o s ub ject to the constraints T (8)

YO(T) =

J

(T -1:') u(;:-) d r = x ' O

(x

O

> 0)

0 T

YI (T) =

J

u( ·~)

dr

=0 •

0

The c o n s t r a i nts (8), (9) mean th at the deflection of the output c o o r dinat e of the s ystem for of that coordinate

at

t =T

t =T

is equal x 0

a nd the c o r re s po n d i ng velo city

is equal ze r o . The c o ns t r a i nt s of that kind are

typi cal for operatio n of controlled motors

and se r vo m ec ha nis m s .

The Lagrangean of present pr obl em is e q ua l

~ (u, f ) (10)

1 =2

T 2 \ [u ( 1:' )] d t- +

o

T

( T -'t') u(r) d r -

xo~+ /'1-2 J o

wh ere

r-2 = La grange multipliers. Th E' ne cess ary, a nd at th e sa m e t im e s uff icie nt co nditi o n (due t o t he

1~I'

co nvexi ty of

E (u )) , o f o p t ima l it y acco r d in g t o th e or em

3 becomes :

-178 -

R. Kulikowski

(ll ) where

grad f'l'

u

~(u, t1' 4>2

Definition 2.2. 4>(') to

E:

C([a,t].S) .

(i) Given an initial function

C ([a,to],D) , a solution to (1) is a function

< S ~ y , such that

x(t)

E:

x(t) = <jl(t)

satisfies (1) on

for all

t

E:

[Cl,t

x(t;t , 4» o

•

] O

C ([ Cl, rl) ,D)

E:

and

(to'S) (ii) The solution at time

from initial time

x(·)

to

and initial function

4>

t

generated

is denoted by y(t;t , 4» o

This solution is unique if any other solution

is identical to it as far as both are defined. Theorem 2.2. be continuous in

l

Consider the system (2.1) and let

and locally lipschitzian in

W.

Let

Then there exists a unique solution on

[a,S), to < S

~

y,

then for any compact set numbers

to < t

i.e., as

t

< t

l

+ ~

x(t)

2

t

and

such that

Let

D

ljI, and linear in

W

[a,y) .

S

=

x(t;t , 4» o

cannot be increased,

x(t

k)

E:

comes arbitrarily close to

there exists a unique solution interval

S < Y and

x(t)

G ~ D there exists a sequence of real

< •• + S

Corollary 2.3. in

and if

f(t,ljI(·»

Rn

x(t)

and let

D - G, k = 1,2, •.. ,

D or is unbounded. f(t,W('»

be continuous

Then for every =

x(t;t , 4» o

on the entire

-

209-

L. Weiss

3.

REPRESENTATION OF SOLUTION S FOR LI NEAR DELAY- DIFFERENTI AL SYS TH1 ~

In th is s e cti on, whi ch i s b as e d he avi l y on th e work o f Hale and Meyer [12] , we cons i der t he equati ons o f t he f orm

dx dt = f(t,x('»

(3.1)

with in itial function sp ace is linear in

x(')

t - h ::s::t f or a l l

~ £

It

B , all

The con t ro l funct ion

C([t

+ u(t)

o

- h,t o],R

n)

=

and de pends on ly on values of

u(')

L(')

x (s )

f(t, x( '» for

II f( t, ¢(.» II

i s furthe r a ssumed that t . where

B whe r e

s L( t)

II
. t

0

0

to whi ch a unique so l ut ion ex is ts by The orem 2 .2 . on

t

The hy potheses

a l l ow appl icat i on of th e Ri esz Repr es en ta tion The ore m t o

e s tab lis h e xi stence of an

n x n

matri x value d f unc tion

n

defi ned

[t-h. t 1

-

210-

L. Weiss

on ~

(_00,00) x [-h,O] £

f(t,~(·»

n(t,T)]~(T)

= fO

[d -h T is of bounded variation on

Moreover, n Ct ;«)

B

each

such that

for all

[-h,O]

for

n) Ll([to,T) ,R

t

range in

Now let denote the space of functions with n R which are Lebes~ue integrable over [to' T) Then we

have Theorem 3.1. (or (3.2» ~

c B.

Let

with control

for all

x(t;to'~'O) + K (t ,»)

f or each W(t,s)

T >. t

and with

o

Then

(3.3)

where

be the solution of (3.1)

x(·,to'~'u)

is defined for

t , and

f:

K(t,s)u(s)ds , t >. to o

s ~ t - h, K(t,')

K(t,s) -- a~=,s)

2 L«t 0 ,t],Rn)

£

00

1 t everyw h ere, were h amos

is the unique solution of the equations

W(t,s) = Oft

for all

t

£

[s - h,s]

(3.4)

{

s

Proof:

Let

f-ho

{d T1U;,T)} W(T + F;,i;)dF; - (t - s)1 , s

W( t ,s)

'S t

T

u(·) c L ([to,t],Rn ) l

is a continuous linear operator mapping

M(u) =6 x(t;to'O,u )

Then n

Ll([to,t],R)

Theorem, there exists an uefined for all

t

:l;

t

o

into n x n

n

R

and

matrix

such that

.

-

211-

L. Weiss

I

t X(t;t ,T)u(T)dT t

It is let

ea~ily

K(t,T)

shown [12] that

=

X is independent of

X(t;to,T) , t c (-00,00) , T c (-,t]

K( t, T)

0

W( t ,»)

- fK(t,T)dT

Then

0

o

for

T

1:

(t,t + h] for

For any

t >. n

'1

W satisfies (3.4) and

and

~

t

and let

(_, 00)

£

Wet ,~)

Hence

o

,

for

0

let t

£

['1 - h,'1]

K is as stated in the theorem .

The linear system to be discussed in some detail from a controllability standpoint is of the form

~;

(3.5)

where

=

A(t)x(t) + B(t)x(t - h) + C(t)u(t)

n , ut t ) c RP , and

xf t) e R

continuous functions.

A(o)

, B(') , cc-)

The solution of (3.5) can be represented

as in (3.3) and it is easily checked that the function ~t isfies

(3.6)

are

K(t,s)

the partial differential equations [2]

aK( t , s ) as

-K(t,s)A(s) -

aK(t ,s) as

-K(t,s)A(s) , t - h

K(t,t)

I.

xr e , s + h)B(s + h) , to

~

s

~

t

~

s < t - h

-

212-

L. Weiss

4.

DEFINITIONS OF CONTROLLABILITY Consider the nonlinear delay-differential system

dx dt

(4.1)

= f(t,x(t) ,x(t - h) .uf t )

n x(t) £ R , u(t) £ RP , and

where

u

, t >- to

is measurable and bounded

on every finite time interval (such controls will be called " admiss'ible").

The delay is represented by a real scalar

and it is assumed that for all

t.

l f £ C

in all its arguments and

The initial function space is the space

h > 0 f(t,O,O,O)

0

B as defined

earlier. Definition 4.1 . if for any

£ B

~

control segment independent of if

t

l

- to

th~re

The system (4.1) is exists

t

l

=

n R - controllable

tl($) £(to'oo)

and an admissible

such that x(tl;to,$,u) = O. If t l i s l] n $ , we speak of fixed-time R - controllability, and u[to,t

can be made arbitrarily small we speak of differential

Rn _ controllability. While this definition turns out to be quite useful, it does not reflect the fact that the state space of (4.1) is a function space and that one can conceive of control problems in which the state of the system is to be transferred to a point (or region) in function space. Hence it makes sense to also consider the following definition.

-

213-

L. Weiss Definition 4.2.

The system (4.1) is controllable to the

origin with respect to the space of initial functions ~

E B

there exists

t

l

= tl(~) E(to' oo)

B if for any

and an admissible control

] such that x(t;tc'~'u) = 0 for all t E [tl,t l + h] . to,tl+h Although controllability to the origin does not imply controllability

segment

u[

to some other point in function space, it is possible to obtain results for the latter problem using an approach similar to that presented in the sequel (see Weiss [23]). 5.

CONTROLLABILITY OF LINEAR DELAY-DIFFERENTIAL SYSTEMS

We begin with the following Lemma. Lemma 5.1.

n R - controllable if there

such that

exists

(5.1)

The system (3.5) is

rank Itl K(tl,n)C(n)C'(n)K'(tl,n)dn t

=n

o

where the prime (') indicates transpose. Proof:

Let

C be the matrix in (5 .1) whose rank is

n .

The Lemma follows by substituting

in (3.3), for then

x(tl;to'~'u)

=O

.

The question of the necessit y

of (5.1) involves the concept defined below .

-214 -

L. Weiss

Definition 5.1. each

A s ys t e m (3.5) is poi ntw i se comp le te i f fo r

there exis t s a s et o f init ial fun cti on s

t

x(t;t ,~,O) o 1

such th at the set

i

1~

(

B , i

=

1, .. . , n ,

fo rms a bas i s f o r

1 , .. . , n

n R

It is e as y to construct an example t o sh ow th at not a l l time varying s ystems (3 .5) are pointwis e comp l e t e.

We con jec tu re, howe ver,

that all constant coefficient systems of the form (3.5) are pointwi se complete . Lemma 5 .2.

If the syst em (3.5) i s po intwi se comp l e t e , th en

(5 .1) i s ne ce ss a ry an d suffi cient f or f i xed -time Proof : For any t

l

> to

0

=

vector

R

Xl

(

B , suppose there e xists a f i xed t ime

n

=

=

0

( L)

(11) admissible

By hypothesis,

Then

Xl

such that

[to ,til

Then th ere exi sts a nonz ero

x ~K ( t l 's ) C ( s ) = 0

such that

Theorem 5.3. wi th respect to

u

but (5.1) does not hold.

xix ( t l ; t o , , O)

x(tl;to, ,O)

(5 .2)

£

and an admis s i b l e c ontrol

x(tl ;to ,,u)

Then

n R - controllab i lity.

x~ x l

=

0

for all

s

(

[to ,t l l

ca n be chos en so that

whi ch is a contradiction .

A system (3 .5) is controllable to the origin

B i f and only if it i s

n R

for each

con trollab Le

( B and for some corresponding t l

s uch that

C(t)u(t)

and

-

215-

L . Weiss

u *(.)

has an admissible solution Proof: 4>

B , let

£

u[

(tl,t

The necessity of (i) is obvious.

to,t l

]

be such that

holds, then on the interval ~(t)

defined on

(tl,t

A(t)x(t)

l

x(t it l o,4>,u)

+ h) .

l

Now, given =

0

If (5.2)

+ h) , the system (3.5) becomes

It then follows that

x(t)

=

0

for

Conversely, i f (3.5) is controllable to the origin with

B , then for each

respect to and control t

£

u

[to,tl+h)

[tl,t l + h]

4>

B there exists some

£

such that

x(t;t

o,4>,u)

=

0

t

l

> t

0

for all

This implies (i) and the uniqueness theorem for

delay equations implies (ii). Remark:

The major element in the controllability problem

for 3.5 is the solution to (5.2). exist on

(tl,t

the range of

l

+ h)

C(t)

Clearly, an admissible solution will

if and only if the right side of (5.2) is in

almost everywhere on the interval.

condition for the latter to hold is the existence of an matrix

D(t)

with bounded measurable elements such that

almost everywhere on 6.

LOCAL

(tl,t

l

+ h)

A sufficient n x p B(t)

C(t)D(t)

•

n R - CONTROLLABILITY OF NONLINEAR DELAY-DIFFERENTIAL SYSTEMS In this section, we generalize the results of Lee and

Markus [19] to the case of delay-equations.

-

216-

L. Weiss Definition 6 .1.

The system (4.1) is locally

Rn - controllable

n B if it is R - controllable to the orig in B with respect to a neighborhood N(O} where to the origin with respect to

(6.1)

A(t}

af =ai( (t,O,O,O)

B(t}

at = aX (t,O,O,O) , where

C(t}

af = au (t,O,O,O)

d

Theorem 6.1.

xd(t}

n R - controllable

A system (4.1) is locally

to the origin with respect to

x( t - h)

B if its first variation about the

zero-solution satisfies the condition that there exists

Proof:

t

l

> to

such that (5.1) holds

We introduce a parameter

~

into the control

u

and define

(6.2)

u~(t} - u(t,O

It should be noted that that if

u(t,O}

= UO(t} = a on

for

t

£

[to,t

l],

and

-

217-

L. Weiss

Define the Jacobian matrix

J(t)

B E,

(6.3)

Since

dX( t ; 'i>' O ,u )

J( t)

dE,

by

I E, =O

0, the solution of (4.1) is written as

~

x(t,E,)

J:

f(T,x(T),xd(T),u E,(T)dT o

Then we have

J(t)

=

~; I =

J:

E,=O

where

A. B ,C

(6.4)

j(t)

[A(T)J( T) + B(T)J(t - h) + C(T) 0

are as given in (6.1) .

~~

(T ,0)

ta.

Differentiating we obtain

dU

A(t)J(t) + B(t)J(t - h) + C(t) dE, (t,O) , to

~ t

But, from (6.2)

dU ~

(t,O)

and

ct r) ~ (t,O) dE,

- B(t ) J (t - h) , t 1

. K[C(t,02)]

01 < 0 2 denotes

and by orthogonal

R[C(t,ol)] li R[C(t,02)]

Corollary 7.2.

There exists a positive

l C

function

u(t)

such that

U R[C(t,a)]

R[C(t,t + \l(t»] .

o>t Corollary 7.3. with decreasing

R[C(t,a)], a

t;

t

is monotone nondecreasing

°.

Identical results hold with

C(t,o)

replaced by

V(t, o)

Hence, if we denote the set of states controllable from (reachable at) time

t

by

Pc(t)(Pr(t»

and denote the set of states which

determinable at (observable from) time there exist positive

l C

functions

t

by

Qd(t)(~(t»

vet) , wet) ,p(t)

Pc Ct)

R[C(t,t +

Pr I t)

R[C(t,t - v (t» ]

Qd(t)

R[V(t,t - w(t»]

Qo(t)

R[V(t,t + p (t»]

(7.6)

\let»~]

, then

such that

-

223-

L. Weiss

We can now characterize the concepts of controllab ilit y, r e achabilit y, determinability, and observability for the s ystem (7.1). Theorem 7.4.

A state

from (reachable at) time value of time

t

l

T (t

l

X

T)

such that

X

Proof (for controllability only) : solution of (7.1) with initial state

x

o

£

o

R[C(T,t

l)]

(Sufficiency) :

The

and initial time

is

given by

x(t;"x ) o

By hypothesis, there exists Setting (r , « ) all

,

u(n)

C'(n)¢'("n)z

° ,,) X

o

we find that

n R

such that

C("tl)z

= X

o .

and making use of the fact that ((t,T)

x(tl;"x

o)

=

(t, o) ( 0 , T)

for

= 0 .

(Necessity) :

Suppose

x U 0) o

is controllable from

¢ R[C("t)]

for all

t >

Then there exists a

finite value of time such that

o or

(7.7)

£

satisfies a group property

t,

but

= -

z

x

o

T

•

and an admissible control segment

-

224-

L. Weiss Since xl

X

on

o i R[C(T, t 1) ] K[C(T,t 1)] .

in which case

i t follows that

o

has a nonz ero projec ti on

By Theorem (7.1), "i E qC(T,t)]

xi~(T,n)C(n)

=0

for all

(7.7), xixo = 0 which implies

from

X

X

o

n E[T,t

for all But then,

l].

E R[C(T,t

l)]

t d T,tll

, a contradi ct i on.

From Theorem 7.1 and Theorem 7.4 we deduce Theorem 7.5. at) time

T

A system (7.1) is controllable from (reachable

if and only if there exists a finite value of time

Another useful result is given by Theorem 7.6.

(L) I f

x E Pc( t)

and

T

~

t , then

(ii) If

x E Pr(t)

and

T

~

t , then

$(T,t) x E Pc(T) .

~(T,t)

x E Pr(T) . Proof of (i):

u* [t,t

(7.8)

such that l]

o

x(t

By hypothesis there exists

1;t,x)

= O.

Then

~(tl,n)C(n)u

*(n)dn

tl

>

t

and

-

L. Weiss

225-

Let

o

on

[T,t)

Then

~(tl,T)~(T,t)x + ~(tl,T) ft ~(T,~)C(~)~*(~)d~ T

o

by (7 .8).

The proof of (Lt ) follows exactly as above.

Now, from Defini tions

7.3, 7.4 and Theorem 7.4, we have Theorem 7.7. (observable from) time

A state T

Corollary 7.8. (observable from) time t

l

T)

x

o

of (7 .1) is determinable at

if and only if there exists a finite

A system (7 .1) is determinable at T

if and only if there exists a finite

such that rank

V(T,t

l)

=

n .

The dual of Theorem 7.6 is also clear and is given by

-

226-

L. Weiss The significance of the phrases "observable from" and "determinable at" should now be clear from th e nature of (7.1) . For i f (7.1) is determinable at (observable from) time (assuming u(o) uniguel~

= 0)

, the state of the system at time

determined from knowledge of the output

can be

y(t)

over a

finite time interval ending at (beginning from) time To see this, consider the solution for u(o) = 0 , the initial state is (7.1) is determinable at

and, for some

t

of (7.1) when

the initial time is

o'

t

a

; and ass ume

Then we can write

a

t l < to tl

x

x

y

I

a

t

a

'

(t,t )H' (t)y(t)dt . a

Application of Theorem 7 .9 indicates that the state be determined at time

t

' ( t o , t ) xo

c an

0

It is also easy to check the following facts about th e sys tern (7.1). 1. time

t

Controllability from

t

implies that any state at

can be transferred to any other state in a finite interval

of time beginning wi th

t.

-

2.

227-

If (7.1) is c on t r o ll a b l e from

L . Weiss

t

and we revers e th e

ordering of the time scale, the reversed time s y s t e m i s re a chab le at

t.

3.

In general, complete controllability does not i mp l y

complete reachabili ty or vice ve ra a , in which

n = p = r = 1 , A(t)

=0

1 - cos t

o

C(t)

To see this, cons id e r (7.1)

, and

, 0

~

,

:> 0

t

~ ~

t

t

1

The above system is completely controllable, but is r eachabl e at for

t > 0

~(.)

T

I f a system (7.1) i s controllable

, then it is reachable at all

t ~ T

+

~( T)

where

is as defined in Corollary 7.2. (d i )

it i s observable from all

t:>

If a system is determinable at

T - W(T)

where

W(T)

T

is as defined

earlier. Proof: rank

o n ly

One can say the following, however.

Proposition 7.10 . (i) from time

t

C(T,T + U(T» rank

If (7.1) is controllable from

(i)

=

n

=

C(t;,T)

rank

C(T +

for all

t;

~

T

,

then

~( T),T)

T+

~ (T )

(ii) Follows from (i) b y dualit y

C0rollary 7 .11.

Complete differential c on c r o l l ab i l i ty

(obsez '; ab :l1ity) is equivalent to c omp l e t e d t f f e r er-t La I re ach ab ilit y

,

-lli-

L. VVeiss (determinability).

[It is therefore obvious that no distinction

need be made between the concepts of controllability (observabi1ity) and reachabi1ity (determinability) for time-invariant systems (7.1)]. We now consider briefly the problem of controllability and determinability for ordinary nonlinear differential systems.

The

controllability problem can, in fact, be handled in a manner completely analogous to that presented for systems containing time delays (simply h + 0).

let

Results of a slightly different nature can be obtained

for special types of nonlinear equations and these are discussed in the section dealing with the application of Pfaffian systems to the controllability problem. The determinability problem warrants more detailed comment, however, and so we consider the problem of giving sufficient conditions for a class of

non1inea~

ordinary differential s ;'stems to be uniformly

determinable (defined below).

The discussion is based on the work

of A1'breckht and Krasovskii [1]. Consider the system

{

(7 .9)

dx dt =

y(t)

Rn , y(t)

x(t)

of

x

in a neighborhood of

of

f

and

g

£

g(t,x(t))

Rr , f(t,O)

where

£

f(t,x(t))

x

=0

= 0 , f and

g

are analytic functions

, and the components

respectively cm' be expanded in a series as foll3Ws.

-

229-

L. Weiss

{

fi(t,x)

(7.10)

gi (t ,«)

where

$ (m) i

and

oji (m) i

are

L

m=l

L m=l

(m) $i (t; ,x)

(m) ojii (t,x)

th m

order forms in

x

wi th continuous

and bounded coefficients . Definition 7.6. at all

t ~ Y if there exists a function

that any trajectory

(7.11) for all

The system (7.9) is uniformly determinable

x(t)

Rl x Rr + Rn

Y:

such

can be expressed as

x(t) = Y(t,y(t +

e» , - y

~

e

~

0

t >. y > 0 .

Remark:

The interest in uniform determinability stems from

the desire to develop a method of state-vector determination whose practical implementation would involve the taking of measurements over time intervals of fixed length. The first result of interest characterizes uniform observability for the linear system (7.1) (with

u(·)

= 0) , and follows from

Corollary 7.8 (See Kalman [17]). Lemma 7.10. at all where

t ~ Y


if and only if rank

V(t,t - y) = n

for all

t >. y ,

-

230-

L. Weiss

and

t- y

V(t,t -

(7.12)

f

t

y)

t'(n,t)H'(n)H(n) t(n,t)d n

i s the transition matrix co r r es pondi n g to (7 .3).

t

The main result is the theorem below. Theorem 7.11. at all

t

~

Y if its first variation has that property.

Proof: suppose in

X

n R

o


£

Consider the solution

x(-,O,x

o)

of (7.9) and

N(O), a sufficiently small neighborhood of th e o ,igin

Writing

x(t)

x(t;O,x

=

o)

, we have the following expan s ion

on the interval [t - y , t )

(7.3)

x ( t + 8)

where

t

~

t ( t + 8,t)x(t) + I S(m)(t, 8,x(t», - y " e

0

is the transition matrix associated with the fir st

variation of (7.9).

The series (7.13) will converge for

sufficiently small.

Since

(7.14)

y ft

+

8)

g (t + 8 , x( t + 8»

,

- y

~

8

x(t)

:'! 0

then s ub s t i t u t i ng (7.13) into (7.14) and expanding yi e l ds

y ( t + 8)

where

H(t + 8)t ( t + 8 , t ) x ( t ) + I p (m) ( t ,8 , x( t » m=2

, - y

:'!

0

~

0

-

231-

L. Weiss

H(t) =

18. ax

(t 0)

'

Now let

(7.15)

Y(t,y(t + e )

A( t )

y

x(t) + V(t,t - y)-l

Jo- (t,n)C(n)C' (n)' (t,Il)T' (t)

for all

t

Choosing

and all n

=

nc[t,t +

~(t)l

where

K(t,n)

is

t , it is clear that the above equation implies

Using the notation

for all

t

which proves the main part of the theorem.

The remainder

follows by a trivial observation. By completely analogous argument,one can prove Theorem 10 .3 : V(t,t - w(t»

Consider (1) with determinability matr ix

and let rank

Then there exists a

l C

V(t,t - w(t)

= rd

max(w.(t», i i

~

1,2, 3,4

-

263 -

L. Weiss

It is e asy to show th at

R[Vaa (t ,t - w e t »~

Hence there e xi sts a matri x

I

R[ R( t ,t - '<J ( t )

K(t)

su ch th at

By Corollary 10 .1 it f ollows th at

K( ' ) c (:1

Now define the ( n - r matrix

T (t ) 4

c

- r

d

+ r 2

d

)

c

I rd

denotes the

1 1 is C

r

c

-

r

d

2

+ rd

)

1

J

- K( t) I n _ r -r

T4 ( r )

-c ( n -

1

b y th e f ormula

(10.14)

whe re

I

r

d

2

x r identity mat r i x. Clea rly, 1 d1 From ( 10 . 13) an d (10.14) we ob ta i n (omi t ti ng a rgumen t s d

on ri gh t hand side)

where

Q 1

Q - R'K

is n onne gati ve def i n ite a n d i t fo llows b y h y po t h e s I s

-

264-

L. Weiss

that rank

Ql(t,t - w(t»

=

r

d

- r 3

d

Applying Corollary 10 .1 let (n - r

c

- r

d

for all 1 T (t)

s

t

.

be an

(n - r

- r

c

d

)

x

Z

continuously differentiable nonsingular matrix such

)

Z

that

o

o where for all

PI (t) t •

- r and rank Pl(t) - r ) x (r d d) d 3 1 3 1 Then the coordinate transformation defined by is

(r

d

where

(10.15)

has the effect of transforming the determinability matrix

03

for the

system (10.11) into the form

(10.16)

-'i

-

:J

]

-

265-

L. Weiss for all

t.

It is easy to check that

T(t)-l

h as

t

lu: " "'"L' fo rm

dS

T(t) , in fact -KT

(10.17)

for all

5l

J

T, t , where the dimensions of the

"0"

are

n - r

c

- r

d2

" rd

Therefore, the transformed state coefficient matrix of (10.11) is given by

•

+ T(t)T{t)

A d(t) a,

and has the form

a, d(t)

A

and the corresponding form, i.e.,

" 4>

a, d(t .. r)

~ransition

matrix

4>

a,

d(t,T)

a l s o has th e

-I

1

-

266-

L. Weiss for all

t,T

•

Now partition

where

; ad 1

where

~ 12

Add

x n - r

is

is

;ad

(n - r

as f o l l ows :

, and

c

- r + r d ) x (r d - r d) d 2 3 1 3 1 remaining matrices are comformab1e ,.ith thi s. c

- r

; dd

and

d

and the

corresponds to a partition of the ve ct or Then the transpo se o f

(10 .19)

'

a,d

a

r····

Add '

; ad ' 1 ; ad ' 2

~ ll

Add ' ~12

''J 2 1 Add ' ~2 2

By (10 .16), states which are determinable at any fi xed time under the new c oo r di n a t e system, have the f orm

Xl

(d]

[ wet)

must,

-

267-

L. Weiss

where w(t)

dim xl(t) =

0

r

=

for all

d

3

t.

, dim w(t) = n - r

- r + r ' and d2 d3 d1 From Theorem 7.6 and (10.19) it f oll ows t hat

Add ' 1>12

for all

t.

0

=

Transposing, and using equation (7.3), we f ind that

r

A

for all

~

(t~

- r

c

a,d

aa A

Aa1 d

0

0

Add All

0

0

Add A 2l

Add A 22

t • Before giving the final regrouping of terms, it rema ins

to find the "output" coefficient matrix of (l0.11) under the new coordinate system .

This is given by

H( t )

and so, from

(10.17) (with arguments omitted)

But from Theorem 10.3 it is clear that

H

H takes on th e fo r m

-

268-

L . We iss where

is

We now define th e f oll owin g

quan ti ties :

~

~

a

x

a

~

[;]

c

~

b d

x x

lC = [A CC

"dd J AU

lc = [A dC

~~d ]

~dd

ac

~~d ]

.ta a

It =

[A

b d 2

~C

[HC

H~]

"dd

An =

Aaa ~bb=Aba

and if we denote

then we define

Finally,

~b b

bb ,

= A

The t he orem

is thus proved . It s hou l d be emphasized that our proce du re fo r s t r uc t u ral de composit ion is "symmetric" from a nunber o f point s of view.

For

example, just as The orem 10.3 is a dual r e sult t o Theorem 10.2 , we

-

269-

L. Weis s

could have given a completely dual procedure f or consonant with (10.12).

obtainin ~

a f orm

That is, one can easily write th e duaJ t o

Theorems 10.4 and 10.5 which would begin with the applicati on of Theorem 10.3 and would replace the matrices matrices

C C C l, Z' 3

VI' V ' 03

Z

wi t h

etc.

In addition to all this "dual" symmetry, the resul ts of Section 7 indicate that the same type of structural decompositi on is obtained if "controllability" is replaced by "reachability" and/or "determinability" is replaced by "observability".

Hence,

Theorems 10 .4 and 10.5 as well as their duals are each represent ati ve s for a set four structural decomposition theorems. To avoid confusion in the sequel, our discuss ion and interpretation of the results of this section are given only with reference to the actual procedure adopted i n Theorems 10.4 and 10.5 t o obtain (lO.lZ) .

On the basis of our comments above, the

reader can easily supply the interpretations for all the r ema in i ng approaches. Remarks:

1.

The overall coordinate transf ormat ion whi ch

produces the general structural decomposition of an arbitrary s yst em (7.1) is represented by the matrix

I

r

d

0

T i(t)

1 T

T( t) 0

Ts(t)

-1

J

4(t)

-1

I

0

I

-1

iT 1(

0

T 3( t

-JI )

.J

-

2.

L. Weiss

270-

For the special case when (1) is time-invariant, all applications

of Corollary 10.1) will involve time-invariant transformations so that the procedure given in the proofs of Theorems 10.4 and 10.5 clearly leads to a time-invariant structural decomposition. 3.

Pictorially, the decomposition (10.12) can be viewed as in

Figure I, which shows four interconnected systems

Sa, Sb, sC, Sd

enclosed in "boxes" labelled with their associated state vectors. If, as is natural, we view the interconnecting lines inside the large "box" as input and output lines for the structural components, then the following result is readily discernible from individual examination of each structural component in (10.12) plus reference to the proofs of Theorems 10.4, 10.5. Corollary 10.6: (i)

Sa

is completely con t rollab Le and completely detorminablp

(Lf.)

Sb

is completely controllable and completely undeterminab I e

(iii)

Sc

is completely uncontrollable and completely determinable

(Lv)

Sd

is completely uncontrollable and completely undeterminable

Note:

I f the matrices

A('), C('), H(')

analytic functions of time, the ranks of

in (7.1) are

e(t,t + ~(t», Vi(t,t - wi»'

i = I, 2, 3 , will be constant everywhere in the t-domain.

Hence,

the system-theoretic interpretation of the structural decomposition of a system with analytic coefficients is given by Corollary 10.6.

This

provides the proof for assertions concerning the structural decomposition of analytic systems which were made by Kalman [16] and Weiss and Kalman [21].

It may be of interest to point out that in this special case

the overall coordinate transformation can be taken to be analytic rather than just

Cl •

This follows directly from the proof of Dolezal's Theorem

(Lemma 8.4) given by Weiss and Falb [26].

-

271 -

L, We is s

O H ''':

Sa I XO GO

u Gb

'''F

bo

F

OC

bc Sb xb F JSC XC I

1

I

~Fbd

I

I

HC

Y

~ F dC

SdI xd

F IGURE l :

St r u c t u r al Decomposi tio n of a Li nea r Sy st em

-

11.

272-

WEIGHTING PATTERNS, IMPULSE RESPONSES, MINIMAL REALIZATIONS AND CONTROLLABILITY THEORY

Until recently, input/output relations were the most popular means used in engineering textbo oks to represent systems, with the "State" being on l y impli citly considered .

Since an input/output relation is the natural ou t come

of an attempt to model a system from experimental observations, it is of interest to investigate the relationship between input/output represent ations and state-vector represent ations of a s ystem.

In

keeping with the theme of these notes, our discussion centers on the properties of controllab ility, observability, et c. for such representations .

We consider only ordinary linear differential

systems of the form ( 7.1) . The solution to ( 7.1) can be written as

( 11.1)

where

H(t)~(t,tO)xO

y Ir)

X

o

(11.2)

+

I:

W(t,T)u(T)dT o

is the state of the system at time

W(t,T)

H(t) ~(t, T)C(T)

f or all

t

o

and

t, T

and is denoted as the weighting pattern for ~ 7.1) (See Wei ss [20]). Clearly, if

x

o

o ,

then

W(t,T)

contains all the informat i on

needed to compute all input/output pairs of the system .

On thi s

-

273-

L. Weiss

basis we concentrate our s t udy of i npu t /outp u t the function

r ~l ati on s

so le ly un

W(t,T).

[A historical aside:

En gi ne e r s h ave

t

r a d i t Lona l Ly

concerned themselves with inp ut/ output rel ati ons associ a t ed wi t il t he causal impulse response function

(11. 3)

The distin ction betwe en

W c

Wc(t,T)

def ined b y

W( t, T)

,

t

o

,

t
(t,~)

and

The weighting pattern for the

OCt )

=

Consider (11.1). 4>(Tl,t)C(t)

Letting

yields the desired factorization.

-

275-

L. Weis s

It is als o quite s i mple t o c on s t r uc t " g l ob a l I v r ed uc e d

weighting pattern from a gi ven one, as i n dicate d b y t he p r oo f of the result below .

Every wei ghtin g p at t ern has

Lemma 11. 2.

g .l oh a l Ly

.J

reduced form .

Pr oof : the row s o f

' (. )

Suppose (11.4 ) is n ot gl oball y reduced. and for the colunms of

Suppose the row s of exists a n

n x n

(' ( . )

Then

'P( · ) are dependent on

a re dependent.

nonsingular con s ta n t matr i x

K

Then th ere

such that

K0 ( t )

where the rows of

W( t,.r)

O( · )

are independent over

0

'l'( t ) K- l

[

(_00, 00)

Then

(t)] 0

'l'(t) 0(t )

If the columns of

'l'l ~t)

are not independent over

we introduce a nonsingular con s t an t matrix

where the columns of

qrl1(t)

L

(_00 ,00) ,

su ch that

a r e independent ov e r

(_00,"' )

•

-

276-

L. Weiss

Then, letting

~( t)

L-lO(t)

we get

Wet,,)

(11 .5)

and (11.5) is globally reduced. We now investigate some of the properties of minim al realizations of globally reduced weighting patterns .

The first

result justifies the terminology "minimal". Lemma 11.3. weighting pattern realizations of Proof. Wet,,)

A minimal realization of a globally r :1ured

Wet,,)

has the lowest dimension of all glob al

W(t,')' Suppose the contrary .

Let

n

and consider a global realization of

dimens ion

< n.

be the order of

Wet,,)

with

Clearly, its weigh ting pattern is of order

which contradicts the assumption that

Wet,,)

< n

is globally reduced.

An interesting fact which relates the material on "structure" to that presently being developed is the f ollowing. (See Kalman [15], [16)). Proposition 11.4.

The subsystem

Sa

in Figure 1 is

a minimal realization of the weighting pattern for the overall system. Proof:

The weighting pattern for the system (10.12) is

-

277-

L. Weiss

W(t,T)

where

aa

corresponds to the coefficient matrix

right side of (11.6) is the weighting pattern for globally reduced.

Faa Sa

'111e and is

It is a trivial matter to check that the order

of this weighting pattern is the dimension of Definition 11.7.

Two linear systems

~

a

S,S

of the form

(7.1) are algebraically eguivalent if there exists a nonsingular continuously differentiable matrix

T(t)

T(t)As(t)T(t)

such that

-1'

+ T(t)T(t)

-1

(11. 7)

for all

t . ~,

obvious result is

Proposition 11.5.

Weighting patternsare invariant under

algebraic equivalence. Lemma 11.6.

Points of time from which a system (7.1) is

controllable (or observable) or at which a system is reachable (or determinable) are invariant under algebraic equivalence.

-

278-

L. Weiss Proof: analogously).

(for controllability only.

The rest follows

Under algebraic equivaien ce we have the cor r es ponde nce

C(t, t + \let) + T(t)C(t, t + \J(t»T' (t) and so rank

C(t,t + \J(t»

= n

C(t,t +

implies rank

1J( t )

C(t,t

to

1 ;l ( t ) )

n •

The following result was first stated but not proved in [16] and [21].

A proof was subsequently published by YouLa- [26].

The following proof combines that of You]a with one given by Kal man in unpublished notes.

Theorem 11.7.

Any two minimal realizations of a given

globally reduced weighting pattern are algebraically equivalent. Proof:

It is clear from the proof of Lemma 11.1 that

any minimal realization is algebraically equivalent to one with the

A(·) " 0

coefficient matrix

(take

T(t)

= ~(t,n)

in (11.7».

Hence it suffices to prove that any two minimal realizations {O'~1(·),0l(·)}

weighting pattern

and

{O'~2(·), 02(·)}

Wet,,)

of a given globally reduced

are algebraically equivalent.

To do

this, first note that

(11. 8)

Wet,,)

where the columns of the

~i(·)

are linearly independent on exist finite intervals

i,

i

=

0. (· ) 1

, i = 1,2

It then follows that there

(_00, 00)

J , K

i

and the rows of the

1,2

on whi ch the aforementioned

columns and rows are linearly independent respectively.

[for if not,

-

279-

L. Weiss

then on a ny int erval vect or on

sk

[-k,k], k = 1 , 2, . . . , th e r e exis ts a c on s t a n t

o f un it no r m s u c h th at

[-k,k]

The s e quen ce

I r

~i m s k . = £;

lSk.} · s u c h that 1.

1 -Ko

".

'k '

and

~ l ( t ) l; k = 0

almos t everywhere

h a s a con ve r ge nce s uh s eq ue n ce '1'1(t )t; = 0

a .e . i n

( _ 'c , ")

1.

th u s cant radicting the linear independen ce of the c o l umns o f

'j' ( . )

He n ce the matrices

I 'l'~(t)'¥ .(t)dt

i

1,2

(t} G~ (t)dt

i

1,2

J .

1

1

1

and

f

N. 1

are n on singul ar.

G.

K.

1

1

1

Hence , fr om (11.8) we c an wr ite th at

(11.9)

'l'l ( t )V

Now, mult iply ing bot h sides o f (11.10) on t h e left by inte grating over -1

M 2

we get

I

n

J

2

UV

,

~ 2 (t )

,

and multiplyin g on t h e l e f t agai n b y But from ( 1 1. 8) , (11.9) , (1 1. 10 ) , we h ave

-

280 -

L. We is s from whi ch it f ollows t hat {O, 'l'1(0), 0

l(0)

}

I

n

= VU so t hat

U=

i s algebrai cally equi va len t t o

v-I

and t here f ore

{C,'I'2 ( o),02 ( o)} Io"' i ch

proves the theorem. As a direct cons eq uen ce of Theorem 11. 7 and Lemma 11.6 we have the s t a t e me n t that all min imal re a l iza t i ons o f a given gl oball y reduced weighting pattern h ave e ss entially t h e same be ha vi or

~i" '

respe ct to the properties of con t ro l lab i l i t y , r e a chabil it y, determinab ilit y , and observability. We can, o f course, go even further as indicated bel ow. Theorem 11080

Given a globally reduced weighting pattern

(11.1), there exist fin ite values of time

t' , t" , s uch that

all min imal realizations of (11. 1) are con t r o llab l e ( or obs erva b le ) from all

0 ( T)

=

t < t'

and are re ach able ( or determinable) at a ll

Proof:

Let

T

A realization of a causal (anticausal)

WC(t,T) (Wa(t,T»

is a system ( 7 .1) whose

causal (anticausal) impulse response is

W (Wa(t,T» c(t,1)

•

The following Corollary of Lemma 11.1 is obvious. Corollary llJO.

An

r x p

is a causal impulse response for an if and only if there exists an r x n n x p

matrix

matrix function

W

C(t,1)

n-dimensional system ( 7 .1) matrix

~(.)

, and an

0(') , both defined and continuous for all time

such that

(11.13)

o ,

t < T

-

283-

L. Weiss

In similar fashion, t he an t ic aus a l impulse r e s pons e must have the form

(11.14)

'J' ( t ) 0 ( T) , t

o, Defini tion 11.]0.

T >

(-"',T)

are linearly independent over Definition 11 .11.

> r

the rows of

,

[i;,"')

linearly independent over

(-"', i;)

while the col umns o f

i;

Controllability and Observability (C.I.M.E. Summer Schools, 46)

Stereodynamics (C.I.M.E. Summer Schools, 56)

Nonlinear controllability and optimal control

Stochastic Differential Equations (C.I.M.E. Summer Schools, 77)

Dynamical Systems (C.I.M.E. Summer Schools, 78)

Wave Propagation (C.I.M.E. Summer Schools, 81)

Wave Propagation: (C.I.M.E. Summer Schools, Vol. 81)

Teoria Della Turbolenza (C.I.M.E. Summer Schools, 14)

Differential Topology (C.I.M.E. Summer Schools, 73)

Algebraic Surfaces (C.I.M.E. Summer Schools, 76)

Complex Analysis (C.I.M.E. Summer Schools, 62)

Complex Analysis (C.I.M.E. Summer Schools, 62)

Materials with Memory (C.I.M.E. Summer Schools, 74)

Potential Theory (C.I.M.E. Summer Schools, 49)

Spectral Analysis (C.I.M.E. Summer Schools, 64)

Stability Problems (C.I.M.E. Summer Schools, 65)

Theoretical Computer Sciences (C.I.M.E. Summer Schools, 68)

Relativistic Fluid Dynamic (C.I.M.E. Summer Schools, 52)

Economia Matematica (C.I.M.E. Summer Schools, 40)

Statistical Mechanics (C.I.M.E. Summer Schools, 71)

Recursion Theory and Computational Complexity (C.I.M.E. Summer Schools, 79)

Categories and Commutative Algebra (C.I.M.E. Summer Schools, 58)

Harmonic Analysis and Group Representations (C.I.M.E. Summer Schools, 82)

Model Theory and Applications (C.I.M.E. Summer Schools, 69)

Finite Geometric Structures and Their Applications (C.I.M.E. Summer Schools, 60)

Geometric Measure Theory and Minimal Surfaces (C.I.M.E. Summer Schools, 61)

Calculus of Variations, Classical and Modern (C.I.M.E. Summer Schools, 39)

Matroid Theory and Its Applications (C.I.M.E. Summer Schools, 83)

Global controllability and stabilization of nonlinear systems

Dynamical systems: Stability, controllability and chaotic behavior

Dynamical Systems: Stability, Controllability and Chaotic Behavior

Controllability and Observability (C.I.M.E. Summer Schools, 46)

Stereodynamics (C.I.M.E. Summer Schools, 56)

Nonlinear controllability and optimal control

Stochastic Differential Equations (C.I.M.E. Summer Schools, 77)

Dynamical Systems (C.I.M.E. Summer Schools, 78)

Wave Propagation (C.I.M.E. Summer Schools, 81)

Wave Propagation: (C.I.M.E. Summer Schools, Vol. 81)

Teoria Della Turbolenza (C.I.M.E. Summer Schools, 14)

Differential Topology (C.I.M.E. Summer Schools, 73)

Algebraic Surfaces (C.I.M.E. Summer Schools, 76)

Complex Analysis (C.I.M.E. Summer Schools, 62)

Complex Analysis (C.I.M.E. Summer Schools, 62)

Materials with Memory (C.I.M.E. Summer Schools, 74)

Potential Theory (C.I.M.E. Summer Schools, 49)

Spectral Analysis (C.I.M.E. Summer Schools, 64)

Stability Problems (C.I.M.E. Summer Schools, 65)

Theoretical Computer Sciences (C.I.M.E. Summer Schools, 68)

Relativistic Fluid Dynamic (C.I.M.E. Summer Schools, 52)

Economia Matematica (C.I.M.E. Summer Schools, 40)

Statistical Mechanics (C.I.M.E. Summer Schools, 71)

Recursion Theory and Computational Complexity (C.I.M.E. Summer Schools, 79)

Categories and Commutative Algebra (C.I.M.E. Summer Schools, 58)

Harmonic Analysis and Group Representations (C.I.M.E. Summer Schools, 82)

Model Theory and Applications (C.I.M.E. Summer Schools, 69)

Finite Geometric Structures and Their Applications (C.I.M.E. Summer Schools, 60)

Geometric Measure Theory and Minimal Surfaces (C.I.M.E. Summer Schools, 61)

Calculus of Variations, Classical and Modern (C.I.M.E. Summer Schools, 39)

Matroid Theory and Its Applications (C.I.M.E. Summer Schools, 83)

Global controllability and stabilization of nonlinear systems

Dynamical systems: Stability, controllability and chaotic behavior

Dynamical Systems: Stability, Controllability and Chaotic Behavior

Recommend Documents