Stochastic Stability and Control
M A T H E MAT I C S I N SCIENCE AND ENGINEERING A S E R I E S OF M O N O G R A P H S A N D T E X T B O O K S
Edited by Richard Bellman University of Southern California 1.
2.
3. 4.
5. 6.
7. 8. 9. 10. 11.
12. 13. 14. 15. 16.
17. 18. 19. 20. 21.
22.
TRACY Y. THOMAS. Concepts from Tensor Analysis and Differential Geometry. Second Edition. 1965 TRACY Y. THOMAS. Plastic Flow and Fracture in Solids. 1961 RUTHERFORD ARIS. The Optimal Design of Chemical Reactors: A Study in Dynamic Programming. 1961 JOSEPH LASALLEand SOLOMON LEFSCHETZ.Stability by Liapunov's Direct Method with Applications. 1961 GEORGE LEITMANN (ed.) . Optimization Techniques: With Applications to Aerospace Systems. 1962 RICHARDBELLMANand KENNETHL. COOKE.DifferentialDifference Equations. 1963 FRANKA. HAIGHT.Mathematical Theories of Traffic Flow. 1963 F. V. ATKINSON. Discrete and Continuous Boundary Problems. 1964 A. JEFFREY and T. TANIUTI. NonLinear Wave Propagation: With Applications to Physics and Magnetohydrodynamics. 1964 JULIUS T. Tow. Optimum Design of Digital Control Systems. 1963 HARLEY FLANDERS. Differential Forms: With Applications to the Physical Sciences. 1963 SANFORD M. ROBERTS. Dynamic Programming in Chemical Engineering and Process Control. 1964 SOLOMON LEFSCHETZ. Stability of Nonlinear Control Systems. 1965 DIMITRISN. CHORAFAS. Systems and Simulation. 1965 A. A. PERvozvANsKrr. Random Processes in Nonlinear Control Systems. 1965 MARSHALL C. PEASE,111. Methods of Matrix Algebra. 1965 V. E. BENES.Mathematical Theory of Connecting Networks and Telephone Traffic. 1965 WILLIAMF. AMES.Nonlinear Partial Differential Equations in Engineering. 1965 J. A C Z ~ LLectures . on Functional Equations and Their Applications. 1966 R. E. MURPHY.Adaptive Processes in Economic Systems. 1965 S. E. DREYFUS.Dynamic Programming and the Calculus of Variations. 1965 A. A. FEL'DBAUM. Optimal Control Systems. 1965
MATHEMATICS I N S C I E N C E A N D E N G I N E E R I N G 23. 24. 25. 26. 27.
28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38.
A. HALANAY. Differential Equations : Stability, Oscillations, Time Lags. 1966 M. NAMIKOGVUZTORELI.TimeLag Control Systems. 1966 DAVIDSWORDER. Optimal Adaptive Control Systems. 1966 MILTONASH. Optimal Shutdown Control of Nuclear Reactors. 1966 N. CHORAFAS. Control System Functions and Programming ApDIMITRIS proaches. ( I n Two Volumes.) 1966 N. P. ERUGIN.Linear Systems of Ordinary Differential Equations. 1966 SOLOMON MARCUS.Algebraic Linguistics; Analytical Models. 1967 A. M. LIAPUNOV. Stability of Motion. 1966 GEORGE LEITMANN (ed.). Topics in Optimization. 1967 MASANAO AOKI.Optimization of Stochastic Systems. 1967 J. KUSHNER. Stochastic Stability and Control. 1967 HAROLD MINORUURABE.Nonlinear Autonomous Oscillations. 1967 F. CALOGERO. Variable Phase Approach to Potential Scattering. 1967 A. KAUFMANN. Graphs, Dynamic Programming, and Finite Games. 1967 A. KAUFMANN and R. CRUON. Dynamic Programming: Sequential Scientific Management. 1967 J. H. AHLBERG, E. N. NILSON,and J. L. WALSH.The Theory of Splines and Their Applications. 1967
In preparation Y. SAWARAGI, Y. SUNAHARA, and T . NAKAMIZO. Statistical Decision Theory in Adaptive Control Systems A. K A U P M A Nand N R. FAURE.Introduction to Operations Research RICHARD BELLMAN. Introduction to the Mathematical Theory of Control Processes ( I n Three Volumes. ) E. STANLEY LEE. Quasilinearization and Invariant Bedding WILLARD MILLER,JR. Lie Theory and Special Functions Nonlinear PAULB. BAILEY,LAWRENCE F. SHAMPINE, and PAULE. WALTMAN. Two Point Boundary Value Problems
This page intentionally left blank
STOCHASTIC STABILITY A N D CONTROL Harold J. Kushner Brown University Providence, Rhode Island
1967
ACADEMIC PRESS New York / London
COPYRIGHT Q 1967, BY ACADEMIC PRESSINC. ALL RIGHTS RESERVED. NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY FORM, BY PHOTOSTAT, MICROFILM, OX ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS.
ACADEMIC PRESS INC. 111 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS INC. (LONDON) LTD. Berkeley Square House, London W. 1
LIBRARY OF CONGRESS CATALOG CARD NUMBER: 6630089
PRINTED IN THE UNITED STATES OF AMERICA
To Linda
This page intentionally left blank
Preface
In recent years there has been a great deal of activity in the study of qualitative properties of solutions of differential equations using the Liapunov function approach. Some effort has also been devoted to the application of the “Liapunov function method” to the design of controls or to obtain sufficient conditions for the optimality of a given control (say, via the HamiltonJacobi equation, or dynamic programming approach to optimal control). In this monograph we develop the stochastic Liapunov function approach, which plays the same role in the study of qualitative properties of Markov processes or controlled Markov processes as Liapunov functions do for analogous deterministic problems. Roughly speaking, a stochastic Liapunov function is a suitable function of the state of a process, which, considered as a random process, possesses the supermartingale property in a neighborhood. From the existence of such functions, many properties of the random trajectories, both asymptotic and finite time, can be inferred. The motivation for the work was the author’s interest in stochastic problems in control, and the methods to be discussed enlighten many such problems; nevertheless, much of the material seems to have an independent probabilistic interest. Although some discrete time results are given, we have emphasized processes with a continuous time parameter, which require somewhat more elaborate methods. Actually, most of the proofs are not difficult in that they do not involve highly detailed or subtle arguments; some of the proofs are difficult in the sense that we have found it ix
X
PREFACE
necessary to refer, in their proof, to theorems which are subtle. Some of these are discussed in the background material of Chapter I, and others in the referenced works of E. B. Dynkin. The analysis requires the introduction of the weak infinitesimal operators of (strong) Markov processes, either as a general abstract object, or in the forms in which it appears for special cases. Instead of doing the analysis for various special processes, we chose the economical alternative of treating general cases, and listing special cases in remarks or corollaries. Chapter I is devoted to a discussion of many of the concepts from probability theory which are used in the sequel. Chapter 11, on stability, is the longest and probably the most basic part; the material underlies most of the results of the other chapters. The chapter is by no means exhaustive. We have concentrated on several results which we feel to be important (and can prove); there are certainly many other cases of potential interest, both obvious and not. The chapter contains a number of nonlinear and linear examples. Nevertheless, a quick survey of the examples reveals a shortcoming that we share with the deterministic method; namely, the difficulties in finding suitable Liapunov functions. In particular, we have not been able to obtain a family of Liapunov functions which can be used to completely characterize the asymptotic stability properties of the solutions of linear differential equations with homogeneous Markov process coefficients (i.e., obtain necessary and sufficient conditions in terms of (say) the transition functions of the coefficient processes, for certain types of statistical asymptotic stability). Chapter 111 is devoted to the study of first exit times or, equivalently, to the problem of obtaining useful upper bounds to the probability that the state of the process will leave some given set at least once by a given time. The approach is expected to be useful, in view of the importance of such problems in control and the difficulty of obtaining useful estimates. Chapter IV is devoted to problems in optimal control; in particular, to the determination of sufficient conditions for the optimality of a control. Some of the material provides a stochastic analog to the
PREFACE
xi
HamiltonJacobi equation, or dynamic programming, approach to sufficient conditions for optimality in the deterministic case, and provides a justification for some wellknown formal results obtained by dynamic programming. In Chapter V, we discuss several uses of stochastic Liapunov functions in the design of controls. The method may be applied to the computation of a control which reduces some “cost” or ensures that some stability property obtains. It is a pleasure to express my appreciation to P. L. Falb and W. Fleming for their helpful criticisms on parts of the manuscript, and to Mrs. KSue Brinson for the excellent typing of several drafts. The work was supported in part by the National Aeronautics and Space Administration under Grant No. NGR40002015 and in part by the United States Air Force through the Air Force Office of Scientific Research under Grant No. AFAFOSR69366.
April 1967
HAROLDJ. KUSHNER
This page intentionally left blank
Contents Preface
.
. . . . . . . . . . .
I / Introduction
. . .
ix
. . . . . . . . . . . . . . . .
1
1. Markov Processes .
2.
3. 4.
5.
6. 7.
1. Introduction
3. 4. 5.
. . .
.
. .
. .
.
.
. .
.
. .
.
.
Discrete time Markov process 1, Continuous time Markov processes 3, Notation 4, Continuity 4 Strong Markov Processes. . . . . . . . . . . . . Markov time 5, Strong Markov process 5, Discrete parameter processes 7, Feller processes 8, Weak infinitesimal operator of a strong Markov process 9, Dynkin’s formula 10 Stopped Processes. . . . . . . . . . . . . . . It6 Processes . . . . . . : . . . . . . . . . Strong Markov property 15, Differential generator 15, It6’s lemma 16, Stopped processes 17, Weakening of condition (45) 18 Poisson Differential Equations . . . . . . . . . . . Second form of the Poisson equation 20 Strong Diffusion Processes . . . . . . . . . . . . Martingales . . . . . . . . . . . . . . . .
II / Stochastic Stability
2.
.
.
.
.
. . . . . . . . .
.
. .
.
. .
...
4
11 12
18 22 25
. . . .
27
.
27
,
.
.
. . .
.
Definitions 30, The idea of a Liapunov function 33, The stochastic Liapunov function approach 34, History 35 Theorems. Continuous Parameter . . . . . . . . . . Assumptions 36, Definition 39, Remark on time dependence 41, Remark on strong diffusion processes 44 Examples . . . . . . . . . . . . . . . . . Discrete Parameter Stability . . . . . . . . . . . . On the Construction of Stochastic Liapunov Functions . . . . XI11
1
36 55 71 72
xiv
CONTENTS
Ill / Finite Time Stability and First Exit Times .
.
1 Introduction 2 . Theorems .
. . . .
77
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
Strong diffusion processes 89 3. Examples . . . . . . Improving the bound 100
77
. . . . . . . . . . .
91
. . . . . . . . . .
102
IV / Optimal Stochastic Control
.
1 Introduction . . . . . . . . . . . . . . . . A dynamic programming algorithm 103. Discussion 106. A deterministic result 107. The stochastic analog 108 2. Theorems . . . . . . . . . . . . . . . . . Terminology 109. A particular form for F( V ( x ) ) 118. An optimality theorem 118. Control over a fixed time interval 121. Strong diffusion process 124. General nonanticipative comparison controls 126. “Practical optimality” : Compactifying the state space 128 3 . Examples . . . . . . . . . . . . . . . . . 4. A Discrete Parameter Theorem . . . . . . . . . . .
130 141
. . . . . . . . . . . .
143
V / The Design of Controls
1. Introduction . . . . . . . . . . . . . . 2 The Calculation of Controls Which Assure a Given Stability
.
Property . . . . . . . . . . . . 3. Design of Controls to Decrease the Cost . . . The Liapunov function approach to design 150
102
109
. . 143
. . . . . 144
. . . . .
147
. . . . . . . . . . . . . . . . . . .
153
Author Index
. . . . . . . . . . . . . . . . .
159
Subject Index
. . . . . . . . . . . . . . . . . . 160
References
I / INTRODUCTION In this chapter, we list and discuss some results and definitions in various branches of the theory of Markov processes which are to be used in the sequel. The chapter is for introductory purposes only, and all the ideas which are discussed are treated in greater detail in the listed references. Occasionally, to simplify the introduction, the list of properties which characterize a definition will be shortened, and reference made to the precise definition. The sample space is denoted by Q, with generic point o.The range of the process is in a Euclidean space E , and d is a Bore1 field of sets in E.
1. Markov Processes
DISCRETE TIME MARKOV PROCESS
Let xl, .. . be a discrete time parameter stochastic process with the associated transition function P ( s , x; n + s, f). Then the function P ( s , x ; n s, r ) is to be.interpreted as the probability that x n + , is in T E € ,given that x, = x . Suppose that the conditional distribution function satisfies
+
P ( ~ n + s ~ r I x*., l , xs> = P ( X n i s E r J X s )
(11)
for all f in d and nonnegative n and s, with probability one. Then the process is termed a Markov process and, in addition, the Chapman1
2
I
/ INTRODUCTION
Kolmogorov equation (12) holds:
P(s, x; n
+ rn + s, r )
=[~(s,x;s
+ m , d y ) P ( s + m , y ; n + rn +s, r).
(12)
E
The following alternative characterization of a Markov process will be useful. To each point x in E and nonnegative integer s < co, a probability measure P , , , { A } is associated. The measure, on an appropriate aalgebra of sets in R is defined by* its values on the Bore1 sets px,s { X n + s E r }
= P ( s , X; s
+ n, r )
and satisfies with probability one p x , m {Xm+n+sErIXI
3
xm+s}
= P(m
+s,
xm+s;
s
+ n + m , r)
for each r in 8.P,, { A Ib} is interpreted as the probability of the event A , with respect to the measure P,,, conditioned on the minimum 0algebra on SZ over which b is measurable.+ Thus P x , , { x n + , ~isr }the probability that x,+, is in r, given that the process starts at s with x, = x, a constant. If P , , , ( X ~ + ~ does E ~ } not depend on s, we write P, {x,ErJ for the probability that X " + ~ E given ~ , that the process starts at x with x, = x, for any initial time s. The sample paths of stochastic process are not always defined for all time (for either continuous or discrete parameter processes). There may be a nonzero probability that the process will escape to infinity in a finite time, or at least leave the domain on which it is defined as a Markov process at some finite time. Consider the trivial Markov process whose paths are the solutions of the differential equation f = x 2 where the initial value xo (taken at t = 0) is a random variable. Then, for any T < co,it is easily seen that x, 03 as t + ((x,) 5 T, with the probability that x, 2 1/ T . In general to each initial condition x in E there is associated a
r}.
* {xn+*E T Jis defined as the w set ( w : xn+s E + Sometimes the appropriate aalgebra is written in lieu of b.
1.
3
MARKOV PROCESSES
random variable 5 which takes values in [0, co]. ( is termed the “killing time” and the process x, is defined for n < ( only. If 5 03 with a nonzero probability, then P,,,,(X,EE} will be less than 1 for large n. If the sample paths are defined for all time with probability one, then ( = 03 with probability one. It is often convenient to suppose that x, = 03 for n 2 5. However, for most purposes of the sequel, the killing time will be unimportant, and unless mentioned otherwise, we assume that it is equal to 03 with probability one. Generally we will be concerned with the behavior of a process up to the first instant of time that the process exists from a given set. It will be required that the process be defined up to this time, and this will be true either by assumption or as a known property of the specific case treated. An exception to this is Theorem 8, Chapter 11, in which it is proved that ( = co with probability one (a type of Lagrange stability) under stated conditions on the process.
=
CONTINUOUS TIME MARKOV PROCESSES If the time indices are allowed to take any values in [0, a),then the discrete case definitions carry over to the continuous parameter case. Precise definitions and results on continuous parameter processes are given in Dynkin [l, 21. Other useful sources on discrete or continuous parameter processes are Feller [l], Chung [l], Doob [l], Loeve [l], and BharuchaReid [l]. Occasionally, to assist in the simplification of notation, it will be helpful to consider time as a component of the Markov process. According to the definitions above, if x, is a Markov process, then so is the pair * (x,, t ) . We will occasionally use the convention of writing the state (x, s) (or (xs, s)) simply as x (or xs). The new state space E x [0, a)(which contains the (x, t ) sets I‘) is then written simply as E.
* The pair (xt ,t ) does not satisfy the precise definition of a Markov process as given by Dynkin [2], p. 78, condition 3.1G. This deficiency is easily overcome by a standard enlargement of the space l2 which leaves all other properties intact. See footnote on p. 79 of Dynkin [2].
4
I
/ INTRODUCTION
This will allow several results which are stated more succinctly in the homogeneous case terminology, to extend to the nonhomogeneous case. NOTATION
The following terminology will be used. Suppose that time is not a component of the state. Let the process x, be homogeneous. Then P , { x ~ E= ~P(s, ) x; s + t, r )= P ( x , t, r )with probability one for any s >= 0. Also, E,f(x,) = S f ( y ) P ( x , t , dy). If the process is not homogeneous, let P x , s { X t + s d=)P ( s , x; s + t , r ) and E,,,f(x,+,) =Sf(y) P(s, x; s + t , dy). In interpreting P , , , { x , + , E ~ }it is to be understood that x is the initial value of the process at initial time s. Px(do) and P,,s(dw)are the probability measures corresponding to the homogeneous and nonhomogeneous cases, respectively. If time is considered as a component of the state, then either the homogeneous case or the nonhomogeneous case terminology may be applied. CONTINUITY
The process x, is said to be stochastically continuous at the point x if P,{II.x,
 XI1 2 &}
+o,
lI~Il= 2
c xi'
as 6+ 0, f or any E > O . If P,{
SUP
d2AtO
IIXd
 XI1 2 8 )
+
0
uniformly for x in a set M , as 6+0, for any E > 0, then the process is uniformly stochastically continuous in the set M.
2. Strong Markov Processes
In this section, it is assumed that the killing time 5 equals infinity with probability one. This assumption will not restrict the results of
2.
STRONG MARKOV PROCESSES
5
the following chapters, but will allow a discussion of a number of concepts and results of Dynkin [2] (which we will use later) with a reasonably unburdened notation. TIME MARKOV
Let 9, be the minimum aalgebra on SZ determined by conditions on x,, 0 S s 5 t , xo = x (and completed with respect to the measure Px(do)).A random variable 7 taking values in [0, co] and depending also on x is called a Markov time if the event {T 5 t } is contained in 9, for each t < co and fixed x. 7 is also called a “time” or “optional time.” More intuitively, 7 is a functional of the sample paths and, whether or not the time 7 has “arrived” by the time t , can be determined by observing the sample paths only up to and including t . Define the first exit time of x , from an open set Q by 7 = inf { t : x, not in Q}. It is intuitively clear that if x, is continuous from the right with probability one, then z is a Markov time (since whether or not the process has left Q at least once by time tcan be determined (with probability one) by observing xs,s 5 t ) . More precisely (see Loeve [l], p. 580), let R, be the set of x points whose distance (Euclidean) from E  Q is less than l/n and let { r i } be the rationals. Then (7 5 t > = { x t $ Q }
U r’l U {XriERn} + N € F t , n r i j t
where N is a null set. A much more thorough characterization of the first entrance and exit times which are also Markov times is given in Dynkin [2, Chapter 4, Section 11. See also It6 [2]. If a Markov time is undefined for some o,we set it equal to infinity. PROCESS STRONGMARKOV
Markov times play an important role in the analysis of a Markov process, and will play a crucial role in the sequel. The events of greatest
6
I
/ INTRODUCTION
interest to us will occur at Markov times; for example, the first time that a controlled process enters a target set; the first time that a trajectory leaves a given neighborhood of the origin (which is a property of interest in stability), or the first time that two processes are within E of one another. In addition, much of the analysis concerns the behavior of sequences of random variables which are the values of a Markov process at a sequence of Markov times. It is useful to further restrict the class of Markov process with which we deal so that the analysis with the Markov times will be feasible. The definitions which follow are according to Dynkin [ 2 ] . First, let z be a Markov time and define the aalgebra Pras follows: let the set A be in Prif
for each t 2 0. The verification that Sr is a afield is straightforward; for example,letA,,n= 1, ... b e i n S  , . T h e n s i n c e A , n { s ~ t }is i n 9 , for each t and n, (U, A , ) n { T 5 t } is also in 9, for each t 2 0; hence, U, A , is in Fr,and so forth. Princludes those events which can be determined by observations on the process up to and including, but no later than, 7. Two examples will clarify this. Let z = s, a constant. T is obviously a Markov time. For t < s, (7 t } is empty, and all sets A satisfy (21). For t 2 s, (7 t } = Q and only the sets A in Sssatisfy Q u A ~9,, all t 2 s. Thus Pr= Ss. Consider also the simple example where Q contains all possible outcomes of two successive coin tosses; C?= { H H , HT, TH, TT,r)}. Let 7 = 1 if the first toss is heads and T = 2 otherwise. T is a random time. Also, Sl = {Q, r), H H u HT, T T u T H } and T2contains all possible combinations of points in Q. It is easily verified that gris the field over {Q, 4, H H u HT, T H u TT, TH, T T } , that is, the collection of events which can be observed at n = 1, if the first toss is a head, or at n = 2 otherwise. In what follows we always assume that the process x,is continuous on the right with probability one; no other processes will be considered
s
2.
STRONG MARKOV PROCESSES
7
in the monograph. The process x, is termed* a strong Markov process if, for any Markov time 7 and any t 2 0, r in 8,and x in E, the conditional probability satisfies
with probability one. Equation (22) has the interpretation that the probability of { x t + , E r } , conditioned upon the history up to 7, equals the probability of { x ~ + , E ~conditioned ), upon x, only. For example, let x, be a continuous process, and let 7 be the first exit time from a bounded open set Q. Then x , ~ d Q and , (22) means that the conditional probability that the path will be in r, t units of time after contacting dQ, is the same whether only the position on dQ at 7 is given, or whether the method of attaining the position on dQ at T is given. Since (22) holds for 7 equal to any finite constant, any strong Markov process is also a Markov process. The reverse is not true. See the counterexample in Loeve [l], pp. 577578. Nevertheless, to the author’s knowledge, all Markov processes which have been studied as models of physical processes are strongly Markovian.
DISCRETE PARAMETER
PROCESSES
All discrete parameter Marksv processes are also strong Markov processes. This follows directly from Loeve [l, p. 5811. The basic distinction between discrete and continuous parameter processes which yields the result is that Markov times for discrete parameter processes can take only countably many values. That this is sufficient may be seen from the following argument. Suppose that 7 is a Markov time for the process xl, ... . Assume, for simplicity, that the process is homogeneous. To prove the result we must show that
* If xt is not right continuous, then Dynkin [2], p. 99, requires that it be measurable.
8
I
/ INTRODUCTION
with probability one. By the left side of (23), we mean P, {xr+,,~T ISr}. Equation (23) is equivalent to the statement that
s
p, { A n {
[email protected])E r}}=
A
px(dw) p (%&(u)
9
r)
(24)
for any set A in S r .Define the event Bi = {T = j } . The B j are disjoint sets and their union (including the event B , = {z = 03)) is a.Thus (24) may be written as
1i P, { A n Bjn {xm+ r}}= j~
=
/ 7
p x (dw)p
(k&(m)
9
r>
An(UBj)
px(dw) ~ ( nx ,j ,
AnB
r>.
(25)
j
However, the defining relation (26) for discrete parameter Markov processes P , { X , , + ~ E ~ ~ Xr ,5, j } = P ( n , xi,
r)
(with probability one)
(26)
implies (25) and concludes the demonstration.
FELLER PROCESSES Suppose that the function f(x)is bounded, continuous, and has compact support. If the function
E,f (4= F (4 is continuous in x,for each t 2 0, the process is termed a Fellerprocess. Also, every right continuous Feller Markov process is a strong Markov process (Dynkin [2, Theorem 5.101). Indeed, this criterion provides a quite useful computational check for the strong Markov property; see for example, Sections 4 and 5. Other, more abstract, criteria for the strong Markov property are given in Dynkin [2], Chapter 3, Section 3.
2.
STRONG MARKOV PROCESSES
WEAKINFINITESIMALOPERATOR OF A
STRONG
9
MARKOV PROCESS
The functionf(x) is said t o be in the domain of the weak infinitesimal operator A’ of the process x,, and we write Jf(x) = h ( x ) , if the limit (27)
60
exists pointwise in E , and satisfies* lim Exh( x 6 ) = h (x) 6+0
.
Clearly, 2 is linear. It will be calculated for It6 and Poisson processes in Sections 4 and 5, respectively. Suppose that x, is the solution to an ordinary deterministic differential equation f = g ( x ) . Then “ f ( x ) is in the domain of 2’implies that f ( x ) has continuous first partial derivatives and that f ( x , ) is continuous in t. Under these conditions x f ( x ) = f:(x) f = f(x). In general, x f ( x ) is interpreted as the average time rate of change of the processf (x,) at time s, given that x, = x . Note that A’ is not necessarily a local (in the ordinary Euclidean topology) operator. The value of the function $ ( x ) = h ( x ) at some point xo will depend on the values of f ( x , ) (with initial condition xol which are attainable for small s. For example, if x, is a Poisson process, then h ( x o ) will depend on the values off(x) which can be reached by a single jump from xo. For a detailed development of the properties and uses of A’ and related operators, the reader is referred to the monograph of Dynkin [2]. Its application in the sequel is due to formula (29). Consider the form and domain of the weak infinitesimal operator when the process is not homogeneous or when f ( x , t ) depends explicitly on time. Suppose that for each constant b in some interval the function of x given by f ( x , 6 ) is in the domain of A” and J f ( x , 6) = h(x, b). Suppose also that f ( x , t ) has a continuous and bounded deriua
* If the limits (27) and (28) are uniform in x in E, thenf(x) is in the domain of the strong infinitesimal operator of the process.
10
I
/ INTRODUCTION
tioe,f,(x, r), with respect to t, for each x in E and t 5 T. Let Ex.,h(x,+a,t+6),h(x,t) E,.fft(Xf+d, t + a)fl(x, t ) as 6 + 0 and 6 >= ct +0. Then (27a) and (28a) hold and f (x, t) is in the domain of 2.(In (27a) and (28a) the symbol 2 is used for the weak infinitesimal operator of x, operating onf (x, t), where t is fixed.) lim
E,.ff(X,+d t 3
+ 6) f(x, 0 6
d0
= lim
Ex,rf(Xc+drt
+ 6)  E,,ff(Xf+d,
t)
6
d+O
+ lim Ex,tf(X,+d,6 t ) f(& r> = lim E , , f f r ( ~ t + at ,+ + z f ( x , r) 6tO
ct)
60 dZa+O
= A (x, 2) + h (XI r )
(27a)
DYNKIN’S FORMULA Suppose that x, is a right continuous strong Markov process and z is a random time with E,z < co. Letf(x) be in the domain of 2,with zf(x) = h (x). Then (Dynkin [2], p. 133)
1
E,f(x,) f(x) = Ex
1 r
T
h (x,) ds
0
= Ex
Af(x,)ds .
(29)
0
Equation (29) will be termed Dynkin’s formula. The deterministic counterpart is the basic formula of the calculus: r
f(xt) f(x) = /Axs) ds * 0
3.
STOPPED PROCESSES
11
3. Stopped Processes Define r n s = min(t, s). Suppose that T is the first time of exit of x, from an open set Q . The process I , = xtnr is a sfoppedprocess:
< T),
I,
= x,
(t
I,
= X,
( r 2 T).
We use the notation Pf and Ef (in lieu of Px and Ex)for the stopped process only when confusion may otherwise arise. If x, is right continuous, then so is I,. The process I , is strongly Markovian (Dynkin [2], Theorem 10.2). Denote the corresponding weak infinitesimal operator by JQ. If the limit (32) exists for the process I , and l i m a + o E ~(Ia) h = h (x), thenf(x) is in the domain of
xQ:
~ 9 ( 1 a > = ~x [ ~ 7 2 a I f ( ~ 6+ ) ~ . [~r<JIf(xr) x
,yA is the characteristic function of the set A . Suppose that x is not in Q . Then z Q f ( x ) = O . If x is in Q, then T > 0 with probability one by right continuity. Iff(.) is in the domain of A” and
and E:h(I,)h(x) Exh (xa)
+
as
S0
(33a) (33b)
h (x)
for all XEQ, thenf(x) is also in the domain of JQand J Q f ( x )= @(x) for x in Q. Equation (32) equals E x ~ a > [f(Xanr) r
6
 f(x6)I
12
I
/ INTRODUCTION
If V(x) is bounded and continuous, then the limit (32) is zero if (34) Equation (34) is true for the Ita and Poisson processes to be discussed, and essentially implies that the transition density P ( t , x, r ) is differentiable with respect to t (from above) at t = 0. I f f ( x ) is in the domain of A“ and (32) holds, thenf(x) is in the domain of if h(x) is continuous and bounded in Q. This follows since the difference between the lefthand sides of (33a) and (33b) is
xQ
lim ExXa>r Ch ( X a J  h (XAl
60
which equals zero under the above condition. It is implicitly assumed in much of the text that if Q 2 P,then V(x) in the domain of A”Q implies V ( x )is in the domain of and xQV(x)= JPv(x) in P. 4. It6 Processes
A class of continuous time Markov processes whose members are often used as models of stochastic control systems are the solution processes of the stochastic differential (It8) equation d x = f (x, t ) dt
+ a(x, t ) d z .
(41) z, is a normalized vector Wiener process with E(z,  zs)(zt  zs)’= Z i t  sI, where Z is the identity matrix. Define the matrix { S i j ( X , t ) } = S(x, t ) = a’(x, t ) a(x, t ) .
In some engineering literature, (41) appears as = f ( x , r) + t)$ 9 (42) where $ is called “white Gaussian noise.” The resemblence of (42) to an ordinary differential equation is misleading, since t+b is not a function. Some relations between the solutions of (41) (or (42)) and the solution of ordinary differential equations are discussed in Example 2 of Chapter 11. See also Wong and Zakai [l, 21. If a(x, s) = 0, then (41)
4. IT6 PROCESSES
13
is interpreted to be an ordinary differential equation. Equation (41) is given a precise interpretation by It8 [l],who writes it as an integral equation: 1 2 x, = xo
+
If(.,,
s ) ds
+ /.(x,,
0
s ) dz,.
0
The second integral, called a stochastic integral, is defined roughly as follows. A random function V, and variable z, respectively, are said to be nonanticipative if V, and the event {z s], respectively, are independent of zt z, for all triples t 2 u 2 s. Suppose that g ( o , s) is a scalarvalued nonanticipative random function which satisfies
s
i
E g 2 (o, s ) ds
0
=
03.
(43)
Let {ti} be an increasing sequence of numbers tending to T. Now, let g(w, s) satisfy (43)and take the value (independent of s) gi(o) in the interval [ t i , ti+l). Define the stochastic integrali(w, s) by ~i(o.s)dzs=Zgi(o)(ill+, i zti)*
(44)
0
Approximateg(w, s) to be a mean fundamental sequence of such simple functions. The integral j’bg(w, s) dz, is defined as a (mean square or probability one) limit of the corresponding sequence (44).With this definition, and the conditions (49, It8 uses a Picard iteration technique to construct a sequence of processes which converge with probability one to a process which is defined as the solution to (41):
f (x, t ) , a (x, t ) I I ~ ( X ,~ ) I I= ’ CIfi(x,
are continuous functions t)12
s~
i
where K is a nonnegative real number.
( +111x11’)
I 1 INTRODUCTION
14
The scalar case of (41) is discussed by Doob [l], and the vector case by Dynkin [2] and Skorokhod [l]. Under (45), the solution of (41) is a Markov process with killing time equal to infinity. The process is continuous with probability one and, for any 0 a < b < 00 and initial condition x , J%,a
x, is independent of z,
max
asisb
IlXill2
u 2 t . Also
P,,,{ max JIxi+d xII 2 E dbAb0
= 0} 5 (1 + llx112)3’2O ( 6 3 ’ 2 ) .
(46)
O ( . ) is uniform in t and x , but depends on E. The scalar version of (46) is given in Doob [l], p. 285, and the vector version is derived in an identical manner. By (46), the process is uniformly stochastically
continuous in any compact set. Skorokhod [I] shows that, for fixed t , x , is continuous in probability with respect to the initial condition. In other words, let the solutions xt and y, of (41) correspond to initial values x and y, respectively. Then (47) P,,y.r {Ilx,+a  Y I + d l l L E l 0 +
as IIx  yll0, for any E > 0, 6 > 0. Since Ex,amaxogtSbllxtllz < co, for each fixed a, b, and x , Chebychev’s inequality gives p,.t
{IlXT
 XI1
’N l s NK1 2
(48)
9
where K1 is some real number. See Doob 111, p. 285, for the derivations of the scalar versions of (49) and (410): f+d
f (.s,X) d . +~ (1 + 1 1 ~ 1 1 ~ ) ” ~
E x , i ( x i + d X) = E,,f(X,+d
XI
(Xt+d i+6
(49)
 XI’
S(s, x) ds
+ (1 + ( 1 ~ 1 ) ~ ) ’ ”
O(h3l2).
(410)
15
4. I T 6 PROCESSES
STRONG MARKOV PROPERTY Next, we show that the solution (in the sense of Itd) of (41) is a Feller process and, hence, since the paths x, are continuous, x, is a strong Markov process. Let g(x) be a bounded continuous realvalued function and consider t as a component of the state; then we must show that E,g(x,) is continuous in x for each t > 0. Suppose that Ig(x)l 5 B < co and x, and y , are solutions of (41) with initial conditions x and y , respectively. Then we must show that IE,g(x,)  E,g ( y , ) )+ 0 as IIx  yII + 0. We now divide the possible situations into the following three cases: either IIx, ytll 2 p, or else Ilx,  y,ll < p occurs together with either 1Ix,\1 2 N or I)x,)1< N . Thus lE,g(x,)  Eyg(y,)IS P,.,{lIxt
+ P,{
IIXrII
 YrlI 2 PI 2 8 2 N ) 2B + SUP Ig(w llull 6 P 11 W1I 5 N
+ .)
 g(w)I. (411)
Given any 6 > 0, we will find an E > 0 so that if IIx  yll 5 E , then each term on the right of (411) is less than 6/3. Choose N < c a so that P, ( 1 1 ~ ~ 1 > 1 N } 6/6B. Then choose p > 0 so that the last term of (41 1) is less than 4'3. Finally choose E > 0 so that P,,,{ llx, y,ll 2 p} < 6/6B, if Ilx yJJ < E . The demonstration is complete.
DIFFERENTIAL GENERATOR The operator a
9 = Cfi(x, t ) i axi
+ 21 CSij(x, 
i,j
a* axiaxj
t)
+ata

(412)
is known as the differential generator of the process x,. Suppose that the function g(x, r ) is bounded and has bounded and continuous first and second partial derivatives with respect to the x i , and a bounded and continuous derivative with respect to t . By using the evaluations (49) and (410) it is not hard to show that for each
16
I
/ INTRODUCTION
fixed t and x
Let A?g(x, t) = h(x, t). The verification of lim E x , fh ( x , + ~ ,t
60
+ 6) = h (x, t )
(414)
is straightforward, and will be demonstrated for the term gxlx,(x,t) S i j ( x ,t) only. By our assumptions on g(x, t) and S(x), lgxrx,(x,t) Sij (x, t)l 5 K2(1 + 11x(I2),for some real number K 2 . Also, Ex maxdzuzo llxul12< co for any 6 > 0. Then it follows directly from the dominated convergencetheorem, and the continuity of both the process x, and the functions Sij(.;) and gxix,(.,*),that
Equation (413)together with (414)imply that, on the class of functions described by the first sentence of the previous paragraph,
Al=
2.
1 ~ 6 LEMMA ' ~
Let x, be an It6 process, and let F(x, t) have continuous derivatives F,(x, t), Fxi(x,t), and Fxix,(x,t ) for 0 6 t 5 T and JIxJI< co, and suppose that s ~ p ~ ~ , ~ , ( F x , , with probability one. Then, for t)l(,  ~ ( x,s> ,
d i ~ ~ ( x u, ), du
S
1
+ J ~:(x,,, u ) c(x,,, u ) dz,. S
(415) with probability one. Now, consider a more general It6 process. Assume thatf(x, t,
0)
4.
IT6 PROCESSES
17
and a(x, t , 0)satisfy (45) uniformly in o and are nonanticipative random functions if x: is nonanticipative. Then the equation dx = f ( x , t, 0)dt
+ a ( x , t , 0)dz
(416)
may be interpreted in very much the same way as (41). The solutions 2 0 < 03, etc. are continuous with probability one, E m a x m > b > s > n x:xs Of course, x, is not necessarily a Markov process. Furthermore, (41 5) holds where =
a

at
+Cfi(X, i
t,
a
0)
axi
aZ + 1 c s i j ( x , t , 0)axi . axj 2i,j
Equation (415) also holds if t and s are nonanticipative random variables. Write Zr as the indicator of the ( q t ) set where T 5 t , and suppose that T and q(0, t ) are nonanticipative, and T is uniformly bounded. Then
j. 0
4 (0, t ) dzr =
E
j
z,q
(0, t)
7 0
I r q (0, t ) dzr.
dz, = 0 .
0
STOPPED PROCESSES The processes obtained upon stopping the solutions of (41) at random times (of the process z t )are discussed in Dynkin [2], Chapter 11, Section 3. Suppose that the stopping time T is the first moment of exit of x, from an open set Q. By continuity, x, is on aQ with probability one. By (46), the limit (34) is zero. Let g(x, r ) be the restriction to Q + aQ of a function which, together with its derivatives, satisfies the boundedness and continuity conditions following equation (412). Then g(x, t ) is in the domain of JQand &(x, t ) = JQg(x, t ) in Q.
18 WEAKENING OF
I
/ INTRODUCTION
CONDITION (45)
Suppose that there is an increasing sequence of open sets Q,, tending to E , so that (45)is satisfied in each Q, + dQ, (with finite Lipschitz constant and bound K;). Construct a sequence of functions f m ( x , t ) and am(x, f), agreeing withf(x, 1) and a(x, t), respectively, on Q, + dQ,, and satisfying (45)for some finite K,. Corresponding to each of these functions there is a welldefined unique solution to (41), which we denote by xr. The processes x:, m 2 n, take identical values until the first exit time from Q,, and this first exit time from Q, is identical for all xr, m >= n. (See, for example, Dynkin [2], Chapter 11, Section 3, or Doob [ 2 ] . )The sequence of first exit times, T,, of xy from Q,, increase and tend to a finite or infinite limit [. By this procedure we may define a unique solution x,, to (4l),for t < [, with (45)being satisfied only locally. If ( ( 0 )< 00, then we say that x,(o) has a finite escape time. In the sequel, we will generally be concerned with processes which are either stopped at or defined until the first moment of exit from open sets. It will then be required that (45)hold in these sets only. 5. Poisson Differential Equations
Let q, be a vectorvalued process whose components are independent Poisson step processes with aid + o ( A ) the probability that the ith component will experience a jump in [f, t A ) . Given that a jump occurs, let Pi(dy)be the corresponding probability measure on the jump amplitude. Let Pi(dy)have compact support. Let*
+
for each i. Write
s
yPi(dy)= 0
d x = f (x, t ) d t
+ a(x, t ) dq .
(51)
* Eyi = 0 is a helpful assumption, since then the 91 in (51) is a martingale, and a very close analog with the results of the previous section is available. + Actually, (51) was also studied first by It6 [I].
5. POISSON DIFFERENTIAL EQUATIONS
19
The theory of equations of the form (51) is precisely that of equations (41).The definitions and existence and uniqueness proofs are identical. This derives from the fact that both the Wiener process zr and the Poisson process qt are martingales and have independent infinitely divisible increments. Both (51) and (41)are special cases of the equations discussed by It6 [l]and Skorokhod [l]. Let (45)hold. Then
as l]xyll+O. Equation (53) is the analog of (47): P,,t{ max
d>AZO
IIXt+A
 XI1 > 4
=O
(4.
(54)
The derivation of (54)is almost exactly that of (46).(The scalar case derivation of (46)appears in Doob [I],p. 385; note that the last term of Doob’s equation (3.19), p. 285, is now 0(6), rather than 0(a3/’), since our qi is a Poisson process. All other terms are similar in both cases.) The analog of (48)is derived by using (52).Equation (49) also holds here. Also, for fixed x and t , i+d
l+d
1
r
(55)
The normed term in (55) equals t+d
t+d
r
t
whose mean square value is of the order of 6. (The integrals are evaluated exactly as in the scalar case in Doob [l],p. 284,(3.16)and
(3.17).) It is readily verified that xi is a Feller process and, hence, since the paths are continuous from the right, it is a strong Markov process.
20
I
/ INTRODUCTION
Equation (54) implies that, at least on bounded continuous functions, if g ( x , t ) is in the domain of 2, then it is in the domain of A”, and J g ( x , t ) = &g(x, t ) in Q (see (31)). Define the operator 9. Let yi be a vector with y in the ith component and zeros elsewhere:
n
Suppose that g(x, r), g,(x, r), gxi(x,r ) are bounded and continuous, and the support of Pi(dy)is compact. The evaluation (55) then yields
and l i m E x , t h ( x r + at, 6+0
+ 6) = h ( x , t ) .
Thus
9=Al on the class of functions described. SECOND FORM OF THE POISSON EQUATION Let us consider another form of a “Poisson driven” process. We use the homogeneous case notation. Let f = f ( x , y ) , where f ( x , y ) is bounded and satisfies a uniform Lipschitz condition in x, and yt is a Poisson step process, taking values y’, ..., y”. Let aij8 o(6) = P{yt+a= y  ” ( y=yi}. , The pair ( x t , y,) is clearly a right continuous
+
5.
21
POISSON DIFFERENTIAL EQUATIONS
strong Markov process with no finite escape time. In fact, x, is uniformly 'continuous. Suppose that g(x, y ) is bounded and has continuous and bounded derivatives gxi(x, y ) , for each y'. Then we define the operator d : d g ( x , Y i ) = c g x , <X , Yi)fj(x, yi) + i
Ci aij [ g ( x t )'Y
 g (x,
Y')I (58)
= 12 ( x , y i ) .
We will show that on the described class of functions, d is the weak infinitesimal operator of the process (x,, y,). The fact that Ex,yih ( x , , y,) + h(x, y i ) is obvious. This property holds for any function h ( x , y ) which is bounded and continuous in x for each value of y , y = y ' , . .., y". Now, supposing that g(x,y) and its derivatives are bounded in absolute value by G, d c
d
A
+ E x , iy i ~ ~ i , d g [ ~ + ~ f ( ~ s ~ y ' ) ~ ~ + ~ f ( ~ , ~ ~ ' ) d ~ 0
A
(59) where A is the (random) time of a transition in y , (supposing that there is one transition), and the expectation on the right of (59) is over A(w). Continuing the evaluation of (59), we have Ex,yi
g(xJ9 ~
6
=) d C g x i ( x 7 Y i > f j ( x , yi) i
 d C g ( x , Y i ) aij i
+ d Ci g ( x , Y')
aij
+ G o ( 6 ) + g ( x , Yi),
which yields the desired result that (58) equals 2 on g(x, y). Let Q , , ..., Q , be open x sets. Let Q have the form Q = Q , x { y ' } ... + Q , x { y " } . Write Q = Uy=lQ i . Let T be the first exit time of (x,, y,)Jom the open set Q . Since for (x, y ) in Q , P,,,{Tc t } = O(f), we have A = d = JQon the described class of functions. If g(x, y ) and
+
22
I
/ INTRODUCTION
f ( x , y ) satisfy the imposed uniform boundedness, differentiability, and uniform Lipschitz conditions only on for each y i , then we still have the process ( x r n , , y t n , )is a right continuous strong Markov process.
JQ= d,and
6. Strong Diffusion Processes
Markov processes with continuous paths and for which there are Holder continuous functions a i j ( x , t ) , b i ( x , t), and c ( x , t ) given by (61) are called diffusion processes. Let E be any positive real number: 1
lim s+o s
1
(llxt+sxll
( x , + ,  x ) P,, (dw) = {bi(x, t ) ) = b ( x , t ) =El
r
1
k t
+.€El
It6 processes (Section 4) constitute one example (where c=O). ( c ( x , r ) A is essentially the probability of escape to infinity from x in the interval [t, t + A ) . ) Under conditions (62) and (63), there are some rather strong properties associated with the process. We now suppose that (62) and (63) hold in a bounded open region G in E. For some real positive p and any vector t,
a i j ( x ,t ) , b i ( x , t ) , c ( x , t ) are continuous and bounded
and satisfy a uniform Holder condition in x , aG ~
is of class* H ~ + . .
(63) (64)
* The local representation of the boundary aC has Holder continuous second derivatives.
6. STRONG DIFFUSIONPROCESSES
23
Let aij, b i , and c depend only on x, and not on r. The following references are all to Dynkin [2]. Under (62) and (63), there is a neighborhood U of each x in G with E,T < 00, where T is the first exit time from U (Theorem 5.8). Since G has a compact closure, (62) and (63) imply the following (Theorems 5.10 and 13.18) for x in G and on the class of bounded functions of x with continuous first and second derivatives * :
A”= A”Q = L  c(x). L
).(C
1
= Caij(X)
2i.j
a2 ax, a x j
+xbi(x)i
a
dXi
c(x).
The process is stochastically continuous in G, and the killing time is greater than zero with probability one (Lemma 5.11). In fact, if ~ ( x= ) 0, the process is continuous and defined with probability one up to at least the first moment of contact with dG. All processes with the same differential generator L  c(x) in Gareequivalent in G (p. 160) (that is, they have the same probability transition function). Furthermore (Theorems 5.8, 5.11, 13.11, 13.12, and 13.16), to each operator L  C(X) satisfying (62) and (63) on G, there is a unique process on G (terminated upon first exit from G) with differential generator L  c(x). The transition density of this process is the unique fundamental solution p(r, x, y ) of ap ( L  C(X)) p
‘at
with boundary condition p ( t , x, y ) + 0 as x ,aG and t > 0. Let G be bounded and dG satisfy (64). Let g(x) satisfy a uniform Holder condition and be bounded in G , and let q ( x ) , defined on aG, becontinuous. Then (65) has a unique solution which is (66) (Theorem 13.16):
* We write L in lieu of P, since P includes a term alar.
t The process is unique if the initial distribution is fixed.
24
I
/ INTRODUCTION
is the first instant of contact of x, with dG. In particular, let q ( x ) = 0 on dG and g(x) = 1, c = 0. Then the unique solution of Lf =  1 i s f ( x ) = E,T, which must thus be finite. Suppose that G is of the form of Figure 1, with smooth exterior
T
Figure 1
boundary rl and smooth interior boundary Tz. Let q ( x ) = 1 on rl and q ( x ) = 0 on Tz and let c = g = 0. Then Lf = 0 has the unique solution
f (x) = Exq (x,)
= P,
{xr touches
rl before r,} .
7.
25
MARTINGALES
7. Martingales
The references of this section are specifically to Doob [l]. (See also Loeve [l].) Let y,,, n = 1,. .. be a sequence of random variables and B,,, n = 1, ... a nondecreasing sequence of aalgebras. If y,, is measurable with respect to B,, and satisfies
with probability one, then the sequence {y,,, B,,} is a martingale. If
E [ ~ n +1 IBn1 5 Y,
(71)
with probability one, then the sequence {y,,, B,,} is a supermartingale. Equation (71) implies, for each set A in B,,, and any n 2 I , m LO,
1
5
Y n + rnp (do)
A
s A
Ynp ( d w )
Let x,, n = 1, ..., be a Markov process, and B, the minimum aalgebra over which x i , i 5 n, is measurable. If El V(x,,)I < 03 for each n < co and Ex[V(xn+,)IBnI 5 V ( x n ) (72) with probability one, then { V(x,,),B,,}, n < 03 is a supermartingale. The lefthand side of equation (72) equals E x n V ( x , + l with ) probability one. (This holds also for continuous parameter processes.) Let z be an (integralvalued) Markov time. If {y,,, B,} is a supermartingale or martingale, then so is { y n n T B,,) , (Chapter 7, Theorem 2.1). Let y , 2 0 and suppose that { y,,, B,,} is a supermartingale. The arguments yielding equation (3.2) of Chapter 7 of Doob [ l ] also give, for each A >= 0, P{maxyiL1}5E n>i>l
Yl
1
.
(73)
(Theorem 4.1s (Doob)). Let {y,,, B,} be a submartingale. Let 1.u.b. Ely,l = M < 03. Then limn y , = y, exists with probability one and
26
I
/ INTRODUCTION
E (y,lS M. (If y , 2 0 is a supermartingale, the theorem applies to { y,, B,}. Then the hypothesis is satisfied if Ely,l <w.) If the y. are uniformly integrable, then E ( y ,  y n ( 0. The continuous parameter results are essentially identical to the discrete parameter results. Separability of the process is always assumed. If (71) holds and B, 2 B,, t 2 s, and y , is measurable over Bt and E (y,l 0 and E > 0 there is a 6 ( p , E ) > 0 such that, if llxoII c 6 ( p , E ) , then, with xo = x, p
p, { sup IIx,II 2 E ) m>fzO
s P.
The “with probability one” clause is motivated by the arbitrariness of p. An alternative definition which is less ambiguous is: (D2) The system is stable with respect to the triple (Q, P,p ) if and only if x in Q implies that P x ( x t ~ P ,all
t
rlO
v(xt>LE},
which are provided by the theorems. V ( x ) is a stochastic Liapunov function. (D3) The origin is asymptotically stable with probability one, if and only if it is stable with probability one and x, + 0 with probability one, for all xo in some neighborhood of the origin R . If R is the whole space, we add “in the large.” (D4) For a given initial condition x, x, is un$ormly bounded by E with probability p if and only if P, { sup
W>ttO
IIXfII
>= E } s 1  p
*
The stability properties of some components of the Markov process may not be of interest. For example, let y t be a Markov process and let k = f ( ~ y’ ), . Then, under suitable conditions on f(w, y ) , y , ) is a Markov process, but it is possible that only M’, is of interest in some problems. Then definition (D2) is still applicabie, and the probability bounds still yield useful information. (Dl) is irrelevent, but may be modified as : Let (wr ,y,) be a Markov process. Let wv take values in the (wit,
32
/ STOCHASTIC STABILITY
I1
Euclidean space E,. The origin of E , is said to be stable with probability one if and only if, for each y o = y , p > 0, and E > 0, there is a 6 ( p , E , y ) > 0 so that, if 11 wo 11 < 6 ( p , E , y ) , we have, with wo = w , P,,,{
SUP
m>i>O
IIWfII
2 8) 5 P .
In fact, the case where some components of the Markov process are of no interest is probably the general case. Some of the states of a Markov model of a control process will be the observed states, or the quantities of interest in the control process. However, in order to “Markovianize” a control process, many “states” may have to be added.
(D5) The origin is exponentially stable with probability one if and only if it is stable with probability one and, for all T < co, P, { sup IIx,I( 2 E } 5 K e‘*, m>ftT
where K < co and ci > 0. (D6) The origin is unstable with probability p if and only if limlimP,{ sup I l x , l l Z ~ } = p . &+OXrO
W>f>O
(D7) The process is ultimately bounded with probability one and with bound m if and only if, for each x in E, there is a finitevalued (with probability one) random time T ( X ) such that PX{
SUP
m > f > T(X)
IIXfII
5 m} = 1
9
or, alternatively, lim P x { sup IIx,(I > m}
Trm
m>i>T
= 0.
If the time ~ ( xdoes ) not depend on x (and is finite with probability one), then the bound is said to be uniform. (D8) Let xi and it be processes corresponding to the initial conditions x and 2, respectively. The process is said to be equi
1.
33
INTRODUCTION
distance* bounded with probability one if and only if for fixed IIX
2117
Px,, { sup 11x1  frll 2 o o > t ~ O
E}
+
0
uniformly in x and 2,as E + 00. (D9) The process is said to be equistable ~ i f probability h one if and only if it satisfies (D8) and, for any fixed E > 0,
K*;{sup
m>t>O
11x1 
2111
L &}  0
as j/x 211 + O , uniformly in x and
2.
The properties (Dl) and (D9) depend on the behavior of the sample paths over a time interval of positive length (as opposed to moments, for example, which depend only on the distribution of x, at fixed instants of I, and only indirectly on the path characteristics). Most of the sequel is devoted to this type of property. An exception is Theorem 6 that allows the estimation oflim,,, E,KV(x,) (if such exists and is independent of x) and of l imj r E x J V ( x , )d t / T for suitable functions Finite time or first exit time results appear in Chapter 111.
THEIDEA
OF A
LIAPUNOV FUNCTION
The Liapunov function approach to the study of the stability of deterministic systems can be briefly introduced as follows. Suppose that the nonnegative continuous function V ( x ) , satisfying V ( 0 )= 0, V ( x )> 0, x # 0, has continuous first partial derivatives in the bounded set Q,, = (x: Y ( x ) < m } , m < co. Let i= f ( x ) have a unique solution in Q,,. Now the sets Q,, each containing the origin, decrease monotonically to the origin, as r 0. Suppose that r'(x,) = C': (.Y,)~(X,) =  k ( x , ) 5 Oin Q,, where k ( x ) is continuous. The fact that r'(.~,) 5 0 in Q, together with the definition of Q, implies that, if V ( x o )< m,then f
*
Following the deterministic definition of Yoshizawa [I].
34
11 / STOCHASTIC STABILITY
=
V ( x , )< m ;hence, x, is in Q, for all t 00, provided only that xo is in Q,. This is a stability result. The function V ( x ) is termed a Liapunov function. We may proceed further. Write V ( x o ) V ( x , ) =
i 0
1 I
k ( x , ) ds
=
r‘(xs) d s .
(11)
0
Since k ( x ) is nonnegative and continuous in the bounded set Q,, and x, is in Q,, t < 00, and since the left side of (11) is bounded, it is obvious that x, + {x: k ( x ) = 0} as t 4 0 0 . Also, it is clear that V ( x , ) decreases monotonically to some nonnegative constant v. If, for example, k ( x ) = 0 in Q , implies x = 0, then x, + 0 as t + 00. These results are among many which may be obtained by the use of Liapunov functions. (See the references previously cited.) The “Liapunov function method” is a method of obtaining information on a family of solutions of i = f ( x ) without actually computing the numerical values of each solution. In lieu of the problem of numerical solution, there is the problem of obtaining useful Liapunov functions.
THESTOCHASTIC LIAPUNOV FUNCTION APPROACH
In studying the qualitative properties of a random process, we study the qualitative properties of a family of functions (the sample paths) simultaneously. If r‘(x,) 5 0 for each sample path, then the deterministic theorems may be used, and the problem may not have much probabilistic interest. A basic property of deterministic Liapunov functions, obtained in the previous subsection, is V ( x , ) v 2 0. While we do not ordinarily expect r’(x,)SO in the stochastic case, it is reasonable to expect that some properties of a stability nature could be inferred from the form of A V ( x ) , the natural stochastic analog of the deterministic derivative. Further, since we are interested in making inferences concerning the asymptotic value of V ( x J from local properties, the possible applicability of the martingale theorems is suggested.
35
1. INTRODUCTION
If V ( x ) 2 0 and, for any A > 0, with probability one, then the supermartingale theorems apply and one may infer V ( x , )+ v 2 0 with probability one. Just as the form of the functions k(x) was helpful in obtaining information on the asymptotic values of x, in the deterministic case, we expect that the form of (12) would yield similar information in the stochastic case. In fact, Dynkin’s formula provides the needed connection between V ( x ) , A V ( x ) , and the martingale theorems, and plays the same role in the stochastic proofs as (11) plays in the deterministic proofs. Let V ( x ) 2 0 be in the domain of 2 and let T be a Markov time satisfying Exr < 03. Then, supposing that A”V(x) =  k(x) 5 0, V ( X ) E X V ( x , ) = E x
i 0
s
k(xJ ds =  Ex A V ( X , )ds >= 0 . 0
(13)
It is reasonable to expect that V ( x , ) is a supermartingale, and also that x,+ {x:k ( x ) = 0} with probability one (these facts will be proved in Section 2). A number of difficulties attend the immediate application of (13). First, the domain of 2 i s usually too restricted to include many functions V ( x ) of interest. Even if V ( x ) is in the domain of A, k ( x ) may be nonnegative in a subset of E only and, furthermore, to investigate asymptotic properties, we must let T + cc.These difficulties may be circumvented by first applying (13) to an appropriate stopped process derived from x,, and then taking limits.
HISTORY The probability literature contains much material concerned with criteria under which random processes have some given qualitative property. However, the first suggestions that there may be a stochastic Liapunov method analogous to the deterministic Liapunov method seems to have appeared in the papers of Bertram and Sarachik [ l ] and Kats and Krasovskii [l]. Both papers are concerned with moments
I1 / STOCHASTIC STABILITY
36
only. Bucy [l] recognized that stochastic Liapunov functions should have the supermartingale property and proved a theorem on “with probability one” convergence for discrete parameter processes. Bucy’s work is probably the first to treat a nonlinear stochastic stability problem in any generality. Some results, of the Liapunov form, were given by Khas’minskii [3] for strong diffusion processes. The martingale theorems have had, of course, wide application in providing probability bounds on stochastic processes, although their applicability to the problems of the references had not been exploited very much prior to these works. Other works are those of the author included here and in Kushner [361, Wonham [2, 31 (for strong diffusion processes), Khas’minskii [2], Kats [l], and Rabotnikov [l]. There is certainly much other material on “stochastic stability,” concerned with more direct methods. See Kozin [l31 and Caughy and Gray [l].
2. Theorems. Continuous Parameter
ASSUMPTIONS The following assumptions are collected here for use in the following theorems. For some fixed m, (Al) V(x) is nonnegative and is continuous in the open set Q, = {x: V(x) < m } . (A2) x, is a right continuous strong Markov process defined until at least some 7’ > 7, = inf(t: x,$Q,f with probability one (or, for all tO.
xQm,
Denote the w set { w : x,EQ,, all t < co} by B,.
2.
THEOREMS. CONTINUOUS PARAMETER
37
Lemma 1 establishes a result which is t o be used repeatedly. It also illustrates the basic use to which Dynkin’s formula will be put in the sequel.
Lemma 1. Assume (Al) to (A4). Let A”,V(x) 5 O.* Then V(xrnrm) is a nonnegative supermartingale of the stopped process xrnr,,,and, for 2 5 m , and initial condition xo = x in Q,, Px
{
SUP
m>rbo
V (XI
2 I} 5 __ 2
V (Xrnr,,)
(21)
Also there is a random variable c(o), 0 5 c ( w ) < m, such that, with probability one relative to B,, V ( x , ) + c ( w ) , as t + w . P,{B,}2 1 V(x)/m. Proof.
By Dynkin’s formula rnrm
E,V ( x r n r m ) V (x)
= E,
J 0
A”,,,V (xJ ds
5 0.
(22)
Thus E,V(.xrnr,,,)5 V ( x ) . Also, since V ( x ) is in the domain of A”,,,, E , V ( X ~ ~+ , ,V ~ () x ) ,as t + 0. These two facts imply the supermartingale property (Dynkin [2], Theorem 12.6). Equation (21) is the supermartingale probability inequality, and can also be derived from (22). The existence of a c ( w ) follows from the supermartingale convergence theorem and the last statement of the hypothesis follows from (21).
Remark. Suppose the state is (x, t ) . Define Q , as in (Al). If ~ , , , V ( Xt ), 5 0 in Q,(J,,, is the weak infinitesimal operator of the process (xrnr,, t n7,)), then the conclusions of Lemma 1 hold. * For greater clarity, we could write the redundant statement A,!,V ( x ) 6 0 in Qt3,.ArrlV ( x ) is, of course, defined only in Q, . + An obvious set of increasing afields can be adjoined to V (xrnrm)to complete the description of the supermartingale.
I1 / STOCHASTIC STABILITY
38
Equation (21) is written as px,,{
SUP
m>rzT
V(Xr5
t ) 2 I> S
V ( x ,T )
I
~
(21')
'
where (x, 7')is in Q, and A 5 m. Theorem 1 is the basic result on stochastic stability, and is the stochastic analog of Liapunov's theorem on stability. The probability estimate (or (D2)) is more important than the mere fact that (DI) is satisfied. Theorem 1. (Stability) Assume (Al) to (A4) for some m > 0. Let V ( 0 )= 0, XEQ,. Then the system is stable relative to (Q,, Q,, 1  r / m ) , for any r = V ( x o )= V ( x ) m (see (D2) in Section 1). Also, for almost all w in B,, V ( x r n Z mt) c ( w ) m. If V ( x ) 0 for x # Oand XEQ,, then the origin is stable with probability one.
s
=
s
Proof. The proof is a direct consequence of Lemma 1, the definitions of stability, and the properties of the function V ( x ) . Lemma 2 establishes an upper bound on the average time that the process x, spends in a set. It will later be used to prove, among other things, that certain sets are reached in finite average time.
Lemma 2. Let V ( x )be nonnegative in an open region Q (Q is not necessarily bounded), x, a right continuous strong Markov process 'in Q, and T the random first exit time from the open set P c Q. Let V ( x ) be in the domain of l Q Let . lQV ( x ) 2  b < 0 in P . Then the average time, E,T, spent interior to P up to the first exit time is no greater than V(x)lb. Proof.
By Dynkin's formula
V ( x )  E , V ( x r n T )=  E , / l Q V ( x , ) ds 2 bE, 0
s
fnr
rnr
I,(s, w ) bs
0
=bEx(tnT),
(23)
where I,(s,w) is the indicator function of the ( s , w ) set where
2.
39
THEOREMS. CONTINUOUS PARAMETER
&V(x) 5  b (it depends on xo = x). The nonnegativity of V(x) and (23) clearly imply the theorem.
DEFINITION An &neighborhoodof a set M relative to an open set Q is an open set N , ( M ) c Q, such that N,(M) = (XEQ
j/x  y/l < E
such that
for some
EM}.
N , ( M ) = N , ( M ) + dN,(M). Note that it is not necessary that N , ( M ) contain M (although N , ( M ) will contain M in Theorem 2). Theorems 2 and 3 are stochastic analogs of some Liapunov theorems on asymptotic stability. Theorem 2. (Asymptotic Stability) Assume (Al) to (A5) and also A”,V(x) =  k ( x ) 5 0. Let Q, be bounded and define P , = Q, n {x: k ( x ) = O}. For each d less than some do > 0, let there exist an &d > 0 such that, for the &,neighborhood N,,(P,) of P , relative to Q,, we have k(x) 2 d > 0 on Q,  NEd(P,,,).(A condition implying this is uniformcontinuity ofk(x)onP,andk(x) > Oforsornex~Q,.) Then xt
+
prn
with a probability no less than 1  V(x)/m. Suppose that the hypotheses hold for all m > 0. Define P = Uy P,. and let N,(P)be the &neighborhood of P relative to Q = Up Q,. If, for each d, 0 < d do, there is an &d > 0 such that k ( x ) 2 d on Q  N,,(P), then x, + P with probability one. Proof. Note that the statement of the last paragraph follows from the statement of the first paragraph. Note also that if k ( x ) = O in Q,, the theorem is trivially true as a consequence of Theorem 1. By Lemma 1, x, remains strictly interior to Q, with a probability 2 [ 1  V(x)/m]. Fix dl and d2 such that do > dl > d , > 0. Let t i correspond to di
I1 / STOCHASTIC STABILITY
40
(that is, in Q,,  NCl(Pm), we have k ( x ) 2 di).It is no loss of generality to suppose that N,,(P,) is properly contained in N,,(Pm). Define Zx(s, W , E ~ ) as the indicator of the (s, W ) set where x,(w) is in Q ,  N,, (P,) and write
I
r,.
T,(t,
E ~= )
I,(s,
W , E ~ ds )
.
tnr,
T'(t, ci) is the total time spent in Q ,  NZi(P,)after time t and before either t = co or the first exit time from Q , (with xo = x). T,(t, ci) = 0 if T , 2 t . By Theorem 1, P, {x,
leaves Q , at least once before
t
v (XI
=a}= 1  P, { B , } 2 m
By Lemma 2 we conclude that T,(t, ti) < co with probability one and, hence, T,(t, ci) + 0 as t + 00. Now we distinguish two possibilities: (a) there is a random variable T ( E , ) T ( E , ) ; or (b) for W E B , , x, moves from Ne2(P,)to Q,,  N E I ( P , )and back to N,,(P,) infinitely often in any interval [ t , co). ((a) and(b) are all possibilities, since the path WEB,) can stay exterior to N E 2 ( P , ) for only a finite total time, since T,(t, c2)+0 as t + co.) Consider case (b). Since T , ( f ,E J + O with probability one as t + co there are infinitely many movements from NE2(P,,) to Q ,  NEI(P,) and back to N,,(P,) in a total time which (in the integral sense) is arbitrarily small. We now show that the probability of case (b) is zero. Fix 6, > 0. Choose h > 0 so that sup
XEQ",  N & z ( P , , , )
P,{ sup IIx,  XI1 2 h2sb 0
El
 E2}
0 there is some h > 0 for which (24) holds. For each 6, > 0, and the fixed initial condition x, there is a t < co so that P,{T,(t,
E2)
> h} < 62.
(25)
Under case (b), (24) and (25) are contradictory. Thus, we conclude
2.
THEOREMS. CONTINUOUS PARAMETER
41
that P x { x s ~ Q r n  N E , ( P , )for some s E ( f , a ) } + O as t +a, for XEQ,. Since
REMARK ON TIME
and E~ are arbitrary, the theorem is proved.
DEPENDENCE
Suppose that either the process x, is not homogeneous or that the Liapunov function V ( x , r ) depends on time, so that as a consequence either k or V are time dependent. Then Theorem 2 may still be used as stated, although it is the custom in the deterministic theory to reword the hypothesis so that certain trivial possibilities are eliminated. For example, suppose that k ( x , t ) = g ( x ) / ( l r), where g ( x ) > 0 for x # 0. Then, considering t as a state, all that we may conclude (from Theorem 3, since Q, is not compact in this case) is that either t + 00 or g(x) + 0 with some probability. We may reword the hypothesis in the following way. Let V ( x , t ) be nonnegative, continuous, and in the domain of 2, (corresponding to the process (xfnr,,, r n r , > ) w i t h ~ , V ( x , t ) =  k ( x , t)sOinQ, . (Note that Q, = {x, t : V ( x , t ) < m } . ) Suppose that k ( x , t ) k , (x) in Q,, where k , (x) satisfies the conditions on k ( x ) in Theorem 2. Let xT = x be the initial condition. Then x, + {x: k , (x) = 0 } n {x: V ( x , t ) < m for some t 2 T } with a probability no less than 1  V ( x , T ) / m . Many special results, analogous to deterministic results, may be developed for timedependent processes or functions, but we will not pursue the matter here. Theorem 3 will be useful in examples where the “stability” of only a few components of x is of interest. (For example, where some components are of the nature of “parameters.”) In such cases the sets Q, are not always bounded. Define the event DE,3 { w : W E B , , and :j Ix(s,w , ci) ds 0. Proof. If Q , and P , are not bounded, then the proof of Theorem 2 implies that either: (c) there is a random variable zl(o) < 00 with probability onesuch that x , E N , , ( P , ) , all f > T , ( w ) , for W E B , ; or (d) for any compact set* A c Q , , there is a random variable r ( A ) < co with probability one such that x , is not A for t > T ( A )and W E B , . The event DEi,ci > 0, occurs in any case. (Equations (24) and (25) are valid when Q , is replaced by compact A . ) Under condition (d), there is a monotone increasing sequence of sets A , , A , t Q,, and random variables ?(A,) < co with probability one for each n, which satisfy (d). Then, under condition (d), IIx,II +a, and x , E Q , for all t . Since P, { IIx, II 00, x, E Q,, DEi}= 0, case (c) must hold. Since is arbitrary, x, P , with probability one, relative to B,. The last statement follows from the proof of Theorem 2, and the observation that, under the uniformity hypothesis, there can be no escape to infinity in finite (with probability one) time, unless the tails of the trajectories are entirelcontained in NE(Pm). The following corollary will be useful in the chapters on optimal cony trol and design of controls. The set S will correspond to a “target” set. +
+
Corollary 31.
Assume (Al) to (A5) with the following exception: V ( x ) is not necessarily nonnegative. Let S = { x : V ( x ) 0}, and define Qk = { x : m > V ( x )> O}. Define T, as inf { t : x , $ Q L } . Let 2, be the weak infinitesimal operator corresponding to the stopped process x r n r mLet . V ( x )be in the domain of A”, with X , V ( x ) =  k ( x ) 5 0 in QL. Suppose that k ( x ) is uniformly continuous** in the set PL = * The words “any compact set A” could be replaced by the words “any set A for which (A5) holds uniformly, bounded or not.” ** More generally let N , (Pm) be an &neighborhoodof P‘,,,relative to Q‘,,,. Let k ( x ) satisfy, on Q‘,,,  N,, (Fm) the conditions on Q‘,,,  N,, (P,) of Theorem 2. Then Corollary 31 remains true.
2.
THEOREMS. CONTINUOUS PARAMETER
43
{ x : k ( x ) = 0 } n {Q,, +as} and k ( x ) > 0 at some x in QL.Then there is a random variable 7 5 co such that x,+ ( x : k ( x ) = 0)u {Q,,+ S } as ? + T , with a probability no less than 1  V ( x ) / m .Also
Proof. The proof follows closely the proofs of Theorem 2 and Lemma 1 and will be omitted. This situation envisioned in Corollary 31 is depicted in Figure 1. The behavior of the process after 7 is not of interest. Note that if m is arbitrary in Corollary 31, and k ( x ) > 0 in each Q6,then there is a random variable 7’ co such that x, + S with probability one as t + 7’.
Figure 1
The result for It6 and Poisson processes are collected in Corollaries 32 and 33. The proofs involve merely the substitution ofthe properties of these processes into the hypotheses of Theorem 2 or 3. Corollary 32. Let x, be an It8 process, the coefficients of whose differential generator satisfy the boundedness and Lipschitz conditions (of Chapter I, Section 4) in Q, . Let V ( x ) be a continuous nonnegative function which is bounded and has bounded continuous first and second derivatives in Q,. (If S i j ( x )E 0 in Q,, then 8’ V(x)/dxia x j need not be continuous.) Then V ( x ) is in the domain of 2 ,. With
44
I1
/ STOCHASTIC STABILITY
the homogeneous case notation, A”,V(x) = m q x ) = X&(x)i
1 a2 v ( x ) axi + 2 iC, Sj i j ( X )  axi a x j =  k ( x ) .
dV(X)
~
s ( x ) = { S i i ( X ) } = a ‘ ( x ) .(x).
Let k ( x ) 2 0 in Q,, and if Q , is unbounded, let P,{ IIx,II 3 co,X , E Q,, D,} =0, for each E > 0. Then x,+ {x: k ( x ) = 0} n Q, with at least the probability 1  V ( x ) / m .
REMARK ON
STRONG DIFFUSION PROCESSES
Define G, = {x: /lx/\ r > and R > r > 0. Suppose that x, is a strong diffusion process in G R  C,, for some fixed R > 0 and each r > 0 (that is, for each r > 0, there is an m, > 0 so that t’S(x) 2 mrl\tl\z, for any vector 5). Note that the process is not necessarily a strong diffusion process in the set G R ,since m, may tend to zero as r + 0. In fact, if the origin is stable with probability one ((26)) this must be the case, si.nce, otherwise, the escape time from G Ris finite with probability one, for any initial value x, including x = 0. Let
rzO
as
x+O.
(26)
Then there is a nonnegative continuous function V ( x ) satisfying V ( 0 )= 0, V ( x ) = 1, x E a C , , V ( x ) > 0, x # 0, and X E G , , and V ( x )is in the domain of the weak infinitesimal operator of each process x t o r r , where z, is the first exit time from G R G,  aGR = D,. Also, z D , V ( x )= L V ( x ) = 0 in D,for each r > 0 and px {
SUP
m>r>=O
llxrll
L R} I Px {
SUP
U2>t20
v(xr)
L I} I J‘(x).
(27)
In other words, the existence of a stochastic Liapunov function is also a necessary condition for stability of the origin with probability one (26). The sufficiency follows from Theorem 5 (we say nothing there about the properties of V ( x )near x = 0). Also x, 0 with a probability
2.
45
THEOREMS. CONTINUOUS PARAMETER
at least 1  P x { ~ ~ p , , , z O llxfll 2 R } . Thus, stability of the origin with probability one implies asymptotic stability with probability one, in this case. The result is due to Khas’minskii [3], and a proof follows. Define the function q ( x ) on the boundary of G R G,. Let q ( x ) = 1 on JGR and q ( x ) = 0 on dG,. Then, since x, is a strong diffusion process in D,, the equation (see Chapter I, Section 6) LV,(x) = 0 with boundary condition V,(x) = q ( x ) , for xEdD, has a unique continuous solution which is given by V, (x)
= E,q (xr,)
= P, {x,
reaches JG, before x, reaches JG,),
where T, is the first exit time from D,.Also E,T, < 03, X E D , , and 0 Vr(X) 5 1. Let { r i } be a sequence of values of r tending to zero. It is easy to show, using the strong maximum principle for elliptic operators, that V,,(x) is a nondecreasing sequence in each GR G, (for r, 5 r ) . Define V ( x ) :
s
V (x)
= lim r+O
V ,(x)
= lim P, {x,
reaches dG,
before
JG,}
r+O
= P,
{ sup IIx,II 2 R } . rn>fZO
(28)
The last term on the right of (28) should be PX{sup,,, 2 o llxfll 2 R}, where T = limr+o T,. If T < co with a probability which is greater than zero, then either IIx,II = R for some t < 03, or llx,ll 4 0 , t  5 . In the latter case sup,>,zo IIx,II < R with probability one (relative to (Ilxtll + 0 as t + T}). This may be proved by use,of the hypothesis (26). Thus equation (28) is valid. By hypothesis the right side of (28) goes to zero as x+O; thus Y(0)= 0. Also, by the method of constructing the Ifr(.). 0 5 V ( X )5 1 and V ( x )= 1 on dGR. V ( x ) is the limit of a nondecreasing uniformly bounded sequence of harmonic functions; hence, L V ( x ) = LV,(x) = 0 in each D,, r > 0. Now an argument similar to that used in Theorem 5 may be applied to complete the proof. V ( x ) is in the domain of the weak infinitesimal operator of the process in each D,, and the operator, acting on V ( x ) in this region, is L. In Theorem 5, we require A”,,, V ( X )  6, in
I1 / STOCHASTIC STABILITY
46
Q R Q, (or, equivalently, L V ( x )  6, < 0 for each r > 0 in Dr), in order to assure that the first exit time from the set Q R QEwould be finite with probability one. The latter property holds here for each D,. Thus, pursuing the argument of Theorem 5 , we conclude that xt+O with at least the probability 1  Px{sup, IIxtII 2 R } . Equation (27) is obvious from (28). Corollary 33.
Let the nonnegative function V ( x ) have bounded and continuous first partial derivatives in Q , and let dx = f ( x ) dt
+ C(X) dz
be the Poisson equation of Section 5, Chapter I. Let f ( x ) and ~ ( x ) satisfy a uniform Lipschitz condition and be bounded in Q,. Let aid + o(d) be the probability that zit has a jump in [ t , t + d) and Pi(dy) the density of the jump in zit. Let each Pi(&) have compact support. Suppose that V ( x + ~ ( xyi) ) is bounded* and continuous for x in (9, and for each y' for which Pi(dy) is not zero. Then V ( x ) is in the domain of A', and A",V(x) = ~ V ( X = ) C L ( x )~ x , ( x ) i
s
+ Ci [ v (X + 0 (x) yi)  v (.)I
=
aipi ( d y )
k(x).
Suppose that k ( x ) satisfies the conditions of either Theorem 2 or 3. (Note that k ( x ) is continuous in Q,.) Then xt + { x :
k ( x ) = 0 } n Q,
with probability at least 1  V ( x ) / m . Proof. The theorem follows from Theorem 3 and Section 5 , Chapter I. If A,V(x) =  k ( x ) 5 0 in (3, and k ( x ) is proportional to V ( x ) in Q,, then a type of exponential convergence occurs. In particular:
* yi
is a vector with y in the ith component and zero elsewhere.
2.
47
THEOREMS. CONTINUOUS PARAMETER
Theorem 4. (Exponential Asymptotic Stability) Assume (Al) to (A5), V(0)= 0,&V(x) 5  o!V(x)in Q , for some o! > 0. Then, with xo = x,
P x { sup
oo>rbT
V(XJ
v (x) v (x) eaT +
2 A} 5
~
I
ni
If the hypothesis holds for arbitrary rn, then
Proof. Let
T,,,= inf
{ t : x,#Q,,,}. Define the process Zt 2, = X,
( t < T,),
Zr = 0
(t
2 Tm).
Let Excorrespond to the process Zr. The process Zr, t < oc), is a right continuous strong Markov process (although 7, is not necessarily a Markov time of the process &). By Dynkin's formula,
s
7,,,nr
EXV(2,) V ( X ) E , V ( X ~ , ~ ,) V ( X )= E x
.X,,,V(X,)~~
0
r
r,nt
 o!EJ
V(X,) ds =  .Ex/
I 
r
V(XJ ds 0
0
=. p y ( P , )
ds,
0
which implies, by the GronwallBellman lemma, that
EXv(zT)5 ~ Now, the facts gxV(zr)
( x e'=. )
s V(X)
E x ~ (,~~J( x ) , as
t ,0
imply that V(2,) is a nonnegative supermartingale (Dynkin [2],
48
I1
/ STOCHASTIC STABILITY
Theorem 12.6). The supermartingale inequality yields
which, together with P,{
SUP
Ca>tzo
IV(Tr)  V(x,)l > 0 } = P x {
SUP
m>r20
V ( x r )L m }
V (x)
Sm
implies the theorem. An alternative proof can be based on the observation that J,ezr V ( x ) S 0 in {x: V ( x ) < m } = Q,. Then earn', V(xlnl,) is a non
negative supermartingale. Thus
P,{ sup V ( x J exp m>rZO
at
2 m } = P,(V(x,)
>= m exp
 a t , some t
0; we have
Since P,{t, < T } S V ( x ) / m and ExV(xTnr,)exp c( T n r , S V ( x ) , Theorem 4 is again proved. It is not usually necessary that V ( x )be in the domain A", in all of Q,. In particular, if we are concerned only with the proof that x,+ 0 as t + co, then the properties of V ( x ) at x = 0 may not be important. It may only be required that V ( x ) be in the domain of the weak infinitesimal operator in the complement (with respect to Q,) of each neighborhood of { x = O } . Theorem 5, giving just such an extension of Theorem 3, will be useful to improve the result in one of the examples where the process is an It6 process and the Liapunov function has a
2.
49
THEOREMS. CONTINUOUS PARAMETER
cusp at the origin. The theorem may be extended to include noncompact Q,. Note that stochastic continuity is not explicitly used in the proof. The supermartingale inequalities (29) and (210) are sufficient to give the result in the case of Theorem 5, since { x : k ( x ) = 0} must be included in ( x : V ( x ) = O > . Define, for E < ~ , T , ( E ) = inf ( t : x , $ Q m 
Q,  aQJ
Theorem 5. Assume (Al) to (A3), V ( 0 )= 0, and that the set Q , is bounded. Denote the first exit time of x, from Q ,  Q ,  dQ, by T,(E). Let A”,,,denote the weak infinitesimal operator of the process xtnTm(&). Let V ( x )be in the domain of for E > 0 arbitrarily small, and let A,,,, V ( x )=  k ( x ) 5 0 in Q ,  Q ,  dQ,. For each small E > 0, let there be a 6 > 0 so that k ( x ) 2 6 in Q ,  Q ,  d Q , . Suppose that {x: V ( x ) = O } is an absorbing set; that is, exit is impossible.* Then x,+ ( x : V ( x )= 0 } with a probability at least 1  V ( x ) / m ,as t+m.
zm,t
Proof. define the sequence^^,&^=^^^^^,^^=^, whereO<E<m, and E < 1. The Q,, decrease t o { x : V ( x )= 0}, and V ( x ) is in the domain of XmxEi, for each i < 00. Let ai correspond to c i . Then, by Lemma 2, E,T,,(E~) < co and the T , ( E ~ ) , i < co,are all finite with probability one. Denote by T the limit T = limn T,(E,,). T is a Markov time of x , , since the nondecreasing T , ( E ~ ) are. Let x = xo # 0 and let y be any point in Q,, dQ,,. Note that
+
P,,{V(x,) 2
before V ( x , ) 2 E , , + ~ } 2
V ( Y )<  = EEnn ~
En1

.
(210)
En1
The probability of the set of trajectories (Q  Be) which leave Q , before entering any arbitrary Q,, E > 0, is less than V ( x ) / m . Then
* The assumption that { x : V ( x ) 0) is an absorbing set is convenient if r < co (see proof) witha nonzero probability. It eliminates the need for introducing assumptions which guarantee it, as well as the attendent arguments. In any case, either V ( m + O a s r + m o r V(xJ =Oforsomes< co,withprobability 2 [ I V(x)/rn]. :
I1 / STOCHASTIC STABILITY
50
+
ape", as t + T,(E,) with at least the probability 1  V ( x ) / m .Once in Q,, dQ,., the probability that a trajectory will leave pen,before entering Q,.,, +dQ,,+, is bounded by E" by (210). Thus, for W E B , , ,V ( x t )is bounded by E , ,  ~ in the time interval [ ~ , , ( E , , ) , T , ( E , , + ~ ) ) with aprobabilitynoless than(1  E , , / E , ,  ~ ) = (1  E"). Thus, x, converges uniformly to zero, as t + T , with a probability no less than
x,(w)+ Q,,
+
(21I) Since E is arbitrary, V ( x , ) + O with probability 2 [l  V ( x ) / m ] , as t + 7,and the theorem is proved. The Liapunov function method may be used to estimate the asymptotic values of moments, providing that they exist. In any case, we have: Theorem 6.
Assume (Al) to (A4). Let V ( x ) be in the domain of
x,,,for each m > 0. Suppose that V ( x ) is bounded for finite x, and
that x, does not have a finite escape time (with probability one) to cc). Let
+
(212)
5 1.
(213)
X,,,V(x) 5  k ( x ) c2 k ( x ) Z O ineach Q,, cc)>c>O.
Then K
t 0
Strict equality in (212) implies strict equality in (213). If E,k(x,) has a limit, then it is at most c2. Proof. By Dynkin's formula and (212)
1
tnrm
(XI

(Xznr,,,)
=  ~x
s
JmV (xs) ds
0
tnf,,,
2 Ex
0
[ k (xs)  c2] d s ,
(214)
2.
51
THEOREMS. CONTINUOUS PARAMETER
where T, is the first exit time from Q,. Since t < 00, co > c > 0, and k ( x ) 2 0, and also since, by hypothesis, ~ , + c o with probability one, we may replace the limit of the righthand integral of (214) (as rn + co) by the integral from 0 to t (dominated convergence theorem). Also, by Fubini’s theorem, the order of integration may be changed. Thus r
V ( x )  lim EXV(xrnr,) 1 / E , k ( x , ) ds  c2r. mr w
0
Now, (213) follows by letting t + c o in V ( x )  limm+m E,V(xtnr,)
V(x) C2t
2 
j
1 E k ( x ) ds >  x s 1 .  t
C2 t
c2
0
Remark. Generally, in applications of Theorem 6 , when V ( x ) is large, A^,V(x) is negative ( V ( x ) decereases “on the average” along trajectories of the process) and Z,V(x) is positive when V ( x ) is small ( V ( x ) then increases along trajectories of the process “on the average”). x, ‘‘oscillates’’ in some random fashion about the set Urn“= I ( ( x : x m Y ( x )= 0 } n Qn,). Since, in applications generally A”,V(x) = x , , V ( x ) , n > rn, in Q,, the sets { x : Z , V ( x ) = 0 } nQ, will generally be increasing. Extension. Let x , V ( x ) =  k ( x ) +g(x), k ( x ) 2 0, g(x) 2 0. Then Theorem 6 may be extended to read
provided only that some condition is given which guarantees that rmnf
Ex
and
J ( k ( x J + d x , ) ) ds =  E x 0
1
r,nr
0
1
rmnt
k ( x J ds
+Ex
0
g ( x J ds
52
I1
/ STOCHASTIC STABILITY
While the problem of existence of stochastic Liapunov functions has not been solved,* some existence results are possible. Theorem 7 is applicable to the case where, for example, x, is an It6 process, h(x,) = IIxsll, and Ex llxJ tends to zero exponentially. See Kats and Krasovskii [l] and also Hahn [l], Theorem 24.5.
Theorem 7. Let x, be a right continuous strong Markov process. Let h ( x ) satisfy h(0) = 0, h ( x ) > 0 for x # 0, and be continuous at 0. Suppose that x o = x and
s m
F ( x ) = Ex
all x in E
h ( x , ) ds
ds'
dnrp
* Stochastic counterparts of the deterministic existence theorems will appear in H. J. Kushner, Converse theorems for stochastic Liapunov functions, SZAM J . Control 5, No. 2 (1967).
2.
THEOREMS. CONTINUOUS PARAMETER
53
Then, under the hypothesis,
E J (XanrJ  F ( X I h (xs) ds   Ex IFTQ t  h (x) 6
6
and, hence, A",F(x)=h(x),
Exh (4P h ( X I
each Q c E
and
XEQ.
By hypothesis, Q , = ( x : F ( x ) < m} is bounded and F ( x ) is continuous. Q.E.D. Thus F(x) satisfies the hypothesis on V ( x )of Theorem 2. Theorem 8. Assume (Al) to (A4), except that the killing time C may be less than infinity. Let 0 S V ( x ),03 as llxll+co and, for each m > 0, A",V(x) =g(x), where g(x) S 0 for large IIxII. Let there be an infinite sequence of compact sets G , f E , such that, if llx,ll 00 as t + 5 03, then (with probability one) x,(~)EG,for some sequence of random times t ( i ) f c. Then there is no finite escape time ([ = m with probability one). If A",V(x) 5  E < 0 in Q ,  G, for all large m and where G is compact, then x , always returns to G with probability one.
=
Remark. The purpose of the condition on the Gi and T ( i ) is to eliminate the possibility that x, either disappears or jumps directly to infinity from some finite point.* A simple example will illustrate the problem. Let x , be a constant, t t } = c6 + o(6). Then E,V(x = (1  cb o(6)) V ( x ) and
+
A",v(x)=  c V ( x ) for all V ( x ) . But we may infer neither stability nor even boundedness of the x, process (in any time interval). Proof. For large i each Gi is contained in some bounded Q,, and contains some Qpi. For all large tl, g(x) 0 in E  Q,. Let Q, 3 Q, =I Q p 
* An equivalent condition is that for each bounded open set Q with first exit time TQ, there is an EQ > 0 with probability one and C >= TQ EQ.
+
I1 / STOCHASTIC STABILITY
54
with x = x o in Q,  Q,  dQ, and g(x) 5 0 in Q,  Q,  dQ,. Define T, = inf { t : x,$Qm Q,  @,}.Then T, < 5 with probability one, for each m, and
s
rmnr
E,V(X~~,~,) V ( X )= E x
=
Since r,,, i we have mp, all t < co and, hence, P,{
XmV(xS)ds = E ,
0
0
SUP^,^^ z s
sup V(xs)
r,bsbO
s
r,nt
g(xJ ds
5 0.
V(xs>2 m > 5 ExJ'(Xr_,,)
V (XI >= m } 5 . m
for (215)
+
Since (215) holds for arbitrary m < co,x , must always enter Q, aQ, before going to infinity, with probability one. This implies that there is no finite escape time. The last statement of the theorem follows from the above reasoning and Lemma 2. Many other stochastic Liapunov theorems are possible. The proof of Theorem 9 is a slight variation on the proof of Theorem 2 and will not be given. Theorem 9. Assume (Al) to (A5) with x n , V ( x )5  k(x) + qr, for each m,where k ( x ) Z O and cpt+O as t+m. Let k(x)+cr, and V ( x ) + c o as //x/]+co. Let k ( x ) be continuous on the compact set {x: k ( x ) = 0} = P. Then, for each neighborhood N ( P ) , there is a finitevalued random time T~ such that x , e N ( P ) , all I > T ~ with , probability one. Suppose that k ( x ) 5 0 in Q, 3 P only. Then x , + P with the probability that x , does not escape from Q m . The following theorem is proved by use of the supermartingale V ( x ) +S:cp, ds; the proof is omitted. Theorem 10. Assume ( A l ) to (A4) and let V ( x ) + c r , as llxll +a. Let x m V ( x )5 cp, where cp, 2 0 is integrable over [O,co).Then there is a finitevalued random variable v such that V ( x , )  v with probability one. Also
3.
55
EXAMPLES
There is a simple stochastic analog of a theorem of Yoshizawa [ l ] concerning a bound on the “sensitivity” of the trajectory to the initial condition. The proof is similar to that of Theorem 1 or 2 and is omitted. The processes x,, y , have initial conditions x, y , respectively. If x, and y, are right continuous strong Markov processes, then so is (x,,y,). Theorem 11. Let ( A l ) to (A5) hold for the process (x,, y,) and function V ( x ,y , t ) in the bounded open set Q = {x, y : W(llx yII) < m>,where W ( 0 )= 0 and W (r ) is continuous and strictly monotone increasing for r 5 m . Let
W(lIx  Yll)
Let JQV(x, Y , 1 )
Then
K,,{
S U P IIX,
rn>t20
IV(x, Y, t ) .
5  k ( x , Y , t ) S  ki(Ilx  YII) 5 0.
 Yrll 2 w  ’ ( m ) ) s P,,,{ < V(X, 
SUP
rn>?ZO
V(X,?
Y f ?t ) 5 m>
Y , 0)
m
If k , ( r ) is continuous on P = { x , y : k , ( I l x  y l l ) = O } n Q , then we have Ijx, y,/j = r, + { r : k , ( r ) = 0 ) with a probability at least 1  q x , y, 0 ) h .

3. Examples First, several simple scalar It8 processes will be investigated. Example I .
Let x, be the solution of the scalar It8 equation dx = ax d t
+ ox d z .
(31)
The function V ( x ) = x2 satisfies the conditions of Corollary 32 in each set Q, = {x: xz < m 2 } ,and
,T,v(x)
= YV((X)= x2(2a
+
0’).
56 If 2a
I1
/ STOCHASTIC STABILITY
+ u 2S 0, Corollary 32 yields the estimate P,{ sup x,‘ 2 m 2 >5 a J > t ~ O
X2 
m2.
Let 2a + u2 < 0; then xt ,0 with at least the probability 1  x 2 / m 2 . Since m is arbitrary here, x, +0 with probability one. Since 9 V ( x ) =  aV(x), a = 12a 0’1, and m is arbitrary, Theorem 4 yie1ds;for any A < 00,
+
Example 2. Let x, be the solution of (31). An improved result will be obtained with the use of the Liapunov function V ( x ) = IxIs, 03 > s > 0. If s < 2, the second derivative of V ( x ) does not exist at x = 0. Nevertheless, Theorem 5 is still applicable. Then, appealing to Corollary 32 and Theorem 5, we have in each Q,, Q,  dQ,

A , , , V ( X ) = ~ ’ V ( X ) =b x 2 ,
where we suppose that (s  1) fJ2
b=a+0. 2 Hence,
Clearly the larger is s, the smaller is (32). In general, Liapunov functions of higher algebraic order are preferable, if they exist, since they yield better estimates of the probabilities, as in the case (32). If a < a2/2, there is some GO > s > 0 such that b < 0. Then x, + 0 with probability one. Thus, even if a > 0, there may still be asymptotic stability. Except for some specially constructed examples, there does not seem to be a vector analog of this result. In our case, it is verifiable
3.
51
EXAMPLES
directly by noting that the exact solution to (31) is (with probability one) x,
= xo exp [ ( u  (izi)t
If u  cr’ =  c < 0, then x,
c
= xo ex,[
+
+ z,].
]:
(33)
t
which tends to zero with probability one, as t t co. In deterministic stability studies, if a function V ( x ) is a Liapunov function in a region Q then so is Va(x)= V“(x),ct > 1, in the same region. There is not necessarily a stochastic analog of this property. (If there were, then all estimates of the form (32) could be written with s = 00.) That V ( x ) is a stochastic Liapunov function in Q implies
S
m
W
Ex
E,AQV (x,) ds < 00
AQV (x,) ds = 0
0
or that E X J Q V ( x , )decreases sufficiently rapidly as sco. But this does not imply that E x x Q V a ( x , )will decrease sufficiently rapidly to ensure that, for any a > 1,
i
ExAQVa(x,)ds < 0 0 .
0
It is these “integral moment” properties which effectively determine the probability bounds. Remarks on the model (31). Astrom [l] gives a thorough analysis, together with numerical results on the sample path behavior, of the scalar equation dx = crlx dz, +a2 d z , , where zll and z2* are (possibly correlated) Wiener processes. Let R = 0x5
11 / STOCHASTIC STABILITY
58 where
5, is integrable on each
[0, TI. Then f
x , = x o exp o S g , d s . 0
The solution still holds (with probability one) if the It6 equation
dt As aoo,
=
Jb ( , d s + z ,
dt
+ a dz
5, is the solution of
(a > 0 ) .
and x , + xo exp oz,
in mean square. The last equation is the solution to the It6 equation
The conclusion to the discussion of this paragraph is the following. Suppose that 5, is a random function with a “large” bandwidth, which acts on the system whose output is governed by i = xo5, and which we wish to model by an It6 equation. Then the model dx = a2x/2dt + v x dz may be preferable to the model 2 = v x dz. This modelling question, which is concerned with the relation of the solution of ordinary and stochastic differential equations, is discussed by Wong and Zakai [l, 21. Example 3, Consider the scalar problem dx =  a x d t
+o(x)dz
(U
> 0).
The exponential form I/ ( x ) = exp I I x I ‘
is useful in the study of a variety of scalar situations. The constants a and I are to be selected. Here, and in Example 4, we set a = 1. Since V ( x ) does not have a second derivative at x = 0, but is in the domain of the weak infinitesimal operator of x, in all Q,  Q,  dQ,,E > 0,
3.
59
EXAMPLES
Theorem 5, rather than Theorem 2, must be applied. Suppose that 0’(x) = 1x1 0’.
(Note that the corresponding ~ ( x does ) not satisfy a Lipschitz condition in any Q,, owing to its behavior at x = 0. This is unimportant. We define the origin as an absorbing point, and the solution may be defined up to the time of contact with the origin by a method analogous to that of the last subsection of Section 4, Chapter I. Furthermore, the results below and the argument used in the proof of Theorem 5 show that either x, + 0 as t 3 00, or x , = 0 for some finitevalued Markov time s.) For x # 0,
(34) Setting I = 2a/a2 yields 9 V ( x ) = 0. Then, by an appeal to Theorem 5,
Example 4 . Take the model of Example 3 but set a’ (x) = (I  elXl>0 ’ .
The remarks in Example 3 regarding the properties of the process in the neighborhoods Q, hold here also. For x # 0, 9 V ( x )=W(X)
Set A = 2a/a2; this yields d p V ( x ) conclude, as in Example 3, that
AT’(I  elxl) 2
3.
(35)
0, for x # 0. By Theorem 5, we
The bound is identical to that obtained in Example 3, even though
60
I1 / STOCHASTIC STABILITY
the variance a’(.) is uniformly bounded here. The probability bounds are determined by the choice of 1,whose value in both cases depends ) x = 0 (or, alternatively, on the reupon the properties of ~ ’ ( x at quirement that the bracketed terms of (34) and (35) be nonpositive). Example 5 . A method of computing Liapunov functions for scalar It6 processes will be given. (See also Khas’minskii [I], where a similar method is used in the study of processes which are “dominated” in a certain sense by a onedimensional process and Feller [2].) Let the model be d x = f ( x ) dt ~ ( xd )z (36) f ( 0 ) = a ( 0 )= 0 .
+
In order to obtain a tentative Liapunov function, we proceed formally at first. Assume that V ( 0 )= 0, V ( x ) > 0 for x # 0. Let V ( x ) be in the domain of A”,,,in Q,  Q,  aQ, and let A”,,,V(x)5 0 in Q,  Q, aQ,, where each Q, is bounded. Then, in Q,  Q,  aQ,,
or, equivalently, (37) The function defined by (38) satisfies the relation (37), provided that the righthand integral exists: X
S
(The exponent of the exponential is written as an indefinite integral since, in many applications, the integral from 0 to s may not exist.) Since x = 0 is an absorbing point, the rest of the discussion may be specialized to x 2 0. It is readily verified that 9 V , ( x ) = 0 on Q,  Q, aQ, for any 0 < E < m < co. From the nonformal point of view, the existence of the integrals in (38) must be verified.
3.
61
EXAMPLES
Some special cases where the preceeding method is applicable are of interest. Suppose that a2(x) = c1x2
+ czlxl
> 0, (u > 0).
cz
(cl
f(x) =  ax
> O),
Then, computation of the integral (38), and omission of the constants of integration, yields the tentative Liapunov function V , (x)
= (x
+
);
2alcl+ 1 *
The function V l ( x ) is in the domain of A",,, for each E > 0. By the method of construction, 9 V l ( x ) = 0. In order to infer asymptotic stability of the origin from Theorem 2 or 5 , we require a Liapunov function V(x) with  Y V ( x ) =  k ( x ) 5 0, where k ( x ) = 0 implies x = 0. To achieve this, we take V(X) = (x
+
y: (;I 
(39)
+
where p = 2a/c1 1  6, for some arbitrarily small positive 6. Now V ( 0 )= 0 and V ( x ) > 0, x > 0, and
Thus by Theorem 5 , Corollary 32, and the arbitrariness of m, x, + 0 with probability one, and for x = xo > 0,
62
I1 STOCHASTIC STABILITY
Example 6. Another application of the method of construction given in Example 5 will be considered. We suppose again that the origin is an absorbing point and that x > 0. Let f ( x ) be differentiable and less than  a x in the region 0 s x S 2 and also a > 0 . Let a2(x)=clxz +c,Ixl. Now, we replace the right side of (37) by its minorant 2ax/02( x )and compute a tentative Liapunov function, V, (x), using the minorant. Thus V,(x> =jds[expj(,
du]
0
where
Vz (4= Vl ( X I 9vz(x)j0
3
(xfO
and x S 2 ) .
Now proceeding as in Example 5 , and using the function V(x) defined by (39) (which is in the domain of A”,,c for each E > 0 and m 5 V ( Z ) ) , we may apply Theorem 5 and Corollary 32, which yield
and x, + 0 with at least the probability 1  V ( x ) / V ( P ) . Example 7. Consider the twodimensional nonlinear Itd equation, where g ( x l ) satisfies a local Lipschitz condition dx, = x2 dt dx, =  g(xJ dt
 axz dt  x2c dz
f
/ g (s)d s+cc
as
0
sg(s)> 0,
g(0) = 0.
S Z O
tcc
(310)
3. EXAMPLES
63
The function XI
v ( x ) = x:
+2
J 0
g(s) ds
is a Liapunov function for the deterministic problem where c = 0 (see LaSalle and Lefschetz [l], p. 59 ff). Each Q, is bounded and V ( x ) is in the domain of each 2,. (Since Sll(x)= Sl,(x) = 0, it is not required that V x , x r ( exist, ~ ) i = 1, 2.) We have
9 v ( x )= x;(
2a
+ 2).
(311)
Suppose that a > c2/2. Then (311) (by appeal to Corollary 32, and the arbitrariness of m) implies x,, + 0 with probability one. It will now be proved that xlt+ 0 with probability one. By Lemma 1, and the fact that, for any m > 0, (312) there is a random variable b(w) < 00 with probability one, such that V ( x , )+ b(w) with probability one, as t + 00. Since xzt+ 0 with probability one, this implies that
s
Xlt
G(xl,)=2
0
g(s)ds+b(w)= 0, a2(w)5 0. Thus x l f + { a l ( w ) , a2(w)}. In fact, in the limit, each path xlt tends to only one of a1(o), a 2 ( w ) ,since it cannot (with probability one) jump from a1( w ) to a z ( w )and back to a1( w ) . (Since xlt+ {al (o), a2(w)},for any E > 0 there is a random variable T,(o) < cc with probability one such that min [(x,,  a1(w))’ x$, (xlt a,(o))’ xi,] < E with probability one for t 2 ~ ~ ( wx,) cannot . move from a neighborhood of {al(w),O} to a neighborhood of {a,(o), 0} without traversing the path between the points. By stochastic continuity, the probability
+
+
I1 / STOCHASTIC STABILITY
64
of such a jump is zero.) Thus x l t + a(w),la(o)l < co with probability one, where U(W) is either a l ( w ) or a 2 ( o ) . Now, for any 1 > p > 0, there is a compact set M so that x,, t co, is confined to M with probability at least 1 p (by (312)). Thus, by the local Lipschitz condition on g(s), for each p > 0 there is a realvalued Kp < 00, so that, uniformly in t,
=
Ig(x1J  g(.(o))l 5 K p l X l t  a(o)l
with a probability at least 1  p . This and the convergence of x l t imply that g(xl,)+g(a(o)) with probability one. The final evaluation of a(o) is obtained by integrating the second equation of (310) between T, A and T,, where A is positive but arbitrary. Let T,+a and suppose that the intervals [T,, T, A ) are disjoint :
1
T. X2T,
 XZT,A
=
1
T,
g(X1,) dt  a
TnA
[
T, X2r
dt  c
T,A
x2t
dz,.
(313)
T,A
Now, all with probability one as nco, the left side of (313) goes to zero, the first term on the right tends to dg(a(w)), and the second term on the right tends to zero. Thus T.
 g(a(o)) A
= lim c ntm
x l t dz,.
(314)
T,A
The last term on the right side of (313) represents a sequence of orthogonal random variables with uniformly bounded variance (since E x x i , Ex V ( x , ) 5 V ( x ) ) . Thus, the sequence, since it converges, can converge only to zero (with probability one). Thus, by (314), g ( a ( o ) ) = 0, and, hence, ~ ( o =) 0 with probability one. In conclusion, x , + 0 with probability one under the condition  2a tc2 < 0. Also, for any m > 0, (315)
3.
65
EXAMPLES
Let us try to improve the probability estimate (315) for small c2. Let Vn(x)= V"(x),n 2 1. For each m > 0, V,,(x)is in the domain of 2, (where for this example we let Q , still denote the set { x : V ( x ) < m } ) :
(316)
s2nVn1(x)x;[a + c 2 ( n  l))].
s
If a >= (n  1) cz, then Y V " ( x ) 0 and we have P,{
sup V ( X ) Z rn} = P x { sup V " ( x ) Z rn"}
rnzrzo
m>rzO
s '"('), m"
(317)
which is better bound than (315). Example 8. Suppose that a2(x) = a2x:/(l + x:) in Example 7. The function Vl (x) = exp iV ( x ),
where V ( x )is the Liapunov function of Example 7, is in the domain of 2, for each m > 0, and
[
I p V l ( x ) = 2,V1 ( x ) = 21V1 ( x )  ax:
5 2nv1 ( x ) x:
[
a
+
~
2(1
o2
+ (1 + 21x3
+ x:>
++I
1x:a2
%I
2
(1 + x:)
~2~v,(X)X:[a+~2(i++)].
If a > 02/2, then letting 1 = (a a2/2)/az yields Y V , ( x ) 5 0 and the bound P, { sup V ( X r ) Z P } = px { SUP Vl(X,) 2 exp API rn>t'O
rnsrz0
s exp n ( ~ ( x) p ) .
Example 9. A Stochastic van der Pol Equation. Consider the form
66
I1
/ STOCHASTIC STABILITY
of the van der Pol equation d(R)
or
+ ~ ( 1 x’)
R dt
+ bx dt = c x d z
(E
> 0),
(318)
With c = 0, the origin is asymptotically stable, and there is an unstable limit cycle. Equation (318) is also equivalent to the system
dy2
=
 by1 dl + Cy1 d z .
(320)
Equation (320) is obtained from (318) by integrating (318) and letting y1 = x and yZl=Jb [ byls ds + cyls dz,]. The identity between (318) and (320)can also be seen by noting that (at least until the first exit time from an arbitrary compact set) y , is differentiable and dyi,/dr = (i  1 ) y;; pit. See also the discussion of the deterministic problem in LaSalle and Lefschetz [l], pp. 5962, where the following Liapunov function is used: YZ
V(y)=2
by: + . 2
Let A”, be the weak infinitesimal operator of the process (320) in Q, . Q, is bounded, and the terms of (320) satisfy a uniform Lipschitz condition in each Q,. V ( y ) is in the domain of 2, for each m > 0:
YpV(y)= y:
[(;  be) +

I)%(
Suppose that c2/2 be < 0. Then Y V ( y )< 0 on the set p
=b:Y : < P I
’=
3 (be  c 2 / 2 ) be
3. EXAMPLES Define bp m== 2
67
3(b& c2/2) 2E
Then, in the open set Q,, Y V ( y )< 0, provided that y , # 0. By Corollary 32,
(32 1) and x,, = y,, + 0 with a probability at least 1  V ( y ) / m . By a method similar to that used in Example 7, it can be shown t h a t y ,  + O f o r w i n { w : V ( x t ) < m , all t O and define t , = t,l + A , to=O. Write t' for tnq,,. Then, from (320), 1'"
f',,
Ylr,,,
 Ylt,,,,
=
j' 1'"
Y2t
dt
1
&
j'
(y1r 
$)
(322)
dt.
r'.I
There is a random variable c ( w ) , 0 5 c(w) < m, such that V ( x , )+ c ( o ) (with probability one, relative to B,) for w in B , . Let w be in El,. Sincey,, + 0, we havey:, ,240). As in Example 7, eithery,, +JC(&) or y 2 ,+ Since the term on the left and the term on the far right of (322) tend to zero as n + 03, the first term on the right also tends to zero as n + 03. This implies that c(w) = 0. Note that P , {B,} is given by (321). If c2 is small, there is a simple device which will improve the estimate (321). Let V , ( y )= V " ( y ) . V,(y) is in the domain o f z , for each m > 0:
dm.
I .Vnyy)Y:[(; 
 b&)+
y+
(n
 1)cZ
1
.
Let c2 be sufficiently small so that
be > ~ ' ( n 3).
(323)
68
I1 / STOCHASTIC STABILITY
Then Y V n ( y )< 0 (if y , # 0) provided that 3 ( b~ C ' ( U  *)) bs
2
Y l < P n = 
Define m,, m = b p= " 2
3 ( b~ C'(U  3)) 2E
Then, U V , ( y ) < O ( y , # O ) in the set Q,, V"(Y)< 4 1 , and
= { y : V ( y )< m , } = { y :
There is some n for which V"(y)/m:is minimum, within the constraint (323). Also, x l t = yl, + 0 with a probability at least 1  V"(y)/m:. An application of Theorem 6 and its extension is given in: ExampIe 10. Let E < 0 in (320). Then the deterministic problem (c = 0) has an unstable origin and a stable limit cycle. With Y: V(y)=2
+ b,2y :
we have, for each m > 0,
Also there is no finite escape time. Suppose that (as is true) ExJy:sds=m. 0
Then
3.
69
EXAMPLES
If E,yts and E,y:, both converge to nonzero limits, then
+
lim E,yt,  c2/2 be lim E,y:, bs/3 . Moment results on another type of stochastic van der Pol equation appear in Zakai [l, 21. Example I I . A TwoDimensional Poisson Parameter Problem. Let
k=(A+ Y ) x , where Y, is a matrixvalued Poisson process with two possible values, & = D or Y, =  D . Let the probabilities of transition be P{ Y , + , = D l y , =  D}= P { Y , + , =  DI y , = D } = P A + o(d). Suppose that both M & D are positive definite. The function V ( x , Y )= x’(M
+ Y )x
is bounded and has bounded derivatives in every bounded set of x points and tends to infinity as llxll +a. For the process stopped on exit from each region Q,,= {x, y : V ( x , y ) < m } , V ( x , y ) is in the domain of the weak infinitesimal operator (which, on V ( x ) , is d ) . To continue, let the real parts of the eigenvalues of ( A D ) be negative. We write
+
d V ( x , Y )=  x’C+x,
when the initial condition is Y =
+ D , and
d V ( x , Y ) =  X’Cx
when the initial condition is Y =  D :
 C + = ( A + D)’(M
+ D)+ (&I + D ) ’ ( A + D) 2pD  C = ( A  D)’ ( M  D) + ( M  0)’ ( A  D) + 2pD
(324a) (324b)
If ( M t D ) and both C, and C  were positive definite, then Theorem 2
I1 / STOCHASTIC STABILITY
70
would apply. To be specific, suppose that A=[' 1 b
'1,
D=[i
:]
(d>b>O)
( A  D is an unstable matrix, so that the problem is not trivial.) If F is a stable matrix and G positive definite and symmetric, then there exists a positive definite symmetric matrix solution to G = F'B BF (Bellman [l]). Thus, by choosing a diagonal C, = C, with positive entries cl, c2 chosen so that C,  2pD is positive definite and symmetric, there is a positive definite symmetric solution ( M + D ) to (324a) :

+
M+D=(
Also M  D is positive definite. Substitute ( M  D) into (324b) and compute  C ; this yields that C is positive definite provided that
( d  b)(c2
+ c1 + 2pd)  4 ( d  b) d + 2pd
1
 d 2 [2  c112 > 0.
(325)
If p is sufficiently large, then the inequality is satisfied. As p increases, so does the probability of a transition in any interval. We will not pursue the matter further. Certainly, stability is implied by weaker conditions than (325). If (325) holds, then, for any m > 0, Px,, { sup x: ( M m>t2O
'(M + Y ) x + r,) x, 2 m} 5 x~~. m
The form of V ( x , Y ) was suggested by the following observation. If Y =  D, then, although the derivative of x : ( M  D)x, may be positive, the average rate of decrease in x ' ( M + Y,)x (initial condition Y =  D) for fixed x may be sufficiently negative for at V ( x , Y ) 5 0.
4.
DISCRETE PARAMETER STABILITY
71
4. Discrete Parameter Stability
The statements of the stability theorems for the discrete parameter process are similar to the continuous parameter theorems. The proofs are generally simpler, since the technicalities centering about the weak infinitesimal operator are not required. To illustrate the method, we state and prove only one theorem. Theorem 12. Let xl, . .. be a discrete parameter Markov process, V ( x ) 2 0, and Q , = {x: V ( x )< m } . In Q,, let
Then
E X , , V ( x , + 1 ) V ( X )=  k ( x ) 5 0.
There is a random variable v, 0 5 v 5 m, such that V ( x n )+ v with probability >= [I  V ( x ) / m ] . Also k(x,)+O in Q , with at least the probability that x,, is in Q,, all n < co. Proof. Let t be the first exit time of x, from Q,. Define I, = x,,,; 2, is a Markov process. Also, V ( 2 , ) is a nonnegative supermartingale since J ! ? ~ , , V ( X , +~ )V ( X ) S 0 for Z either in Q , or not in Q,. Equa
tion (41) and the sentence following 'it follow from the supermartingale properties and the fact that V ( . Q 2 m if V ( x , ) 2 m for some i n. Define i ( y ) by Ex,, v(z,+l) V ( I ) =  i ( I ) . If I is in Q,, then l ( X ) = k ( 2 ) , and if 2 is not in Q,, then 2n+l= 2, = I and L ( 2 ) = 0; thus i ( 2 ) 2 0. Also,
EXJlv (x,)
v ( 2 ) = i c Ex,ol (Xi). = n 1

1
(42)
Equation (42) stays bounded as n + co since V ( x ) 2 0 and &(x) 2 0. Thus pi = P x , o( a ( x i )2 E ) is a summable sequence for each E > 0. By the BorelCantelli lemma, &(f,)+ 0 with probability one. Since I,,= x, all n < co, with probability 2 [I  V(x)/m],the theorem follows.
72
I1 / STOCHASTIC STABILITY
5. O n the Construction of Stochastic Liapunov Functions
Generally, in the application of the Liapunov function method, a Liapunov function V ( x )is given and & V ( x ) =  k ( x ) computed. The usefulness of the function for stability inferences depends heavily on the form of k ( x ) , and much effort is usually expended in search for functions V ( x )yielding useful k ( x ) . In the deterministic theory, a great deal of effort has also been devoted to the reverse problem: Choose a k ( x ) z O of the desired form and seek the function V ( x ) giving v(x) =  k ( x ) in some region Q. If such a V ( x ) can be found, and if it has the appropriate properties, then stability inferences can be drawn. This reverse approach is intriguing, but it has not, to date, produced many useful Liapunov functions, except for the case of time varying linear systems. See Zubov [l], Rekasius [l], Geiss [l], and Infante [l]. We now give the stochastic analog of the method of partial integration, an approach which attempts to obtain V ( x ) by integrating a given k(x,), with the use of the system equations. In the stochastic case, the average value of the integral is the quantity of interest. Only diffusion processes will be considered. Our starting point will be Dynkin’s formula, instead of the (deterministic tool) integralderivative relation of the ordinary calculus. Suppose 9 V ( x ) is given, then under suitable conditions on V ( x ) and T , V ( X ) E x V ( ~ = J  Ex
i 0
9 V ( x S )ds
= Ex
i
k ( x J ds.
(51)
0
In order to reduce the number of technicalities in the discussion, all operations in the development will be formal. T (in (51)) will not be specified; we assume that, on all functions considered, 9 V ( x ) = x Q V ( x ) ,and Q will not be specified. Further, E,V(x,) is assumed equal to zero. These assumptions provide for convenience in the derivations, and are not seriously restrictive from the point of view of obtaining a result, since, in any case, any computed Liapunov function must be checked against the theorems of Section 2. (Roughly, if T is the first
5.
ON THE CONSTRUCTION OF STOCHASTIC LIAPUNOV FUNCTIONS
73
exit time of x , from an open set Q, and 9 V ( x ) = &V(x) =  k ( x ) in Q , then YPE,V(x,) = 0 and E,V(x,) contributes nothing of value to either V ( x ) or z Q V ( x ) . )The method is illustrated by two examples. Example 1. Suppose that the process is the solution to the It6 equation dx, = x2 dt d ~ 2 =  ( b ~ 2 + 2 +~3~x 1: ) d t + o ~ , d ~ ( b > 0 , c > O ) .
Suppose that we seek a function V ( x ) which is a Liapunov function in a neighborhood of the origin and which satisfies 9 V ( x ) =  x:.
Define 11 = E x
I,
i 0
~ 1 t ~ d2 t 1,
dt.
= 0
Note that, formally, 1, satisfies 91, =  x : , and we will solve for 1,. Upon applying Dynkin’s formula (and omitting the €,g(xT) terms), and substituting the values of the 9 x 1 , we have 2
x 2 = Ex
i i i
( 2&,) dt = 4c1,
+ 61, + ( 2 b  d)I , ,
(53a)
( 25:)dt
,
(53b)
0
x;
= E,
X: =
E,
=  21,
0
0
( 2 ~ ; dt = ) 31,.
(53c)
I1 / STOCHASTIC STABILITY
74
If a’ c 26, the solution of the linear set (53) for the function I,, which we now denote by V ( x ) , is
It is indeed true that EpV(x)=  x: and, further, it is readily verified thatTheorem2may be applied to V(x) in a neighborhood of the origin. Equation (54) was not hard to calculate. The answer (54) was known : the application of 9to each term of (54) individually yields only terms which are integrands in (52); hence, the I i could be calculated by use of the integral formula. In general, we pursue the following procedure. Choose a collection of functions g , ( x ) , i = 1, ...,n ; these correspond to theleft sides of(53). Computeeach 9 g i ( x )= x j a i j h i j ( x ) . Hopefully, the set { A i j } will contain only n distinct functions and one of these will (hopefully) be the desired function k ( x ) , the desired value of  U V ( x ) . Finally, by writing
gi(x) = E x J
 9 g i ( x s )ds =
0
z a i j E , i
s
kij(xs)ds
0
we may solve the set of linear equations for the desired Liapunov function
1 T
V ( x )= Ex
k(x,) ds.
0
Of course, it is usually no easy matter (if not impossible) to select a proper set { g , ( x ) }(so that there are only n distinct terms in { h i j ( x ) } ) . As in the deterministic case, the choice of { g i ( x ) }seems to be a matter of trial and error. The computed V ( x )must be tested according to the appropriate theorem. If the number of terms in { h i j ( x ) } cannot be reduced to n ( k ( x ) must be a distinct term in {hij(x)} in any case), it may still be possible to compute a Liapunov function which gives some information. An example follows.
5.
ON THE CONSTRUCTION OF STOCHASTIC LIAPUNOV FUNCTIONS
75
Example 2. Let dx
= ( a
+ y ) x dt
(a
> 0)
(55)
where the time varying coefficient y , is the solution of
Our aim is to obtain a function V ( x , y ) which is uniformly (in y ) positive definite in a neighborhood of the origin (of the x space) and which satisfies Y V ( x , y ) =  x2 . We restrict our search to functions of the form V ( x , y ) = x 2 P ( y ) , where P ( y ) is a polynomial. Define the sequence I , , ... : T
I  , = I  , = 0,
I,, = ~ , , , [ x : y : d t
( n > 0).
0
The desired V ( x ,y ) equals I,. Then, a formal application of Dynkin's formula gives
I 1
x2yn= E x , y ( Yx:~:)dt 0
n ( n  1)Jn2 =(2~+nb)I,,2I,,+l2
(57)
where J n  2 = Ex,yJL a 2 ( y , )x:y:2 dt. We now make the additional assumption, which will be helpful in the following calculations, that J,,2 = a 2 ( y ) I ,  , . Let us solve (57) under the supposition that I , = 0, n 2 3. Then, providing that A
= inf Y
{ 2 a ( 2 a + 2b) (2a + b )  4a2(y)} > 0,
76
I1
/ STOCHASTIC STABILITY
we have 1, = V ( x , y) =
2 V ( x , Y)
=
x 2 [(2a
+ 6 ) (2a + 2 b ) + 2 ( 2 a + 2 6 ) y + 4y2] A
2x2 [  a (2a + 6 ) (2a
(58)
+ 2 6 ) + 2a2(y) + 4y3]
A
Suppose that a2(y) and yo are such that y: < A / 8 for all f < 03 and the bracketed term in (58) is always positive. Then (58) is a stochastic Liapunov function. The set of such a2(y) and initial conditions yo is neither empty nor trivial. Let a = b = 1 , and max 0 2 ( y )= 1. Then A = 20,andtheconditionsaresatisfiedifa2(y)= 0,for lyl > (20/8)’’3> lyol. By truncating the set of linear equations at larger n, and solving the set for a new approximation to I,, a larger allowable range of variation in y, is obtained.
111 /
F I N I T E T I M E STABILITY AND FIRST E X I T T I M E S 1. Introduction
The behavior of many tracking and control systems is of interest for a finite time only, and their study requires estimates of the quanti
for some suitable function V ( x ) . First exit time probabilities have a traditional importance in communication theory (see, for example, Rice [l]) where they are used, for example, to study the statistical properties of local extrema of random functions, or of clipped random functions; the study of control systems via a study of appropriate first exit times is certainly not new (see, for example, Ruina and Van Valkenburg [l]). Nevertheless, most of the available work that is pertinent to control systems seems to bear mainly on asymptotic properties or to involve numerical solutions of partial differential equations which are associated with certain types of first exit time problems. In this chapter, the stochastic Liapunov function method of obtaining upper bounds to (11) is discussed, and several theorems and examples are given. Most of the published work on stochastic control and tracking systems seems to be concerned with the calculation of moments, especially with second moments. Moments have generally been easier to calculate or to estimate than the values of (11). However, for many problems, knowledge of (11) has greater pertinence than knowledge of moments. 77
78
/
111 FINITE TIME STABILITY A N D FIRST EXIT TIMES
Many tracking or breakdown problems must be considered only as finite time problems, since either a stationary distribution of error may not exist, owing to eventual breakdown or loss of track, or else the “transient” period is the time of greatest interest. Estimates of (11) are also of importance when the system, even in the absence of the disturbing noise, is unstable outside of a given region. The noise may eventually drive the system out of the stable region with probability one, and then estimates of the probabilities of the times of these occurrences are of interest. Tracking systems occur in many forms. For example, an aircraft may be tracked through the use of the noise corrupted signals which are returned to a tracking radar system, or a parameter of a control system may be tracked either directly by using noise disturbed observations of the value of the parameter, or indirectly by noise disturbed observations on the values of the states of the system which is parametrized by the parameter. In such cases, bounds on the probability that the maximum tracking error (within some given time interval) will be greater than some given quantity are of interest. This is especially true in systems which lose track at the moment that the error exceeds some given quantity. The orbiting telescope is a problem of some current interest, where first exit times are important. To allow successful photography of astronomical objects, the direction of point (attitude) of the telescope must remain within a given region for at least the time required to take the photograph. However, the attitude is influenced by the random forces acting on the satellite, and in its control system. Problems in hill climbing are often of the same type. Given some maximum seeking method of the gradient estimation type, one would like a reasonably good upper bound of the probability that at least one member of some finite sequence of estimates of the peak deviates by more than a given amount from the true location of the peak. Such estimates provide a useful evaluation of the hill climbing procedure,* * Two opposing demands on the design of maximum seeking methods (which are based on the use of noise corrupted observations)are (l), to have the probability
2.
79
THEOREMS
and are of particular use in cases where there are many relative maxima, or where large errors cannot be tolerated ;for example, where a parameter of an operating plant is being adjusted. Similarly, the use of first exit probabilities provide one natural way of evaluating other “socalled” adaptive control systems. The reader is also referred to the work on deterministic finite time stability in Infante and Weiss [ l , 21. The finite time results differ from the infinite time results in that 2 ,V ( x ) may be positive for some or all values of x. Nontrivial statements regarding the values of the probabilities (11) may, nevertheless, be made. If the system has some “finite time stability” properties, but is not stable in the sense of Chapter 11, then a positive definite Liapunov function cannot satisfy x , V ( x ) s O in the set of interest, since this would imply a result of the type of Chapter I1 which gives (stability) information on the trajectories over the infinite interval [0, a). Generally, in this chapter, A,,,V(x) will be positive for all or some values of x in the set of interest. Nevertheless, the form of x , V ( x ) will suggest time dependent nonnegative supermartingales which, in turn, will be useful to obtain upper bounds to the probabilities (11). An analog to the deterministic case is helpful. Let V ( x ) be differentiable in the set Q , = {x: V ( x ) m},m > 0. Let i= f ( x ) and assume that in the set Q,, v(x) = ( d V ( x ) / d x ) ‘ f ( x )5 cp, where cp is a nonnegative constant. Then elementary considerations show that x, is in the set Q , for at least the time T = (m V(x))/cp,where x = xo is the initial condition. This is a result in deterministic finite time stability.
=

2. Theorems
Theorem 1.
Assume ( A l ) to (A5) of Chapter 11. In Q,, let AmV(x)
s  P ~ ( x )+ qpI
(P > O > ,
(21)
of “bad” estimates as low as possible (for example, keep the probability of at least one “bad” estimate below a given quantity), and (2), to have the largest possible “rate of improvement” in the estimate of the maximum. The greater “caution” demanded by (1) slows down (2); see Kushner 191.
80
I11
/ FINITE TIME STABILITY AND FIRST EXIT TIMES
where rp, is nonnegative and continuous in [0, TI.Define in [O, T ] @, =
i
rps ds
0
where c is selected so that p 2 max crp,. fsT
Let I be such that if V(x,) 2 m, then W ( x , t ) 2 I. Then, with xo = x,
With suitable choices of I and c, the appropriate equation, (23) or (24), holds, where @; = @,p/max rp, :
Remark. Letr(t) = e'*'(I  eCGT/c) + l/c.Thedefinition of W ( x ,r) together with equation (22) imply that (x,, t ) stays within the hatched area of Figure 1 with at least the probability given by (22). To obtain (23) or (24), I is chosen so that maxTbrhOr(t) = m.
2. THEOREMS
81
Moximum of r ( t ) in LOJI occurs ot t = O
Moximum of r l t ) in t0,Tl occurs
ot t=T
Figure 1
Remark. Note that, for V ( x , ) = V ( x ) # 0 and T = 0, the right sides of (23) and (24) d o not reduce to zero. This behavior is due to the method of proof which does not distinguish between initial conditions (of V ( x , ) )which are fixed constants of value V ( x ) and initial conditions which are random variables whose expectation is V ( x ) .The results remain valid (as they do in Chapter 11) if the initial condition is a random variable (nonanticipative) with mean value V ( x ) . The fact that W ( x , ,t ) (stopped on exit from some region) may be a nonnegative supermartingale is suggested by the form of A",V(x) in (21). Proof. Define & = {x, t : W ( x , t ) < 1, t < T ) and let z be the first (Labeling the vertical axis in Figure 1 time of exit of ( x t , t ) form
on.
82
111 / FINITE TIME STABILITY AND FIRST EXIT TIMES
by V ( x ) ,the scaled Q, corresponds to the interior of the shaded areas.) Let s, = r n 7 . Both 2, = xrnr and i t= (Z,, s,) are right continuous strong Markov processes in [0, TI. In reference to the itprocess, let px,,{ B } be the probability of the event B given the initial condition 2, = x, s, = t where (x, t)e&,. Now, let (x, t ) e & . Then the stochastic continuity of the process 2, at t implies that p,,,(7 > t h ) ,1 as h decreases to zero. Consequently Ex,,(exp [ c @ ( t h ) nT]  e'@')/h + c q , ecdt as h + 0. Let be the weak infinitesimal operator of i t The . operation of d, on W ( x , t ) , for (x, t ) in Q A , is easily calculated to be (note that if (x,, s,) is in Q A , then t = s,)
a,
a , W ( x , s)
+
+
= e'*s[cq,V(x)
+ A",V(x)]  ~ s e c *0.s ~
(25)
Note that W ( x ,s) 2 0 for (x, s) in Q,. Now an application of Dynkin's formula vields
which is (22). Inequalities (23) and (24) are obtained by a weakening of the bound (22). If W ( x , t ) 2 1 for some t 5 T , then V(XJ 2 r ( t ) = e'Or
C
for some t 2 T(and vice versa). (See the remark following the theorem statement.) Also, Px{supT,,,, V ( x , )2 supT2,,,r(t)} is no greater than the bound given by (22). r ( t ) is either monotonic nonincreasing 01 monotonic nondecreasing, as t increases to T, depending on whether or not 1is greater than or less than eCoT/c.Each case will be considered separately. Let the maximum o f r ( t ) (in [0, TI) occur at t = 0. Then, choosing 1 so that the maximum of r ( t ) equals m, we have 1= in + (eCoT 1)/c. Note that, for consistency, we require 12 ec*T/c; thus c 2 l / m is required in this case, and the right side of (22) equals B(c) =
~ ( x+) (ecQT l ) / c in
+ (ecQT l ) / c
*
2.
THEOREMS
83
A further constraint on the value of c is c 5 p/max cp,, which assures that (25) is nonpositive in &. Suppose that the interval A = [l/m, p/max cp,] is not empty (a necessary and sufficient condition for the maximum of r ( t ) , in [0, TI, to occur at t = 0). The value of c in A which minimizes B ( c ) is c = l/m, the smallest number in A . Equation (23) follows immediately. If A is empty, then the maximum of r ( t ) in [0, TI occurs at t = T. To satisfy c 5 p/max cp,, we set c = p/max cp,. Also, choose 1 so that m = max,.. r ( t ) = r ( T ) = leKCaT.The right side of (22) now equals B(c) =
+
~ ( x ) (ecaT  l)/c ~~
mecoT
and (24) follows immediately upon substitution of the chosen c. (It can also be verified directly that A eC*T/cin this case.)
s
Remark. In the examples, we use the special case cp, = cp, a constant. Then QT = cpT and rnax,.,cp,/p = cp/p. Corollary 11.
p
2 0. Then
Assume the conditions of Theorem 1, except that
Proof. The proof is the same as that of Theorem 1, except that W ( x , t ) = V ( x ) QT  Q, is used. (Alternatively, let c + 0 in Theorem 1.) For m @ j T , the bound (26) is trivial.
s
+
Remark. The theorems of this chapter are only particular cases of the general idea that if W ( x , t ) , t 5 T, is a nonnegative supermartingale in a suitable region, then
Theorem 1 is given in the form in which it appears, since form (21)
84
I11 / FINITE TIME STABILITY AND FIRST EXIT TIMES
arose in a number of examples, with the use of some simple V ( x ) . It is likely that other cases, not covered in the theorems or by the examples, will be of greater use for some problems, either in that the corresponding Liapunov functions will be easier to find or that they will yield better bounds. One such special case is given by Corollary 12, and the Wiener process example following it. Remark. The forms of A”, on given domains for It6 and Poisson processes are given in Chapter 1, Sections 4 and 5. Remark. The bound (26) may seem poor, and it probably is for most problems that one is likely to encounter. Yet there are simple systems for which the probability in (26) does increase linearly with time. Let i = 1, and let the initial condition be a random variable which is uniformly distributed in the interval [0, 13. Then P x { ~ ~ p T 2 , 2 0 x , 2 l} = T f o r T s 1. The Liapunov function approach is often a crude approach. Given a tentative Liapunov function, its application to systems of rather diverse natures may yield very similar forms for &,V(x). If there is to be no further analysis or search for other possible Liapunov functions, then we must be content with the result given by a function which does not distinguish between quite different systems. Corollary 12. Assume (Al) to (A5) and let A“,V(x)S p V ( x ) . Let m 2 WLT, 11, 2 p > 0. Then V (x) P, { sup e”lr V ( x , ) 2 A} S ~. (27) T>rzO
1
Proof. The proof is very similar to the first part of the proof of Theorem 1 and will be omitted. The following example illustrates a case where the value of p, may be chosen as a function of T, to yield a useful estimate. Example. Let dx = a dz, where z , is a Wiener process. Let V ( x )= cosh px = (eP” C P X ) / 2Then, . for each m c co, A“, acting on V ( x )
+
2.
THEOREMS
85
(or on a function which is equal to V ( x ) on Q, and which is bounded and has bounded and continuous second derivatives) is (a2/2)d’ldx’. Thus a2J2 A,V(x)=V(x). 2

By reasoning as in Theorem 1, for pl = J20’/2 = p , we have ~ ( x= ) 0.
Define m by cosh Jrn = (cosh J E )exp (pT). Then
(28)
P, { sup (x,[ 2 m> 2 P, { sup e”‘cosh px, 2 cosh PE) T>rgO
TgrtO
cosh Jx = 0 is continuous and increasing and W, (0) = 0, and V ( x ,y , r ) 2 W, (IIx  y ll), for t 5 T . Let
where p > 0 and cp, is nonnegative and continuous in 10,TI. Define W ( x ,y , t ) as in Theorem 1, with V ( x , y , I ) replacing V ( x ) .Then, (22)(24) hold, with V ( x , , y,, t ) replacing V ( x , ) and V ( x ,y , 0) replacing V ( x ) . Also, with these substitutions the left sides of (23) and (24) majorize the quantity
Theorem 3 is the discrete time parameter version of Theorem 1. Theorem 3. Let x,, n = 0, ..., N be a Markov process and V ( x ) a continuous nonnegative function with
(210) in Q,
= { x : Y ( x )< m},where
P > 1, cp, 2 0. Define
whereF"(K)=n{' [ K / ( K  cpi)],FO(k)=l , a n d P > K / ( K  cpJ.Let1 be such that if Y ( x )2 m, then W ( x ,n) 5 A. Then with xo = x,
P x { sup W(x,,, n) 2 A} 5
V(X)
NLnzO
With appropriate choices of
+ K F N ( K ) K = B ( K ) . ~
1
~
~
(211)
A and K, and letting m > max 'pi= cp', we
2. obtain P,{
(
sup V ( x n ) zm> 5 1  1 
NtnZO
87
THEOREMS
y )y ~
(212)
(1  "i),
M
""
m' =>m P1 Remark. In the special case where reduces to
'pi = q, the
(213)
right side of (213) (213a)
Proof. Define T as the first Markov time (integral valued) for which W ( x , , n) 2 A, n 5 N , and let &, be as in the proof of Theorem 1. Except for the fact that Dynkin's formula is not needed, the proof is similar to that ofTheorem 1. The stopped process X, = x,,, is a Markov process, and S O is in = (z,, z nn). A straightforwardcalculation shows that W ( x , , n) is a nonnegative supermartingale if fi > K / ( K  qi)(for each i) or K > (p'/?/(/?  1) (with adjunction of the appropriate family of afields), for n 5 N . (For ( x , , n ) in Q,, Ex, W ( X , + ~n ,+ 1) W ( x , , n) 5 0.) Inequality (21 1) is the nonnegative supermartingale probability inequality. Define
=
yj!(+) K
~ p .
[A:  K y(A)] + . K
The maximum of r ( n ) , in the interval [0, N ] , occurs at either n = 0 or n = N , depending on whether A is greater than or less than
/
88
111 FINITE TIME STABILITY AND FIRST EXIT TIMES
KnE‘(K/(K= m, then
(pi))=
K F N ( K ) If . 1is chosen so that maxos,5Nr(n)
P,{ sup V(x,,)Z m } 2 P,{ sup W ( x , , n) 2 A} iB ( K ) . NtntO
N>ntO
+
Assume A? KF,(K) and let maxNtntor(n) = m . Then 1 = m K F N ( K ) K . The last two sentences jmply that m 2 K . Thus we require
Now
which is minimized (in the alloted interval m 2 K 2 (p’p/(p  1)) by the maximum value K = m , and (212) follows immediately upon making this substitution. Now, let 1 5 K F , ( K ) ; then the maximum of r ( n ) occurs at n = N , and, setting the maximum equal to m yields 1= m F , ( K ) . For consistency, we now require that K > m. Since K 2 (p’p/(p  1) is also required, and (in this case) m < (p’p/(p  l), and
decreases as K decreases, we set K = (p’p/(p 1) = m’ > m, the minimum allowed value. Equation (213) follows immediately upon this substitution. Corollary 21.
B .~L 1. Then
Assume the conditions of Theorem 2, except that
P, { sup V (x,) 2 m } i NtntO
m
(214)
Proof. The proof is similar to that of Theorem 2, using W ( x ,n) = V(X)
+ 1;
cpi.
2.
89
THEOREMS
STRONG DIFFUSION PROCESSES
It is instructive to consider a derivation of a result of the type of Theorem 1 from another point of view. The derivation is for the special case of the strong diffusion process in a compact region and makes use of the relationship between such processes and parabolic differential equations. Let G be an open region with boundary dG and compact closure. In order to apply the result for parabolic equations it is required that dG satisfy some smoothness condition. We suppose that dG is of class H 2 + = (see Chapter I, Section 6). If g is given by Q, = {x: V ( x ) < m ) , where Q, is bounded and V ( x ) has Holder continuous second derivatives, the smoothness condition on dG is satisfied. Theorem 4. Let x, be a strong diffusion process in G, where C and dG satisfy the conditions in the previous paragraph. Let P ( x , t ) be a continuous nonnegative function with continuous second derivatives in the components of x and a continuous first derivative in I , in G = [G dG] x [0, TI. If
+
(L 
);
P(x, t) 2 0
(215)
and P(x, 0 ) L 0 ,
P(x,t)Zl,
TZt>O,
x ~ d G .
(216)
Then, for t 5 T, P,{x,$G,
some s 5
t}
5 P ( x , t).
(217)
Remark. ,Although Theorem 4 is applicable only to strong diffusion processes, it is suggestive as a criterion for evaluating stochastic Liapunov functions to which Theorems 1 and 4 are to be applied. Let P, (x, t ) and P,(x, t ) satisfy the conditions on P ( x , r ) of Theorem4. Let P, (x, t ) 2 P,(x, t ) on dG x [0, TI G x {0},
+
90
111
/ FINITE TIME STABILITY A N D FIRST EXIT TIMES
in G . Then Theorem 4 and the strong maximum principle for parabolic operators (Friedman [l], Chapter 2) yields P , (x, t ) 2 P2(x, t ) 2 P, {xs#G , some s 5 t } .
Thus P 2 ( x , t ) is no worse an estimate than P, (x, t ) , and will be better if the inequalities are strict at some point. Let x, be a strong diffusion process and suppose that V ( x ) has continuous second derivatives. Let P,(x, t ) be the right side of (23) of Theorem 1 (where cp, = cp and p 2 cpfm) and L = A",, and let G = {x: V ( x ) < m } be bounded. On dG, V ( x ) = rn and Pl (x, t ) = 1. At t = 0, Pl(x, 0) = V ( x ) / m2 0. Also
Since L V ( x ) 5  p V ( x ) + cp and p 2 cp/m by assumption,
and we conclude that the right side of (23) satisfies the conditions o n P ( x , t ) of Theorem 4, under the appropriate conditions on the process x,. A similar statement can be made for the right side of (24). , on Proof of Theorem 4. Under the conditions on S i j ( x ) , f ; ( x ) and dG, Theorem 13.18 of Dynkin [2] (see Section 6 , Chapter I) yields that p ( t , x, y ) , the probability transition density of the process x, in G (and c G) = stopped on dG), satisfies ( L  d / d t ) u = 0. (Here P , ( x , E ~ Jr(p(t,x,y)dy). The function p ( / , x , y ) is continuous and has continuous second derivatives for t > 0. Alsop(t, x, y ) + 0, x + dG, t > 0, and p ( t , x, y ) 5 Kt  exp  Ily  x1I2/2th,for some positive real numbers K and h. So, the function 1  jGP(trx , y ) dy = Q(x, t ) is the probability that the first time that xsexists from G is no larger than t . Q(x, t ) satisfies ( L  a / d t ) Q(x, t ) = 0 with boundary condition Q(x, t ) + 1 as x + dG, t > O a n d Q(x,t)+Oas t + O , x ~ G . N o t i n g t h a t ( L  d / d t ) P ( x , t ) g O and that P ( x , t ) 1 Q(x, 1 ) on G x ( 0 ) dG x [0, T I , an application of
+
3.
91
EXAMPLES
the strong maximum principle for parabolic operators yields that P(x, t ) 2 Q(x, t ) Thus, for t
in
[G
+ dG]
x [0, T I .
T, some s S t } S P ( x , t ) .
Q(x, t ) = P,{x,$G,
Q.E.D.
3. Examples Example I. A Hill Climbing Problem. In this example, bounds on the errors associated with a standard simple gradient hill climbing method are developed. Let f ( x ) be a smooth scalarvalued function with a unique maximum On, at time n, which varies in time according to the rule en+,= 8, + $, where the $, are assumed to be independent random variables. The quantities x and 8, are scalar valued, and x, is the nth sequential estimate of 8,. For the most general problem, it would be assumed that not much a priori information concerningf(x) is available, but that observations on f(x), corrupted by additive noise, can be taken. To facilitate a simple development we suppose that f ( x ) takes the simple formf(x) =  k(x  8,)/2 and let two observations (at x, c and at x,  c ) be taken simultaneously. The sequence x, is given by the gradient procedure
+
x,+1
= x,
+
= x,
+
a [observation at (x, 
+ c)  observation at (x,  c)] ~~~~~
~~
2c
(xn + C)  f (xn  C)
+tnl
2c
~~

(31)
The random variable t, is the total observation noise and the members of {t,}are assumed to be independent. a and c are suitable positive constants. Equation (31) can be written as xn+1  e n +
1 = (1
0
 a k ) (xn  e n > + v n
at, =$, 2c
92
111
/ FINITE TIME STABILITY AND FIRST EXIT TIMES
Define Ev; = m2 and ED: = m4 and let Eon = Evi = 0. In order to apply Theorem 3, the assumption that 11  akl c 1 will also be needed. Let en = x,  9,; then e n + , = (1
+
 &)en
(32)
0,.
The estimates of P,(supNLntOlenl 2 E ) given by twodifferent Liapunov functions will be compared. First, define V, ( e ) = e 2 .Then (32), applied to V, (e), gives
Eerie;+
 ak)2 e: p = (1  a k )  ' , = (1
+m2,
9, = m2.
Thus Theorem 3 is applicable and, with the use of m = E~ and e t gives the bounds
= e2,
P,{ sup lenl 2 E } = P,{ sup e: 2 2) NtnzO
NtnzO
if &
2
z  9P
and P x { sup (enl 2 E } 5 NBnzO

m2
p  1 [I  (1  a k y j
e 2 ( l  ak)2N m2[1 (1  ~ k ) ~ ~ ] (33b) + E2 E 2 [l  (1  ak)2] ~
~~
~
~
~~~~
if &
2
5  pm2 [l  ( 1  ak)2j'
The bound in (33b) is separated into two terms, the first depending on the initial error, and the second depending on the noise variance. As N increases, the contribution of the initial error decreases exponentially, while the noise contribution increases as a constant minus a decreasing exponential. Now let us try V,(e) = Be4 + e2 as a Liapunov function.
3.
+ en+ 2
&“(Be:+
= B(l
93
EXAMPLES
 a k ) , e:
cp = Bin,
+ m2.
(34)
Apply Theorem 3 to V , ( e ) and let m = Be4 + E ’ ; this gives the bounds
P, { SUP lenl 2 NLn>O
E l = P,{ I l 
when
SUP V 2 ( 4 1 Vz(E)l
N>n>O
(
1
+
)
Be4 + e2 Bm, m2 1 ___ BE, + c 2 ) ( BE, + E’
N
(35a)
+
4 Bm, m2 m=BE +E‘L1  1 / p’
and
P,{ I 
SUP
NLn>O
lenl
L El
(Be4 + e’) /TNf (1  p”) ( B m , BE>,
when BE, + E’
E, then the maximum value of the second term of (31 1) occurs at E = e. Let E~ 5 E . To make the most effective use of Theorem 1 (assuming (23) is applicable) we want to select y so that cpy/m is a minimum subject to the constraint p y 2 q y / m(which assures us that (23) applies); i.e., if m is sufficientlylarge, we would like to select y = y * so that m = cpy./py.. (See Figure 2.) The value of y which minimizes cpu/pyis y* = 2/n and
m
I
0
1
Y*
Y
I
Figure 2
yields cp,./py.
= D.Thus,
we require
m >= cpy,/py. >= D . The inequalities are satisfied by our assumption on m. Thus, corresponding to some 0 5 y 6 1, the equality m = cpy/p7holds. With this y, (23) yields the bound (315)
3.
99
EXAMPLES
If the hypotheses of the construction are satisfied, namely m 2 D, then (315) is preferable to (313), provided that the same value of n is used in each. However, the best value of n in (313) is n = 2 (at least for e = 0; n 2 2 by assumption), and the computations which have been made have not proved that (315) is preferable to (313) when each is minimized seperately with respect to n. Nevertheless, the method does seem to have a general interest. Example 4. A TwoDimensional Linear It6 Equation. system be modeled by d x , = x 2 dt dx2 = (
XI
~
2
dt)
+
CT
d~
Let the
(316)
We will use the stochastic Liapunov function V ( x ) = +x;
with
+ X l X 2 + x:
A”,V(X)= ~ V ( X=) X :
+
 X:
= x’Bx
C T ~=
 X ’ A X + 0’.
(317)
Equation (317) can be put into the form
+ cp p = min (x: + x:)/(+x: + x 1 x 2 + x : ) . J m V ( x )2  p V ( x )
cp = 0
2
,
X
The minimum of the ratio is easily shown to be the least value of 111 for which ( A 1B)x = 0
+
has a nontrivial solution, and it equals p Thus z m V ( x )2
= 1  0.4J1.25
2 0.552.
 0 . 5 5 2 V ( x ) + a’.
Suppose that m 2 a2/(0.552). Then equation (23) gives the bound
(
P,{ sup V ( x , ) 2 m } 5 1  1  v ( X I )exp($). TLtZO
~
(318)
100
I11
/ FINITE TIME STABILITY A N D FIRST EXIT TIMES
IMPROVING THE BOUND We will give a qualitative description of the situation when the Liapunov function V " ( x )is used : A",V"(x)
=9V"(x)
=nV~l(x)[x:x2+fJ
I.
+(2x2 + 2V ( XdI ( n  1 ) (319)
There are many ways of factoring (319) into the form 9 V " ( x ) s  pV"(x) + cp. Let ji = 0.552 and write 9 V " ( X )5  p,V"(x) where
s
p,, = npa,
,
+ cp"
1 > a, > 0
cpn=max nji(a,, ~ ) ~ " ( x ) + n ~ "  ' ( x ) c ~ ~
+(2x2 + x l ) 2 V " 2' ( x ) c?(n
1.
 1) n ~
(320)
The first term in the bracket of (320) is negative and dominates the other terms for large ]lxll. Thus, for any 1 > a, > 0, the term being maximized will have a finite maximum which will occur at a finite point. cpn increases, as a, increases to one. cpn also increases with n roughly as n". Choose 1 > a, > 0 so that cpn is as small as possible subject to the constraint m" 2 cp,/pn. (To simplify the discussion, we assume that m" 2 cp,/pn for the range of n of interest. For any fixed m, this will hold for at most finitely many n. If m" 2 cp,/p, cannot be achieved, then the argument must be based on (24) rather than on (23). For any n, if m is suflciently large, then the exponent in (23) or (321) satisfies cpn/mn< cpr/mr for all r < n. Also, V n ( x ) / m n< V r ( x ) / m r , r < n. We may conclude that, for each value of m, there is an optimum n (n is not required to be integral valued, provided n 2 1); the optimum
3.
101
EXAMPLES
value of n is an increasing function of m. While no computations have been made, it seems reasonable to expect that, if the optimum n ( m ) were substituted into (321), then, for fixed T , the graph of the right side of (321), considered as a function of m, would have a bell shape, rather than the shape of a simple exponential: P, { sup
TzthO
v (XI)
2 m } = P, { sup
TzItO
I 1  (1
I/" (XI)
7)
2 m")
 V" ( X I exp
( ).:
(321)
A related approach involves the use of the Liapunov function exp $ V ( x ) in lieu of V " ( x ) .The choice of $ replaces the choice of n. We have exp $ V ( x )
= $ exp $ v ( x )
 x:  x i + 0'
+ $0' (2x, 2
+ x1 (322)
which may be treated similarly to 9 V n ( x ) . For more details, see Example 2 of Chapter V which concerns the design of a control for the system (316).
Iv /
OPTIMAL STOCHASTIC CONTROL
1. Introduction
This section motivates the more general discussion of the succeeding sections. Let x: be a family of strong Markov processes. The parameter u associated with each member is termed a control. The control determines the probability transition function p ” ( t ,x; s, y ) of the process xy. Usually each u may be identified with a specific member of a given family of functions which takes values u(x:, t ) depending only on x: and t , at time t. The object of control theory is to select the control so that the corresponding process possesses some desired property. One object may be to transfer an initial state x to some target set S with probability one, that is, choose a u so that x: + S with probability one); another object could be for x: to follow as closely as possible (in a suitable statistical sense) some preassigned path. Write A”u and A”,” for the weak infinitesimal operators of the process x: and the process stopped on exit from an open set Q,,,, respectively, and write E: for the expectation given that the initial state is x. Henceforth x: will be written as x,, and #($, t ) may be written u,. The specific control associated with x, will be clear from the context. The problem may be developed further by associating to each control a cost C”(x) = E l b (XI,)
+ El 102
1 0
k (x,, u s ) ds .
(11)
1 . INTRODUCTION
103
If we wish to attain a target set S, then 7, is the random time of arrival at S . A problem of optimal control is to select u so that x, + S with probability one as t approaches some random time 7, (possibly infinite valued) and which minimizes the cost C"(x) with respect to other controls of a specified class of "admissible comparison controls." A
DYNAMIC PROGRAMMING ALGORITHM
Let us first consider the control of processes governed by It8 equations, and, for the moment, proceed in a formal way. Suppose that the control process is modeled by the homogeneous process d x = f (x, U ) dt Y = c f i ( X , u )  +a j axi
+ a ( x , u ) dz 1c s i j ( x , u )    at .
axi a x j
2i,j
,
(12)
and assume A"u = 2"on all functions to which the operator is to be applied. The function u takes values u ( x J at time t . The system (12) has a welldefined solution in the sense of Itd for any sufficiently smoothf, u, and 0 . Thus, the control u indeed determines the process. Let a target set S be given and let (11) represent the cost given that xo = x is not in S. Denote the minimum cost by V ( x ) ; in other words, V ( x )5 C"(x) for all controls u lying in a specified comparison class. If V ( x ) is indeed the minimum cost, then the principle of dynamic programming (Bellman [ 2 ] ) implies that V ( x ) satisfies the functional equation V ( x ) = min E;[V(xAnru U
)
AT
+
0
k ( x , , us) d s ] .
(13)
The time index Ant, appears since control is terminated at 7,. The initial time in (13) is set equal to zero for convenience. Since u depends only on x, the process is homogeneous in time and the convention is inconsequential. If V ( x ) is given, then, by the algorithm of dynamic programming, the control u minimizing the right side of (13) is the optimal control.
104
IV
/ OPTIMAL STOCHASTIC CONTROL
Continuing formally, let us put (13) into a more convenient form. Divide (13) by A and let
A0
k ( x s , us) = k ( x , u (x)) . ff
0
This yields
0 = min [ 9 " " V ( x + ) k ( x , u)] , U
which is a nonlinear partial differential equation for V ( x ) , and also gives the optimum control u ( x ) in terms of the derivatives of V ( x ) . In other words, if u is the function minimizing (15) and w is some other control, then
+
0 = ~ ' V ( X ) k ( x , U ) 5 Y w V ( ~+) k ( x , w ) .
(16)
V ( x ) is, of course, subject to the boundary condition V ( x )= b ( x ) for
x on as. In a sense, for stochastic control, (15) replaces the Hamilton
Jacobi equation corresponding to the deterministic optimal control problem, where the It6 equation is replaced by an ordinary differential equation. See, for example, Kalman [l], Dreyfus [l], and Athans and Falb [l]. A slightly different control problem occurs if we add the requirement that the process is to be terminated upon the first time x, exits from an open set G containing S, provided that this occurs before x, reaches S . Then by letting 7, be the least time to either a G or dS, whichever is contacted first, and assigning the penalty b(x7") to the stopping position, the problem is unchanged except that the solution of (15) is subject to the boundary condition b ( x ) on asu ac. A seemingly different type of control problem appears when no target set S is specified, and the control is to be terminated at a given finite time T. Now, let u take values u ( x , , t ) depending on both x, and t . By
1.
105
INTRODUCTION
letting T" equal the minimum of T and the first time of contact with JG, and letting x, = x e G be the initial condition, the cost is written as ru
CU(x,t ) = E:,, b(x,,,, T,,)
+ E:,,/k(x,,
u s ,s ) d s ,
(17)
t
where E:,[ is the expectation given x, = x e G and r T . The associated dynamic programming equation is (we now allow all functions to depend explicitly on t )
+ 9 " V (x, t ) + k (x, u , where V ( x , t ) is the minimum cost function (the minimum of Cu(x, t ) over u). Actually, from the point of view of formal development, the problem leading to (17) and (18) is a special case of the preceeding problem. It is not necessary to introduce time explicitly since some component of x, can be considered to be time; that is, dx,+ = dt or x,+ = t . In fact, unless otherwise mentioned, we will omit the t argument, and assume that some state is time. The control forms which have been discussed up to now are functions of only the present value x, at time t. Their values at t have not been allowed to depend on the values of x s , s < t . Let P be the class of controls whose values depend only on the present state x,. Let P, be the class of controls whose values have an arbitrary dependence upon the present and also upon the past history of the state. Is the class P no worse than the class P,? In other words, for each control i t ' in P, is there a control u ( w ) in P such that C u ( w(x) ) Cw(x)?The question as formulated is still a little vague. Nevertheless, it is intuitively reasonable that, if the uncontrolled process has the Markov property, that the class P i s "just as good" as the class PI. Derman [ l ] gives some results on this problem for the discrete parameter control problem and Fleming [2] for the continuous parameter diffusion process; see also Theorem 8 and Example 1 of this chapter.
106
IV
/ OPTIMAL STOCHASTIC CONTROL
DISCUSSION A number of difficulties with the development leading to (15) and (16) are readily apparent. Let Q be the family of controls (functions of x) such that, if U E Q then , (12) has a unique solution (in the sense of It6) which is a welldefined strong Markov process. The class Q is essentially limited to functions satisfying at least a local Lipschitz condition in x. There may be no u in Q such that C"(x) C w ( x )for every other control M? in Q.* Even if an optimum control u in Q did exist, there is no guarantee that the cost C"(x) is in the domain of 2 and that 2 = 9'' when ' acting on C"(x); that is, the limits (14) may be meaningless. Third, if a smooth solution to (15) did exist, there is no guarantee that the control which minimizes (15) will be in Q; in particular, it may not satisfy even a local Lipschitz condition in x. (If u does not satisfy this condition, then the corresponding It6 process has not been defined, although there are diffusion processes whose differential generators have coefficients that are less smooth than that required (at present) for the existence and uniqueness of solutions to the It6 equation.) The family of comparison controls Q is not clear from the derivation of (15) and (16). Also, without further analysis, we cannot assume as is done in (14) that P: { A < T,,}+ 1 as A + 0. (This property holds for all the right continuous processes of concern here.) Lastly, the solution to ( 1 4 subject to the appropriate boundary conditions, may not be unique. ~
* Consider a control problem where a target set S is to be attained. Typically, the question of the existence of an optimal control is broken into two questions. The first is a question of attainability and the second of optimality. First: Is there a control with which the set S will be attained with probability one? It is possible to treat this with the methods developed here for stochastic stability. See Chapter V. The second question is: Given that there is one control accomplishing the desired task, then is there an optimal control? The latter question has been well treated for the deterministic problem. See, for example, Lee and Marcus [I], Roxin [I], and Fillipov [I]. For the stochastic problem, the possibilities are great and the results more fragmentary; see Kushner [7] and Flemingand Nisio [I]. See the survey papers Kushner [2, 101 for more references to the literature on stochastic control.
1.
107
INTRODUCTION
The purpose of the next section is to resolve these and related questions.
A
DETERMINISTIC RESULT
To motivate the theorems of the sequel, we first consider the control of the deterministic system i = f ( x , u). Let Q be a class of controls each member of which is a function whose values depend on time, and impose conditions on Q and f so that f = f ( x , u) has a unique continuous solution and Ilf(x, u)II is bounded in the time interval [0, a), for each u in Q . Let S be a compact target set and define the cost as C"(X) = b (xJ
+
i 0
k ( x s , us)ds
where T" is the time of arrival at 8s.Let b ( x ) be continuous and let k ( x , u ) > 0 outside of S. Now, let V ( x ) be a nonnegative function with continuous derivatives, and which tends to infinity as llxll+oo. Define the operator H" = c i A ( x ,u ) 8 / d x i . Let H"V (x)
= V ( X ) =
k (x, u ) .
Then V ( x ) is a Liapunov function for f = f ( x , u), and any trajectory of the system is uniformly bounded in time. Also, with xo = x and t 5 T", V ( X ~) V ( X )=
i
H"V(x,) ds = 
0
i
k ( x , , u,) d s .
(19)
0
Owing to the boundedness of the solutions and of 1 1 li and to the properties of V ( x ) and v(x) outside S, we see that, as t increases to some finite or infinite value T", x, must approach some point xTyon as, and V ( x , ) must approach b ( x T J .Hence V ( x )= C"(x). Now, let w be any other control in Q which transfers x to dS and let H " V ( x ) k ( x , w ) 2 H"V(x) k ( x , u). Then, again V ( x t )+ b(xrw),
+
+
108
IV
/ OPTIMAL STOCHASTIC CONTROL
and a simple calculation yields T"
C " ( x ) = / k ( x s , us) ds 0
+ b(xJ
s
rW
S C w ( x )=
k ( x , , w,) ds
0
+ b(x,,,,).
In other words, if 0 = H " V ( x ) + k ( x , u ) 5 H " V ( x ) + k ( x , w ) for all w in Q , then u is an optimal control relative to the set of controls Q. Also, the minimum cost is a solution to H " V ( x ) + k ( x , u ) = 0 with boundary data b ( x ) and the optimum u minimizes H " V ( x ) + k ( x , u). ( H " V ( x )+ k ( x , u) = 0 is an equation of the HamiltonJacobi type.) Except for a few results concerning existence and uniqueness for some problems in the control of strong diffusion processes, the approach of the sequel is similar to the deterministic approach just described. Given candidates for the minimum cost and optimum control, we discuss criteria which may be used to test for optimality. The processes, target sets, and conditions will vary from theorem to theorem. The stochastic situation is much more difficult than the deterministic counterpart just described.
THESTOCHASTIC ANALOG Stochastic analogs of the deterministic result are the burden of part of Section 2. In order to base a stochastic proof upon the deterministic model of the previous subsection, an analog of the integral formula (19) is needed. Dynkin's formula
i
EZV ( x ~) V ( x ) = Ef: A""V ( x , ) ds
(110)
0
will provide this. T is a Markov time with finite average value, and V ( x ) is in the domain of A"". The use of (110) would appear to be hindered by these conditions. Nevertheless, by first applying (110) under these conditions, we may subsequently extend its domain of validity in such a way as to provide a useful tool. The extension is
2.
THEOREMS
109
accomplished by truncating either V(x) or the process x, and then taking limits, and is developed in the proofs which follow.
2. Theorems
TERMINOLOGY In Theorem 1, the following conventions and assumptions are used for the target set S. Let x = (2,K), where Tr is continuous with probability one, for t =< T,, if T, c co, and for t < co otherwise. The component R, is only right continuous on the same time interval. Let 2 be in the Euclidean space E and x’ in E, E = E x E. The object of the control is to transfer the initial value of Zt, denoted by 2, to S in E. These conventions will simplify the proof. They do not seem to be a serious compromise of generality, since discontinuous processes will probably enter the problem as parameters or, at least, not as variables whose values are to be transferred to some set. For each control under consideration, the corresponding process is assumed to be a right continuous strong Markov process. The previous usage Q, = {x: V(x) < m}, where V(x) is a given function, will be retained. Note also that A”,” refers to the weak infinitesimal operator of the process which corresponds to u and which is stopped on exit from a given set Q,. The set Q, (and function V(x)) will be clear from the context. Except where explicitly noted, u will always denote a real or vectorvalued function whose values will depend either only on x, or on x and t , as indicated. We define A:,b as the weak infinitesimal operator of the process (with control u) stopped on exit from Q,  S = {x: V(x) S m}  S . T~ is always the first entrance time into the set S, or the time of termination of control if this occurs first. Theorem 1 gives conditions on a function V(x) and control u so that V(x) = C”(x). It is assumed that b(x) = 0 and Tr + 8s with probability one as t + T,. Theorem 2 gives conditions on a function V(x) so that V ( x ) = C ” ( x ) and b(x) may take nonzero values if x , + S a s t + ~ , , where T” < co with probability one. Theorem 3 gives a readily check
110
IV
/ OPTIMAL STOCHASTIC CONTROL
able condition under which a condition (uniform integrability) required in Theorems 1 and 2 is satisfied. Theorem 4 is an optimality theorem; conditions on two controls u and u are given so that C"(x) 5 C"(x). The analog of Theorem 2 for fixed time of control appears in Theorem 5. Theorems 6 and 7 give special forms (stronger results under stronger conditions) of Theorems 2 and 5, respectively, for the special case of strong diffusion processes. Finally, there are examples and a discrete parameter theorem. Theorem 1. Assume the target set conventions of the first paragraph of Section 2. Let b ( x ) = 0. Let V ( x ) 2 0 be a continuous function defined on E and which takes the value zero on S. For each rn > 0, let V ( x ) be in the domain of Ai,b and
Suppose* that ,ff+ a3 as t ,t,,. There is a nondecreasing sequence of Markov times tisatisfying E:ti < co,ti t,, with probability one and ti5 inf { t : V ( x , )2 i}. Let x be in E  S . Assume that either (21) holds, or that (22) togethert with either (23) or (24) holds: f
E:l/(xTi)+O
as
(21)
im;
the V (xTi) are uniformly integrable 1 ;
(22)
V ( x ) is uniformly continuous on 8 s ;
(23)
P: { sup llxtll 2 N } + O T,>tZO
as
N+co.
(24)
... .
* This is a question in stochastic stability, and can be verified immediately if k ( x , u) has the appropriate form; see for example Lemma 2 or Corollary 31 of
Chapter 11. t If V ( x ) x as l[x[l+ m, then the nonpositivity of a m , , u V ( x )implies (24). See Chapter 11. A sequence fi is uniformly integrable if, for every E > 0, there is an N < 00 independent of i such that EZulfil x , l ~ , B l N )5 E , where X A is the characteristic function of the set A . If fi +f with probability one, then uniform integrability implies €Sufi + E Z y J f
2. THEOREMS Then
111
7"
V ( x ) = C"(x) = E,/k(x,,
us) d s .
(25)
0
Proof. Define t i = min (i, zu,inf { t : V ( x t )2 i}). Since P x { ~ ~ p 7 u , t ~ 0 V ( x , )2 i} V(x)/i implies that inf { t : V ( x , )2 i} t 03 as i + 03 (with probability one), we have z, n inf { t : V ( x , )>= i} + T, (with probability one) as i + m . Thus { z i } satisfies the hypothesis. First, we prove that (22) and (24) imply (21). Ifthe sample function x,, t < z,, is uniformly bounded, then, by the continuity of V ( x ) ,the sample values V(x,) + 0 as zi + T,. Equation (24) implies that the sample functions x, are uniformly bounded with a probability arbitrarily close to one. Thus V(x,) + 0 with probability one. (This is true even though the x, may not converge to a unique limit on 8s. By hypothesis x, + as = aS x E with probability one. Thus, for each E > 0, there is a finitevalued random variable 4&such that the distance between x, and dS is less than E for t > $&,with probability one. V ( x , )+O as t + z, since V ( x ) is uniformly continuous on the range of x,, t < z,, with a probability arbitrarily close to one.) Finally, this fact together with (22) implies (21). Similarly, it is proved that (22) and (23) imply (21). (Note that the fact that V ( x , ) t 0 with probability one follows immediately from the uniform continuity of V ( x ) on as together with the fact that x, + as with probability one as t + z,.) By Dynkin's formula
j:
V ( X )= E:V(xTi)  Ej: X:~V(X)ds 0
.Ti
= E;V(xri)
+ E:/
k(x,, us)ds. 0
Since k(x) 2 0, the integral tends to E," f: k(x,, us) ds = C"(x). Since the first term on the right tends to zero as i + m , the proof is complete. Theorem 2.
Let the nonnegative function V ( x ) be defined and continuous on E and take the nonnegative value b ( x ) on the target set
1V / OPTIMAL STOCHASTIC CONTROL
112
S. Assume that V ( x )is in the domain of A”:,*, for each m > 0, and that Z:,*V(x) =  k ( x , u ) 5 0
+
in E  S as, and that the control u transfers the initial condition to S at T,, where T, c 00 with probability one. There is a nondecreasing sequence of Markov times T~ such that zi 4T , with probability one, E:T~< co and T~ =< inf ( t : V(x,)2 i]. For some such sequence, let either (2la) or (22a) hold:
the V ( x T i are ) uniformly integrable.
(22a)
Then V ( X )= C”(X) = E:b(x,,,)
+ E:
J 0
k ( x , , us) d s .
(25a)
Remark. The primary difference between the problems stated in Theorems 1 and 2 is that here the limit lim, x, = xTumust exist with probability one. Otherwise, the “terminal cost” term E,Ub(x,”)is meaningless. This is the reason for the requirement that T, be essentially finite valued. Proof. Since A”i,,V(x) 5 0, in E  S
+ as we have,
for X E E S ,
which goes to zero as m 03. Define zi = min (i, T,, inf ( 1 : V ( x , )2 i}). As in Theorem 1, { T ~ satisfies } the hypothesis. Also, since T , < 00 with probability one, there is an i(o)which is finite valued with probability one, such that x,, = xTufor all i > i ( w ) . Hence lim+ ‘IJ V(x,) = V(x,,,) with probability one, whether or not the process V ( x , ) is continuous. Now (22a) implies (2la). Applying Dynkin’s formula to T~ and V ( x )
2.
113
THEOREMS
we have V ( x )= E : V ( x , )
+ El!
Ti
k ( x , , us) d s . 0
Using (2la) and taking limits we obtain (25a). Remark on the significance of (22) or (22a). Whether or not any of (21)(24), (2la), or (22a) holds, the nonnegativity of k ( x , u ) and the monotone convergence theorem imply that, for any sequence of Markov times t , V ( x ) = E:
I
k ( x , , us)&
+ lim E:V(x,). 1T"
0
By Fatou's lemma (if b ( x ) is identically zero set b(xTU) = V(xTu) = 0 in what follows), lirn E:V ( x , ) 2 E:V (xT,)= E:b ( x J . ZT"
Hence
V ( x ) 2 C"(X)>= 0
and, for x on S , V ( x ) = b (x) = C " ( X ) .
Also, since V ( x ) is in the domain of each z i , b , the function defined by rw
Cy (x)
= E:
k (xs, us) ds
+ E: b(x,,,)
0
may be assumed* to be in the domain of each 
2 i . b with Ai,bCy(x)=
* I t is not always true that integrals of the form ELuSiUg(xr) ds = C (x) are in the domain of A,,,J,~ for any m > 0. If this is so and g (x) is continuous, then AnL.bu G ( x ) = g (x). The assumption is made for the sake of argument, but it will not be pursued here. See Dynkin [2], Chapter 5, for a discussion of such questions.
114
IV
/ OPTIMAL STOCHASTIC CONTROL
If the only function of the process xtnr, which is a martingale and which satisfies the above boundary and expectation conditions is the trivial function q ( x ) = 0, then condition (22) is satisfied and (25) holds. To see that there can be such functions q ( x ) which are not identically zero, consider the scalar uncontrolled It6 process whose stopped subprocesses have the weak infinitesimal operators x z , b and with the origin as the target set S : dx
=x
dt
+ J:x
dz
Suppose that 4x2 k ( x , u ) = k ( x ) = , 3
b(O)=O.
Let V ( x ) = x2. Then V ( x ) is in the domain of x : , b , and on V ( x ) in the set Q,, 9 =A:, for each m,and
2.
115
THEOREMS
Now let V, (x) = x2 + x4; V , (x) is also in the domain of for each m,and V , (0) = 0 (note that Q, is now relative to V , (x)) and x : , b = Y o on V, (x): p ~ l ( ~ ) =  4x2 =k(x)=~oV(x). 3
In this case it can be shown that zo = co with probability one (see the proof of Khas'minskii [3]) ; there are analogous examples, however, when zo is finite with probability one. The process x, may be essentially written as x, = xo exp (42,  4 t ) .
A direct computation yields 4
4
Ex,xr + s = x s
with probability one, verifying that the difference V , (x,)  V ( x , )= x: is a martingale. Hence, the stopped process x&,, is also a martingale. Write = V(x,). By a supermartingale convergence theorem of Doob [l], the supermartingale sequence 6 converges to a finite valued random variable v with probability one if ElVJ 5 M 00. Also, if the sequence 5 is uniformly integrable, then Ex& + Exv. Thus, uniform integrability (condition (22)) together with the convergence of the V, to the appropriate boundary values is equivalent to (21) or (2la). Theorem 3 gives a general and usable criterion for the uniform integrability of the sequence V(xTi);hence, under the conditions of Theorem 3, if V(x,) converges to the desired boundary value b(xJ with probability one, then E: V ( x , )+ E:b(xru).
v
=
Theorem 3. Let the function V ( x ) and the sequence T~ satisfy the conditions of Theorem 1 or 2. Let there exist a nonnegative function E ( x ) in the domain of x : , b with E ( x ) 5 0 for each m. Let
xi,"
116
as
IV
A + 03. Then the
/ OPTIMAL STOCHASTIC CONTROL V(x,,) are uniformly integrable. In particular
E:V
('Ti)
+
E:v
(xTu).
v
= V(xTi) and Fi = g(xr,). Proof. Define the random sequences For each w in the set Bmi= {w: 2 m}, v / F i 5 l/g(m). Thus,
Since J:,bg(x) 5 0, the process @(xtnTU) is a nonnegative supermartingale. The discrete parameter process obtained by sampling a supermartingale at a sequence of Markov times is a discrete parameter supermartingale (Doob [l], Chapter VIII, Theorems 11.6 and 11.8). Thus the process Fi is a discrete parameter nonnegative supermartingale. By the martingale property, E:Fi 5
F?.
Substituting this into the right side of (27) one obtains a bound which is independent of i and tends to zero as m + 03. Hence the sequence Vi is uniformly integrable. Q.E.D Remark on the selection of P ( x ) f o r It6 processes. In Example 3, E ( x ) will be taken in the special form E ( x ) = F( V ( x ) ) ,where F(A)/A+ co as A+co. Let F(A) have continuous second derivatives and let V ( x ) be in the domain of A:,,for each m > 0. Then F( V ( x ) )is in the domain of J:,b for each m > 0, and
2.
117
THEOREMS
where V,(x) is the gradient of V ( x ) and
k ( x , U) =  ~ ' V ( X ) . Let m(s) be any continuous function which satisfies
in E  S. If F ( V ( x ) ) is to satisfy the conditions of Theorem 2 then (28) must be nonpositive in E  S, or, equivalently,
in E  S . The function F (A) defined by S
(210) satisfies 9 ' F ( V ( x ) )S 0 in E  S. If the condition S
exp[rn(y)dy+cc
as
s+m
(211)
is also satisfied, then F (A)/A + m as A+ co,and since F (A) has continuous second derivatives, it will satisfy the hypotheses of Theorem 3. If m(s) is the trivial function m(s) = 0, then F ( A ) / A = 1. Thus, if the bound (29) satisjes (2ll), the V(xri) are uniformly integrable. If the process x, is confined to a set Q, 3 S with probability one, then the Viare uniformly integrable, since they are uniformly bounded. This boundedness is guaranteed if S (x) + 0 uniformly as x + dQ, from the interior and k ( x , u ) > E > 0 in some neighborhood of dQ, (relative to Q,) for some E > 0. To show this we need only observe that the given conditions imply the existence of a function m(A) which is continuous in Q,  S , satisfies (29), and tends to infinity as A f r.
118
IV
/ OPTIMAL STOCHASTIC CONTROL
Thus F ( A ) / 1 (defined by (210)) tends to infinity as A+r, and finally, this implies that P x { sup V ( x J > r } = O . r.>rtO
A
PARTICULAR FORM FOR
F( V(X))
For It6 processes, it is convenient to use the particular form F(1) = 1log ( A + A)
(212)
for a suitably large real number A . Then log@
+ [2 AN OPTIMALITY
+ V(X))
V (x)
+A
k(x, u)
V(x)’ [ V i ( x ) S(x) V x ( x ) ] .
+ V(x)) +
2(A
+I
(213)
THEOREM
The cost corresponding to control u (of the statement of Theorem 2) may be compared to the costs corresponding to other controls by the following considerations. Let w be a control which takes the initial condition x to S with probability one, for each X E E  S . Let V ( x ) satisfy the conditions of Theorem 2 and suppose that V ( x ) is also in the domain of 2 1 , b for each m > 0 and that A”,”,bl/(x)>=  k ( x ,
W).
(214)
Suppose that T, < co with probability one, and that, with control w, there is a sequence T~ of Markov times satisfying the conditions of Theorem 2 (but converging to T,). Then, by Dynkin’s formula, for each i,
+
V ( x ) 5 E:V ( x r i ) E:
j: 0
k ( x , , ws)d s .
2.
119
THEOREMS
By taking limits, we obtain V(x)
where
s C W ( X )+ 6 ( x ) ,
C” ( x ) = E:b (xr,)
+ E:
i 0
k ( x s, w,) ds
S w ( x ) = lim E:V (x,,)  EZV ( x r w )2 0 . it m
In order to compare the controls u and w , we need to evaluate a”(.). If Sw(x)= 0, then C ” ( x )= V ( x )5 C w ( x ) . In general, u is optimal (in the sense of minimizing the cost) with respect to at least all controls w for which A”,”,, v ( X ) 2  k ( x , w), and a”(.) = 0. In the deterministic problem, if V ( x ) is continuous and if a control w transfers the initial point x to a finite point on dS, then V(x,)+ V(x,,) = b(x,,). In the stochastic problem, if w transfers the initial point x to S i n finite time and V ( x )is continuous and E  S is bounded, then V ( x J + V(x,,) = b(xrw) with probability one, and EZ V(x,,)+ E:b(x,,). Some of the most common models taken for continuous time control processes do not have a bounded state space. Then, although we may have x, + San d V ( x , )+ V(x,,), both with probability one as t + T” < co,we may not have ExwV ( x , ) + E,” V(x,,,,),as the example following Theorem 2 showed. Thus, in order to compare C ” ( x )and C w ( x )in this case, some further constraints on the effects of the comparison control w are required. Theorem 4 sums up this discussion. Let both the controls u and w transfer the initial condition x to the set S with T, < 03 and T, < co with probability one. Let the nonnegative continuous function V ( x ) take the value b ( x ) in S , and be in the domain of both z:, and A”,”,b for each m > 0. Suppose that Theorem 4.
,
z:,bV(x) =  k ( x , U) A”,”,bV(X)
=  k,(X,
W)
2  k ( x , W).
(2 14a)
(214b)
120
IV
/ OPTIMAL STOCHASTIC CONTROL
Let either k, S 0 or
P:{
sup V ( x J ~ m }  + O as
r,>rbO
ma.
For the process corresponding to each control, there exists a sequence of Markov times satisfying the condition of Theorem 2. For any such pair of sequences, let either ( u = u or w)
EI:V ( X r J
+
EI:V (xz,)
(215)
or let the V ( x , ) be uniformly integrable. Then C"(X)
S
CW(X).
Remark. Theorem 4 is a precise statement of the dynamic programming result of Section 1 for a more general process. In Section 1, = 9" was used. Conditions on k ( x , u ) which imply that x, + S with probability one are discussed as stability results in Chapter 11. Proof. The conditions imply that the appropriate sequences { T ~ ) exist. The rest of the proof follows from Theorem 2, and the discussion preceeding the statement of Theorem 4. Corollary 41 follows immediately from Theorem 4, and applies to the case where C"(x) = Pf: { s u ~ , ~ ,V~( x>, )~>= A} and the target set S is contained strictly interior to Q,; that is, the cost is the probability that x, will leave Q, at least once before absorbtion on S at T". Corollary 41. Let k ( x , u) = 0 with V ( x ) = 0 on S (and V ( x ) = A on dQ,) where S is strictly interior to Qd.Let V ( x ) and the controls u and w satisfy the other conditions of Theorem 4. Then
P:{ sup V ( X J 1 A} 5 P,w( sup V ( X J 1 A}. r,>rtO
r,>tbO
For future reference in Example 3 we write: Corollary 42. Let u and w be controls and let the corresponding processes be It6 processes, for t < ?,,and t < T,, respectively. Let (214)
2.
THEOREMS
121
and the conditions preceding it in Theorem 4 be satisfied. Suppose that A > 0 and . Y V ( X ) log ( A v ( x ) ) 2 0 LYWV( x ) log ( A (x)) 5 0.
+ +v
Then
C " ( x )5 C W ( X ) .
Proof. The proof follows from Theorems 1 to 4. Corollary 4 3 . Let the set E  S + dS be bounded. Let the continuous nonnegative function V ( x ) satisfy the boundary condition b ( x ) on S and let V ( x ) be in the domains of both Ji,band 2 Z . b . Let the controls u and w transfer the initial condition x to S with probability one as t + T,, or T,, respectively. Then, if
J ; , g y x ) =  k ( x , u ) 5 0, ZZ,b v ( x ) 2  ( x , w, > we have
C"(X)5 C W ( X ) .
Proof. V ( x ) is continuous and bounded in the bounded set E  S + 8s.Thus, for both controls u and MI, the corresponding processes V ( x , ) converge to the appropriate boundary condition (V(xrU) or V ( x , _ ) ) . Also, for the associated sequences isi}, we have both ET: V ( x J + ET: V(xr,) and Ef: V(xT,)+ Ef:V(x,.). The rest of the proof follows from Theorem 1 or 2. Note that, if b ( x ) E 0, the arrival times z, and zw may take infinite values.
CONTROL OVER
A FIXED TIME INTERVAL
If the control is to be exercised over a fixed time interval ( t , TI only, the result is essentially a special case of Theorem 2. We now introduce the time variable t explicitly, and list some conventions which provide
IV / OPTIMAL STOCHASTIC CONTROL
122
for the immediate application of Theorem 2. The pair (x,, t ) is considered as a right continuous Markov process, for t T,and the conventions of Chapter I, Section 3, are used to define the weak infinitesimal operator 2 ; acting on the function V ( x , t ) in the set Qm= {x, t : V ( x , t ) < m}. Define the target set S, = S x [0,TI { T } x E. We allow all components of x, to be only right continuous. Define T, = T ni nf { t : x , E S } . The terminal cost is the nonnegative continuous function b(x, t ) defined on the target set S,. Let the initial value x, = x E E  S + dS be given at time t < T . Then, we define the cost corresponding to control u and the interval ( t , TI by
s
+
k ( x , , u s , s) ds
+ C,,b(xTU,T,,),
where k ( x , u, s) 2 0. In Theorem 5 , we use xi,, as the weak infinitesimal operator of the nonhomogeneous process x, (with control u) stopped at T”, and acting on functions which may depend on time. Theorem 5. Let the killing times for the processes corresponding to controls u and w be no less than T with probability one. Let the nonnegative function V ( x , t ) be continuous in all its arguments and suppose that it satisfies the boundary condition V ( x , t ) = b(x, t ) 2 0 on S,. Assume that, in ( E  S ) x [O, TI
Ji,bv(x,t ) =  k ( x , u , f) 5 0
(216a)
for each m and A ” , ” , b v ( x ,t )
=
 k,(x,
W,
t)
>=  k ( X ,
t),
(2 16b)
m+m.
(217)
W,
where either k,(x, w, t ) 2 0 or P I { sup V ( x , , t ) L m } + O
as
Tw>fLO
To each control u and w, there is a corresponding sequence of nondecreasing Markov times with the properties; T~+ 7, with probability one; E;,,q< cx) (where u is either u or w ) ; T~ 5 inf ( t : V(xs, s) 2 i > ;
2. THEOREMS
123
for each w , there is an integer i(o)< co so that T~ = T" for all i > i(w). Finally, suppose that either (218) or (219) hold: E:, t (x,,
7
7,)
+
E:, IV ( X I " 9 T,)
the V ( x , , , T,) are uniformly integrable for both u and w . Then
V ( x , t ) = C"( x , r )
s CW(X,t ) .
(218) (219) (220)
ProoJ Let the control be u. By (216a), the process V ( x l ,t), stopped at T,, is bounded with probability one, for t 5 T,; that is,
PI{ sup
T 2r. 2 sh I
V ( x s , s) 2 m }
V(x,t) rn
,
which goes to zero, as rnco. Thus there is a sequence { T ~ }satisfying the conditions of the hypothesis (for example, T ~ =min (i, T,, inf { t : V(xl) 2 i})). Let x be in E  S and t < T. Dynkin's formula may be applied to T~ and V ( x , t ) and yields 7,
The last term on the right converges to 7"
I
as i+co. By (218), V ( x , t ) = C"(x, t ) . In any case, since x,, = xrYfor some i < 00 with probability one (since T~ = T" for some i < 00 with probability one), we have limi+mV ( x , , , T ~ )= V ( x r u T,,), , with probability one. Now (219) implies E:,t V ( x 7 , ,T~)+ EI,t V(xTU,T,), and the left side of (220) follows.
124
IV
/ OPTIMAL STOCHASTIC CONTROL
If the control w satisfies either kw(x, w , s) >= 0 in ( E  S ) x [0, TI or condition (217), then we also infer that V ( x r i ,zi)+ V(xrW,zw) with probability one. Then (219) implies (218). Combining (218) with Dynkin's formula, we obtain V ( x ) 5 lim E:, ,V (xTi, zi)+ lim i+ m
im
=
C W ( x t, )
and (220) follows.
STRONG DIFFUSION PROCESS
A stronger result is available when x, is a strong diffusion process and E  S is bounded. In Theorem 6 the control depends on x only, and time is not one of the components of x. Theorem 6. Let Q be a bounded open region of class* IT2+,, and let the process x,, corresponding to controls u = u or w, be a strong diffusion process in Q + aQ, with differential generator
a L"= Yip" = C f i ( X , u) i axi
+ 12 C ~
Sij(X,
i.j
a2 axi a x j
u)
where the control u(x), and the functionsfi(x, u ) and Sij(x, u ) satisfy a uniform Holder condition in Q + aQ. Then each u transfers the initial condition X E Q to aQ in time zip"with E ~ r, and V ( x ) = 0 for llxll = r. Then the optimality will be proved. The symmetry of (318) suggests that C"(x) is of the form C"(x) = g( Ilxll) for some monotonically increasing function g(w). Let w = JIxJJ. Then, supposing that the candidate V ( x ) is of the form g( llxll), the relation aw
implies that
(320)
aw
where n is the dimension of the vector x. Equation (320) admits of a solution of the form O' Bi (321) g(w) =A, Axw A, log w 17.
+
+
+
1 w
Substituting (321) into (320), and equating coefficients of like terms
140
IV
/ OPTIMAL STOCHASTIC CONTROL
in w on each side of (320) we obtain the coefficients A iand B,: 1
=P
,
A,
= (n
o2
 1)  2 , 2P
(322)
Since the set (322) contains only n nonzero terms, V ( x ) = s(llxll) = A ,
+ A , llxll + A , log llxll +
1:
where the empty sum defined by Bi/w obtained from the boundary condition g(r)=0 = A ,
= 0.
c llxil’ Bi
n2
7
(323)
1
The value of A , is
+ A , r + A , log r + c1 .ir n2
Bi
Since A i2 0 and Bi 5 0, we have g(w) > 0, for w > r . The optimality of (319) and minimality of (323) will now be verified .by the use of Theorems 1 and 3. By construction, V ( x )> 0 and Y ” V ( x )=  1 in E  S, and V ( x ) = 0 on 13s.We may define V ( x )= 0 on S. V ( x ) is in the domain of J,,,, b , for each m > 0 and A”,, b V(x) = ~ “ V ( x > . 2 ? ” V ( x= )  1 implies that x, + 8s with probability one in a finite average time for x in E  S. Finally, by the use of Theorem 3 and F( V ( x ) )= V ( x ) log ( A + V ( x ) ) and
9 ° F (v(x)) 5 0
for large A , and x in E  S, the uniform integrability condition of Theorem 1 is satisfied. Thus V ( x ) = C”(x). Now, noting that the control (319) absolutely minimizes YW V ( x )+ 1, where w is subject to the constraint w’w S p 2 , Theorems 3 and 4 imply that (319) is optimal with respect to at least the family of uniform Lipschitz controls w such that Y p ” V ( x )log@
in E  S .
+ V ( x ) )5 0
4. A DISCRETE PARAMETER THEOREM
141
4. A Discrete Parameter Theorem
Theorem 9 is the discrete parameter version of Theorem 2. The discrete parameter case has been more extensively studied than the continuous parameter case. See Howard [l], Blackwell [l], Derman [I], [2], and Bellmanand Dreyfus [l] for some representative results. We denote the target set by S and suppose that the transition probability of the homogeneous * Markov process xl, ... depends on the control. The values of the control u, depend only on x,. Nuis the first Markov time of arrival at S , and we suppose that the process is stopped at Nu. The cost is
b(x) h 0,
k,(x, u ) 2 0,
conditioned on the initial condition x1 = x. Write G , n k l ( x n + l u,) = k(x, un). 7
Let k ( x , u) = 0 for x in S. By the algorithm of dynamic programming, the minimum cost, V ( x ) , satisfies the functional equation V(x)=minE,,.[V(x,+,)+
kl(x,+1,un)1.
(41)
U
Let u transfer x1 = x to S with probability one. (With Nu< a3 with probability one if b ( x ) f 0.) Let V ( x ) be a continuous nonnegative function satisfying b ( x ) = V ( x ) for XES and Theorem 9.
E:,"V(x"+J
 V ( x ) =  k ( x , u ) s 0.
(42)
in E  S . Let either (43) or (44) together with either (45) or (46) hold : E:V (x,,~,) E:b ( X N , ) as n 00 (43) V (xnnN,) are uniformly integrable, (44) V (x) uniformly continuous on S , (45) +
*
+
9
The nonhomogeneous case result is the same, except for notation.
142
IV
/ OPTIMAL STOCHASTIC CONTROL
P:{ sup llxi[I Z N } + O N.>it
as
1
N+oo.
(46)
Then C"(X) = V ( x ) .
Let w be a control such that the above conditions are satisfied except that E:,nV(xn+l)
v(x)Z k ( x , w ) *
Then C"(x) = Y (x)
Proof. If b ( x ) = 0, then x,
5 CW(X).
and either (45) or (46) imply Nu c 03, and either (45) or (46) imply V ( x n )+ 0 (all with probability one). Then (44) implies (43) whether or not b ( x ) = 0. Next, we note that the application of (43) to V(xn)+ 0. If b ( x ) f 0, then x,
+S
+ S,
nnN,
V (x)
= E:
1 k(xi 1
9
ui)
+ E:Y
( x n + InN,)
implies that V ( x )= C"(x). Similarly, it can be proved that V ( x )6 C"(x).
v / T H E DESIGN OF C O N T R O L S 1. Introduction
In this chapter we discuss some methods of designing controls which are suggested by the results of Chapters I1 to IV. If a control problem is posed as a wellformulated mathematical optimization problem, as in the introduction to Chapter IV, then it is natural at least to attempt to compute the optimizing control. Owing to the difficulty of the computational problem (even if existence and other theoretical questions were settled), this is not always possible. In addition, the practical control problem is not usually posed as a wellformulated mathematical optimization problem. The goal which the control is designed to accomplish may be phrased somewhat loosely. We may desire a control which will guarantee that a given target set is attained with probability one at some random time which is specified only by a bound on its average value or on the probability of large values. Or we may require that the system satisfy some particular statistical stability property, either in some asymptotic sense, or with bounded paths, or with a condition on the average rate of decrease of some error for large values of the error. Any member of some family of loss functions may be satisfactory. It may only be desired that the control, which accomplishes a given task, not take “large” values with a high probability. For the deterministic problem there is interest in the use of Liapunov function methods to design controls which will satisfy such qualitative requirements; for example, 143
144
V
/ THE DESIGN OF CONTROLS
Kalman and Bertram [ l ] , LaSalle [ l ] ,Geiss [ l ] ,Nahi [ l ] , Rekasius [ l ] , and Johnson [ l ] ; see Kushner [ 5 ] for the stochastic case. In the next section, we give two examples of the use of the stability results to design controls which will assure that the resulting process has some specified stability property. The design of controls to improve the cost is discussed in Section 3 . It is doubtful that the design of a control can be based solely on the Liapunov function method ; however, it should provide some helpful assistance.
2. The Calculation of Controls Which Assure a Given Stability Property
Example 1 . We will compute a control which will assure that the stability properties of (21) satisfy a given specification : d x l = X , dt d x , = ( x 1  x ,
The function V ( x ) = *x;
is in the domain of Lipschitz control:
2; for
+ U ) dt + c d z .
+ X l X , + x:
each m , and
Y V ( x )=x:
(21)
2; =Yufor
(22)
any uniformly
 x : + u ( 2 x , + x l ) + 2.
Suppose that U ( X ) is uniformly Lipschitz. Define S = { x : x ' x 5 R2} where R 2 2 d.Let u = 0. Outside of S, Y oV ( x ) < 0 and tends to  co as llxll tends to infinity. By Chapter 11, Corollary 21, there is a time T~ (possibly taking infinite values) so that x, + 8.9with probability one as t   * T O . Similarly, if the control u ( x ) satisfies sign u ( x ) = sign ( 2 x , +x,), there is a T , such that x,+dS with probability one as t + T " . We will select a control u so that
p:{ SUP r,>t$O
v(Xf)rEjsP,
(23)
2.
145
CALCULATION OF CONTROLS
where p and E are fixed; that is, the control will assure that the probability that x , leaves Q, before touching dS is no greater than p. Let c > 0. Then, for any stochastic Liapunov function V ( x ) , V ( x ) + c gives a worse estimate than V ( x ) . This is easily seen by noting that
A m ( V ( x )+ c ) = AmV(X) (provided only that the probability that x , is killed in [ t , t x, = X E Q ~is ,o(d)) and P x { sup
?,>,to
v(x,)2 E }
= Px{
sup
?,>ttO
I
V(x)
+ A ) , given
v (XI) + c 2 E + c)
+ c >V ( x )
E + C
>= P x {r ,sup >t>o
E
V ( x , )>= E } .
Therefore, to improve the estimate in our problem, we use P ( x ) = V ( x )  v 2 0 in E  S , where v = minxcPSV ( x ) . Define W ( x )= exp $ P ( x )  1, m = exp t,bE, and E = E  v. W ( x ) is in the domain of 2;and A; = 9”. Note that W ( x ) >= 0, where P ( x ) 2 0.
Y ” W ( X )= $ ( W ( x ) + 1)
If u ( x ) is such that 2”’ W ( x )2 0 in E  S , then by Chapter 11, Corollary 21, P : { sup V ( x l )2 E } 7,>120
= P:{
,to
exp $ P ( x )  1 exp $E  1 ~
= B($).
To complete the choice of u ( x ) , fix the initial condition V ( x ) ,choose $so that B ( $ ) = p and, finally, choose u ( x ) so that u ( 2 x 2 x,)cancels the noise contribution ( $ a 2 / 2 ) (2x2 + x , ) ’ ; one choice, which is not necessarily the “smallest,” is u = $ a 2 ( 2 x 2 + x , ) / 2 .
+
146
V
/ THE DESIGN OF CONTROLS
It cannot be claimed that the chosen control is best in the sense of minimizing the left side of (22), or even that it improves the stability in any absolute sense. All that can be claimed is that (23) is satisfied. Nevertheless, if this cannot be mathematically ascertained by some other technique, and some other control, then the method employed does have value for the problem. It should be mentioned that, in a realistic control problem, the specification ((23)) would rarely be given so precisely; a fair amount of freedom may be available in the choice of both S and ‘‘QE,”and other Liapunov functions, giving information on different regions, may also be useful. In fact, it would be desirable to compare several Liapunov functions, bounds, and regions.
Example 2. We consider now a finite time problem for the system (21). Let V ( x ) be defined as in Example 1 and let w(x) = exp + V ( x )  1, W ( x )= exp,@ V ( x ) . Determine a control which assures that
P I { sup V ( X , ) ~ &5;p. } ThfbO
(25)
Recall that (Chapter 111, Example 3), x: + x i 2 p V ( x ) ,where p = 0.552. Let u =  (2x, + xl) a‘+/2 and write (analogous to (24))
Y“‘(x) 5  ~ $ P (+ x@ ) W ( X )[ p + 0’  ~ V ( X ) ] . (26) The second term of (26) has a maximum, denoted by cp, at any point x such that
If the maximum occurs at V ( x )= 0, then cp = t,h@ + a2); otherwise, cp = p expl[+(@ a2)/P  I]. In what follows, we suppose that the latter case holds; we also suppose that E is large enough so that, for all of interest,
+
+
We allow these assumptions merely for ease of presentation.
3.
DESIGN OF CONTROLS TO DECREASE THE COST
147
Under condition (27), Theorem 1 of Chapter 111 gives
P: { sup
TztzO
v ( x J 2 E } = PI {
sup W ( x J 2 exp $8  1 = m}
TztzO
JI can be selected so that the righthand side of (28) equals
p, as
required by (25). Note that, as $+m, the right side of (28) tends to zero for all fixed T < co, provided (27) holds. Of course, as $ increases, so does the magnitude of the control.
3. Design of Controls t o Decrease the Cost Theorem 1.
Suppose that the pair ( V ( x ) ,u ) satisfies the conditions of Theorem 1 (or Theorem 2 if b ( x ) f 0), Chapter IV (so that V ( x ) = C”(x)). Let w be any control which transfers the initial condition x to S with probability one and for which V ( x ) is in the domain of A”:, for each m > 0. If b ( x ) f 0, then let T, c co with probability one. Let
A”,”V(x)5  k ( x , W )
0.
(31)
Then CW(X)5 C”(X).
(32)
If (31) is strict for some x, then so is (32) for some x. Proof. By Theorem 1 (or Theorem 2, if applicable) of Chapter IV, V ( x )= C”(x). Let zi be a sequence of Markov times (for the process with control w) tending to T, with probability one and satisfying E”;si 00, T~ Sinf { t : V(x,) 2 i, with control w}. Then, by Dynkin’s
=
148
V
/ THE DESIGN OF CONTROLS
formula and (3l),
1 Ti
+
V ( x ) 1 E:V ( x T i ) E:
k ( x , , w,)ds .
(33)
0
Now applying Fatou’s lemma and using the fact that V ( x , )+ V(x,,) with probability one, we obtain
+
V ( x ) 2 lim inf E:V ( x T i ) lim inf E:
2 E:b(x,,) whether or not b ( x ) = 0.
+ E:
i
k ( x , , w,) ds
i 0
k (x, , w,)ds
= CW(x),
(34)
0
Q.E.D.
Remark. Note that it is not required that E : V ( x T i )  E Z b ( x T w ) . If this is not the case, then the difference is in favor of the control w. If E  S +as is bounded, then it is again only required that the processes x, converge either to S if b ( x ) is identically zero or to a specific point on S in finite time if b ( x ) is not identically zero. Remark. Suppose that V ( x ) is know to equal the cost C”(x)= E:b(x ru) + E: s‘o. k ( x , , us)ds, and is in the domain of A”,”, for each m > 0, as well. Suppose also that k ( x , u) is continuous at each x and u and that us is also right continuous with probability one for t < T ~ . Then Al,”V(x)=  k ( x , u). (This can be demonstrated by an application of Theorem 5.2 and Lemma 5.6 of Dynkin [2].) Then any control w for which V ( x ) is in the domain of A”,” and A”,” V ( x )5  k ( x , w ) yields a cost that is no larger than C”(x), provided V ( x J+ V(xrw)as t + T,. The theorem provides a procedure which, at least in principle, may be used to compute a sequence of controls, each member of the sequence yielding a smaller cost than the previous member. Suppose that the cost, corresponding to u, can be computed, and suppose that the pair ( C ” ( x ) ,u ) satisfies the conditions on ( V ( x ) , u) of the theorem. Suppose also that a control 1v can be found which satisfies the con
3.
DESIGN OF CONTROLS TO DECREASE THE COST
149
ditions of the theorem. Then C"(x) S Cu(x). If C"(x) can be computed, and ( C w ( x ) ,w)has the properties of ( V ( x ) , u) of the theorem, then the procedure may be repeated, etc. For It8 processes, the computations involve, under suitable conditions, the solution of a partial differential equation. See, for example, Chapter IV. If a solution to the appropriate equation is available, it must still be checked against the conditions of the theorems of Chapter IV. The limiting properties of at least the sequence of solutions to this partial differential equation, for strong diffusion processes in bounded domains, is discussed by Fleming [l]. See also Fleming [2]. We will not pursue these interesting results here.
Remark. The following form for It8 processes will be useful in the example. Let d x = f (x, U ) d t CJ (x) dz
+
where CJ does not depend on the control. Suppose that the triple V ( x ) ,u = 0 and w, satisfies the conditions of Theorem 1, and that, on V ( x ) , = 2" and 2;= Sw. Let
xz
C"(X) = E:
where k(x) 2 0 ,
i 0
[k(x,)
+ l ( x , , us)] ds
l ( x , 24) 2 0 ,
Let U"(X)
I ( x , 0) = 0 .
+ k(x)=0.
By the theorem, C'(x) 2 C w ( x ) .The inequality is strict for some x, if U W V ( X ) S O V ( X )
or, equivalently, if
+ 2(x, w) < 0,
150
V
/ THE DESIGN OF CONTROLS
THELIAPUNOV FUNCTION APPROACH TO DESIGN Suppose that some stochastic Liapunov function V ( x )2 0 is given. Then the control problem may be studied in several ways. The common factor underlying the several approaches is that V ( x ) is assumed to equal the cost associated with some control, and some loss function k ( x , u). The control may be given a priori, but the loss function is determined by V ( x ) and u. First let b(x) = 0 and define the loss to be k ( x , u) = J,:V(x) 2 0 and suppose that the value of A",,,V(x)in any fixed set does not depend on m. Define R" = { x : k ( x ,;)I O } . Suppose that there is some y > 0 so that Q , + dQ, = { x : V ( x )5 y } 3 R".Then, for the loss k ( x , u ) and target set Q , + dQ, = S , the cost corresponding to control u is C"(x)= V ( x ) y. (We suppose, of course, that ( V ( x ) , u), statisfies the conditions of Theorem 1 .) Let w be any control satisfying the conditions of Theorem 1, and suppose also that S 3 R".Then the Theorem says that C w ( x )6 C"(x). Thus, given some suitable V(x), we constructed a control problem and then improved the control. Obviously, the usefulness of the procedure is dependent upon whether the chosen V ( x ) yields a k ( x , u) and an S of suitable forms. In any case, with the given V ( x ) and computed S and any k ( x , u) S 0 in E  S , some type of stochastic stability is guaranteed. Now, choose a set S which has a smooth boundary and which contains R", and define the function V ( x )= b(x) on S. The procedure outlined above may be repeated. Now, C"(x)= V ( x ) and, if there is a w satisfying Theorem 1, then w is at least as useful as u. The method is applied to a control process in Example 3. More examples are given in Kushner [ 5 ] . Example 3. Let the control system be defined by d x , = x2 dt dxZ=(x,
xx,+~)dt+odz.
Define V ( x ) = +x:
+
XlXZ
+xi.
3.
DESIGN OF CONTROLS TO DECREASE THE COST
151
V ( x ) is in the domain of A”,” for each m > 0 and each locally Lipschitz , =Al,”. In particular, control u(x). On ~ ( x ) 9” 9OV(X) =  x’x
+ 6‘.
We will now construct a control problem based on V ( x ) , and compare the control u = 0 to another control. Define Ro = {x:x’x S o’} and let S = Q , aQ,= {x: V ( x )5 y } 2 Ro. By Corollary 21, Chapter 11, there is some z0 S 03 such that x, + dQ, as s + zo, with probability one. In addition, it is not hard to verify (although we omit the proof) * that E : V ( x J + y for any sequence of random times tending to zo . Thus, by Theorem 1, Chapter IV,
+
V(x)
 y = C“X)
= E,O
i
(xix,
0
 o’) ds.
We conclude that V ( x )  y is the cost corresponding to the control problem with target set S , loss x’x  02,and control u = 0. Suppose now that we wish to find a control u = w for which there is a random time z, such that x,+dS as t + z w , and, in addition, for which (xo = x@S)
s
rw
Co(x) > CW(x)= E,”
0
(x~x, 6’
+ wf) ds .
The procedure to be followed is outlined in the last remark following Theorem 1. Let w be a uniformly Lipschitz control. Define the loss k(x,
W)
=k(x)
+ l(w)
k ( x ) = x’x  02, I(w) = w 2 9WV (x) =  x:  x: + o2 + w (XI + 2x2).
* Suppose that the distance between aS and aRU is greater than zero. Then  x’x + d <  E < 0 in E  S, and it is easy to show that EzoV (xr,) + y. Let > 0. Then 9 0 V l + a (x) = (1 + a) V n (x) [ X’X + u2 to2a (2xa + xd2/2] which, for small a, is nonpositive in E  S. An appeal to Theorem 3, Chapter IV, (Y
completes the demonstration for this case.
152
V
/ THE DESIGN OF CONTROLS
If sign w =  sign (xl + 2x2), then 6pwV ( x ) 5 Y oV(x) and, hence, x, + as as t + T, (that is, z, is defined). Suppose that w satisfies, in addition, Y w V (x) 5  k (x, W ) or, equivalently, if
v: (x) (f (x, w)  f (x, 0)) + w 2 = (xl + 2x2) w + w 2 < 0, then the control u = w is better than the control u = 0. The control minimizing the left side of the last expression (and which also satisfies the other requirements) is x1 2x2 w=
+
3
Other forms, such as w1 = w, if IwI 5 1, and w1 = sign w otherwise, are also satisfactory. It does not appear easy to compute explicitly the improvement in the cost obtained by the use of w, Co(x)  Cw(x). Nevertheless, the use of control w allows an improved estimate of P]:(supr, > , ,, V ( x t )>= m}. In fact, this estimate is essentially provided by Example 1.
REFERENCES
Aiserman, M. A., and Gantmacher, F. R. [I] Absolute Stability of Regulator Systems, HoldenDay, San Francisco, 1964. Astrom, K. J. [ I ] “On a First Order Stochastic Differential Equation,” Intern. J. Control 1, No. 4, 301326 (1965). Athans, M., and Falb, P. L. [I] Optimal Control: An Introduction to the Theory and Its Applications, McGrawHill, New York, 1966. Bellman, R. [ I ] Introduction to Matrix Analysis, McGrawHill, New York, 1960. [2] Adaptive Control Processes: A Guided Tour, Princeton Univ. Press, Princeton, New Jersey, 1961. Bellman, R., and Dreyfus, S. E. [I] Applied Dynamic Programming, Princeton Univ. Press, Princeton, New Jersey, 1962. Bertram, J. E., and Sarachik, P. E. [I] “On the Stability of Systems with Random Parameters,” Trans. IREPGCT 5 (1959). BharuchaReid, A. T. [ I ] Elements of the Theory of Markov Processes and Their Applications, McGrawHill, New York, 1960. Blackwell, D. [I] “Discrete Dynamic Programming,” Ann. Math. Statist. 33, 719726 (1962). Bucy, R. S. [I] “Stability and Positive Supermartingales,” J . Differential Equations 1, NO. 2, 151155 (1965). Caughy, T. K., and Gray, A. H., Jr. [I] “On the Almost Sure Stability of Linear Dynamic Systemswith Stochastic Coefficients,” J. Appl. Mech. ( A S M E ) 87, 365372 (1965). Chung, K. L. [I ] Markov Chains with Stationary Transition Probabilities, Springer, Berlin, 1960. Derman, C. [ I ] “On Sequential Control Processes,” Ann. Math. Statist. 35, 341349 (1964). [2] “Markovian Sequential Control Processes  Denumerable State Space,” 1.Math. And. and Appl. 10, No. 2, 303318 (1965). 153
154
REFERENCES
Doob, J. L. [ I ] Stochastic Processes, Wiley, New York, 1953. [2] “Martingales and One Dimensional Diffusion,” Trans. Amer. Math. SOC.78, 168208 (1955). Dreyfus, S. E. [I] Dynamic Programming and the Calculus of Variations, Academic Press, New York, 1965. Dynkin, E. B. [l] Foundations of the Theory of Markov Processes, Springer, Berlin, 1961 (translation of the 1959 Russian monograph). [2] Markov Processes, Springer, Berlin, 1965 (translation of 1963 publication of State Publishing House, Moscow). [3] “Controlled Random Sequences,” Theory of Probability and Its Applications 10, 114 (1965). Feller, W. [I] Probability Theory and Its Applications. Vol. I , Wiley, New York, 1950. [2] “Diffusion Processes in One Dimension,” Trans. Amer. Mafh. SOC.77, 131 (1954). Fleming, W. H. [11 “Some Markovian Optimization Problems,” J . Math. and Mech. 12, 131140 (1963). [2] “Duality and a Priori Estimates in Markovian Optimization Problems,” J. Math. Anal. and Appl. 16, 254279 (1966). Fleming, W. H., and Nisio, M. [ l ] “On the Existence of Optimal Stochastic Controls,” J. Math. and Mech. 15, 777794 (1966). Fillipov, A. F. [l ] “On Certain Questions in the Theory of Optimal Control” (English translation), SlAM J. Control 1, 7684 (1962). Florentin, J. J. [I] “Optimal Control of Continuoustime Markov Stochastic Systems,” J. Electronics and Control 10, 473488 (1961). Friedman, A. [l ] Partial Differential Equations of Parabolic Type, Prentice Hall, Englewood Cliffs, New Jersey, 1964. Geiss, G. [I] “The Analysis and Design of Nonlinear Control Systems via Liapunov’s Direct Method,” Grumman Aircraft Corp. Rept. RTDTDR634076 (1964). Hahn, W. [ l ] Theory and Application of Liapunov’s Direct Method, Prentice Hall, Englewood Cliffs, New Jersey, 1963.
REFERENCES
155
Howard, R. A. [ l ] Dynamic ProgramrningandMarkov Processes, Technology Press of M.I.T., 1960. Infante, E. F. [I] “Stability Criteria for nth Order, Homogeneous Linear Differential Equations,” in Proc. Intern. Symp. on Differential Equations and Dynamical Systems, December, 1965, Academic Press, New York, 1967. To be published. Infante, E. F., and Weiss, L. [I] “On the Stability of Systems Defined over a Finite Time Interval,” Proc. Natl. Acad. of Sci. 54, 4448 (1965). 121 “Finite Time Stability under Perturbing Forces and on Product Spaces,” in Proc. Intern. Symp. on Differential Equations and Dynamical Systems. December, 1965, Academic Press, New York, 1967. To be published. Ingwerson, D. R. [l] “A Modified Lyapunov Method for Nonlinear Stability Analysis,” Trans. IEEEAC AC6,199210 (1961). ItB, K. [l ] “On Stochastic Differential Equations,” Memoirs, Anter. Math. SOC. No. 4 (1951). [2] Lectures on Stochastic Processes, Tata Institute, Bombay, 1961. Johnson, G. W. [l] “Synthesis of Control Systems with Stability Constraints via the Second Method of Liapunov,” Trans IEEEAC AC9, 380385 (1964). Kalman, R. E. [ l ] “The Theory of Optimal Control and the Calculus of Variations” in Mathematical Optimization Techniques (R. Bellman, ed.), pp. 309332. Univ. of California Press, Berkeley, 1963. Kalman, R. E., and Bertram, J. E. [I] “Control System Analysis and Design via the Second Method of Liapunov,” J. Basic Eng. ( A S M E ) 82, 371393 (1960). Kalman, R. E., and Bucy, R. S. [I] “New Results in Linear Filtering and Prediction Theory,” J. Basic Eng. ( A S M E ) 83D, 95108 (1961). Kats, I. 1. [ l ] “On the Stability of Stochastic Systems in the Large,” J. Appl. Math. and Mech. ( P M M ) 28 (1964). Kats, I. I., and Krasovskii, N. N. [ l ] “On the Stability of Systems with Random Disturbances,” J. Appl. Math. and Mech. ( P M M ) 24 (1960). Khas’minskii, R. Z. [l] “Diffusion Processes and Elliptic Differential Equations Degenerating
156
REFERENCES
at the Boundary of a Domain,’’ Theory of Probability and Its Applications 3, 400419 (1958). [2] “Ergodic Properties of Recurrent Diffusion Processes and Stabilization of the solution to the Cauchy Problem for Parabolic Equations,” Theory of Probability and Its Applications 5, 179196 (1960). [3] “On the Stability of the Trajectories of Markov Processes,” J. Appl. Math. and Mech. ( P M M ) 26, 15541565 (1962). [4] “The Behavior of a Conservative System under the Action of Slieht Friction and Slight Random Noise,” J . Appl. Math. and Mech. ( P M M ) 28, 11261130(1964). Kozin, F. [ I ] “On Almost Sure Stability of Linear Systems with Random Coefficients,” J . Math. andPhys. 43, 5967 (1963). [2] “On Almost Sure Asymptotic Sample Properties of Diffusion Processes Defined by Stochastic Differential Equations,” J . Math. Kyoto Univ. 4, 5 15528 (1965). [3] “On Relations between Moment Properties and Almost Sure Lyapunov Stability for Linear Stochastic Systems,” J . Math. Anal. and Appl. 10, 342353 (1965). Krasovskii, N. N. [I] Stability of Motion, Stanford Univ. Press, Stanford, California, 1963 (translation of the 1959 Russian book). Kushner, H. J . “On the Differential Equations Satisfied by Conditional Probability Densities of Markov Processes, with Applications,” SIAM J . Control 2, 1061 19 (1964). “Some Problems and Some Recent Results in Stochastic Control,” IEEE Intern. Conv. Rec. Pt. 6, 1081 16 (1965). “On the Stability of Stochastic Dynamical Systems,” Proc. Natl. Acad. Sci. 53, 812 (1965). “On the Theory of Stochastic Stability,” Rept. 651, Center for Dynamical Systems, Brown Univ. Providence, Rhode Island (1965); See also Advan. Control Systems 4, and Joint Automatic Control Conf., 1965. “Stochastic Stability and the Design of Feedback Controls,” Rept. 655, Center for Dynamical Systems, Brown Univ. Providence, Rhode Island (May, 1965); Proc. Polytechnic Inst. of Brooklyn Symp. on Systems, 1965. “On the Construction of Stochastic Liapunov Functions,” Trans. IEEEA C AC10, 477478 (1965). “On the Existence of Optimal Stochastic Controls,” SIAM J . Control 3 , 463474 (1966). “Sufficient Conditions for the Optimality of a Stochastic Control,” SIAM J . Control 3,499508 (1966).
REFERENCES
157
[9] “A Note on the Maximum Sample Excursions of Stochastic Approximation Processes,” Ann. Math. Statist. 37, 513516 (1966). [lo] “On the Status of Optimal Control and Stability for Stochastic Systems,” IEEE Intern. Conv. Rec. Pt. 6, 143151 (1966). [I 1] “Dynamical Equations for Optimal Nonlinear Filtering,” J. Differential Equations 3, No. 2 (1967) to appear. LaSalle, J. P. [I] “Stability and Control,” J . SIAM Control 1 , 315 (1962). LaSalle, J. P., and Lefschetz, S . [I] Stability by Liapunov’s Direct Method, with Applications, Academic Press, New York, 1961. Lee, E. B., and Marcus, L. [ l ] “Optimal Control for Nonlinear Processes,” Arch. Rail. Mech. Anal. 8, 3658 (1961). Lefschetz, S . [ I ] Stability of Nonlinear Control Systems, Academic Press, New York, 1964. Loeve, M. [I] Probability Theory, 3rd ed. Van Nostrand, Princeton, New Jersey, 1963. Nahi, N. E. [l] “On the Design of Optimal Systems via the Second Method of Liapunov,” Trans. IEEEAC AC9, 214275 (1964). Rabotnikov, I. L. [I J “On the Impossibility of Stabilizing a System in the MeanSquare by Random Perturbations of Its Parameters,” J. Appl. Math. and Mech. ( P M M ) 2 8 , 11311136 (1964). Rekasius, Z. V. [I] “Suboptimal Design of Intentionally Nonlinear Controllers, Trans. IEEEAC AC9, 380385 (1964). Rice, S. 0. [I] “Mathematical Analysis of Random Noise,” Bell System Tech. J . 23, 282332 (1944); 24, 46156 (1945). Reprinted in Wax, N., Selected Papers on Noise and Stochastic Processes, Dover, New York, 1954. Roxin, E. [l] “The Existence of Optimal Controls,” Michigan Math. J . 9, 109119 ( 1962). Ruina, J. P., and Van Valkenburg, M. E. 11) “Stochastic Analysis of Automatic Tracking Systems,” Proc. Is? IFAC Conf.. Moscow 1, 810815 (1960). Skorokhod, A. V. [I] Studies in the Theory of Random Processes, AddisonWesley, Reading, Massachusetts. 1965 (translated from the 1961 Russian edition).
158
REFERENCES
Wong, E., and Zakai, M. [I I “On the Convergence of Ordinary Integrals to Stochastic Integrals,” Ann. Math. Statist. 36, 15601564(1965). [21 “The Oscillation of Stochastic Integrals,” Z. Wahrscheinlichkeitsrheorie 4, NO. 2, 103112 (1965). Wonham, W. M. [ I ] “Stochastic Problems in Optimal Control,” IEEE Intern. Conv. Rec. Pt. 4, (1963). [2] “A Liapunov Criteria for Weak Stochastic Stability,” J. Di’erential Equations. 2, 195207 (1966). [3] “A Lyapunov Method for the Estimation of Statistical Averages,” J. Differential Equations. 2, 365377 (1966). Yoshizawa, T. [ I ] “Stability and Boundedness of Systems,” Arch. Rat/. Mech. and Anal. 6, 4 0 9 4 2 1 (1 960). Zakai, M. [ I ] “On the First Order Probability Distribution of the van der Pol Oscillator Output,” J. EIPcfronics and Control 14, 381388 (1963). [2] “The Effect of Background Noise on the Operation of Oscillators,” Intern. J. Electronics to appear. Zubov, V. I. [ I ] “Methods of A. M. Liapunov and Their Application,” M a t . Leningrad Univ. (1957) (English transl.: Noordhoff, Groningen, 1964).
AUTHOR INDEX Numbers in italics indicate the page on which the complete reference is listed. Aiserman, M. A., 28,153 Astrom, K. J., 57, 153 Athans, M.,104, 138, 153 Bellman, R., 70,103, 141, 153 Bertram, J. E., 35, 144, 153, 155 BharuchaReid, A. T., 3, 153 Blackwell, D., 141, 153 Bucy, R. S., 36, 134, 153, 155 Caughy, T.K., 36, I53 Chung, K. L., 3,153 Derman, C., 105, 141, I53 Doob, J. L., 3, 14, 18, 19, 25, 115, 154 Dreyfus, S. E., 104, 141, 153, 154 Dynkin, E. B., 3, 5, 6, 7, 8, 9, 10, 11, 14, 16, 17, 18, 23, 37, 41, 52, 90, 113, 148,154 Falb, P. L., 104, 138, 153 Feller, W., 3, 60, 154 Fillipov, A. F., 106, 154 Fleming, W. H., 105, 106, 126, 149, 154 Florentin, J. J., 130, 154 Friedman, A., 90, 154 Gantmacher, F. R., 28, I53 Geiss, G., 72, 144, 154 Gray, A. H., Jr., 36, 153 Hahn, W., 28, 52, 154 Howard, R. A., 141, I55 Infante, R. A., 12, 79, I55 Ingwerson, D. R.,155 118, K., 5, 13, 16, 18, 19, 155
Johnson, G. W., 144,155 Kalman, R. E., 104, 134, 144,155 Kats, I. I., 35, 36, 52, 155 Khas’minskii, R. Z.,36, 45, 60, 115, 155, 156 Kozin, F., 36, 156 Krasovskii, N. N., 28, 35, 52, 155, I56 Kushner, H. J., 36, 52, 79, 106, 134, 144,150,156;157 LaSalle, J. P., 28, 63, 66, 94, 157 Lee, E. B., 106, 157 Lefschetz, S., 28, 63, 66, 94, 157 Loeve, M., 3, 5, 7, 25, 28, 157 Marcus, L., 106, I57 Nahi, N. E., 144,157 Nisio, M., 106, 154 Rabotnikov, I. L., 36, I57 Rekasius, Z.V., 72, 144.157 Rice, S. O.,77,157 Roxin, E., 106, 157 Ruina, J. P., 77,157 Sarachik, P. E., 35, 153 Skorohod, A. V., 14, 16, 19, 157 Van Valkenburg, M. E., 77,I57 Weiss, L.,79, 155 Wong, E., 12, 58,158 Wonham, W. M.,36, 158 Yoshizawa, T., 33, 55, 158 Zakai, M.,12, 58, 69, 158 Zubov. V. I., 72, I58 159
SUBJECT INDEX
Asymptotic sets, 39, 42, 54 Converse theorem, 52 Design of controls, 143ff. to decrease cost, 147 with given stability properties, 144 Liapunov function approach to, 156 Differential generator, 15 Dynamic programming, 103 Dynkin’s formula, 10 &Neighborhood,39 Equidistance bounded w.p.l., 32 Equistability w.p.l., 33, 55 Feller process, 8 Finite time stability, 77ff. First exit times, 77ff. discrete parameter, 86 strong diffusion process, 89 HamiltonJacobi equation, 108 Hill climbing estimates, 91 Instability w.p.l., 32 It6 stochastic differential equation, 12 differential generator, 15 stopped process, 17 strong Markov property, 15 It6’s lemma, 16 Killing time, 3 Lagrange stability, 53 Liapunov function, 23 stochastic, 34, 55ff.
construction, 72 degenerate at origin, 49 It6 process, 43, 55ff. moment estimates, 50 nonhomogeneous process, 41 Poisson differential equation, 46, 69 strong diffusion process, 44 Markov process continuous parameter, 3 discrete parameter, 1 , 7 strong, 4 Markov time, 5 Martingale, 25 super, 25, 31 probability inequality, 26 Moment estimates, 50 Poisson differential equation, 18, 20, 46, 69 differential generator, 20, 21 Stability definitions, 30 of origin, w.p.l., 30, 37, 38 asymptotic w.p.l., 31, 39, 41 exponential, 32, 47 with respect to (Q, P,p), 31 Stochastic continuity, 4 uniform, 4 160
SUBJECT INDEX
Stochastic control, 102ff. attaining target set, 1IOff. discrete time, 141 fixed time, 121 linear system, 130 with filtering, 133 nonanticipative controls, 126 norm invariant system, 138 practical optimality, 128 strong diffusion process, 124
161
sufficient condition for optimality, 11Off. Stochastic differential equation, see It6 or Poisson differential equation Stopped process, 1 1 Strong diffusion process, 22, 44, 89, 124 Ultimate boundedness w.p.l., 32 Uniform integrability, sufficient condition for, 115 Weak infinitesimal operator, 9
This page intentionally left blank