This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0, then f can be chosen to be uniformly continuous with respect to the metric d. Proof Let d be a metric on X consistent with its topology and define f(x)
=
d(x, A)/[ d(x, A) + d(x, B)],
where the distance from a point to a nonempty closed set is defined by (8). This distance is zero if and only if the point is in the set, and the mapping of (8) is Lipschitz-continuous by (6) of Appendix C. This f has the required properties. If infaEA,bEBd(a,b) > 0, then d(x,A) + d(x,B) is bounded away from zero, and the uniform continuity of f follows. Q.E.D. Lemma 7.2 Let X be a metrizable space. Every closed subset of X is a Gs and every open subset is an F". Proof We prove the first statement; the second follows by complementation. Let F be closed. We may assume without loss of generality that F is nonempty. Let d be a metric on X consistent with its topology. The continuity of the function x ---* d(x, F) implies that
is open. But F
=
n:=
G; = {xEXld(x,F) < lin} 1
Gn •
Q.E.D.
Definition 7.2 Let X be a metrizable topological space. The space X is separable if it contains a countable dense set. It is easily verified that any subspace of a separable metrizable space is separable and metrizable. A collection of subsets of a topological space (X, g-) is a base for the topology if every open set can be written as a union of sets from the collection. It is a subbase if a base can be obtained by taking
7.
106
BOREL SPACES AND THEIR PROBABILITY MEASURES
finite intersections of sets from the collection. If :T has a countable base, (X,:T) is said to be second countable. A topological space is Lindelof if every collection of open sets which covers the space contains a countable subcollection which also covers the space. It is a standard result that in metrizable spaces, separability, second countability, and the Lindelof property are equivalent. The following proposition is a direct consequence of this fact. Proposition 7.1 Let (X,:T) be a separable, metrizable, topological space and f1J a base for the topology:T. Then f1J contains a countable subcollection f1J o which is also a base for :T. Proof Let Cfl be a countable base for the topology :T. Every set C E Cfl has the form C = U, E'J(C) B" where I (C) is an index set and B, E f1J for every IXEI(C). Since Cis Lindelof, we may assume I(C) is countable. Let :!lJ o = UCd{B,[IXEI(C)}. Q.E.D.
The Hilbert cube Yf' is the product of countably many copies of the unit interval (with the product topology). The unit interval is separable and metrizable, and, as we will show later (Proposition 7.4), these properties carryover to the Hilbert cube. In a sense, Yf' is the canonical separable metrizable space, as the following proposition shows. Proposition 7.2 (Urysohn's theorem) Every separable metrizable space is homeomorphic to a subset of the Hilbert cube Yf'.
Let (X, d) be a separable metric space with a countable dense set
Proof
[xd. Define functions
0,
U {YEXld(x,y) < 8}.
xeFc.
U::,=
A totally bounded metric space is necessarily separable, since 1 F 11" is a countable dense subset. Total boundedness depends on the metric, however, and a space which is totally bounded (and separable) with one metric may not be totally bounded with another. Like separability, total boundedness is preserved under passage to subspaces, i.e., if (X, d) is totally bounded and Y c X, then (Y, d) is totally bounded. To see this, take 8 > and let F'IZ be a finite subset of X such that
°
X=
U {YEX!d(x,y) < 812}.
XEF&/2
Choose a point, if possible, in each of the sets Y (\ {YEX\d(x,y) < 812},
xEFtIZ ,
and call the collection of these points G,. Then
Y=
U {ZE Yld(y,z) < 8}.
yeG e
We use this fact to prove the following classical result relating completeness, compactness, and total boundedness. Proposition 7.6 A metric space is compact if and only if it is complete and totally bounded.
Proof If (X, d) is a compact metric space, then every Cauchy sequence has an accumulation point. The Cauchy property implies that the sequence
7.2
113
METRIZABLE SPACES
converges to this point, and completeness follows. Also, for s > 0, the collection of sets (YEXld(x,y) < s],
XEX,
contains a finite cover of X. Hence, (X, d) is totally bounded. If (X, d) is complete and totally bounded and S = {Sj} is a sequence in (X, d), then an infinite subsequence SIc S must lie in some set B 1 = {YEXld(Xl'Y) < I}. Since B 1 is totally bounded, an infinite subsequence S2 c Sl must lie in some set B 2 = {YEB1Id(X2'Y) < !}. Continuing in this manner, we have for each n an infinite sequence Sn+ 1 C S; lying in B n+ 1 = {YEBnld(xn+by) < l/(n + I)}. Let i, 0, there exists N(e) such that d(xn,xm ) S ..5(e) for all n; m ~ N(e), so {g(x n)} is Cauchy in R. Define g(y) = limn~co g(x n). Note that n ~ N(e) implies Ig(xn) - g(y)1 s e. Suppose now that X E X and d(x, y) S ..5(e)/2. Choose n ~ N(e) so that d(x n, y) S ..5(e)/2. Then dix, x n) s ..5(e) and Ig(x) - g(y)1 s Ig(x) - g(xll)1
+ Ig(xn) -
g(y)1
s
2e.
(16)
This shows that for any sequence {x~} c X with x~ ~ y, we have g(y) = limn_co g(x~), so the definition of g(y) is independent of the particular sequence {x n} chosen. If yEX, we can take x, = y, n = 1,2, ... , and obtain g(y) = g(y), so {j is an extension of g. If {Ym} is a sequence in X which converges to yEX, then there exist sequences {x mn} in X with Ym = lim ll_co x mll. Choose n 1 < n z < ... so that lim m_co xmll~ = Y and d(xmll~' Ym) S 6(l/m)/2. Then g(y)
= lim g(xmnJ,
(17)
m-oo
and, by (16), (18) Letting m ~ 00 in (18) and using (17), we conclude that g(y) = lim m_ oo gCvm) and {J is continuous on X. It is clear that sUPlg(x)1 XEX
=
sup!{J(y)I·· YEX
If X = Y, {J is clearly unique and we are done. If X is a proper subset of Y, use the Tietze extension theorem (see, e.g., Ash [AI] or Dugundji [D7]) to extend g to all of Y so that Ilgll = suplg(y)l·
Q.E.D.
YEf
Proposition 7.9 is separable.
If (X, d) is a totally bounded metric space, then UAX)
Proof Corollary 7.8.1 tells us that (X, d) can be isometrically embedded as a dense subset of a compact metric space (Xd,dd. We regard X as a
7.3
117
BOREL SPACES
subspace of X d. Given any gE Ud(X), by Lemma 7.3, g has a unique extension such that Ilg[1 = Ilgll. The mapping g --+ 9 is linear and normpreserving, thus an isometry from UiX) to C(X d)' The latter space is separable by Proposition 7.7, and the separability of UiX) follows. Q.E.D. gE C(Xd)
7.3
Borel Spaces
The constructions necessary for the subsequent theory of dynamic programming are impossible when the state space and control space are arbitrary sets or even when they are arbitrary measurable spaces. For this reason, we introduce the concept of a Borel space, and in this and subsequent sections we develop the properties of Borel spaces which permit these constructions. Definition 7.6 If X is a topological space, the smallest a-algebra of subsets of X which contains all open subsets of X is called the Borel a-algebra and is denoted by !!lJ x . The members of !!lJ x are called the Borel subsets of X. If X is separable and metrizable and ff is a a-algebra on X containing a subbase Y for its topology, then ff contains !!lJ x . This is because, from Proposition 7.1, any open set in X can be written as a countable union of finite intersections of sets in Y. Thus we have !!lJ x = a(Y) for any subbase
Y.
We will often refer to the smallest a-algebra containing a class of subsets as the a-algebra generated by the class. Thus, !!lJ x is the a-algebra generated by the class of open subsets of X. Note that !!lJ R is the class of Borel subsets of the real numbers in the usual sense, i.e., the a-algebra generated by the intervals. Given a class of real-valued functions on a topological space X, it is common to speak of the weakest topology with respect to which all functions in the class are continuous. In a similar vein, one can speak of the smallest a-algebra with respect to which all functions in the class are measurable. If X is a metrizable space, it is easy to show that its topology is the weakest with respect to which all functions in C(X) are continuous. The following proposition is the analogous result for !!lJ x . In the proof and in subsequent proofs, we will use the fact that for any two sets 0, 0', any collection ~ of subsets of 0', and any function j:O --+ 0', we have a[f-l(~)J
= j-l[a(~)].
Proposition 7.10 Let X be a metrizable space. Then !!lJ x is the smallest a-algebra with respect to which every function in C(X) is measurable, i.e., 1 fJU x = a[UfeC(Xd- (!!lJR)].
118
7.
BOREL SPACES AND THEIR PROBABILITY MEASURES
Proof Denote ~ = a[U!EC(xJ- 1(.@R)] and let Y R be the topology of R. We have
~
=
a[
U f-l[a('~R)]
!EC(X)
= a[
U
!EC(X)
a[j-l(5R)]] c a[
U
!EC(X}
f!8X ]
= .@x·
To prove the reverse containment f!8x c ~ we need only establish that .'F contains every nonempty open set. By Lemma 7.2, it suffices to show that .'F contains every nonempty closed set. Let A be such a set. We may assume without loss of generality that A #- X, so there exists x E X-A. Let B = [x], and let f be given by Lemma 7.1. Then A = f-l( [O})E~. Q.E.D. We use Lemma 7.2 to prove another useful characterization of the Borel a-algebra in a metrizable space. Proposition7.11 Let X be a metrizable space. Then .@x is the smallest class of sets which is closed under countable unions and intersections and contains every closed (open) set.
Proof Let f0 be the smallest class of sets which contains every closed set and is closed under countable unions and intersections, i.e., f0 is the intersection of all such classes. Then f0 c f!8x and it suffices to prove that f0 is closed under complementation. Let f0' be the class of complements of sets in f0. Then f0' is also closed under countable unions and intersections. Lemma 7.2 implies that f0 contains every open set, so f0' contains every closed set, and consequently f0 c f0'. Given DEf0, we have DEf0', so D Q.E.D. CEf0.
Definition 7.7 Let X be a topological space. If there exists a complete separable metric space Y and a Borel subset BE f!8y such that X is homeomorphic to B, then X is said to be a Borel space. The empty set will also be regarded as a Borel space.
Note that every Borel space is metrizable and separable. Also, every complete separable metrizable space is a Borel space. Examples of Borel spaces are R, W, and R* with the weakest topology containing the intervals [ - 00, 0:), (/3, 00], (0:, /3), 0:, /3 E R. (This is also the topology that makes the function tp defined by cp(x) =
{~gn(X)(l -1
- e-
1xl )
if x =00, if xER, if x = - 00,
7.3
119
BOREL SPACES
a homeomorphism from R* onto [ -1, 1J). Any countable set X with the discrete topology (i.e., the topology consisting of all subsets of X) is also a Borel space. We will show that every Borel subset of a Borel space is itself a Borel space. For this we shall need the following two lemmas. The proof of the first is elementary and is left to the reader. Lemma 7.4 If Y is a topological space and E c Y, then the rr-algebrasf £ generated by the relative topology coincides with the relative IT-algebra, i.e., the collection {E n qCE86'y}. In particular, if EE86'y, then 86'£ consists of the Borel subsets of Y contained in E. Lemma 7.5 If X and Yare topological spaces and cp is a homeomorphism of X into Y, then cp(86'x) = 86': 0= pAG). Proposition 7.21 implies Px" -+ Px' so b is continuous. On the other hand, if PXn -+ Px and G is an open neighborhood of x, then since lim infn_ oo PxJG) ;::;>: pAG) = 1, we must have xnEG for sufficiently large 11. i.e., x; -+ x. This shows that b is a homeomorphism. Q.E.D.
From Corollary 7.21.1 we see that Pn can converge to P in such a way that strict inequality holds in (d) and (e) of Proposition 7.21. For example. let G c X be open, let x be on the boundary of G, and let x, converge to .x through G. Then PxJG) = 1 for every n, but Px(G) = O. We now show that compactness of X is inherited by P(X). Proposition 7.22 If X is a compact metrizable space, then P(X) is a compact metrizable space.
Proof If X is a compact metrizable space, it is separable (Corollary 7.6.2) and C(X) is separable (Proposition 7.7). Let Uk} be a countable set in C(X) such thatfl == 1, 11.h11 s l for every k, and U~} is dense in the unit sphere [f E C(X)lllfll s I}. Let [-1,1]00 be the product of countably many copies of [ -1,1] and define cp:P(X) -+ [ -1,1]'"' by cp(p)
(f.f~
=
dp,
f.f~
dp, . . .).
A trivial modification of the proof of Proposition 7.20 shows cp is a homeomorphism. We will show that cp[P(X}] is closed in the compact space [ -1,1 ]'"', and the compactness of P(X) will follow. Suppose {Pn} is a sequence in P(X) and CP{Pn) -+ (cx l, CX 2, • • •) E [ -1, 1] 0':. Given s > 0 and I E C(X) with IIIII s 1, there is a function fk with III- .All < £/3. There is a positive integer N such that n,m ~ N implies ISfkdPn - ffkdPml < £/3. Then
If f dp; - f f dPml s
If
I
dp; -
f.A dPnl +
+ Isr,. dp.; -
If
fk dp; -
ff dPml < a,
f t. dPml
7.4
131
PROBABILITY MEASURES ON BOREL SPACES
so {JfdPn} is Cauchy in [ -1,1]' Denote its limit by E(f). E(f) =
IfIIIII >
1, define
IlfIIE(f/llllj)·
It is easily verified that E is a linear functional on C(X), that E(f) ~ 0 whenever f ~ 0, IE(f)1 s Ilfll for every f E C(X), and E(fd = 1. Suppose {hn} is a sequence in C(X) and hn(x)! 0 for every x E X. Then for each e > 0, the set Kn(e) = (xlhn(x) ~ e} is compact, and I Kn(e) = 0. Therefore, for n sufficiently large, Kn(e) = 0, which implies Ilhnll ! O. Consequently, E(hnl! O. This shows that the functional E is a Daniell integral, and by a classical theorem (see, e.g., Royden [R5, p. 299, Proposition 21]) there exists a unique probability measure on a[UfEC(xd- I(&6'R)] which satisfies E(f) = Sfdp for every f E C(X). Proposition 7.10 implies p e P(X). We have
n:=
rJ. k
so q>(Pn)
~
= lim ffkdPn = E(fk) = SAdp, n-e cc
q>(p). This proves q>[P(X)] is closed.
k = 1,2, ... ,
Q.E.D.
In order to show that toplogical completeness and separability of X imply the same properties for P(X), we need the following lemma. Lemma 7.10 Let X and Y be separable metrizable spaces and q>: X a homeomorphism. The mapping t/!:P(X) ~ P(Y) defined by
~
Y
is a homeomorphism. Proof Suppose PI,pzEP(X) and PI #- Pz. Since PI and Fz are regular, there is an open set G c X for which PI(G) #- pz(G). The image q>(G) is relatively open in q>(X), so q>(G) = q>(X) n B, where B is open in Y. It is clear that t/!(pd(B) = PI(G) #- pz(G) = t/!(pz)(B),
so t/! is one-to-one. Let {Pn} be a sequence in P(X) and P E P(X). If Pn ~ P, then since q> - l(H) is open in X for every open set HeY, Proposition 7.21 implies
so t/!(Pn) ~ t/!(p) and t/! is continuous. If we are given {Pn} and p such that t/!(Pn) ~ t/!(p), a reversal of this argument shows that P« ~ P and t/! - 1 is continuous. Q.E.D. Proposition 7.23 If X is a topologically complete separable space, then P(X) is topologically complete and separable.
7.
132
BOREL SPACES AND THEIR PROBABILITY MEASURES
Proof By Urysohn's theorem (Proposition 7.2) there is a homeomorphism cp:X - Ye, and the mapping t/J obtained by replacing Y by Ye in Lemma 7.10 is a homeomorphism from P(X) to P(Ye). Alexandroff's theorem (Proposition 7.3) implies cp(X) is a Grsubset of Ye, and we see that t/J[P(X)]
= {pEP(Ye)lp[Ye - cp(X)] = O}.
(31)
We will show t/J[ P(X)] is a Go-subset of the compact space P(ft) (Proposition 7.22) and use Alexandroff's theorem again to conclude that P(X) is topologically complete. Since cp(X) is a Grsubset of Ye, we can find open sets G] :=> G z :=> ••• such that cp(X) = Gn • It is clear from (31) that
n:=l
t/J[P(X)]
n = nn =
00
n= 1 00
{pEP(Ye)lp(Ye - Gn )
= O}
00
n= 1 k= 1
{pEP(Ye)lp(Ye - Gn ) < 11 k}.
But for any closed set F and real number c, the set {p E P(Ye)lp(F) 2 c} is closed by Proposition 7.21(d), and {pEP(Ye)lp(Ye - Gn ) < 11k} is the complement of such a set. Q.E.D. We turn now to characterizing the o-algebra !!JP(X) when X is metrizable and separable. From Lemma 7.6, we have that the mapping 8f: P(X) - R given by
is continuous for every f E C(X). One can easily verify from Proposition 7.21 that the mapping 8B :P(X ) - [0,1] defined by" 8B(p)
= p(B)
is Borel-measurable when B is a closed subset of X. (Indeed, in the final stage of the proof of Proposition 7.23, we used the fact that when B is closed the upper level sets {pEP(X)18 B(p) 2 c} are closed.) Likewise, when B is open, 8B is Borel-measurable. It is natural to ask if (}B is also Borel-measurable when B is an arbitrary Borel set. The answer to this is yes, and in fact, !!JP(X) is the smallest e-algebra with respect to which (}B is measurable for every BE!!J x . A useful aid in proving this and several subsequent results is the concept of a Dynkin system. t The use of the symbol liB here is a slight abuse of notation. In keeping with the definition of fir- the technically correct symbol would be fiXB '
7.4
133
PROBABILITY MEASURES ON BOREL SPACES
Definition 7.11 Let X be a set and f!}) a class of subsets of X. We say is a Dynkin system if the following conditions hold: (a) X «». (b) If A,BEf!}) and B c A, then (c) If At,A z , .. . Ef!}) and At c
A A
z
BEf!}). C"',
then
f!})
U:=t AnEf!}).
Proposition 7.24 (Dynkin system theorem) Let fF be a class of subsets of a set X, and assume fF is closed under finite intersections. If f!}) is a Dynkin system containing fF, then f!}) also contains a(fF). Proof This is a standard result in measure theory. See, for example, Ash [AI, page 169]. Q.E.D.
Proposition 7.25 Let X be a separable metrizable space and iff a collection of subsets of X which generates &8 x and is closed under finite intersections. Then &8 p (x ) is the smallest a-algebra with respect to which all functions of the form
are measurable from P(X) to [0, I], i.e.,
Proof Let fF be the smallest a-algebra with respect to which BE is measurable for every E E iff. To show fF C &8 p (X ) , we show that (i B is :?dP(X)measurable for every BE &8 x- Let f!}) =[ B E.~ x lOB is .~ p(Xl-measurable}. It is easily verified that f!}) is a Dynkin system. We have already seen that f!}) contains every closed set, so the Dynkin system theorem (Proposition 7.24) implies f!}) = :?d xIt remains to showthat:?dp(x) C fF. Let f!})' = {BE&8 xleB is fF -measurable} As before, f!})' is a Dynkin system, and since iff C f!})', we have f!})' = :?d x . Thus the function BAp) = SI dp is fF -measurable when I is the indicator of a Borel set. Therefore eJ is fF -measurable when I is a Borel-measurable simple function. If f E qX), then there is a sequence of simple functions in which are uniformly bounded below such that In l' f. The monotone convergence theorem implies eJ l' eJ' so eJ is fF -measurable. It follows that for B > 0, P E P(X), and f E qX), the subbasic open set n
v,,(p;f)
=
{qE
p(X)llsf dq -
is fF-measurable. It follows that &8 p (X ) tion 7.6). Q.E.D.
=
Sf dPI
P(Y) is defined by
y(x)
=
q(dYlx),
(32)
then q(dYlx) is said to be Ji'-measurable. If y is continuous, q(dYlx) is said to be continuous. Proposition 7.26 Let X and Y be Borel spaces, (f a collection of subsets of Y which generates ss; and is closed under finite intersections, and q(dYlx) a stochastic kernel on Y given X. Then q(dYlx) is Borel-measurable if and only if the mapping AE: X ----> [0, 1] defined by
AE(X) = q(Elx)
is Borel-measurable for every E E $. Proof Let y:X ----> P(Y) be defined by y(x) = q(dYlx). Then for EE{f, we have )'E = eE y. If q(dylxl is Borel-measurable (i.e.. y is Borel-measurable), then Proposition 7.25 implies AE is Borel-measurable for every E E g. Conversely, if AE is Borel-measurable for every EE{f, then 0,
°
if YEM(j,n)
and
s[M(j,n)lx] =0.
The functions Gn(zlx, y) can be regarded as generalized difference quotients of q(dy(O, z] Ix) relative to s(dYlx). For each z, the set B(z)
=
{(X'Y)EXYI!~~
Gn(zlx,y) exists in R}
= {(x,Y)EXYI{Gn(zlx,y)} is Cauchy}
nun 00
=
00
k= 1 N= 1 n,m"?N
{(x,Y)EXYIIGn(zlx,Y) - Gm(zlx,y)i < 11k}
is ffguy-measurable. Theorem 2.5, page 612 of Doob [D4] states that s[B(z)xlx]
=1
'l/xEX,
zEQnZ,
and if we define G(zlx, y) =
then q[Y(O, z] Ix]
=
if (x,Y)EB(z),
lim Gn(zlx, y) {
n- 00
otherwise,
Z
Ix G(zlx, y)s(dYlx)
'l/xEX,
ZEQnZ,
YEguy.
(35)
It is clear that for any z, G(zlx, y) is ffguy-measurable in (x, y).t A comparison of (34) and (35) suggests that we should try to extend G(zlx, y) in such a way that for fixed (x, y), G(zlx, y) is a distribution function. t For the reader familiar with martingales, we give the proof of the theorem just referenced. Fix x and y and observe that for m ::::: n,
q[M(j,n)(O,z]lx]
=
r .
JM(j,n)
Gm(zlx,~v)s(dYlx).
Since {M(j, n)!.i = 1, ... , 2n} is the o-algebra generated by Gn(zjx, y) regarded as a function of y. we conclude that Gn(zlx. y). n = 1,2, ... is a martingale on Y under the measure s(dy!x). Each Gn(zjx,y) is bounded above by 1, so by the martingale convergence theorem (see, e.g.. Ash [AI. p. 292]), Gn(z[x, y) converges for s(dYlx) almost every y. Thus s[B(z)xl-x] = land the definition of G(zlx, y) given above is possible. Let m ---> OC! in (*) to see that (35) holds whenever I = M(j, n) for somej and n. The collection of sets r for which (35) holds is a Dynkin system, and it follows from Proposition 7.24 that (35) holds for every r Ef4 y •
7.4
137
PROBABILITY MEASURES ON BOREL SPACES
Toward this end, for each Zo E Q n Z, we define C(zo) = {(x,Y)EXYI:JZEQ n Z with z:$; Zo and G(zlx,y) > G(zolx,y)},
U
=
{(x, y) E X YIG(zlx, y) > G(zolx, y)},
ZEQnZ z::;:zo
C
u
=
C(zo),
ZoEQnZ
D(zo) = {(x,Y)EXYiGClx,y) is not right-continuous at zo}
un u 00
=
00
ZEQnZ
n~lk~l
D
U
=
{(x,Y)EXYIIG(zlx,y) - G(zolx,Y)1 ~ lin},
zo$z 1 zEQnZ
lin},
z< 11k
and F
= {(x,Y)EXYIG(llx,y)"# t},
For fixed x E X and Zo E Q n Z, (35) implies that whenever z E Q n Z, z :$; zo, then
Ix G(zlx, y)s(dYlx) s Ix G(Zolx, y)s(dYlx)
VYE rJB y .
Therefore G(zlx, y) :$; G(zolx, y) for s(dYlx) almost all Y, so s[ C(zo)xlx] = 0 and s(Cxlx)
= O.
(36)
Equation (36) implies that G(zlx, y) is nondecreasing in z for s(dYlx) almost all y. This fact and (35) imply that if z 1Zo (z E Q n Z), then
Iy G(zlx, y)s(dYlx)! Iy G(zolx, y)s(dJilx), and G(zlx, y) ! G(zolx, y)
fur s(dYlx) almost all y. Therefore s[ D(zoUx] = 0 and s(Dxlx)
= o.
(37)
7.
138
BOREL SPACES AND THEIR PROBABILITY MEASURES
Equation (35) also implies that as z t 0 (z E Q n Z)
Ir G(zlx, y)s(dYlxH 0
'i/f
E
a.,
Since G(zlx, y) is non decreasing in z for s(dYlx) almost all y, we must have G(zlx, y) ~ 0 for s(dYlx) almost all y, i.e., s(Exlx)
= O.
(38)
Substituting z = I in (35), we see that
Ix G(llx, y)s(dYlx) = s(fl x ) so G(llx,y) = I for s(dYlx) almost all y, i.e., s(Fxlx)
=
o.
(39)
For z E Z, let {zn} be a sequence in Q n Z such that z, ~ z and define, for every XEX, yE Y, F(zlx,y)=
lim G(znlx,y)
n-w
{z
if (X,y)EXY - (C u DuE u F),
(40)
otherwise.
For (x, Y)E XY - (C u DuE u F), G(zlx, y) is a nondecreasing rightcontinuous function of z E Q n Z, so F(zlx, y) is well defined, nondecreasing, and right-continuous. It also satisfies for every (X,y)EXY,
o :s; F(zlx, y) s F(llx,y)
=
1 I,
'i/zEZ,
and lim F(zlx, y) = O. do It is a standard result of probability theory (Ash [AI, p. 24J) that for each (x, y) there is a probability measure r(dzlx, y) on Z such that r((O,zJlx,y)
= F(zlx,y)
'i/z E (0, I].
The collection of subsets ZEf!4 z for which r(Zlx,y) is fff!4 y -measurable in (x,y) forms a Dynkin system which contains {(O,ZJ[ZEZ}, so r(Z\x,y) is fff!4 y -measurable for every ZEf!4 z . Relations (35)-(40) and the monotone convergence theorem imply
Ix F(zlx, y)s(dYlx) = Ix r( (0, zJ[x, y)s(dYlx)
q[f(O, zJ[x J =
'i/xEX,
ZEZ,
fEfJ8 y • (41)
7.4
139
PROBABILITY MEASURES ON BOREL SPACES
The collection of subsets Z.E Plz for which (34) holds forms a Dynkin system which contains {(O,ZJIZEZ}, so (34) holds for every Z.EPl z. Q.E.D. If :#' = Plx , an application of Proposition 7.26 reduces Proposition 7.27 to the following form.
Corollary 7.27.1 Let X, Y, and Z be Borel spaces and let q(d(y, z)lx) be a Borel-measurable stochastic kernel on YZ given X. Then there exist Borel-measurable stochastic kernels r(dzlx, y) and s(dYlx) on Z given X Yand on Y given X, respectively, such that (34) holds. If there is no dependence on the parameter x in Corollary 7.27.1, we have the following well-known result for Borel spaces.
Corollary 7.27.2 Let Yand Z be Borel spaces and q e P(YZ). Then there exists a Borel-measurable stochastic kernel r(dzlY) on Z given Y such that q( YZ)
=
II
r(z:ly)s(dy)
where s is the marginal of q on Y. 7.4.4
Integration
As in Section 2.1, we adopt the convention - 00
+ 00 = + 00
-
00
=
00.
(42)
With this convention, for a, b, C E R* the associative law (a
+ b) + c = a + (b + c)
still holds, since if either a, b, or c is 00, then both sides of (42) are 00, while if neither a, b, nor c is 00, the usual arithmetic involving finite numbers and - 00 applies. Also, if a, b, C E R * and a + b = c, then a = C - b, provided b i= ± 00. It is always true however that if a + b :s; c, then a :s; c - b. We use convention (42) to extend the definition of the integral. If X is a metrizable space, p E P(X), and f: X -+ R* is Borel-measurable, we define (43)
J.r-
Note that if SJ+ dp < 00 or if dp < 00, (43) reduces to the classical definition of Sf dp. We collect some of the properties of integration in this extended sense in the following lemma. Lemma 7.11
Let .I; g and f~, functions on X.
Let X be a metrizable space and let p E P(X) be given. n = 1,2, ... , be Borel-measurable, extended real-valued
7.
140
(a)
BOREL SPACES AND THEIR PROBABILITY MEASURES
Using (42) to define f f U
(b) If either (bl ] JJ+ dp < (b2) JJ- dp < (b3) Jg+ dp
- 00 implies ft dp = fl dp = 00, and the conclusion follows from (d). Statement ([) follows from the extended monotone convergence theorem. Q.E.D.
J
J
sr:
We saw in Corollary 7.27.2 that a probability measure on a product of Borel spaces can be decomposed into a stochastic kernel and a marginal. This process can be reversed, that is, given a probability measure and one or more Borel-measurable stochastic kernels on Borel spaces, a unique probability measure on the product space can be constructed. Proposition 7.28 Let Xl' X 2, . . . be a sequence of Borel spaces, Y" = X 1 X 2' .. Xnand Y = X 1 X 2' . . . Let p E P(X tl be given, and, for n = I, 2, ... , let qn(dXn+lIYn) be a Borel-measurable stochastic kernel on X n+1 given Y".
7.4
141
PROBABILITY MEASURES ON BOREL SPACES
Then for n = 2,3" .. , there exist unique probability measures rn E P( Y,,) such that rn(XtXZ'''X n) =
Ix J2..2 Ix ... Jz, Ix_ J2. 1
X
1
qn-t(Xnlxt,xZ, .. "xn-d
qn-Z(dxn-tIXt,xz, ... ,xn- Z)··· \:IX1 E/JDX"", ,XnE/!4xn. (46)
x qt(dxzlx1)p(dxd
If f: Y" then
--->
R* is Borel-measurable and either Sf+ dr; < 00 or Sf- dr; < 00,
I JX I , .. JX I n f(XbXZ"" ,xn)qn-l(dxnlx1,xz, .. , ,xn-d'" JYI n f dr; = JX 1
2
(47)
x qt(dxzlxdp(dx1).
Furthermore, there exists a unique probability measure r on Y = X 1 X Z ' such that for each n the marginal of r on y" is r n'
, ,
Proof The spaces Y", n = 2,3, ... , and Yare Borel by Proposition 7,13. If there exists r; E P( y") satisfying (46), it must be unique. To see this, suppose r~EP(y") also satisfies (46). The collection ~ = {BE/JDynlrn(B) = r~(B)} is a Dynkin system containing the measurable rectangles, so ~ = fJU Yn and rn = r~, We establish the existence of r« by induction, considering first the case n = 2. For BEfJU y 2 , use Corollary 7.26.1 to define
(48) E P( Yz ) and rz satisfies (46), If f is the indicator of BE/JDy 2 , the SxJ(xt,xz)qt(dxzlxd is Borel-measurable and, by (48),
It is easily verified that r :
(49) Linearity of the integral implies that (49) holds for Borel-measurable simple functions as well. If f: Yz ---> [0,00] is Borel-measurable, then there exists an increasing sequence of simple functions such that fn t f. By the monotone convergence theorem, lim I fn(Xt,xz)qt(dxzlx t) = IX f(xt,xz)qt(dxzlx t) JX2 J2 so SxJ(xt,xz)qt(dxzlxd is Borel-measurable and n--'OO
lim
n~ooJY2
I
fndrz
=
\:IX t EXt,
lim I I fn(Xl,XZ)ql(dxzlxdp(dxd n-+ooJxIJx2
= JX1JX2 I I f(Xl,XZ)qt(dxzIXt)p(dxt).
7.
142
BOREL SPACES AND THEIR PROBABILITY MEASURES
But Sr2.fn drz i Sr2f dr2, so (49) holds for any Borel-measurable nonnegative [: Fora Borel-measurable f: Yz ---'> R* satisfying Sf + dr; < 00 or Sf - dr; < 00, we have
and
r 1JX2 r .r(xI,xz)ql(dxzlxdp(dxd. JrY f- drz = JX 2
Assume for specificity that
SrJ- drz
0, the Stone-Weierstrass theorem implies that such a linear combination can be found satisfying IIIJ= gil < 6. If {Pn} is a sequence in P(X) converging to P E P(X), {qn} a sequence in P(Y) converging to q E P(Y), and fj and hj the restrictions of Jj and hj to X and Y, respectively, then
d/ij -
lim supls g d(Pnqn) n~OCJ
xr
slim supI r n~OCJ JXY
I
r g d(Pq)j JXY
(g - I 1fjhj)d(Pnqn) \ j=
+ j=l n-e lim I r fjdPn r hjdqn - r fjdp r hjdq/ co JX Jy Jx Jr
+ li~s~plfxyCtl
fjh j
-
g)d(pq)1
S 26.
The continuity of (J follows from the equivalence of (a) and (c) of Proposition 7.21. Q.E.D. Proposition 7.30 Let X and Y be separable metrizable spaces and let q(dYlx) be a continuous stochastic kernel on Y given X. If f E C(X Y), then the function A: X --+ R defined by A(X)
=
f f(x, y)q(dylx)
is continuous. Proof The mapping v:X --+ P(XY) defined by v(x) = Pxq(dylx) is continuous by Corollary 7.21.1 and Lemma 7.12. We have A(X) = (8J v)(x), where 8J: P(X Y) --+ R is defined by 8J(r) = Sf dr. By Proposition 7.21,8J is continuous. Hence, Ais continuous. Q.E.D. 0
7.5
Semicontinuous Functions and BorelMeasurable Selection
In the dynamic programming algorithm given by (17) and (18) of Chapter 1, three operations are performed repetitively. First, there is the evaluation of a conditional expectation. Second, an extended real-valued function in two variables (state and control) is infimized over one of these variables (control). Finally, if an optimal or nearly optimal policy is to be constructed, a "selector" which maps each state to a control which achieves
146
7.
BOREL SPACES AND THEIR PROBABILITY MEASURES
or nearly achieves the infimum in the second step must be chosen. In this section, we give results which will enable us to show that, under certain conditions, the extended real-valued functions involved are semicontinuous and the selectors can be chosen to be Borel-measurable, The results are applied to dynamic programming in Propositions 8.6-8.7 and Corollaries 9.17.2-9.17.3. Definition 7.13 Let X be a metrizable space and f an extended realvalued function on X. If {x E X[f(x) :::;; c} is closed for every c E R, f is said to be lower semicontinuous. If {x E X!f(x) ~ c} is closed for every c E R, f is said to be upper semicontinuous. Note that f is lower semicontinuous if and only if - f is upper semicontinuous. We will use this duality in the proofs of the following propositions to assert facts about upper semicontinuous functions given analogous facts about lower semicontinuous functions. Note also that if f is lower semicontinuous, the sets {xEXlf(x) = -oo} and {xEX!f(x):::;; oo] are closed, since the former is equal to n~= dXEX[f(x) :::;; -n} and the latter is X. There is a similar result for upper semicontinuous functions. The following lemma provides an alternative characterization of lower and upper semicontinuous functions. Lemma 7.13
Let X be a metrizable space and f:X
->
R*.
(a) The function f is lower semicontinuous if and only if for each sequence {x n } C X converging to x E X
lim inff(xn )
~
n-s cc
f(x).
(53)
(b) The function f is upper semicontinuous if and only if for each sequence {x n } C X converging to x E X lim sup f(x n) :::;; f(x).
Proof Suppose f is lower semicontinuous and x, a subsequence {xnJ such that x nk -> x as k -> 00 and
(54) ->
x. We can extract
lim f(xnJ = lim inf f(x n).
k-oo
Given
B
n-e co
> 0, define 8(B) =
lim inf f(x n ) {
n->
+B
if lim inf f(x n ) > -
00
-liB
otherwise.
00,
7.5
147
BOREL-MEASURABLE SELECTION
There exists a positive integer k(e) such that f(xnJ s O(e) for all k 2 k(e). The set {y E Xlf(y) s O(e)} is closed, and hence it contains x. Inequality (53) follows. Conversely, if (53) holds and for some cER, (x n } is a sequence in [YEXlf(y) s c} converging to x, thenf(x) s c, so f is lower semicontinuous. Part (b) of the proposition follows from part (a) by the duality mentioned earlier. Q.E.D. If f and g are lower semicontinuous and bounded below on X and if x; ....... x, then
2 f(x)
+ g(x),
so f + g is lower semicontinuous. If f is lower semicontinuous and o: > 0, then «f is lower semicontinuous as well. Upper semicontinuous functions have similar properties. It is clear from (53) and (54) that f:X ....... R* is continuous if and only if it is both lower and upper semicontinuous. We can often infer properties of semicontinuous functions from properties of continuous functions by means of the next lemma. Lemma 7.14
Let X be a metrizable space and f:X ....... R*.
(a) The function f is lower semicontinuous and bounded below if and only if there exists a sequence {f,,} c C(X) such that .r.. t f'. (b) The function f is upper semicontinuous and bounded above if and only if there exists a sequence {f,,} c C(X) such thatf" t f. Proof We prove only part (a) of the proposition and appeal to duality for part (b). Assume f is lower semicontinuous and bounded below by bE R, and let d be a metric on X consistent with its topology. We may assume without loss of generality that for some X o E X, f(xo) < 00, since the result is trivial otherwise. (Take f,,(x) = n for every x E X.) As in Lemma 7.7, define
gn(x) = inf [f(y) YEX
+ ndix, y)].
Exactly as in the proof of Lemma 7.7, we show that {gn} is an increasing sequence of continuous functions bounded below by b and above by f. The characterization (53) oflower semicontinuous functions can be used in place of continuity to prove e, i f. In particular, (28) becomes f(x) slim inf f(Yn) slim gn(x) + e. n-oo
n-e co
Now define f" = min {n, gn}. Then each f" is continuous and bounded and This concludes the proof of the direct part of the proposition. For
f" t f.
7.
148
BOREL SPACES AND THEIR PROBABILITY MEASURES
the converse part, suppose {J,,} c C(X) and J" if. For
CE
n {XEXjJ,,(X) ~
c}
{xEXjf(x) ~ c} =
R,
00
n=l
is closed.
Q.E.D.
The following proposition shows that the semicontinuity of a function of two variables is preserved when one of the variables is integrated out via a continuous stochastic kernel. Proposition 7.31 Let X and Y be separable metrizable spaces, let q(dYlx) be a continuous stochastic kernel on Y given X, and let f:XY ~ R* be Borel-measurable. Define
A(X) =
f f(x, y)q(dYlx).
(a) If f is lower semicontinuous and bounded below, then A is lower semicontinuous and bounded below. (b) If f is upper semicontinuous and bounded above, then A is upper semicontinuous and bounded above. Proof We prove part (a) of the proposition and appeal to duality for part (b). If f:XY --+ R* is lower semicontinuous and bounded below, then by Lemma 7.14 there exists a sequence {J,,} c C(XY) such that J" i f. Define An(X) = fJ,,(x,y)q(dYlx). By Proposition 7.30, we have that An is continuous, and by the monotone convergence theorem An i A. By Lemma 7.14, Ie is lower semicontinuous. Q.E.D. An important operation in the execution of the dynamic programming algorithm is the infimization over one of the variables of a bivariate function. In the context of semicontinuity, we have the following result related to this operation. Proposition 7.32 be given. Define
Let X and Y be metrizable spaces and let f: X Y f*(x) = inf f(x, y).
~
R* (55)
YEY
(a) If f is lower semicontinuous and Y is compact, then f* is lower semicontinuous and for every x E X the infimum in (55) is attained by some yE Y. (b) Iff is upper semicontinuous, then f* is upper semicontinuous. Proof (a) Fix x and let {Yn} c Y be such that f(x, Yn).!. f*(x). Then
{Yn} accumulates at some Yo E Y, and part (a) of Lemma 7.13 implies that
f(x, Yo) = f*(x). To show that f* is lower semicontinuous, let {x n} c X be
7.5
149
BOREL-MEASURABLE SELECTION
such that Xn ---+ xo. Choose a sequence {Yn} c Y such that n
= 1,2, ....
There is a subsequence of {(xn,Yn)}, call it {(Xnk'YII.)}' such that lim infll _ 00 f(x n, Yn) = linu., 00 f(x nk, YnJ. The sequence {YnJ accumulates at some YoE Y, and, by Lemma 7.13(a), liminff*(xn) = liminff(xn,Yn) = lim f(xnk,Ynk) 2 f(xo, Yo) 2 f*(xo)' n-+oo
n-» o:
k-r co
so f* is lower semicontinuous. (b) Let d j be a metric on X and d 2 a metric on Y consistent with their topologies. If G c X Y is open and Xo Eprojy] G), then there is some Yo E Y for which (xo, Yo)E G, and there is some B > 0 such that N.(xo,Yo) = {(x,Y)EXYldj(x,x o) < B,d2(y,yo) < B} c G.
Then xoEproh[N.(xo,Yo)] = {xEXldj(x,x o) < B} c projx(G),
so projx(G) is open in X. For cER, {xEXlf*(x) < c} = projx({(x,Y)EXYlf(x,y) < c}).
The upper semicontinuity of f implies that {(x, Y)!f(x, y) < c} is open, so {xEXlf*(x) < c} is open and f* is upper semicontinuous. Q.E.D. Another important operation in the dynamic programming algorithm is the choice of a measurable "selector" which assigns to each x E X aYE Y which attains or nearly attains the infimum in (55). We first discuss Borelmeasurable selection in case (a) of Proposition 7.32. For this we will need the Hausdorff metric and the corresponding topology on the set 2Y of closed subsets of a compact metric space Y (Appendix C). The space 2 Y under this topology is compact (Proposition C.2) and, therefore, complete and separable. Several preliminary lemmas are required. Lemma 7.15 Let Y be a compact metrizable space and let g: Y be lower semicontinuous. Define g*: 2Y ---+ R* by
:A
m in g(y) g*(A)
=
{
if A;6
---+
R*
0,
if A = 0.
(56)
Then g* is lower semicontinuous. Proof Since the empty set is an isolated point in 2Y , we need only prove that g* is lower semicontinuous on 2Y - {0}. We have already shown [Proposition 7.32(a)] that, given a nonempty set A E 2 Y , there exists yEA
7.
150
BOREL SPACES AND THEIR PROBABILITY MEASURES
such that g*(A) = g( y). Let {An} C 2 Y be a sequence of nonempty sets with limit A E 2Y , and let Yn E An be such that g*(A n) = g(Yn), n = 1,2, .... Choose a subsequence {Ynk} such that lim g(Ynk) = lim inf g(Yn) = lim inf g*(A n).
k-r so
n-e cc
n-s co
The subsequence {YnJ accumulates at some Yo E Y, and, by Lemma 7.13(a), g(yo)::; lim g(YnJ = liminfg*(A n ) · k-oo
n-r co
From (14) of Appendix C and from Proposition C.3, we have (in the notation of Appendix C) YOE lim An = A,
so g*(A) ::; g(yo) ::; lim inf g*(A n). n-+ 00
The result follows from Lemma 7.13(a).
Q.E.D.
Lemma 7.16 Let Y be a compact metrizable space and let g: Y be lower semicontinuous. Define G:2 Y R* --> 2Y by
G(A,c) = An {YE Ylg(y)::; c}.
-->
R*
(57)
Then G is Borel-measurable. Proof We show that G is upper semicontinuous (K) (Definition C.2) and apply Proposition CA. Let {(An,c n)} C 2YR* be a sequence with limit (A,c).lflim n-+ oo G(An,c n) = 0, then
(58) Otherwise, choose Y E lim n -+ 00 G(A n, cn)' There is a sequence nl < n z < ... of positive integers and a sequence Ynk E G(A nk, CII)' k = 1,2, ... , such that Ynk --> y. By definition, Ynk E A nk for every k, so Y E lim n -+ a: An = A. We also have g(Ynk)::; cnk' k = 1,2, ... , and using the lower semicontinuity of g. we obtain g(y) ::; lim inf g(YnJ ::; lim cnk = k-v co k -r cc»
C.
Therefore Y E G(A, c), (58) holds, and G is upper semicontinuous (K). Q.E.D.
7.5
151
BOREL-MEASURABLE SELECTION
Lemma 7.17 Let Y be a compact metrizable space and let g: Y -> R* be lower semicontinuous. Let g*:2 Y -> R* be defined by (56) and define G*:2 Y -> 2Y by G*(A)
= An {YE Ylg(y)
~ g*(A)}.
(59)
Then G* is Borel-measurable. Proof Let G be the Borel-measurable function given by (57). Lemma 7.15 implies g* is Borel-measurable. A comparison of (57) and (59) shows that G*(A) = G[A, g*(A)].
It follows that G* is also Borel-measurable.
Q.E.D.
Lemma 7.18 Let Y be a compact metrizable space. There is a Borelmeasurable function a: 2Y - {0} -> Y such that a(A) E A for every AE2 Y
-
{0}.
Proof Let {gnln = 1,2, ...} be a subset of C(Y) which separates points in Y (for example, the one constructed in the proof of Proposition 7.7). As in Lemma 7.15, define g:: 2 Y -> R* by g:(A)
=
min gn(Y) {
if A"# 0,
yEA 00
if A = 0,
and, as in Lemma 7.17, define G:: 2 Y -> 2 Y by G:(A)
Let H n:2 Y
->
= A n ryE Y!gn(Y):S; g:(A)} = {YEAlgn(Y) = g:(A)}.
2 Y be defined recursively by Ho(A) = A, Hn(A)
= G:[Hn-1(A)],
n = 1,2, ....
Then for A "# 0, each Hn(A) is nonempty and compact, and A
= Ho(A):::::J H1(A):::::J H 2(A):::::J'" .
Therefore, n:.n=oH.(A)"# 0. If y,y'En:.n=oHn(A), then for n = 1,2, ... , we have gn(Y)
= g:[Hn-1(A)] = gn(y')·
Since {gnln = 1,2, ...} separates points in Y, we have Y = y', and n:.n=o Hn(A) must consist of a single point, which we denote by a(A).
7.
152
BOREL SPACES AND THEIR PROBABILITY MEASURES
We show that for A =f 0 lim Hn(A) = n-e cc
00
n Hn(A)
=
{cr(A)}.
(60)
n=O
Since the sequence {H.(A)} is nonincreasing, we have from (14) and (15) of Appendix C that 00 (61) HIl(A) c lim Hn(A) c lim Hn(A).
n
n-s co
n=O
n-e cc
If Y E limn.... 00 Hn(A), then there exist positive integers 111 < 112 < ... and a sequence YllkEHnk(A), k= 1,2, ... , such that Yllk-+Y' For fixed k, YnjEH nk for all j ~ k, and since Hnk(A) is closed, we have yEHIlJA). Therefore, yE n~=o HIl(A) and 00 (62) lim Hn(A) c Hn(A). n-(()
n
n:=O
From relations (61) and (62), we obtain (60). Since G~ and H; are Borel-measurable for every 11, the mapping v:2 Y {0} -+ 2Y defined by v(A) = {cr(A)} is Borel-measurable. It is easily seen that the mapping r: Y -+ 2Y defined by r(y) = {y} is a homeomorphism. Since Y is compact, r( Y) is compact, thus closed in 2Y , and T-1: r( Y) -+ Y is Borel-measurable. Since a = r -1 -v, it follows that (J is Borel-measurable. Q.E.D. Lemma 7.19 Let X be a metrizable space, Ya compact metrizable space, and let f:XY -+ R* be lower semicontinuous. Define F:XR* -+ 2Y by
F(x,e) = rYE Ylf(x,y) ~ c}.
(63)
Then F is Borel-measurable.
Proof The proof is very similar to that of Lemma 7.16. We show that F is upper semicontinuous (K) and apply Proposition CA. Let (x,; ell) -+ (x, c) in X R* and let Y be an element of limn.... 00 F(x n , ell), provided this set is nonempty. There exist positive integers 111 < 112 < ... and Yllk E F(xnk, ellJ such that Ynk -+ y. Since f(x nk, YnJ ~ enk and f is lower semicontinuous, we conclude that f(x, y) ~ e, so that limn.... 00 F(x ll, en) c F(x, c). The result follows. Q.E.D. Lemma 7.20 Let X be a metrizable space, Ya compact metrizable space, and letf:XY -+ R* be lower semicontinuous. Letf*:X -+ R* be given by f*(x) = minYEf f(x, y), and define F*: X -+ 2Y by
F*(x) = {YEYlf(x,Y)~f*(x)}. Then F* is Borel-measurable.
(64)
7.5
153
BOREL-MEASURABLE SELECTION
Proof Let F be the Borel-measurable function defined by (63). Proposition 7.32(a) implies thatf* is Borel-measurable. From (63)and (64)we have F*(x) = F[x,f*(x)]. It follows that F* is also Borel-measurable.
Q.E.D.
We are now ready to prove the selection theorem for lower semicontinuous functions. Proposition 7.33 Let X be a metrizable space, Y a compact metrizable space, D a closed subset of X Y, and let f: D ---+ R* be lower semicontinuous. Letf*:projx(D) ---+ R*begivenby (65)
f*(x) = min f(x, y). YED x
Then projx(D) is closed in X, f* is lower semicontinuous, and there exists a Borel-measurable function cp:proh(D) ---+ Y such that Gr(cp) c D and
f[x, cp(x)]
=
f*(x)
(66)
VXEproh(D).
Proof We first prove the result for the case where D = X Y. As in Lemma 7.18, let a:2 Y - {0} ---+ Y be a Borel-measurable function satisfying a(A)EA for every AE2 Y - {0}. As in Lemma 7.20, let F*:X ---+ 2 Y be the Borelmeasurable function defined by F*(x) = {YE Ylf(x,y) = f*(x)}. Proposition 7.32(a) implies that f* is lower semicontinuous and F*(x) i= 0 for every x EX. The composition cp = a F * satisfies (66). Suppose now that D is not necessarily XY. To see that projx(D) is closed, note that the function g = - XD is lower semicontinuous and 0
proh(D)
=
{XEX!g*(X)
s -l},
where g*(x) = min YE Y g(x, y). By the special case of the proposition already proved, g* is lower semicontinuous, proh(D) is closed, and there is a Borelmeasurable function CPl:X ---+ Y such that g[X,CP1(X)] = g*(x) for every XEX or, equivalently, (X,CP1(X))ED for every xEproh(D). Define now the lower semicontinuous function ]:XY ---+ R* by
](x,Y) =
{;:,y)
if (x,Y)ED, otherwise.
For all CE R,
{xEproh(D)lf*(x) S c} = {xEX1min](X,Y) YEY
s
c}.
154
7.
BOREL SPACES AND THEIR PROBABILITY MEASURES
Since min.; Y !(x, y) is lower semicontinuous, it follows that f* is also lower semicontinuous. Let cpz: X ---> Y be a Borel-measurable function satisfying ![X,CP2(X)] = min j'(x. p) YEY
Clearly (x,CPz(x»ED for all x in the Borel set l (X, y) < { x EX lmin yE Y
oo}.
Define cp:proh(D) ---> Y by if rnin j'(x.y) =
00,
YEY
if rnin j'(x.y)
Y such that Gr(cp) c G. Proof Let {Ynln = 1,2, ...}. be a countable dense subset of Y. For fixed yE Y, the mapping x ---> (x,y) is continuous, so {xEXI(x,y)EG} is open. Let Gn = {XEX!(X,Yn)EG}, and note that projx(G) = U.~)=I G; is open. Define cp:proh(G)---> Yby
YI cp(x)
=
n-I
{ Yn
if x E Gn
-
U Gb
n = 2,3, ....
k= 1
Then cp is Borel-measurable and Gr(cp) c G.
Q.E.D.
Proposition 7.34 Let X be a metrizable space, Ya separable metrizable space, D an open subset of X Y, and let f: D ---> R* be upper semicontinuous. Let f*: proh(D) ---> R* be given by f*(x)
= inf
f(x, y).
(67)
YED x
Then projx(D) is open in X, f* is upper semicontinuous, and for every s > 0, there exists a Borel-measurable function cp,:proh(D) ---> Y such that
7.5
155
BOREL-MEASURABLE SELECTION
Gr(cpc) c D and for all x E proh(D) f *(X) + 8 f[ x, cpJx)]: if f*(x) = _
00, 00.
(68)
Proof The set proh(D) is open in X by Lemma 7.21. To show that f* is upper semicontinuous, define an upper semicontinuous function]: X Y ---+ R* by ](x,y)
=
{~X'Y)
if (x,Y)ED, otherwise.
For cER, we have
{xEprojx(D)lf*(x) < c} = {XEXlinf J(x,y) < ye Y
c},
and this set is open by Proposition 7.32(b). Let 8 > 0 be given. For k = 0, ± 1, ± 2, ... , define (see Fig. 7.1)
A(k) = {(x,Y)EDlf(x,y) < k8}, B( -
B(k) = {xEprojx(D)I(k - 1)8: N(S), It follows that N(R) = N(S). Q.E.D.
'z, ...)
Note that if a regular Suslin scheme R satisfies (80), then for all fl l = {ZEflIDz R(s) -#
the set
Z
in
0}'
ns X such that A = f(fl l ). Conversely, if fl l c fl is closed and{: fl l -> X is continuous, then f(JV'd is analytic. Proof Let A = N(R) be nonempty, where R is a regular Suslin scheme for :1i' x satisfying (80). Define
7.6
163
ANALYTIC SETS
Let Z = (( l' (z, . . .) be in u1!. If for each n we have R(( l' (z,· .. , (n) =/= 0, then it is possible to chose X n E R(( I' (z, . . . , (n)' The sequence {x n} is Cauchy by (80), and since (X,d) is complete, {x n } has a limit xEX. But for each n the regularity of R implies {xmlm 2': n} c R(( I, (z, . . . , (n), so X E R(( I' (z,· .. , (II)' Therefore xEns 0, (80) implies that there exists Sn < Zo for which diam R(sn) < e. For k sufficiently large, Zk E {ZE flls n < z}, so f(Zk) E R(sll)' Therefore d(f(Zk)J(ZO») ~ diam R(sll) < e, which shows that f is continuous. For the converse, suppose fl l c fl is closed and I: ut'l --> X is continuous. Define a regular Suslin scheme R for :IF x by R(s)
= f((zEA/'lls < z}),
where R(s) = 0 if {ZE.AlI!S< z} = 0. If ZE.Al I , then f(Z)En., JiI JiIJiI ... by (86)
where Zk consists of the components of Z with indices in Ilk' Then qJ is oneto-one and onto and, because convergence in a product space is componentwise, qJ is a homeomorphism. Q.E.D. Combination of Lemma 7.25 with Proposition 7.37 gives the following.
7.6
165
ANALYTIC SETS
be a sequence of Borel spaces and A k Proposition 7.38 Let X b X 2, an analytic subset of X k , k = 1,2, Then the sets A 1A 2 ••• and A 1A 2 . . . An' n = 1,2, ... , are analytic subsets of X 1 X 2 . . . and X 1 X 2 . . . X n' respectively. Proof Let fk: JV ~ X kbe continuous such that A k = !k(JV), k Let qJ be given by (86) and F: JVJV ... ~ X 1 X 2' .. be given by F(Zl'Z2"")
=
1,2, ....
= (fd Zl ),fi z 2), " .).
Then Fa qJ is continuous and maps JV onto A 1A2 • .•. The finite products are handled similarly. Q.E.D. Another consequence of Proposition 7.37 is that the continuous image of an analytic set, in particular, the projection of an analytic set, is analytic. As discussed at the beginning of this section, this property motivated our inquiry into analytic sets. We formalize this and a related fact to obtain another characterization of analytic sets. Proposition 7.39 Let X and Y be Borel spaces and A an analytic subset of XY. Then proh(A) is analytic. Conversely, given any analytic set C eX and any uncountable Borel space Y, there is a Borel set B c X Y such that C = projx(B). If Y = JV, B can be chosen to be closed.
Proof If A = f(JV) c X Y is analytic, where f is continuous, then projx(A) = (proj; a f)(JV) is analytic by Proposition 7.37. If C = f(JV) c X is nonempty and analytic, then C = projx[Gr(f)],
where Gr(f) = {(f(z), z) E X JVlz E JV} is closed because f is continuous. If Y is any uncountable Borel space, then there exists a Borel isomorphism qJ from JV onto Y (Corollary 7.16.1). The mapping defined by (x, z)
= (x, qJ(z))
is a Borel isomorphism from XJV onto X Y, and C = proh([Gr(f)]).
Q.E.D.
So far we have treated only the continuous images of analytic sets. With the aid of Proposition 7.39, we can consider their images under Borelmeasurable functions as well. Proposition 7.40 Let X and Y be Borel spaces and f:X ~ Ya Borelmeasurable function. If A c X is analytic, then f(A) is analytic. If BeY is analytic, then f - l(B) is analytic.
Proof Suppose A c X is analytic. By Proposition 7.39, there exists a Borel set B c XJV such that A = proh(B). Define l{!: B ~ Y by l{!(x, z) = f(x).
7.
166
BOREL SPACES AND THEIR PROBABILITY MEASURES
Then ljJ is Borel-measurable, and Corollary 7.14.1 implies that Gr(ljJ)E.?J'UY' Finally, f(A) = projy[Gr(ljJ)] is analytic by Proposition 7.39. If BeY is analytic, then B = N(S), where S is some Suslin scheme for % r Then f-1(B) = N(f-1 S), where i : S is the Suslin scheme for :?J x defined by 0
0
The analyticity of f- 1 (B ) follows from Proposition 7.36.
Q.E.D.
We summarize the equivalent definitions of analytic sets in Borel spaces. Proposition 7.41 Let X be a Borel space. The following definitions of the collection of analytic subsets of X are equivalent: (a)
Y(% x);
Y(:?Jx); (c) the empty set and the images of ,Ai' under continuous functions from JV into X; (d) the projections into X of the closed subsets of X JIf'; (e) the projections into X of the Borel subsets of XY, where Y is an uncountable Borel space; (f) the images of Borel subsets of Y under Borel-measurable functions from Y into X, where Y is an uncountable Borel space. (b)
Proof The only new characterization here is (f). If Y is an uncountable Borel space and f: Y -> X is Borel-measurable, then for every BE.?J'y, f(B) is analytic in X by Proposition 7.40. To show that every nonempty analytic set A c X can be obtained this way, let cp be a Borel isomorphism from Y onto X JV and let F c X JV be closed and satisfy projx(F) = A. Define B = cp -l(F) E:?J y . Then (proj; 0 cp)(B) = A. If A = 0, then f(0) = A for any Borel-measurable f: Y -> X. Q,E.D. 7.6.2
Measurability Properties of Analytic Sets
At the beginning of this section we indicated that extended real-valued functions on a Borel space X whose lower level sets are analytic arise naturally via partial infimization. Because the collection of analytic subsets of an uncountable Borel space is strictly larger than the Borel IT-algebra (Appendix B), such functions need not be Borel-measurable. Nonetheless, they can be integrated with respect to any probability measure on (X, .?J'x). To show this, we must discuss the measurability properties of analytic sets. If X is a Borel space and p E P(X), we define p-outer measure, denoted by p*, on the set of all subsets of X by p*(E) = inf{p(B)IE c B,BE.?J'x}.
(87)
7.6
167
ANALYTIC SETS
Outer measure on an increasing sequence of sets has a convergence property, namely, p*(E n) i p*(U:~ 1 En)if E 1 C E z C . . . . This is easy to verify from (87) and also follows from Eq. (5) and Proposition A.1 of Appendix A (see also Ash [AI, Lemma 1.3.3(d)]). The collection of sets iJ6 x (p) defined by iJ6 x (p) = {E c Xlp*(E)
+ p*(g) = I}
is a c-algebra (Ash [AI, Theorem 1.3.5]), called the completion of .U4 x with respect to p. It can be described as the class of sets of the form BuN as B ranges over iJ6 x and N ranges over all subsets of sets of p-measure zero in iJ6 x (Ash [AI, p. 18]), and we have p*(B u N) = p(B).
Furthermore, p* restricted to iJ6x (p) is a probability measure, and is the only extension of p to iJ6 x (p) that is a probability measure. In what follows, we denote this measure also by p and write p(E) in place of p*(E) for all E E f4 x (p). Definition 7.18 Let X be a Borel space. The universal a-alqebra !lit x is defined by OIl x = nPEP(X) iJ6 x (p). If E E!lItx, we say E is universally measurable. The usefulness of analytic sets in measure theory is in large degree derived from the following proposition. Proposition 7.42 (Lusin's theorem) Let X be a Borel space and S a Suslin scheme for !lit x- Then N(S) is universally measurable. In other words, y(!lIt x )
= !lItx ·
Proof Denote A = N(S), where S is a Suslin scheme for !lItx . For ,00dE!:, define
(0"1""
N«Jl,.·.,(Jk)= {«(I,(Z, ... )EJII!(1 =O"I""'(k=O"k}
(88)
and M(O"I"" ,O"k) = {«l,(Z," .)Eu¥'I(1 :s;; t I ::::;
U
(J
0"1,'"
,(k::;; (Jd
N(rl>' .. , rd·
(89)
10 •••• tk ::::; (Tk
Define also
n S(s).
U
R«Jl,'" ,O"d =
(90)
ZEM(Ub··.,O"k)S' .. ,Tk ) c N(S)
k= 1
This proves (95), i.e., as k -+ 00, K«( I" .. , (k) decreases to a set contained in A, and X - K«( I,' .. ,(d increases to a set containing X-A. Letting k -+ 00 in (94), we obtain 1 2:: p*(A) - s
+ p*(X -
A):
Since s > 0 is arbitrary, this implies that
+ p*(X that p*(E) + p*(X 1 2:: p*(A)
It is true for any E c X
p*(A)
+
p*(X - A)
and A is measurable with respect to' p.
A). E) 2:: 1, so
= 1,
Q.E.D.
Corollary 7.42.1 Let X be a Borel space. Every analytic subset of X is universally measurable. Proof The closed subsets of X are universally measurable, so !I'(ff'x) c IJIt x by Proposition 7.42. Q.E.D.
As remarked earlier, the class of analytic subsets of an uncountable Borel space is not a a-algebra, so there are universally measurable sets which are not analytic. In fact, we show in Appendix B that in any uncountable Borel space, the universal a-algebra is strictly larger than the a-algebra generated by the analytic subsets. 7.6.3
An Analytic Set of Probability Measures
In Proposition 7.25 we saw that when X is a Borel space, the function -+ [0,1] defined by (JA(P) = peA) is Borel-measurable for every Borel-measurable A c X. We now investigate the properties of this function when A is analytic. The main result is that the set {p E P(X)lp(A) 2:: c} is analytic for each real c.
(JA:P(X)
Proposition 7.43 Let X be a Borel space and A an analytic subset of X. For each cER, the set {pEP(X)lp(A) 2:: c} is analytic.
170
7.
BOREL SPACES AND THEIR PROBABILITY MEASURES
Proof Let S be a Suslin scheme for :Ji'x, the class of closed subsets of X, such that A = N(S). For S E L, let N(s), M(s), R(s), and K(s) be defined by (88)-(90) and (92). Then (91) and (95) hold and each K(s) is closed. We show that for cER [pEP(X)lp(A) ~ c}
nun en
=
{pEP(X)lp[K(s)] ~
C -
(lIn)}. (98)
n= 1 zeJV· s
-c}.
n= 1
Proceed as before. In the general case, I f(x, y)q(dylx)
=
If + (x, y)q(dylx) - If - (x, y)q(dYlx).
The functions f + and - f - are lower semianalytic, so by the preceding arguments each of the summands on the right is lower semianalytic. The result follows from Lemma 7.30(4). Q.E.D.
182
7.
BOREL SPACES AND THEIR PROBABILITY MEASURES
Corollary 7.48.1 Let X be a Borel space and let f: X semianalytic. Then the function 8J: P(X) -+ R* defined by
8J (p) =
ff
-+
R * be lower
dp
is lower semianalytic. Proof Define a Borel-measurable stochastic kernel on X given P(X) by q(dxlp) = p(dx). Apply Proposition 7.48. Q.E.D.
As an aid in proving the selection theorem for lower semianalytic functions, we give a result concerning selection in an analytic subset of a product of Borel spaces. The reader will notice a strong resemblance between this result and Lemma 7.21, which was instrumental in proving the selection theorem for upper semicontinuous functions. Proposition 7.49 (Jankov-von Neumann theorem) Let X and Y be Borel spaces and A an analytic subset of X Y. There exists an analytically measurable function sp : projx(A) -+ Y such that Gr(cp) c A. Proof (See Fig. 7.2.) Let f: JV -+ XY be continuous such that A = f(JV). Let 9 = proh f. Then g: JV -+ X is continuous from JV onto {x}) is a closed nonempty subset of .ff. Let proh(A). For x E proh(A), (l(X) be the smallest integer which is the first component of an element Z1 E g-l( {x}). Let (2(X) be the smallest integer which is the second component of an element Z2Eg-1({X}) whose first component is (l(X). In general, let (k(X) be the smallest integer which is the kth component of an element ZkEg-1({X}) whose first (k - l)st components are (l(X), ... ,(k-l(X). Let ljJ(x) = ((1(X),(2(X), . . .). Since Zk -+ ljJ(x), we have ljJ(X)Eg- 1({X}). (104) 0
Define
tp : projx(A) -+ Y
«:«
by cp = proj,
0
f
0
ljJ, so that Gr( cp) c A.
y
FIGURE 7.2
7.7
183
UNIVERSALLY MEASURABLE SELECTION
We show that cp is analytically measurable. As in the proof of Proposition 7.42, for (a1,' .. , adEL let N(al"" M(ab'"
= {(Cl,C2," .)EJVIC1 = al,'" ,ak) = {(CbC2,·· .)E·JVICI S ab'"
,Ck = ak),
,ak)
,Ck S ak}'
We first show that t/J is analytically measurable, i.e., t/J - 1(.CJB.'f) C d x- Since {N(slls E L} is a base for the topology on JV, by the remark following Definition 7.6, we have a({N(s)lsEL}) = .CJB.Ho Then
t/J-l(.CJBHl = t/J-1[a({N(s)lsEL}lJ = a[t/J-l({N(sl!SELj)], and it suffices to prove (105)
t/J-l[N(s)JEd x
We claim that for s = (al,a2,'" ,ak)EL t/J-l[N(s)J
=
k
g[M(s)J -
U g[M(al""
,aj-baj - I)J,
(106)
j= 1
where M(a1"" ,aj_l,aj - 1) = 0 if aj - 1 = O. We show this by proving that t/J-1[N(s)] is a subset of the set on the right-hand side of (106) and vice versa. Suppose XE t/J-l[ N(s)]. Let t/J(x) = (C leX), C2(X), . . .). Then (107)
t/J(x)EN(s) c M(s),
so (104) implies
x
g[t/J(x)J E g[M(s)].
=
(108)
Relation (107) also implies Cl(X) = ab'" ,Ck(X) = ak' By the construction of t/J, we have that al is the smallest integer which is the first component of an element of g - 1({ x} l, and for j = 2, ... .k, aj is the smallest integer which is the jth component of an element of g-l({X}) whose first (j - 1) components are al,' .. , aj_ L' In other words, g-l({X}) n M(al""
,aj-l,aj - 1) =
0,
j
=
1,... ,k.
It follows that k
x¢
U g[M(al""
j= 1
,aj-baj - 1)].
(109)
Relations (108) and (109) imply k
t/J-1[N(s)] c g[M(s)] -
U g[M(al""
j= 1
.oi.,», - 1)].
(110)
7.
184
BOREL SPACES AND THEIR PROBABILITY MEASURES
To prove the reverse set containment, suppose k
xEg[M(s)] -
U g[M(O"I""
,00j-l,O"j -1)].
(111)
j= 1
Since x E g[M(s)], there must exist y
= (111,112" ..) E «: ({ x}) such that (112)
Clearly, x E projx(A) = g(JV), so Ij;(x) is defined. Let Ij;(x) By (104), we have g[lj;(x)] = x, so (111) implies j
= (( 1(X),(2(X), . . .).
= 1,2, ... ,k.
Since Ij;(X)¢M(O"I - 1), we know that (I(X) ~ 0"1' But (I(X) is the smallest integer which is the first component of an element of g-I({X}), so (112) implies (I(X) ~ 111 ~ 0"1' Therefore (I(X) = 0"1' Similarly, since Ij;(X)¢M((I(X), 0"2 - 1), we have (2(X) ~ 0"2' Again from (112) we see that '2(X) ~ 112 ~ 0"2, so (2(X) = 0"2' Continuing in this manner, we show that Ij;(X)E N(s), i.e., xEIj;-I[N(s)] and k
1j;-I[N(s)]
=.J
g[M(s)] -
U g[M(O"I""
j=1
.vi-,.«, - 1)].
(113)
Relations (110) and (113) imply (106). We note now that M(t) is open in JV for every tE~, so g[ M(t)] is analytic by Proposition 7.40. Relation (105) now follows from (106), so Ij; is analytically measurable. By the definition of qJ and the Borel-measurability off and proj., we have
We have just proved Ij; - 1(!Jl x ) c d x» and the analytic measurability of qJ follows. Q.E.D. This brings us to the selection theorem for lower semianalytic functions.
Proposition 7.50 Let X and Y be Borel spaces, [) «: XY an analytic set, andf:D ---+ R* a lower semianalytic function. Define f* :proh(D) ---+ R* by f*(x) = inf f(x, y).
(114)
yeD x
(a) For every s > 0, there exists an analytically measurable function qJ: proh(D) ---+ Y such that Gr( qJ) c D and for all x E proh(D),
< {f*(X) f[ x, qJ(x )] - 1/e
+s
f*(x) > if f*(' 1 x) = -
if
00, 00.
7.7
185
UNIVERSALLY MEASURABLE SELECTION
(b) The set 1= {xeprojx(DWor some YxeDx,f(x,Yx) = f*(x)}
°
is universally measurable, and for every e > there exists a universally measurable function q>:proh(D) ~ Y such that Gr(q» c D and for all xeproh(D) f[ x, q>(x)] = f*(x)
if x e I,
f *(X). + e f[ x, q> (x )] S { _ 1/1;
(115)
if x¢=I, f*(x»-oo, if x ¢= I, f*(x) = - 00.
(116)
Proof (a) (Cf proof of Proposition 7.34 and Fig. 7.1.) The function f* is lower semianalytic by Proposition 7.47. For k = 0, ± 1, ± 2, ... , define A(k) = {(x,y)eDlf(x,y) < ke}, B(k) = {xeproh(D)I(k - l)e S f*(x) < ke}, B(-oo) = {xeproh(D)lf*(x) = -oo}. B( (0)
= {x e proh(D)lf*(x) = oo].
The sets A(k), k = 0, ± 1, ± 2, ... , and B( - (0) are analytic, while the sets B(k), k = 0, ± 1, ± 2, ... , and B( (0) are analytically measurable. By the Jankov-von Neumann theorem (Proposition 7.49) there exists, for each k = 0, ±1, ±2, ... , an analytically measurable q>k:projx[A(k)] ~ Y with (x, q>k(X)) e A(k) for all x e projy]' A(k)] and an analytically measurable ip:projx(D) ~ Y such that (x,ip(x))eD for all xeprojx(D). Let k* be an integer such that k* S -1/e 2 . Define q>:proh(D) ~ Y by if xeB(k), k=0,±I,±2,... , if xeB(oo), if xeB( - (0). Since B(k) c proh [A(k)] and B( - (0) c projx[ A(k)] for all k, this definition is possible. It is clear that q> is analytically measurable and Gr(q» c D. If x e B(k), then (x, q>k(X)) e A(k) and we have f[ x, q>(x)] = f[x, q>k(X)] < ke S f*(x) If xeB(oo), then f(x,y) = xeB(-oo), we have
00
f[ x, q>(x)]
=
+ c.
for all yeD x and f[x,q>(x)] f[x, q>k*(X)] < k*e S -l/e.
Hence q> has the required properties.
= 00
=f*(x). If
7.
186
BOREL SPACES AND THEIR PROBABILITY MEASURES
(b) Consider the set E E
=
c XYR*
defined by
((x,y, b)[(x, y)ED,/(x, y) ~ b}.
Since
n U ((x,y, b)!(x, Y)E D,f(x,y) ~ 00
E
=
r, r
k=lrEQ*
s b + (l/kj),
it follows from Corollary 7.35.2 and Proposition 7.38 that E is analytic in XYR*, and hence the set A
=
projXR*(E)
is analytic in XR*. The mapping T:projx(D) T(x)
--->
XR* defined by
= (x,/*(x»
is analytically measurable, and 1= {x[(x,/*(x))EA}
= T- 1(A).
Hence I is universally measurable by Corollary 7.44.2. Since E is analytic, there is, by the Jankov-von Neumann Theorem, an analytically measurable p:A ---> Y such that (x,p(x,b),b)EE for every (x, b) E A. Define !/J: 1---> Y by !/J(x)
= p(x,/*(x» = (p a T)(x)
VXEI.
Then !/J is universally measurable by Corollary 7.44.2, and by construction f[x, !/J(x)] s f*(x) for x E I. Hence f[ x, !/J(x)]
= f*(x)
(117)
VXEI.
By part (a) there exists an analytically measurable !/J,: proh(D) that f * (X) + e f[ x,!/J.(x)] ~ { -l/e
if f*(x) > if f*(x) = -
00, 00.
--->
Y such
(118)
Define cp: projx(D) ...... Y by cp(x)
=
{!/J(X) !/J,(x)
if XE I, .if x E proh(D) - I.
Then cp is universally measurable and, by (117) and (118), it has the required properties. Q.E.D. Since the composition of analytically measurable functions can fail to be analytically measurable (Appendix B), the selector obtained in the proof
7.7
UNIVERSALLY MEASURABLE SELECTION
187
of Proposition 7.50(b) can fail to be analytically measurable. The composition of universally measurable functions is universally measurable, and so we obtained a selector which is universally measurable. However, there is a er-algebra, which we call the limit a-alqebra, lying between ",Ix and O/i x such that the composition oflimit measurable functions is again limit-measurable. We discuss this o-algebra in Appendix B and state a strengthened version of Proposition 7.50 in Section 11.1.
Chapter 8
The Finite Horizon Borel Model
In Chapters 8-lOwe will treat a model very similar to that of Section 2.3.2. An applications-oriented treatment of that model can be found in "Dynamic Programming and Stochastic Control" by Bertsekas [B4], hereafter referred to as DPSC. The model of Section 2.3.2 and DPSC has a countable disturbance space and arbitrary state and control spaces, whereas the model treated here will have Borel state, control, and disturbance spaces. 8.1
The Model
Definition 8.1 A finite horizon stochastic optimal control model is a ninetuple (S, C, U, W, p, j, ()(, g, N) as described here. The letters x and U are used to denote elements of Sand C, respectively.
S State space. A nonempty Borel space. C Control space. A nonempty Borel space. U Control constraint. A function from S to the set of nonempty subsets of C. The set
r =
{(X,U)!XES, UE U(x))
is assumed to be analytic in S'C. 188
(1)
8.1
189
THE MODEL
W Disturbance space. A nonempty Borel space. p(dwlx, u) Disturbance kernel. A Borel-measurable stochastic kernel on W given S'C. f System function. A Borel-measurable function from sew to S. c< Discount factor. A positive real number. 9 One-stage cost function. A lower semianalytic function from I' to R*. N Horizon. A positive integer. We envision a system moving from state Xk to state xk+ 1 via the system equation
k = 0, I, ... ,N - 2, and incurring cost at each stage of g(Xb ud. The disturbances W k are random objects with probability distributions p(dwklxk, ud. The goal is to choose Uk dependent on the history (xo,u o, ... ,Xk-l,Uk-l,Xk) so as to minimize
(2) The meaning of this statement will be made precise shortly. We have the constraint that when x, is the kth state, the kth control Uk must be chosen to lie in U(x k ) . In the models in Section 2.3.2 and DPSC, the one-stage cost 9 is also a function of the disturbance, i.e., has the form g(x, u, w). If this is the case, then g(x, u, w) can be replaced by
g(x, u) =
Sg(x, u, w)p(dwlx,
u).
If g(x, u, w) is lower semianalytic, so is g(x, u) (Proposition 7.48). If p(dwlx, u) is continuous and g(x, u, w) is lower semicontinuous .and bounded below or
upper semicontinuous and bounded above, then g(x, u) is lower semicontinuous and bounded below or upper semicontinuous and bounded above, respectively (Proposition 7.31). Since these are the three cases we deal with, there is no loss of generality in considering a one-stage cost function which is independent of the disturbance. The model posed in Definition 8.1 is stationary, i.e., the data does not vary from stage to stage. A reduction of the nonstationary model to this form is discussed in Section 10.1. A notational device which simplifies the presentation is the state transition stochastic kernel on S given se defined by
t(Blx, u) = p({wlf(x, U, W)E B} lx, u) = p(j-l(B)(x, ullx, u).
(3)
190
8.
THE FINITE HORIZON BOREL MODEL
Thus t(Blx, u) is the probability that the (k + l)st state is in B given that the kth state is x and the kth control is u. Proposition 7.26 and Corollary 7.26.1 imply that t(dx'lx, u) is Borel-measurable. Definition 8.2 A policy for the model of Definition 8.1 is a sequence n = (110,l1b'" ,I1N-d such that, for each k, I1k(du klxo,uo, .. ' ,Uk-I,xd is a universally measurable stochastic kernel on C given SC' .. CS satisfying
I1k(U(Xk)!x o,uo,'" ,Uk-I,xd
=
1
for every (xo, U o , ' .. , Uk _ I ' x k). If, for each k, Ilk is parameterized only by (xo,xd, t: is a semi-Markov policy. If Ilk is parameterized only by Xk, tt is a Markov policy. If, for each kand(xo, Uo,' .. , uk-I,Xk),l1k(duk!xo, Uo,' .. , Uk-I' Xk) assigns mass one to some point in C, tt is nonrandomized. In this case, by a slight abuse of notation, n can be considered to be a sequence of universally measurable (Corollary 7.44.3) mappings Ilk: Sc- .. CS ~ C such that
I1k(Xo, U o , · .. ,u k-
b
x k) E U(xd
for every (x.,, U o , ... , Uk _ I ' xd. If :F is a type of a-algebra on Borel spaces and all the stochastic kernel components of a policy are :F -measurable, we say the policy is :F-measurable. (For example, .~ could represent the Borel a-algebras or the analytic e-algebras.)
We denote by IT the set of all policies for the model of Definition 8.1 and by II the set of all Markov policies. We will show that in many cases it is not necessary to go outside II to find the "best" available policy. In most cases, this "best" policy can be taken to be nonrandomized. Since r is analytic, the Jankov -von Neumann theorem (Proposition 7.49) guarantees that there exists at least one nonrandomized Markov policy, so II and II' are nonempty. If tt = (110, Ill,' .. , I1N - d is a nonrandomized Markov. policy, then tt is a finite horizon version of a policy in the sense of Section 2.1. The notion of policy as set forth in Definition 8.2 is wider than the concept of Section 2.1 in that randomized non-Markov policies are permitted. It is narrower in that universal measurability is required. We are now in a position to make precise expression (2). In this and subsequent discussions, we often index the state and control spaces for clarity. However, except in Chapter 10 when the nonstationary model is treated, we will always understand Sk to be a copy of Sand Ck to be a copy of C. Suppose p E P(S) and tt = (110, Ill" .. , I1N _ d is a policy for the model of Definition 8.1. By Proposition 7.45, there is a unique probability measure rN(n,p) on SoCO"'SN-ICN-I such that for any universally measurable function h: SoC o' .. SN-l C N- I ~ R* which satisfies either Sh+ drN(n,p) < 00
8.1 or
191
THE MODEL
v:
drN(n,p)
J
h dr N(n , p )
=
=
fSkCk t(!ik+ I[Xk, Uk) dqk(n', pJ,
[0,00] is a Borel-measurable indicator function, then
h(Xk+l)dpk+l(n',pJ(dxk+l) =
r
r
JSkCk JSk +
X
1
h(Xk+l)t(dxk+l[X k,Uk)
dqk(n',pJ.
(15)
Then (15) holds for Borel-measurable simple functions, and finally, for all Borel-measurable functions h:Sk+ 1 -> [0,00]. Letting h(xk+d in (15) be flk+ I([k+ dXk+ I), we obtain from (14), the induction hypothesis, and (8) qk+ l(n',pJ(!ik+ I[k+ I)
= 1kCk 1k+
1
flk+ I([k+ dXk+ dt(dxk+ I[Xb Uk)
x dqk(n',pJ
= JSkCk r J§k r + 1 flk+I([k+I[Xk+dt(dxk+I[Xbuddqk(n,PX) = qk+l(n,pJ(!ik+l[k+d, which proves (13) for k
+ 1.
Q.E.D.
8.
194 Corollary 8.1.1
(F+)(F-)
where n
For K = 1,2, ... ,N, we have
= inf
J~(x)
THE FINITE HORIZON BOREL MODEL
JK.,,(X)
"En is the set of all Markov policies.
VXES,
Corollary 8.1.1 shows that the admission of non-Markov policies to our discussion has not resulted in a reduction of the optimal cost function. The advantage of allowing non-Markov policies is that an s-optimal nonrandomized policy can then be guaranteed to exist (Proposition 8.3), whereas one may not exist within the class of Markov policies (Example 2). 8.2
The Dynamic Programming Algorithm-Existence of Optimal and e-Optimal Policies .
Let U(qS) denote the set of universally measurable stochastic kernels fl on C given S which satisfy fl( U(x)!x) = 1 for every XES. Thus the set of Markov policies is n = U(qS)U(qS)' .. U(qS), where there are N factors. Definition 8.4 Let J: S -4 R* be universally measurable and fl The operator TIL mapping J into T I'(J): S -4 R* is defined by TI'(J)(x) = Ie [g(x, u)
+ o:
E
U( qS).
£
J(x')t(dx'lx, U)]fl(dulx)
for every XES. The operator TIL can also be written in terms of the system function f and the disturbance kernel p(dwlx, u) as [cr. (3)] T Il(J)(X)
= Ie [g(x, u) + o: I w J[f(x, u, w)]p(dwlx, U)]fl(dulx).
By Proposition 7.46, T I'(J) is universally measurable. We show that under (F +) or (F "), the cost corresponding to a policy tt = (flo, ... , flN - d can be defined in terms of the composition of operators T 1'0 T 1'1' •• T '1N_ 1 . Lemma 8.1 (F+)(F-) Let n = (flo,fl1,'" ,flN-1) be a Markov policy and let J 0: S -4 R* be identically zero. Then for K = 1,2, ... , N we have J Ko"
where T 1'0 Proof
•••
=
(Tl'o'"
(16)
TI'K_,)(10)'
T ILK _ 1 denotes the composition of T 1l0'
••• ,
T 11K _I
We proceed by induction. For XES, J 1.,,(x)
=
I gdqo(n,pJ
= Ieo g(x, uo)J1o(du o!x) =
T 1'0(1 o)(x).
•
8.2
195
THE DYNAMIC PROGRAMMING ALGORITHM
Suppose the lemma holds for K - 1. Let n = (11 h flz, ... ,flN - 1, fl), where fl is some element ofU(qS). Then for any XES, the (F+) or (F-) assumption along with Lemma 7.11(b) implies that (5) can be rewritten as 1
J K.,,(X)
=
r g(x, uo)llo(duo!x) + IX JcDJs,Jc, r r r ... JCK-l r [Ki. jeD k=l X
k- 19(Xk' Ud]
IX
flK-l(duK-1[XK- dt(dxK-IIXK-Z' UK-z)· ..
x fll(dullxdt(dxdx, Uo)flo(duolx)
= JCD r g(x,U o)llo(duo[x) + IXJCDJS, r r JK-l.,,(Xl)t(dxl[X,Uo)flo(duolx). Under (F-), SCD g+(x, Uo)flo(duolx)
J, then TIl(Jd
-->
Proof
TIl(J)·
Assume first that T iJ d < S[g(x, u) +
00
and Jk t 1. Fix x. Since
IX SJ l(x')t(dx'!x,
u)]ll(dulx)
inf(Tlli o ' io
=
iN
-
2
Il N- 2
0
"
0
iN- I
N- 1
T iN- 2 T ) ( J o) Il N - 2
TN(J O)'
(23)
where the last equality is obtained by repeating the process used to obtain the previous equality. Combining (20) and (23), we see that the proposition Q.E.D. holds under the (F-) assumption. When the state, control, and disturbance spaces are countable, the model of Definition 8.1 falls within the framework of Part I. Consider such a model, and, as in Part I, let M be the set of mappings fl:S -> C for which fl(X)E U(x) for every XES. In Section 3.2, it was often assumed that for every XES and fljEM,j = 0, ... ,K - 1, we have (TIlO' .. T IlK- )(Jo)(x)
0, it may not be possible to find a nonrandomized Markov a-optimal policy, as the following example demonstrates. EXAMPLE
N
2
Let S = [0,1,2, ...}, C = {1,2, ...}, W =
= 2 and define
-
g(x,u)
={0
[tx, U, w) = p({wdlx, u)
=
o {1
if x = 1, if x =/= 1,
U
if x = 0 or x = 1 or w = WI, if x =/= 0, x =/= I, and W = W2'
1 - l/x
p({w2}!x,u)=l/x
{Wb wz},
if x =/= 0, x =/= 1, if x =/=0,
x=/=1.
r = SC,
8.
202
THE FINITE HORIZON BOREL MODEL
The (F-) assumption is satisfied. Let tt = (f.1o,f.1d be a nonrandomized Markov policy. If the initial state Xo is neither zero nor one, then regardless of the policy employed, Xl = 0 with probability 1 - (l/xo), and XI = 1 with probability l/xo. Once the system reaches zero, it remains there at no further cost. If the system reaches one, it moves to X2 = 0 at a cost of - f.11 (1). Thus IN,,,(xo) = -f.1I(l)/X O if Xo =I- 0, Xo =I- 1, and J%(xo) = - 00 if X o =I- 0, Xo =I- 1. For any G> 0, tt cannot be s-optimal. In Example 2, it is possible to find a sequence of nonrandomized Markov policies {nn} such that J N, "nt J%. This example motivates the idea of policies exhibiting {Gn}-dominated convergence to optimality (Definition 8.3) and Proposition 8.4, which we prove with the aid of the next lemma. Lemma 8.5 Let {J k} be a sequence of universally measurable functions from S to R* and f.1 a universally measurable function from S to C whose graph lies in r. Suppose for some sequence {Gd of positive numbers with L~= I Gk < 00, we have, for every XES,
f Ji(x')t(dx'!X, f.1(x))
- 00,
+ Gk
if J(x) = - 00.
Then lim TJ1(Jd 00
k~
Proof
=
TJ1(J).
(26)
Since J ::s;; J k for every k, it is clear that TJ1(J)::S;; liminfTI'(Jk)'
(27)
k-r co
For
XES,
lim sup TI'(Jk)(x)::s;; g[x,f.1(x)] k~oo
+ cdimsup r" . J k(x')t(dx'jx,f.1(x)) k-r o» J{xIJ(x»-oo)
+ 0: limk-rsup r sa Jlx'IJ(x')= -
00)
J k(x')t(dx'!x, f.1(x)).
Now lim supS{x'IJ(x'» k~oo
- 00)
::s;; limsup[f k~oo
=
r
J{x'IJ(x'»
J k(x')t(dx'!x, f.1(x))
{x'IJ(x'»
-""j
-oo}
J(x')t(dx'!x,f.1(x))
J(x')t(dx'ix, I/(x)) ,...
+ GklJ
8.2
203
THE DYNAMIC PROGRAMMING ALGORITHM
If J(x') = -
00,
then 00
Jk(X')
+ I
n=k+l
+ I:=ze n ] t(dx'IX, J1{x ))
if T(Jo)(x) = -
+ en
00, 00.
We may assume without loss of generality that TIt~(Jo)
~ TIt~-I(JO)'
Therefore {nn} exhibits {en}-dominated convergence to one-stage optimality. Suppose the result holds for N - I. Let ti; = (J11, ... "t'}, _ 1) be a sequence of (N - Il-stage nonrandomized Markov policies exhibiting {e n I2a }dominated convergence to (N - I)-stage optimality, i.e.,
() X I N- 1 ,T -
00,
J~(x)
= -
00.
(31)
We may assume without loss of generality that
s
TIl"(J~-l)
n = 2,3, ....
TIl"-t{J~-d,
(32)
By Proposition 7.48, the set A(J~-
d=
= - co}[x,u) > O}
{(x, U)E rjt( {xtIJ~_l(x')
= {(X,U)Er!S -X{x'IJ;'_I(x')=-OO}(x')t(dx'!x,u)
0 and let ft E n be s-optimal. If p( {xl J"t(x) = f J N, ,,(x)p(dx)
(38)
"En
~ f J~(x)p(dx)
00 })
= 0, then
+ s,
and it follows that inf f IN,,,(x)p(dx) ~ f J~(x)p(dx).
"En
(39)
8.
206 If p({xIJ~(x)
THE FINITE HORIZON BOREL MODEL
= - co}) > 0, then
f
IN,ir(x)p(dx) s; -P({X\JN(X)
+ J{xIJN(x» r *
-oo})/E
=
-:q
+ E.
J~(x)p(dx)
(40)
If S{XIJ';v(Y) > -oo}JN(x)p(dx) = 00, then S n(x)p(dx) = 00 and (39) follows. Otherwise, the right-hand side of (40) can be made arbitrarily small by letting E approach zero, so inf1tE u IN.n(x)p(dx) = - 00 and (39) is again valid. The lemma follows from (38) and (39). Q.E.D.
S
We now consider the question of constructing an optimal policy, if this is at all possible. When the dynamic programming algorithm can be used to construct an optimal policy, this policy usually satisfies a condition stronger than mere optimality. This condition is given in the next definition. Definition 8.6 (Ilk' ... , IlN- 1)' k
Let tt = (Ilo, ... ,IlN-l) be a Markov policy and n N- k = = 0, ... , N - 1. The policy tt is uniformly N -staqe optimal if
k = 0, ' .. , N - 1. Lemma 8.7 (F + )(F -) The policy tt N -stage optimal if and only if (TIlJN-k-l)(JO) Proof
If n
=
=
(Ilo, ... ,IlN -
TN-k(J O)'
k
= (Ilo, ... ,IlN - d is uniformly
dE II is uniformly
= 0, ... , N -
1.
N -stage optimal, then
TN-k(J O) = IN-k = IN-k,nN-k = Tllk(JN-k-l,nN-k-.)
= Tllk(Jt-k-d = where JO,no = J~ then for all k
== 0.
Jt-k
=
(TllkTN-k-l)(JO),
k
= 0, ... ,N - 1,
If (TllkTN-k-l)(JO) = TN-k(J O), k = 0, ... ,N - 1, TN-k(J O)
=
(TIlJN-k-l)(JO)
= (TllkTllk+,TN-k-2)(JO) =
(T llk'"
TIlN_J(JO)
where the next to last equality is obtained by continuing the process used to obtain the previous equalities. Q.E.D. Lemma 8.7 is the analog for the Borel model of Proposition 3.3 for the model of Part I. Because (F+) or (F-) is a required assumption in Lemma 8.1, one of them is also required in Lemma 8.7, as the following example shows. If we take (16) as the definition of Jk,n, then Lemma 8.7, Proposition 8.5, and
8.2
207
THE DYNAMIC PROGRAMMING ALGORITHM
Corollaries 8.5.1 and 8.5.2 hold without the (F+) and (F-) assumptions. The proofs are similar to those of Section 3.2. EXAMPLE 3 Let S = {s,t} u {(k,j)lk = 1,2, ... ; j = 1,2}, C = {a,b}, U(s) = {a,b}, U(t) = U(k,j) = {b}, k = 1,2, ... , j = 1,2, W = S, and rt. = 1. Let the disturbance kernel be given by p(sls, a) = 1,
p[(k, l)ls,b] = p[(k,2)1(l, 1),b] = k-
z (
OCJ
n~l
1 nZ
)-1
k,l
,
= 1,2, ... ,
p[tl(k, 2), b] = 1, k = 1,2, ... , and p(tlt, b) = 1. Let the system function be [ix, u, w) = w. Thus if the system begins at state s, we can hold it at s or allow it to move to some state (l, 1), from which it subsequently moves to some (k, 2) and then to t. Having reached t, the system remains there. The relevant costs are g(s,a) = g(s,b) = g(t, b) = 0, g[(k, 1),b] = k, g[(k, 2), b] = - k, k = 1,2, .... Let tt = (jlo,f.1l>f.1z) be a policy with f.1o(s) = b, f.11(S) = f.1z(s) = a. Then
if if if if
~H
T(J o)(xz) = T 1J.2(Jo)(xz)
TZ{Jo)(xd = (TIJ. JTIJ.2)(Jo)(xd
~ {-~
-k
r 0
T'( J o)(xo) ~ (T" T" T,,)( J 0)(xo) ~
roo =
Xz = s, Xz = (k, 1), Xz = (k,2), Xz = t,
if if if if
Xl = s,
if if ·if if
Xo = s,
Xl =
(k, 1),
Xl =
(k,2),
Xl =
t,
(k, 1), = (k,2), = t.
Xo = Xo Xo
However, J",3(S) = 00 >J",3(S)=0, where n=CJio,Jil,JiZ) and Jio(s)= = Jiz(s) = a, so tt is not optimal and T 3(JO) # J~. It is easily verified that ii is a uniformly three-stage optimal policy, so Corollary 3.3.1(b) also fails to hold for the Borel model of this chapter. Here both assumptions (F+) and (F-) are violated.
JiI(S)
Proposition 8.5
(F+)(F-)
f
Ifthe infimum in
inf {g(X, u) + rt. Jt(x')t(dx'lx, u)},
UEU(X)
k = O, ... ,N - 1,
(41)
208
8.
THE FINITE HORIZON BOREL MODEL
is achieved for each XES, where J~ is identically zero, then a uniformly Nstage optimal (and hence optimal) nonrandomized Markov policy exists. This policy is generated by the dynamic programming algorithm, i.e., by measurably selecting for each x a control u which achieves the infimum. Proof Let 1i = (f.1o, ... ,f.1N-d, where f.1N-k-t:S ~ C achieves the infimum in (41) and satisfies f.1N-k-t(X)E U(x)foreveryxES,k = 0, ... ,N - 1 (Proposition 7.50). Apply Lemma 8.7. Q.E.D.
Corollary 8.5.1 (F+)(F-) If U(x) is a finite set for each XES, then a uniformly N-stage optimal nonrandomized Markov policy exists. Corollary 8.5.2 the set Uk(x,}.)
(F+)(F-)
=
If for each
XES,
AE R, and k = 0, ... ,N - 1,
{UE U(x)lg(x,u) + Lt. f Jt(x')t(dx'lx,u):s:; }.}
is compact, then there exists a uniformly N-stage optimal nonrandomized Markov policy. Proof
8.3
Apply Lemma 3.1 to Proposition 8.5.
Q.E.D.
The Semicontinuous Models
Along the lines of our development of lower and upper semicontinuous functions in Section 7.5, we can consider lower and upper semicontinuous decision models. Our models will be designed to take advantage of the possibility for Borel-measurable selection (Propositions 7.33 and 7.34), and in the case of lower semicontinuity, the attainment of the infimum in (55) of Chapter 7. We discuss the lower semicontinuous model first. Definition 8.7 The lower semicontinuous, finite horizon, stochastic, optimal control model is a nine-tuple (S, C, U, W, p, f, Lt., g, N) as given in Definition 8.1 which has the following additional properties:
r
2
(a) The control space C is compact. (b) The set r defined by (1) has the form c ... , each r j is a closed subset of SC, and lim j~
00
(x, u)
inf E
rv -
rj
g(x, u)
r =
U~
1
where
r1c
= co."
- 1
(c) The disturbance kernel p(dwlx, u) is continuous on t
r-,
By convention, the infimum over the empty set is
I" j are all identical for.i larger than some index k.
00,
r.
so this condition is satisfied if the
8.3
209
THE SEMICONTINUOUS MODELS
(d) The system function f is continuous on rw. (e) The one-stage cost function g is lower semicontinuous and bounded below on r. Conditions (c) and (d) of Definition 8.7 and Proposition 7.30 imply that t(dx'lx, u) defined by (3) is continuous on r, since for any h e C(S) we have
f
h(x')t(dx'lx, u) =
f
h[f(x, u, w)Jp(dwlx, u).
Condition (e) implies that the (F+) assumption holds. Proposition 8.6 Consider the lower semicontinuous finite horizon model of Definition 8.7. For k = 1,2, ... ,N, the k-stage optimal cost function Jt is lower semicontinuous and bounded below, and it = Tk(J 0)' Furthermore, a Borel-measurable, uniformly N-stage optimal, nonrandomized Markov policy exists.
Proof Suppose J: S --+ R* is lower semicontinuous and bounded below, and define K: r --+ R* by K(x, u) = g(x, u)
+ r>:
f
J(x')t(dx'lx, u).
(42)
Extend K to all of SC by defining R(x, u) =
{K~
u)
if (X,U)Er, if (x,u)¢r.
By Proposition 7.31(a) and the remarks following Lemma 7.13, the function K is lower semicontinuous on r. For cER, thiset {(x,u)EsCjR(x,u)
must be contained in some
r- by Definition 7.8(b), so the set
s
c}
{(x, U)ESCIR(x, u) s c} = {(x, U)ErkIK(x, u) s c}
is closed in r- and thus closed in SC as well. It follows that R(x, u) is lower semicontinous and bounded below on SC and, by Proposition 7.32, the function T(J)(x)
= inf R(x, u)
(43)
UEC
is as well. In fact, Proposition 7.33 states that the infimum in (43) is achieved for every XES, and there exists a Borel-measurable ip : S --+ C such that T(J)(x)
= R [x, cp(x)J
'VXES.
For) = 1,2, ... , let CPj: projs(r j ) --+ C be a Borel-measurable function with graph in r j. (Set D = r i in Proposition 7.33 to establish the existence of such a function.) Define u: S --+ C so that Jl(x) = cp(x) if T(J)(x) < co, Jl(x) = cp 1(x) if T(J)(x) = co and x E projs(r 1) ; and for) = 2,3, ... , define Jl(x) = cpix) if
8.
210
THE FINITE HORIZON BOREL MODEL
T(J)(x) = 00 and x s projsff") - projs(r i - 1 ) . Then /l is Borel-measurable, /lex) E U(x) for every XES, and T Il(J) = T(J). Since J 0 =: is lower semicontinuous and bounded below, the above argument shows that J: = Tk(J 0) has these properties also, and furthermore, for each k = 0, ... ,N - 1, there exists a Borel-measurable /lk: S -+ C such that flk(X)E U(x) for every XES and (T IlJ N- k- 1)(J O) = TN-k(J O)' The proposition follows from Lemma 8.7. Q.E.D.
°
We note that although condition (a) of Definition 8.7 requires the compactness of C, the conclusion of Proposition 8.6 still holds if C is not compact but can be homeomorphically embedded in a compact space C in such a way that the image of r-, j = 1,2, ... , is closed in SC. That is to say, the conclusion holds if there is a compact space C and a homeomorphism tp : C -+ C such that for j = 1,2, ... , (ri) is closed in SC, where (x, u) = (x, qJ(u».
The continuity of f and p(dwlx, u) and the lower semicontinuity of g are unaffected by this embedding. In particular, if r i is compact for each i. we can take C = :Yf and use U rysohn's theorem (Proposition 7.2) and the fact that the continuous image of a compact set is compact to accomplish this transformation. We state this last result as a corollary. Corollary 8.6.1 The conclusions of Proposition 8.6 hold if instead of assuming that C is compact and each r- is closed in Definition 8.7, we assume that each r i is compact. Definition 8.8 The upper semicontinuous, finite horizon, stochastic, optimal control model is a nine-tuple (S, C, U, W, p, f, rx, g, N) as given in Definition 8.1 which has the following additional properties:
(a) The set r defined by (1) is open in S'C. (b) The disturbance kernel p(dwlx, u) is continuous on T, (c) The system function f is continuous on rw . (d) The one-stage cost g is upper semicontinuous and bounded above onr. As in the lower semicontinuous model, the stochastic kernel t(dx'lx, u) is continuous in the upper semi continuous model. In the upper semicontinuous model, the (F-) assumption holds. If J:S -+ R* is upper semicontinuous and bounded above, then K:r -+ R* defined by (42) is upper semicontinuous and bounded above. By Proposition 7.34, the function T(J)(x) =
inf K(x, u) ueU(x)
is upper semicontinuous, and for every 8 >
°there exists a Borel-measurable
8.3
211
THE SEMICONTINUOUS MODELS
ui S
~
C such that f.l(X)E U(x) for every XES, and T /l(J)(x)
+s
~ {lI_(Jl/)~X) G
if T(J)(x) > if T(J)(x) = -
00 00.
Since J 0 == 0 is upper semicontinuous and bounded above, so is Jt = Tk(J 0)' k = 1,2, ... , N. The following proposition is obtained by using these facts to parallel the proof of the (F-) part of Proposition 8.3. Proposition 8.7 Consider the upper semicontinuous finite horizon model of Definition 8.8. For k = 1,2, ... , N, the k-stage optimal cost function Jt is upper semicontinuous and bounded above, and Jt = Tk(J 0)' For each s > 0, there exists a Borel-measurable, nonrandomized, semi-Markov, e-optimal policy and a Borel-measurable, (randomized) Markov, s-optimal policy.
Actually, it is not necessary that Sand C be Borel spaces for Proposition 8.7 to hold. Assuming only that Sand C are separable metrizable spaces, one
can use the results on upper semicontinuity of Section 7.5 and the other assumptions of the upper semicontinuous model to prove the conclusion of Proposition 8.7. It is not possible to parallel the proof of Proposition 8.4 to show for the upper semicontinuous model that given a sequence of positive numbers {en} with en,j, 0, a sequence of Borel-measurable, nonrandomized, Markov policies exhibiting {sn}-dominated convergence to optimality exists. The set A(J% - 1) defined by (33) may not be open, so the proof breaks down when one is restricted to Borel-measurable policies. We conclude this section by pointing out one important case when the disturbance kernel p(dwlx, u) is continuous. If W is n-dimensional Euclidean space and the distribution of w is given by a density d(wlx, u) which is jointly continuous in (x, u) for fixed w, then p(dwlx, u) is continuous. To see this, let G be an open set in Wand let (x n, un) ~ (x, u) in S'C. Then lim inf p(Glx k, Uk) = lim inf k~CfJ
k-oo
2':
r d(WIXk, Uk) dw
J(G
fG d(wlx, u) dw = p(Glx, u)
by Fatou's lemma. The continuity ofp(dwlx, u) follows from Proposition 7.21.t t
Note that by the same argument, lim infp(GcIX., uk! ~ p(Gclx, u), k--oc.,
so p(Glx k , Uk) ---+ p(Glx, u). Under this condition, the assumption that the system function is continuous in the state (Definitions 8.7(d) and 8.8(c)) can be weakened. See [H3] and [S5].
212
8.
THE FINITE HORIZON BOREL MODEL
In fact, it is not necessary that d be continuous in (x, u) for each w, but only that (x,; un) ~ (x, u) imply d(wlx n, un) ~ d(wlx, u) for Lebesgue almost all w. For example, if W = R, the exponential density d(wlx,u)
= {ex p[ -(w - m(x,u))]
o
~f
w
~ m(x,u),
If w < m(x,u),
where m:SC ~ R is continuous, has this property, but need not be continuous in (x, u) for any WE R.
Chapter 9
The Infinite Horizon Borel Models
A first approach to the analysis of the infinite horizon decision model is to treat it as the limit of the finite horizon model as the horizon tends to infinity. In the case (N) of a nonpositive cost per stage and the case (D) of bounded cost per stage and discount factor less than one, this procedure has merit. However, in the case (P) of nonnegative cost per stage, the finite horizon optimal cost functions can fail to converge to the infinite horizon optimal cost function (Example 1 in this chapter), and this failure to converge can occur in such a way that each finite horizon optimal cost function is Borel-measurable, while the infinite horizon optimal cost function is not (Example 2). We thus must develop an independent line of analysis for the infinite horizon model. Our strategy is to define two models, a stochastic one and its deterministic equivalent. There are no measurability restrictions on policies in the deterministic model, and the theory of Part lor of Bertsekas [B4], hereafter abbreviated DPSC, can be applied to it directly. We then transfer this theory to the stochastic model. Sections 9.1-9.3 set up the two models and establish their relationship. Sections 9.4-9.6 analyze the stochastic model via its deterministic counterpart.
9.1
The Stochastic Model
Definition 9.1 An infinite horizon stochastic optimal control model, denoted by (SM), is an eight-tuple (S, C, U, W, p,f, rx, g) as described in 213
9.
214
THE INFINITE HORIZON BOREL MODELS
Definition 8.1. We consider three cases, where I' is defined by (1) of Chapter 8: (P) O:s:: g(x, u) for every (x, u) E r. (N) g(x, u) :s:: 0 for every (x, u) E r. (D) 0 < o: < 1, and for some bER, -b :s:: g(x, u):s:: b for every (x, U)E r. Thus we are really treating three models: (P), (N), and (D). If a result is applicable to one of these models, the corresponding symbol will appear. The assumptions (P), (N), and (D) replace the (F+) and (F-) conditions of Chapter 8. Definition 9.2 A policy for (SM) is a sequence tt = (J..lo, J..ll" ..) such that for each k, J..lk(duklxo, Uo, . . . ,Uk- b x k) is a universally measurable stochastic kernel on C given SC' .. CS satisfying
J..lk(U(xdlxo,uo, ... ,Uk-1'X k) = 1 for every (xo,uo, ... ,Uk-1,Xk)' The concepts of semi-Markov, Markov, nonrandomized, and :Y'" -measurable policies are the same as in Definition 8.2. We denote by TI' the set ofall policies for (SM) and by TI the set ofall Markov policies. If n is a Markov policy of the form n = (J..l, u, ...), it is said to be stationary. As in Chapter 8, we often index Sand C for clarity, understanding Sk to be a copy of Sand C, to be a copy of C. Suppose p e P(S) and tt = (J..lo, J..lb' ..) is a policy for (SM). By Proposition 7.45, there is a sequence of unique probability measures rN(n,p) on SoC o" 'SN-1C N- b N = 1,2, ... , such that for any N and any universally measurable function h:SoC o' .. SN-l C N- 1 -> R* which satisfies either Jh+ drN(n,p) < 00 or Jh- drN(n,p) < 00, (4) of Chapter 8 is satisfied. Furthermore, there exists a unique probability measure r(n,p) on SOCOS1C 1'" such that for each N the marginal of r(n,p) on SoC o" ,SN-1CN- 1 is rN(n,p). With rN(n,p) and r(n,p) determined in this manner, we are ready to define the cost corresponding to a policy. Definition 9.3 Suppose t: is a policy for (SM). The (infinite horizon) cost corresponding to tt at XES is
J,,(x)
= fL~o =
rJ.kg(Xk,Uk)]dr(n,PJ
f rJ.k f g(Xb Uk) drk(n, Px)'
t
(1)
k=O
t The interchange of integration and summation is justified by appeal to the monotone convergence theorem under (P) and (N), and the bounded convergence theorem under (D).
9.1
215
THE STOCHASTIC MODEL
= (p,J1, ...) is stationary, we sometimes write J in place of J". The " (infinite horizon) optimal cost at x is
If t:
J*(x)
= inf
a e H'
J ,,(x).
(2)
If s > 0, the policy n is e-optimal at x provided J ( ) < {J*(X)
"x -
-lie
if J*(x) > if J*(x) = -
+s
00, 00.
If J ,,(x) = J*(x), then tt is optimal at x. If n is s-optimal or optimal at every XES, it is said to be e-optimal or optimal, respectively. It is easy to see, using Propositions 7.45 and 7.46, that, for any policy ti, J ,,(x) is universally measurable in x. In fact, if n = (Po, J11,' ..) and nk =
(J1o, . . . ,J1k- d, then Jk."k(X) defined by (5) of Chapter 8 is universally measur-
able in x and lim Jk,,,k(X)
k-r cc
= J,,(x)
(3)
VXES.
If n is Markov, then (3) can be rewritten in terms of the operators T Ilk of
Definition 8.4 as lim (T lIo '
k-r co
••
T ll k _ )(Jo)(x)
=
J,,(x)
VXES,
(4)
which is the infinite horizon analog of Lemma 8.1. If n is a Borel-measurable policy and g is Borel-measurable, then J ,,(x) is Borel-measurable in x (Proposition 7.29). It may occur under (P), however, that limk~co n(x) i= J*(x), where Jt(x) is the optimal k-stage cost defined by (6) of Chapter 8. We offer an example of this. EXAMPLE
I
Let S =
to, 1,2, ... J,
C = {1,2, ... J, U(x) = C for every
xES,rt.=I, f(x, u)
u
= { x-I
if x = 0, 'f X=l-, 0 1
g(x, u)
{I
= 0
if x = 1, if x=l-. 1
1
The problem is deterministic, so the choice of Wand p(dwlx, u) is irrelevant. Beginning at Xo = 0, the system moves to some positive integer Uo at no cost. It then successively moves to U o - 1, Uo - 2, ... , until it returns to zero and the process begins again. The only transition which incurs a nonzero cost is the transition from one to zero. If the horizon k is finite and Uo is chosen larger than k, then no cost is incurred before termination, so Jt(O) = O. Over the infinite horizon, the transition from one to zero will be made infinitely often, regardless of the policy employed, so J *(0) = CfJ.
9.
216
THE INFINITE HORIZON BOREL MODELS
Four = (flo, /11,' .. )E IT' and p EpeS), let qk(n, p) be the marginal ofr(n, p) on SkCk' k = 0,1, .... Then (7) of Chapter 8 holds, and if n is Markov, (8) holds as well. Furthermore, from (1) we have J,,(X) =
f
k=O
ak r
JSkCk
gdqk(n,px)
VXES,
(5)
which is the infinite horizon analog of (9) of Chapter 8. Using these facts to parallel the proof of Proposition 8.1, we obtain the following infinite horizon version. Proposition 9.1 policy tt such that
(P)(N)(D)
If XES and n' E IT', then there is a Markov
J,,(x) = J".(x). Corollary 9.1.1
(P)(N)(D)
We have
J*(x) = inf J,,(x)
VXES,
"Ell
where IT is the set of all Markov policies for (SM). 9.2
The Deterministic Model
Definition 9.4 Let (S, C, U, W,P,J, a, g) be an infinite horizon stochastic optimal control model as given by Definition 9.1. The corresponding infinite horizon deterministic optimal control model, denoted by (DM), consists of the following:
peS) State space. P(SC) Control space. D Control constraint. A function from peS) to the set of nonempty subsets of P(SC) defined for each P E peS) by D(p) = {q EP(sC)lq(r) = 1 and the marginal of q on Sis p},
(6)
r is given by (1) of Chapter 8. J System function. The function from P(SC) to peS) defined by
where
l(q)(SJ = Ssc t(slx, u)q(d(x, u))
VS E fJU s ,
(7)
where t(dx'lx, u) is given by (3) of Chapter 8. a Discount factor. 9 One-stage cost function. The function from P(SC) to R* given by g(q) = Ssc g(x, u)q(d(x,u)).
(8)
9.2
217
THE DETERMINISTIC MODEL
The model (DM) inherits considerable regularity from (8M). Its state and control spaces peS) and P(SC) are Borel spaces (Corollary 7.25.1). The system function J is Borel-measurable (Proposition 7.26 and Corollary 7.29.1), and the one-stage cost function g is lower semianalytic (Corollary 7.48.1). Furthermore, under assumption (P) in (8M), we have g ?': 0, while under (N), u s: 0, and under (D), 0< oc < 1 and VpEP(S).
Let (J: S ~ P(SC)P(SC)' .. be defined by (J(x) = cp(pJ. Then (J is universally measurable (Proposition 7.44) and G[px, (J(x)] ~ J*(x)
+ I>
Under (N), there exists a universally measurable (J:S that for every XES, (Px,(J(X))EL\ and G[px, (J(x)]
~
J * (X) + I> { -(1 + 1>2)/I>Poo(J*)
(17)
VXES. --->
P(SC)P(SC)· .. such
if J*(x) > if J*(x) = _
00, 00,
(18)
wherepoo(J*) = p({xIJ*(x) = -oo})ifp({xIJ*(x) = -oo}) > Oandpoo(J*) = 1 otherwise.
9.
222
THE INFINITE HORIZON BOREL MODELS
Denote o (x) = [qo(d(xo,uo)lx),q1(d(X1,u 1)lx), .. .]. Each qk(d(xk,udlx) is a universally measurable stochastic kernel on SkCk given S. Furthermore, qo(d(xo, Uo)IX)E D(px)
and, for k
=
VXES,
0, 1, ... ,
qk+ 1 (d(Xk+ 1, Uk+
dl x)
E
DU[qk(d(xk, Uk) Ix)])
VXES.
For k = 0,1, ... , define 7.1kE P(SC) by
Then 7.1k(r) = 1, k = 0,1, .... We show that (p, 7.10,7.11" ..) E A. Since the marginal of qo(d(x o, uo)lx) on So is Px, we have 7.1o(SoC o) =
.fs qoCSoColx)p(dx) = .fs X~o(x)p(dx)
so 7.10 E D(p). For k
= p(So)
0,1, ... , we have
=
7.1k+ l(Sk+ 1 c., d
=
Isqk+ 1(Sk+1 Ck+1!x)p(dx)
=
r JSkCk f «s.. l!Xk, Uk)qk(d(Xb Uk)jx)p(dx) Js
= JSkCk r t(Sk+1!Xk,Uk)7.1k(d(Xk,Uk))
VSk+1EfJDS'
Therefore 7.1k+ 1 E D[J(7.1k)] and (p, 7.10,7.11" ..) E A. Let n be any policy for (DM) which generates the admissible sequence (P,7.10,Q1," .). Then under (P) and (D), we have from (17) and the monotone or bounded convergence theorem J1f(p) = G(P,7.10,7.11"")
= k~O =
=
rJ.k fskCkg(XbUk)7.1k(d(Xk,Uk))
f rJ.k Jsr JSkCk f g(x k, uk)qM(x k, udlx)p(dx) Js[fk=O rJ.k JSkCk r g(x b Uk)qk(d(x k, Uk)!X)]P(dX} k= 0
= Is G[px, o-(x)]p(dx)
: :; .fs J*(x)p(dx) +
B.
9.3
223
RELATIONS BETWEEN THE MODELS
Under (N), we have from the monotone convergence theorem Jft(p) = fsG[px,a(x)]p(dx).
If p({xIJ*(x)
=
(19)
-co}) = 0, (18) and (19) imply Jft(p) ::; f J*(x)p(dx)
+ s,
where both sides may be - 00. If p({xIJ*(x) = - co]) > 0, then SJ*(x} p(dx) = - 00 and we have, from (18) and (19), Jft(p) ::; ~XIJ*(X» ::; e - (1 Proposition 9.5
-'I
[J*(x) + e]p(dx) - (1 + e
+ e2)/e =
(P)(N)(D)
2)/e
Q.E.D.
-l/e.
For every p E P(S),
J*(p) = f J*(x)p(dx). Proof
Lemma 9.2 shows that J*(p) ::; f J*(x)p(dx)
v» E P(S).
For the reverse inequality, let p be in P(S) and let n be a policy for (DM). There exists a policy tt E IT corresponding to nat p (Proposition 9.2), and, by Proposition 9.3, Jft(p) = f J,,(x)p(dx) ~ f J*(x)p(dx). By taking the infimum of the left-hand side over ii Eft, we obtain the desired result. Q.E.D. Propositions 9.3 and 9.5 are the key relationships between (SM) and (DM). As an example of their implications, consider the following corollary. Corollary 9.5.1 (P)(N)(D) Suppose tt E IT and n Eft are corresponding policies for (SM) and (DM). Then tt is optimal if and only if n is optimal. Proof If ti is optimal, then
Jft(p) =
f J,,(x)p(dx) = f J*(x)p(dx) = J*(p)
YpEP(S).
If n is optimal, then YXES.
Q.E.D.
The next corollary is a technical result needed for Chapter 10.
9.
224 Corollary 9.5.2 (P)(N)(D) 5
Proof
THE INFINITE HORIZON BOREL MODELS
For every p E P(S),
J * (x )p (dx ) = inf
"En By Propositions 9.2 and 9.3,
5
J ,,(x)p(dx).
l*(p) = inf 5J,,(x)p(dx)
'rIPEP(S).
"En
Apply Proposition 9.5.
Q.E.D.
We now explore the connections between the operators T ts: and T Jl and the operators T and T, The first proposition is a direct consequence of the definitions. We leave the verification to the reader. Proposition 9.6 (P)(N)(D) Let J: S ---+ R* be universally measurable and satisfy J ~ 0, J .:::;; 0, or - c .:::;; J .:::;; c, c < 00, according as (P), (N), or (D) is in force. Let 1: P(S) ---+ R* be defined by l(p)
and suppose P:P(S)
---+
U(qst
5
J(x)p(dx)
Vp E P(S),
P(SC) is of the form
5~J1(C:lx)p(dx)
P(p)(SC) =
for some J1E
=
c-».
V~E~s,
Then P(p)E a(p) for every pEP(S), and
fJl(l)(p)
=
5
Vp E P(S).
TIi(J)(x)p(dx)
Proposition 9.7 (P)(N)(D) Let J:S ---+ R* be lower semianalytic and satisfy J ~ 0, J .:::;; 0, or - c .:::;; J .:::;; c, c < 00, according as (P), (N), or (D) is in force. Let J: P(S) ---+ R* be defined by J(p)
=
5
f(l)(p)
=
5
Then
Proof g(q)
J(x)p(dx)
'rIpEP(S).
T(J)(x)p(dx)
(20)
'rip E P(S).
For pEP(S) and qE O(p) we have
+ IXJ[J(q)] = 5sJg(x, u) + IX 5s J(x')t(dx'lx, u)]q(d(x, u)) ~
5s T(J)(x)p(dx),
t The set U(qS), defined in Section 8.2, is the collection of universally measurable stochastic kernels J.l on C given S which satisfy J.l(U(x)!x) = 1 for every XES.
9.4
225
THE OPTIMALITY EQUATION
which implies 1'(J)(p)
~
I T(J)(x)p(dx).
Given e > 0, Lemma 8.2 implies that there exists /1 E U( CiS) such that IcCg(x, u)
+ a Is J(x')t(dx'lx, U)]/1(dulx) :::;;
T(J)(x)
+ e.
Let (1 E O(p) be such that q(SC)
=
l/1(Clx)p(dx)
Then 1'(J)(p):::;; 1c[g(x, u)
+a
1
J(X')t(dx'lx, U)}(d(X, u))
= Is Ic[g(x, u) + a Is J(x')t(dx'lx, U)] /1(dulx)p(dx)
s
I T(J)(x)p(dx)
where ST(J)(x)p(dx)
+ e may be
-
+ s, 00.
Therefore,
1'(J)(p) :::;; I T(J)(x)p(dx).
9.4
Q.E.D.
The Optimality Equation-Characterization of Optimal Policies
As noted following Definition 9.8, the model (DM) is a special case of that considered in Part I and DPSCt. This allows us to easily obtain many results for both (SM) and (DM). A prime example of this is the next proposition.
Proposition 9.8 (P)(N)(D)
We have
J* = 1'(1*), J*
=
T(J*).
(21) (22)
Proof The optimality equation (21) for (DM) follows from Propositions 4.2(a), 5.2, and 5.3 or from DPSC, Chapter 6, Proposition 2 and Chapter 7, t Whereas we allow 9 to be extended real-valued, in Chapter 7 of DPSC the one-stage cost function is assumed to be real-valued. This more restrictive assumption is not essential to any of the results we quote from DPSC.
9.
226
THE INFINITE HORIZON BOREL MODELS
Proposition 1. We have then, for any XES, J*(X) = l*(px) = r(J*)(pJ = T(J*)(x)
by Propositions 9.5 and 9.7, so (22) holds as well.
Q.E.D.
Proposition 9.9 (P)(N)(D) If n = (71,71, ...) is a stationary policy for (DM), then 1Ii = 1i(1Ii)' If n = (/1,/1, ...) is a stationary policy for (SM), then J Jl = TJl(JJl)'
r
Proof For (DM) this result follows from Proposition 4.2(b), Corollary 5.2.1, and Corollary 5.3.2 or from DPSC, Chapter 6, Corollary 2.1 and Chapter 7, Corollary 1.1. Let n = (/1,/1, ...) be a stationary policy for (SM) and let ii = (71,71, ...) be a policy for (DM) corresponding to n. Then for each XES,
by Propositions 9.3and 9.6.
Q.E.D.
Note that Proposition 9.9 for (SM) cannot be deduced from Proposition 9.8 by considering a modified (SM) with control constraint of the form VXES,
(23)
as was done in the proof of Corollary 5.2.1. Even if /1 is nonrandomized so that (23) makes sense, the set
r ll
=
{(X,U)!XES,UE UJl(x)}
may not be analytic, so UJl is not an acceptable control constraint. The optimality equations are necessary conditions for the optimal cost functions, but except in case (D) they are by no means sufficient. We have the following partial sufficiency results. Proposition 9.10
ta;
If 1: P(S) ~ [0, 00 J and 1;?: then 1 ;?: 1*. If J: S ~ [0, 00 J is lower semianalytic and J;?: T(J), then J ;?: J*. (N) If l:P(S) ~ [ - 00, OJ and 1 .::;; tu; then 1 .::;; 1*. If J: S ~ [ - 00, OJ is lower semianalytic and J .::;; T(J), then J .::;; J*. (D) Ifl:P(S)~[ -c,cJ,c< oo,andl= 1(l),thenl=l*. If J: S ~ [ - c, cJ, c < 00, is lower semianalytic and J = T(J), then J = J*. (P)
Proof We consider first the statements for (DM). The result under (P) follows from Proposition 5.2, the result under (N) from Proposition 5.3, and the result under (D) from Proposition 4.2(a). These results for (DM)
9.4
227
THE OPTIMALITY EQUATION
follow from Proposition 2 and trivial modifications of the proof of Proposition 9 of DPSC, Chapter 6. We now establish the (SM) part of the proposition under (P). Cases (N) and (D) are handled in the same manner. Given a lower semianalytic function J: S ---+ [0, 00] satisfying J ~ T(J), define]: P(S) ---+ [0, 00] by (20). Then ](p)
=
f
J(x)p(dx)
~
f
T(J)(x)p(dx)
=
by Proposition 9.7. By the result for (OM),]
~
J(x) = ](Px) ~ ]*(Px) = J*(x)
Proposition 9.11 Let n = (Ii, Ii, . . .) and policies in (OM) and (SM), respectively.
(P)
T(])(p)
\lPEP(S)
s-. In particular,
\Ix E S. tt
Q.E.D.
= (fl, u, . . .) be stationary
If] :P(S) ---+ [0,00] and ] ~ Yjl(J), then ] ~ ]jl' If J: S ---+ [0, 00] is universally measurable and J ~ TI,(J), then J~JIl'
(N) If]: P(S) ---+ [ - 00,0] and ] ~ Tjl(J), then ] ~ ] jl' If J: S ---+ [ - 00,0] is universally measurable and J (D)
~ TIl(J), then J ~ JIl' If]: P(S) ---+ [ - c, c], c < 00, and] = Tjl(]), then] = ] jl' If J:S ---+ [ -c,c], C < 00, is universally measurable and J = TIl(J), then J = JIl'
Proof The (OM) results follow from Proposition 4.2(b) and Corollaries 5.2.1 and 5.3.2 or from DPSC, Corollary 2.1 and trivial modifications of Corollary 9.1 of Chapter 6. The (SM) results follow from the (OM) results and Proposition 9.6 in a manner similar to the proof of Proposition 9.10. Q.E.D.
Proposition 9.11 implies that under (P), J u is the smallest nonnegative universally measurable solution to the functional equation J
=
TIl(J).
Under (D), J u is the only bounded universally measurable solution to this equation. This provides us with a simple necessary and sufficient condition for a stationary policy to be optimal under (P) and (D). Proposition 9.12 (P)(D) Let n = (Ii, Ii, . . .) and tt = (fl, p; . . .) be stationary policies in (OM) and (SM), respectively. The policy n is optimal if and only if]* = Tjl(J*). The policy n is optimal if and only if J* = TIl(J*). Proof If n is optimal, then T, = J*. By Proposition 9.9, J* = Yjl(J*). Conversely, if J* = Yjl(J*), then, by Proposition 9.11, J* ~ and n is
s,
9.
228
THE INFINITE HORIZON BOREL MODELS
optimal. The proof for (SM) follows from the (SM) parts of the same propositions. Q.E.D. Corollary 9.12.1 (P)(D) There is an optimal nonrandomized stationary policy for (SM) if and only if for each XES the infimum in
inf UEU(X)
{g(X, u) + a fJ*(x')t(dx'!x, u)L'f
(24)
is achieved. Proof If the infimum in (24) is achieved for every XES, then by Proposition 7.50 there is a universally measurable selector u: S ---+ C whose graph lies in I' and for which
g[x, f.1(x)] + rJ. f J*(x')t(dx'lx, f.1(x)) inf UEU(Xj
{g(X, u) +
rJ.
f J*(x')t(dx'lx, u)L
'f
VXES.
Then by Proposition 9.8 TIt(J*)
= T(J*) = J*,
so tt = (f.1,f.1, ...) is optimal by Proposition 9.12. If n = (f.1, u; . . .) is an optimal nonrandomized stationary policy for (SM), then by Propositions 9.8 and 9.9 Ti J*)
= Ti J It) = J It = J* = T(J*),
so f.1(x) achieves the infimum in (24) for every XES.
Q.E.D.
In Proposition 9.19, we show that under (P) or (D), the existence of any optimal policy at all implies the existence of an optimal policy that is nonrandomized and stationary. This means that Corollary 9.12.1 actually gives a necessary and sufficient condition for the existence of an optimal policy. Under (N) we can use Proposition 9.10 to obtain a necessary and sufficient condition for a stationary policy to be optimal. This condition is not as useful as that of Proposition 9.12, however, since it cannot be used to construct a stationary optimal policy in the manner of Corollary 9.12.1. Proposition 9.13 (N)(D) Let n = (71,71, ...) and tt = (f.1, u, . . .) be stationary policies in (DM) and (SM), respectively. The policy n is optimal if and only if J;1 = T'(J;1)' The policy t: is optimal if and only if J It = T( J It)· Proof
Ifn is optimal, then J;1
= J*. By Proposition 9.8
J;1 = J* = f(J*) = f(J;1)'
9.5
CONVERGENCE OF THE DYNAMIC PROGRAMMING ALGORITHM
229
Conversely, if JJi = T(JJi)' then Proposition 9.10 implies that JJi s J* and n is optimaL If 1! is optimal, J Il = T(J Il) by the (8M) part of Proposition 9.8. The converse is more difficult, since the (8M) part of Proposition 9.10 cannot be invoked without knowing that J 11 is lower semianalytic. Let n = (jl, ii, . . .) be a policy for (DM) corresponding to 1! = (/1, /1, ...), so that J Ji(p) = J1 ix)p(dx) for every p E P(S). Then for fixed p E P(S) and q E O(p), g(q)
+ ~JJi[J(q)]
=
lc[g(x,U)
;: : f
+ o: IJIl(X')t(dx'lx,U)}(d(X,U))
inf {9(X,U)
UEU(X)
+ o:
provided the integrand T(JIl)(x) =
inf {g(X, u)
UEU(Xj
+a
fs
J
JIl(x')t(dx'lx,u)lp(dx),
1
J ll(x')t(dx'lx, u)}
is universally measurable in x. But T(J 11) = J /1 by assumption, which universally measurable, so g(q)
+ ~JJi[J(q)]
;::::
f
JIl(x)p(dx)
=
IS
JJi(p).
By taking the infimum of the left-hand side over q E O(p) and using Proposition 9.9, we see that f(JJi) ;:::: ]Ji = fJi(JJi). The reverse inequality always holds, and by the result already proved for (DM), n is optimal. The optimality of 1! follows from Corollary 9.5.1. Q.E.D. 9.5
Convergenceof the Dynamic Programming AlgorithmExistence of Stationary Optimal Policies
Definition 9.10 The dynamic programming algorithm is defined recursively for (DM) and (8M) by ] o(p) = 0
Vp E P(S),
]k+l(p)=f(Jk)(P)
Jo(x) = 0
VpEP(S),
k=O,l, ... ,
VXES,
J k+ 1(X) = T(Jk)(x)
VXES,
k=O,l, ....
We know from Proposition 8.2 that this algorithm generates the k-stage optimal cost functions Jt. For simplicity of notation, we suppress the * here. At present we are concerned with the infinite horizon case and the possibility that J k may converge to J* as k ~ 00.
9.
230
THE INFINITE HORIZON BOREL MODELS
Under (P), 1 0::;; 1 1 and so 1 1 = 1''(10)::;; 1'(1 1 ) = l z . Continuing, we see that l k is an increasing sequence of functions, and so 1 00 = limk~oo l k exists and takes values in [0, + 00]. Under (N), l k is a decreasing sequence of functions and 1 00 exists, taking values in [ - 00,0]. Under (D), we have
10
::;;
b + 1'(10), + feb + 1'(10)] = (l + rx)b + f Z(1o),
0::;; b + 1'(10)::;; b 0::;; b + feb
+ f(1o)]
+ rx)b + f Z(1o) ::;; b + f[(l + rx)b + f = (1 + a + rxZ)b + 1'3(10),
=
(1
rx i
+f
Z(1o)]
and, in general, 0::;; b
k-Z
I
i=O
As k ---> 00, we see that b bl(1 - «), so ] a; = limk~oo
k
-
l
(l o) ::;; b
k-I
I
i=O
rx i
+
fk(l o)·
D=A rxi + fk(1o) increases to a limit. But b I.i=o rxi = fk(1o) exists and satisfies -bl(1 - rx)::;; 1 00 ,
Similarly, we have
1 00 Now if]: peS) --->
[ -
::;;
blO - «),
c,c], c < 00, then
1 0::;; 1
+ c,
1'(10) ::;; 1'(1 + c) = ac + 1'(1),
and, in general, It follows that ]
00
::;;
lim inf fk(1), k~oo
and by a similar argument beginning with 1 - c ::;; 1 0 , we can show that lim SUPk~o: fk(J) ::;; 10:,' This shows that under (D), if 1 is any bounded realvalued function on peS), then 1-x: = limk~e [0, + 00]; under (N), J 00: S ---> [ - 00,0]; and under (D), J oo = limk~oo Tk(J) takes values in [-bl(l - rx), bl(1 - rx)] where J: S ---> [ - c,c], c < 00, is lower semianalytic. Note that in every case, J 00 is lower semianalytic by Lemma 7.30(2). Lemma 9.3
(P)(N)(D)
l k(p) =
For every p E
f Jk(x)p(dx),
ns:
k = 0,1, ... ,
k = 00.
9.5
CONVERGENCE OF THE DYNAMIC PROGRAMMING ALGORITHM
231
Proof For k = 0,1, ... , the lemma follows from Proposition 9.7 by induction. When k = 00, the lemma follows from the monotone convergence theorem under (P) and (N) and the bounded convergence theorem under (D). Q.E.D. Proposition 9.14
(N)(D)
We have
J
J*,
(25)
J oo = J*.
(26)
00
=
Indeed, under (D) the dynamic programming algorithm can be initiated from any J: P(S) ~ [ - c, c], C < 00, or lower semi analytic J: S ~ [ - c, c], C < 00, and converges uniformly, i.e., lim sup Ifk(J)(p) - J*(p)1 = 0,
k-r
(27)
co pEP(S)
lim sup] Tk(J)(x) - J*(x)1
(28)
0.
=
k-ooxeS
Proof The result for (DM) follows from Proposition 4.2(c) and 5.7 or . from DPSC, Chapter 6, Proposition 3 and Chapter 7, Proposition 4. By Lemma 9.3, "IXES,
k=0,1, ... , k=oo,
so (25) implies (26). Under (D), if a lower semianalytic function J: S ~ [ - c,c], c < 00, is given, then define J: P(S) ~ [ - c, c] by (20). Equation (28) now Q.E.D. follows from (27) and Propositions 9.5 and 9.7. Case (D) is the best suited for computational procedures. The machinery developed thus far can be applied to Proposition 4.6 or to DPSC, Chapter 6, Proposition 4, to show the validity for (SM) of the error bounds given there. We provide the theorem for (SM). The analogous result is of course true for (DM). Proposition 9.15 (D) LetJ: S ~ [ - c,c], c < Then for all XES and k = 0,1, ... , Tk(J)(x)
+ bk :s;
T k+ l(J)(X)
:s; J*(x)
00,
be lower semianalytic.
+ bk+ 1
s t-: l(J)(X) + Dk+ 1 :s;
Tk(J)(x)
+ Db
(29)
where b, = [1J(/(1 - IJ()] inf[Tk(J)(x) - T k-
1(J)(X)],
(30)
1(J)(X)].
(31)
XES
Dk = [1J(/(1 - IJ()] sup[Tk(J)(x) XES
T k-
9.
232
THE INFINITE HORIZON BOREL MODELS
Given a lower semianalytic function J: S ---+
Proof
[ -
c, c], c
0, there exists an s-optimal nonrandomized Markov policy for (8M), and if IX < 1, it can be taken to be
238
9.
THE INFINITE HORIZON BOREL MODELS
stationary. If for each XES there exists a policy for (SM) which is optimal at x, then there exists an optimal nonrandomized stationary policy. Proof Choose c > 0 and Ck > 0 such that IJ:'~ 0 chk = C. If rx < 1, let = (1 - rx)c for every k. By Proposition 7.50,there are universally measurable functions flk: S -> C, k = 0,1, ... , such that flk(X) E U(x) for every XES and
Ck
TI"k(J*)
s
J* + Ck'
If rx < 1, we choose all the flk to be identical. Then (TI"k_71"J(J*)
s
TI"k_,(J*)
+ «e, S
J*
+ Ck-l + exck'
Continuing this process, we have k
(T 1"0 T I" , ... T I"k )(J*) -< J*
+ "L.,
ex j C') -< J*
+ C,
j~O
and, letting k ->
00,
we obtain lim(TI"oTI""" TI"J(J*)
k -«
00
s
J*
+ c.
Under (P) we have J" = lim (TI"JI' , ... TI"J(J o) s lim(TI'OTI"I'" TI"J(J*), k-e co
k-+co
(41)
so tt = (flo, fll , ...) is s-optimal. Under (D), J0
s
J*
+ [bj(1
- ex)],
(TI"oTIl 1 • • • TI"k)(J O) S (TI"JI" , ... TI'J(J*
=
[rx k+ 1 bj(1 - ex)]
+ [bj(1
- ex)])
+ (TI"oTI"""
TI"J(J*),
so (41) is valid and n = (flo, /11' ...) is s-optimal. This proves the first part of the proposition. Suppose that for each XES there is a policy for (SM) which is optimal at x. Fix x and let n = (flo, fll,' ..) be a policy which is optimal at x. By Proposition 9.1, we may assume without loss of generality that tt is Markov. By Lemma 8.4(b) and (c), we have J*(x) = J,,(x) =
=
lim(TI'JI"'" . TI'J(Jo)(x)
k--+ 00
TI"o[lim(TI"I' .. TI"J(Jo)](X) k-r co
2 TI"o(J*)(x) 2 T(J*)(x)
= J*(x).
9.6
239
EXISTENCE OF e-OPTIMAL POLICIES
Consequently, Tl-to(J*)(x)
= T(J*)(x).
This implies that the infimum in the expression inf {9(X, u) + ex
ueU(x)
f
J*(x')t(dx'lx, U)}
is achieved. Since x is arbitrary, Corollary 9.12.1 implies the existence of an optimal nonrandomized stationary policy. Q.E.D. Proposition 9.20 (N) For each s > 0, there exists an s-optimal nonrandomized semi-Markov policy for (SM). If for each XES there exists a policy for (SM) which is optimal at x, then there exists a semi-Markov (randomized) optimal policy.
Proof Under (N) we have J k ! J* (Proposition 9.14), so, given e > 0, the analytically measurable sets
A k = {x E SIJ*(x) > U
{xESIJ*(x)
00,
=
+ e/2} 2)/2e} -(2 + e
J k(X) :::; J*(x)
-00,
Jk(x):::;
converge up to S as k -+ 00. By Proposition 8.3, for each k there exists a k-stage nonrandomized semi-Markov policy n k such that for every XES J
Then for x E A k we have either J*(x) > Jk,rrk(X):::; Jk(x)
or else J*(x) = -
if h(x) > if Jk(x) = -
() < {Jk(X) + (e/2) k,rr k X -l/e
00.
If J*(x) = -
00
00.
and
+ (e/2):::;
00,
00,
J*(x)
+ s,
then either Jk(x) = -
00
and
Jk,rrk(X) :::; -l/e,
or else Jk(x) > -
00
and
Jk,rrk(X):::; Jk(x)
+ (e/2):::; -
[(2
+ e2)/2e] + (e/2) = -l/e.
Choose any fl E U( CIS) and define fck = (flt, . . . , fl~ (flt· .. , fl~- d· For every x E A b we have
1,
k fl, u, . . .), where n =
if J*(x) > if J*(x) = -
00, 00,
so fck is a nonrandomized semi-Markov policy which is s-optimal for every x E A k. The policy tt defined to be fck when the initial state is in Ab but not in A j for any j < k, is semi-Markov, nonrandomized, and s-optimal at every XE Ut=l A k = S.
9.
240
THE INFINITE HORIZON BOREL MODELS
Suppose now that for each XES there exists a policy n X for (SM) which is optimal at x. Let nX be a policy for (OM) which corresponds to tt", and let (Px,q'Q,q1, ... ) be the sequence generated from Px by n via (10) and (11). If G: ~ ---+ [ - 00,0] is defined by (15), then we have from Proposition 9.3 that X
J*(x)
= J",,(x) = JnApJ = G(Px,q'Q,q1,.. .).
(42)
We have from Proposition 9.5 and (16) that J*(x)
= J*(px) =
G(Px,qO,ql"")'
inf (qQ, q 1, . . . ) E
(43)
L'.px
Therefore the infimum in (43)is attained for every PxE S, where S = {pyl yE S}, so by Proposition 7.50, there exists a universally measurable selector 1jJ: S ---+ P(SC)P(SC)' .. such that ljJ(px) E ~Px and J*(x)
= J*(px) = G[Px,ljJ(px)]
VPxES.
Let fJ: S ---+ S be the homeomorphism fJ(x) = Px and let cp(x) cp is universally measurable, cp(X)E~px' and J*(x)
= G[px' cp(x)]
VXES.
= IjJ [fJ(x)]. Then (44)
Denote cp(x) = [qo(d(xo,uo)\x), ql(d(Xl,U1)jX), .. .].
For each k :2: 0, qk(d(xk, udlx) is a universally measurable stochastic kernel on SkCk given S, and by Proposition 7.27 and Lemma 7.28(a), (b), qk(d(xk, udlx) can be decomposed into its marginal Pk(dxk!x), which is a universally measurable stochastic kernel on Sk given S, and a universally measurable stochastic kernel,uk(duklx,xd on C k given SSk- Since Po(dxolx) = Px(dxo), the stochastic kernel ,uo(duolx, x o) is arbitrary except when x = X o . Set ,ITo(duolx)
= ,uo(duolx, x)
Vx E S.
The sequence tt = (,ITO,,ul,,u2,' .. ) is a randomized semi-Markov policy for (SM). From (7) of Chapter 8, we see that for each XES qk(n,pJ
=
qk(d(Xk,ud[x)
VXES,
k=O,l, ....
From (5),(15), and (44), we have J,.(x)
so tt is optimal.
= G[px,qo(n,pJ,ql(n,pJ, ...] = J*(x) Q.E.D.
Although randomized polcies may be considered inferior and are avoided in practice, under (N) as posed here they cannot be disregarded even in deterministic problems, as the following example demonstrates.
EXISTENCE OF E-OPTIMAL POLICIES
9.6
241
EXAMPLE 3 (St. Petersburg paradox) = C for every XES, !Y. = 1,
Let S= {0,1,2, ...}, C= {0,1},
U(x)
{
°+
g(x, u) = {
°_ 2
f(x,u) =
X
1
if u = 1, x#- 0, otherwise, if x#- 0, u = 0, otherwise.
X
Beginning in state one, any nonrandomized policy either increases the state by one indefinitely and incurs no nonzero cost or else, after k increases, jumps the system to zero at a cost of - 2k+ 1, where it remains at no further cost. Thus J*(1) = - 00, but this cost is not achieved by any nonrandomized policy. On the other hand, the randomized stationary policy which jumps the system to zero with probability when the state X is nonzero yields an expected cost of - 00 and is optimal at every XES.
t
The one-stage cost g in Example 3 is unbounded, but by a slight modification an example can be constructed in which g is bounded and the only optimal policies are randomized. If one stipulates that J* must be finite, it may be possible to restrict attention to nonrandomized policies in Proposition 9.20. This is an unsolved problem. If (SM) is lower semicontinuous, then Proposition 9.19 can be strengthened, as Corollary 9.17.2 shows. Similarly, if (SM) is upper semicontinuous, a stronger version of Proposition 9.20 can be proved. Proportional 9.21 Assume (SM) satisfies conditions (a)-(d) of Definition 8.8 (the upper semicontinuous model). (D) For each s > 0, there exists a Borel-measurable, s-optimal, nonrandomized, stationary policy. (N) For each E > 0, there exists a Borel-measurable, s-optimal, nonrandomized, semi-Markov policy. Under both (D) and (N), J* is upper semicontinuous. Proof Under (D) and (N) we have Iim.., 00 J k = J* (Proposition 9.14), and each J k is upper semicontinuous (Proposition 8.7). By an argument similar to that used in the proof of Corollary 9.17.2, J* is upper semicontinuous. By using Proposition 7.34 in place of Proposition 7.50, the proof of Proposition 9.19 can be modified to show the existence of a Borel-measurable, s-optimal, nonrandomized, stationary policy under (D). By using Proposition 8.7 in place of Proposition 8.3, the proof of Proposition 9.20 can be modified to show the existence of a Borel-measurable, s-optimal, nonrandomized, semi-Markov policy under (N). Q.E.D.
Chapter 10
The Imperfect State Information Model
In the models of Chapters 8 and 9 the current state ofthe system is known to the controller at each stage. In many problems of practical interest, however, the controller has instead access only to imperfect measurements of the system state. This chapter is devoted to the study of models relating to such situations. In our analysis we will encounter nonstationary versions of the models of Chapters 8 and 9. We will show in the next section that nonstationary models can be reduced to stationary ones by appropriate reformulation. We will thus be able to obtain nonstationary counterparts to the results of Chapters 8 and 9. 10.1 Reductionof the Nonstationary Model-State Augmentation The finite horizon stochastic optimal control model of Definition 8.1 and the infinite horizon stochastic optimal control model of Definition 9.1 are said to be stationary, i.e., the data defining the model does not vary from stage to stage. In this section we define a nonstationary model and show how it can be reduced to a stationary one by augmenting the state with the time index. 242
10.1
REDUCTION OF THE NONSTATIONARY MODEL
243
We combine the treatments of the finite and infinite horizon models. Thus when N = 00 and notation of the form SO,SI,'" ,SN-l or k = 0, ... , N - 1 appears, we take this to mean So, S 1" •• and k = 0,1, . . . , respectively. Definition 10.1 A nonstationary stochastic optimal control model, denoted by (NSM), consists of the following objects:
Horizon. A positive integer or 00. Sk, k = 0, ... , N - 1 State spaces. For each k, Sk is a nonempty Borel space. C ko k = 0, ... , N - 1 Control spaces. For each k, C, is a nonempty Borel space. Uko k = 0, ... , N - 1 Control constraints. For each k, Uk is a function from Sk to the set of nonempty subsets of C k, and the set N
(1)
is analytic in SkCk' Hi/" k = 0, ... , N - 1 Disturbance spaces. For each k, Hi/, is a nonempty Borel space. Pk(dwklx ko Uk), k = 0, ... , N - 1 Disturbance kernels. For each k, Pk(dwklx ko ud is a Borel-measurable stochastic kernel on Hi/, given SkCk' fk, k = 0, ... , N - 2 System functions. For each k, fk is a Borel-measurable function from SkCkWk to Sk+ i a Discount factor. A positive real number. gko k = 0, ... , N - lOne-stage cost functions. For each k, gk is a lower semianalytic function from r k to R*. We envision a system which begins at some XkE Sk and moves successively through state spaces Sk+ b Sk+ 2" .. and, if N < 00, finally terminates in SN _ i- A policy governing such a system evolution is a sequence n k = (flko flk+ b ' . . , flN - d, where each flj is a universally measurable stochastic kernel on C, given SkCk' .. C j- 1 Sj satisfying flj(UiXj)lxkoUk, ... ,Uj-1,Xj) = 1
for every (Xko Uk" .. , Uj_ b xJ Such a policy is called a k-originating policy and the collection of all k-originating policies will be denoted by Ilk. The concepts of semi-Markov, Markov, nonrandomized and:#' -measurable policies are analogous to those of Definitions 8.2 and 9.2. The set Il o is also written as Il', and the subset of Il: consisting of all Markov policies is denoted by Il, Define the Borel-measurable state transition stochastic kernels by tk(~k+
llXk, Uk)
= Pk({ Wk E Hi/,lfk(Xk, Uko Wk)E ~k+ I}Ixko Uk)
244
10.
THE IMPERFECT STATE INFORMATION MODEL
Given a probability measure P« E P(Sd and a policy n k = ({lb' .. define for j = k, k + 1,... ,N - 1 qin\ Pk)(S.jC) =
r ... JCf.
f
JSk JCk
I
J -
,{IN _ dE
n-,
f {lj(CjIX k, Ub . . . ,Uj_ 1, x;) Js,
x tj_ 1(dXjIXj-l' Uj_ d x {lj- 1 (duj_
ll
Xb
Ub' .. ,Uj_ 2, Xj_
d' .. {lk(dukIXk)Pk(dxk)
VS.jEf!JSj'
CjEfJDcj"
(2)
There is a unique probability measure qj(n\ Pk) E P(SjC) satisfying (2). If the horizon N is finite, we treat (NSM) only under one of the following assumptions: f.9j-(xj,uj)dqin\px.) -
~
J*(yo)q(dyo)
00,
= - 00,
i.e., n(q) is weakly q-e-optimal for (PSI). By Proposition 10.3, the policy tt defined by (44), where Ilk(P; Yk) = Jl(cp(p); Yk), is s-optimal for (lSI). Q.E.D. The other specific results which can be derived for (lSI) from Chapters 8 and 9 are obvious and shall not be exhaustively listed. We content ourselves with describing the dynamic programming algorithm over a finite horizon. By Proposition 8.2, the dynamic programming algorithm has the following form under (F+, P+) or (F-, p-), where we assume for notational simplicity that (PSI) is stationary: JMy) =
°
VyE Y,
I
Jt+ l(y) = inf {g(y, u) + C( Jt(y')t(dY'IY, un, UEU(y)
(46) k
= 0, ... ,N - 1. (47)
If the infimum in (47) is achieved for every y and k = 0, ... ,N - 1, then there exist universally measurable functions fik: Y ~ C such that for every y and k = 0, ... ,N - 1, fik(Y)E O(y) and fik(y) achieves the infimum in (47). Then ii = (fio, . . . ,fiN _ 1) is optimal in (PSI) (Proposition 8.5),so is optimal
n
in (lSI) as well (Proposition 10.3). If (F+, P+) holds and the infimum in (47) is not achieved for every y and k = 0, ... , N - 1, the dynamic programming algorithm (46)and (47)can still be used in the manner of Proposition 8.3 to construct an s-optimal, nonrandomized, Markov policy n for (PSI). We see from Proposition 10.3 that n is an s-optimal policy for (lSI) as well. In many cases, '1H 1 (p; iH 1) is a function of '1k(P; ik), Uk> and ZH l' The computational procedure in such a case is to first construct (fio,· .. ,fiN- 1) via (46) and (47), then compute Yo = '1o(p; io) from the initial distribution and the initial observation, and apply control Uo = fio(yo). Given Yk' Uk' and ZH b compute Yk+ 1 and apply control Uk+ 1 = fiH I(YH I), k = 0, ... ,N - 2. In this way the information contained in (p; id has been condensed into Yk' This condensation of information is the historical motivation for statistics sufficient for control. 10.3 Existence of Statistics Sufficient for Control Turning to the question of the existence of a statistic sufficient for control, it is not surprising to discover that the sequence of identity mappings on P(S)h, k = 0, ... ,N - 1, is such an object (Proposition 10.6). Although this
260
10.
THE IMPERFECT STATE INFORMATION MODEL
represents no condensation of information, it is sufficient to justify our analysis thus far. We will show that if the constraint sets r k are equal to IkC, k = 0, ... ,N - 1, then the functions mapping P(S)h into the distribution of x; conditioned on (p; ik ) , k = 0, ... ,N - 1, constitute a statistic sufficient for control (Proposition 10.5). This statistic has the property that its value at the (k + l)st stage is a function of its value at the kth stage, Uk and Zk+ 1 [see (52)], so it represents a genuine condensation of information. It also results in a stationary perfect state information model and, if the conditional distributions can be characterized by a finite set of parameters, it may result in significant computational simplification. This latter condition is the case, for example, if it is possible to show beforehand that all these distributions are Gaussian.
10.3.1 Filtering and the Conditional Distributions of the States We discuss filtering with the aid of the following basic lemma. Lemma 10.3 Consider the (lSI) model. There exist Borel-measurable stochastic kernels ro(dxolp; zo) on S given P(S)Z and r(dxlp; u, z) on S given P(S)CZ which satisfy
I~o
so(Zolxo)p(dxo) = Is
1
s(zlu, x)p(dx) =
I~o
J: I~
ro(Solp; Zo)So(dZoIXo)p(dxo) ZoEf!4 z,
VSoE~s,
pEP(S),
(48)
r(Slp; u, z)s(dzlu, x)p(dx) VSEf!4s,
ZE~z,
pEP(S),
UEC.
(49)
Proof For fixed (p; u) E P(S)C, define a probability measure q on SZ by specifying its values on measurable rectangles to be (Proposition 7.28) q(SZ[p;u) =
I~s(Zlu,x)p(dx).
By Propositions 7.26 and 7.29, q(d(x, z)lp; u) is a Borel-measurable stochastic kernel on SZ given P(S)C. By Corollary 7.27.1, this stochastic kernel can be decomposed into its marginal on Z given P(S)C and a Borel-measurable stochastic kernel r(dxlp; u, z) on S given P(S)CZ such that (49) holds. The existence of ro(dxolp; zo) is proved in a similar manner. Q.E.D. It is customary to call p, the given distribution of xo, the a priori distribution of the initial state. After Zo is observed, the distribution is "up-dated", i.e., the distribution of Xo conditioned on Zo is computed. The up-dated distribution is called the a posteriori distribution and, as we will show in Lemma-lO.d, is just ro(dxolp; zo). At the kth stage, k ~ 1, we will have some a priori distribution Pk of Xk based on ik- 1 = (zo, Uo, . . . ,uk- 2, Zk-l)' Control
10.3
EXISTENCE OF STATISTICS SUFFICIENT FOR CONTROL
261
1 is applied, some Zk is observed, and an a posteriori distribution of Xk conditioned on (ik-I, Uk-I' Zk) is computed. We will show that this distribution is just r(dxlp~; uk- b zd. The process of passing from an a priori to an a posteriori distribution in this manner is called filterinq. and it is formalized next. Consider the function T: P(S)C -+ P(S) defined by
Uk-
It». u)(SJ =
\IS E 86s·
I t(slx, u)p(dx)
(50)
Equation (50) is called the one-stage prediction equation. If x, has an a posteriori distribution Pk and the control Uk is chosen, then the a priori distribution of xk+ 1 is J(Pk' ud. The mapping! is Borel-measurable (Propositions 7.26 and 7.29). Given a sequence ikEl k such that ik+ 1 = (ibubZk+d, k = 0, ... ,N - 2, and given P E P(S), define recursively Po(p; i o) = ro(dxolp; zo), Pk+l(p;ik+d
=
(51)
r(dx[![Pk(P;ik),Uk];UbZk+d,
k = 0, ... ,N - 2. (52)
Note that for each k, Pk:P(S)Ik -+ P(S) is Borel-measurable. Equations (48)-(52) are called the filtering equations corresponding to the (lSI) model. For a given initial distribution and policy, they generate the conditional distribution of the state given the current information, as the following lemma shows, Lemma 10.4
Let the model (lSI) be given. For any P E P(S), n = we have
(flo,··" flN-dEll and ~kE[?JS'
Pk(n,p)[xkE~klik]
=
(53)
Pk(p;id(~k)
for Pk(n,p) almost every ik, k = O, ... ,N - 1.
Proof" We proceed by induction. For any ~oE86s from (51), (16), and (48), that
r
Jizo E~oJ
po(p;zo)(So)dPo(n,p) -
=
r .
Jizo E~O}
and Z, E (JIJz, we have
ro(Solp;zo)dPo(n,p)
= Is I~o ro(~olp;
-
Zo)So(dZoIXo)p(dxo)
= Iso So(?-.oIXo)p(dxo) =
Equation (53) for k probability.
=
°follows from
Po(n,p)({xoE§o,ZoE?-.o}).
(54)
(54) and the definition of conditional
t In this and subsequent proofs, the reader may find the discussion concerning conditional expectations and probabilities at the beginning of Section 10.2 helpful.
262
10.
THE IMPERFECT STATE INFORMATION MODEL
Assume now that (53) holds for k. For any !kEfJB1k' ~kE~c, ~k+1EfJBz and ~k+ 1 E fJB s, we have from (16),the induction hypothesis, Fubini's theorem, (50), (52), and (49) that likElk, UkE~k,
Zk+
1 E;!;k+
= lik Elk} f~k X
Jl Pk+ 1 (p; ib U b
=
=
f~k
=
Zk+ 1)(~k+
1)S( dZk+ 1luk' Xk+ d
f;!;k+ 1 Pk+ 1(P; i b
1
Ub
Zk+ 1)(~k+
1)S( dzk+ dUb Xk+ d
fSk+
1
f~k+
1
r(~k+
dJ[Pk(P; id, Uk]; U b Zk+ 1)
s(dZk+ 1luk'Xk+ dJ[Pk(P; i k), Uk] (dXk+ 1),uk(duklp; i k) dPk(n, p)
S
~k+
1
s(Zk+1IUk,Xk+dJ[Pk(P;ik),Uk](dxk+1) -
,uk(duklp; iddPk(n,p)
r
r
J{lk E!k} J~k X
Ub
t(dxk+ 11xb Ud[Pk(P; ik)(dxk)],uk(duklp; i k) dPk(n, p)
= J{lkE.!:k} r Jf.\ r X
S;!;k+ 1 Pk+ 1(P; i b
1
t(dxk+ 11xb Ud,uk(duklp; ik)[Pk(P; ik)(dxk)] dPk(n, p)
LkE.!:i R defined by (3), we have
because 8; l(C) E.?6'p(y) (Proposition 7.25). Therefore (b) holds. If (b) holds, we can show that (a) holds by the same argument used in the proof of Proposition 11.2. Q.E.D. We know from Corollary B.11.1 that the composition of analytically measurable functions need not be analytically measurable, so the cost corresponding to an analytically measurable policy for a stochastic optimal control model may not be analytically measurable. To see this, just write out explicitly the cost corresponding to a two-stage, nonrandomized, Markov, analytically measurable policy (cf. Definition 8.3). A review of Chapters 8 and 9 shows the following. Proposition 8.3 is still valid if the word "policy" is replaced by "analytically measurable policy," except that under (F-) an analytically measurable, nonrandomized, semiMarkov, a-optimal policy is not guranteed to exist. However, an analytically measurable nonrandomized a-optimal policy can be shown to exist if g s 0 [B12]. The proof of the existence of a sequence of nonrandomized Markov policies exhibiting {an}-dominated convergence to optimality (Proposition 8.4) breaks down at the point where we assume that a sequence of one-stage policies {f1~} exists for which ) S T"n-l(J o). Tl'n(J o o ~o
This occurs because T I'nl(J 0) may not be analytically measurable. In the o first sentence of Proposition 9.19, the word "policy" can be replaced by "analytically measurable policy." The a-optimal part of Proposition 9.20 depends on the (F-) part of Proposition 8.3, so it cannot be strengthened in
11.3
MODELS WITH MULTIPLICATIVE COST
271
this way. Under assumption (N), an analytically measurable, nonrandomized, s-optimal policy can be shown to exist [B12], but it is unknown whether this policy can be taken to be semi-Markov. The results of Chapters 8 and 9 relating to existence of universally measurable optimal policies depend on the exact selection property of Proposition 7.50(b). Since this property is not available for analytically measurable functions, we cannot use the same arguments to infer existence of optimal analytically measurable policies. 11.3 Models with Multiplicative Cost
In this section we revisit the stochastic optimal control model with a multiplicative cost functional first encountered in Section 2.3.4. We pose the finite horizon model in Borel spaces and state the results which are obtainable by casting this Borel space model in the generalized framework of Chapter 6. This does not permit a thorough treatment of the type already given to the model with additive cost in Chapters 8 and 9, but it does yield some useful results and illustrates how the generalized abstract model of Chapter 6 can be applied. The reader can, of course, use the mathematical theory of Chapter 7 to analyze the model with multiplicative cost directly under conditions more general than those given here. We set up the Borel model with multiplicative cost. Let the state space S, the control space C, and the disturbance space W be Borel spaces. Let the control constraint U mapping S into the set of nonempty subsets of C be such that I"
= {(X,U)IXES,UEU(X)}
is analytic. Let the disturbance kernel p(dwlx, u) and the system function
f :SCW ~ S be Borel-measurable. Let the one-stage cost function 9 be Borel-
measurable, and assume that there exists e b e R such that 0 S; g(x, U, w) S; b for all XE S, UE U(x), WE W. Let the horizon N be a positive integer. In the framework of Section 6.1, we define F to be the set of extended real-valued, universally measurable functions on Sand F* to be the set of functions in F which are lower semianalytic. We let 1ft be the set of universally measurable functions from S to C with graph in I'. Define H:SCF ~ [0, co] by
H(x, u, J) =
fw g(x, u, w)J[.f(x, u, w)]p(dwlx, u),
where we define 0 . co = co . 0 = O·(- co) = (- co). 0 = O. We take J 0: S ~ R * to be identically one. Then Assumptions A.1-A.4, F.2, and the Exact Selection Assumption of Section 6.1 hold. (Assumption A.2 follows from Lemma 7.30(4) and Propositions 7.47 and 7.48. Assumption A.4 follows from Proposition
272
11.
MISCELLANEOUS
7.50.) From Propositions 6.1(a), 6.2(a), and 6.3 we have the following results, where the notation of Section 6.1 is used. Proposition 11.7
In the finite horizon Borel model with multiplicative
cost, we have
=
J~
TN(J O)'
and for every £ > 0 there exists an N-stage s-optimal (Markov) policy. A policy n* = (f1~,' .. ,f1~ - d is uniformly N -stage optimal if and only if (TI'~TN-k-l)(JO) = TN-k(J O), k = 0, ... ,N - 1, and such a policy exists if and only if the infimum in the relation Tk+l(J O)(X)
=
inf H[x,u, Tk(J o)]
UEU(X)
is attained for each XESand k = 0, ... , N - 1. A sufficient condition for this infimum to be attained is for the set Uk(X,Jc)
=
rUE U(x)[H[x,u, Tk(JO)]:S).}
to be compact for each XES,
)oER,
and k = 0, ... , N - 1.
Appendix A
The Outer Integral
Throughout this appendix, (X,~, p) is a probability space. Unless otherwise specified, j, g, and h are functions from X to [ - 00, 00]. If j 2 0, the outer integral of j with respect to p is defined
Definition A.I
by
S* j dp = inf {S g dplJ ::'::: g, g is ~-measurable
}.
(1)
If j is arbitrary, define
S* j dp = S* j
+
dp -
S* j - dp,
(2)
where j+(x) = max {O,f(x)},
and we set
00 -
Lemma A.I such that
00
=
j-(x) = max{O, -j(x)},
00.
If j 2 0, then there exists a
~-measurable
g with g 2
I, (3)
273
274
APPENDIX A
Proof
Choose gn
gn
~f,
so that
~-measurable,
We assume without loss of generality that gl ~ g2 ~ .... Let g = Then g ~ I, g is ~-measurable, and (3) holds. Q.E.D. Lemma A.2
If /
~
0 and h
f*(f
~
limn~:M
gO'
0, then
+ h)dp:S;
f*/dp
+ f* hdp.
(4)
If either / or h is 86'-measurable, then equality holds in (4). Proof Suppose e. ~ f, g2 ~ t, e, and g2 are ~-measurable, and S*f dp = Sgl dp, S*h dp = Sg2dp. Then e. + g2 ~ / + hand (4) follows from (1). Suppose h is 86'-measurable and Sh dp < 00. [If Sh dp = 00, equality is easily seen to hold in (4).J Suppose / + h :s; q, where g is ~-measurable and
J*(f + h)dp = f gdp. Then / :s; g - hand g - h is 86'-measurable, so f* / dp :s; f g dp -
f h dp,
which implies f*
f
dp
+ fhdp:s;
Therefore equality holds in (4).
f*(f
+ h)dp.
Q.E.D.
We provide an example to show that strict inequality can occur in (4), even if / + h is 86'-measurable. For this and subsequent examples we will need the following observation: For any E c X, f* XE dp
= p*(E),
where p*(E) is p-outer measure defined by p*(E)
=
inf{p(B)IE c B, BE86'}
and XE is the indicator function of E defined by if xEE, if x¢E.
(5)
275
THE OUTER INTEGRAL
To verify (5), note that if XE ::;; 9 and 9 is 3l-measurable, then {xjg(x) ~ 1} is a 3l-measurable set containing E and consequently f 9 dp 2': p*(E).
Definition A.1 implies
r
XE dp
(6)
~ p*(E).
On the other hand, if {Bn } is a sequence of 3l-measurable sets with E c B; and p(BnH p*(E), then 1 B n) = p*(E). By construction, Xn7: 01 s; ~ XE' But Xn7:=l Bn is ~-measurable, and
p(n:=
The reverse of inequality (6)follows. Note that the preceding argument shows that for any set E, there exists a set B E ~ such that E c Band p(B) = p*(E). EXAMPLE 1 Let X = [0,1]. let ~ be the Borel o-algebra, and let p be Lebesgue measure restricted to ~. Let E c X be a set for which p*(E) = p*(X - E) = 1 (see [HI, Section 16, Theorem E]). Then
f(XE
+ XX-E)dp =
f* XEdp + f* Xx-Edp
=
f 1 dp = 1,
2,
and strict inequality holds in (4). Lemma A.2 cannot be extended to (possibly negative) bounded functions, even if h is 3l-measurable, as the following example demonstrates. EXAMPLE
2
Let (X,31,p) and E be as before. Let f =XE - XX-E, h = 1.
Then f*cr
+ h)dp =
f* 2XEdp = 2,
f*fdp+ fhdp= f*XEdp- f*xx-Edp+ 1 = 1. Lemma A.3
(a) Iff::;; g, then S*f dp ::;; S*gdp. (b) If s > 0 and f ::;; 9 ::;; f + e, then f* .f dp ::;; f* 9 dp s f*.f dp
+ 2£.
(7)
276
APPENDIX A
If S*f+ dp
0, then for every 9 either S*(g + f) dp
or S*(g + ndp
Proof
(a)
= -
=
00
00.
r: s g+ and f- a>. By (1), S*f+ dp s S* e: dp, S* r: dp ~ S* e dp. Iff s g, then
~
The result follows from (2). (b) In light of (a), it remains only to show that
S*U For e, ~ f+,
+ e)dp s
S* f dp
+ 2e.
(10)
e. iJU-measurable, and S*f+dp= Sg1 dp,
we have so
S*tr For gz
~
U
+ et dp s
S a, dp + s = S* i
+ e)-, gz iJU-measurable, S*tr
'
dp
+ e.
(11)
and
+ e)- dp =
S gz dp,
we have 92
+ e ~ U + e)- + e =
max U - - e, OJ
+s ~
j":
so
e+ S*U+e)-dp=e+ Sgzdp= Combine (11) and (12) to conclude (10).
Yg2+e)dp~S*rdp.
(12)
277
THE OUTER INTEGRAL
(c) We have f*(-f)dp= f*(-f)+dp- f*(-f)-dp = f* f - dp - f* f
= -
r
+
dp
=
-
[f* f
+
dp -
r
f - dPJ
fdp,
, where the assumption that S*f + dp < 00 or S*f - dp < 00 is necessary for the next to last equality. (d) Suppose r > O. Let gbea.'J6'-measurable function with g::::: XA u Bfand f* XAvBf dp =
Sg dp.
Then XAg::::: XAf, XBg ::::: XBf, so f* XAuBf dp = f XAg dp
:: r
+f
XAf dp
XBg dp
+ f* XBf dp.
(13)
Now suppose gl ::::: XAf, gz ::::: XBf are .?4-measurable and f g 1 dp = f* XAf dp,
Then
e. + gz ::::: XAuBf, so f* XAf dp
+
r
f gz dp
f* XBf dp.
=
XBf dp = f(gl
+ gz) dp
::::: f* XAvBf dp.
Combine (13) and (14) to conclude (9) for f
f is straightforward.
(14)
::::: O. The extension to arbitrary
(e) Suppose f::::: O. Choose BE.?4 with p(B) = p*(E) = 0, B:::J E. By (d),
r
f dp = f* XX-Bf dp ::; f* XX-Ef dp ::; f* f dp.
Hence J*f dp = S*Xx-Ef dp. The extension to arbitrary f is straightforward. (f) We have (g + f)+(x) = 00 if f(x) = 00, so that p*({xl(g
+ f)+(x) = co]) >
O.
Hence S*(g + f)+ dp = 00, and it follows that S*(g + f) dp = 00. (g) Consider the sets E = {xlf(x) = - oo} and E g = {xlf(x) = g(x) < oo}. If p*(Eg ) = 0, then p*(E - E g }
=
p*(E - E g }
+ p*(E
g } :::::
p*(E} > O.
00,
278
APPENDIX A
Since we have f(x) + g(x) = 00 for x E E - E g , it follows from (f) that S*(g + .f)dp = 00. If p*(Eg ) > 0, then p*({xl(g + .f)-(x) = ooj) 2': p*(Eg ) > 0 and hence, by (f), S*(g + .f)- dp = 00. Hence, if S*(g + .f)+ dp = 00, then S*(g + .f) dp = 00, while if S*(g + .f)+ dp < 00, then S*(g + .f) dp = - 00. Q.E.D. The bound given in (7) is the sharpest possible. To see this, let f be as defined in Example 2, 9 = f + 1, and s = 1. Despite these pathologies of outer integration, there is a monotone convergence theorem, which we now prove.
f" i
Proposition A.I f, then
If {f,,} is a sequence of nonnegative functions and
f* t: dp i f* f dp. If {f,,} is a sequence of nonpositive functions and
(15)
f" t f, then
f* t: dp t f* f dp. Proof We prove the first statement of the theorem. The second follows from the first and Lemma A.3(c). Assume I; 2': 0 and I; t f. Let {gn} be a sequence of .94-measurable functions such that gn 2': f" and (16)
If, for some n, Sgn dp = S*f" dp = every n,
00,
then (15) is assured. If not, then for
f gndp
gn+l(X)}) > O.
Then since 9 n + 1 2': f" + 1 2': f", we have that 9 defined by if gn(x):::; gn+ 1 (x), if gn(x) > gn+ l(X), satisfies gn 2': 9 2': f" everywhere and 9 < gn on a set of positive measure. This contradicts (16). We may therefore assume without loss of generality that (17) holds and 9 1 :::; 9 2 . . . . Let 9 = lim.., gn' Then 9 2': f and lim n~oo
r
00
I, dp = lim fgn dp = fg dp 2': n-co
r
f dp.
But fn :::; f for every n, so the reverse inequality holds as well.
Q.E.D.
279
THE OUTER INTEGRAL
One might hope that if {In} is a sequence offunctions which are bounded below and In i f, then (15) remains valid. This is not the case, as the following example shows. EXAMPLE 3 Let X = [0,1), flJ be the Borel a-algebra, and P be Lebesgue measure restricted to flJ. Define an equivalence relation - on X by
x - y-=-x - y is rational. Let F ° be constructed by choosing one representative from each equivalence class. Let Q = {qo, qb' ..} be an enumeration of the rationals in [0,1) with qo = and define
°
Fk = {x + qk[modl]lxEFd = F o + qk[modl]
k = 0,1, ....
Then F 0, F I, . . . is a sequence of disjoint sets with 00
U r, =
k=O
[0, 1).
(18)
If for some n < 00, we have P*(Ur'=nFk) < 1, then E = UZ=6 F, contains set with measure b > 0. For k = 1,... , n - 1, let qk = rdsk' a ~-measurable where rk and Sk are integers and rk/sk is reduced to lowest terms. Let {PI, Pz, ... } be a sequence of prime numbers such that
max
1 '5',kS,/1-1
Sk < PI < pz < ...
Then the sets E, E + Pl 1 [mod 1], E + Pi. I [mod 1], ... are disjoint, and by the translation invariance of p, each contains a flJ-measurable set with measure b > 0. It follows that [0,1) must contain a flJ-measurable set of infinite measure. This contradiction implies (19)
for every n. Define n = 0,1, ....
Then In i 0, but (5) and (19) imply that for every n
f* fndp = -1. Bya change of sign in Example 3, we see that the second part of Theorem A.l cannot be extended to functions which are bounded above unless additional conditions are imposed. We impose such conditions in order to prove a corollary.
280
APPENDIX A
Corollary A.I.1 Let {IOn} be a sequence of positive numbers with I IOn < 00. Let {f,,} be a sequence with
I:'=
lim I, = f,
f
~
f,,(x)
~
+ IOn fn-I(X) + IOn f(x)
f* I, dp
,lid, cp(Vj, Vz," .)ES(V j , • . • , vd, and S(J..l[, ... ,J..lk) is disjoint from S(v[, , vk), we see that CP(J..lb J..lz,· .. ) =I cp(v j, Vz , ' ..), so tp is one-to-one. To show cp is continuous, let {m n } be a sequence converging to mEA. Choose s > and let k be a positive integer such that 2/k < e. There exists an "ii such that whenever n ;;::: 'ii, the elements mn and m = (J..lj, Ilz, ...) agree in the first k components, so both cp(mn ) and cp(m) are in S(lll' ... ,Ilk)' This implies d(cp(m n ), cp(m)) :s; 21k < e, so cp is continuous. To show that cp - 1 is continuous, it suffices to show that cp( F) is closed in cp(A) whenever F is closed in A. This follows from the fact that A is compact and cp is continuous. Define % 1 C % to be the compact homeomorphic image of A under cp. We now show that !:%1 ---+X is a homeomorphism. To see that f is one-to-one, choose distinct points Z and z in % i - Then there exist distinct points m = (J..ll' Ilz, ...) and m= ({iI, {iz,· ..) in A such that Z = cp(m) and 2 = cp(l'n). For some k, we have J..lk =I {ib so by (i), ![S(llo, . . . ,lId] n j'[S({io, ... ,{ik)] = 0· Since Z E S(J..lo, ... ,Ilk) and 2 E S({io, ... ,{ik), we see that f(z) =I f(z), so f is one-to-one. Just as in the case of q), the continuity of f - 1 follows from the fact that! is continuous and has a compact domain. The set K = !Cff1)is a compact subset of X homeomorphic to A. Q.E.D.
nk'=
°
Lemma B.4 Let X be an uncountable Borel space. There exists a Borel subset L of A such that X and L are Borel-isomorphic.
Proof By definition, X is homeomorphic to a Borel subset B of a complete separable metric space Y. By Urysohn's and Alexandroff's theorems (Propositions 7.2 and 7.3), Y is homeomorphic to a Go-subset of the Hilbert cube Yf, so B and hence X are homeomorphic to a Borel subset of Yf. It suffices then to show that Yf is Borel-isomorphic to a Borel subset of A. The idea of the proof is this. Each element in Yf is a sequence of real numbers in [0, 1]. Each of these numbers has a binary expansion, and by
288
APPENDIX B
mixing all these expansions, we obtain an element in A. Let us first define lj;: [0, 1] -+ A which maps a real number into a sequence of zeroes and ones which is its binary expansion. It is easier to define lj; - 1, which we define on A 1 U {(O, 0, 0, ... where
n,
A
1
=
{(Ill' Ilz,"
.)EA!llk
= 1 for infinitely many k}.
It is given by
and it is easily verified that lj; -1 is one-to-one, continuous, and maps onto [0, 1]. Since A - A 1 is countable, the domain of lj; - 1 is a Borel subset of A, and Proposition 7.15 tells us that lj; is a Borel isomorphism. Since we have not proved Proposition 7.15, we show directly that lj; is Borel-measurable. Consider the collection of sets R(k) R(k)
= =
{(lll,IlZ'"
.)EA!llk
{(Ilbllz,"
.)EA!llk
= O}, = I},
k = 1,2,
,
k
.
=
1,2,
These sets form a subbase for the topology of A, so by the remark following Definition 7.6, we need only prove that lj;-l[R(k)] and lj;-l[R(k)] are Borelmeasurable to conclude that lj; is. Since one of these sets is the complement of the other, we may restrict attention to lj; - 1 [R(k)]. Remembering that the domain oflj;-l is A 1 U {O,O,O, ... we have lj;-l[R(k)]
=
{Jl ~fl(ll'IlZ'"
n,
.)EA
1,
Ilk
= O}
U{O},
and
which is a finite union of Borel sets. The proof that A A ... and A are homeomorphic is essentially the same one given in Lemma 7.25, and we do not repeat it here. Let e mapping .~.~ ... onto A be a homeomorphism and define ip : Yf -+ .~ by cp(x1 , X Z"")
Then
tp
=
e[lj;(x 1),lj;(xz),··.].
is the required Borel-isomorphism.
Q.E.D.
Lemma B.5 If K 1 and L are Borel subsets of A, K 1 Borel-isomorphic to A, then L is Borel-isomorphic to A.
C
L, and K 1 is
Proof For Borel subsets A and B of A, we write A ~ B to indicate that A and B are Borel-isomorphic. Note that A ~ Band B ~ C implies
289
ADDITIONAL MEASURABILITY PROPERTIES OF BOREL SPACES
A ::::: C. Also, if A l' A 2 , ••• is a sequence of disjoint Borel sets, if B 1> B 2 , . . . is another such sequence, and if Ai::::: B, for every i, then U~ 1 Ai ::::: U~ 1 B i • We note finally that if A = A 1 U A 2 and A ::::: B, then B = B 1 U B 2 , where A 1 ::::: B 1 and A 2 ::::: B 2 . If A 1 and A 2 are disjoint, then B 1 and B 2 can be taken to be disjoint. Under the hypotheses of the lemma, let D 1 = At - K 1 • Since At 1 ::::: K 1 and At = K 1 U D 1 , there exist disjoint Borel sets K 2 and D 2 such that K 1 = K 2 U D2 , K 1 ::::: K 2 and D 1 ::::: D2 . Since K 1 ::::: K 2 and K 1 = K 2 U D2 , there exist disjoint Borel sets K 3 and D3 such that K 2 = K 3 U D3 , K 2 ::::: K 3 , and D 2 ::::: D 3 • Continuing in this manner, at the nth step we construct disjoint Borel sets K; and D; such that K n- 1 = K; U Dn, K n- 1 ::::: K n, and D~-I::::: Dn· Let K oo = n:'=IKn' Then At = x ; U [U:'=IDn], and all the sets on the right side of this equation are disjoint. Let A 1 = At - Land B 1 = L - K 1 • Then A 1 and B 1 are disjoint and D 1 = A 1 U B 1 • For each n, D 1 ::::: Dn, so D; = An U Bn, where An and B; are disjoint Borel sets and A 1 ::::: An' B 1 ::::: Bn· In particular, An::::: An+ 1 for n = 1, 2, ... , and we have
«;
U
::: x ;
u
At =
= {K oo
[91 = x ; [9 [9 [91 [91 DnJ
2
U
AnJ u
U
1
AnJ u
[91
BnJ
BnJ
DnJ} - A 1 = At - A 1 = L.
Q.E.D.
We can now prove Proposition 7.16, and the proof clearly shows that Corollary 7.16.1 is also true. Proposition B.3 Let X and Y be Borel spaces. Then X and Yare isomorphic if and only if they have the same cardinality.
Proof If X and Yare isomorphic, then clearly they must have the same cardinality. If X and Y both have the same finite or countably infinite cardinality, then their Borel a-algebras are their power sets and any oneto-one onto mapping from one to the other is a Borel-isomorphism. If X is uncountable, then by Lemma B.4 there exists a Borel isomorphism sp: X ~ At such that L = 0 for which SE(a O) c X - K. For BEY'E(A), we have d(ao, B) ::;; max d(a, B)
< s,
UEA
which implies B n Siao) #- 0 and BE 2 x - 2K • Therefore Y'E(A) c 2x - 2K , and 2 x - 2K is open. Having thus shown that the sets 2 G and 2 x - 2K are open in (2X , p) when G is open and K is closed, we must now show that given any open subset tfJ of(2 X , p) and any nonempty AE tfJ, we can find open sets G l , G 2 , ••• , Gm and closed sets K 1,K 2 , . . . .K; in X for which AE2 G 1 n··· n 2Gm n (2 x - 2K 1) n ... n (2X
-
2K n )
C
tfJ.
Since tfJ is open in (2 X , p), there exists s > 0 such that Y'E(A) c tfJ. Since A is closed in the compact set X, there exist points {x., ... ,Xn } in A such that A c Uk=l SEdxd. Let G 1 = {xEXld(x,A)
0, let {Xl' ... ,xd be points of A such that the open spheres S.dx),j = 1, ... , k cover A. Choose N large enough so that for all n ~ N j = 1, ... .k.
Now use the Lipschitz continuity [cf. (6)J of the function conclude that
X -+
d(x, An) to
'v'xEA. This implies that lim max d(x, An) = O.
n-e cc
XEA
This equation and (3) imply that (17) will follow if we can show lim maxd(y,A) = O.
n- 00 YEAn
(18)
THE HAUSDORFF METRIC AND THE EXPONENTIAL TOPOLOGY
If (18) fails to hold, then for some that nl < n z < ... and
B
309
> 0 there exists a sequence Yk E A nk such Vk.
(19)
The compactness of X implies that {Yk} accumulates at some YoEX which, by (19) and the continuity of x --+ d(x,A), must satisfy d(Yo,A) ~ B. But YoElimn~oo An by (14), and this contradicts (16). Hence (18) holds. Still assuming A # 0, we turn to the reverse implication of the proposition. If (17) holds, then VXEA,
(20)
and lim maxd(y,A) = O.
(21)
n---+ooYEA n
Equation (20) implies that A c lim An C lim An. n-r cc
(22)
n-+oo
If x Elim.,; 00 An, then by definition there exists a sequence Yk E A nk such that n 1 < nz < ... and lim d(x, Yk) =
k -«
00
o.
(23)
We have from (6) that
and, letting k --+ 00 and using (21) and (23), we conclude d(x, A) = O. Since A is closed, this proves x E A and (24) n-s co
Combine (22) and (24) to obtain (16). Assume finally that A = 0. If (16) holds, then all but finitely many of the sets Anmust be empty, for otherwise one could find Yk E A nk, nl < n z < ... , and {yd would accumulate at some Yo E limn~ 00 An. If all but finitely many of the sets An are empty, then (5) implies that (17) holds. Conversely, if (17) holds and A = 0, then (4) implies that all but finitely many of the sets An Q.E.D. are empty. Equation (16) follows from (2), (14), and (15). For the proof of Proposition 7.33 in Section 7.5 we need the concept of a function which is upper semicontinuous in the sense of Kuratowski, or in abbreviation, upper semicontinuous (K).
310
APPENDIX C
Definition C.2 Let Y be a metric space and X a compact metric space. A function F: Y -> 2x is upper semicontinuous (K) if for every convergent
sequence {Yn} in Y with limit Y, we have limn_co F(Yn)
c
F(y).
The similarity of Definition C.2 to the idea of an upper semicontinuous real or extended real-valued function is apparent [Lemma 7.13(b)]. Although we will not discuss functions which are lower semicontinuous (K), it is interesting to note that such a concept exists and has the obvious definition, namely, that the function F: Y -> 2x is lower semicontinuous (K) if for every convergent sequence {Yn} in Y with limit Y, we have limn_co F(Yn):::J F(y). It can be seen from Proposition C.3 that a function F: Y -> 2x is continuous in the usual sense (where 2x has the exponential topology) if and only if it is both upper and lower semi continuous (K). We carry the analogy with real-valued functions even farther by showing that an upper semicontinuous (K) function is Borel-measurable, and the remainder of the appendix is devoted to this. Lemma C.2 Let Y be a metric space and X a compact metric space. If x -> 2 is upper semicontinuous (K), then for each open set G c X, the
F: Y
set (25)
is open. The openness of F- 1(2G ) for every open G is in fact equivalent to upper semicontinuity (K), but we need only the weaker result stated. To prove it, we show that for G open, the set F- 1(2X - 2G ) is closed. If {Yn} is a sequence in this set with limit Y E Y, then Proof
F(Yn)n(X - G) =I 0,
n
= 1,2, ... ,
and so there exists a sequence {x n } in the compact set X -. G such that x, E F( Yn), n = 1,2, .... This sequence has an accumulation point x EX - G, and, by (14), xElim n_ oo F(Yn)' The upper semicontinuity (K) of F implies xEF(y), and so F(y) n (X - G) =I 0, i.e., YEF- 1(2 x - 2G ). Q.E.D. Proposition CA Let Y be a metric space, (X, d) a compact metric space, and let 2x have the exponential topology. Let F: Y -+ 2x be upper semicontinuous (K). Then F is Borel-measurable. Proof If F: Y -> 2x is upper semicontinuous (K) and G is an open subset of X, then F- 1(2G ) is Borel-measurable in Y by Lemma C.2. If K is a closed 1 Gn , subset of X, define open sets G; = {xld(x, K) < lin}. We have K = and so a closed set A is a subset of K if and only if A c Gn , n = 1,2, ....
n:,=
THE HAUSDORFF METRIC AND THE EXPONENTIAL TOPOLOGY
n F-
311
00
F- 1(2K ) =
1(2G n
)
n= 1
is a G~-set, thus Borel-measurable in Y. It follows that for any set qj in the subbasew v % for the exponential topology on 2x , F- 1 (qj) is Borel-measurable in Y. By Proposition 7.1, any open set in 2x can be represented as a countable union of finite intersections of sets in ~ v % and so its inverse image under F is Borel-measurable. Q.E.D.
References
[AI] R. Ash, "Real Analysis and Probability." Academic Press, New York, 1972. [A2] K. J. Astrom, Optimal control of Markov processes with incomplete state information, J. Math. Anal. Appl. 10 (1965),174-205. [Bl] R. Bellman, "Dynamic Programming." Princeton Univ. Press, Princeton, New Jersey, 1957. [B2] D. P. Bertsekas, Infinite-time reachability of state-space regions by using feedback control, IEEE Trans. Automatic Control AC-17 (1972), 604-613. [B3] D. P. Bertsekas, On error bounds for successive approximation methods, IEEE Trans. Automatic Control AC-21 (1976), 394-396. [B4] D. P. Bertsekas, "Dynamic Programming and Stochastic Control." Academic Press, New York, 1976. [B5] D. P. Bertsekas, Monotone mappings with application in dynamic programming, SIAM J. Control Optimization 15 (1977), 438-464. [B6] D. P. Bertsekas and S. Shreve, Existence of optimal stationary policies in deterministic optimal control, J. Math. Anal. Appl. (to appear). [B7] P. Billingsley, Invariance principle for dependent random variables, Trans. Amer. Math. Soc. 83 (1956), 250-282. [B8] D. Blackwell, Positive dynamic programming, Proc. Fifth Berkeley Sympos. Math. Statist. and Probability, 1965,415-418. [B9] D. Blackwell, Discounted dynamic programming, Ann. Math. Statist. 36 (1965), 226-235 [BIO] D. Blackwell, On stationary policies, J. Roy. Statist. Soc. 133A (1970), 33 - 37. [Bll] D. Blackwell, Borel-programmable functions, Ann. Prob. 6 (1978), 321-324. [BI2] D. Blackwell, D. Freedman, and M. Orkin, The optimal reward operator in dynamic programming, Ann. Probability 2 (1974), 926-941. [B13] N. Bourbaki, "General Topology." Addison-Wesley, Reading, Massachusetts, 1966. [BI4] D. W. Bressler and M. Sion, The current theory of analytic sets, Canad. J. Math. 16 (1964),207-230.
312
REFERENCES [BI5] [CI] [DI] [D2] [D3] [D4] [D5] [D6] [D7] [D8] [FI] [F2] [F3] [F4] [F5] [G I] [HI] [H2] [H3] [H4] [H5] [H6] [H7] [11] [J2] [13]
[KI]
313
L. D. Brown and R. Purves, Measurable selections of extrema, Ann. Statist. 1 (1973), 902-912. D. Cenzer and R. D. Mauldin, Measurable parameterizations and selections, Trans. Amer. Math. Soc. (to appear). C. Dellacherie, "Ensembles Analytiques, Capacites, Mesures de Hausdorff." SpringerVerlag, Berlin and New York, 1972. E. V. Denardo, Contraction mappings in the theory underlying dynamic programming, SIAM Rev. 9 (1967),165-177. C. Derman, "Finite State Markovian Decision Processes." Academic Press, New York, 1970. J. L. Doob, "Stochastic Processes." Wiley, New York, 1953. L. Dubins and D. Freedman, Measurable sets of measures, Pacific J. Math. 14 (1964), 1211-1222. L. Dubins and L. Savage, "Inequalities for Stochastic Processes (How to Gamble if you Must)." McGraw-Hill, New York, 1965. (Republished by Dover, New York, 1976.) J. Dugundji, "Topology." Allyn & Bacon, Rockleigh, New Jersey, 1966. E. B. Dynkin and A. A. Juskevic, "Controlled Markov Processes and their Applications." Moscow, 1975. (English translation to be published by Springler-Verlag.) D. Freedman, The optimal reward operator in special classes of dynamic programming problems, Ann. Probability. 2 (1974), 942-949. E. B. Frid, On a problem of D. Blackwell from the theory of dynamic programming, Theor. Probability Appl. 15 (1970),719-722. N. Furukawa, Markovian decision processes with compact action spaces, Ann. Math. Statist. 43 (1972) 1612-1622. N. Furukawa and S. Iwamoto, Markovian decision processes and recursive reward functions, Bull. Math. Statist. 15 (1973), 79-91. N. Furukawa and S. Iwamoto, Dynamic programming on recursive reward systems, Bull. Math. Statist. 17 (1976), 103-126. K. Godel, The consistency of the axiom of choice and of the generalized continuumhypothesis, Proc. Nat. Acad. Sci. U.S.A. 24 (1938), 556-557. P. R. Halmos, "Measure Theory." Van Nostrand-Reinhold, Princeton, New Jersey, 1950. F. Hausdorff, "Set Theory." Chelsea, Bronx, New York, 1957. C. J. Himmelberg, T. Parthasarathy, and F. S. Van Vleck, Optimal plans for dynamic programming problems, Math. Operations Res. 1 (1976), 390-394. K. Hinderer, "Foundations of Nonstationary Dynamic Programming with Discrete Time Parameter." Springler-Verlag, Berlin and New York, 1970. J. Hoffman-Jorgensen, "The Theory of Analytic Spaces." Aarhus Universitet, Aarhus, Denmark, 1970. A. Hordijk, "Dynamic Programming and Markov Potential Theory." Mathematical Centre Tracts, Amsterdam, 1974. R. Howard, "Dynamic Programming and Markov Processes." MIT Press, Cambridge, Massachusetts, 1960. B. Jankov, On the uniformisation of A-sets, Dokl. Akad. Nauk SSSR 30 (1941),591-592 (in Russian). W. Jewell, Markov renewal programming I and II, Operations Res. 11 (1963), 938-971. A. A. Juskevic (Yushkevich), Reduction ora controlled Markov model with incomplete data to a problem with complete information in the case of Borel state and control spaces, Theor. Probability Appl. 21 (1976), 153-158. L. Kantorovich and B. Livenson, Memoir on analytical operations and projective sets, Fund. Math. 18 (1932), 214-279.
314 [K2] [K3] [K4] [K5] [K6] [Ll] [L2] [L3] [MI] [M2] [M3] [M4] [M5]
[NI] [01] [02] [03] [04] [05] [PI] [P2] [RI] [R2]
[R3] [R4] [R5] [SI] [S2]
REFERENCES
K. Kuratowski, "Topology I." Academic Press, New York, 1966. K. Kuratowski, "Topology II." Academic Press, New York, 1968. K. Kuratowski and A. Mostowski, "Set Theory." North-Holland, Amsterdam, 1976. K. Kuratowski and C. Ryll-Nardzewski, A general theorem on selectors, Bull. Polish A cad. Sci. 13 (1965), 397-411. H. Kushner, "Introduction to Stochastic Control." Holt, New York, 197\. M. Loeve, "Probability Theory." Van Nostrand-Reinhold, Princeton, New Jersey, 1963. N. Lusin, Sur les ensembles analytiques, Fund. Math. 10 (1927), 1-95. N. Lusin and W. Sierpinski, Sur quelques proprietes des ensembles (A), Bull. Acad. Sci. Cracovie (1918),35-48. G. Mackey, Borel structure in groups and their duals, Trans. Amer. Math. Soc. 85 (1957),134-165. A. Maitra, Discounted dynamic programming on compact metric spaces, Sankhya 30A (1968),211-216. J. McQueen, A modified dynamic programming method for Markovian decision problems, J. Math. Anal. Appl. 14 (1966),38-43. P. A. Meyer, "Probability and Potentials." Ginn (Blaisdell), Boston, Massachusetts, 1966. P. A. Meyer and M. Traki, Reduites et jeux de hasard (Seminaire de Probabilites VII, Universite de Strasbourg, in "Lecture Notes in Mathematics," Vol. 321), pp. 155-17\. Springer, Berlin, 1973. J. von Neumann, On rings of operators. Reduction theory, Ann. of Math. 50 (1949), 401-485. P. Olsen, Multistage stochastic programming with recourse: The equivalent deterministic problem, SIAM J. Control Optimization 14 (1976), 495-517. P. Olsen, When is a multistage stochastic programming problem well-defined.", SIAM J. Control Optimization 14 (1976), 518-527. P. Olsen, Multistage stochastic programming with recourse as mathematical programming in an L p space, SIAM J. Control Optimization 14 (1976), 528-537. D. Ornstein, On the existence of stationary optimal strategies, Proc. Amer. Math. Soc. 20 (1969),563-569. J. M. Ortega and W. C. Rheinboldt, "Iterative Solutions of Nonlinear Equations in Several Variables." Academic Press, New York, 1970. K. Parthasarathy, "Probability Measures on Metric Spaces." Academic Press, New York, 1967. Yu. V. Prohorov, Convergence of random processes and limit theorems in probability theory, Theor. Probability Appl. 1 (1956),157-214. D. Rhenius, Incomplete information in Markovian decision models, Ann. Statist. 2 (1974),1327-1334. R. T. Rockafellar, Integral functionals, normal integrands and measurable selections, in "Nonlinear Operators and the Calculus of Variations." Springer-Verlag, Berlin and New York, 1976. R. T. Rockafellar and R. Wets, Stochastic convex programming: relatively complete recourse and induced feasibility, SIAM J. Control Optimization 14 (1976), 574-589. R. T. Rockafellar and R. Wets, Stochastic convex programming: basic duality, Pacific J. Math. 62 (1976),173-195. H. L. Royden, "Real Analysis." Macmillan, New York, 1968. S. Saks, "Theory of the Integral." Stechert, New York, 1937. Y. Sawaragi and T. Yoshikawa, Discrete-time Markovian decision processes with incomplete state information, Ann. Math. Statist. 41 (1970),78-86.
REFERENCES
315
[S3] M. Schill, On continuous dynamic programming with discrete time parameter, Z. Wahrscheinlichkeitstheorie und VerII'. Gebiete 21 (1972), 279~288. [S4] M. Schill, On dynamic programming: Compactness of the space of policies, Stochastic Processes Appl. 3 (1975), 345-364. [S5] M. Schal, Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal, Z. Wahrscheinlichkeitstheorie und VerII'. Gebiete 32 (1975), 179~196. [S6] E. Selivanovskij, Ob odnom klasse effektivnyh mnozestv (mnozestva C), Mal. Sb. 35 (1928),379-413. [S7] S. Shreve, A General Framework for Dynamic Programming with Specializations, M. S. thesis (1977), Dept. of Elec. Eng., Univ. of Illinois, Urbana. [S8] S. Shreve, Dynamic Programming in Complete Separable Spaces, Ph.D. thesis (1977), Dept. of Math., Univ. of Illinois, Urbana. [S9] S. Shreve and D. P. Bertsekas, A new theoretical framework for finite horizon stochastic control, Proc. Fourteenth Annual Allerton Conf, Circuit and System Theory, Allerton Park, Illinois, October, 1976,336-343. [SIO] S. Shreve and D. P. Bertsekas, Equivalent stochastic and deterministic optimal control problems, Proc. 1976 IEEE Conf. Decision and Control, Clearwater Beach, Florida, 705- 709. [SII] S. Shreve and D. P. Bertsekas, Alternative theoretical frameworks for finite horizon discrete-time stochastic optimal control, SIAM J. Control Optimization \6 (1978). [S12] S. Shreve and D. P. Bertsekas, Universally measurable policies in dynamic programming Mathematics 01' Operations Research (to appear). [SI3] R. Solovay, A model of set-theory in which every set of reals is Lebesgue measurable. Ann. Math. 92 (\970\, 1~56. [SI4] R. E. Strauch, Negative dynamic programming, Ann Math. Statist. 37 (1966), 871- 890 [S15] C. Striebel, Sufficient statistics in the optimal control of stochastic systems, 1. Math. Anal. Appl. 12 (1965), 576~592. [SI6] C. Striebel, "Optimal Control of Discrete Time Stochastic Systems." Springer-Verlag, Berlin and New York, 1975. [SI7] M. Suslin (Souslin), Sur une definition des ensembles measurables B sans nombres transfinis, C. R. A cad. Sci. Paris 164 (1917), 88~9I. [VI] V. S. Varadarajan, Weak convergence of measures on separable metric spaces, Sankhya 19 (1958), 15-22. [WI] D. H. Wagner, Survey of measurable selection theorems, SIAM J. Control Optimization 15 (1977), 859~903. [W2] A. Wald, "Statistical Decision Functions." Wiley, New York, 1950. [W3] H. S. Witsenhausen, A standard form for sequential stochastic control, Math. Systems Theory 7 (1973), 5-1 I.
This page intentionally left blank
Table of Propositions, Lemmas, Definitions, and Assumptions
Chapter 2
Monotonicity Assumption
27
Chapter 3
Proposition Proposition Proposition Proposition Proposition Proposition Proposition
3.1 3.2 3.3 3.4 3.5 3.6 3.7
40 43 44 46 47 50 5\
Lemma 3.\
45
Assumption F.\ Assumption F.2 Assumption F.3
39 40 40 Chapter 4
Proposition Proposition Proposition Proposition Proposition
4.\ 4.2 4.3 4.4 4.5
53 55 56 57 59
Proposition Proposition Proposition Proposition Proposition Proposition
4.6 4.7 4.8 4.9 4.\0 4.11
60 62 64 64 68 69
Assumption C (Contraction Assumption) Fixed Point Theorem
52 55
Chapter 5
Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.\0 5.11 5.12 5.13
71 73 75 78 78 79 80 81 84 86 87 88 89
317
318
TABLE OF PROPOSITIONS
Proposition 5.14 Proposition 5.15
90 90
Lemma 5.1 Lemma 5.2
75 82
Assumption 1 (Uniform Increase Assumption) Assumption 0 (Uniform Decrease Assumption) Assumption I. I Assumption 1.2 Assumption 0.1 Assumption 0.2
70 70 71 71 71 71
Chapter 6
Proposition Proposition Proposition Proposition Proposition
6.1 6.2 6.3 6.4 6.5
95 95 96 97 97
Assumption A.I Assumption A.2 Assumption A.3 Assumption A.4 Assumption A.5 Assumption F.2 Assumption F.3 Exact Selection Assumption Assumption C
93 93 93 93 93 94 95 95 96
Chapter 7
Proposition 7.1 Proposition 7.2 (Urysohns Theorem) Proposition 7.3 (Alexandroff's Theorem) Proposition 7.4 Proposition 7.5 Proposition 7.6 Proposition 7.7 Proposition 7.8 Proposition 7.9 Proposition 7.10 Proposition 7.11 Proposition 7.12 Proposition 7.13 Proposition 7.14 Proposition 7.15 (Kuratowskis Theorem)
106 106 107 108 109 112 113 114 116 117 118 119 119 120 121
Proposition 7.16 Proposition 7.17 Proposition 7.18 Proposition 7.19 Proposition 7.20 Proposition 7.21 Proposition 7.22 Proposition 7.23 Proposition 7.24 (Dynkin System Theorem) Proposition 7.25 Proposition 7.26 Proposition 7.27 Proposition 7.28 Proposition 7.29 Proposition 7.30 Proposition 7.31 Proposition 7.32 Proposition 7.33 Proposition 7.34 Proposition 7.35 Proposition 7.36 Proposition 7.37 Proposition 7.38 Proposition 7.39 Proposition 7.40 Proposition 7.41 Proposition 7.42 (Lusin's Theorem) Proposition 7.43 Proposition 7.44 Proposition 7.45 Proposition 7.46 Proposition 7.47 Proposition 7.48 Proposition 7.49 (Jankov-von Neumann Theorem) Proposition 7.50 Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma
7.1 (Urysohn's Lemma) 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13
121 122 124 127 127 128 130 131 133 133 134 135 140 144 145 148 148 153 154 158 161 164 165 165 165 166 167 169 172 175 177 179 180 182 184 105 105 116 119 119 125 125 125 127 131 139 144 146
319
TABLE OF PROPOSITIONS Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma
7.14 7.15 7.16 7.17 7.18 7.19 7.20 7.21 7.22 7.23 7.24 7.25 7.26 7.27 7.28 7.29 7.30
Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition
147 149 150 151 151 152 152 154 161 162 163 164 172 173 174 174 177 104 105 107 112 114 117 118 120 121 122 133 134 146 157 157 160 161 167 171 171 177
7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13 7.14 7.15 7.16 7.17 7.18 7.19 7.20 7.21
Chapter 8
Proposition Proposition Proposition Proposition Proposition Proposition Proposition Lemma 8.1 Lemma 8.2
8.1 8.2 8.3 8.4 8.5 8.6 8.7
192 198 200 203 207 209 211 194 196
Lemma Lemma Lemma Lemma Lemma
8.3 8.4 8.5 8.6 8.7
Definition Definition Definition Definition Definition Definition Definition Definition
196 197 202 205 206
8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8
188 190 19\ 194 195 206 208 210 Chapter 9
Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition
9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9.10 9.11 9.12 9.13 9.14 9.15 9.16 9.17 9.18 9.19 9.20 9.21
216 219 219 220 223 224 224 225 226 226 227 227 228 231 231 232 234 236 237 239 241
Lemma 9.1 Lemma 9.2 Lemma 9.3
220 221 230
Definition Definition Definition Definition Definition Definition Definition Definition Definition Definition
213 214 214 216 217 217 217 217 218 229
9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9.10
320
TABLE OF PROPOSITIONS
Chapter 10
Proposition Proposition Proposition Proposition Proposition Proposition Lemma Lemma Lemma Lemma
10.1 10.2 10.3 lOA 10.5 10.6
246 254 256 257 262 264
10.1 10.2 10.3 lOA
Definition Definition Definition Definition Definition Definition Definition Definition
Appendix B
253 255 260 261 243 245 248 249 249 250 251 256
10.1 10.2 10.3 lOA 10.5 10.6 10.7 10.8
Chapter 11
Proposition Proposition Proposition Proposition Proposition Proposition Proposition
11.1 11.2 11.3 1104 11.5 11.6 11.7
266 267 267 268 269 270 272
Appendix A
Proposition A.I
278
Lemma A.I Lemma A.2 Lemma A.3
273 274 275
Definition A. I
273
Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Proposition Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma
B.I B.2 B.3 BA B.5 B.6 B.7 B.8 B.9 B.IO B .11 B.12
282 285 289 290 291 292 293 295 296 297 299 300 285 285 286 287 288 294 295 298
B.l B.2 B.3 BA B.5 B.6 B.7 B.8
290 293 298
Definition B.I Definition B.2 Definition B.3
Appendix C
Proposition Proposition Proposition Proposition
C.I C.2 C.3 CA
304 306 308 310
Lemma C.I Lemma C.2
306 310
Definition C.I Definition C.2
303 310
Index
A
Alexandroffs theorem, 107 Analytic measurability ofa function, 171 Analytic set, 160 Analytic c-algebra, 171 A posteriori distribution, 260ff A priori distribution, 260ff Axiom of choice, 301
8
Baire null space, 103, 109 Borel isomorphism, 121 Borel measurability of a function, 120 Borel programmable, 21 Borel o-algebra, 117 Borel space, 118
c Cantor's continuum hypothesis, 301ff Completion of a metric space, 114 ofa o-algebra, 167 Composition of measurable functions, 298 Contraction assumption, 52
Control constraint, 2, 26,188,216,243,245,248, 251. 271 space, 2, 26, 188,216,243,245,248,251, 271 Cost corresponding to a policy, 2, 28, 191,217, 244,249,254 one-stage, 2,189,216,243,245,248,251, 271 optimal, 2, 29,191,217,244,246,250,254 C-sets,20 D
Disturbance kernel, 189,243,245,271 Disturbance space, 189,243,245,271 Dynamic programming (DP) algorithm, 3, 6, 39,57,80,198,229,259 Dynkin system, 133 Dynkin system theorem, 133 E
Epigraph,82 Exact selection assumption, 95 Exponential topology, 304
321
322
INDEX F
Filtering, 261 Fixed point theorem (Banach), 55 F,,-set, 102
G
G,-set, 102 H
Hausdorff metric, 303 Hilbert cube, 103 Homeomorphism, 104 Horizon finite, 28,189,243,245,248,251, 271 infinite, 70, 213, 216, 243, 245, 248, 251
Imperfect state information model, 248 Indicator function, 103 Information vector, 248 Isometry, 144
J Jankov-von Neumann theorem, 182 K
Kuratowski's theorem, 121
L
Limit measurability, 298 Limit IT-algebra, 293 Lindelof space, 106 Lower semianalytic function, 177 Lower semicontinuous function, 146 Lower semicontinuous model, 208 Lusin's theorem, 167 M
Metrizable space, 104 Monotonicity assumption, 27
N
Nonstationary model, 243
o Observation kernel, 248 Observation space, 248 Optimality equation, 4, 57. 71. 73. 78f£, 225 nonstationary, 246 Outer integral, 273 monotone convergence theorem for, 278 p
Paved space, 157 Policy, 2, 6, 26, 91,190,214,217,243,249 analytically measurable. 190, 269ff Borel-measurable, 190 e-optimal,29,191,215,244 {en}-dominated convergence to optimality,29,191,245 k-originating, 243 limit-measurable, 190, 266ff Markov, 6, 190 nonrandomized, 190,249 optimal,29, 191. 215, 244 p-e-optimaI, 12 q-optimal,256 semi-Markov, 190 stationary, 214 uniformly N-stage optimal, 29, 206 universally measurable, 190 weakly q-e-optimal, 256 p-outer measure, 166,274 Projection mapping, 103 R
Regular probability measure, 122 Relative topology, 104 R-operator, 2 1
s Second countable space, 106 Semi-Markov decision problems, 34 Separable space, 105 State space, 2, 26, 188,216,243,245,248, 251,271
323
INDEX State transition kernel, 189,243,248, 251 Statistic sufficient for control, 250 existence of, 259ff Stochastic kernel, 134 Stochastic programming, 11 IT Suslin scheme, 157 nucleus of, 157 regular, 161 System function, 189,216,243,245,271
T Topologically complete space, 107 Totally bounded space, 112
u Uniform decrease assumption, 70 Uniform increase assumption, 70 Universal function, 290 Universal measurability of a function, 171 Universal o-algebra, 167 Upper semicontinuous function, 146 Upper semicontinuous model, 210 Upper semicontinuous (K) function, 310 Urysohn's lemma, 105 Urysohns theorem, 106
w Weak topology on space of probability measures, 125
This page intentionally left blank