This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0, 0oo)iffthereisasetAe.Fwithp(A)=0andf"fm - 0 uniformly on A`. It is immediate that the Holder inequality still holds when p = 1, q = oo, and we have shown above that the Minkowski inequality holds when p = oo. To show that L°° is complete, let {f.) be a Cauchy sequence in L°°, and let A be a set of measure 0 such that f"(w) - fm(w) - 0 uniformly for w e A`. But then "(w) converges to a limit f (w) for each co E A`, and the convergence is uniform on A`. If we define f (co) = 0 for w E A, we have f e L°° and Theorem 2.4.13 holds also when p = oo. For if f is a function in L°°, the standard approximating sequence {f,,) of simple functions (see 1.5.5) converges to f uniformly, outside a set of measure 0. However, Theorem 2.4.14 fails when p = oo (see Problem 12). If C is an arbitrary set, F consists of all subsets of 11, and p is counting measure, then L°°(fl, ,F, p) is the set of all bounded complex-valued functions f = (f (a), a e fl), denoted by The essential supremum is simply the supremum; in other words, Ilf II = sup( I f (a) I : a e fl). If f) is the set of positive integers, l°°(Q) is the space of bounded sequences of complex numbers, denoted simply by 1°°. Problems
If f = {a,,, n = 1, 2, ...}, the a are real or complex numbers, and p is counting measure on subsets of the positive integers, show that la f dp = Y', a", where the sum is interpreted as in 2.4.12. (b) I f f = (f (a), a e n) is a real- or complex-valued function on the arbitrary set 0, and u is counting measure on subsets of 0, show that lo f du = Ea f (a), where the sum is interpreted as in 2.4.12. 2. Give an example of functions f, fl, f2, ... from R to [0, 1] such that (a) each f,, is continuous on R, (b) ,,(x) converges to f (x) for all x, f f (x) I p dx - 0 for every p e (0, oo), and (c) f is discontinuous at some point of R. {a(,"), a2"), ..} be a sequence of complex 3. For each n = 1, 2, ..., let 1.
(a)
I
numbers.
2.4 L° SPACES (a)
91
If the a(n) are real and 0 5 ak") < ak"+1) for all k and n, show that 00
00
lim E a(n) n-g00 k=1
lim ak").
k=1n-'oo
Show that the same conclusion holds if the ak") are complex and I ak"> I < bk for all k and n, where _k= 1 b, < oo. (b) If the ak") are real and nonnegative, show that
k=1 n=1
(c)
Y_ akn'' ak")= n=1 Y, k=1
If the a(k") are complex and Y- Yk I ak" I < oo, show that ['n 1 Yk00 i a(kn) and Yk 1 E ak ) both converge to the same 1
1
1
4.
finite number. Show that there is equality in the Holder inequality if I f I ° and I g I ° are linearly dependent, that is, iff A I f I ° = B I g I a a.e. for some constants
5.
6. 7.
A and B, not both 0. If f is a complex-valued p-integrable function, show that I f of d.I = fn I f I dp iff arg f is a.e. constant on {w: f (w) # 0}. Show that equality holds in the Cauchy-Schwarz inequality iff f and g are linearly dependent. (a) If 1 < p < oo, show that equality holds in the Minkowski inequality if Af = Bg a.e. for some nonnegative constants A and B, not both 0.
(b) What are the conditions for equality if p = I ? 8.
If 1 < r < s < oo, and f e LS(S2, .F, u), u finite, show that Ilf 11, < kJI f II, for some finite positive constant k. Thus LS e L" and LS convergence implies L' convergence. (We may take k = 1 if u is a probability measure.) Note that finiteness of u is essential here; if p is Lebesgue measure on
R(R) and f (x) = I /x for x
1, f (x) = 0 for x < 1, then f e LZ but
f0L'. 9.
If It is finite, show that Ilf IIp -i Ilf III as p -* oo. Give an example to
10.
show that this fails if 14(S)) = oo. (Radon-Nikodym theorem, complex case) If p is a or-finite (nonnegative, real) measure, A a complex measure on (S2, and A < p, show that
there is a complex-valued u-integrable function g such that 2(A) _ JA g dp for all A e F. If h is another such function, g = h a.e. Show also that the Lebesgue decomposition theorem holds if A..
is a complex measure and u is a a-finite measure. (See Problem 6, Section 2.2, for properties of complex measures.) 11.
(a)
Let f be a complex-valued p-integrable function, where u is a nonnegative real measure. If S is a closed set of complex numbers
92
2
FURTHER RESULTS IN MEASURE AND INTEGRATION THEORY
and [l /µ(E)] f E f dµ e S for all measurable sets E such that µ(E) > 0, show that f (w) e S for almost every a). [If D is a closed disk with
center at z and radius r, and D e S`, take E = f -'(D). Show that I J E (f - z) dp I < rp(E), and conclude that µ(E) = 0.] (b) If 2 is a complex measure, then A < I A I by definition of 121 ;
hence by the Radon-Nikodym theorem, there is a I A I -integrable complex-valued function h such that A(E)= 5 E h d I A I for all
E e F. Show that I h I= 1 a.e.
[ I A I ]. [Let A, = {co: I h(w) I< r},
0 < r < 1, and use the definition of 12I to show that I h I >_ 1 a.e. Use part (a) to show I h I < I a.e.] (c) Let p be a nonnegative real measure, g a complex-valued p-integrable function, and 2(E) = fE g dµ, E e F. If h = d.1/dl A I as in part (b), show that 12 I (E) = f E lig dµ. (Intuitively, Jig dp = h d l - hh dI l I = I h 12 dill = dI l I . Formally, show that f a fh dI ).I = f a fg dp if f is a bounded, complex-valued, Borel measurable function, and set f = RE.) (d) Under the hypothesis of (c), show that
IAI(E)=J IgI dµ 12.
for all Ee..
Give an example of a bounded real-valued function f on R such that there is no sequence of continuous functions f such that if -f IL -> 0. Thus the continuous functions are not dense in L°° (R).
2.5 Convergence of Sequences of Measurable Functions
In the previous section we introduced the notion of LP convergence; we are also familiar with convergence almost everywhere. We now consider other types of convergence and make comparisons.
Let f, f f2, ... be complex-valued Bore] measurable functions on (S2, .F, µ). We say that f -> f in measure (or in µ-measure if we wish to
emphasize the dependence on p) if for every e > 0, p(w: I f,(w) - f (w) I >- e} When µ is a probability measure, the con- 0 as n -+ oo. (Notation: vergence is called convergence in probability. The first result shows that LP convergence is stronger than convergence in measure. 2.5.1
Theorem. If f, f1, f2, ... e LP (0 < p < oo), then f
PROOF. Apply Chebyshev's inequality (2.4.9) to If, - f I. 1
f implies f
f.
2.5
93
CONVERGENCE OF SEQUENCES OF MEASURABLE FUNCTIONS
is a Cauchy sequence in L°, then The same argument shows that if is Cauchy in measure, that is, given e > 0, p{w: I fm(m) I >_ E) - 0
{
as
If f, f,, f2 .... are complex-valued Borel measurable functions on (Q, .9, p),
we say that f - f almost uniformly if, given e > 0, there is a set A e F such that p(A) < E and f -> f uniformly on A`. Almost uniform convergence is stronger than both a.e. convergence and convergence in measure, as we now prove. 2.5.2 Theorem. If f, - f almost uniformly, then f -f in measure and almost everywhere.
PROOF. If s > 0, let f, -+f uniformly on Ac, with p(A) < e. If 6 > 0, then eventually If, -f I< bon A`, so fl f, f -f I > S} c A. Thus p{ I f - f > S} < p(A) < E, proving convergence in measure. To prove almost everywhere convergence, choose, for each positive integer
k, a set Ak with p(Ak) < 1/k and f --+f uniformly on Ak`. If B = Uk
Akc, 1
then f -+f on B and p(B`) = p(f lk 1 A,):!:-:: p(Ak) - 0 as k -+ oo. Thus p(Bc) = 0 and the result follows. I The converse to 2.5.2 does not hold in general, as we shall see in 2.5.6(c), but we do have the following result.
Theorem. If {f} is convergent in measure, there is a subsequence converging almost uniformly (in particular, a.e. and in measure) to the same limit function. 2.5.3
is Cauchy in measure, because if I f - fm I PROOF. First note that { then either If, - f I e/2 or I f - fm I >- e/2. Thus
Pflfn-fmI
>E} oo.
Now for each positive integer k, choose a positive integer Nk such that Nk+, > Nk for all k and
p{w: If,((o) -f.(w)I > 2-k}
Nk.
Pick integers nk Z Nk, k =/1, 2, ...; then if gk = /lw)I
: 2-k} < 2-k. p{w. I9k(w) - 9k+I Let Ak = { I9k - 9k+1 I >- 2-k}, A = lim sup* Ak. Then p(A) = 0 by 2.2.4; but if co 0 A, then co e Ak for only finitely many k; hence I9k(w) - 9k+, (w) I < 2-k
94
2 FURTHER RESULTS IN MEASURE AND INTEGRATION THEORY
for large k, and it follows that gk(w) converges to a limit g(w). Since µ(A) = 0 we have gk -+ g a.e.
If B,. = Uk At, then µ(Br) < k r µ(A k) < s for large r. If co 0 Br, then gk(w) - gk1(w)I < 2-k, k = r, r + 1, r + 2, .... By the Weierstrass M-test, 9k -+ g uniformly on Br , which proves almost uniform convergence. Now by hypothesis, we have f "+ f for some f, hence --+f. But by 2.5.2,
f = g a.e. (see Problem 1). Thus f, converges almost uniformly to f, completing the proof. I There is a partial converse to 2.5.2, but before discussing this it will be convenient to look at a condition equivalent to a.e. convergence:
2.5.4 Lemma. If µ is finite, then f -> f a.e. if for every S > 0, 0
U {w: Ifk(w) -f(w)I >- S})
k=n
as n - oo.
PROOF. Let B, = {w : Ifn(w) - f (w) I >- S}, B,5 = lim sup Bna = f, 1 Uk n Bka Now Uk n Bka I Ba ; hence µ(U n Bka) -+ µ(Ba) as n - oo by 1.2.7(b). Now {w : fn(w)
U Ba
a>o 00
U B1/m
since Ba, c Bat for
S1 > S2.
M=1
Therefore,
f. -f
a.e.
if µ(Ba)=0 if
for all 6>0
µ UBka -,0
for all
S>0. 1
G='On
2.5.5
Egoroff's Theorem. If µ is finite and f -+f a.e., then f -+f almosi
uniformly. Hence by 2.5.2, if µ is finite, then almost everywhere convergence implies convergence in measure.
PROOF. It follows from 2.5.4 that given c > 0 and a positive integer j, for sufficiently large n = n(j), the set Aj = Uk n(j) {Ifk -f I ? 1/f) has measure less than s/2j. If A = U; A, then µ(A) < YT I µ(A j) < s. Also, if S > 0 and j is chosen so that 1/j < S, we have, for any k >- n(j) and w e A` (hence 1
co 0 Aj), I fk(w) - f (co) I < 1 /j < S. Thus f -+ f uniformly on K.
I
2.5
CONVERGENCE OF SEQUENCES OF MEASURABLE FUNCTIONS
95
We now give some examples to illustrate the relations between the various types of convergence. In all cases, we assume that F is the class of Borel sets and p is Lebesgue measure. 2.5.6 Examples. (a)
Let .0 = [0, 1] and define Le
f(x) =
if
0 x). If B a Cl or B - C2, show that B e .F. (The relation y < x refers to the ordering of y and x as ordinals, not as real numbers.) (c) Show that ,F consists of all subsets of 0. Show that a measurable function of one variable is jointly measurable. Specifically, if g: (Q .F,) -+ (f2', F') and we define f: 1, x 122 - f2' by f (w1, (02) = g(w,), then f is measurable relative to Pr, x .qf2 and .F', regardless of the nature of F2 . (a)
8.
t Rao, B. V., Bull. Amer. Math. Soc. 75, 614 (1969).
108 9.
2
FURTHER RESULTS IN MEASURE AND INTEGRATION THEORY
Give an example of a function f: [0. 1] x [0, 1] ->' [0, 1] such that (a) f (x, y) is Bore) measurable in y for each fixed x and Borel measurable in x for each fixed y, (b) f is not jointly measurable, that is, f is not measurable relative to the product a-field -4[0, 1] x 9[0, 1], and (c) f u (J f (x, y) dy) dx and f o (jo f (x, y) dx) dy exist but are unequal. (One example is suggested by Problem 7.)
2.7 Measures on Infinite Product Spaces
The n-dimensional product measure theorem formalizes the notion of an n-stage random experiment, where the probability of an event associated with the nth stage depends on the result of the first n - I trials. It will be convenient later to have a single probability space which is adequate to handle n-stage experiments for n arbitrarily large (not fixed in advance). Such a space can be
constructed if the product measure theorem can be extended to infinitely many dimensions. Our first task is to construct the product of infinitely many a-fields.
Definitions. For each j = 1, 2, ..., let (Q;, F) be a measurable space. Let f = 1100, 12;, the set of all sequences (w,, (02 , ...) such that w; a 0j, 2.7.1
j=1,2,.... If B"crlj_,Qj,we define B"={wED:(w,,...,cw")eB"}. The set B. is called the cylinder with base B"; the cylinder is said to be measurable if B" a fl F F. If B" = A, x x A,,, where A; c f1i for each i, B" is called a rectangle, a measurable rectangle if A i e .°F, for each i. A cylinder with an n-dimensional base may always be regarded as having a higher dimensional base. For example, if
B={wE(I:(w,,w2,w3)EB3), then
B = {w6f):(w1,w2,w3)EB3, (04Ei24) = {w E fl: (w1 . 0)2, w3 , (04) a B3 X f14).
It follows that the measurable cylinders form a field. It is also true that finite disjoint unions of measurable rectangles form a field; the argument is the same as in Problem 1 of Section 2.6. The minimal a-field over the measurable cylinders is called the product of the a-fields .F;, written flj 1 F ; rl , F; is also the minimal a-field over
2.7
109
MEASURES ON INFINITE PRODUCT SPACES
the measurable rectangles (see Problem 1). If all .F, coincide with a fixed a-field F, then H, .`, is denoted by F', and if all S2, coincide with a fixed set S, f 1 S2, is denoted by Sr'. The infinite-dimensional version of the product measure theorem will be used only for probability measures, and is therefore stated in that context. (In fact the construction to be described below runs into trouble for nonprobability measures.)
2.7.2
Theorem. Let (S2, , .F), j = 1, 2, ..., be arbitrary measurable spaces;
let i2 = f S2,' .F = fly 1 ., 1
.
Suppose that we are given an arbitrary prob-
ability measure P, on .F,, and for each j = 1, 2, ... and each (w...... co) e 92, x x S2, we are given a probability measure P(w,, ..., w, , ) on .°F,+,. Assume that P(w1, ..., w,, C) is measurable: (R, R(R)) for each fixed C e .v,+ If B" e j % F; , define P"(B") = fn PI(dwi) Jn2 P(wi, do),.) ... , J
an
Gl_1 52,,
f_1-F,)-+
N(01' ... , (0"-2 , dui"-1)
w")P(w1, ..., w"-1, dw").
Note that P,, is a probability measure on f;=1 ,F, by 2.6.7 and 2.6.8(a). There is a unique probability measure P on .F such that for all n, P agrees with P on n-dimensional cylinders, that is, P{w e S2: (w1, ..., w") e B") _
P"(B") for all n=1,2,...and all B"efj=1,F;. PROOF. Any measurable cylinder can be represented in the form B. = w,;) a B"} fot some n and some B" a f;=1.F,; define P(B") = P"(B"). We must show that P is well-defined on measurable cylinders. For sup{w c -0 :
pose that B can also be expressed as {co e Q: (w,, ..., (9 ) e C'j where Cm e j=1 .F,; we must show that P"(B") = Pm(Cm). Say m < n; then (co,, ... , co,,,) x K2,,. It follows from E C' iff (w 1, ... , co,,) e B", hence B" = C' x S2,"+, x P(Cm). (The fact that the P((O ..., w, , ) the definition of P,, that are probability measures is used here.)
Since P. is a measure on fl ., F,, it is immediate that P is finitely additive on the field .Fo of measurable cylinders. If we can show that P is continuous from above at the empty set, 1.2.8(b) implies that P is countably additive on .F0, and the Caratheodory extension theorem extends P to a
probability measure on fl. t, F,; by construction, P agrees with P,, on n-dimensional cylinders.
110
2
FURTHER RESULTS IN MEASURE AND INTEGRATION THEORY
Let (B., n = n, , n2,
... } be a sequence of measurable cylinders decreasing
to 0 (we may assume n, < n2
0. Then for each n > 1, P(B") = fn 9n1)(w1)P1(dwl), ,
where
9(1)(w1) = f P(w1, dw2) ... nx
4
w"-1, dwn)
an fan
Since B"+1 = B", it follows that Brt+1 = B" x IB.,+,(w,, ..., wn+1)
hence
IB.,(w1, ..., w").
Therefore gn1 (co,) decreases as n increases (co, fixed); say g,(,1) (w1)-h,((o1). By the extended monotone convergence theorem (or the dominated convergence theorem), P(B") -+ In, h,(w,)P,(dw1). If lim"-,, P(B") > 0, then h1((O1') > 0 for some w,' a 52,. In fact co,' E B', for if not, IB"(w1', (02, ..., Co.) = 0 for all n; hence g,(,1)(w,') = 0 for all n, and h,(w1') = 0, a contradiction.
Now for each n > 2,
9n"(wl') =
J lZ
9n2)(w2)P((9 ,', dw2),
where P(w1', w2 , dw3) 9(w2) = L3 n2) J
5 'B-(C4)1" w2, ..., w")P(w,...... wn-,, dww) an
As above,
j h2(w2); hence
9(1)(w,') -' Lh2(w2)P(wt', dw2) Since gn1)(w1') -> h1(w1') > 0, we have h2(w2') > 0 for some (02' E 522, and as above we have (w1', w2') e B2.
The process may be repeated inductively to obtain points co,', w2', ... such that for each n, (w,', ..., co.') e B. But then (w1', w2', ...) a n"'=1 B. _
0, a contradiction. This proves the existence of the desired probability measure P. If Q is another such probability measure, then P = Q on measurable cylinders, hence P = Q on .F by the uniqueness part of the Caratheodory extension theorem. I
2.7
ill
MEASURES ON INFINITE PRODUCT SPACES
The classical product measure theorem extends as follows:
Corollary. For each j = 1, 2, ... , let
2.7.3
P,) be an arbitrary prob-
52; , .F _ f 1 ,F j . There is a unique probability space. Let 0 = ability measure P on F such that
P{w a 52: co, e A1, ..., co E
f j=1P;(A;)
... and all A; e F j , j = 1, 2, .... We call P the product of the Pj, and write P = f 1 P. for all n = 1, 2,
take P(cvli ..., (oj, B) = Pj+1(B), B e j+1. Then f;=1 P;(A j), and thus the probability measure P of 2.7.2 has the desired properties. If Q is another such probability measure, then P = Q on the field of finite disjoint unions of measurable rectangles; hence P = Q on W by the Caratheodory extension theorem. I PROOF.
P (A1 x
In
2.7.2,
x
Problems 1.
Show that f; ,F j is the minimal a-field over the measurable rectangles.
2.
Let ,F = R(R); show that the following sets belong to F°° :
1
(a)
{x a R°°: sup,, x
< a},
(b) {xeR°°:Yn IIxxl = ilx112+ IIY112+<x,y>+ = IIx112+ IIY112+2Re<x,y> I = I <x,, , Y. - Y> + <x. - x, Y> I -
<x, y>. I
The computation of 3.2.2 establishes the following result, which says geometrically that the sum of the squares of the lengths of the diagonals of a parallelogram is twice the sum of the squares of the lengths of the sides:
118
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
3.2.4 Parallelogram Law. In an inner product space, IIx + y112 + IIx - YII2 = 2(11X112 + 11y112). PROOF.
IIx + YII2 = IIXII2 + IIYII2 + 2 Re<x, y>, and
IIx-YII2=IIXI12+IIYII2-2Re<x,y>. I Now suppose that x,, ..., x are mutually perpendicular unit vectors in Rk, k > n. If x is an arbitrary vector in Rk, we try to approximate x by a linear
combination Y;=, ajxj. The reader may recall that E;_, ajxj will be closest to x in the sense of Euclidean distance when aj = <x, xj>. This result holds in an arbitrary inner product space.
Definition. Two elements x and y in an inner product space L are
3.2.5
said to be orthogonal or perpendicular if <x, y> = 0. If B c L, B is said to be orthogonal iff <x, y> = 0 for all x, y E B such that x:0 y; B is orthonormal if it is orthogonal and IIXII = 1 for all x c- B. The computation of 3.2.2 shows that if x,, x2 , ..., x,, are orthogonal, the Pythagorean relation holds: x+`12 = D°_' IIx;112. 3.2.6 Theorem. If {x ..., x,j is an orthonormal set in the inner product space L, and x c- L, n
is minimized when a j = <x, xj>, j = 1, ..., n.
x- Y-ajxj II
j=1
11
PROOF. IIx
a xjII2 = ` x - JY la. x j , x - kI' ak xk
\
n`
IIX112
k=1
n
ak<X, Xk> -
akxk ). + K j =1 ajxj, E k=1 /
aj<xj, x> j=1
119
3.2 BASIC PROPERTIES OF IIILBERT SPACES
The last term on the right is >. I a, 12 since the x; are orthonormal. Furthermore, -a,<x, x,> - a,<xl , x> + I a;1 2 = _I <x, x,> 12 + I a,; - <x, x,> 12. Thus 1
0
I2+ Yla;-<X,x;>12, (1) J=1
.1=1
.1=1
so that we can do no better than to take a, = <x, x,>. I The above computation establishes the following important inequality. 3.2.7 Bessel's Inequality. If B is an arbitrary orthonormal subset of the inner
product space L and x is an element of L, then 11X112>-
yeB
I<x,y>12.
In other words, <x, y> = 0 for all but countably many y e B, say y = x1,
x...... and IIX112>_E I<x,x,>12.
Equality holds if El= <x, x,>x, -> x as n -+ oo. 1
PROOF. If x1, ..., xn E B, set
<x, x,> in Eq. (1) of 3.2.6 to obtain
Il x - Yi = 1 <x, xl>xi II 2 = 11 X 112 - Li = 1 I <x, X j> 12 > 0. I
We now consider another basic geometric idea, that of projection. If M is a subspace of Rn and x is any vector in Rn, x can be resolved into a compo-
nent in M and a component perpendicular to M. In other words, x = y + z where y e M and z is orthogonal to every vector in M. Before generalizing to an arbitrary space, we indicate some terminology. 3.2.8
Definitions. A subspace or linear manifold of a vector space L is a subset M of L that is also a vector space; that is, M is closed under addition and
scalar multiplication. If L is a topological vector space, M is said to be a closed subspace of L if M is a subspace and is also a closed set in the topology of L. A subset M of the vector space L is said to be convex if for all x, y e M,
we have ax + (I - a)y e M for all real a e [0, 1]. The key fact that we need is that if M is a closed convex subset of the Hilbert space H and x is an arbitrary point of H, there is a unique point of M closest to x.
120
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
Theorem. Let M he a nonempty closed convex subset of the Hilbert space H. If x e H, there is a unique element yo e M such that 3.2.9
IIx - yoll = inf{IIx-Y1l:yEM}. PROOF. Let d = inf{llx - Y1l : y e M}, and pick points yl, Y21 ... C_ M with is a Cauchy sequence. IIx - y.11 - d as n ce ; we show that Since Jlu + vlJ 2 + flu - x112 = 211u112 + 211vIl2 for all u, v e H by the parallel-
ogram law 3.2.4, we may set u = yR - x, v = ym - x to obtain IIYn+Ym-2x112+ IIY.-Yn112
=2IIY,,-xII2+211ym-x112
or
IIYn-Ym112=211Y,,-xII2+211Ym-xII2-4111(y,,+Ym)-xII2.
Since z(y,, + y,,) e M by convexity, 11 (y + ym) - xII2 >- d2, and it follows 0 as n, m -> oo. that IIy,, - Y,,11
By completeness of H, y,, approaches a limit yo, hence IIx fix - Yo11 . But then JJx - yo J1 = d, and yo e M since M is closed; this finishes the existence part of the proof. To prove uniqueness, let yo, zo e M, with IIx - Yoli = IIx - zoll = d. In the parallelogram law, take u = yo - x, v = zo - x, to obtain
IIYo+zo-2x112+llyo-z0112=211Yo-x112+211zo-x1f2=4d2. But IIYo + zo - 2x112 = 411R(Yo + zo) - x112 >- 4d2; hence IIYo - zoll = 0, so
Yo=zo I If M is a closed subspace of H, the element yo found in 3.2.9 is called the projection of x on M. The following result helps to justify this terminology.
Theorem. Let M be a closed subspace of the Hilbert space H, and yo an element of M. Then 3.2.10
lix -- Yo 11 = inf{ IIx - Yil : y e M}
iff
x - Ye 1 M,
that is, <x - yo, y> = 0 for all y e M. PROOF. Assume x - yo 1 M. If y e M, then I1x-Y112 = IIx - Yo - (y-Yo)112
= IIx-Yo112 + IIY-Yo112-2Re<x-Yo,Y-Yo> since y - yoeM = IIx-YoII2+ IIY-YoII2 IIx - YoII2
Therefore, JJx - Yoll = inf{Ilx - ylj : y e M}.
3.2
BASIC PROPERTIES OF HILBERT SPACES
121
Conversely, assume Ilx - Yol! = inf{Ilx - yll : y e M}. Let y e M and let c be an arbitrary complex number. Then yo + cy e M since M is a subspace, hence Ilx - Yo - cyll Z llx - Yoll. But I1x-Yo-cY112 = llx-Yo112+ IcI211Y112-2Re<x-yo,cy>; hence
IcII11Y112 -2Re<x-Yo,cy>>0.
Take c = b<x - yo, y>, b real. Then <x - yo, cy> = b I <x - yo, y> 12. Thus I <x - Yo' Y> 12[b211Y112 - 2b] > 0. But the expression in square brackets is negative if b is positive and sufficiently close to 0; hence <x - yo, y> = 0. 1
We may give still another way of characterizing the projection of x on M.
Projection Theorem. Let M be a closed subspace of the Hilbert space H. If x e H, then x has a unique representation x = y + z where 3.2.11
y e M and z 1 M. Furthermore, y is the projection of x on M.
PROOF. Let yo be the projection of x on M, and take y = yo, z = x - yo. By 3.2.10, z 1 M, proving the existence of the desired representation. To prove uniqueness, let x = y + z = y' + z' where y, y' e M, z, z' 1 M. Then
y - y' e M since M is a subspace, and y - y' 1 M since y - y' = z' - z. Thus y - y' is orthogonal to itself, hence y = y'. But then z = z', proving uniqueness. I If M is any subset of H, the set M1 = {x e H: x I M} is a closed subspace by definition of the inner product and 3.2.3. If M is a closed subspace, Ml is called the orthogonal complement of H, and the projection theorem is expressed by saying that H is the orthogonal direct sum of M and Ml, written
H=M(B M1. In R", it is possible to construct an orthonormal basis, that is, a set {x1, .... , x"} of n mutually perpendicular unit vectors. Any vector x in R"
<x, x.>x;, so that <x, x.> is the commay then be represented as x ponent of x in the direction of x; . We are now able to generalize this idea to an arbitrary Hilbert space. The following terminology will be used.
Definitions. If B is a subset of the topological vector space L, the space spanned by B, denoted by S(B), is the smallest closed subspace of L containing all elements of B. If L(B) is the linear manifold generated by B, 3.2.12
122
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
that is, L(B) consists of all elements F"., a; xi, a E C, Xt c B, i = 1, ... , n, n = 1, 2, ... , then S(B) = L(B). If B is a subset of the Hilbert space H, B is said to be an orthonormal basis for H if B is a maximal orthonormal subset of H, in other words, B is not a proper subset of any other orthonormal subset of H. An orthonormal
set B c H is maximal if S(B) = H, and there are several other conditions equivalent to this, as we now prove. Theorem. Let B = {x8, a e 1} be an orthonormal subset of the Hilbert space H. The following conditions are equivalent: 3.2.13
(a) B is an orthonormal basis. (b) B is a "complete orthonormal set," that is, the only x c- H such that
xlBisx=0.
B spans H, that is, S(B) = H. For all x e H, x = & <x, xa)xa. (Let us explain this notation. By 3.2.7, <x, xa) = 0 for all but countably many xa , say for x,, x2 , ... ; the assertion is that Y j=, <x, x;)x; -1, x, and this holds regardless of the order in which the xj are listed.) <x, xa)<xa, y). (e) For all x, y e H, <x, y) (f) For all x e H, JJxJJ 2 = 1]a J <x, xa) 1 2. (c)
(d)
Condition (f) [and sometimes (e) as well] is referred to as the Parseval relation.
If x 1 B, x :A 0, let y = x/JlxIl. Then B v {y} is an orthonormal set, contradicting the maximality of B. (b) implies (c): If x e H, write x = y + z where y e S(B) and z 1 S(B) (see 3.2.11). By (b), z = 0; hence x e S(B). (c) implies (d): Since S(B) = L(B), given x e H and e > 0 there is a finite set F c I and complex numbers a8, a e F, such that PROOF. (a) implies (b):
x - Y as x,, < e. aeF
By 3.2.6, if G is any finite subset of I such that F c G,
x - E <x, xa)xa
11
-
# 0, lix - Ei=, <x, x;>x; < e for sufficiently large n, as desired. (d) implies (e): This is immediate from 3.2.3. (e) implies (f): Set x = y in (e). (f) implies (a): Let C be an orthonormal set with B c C, B 0 C. If x e C, x 1 B, we have IIxI12 = a I <x, x2> 2 = 0 since by orthonormality of C, x is orthogonal to everything in B. This is a contradiction because llxll = 1 for all
3.2.14 Corollary. Let B = {x,,, a e I} be an orthonormal subset of H, not necessarily a basis. (a) B is an orthonormal basis for S(B). [Note that S(B) is a closed subspace of H, hence is itself a Hilbert space with the same inner product.] (b) If x e H and y is the projection of x on S(B), then
y=
<x, x.>xa
[see 3.2.13(d) for the interpretation of the series]. PROOF. (a)
Let x e S(B), x 1 B; then x I L(B). If y e S(B), let y,, y2, ...
e L(B) with y,, -+ y. Since <x,
0 for all n, we have <x, y> = 0 by 3.2.3.
Thus x 1 S(B), so that (x, x> = 0, hence x = 0. The result follows from 3.2.13(b). (b) By part (a) and 3.2.13(d), y = Y,,(y, x >x8 .
But x - y 1 S(B) by
3.2.11, hence <x, x8> = for all a. I
A standard application of Zorn's lemma shows that every Hilbert space
has an orthonormal basis; an additional argument shows that any two orthonormal bases have the same cardinality (see Problem 5). This fact may be used to classify all possible Hilbert spaces, as follows: Theorem. Let S be an arbitrary set, and let H be a Hilbert space with an orthonormal basis B having the same cardinality as S. Then there is an isometric isomorphism (a one-to-one-onto, linear, norm-preserving map) 3.2.15
between H and 12(S).
PROOF. We may write B = {x., a e S}. If x e H, 3.2.13(d) then gives x = 1. <x, xz>xa, where Y I <x, xa> 12 = 11x112 < oo by 3.2.13(f). The map x - (<x, xe>, a E S) of H into 12(S) is therefore norm-preserving; since it is also linear, it must be one-to-one. To show that the map is onto, consider any
124
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
collection of complex numbers as , a e S, with I as 12 < 00. Say as = 0 except for a = a,, a, ..., and let x = a a,,,xa,. [The series converges to an element of H because of the following fact, which occurs often enough to be stated separately: If {y,, Y2, ...} is an orthonormal subset of H, the series >; c, y, converges to some element of H if y; I c1I 2 < oo. To see this, observe that 1117" c; ZIIY;I12 = ;_" c3 2; thus the partial sums form a Cauchy sequence iff yj I c; 12 < co.] Since the x,, are orthonormal, it follows that <x, xa> = aQ for all a, so that
xmaps onto (a,,,aeS).I We may also characterize Hilbert spaces that are separable, that is, have a countable dense set. Theorem. A Hilbert space H is separable if it has a countable orthonormal basis. If the orthonormal basis has n elements, H is isometrically isomorphic to C"; if the orthonormal basis is infinite, H is isometrically iso3.2.16
morphic to 12, that is, 12(S) with S = {1, 2, ...}. PROOF. Let B be an orthonormal basis for H. Now IIx - Yll2 = 11x112 + Ily112 = 2 for all x, y e B, x 5e y, hence the balls A,, = {y: Ily - x11 < j;}, x e B, are
disjoint. If D is dense in H, D must contain a point in each A,, so that if B is uncountable, D must be also, and therefore H cannot be separable. Now assume B is a countable set {x,, X2, ...). If U is a nonempty open subset of H (= S(B) = L(B)], U contains an element of the form Y_;_, a; x; with the a; a C; in fact the a; may be assumed to be rational, in other words, to have rational numbers as real and imaginary parts. Thus D= (jY,,-1 a; x; : n = 1, 2, ... ,
the a3 rational}
is a countable dense set, so that H is separable. The remaining statements of the theorem follow from 3.2.15. 1
A linear norm-preserving map from one Hilbert space to another automatically preserves inner products; this is a consequence of the following proposition: 3.2.17 Polarization Identity. In any inner product space,
4<x, y> = IIx + YII2 - IIx - YII2 + illx + iYI12 - illx - iy112.
3.2
125
BASIC PROPERTIES OF HILBERT SPACES
PROOF.
IIx+y112=I1Xll2+11y112+2Re<x,y> IIx-y112=IIXII2+I1Y112-2Re<x,y> Ilx + iy112 = IIxii2 + IIYI12 + 2 Re<x, iy> iyl[2
= IIXI12 + 1ly112 - 2 Re<x, iy> IIx But Re<x, iy> = Re[-i<x, y>] = Im<x, y>, and the result follows.
Problems 1.
In the Hilbert space 12(S), show that the elements e,,, a e S, form an orthonormal basis, where s 96 a,
e2(s) = r0,
s=a. If A is an arbitrary subset of the Hilbert space H, show that l1,
2.
(a)
A1' = S(A). (b)
If M is a linear manifold of H, show that M is dense in H if Ml = {0}.
3.
Let x,, ... , x" be elements of a Hilbert space. Show that the xi are
linearly dependent if the Gramian (the determinant of the inner products <x;, x;>, i, j = 1, ..., n) is 0. 4. (Grain-Schmidt process) Let B = {x,, x2 , ...} be a countable linearly independent subset of the Hilbert space H. Define e, = x,/IIx, II ; having chosen orthonormal elements e,, ..., en, let yn+, be the projection of xn+, on the space spanned by e,, ..., en: n
Y.+, _ Y <xn+,, ei>e, i=, Define en+, =
X'1+1 iIxn+,
(a)
_Y..,,,* Yn+,
Show that L{e,, ..., en} = L{x ..., xn} for all n, hence xn+, #
Yn+, and the process is well defined. (b) Show that the e" form an orthonormal basis for S(B). Comments.
Consider the space H = L2(-1, 1); if we take xn(t) = t",
n = 0, 1, ..., the Gram-Schmidt process yields the Legendre polynomials en(t) = an d"[(t2 - 1)"]/dt", where a" is chosen so that Ilenll = 1. Similarly, t"e-121'2, if in L2(- oo, oo) we take xn(t) = n = 0, 1, ..., we obtain the Hermite polynomials en(t) = an(-1)"e`2 d"(e-`2)/dt".
126 5. 6.
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
Show that every Hilbert space has an orthonormal basis. Show that any two orthonormal bases have the same cardinality. Let U be an open subset of the complex plane, and let H(U) be the collection of all functions f analytic on U such that
(a) (b)
11f1I2= f f If(x+iy)I2dxdy for all x, then y - z is orthogonal to everything in H (including itself), so y = z.
3.3
131
LINEAR OPERATORS ON NORMED LINEAR SPACES
To prove existence, let N be the null space of f, that is, N = {x e H: f(x) = 0}. If N' = {0}, then N = H by the projection theorem; hence f = 0, and we may take y = 0. Thus assume we have an element u e N' with u 0. Then u 0 N, and if we define z = u/f (u), we have z e Nl and f (z) = 1. If x e H and f (x) = a, then with
x = (x - az) + az,
x - az E N,
az 1 N.
If y = z/Ilzll2, then <x, y> = <x - az, y> + a
since y I N
= a(z, y> = a =.f (x) as desired.
The above argument shows that if f is not identically 0, then N' is one-
dimensional. For if x e N' and f(x) = a, then x - az e N n N', hence x = az. Therefore N' = {az: a e Q. Notice also that if 11fII is the norm off, considered as a linear operator, then 111'11 = IIyII
For I f(x) I = I<x; y>I for all x, then af(x) = <x, ay>.j Such a map is called a conjugate isometry. (b) Let f be a continuous linear functional on l (= l°(S), where S is the set of positive integers), I < p < oo. We show that if q is defined by (1/p) + (1/q) = 1, there is a unique element y = (yl, y2, ...) a 14 such that AX) = Y_ xk yk
for all x e !°.
k=1
Furthermore, ao
111
l1=IIY11=(E lYklq) k= 1
1/q
132
INTRODUCTION TO FUNCTIONAL ANALYSIS
3
To prove this, let en be the sequence in 1P defined by en(j) = 0, j 96 n; If x e 1P, then llx - Yk=, xkekliP = Ek n+, IxkIP->0; hence x = J:k 1 xk ek where the series converges in 1P. By continuity off, en(n) = 1.
where yk =f(ek)
f(x) = k xkyk, 00
k=1
(1)
Now write yk in polar form, that is, yk = rk eiek, rk > 0. Let
by (1), n
n
Al.) _ k=1 rk-1e-£Bkrkei0"= k=1 Y_ IA
(2)
But If(Zn)I C 11f11 I1Zn11 n
= IIJ
1/P
II k=1 Y_ IYkI1q-11P)
llfll( k-1 IYkI9)1/P By Eq. (2), n
1/4
Ek1IYk19)
G Ilfll;
hence y e 19 and IIYII 0 and ilaA II = Pal IlA ll for all A e [L, M] and a e C. Also by 3.3.1, if 11 All = 0, then Ax = 0 for all x e L, hence A = 0. If A, B e [L, M], then again by 3.3.1, IIA + BII = sup{II(A + B)xll : x e L,
Since II(A + B)xll :5 IlAxll + IlBxll for all x,
iiA + BII s IIAil + IIBll
Ilxll < 1}.
134
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
and it follows that [L, M] is a vector space and the operator norm is in fact a norm on [L, M]. Now let A1, A2 , ... be a Cauchy sequence in [L, M]. Then
II(A"-Am)xli S IIA"-A,"II Ilxll-r0
as
n,m->oo.
(1)
Therefore {A" x} is a Cauchy sequence in M for each x e L, hence A. converges
pointwise on L to an operator A. Since the A. are linear, so is A (observe that Ajax + by) = aA" x + bA" y, and let n -+ oo). Now given e > 0, choose N such that 11 A,, - Amll < s for n, m >- N. Fix n >- N and let m - co in Eq. (1) to conclude that II (A" - A)xll - N; therefore IIA" - A II - 0 as n -a oo. Since 11A 11 IIA - A"II + IIA"II, we have A e [L, M] and A. -> A in
the operator norm.
In the above proof we have talked about two different types of convergence of sequences of operators. 3.3.6
Definitions and Comments. Let A, A1i A2, ... e [L, M]. We say that
A" converges uniformly to A if IIA" - A II - 0 (notation: A"-" . A). Since II(A" - A)xll < IIA" - All Ilxll, uniform operator convergence means that A"x-. Ax, uniformly for llxll 0 such that IlAxll ? m Ilxll for all x e L. (b) Let 111), and 11 JJ2 be norms on the linear space L. Show that the
norms induce the same topology if there are finite numbers m, M > 0 such that mllxll1:5 11x112 <Ml141
for all
xeL.
136
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
(This may be done using part (a), or it may be shown directly that, for
7.
example, if 11x111 0. Then
If(x)j = r =f(e-'Bx) = f1(e-iex) < p(e-i°x)
since
r is real
since
f1 < p on L
= p(x)
by absolute homogeneity.
3.4.4 Corollary. Let g be a continuous linear functional on the subspace M of the normed linear space L. There is an extension of g to a continuous linear
functional f on L such that Ilfll = Ilgll. PROOF. Let p(x) = Ilgll Ilxhl; then p is a seminorm on L and IgI
- IIxII, and consequently 11h(x)II = IIxII. 1
If h(L) = L**, L is said to be reflexive. Note that L** is complete by 3.3.5 so by 3.4.6, a reflexive normed linear space is necessarily complete. We shall now consider some examples. 3.4.7 Examples. (a) Every Hilbert space is reflexive. For if >G is the conjugate isometry of 3.3.4(a), H* becomes a Hilbert space if we take =
0. Then all is a neighborhood base for a topology that makes L a topological vector space. This topology is the weakest, making all in L, x -+ x if pi(x - x) - 0 for each the p, continuous, and for a net
ieI.
3.5
SOME PROPERTIES OF TOPOLOGICAL VECTOR SPACES
153
PROOF. We show that the conditions of 3.5.1 are satisfied. Condition (a) follows because an intersection of two finite intersections of sets {x: pi(x) < S,} is a finite intersection of such sets. If x e L, then p,(ax) = I a I pi(x), which is less than S, if I a I is sufficiently small; this proves (b). To prove (c), suppose
that U = n,= I{x: pi(x) < S;}, and let V = ni= I{x: pi(x) < S} where 0 < S < mini S; . If y, z e V, then p,(y + z) 5 pi(y) + p,(z) < S,, hence y + z e U. Finally, each U e all is circled since pi(ax) = la Ip,(x), proving (d). If x e U = n,.,{z: p,(z) < S,} and y e V =n" I{z: pi(z) < minj[S; - pi(x)ii, then p,(x + y) 0, then {z: p,(z - x) < 6/2) and {z: p,(z - y) < 6/2) are disjoint neighborhoods of x and y. Examples. (a) Let L be a vector space of complex-valued functions on a topological space Q. For each compact subset K of f), define PK(x) = sup{ I x(t) I t e K). In the (Hausdorff) topology induced by the seminorms pK, convergence means uniform convergence on all compact subsets of .0. If we restrict the K to finite subsets of f), we obtain the topology of pointwise convergence. In general, if the K are restricted to a class W of subsets of fl, we obtain the topology of uniform convergence on sets in W. (b) Let L = C°° [a, b], the collection of all infinitely differentiable complex-valued functions on the closed bounded interval [a, b] a R. For each n, 3.5.3
:
let p (x) = sup{ I x("'(t) I : a < t < b} where P) is the nth derivative of x. In
the topology induced by the p,,, convergence means uniform convergence of all derivatives. We now examine convex sets in more detail.
154
3.5.4
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
Definitions. Let K be a subset of the vector space L. Then K is said to
be radial at x if K contains a line segment through x in each direction, in other words, if y e L, there is a b > 0 such that x + Ay e K for 0:5 A < 6. (If K is radial at x, x is sometimes called an internal point of K.) If K is convex and radial at 0, the Minkowski functional of K is defined as p(x) = inf (r > O: x e K). 111
Intuitively, p(x) is the factor by which x has to be shrunk in order to reach the boundary of K. The Minkowski functional has the following properties.
3.5.5 Lemma. Let K be a convex subset of L, radial at 0, and let p be the Minkowski functional of K. (a) The functional p is sublinear, that is, subadditive and positivehomogeneous. (b) {x e L: p(x) < l} is the radial kernel of K, defined by rad ker K = {x e K: K is radial at x}; also, K c {x e L: p(x) < 1}. (c) If K is circled, then p is a seminorm. (d) If L is a topological vector space and 0 belongs to the interior K° of K, then p is continuous, K = {x e L: p(x) < I}, and K° = {x e L: p(x) < 1); hence {x e L: p(x) = 1) is the boundary of K. PROOF. (a)
If x/r e K and y/s e K, then
x+ y __ r
r+s
x+
y
r+sr r+ss e K S
by convexity.
Thus p(x + y) < r + s; take the inf over r, then over s, to obtain p(x + y) 1; hence x e K by convexity. [Write x = (1/s)(sx) + [1 - (1/s)]0.] Now if p(x) < I and y e L, then p(x + Ay) < p(x) + )p(y) < I for A sufficiently small and nonnegative; hence K is radial at x. Conversely, if K is radial at x, then x + Ax e K for some A > 0; hence p(x + Ax) < 1 by definition of p. Thus p(x) < (I + A)-1 < 1. The last statement follows from the definition of p. (c) if x/r e K and a e C, a 96 0, then ax/ I a I r e K since K is circled ; consequently p(ax) < I a I r. Take the inf over r to obtain p(ax) < I a Ip(x). Replace x by x/a to obtainp(x) < j a f p(x/a) or, with b = 1 la, p(bx) f b I p(x). Now p(0) = 0 since 0 e K, and the result follows. (d) Since 0 e K° there is a neighborhood U of 0 such that U e K. If
A > 0 and y e AU, that is, y = Ax for some x e U, then p(y) = Ap(x) < A.
3.5
SOME PROPERTIES OF TOPOLOGICAL VECTOR SPACES
155
[Note x e K implies p(x) < 1, by (b).] Thus p is continuous at the origin, and therefore continuous everywhere by subadditivity. Since p is continuous, {x: p(x) < 1} is closed, so by (b), K c {x: &) :5 1}. But if 0 < A < 1 and p(x) < 1, then p(Ax) < 1; hence AX E K. If A -* 1, then x; therefore p(x) < 1 implies x e K. Ax Again, by continuity of p, {x: p(x) < 1} is open, and hence is a subset of K°. But if p(x) >_ 1, then by considering {x with A > I we see that x is a limit of a sequence of points not in K, hence x 0 K°. I
Before characterizing locally convex spaces, we need the following result.
3.5.6 Lemma. If U is a neighborhood of 0 in the topological vector space L, there is a circled neighborhood V of 0 with V c U, and a closed circled overneighborhood W of 0 with W c U. If L is locally convex, V and W can be taken as convex. PROOF. Choose T e 'W and S > 0 such that aT (-- U for I a l < 6, and take V = U{aT: I a 1< b}. Now if A c L, we claim that
A= n{A+N: Ne.'},
(1)
where .K is the family of all neighborhoods of 0. [This may be written as
A = n{A - N: N e X); note that N e .' if - N e .N' since the map y - -y is a homeomorphism.] For if x e A and N e X, then x + N is a neighbor-
hood of x; hence (x+N)nA 00. Ifye(x+N)nA, then xey-N. If x0A,then (x+N)r A=0forsomeNe.,V;hencexoA-N. Now if U is a neighborhood of 0, let V, be a circled neighborhood of 0 with V, + V, c U. By (1), V, e V, + V,, and since V, is circled, so is V,. Thus we may take W = V, V.
In the locally convex case, we may as well assume U convex [the interior
of a convex set is convex, by 3.5.5(d) and the fact that a translation of a convex set is convex]. If V2 is a circled neighborhood included in U, the convex hull r
n
.I,>0,
YA,=1, n=1,2,...
111+=1
the smallest convex overset of V2, is also included in U. Since V2 is circled so is V2 , and therefore so is (V2)°. (If x + N c P2, N e X, and I a l < 1, then ax + aN c V2 .) Thus we may take V = (V2)°. Finally, if V3 is a circled convex neighborhood of 0 with V3 + V3 e U, then W = V3 is closed, circled, and convex, and by (1), W c V3 + V3 c U.
156
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
3.5.7 Theorem. If L is a locally convex topological vector space, the topology of L is generated, in the sense of 3.5.2, by a family of seminorms. Specifically, if 3?1 is the collection of all circled convex neighborhoods of 0, the Minkowski functionals pu of the sets U e W are the desired seminorms.
PROOF. By 3.5.5(c), the Pu are seminorms, and 3.5.6, iW is a base at 0 for the topology of L. By 3.5.5(d), for each U e ?l we have U = {x: pu(x) < 1}, and
it follows that the topology of L is the same as the topology induced by the
Pu I The fact that the Minkowski functional is sublinear suggests that the Hahn-Banach theorem may provide useful information. The next result illustrates this idea. Theorem. Let K be a convex subset of the real vector space L; assume K is radial at 0 and has Minkowski functional p. 3.5.8
(a) If f is a linear functional on L, then f < 1 on K iff f < p on L. (b) If g is a linear functional on the subspace M, and g< 1 on K n M,
then g may be extended to a linear functional f on L such that f < 1 on K. (c) If in addition K is circled and L is a complex vector space, and g is a linear functional on the subspace M with I g I < 1 on K n M, then g may be extended to a linear functional f on L such that I f I < 1 on K. (d) A continuous linear functional g on a subspace M of a locally convex topological vector space L may be extended to a continuous linear functional on L. PROOF. (a)
If f < p on L, then f < 1 on K by 3.5.5(b). Conversely, assume
f < 1 on K. If x/r a K, then f(x/r) < 1, so f(x) - c for all x in the other set. We are going to consider generalizations of this idea. The following theorem is the fundamental result of this type. 3.5.13
Basic Separation Theorem. Let K, and K2 be disjoint, nonempty
convex subsets of the real vector space L, and assume that K, has at least one
160
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
internal point. There is a linear functional f on L separating K, and K2, that is, f 0 0 and f(x) < f(y) for all x e K, and y e K2 [for short, f(K,) < f(K2)]. PROOF. First assume that 0 is an internal point of K,. Pick an element z e K2;
then -z is an internal point of -z + K, c K, - K2 ; hence 0 is an internal point of the convex set K = z + K, - K2. If z e K, then K, n K2 Qf , contradicting the hypothesis; therefore z 0 K; so if p is the Minkowski functional of K, we have p(z) > I by 3.5.5(b). Define a linear functional g on the subspace M = (),z: A e R} by g(Az) _ A. If a > 0, then g(az) = a = a(l) < ap(z) = p(az), and if a < 0, then g(az) = a < 0 - c2 > 0. 1
If we adopt the above definition of separation in complex vector spaces, all parts of 3.5.14 extend immediately to the complex case.
Separation theorems may be applied effectively in the study of weak topologies. In 3.4.10 we defined the weak topology on a normed linear space; the definition is identical for an arbitrary topological vector space L. Specifically, for each f e L*, p1(x) = f (x) I defines a seminorm on L. The locally
convex topology induced by the seminorms pr, f e L*, is called the weak topology on L. By 3.5.1 and 3.5.2, a base at x0 for the weak topology consists of finite intersections of sets of the form {x: p f(x - x0) < s}, so in the case of a normed linear space we obtain the topology defined in 3.4.10.
There is a dual topology defined on L*; if x e L, then px(f) = I f(x) I defines a seminorm on V. The locally convex topology induced by the seminorms px is called the weak* topology on V. By 3.5.2, the weak topology is the weakest topology on L making each f e L* continuous, so the weak topology is weaker than the original topology
of L. Convergence of x to x in the weak topology means f (x,,) -+f(x) for each f e V. The weak* topology on L* is the weakest topology making all evaluation maps f-+f(x) continuous. Convergence of f to fin the weak* topology means f (x) -+f(x) for all x e L; thus weak* convergence is simply pointwise convergence, so if L is a normed linear space, the weak* topology is weaker than the norm topology on V. We have observed in 3.4.10 that the weak topology is an example of a product topology. Since weak* convergence is pointwise convergence, the weak* topology is the product topology on the set C` of all complex-valued functions on L, relativized to L*. In distinguishing between the weak topology and the original topology on L, it will be convenient to call the original topology the strong topology. By the above discussion, a weakly closed subset of L is closed in the original
topology. Under certain conditions there is a converse statement:
162
3
INTRODUCTION TO FUNCTIONAL ANALYSIS
3.5.15 Theorem. Let L be a locally convex topological vector space. If K is a convex subset of L, then K is strongly closed in L iff it is weakly closed.
PROOF. Assume K strongly closed. If y 0 K, then by 3.5.14(b) there are real
numbers c, and c2 and an f e L* with Re f(x) < c, < c2 < Re f(y) for all x e K. But then W = {x e L: I f (x) - f (y) I < c2 - c, j is a weak neighborhood
of y, and if xeK, we have If(x - y)I > - IRe f(y) - Re f(x)I
c2-c1;
therefore W n K = 0, proving K` weakly open. For the remainder of this section we consider normed linear spaces. The closed unit ball f f: If II 0,
14.
IIx112 < rllxll, for all x e L; in other words (Problem 6, Section 3.3) the norms induce the same topology. (a) (Closed graph theorem) Let A be a linear map from L to M, where L and M are complete metrizable topological vector spaces.
Assume A is closed; in other words, the graph of A is a closed subset of L x M, with the product topology. Show that A is continuous. (b) If A is a continuous linear operator from L to M where L and M
are topological vector spaces and M is Hausdorff, show that A 15.
16.
is closed. Let g, f1, ... , f" be linear functionals on the vector space L. If N denotes null space and n;= I N(f;) e N(g), show that g is a linear combination of the f, . Let L be a normed linear space. If the weak and strong topologies coincide on L, show that L is finite-dimensional, as follows: (a) The unit ball B = {x: IIxli < 1) is strongly open, hence weakly e L* and b1, ..., open by hypothesis. Thus we can find .
S" > 0 such that {x: l f;(# < S,, i = 1, ..., n} c B. Show that (with N(f) denoting the null space off) (),!=I N(f;) = (0). (b)
Define T(x) = (f, (x), ... , f"(x)) ; by (a), T is a one-to-one map of L
onto a subspace M of C". If {y...... yk) is a basis for M and x; =
T - ' (y;), j = 1, ... , k, show that {x,, ... , xk} is a basis for L, 17.
hence L is finite-dimensional. Let L be a separable normed linear space. If B is the closed unit ball in L*, show that B is metrizable in the weak* topology. Thus since B is compact by 3.5.16, every sequence in B (hence every norm-bounded
3.6
REFERENCES
167
sequence in L*) has a weak* convergent subsequence. Similarly, if L is reflexive, the closed unit ball of L is metrizable in the weak topology. (Let {x,, X21 ...} be a countable dense subset of and set w {{
1
If(xn) - g(Xn)1
d(f, g) - =1 2" 1 + ((((x.) g(xn)I ,,JJ
18.
19.
f, g E L*. f,
Show that weak* convergence implies d-convergence.) Show that a reflexive Banach space is weakly complete; in other words every weak Cauchy sequence [{ f (x )} is Cauchy for each f e L*] converges weakly. A subset B of a topological vector space L is said to be absorbed by a
subset A if B c aA for sufficiently large I a I ; B is said to be bounded if it is absorbed by every neighborhood of 0. (a) Show that B is bounded if for each sequence of points x e B and each sequence of complex numbers .1 -+ 0, we have A. x -+ 0. (b) A bornivore in a topological vector space L is a convex circled set that absorbs every bounded set; L is bornological if L is locally convex and every bornivore is an overneighborhood of 0. Show that every metrizable locally convex space is bornological. (c) Let T be a linear operator from L to M, where L is bornological and M is locally convex. If T maps bounded sets into bounded sets, show that T is continuous. (The converse is true for arbitrary L and M.) (d) If L is a Hausdorff topological vector space, show that there is no bounded subspace of L except {0}.
3.6
References
There is a vast literature on functional analysis, and we only give a few representative titles. Readable introductory treatments are given in Liusternik and Sobolev (1961), Taylor (1958), Bachman and Narici (1966), and Halmos (1951); the last deals exclusively with Hilbert spaces. Among the more ad-
vanced treatments, Dunford and Schwartz (1958, 1963, 1970) emphasize normed spaces, Kelley and Namioka (1963) and Schaefer (1966) emphasize topological vector spaces. Yosida (1968) gives a broad survey of applications to differential equations, semigroup theory, and other areas of analysis.
4 The Interplay between Measure Theory and Topology
4.1
Introduction
A connection between measure theory and topology is established when a a-field .F is defined in terms of topological properties. In the most common situation, we have a topological space fl, and F is taken as the smallest a-field
containing all open sets of Q. If this is done, there is a natural connection between measure-theoretic and topological questions. For example, if µ is a measure on , and A e .F, we may ask whether A can be approximated by a compact subset. In other words, we wish to know if u(A) = sup{µ(K): K a compact subset of A}. As another example, we may ask whether a function in L°(S2, .07, p) can be approximated by a continuous function. One formulation of this is to ask whether the continuous functions are dense in
P. In this chapter we investigate questions of this type. The results in the first two sections are not topological, but they serve as basic tools in the later development. We first consider a result that is a companion to the monotone class theorem: Definition. Let -9 be a class of subsets of a set 0. Then -9 is said to be a Dynkin system (D-system for short) if the following conditions hold.
4.1.1
168
4.1
INTRODUCTION
169
(a) QE -'. (b) If A, B e 2, B c A, then A - B e 9. Thus -9 is closed under proper differences. (c)
If A1,A2,...e9andA,, TA, then AeO.
Note that by (a) and (b), -9 is closed under complementation; hence by (c), -9 is a monotone class. If -9 is closed under finite union (or closed under finite intersection), then g is a field, and hence a a-field (see 1.2.1). 4.1.2
Dynkin System Theorem. Let .9' be a class of subsets of 0, and assume 9' closed under finite intersection. If -9 is a Dynkin system and -9 .9', then -9 includes the minimal a-field F = a(9'). PROOF. Let -9o be the smallest D-system including 9. We show that -9Q = F,
in other words the smallest D-system and the smallest a-field over a class closed under finite intersection coincide. Since -9o c -9, the result will follow.
Now -9o (-- .;t since F is a D-system. To show that .F c .9o , let ' = {A e -9o : A r B e -90 for alI B e .9'}. Then 91' c ' since .9' is closed under finite intersection, and since -9o is a D-system, so is W. Thus .90 a' , hence -9o=W. Now let " _ {C ego : C n D c-90 for all D e 3o). The result .90 = W implies that 51 T, and since 'e' is a D-system we have .9o c W', hence 0o = V. It follows that -90 is closed under finite intersection; by 4.1.1, -90 is
a a-field, so that F c go. 1 If Y is a field and .,# is the smallest monotone class including So, then Af = a(.9') (see 1.3.9). In the Dynkin system theorem, we have a weaker hypothesis on .9' (it is closed under finite intersection but need not be a field)
but a stronger hypothesis on the class of sets including . 9' (9 is a Dynkin system, not merely a monotone class). Corollary. Let 9 be a class of subsets of 0, and let µl and P2 be finite measures on a(.9'). Assume 0 E . 9' and . 9' is closed under finite intersection. If µ, = µ2 on So, then µl = µ2 on a(.9'). 4.1.3
PRoor. Let g be the collection of sets A e a(.9') such that µ,(A) = 102(A). Then
2 is a D-system and 59' c .9, hence g = a(.9') by 4.1.2. 1 4.1.4 Corollary. Let .9' be a class of subsets of 0; assume that S2 e .9' and . 9'
is closed under finite intersection. Let H be a vector space of real-valued functions on fl, such that IA E H for each A E .9'.
4
170
THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
Suppose that whenever f f2 , ... are nonnegative functions in H, I f 1 < M < co for all n, and f, T f, the limit function f belongs to H. Then IA e H for all A c a(.9'). PROOF. Let _q = {A c fl: IA a H}; then ,9'c -9 and 2 is a D-system. For if
A,Bc ,BcA,then IA_B=IA-JBeH,sothat A-Be -9. If A. e-9 and A T A, then IA T IA, hence I. e H by hypothesis; thus A e -9. The result now follows from 4.1.2. 1 4.2
The Daniell Integral
One of the basic properties of the integral is linearity; if f and g are pintegrable functions and a and b are real or complex numbers, then
fn (af+bg)dp=a$n fdp+b fn g dp. Thus the integral may be regarded as a linear functional on the vector space of integrable functions. This idea may be used as the basis for a different approach to integration theory. Instead of beginning with a measure and constructing the corresponding integral, we start with a given linear functional E on a vector space. Under appropriate hypotheses, we extend E to a larger space, and finally we show that there is a measure u such that E is in fact the integral with respect to p. We first fix the notation to be used.
Notation. In this section, L will denote a vector space of real-valued functions on a set f ; L is assumed closed under the lattice operations, in other words if f, g e L and f vg = max(f, g), f A g= min(f, g), then f v g, f A g e L. 4.2.1
There are several familiar examples of such spaces; L can be the class of continuous real-valued functions on a given topological space, or equally well, the bounded continuous real-valued functions. Another possibility is to take L as the collection of all real-valued functions on a given set. The letter E will denote a positive linear functional on L, that is, a linear map E from L to R such that f'>_ 0 implies E(f) >- 0. This implies that E is monotone, that is, f < g implies E(f) 0). The collection of functions f: S2 -+ R of the form lim f where the f form an increasing sequence of functions in L+, will be denoted by L';
if the f are allowed to form an increasing net in L+, the resulting class is denoted by L. (See the appendix on general topology, Section Al, for a discussion of nets.)
4.2 THE DANIELL INTEGRAL
171
If H is as above, a(H) is defined as the smallest a-field of subsets of 12 making every function in H Bore] measurable, that is, the minimal a-field containing all sets f -'(B), where f ranges over H and B over the Borel sets. If 9 is a class of subsets of K2, a(f) as usual denotes the minimal or-field over T.
The following will be assumed throughout: Hypothesis A: If the functions fn form a sequence in L decreasing to 0, then E(fn) decreases to 0. Equivalently, if the fn belong to L and increase to f e L, then E(fn) increases to E(f).
We are also going to carry through a parallel development under the following assumption: Hypothesis B: If the functions fn form a net in L decreasing to 0, then E(fn) decreases to 0 (with the equivalent statement just as in hypothesis A).
Hypothesis A is always assumed in the statements of theorems. Corresponding results under hypothesis B will be added in brackets. 4.2.2 Lemma. Let {fm} and { fn'} be sequences in L increasing to f and f',
respectively, with f < f' (f and f need not belong to L). Then Iim E(fm) a) e 9. [Under hypothesis B, the proof is the same, with { fa} a net instead of a sequence.] I
4.2.6 Lemma. The a-fields a(L), a(L'), and a(ir) are identical. [Under hypothesis B we only have a(V) = a(f) and a(L) c a(L").] PROOF. By 4.2.5, a(gr) makes every function in L' measurable; hence a(E) c a(gr). If G E 9r, then IG e L'; hence G = (IG = 1) a a(L'); therefore a(sr) e a(L'). [Under hypothesis B, a(L") = a(W) by the same argument.]
174
4
THE INTERIsLAY BETWEEN MEASURE THEORY AND TOPOLOGY
If f e L, then f = f + -f-, where f +, f - e L' c :L'. Since f + and f- are a(L')-measurable, so is f. In other words, u(L') makes every function in L Bore] measurable, so that a(L) c a(L'). [Under hypothesis B, a(L) c a(L") by the same argument.] Now if f e L', then f is the limit of a sequencef a L, and since the f are a(L)-measurable, so is f. [This fails under hypothesis B because
the limit of a net of measurable functions need not be measurable; see Problem 1.] Thus a(L') e a(L). I Now by definition of the set function p* (see 4.2.4) we have, for all A c Sl, p*(A) = inf{E(IG): G e QV,
G
A)
=inf{E(f):f=IGE L', f>IA} > inf{E(f): fe L', f> IA}. In fact equality holds.
4.2.7 Lemma. For any A c 0, p*(A) = inf{E(f): f e L', f > IA). [The result is the same under hypothesis B, with L' replaced by L".) PROOF. Let f e L', f >- IA. If 0 < a < 1, then A e (f > a), which belongs to 9 by 4.2.5. Thus p*(A) < p{ f > a) = E(I f>a)). But since f >- 0 we have f Z aI(f>Q); hence E(I(f>Q)) <E(f)la. Let a-+ 1 to conclude that p*(A) S E(f). [The proof is the same under hypothesis B.]
We may now prove that p* is a measure on a(l):
4.2.8 Lemma. If .e = {H c fl: p*(H) + p*(H`) = 1), then 9r c f e, hence a(19) c 0. [The result is the same under hypothesis B.] PROOF. Let G e T; since 1c e L', there are functions f, e e with f, T 1G. Then
y*(G) = p(G) = E(IG) = lim E(f)
by 4.2.3(e).
n
By 4.2.7, p*(G`) = inf{E(f ): f e L', f
But if fn < IC', then 1 - fn >- Ian ;
since I - fn > 0, we have I - fn e L' c L'; hence p*(G`) < inf E(1 n
I - lim E(f) = 1 - E(IG). n
Thus p*(G) + p*(G`) < 1; since p*(G) + p*(G`) is always at least I by 1.3.3(b), we have G E .-Y. [Under hypothesis B, the proof is the same, with { fn} a net
instead of a sequence.] I
4.2
175
THE DANIELL INTEGRAL
We now prove the main theorem; for clarity we state the results under hypotheses A and B separately (in 4.2.9 and 4.2.10). Daniel! Representation Theorem. Let L be a vector space of real-valued functions on the set 92; assume that L contains the constant functions and is closed under the lattice operations. Let E be a Daniell integral on L, that is, 4.2.9
a positive linear functional on L such that E(fa) J. 0 for each sequence of functions f, e L with f 10; assume that E(1) = 1. Then there is a unique probability measure P on a(L) (=a(L') = a(T) by 4.2.6) such that each f e L is P-integrable and E(f) = In f dP. PROOF. Let P he the restriction of p* to a(L). By 4.2.6 and 4.2.8, P is a probability measure. If G e 19, then
E(I0) = p(G) = p*(G) = P(G) = f IG 0. n
Now if f e L', we define
k-1 l{(k-I)/2^,)
hn = k=1
n
2
Since I(a< f:5 b) = I(f>a) - I{ f>b} for a < b, and {f > a}, {f > b} e W by 4.2.5, it
follows that
In h dP. But the h form a sequence of nonnegative
simple functions increasing to f [see 1.5.5(a)], so by the monotone convergence
theorem, E(f) = In f dP.
Now let feL; f = f + -f-, where f +, f - c -L' c L'. Then E(f) _ E ( f ') - E(f -) _ ,$n f + d P - fn f d P = f n f dP. (Since f +, f - e L, the integrals are finite.)
This establishes the existence of the desired probability measure P. If P' is another such measure, then So fdP = 5 n f dP' for all f a L,and hence for all f e L', by the monotone convergence theorem. Set f = Ia, G e W1, to show that P = P on W. Now Ir is closed under finite intersection by 4.2.4(b); hence by 4.1.3, P = P' on a(lr), proving uniqueness. I 4.2.10
Theorem. Let L be a vector space of real-valued functions on the set
i2; assume that L contains the constant functions and is closed under the
lattice operations. Let E be a positive linear functional on L such that E(,,) 10 for each net of functions f e L with f 10; assume that E(1) = I. Then there is a unique probability measure P on o(L") (=a(W) by 4.2.6) such that: (a) Each f e L is P-integrable and E(f) = In f dP. (b) If is a net of sets in g and Ga TG, then G e
and P(G) T P(G).
176
4 THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
PROOF. Let P be the restriction of M* to a(L'). The proof that P satisfies (a) is done exactly as in 4.2.9, with L' replaced by L'; P satisfies (b) by 4.2.4(d).
The uniqueness part cannot be done exactly as in 4.2.9 because the monotone convergence theorem fails in general for nets (see Problem 2). Let P' be a probability measure satisfying (a) and (b). If f e L', there is a net of functions fa e L+ with fa If. We define 1
n2"
n = 1, 2, ... .
hna = 2n,I1 I{Ia>
If (k - 1)/2" < fa(w) < k/2n, k = 1, 2, ..., n2", then hna(w) =(k - 1)/2", and if fa(co) > n, then hna(w) = n. Thus the hna, n = 1, 2, ..., are in fact the standard
sequence of nonnegative simple functions increasing to fa [see 1.5.5(a)]. Similarly, if 1
nr22"
2nj=1 L I{j> j2-")
h"
the h" are nonnegative simple functions increasing to f. Now
f h" dP' a
n2"
1
y P'{f > j2-"}
2nj=I o2"
1
E lim P'{ fa > j2-"} ' 2n j=1 a
by (b)
n2"
1
=lim- Y P'{fa>j2r"} a
2" j=1
since the sum on j is finite
= lim f hna dP'; a
n
but
f f dP' = lim f ha dP' n
n
n
by the monotone convergence theorem
= lim lim f hna dP' n
a
n
= lim lim f hna dP' a
n
a
since hna is monotone in each variable,
so that "lim" may be replaced by "sup"
=lim f fadP' a
a
by the monotone convergence theorem
4.2
THE DANIELL INTEGRAL
177
Equation continues
= lim a
fa dP
Ja
by (a)
= fnf dP by the above argument with P' replaced by P.
Set f = Ic, G e 9r, to show that P = P' on 9; hence, as in 4.2.9, on a(w). The following approximation theorem will be helpful in the next section. Theorem. Assume the hypothesis of 4.2.9, and in addition assume that L is closed under limits of uniformly convergent sequences. Let 4.2.11
9'={G-=0:G={f>0) for some feL}. Then : (a)
9' = 9r.
(b) If A e 6(L), then P(A) = inf{P(G): G e,r', G :DA}. (c)
If G e W, then P(G) = sup{E(f): fe L, f < Ic}.
PROOF. (a) We have 5r' c Ir by 4.2.5. Conversely, suppose G e 1, and let f E L+ with f, T Ic (e L'). Set f = 2:1 1 2-nf . Since 0 0)= U
1}=G.
n=1
Consequently, G e 9'. (b) This is immediate from (a) and the fact that P = u* on Q(L). (c)
If f e L+, f < Ic, then E(f) < E(IG) = P(G). Conversely, let G E 1,
with f e L, f T 1G. Then P(G) = E(!) = limn
sup
hence
P(G) < sup{E(f): fe e, f < IG}. Problems
Give an example to show that the limit of a net of measurable functions need not be measurable. 2. Give an example of a net of nonnegative Borel measurable functions fa increasing to a Bore] measurable function f, with lima f fa du -0 f f du. 3. Let L be the class of real-valued continuous functions on [0, 1], and let E(f) be the Riemann integral off. Show that E is a Daniel] integral on L, and show that o(L) _ -4[0, 1] and P is Lebesgue measure. 1.
178
4 THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
4.3 Measures on Topological Spaces
We are now in a position to obtain precise results on the interplay between measure theory and topology.
Definitions and Comments. Let fI be a normal topological space (0 is Hausdorff, and if A and B are disjoint closed subsets of Q, there are disjoint 4.3.1
open sets U and V with A c U and B c V). The basic property of normal spaces that we need is Urysohn's lemma: If A and B are disjoint closed subsets
of 0, there is a continuous function f: r2 -> [0, 11 such that f = 0 on A and f = 1 on B. Other standard results are that every compact Hausdorrf space is normal, and every metric space is normal. The class of Borel sets of 0, denoted by R(fl) or simply by.4, is defined as the smallest a-field of subsets of r2 containing the open (or equally well the closed) sets. The class of Baire sets of 0, denoted by V(fl) or simply by sad, is defined as the smallest a-field of subsets of S1 making all continuous realvalued functions (Borel) measurable, that is, sat is the minimal Q-field contain-
ing all sets f -'(B) where B ranges over R(R) and f ranges over the class C(f2) of continuous maps from f to R. Note that sd is the smallest a-field making all bounded continuous functions measurable. For let .F be a a-field that makes all bounded continuous functions measurable. If f e C(r2), then f + An is a bounded continuous function and f + An If + as n -+ oo. Thus
f + (and similarly f -) is .F-measurable, hence f =f' -f - is .F-measurable. Thus d c F, as desired. The class of bounded continuous real-valued functions on 0 will be denoted by Cb(Q).
If V is an open subset of R and f e C(Q), then f -'(V) is open in 11, hence
f -(V) e 2(c2). But the sets f -'(V) generate si(Q), since any a-field containing the sets f -'(V) for all open sets V must contain the sets f -'(B) for all Borel sets B. (Problem 6 of Section 1.2 may be used to give a formal proof, with ' taken as the class of open sets.) It follows that W(r) a R(r2). An F, set in rZ is a countable union of closed sets, and a G. set is a countable intersection of open sets. Theorem. Let r2 be a normal topological space. Then sl(r2) is the minimal a-field containing the open F, sets (or equally well, the minimal a-field containing the closed G6 sets). 4.3.2
PROOF. Let f e C(Q); then If > a) = Un , If >- a + (1/n)) is an open F,, set. As above, the sets {f> a}, a e R, f e C(O), generate V; hence .4 is included
4.3
179
MEASURES ON TOPOLOGICAL SPACES
in the minimal o-field ,Y over the F, sets. Conversely, let H = Un , F,,, F. closed, be an open F. set. By Urysohn's lemma, there are functions f" e C(f)) onF,,.Iff=Y', 2-"f", then feC(Q), with
0 0) where f C- Cb(fZ), f > 0. Then IA is the limit of an increasing sequence of continuous functions. PROOF. We have (f > 01 = Un , {f> 1/n), and by Urysohn's lemma there are functions f" e C(C) with 0< .fn :g 1, f" = 0 on (f = 0),f. = 1 on {f > I /n}. If 9" = max(f , ... ,./"), then 9" T Itt> o) .
The Daniell theory now gives us a basic approximation theorem. 4.3.6 Theorem. Let P be any probability measure on .W(f2), where Cl is a
normal topological space. If A e d, then (a) P(A) = inf{P(V): V n A, Van open F. set}, (b) P(A) = sup{P(C): C c A, C a closed Ga set}.
180
4 THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
PROOF. Let L = Cb(f) and define E(f) = In f dP, f e L. [Note that a(L) = ,V,
so each f e L is d-measurable; furthermore, since f is bounded, In f dP is finite. Thus E is well-defined.] Now E is a positive linear functional on L, and by the dominated convergence theorem, E is a Daniell integral. By 4.2.11(b), P(A) = inf{P(G): G e 9r',
G z A},
where
9'={Gc0:G={f>0} forsomefeL}. By 4.3.3, T' is the class of open F, sets, proving (a). Part (b) follows upon applying (a) to the complement of A. I Corollary. If f2 is a metric space, and P is a probability measure on R(f ), then for each A e R(f2), 4.3.7
(a) P(A) = inf{P(V): V A, V open}, (b) P(A) = sup{P(C): C c A, C closed}.
PROOF. In a metric space, every closed set is a Gs and every open set is an F, (see 4.3.4); the result follows from 4.3.6. 1
Under additional hypotheses on f2, we obtain approximations by compact subsets.
4.3.8 Theorem. Let ) be a complete separable metric space (sometimes
called a "Polish space"). If P is a probability measure on R(fl), then for each A e R(f2),
P(A) = sup{P(K): K compact subset of A}.
PROOF. By 4.3.7, the approximation property holds with "compact" replaced by "closed." We are going to show that if e > 0, there is a compact set K. such that P(Ke) > 1 - s. This implies the theorem, for if C is closed, then C n K, is
compact, and P(C) - P(C n K,) = P(C - K,) < P(c2 - K,) < e. Since fl is separable, there is a countable dense set {w1, w2, ...}. Let B(cv,,, r} (respectively, B(w,,, r)) be the open (respectively, closed) ball with
center at w and radius r. Then for every r > 0, Q = Un 1 B(w,,, r) so that Uk=, B(Wk, 1/n) T f2 as m.-oo (n fixed). Thus given e > 0 and a positive integer n, there is a positive integer m(n) such that P(Uk 1 B(wk, 1/n)) z 1 - e2` for all m Z m(n).
4.3 MEASURES ON TOPOLOGICAL SPACES
181
Let K, = nni Uk("1 B(uwk, 1/n). Then K, is closed, and !m() _
X00
L
P(K,`)
'PI
n=1
k=1
1
B [Cok'
c
n _J)
m
< n=1 Y
s2-" = s.
Therefore P(K,) >- I - s. It remains to show that K, is compact. Let {x1ix2, ...} be a sequence in K,. Then xP E no Uk "i R04, 1/n) for all p; hence x,, e Uk`=11 B(wk, 1) for all p. We conclude that for some k,, xP E B(wk,, 1) for infinitely many p, say, for p c- T,, an infinite subset of the positive integers. But xP E B(wk, #) for 1
all p, in particular for all p e T1; hence for some k2 , XP E B(cok,, 1) r B((vk2, #)
for infinitely many p e TI, say, for p e T2 e T1. Continue inductively to obtain integers k,, k2, ... and infinite sets TI T2 such that I
x1,Ej=1 f BLwk',J 1
for all peT,.
LL
Pick pi e Ti, i = 1, 2, ..., with PI < P2 < -. Then if j < i, we have xP,, x,,, E B(cokj , 1 /j), so d(xp, , XP) < 2/j - 0 as j - oo. Thus {xp,) is a Cauchy sequence, hence converges (to a point of K, since K, is closed). Therefore {xp} has a subsequence converging to a point of K,, so K, is compact. I
We now apply the Daniell theory to obtain theorems on representation of positive linear functionals in a topological context. Theorem. Let n be a compact Hausdorff space, and let E be a positive linear functional on C(Q), with E(l) = 1. There is a unique probability measure P on sd(Q) such that E(f) = In f dP for all f e C(CZ). 4.3.9
PROOF. Let L = C(Q). If fn e L, fn 10, then f -+0 uniformly (this is Dini's theorem). For given b > 0, we have fl! = Un I {f,, < S}; hence by compactness, N
0 = U {f, < S}
for some N
n=1
= { fN < S}
by montonicity of {fj.
Thus n >_ N implies 0 < f"(w) < fN(co) < b for all co, proving uniform convergence.
Thus if S > 0 is given, eventually 0 < f" < S, so 0 < E(f") < E(S) = S. Therefore E(fn) 10, hence E is a Daniell integral. The result follows from 4.2.9. 1
182
4
THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
A somewhat different result is obtained if we use the Daniell theory with hypothesis B.
Theorem. Let S2 be a compact Hausdorff space, and let E be a positive linear functional on C(Q), with E(l) = 1. There is a unique probability measure P on R(S2) such that 4.3.10
(a) E(f) = fa f dP for all f e C(O), and (b) for all A e R(S2),
P(A) = inf{P(V): V z) A,
V open)
or equivalently,
P(A) = sup{P(K) : K (-_ A, K compact). (" Compact" may be replaced by " closed " since n is compact Hausdorff.) PROOF. Let L = C(c2). If {f, n e D} is a net in L and f 10, then, as in 4.3.9, for any 6 > 0 we have fl = U j E F { f j < S} for some finite set F c D. If N e D and N z j for all j e F, then n = { fN < S) by monotonicity of the net. Thus n >_ N implies 0 < f < fN, < S, and it follows as in 4.3.9 that 10.
By 4.2.10 there is a probability measure P on a(L") = a(lr) such that E(f) = fn f dP for all f e L. But in fact ## is the class of open sets, so that a(lr) = R(Q). For if f e L", there is a net of nonnegative continuous functions
f T f; hence for each real a, {f > a) = U f f, > a) is an open set. Thus if G e 9, then IG e L", so that G = (Ia > 0) is open. Conversely if G is open and co e G, there is a continuous function f.: Q -+ [0, 11 such that fjw) = 1 and f,,, = 0 on G`. Thus Ic = supw.fu so that if for each finite set F c G we define gF = max{ fu, : w e F}, and direct the sets F by inclusion, we obtain a mono-
tone net of nonnegative continuous functions increasing to I.. Therefore IGeL", so that Ge4. Thus we have established the existence of a probability measure P on 9(0) satisfying part (a) of 4.3.10; part (b) follows since P = µ* on a(W), and V is the class of open sets. To prove uniqueness, let P' be another probability measure satisfying (a) and (b) of 4.3. 10. If we can show that P' satisfies 4.2.10(b), it will follow from be a net of open sets the uniqueness part of 4.2.10 that P' = P. Thus let with G T G; since G is the union of the G,,, G is open. By hypothesis, given
S > 0, there is a compact K e G such that P'(G) s P'(K) + S. Now G. U K` T G U K` = fl; hence by compactness and the monotonicity of Gm U K` = SZ for some m, so that K e G.. Consequently,
P'(G) T P'(G).
4.3 MEASURES ON TOPOLOGICAL SPACES
183
Property (b) of 4.3.10 is often referred to as the "regularity" of P. Since the word "regular" is used in so many different ways in the literature, let us state exactly what it will mean for us. Definitions. If p is a measure on .(S)), where S2 is a normal topological space, p is said to be regular if for each A E R(S2), 4.3.11
p(A) = inf{p(V): V
A,
V open)
and
p(A) = sup{p(C): C c A,
C closed).
Either one of these conditions implies the other if p is finite, and if in addition, .0 is a compact Hausdorff space, we obtain property (b) of 4.3.10. If p = p+ - jr is a finite signed measure on N(Q), S2 normal, we say that p is regular iff p+ and p- are regular (equivalently, if the total variation (p I is regular). The following result connects 4.3.9 and 4.3.10. 4.3.12
Theorem. If P is a probability measure on sa1(S2), fl compact Haus-
dorff, then P has a unique extension to a regular probability measure on eJ(1). PROOF. Let E(f) = In f dP, f E C(i2). Then E is a positive linear functional on L = C(S2), and thus (see the proof of 4.3.10) if f f.) is a net in L decreasing to 0, then j 0. By 4.3.10 there is a unique regular probability measure P' on
'(t2) such that In f dP = fn f dP' for all f e L. But each f in L is measurable: (S), d) - (R, V(R)), hence by 1.5.5, In f dP' is determined by the values of P'
on Baire sets. Thus the condition that f a f dP = In f dP' for all f e L is equivalent to P = P' on W(Q), by the uniqueness part of 4.3.9. 1 In 4.3.9 and 4.3.10, the assumption E(l) = 1 is just a normalization, and if it is dropped, the results are the same, except that "unique probability measure" is replaced by "unique finite measure." Similarly, 4.3.12 applies equally well to finite measures.
Now let fl be a compact Hausdorff space, and consider L = C(fl) as a vector space over the reals; L is a Banach space with the sup norm. If E is a positive linear functional on L, we can show that E is continuous, and this will allow us to generalize 4.3.9 and 4.3.10 by giving representation theorems for continuous linear functionals on C(O). To prove continuity of E, note that if f e L and If 11 _< 1, then -1 < f(co) < 1 for all co; hence -E(l) :5 E(f)
184
4 THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
E(l), that is, I E(f) I < E(1). Therefore IIEII < E(l); in fact IIEII = E(l), as may be seen by considering the function that is identically 1. The representation theorem we are about to prove will involve integration with respect to a finite signed measure y = p+ - u-; the integral is defined in the obvious way, namely,
f of dp = f .f dµ+
- f of d9
,
assuming the right side is well-defined. 4.3.13 Theorem. Let E be a continuous linear functional on C(SI), SI compact Hausdorff. (a)
There is a unique finite signed measure p on d(Q) such that E(f) _
In fdpfor all feCO). (b) There is a unique regular finite signed measure A on R(Q) such that E(f) = In f dA for all f c- C(SI).
Furthermore, any finite signed measure on .01(SI) has a unique extension to a regular finite signed measure on .(SI); in particular A is the unique extension of p. PROOF. The existence of the desired signed measures y and A will follow from
4.3.9 and 4.3.10 if we show that E is the difference of two positive linear functionals E+ and E-. If f Z 0, f e C(SI), define
E+(f) = sup(E(g): 0 n}, which
can be made less than 612 for sufficiently large n. If g is continuous aid µ{ f, # g} < 6/2, then µ{ f g} < 6, as desired. Finally, if If I < M < oo, and g approximates f as above, define gl(w) _ g((9) if I g(w) l < M, g1(w) = Mg(w)I I g(co) I if I g(w) I > M. Then g, is continuous, I g, I < M, and f(w) = g(w) implies I g(w) I < M; hence gl(w) _ g(w) = f (w). Therefore µ{ f 96 g1} < p{f # g} < 3, completing the proof.
4.3.17
Corollaries. Assume the hypothesis of 4.3.16.
(a) There is a sequence of continuous complex-valued functions f" on 0 converging to f a.e. [µ], with If. I < sup If I for all n. (b) Given e > 0, there is a closed set C c S2 and a continuous complexvalued function g on S2 such that µ(C) ? µ(S2) - e and f = g on C, hence the restriction off to C is continuous. If p has the additional property that µ(A) = sup{p(K): K e A, K compact) for each A e 64(Q), then C may be taken as compact. PROOF. (a) By 4.3.16, there is a continuous function f" such that If. I < M = sup If I and µ{ f" 96 f } < 2 -". If A,, = {f, f) and A = lim sup" A" , then
p(A) = 0 by the Bore[-Cantelli lemma. But if co 0 A, then f(w) =f(w) for sufficiently large n. (b) By 4.3.16, there is a continuous g such that µ{ f # g} < e/2. By regu-
larity of µ, there is a closed set C e f f = g} with µ(C) >- µ{ f = g} - e/2. The set C has the desired properties. The proof under the assumption of approximation by compact subsets is the same, with C compact rather than closed. Corollary 4.3.17(b) is called Lusin's theorem.
4
188
THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
Problems I.
2.
Let F be a closed subset of the metric space Q. Define f (w) = e-'(°" F) where d(w, F) = inf{d(w, y): y e F}. Show that the f are continuous and f 1 IF. Use this to give a direct proof (avoiding 4.3.2) that in a metric space, the Baire and Bore] sets coincide. Give an example of a measure space (0, #;, µ), where 12 is a metric space and .F = R(S2), such that for some A e .F, u(A) 96 sup{p(K): K c A, K compact}.
3.
In 4.3.14, assume in addition that S2 is locally compact, and that µ(A) _ sup{p(K): K compact subset of A} for all Borel sets A. Show that the
continuous functions with compact support (that is, the continuous functions that vanish outside a compact subset) are dense in L°(S2, W, g), 0 < p < co. Also, as in 4.3.14, if f e LP is approximated by the continuous
function g with compact support, g may be chosen so that supIgI 5 supifI. 4.
Let fl be a normal topological space, and let H be the smallest class of real-valued functions on a that contains the continuous functions and is closed under pointwise limits of monotone sequences. Show that H is the class of Baire measurable functions, that is, H consists of all f: (0, sd) - (R, R(R)) (use 4.1.4). (b) If H is as in part (a) and a(H) is the smallest a-field 5 of subsets of n (a)
making all functions in H measurable (relative to 5 and R(R)), show that a(H) = s?(f); hence a(H) is the same as a(C(a)). 5.
Let S2 be a normal topological space, and let Ko be the class of all continuous real-valued functions on Q. Having defined K. for all ordinals jf less than the ordinal a, define
K,, =U{K,,:/3 P(Ak (-i A;`) 1=1
1=1
k
k
Y P(A, - A;) < 1=1
e/21+1 < e/2. 1=1
Since Dk c: Ak' c Ak, P(Ak - Dk) = P(A,,) - P(Dk), consequently P(Dk) > P(Ak) - e/2. In particular, Dk is not empty.
Now pick xkeDk,k=1,2,....SayA1'=t") =C"'(v1)(note
all Dk c A 1'). Consider the sequence
(L xt,, ... , xt",), 1
(X2, (x,,, ... , X2,), xr",),
3 (4,, ... , x ),
that is, x/'/,, x2,,
Since the x', belong to C"', a compact subset of 1 0,,, we have a convergent subsequence x,,;" approaching some x,,, e C"'. If A2' = C"2(v2) (so Dk c A2' for k z 2), consider the sequence X11 1, e,", ... e C"2 (eventually), and extract a convergent subsequence xv2" -+ e C"2. Note that (x'2"), , = x12"; as n -+ oo, the left side approaches and since {r2n} is a subsequence of {r1 }, the right side approaches x,,,. Hence (xU2)V, = x,,,.
Continue in this fashion; at step i we have a subsequence x';" -- x,,, e C"',
and
for j < i. x,, for all j = 1, 2, ... (such a
(x,,)f =
Pick any co e fl: E T 0, such that j < i). Then co,,, e C"' for each j; hence
choice is possible since (x,,,)vj =
we nA,'c nA,=Qf,
j=t a contradiction. Thus P extends to a measure on F, and by construction, P for all v. j=1
Finally, if P and Q are two probability measures on .` such that for all finite v c T, then for any B" e P(B"(v)) = [ir,,(P)](B") =
Q(B"(v)).
Thus P and Q agree on measurable cylinders, and hence on Sr by the uniqueness part of the Caratheodory extension theorem.
4
194
THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
Problems
In Problems 1-7, (S) .F,) is a measurable space for each t e T, and 1.
Let p, be the projection map of () onto SZ that is, p,(w) = w(t). If (S, 9o) is a measurable space and P. S -+ S2, show that f is measurable iff p, o f is measurable for all t.
2.
If all n, = R, all F, = R(R), and T is a (nonempty) subset of R, how many sets are there in F? Show that if B e y , then membership in B is determined by a countable
3.
number of coordinates, that is, there is a countable set Tae T and a set B0 E FT, = j 1, E T. 'F, such that co e B if CT,, E B0 , where OT0 = (w(t), t E To). 4.
If T is an open interval of reals and fl, = R (or R), F, = .1(R) (or R(R)) for all t, use Problem 3 to show that the following sets do not belong to ,F: (a) {co: co is continuous at to}, where to is a fixed element of T. (b) {w: SUP. t;b w(t) < c}, where cc R and [a, b] c T
5.
Assume each Q, is a compact metric space, with .F, the Baire (= Borel)
subsets of 0, . Then by the Tychonoff theorem, f2 is compact in the topology of pointwise convergence. Show that F is the class of Baire sets of fl, in other words, .<J, E T ,it) = [: E T d411), as follows: (a)
If A = {co e n: w(t) E F}, F a closed subset of Q,, show that A = {w a ): f(co) = 0) for some f c- C(S)). Conclude that F a .ra1(S2).
(b)
6.
Use the Stone-Weierstrass theorem to show that the functions
f e C(Q) depending on only one coordinate [that is, f(w) = g,(co(t)) for some t, where g, e C(S2,)] generate an algebra that is dense in C(S2). Conclude that 4(S2) c ,F. Assume T is an open interval of reals, and (f2 F,) = (R, R(R)) for all t; thus n = RT, .f = A(R)T. Let A = {co e 92: co is continuous at t0), where to is a fixed point of T. (a) Show that A is an F,a (a countable intersection of F, sets); hence A e R(f2). (b)
7.
Show that A 0 d(fl), so that we have an example of a compact
Hausdorff space in which the Baire and Borel sets do not coincide. (Alternative proof of the Kolmogorov extension theoremt) Assume the hypothesis of 4.4.3, with the stronger condition that each fl, is a compact metric space. Put the topology of pointwise convergence on Q. (a) Let A be the set of functions in C(S2) that depend on only a finite number of coordinates; that is, there is a finite set v c T and a cont Nelson. E., Ann. of Math. 69, 630 (1959).
4.4 MEASURES ON UNCOUNTABLY INFINITE PRODUCT SPACES
195
tinuous g: 12 -+ R such that f(w) =
w e 12. Use the StoneWeierstrass theorem to show that A is dense in C(12). (b) If f e A, define E(f) = fag dP,,. Show that E is well-defined and extends uniquely to a positive linear functional on C(Q). Since E(l) = 1, the Riesz representation theorem 4.3.9 yields a unique probability
measure P on An) (= f I, E T .F, by Problem 5) such that E(f) _ (c)
fn f dP for all f e C(Q). Let v be a fixed finite subset of T, and let H be the collection of all functions g: (fl, F.,) --+ (R, E(R)) such that if f(w) = co e 12,
then J, f dP = in g
8.
Show that IA e H for each open set
A e 12,,, and then show that H contains all bounded Borel measurable functions on (Q., ,F.). Conclude that P,,. (Uniqueness of P is proved as in 4.4.3.) The metric space Q is said to be Borel equivalent to a subset of the metric space Q' i1T there is a one-to-one map f: Q -+ Q' such that E = f (Q) e R(12)
and f and f -' are Bore] measurable. [Measurability of f -' means that f -' : (E, R(E)) -+ (0, R(Q)).] (a) Let Q he a complete separable metric space with metric d. (It may be assumed without loss of generality that d(x, y) < 1 for all x, y, since the metric d' = d1(1 + d) induces the same topology as d and is also complete.] Denote by [0, 1]°° the space of all sequences of real numbers with components in [0, 1], with the topology of pointwise convergence. (This space is metrizable; explicitly, we may take the metric as °° I I xn do(X, y) = nE 2n 1 + Ix,,
yn I
If D = {w,, (02 , ...) is a countable dense subset of 12, define ... ). Show that f is continuous and one-to-one, with a continuous inverse.
f: 12 --+ [0, 11' by f (w) = {d(w, wn), n = 1, 2,
(b) Show thatf(1)) is a Borel subset of [0, 1]°°; thus n is Borel equivalent
to a subset of [0, 1]'. (c)
Let S = {0, 1}, and let 6" consist of all subsets of S. Show that there is a map g: ([0, 1), 4[0, 1)) --+ (S°°, .9"°) such that g is one-to-one,
g[0, 1) e Y', and both g and g`' are measurable. (d) Show that ([0, 1], R[0, 1]) and (S°°, S'°) are equivalent, that is,
there is a map h: [0, 1] -+ S°° such that h is one-to-one onto, and h and h-' are measurable. (e)
If (fl, Pn) is equivalent to (Se, 9n), with associated map hn, n = 1, 2, ..., show that (ln 12n, H. ,F.) is equivalent to (fln Sn, r l. .9'n). Thus by (d), ([0, 1]°°, (-4[0, 11)°°) is equivalent to (S'°, Y').
4 THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
196
Now by 4.4.2, (9[0, 1 ])°° is the minimal a-field over the open sets of [0, 1]Z, that is, (-4[0, 1])°° is the class of Borel sets _4([0, 1]°°). It follows from these results that if 0 is a complete, separable metric space, S1 is Borel equivalent to a (Bore]) subset of [0, 1].
4.5 Weak Convergence of Measures
By the Riesz representation theorem, a continuous linear functional on C(S2), where S2 is compact Hausdorff, may be identified with a regular finite signed measure on R(f2). Thus if {p,,} is a sequence of such measures, weak*
convergence of the sequence to the measure p means that In f dµ - In f dp for all f e C(O). In this section we investigate this type of convergence in a somewhat different context; 91 will be a metric space, not necessarily compact, and all measures will be nonnegative. The results form the starting point for
the study of the central limit theorem of probability.
Theorem. Let p, Pi, 102, ... be finite measures on the Borel sets of a metric space Q. The following conditions are equivalent:
4.5.1
(a) $ n f dun -> Jn f dp for all bounded continuous f: 51- R. (b) lim inf, In f dun >_ fa f dp for all bounded lower semicontinuous
f.n->R.
(b') urn sups . oo in f dun < Jn f du for all bounded upper semicontinuous f: S2 - R. (c) 5 n f dun -f n f du for all bounded f: (s2, (fz)) - (R, R(R)) such that f is continuous a.e. [p]. (d) lim inf, pn(A) >_ p(A) for every open set A (-_ S2, and un(Q) -' u(n) (d') lim sup, u (A) 5 u(A) for every closed set A c f , and
inn0 -.41).
(e) u,,(A) - u(A) for every A e.4(Q) such that p(aA) = 0 (aA denotes the boundary of A).
PROOF. (a) implies (b):
If g < f and g is bounded continuous,
liminf f fdp,,_liminf f n-oo
12
n
by (a).
But since f is lower semicontinuous (LSC), it is the limit of a sequence of con-
tinuous functions, and if If < M, all functions in the sequence can also be
taken less than or equal to M in absolute value. (See the appendix on general topology, Section A6, for the basic properties of semicontinuous
4.5
WEAK CONVERGENCE OF MEASURES
197
functions.) Thus if we take the sup over g in the above equation, we obtain (b). (b) is equivalent to (b'): Note that f is LSC if -f is upper semicontinuous (USC). (b) implies (c): Let f be the lower envelope off (the sup of all LSC
functions g such that g < f) and f the upper envelope (the inf of all USC functions g such that g >: f). Since f(x) = lim infy.x f(y) and f (x) = lim sup.,-.,f (y), continuity off at x implies f (x) =f(x) =J(x). Furthermore, f is LSC and f is USC. Thus if f is bounded and continuous a.e. [µ],
f f dµ = f f dµ 5 Jim inf f f dµn n
n-oo
s1
n
Slim inf f f dµn n
by (b)
since f 5 f
S lim sup f f dµn a n- 00
Slim sup f f dµn n-c
n
since f < f
S fn fdi
by (b')
= f f dp,
proving (c).
(c) implies (d): Clearly (c) implies (a), which in turn implies (b). If A is open, then IA is LSC, so by (b), lim inf,,-. pn(A) >- µ(A). Now In - 1, so pn(Q) -+ fi(Q) by (c)-
(d) is equivalent to (d'): Take complements. (d) implies (e): Let A° be the interior of A, A the closure of A. Then iim sup pn(A) S lim sup pn(A) S µ(A) n- oo
by (d')
n- 00
= µ(A)
by hypothesis.
Also, using (d), lim inf pn(A) >_ lim inf pn(A°) > p(A°) = p(A). n- ao
n- cc
Let f be a bounded continuous function on Q. If If < M, let A = {c a R: p(f -1{c)) # 0}; A is countable since the f -'{c} are disjoint and p is finite. Construct a partition of [- M, MI, say - M = to < t, < < tj = M, with t; 0 A, i = 0, 1, ..., j (M may be increased if necessary). If B, = {x: t; 5 f(x) < ti.,.1}, i = 0, 1, ..., j - 1, it follows from (e) that J-1 i-^1 /_ ti pn(Bi) - Y- ti y(Bi) i=0 (e) implies (a) :
i=0
4 THE INTERPLAY BETWEEN MEASURE THEORY AND TOPOLOGY
198
[Since f -'(ti, 1.+i) is open, of -' [ti, ti, I) c.f -'{ti, ti+i}, and {if -'{ti, ti+i} = 0 since ti, ti+, 0 A.] Now
fafdp -fnfdp
S
ti pn(Bi) J J dju. - Li=0 S2 2
+
I'Y-
+
[tiIt(Bi) - Jn f d4
t=0
ti p,,(B.) - y_ ti p(B.)I i=0
The first term on the right may be written as j_1
E
1
i=0
J Bt
(f(x) - ti)
-
and this is bounded by maxi(ti+i which can be made arbitrarily small by choice of the partition since p (S1) -+ p(fl) < oo by (e). The third term on the right is bounded by max.(ti+i - ti)p(S2), which can also be made arbitrarily small. The second term approaches 0 as n -+ co, proving (a). I 4.5.2
Comments.
Another condition equivalent to those of 4.5.1 is that
In f dp - fn f du for all bounded uniformly continuous f: it -* R (see Problem 1).
The proof of 4.5.1 works equally well if the sequence is replaced by a net. The convergence described in 4.5.1 is sometimes called weak or vague convergence of measures. We shall write
p
p are defined on e(R), there are corresponding
distribution functions F and F on R. We may relate convergence of measures to convergence of distribution functions.
Definition. A continuity point of a distribution function F on R is a point x e R such that F is continuous at x, or ± oo (thus by convention, oo and - oo are continuity points). 4.5.3
Theorem. Let p, Al, P2' ... be finite measures on £(R), with corresponding distribution functions F, F,, F2, .... The following are equivalent:
4.5.4
(a) p w' p.
(b) F (a, b] - F(a, b] at all continuity points a, b of F, where F(a, b] = F(b) - F(a), F(co) = lim, . F(x), F(-oo) = limx.._,,, F(x).
4.5
199
WEAK CONVERGENCE OF MEASURES
If all distribution functions are 0 at - oo, condition (b) is equivalent to the statement that F,,(x) -+ F(x) at all points x E R at which F is continuous, and F (oo) --, F(oo).
(a) implies (b): If a and b are continuity points of F in R, then (a, b] is a Borel set whose boundary has p-measure 0. By 4.5.1(e), µn(a, b] -+ µ(a, b], that is, F,,(a, b] -+ F(a, b]. If a = - oo, the argument is the same, and if b = oo, then (a, oo) is a Borel set whose boundary has µ-measure 0, and the proof proceeds as before. (b) implies (a): Let A be an open subset of R; write A as the disjoint PROOF.
union of open intervals I1, 12, .... Then lim inf µn(A) = lim inf Z .U.(") n- '0
k
Y lim inf Pn(Ik)
by Fatou's lemma.
n-co
k
Let E > 0 be given. For each k, let Ik' be a right-semiclosed subinterval of Ik such that the endpoints of Ik' are continuity points of F, and u(Ik') >- µ(Ik) - c2-k; the Ik' can be chosen since F has only countably many discontinuities. Then by (b). lim inf µn(Ik) >- lim inf µn(Ik) = P(IO') n-oo
n- 00
Thus
liminfµn(A) n- oo
- Y_ p(Ik) - Y_µ(Ik)-e=µ(A)-e. k
k
Since a is arbitrary, we have µn '-+,u by 4.5.1(d). I
Condition (b) of 4.5.4 is sometimes called weak convergence of the sequence {Fn} to F, written Fn - F. Problems 1.
(a)
If F is a closed subset of the metric space Q, show that IF is the limit
of a decreasing sequence of uniformly continuous functions fn, with O!gf. x e V, we have x e V evenis open if for every net tually. [The " only if " part follows from the definition of convergence; for the
"if" part, apply A2.3(b) to V`.] A2.5 Definitions. Let {x,,, n e D} be a net, and suppose that we are given a directed set E and a map k -* nk of E into D. Then {x,,,,, k e E } is called a subnet of {x,,, n e D}, provided that " as k becomes large, so does nk" ; that is, given no a D, there is a ko e E such that k >_ ko implies nk >_ no. If E = D = the positive integers, we obtain the usual notion of a subsequence. If {xn , n e D} is a net in the topological space .0, the point x e Q is called
an accumulation point of the net if for each neighborhood U of x, x is frequently in U; in other words, given n e D, there is an m e D with m z n
and xa U. A2.6 Theorem. Let {xn , n e D} be a net in the topological space Q. If x e S2, x is an accumulation point of (xn) if there is a subnet {xnk , k e E) converging to X.
PROOF. If xnk - x, U c- °I1(x) and n c- D, then for some ko a E, we have xnk E U for k >_ ko. But by definition of subnet, there is a k1 E E such that k _ k1 implies k n. Thus jfk _ koand k _ k1,we have nk>:nand x an accumulation point of {xn, n e D). Let E be the x
collection of pairs (n, U), where n e D, U is a neighborhood of x, and x E U. Direct E by setting (n, U) < (m, V) if n < m and U V. If k = (m, V) e E let nk = m. Given n and U, then if k = (m, V) >_ (n, U), we have nk = m >_ n, so that {xnk , k c- E} is a subnet of {xn , n c- D). Now if U isa neigh-
borhood of x, then x e U for some n e D. If k = (m, V) z (n, U), then xnk = x, e V c U; therefore
x. I
For some purposes, it is more convenient to specify convergence in a topological space by means of filters rather than nets. If {xn, n e D) is a net
A2 CONVERGENCE
205
in fl, and a e D, let Ta = {n a D: n >_ a}, and let x(T,) be the set of all x,,, n > a. The x(T0), a e D, are called the tails of the net. The collection sd of tails is an example of a filterbase, which we now define. Definitions and Comments. Let sd be a nonempty family of subsets of a set r). Then sd is called a filterbase in fl if
A2.7
(a) each U e sd is nonempty; (b) if U, V e sd, there is a W e sd with W e U n V.
If F is a nonempty family of subsets of 0 such that (c) each U E .°F is nonempty, (d) if U, V e F, then U n V e P, and (e) if U c.F and U c V, then V e Pr,
then .F is called a filter in n. If sd is a filterbase, then F = {U c S2: U e V for some V e d) is a filter, called the filter generated by sd. If sd is the collection of neighborhoods of a given point x in a topological space, sd is a filterbase, and the filter generated by sd is the system of overneighborhoods of x. A filterbase sd in a topological space fl is said to converge to the point x (notation sd x) if for each U E 611(x) there is a set A e sd such that A C U. A filter .F in ) is said to converge to x if each U e 611(x) belongs to F. Thus a filterbase sd converges to x if the filter generated by sd converges to X. If {xa , ii a D} is a net, then x -- x if for each U c-611(x) we have x(Ta) c U
for some a e D, that is, x -+ x if the associated filterbase converges to x. Convergence in a topological space may be described using filterbases instead of nets. The analog of Theorem A2.3 is the following: A2.8 Theorem. Let B be a subset of the topological space Q.
(a) A point x E S2 belongs to B if there is a filterbase din B such that -+ X.
(b)
B is closed if for every filterbase sd in B such that .4 ---, x, we have
xaB. (c) A point x E n is a cluster point of B if there is a filterbase in B - (x) converging to x. PROOF. (a)
If sd -+ x and U E 011(x), then A c U for some A E sd, in partic-
ular, U n B 0 0; thus x e B. Conversely, if x e B, then U n B 96 Q for each U e 611(x). Let sd be the collection of sets U n B, U e 611(x). Then sd is a filter-
base in Band sd -- x.
206
APPENDIX ON GENERAL TOPOLOGY
(b) If B is closed, W is a filterbase in B, and sl --+ x, then x e B by (a), hence x e B by hypothesis. Conversely, if B is not closed and x e B - B, by
(a) there is a filterbase sad in B with sad -+ x. Since x 0 B, the result follows. (c)
If there is such a filterbase a and U e °h(x), then U A for some
A e .W; in particular, U n (B - {x}) 96 0, so x is a cluster point of B. Conver-
sely, if x is a cluster point of B, let a consist of all sets U n (B - {x}), U e 1&(x). Then d is a filterbase in B - {x} and d -+ x. I
If fl is first countable, the filterbases in A2.8 may be formed using a countable system of neighborhoods of x, so that in a first countable space, the topology may be described by filterbases containing countably many sets. A2.9
Definitions. The filterbase 9 is said to be subordinate to the filterbase sad
if for each A e d there is a B e.4 with B c A; this means that the filter generated by d is included in the filter generated by B. If k e E} is a subnet of {x, n e D}, the filterbase determined by the subnet is subordinate to the filterbase determined by the original net. For if no e D, there is a ko e E such that k - ko implies nk >_ no. Therefore
k>-ko} c{x,,:neD, nzno}. If sat' is a filterbase in the topological space 0, the point x e r) is called an accumulation point of sat if U n A # 0 for all U e &Ii(x) and all A e W, in other words, x e A for all A e sad. We may now prove the analog of Theorem A2.6. A2.10 Theorem. Let sat be a filterbase in the topological space 0. If x e 0,
x is an accumulation point of d if there is a filterbase -4 subordinate to d with x; in other words, some overfilter of d converges to x.
a
PROOF. If -4 is subordinate to d and 1--+ x, let U e 61l(x), A e sat. Then U => B and A B, for some B, B, a a; hence U n A B n B,, which is nonempty since .4 is a filterbase. Therefore x e A. Conversely, if x is an accumulation point of sad, let .4 consist of all sets U n A, Ue °le(x), A e sad. Then
sat c 9 (take U = 91), hence 9 is subordinate to a; since a -+ x, the result follows. I A2.11
Definition. An ultrafilter is a maximal filter, that is, a filter included in
no properly larger filter. (By Zorn's lemma, every filter is included in an ultrafilter.)
A2 CONVERGENCE
A2.12
207
Theorem. Let .F be a filter in the set Sl.
(a) .F is an ultrafilter if for each A c Sl we have A E .F or A` a F. (b) If .F is an ultrafilter and p: Sl - Sl', the filter T generated by the
filterbase p(F) = {p(F): F e .F} is an ultrafilter in il'. (c) If S is a topological space and .F is an ultrafilter in 0, F converges to each of its accumulation points. If ,F is an ultrafilter and A 0 F, necessarily A n B = 0 for some B e F. For if not, let .4 consist of all sets A n B, B e .F; then sad is a filterbase PROOF. (a)
generating a filter larger than .F. But A n B = 0 implies B c A°; hence A` e .F. Conversely, if the condition is satisfied, let F be included in the filter W. If A e W and A 0 F, then A` a .F c T, a contradiction since
AnA`=0. (b)
Let A c SY; by (a), eitherp-1(A) e IF orp-1(A`) e .F. Ifp^1(A) a .F,
then A
pp-1(A) e p(.F); hence A e T. Similarly, if p-1(A°) a .F, then A` e T. By (a), 9 is an ultrafilter. (c) Let x be an accumulation point of .F . If U e R1(x) and U # .F, then U° e F by (a). But U n U° = 0, contradicting the fact that x is an accumulation point of .F. We have associated with each net {x,,, n e D} the filterbase {x(T4): a e D} of tails of the net, and have seen that convergence of the net is equivalent to convergence of the filterbase. We now prove a converse result.
Theorem. If sad is a filterbase in the set fl, there is a net in a such that the collection of tails of the net coincides with sad. A2.13
PROOF. Let D be all ordered pairs (a, A) where a e A and A e sad; define (a, A) S (b, B) iff B e A. If (a, A) and (b, B) belong to D, choose C e and with C e A n B; for any c e C we have (c, C) >: (a, A) and (c, C) Z (b, B), hence D is directed. If we set X(., A) = a we obtain a net in Sl with x(T(a, A)) = A. I
We conclude this section with a characterization of continuity. A2.14 Theorem. Let f: it -+ Q', where 0 and fl' are topological spaces. The
following are equivalent: (a) The function f is continuous on Sl; that is, f -1(V) is open in Sl whenever V is open in Sl'.
208
APPENDIX ON GENERAL TOPOLOGY
(b)
For every net
f(x).
in Q converging to the point x e S2, the net {
(c) For every filterbase d in fZ converging to the point x c- ), the filterbase f (sad) converges to f (x). PROOF. Let
be a net and 4 a filterbase such that the tails of the net
coincide with the elements of the filterbase. If, say, x(TQ) = A E s4, then n e D, n >_ a}. Thus the tails of the net { f coincide with f (A) _ { f the elements off (.a?). It follows that (b) and (c) are equivalent. If f is continuous and x -+ x, let V be a neighborhood of fl x). Then
f -'(V) is a neighborhood of x; hence x,, is eventually in f -'(V), so that f (x.) is eventually in V. Thus (a) implies (b). Conversely, if (b) holds and C is
closed in if, let
be a net in f -'(C) converging to x. Then f(x)
(b), and since C is closed we have f (x) e C by A2.3(b). Thus x e f -'(C), hence
f -'(C) is closed, proving continuity off. I A3 Product and Quotient Topologies
In the Euclidean plane R2, a base for the topology may be formed from sets U x V, where U and V are open subsets of R; in fact U and V can be taken to be open intervals, so that U x V is an open rectangle. If {(xn , y.),
n=1,2,...)
is a sequence in R2, then (x., (x, y) if x -> x and y -+ y, that is, convergence in R2 is "pointwise" or "coordinatewise" convergence. In general, given an arbitrary collection of topological spaces S2i, i E 1, let
Q be the cartesian product fi'EI f ,, which is the collection of all families (x;, i e I) ; that is, all functions on I such that x, a 0, for each i. We shall place a topology on S2 such that convergence in the topology coincides with pointwise convergence.
A3.l Definition. The product topology (also called the topology of point wise convergence) on S2 = fl; E r f2, has as a base all sets of the form
{xE):x1keUIk, k=1,...,n) where the Uik are open in S
and n is an arbitrary positive integer. (Since the
intersection of two sets of this type is a set of this type, the sets do in fact form a base.)
209
A3 PRODUCT AND QUOTIENT TOPOLOGIES
If pi is the projection of f2 onto f2i, the product topology is the weakest topology making each pi continuous; in other words, the product topology is included in any topology that makes each pi continuous. The product topology has the following properties: A3.2
Theorem. Let 0 =F11 E 1 fl,, with the product topology.
(a) If {x("), n c D} is a net in 0 and x e f2, then x(") -+ x iff x;") -+ xi for each i. (b) A map f from a topological space L20 into n is continuous if pi of is continuous for each i. (c) If fi: f2o --+ 1i , i e I, and we define f: 1o -> 0 by f (x) = (fi(x), i e I ), then f is continuous iff each fi is continuous. (d) The projections pi are open maps of f2 onto L2,.
If x(") --+ x, then x;") = pi(xl)) --+ pi(x) = xi by continuity of the pi. Conversely, assume x;"> - xi for each i. Let PROOF. (a)
V= {ye0:yikE Uik, k = 1,...,r}, be a basic neighborhood of x. Since xik E Uik , there is an nk e D with x(,k) E Uik for n >_ nk. Therefore, if n e D and n >_ nk for all k = 1, ..., r, we have x e V, so that xt"i -+ X. (b) The "only if" part follows by continuity of the pi. Conversely, assume each pi of continuous. If x(") -+ x, then pi(f (x("))) -+ pi(f (x)) by hypo-
thesis; hence f(P)) -f(x) by (a). (c) We have fi = pi o f, so (b) applies. (d) The result follows from the observation that pi{x a S2: xik E Uik ,
k = 1, ... ,
n}
{1i Uik
if i # any ik, if i = ik for some k.
Note that if 0 is the collection of all functions from a topological space S to a topological space T, then 0 = fii E 1 Qi, where I = S and 1i = T for all i. If {f,,) is a net of functions from S to T, and f: S --+ T, then f" -+f in the product topology iff f"(s) --+f(s) for all s E S. We now consider quotient spaces.
Definition. Let 0o be a topological space, and p a map of 0, onto a set f2. The identification topology on 0 is the strongest topology making p continuous, that is, the open subsets of S2 are the sets U such that p-1(U) is A3.3
210
APPENDIX ON GENERAL TOPOLOGY
open in no. When n has the identification topology, it is called an identification space, and p is called an identification map.
A quotient topology may be regarded as a particular identification topology. Let R be an equivalence relation on the topological space no, S2o/R the set of equivalence classes, and p: 0..0 - S2o/R the canonical projection: p(x) = [x], the equivalence class containing x. The quotient space of S2o by R is S2o/R with the identification topology determined by p. In fact, any identification space may be regarded as a quotient space. To see this, we need two preliminary results.
A3.4 Lemma. If n has the identification topology determined by p: 0o - S2, then g: S2 -> Q, is continuous iff g o p: 12o -- Q, is continuous. PROOF. The "only if" part follows from the continuity of p. If g e p is continuous and V is open in n1, then (g o p)-1(V) = p-1(g-1(V)) is open in no. By definition of the identification topology, g-1(V) is open in 0. 1
A3.5 Lemma. Let p: CIO -+ 0 be an identification, and let h: no -1, S21 be
continuous. Assume that h -p-' is single-valued, in other words, p(x) _ p(y) implies h(x) = h(y). Then h c p'1 is continuous.
PROOF. Since h = (h o p-1) o p, the result follows from A3.4. (Note that P_' is defined on all of 0 since p is onto.) Theorem. Let f: no - fi be an identification. Define an equivalence relation R on no by calling x and y equivalent iff f(x) = fly). Let p be the A3.6
canonical projection of no onto S2o/R. Then S2o/R, with the quotient topology,
is homeomorphic to n. PROOF. We have f(x) = f(y) if p(x) = p(y), so by A3.5, fop-1 and p of -1 are both continuous. Since these functions are inverses of each other, they define a homeomorphism of n and S2o/R. I The following result gives conditions under which a given topology arises from an identification.
Theorem. Let p be a map of 00 onto 0. If p is continuous and either open or closed, it is an identification, that is, the identification topology on r) determined by p coincides with the original topology on U. A3.7
A4 SEPARATION PROPERTIES
211
PROOF. Since p is continuous and the identification topology is the largest one
making p continuous, the original topology is included in the identification topology. If p is an open map and U is an open subset of 0 in the identifica-
tion topology, p-'(U) is open in fl,, hence p(p-'(U)) = U is open in the original topology. If p is a closed map, the same argument applies, with "open" replaced by "closed." A4 Separation Properties and Other Ways of Classifying Topological Spaces
Topological spaces may be classified as to how well disjoint sets may be separated, as follows. (The results of this section are generally discussed in a first course in topology, and will not be proved.)
Definitions. A topological space S2 is said to be a To space if given any two distinct points x and y, at least one point has a neighborhood not containing the other; f2 is a T, space if each point has a neighborhood not containing the other; .0 is a T2 (or Hausdorff) space if x and y have disjoint neighborhoods. This is equivalent to uniqueness of limits of nets (or filterbases). Also, C is said to be a T3 (or regular) space if i2 is T2 and for each closed set C and point x 0 C, there are disjoint open sets U and V with x e U and C c V; it is said to be a T4 (or normal) space if i2 is T2 and for each pair of disjoint closed sets A and B there are disjoint open sets U and V with A4.1
AcUand BcV.
It follows from the definitions that Ti implies T;_1i i = 4, 3, 2, 1. Also, the T1 property is equivalent to the statement that {x} is closed for each x. The
space f) is regular if it is Hausdorff and for each open set U and point x e U, there is a V c- 0&(x) with V c U. The space S is normal if it is Haus-
dorff and for each closed set A and open set U
with AcVc VcU.
A, there is an open set V
A metric space is T4, for if A and B are disjoint closed sets, we may take U = {x: d(x, A) - d(x, B) < 0}, V = {x: d(x, A) - d(x, B) > 0}, where d(x, A) = inf{d(x, y) : y e A}.
Urysohn's Lemma. Let S) be a Hausdorff space. Then 0 is normal if for each pair of disjoint closed sets A and B, there is a continuous function f: 0 -+ [0, 1] with f = 0 on A and f = 1 on B. A4.2
Tietze Extension Theorem. Let t) be a Hausdorff space. Then S2 is normal iff for every closed set A c C1 and every continuous real-valued A4.3
APPENDIX ON GENERAL TOPOLOGY
212
function f defined on A, f has an extension to a continuous real-valued function F on n. Furthermore, if if I < c (respectively, If 1 < c) on A, then 1FI can be taken less than c (respectively, less than or equal to c) on Q. Theorem. Let A be a closed subset of the normal space n. There is a continuous f: t -> [0, 1] such that A =f -1{O} if A is a Gd , that is, a countable intersection of open sets. A4.4
A4.5
Definitions and Comments. A topological space n is second countable
if there is a countable base for the topology, first countable if there is a countable base at each point (see A2.4). Second countability implies first countability but not conversely. Any metric space is first countable.
If 0 is second countable, it is necessarily separable, that is, there is a countable dense subset of Q. Furthermore, if n is second countable, it is Lindelof, that is, for every family of open sets V1, i e I, such that Ut V1 = S2, there is a countable subfamily whose union is Q (for short, every open covering of 0 has a countable subcovering). In a metric space, the separable, second countable, and Lindelof properties are equivalent as follows. Second countability always implies separability and Lindelof. If n is separable with a countable dense set {x1, x2 , ...), then the balls B(x;, r) = {y a f : d(y, xi) < r}, i = 1, 2,..., r rational, form acountable base. If n is Lindelof, the cover by balls B(x, 1/n), x e S2, has a countable
subcover {B(x,,;, 1/n), i = 1, 2, ...}, and the sets B(x,,;, 1/n), i, n = 1, 2, ..., form a countable base. This result implies that any space that is separable but not second countable (or not Lindelof), or Lindelof but not second countable (or not separable)
cannot be metrizable, that is, there is no metric whose topology coincides with the original one. A4.6
Definitions and Comments. A topological space .0 is said to be com-
pletely regular if Q is Hausdorff and for each x e n and closed set C 0 with x 0 C, there is a continuous f: fZ
[0, 1] such that f (x) = 1 and f = 0 on
C.
If A is a subset of the normal space n, then A, with the relative topology, is
completely regular (this follows quickly from Urysohn's lemma). Also, complete regularity implies regularity. Thus complete regularity is in between regularity and normality; for this reason, completely regular spaces are sometimes called T34 spaces. Now if CIO is a Hausdorff space and F is the family of continuous maps f: no -' [0, 1 ], let n = fl (If : f e , r) where each If = [0, 11. Let e:.00 -+ f be the evaluation map, that is, e(x) = (f (x), f e ,F).
A5 COMPACTNESS
213
With the product topology on fl, e is continuous, and e is one-to-one if F distinguishes points, in other words, given x, y e ao, x :A y, there is an f e F such that f (x) 56f (y). Finally, if .F distinguishes points from closed sets, that is, if 1Q, is completely regular, then e is an open map of !no onto e(flo) c Q. [If is a net and e(x) x ++ x, there is a neighborhood U of x such that x is not eventually in U; that is, given m, there is an n >_ m with
x 0 U. Choose f e °F with f (x) = 1 and f = 0 on flo - U. Then for each m, we have f (x,,) = 0 for some n z m, so that f (x.) ++f (x). But then a contradiction.] e(x) It follows that if f2o is completely regular, it is homeomorphic to a subset of a normal space. [Since 0 is a product of Hausdorff spaces, it is Hausdorff;
the Tychonoff theorem, to be proved later, shows that f2 is compact, and hence normal (see A5.3(d) and A5.4).] Since e(Q0) is determined completely by F, we may say that the continuous functions are adequate to describe the topology of flo.
A5 Compactness
The notion of compactness appears in virtually all areas of mathematics. The original compactness result was the Heine-Borel theorem: If [0, 1] c U; Vi, where the Vi are open subsets of R, then in fact [0, 1) is covered by finitely many V, , that is, [0, 1 ] c Uk= 1 V.,, for some V;. , ... , V;.. In general, we have the following definition: Definition. The topological space f2 is compact if every open covering of fl has a finite subcovering. A5.1
There are several ways of expressing this idea.
Theorem. If fl is a topological space, the following are equivalent: (a) f2 is compact. (b) Each family of closed sets C, c f2 with the finite intersection property (all finite intersections of the Ci are nonempty) has nonempty intersection. Equivalently, for every family of closed subsets of 0 with empty intersection, there is a finite subfamily with empty intersection. A5.2
(c)
Every net in f2 has an accumulation point in 0; in other words
(by A2.6), every net in f2 has a subnet converging to a point of fl. (d) Every filterbase in fl has an accumulation point in f2, that is (by A2.10), every filterbase in fl has a convergent filterbase subordinate to it, or equally well, every filter in f2 has an overfilter converging to a point of 0. (e) Every ultrafilter in 0 converges to a point of Q.
214
APPENDIX ON GENERAL TOPOLOGY
PROOF. Parts (a) and (b) are equivalent by the duality between open and closed sets. If is a net and sad is a filterbase whose elements coincide with the tails of the net, then the accumulation points of the net and of the filterbase coincide, so that (c) and (d) are equivalent. Now (d) implies (e) by A2.12(c), and (e) implies (d) since every filter is included in an ultrafilter. To prove that (b) implies (d), observe that if sd is a filterbase, the sets A e s:1, hence the sets A, A c sad, have the finite intersection property, so by (b), there
is a point x E n{A: A c-.4}. Finally, we prove that (d) implies (b). If the closed sets C; have the finite intersection property, the finite intersections of the C; form a filterbase, which by (d) has an accumulation point x. But then x
belongs to all C,. I It is important to note that if A c Q, the statement that every covering of A by sets open in f has a finite subcovering is equivalent to the statement that every covering of A by sets open in A (in the relative topology inherited from S2) has a finite subcovering. Thus when we talk about a compact subset of a topological space, there is no ambiguity. The following results follow quickly from the definition of compactness. Theorem. (a) If S2 is compact and f is continuous on 11, then f (Q) is compact. A5.3
(b) A closed subset C of a compact space ) is compact. (c) If A and B are disjoint compact subsets of the Hausdorff space Q, there are disjoint open sets U and V such that A c U and B e V. In particular (take A = {x}), a compact subset of a Hausdorff space is closed. (d) A compact Hausdorff space is normal. (e) If A is a compact subset of the regular space 0, and A is a subset of the open set U, there is an open set V with A c V e V e U. PROOF. (a) This is immediate from the definition of compactness. (b) If C is covered by sets U open in Q, the sets U together with a -- C
cover 0. By compactness there is a finite subcover. (c) If x 0 B and y e B, there are disjoint neighborhoods U,(x) and V(y) of x and y. The V(y) cover B; hence there is a finite subcover V(y), i = 1, ... , and V' = U'=1 V(y;) are disjoint open sets with n. Then U' =n7 I x e U' and B c V'. If we repeat the process for each x e A, we obtain disjoint open sets U(x) and V(x) as above. The U(x) cover A; hence there is a finite subcover U(X), i = 1, ... , m. Take U = U"' I U(x,), v = n,,"= 1 V(x,). (d) If A and B are disjoint closed sets, they are compact by (b); the result then follows from (c).
215
A5 COMPACTNESS
(e)
If x e A, regularity yields an open set V(x) with x e V(x) and
V(x) c U. The V(x) cover A; so for some x...... xn, we have n
n
n
A c U V(xi) c U V(xi) = U V(xi) C U. i=1
i=1
i=1
The following is possibly the most important compactness result. If S2i is compact for each i e 1, then ill _ fl , iii is compact in the product topology. A5.4 Tychonoff Theorem.
PROOF. Let 97 be an ultrafilter in 0. If pi is the projection of i2 onto Ui, then by A2.12(b), pi(.F) is a filterbase that generates an ultrafilter in fli. By hypothesis, pi(F) converges to some xi a Di, and it follows that ." -+ x = (xi, i e 1). [To see this, observe that if sal is a filterbase in i2 and {x"} is a net whose tails are the elements of a, then the tails of {pi(x")} are the elements of
pi(d). Since x" -+ x if pi(x") -+ pi(x) for all i, by A3.2(a), it follows that sd - x iff pi(d) -' pi(x) for all i.] The Tychonoff theorem now follows from A5.2(e).
The following result is often used to infer that the inverse of a particular one-to-one continuous map is continuous. A5.5 Theorem. Let f: i2 - i2, where i2 is compact, 01 is Hausdorff, and f is continuous. Then f is a closed map; consequently iff is one-to-one onto, it is a homeomorphism.
PROOF. By A5.3(a), (b) and (c). I A5.6
Corollary. Let p: i2o - 0 be an identification, and let h: ffo - ill be
continuous; assume hop-' is single valued, and hence continuous by A3.5. Assume also that f)o is compact (hence so is i2 because p is onto) and f2, is Hausdorff. If h op-1 is one-to-one onto, it is a homeomorphism. PROOF. Apply A5.5 with f = h o
`. I
Corollary A5.6 is frequently applied in constructing quotient spaces. For example, if one pair of opposite edges of a rectangle are identified, we obtain a cylinder. Formally, let 12 = {(x, y): 0 < x < 1, 0:5 y < 1). Define an
216
APPENDIX ON GENERAL TOPOLOGY
equivalence relation R on I2 by specifying that (0, y) be equivalent to (1, y), 0 < y < 1, with the other equivalence classes consisting of single points. Let h(x, y) = (ei2Rx, y), (x, y) ,E I2 ; h maps I2 onto a cylinder C. If p is the
canonical projection of I2 onto I2/R, then A5.6 implies that h o p-1 is a homeomorphism of I2/R and C. In some situations, for example in metric spaces, there are alternative ways of expressing the idea of compactness. Definition. A topological space 12 is said to be countably compact if every countable open covering of 0 has a finite subcover. A5.7
A5.8 Theorem. For any topological space S2, the following properties (a)(d) are equivalent, and each implies (e). If S2 is T1, then all five properties are
equivalent.
(a) Q is countably compact. Each sequence of closed subsets of 92 with the finite intersection property has nonempty intersection. (c) Every sequence in fl has an accumulation point. (d) Every countable filterbase in f) has an accumulation point. (e) Every infinite subset of fl has a cluster point. (b)
PROOF. The equivalence of (a), (b), and (d) is proved exactly as in A5.2, and (d) implies (c) because the tails of a sequence form a countable filterbase. To prove that (c) implies (d), let sad = {A1, A2, ...} be a countable filterbase,
and choose x e n"=1 A; , n = 1, 2. .... If x is an accumulation point of and U e °IE(x), then for each n there is an m >- n such that
a U, hence
U n nm 1 A i 0 0. It follows that U n A # 0 for all n, and consequently x is an accumulation point of W. To prove that (c) implies (e), pick a sequence
of distinct points from the infinite set A and observe that if x is an accumulation point of the sequence, then x is a cluster point of A. Finally, we show that (e) implies (a) if S2 is T1. Say U1, U2, ... form a countable open covering of S2 with no finite subcover. Choose x1 0 Ul ; having chosen distinct x1, ... , xk with x; 0 U/= I Ui , j = 1, ..., k, then xl,... , xk all belong to
some finite union Ui=1 Ui, n
k + 1; choose Xk+10 Ui=I U1 (hence xk+I 11 U1 U. and x1, ..., xk+I are distinct). In this way we form an infinite set A = {x1, X2, ...} with no cluster point. For if x is such a point, x belongs to U for some n. Since n is T1, there is a set U e V(x) such that U c U and xi 0 U, i = 1, 2, ..., n - 1 (unless xi = x). Since Xk 0 U. for k >- n, U contains no point of A distinct from x. I
A5 COMPACTNESS
A5.9
217
Definitions and Comments. The topological space Q is said to be
sequentially compact iff every sequence in ) has a convergent subsequence. By A5.8(c), sequential compactness implies countable compactness. In a first countable space, countable and sequential compactness are equivalent. For if x is an accumulation point of the sequence and V1, V2, ... (with
V,,.., = V. for all n) form a countable base at x, then for each k we may find nk >_ k such that
e Vk. Thus we have a subsequence converging to x.
A5.10 Theorem. In a second countable space or a metric space, compactness,
countable compactness, and sequential compactness are equivalent. PROOF. A second countable space is Lindelof (see A4.5), so compactness and
countable compactness are equivalent. It is first countable, so countable compactness and sequential compactness are equivalent; this result holds in a metric space also, because a metric space is first countable. Now a sequentially compact metric space fl is totally bounded, that is, for each e > 0, 0 can be covered by finitely many balls of radius e. (If not, inductively pick x1, x2, ... with xi+1 0 U;=, B(xi, s); then can have no convergent subsequence.) Thus for each positive integer n, ) can be covered l 1n), i = 1, 2, ..., k,,. If { U; , j e J} is an arbitrary by finitely many balls open covering of SZ, for each ball I/n) we choose, if possible, a set URi of the covering such that B(x,,i, 1/n). If x e fl, then x belongs to a ball B(x, a) included in some U;; hence x e B(x,,i, 1/n) a B(x, e) c U. for some n and i; therefore x e Thus the U,i form a countable subcover, and 1), which is countably compact, must in fact be compact. I Note that a compact metric space is Lindelof, hence (see A4.5) is second countable and separable.
Definition. A Hausdorff space is said to be locally compact if each x E !Q has a relatively compact neighborhood, that is, a neighborhood whose A5.11
closure is compact. (Its follows that a compact Hausdorff space is locally compact.) A5.12
Theorem. The following are equivalent, for a Hausdorff space i2:
(a) D is locally compact. (b)
For each x e S2 and U e all(x), there is a relatively compact open set V
with x e V c V c U. (Thus a locally compact space is regular; furthermore, the relatively compact open sets form a base for the topology.)
218
APPENDIX ON GENERAL TOPOLOGY
(c) If K is compact, U is open, and K c U, there is a relatively compact open set V with K c V c V c U.
PROOF. It is immediate that (c) implies (a), and (b) implies (c) is proved by applying (b) to each point of K and using compactness. To prove that (a) implies (b), let x belong to the open set U. By (a), there is a neighborhood V1 of x such that K = V1 is compact. Now K is compact Hausdorff, and hence
regular, and x e U n V1, which is open in f2, and hence open in K. Thus (see A4. 1) there is a set W open in C1 such that x e W n K and the closure of W n K in K, namely, W n K, is a subset of U n V1.
Now xe Wn V1 and Wn V1 c WnKcU, so V=Wn V1 is the desired relatively compact neighborhood.
The following properties of locally compact spaces are often useful: Theorem. (a) Let S2 be a locally compact Hausdorff space. If K c U c 52, with K compact and U open, there is a continuous f:.0 -+ [0, 1] A5.13
such that f = 0 on K and f = 1 on 0 - U. In particular, a locally compact Hausdorff space is completely regular. (b) Let S2 be locally compact Hausdorff, or, more generally, completely regular. If A and B are disjoint subsets of i2 with A compact and B
closed, there is a continuous f: 11 -+ [0, 1] such that f = 0 on A and f = 1 on B. (c) Let S2 be locally compact Hausdorff, and let A c U c 0, with A compact and U open. Then there are sets B and V with A c V c B e U,
where V is open and a-compact (a countable union of compact sets) and B is compact and is also a G6 (a countable intersection of open sets). Consequently (take A = {x}) the a-compact open sets form a base for the topology. PROOF. (a)
Let K c V c V c U, with V open, V compact [see A5.12(c)).
P is normal, so there is a continuous g: V-+ [0, 11 with g = 0 on K, g = 1 on
T7- V. Define! = g on V. f = 1 on Q- V. (On V- V, g = 1 so! is welldefined.) Now f is continuous on 0 (look at preimages of closed sets), so f is the desired function. (b)
By complete regularity, for each x e A, there is a continuous
f,,: 0 -+ [0, 1] with f,(x) = 0, fx = 1 on B. By compactness, n
A C U {x: fx,(x) < H i=1
for some x1,...,xn.Let
;then g=1onBand0 a} is open. Let x,, --> x, where f(x) > a. Then lim inf,,f(x,,) > a, hence f (x,,) > a eventually, that is, x e V eventually. Thus (see A2.4) V is open. I We now prove a few properties of semicontinuous functions. A6.3
Theorem. Let f be LSC on the compact space fl. Then f attains its
infimum. (Hence iff is USC on the compact space S2, f attains its supremum.)
PROOF. If b = inff, there is a sequence of points x,, a 0 with f (x,,) -+ b. By compactness, we have a subnet x,,, converging to some x e 92. Since f is LSC, lim infk f (x.,) >_ f (x). But f (x,,,,) -+ b, so that f (x) < b; consequently f (x) _ b. I
A6.4 Theorem. If f, is LSC on 12 for each i e I, then sup; f, is LSC; if I is finite, then min; fj is LSC. (Hence iff; is USC for each i, then inf; f, is USC, and if I is finite, then max, f, is USC.)
PROOF. Let f = sup; f,; then {x: f(x) > a) = Ui E j {x: f,(x) > a); hence {x: f (x) > a) is open. If g = min(fl, f2 , ... , then {x: g(x) > a) =n {x:f,(x) > a)
I=1
is open. I A6.5
Theorem. Let f: S2 -+ R, SZ any topological space, f arbitrary. Define
f (x) = lim inf.f(y), y-.x
x e n;
that is, f (x) = sup inf f (y), V yeV
where V ranges over all neighborhoods of x. [If S2 is a metric space, then f(x) = suP.= I,2, ... infd(x,,),11 f(y)]
222
APPENDIX ON GENERAL TOPOLOGY
Then f is LSC on 0 and f 5 f; furthermore if g is LSC on Sl and g < f, then g < f. Thus f, called the lower envelope off, is the sup of all LSC functions that are less than or equal to f (there is always at least one such function, namely
the function constant at - oo). Similarly, if J(x) = lim sup,.x f (y) = infy sup,. y f(y), then f, the upper envelope off, is USC andf >- f; in fact f is the inf of all USC functions that are
greater than or equal to f. PROOF. It suffices to consider f. Let
be a net in Sl with x -, x and
lim inf f (x,,) < b < f (x). If V is a neighborhood of x, we can choose n b. Since V is also a neighborhood of x, we have
such that x,, e V and f
b > f (x,.) >- inff(y), so
f (x) = sup inf f (y) < b < f (x), V yeV
a contradiction. By A6.2, f is LSC, and f < f by definition off. Finally if g is LSC, g < f, then f (x) = lim infy,x f(y) >_ lim infy, g(y) > g(x) since g is
LSC. [If sups infy. v g(y) < b < g(x), then for each V pick xy e V with g(xv) < b. If Vl < V2 means that V2 c V1, the xy form a net converging to x, while lim infy g(xy) < b < g(x), contradicting A6.2.]
It can be shown that if Sl is completely regular, every LSC function on it
is the sup of a family of continuous functions. If Q is a metric space, the family can be assumed countable, as we now prove.
Theorem. Let Sl be a metric space, f a LSC function on C. There is a sequence of continuous functions f.: Sl --* R such that f T f. (Thus if f is USC, there is a sequence of continuous functions f If) If if I < M < co, the M for all n. f may be chosen so that I A6.6
PROOF [following Hausdorff (1962)]. First assume f t 0 and finite-valued.
If d is the metric on n, define g(x) = inf{f(z) + td(x, z): z e Sl}, where t > 0 is fixed; then 0 < g < f since g(x) f (x) + td(x, x) = f (x). If x, y e Sl, then f(z) + td(x, z) < f(z) + td(y, z) + td(x, y). Take the inf over z to obtain g(x) < g(y) + td(x, y). By symmetry, I9(x) - 9(y)I < td(x, y),
hence g is continuous on Q.
223
A7 THE STONE-WEIERSTRASS THEOREM
inf{f(z) + nd(x, z): z E S2}. Now set t = n; in other words let Then 0 < f,, T h < f. But given e > 0, for each n we can choose z e S2 such that f .(x) + s > f (z.) + nd(x,
nd(x, z.).
0. Since f is LSC, But f ,,(x) + E < f (x) + e, and it follows that d(x, lim inf,, . f >f(x) - e eventually. But now f (x) ; thus f
f (x) -
for large enough n.
2E
It follows that 0 (xo, yo) for some xo, yo a fl; thus f(x,,,) -->f(xo), f(Y,,k) -'f(Yo)
by continuity. But 0, so d(xo, yo) = 0 and consequently, e(f(xo), f(yo)) = 0 by continuity off, a contradiction. I We now prove a basic compactness theorem in function spaces.
Arzela-Aseoli Theorem. Let 0 be a compact topological space, S21 a Hausdorff gauge space, and G c C(12, f21), with the uniform topology. Then G is compact iff the following three conditions are satisfied: A8.5
(a)
G is closed,
(b) {g(x): g e G} is a relatively compact subset of S21 for each x e 12, and (c) G is equicontinuous at each point of 92; that is, if e > 0, dc- 2(S)1),
x0 a 92, there is a neighborhood U of x0 such that if x e U, then d(g(x), g(xo)) < e
for all geG. PROOF. We first note two facts about equicontinuity. (1) If M e F(0, S21), where Q is a topological space and l1 is a gauge space, and M is equicontinuous at x0, the closure of M in the topology of pointwise convergence is also equicontinuous at x0. (2) If M is equicontinuous at all x E S2, then on M, the topology of pointwise convergence coincides with the topology of uniform convergence on compact subsets.
To prove (1), let f e M, f - f pointwise; if d e Q(S21), we have
d(f(x),f(x0))
0, the third term on the right will eventually be less than 6/3 by the pointwise convergence, and the second term will be less than 6/3 for x in some
neighborhood U of x0, by equicontinuity. If x e U, the first term is eventually less than 6/3 by pointwise convergence, and the result follows. To prove (2), let f, e M, f f pointwise, and let K be a compact subset of n; fix 6 > 0 and d e 2' (S21). If x c- K, equicontinuity yields a neighborhood
A8 TOPOLOGIES ON FUNCTION SPACES
229
U(x) such that y c U(x) implies d(f,(y), f (x)) < 6/3 for all n. By compactness,
K c U'= U(x;) for some x1, ..., x,. Then 1
d(f(x),f.(x)) < d(f(x),f(x;)) + d(f(x1),f,(xi)) + d(f.(x1),f (x)). If x e K, then x e U(x;) for some i; thus the third term on the right is less than 6/3 for all n, so that the first term is less than or equal to 6/3. The second term
is eventually less than 6/3 by pointwise convergence, and it follows that f -+f uniformly on K. Now assume (a)-(c) hold. Since G fXEn{g(x): g e G}, which is pointwise compact by (b) and the Tychonoff theorem A5.4, the pointwise closure is a net in G, there is a subnet Go of G is pointwise compact. Thus if converging pointwise to some g e Go. By (c) and (1), Go is equicontinuous at each point of f2; hence by (2), the subnet converges uniformly to g. But g is continuous by A8.3; hence g e G by (a). Conversely, assume G compact. Since S21 is Hausdorff, so is C(f2, f ) [as well as F(f, S21)], hence G is closed, proving (a). The map g -+ g(x) of G
into f is continuous, and (b) follows from A5.3(a). Finally, if G is not equicontinuous at x, there is an E > 0 and a d e k(ill) such that for each neighborhood U of x there is an xu e U and gu e G with d(gv(x), g11(x11)) >_ E. If U >_ V means U V, the gu form a net in G, so there is a subnet converging uniformly to a limit g e G. But xu --> x; hence g11(xu) - g(x), a contradiction. [The last step follows from the fact that the map (x, g) -, g(x) of fl x G into
f21 is continuous. To see this, let x -> x, and g -+g uniformly on fl; if d e .9(S21), then d(gn(xn), g(x)) : in. (In other words, the net is Cauchy.) It may be assumed that i,,< in,, for all n; then the xi form a Cauchy sequence which converges to a limit y e A by completeness. Since S2 is Hausdorfi, we have y = x; hence x e A.
We now prove the main theorem on topological completeness. A9.11
Theorem. Let n be a metrizable topological space. Then S2 is topolog-
ically complete if it is an absolute Ga; in other words, 0 is a G,, in every metric space in which it is topologically embedded.
PROOF. If S2 is topologically complete and is embedded in the metric space Q,, then 0 is dense in 31; so by A9.10, S2 is a Ga in 31. Thus 12 can be expressed
as nR , W,,, where W = U,, n I2, U,, open in 52,. But 0 is a closed subset of the metric space 52,, so S2 is a G,, in Q, (rl = nn , {x a S2, : d(x, S2) < 1/n)). It follows that S2 is a Ga in 52,.
234
APPENDIX ON GENERAL TOPOLOGY
Conversely, let fl be an absolute G. Embed (I in its completion (we assume as known the standard process of completing a metric space by forming equivalence classes of Cauchy sequences). By A9.9, f) is topologically
complete. I We now establish topological completeness for a wide class of spaces. A9.12
Theorem. A locally compact metric space is topologically complete.
PROOF. If Q is such a space and KI is its completion, then 4 is open in S2. For if x e ), there is, by local compactness, an open (in S2) set V with x e V n S2 and
V n (I compact. In fact V c S2, proving D open. For if y e V and y 0.0, then for each Ue °W(y), U n V n f A0 since 0 is dense; consequently, U n V n fl # 0. The sets U n V n f, U e all(y), form a filterbase -4 in V n fl converging to y, and since S1 is Hausdorff, y is the only possible accumulation point of .411. But y 0 V n 0, contradicting compactness of V n S2 [see A5.2(d)]. By A9.8, f2 is topologically complete.
A10 Uniform Spaces We now give an alternative way of describing gauge spaces.
Definitions and Comments. Let V be a relation on the set fl, that is, a subset of fl x (I. Then V is called a connector if the diagonal A10.1
D = {(x, x): x e 0} is a subset of V. A nonempty collection .
of connectors is called a uniformity
if
(a)
for all VL, V2 e .., there is a We Y with W c VL n V2 (in other
words W is a filterbase), and (b)
for each V e, there is a W e .
with WW -' c V.
[The relation WW-' is the composition of the two relations W and W -'; that is, (x, z) e WW -' if for some y e Q we have (x, y) e W -' and (y, z) a W; (x, y) e W ' means (y, x) e W.]
If in addition, .*' is a filter, that is, if V e .e and V c W, then W e .y, then .*' is called a uniform structure. In particular, if _* is a uniformity, the filter generated by ..£° is a uniform structure.
A10 UNIFORM SPACES
235
As an example, let -9 be a family of pseudometrics own, and let W consist of all sets of the form V = {(x, y): d+(x, y) <S}, b > 0, d+ e _Q+; then . is a uniformity on fl. [To see that condition (b) is satisfied, let
W = {(x, y): d+(x, y) <S/2};
then WW - 1 c V by the triangle inequality.] By analogy with the pseudometric case, if V belongs to the uniformity W, we say that x and y are V-close if (x, y) e V. Two uniformities W, and .7°2 are said to be equivalent if they generate
the same uniform structure; in other words, given V2 a -W2, there is a V, e *, with V, c V2 and given W, a .W,, there is a W2 E
2
with W2 c W1.
Two uniform structures that are equivalent must be equal; hence there is exactly one uniform structure in each equivalence class of uniformities.
If Y is a uniform structure and V, W e W, WW -1 c V, then if D is the diagonal, we have
W-1 = DW-1 cWW-1 hence W c V -1, so that V VnV
V,
a .. But then e
(and V n V-1
V).
Now U = V n V is symmetric, that is, U = U -1. Thus if . is a uniform structure, the symmetric sets in .; form a uniformity that generates 0, so that every uniformity is equivalent to a uniformity containing only symmetric sets. We are going to show later that every uniform structure is generated by a uniformity corresponding to a family of pseudometrics. As a preliminary, we establish the following result:
Metrization Lemma. Let U" , n = 0, 1, 2, ..., be a sequence of subsets of 0 x S2 such that each U,, is a connector, U0 = 0 x f), and U,3+1 A10.2
(= U"+1U"+lU"+1) e U,, for all n. Then there is a function d from S2 x S2 to the nonnegative reals, such that d satisfies the triangle inequality and U. c {(x, y): d(x, y) < 2-") c U"_1 for all n = 1, 2. .... If each U. is symmetric, there is a pseudometric satisfying this condition.
PROOF. Define f(x, y) = 0 if (x, y) e U. for all n; f(x, y) = # if (x, y) a U0 U1 ; f (x, y) = I if (x, y) e U, - U2, and in general, f (x, y) = 2-n if
(x, y) e U"_, - U". Since U"+, c U.+, c U,,, f is well-defined on C2 x fl; furthermore, (x, y) e U,,
if f (x, y) 0. Then F and .*' are equivalent, so that the topology A10.5
induced by :" is identical to the gauge space topology determined by the pseudometrics d.
PROOF. It follows from the definition of .*' that . c .F. Now if U is a symmetric set in F, set U0 = S2 x f2, U, = U, U2 a symmetric set in F such that U23 c U1, and in general, let U. be a symmetric set in .°F such that C By A10.2, there is a pseudometric don d2 with U. C Vd.2-^ C Un-1
for all n. Since .F is a uniform structure, V2 - e .F for all n, hence V db a F' for all S > 0. But U2 C_ Vd, 114 c U1 = U, proving the equivalence of .F and
.r. I
Since uniform spaces and gauge spaces coincide, the concept of uniform continuity, defined previously (see A8.4) for mappings of one gauge space to
another, may now be translated into the language of uniform spaces. If f. 12 -- 521, where f2 and n, are uniform spaces with uniformities ?l and f-, f is uniformly continuous iff given V e V, there is a U e'W such that (x, y) e U implies (f(x), f(y)) e V. We now consider separation properties in uniform spaces. Theorem. Let f) be a uniform space, with uniformity .*'. The following are equivalent:
A10.6
(a) (1{ V: V e 0} is the diagonal D. (b) C1 is T2. (c) (d)
f)isT1. S2 is To.
PROOF. If (a) holds and x 96 y, then (x, y) 0 V for some V e Y. If W is a symmetric set in the uniform structure generated by .', and W2 c V, then W(x) and W(y) are disjoint overneighborhoods of x and y, proving (b). If (d) holds and x y, there is a set V e .W such that y 0 V(x) [or a set W e ay with x 0 W(y)]. But then (x, y) t V for some V e *, proving (a). I Finally, we discuss topological groups and topological vector spaces as uniform spaces. A topological group is a group on which there is defined a topology which
makes the group operations continuous (x,, -* x, y - y implies x y -+ xy;
A10 UNIFORM SPACES
239
x -+ x implies xK ' -+ x '). Familiar examples are the integers, with ordinary addition and the discrete topology; the unit circle {z: Izi = 1) in the complex plane, with multiplication of complex numbers and the Euclidean topology; all nonsingular n x n matrices of complex numbers, with matrix multiplication and the Euclidean topology (on R"2). If a = Ili S , where the Di are topological groups, then n is a topological group with the product topology if multiplication is defined by xy = (x,y;, i e I).
Theorem. Let n be a topological group, and let .*' consist of all sets VN = {(x, y) E 11 x 0: yx-' e N}, where N ranges over all overneighborhoods of the identity element e in Q. Then -ye is a uniformity, and the topology induced by .f° coincides with the original topology. In particular, A10.7
if 12 is Hausdorif, it is completely regular. PROOF. The diagonal is a subset of each VN, and Vs n VM = VN n M, so only the last condition [A10.1(b)] for a uniformity need be checked. Letf(x, y) _ xy-I, a continuous map of 92 x 0 into g (by definition of a topological group). If N is an overneighborhood of e, there is an overneighborhood M
of e such that f(M x M) c N. We claim that VM VM' c VN. For if (x, y) e VM' and (y, z) a VM, then
(y, x) e VM; hence xy-' e M and zy-' a M. But zx-' = (zy-')(yx-') = (zy-')(xy-')-' ef(M x M) c N. Consequently, (x, z) e VN, so ,Y is a uniformity. Let U be an overneighborhood of the point x in the original topology. If
N = Ux-' ={yx-':ye U), then VN(x)={y:(x,y)eVN}={y:yx-'eN)=
Nx = U; therefore U is an overneighborhood of x in the topology induced by
Conversely, let U be an overneighborhood of x in the uniform space topology; then VN(x) c U for some overneighborhood N of e. But VN(x) = Nx, and the map y - yx, carrying N onto Nx, is a homeomorphism of S2 with itself. Thus Nx, hence U, is an overneighborhood of x in the original topology. I A10.8
Theorem. Let f) be a topological group, with uniformity .;t° as
defined in A 10.7. The following are equivalent: is pseudometrizable (see A10.3). S2 is pseudometrizable; that is, there is a pseudometric that induces the given topology. (c) f2 has a countable base of neighborhoods at the identity e (hence at every x, since the map N - Nx sets up a one-to-one correspondence between neighborhoods of e and neighborhoods of x). (a) (b)
.
240
APPENDIX ON GENERAL TOPOLOGY
PROOF. We obtain (a) implies (b) by the definition of the topology induced by ,e (see A10.4). Since every pseudometric space is first countable, it follows that (b) implies (c). Finally, if (c) holds and the sets N1, N2, ... form a countable base at e, then by definition of Ye, the sets VN,, VNz .... form a countable
base for 0; hence, by A10.3, 0 is pseudometrizable. I A topological vector space is a vector space with a topology that makes
addition and scalar multiplication continuous (see 3.5). In particular, a topological vector space is an abelian topological group under addition. The uniformity -ye consists of sets of the form VN = {(x, y): y - x e N), where N is an overneighborhood of 0. Theorem A10.8 shows that a topological vector space is pseudometrizable if there is a countable base at 0. The following fact is needed in the proof of the open mapping theorem (see 3.5.9): A10.9
Theorem. If L is a pseudometrizable topological vector space, there is
an invariant pseudometric [d(x, y) = d(x + z, y + z) for all z] that induces the topology of L. PROOF. The pseudometric may be constructed by the method given in A10.2
and A10.3, and furthermore, the symmetric sets needed in A10.3 may be taken as sets in the uniformity Jr itself rather than in the uniform structure generated by Ye. For if N is an overneighborhood of 0, so is -N =
{-x: x e N) since the map x -p -x is a homeomorphism. Thus M = N n (- N) is an overneighborhood of 0. But then VM is symmetric and VMcVN. Now by definition of VM , we have (x, y) e V. iff (x + z, y + z) e VM for
all z, and it follows from the proof of A10.2 that the pseudometric d is invariant.
Bibliography
Apostol, T.M., " Mathematical Analysis." Addison-Wesley, Reading, Massachusetts, 1957.
Bachman, G., and Narici, L., " Functional Analysis." Academic Press, New York, 1966. Billingsley, P., "Convergence of Probability Measures." Wiley, New York, 1968. Dubins, L., and Savage, L., "How to Gamble If You Must." McGraw-Hill, New York, 1965.
Dugundji, J., "Topology." Allyn and Bacon, Boston, 1966. Dunford, N., and Schwartz, J. T., " Linear Operators." Wiley (Interscience), New York, Part 1, 1958; Part 2, 1963; Part 3, 1970. Halmos, P. R., " Measure Theory." Van Nostrand, Princeton, New Jersey, 1950. Halmos, P. R., " Introduction to Hilbert Space." Chelsea, New York, 1951. Halmos, P. R., " Naive Set Theory." Van Nostrand, Princeton, New Jersey, 1960. Hausdorff, F., " Set Theory." Chelsea, New York, 1962. Kelley, J. L., and Namioka, I., " Linear Topological Spaces." Van Nostrand, Princeton, New Jersey, 1963. Liusternik, L., and Sobolev, V., "Elements of Functional Analysis." Ungar, New York, 1961.
LoBve, M., "Probability Theory." Van Nostrand, Princeton, New Jersey, 1955; 2nd ed., 1960; 3rd ed., 1963.
Neveu, J., " Mathematical Foundations of the Calculus of Probability." Holden-Day, San Francisco, 1965. Parthasarathy, K., "Probability Measures on Metric Spaces." Academic Press, New York, 1967.
Royden, H. L. "Real Analysis." Macmillan, New York, 1963; 2nd ed., 1968. Rudin, W., " Real and Complex Analysis." McGraw-Hill, New York, 1966. Schaefer, H., " Topological Vector Spaces." Macmillan, New York, 1966.
Simmons, G., "Introduction to Topology and Modern Analysis." McGraw-Hill, New York, 1963.
Taylor, A. E., "Introduction to Functional Analysis." Wiley, New York, 1958. Titchmarsh, E. C., "The Theory of Functions." Oxford Univ. Press, London and New York, 1939. Yosida, K., "Functional Analysis." Springer-Verlag, Berlin and New York, 1968.
241
Solutions to Problems
Chapter 1
Section 1.1 2. 3.
We have iim sup. An = (-1, 1 ], lim inf,, A = {0}. Using lim sup,, An = {w: co e A. for infinitely many n}, lim infra An {co: co e An for all but finitely many n), we obtain
_
liminfAn ={(x,y): x2 +y2 < 1}, lim sup A. ={(x, y): x2 + y2 < 1) -{(0, 1), (0, -1)}. n
4.
If x = lim sup., x., then lim supra An is either (- co, x) or (- oo, x]. For if y e An for infinitely many n, then xn >y for infinitely many n; hence x >- y. Thus lim sup. An c (- oo, x]. But if y < x, then xn > y for infinitely many n, so y e lira sup. An. Thus (- co, x) c lim sup. A., and the result follows. The same result is valid for lim inf; the above analysis applies, with "eventually" replacing "for infinitely many n." 243
244
SOLUTIONS TO PROBLEMS
Section 1.2
4.
5. 8.
(a)
If - oo < a < b < c < oo, then µ(a, c] = µ(a, b] + µ(b, c], and
p(a, co) = p(a, b] + µ(b, oo); finite additivity follows quickly. If A" = (- oo, n], then A" I R, but µ(A") = n+). µ(R) = 0. Thus µ is not continuous from below, hence not countably additive. (b) Finiteness of p follows from the definition; since p(- oo, n] -> co, i is unbounded. We have p(U A,) >_ u(U°=1 A,) = Y; I p(A;) for all n; let n -- oo to obtain the desired result. The minimal a-field F (which is also the minimal field) consists of the 1
collection ig of all (finite) unions of sets of the form B1 n B2 n - n B., where B. is either A, or A,`. For any a-field containing A1, ..., A. must contain all sets in 9r ; hence c .F. But Ir is a a-field; hence F c 9r. n B", and each such Since there are 2" disjoint sets of the form B1 n
set may or may not be included in a typical set in .F, .F has at most 22" members. The upper bound is attained if all sets B1 n - - n B. are 9.
nonempty. (a) As in Problem 8, any field over ' must contain all sets in (; hence
F. But W is a field; hence F e W. For if A; = nj=1 B,j, then (Ui=1 A:)` = ni=1 U;=1 B`'J, which belongs to W because of the distributive law A n (B v C) = (A n B) v (A n C). (b) Note that the complement of a finite intersection ni=1 B;j belongs to -9; for example, if B1i B2 a W, then (B1 n B2")"
=B1`uB2 = (B1` n B2) U (B1` - B2c) u (B2 n B1) v (B2 r B1`) a !R. Now :a is closed under finite intersection by the distributive law, and it follows from this and the above remark that -9 is closed
under complementation and is therefore a field. Just as in the proof that F = 9r, we find that F = -9. 11.
(c) (a)
This is immediate from (a) and (b). Let A,, e ..', n - 1, 2, .... Then A,, belongs to some W,,., and we Let a = sup" of. < , so '',,, c 'a2 c may assume a1 < a2 < F'1. Then all Wa" a W., hence all A. e (ga. Thus U" An e Wa+1 e be, so U,, A e .9. If A e 91, then A belongs to some WQ; hence A` e 1ea+1 C Y.
(b) We have card 'a S c for all a. For this is true for a = 0, by hypothesis. If it is true for all 1 < a, then Up,,, Wa has cardinality at
245
CHAPTER I
most (card a)c = c. Now if -9 has cardinality c, then _9' has cardinality at most cx° = (2"0)x0 = 2'° = c. Thus card Wa < c. It follows that Up1 W. has cardinality at most c. Section 1.3
3.
(a)
Since A(O) = 0 we have n e ..#, and A' is clearly closed under complementation. If E, Fe . ' and A c 0, then
A[An(EvF)]=A[An(EuF)nE] + ALA n (E U F) n E']
since
E E .'
=A(AnE)+A(AnFt E`). Thus
A[An(EuF)]+A[An(EuF)`]
=A(AnE)+A(AnE`nF)+A(AnE`nF) = A(A n E) + A(A n E`) since Fe.ff = A(A)
since
E e M.
This proves that ,' is a field. Also, if E and F are disjoint we have
A[An(EvF)]A[An(EuF)nEI+A[An(EuF)nE`] =A(Ar-E)+A(Ar-FnE`) =A(AnE)+A(AnF) since EnF=Q. Now if the E are disjoint sets in ..1 and F. = Ui=1 E, T E, then 2(A) = A(A m F,,) + A(A r FCC) since F,, belongs to the field A'
A(AnF;)+2(Ar E`) since E` c Fn` and A is monotone n
i=1
A(AnE;)+2(Ar E`)
by what we have proved above. Since n is arbitrary, 00
A(A)
I A(A n En) + A(A n E") n=1
A(A n E) + A(A n E')
by countable subadditivity of A.
Thus E e . #, proving that . f' is a a-field.
246
SOLUTIONS TO PROBLEMS
Now A(A n E) + I(A n E`) > .1(A) by subadditivity, hence 00
A(A) = Z A(A n En) + ).(A n E`). n=1
Replace A by A n E to obtain I(A n E)
).(A n En), as de-
sired. (b) All properties are immediate except for countable subadditivity. If
A = U- 1 A., we must show that u*(A) < Y 1 µ*(A,), and we may assume that p*(A,,) < oo for all n. Given e > 0, we may µ(E"k) < µ*(An) + choose sets Enk e Fo with A. c Uk Enk and e2-". Then A e Un, k Enk and En, k µ(Enk) < [.,n µ*(An) + e. Thus µ*(A) < Yn µ*(An) + e, a arbitrary.
Now if A e ., then µ*(A) < µ(A) by definition of µ*, and if A c Un E,,, E. e .moo, then µ(A) < >Jn µ(E,) by 1.2.5 and 1.3.1. Take the infimum over all such coverings of A to obtain µ(A) < µ*(A); hence µ* = µ on .Fo. (c)
If Fe Fo , A c fl, we must show that µ*(A) >_ µ*(A n F) + + µ*(A n F`); we may assume µ*(A) < oo. Given E > 0, there are u*(A) + e. Now sets En e Fo with A e U. E. and E"= 1
(U (En n F))
µ*(A n F)
p(-s, e) >_
1
k=r k
for some s
for some r
=00.
Thus (b) fails. (Another example: Let p(A) be the number of rational points in A.) Section 1.5
2.
If B e R(R),
{w:h(co)eB} ={coeA:h(co)eB} u{(vaA`:h(c))eB} = [A of -1(B)] u [A` n g- 1(B)] 5.
which belongs to F since f and g are Borel measurable. (a) {x: f is discontinuous at x} = Un 1 D, where D" = {x a Rk: for all S > 0, there exist x1i x2 e Rk such that Ix1 --- xJ < S and 1x2 - xI < S, but If(XI) -f(X2)1 >- 1/n). We show that the D. are closed. Let {xa}
SOLUTIONS TO PROBLEMS
be a sequence of points in D. with xQ - x. If 6 > 0 and N = {y: Iy - xI < b), then x,, e N for large a, and since X. a D", there are points x,1 and xa2 e N such that I f(xa1) -f(xa2)I >- 1/n. Thus Ix", - x1 < b, Ix.2 - x1 < 8, but f(xal) - f(xa2)I > 1/n, so that
xe D". The result is true for a function from an arbitrary topological space S to a metric space (T, d). Take D,, = {x e S: for every neighborhood N of x, there exist x1, x2 e N such that d (f (xl), f (X2)) > I /n}. (the above proof goes through with "sequence" replaced by " net.") In fact T may be a uniform space (see the appendix on general topology, Section A10). If -9 is the associated family of pseudo-
metrics, we take D. = {x e S: for every neighborhood of N of x, there exist x1, x2 e N and d e 2 such that d(f(x1), f(x2)) >- 1/n) and proceed as above. The result is false if no assumptions are made about the topology of the range space. For example, let S2 = (1, 2, 3}, with open sets 0, fl, and {1}. Define P. Q -. f2 by f(1) =f(3) = 1, f(2) = 2. Then the set of discontinuities is {3}, which is not an F,,. (b) This follows from part (a) because the irrationals I cannot be expressed as a countable union of closed sets C,,. For if this were possible, then each C,, would have empty interior since every nonempty open set contains rational points. But then I is of category 1 in R, and since Q = R - I is of category 1 in R, it follows that R is of category 1 in itself, contradicting the Baire category theorem. 6. By Problem 11, Section 1.4, there are c Bore] subsets of R"; hence there are only c simple functions on R". Since a Borel measurable function is
the limit of a sequence of simple functions, there are c"0 = c Borel measurable functions from R" to R. By 1.5.8, there are only c Borel 7.
measurable functions from R" to Rk. (a) Since the P,, are measures, J:k P"(Ak) = P"(S2) = 1, and it follows quickly that the a"k satisfy the hypotheses of Steinhaus' lemma. If {x"} is the sequence given by the lemma, let S = {k: xk = 1) and let B be the union of the sets Ak, k e S. Then t" =
L. [P,,(Ak) - P(Ak)J = 1 - CC keS 1
l
l-a
[P.(B)
-
l keS
P(Ak)J
and it follows that t" converges, a contradiction. Thus P is a prob-
ability measure. If Bk e .F, Bk j 0, then given e > 0, we have P(Ak) < e for large k, say for k >- ko. Thus P"(Ak0) < e for large n, say for n >- no. Since the Ak decrease, we have supnzrtp P"(Ak) < e
fork >- ko, and since Ak 10, there is a k1 such that for n = 1, 2, ... , no - 1, P,,(Ak) < e fork > k1. Thus sup" P"(Ak) < e, k >- max(ko, k1).
251
CHAPTER I
1 for all n. Add a point
(b) Without loss of generality, assume
1-
(call it oo) to the space and set
1 - P(fl) =
P{oo}. The P are now probability measures, and the result follows from part (a). Section 1.6
2.
is integrable In Y-1 Ifnl dp = Y' 1 fn Ifnl dp < oo; hence Y , and therefore finite a.e. Thus Y , f converges a.e. to a finite-valued
function g.
Yk 1 Ifkl, an integrable function. By Let g _ Yk=, fk. Then the dominated convergence theorem, In g dp ffl g dp, that is, 00
> f. fn
n=1
3.
ffln=1 fn du.
f2
Let xo a (c, d), and let xn -+ xo , x 96 xo . Then b
b
1
(Xn
-XO)
[f f(x., y) dy - faf(xO,Y)dy] -
= f b rf(xn , Y) - f (xo , Y)] xn - XO
a L
dY.
By the mean value theorem,
f(xn, Y) -f(x0, Y) =f1(An, Y) xn - xo
8.
between x and xo . By hypothesis, Ifi(2n, Y)I 5 h(y), for some A = where h is integrable, and the result now follows from the dominated convergence theorem (since y) - f(xo, y)]/[x - x0] -f1(xo, y), f,(x, ) is Borel measurable for each x). Let it be Lebesgue measure. If f is an indicator IB, B e M(R), the result to be proved states that p(B) = p(a + B), which holds by translationinvariance of it (Problem 4, Section 1.4). The passage to nonnegative
simple functions, nonnegative measurable functions, and arbitrary measurable functions is done as in 1.6.12. Section 1.7
2.
(a)
If f is Riemann-Stieltjes integrable, a =f = /3 a.e. [p] as in 1.7.1(a). Thus the set of discontinuities off is a subset of a set of p-measure 0,
together with the endpoints of the subintervals of the Pk. Take a different sequence of partitions having the original endpoints as
252
SOLUTIONS TO PROBLEMS
interior points to conclude that f is continuous a.e. [µ]. Conversely, if f is continuous a.e. [p], then a =f = /3 a.e. [p]. [The result that f
continuous at x implies a(x) =f(x) = /3(x) is true even if x is an endpoint.] As in 1.7.1(a), f is Riemann-Stieltjes integrable. 3.
(b) This is done exactly as in 1.7.1(b). (a) By definition of the improper Riemann integral, f must be Riemann
integrable (hence continuous a.e.) on each bounded interval, and the result follows. For the counterexample to the converse, take f (x) = 1,
n< x + oo as a -+ - oo, b -+ oo.) (b) Define
fi(x) =
(.1(x)
if -n<x f I9I dµ z A
n
hence p(An) < oc. For the example, let p be Lebesgue measure on P1(R), and let g(x) be any strictly positive integrable function, such as g(x) _ e- I"l. In this case, A = R, so that p(A) = oo. 4.
If f is an indicator IA, the result is true by hypothesis. If f is a nonnegative simple function Y;=1 x j IA, , the A; disjoint sets in F, then Jnn
fd2=
;_1
x;1(A;)=J=i >x;J gdp=
= fig dy
Aj
x; f
1 ,gdp
i=1
by the additivity theorem.
If f is a nonnegative Borel measurable function, let fl, f2, ... be nonnegative simple functions increasing to f. By what we have just proved
254
SOLUTIONS TO PROBLEMS
J a fn d2 = fn fn g dµ; hence J a f d l= In fg dp by the monotone convergence theorem. Finally, if f is an arbitrary Borel measurable function,
write f = f - f -. By what we have just proved,
fnf-dl= f f-gdµ
f f+d2= f f+gdp, n
6.
n
n
and the result follows from the additivity theorem. (a) In the definition of IAI, we may assume without loss of generality that the E1 partition A. If A is the disjoint union of sets A1, A2, ... a .r, then n
n
j=1
11(E) I=
n
ao
Co
Ai)I < E Y_ 12(Ej n A,) I j=1 1i=1i A(Ej r j=1 i=1 00
n
00
E 12(Ej n Ai)I s
= i=1 j=1
i=1
IAI(A).
121(A). Now to show the reverse inequality, we may assume I2I(A) < oo; hence I21(A) < IAI(A) < co. For each i, there is a partition {E11, ..., Ein,} of A. such that Thus IAI(A)
Y I A(Ei) I > I A I (A) -
j=1
a
2'
,
E> 0 preassigned.
Then for any n, n
n
ni
I A I (A) ? E E 12(Eij) I? E 12 I (A) - a. i=1 j=1
1=1
Since n and a are arbitrary, the result follows. (b) If E1, ..., En are disjoint measurable subsets of A, n
n
n
1(21 + 22)(E1) I < F.. 121(E1) I + Y- 122(E,) I i=1 i=1 i=1
< 121 I (A) + 122 I (A),
proving IA1 +' 21 -< IA1I + 1121; IaAI = lal 121 is immediate from the (c)
definition of total variation. If µ(A) = 0 and 1211(Ai°) = 0, i = 1, 2, then µ(A1 u A2) = 0 and by
(b), lei + 221(Alc .\ A20):!9 1A1I(A1`) + I22I(A2`) = 0(d) This has been established when A is real (see 2.2.5), so assume A com-
plex, say, 2 = Al + i22. If µ(A) = 0, then A1(A) =22(A) = 0; hence A 0 `f(x+h)-f(x)l
- 0, fn fdp =
1 a,, by the result for simple functions and the
monotone convergence theorem. If the a are real numbers, then fnfdp=fnf+dp-fnf-dp=Y' la.+-Y l an- if this is not of the form + oo - oc. Finally, if the a are complex,
fn-
dp=
=1
Rea.+iIma,, n=1
provided Re f and Im f are integrable; since IRe aJ, IIm anI - I, f dp for all finite F; hence in fdp >- Y. f (a). If f (a) > 0
for uncountably many a, then Y. f (a) = oo ; hence In fdp = oo also. If f (a) > 0 for only countably many a, then in fdp = Y. f (a) by the monotone convergence theorem. The remainder of the proof is as in part (a). 8.
Apply Holder's inequality with g = 1, f replaced by If jr, p = s/r, 1/q = I - r/s, to obtain
CHAPTER 2
259
IfI"Pdµ)11p(f
I"dµ-
If l"dµ fA
- (II! II.-E)"µ(A)
Since p(A) > 0 by definition of IIIII, we have lim infp_. Iif IIp >- IIf II M If Ilf II = co, let A = {w: I f(w)I >- M} and show that
lim inf IIf IIp ? M; p-oo
since M can be arbitrarily large, the result follows.
If p(S2) = oo, it is still true that lim infp.,, Illllp ? Ilfil"',; for if p(A) = co in the above argument, then IIf IIp = oo for all p < oo. How-
ever, if p is Lebesgue measure on p(R) and f(x) = 1 for n < x < n + (1 /n), n = 1, 2, ... , and f (x) = 0 elsewhere, then 11f 11p = oo for
p 0, then 1
= p(E)
dp p(E) fE f µ 1
f '(f - z) dµ
r;
[l /µ(E)] JE f dµ e D c S`, a contradiction. Therefore p(E) = 0, that is, µ{w: f(w) a D} = 0. Since {w: f(w) 0 S} is a countable union of sets f '(D), the result follows. (b) Let p = iA I ; if El, ..., En are disjoint measurable subsets of A hence
j=1
I2(E;)I =
j=1
If E1
h dp
- I a.e. Now if p(E) > 0, then [ 1 /µ(E) ] J E h dµ = A(E)/µ(E) e S, where
S = (z e C: lzl < 1). By (a), h(w) e S for almost every co, so IhI < 1 a.e. [121].
260
SOLUTIONS TO PROBLEMS
(c)
If E e ,°F, In IE h dl2I = JE h dJAI = A(E) by (b); also, I. IEg dµ =
J, g dp = 2(E) by definition of A. It follows immediately that In fh diAI = f n fg du when f is a complex-valued simple function. If
f is a bounded, complex-valued Borel measurable function, by 1.5.5(b), there are simple functions ff -+f with I f,, - 0 for all E; hence hg > 0 a.e. [p] by 1.6.11. But if
g((o) = Ig(w)le'O() and h(w) = e`*(W), then e'(9-v) = 1 a.e. on {g :A 0}, so that hg = Ig1 a.e., as desired. 12.
If I(a, b) can be approximated in L°° by continuous functions, let 0 < E < I and let f be a continuous function such that l'(a, b) -f IIco < e;
hence I'(a, b) -f I 0, there are points x E (a, a + 5)
and y e (a - 5, a) such that I 1 - f (x)I < s and If() I < e. Consequently,
lim supra, f(x) >- 1 - e and lim infra- f(x) < e, contradicting continuity off. Section 2.5 4.
Let Bjkb = (I fj -fkl >- 5), Ba = nn
UT
J,
1
Bjka Then
oo
U Bjka } Ba
j,k=n 5.
and the proof proceeds just as in 2.5.4. Let (f,,,) be a subsequence converging a.e., necessarily to f by Problem 1.
By 1.6.9, f is p-integrable. Now if In f du +- In f dp, then for some e > 0, we have I In f, dp - 1, f dp >- s for n in some subsequence {mk). But we may then extract a subsequence {f,,) of { f a.e., so that In f,, du - In f du by 1.6.9, a contradiction. Section 2.6 4.
By Fubini's theorem, 1
N(C) = ff Ic dµ = n
(
[jig dµ2] dpl = Jn,NZ(C(wi)) dpI((o 1). in,
Similarly, p(C) = JRi pl(C((02)) dp2((02). The result follows since f >- 0,
fnf=0implies f=0a.e.
261
CHAPTER 2
7.
(a)
Let
Ank={xefl': (
Bnk = y e 522:
n
- N(s).
fn(re'B) - f(re'B) 12 dO < s,
o l
Since ro may be chosen arbitrarily close to I, N 2(f" - f) < e for n >- N(F), proving completeness. (e)
Since e" corresponds to (0, ... , 0, 1, 0, ...), with the I in position n, in the isometric isomorphism between H2 and a subspace of 12, the e" are orthonormal. Now if f c- H2, with Taylor coefficients a, n = 0, 1, ... , then = an , again by the isometric isomorphism. Thus
N2(f) _ Y lanl2 = Y IT-1zeL. 9. For (a) implies (b), see Problem 7; if (c) holds, then {x: Ilxl1 < c} is compact for small enough e > 0; hence every closed ball is compact (note that the map x -+ kx is a homeomorphism). But any closed bounded set is a subset of a closed ball, and hence is compact. To prove that (f) implies (a), choose x1 e L such that IIx1 I{
Suppose we have chosen x1i...,xkeL such that
IIxi11 = I
and
I1xi - x;ll >- I for i, j = 1, ..., k, i j. If L is not finite-dimensional, then S{xl, ..., xk} is a proper subspace of L, necessarily closed, by Problem 7(d). By Problem 8, we can find xk+1 eL with Ilxk+111 = 1 and Ilxi - xk+l II >- 1, i = 1, ..., k. The sequence x1, x2 , ... satisfies IIxn11 = 1
for all n, but Ilxn - x,n11 ? -1 for n 0 m; hence the unit sphere cannot possibly be covered by a finite number of balls of radius less than 4. 11.
(a)
Define ).(A) = f (IA), A e F. If A1i A2, ... are disjoint sets in whose union is A, then ).(A) = Y? f(IA,) since f is continuous 1
and"=1 IA, L' >
IA . [Note that n
ao
I 1 > IA, - IAI ° dµ = E µ(Ai) Sl i=1
i=n+1
by finiteness of p.] Thus
0
is a complex measure on F. If u(A) = 0,
then IA = 0 a.e. [µ], so we may write IA L°+ 0 and use the
continuity of f to obtain .1(A) = 0. By the Radon-Nikodym theorem, we have ).(A) = JA y du for some p-integrable y. Thus Ax) = J. xy dy when x is an indicator; hence when x is a simple function. Since f is continuous, y is p-integrable, and the finitevalued simple functions are dense in L°, the result holds when x is a bounded Bore] measurable function. Now let y1i Y2, ... be nonnegative, finite-valued, simple functions increasing to lyl. Then IIY"Ilq = f Yn9 dk
n
fn
IYI dp = f
n
since Il.f
1111 Y9 - I
li° = I;f 1111 Y.11gl°
yn-'e-iargyy dp
is bounded since (q - 1)p = q.
Thus IlYnllq < If II ; hence by the monotone convergence theorem, IIYIIq 0 on the set F c A n B, let x = IF; then xIA = xIB; hence Sn x(yA - yB) dp = 0, that is,
f (YA-YB)dp=0. (ii)
But then µ(F) = 0. YA a.e. on A. we have
Since
11YA. I q -
- k); then
kp(B) < f IYI du = f IBe- iargyydp B
n
=f(IBe-`a`gy):5 IIYII IIIBII, = IIfll,(B).
Thus if k > Ilf II, we have p(B) = 0, proving that y e L°° and IIYII < If 11. As in (a), we obtain f(x) =Snxydp for all xeL', (d)
III II = IIYII ., and y is essentially unique. Part (i) of (b) holds, with the same proof. Now if 12 is the union of oo, define y on .0 by taking y = yAn disjoint sets A,,, with II for all n, we have y e L°° and IIYII. -< on A,,. Since
269
CHAPTER 3
IIf II . If x e L1, then Y,."=1
f(x)_ n=1 Since IIf II
0 such that 11f 11 < 6. implies p"(f) < I ; hence If II < 1 implies p"(f) < l/5,,. Therefore W = {feL: p"(f) < 1/6. for all n} is an overneighborhood of 0. For each n let z,, be a point in Knt1 but not in K", and let f" e U" such 1/S"+1 i a choice off" is possible because, for example, if that I
In 1/b"+1 , and since the U. decrease with n and fn e U", we have, for any k, fn e Uk for all n >- k; consequently fn -> 0. But then f" e W for large n; hence pn+1(f") < a contradiction. (a) We may write x = XICO, r) + XI[r, y + z, and 1
d(x, 0) = f0 Ix(t)I" dt = d(y, 0) + d(z, 0)
since [0, r) n [r, 1 ] = 0. Now d(y, 0) _ fo I xI" dt, which is continuous in r, approaches 0 as r -+ 0, and approaches d(x, 0) as r -> 1. By the intermediate value theorem, d(y, 0) = d(z, 0) _ .
d(x, 0) for some choice of r.
(b) By part (a) we can find yl with d(y1 i 0) d(x, 0) and I f (yl) I ? #. Let x1 = 2y1; then I f(x1)I >- 1 and d(x1, 0) = 2" d(y1i 0) = 2p-1 d(x, 0). Having chosen x1 i ..., x,, with I f (x1) I >_ 1 and
13.
15.
d(x1, 0) =2""- " d(x, 0), i = 1, ... , n, apply part (a) to x" to obtain =2"-1 d(x,,, with If(x,,+1)I z 1 and d(x"+1, 0) xn+1 0) = 2(n+1)("-1' d(x, 0). Since p < 1, d(x", 0) -> 0 as n -+ oo, as desired. (a) Let i be the identity map from (L, l2) to (L, 1). Since 1 2, i is continuous, and by the open mapping theorem, 9-2 ell. (b) If fT is the topology induced by II 11j, .1 = 1, 2, then 9-1 a 107"2 by hypothesis, and the result follows from part (a). Define T: L -+ C" by T(x) = (fl(x), ... , f,,(x)) and define h: T(L) --. C" by h(Tx) = g(x). Then h is well-defined; for if Tx1 = Tx2, then f1(x1) = f;(x2) for all i, so g(x1) = g(x2) by hypothesis. Since h is linear on
272
SOLUTIONS TO PROBLEMS
T(L), it may be extended to a linear functional on C", necessarily of the form h(y1, ..., y") = c1y1 + + c" y,,. Thus g(x) = h(Tx) = c1 fl(x)
all xeL. 16.
(a)
Assume fi(x) = 0 for all i. If k is any real number, f;(kx) = 0 for all i, hence J(kxlI < 1. Since k is arbitrary, x must be 0. (h) Since the yJ are linearly independent and T -1 is one-to-one, the x; are linearly independent. If x e L, then Tx = Yi=1 c; y, for some c1, ... , ci,; hence x = Y;= 1 ci x; .
Chapter 4 Section 4.2
2.
Let Sl be the first uncountable ordinal, with F the class of all subsets of fl, and µ(A) = 0 if A is countable, µ(A) = oo if A is uncountable. Define, for each o (c 0, f,#o) = I if co < a; fa((o) = 0 if co > a. Then fa T f where f = 1, and J, f, dp = 0 for all a since fa is the indicator of a countable set. But 1, f dp = oo, so the monotone convergence theorem fails.
Section 4.3 4.
(a)
First note that H is a vector space. For if g is continuous, f f e H: f + g e H} contains the continuous functions and is closed under pointwise limits of monotone sequences, and hence coincides with H. Thus if f e H and g is continuous, then f + g e H. A repetition of this argument (notice the bootstrapping technique) shows that if g e H, then {f e H: f + g e H} = H; hence H is closed under addition. Now if a is real, then {f e H: of e H) = H by the same argument; hence H is closed under scalar multiplication. Let 9 be the open F. sets. Then So is closed under finite intersection, and IA e H for all A e Y. (By 4.3.5, IA is the limit of an increasing sequence of continuous functions.) By 4.1.4, IA e H for all A e a($") [=d(fl) by 4.3.21. The usual passage to nonnegative simple functions, nonnegative measurable functions, and arbitrary
measurable functions shows that all Baire measurable functions belong to H. But the class of Baire measurable functions contains the continuous functions and is closed under pointwise limits of monotone sequences; hence H is the class of Baire measurable functions. (b) All functions in H are Baire measurable by part (a), hence a(H) a sad. But if A e sad, then IA is Baire measurable, so IA e H by part (a).
273
CHAPTER 4
But then IA is a(H)-measurable by definition of a(H), hence A E a(H). 5.
Let H be the class of Baire measurable functions. Then Ko c H, and if KK c H for all /3 < or, then Ka c H since H is closed under monotone
limits. Thus K c H. But K contains the continuous functions and is closed under pointwise limits of monotone sequences. For iffl,fz , ... EK f,eKa.,n=1,2,...,where a" S > 0 for all n, let co* be an element of D such that d(w, co*) < 6/2. Since f (Con) -+ f (w), we have d(w", co*) -> d(w, co*) as n -+ co, and therefore d(w", (o) < d(w", w*)
+ d(w*, co) < S for large n, a contradiction. Thus f has a continuous inverse.
275
CHAPTER 4
(b) By part (a), f is a homeomorphism of !Q and f (S2) c [0, 1]'. Since i2
is complete, it is a G,, in any space in which it is topologically embed-
ded (see the appendix on general topology, Theorem A9.11). Thus f()) is a Ga in [0, 1 ]', in particular a Borel set. (c)
Let r e [0, 1) with binary expansion 0. a,a2
(to avoid ambiguity,
do not use expansions that terminate in all ones). Define g(r) = (a,, a2, ...); g is then one-to-one. If k/2" is a dyadic rational number with binary expansion O.a1 .. a.00 , then
['k'kk+1
g[
2"
{yeS"0:yi =a,,...,y"=a"}.
Thus g maps finite disjoint unions of dyadic rational intervals onto measurable cylinders, and since g[0, 1) = S' - {y e S°°: yn is eventually 1) = S `° - a countable set, we have g[0, 1) a Y ' and g and g-1 are measurable. (d) Let A = {y e S o0: yn is eventually 1), B = {y e S °°: y c- g[0, 1) and g-1(y) is rational}, where g is the map of part (c). Let q be a oneto-one correspondence of the rationals in [0, 1] and A u B. Define h:
[0, 1 ] -. S' by h(x) = g(x) q(x)
(e)
if x is irrational, if x is rational.
Then h has the desired properties. Define h: fln O2 -+11n Sn by h(w1, ()2 , ...) _ (M(all), h2(w2), ...). The mapping h yields the desired equivalence.
Section 4.5 3.
Assume µ"Z g, and let A be a bounded Borel set whose boundary has p-measure 0. Let V be a bounded open set with V A, and let Gj = {x e V: d(x, A) < 1/j}. If fj is a continuous map from it to [0, 1 ] such that fj = 1 on A and fj = 0 off Gj [see A5.13(b)], then
lµ"(A)-p(A)I - µ(A), proving that (a) implies (b). Assume p .(A) -+ µ(A) for all bounded Borel sets with µ(8A) = 0, and
let f be a bounded function from 0 to R, continuous a.e. [µ] with suppf c K, K compact. Let V and W be bounded open sets such that K c V c V c W [see A5.12(c)]. Now
v= n {xeW:d(x, V)0
6>0
the Wa are open and 8W1 c {x a W: d(x, VV) = S}.
Thus the Ws have disjoint boundaries and µ(W) < oo, so µ(8W1) = 0 for some 8. Therefore we may assume without loss of generality that we have K e V, with V a bounded open set and µ(8V) = 0.
Now if A c V, the interior of A is the same relative to V as to the entire space Q since V is open. The closure of A relative to V is given by Av = A n V; hence the boundary of A relative to V is 81, A = (8A) n V.
277
CHAPTER 4
If A is a Borel subset of V and µ(8y A) = 0, then µ[(8A) n V] = 0; also
(8A)nV°cAnV°cV-V=V-V°=BV; hence µ[(8A) n V` ] = 0, so that u(8A) = 0. Thus by hypothesis, and µ' denote the restrictions of µ and p to V, we P(A). By 4.5.1, if have un' w, µ'. Since f restricted to V is still bounded and continuous a.e. [µ], we have f y f du.'- Jy f du', that is, fn f du. -+ Jn f dµ. This proves that (b) implies (c); (c) implies (a) is immediate.
Subject Index
A Absolute continuity of functions, 70 of measures, 59, 63 Absolute G,, 233 Absolute homogeneity, 114 Absorption of one set by another, 167 Accumulation point of filterbase, 206 of net, 204 Additivity theorem for integrals, 45 Adjoint of linear operator, 149 Algebra of sets, 4 Almost everywhere, 46 Annihilator of subset of normed linear space, 149 Approximation of Baire or Borel sets by closed, compact, or open sets, 34, 179-183 by continuous functions, 88, 185-188 by simple functions, 38, 88, 90 Arzela-Ascoli theorem, 228
B
Baire category theorem, 230 Baire sets, 178 Baire space, 231 Banach space, 114 Banach-Alaoglu theorem, 162 Bessel's inequality, 119 Bilinear form, 150 Borel-Cantelli lemma, 66 Borel equivalence, 195 Borel measurable functions, 35, 36 complex-valued, 80 properties of, 38-40 Borel sets, 7, 27, 178 Bornivore, 167 Bornological space, 167 Bounded linear operators, 128 weakly, 150 Bounded set, in topological vector space. 167
Bounded variation, 71
279
280
SUBJECT INDEX C
c, space of convergent sequences of complex numbers, 115, 136 Cantor function, 77, 78 Cantor sets, 33, 34 Caratheodory extension theorem, 19 Cardinality arguments, 13, 34, 42 Cauchy in measure, 93 Cauchy-Schwarz inequality, 82, 116 for sums, 87 Chain rule, 69 Change of variable formula for multiple integrals, 78 Chebyshev's inequality, 84 Circled set, 151 Closed graph theorem, 148, 166 Closed linear operator, 147 unbounded, 150 Cluster point, 203 Compact topological space, 213 countably, 216 locally, 217 relatively, 217 sequentially, 217 a, 218 Compactification, one-point, 220 Complete metric space, 230
Complete orthonormal set, 122 Completeness of L° spaces, 85, 90 Completion of measure space, 18 Complex measure, 69 Composition of measurable functions, 39 Conjugate isometry, 131 Conjugate linear map, 131 Conjugate space, 142 second,142 Consistent probability measures, 190, 191 Continuity of countably additive set functions, 10, 11 Continuity point of distribution function, 198
Continuous functions dense in L°, 88, 185, 188
Continuous linear functionals, 130, 135 extension of, 140, 156 representations of, 130-133, 136, 137, 184-186,188 space of, 131, 141
Convergence of filterbases, 205 of nets, 203 in normed linear spaces strong, 144 weak, 144 of sequences of linear operators, 134 strong, 134, 144, 149 uniform, 134 of sequences of measurable functions, 92ff.
almost everywhere, 47 almost uniform, 93 in L", 88 in L`°, 89 in measure, 92 in probability, 92 Convergence theorems for integrals, 44, 47, 49 Convex sets, 119
in topological vector spaces, 154ff. Countably additive set function, 6, 43, 62 expressed as difference of measures, 11, 44, 61
Counting measure, 7 Cylinder, 108, 189 D
Daniell integral, 170ff., 175 Daniell representation theorem, 175 Decreasing sequence of sets, I
De Morgan laws, I Density, 66 Derivative of function of bounded variation, 76 of signed measure, 74 Radon-Nikodym, 66 Difference operator, 27 Differentiation under integral sign, 52 of measures, 74ff. Dini's theorem, 181 Directed set, 203 Discontinuous linear functional, 135 Discrete distribution function, 76 Distribution function, 23, 29 decomposition of, 76 Dominated convergence theorem, 49 extension of, 96 Dynkin system, 168, 169
SUBJECT INDEX
281
E Egoroff's theorem, 94 Equicontinuity, 228 Extension of finitely additive set functions, 149
Extension theorems for measures, 13ff., 18, 19, 22, 183, 184
F F, set, 42, 178 Fatou's lemma, 48 Field of sets, 4 Filter, 205 Filterbase, 205 subordinate, 206 Finitely additive set function, 6 not countably additive, 11, 12 a-finite, 9 First countable space; 204, 212 Fubini's theorem, 101, 104 classical, 103, 106 Functional analysis, 113ff. basic theorems of, 138ff. G Go set, 178
Gauge space, 226, 237 Good sets principle, 5 Gram-Schmidt process, 125 Gramian, 125
H Hahn-Banach theorem, 139, 140, 149 Hausdorff space, 211 Heine-Borel theorem, 213 Hermite polynomials, 125 Hilbert spaces, 114, 116ff. classification of, 123, 124 separable, 124, 133 Holder inequality, 82 for sums, 87 I
Identification topology, 209
Increasing sequence of sets, I Indicator, 35 Inner product, 114 space, 114 Integrable function, 37, 81 Integral, 36ff. as countably additive set function, 43 indefinite, 59, 73 Integration of series, 46, 52 Internal point, 154 Isometric isomorphism, 123, 133, 137, 142, 163, 186, 189
J Jordan-Hahn decomposition theorem, 60
K
Kolmogorov extension theorem, 191, 194
L L' spaces, 80ff. completeness of, 85 continuous linear functionals on, 131133,137,165 /', l'(S2), 87 L°°, 89 1°°, , (c1), 90
Lattice operations, 170 Lebesgue decomposition theorem, 68, 76 Lebesgue integrable function, 51 Lebesgue integral abstract, 36ff.
comparison with Riemann integral, 53 Lebesgue measurable function, 39 Lebesgue measurable sets, 26, 31, 33, 54 Lebesgue measure, 26, 31, 100, 106 Lebesgue set, 78 Lebesgue-Stieltjes measure, 23, 27 Legendre polynomials, 125 Lim inf (lower limit), 2 Lim sup (upper limit), 2 Limit under integral sign, 52 of sequence of sets, 3, 12, 52
SUBJECT INDEX
282
Lindelof space, 212 Linear functionals, 130 continuous, 130, see also Continuous linear functionals positive, 170 Linear manifold, 119 generated by set, 121 Linear operator(s), 127ff.
bounded,128 closed, 147 continuous, 128 with discontinuous inverse, 147 idempotent, 130 range and null space of, 149 spaces of, 133 Lipschitz condition, 78 Locally compact space, 217 Locally convex topological vector space, 153
characterization of, 156 Lusin's theorem, 187
M Measurable cylinder, 108, 189 Measurable function, 35 Bore], 35 jointly, 101, 107 Lebesgue, 39 Measurable rectangle, 97, 108, 189 Measurable sets and spaces, 35 Measure(s), 6 absolutely continuous, 59, 63 complete, 18 complex, 69 extension of, 13ff., 183, 184 on field, 6 finite, 9 on infinite product spaces, 108ff., 189ff. Lebesgue, 26, 31, 100, 106 l.,ebesgue-Stieltjes, 23, 27 outer, 16, 22 probability, 6 product, 97, 100, 104, 106, 109, 111 regular, 183, 189 a-finite, 9 signed, 62 singular, 59, 66 spaces of, 186, 189
on topological spaces, 178ff. uniformly a-finite, 97 Measure-preserving transformation, 50 Minkowski functional, 154 Minkowski inequality, 83 for sums, 87 Monotone class theorem, 19 Monotone convergence theorem, 44 extended, 47 Monotone set function, 16
N
Negative part of countably additive set function, 62 of function, 37 Neighborhood, 201 Net, 203 Norm(s), 84, 114 on finite-dimensional space, 134 inducing same topology, 134-136 of linear operator, 128, 133 Normal topological space, 178, 211 Normed linear space, 114 linear operators on, 127ff.
0 Open mapping theorem, 147, 159 Orthogonal complement, 121 Orthogonal direct sum, 121 Orthogonal elements, 118 Orthogonal set, 118 Orthonormal basis, 122 Orthonormal set, 118 complete, 122 Outer measure, 16, 22 Overneighborhood, 201
P Parallelogram law, 118 Parseval relation, 122 Polarization identity, 124 Polish space, 180 Positive homogeneity, 138 Positive linear functional, 170
SUBJECT INDEX
283
Positive part of countably additive set function, 62 of function, 37 Pre-Hilbert space, 114 Probability measure, 6 Product measure theorem, 97, 104 classical, 100, 106, 111 infinite-dimensional, 109, 111 Product a-field, 97, 108, 189 Product topology, 208 Projection, 119, 120, 130, 148 Projection theorem, 121 Pseudometric, 84, 226 Pseudonorm,84 Pythagorean relation, 118 Q
Quotient space, 135 Quotient topology, 210
R Radial, 154 Radial kernel, 154 Radon-Nikodym derivative, 66 Radon-Nikodym theorem, 63 Rectangle, 96, 97, 108, 189 Reflexivity, 142, 163, 164 Regular measure, 183, 189 Regular topological space, 211 completely, 212 Riemann integral, 53-57 Riemann-Stieltjes integral, 56 Riesz lemma, 136 Riesz representation theorem, 130, 133, 181, 182, 184-186, 188 S
Second countable space, 212 Section of set, 98 Semicontinuous functions, 220 Semimetric, 84 Seminorm(s), 84, 113 family of, generating locally convex topology, 153 Separable Hilbert spaces, 124, 133
Separable topological spaces, 212 Separation properties for topological spaces, 211 Separation theorems, 159-161 strong, 160 Set function, 3 countably additive, 6 finite, 9 finitely additive, 6 Shift operator, 129 one-sided (unilateral), 129 two-sided (bilateral), 129 a-field (a-algebra), 4 countably generated, 111, 148 minimal, 5, 12 Simple functions, 36 dense in Lo, 88, 90 Singular distribution function, 77, 78 Singular measures, 66 Solvability theorem, 150 Space spanned by set, 121 Steinhaus' lemma, 42 Stone's theorem, 200 Stone-Weierstrass theorem, 225 Strong convergence in normed linear space, 144 of operators, 134,144, 149 Strong topology, 144, 161 Subadditivity, 114, 138 countable, 16 Sublinear functional, 138 Subnet, 204 Subspace, 119
closed, 119
T T, spaces, 211, 212 Tails of net, 205 Tietze extension theorem, 211 Topological isomorphism, 136 Topological spaces, 201ff, measures on, 178ff. Topological vector space, 114, 150ff. locally convex, 153 Topologically complete space, 232 Topology of pointwise convergence, 208, 227 of uniform convergence, 153, 227 Total variation, 62, 69
SUBJECT INDEX
284
Translation-invariance of Lebesgue measure, 33 Tychonoff theorem, 215 U
Ultrafilter, 206 Uniform boundedness principle, 143 Uniform convergence of operators, 134 Uniform space, 237 Uniform structure, 234 Uniformity, 234 Urysohn metrization theorem, 219 Urysohn's lemma, 211
V
Variation bounded,71 of function, 71 lower, 62 total, 62, 69 upper, 62 Vitali-Hahn-Saks theorem, 43
W Weak and weak* compactness, 162-164 Weak convergence, 144, 161 of distribution functions, 199 of measures, 196-199 Weak* convergence and weak* topology, 161
Vague (=weak) convergence of measures, 198
Weak topology, 144, 161