This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0, that is, differentiable in JO,+oo[, and
(16.7)
(x) = -
e_2(1+")).1(dw)
for x > 0
and via the substitution t = w f this reads (16.8)
cp'(x) = -Gx-1"2e-z
forx>0
where G designates the integral (16.1) that we are trying to explicitly compute. Its existence is already fart of the preceding analysis, but can also be inferred from
the majorization a-' < e-t, which holds for t > 1. From (16.6), (16.8) and the
§16. Applications of the convergence theorems
95
fundamental theorem of calculus
V(x) - V(a) = GI t-1/2e-° dt = 2G 41. e" dw, for x > 0 and a > 0. Upon letting a run to +oo, we will get (16.9)
p(x) = 2G
+oo a-", dw
J,rif we notice that V(a) -+ 0 as a - +oo, which in turn is a consequence of the inequalities +w2)-1A1(dw) = p(O)e-0
w(a) < e-° f(i
for all a > 0.
Because cp is continuous on R+ we can pass to the limit x -+ 0+ in (16.9) and get
it = p(0) = 2G
r+ e-"'2 dw = G2,
J0
using the obvious (on grounds of symmetry) fact that f °. a-"'' dw = f0+00 e' dw. G = . That is, Since G > 0, it follows finally thatfe2
dx = r
(16.10)
or equivalently, in the form seen in probability theory,
2a.
(16.10')
This derivation goes back to ANONYME [1889J and VAN YZEREN [1979]. A par-
ticularly short alternative one is made possible by Tonelli's theorem (cf. Exercise 4 in §23).
Exercises. 1. Which of the two functions below are integrable, which are square-integrable with respect to Lebesgue-Borel measure on the indicated intervals? (a)
(b)
f (x) := x-1, f (x) := x-1/2,
x E I:= [l, +oo[; x E I:= 10,1] .
2. Show that for every real number a > 0 the function x H e" is A1-integrable over R+.
3. Show that for every real number a > 0 the function x
- a_°x [sinX x13 J
96
1 1. Integration Theory
is A'-integrablc over JO, +oo[ and that
rsinx13 A1(dx) x J
Jo is continuous Oil 10, +00[.
§17. Measures with densities: the Radon-Nikodym theorem Again let (12, dd, p) be an arbitrary measure space and E' = E'(f2, sd) the set of all W-measurable, non-negative numerical fimctions on 12. In 12.4 we defined the integral of every function f E E* over every set A E id'. We are interested here in how this integral behaves with respect to A. 17.1 Theorem. For each function f E E`JA the equation
v(A) :=
(17.1)
f du
defines a measure v on sd.
Proof v(0) = 0 and v > 0. For every sequence (An)nEN of pairwise disjoint sets
from W with A:= U A nEN
IAf =
IA, f n=1
and so by 11.5
v(An),
v(A) n=1
the final property needing to be checked in confirming that v is a measure on 0. 0
17.2 Definition. If f is a non-negative .d-measurable, numerical function on 11, then the pleasure v defined on .0' by (17.1) is called the measure having density f with respect top. It will be denoted by
v=fiz.
(17.2)
Concerning the relationship between v- and µ-integrals we will show
17.3 Theorem. Let f,, E E', v:= fu. Then (17.3)
1
§17. Measures with densities: the Radon-Nikodym theorem
97
or, written out,
Jd(f,i) = f Wf dµ -
(17.3')
An id-measurable function V : fl - R is v-integrable if and only if ,pf is µintegrable. In this case (17.3) is again valid. Proof. First suppose p =
a,lA; is an sad-elementary function. In this case (17.3)
holds because n
n
f ,pdvaiv(A1)a;f lA,fdµ=Jcof d µ. For an arbitrary p E E' there is a sequence (un) in E such that U. T V. Since then un f T W f as well, (17.3) follows from 11.4. Finally, consider any id-measurable numerical function p on Sl. By now we know that
fco+ dv = Jco+f dµ = J(caf)+ dµ and
f
W- dv = f V f du = f(f ) dp.
From these equations and the definition of integrability follows the second part of
the theorem. 0 It now follows that the formation of measures with densities is transitive:
17.4 Corollary. Let f, g E E', v := fit and P := gv. Then B = (gf )µ, that is,
9(fµ) = (9f)µ
(17.4)
Proof. For every A E id
g(A) = f gdv = A
f
lAgdv
and furthermore, according to 17.3
f lA9dv=
f
lA9fdµ= f(9f)dii.
We thus obtain p(A) = fA g f dµ, for all A E W; which is what had to be proved. 0
On the question of uniqueness of density functions we have
17.5 Theorem. For functions f, g E E' (17.5)
f =g
µ-almost everywhere
= f p = gµ .
If either f or g is µ-integrable, the converse implication holds as well.
98
IL . Integration Theory
Proof. If f and g coincide p-almost everywhere, then so do 1A f and 'Ag for each A E a(, whence JALgdp
for allAEd,
which just says that fit = gp. Now suppose that f is p-integrable and that fit = gp. Since g > 0 and f gdp = f f dp < +oc, g is also p-integrable. Let us show that the set
N:={f>g}, which lies in 0 by 9.3, is a p-nullset. For every w E N, f (w) - g(w) is defined and is positive, which means that the definition
h:= 1Nf - 1N9 makes sense. The functions 1N f, 1Ng, being majorized by the p-integrable func-
tions f, g, are themselves integrable. Because fit = gp, they have the same itintegral. From this we getr that
J
hdp=
r Ir fdp- /Ngdp=0.
Since N = {h > 0}, this equality and 13.2 tell us that p(N) = 0. With the roles of f and g reversed, this conclusion reads u(N') = 0, where N' := {g > f }. Since if 54 g} = N U N', the desired conclusion, namely that If 34 g} is a p-nullset, is
obtained. 0 The converse of implication (17.5) is not valid without some additional hypothesis on the densities f and g. The next example illustrates this.
Example. 1. As in Example 2 of §3 let fl be an uncountable set, 0 the a-algebra of countable and co-countable subsets of (1 (see Example 2 in §1). But the measure p will be defined on 0 by p(A) := 0 or +oo, according as A or CA is countable. If f and g are the constant functions on ft with the respective values 1 and 2, then indeed f p = gp, yet f (w) = g(w) holds for no w E ft. Of course, it then follows from 17.5 that neither f nor g is p-integrable. Before turning to the principal problem of this section, we will examine another characterization of a-finite measures which is important for what follows and is of interest in its own right.
17.6 Lemma. Let (fl,.ad,p) be a measure space. The measure is a -finite if and only if there exists a p-integrable function h on Cl which satisfies (17.6)
0 0 there exists d > 0 such that v(A) < e. (17.7) . A E O and u(A) 0 if A is a p-nullset. Hence v(A) = 0 and v is thus a p-continuous measure, even without the finiteness
hypothesis. For the converse we will show that if (17.7) fails, then v is not µcontinuous. Thus, for some c > 0 there is no 6, which means there is a sequence with the properties (An)nEN in p(An) < 2_n and v(An) > E for each n E N. We set
A := 41.s .up An := n U An nEN m>n
and have a set in ap which on the one hand satisfies 00
A(A) < µ( U Am) < E p(Am) n
m=n
00
m=n
2-m = 2-n+1
for every n E N,
100
II. Integration Theory
whence p(A) = 0, and on the other hand, due to the finiteness of v and 15.3, satisfies
v(A) > limsup
E > 0,
nix
which proves that v is not p-continuous. 0 Examples. 2. Let 12 be an uncountable set, W the or-algebra of countable and cocountable subsets of .W (Example 2 in §1). As in the preceding Example, consider the measure v on .i which assigns to a set the value 0 or +oo according as the set or its complement is countable. Let is denote the counting measure C on at (from Example 6, §3). Since 0 is the only p-nullset, v is trivially µ-continuous. However, v cannot have a density with respect to p. For from v = f p with f E E* it would follow that
0 = v({w}) = f f dp = f(w)k({w}) = f(w) W}
for every w E S2, making f = 0 and therefore v = fit = 0, which is not the case because Sl is uncountable.
Let (R, 0, It) be the 1-dimensional Lebesgue-Borel measure space (so p = 'V) and denote by A" the system of all p-nullsets. Then is an example of a or-ideal in W1: The union of any sequence of its sets is another, as are the intersections of its sets with those of ,5d1 (cf. Exercise 5, §3). These properties insure that 3.
v(A)
-
10 +oo
ifAE-4 if AEJO\.X
defines a measure on 1 (cf. Exercise 6, §3). From its definition it is clear that v is p-continuous. Here however (17.7) falls, since for every b > 0
jp([o,ap = s and v([0,ap =+oo. Thus the finiteness hypothesis on v in 17.8 is not superfluous. Example 2 shows that for the existence of a density f E E' with v = fit, the µ-continuity of v, while necessary, is not sufficient. All the more noteworthy is the theorem of Radon and Nikodym which we will prove, after a preparatory lemma.
17.9 Lemma. Let or and r be finite measures on a o-algebra ii of subsets of 11 and let a := r - a denote their difference. Then there is a set S2o E W with the properties (17.8)
(17.9)
e(fl0) > LOW); @(A) >0
for all AESTOltW.
Proof. Let us first proof the weaker claim: (*) For every, e > 0 there exists 0e E 0 with the properties (17.8') (17.9')
N(1l) >- 9(f) ;
g(A) > -E
for all A ED, ft a/.
§17. Measures with densities: the Radon-Nikodym theorem
101
We may obviously suppose that p(l) > 0, since otherwise SlE := 0 does what is wanted. If then e(A) > -e for all A E .sad, it suffices to choose 1 := Q. So we consider the case that some Al E ad satisfies e(A1) < -e. From the definition of e and the subtractivity of the finite measures a and T, e(CA1) = e(fl) - e(A1) ? e(1) + e > e(11) .
Therefore, if e(A) > -e for all A E (CAI) fl 0, we can set S1E := CA1 and be done. In the contrary case there is a set A2 E (CAI) flsat with e(A2) < -e. Then because A1, A2 are disjoint
e(C(A1 U Az)) = o(Q) - e(A1) - e(A2) > e(fl) + 2e > e(n) and the preceding dichotomy presents itself anew. If after finitely many repetitions of this procedure we have not reached our goal, then we will have generated a sequence (An)nEN of pairwise disjoint sets in gd with e(Sl \ (A1 U ... U An)) > e(Sl)
and e(A.) < -e
for every n E N.
Because of the finite additivity of a and r, this would have the consequence that n
e(A1U...UAn)=Ee(A,) -1/nforallnENandeveryAESlofl.od. O As indicated, this puts us in a position to answer the important question we posed earlier.
17.10 Theorem (Radon-Nikodym). Let u and v be measures on a a-algebra .srd in a set Q. If µ is a-finite, the following two assertions are equivalent:
I l. Integration Theory
102
v has a density urith respect to A. (ii) v is 14-continuous. (i)
Proof. Only the implication (ii)=(i) is still in need of proof. To that end we distinguish three cases.
First Case: The measures µ and v are each finite. Form the set 9 of all d measurable numerical functions g > 0 on Sl which satisfy gµ < v, that is, which satisfy
for allAEd. The constant function g = 0 lies in 9, so 9 is not empty. 9 is moreover sup-stable, that is, g V h E 9 whenever g, h E W. Indeed, setting Al := {g > h}, A2 := CAI, every A E d satisfiees
J
gvhdµ= 1
Ana,
r
gdµ+J
ArA,
Since f gdµ < v(Q) < +oo for every g E 9, the number
ry:=suP{ f 9dµ:gE9) is finite and there is a sequence (g;,) in 9 such that lim f gn dµ = -y. Due to supstability the functions gn := gi V ... V gn lie in 9, and consequently ry > f gn dµ >
f gn dµ (since g,, > gn) for all n E N. Which shows that lim f gn dµ = ry. As the sequence (gn) is isotone, the monotone convergence theorem can be applied,
assuring that f := supgn is a function in 9 and that f f dµ = ry. All this proves that the function g H f g dµ on 9 assumes its maximum value at f. Now we prove that v = f µ. In any case we have f µ < v, since f E 9, and so
T:= V- f A is a finite measure on sat, evidently µ-continuous since v is by hypothesis. We have
to show that r = 0. So let us assume contrariwise that r(Sl) > 0. Due to the µ-continuity of r, this entails that µ(11) > 0 as well, and we may form the real number
Q:=2
(M}>0,
which satisfies r(Sl) = 20µ(Sl) > Qµ(St). The preceding lemma applied to r and a:= Q3µ supplies a set flo E 0 which satisfies
r(flo) - lµ(ilo) > r(1) - $µ(!l) > 0 and r(A) > Qµ(A) for all A E f o n 0. The .sat-measurable, non-negative function fo := f +,81n. therefore has the property
ffodiz=jfdii+I3(QonA)
jfd+r(A)=v(A)
§17. Measures with densities: the Radon-Nikodym theorem
103
for every A E sV. These inequalities put fo in 9. Since r is p-continuous and r(S2o) > Qµ(S2o), we must have µ(S20) > 0, leading to
f
fodµ= ffdµ+ap(no)=7+i3µ(Slo)>7,
an inequality which is incompatible with the definition of -f and the fact that fo E 9. The assumption r(S1) > 0 is therefore untenable, and r = 0, as desired. Second Case: The measure µ is finite and the measure v is infinite. We will produce 00
a decomposition SZ = U On of S1 into pairwise disjoint sets from d with the following properties
(a) A E 1o fl at (b)
n=0
either µ(A) = v(A) = 0 or 0 < µ(A) < v(A) = +oo . v(S1n) < +0o
for all n E N.
To this end let 2 denote the system of all Q E 0 with v(Q) < +oo and define a:= sup{µ(Q) : Q E _l} . This is a real number because the measure µ is finite. There is a sequence (Qm)mEN
in .l with limµ(Qn,) = a. Since 1 is evidently closed under finite unions, (Q,n) U Q,n is then a set from std satisfying may be assumed to be isotone. Qo mEN
µ(Qo) = a. We will show that 52o := CQo satisfies (a). So consider A E Stood with
v(A) < +oo. We need to see that p(A) = v(A) = 0, and since v is µ-continuous we really only need to confirm that p(A) = 0. Since v(A) < +oo and, as noted already, . is closed under union, each Q,n U A lies in 2, so that p(Q,, U A) < a, and consequently µ(Qo U A) = lim p(Qm U A) < a. "t-400
Since A is disjoint from 1o, u(Qo U A) = a + µ(A). Conjoined with the preceding inequality and the finiteness of a this says that indeed p(A) must be 0. Finally, to take care of (b) we merely define S21 := Ql, and u n := Qm \Q,n_1 for all integers m > 2 in order to get a decomposition of S2 with the desired properties. Now let An, vn denote the restrictions of µ, v to the trace a-algebra On fl 8d, for n = 0, 1.... and note that each vn is a µn-continuous measure. Moreover, for all n > 1 both An and vn are finite. Case 1 therefore supplies Cl,, n 0-measurable functions fn > 0 on Cl,, with vn = fnµn Taking fo to be the constant function +oo on Sto, vo = foµo also holds, thanks to (a). Finally, "putting all the pieces together" gives our result in this second case. Namely, the function f on Cl defined to coincide on each Cl,, with fn (n = 0, 1, ...) is non-negative, sad-measurable and satisfies
v=fp.
Third Case: This is the general case: only the a-finiteness of it is demanded. There
is according to 17.6 a strictly positive function h E 2'(µ). The measure hp is therefore finite and possesses exactly the same nullsets as does A. Consequently
v is also (hp)-continuous. By what has already been proved there is then an
104
II. Integration Theory
0-measurable function f > 0 on 1 with v = f (hµ). According to 17.4 v then has the density f h with respect to A. 0 The question arises whether, in the situation of Theorem 17.10 the density f of v is p-almost everywhere uniquely determined. From 17.5 we at least get a positive answer when f is p-integrable, that is, when v is a finite measure. But more is true:
17.11 Theorem. Let v = fit be a measure having a density f with respect to a a-finite measure p on 0. Then f is p-almost everywhere uniquely determined. The measure v is or-finite exactly when f is p-almost everywhere real-valued. Proof. First we show that f is µ-almost everywhere uniquely determined if the measure p is finite. In proving this we may assume that v(St) = +oo, since its truth is otherwise a consequence of the second part of 17.5. Furthermore, as we now find ourselves in case 2 of the preceding proof, the decomposition of St into %J11,... employed there lets us confine our attention to Sto, as 17.5 takes care of the remaining Stn (n E N). So it suffices to treat the case ft = Sto, that is, to assume that p and v are linked by the alternative: A E srp
=
either p(A) = v(A) =0 or 0 < µ(A) < v(A) = +oo.
The constant function +oo is then a density for v with respect to p and what has to be shown for uniqueness is that f = +oo holds p-almost everywhere. And for that it suffices to show that µ({ f < n}) = 0 for each n E N, which in turn is a consequence of the above alternative and the inequalities
v({f
0 for all A in the trace a-algebra Sl+ n 0, and g(A) < 0 for all A E Sl- n dd. Proof. Set
-y:= sup{g(A) : A E 0}
and choose a sequence (An) in 0 with limg(An) = y. By applying 17.9 to the restriction of g to An nad, we may replace An by a set Pn E 0 satisfying g(Pn) > g(An) and g(A) > 0 for all A E Pn n 0. We will then have
y=sup{g(Pn):nEN).
(18.1)
The decomposition of Cl that is sought can be realized by
Sl+ := U Pn,
S2- := S2 \ Q+ .
nEN
Indeed, all A E H+ n .ad satisfy g(A) > 0 because such an A has the form
A = U Bn nEN
with pairwise disjoint sets B. E P. n ad (by the disjointification procedure used in the verification of (3.10)). From this representation of A and the a-additivity
g(B,) > 0. Thus p assumes only non-negative real values
follows g(A) _ n=1
on Sl+ n .sad, that is, the restriction of g to Sl+ n 0 is a finite measure. Moreover, because @(P.):5 g(Sl+) < y and (18.1) this measure satisfies
y=Q(sl+) In particular, y < +oo since p assumes only real values. g(A) > 0 cannot hold for any A E Sl- n .sat, for otherwise g(C+ U A) = g(Sl+) + g(A) > y. Thus, g(A) < 0
for allAESl-n0. Measures (in the sense of Definition 3.3) have occasionally been interpreted as mass distributions on the underlying set Cl. A finite signed measure can be analogously interpreted as an (electric) charge distribution smeared over Cl. The foregoing theorem justifies this metaphor by showing that as with charge in electrostatics, there are two disjoint sets, one carrying all the positive charge, the other all the negative charge.
§18*. Signed measures
109
From this theorem another important feature of signed measures becomes evident: The difference p in Lemma 17.9 is more than an illustrative example of a signed measure - it is the typical signed measure:
18.2 Corollary. Every finite signed measure p on a a-algebra sat in ] is the difference of two finite measures on sat.
Proof. Let fl = S2+ U S2- be a Hahn decomposition in the sense of 18.1. Then evidently p+(A)
p(A n St+)
and p(A) :_ - p(A n St-),
A E sat
define measures on d, which satisfy p = p+ - p-, since each A E sat is the disjoint
union (AnS2+)u(Ancl-). 0 With this result the circle closes: finite signed measures are nothing more than the differences of finite measures. It is however possible to dispense with the finite-
ness hypothesis if a-additivity is handled with sufficient care, but we will not go into this further. In the final analysis it is because of the preceding corollary that we only consider measures with non-negative values in this book. Often to emphasize the distinction with signed measures, what we call simply measures are called positive measures.
Exercises. 1. Show that every finite signed measure on a a-algebra is bounded and assumes a largest and a smallest value.
2. Let p be a finite signed measure on a-algebra d in Sl, and St = Sli U f1i , fl = fl2 Uci be two Hahn decompositions for it. Show that ii LSl2 and Sti OS22 are totally p-nulsets, meaning that p(N) = 0 for every N E 0 which is subset of either of them. Conclude that to within such totally p-nullsets there is only one Hahn decomposition for p. 3. Let p be a finite signed measure on a a-algebra sat in Q. Show that the specific representation p = p+ - p- of p as the difference of the two measures on sat which was produced in the proof of 18.2 is characterized by the following minimality property: In every representation p = pl - p2 as the difference of measures pl, p2
on 0, pl = p+ + 8 and p2 = p + b for an appropriate finite measure 8 on sa7, and indeed if 11 = Sl+ U S2- is any Hahn decomposition of S2 corresponding to p,
8 = (ln+)p2 + (1n-)pl. (Conversely, of course, every finite non-zero measure b on sat generates in this way a different representation of p.) Infer that the only measure v on sat which satisfies v(A) < min{p+(A), p-(A)} for every A E sat is the identically 0 measure. [Remark: The representation p = p+ - p uniquely determined by this minimality condition is called the Jordan decomposition of the finite signed measure p. As with functions, p+ and p- are called the positive part and the negative part of p.]
110
1 1. Integration Theory
§19. Integration with respect to an image measure Along with the measure space (it, .0', i) a measurable space (W,01) and an jW-d'-measurable mapping
T : (fl, a) -a (ft', d') are given. Then the image measure
p` := T(p) is defined in (7.5). The connection between p-integrals and µ'-integrals is elucidated by:
19.1 Theorem. For every s/'-measurable numerical function f' > 0 on 0' (19.1)
Proof. The non-negative function f' o T is d-measurable, by 7.3. The integral on the right-band side of (19.1) is therefore defined. To prove the equality there we first consider only d'-elementary f': n
f
ailA s
i=1
(with coefficients ai E R+ and sets A; E d'). For such f
f'oTa;lAi e=1
with A; := T-r (A;), so this composite is an d-elementary function. Since
T(p)(Ai) = p(Ai)
(i = 1,...,n)
holds by definition of image measures, (19.1) follows in this case. For an arbitrary s9'-measurable f > 0 there is an isotone sequence (un) of d'-elementary functions for which u;, T f'. Then (un o T) is a sequence of s(-elementary functions for which u;, o T T f o T. From the validity of (19.1) for the u;, and Definition 11.3 of the integral in general, we get (19.1) for f'.
19.2 Corollary 1. Let f' be an sf'-measurable numerical function on W. Then the T(µ)-integrability of f' entails the p-integrubility of f' oT, and conversely. In case of integrability (19.2)
1
§19. Integration with respect to an image measure
111
Proof. From 19.1
f (f')+dT(p)=J(f')+
o Tdp and
J(f')_dT(P) = f
(f')- oT d1 z,
and of course
(f'oT)+=(f')+oT
and
(f'oT)-=(f')-oT.
Both claims therefore follow from the definition of the integral 12.1.
19.3 Corollary 2. The mapping T : S2 -+ S2' is bijective and d -d'-measurable,
with W'-d-measurable inverse T'. Further f' is a numerical function on W. Then the T(p)-integrability of f' is equivalent to the p-integrability of f' o T, and in its presence equality (19.2) prevails.
One has only to note that the integrability of f' o T entails the measurability of f' o T and therewith that off'= f' o T o T -1. The content of 19.1-19.3 constitutes what is called the "general transformation theorem for integrals".
As the behavior of the L-B measure with respect to Cl-diffeomorphisms is known from (8.16'), the transformation theorem for Lebesgue integrals follows at once:
19.4 Theorem. Let G. G' be open subsets of W', cp : G -> G' a C1-diffeomorphisrn
of G onto G'. A numerical function f' on G' is Ad-integrable if and only if the function f' o cp I det DWI is Ad-integrable over G, and in this case (19.3)
IG, f' dAd =
fcf' o' I det D,,, I dAd .
Proof. The Ad-integrability of f' over G' and that of f' o W I (let DWI over G means the AG,-integrability and the AC-integrability of those functions, respectively. According to (8.16') ' (Ac) = I det DWI Ad ;
furthermore, the Borel measurability of f' is equivalent to that of f'o 0 such that
r
fA-'(A) f oTdp fqdu
-TJ for all dat-measurable numerical functions f > 0 on fl, and all A E d.
§20. Stochastic convergence Let us return to the study of p-fold integrable functions begun in §14. Our goal will be to replace the almost-everywhere convergence concept that underlies the theorems proved there with a weaker convergence concept. It is suggested by a simple but very useful inequality.
The setting is once again an arbitrary measure space (el, 0,u). 20.1 Lemma. For every measurable numerical function f on 0 and every pair of real numbers p > 0 and a > 0 the Chebyshev-Markov inequality p({IfI >- a}) a}nA and {Ifn-fl>a}nA differ from each other only in an (n-independent) nullset. The converse of this is important:
20.3 Theorem. For every o-finite measure p, any two stochastic limits of a sequence of measurable real functions are µ-almost everywhere equal to each other.
114
1 1. Integration Theory
Proof. If f and f* are stochastic limits of the sequence (fn), then from the triangle inequality in R
{If -f*I2al C{If.-fI? a/2}U{Ifn-f*I2! a/2}, whence
p({If-f*I >a}nA)a/2}nA)+p({Ifn-f*I2:a/2}n A) for every n E N and every A E d. Letting n -3 oo shows that
p({ If -f*1 >- a} nA) = 0 for every a > 0 and every A E ii of finite measure. Then however, f = f* "-almost everywhere in every such set A, since
If 54 f*} n A= U{If - f*1 > Ilk} nA kEN
is a p-nullset. Upon taking for A the sets in a sequence (An) in 41 which satisfies p(An) < +oo for all n and An t 0, the p-almost everywhere equality of f and f follows. D To supplement this fact we mention:
Remark. 4. Stochastic limits f and f* of the same sequence (fn) are almost everywhere equal without any hypotheses on the measure itself if both functions are p-fold integrable for some p E [1, +oo[. This is because for every real a > 0 the
set (if - f* I > a} has finite measure, by (20.1), and so f = f * p-almost everywhere in this set, whence { If - f * I > 0} = U {If - f* I > 1/n} is a countable nEN
union of p-nullsets. This just says that f = f* p-almost everywhere in Sl. But the next example shows that it may fail if one of the functions is not in any 2P-space. Example. 2. Consider the measure space (fl, Y(fl), p), where 11 consists of exactly two elements wo,wl and p({wo}) = 0, p({wl}) = +oo, fn = f = 0 for every n E N. These functions lie in every .2'P(p) and the sequence (fn) converges stochastically
to f , as well as to every real-valued function f * on 0. Every such f* which is non-zero at wl, however, lies in no 2"(p) with 1 < p < +00 and fails to coincide p-almost everywhere in 11 with f. The considerations with which we began this section lead to an important class of stochastically convergent sequences:
20.4 Theorem. If the sequence (fn) in 2P(p) converges in e" mean to a function f E 2P(p) for some 1 < p < +oo, then it also converges to f p-stochastically. Proof. The Chebyshev-Markov inequality tells us that
p({Ifn - fl ?a}nA) a}) =0
for every a > 0,
lim µ({sip Ifml > a}) = 0
for every a > 0,
p(limsap{Ifnl>a})=0
for every a>0.
lim A n-rao
(20.7)
m>n
m>n
Proof. To prove the equivalence of (20.6) with the almost everywhere convergence of (fn) to 0, we set, for each a > 0 and each n E N
An :_ { sup IN > a} . m>n
Obviously both n H An and a H An are antitone mappings; then k H An/k is isotone on N. If we also set
A:= {w E fl :limo fn(w) = 0} = {w E Sl : limas
op
Ifnl (w) = 0),
1 1. Integration Theory
116
then these lie in W. either by appeal to 9.5 or by noticing that each A; E W and
A= n U kEN nEN
Passing to complements,
CA= U nAnk kEN nEN
and so
n A ;/k r CA as k -+ oo,
and Al/k n 1
fI' dl "m
as n -00.
mEN
nEH
Consequently,
u(CA) = sup p ( n A,imk) = sup inf
(20.8)
kEN
kEN 'nEN
nEN
because the finite measure µ is both continuous from above and continuous from below, by 3.2. Thus (fn) converges almost everywhere to 0 just when the number defined by (20.8) is 0. In turn, the latter occurs exactly in case
inf p(AIlk) = Iuu p(An1fk) = 0
nEN
n-+oo
for every k E N. The first equivalence follows from this. The equivalence of (20.6) with (20.6') follows from the observation that for any numerical function g on S2
{g>a}C{g>a}C{g>a'} whenever 0 < a' < a. Finally, the equivalence of (20.6') with (20.7) follows from the validity, for every
a > 0, of the equality
a(( sup Ifml > a}) = µ(limsop tlfnl > a}) .
(20.9)
m> n
For the proof of which we introduce
Bn:= U{Ifml>a} and B:=llmspp{Ifnl>a}. m>n
On the one hand, Bn I B and consequently tim p(Bn) = µ(B). On the other hand, however,
Bn= U {Ifml>a}={sup Ifml>a}. rn>n
m>n
From this finally we get the needed (20.9). 0 The conditions involved in Theorems 20.4 and 20.5 are indeed sufficient to insure stochastic convergence, but they are not necessary for it, as the following examples show.
§20. Stochastic convergence
117
Examples. 3. Let S2 :_ [0,1 [, s/ := 1 n 91 and µ := an, a finite measure. With converges to 0 at every point of Q An :_ JO, 1/n[ E a, the sequence and so, either by appeal to 20.4 or by virtue of
µ({n1A > a)) = µ(An) = n
whenever 0 < a < n E N,
this sequence also converges stochastically to 0. By contrast
= n"p(An) = np-1 shows that the sequence does not converge to 0 in pth mean for any p > 1. 4.
Let (fl, 0, µ) be the measure space of the preceding example. Write each n E N
as n = 2' + k with non-negative integers h and k satisfying 0 < k < 21 (which uniquely determines them) and set
An :_ [k2-h, (k+ 1)2-h[,
In
n E N.
lAn,
It was shown in the example in §15 that the sequence (fn(w))nEN converges for no w E S1. Nevertheless the sequence (fn) does converge stochastically to 0, since for every a > 0 and n E N
p({) fnI 1 a}) < 2-h < 2r2 . In this example stochastic convergence can also be inferred from 20.4, since the example in §15 showed that (fn) converges to 0 in pth mean for every p E [1, +oo[. The connection between stochastic convergence and almost-everywhere convergence is nevertheless closer than one would be led to suspect on the basis of the last example.
20.7 Theorem. If a sequence (fn)nEN of measurable real functions converges ,u-stochastically to a measurable real function f, then for every A E 0 of finite p-measure some subsequence of (fn) converges to f µ-almost everywhere in A. Proof. For A E sa( with µ(A) < +oo, the measure µA, which is the restriction of p to A n.ad, is finite. It therefore suffices to deal with the case of a finite measure u; moreover, in that case we can simply take A to be St itself. For a > 0 and m, n E N the triangle inequality shows that
{Ifm - fnI 2: a} C {If,. - f I ! a/2} U {Ifn - f I
a/2);
thus by hypothesis µ({I fn, - fnl > a}) can be made arbitrarily small by taking m and n sufficiently large. If therefore (rlk)kEN is a sequence of positive real numbers with 00
E rlk < +00, k=1
118
I l. Integration Theory
then for each k E N there is an nk E N such that
forallm>nk.
{t({Ifm-fnkl?nk})
k=1
k=1
and consequently,
p(Ak) = 0.
lira
n-oo
k=n
From this it follows that the set A := lira sup An satisfies n-,00
p(A) = 0, 00
because A C U Ak for every n E N, entailing that p(A) < E p(Ak) for every n. k=n
k>n
The definition of A shows that if w E CA, then the inequality Ifnk+. (w) - fnk (w) I ? rlk
prevails for at most finitely many k E N. Therefore, along with the series E Ilk, the series 00 1: lfnk+l(w) - A. (w)1 k=1
converges (absolutely); that is, the sequence Y n& (w))kEN converges in R. In summary, the sequence (fnk) converges almost everywhere to a measurable real func-
tion f' on !l. By 20.5 f' is also a stochastic limit of (fnk )kEN. But, as a subthat sequence converges stochastically to f as well. Hence sequence of by 20.3, f = f " almost everywhere. We have shown therefore that (fnk )kEN con-
verges almost everywhere to f. 0 In terms of almost-everywhere convergence we can now even characterize stochastic convergence by a subsequence principle.
20.8 Corollary. A sequence (fn) of measurable real functions on 11 converges pstochastically to a measurable real function f on ) if and only if for each A E of of finite measure, each subsequence (fnk )kEN of (fn) contains a further subsequence which converges to f p-almost everywhere in A.
Proof. The preceding theorem establishes that the subsequence condition is necessary for the stochastic convergence of (fn) to f, since every subsequence of (fn)
§20. Stochastic convergence
119
likewise converges stochastically to f. Let us now assume that the subsequence condition is fulfilled, and fix an A E W of finite measure. Since every subsequence (f,,.)
contains another which converges almost everywhere in A to f and by 20.5 this latter subsequence must also converge (in A) stochastically to f, we see that in the sequence of numbers
(kEN),
p({Ifnk - fI -a}nA)
in which a > 0 is fixed, a subsequence exists which converges to 0. But, as an easy argument confirms, a sequence of real numbers whose subsequences, have this property must itself converge to 0. That is, the sequence of real numbers
>a}nA)
(nEN)
converges to 0. As this is true of every A E d having finite measure and every a > 0, the stochastic convergence of to f is thereby confirmed. 0 Remarks. 5. It is not to be expected that in 20.7 and 20.8 the reference to the finite-measure set A E W can be stricken. This is already illustrated by Example 2
if one replaces the sequence (fn) there with the sequence (f) defined by f,, :_ nl(,,,, ), n E N. This new sequence also converges stochastically to f := 0. See however Exercise 5.
6. The second part of the proof of 20.7 shows that for finite measures u there is a Cauchy criterion for the stochastic convergence of a sequence (f.): Necessary to a measurable and sufficient for the stochastic convergence of a sequence real function on S1 is the condition for every a > 0.
litre
m.n-ix 7.
The sequence formed by alternately taking terms from each of two stochasti-
cally convergent sequences whose limit functions do not coincide almost everywhere
shows that in Corollary 20.8 it does not suffice to demand that in each A some sub sequence of the full sequence (fn) converge almost everywhere. A particularly useful consequence of 20.8 is:
20.9 Theorem. If the sequence (f,,) ,EN of measurable rral functions on 11 converges stochastically to a measurable real function f on. Q. and yo : R -4 R is continuous, then the sequence (y^ o f )nEN converges stochastically to V o f.
Proof. One exploits both directions of 20.8, noting that from the almost everyto f on an A E 41 follows the almost
where convergence of a subsequence everywhere convergence of (,p o
f on A. 0
The general question of functions p : R -* R which preserve convergence, in the sense that (o o f, inherits the kind of convergence (f,,)iE14 has, is investigated by BARTLE and Jo1CH1 (1961]. They show how Theorem 20.9 can fail if the more restrictive definition (20.5) is adopted for stochastic convergence.
120
11. Integration Theory
Exercises. are stochastically convergent sequences of measurable real func1. (fn) and tions, having limit functions f and g, respectively. Show that for all a,,8 E R
the sequence (af,, + 13g,,) converges stochastically to of + fg, and the sequences (fn A gn), (f V g,,) converge stochastically to f A.9, f V g, respectively. 2. For a measure space (Si, d,,u) with finite measure p let d, be the pseudomet-
ric on d constructed in Exercise 7 of §3. Show that a sequence (An) in saf is d,,-convergent to A E 0 if and only if the sequence (NAB) of indicator functions converges stochastically to the indicator function IA. 3. For every pair of measurable real functions f and g on a measure space (Cl, sA, µ) with finite measure µ define
D,(f,g) := inf{e > 0 : p({I If - gI > e}) < e} and then prove that (a) DP is a pseudometric on the set M(d) of all measurable real functions. (b) A sequence (fn) in M(W) converges stochastically to f E M(d) if and only if lim D, (f,,, f) = 0. n +00 (c) M(se) is D,,-complete, that is, every Dµ Cauchy sequence in M(d) converges with respect to Da to some function in M(Ao ). What is the relation of D,, to the dµ of Exercise 2? 4. In the context of Exercise 3 define
If - gi
dp,
for every pair of functions f, g E M(ss). Show that Dµ also enjoys the properties (a)-(c) proved for D$, in the preceding exercise. be a or-finite measure space. Show that a sequence (fn) of measur5. Let able real functions on Cl converges stochastically to a measurable real function f on Cl if and only if from every subsequence (fk) of (fn) a further subsequence can be extracted which converges almost everywhere in 0 to f. [Hints: Suppose (fn) is stochastically convergent. Choose a sequence (Ak) from d with p(Ak) < +oo for each k and Ak 1 11, and consider the finite measures pk(A) := µ(A fl At,) on sW. The claim is true of each measure Pk. Given a subsequence 4 of (fn), there is for each k E N a subsequence of (g;,k))nEN of 4' which converges pk-almost everywhere
to f. It can be arranged that (g nk+u)) is a subsequence of (gnl) for each k. Then the diagonal subsequence (g;,ni ), EN does what is wanted.] 6. Give an "elementary" proof of 20.9 based directly on the relevant definition 20.2.
To this end, show that for each E E 10, 1[ there exists 6 > 0 such that fl f I
0.
Suppose M is a set of measurable numerical functions on fl, 1 < p < +oo, and there is a p-fold µ-integrable majorant g for M, that is, every f E M satisfies 3.
µ-almost everywhere.
If1 < g
Then the set
M":={IfIP:fEM} is equi-integrable. Indeed, as in Example 2, the single integrable function h := 2gP is an --bound for every e > 0, since by 13.6
J
fIdµ < J
gP dµ = J
dµ = 0
{g=too}
{gP>h}
1f1P>h}
This example shows that Theorem 15.6 on dominated convergence is really about an equi-integrable set of functions. Of course, one cannot expect that conversely from the equi-integrability of a subset of .`" (t) there should follow the existence of a single integrable majorant for the set. The following example confirms this. Consider the probability space (N, .(N), µ), the finite measure µ being specified by µ({n}) = 2-n for each n E N. The sequence of functions fn := 2"n-11{n) (n E N) is equi-integrable: For the constant function 1 E .2o1(µ) the inequality 4.
fn dµ
0 and every n E N
/
JIf-I>g}
If,.Idµ=J
r
ndµ=J ndµ-J A
ndµ>1-J
A
From the finiteness of the measure gµ and the fact that An 1 {0}, it follows that
liminf J n_+00
Ifnl dµ> 1,
{If..I>g}
showing that g cannot be an a-bound for any e E ]0, 1[. Here is a useful characterization of equi-integrability, which, for o-finite measures, will be improved upon in 21.8.
§21. Equi-integrability
123
21.2 Theorem. A set M of measurable numerical functions on l is equi-integrable if and only if the following two conditions are satisfied: sup
(21.3)
fEM
f If I dµ < oo .
(21.4) For every e > 0 there exists a p-integrable function h > 0 and a number 3 > 0 such that
< d=* Jill/iforallfEMand Proof. For every A E &/, every measurable numerical function f on 0, and every integrable function g > 0
f AIfI du=
f
An{IfI>g}
IfI du+ f
An{III 0 be as furnished by (21.4). For each f E M and real a > 0, consider the obviously valid inequality
f IfI du
4IfI?ah}
Ifl du > f {If (If I>_-h}
or its equivalent 1
J IfI?ah} h djo < -
If I dM.
The integrals f If I dµ here are bounded as f ranges over M, by (21.3). Therefore a > 0 can be chosen so large that
hdµ < b for all f E M. {IfIiah} (21.4) then insures that g := ah is an c-bound for M, which proves that this set is equi-integrable. 0
21.3 Corollary. Let M C 2P and the set MP :_ { If I P : f E MI be equiintegrable, where 1 < p < +oo. Then the set
M;:={laf+,0glP:f,gEM,a,,0ER,Ial:_1,1,01g} Ifnrn IP do + J Ifm,.I 0 from 2'(It). If in addition lien
then the
sequence
f f dit = If dp, J
converges to f in mean.
Proof. We consider the sequence (f A fn)nEN. The inequalities
0< fA and Example 3 show that it is equi-integrable. Since
05f-fAfnz
From this, the decomposition f + fn = f V f + f A fn, and the convergence hypothesis follows the companion result (21.10')
lim
If V f dp =
f
f du.
But then the decomposition
If,, - fl =.f V .fn -.f A.fn shows that the claimed mean convergence ensues upon subtracting (21.10) from (21.10').
Now we can get the sharpening of Theorems 21.4 and 15.4 mentioned earlier:
21.7 Theorem. For every sequence (fn) in 2P(t) which converges p-stochastically to a function f E 2P(,u) the following three assertions are equivalent: The sequence (fn) converges in p'h mean to f . (1) (ii) The sequence (If,, 1") is equi-integrable. (iii) lim f If,, I' d;i = f If I' dp. n-, x.
Proof. The equivalence of (i) and (ii) is contained in Theorem 21.4. We need therefore establish only two implications: (i) .(iii): Assertion (15.6) in Theorem 15.1 affirms this. (iii)=,>(ii): From the hypothesized stochastic convergence of the sequence (f,,) to f follows that of (I f I') to If 11, via 20.9. And then from the preceding lemma
it further follows that the sequence (If P) converges to I fI' in mean. Finally, Theorem 21.4 - with the p there chosen to be I - shows that the convergence in mean of this sequence entails its equi-integrability.
128
1 1. Integration Theory
For a-finite measures µ, equi-integrability can be characterized in a way that is particularly convenient for applications. The a-finiteness will be exploited in the form expressed by 17.6, that there is a strictly positive function h in Y' (it). 21.8 Theorem. Let (S2, dd, p) be a o-finite measure space and h a strictly positive
function from 2'(p). Then for any set M of dd-measurable numerical functions on Sl the following three assertions are equivalent:
(i) M is equi-integrable. (ii) For every e > 0 some scalar multiple of h is an a-bound for M. (iii) M satisfies sup
(21.11)
fIfI dµ < +oo
JEM
as well as the following: Given e > 0 there exists 6 > 0 such that
fhd6=JIfIdlAah} If I du = 0
holds uniformly for f E M. Condition (21.12) is for obvious reasons (cf. 17.8) called the equi-(hit)-continuity of the measures If I µ, f E M. Proof. (i) .(ii): Let g be an E-bound for M. Then for all f E M and all a > 0
{IfI>-hh}
IfI dµ=
f
{IfI>oh}n{IfI>g}
< fj IfI>_g} I fI dµ+
IfI dµ+
f
f
{(fI>«h)n{(fIcth} According to 13.6, µ({g = +oo}) = 0. Since gµ is a finite measure on dd, it is {g>ah}
2
continuous from above. Hence the fact that
n {g > ah} = n {g > nh} = {g = +oo} a>o
nEN
is a set of (gµ)-measure 0 means that
k>ah)
g dµ < 2
for all sufficiently large a. Coupled with the preceding inequality this shows that indeed ah is an a-bound for all sufficiently large a, that is, (ii) holds.
§21. Equi-integrability
129
This can be gleaned from the inequality derived at the beginning of the proof of 21.2, ah being now eligible for the function g there:
JIfIdJLjIJI> an}IfI d1+a
for all f EM.
hd/1
21.2 affirms this. 0 Theorem 21.8 is of special significance for finite measures p. Then it is often expedient to choose for h the constant function 1. When one does, (21.13) assumes the equivalent form (21.13')
lim
a-++oo
J IfI?a} IfI dp = 0
uniformly for f E M.
This condition is thus - just as (21.13) for a-finite measures - necessary and sufficient for equi-integrability of M.
Remark. 2. In part (iii) of Theorem 21.8 the 21-boundedness of M expressed by (21.11) cannot in general be dropped from the hypotheses. It suffices to consider the measure space ({a}, Y({ a}), Ca) consisting of a single point and the sequence
of functions f,, := n 1. This sequence is not equi-integrable, although for every e > 0 and every strictly positive h, (21.12) holds whenever 0 < 6 < h(a). Let us close by deriving a sufficient condition for equi-integrability in the finitemeasure case which generalizes the introductory Example 3.
21.9 Lemma. Let p be a finite measure and M C Y' (y). Suppose that there is a p-integrable function g > 0 such that (21.14)
J{Ift?a}
IfI dp
a}
9dp
for all f E M and all a E R+. Then M is equi-integrable. Proof. The case a:= 0 of (21.14) says that f If I dp < f g dp < +oo for all f E M. Then Chebyshev's inequality tells us that p({IfI ? a}) 0, f EM.
It follows from this that (21.15)
lim p({IfI > a}) = 0
a-4+oo
uniformly in f E M.
For each e > 0, 17.8 supplies a 8 > 0 such that
AEd and p(A)o)
IfI dp = 0
uniformly for f E M,
that i4, (21.13'), which we have seen entails equi-integrability of M. O
Exercises. 1. Show that for any measure space (0, a, p) a set M of measurable numerical functions is equi-integrable if and only if for every e > 0 there is an integrable function h = hr > 0 such that f (If I - h)+ < e for all f E M. [Hint: For sufficiently large q > 0, g := r)h will be a 2e-bound for M.] 2. Let (S2, d,14) be an arbitrary measure space, 1 < p < +oo. Suppose the se((t) converges almost everywhere on 12 to a measurable real quence (f,,) in function f. Show that f lies in 2P(p) and (fn) converges to fin pth mean if the sequence (If,, I P) is equi-integrable.
3. Show that from the 2-convergence of a sequence (fn) to a function f E 2"(e) follows the 21-convergence of the sequence (I fn IP) to If I, for any 1 < p < +oo. 4. Consider a finite measure .t and an M C Y1(µ). For each n E N, f E M set
an(f):=nµ({n 0 from the sequence (f,,) in the Example from § 15. 7. Let (f), .x, µ) be a measurable space with µ(S2) < +oo, and let (v;)iE f be a family of finite and it-continuous measures on 0. Suppose this family is equi-continuous at 0, meaning that to every sequence (An)nEN in iA with A,, J. 0 and to every
c>0there is an nEENsuch that y;(A,)<efor all n>nE,and all iEI.Show that then this family is equi-µ-continuous in the following sense (cf. (21.12)): To every E > 0 there corresponds a 6 = 6e > 0 such that
and µ(A) 2. One important application of product measures is the introduction of the concept of convolution for measures and functions.
§22. Products of c-algebras and measures j = 1, ... , n E N are given. We consider
Finitely many measurable spaces
the product set
n
Q:= X11j=Q1x...xQ,t j=1
and for each j the projection mapping Pj : 52 -> S2y
which assigns to each point (w1, ,w,) E I its jth coordinate wj. The a-algebra in Q generated by the mappings pa,. , pn is designated n j=1
and called the product of the a-algebrns d1 r ... , d,,. According to (7.3) we have to do here with the smallest a-algebra s® in ft such that each pj is d-safj-measurable.
The reader may recall that the product of finitely many topological spaces is defined in a very similar way. An important principle of generation for such products is immediately at hand:
22.1 Theorem. For each j = 1, ... , it let Ag be a generator of the a-algebra salj in SZj which contains a sequence (Ejk)kEN of sets with Ejk T Q j. Then the a-algebra ®.n is generated by the system of all sets A(i 0
E1x...xEn with E., E 9, for each j = 1, ... , n.
§22. Products of a-algebras and measures
133
Proof. Let 0 be any a-algebra in Q. What we have to show is that the mappings p,
are all d-Oj-measurable (j = 1,.. . , n) if and only if s+d contains each of the sets El x ... x En described above. According to 7.2 pj is .V-Afj-measurable just exactly if p 1(E3) E 0 for every E3 E 8 . If this condition is fulfilled for each j E {1,.. . , n}, then the sets
El x ... x En =p11(El)n...npnl(En) all lie in 0. If conversely, E, x ... x En E s+1 for every possible choice of E3 E 4 and j E {1,. .. , n }, then upon fixing E3 E 8j, the sets
Fk:=Elkx...xEj-1.kxEi xEj+1,kx...xEnk,
kEN,
all lie in W. Since the sequence (Fk)kEN increases to
U1 x...x1j-1 xEj xflj+1 x... xOn =pj1(Ej), this set too lies in d, for each j. The claim is therewith proven.
13
Remark. 1. The restriction imposed on the generators S, cannot generally be dispensed with. Take, for example, n := 2, sail in which .QF2 contains at least four sets.
{0,111}, ell := {0} and 82 := W2i
A particular case of this theorem is the fact that the product dj ® ... ®srdn is generated by all the sets Al x ... x An with each A3 E . . Our further course will be guided by the following example:
Example. F o r each j E { 1, ... , n} let Std := R, . rt :_ .41 and 8j :_ f 1. The system of all sets E1 x ... x En with each E? E Jr' is evidently just the system .5n of all right half-open intervals in Rn. According to 6.1, fn generates the a-algebra R" of n-dimensional Borel sets. Taken together with 22.1 - whose hypotheses are clearly satisfied here - this reveals that
,qn = a1 ®
(22.2)
(& R1
(n factors on the right).
By 6.2, A" is the only measure on R" which satisfies
,\' V1 x ... X In) = V1(Il) . ... Al (In) for all I, i ... , In E .01. This remark and the example preceding it leads to the following question.
Measure spaces (f13, O j, pi) are given, 1 < j < n with n > 2, and for each dj
a generator 9j. Under what hypotheses can the existence of a measure a on
010 .. . (9 On satisfying (22.3)
zr(E1
for all E,ESj,I<j 2.
Remark. 2. In closing it should again be mentioned that a mapping
f:S2o-4 SZlx..-xSZ of a measurable space (11o, ado) into a product of measurable spaces (0j, Afj) is measurable with respect to the a-algebra all ® ... ®as' if and only if each component mapping fj := pj o f off is d0-Oj-measurable - a fact which is immediate from Theorem 7.4.
Exercise. Finitely many measurable spaces (flj,.Wj) are given, j = 1,. .. , n. Show that the algebra in S21 x ... x S2 generated by all sets Al x ... x A,, with each Aj E .rrdj consists of all finite unions of such product sets.
§23. Product measures and Fubini's theorem
135
§23. Product measures and Fubini's theorem Initially measure spaces (521, .sdl, pj ), (522, sd2, µ2) are given. For every Q C ill x 112
the sets (23.1)
{w2 E ill : (WI, W2) E Q} {w1 E ili : (w1,w2) E Q}
Q111
Q,,,.,
are called, respectively, the w1-section of Q (w1 E ill) and the w2-section of Q (w2 E p2) This notation is chosen for typographic simplicity and will see us through §23, after which it is not needed. In case ill = il2i however, it presents obvious problems, to circumvent which, alternative notations like,,,, Q or Q4 for Q,,1 are also popular in the literature. About these sets we claim:
23.1 Lemma. If Q E sd1 ® sd2i then its w1-section lies in ad2 for every w1 E 01, and its w2-section lies in sd1 for every w2 E i12. Proof. For arbitrary subsets Q, Q1 i Q2.... Of fl :=121 x 522i and points w1 E ill
(!\Q)w, =!2\Q.1 and
(U Qn)
= U (Qn)., . nEN
nEN
Furthermore 52, = 112, and more generally for Al C 111, A2 C ill we have (A1 x A2),1 =
j A2 0
if w1 E Al if w1 E ill \ A1.
For each w1 E 121, therefore, the system of all sets Q C fl having section Q,,, E .ode
is a a-algebra in Cl which contains every product set Al x A2 with Al E .o'j, A2 E ode. But according to 22.1 01 (& ad2 is the smallest a-algebra which contains all such product sets. This proves the part of the lemma dealing with w1-sections. Of course, w2-sections are treated the same way. 0 Since now µ2(QW1) and make sense for all Q E 01 ®.02, wl E ill and w2 E S12, we are in a position to take the next step:
23.2 Lemma. Suppose the measures p1 and µ2 are or-finite. Then for every Q E sd1 ® . 9 the functions w1 H µ2(Q.,)
and w2 H A, (Q..)
on 121 and 122, respectively, are sd1-measurable and 02-measurable, respectively.
III. Product Measures
136
Proof. The function wl H P2(Qw,) will be denoted by sq. We will establish the d1-measurability of sq, for each Q E d1 ®sal2. The other function can be treated analogously.
First suppose that µ2(1Z2) < +oo. In this case the set ) of all D E .01 ®sal2 whose sD function is.call-measurable constitutes a Dynkin system in C := 111 x 11.2. This involves the following easily checked assertions: 811 = /12(122);
sf1\D = 851 - SD for every D E .9;
svD = ESD. for every sequence (D,6) of disjoint sets in .9. Furthermore 9 contains Al x A2 for every Al E salli A2 E sale, since SA, xA2 =112(A2) - lA,
The system if of all such Al x A2 is fl-stable and generates sale ®sd2, by 22.1. Therefore 2.4 insures that 01 ®ad2 is the Dynkin system generated by it. From 9 C -9 C Wl ®,42 therefore follows that .9 = .call ®.v i which is what is being claimed.
of sets from ae, each of If 162 is only a-finite, then there is a sequence finite 162-measure, with Bn T 112. For each n, A2 H u2(A2f B.) is therefore a finite measure 162,, on sate, to which the already proven result can be applied, showing is .aft-measurable for each Q E Of, ® 02. Now that wl H 112(Q,,,) = auP112,,(Qw,) nEN
because of the continuity from below of the measure 162. From Theorem 9.5 then the mapping wl -r 162(Q,,,) is indeed al-measurable.
It is now rather simple to construct the measure it that we seek:
23.3 Theorem. Let (f1j, dj, pp) be o-finite measure spaces, j = 1, 2. Then there is exactly one measure.. it on all ® .sate which satisfies (23.2)
rr(A, x A2) = p, (Al)112(A2)
for all Al E sli, A2 E sate.
In addition this measure satisfies (23.3)
it(Q) =
f
f
for all Q E sail ®d2
and is a-finite. Proof. As before, for each Q E sate e s12 let sq denote the Wi-measurable function on 121; it is of course non-negative. Consequently via
w1
ir(Q) :=
JSQdILI
a non-negative function it is well defined on 010 sate. For every sequence (Q,)nEH of pairwise disjoint sets from sat 0 szt2 the equality sUq = E sq, and 11.5 insure
§23. Product measures and Fubini's theorem
that
137
00
7r U Qn) _ F, n(Qn) n=1
nEN
Since so = 0 we have 7r(0) = 0. This proves that 7r is indeed a measure on .od1®a2. It has property (23.2) because SA, XA2 = p2(A2)IA,, whence integration yields 7r(A1 x A2) = pl(A1)a2(A2)
Proceeding analogously, we confirm that
ir'(Q) :=
fi(Qw2)iz2(dw2)
also defines a measure on s1® ® d2 having this property. But when Theorem 22.2
sr'1 and &2 := W2 it affirms that there is at most one such measure. Thus 7r = 7r' and (23.3) is confirmed. There is a sequence (Ajn)nEN of sets from ,rarj, each of finite pj-measure, with Ajn T 52j, for j = 1 and j = 2. Using these as the A1, A2, respectively, in (23.2) proves the a-finiteness of IT because is applied to 9d°1
r(A1nxA2n) y}, namely
E:={(w,t)ESZxR+: f(w)>t}, lies in sad®.. Theorem 23.6 for the product measure p®A' consequently supplies the equalities
JJ
(23.8)
V
(t)IE(w,t)A'(dt)p(dw) = f f V(t)1E(w,t)µ(dw)X'(dt)
= Jw'(t)iz(Ei)A(dt) =
Jc'(t)({f > t})A'(dt),
since the t-section of E is just the set of all w E 1 which satisfy f (w) > t. As V is isotone, W'(t) > 0 for all t > 0. The continuous function gyp' is integrable over [1/n, a] whenever 1/n < a < +oo, and since [1/n, a] t ]0, a], and
f
oal
(t)A'(dt) = limo J
n
(t) dt = W(a) - n m V(1/n) = w(a)
142
!IL Product Measures
(cp(0) = 0 and Sp is continuous on R+), we see that V is also integrable over 10, a] for every a > 0. It follows from f > 0 and the preceding calculation that
p'(t)a(dt) = (f(w))
J
for every
E S1,
o,f(W)l
both expressions being 0 whenever f (w) = 0. We thus get o f dµ =
f (Jlo,f(W)l
= J f o'(t)llo,nw)d(t)A*(dt)µ(&) =
J
IV
which combined with (23.8) concludes the proof. D
Example. 2. The relevant hypotheses are certainly fulfilled by the functions V(t) := t' with p > 0. Thus for every a(-measurable real function f > 0 on S1 (23.9)
J
fl'dµ=p
+ 0
When p = 1 we get the especially important formula (23.10)
f f du =
r p({f > t})A1(dt) =
t})dt.
The reader should not overlook the geometric significance of this, which is that the integral f f dµ is formed "vertically", while the integral on the right-hand side of (23.10) is formed "horizontally".
Now at last we turn back to the general case of §22 and consider finitely many o-finite measure spaces (S1i, di,,a ), j = 1, ... , n and n > 2. The two product sets (f21 x ... x 1li_1) x On and SZ1 x ... x Sln_1 x Stn will be identified via the bijection
((w1,...,W,y_1),wn) H (L11,...,wn-l,wn) The agreed-upon equality of these sets leads at once to the equality of the corresponding products of v-algebras: (23.11)
(Wi®...®An-1)®-Wn=010...®An-1®dd/n.
In fact, by 22.1 the sets Al x ... x An- l with each Ai E jz(j generate rote®...OAfn-1,
and by the same theorem the sets
then generate (.Q91 0 ... 0 s0n_ 1) ®6dn as well as .c
® ... ®sOn_ 1 ®SF,.
§23. Product measures and Fubini's theorem
143
In a completely analogous fashion one confirms a general associativity in the formation of products of a-algebras: m
n
j=1
j=m+1
(23.12)
n
-'10
= j=1 ® 0j
(1<m 2 of factors via induction on n.
23.9 Theorem. or-finite measures µl, ... , µn on a-algebras .d1, ... , jVn uniquely determine a measure 7r on safe ® ... 0 do such that (23.13)
for all Aj E 0j, 1 < j < n.
7r(A1 x ... x An) = ul(A,) .... µn(An)
This measure 7r is a-finite.
Corresponding to Definition 23.4, 7r is called the product of the measures µl, ... , µn and is denoted by n
®µj µl®...®µn. j=1
The question posed in §22 is finally answered in full, by this theorem.
Proof. In 22.2 take for the various generators 8j the o-algebra .dj itself, and learn that there is at most one measure 7r which satisfies (23.13). The existence question has already been settled for n = 2, in 23.3. We make the inductive assumption that 7r' := µ1 ®... ®µn-1 exists for some n > 2 and show how that leads to the existence of µl ® ... ®µn. Evidently the a-finiteness of µl, ... , µn_1 entails that of 7r', as in the proof of Theorem 23.3. That theorem therefore supplies us with a measure 7r := 7r' ®µn on (.W1 ®... ®.dn_ 1) ®.dn which satisfies 7r(Q' x An) = 7r'(Q')µn(An)
for all Q' E .d1 ® ... ® .dn-1 and all An E dd4n. Because of (23.11) this measure does what is wanted at level n, completing the induction. Again, a-finiteness of 7r is confirmed exactly as in the proof of 23.3. 0 This inductive construction of the n-fold product measure builds in the equality (23.14)
(141 ®... (&µn-1) ®µn = µ1 ®... ®µn-1 ®µn By now familiar considerations show that in fact a general associativity prevails in the formation of product measures: m (23.15)
In particular
n
n
(®µj)®( ® µj)=® µj j=1 j=m+1 j=1 xd
=
V
®V,
(1<m 0 be an s91®... ®.c 4-measurable numerical function on 01 x... x Stn. Then for every permutation j1, ... , j,, of 1, ... , n
Jfd(ii®...®in)
(23.16)
= f(... (f (f f(w1i...,wn)µj,(dwj,))µj.(dwjs))...)µjr(dwj.)' Every integral that occurs on the right-hand side is measurable with respect to the product of the appropriate Oj, namely those corresponding to the coordinates in which integration has not yet occurred. This right-hand side is often written in the shorter fashion
J ... J The simple proof of this theorem (involving induction), as well as the formula, tion and proof of the analog of 23.7, will be left to the reader. One more piece of notation is convenient:
23.10 Definition. For finitely many a-finite measure spaces (SZj, Wj, µj), 1 < j < +,
1l
1!
n, the triple ()( SZj, ®.Wj, ®µj) is called the product of these measure spaces 7=1
j=1
j=1
and is denoted by
n
j,
14Y
j=1
Remark. 2. Throughout the preceding the index set was finite. But there is also a theory of products of (finite) measures indexed by arbitrary sets, which is particularly important in probability theory; it is treated in detail by BAUER [1996], and somewhat more extensively in HEw rr and STROMBERG [1965]. For p-measures SAF,KI [1996] gives a short, elementary proof that uses only 5.1.
In closing we will consider the case where each measure µj comes with a real density f j > 0. According to Theorem 17.11, vj := f jµj is then a a-finite measure too.
23.11 Theorem. Let (S2j,.Vj, jAj) be or-finite measure spaces
andfj>0real-
valued w(j-measurable, functions on S1j. Set
vj = fjµj, Then the product of these measures is defined and satisfies (23.17)
n
n
j=1
j=1
®vj = F. (®µj)
j = 1,...,n.
§23. Product measures and Fubini's theorem
145
with the density function n
[ffj(wj),
F(wl,...,wn)
(23.18)
j=1
The function F is the so-called tensor product of the densities f1,..., fn Proof. As already noted, 17.11 insures that each measure vj is a-finite, guarantee-
ing that their product is defined. It suffices to treat the case n = 2 and refer the general case to induction. For sets Al E and A2 E s12 vl(A1)v2(A2) =
=
(jfid14i)(j12d142) z
Jf
I ._
lA,(w1)fl(wl)lA2(w2)f2(w2)141(dwl)112(dw2)
= Jf lA,xA2(wl,w2)F(wl,w2)1L1(dwl)122(dw2) From 23.6 therefore Fd(141 ®1L2),
v1(A1)v2(A2) = J
for all Al E. iA2Ed2.
, x A2
But then according to 23.3, v1 ® v2 coincides with the measure F (141 ®14z). 0
Exercises. 1. Consider 521 = 522 :=1R, 01 = 02 := ,41, it, := Al and 142 the non-a-finite counting measure on .41 (cf. Example 3, §5). Show that equality (23.3) fails to hold for Q := D, the diagonal {(w,w) : w E R} in 121 x 522. Why does D lie in jV1 002 =W2? 2. Show that the function (x, y) H 2e2xv - exv is not A2-integrable over the set [1, +oo[x [0, 1].
3. With the aid of Tonelli's theorem find a new proof of Theorem 8.1 along the following lines: Up is a translation-invariant measure on mod, 14([O,1[) = 1, and f >
0, g > 0 are Borel measurable numerical functions on Rd, compare the integrals
f
f()f(x + y)14(dx)Ad(dy)
and f f g(y - x)f(y)14(dx)Ad(dy)
and, finally, take f to be any indicator function, g the indicator function of [0, 1[. 4. Compute 00
2
I:= f e_x dx, 0
and thereby evaluate anew the important integral G = 21 in (16.1), in the folye_y2V2 lowing simple way: fo a-e2 dt = fo dx for every y > 0 and therefore
146
III. Product Measures
I2 = f °° (, fn f (x, y) dx) dy for the function f on R+ x R+ defined by f (x, y) yP-v2(1+z2). Applying Tonelli's theorem leads to I = 2Vr7r.
5. Let IxI := (x + ... + xd)112 denote the usual euclidean norm of the vector x := (x1,. .. , xd) E Rd. Show that the function x H e-Iz1° is ad-integrable for every a > 0. (Recall Exercise 2 of §16.) In case a = 2, show that the Ad-integral of this function is Gd.
6. KL(xo) will denote the closed ball in Rd with center xo and radius r > 0. Set ad :_ and prove that ,\d(K*(xo)) = adrd .
Show also that the numbers ad can be calculated by a2q = 4 9rq,
2q(2q
and a2q- i = 1 3
- 1)
a-1
(q E Dl).
[Hint: Use (7.10) and note that every xd-section of K,.(0) is either empty or is a (d-1)-dimensional closed ball. Tone1G's theorem then leads to a recursion formula for the ad. Here, of course, 7r has its customary geometric meaning.]
How do these relations change if we replace K,.(xo) by the open ball Kr(xo) in Rd of radius r and center xo? [Cf. Exercise 3 in §7.] 7. For every compact interval [a, ,Q] C R+ designate by R(a, Q) the spherical shell
K,3(0) \ K.(0) _ {x E Rd : a < IxI < /3} . Show that for every continuous real function h on such an interval (a, /3] C R+
f
h(Jxj)Ad(dx) = d ad f
.
a
R(a,p)
h(t)td-1
dt,
ad being the number ad(KI (0)) from the preceding exercise. [Hint: The function H defined on [a, p) by
H(t) := f
h(IxI)J1d(dx),
is differentiable with H'(t) = d ad h(t) td-1 for all such t.] 8. Apply the result of Exercise 7 to the case d = 2 and h(t) := show, using Exercise 5, once again that G = f.
tE
a-t2
in order to
9. Let (S2, d1. p) be a o-finite measure space, f : Il -+ R+ measurable. Show that
the set of all t > 0 such that u({f = t}) # 0, as well as the set of all t > 0 such that µ({ f > t}) # µ({ f > t}) is countable. Therefore in the equalities (23.8), (23.9) and (23.10), p({ f > t}) can always be replaced by µ({ f > t}).
§24. Convolution of finite Borel measures
147
§24. Convolution of finite Borel measures Consider the d-dimensional Borel measurable space (Rd,.gd). Every finite measure µ on Rd will be called a finite or also a bounded Borel measure, and the set of all of them will be designated by.,&+' (lR'). For every such µ the number (24.1)
lI,II := IA(Rd)
is called the total mass of A. Making critical use of the group structure of (Rd, +) a so-called convolution product can be assigned to any finitely many measures Al, ... , An E .K+ (Rd);
in contrast to the previously studied product measure, it is again a measure on the original o-algebra Vd, even an element of .,of' (Rd). What we do below can be carried out in every (abelian) locally compact group. We cannot, however, go into this generalization, but must instead refer interested readers to the excellent monographs of HEwIrr and Ross [1979] and RUDIN [1962]. Initially we consider
the product measure Al ® ... ® An defined in §23. Since W d = Rd ®... 00, this measure is an element of .,W+b (Rod) The mapping A. : R"d -3 Rd defined by
A,,(xl,... , xn) := x1 + ... + xn is continuous, and so Vnd-.mod-measurable. The following definition accordingly makes sense:
24.1 Definition. The image under the mapping An of the product measure -IC/+b(Rd), plo. .®Idn is called the convolution product of the measures pl,... , An E in symbols (24.2)
The theorems on product and image measures combine to yield the most important properties of the convolution operation *. First of all, At * ... *An is again an element of .0+1 (Rd) and
µl*...*µn(R")=µl®...®p,(R"d)=11µ11I ...
IIJUnII
so that in fact (24.3)
IIµl * ... * poll = 11µ11I ...' 11µn11
In studying the convolution product it suffices to deal with n = 2, because (24.4)
Al * ... * An * I`n+1 = (Al * ... * ln) * ltn+1
for every n + 1 measures from .4 (Rd). To see this, introduce the continuous mapping Bn+1 : R(n+l)d _+ Red by
Bn+1(x1, ... , xn, xn+l) := (XI + ... + xn, xn+l )
148
III. Product Measures
and have An+l = A2 o B.+1. Checking that Bn+1(p1 ®... OA. 0 pn+1) = A. (j AI ®... ®pn) ®pn+1,
and remembering that the formation of image measures is transitive, we get Al * ... * pn * µn+1 = A2(Bn+l (JAI ®... ®pn ®pn+i )) = A2((1.t1 * ... * A.) 0 pn+1), which confirms (24.4). Henceforth therefore n = 2. For any measures p, v E .4f+' (Rd) and any 0-measurable numerical function f > 0 it follows from T19.1 and 23.6 that
J
fd(E.e*v)
r
=J foA2d(p®v) = ff f(x + y)p(dx)v(dy)
(24.5)
= f f f(x + y)v(dy)µ(dn)
As this holds for f := 1B, they indicator function of any set B E fed, we have (24.6)
p * v(B) = J µ(B - y)v(dy) = J v(B - x)p(dx)
(Recall (7.8) that B-x = -x+B.) Consequently * is a commutative, and by (24.4) also an associative operation in .1/+(R.d) Due to 19.2 and 23.7, (24.5) are valid as well for every p*v-integrable numerical function f on Rd. Equality (24.6) is frequently taken as the definition of p * v. Evidently .,W+6 (Rd) is closed with respect to addition and under multiplication by numbers in R+. From (24.6) we immediately see the relation of convolution to these two operations: For all p, v, v1i v2 E .41+(Rd), a E 11 Y+
p*(vl+v2)=p*v1+p*v2, p*(av)=(ap)*v=a(p*v).
(24.7) (24.8)
The distributive law (24.7) even holds in the following generality: For every sequence
of measures from .4r+(Rd) satisfying E IkvJJ1 < +oo, the sum n=1
00
E vn is also a measure in .4f+1 (Rd) (cf. Example 4 of §3). Taking account of 11.5,
n=1
it therefore follows from (24.6) that 00
(24.9)
14 *(E14t n=1
00
Ep*vn n=1
for every p E A,(+(Rd)
Let us now compute p * v in some special cases.
§24. Convolution of finite Borel measures
149
1. We again denote by T. the translation mapping x H x + a of Rd onto itself via a E Rd, and by ea the (Dirac-)measure on Md defined by unit mass at the point a. Of course, Ea E -f+(Rd) and IIEa1I = 1. From (24.6) follows that Ea * µ(B) _ µ(B - a) = µ(T; ' (B)) for all B E mod, and so (24.10)
E. * µ = Ta(p)
for all p E .4W+6 (Rd), a E Rd.
Now To is the identity mapping, so co is a - and obviously the only - unit with respect to convolution. If, namely, E were also a unit, meaning that p = E *,U for every µ E 4. (Rd), then it would follow that Eo = E * co = E. For the special choice p := Eb, (24.10) says that (24.10')
for all a, b E Rd.
Ea * Eb = Ea+b
2. Let f > 0 be a Ad-integrable numerical function on Rd and p := fAd. Since IIµII = f f dAd < +oo, p also lies in W+ (Rd). Let us compute p*v for an arbitrary v E .,4+(Rd). From 17.3 using the translation-invariance of Ad and the general transformation theorem 19.1, we get
p * v(B) = J J 1B(x + y)f (x)Ad(dx)v(dy) = f f 1B(x +
y)f(x)T-v(Ad)(dx)v(dy)
= f f 1B(x)f(x
- y)Ad(dx)v(dy)
for every B E .mod. With the help of Tonelli's theorem it further follows that
p * v(B) = f 1B(x)q(x)Ad(dx) = f gdAd, B
where q is the non-negative .mod-measurable function x H f f (x - y)v(dy). This function is also Ad-integrable, since f q dAd = Ilp * vfl < +oo. Thus whenever p has a density with respect to Ad, so does p * v. We set f * v := q, that is, we make the definition (24.11)
f * v(x) := f f (x - y)v(dy)
for x E Rd.
The preceding result now assumes the more suggestive form (24.12)
(/Ad) * v = (f * v)Ad.
Naturally f * v is called the convolution of f and v.
3. Besides p = f Ad, let now v = gAd also have a Ad-integrable density g > 0. According to 17.3 and the preceding f * (gAd)(x) = f f(x - y)g(y)Ad(dy)
(x E Rd)
150
III. Product Measures
is a density for u * v with respect to Ad. We denote this function by f * g, that is, we set (24.13)
f * g(x)
f f(x - y)g(y).d(dy)
(x E Rd)
and get
(f Ad)*(gAd)_(f*g)Ad-
(24.14)
Here too f *g is called the convolution off and g. It is defined for every pair of nonnegative Ad-integrable functions and is itself such a function. Nevertheless, it might
not be real-valued, even if f and g each are (cf. Remark 1 below). Ftom (24.13) and the translation- and reflection-invariance of Ad it follows that for every x E Rd
f * g(x) = f f(x - y)g(y)Ad(dy) = f f(x + y)g(-y)Ad(dy) =
f f(y)g(x _ y)Ad(dy) = g * f(x)-
That is, the * operation between functions is also commutative: (24.15)
f * g = g * f.
Similar calculations confirm its associativity; that is, (24.16)
(f*g)*h=f*(g*h)
for all Ad-integrable, non-negative functions f, g, h. The distributive law (24.17)
f*(g+h)=f*g+f*h
and the homogeneity property (24.18)
f * (ag) _ (af) * g = a(f * g)
(aER.F.)
for such functions hold as well and follow immediately from (24.13).
4. For arbitrary functions f, g E 2' (Ad) decomposition into their positive and negative parts and appeal to the resusecured in 3. show that x +
ff(x - y)g(y)Ad(dy),
while possibly defined only Ad-almost everywhere (see Remark 1 below), is always Ad-integrable. One can therefore define f * g by f * g(x):= f f(x - y)g(y)Ad(dy)
but generally only for Ad-almost all x E Rd. Once again the expression convolution is used for this f * g.
§24. Convolution of finite Borel measures
151
Remarks. 1. For real-valued, non-negative functions f, g E pl (Ad) the function f * g need not be finite everywhere. It suffices to consider any real-valued, non-negative, even function f which lies in Y1 (A") but not in 22(Ad) and to take g = f. Then f * g(0) = +oo. In case d = 1, such a function is
f(x) :=
forlxI>Iorx=0
10 1
IXI-112
for 0 < IxI < 1.
2. In passing to Le(ad) - cf. Remark 1 in §15 - the difficulties high-lighted above with the definition of f * g disappear. Indeed, let f H f be the canonical mapping of .1 (Ad) onto Ll (Ad). One defines f * g for arbitrary f , § E Ll (Ad) as the image h of a function h E 21 (Ad) which coincides Ad-almost everywhere with f * g. This definition is independent of the special choice of representing functions f, g and h from 21 (Ad). The new operation * renders the vector space Ll (Ad) an algebra over R.
Exercises.
1. Show that for any it, v E dii (Rd) and any linear mapping T : Rd - Rd, T(µ * v) = T(p) * T(v). To this end, first observe that T o A2 = A2 o (T (& T), where T 0 T denotes the mapping (x, y) -+ (T (x), T (y)) of Rd x Rd into itself. 2. Compute the nlh convolution power of the function f defined on R by f (x)
ethat is, the convolution f * ... * f with n(E N) factors. Is it true that for every n E N, f has an "nth convolution root"? That is, is f the nth convolution power of some A'-integrable function g > 0? 3. If we set N1(f) f I f I dAd (this is (14.1) for it := Ad), then
N, (f *g) n, and this is true of each n E N. Now the set
K := {x} U U Kn nEN
is compact. For if °1! is an open cover of K, then some U E P1 contains x and since (Vn) is a neighborhood basis at x, Vno C U for some no E N. It follows that C U for all n > no. Since Kl U ... U Kno is a compact subset of K, K, C Vn C it is covered by finitely many sets in 9l. These together with U then furnish the desired finite covering of K. On the one hand then p(K) < +oo, since p is a Borel
156
IV. Measures on Topological Spaces
measure, and on the other hand since K C K
µ(K) ? p(KK) > n This is the contradiction sought. O
for allnEN.
Exercises. 1. Let (Q, .W) be a measurable space, 8 a generator of &V and ! ' a subset of Q.
Consider the traces a' and d" of a' and 8, reap., on S2' and show that e' is a generator of the a-algebra .rah' in ff. Example 3 above is a special case. 2. Equip the set R with the so-called right-sided topology (which is also sometimes named after SORGENFREY [1947) whose system 0, of open sets is defined as follows: A subset U C R lies in ®r if and only if for each x E U there is an e > 0 such that [x, x + E[ C U. The topological space thus created will be denoted R,. Establish, one after another, the following claims: (a) Every right half-open interval [a, b[ is both open and closed in R,.. The rightsided topology on R is strictly finer than the usual topology. In particular, R, is a Hausdorff space.
(b) .W(R,) =0. (c) Suppose (x,e) is a strictly isotone sequence of real numbers possessing the supremum b E R. Then the set {z : n E N} U {b} is closed but not compact in R,. By contrast, if (y,,) is a strictly antitone sequence of real numbers possessing the infimum a E R, then {a} U {y : n E N} is compact in R,.. (d) Let K be compact in R,. Then there exists (from the first part of (c)) for every x E Kay E Q with y < x and [y, x[f1K = 0. If for each x E K, p(x) designates such a rational number y, then a mapping B : K -+ Q materializes which is strictly isotone, and hence injective. (e) Every compact subset of R, is countable. (But (c) shows that the converse is not true.) (f) Consider on .W(R,) = . 1 the measure p which assigns to every countable set
the value 0 and to every uncountable set the value +oo (cf. Example 6). Then p is a Borel measure on R, for which no point of R, has a neighborhood of finite measure. In particular, the measure p is not locally finite and is neither inner regular nor outer regular.
(g) Consider the measure v := IA' with density f(x) := x-'
llo,+ool(x)
(x E R)
and show that it too is a non-locally-finite Borel measure on R,.
(h) Investigate the L-B measure Al, thought of as a Borel measure on R in respect to its inner and outer regularity.
§26. Radon measures on Polish spaces
157
§26. Radon measures on Polish spaces For two extensive classes of Hausdorff spaces Borel measures come up very naturally. The first of these classes will be discussed in this section, beginning of course with its
26.1 Definition. A topological space E is called Polish when its topology has a countable base and can be defined by a complete metric. The terminology is due to N. BouRBAKI and commemorates the achievements of Polish topologists in the development of general topology. A metric is called complete when the associated metric space is complete: every Cauchy subsequence in it converges. A countable base or basis for the topology is a countable system of open sets such that every open set is the union of those from the system which are subsets of it. For a metrizable space E the existence of such a basis is equivalent to the existence of a countable dense subset.
Examples. 1. The euclidean spaces Rd of every dimension d > 1 are Polish, the ordinary euclidean metric being complete. The product E' x E" of two Polish spaces is another, when given the product topology. For if d, d" are complete metrics generating the topologies of E' and E", reap., then the product topology of E' x E" is generated by the metric 2.
d(x, y) = d'(x', y) + d"(:r", y"), x := (x', x"), y (y', y"). which moreover is complete. If 9',9" are countable bases for E', E", resp., then {G' x G" : G' E 91, G" E 9") is a countable basis for E' x E". Every closed subspace F of a Polish space E is Polish. Just restrict to F any complete metric that generates the topology of E. 3. 4.
Every open subspace G of a Polish space E is Polish.
Proof. We may suppose G # E. By 1. and 2. R x E is Polish. Let d be a complete
metric giving the topology of E, and consider the set F of all (A, x) E R x E E\G) = 1. Here, as usual, for 0 0 A C E. d(x., A) := inf{d(x, a) a E A} is the distance from the point x E E to A. The mapping x H d(x, A) is continuous on E, in fact., as the reader can easily check, ld(x, A) - d(y, A)l < satisfying
d(x, y) for all x, y r= E. Consequently, (A, x) Fa A d(x, E \ G) is a continuous real function on R x E, and F is a closed subset of R x E, hence itself a Polish space, by 3. Finally, (A, x) H .r. maps F homeomorphically onto G. To see surjectivity, we only have to notice that, because E \ G is closed, G coincides with the set {x E E : d(x, E \ G) > 0}. 5.
More generally it is true (cf. COHN [1980], Theorem 8.1.4 or WILLARI) [1970],
Theorem 24.12) that a subspace A of a Polish space E is Polish if A is a Ga-set in E, that is. A is the intersection of a sequence of open subsets of E. Thus, for
158
IV. Measures on Topological Spaces
example, the set J of all irrational numbers with its topology as a subspace of R is Polish, since
J= n (R \ {x}) . 2E'Q
Every compact space E with a countable basis is Polish. For a famous theorem of P.S. URYSOHN (1889-1924) (cf. KELLEY [1955], p. 125 or WILLARD [1970], 6.
Theorem 23.1) guarantees that E is metrizable, and in Remark 3 of §31 we shall even give a proof of this. The compactness of E easily entails that every metric defining its topology is complete.
The key to the further discussion is the following lemma, which is here just a preliminary to the big theorem that follows it, but nevertheless is significant in its own right. In it we encounter our first extensive class of Radon measures. 26.2 Lemma. Every finite Borel measure it on a Polish space E is regular. Proof. We consider the system .9 of all B E -W(E) which satisfy both
p(B) = sup{µ(K) : K compact C B}
(26.1)
and
µ(B) = inf {it(U) : B C U open). The goal of course is to show that .9 = M(E). We block off the work into five sections. Let d be a complete metric defining the topology of E. 1. E E 9: Only (26.1) needs proof when B = E. Let (X,,)-EN be a sequence which is dense in E, and for x E E, real r > 0 let Kr(x) denote the open ball of center x and d-radius r. For every r then E _ U K,.(xn), because in every ball Kr(x) lies (26.2)
nEN
some x,, so that x E Kr(xn). Sincep is continuous from below k
p(E) = kunµ(U Kr(xj)) . j=1
Therefore, for each e > 0 and n E N there exists kn E N such that
k
µ
K1/,, (xj)) > p(E)
-F2'°
j=1
kp
Each set Bn
U K 1 / (x j ), hence also their intersection K:= f Bn is closed, nEN
j=1
and we have
u(E)-µ(K)=µ(E\K)=p(U (E\B,)) 5 nEN
p(E\Bn) 0 be given. We already know that there is a compact set K with µ(E) - IA(K) < e. According to 3.5 however
µ(C) - µ(C fl K) = p(C U K) - µ(K) < µ(E) - µ(K) < £ and this proves (26.1) for B :
C, because C fl K is compact. As a closed subset
of a metric space, C is a G6-set, that is, there are open sets G. J. C. To see this we may assume C 9& 0, so that G := E \ C is an open proper subset of E. Consequently, x H d(x, C) is a continuous mapping whose zero-set is C, as was
shown in treating Example 4. The sets Gn :_ {x E E : d(x,C) < 1/n} are therefore open and decrease to C. From the finiteness of µ and 3.2(c) we then have that µ(G.) 4. µ(C), showing that (26.2) is also satisfied by B := C. 3. Whenever B lies in 9 so does CB: First note that for every compact K C B
µ(CK) - p(CB) = µ(B) - µ(K) , and so CB satisfies (26.2) whenever B satisfies (26.1). Moreover, if G is an open superset of B, then CG is a closed subset of CB with µ(CB) -,u(CG) = µ(G) - µ(B) ,
showing, at least, that CB satisfies (26.1) weakened by replacing "compact" there
by "closed". But then application of step 2 to these closed sets gives us the full (26.1) for CB.
4. Whenever pairwise disjoint sets Dn lie in 9 (n E N), their union D also lies in 9: First of all
µ(D.)
µ(D) _ n=1
Letting e > 0 be given, we therefore have an nr E N such that n, (26.3)
µ(D) - E p(Dn) < c/2. n=1
Every Dj contains a compact K,j such that
µ(Di) - µ(Ka)
0 there is a compact subset KK C E such that p(CKE) < e and the restriction off to K, is continuous. Proof. Let us first suppose that p is finite. Let 9' be a countable base for the topology of E' and (Gn)nEN a sequential arrangement of its elements. Notice that 9' is a generator of the Borel o-algebra because every open subset of E' is a (countable) union of sets from s'.
IV. Measures on Topological Spaces
164
(a)=(c): By hypothesis there is a Borel measurable mapping g : E -* E' and p-nullset N E .£(E) with f (x) = g(x)
(26.7)
for all x E CN.
For every set Gn, g-1(Gn) E . (E). Because every Radon measure on E is regular, given E > 0, there exist compact sets Kn and open sets Un such that (26.8)
K C g-1(G'n) C Un and p(Un \ Kn) < 2-ne
The set A
for each n E N.
U (Un \ Kn) is open, being a union of open sets. For its measure nEN
we have the obvious inequality 00
p(A) s E p(Un \ Kn) < C. n=1
Using once more the (inner) regularity of 1S, we find a compact K C C(A U N) _ CA n CN such that
p(CAnCNnCK) <e-p(A), thus (since A U N C CK and A U N U (CA n CN = E) such that p(CK) = p(A U N U [CA n CN n CKI) < p(A) + p(N) + E - p(A) = E .
This set K does what is wanted in (c), because by (26.7) f and g coincide in K and because the restriction go of g to CA is continuous, as we now confirm. For each set Gn, go 1(Gn) = g-1(Gn) n CA;
from (26.8) and the fact Un \ Kn C A follows therefore
UnnCA =KnnCA cg'(G')cUnnCA, which means that
goI(Gn)=UnnCA =KnnCA, showing that the go-pre-image of G;, is open (as well as closed) in CA. Since (Gn)nEN is a base for the topology of E', this is enough to guarantee the continuity
of go=gICA. (c)=(b): It suffices to find pairwise disjoint compact subsets Kn of E such that f I Kn is continuous and K3) < p(C ?=1 U J n =
for each n E N. For then
N:=CUKn= nCKn nEN
nEN
is a Borel set disjoint from each Kn and satisfying p(N) < 1/n for every n E N, i.e., p(N) = 0. The sequence (Kn) is gotten inductively from (c) as follows: To start off, there is a compact K1 C E such that u(CKI) < 1 and f I K1 is continuous.
§26. Radon measures on Polish spaces
165
If Ks,. .. , Kn have been defined having the desired properties, we will get K"+1 from (c) and the inner regularity of p. By (c) there is a compact K' C E such that
p(CK') < (2n + 2)-' and f I K' is continuous. With L := K, U... UKn the inner regularity of p supplies a compact Kn+1 C K' \ L such that
µ(K' \ L) - p(Kn+1) = µ(K' n CL n CKn+,) < (2n + 2)' 1
.
Because
p(C(L U Kn+,)) = p(CK' n CL n CKn+1) + µ(K' n CL n CKn+, )
< p(CK')+p(K'nCL nCK,,+,) < (n + 1)-', with this set Kn+, the inductive construction is complete. (b)=(a): If E = N U K, U K2 U ... is the given decomposition, one defines a mapping g : E -* E' as follows. In case N = 0, let g := f. In case N 96 0, choose yo E f (N) arbitrarily and set
g(x) := f (x) for x E E \ N,
g(x) := yo for x E N.
What has to be shown is that g is Borel measurable, which is done as follows: For every open G' C E' 9_1(G')
= (g-1 (G') n N) U U (g-1(G') n Kn) = No U U g; 1(G') nEN
nEN
where No := g-1(G') n N and gn := g I Kn. Now No is either N or 0, according as yo E G' or yo V G'. Moreover, gn coincides with the restriction of f to Kn, so that by hypothesis gn 1(G') is open in Kn, that is, of the form Kn n Un for some open subset U,, of E. Therefore only Borel sets occur in the above decomposition of g-1(G') and we conclude that g-1(G') is a Borel set. This being true of every open G' C E', the Borel measurability of g follows from 7.2. Now consider an arbitrary locally finite measure p on R(E). According to 26.3, p is a-finite. Lemma 17.6 therefore furnishes a strictly positive p-integrable real function h on E. The measure v := hp is then a finite Borel measure on E which has exactly the same nullsets as p. The proven equivalence of (a) and (b) for the measure v therefore entails the validity of this equivalence for the measure it. Thus the whole theorem is proved.
Remarks. 1. The equivalence of (a) and (b) in Lusin's theorem may be lost if (a) is
strengthened to the 9(E)-9(E')-measurability of f. It suffices to take for E the compact set [0,1] x [0,1] and for p the L-B measure .X E. As was noted in the second part of Remark 4, §8, E contains a p-nullset N which contains a non-Borel subset. If M is such a set, its indicator function f = l,w is not Borel measurable, although f is p-almost everywhere equal to the Borel measurable function 1N On the other hand, if f is . (E)-. (E')-measurable, there is a Polish topology r on E, stronger than the original but generating exactly the same Borel sets, such that f is r-continuous. See 3.2.6 of SRIVASTAVA [1998] for the proof, which is not difficult.
166
IV. Measures on Topological Spaces
2. The Dirichlet jump function (cf. Remark 1 of §16) is continuous at no point of its domain of definition 10, 1], yet it is Borel measurable. This shows that in assertion (c) of Lusin's theorem one cannot hope to be able to replace the continuity
of the function f I K by the continuity of f at each point of K.
Exercises. 1. Show that every inner regular finite Borel measure on a Hausdorff space is outer regular.
2. Show that in a Polish space E the Dirac measures are the only non-zero Borel measures it which take only the values 0 and 1. [Hint: Show that the system of all compact K C E such that tt(K) = I is fl-stable and investigate the intersection of all itssets.]
3. Show that AE x E') _ i(E) ®M(E') for any Polish spaces E,E'. 4. Consider K compact C U open C Rd, and for each n E N let V denote the open ball of radius 1/n and center 0. Show that K + V C U for some n. [Hint: n CU # 0 for every it E N, find xn E K, vn E V,,, zn E CU such that If (K + x + v = z,,, for every n E N. Some subsequence of (xn) converges to a point xo E K and because CU is closed we even have x0 E K fl CU, which contradicts the fact that K C U.]
5. Let p be a locally finite Borel measure on a Polish space E and f : E - E' a mapping into a topological space E' with a countable base. Show that assertions (a) and (b) in Lusin's theorem are equivalent to (c'): For every e > 0 and every compact K C E there is a further compact Kf C K such that p(K\Kf) < c and f I KE is continuous.
§27. Properties of locally compact spaces A topological space is called locally compact if it is Hausdorff and if each of its points has at least one compact neighborhood. Examples of such spaces are the euclidean space Rd, every manifold (i.e., every locally euclidean Hausdorff space), every discrete space, and every compact space. When an arbitrary point is removed from a compact space the remainder is a locally compact space. Actually every locally compact space is of this form. For if © is the system of all open subsets of the locally compact space E and wo is any (so-called ideal) point not in E, then a topology can be defined on E' := EU {WO} as follows: The system d' of open sets in E' shall consist of ® together with the sets E' \ K for all the compact subsets K of E. This defines a compact topology on E', E is an open subset of E' and the topology that E inherits from t9' is its original topology. E was compact to start with if and only if wo is an isolated point in E'. If E is not compact, then it is dense in E'. These claims are easily confirmed, or the reader can consult KELLEY [1955], p. 150, or WILLARD [1970], 19.2. The space E'
§27. Properties of locally compact spaces
167
is called, after its creator P.S. ALEXANDROFF (1896-1982), the (Alexandroff) one-point compactification of E and wo its infinitely remote point. We will pursue the further theory of locally compact spaces via this compactification. First we study some distinguished continuous functions in this environment. For an arbitrary topological space E we denote by C(E) and
Ct(E)
the vector space of all, respectively all bounded, continuous real functions on E.
27.1 Definition. Let f : E -> JR be a real function on a topological space E. The set (27.1) supp(f) := If 34 0} is called the support of f.
The complement of supp(f) is thus the largest open set at every point of which f takes the value zero. If E is locally compact. we will designate by CA(E)
the set of all f E C(E) with compact support supp(f). A function f E C(E) lies in CA(E) just if there is some compact subset of E in the complement of which f is identically zero. Clearly (27.2)
C (E) C Cb(E) C C(E),
since an f E CA(E) is bounded on its compact support, hence throughout E. C,.(E) is a vector subspace of Cb(E). More generally for any n E N, E C(1R") with V(O) = 0 and fl,.. . E C,.(E), the composition f,,) lies in CA(E), rr
and indeed its support is a subset of f supp(fj). In particular, whenever u, v E j=1
C,.(E) the functions Jul, u V zv. u A v, and therewith u+ and u.-, all lie in C'(E). The needed continuity of y,(x, y) := r V y on 1R2 follows from the identity r V y =
(.x+y+I.e-yI) In the special case of a compact space E, all three function spaces in (27.2) coincide.
A fundamental property of the space C,.(E) is the following:
27.2 Theorem (on partitions of unity). Suppose that the compact subset K of the locally compact space E is covered by the n open sets U1, ... , U,,. n E N. Then
there are functions fl.... , f E C,.(E) with the following properties (27.3)
fj>0
(27.4)
supp(fj) C Uj
for j = 1.....n; for j = 1,....n:
r4
f(x) < 1
(27.5) j=1
for all r E E;
168
IV. Measures on Topological Spaces n
rfj(x)
(27.6)
forallXEK.
j=1
Proof. We work in the one-point compactification E' := E U {wo} of E. The given open sets together with Uo := E' \ K constitute an open cover of E'. Because compact spaces are normal topological spaces (cf. KELLEY [1955], p. 141 Or WILLARD [1970], Theorem 17.10), this covering can be "shrunk" to an open covering Ui, ... , Un of E' satisfying UUCUj for each j =0,...,n, where of course the bar denotes closure in E'. The theorem on partitions of unity in normal spaces (KELLEY [1955], p. 171 Or WILLARD [1970], 20 C) provides functions
fo..... fn E C(E') such that fj' > 0,
(i)
supp(f f) C Uj,
for j = 0,..., n;
n
Ef,(x)=1
(ii)
for all xE E'.
j=o
The restrictions f I , ... , fn to E of f f,i lie in C(E) and it will be easy to show that they have all the properties wanted. From (i) and (ii) properties (27.3)-(27.5) follow almost immediately. One only has to notice that for each j = 1,.. . , n
supp(fj)=supp(ff)flECUUflE=UUCUj since UU C Uj C E. In particular, Uf being a closed subset of the compact space E',
is a compact subset of E. From supp(fj) C W therefore follows the compactness of this support. Thus f I, ... , f,, all lie in CA(E). The remaining property (27.6) likewise follows from (ii) because supp(fo) C Uo = E \ K entails that fo(x) = 0
for all x E K. 0 Two consequences of the foregoing will turn out to be especially useful. The first - known as Urysohn's lemma - often serves as the starting point for inductive constructions of partitions of unity (see, e.g., RUDIx[1987J, p. 39). The second can also be proven directly, as indicated in Exercise 1 below.
27.3 Corollary 1. In the locally compact space E, U is an open neighborhood of the compact subset K. Then CA(E) contains a function f which satisfies (27.7)
0:5f:51, f(K)=fl),
and
supp(f) C U .
In particular, supp(f) is a compact neighborhood of K.
Proof. We have only to apply 27.2 for n = 1. Since K C (f, > 0} C supp(f3), the fact that (f, > 0) is open means that supp(f 1) is indeed a neighborhood of K. 0 27.4 Corollary 2. In the locally compact space E the compact subset K is covered
by then open sets UI,... , Un, n E N. Then K can be decomposed as K = KI U ... U Kn with Kj a compact subset of Uj for each j = 1, ... , n.
§27. Properties of locally compact spaces
Proof. Let fl,
169
, fn E Cc,(E) be as provided by 27.2. The compact sets
K; := K n supp(f3 ),
j = 1, ... , n
do what is wanted; for if x E K, then 1 = f i (x) +... + f n (x) means that f, (x) j4 0 for some j, and therefore x E K3.
For a locally compact space E there is another function space besides CC(E) that is of importance. To define it we assign to every bounded real function f on an arbitrary space E its supremum norm, also called its uniform norm, via Ilf11
sup If W1 sEE
The mapping (f, g) -+ If -gIi makes Cb(E) - more generally even the vector space of all bounded real functions on E - into a metric space. One speaks of the metric of uniform convergence (on E). A sequence (fn) of bounded real functions on E converges uniformly on E to a bounded function f just means that lim Ilfn - f 1l = 0 . nloo
27.5 Definition. A continuous real function f on a locally compact space E is said to vanish at infinity if it lies in the closure Co(E) of CC(E) in Cb(E) with respect to the metric of uniform convergence. Denoting closure in this metric by bar, we thus have Co(E) := CC(E) C Cb(E). The terminology "vanishing at infinity" is both clarified and justified by
27.6 Theorem. For a real function f on a locally compact space E the following statements are equivalent:
(a) f E Co(E); (b) f E C(E) and {If I > e} is compact for each e > 0; (c) the function
f'(x) :_ { f (x), for all x E E for x = wo 0, is continuous on the one-point compactification E' of E.
Proof. (a)=(b): Given e > 0, there is by definition off E Co(E) a g E Cc(E) with Ilf - gfl S e/2. Every x E E satisfies If (x)I - Ig(x)I e} C {IgI > E/2} C supp(g). This shows that (If 12: c} is a relatively compact set. But, due to the continuity of f, it is also closed. Hence it is compact. (b)*(c): Since the subspace topology of E in E' is its original topology and E is an open subset of E', continuity of f' at each point of E is assured by f E C(E). As to continuity at the ideal point wo, given e > 0, we have I f'(x) - f'(wo) I = l f'(x) I
e} is a compact subset of E. (c)=:>(a): Continuity of f' at wo and the definition of the topology in E' mean that for each e > 0 there is a compact K C E such that If (x)I = If'(x) - f'(wo)I < E for all x E E \ K. 27.3 supplies a g E CA(E) with 0 < 9< I and g(K) = {1}. Then fg E CA(E) and satisfies
If
- f(x)I = If(x)I (1-g(x)) < E
for all x E E, so Ilfg - f II < E. As e > 0 is arbitrary, this proves that f E CA(E).
Exercises. 1. Without resort to partitions of unity, prove Corollary 27.4 directly. [Hint for the case n = 2: Separate the disjoint compacta K \ U1, K \ U2 with disjoint open neighborhoods V1, V2 and set Kl := K \ V1, K2 := K \ V2.] 2. Let E' = E U {wo } be the one-point compactification of a locally compact space E. Describe the Borel sets in E' by means of the Borel sets in E. In particular, see how your description fits into the following general picture: For a measure space (E,.o), a point wo it E and the set EWO := E U {wo}, the a-algebra d"'O in E"'° generated by d and {wo} consists of all A' C El- such that All fl E E St.
§28. Construction of Radon measures on locally compact spaces In what follows E will be a locally compact space. We consider a Borel measure p
(defined on R(E)). Here the requirement µ(K) < +oo for every compact set K is the same as the local finiteness requirement, because every point of E has a compact neighborhood and the implication (25.7) holds in general. So in the present context the concepts of Borel measure and locally finite measure on .W(E) coincide. The Radon measures on E are thus (cf. 25.3) those Borel measures which are inner regular. For a Borel measure it every u E CA(E) turns out to be p-integrable. For, being continuous, u is Borel measurable. Denoting by K the compact support of u, we have 1111 5 IIuII 1K. Since It is a Borel measure, 1K is p-integrable, and the pintegrability of u follows. Therefore corresponding to the Borel measure is a linear form 1,, on C,;(E) defined by (28.1)
lu(u) := Judy.
This is an isotope linear form in the sense of (12.3): From u < v follows I,,(u) < I,,(v). Because of the linearity of I,, this is equivalent to
00,
§28. Construction of Radon measures on locally compact spaces
171
which is why I,, is usually called a positive linear form. This brings us to a key question for our further work: Is every positive linear form on C,.(E) an I,, for some Borel measure p on E, or are there possibly positive linear forms of a completely different kind? Even for compact intervals J := [a, b] on the number line, answering this question is by no means a trivial task. In this case however, as early as 1909 F. Riesz showed (cf. RIEsz (1911]) that besides the
linear forms I,, arising from Borel measures it on J, there are no other positive linear forms on Q,,(J) = C(J). One of our goals is to show that every locally compact space E shares this property with J. The result in question will, in view of this pioneering work, be called the Riesz representation theorem. En route to it we will naturally be led to the construction of Radon measures on E. Besides the locally compact space E. let now a positive linear form
I : Cr(E) -+ R be given. What follows will prepare the way for the proof of the Riesz representation theorem. For every compact K C E we set (28.2)
p.(K) := inf{I(u) : 1K < it E C.,,(E)}.
Such functions u exist thanks to Corollary 27.3. Consequently, (28.3)
0 < p. (K) < +oc.
Moreover, the mapping K ' p.(K) is obviously isotone on the system ..l' of all compact, sets. For an arbitrary A E -1P(E) we set (28.4)
p.(A) := sup{p.(K) : K compact C Al.
Because of the above noted isotoneity of it. on ..it', this new definition is consistent with (28.2). Finally, for A E .9(E) we define (28.5)
p'(A) := inf{p.(U) : A C U open}.
Then it. and p` are isotone functions on . (E). Moreover (28.6)
p. (A) < y* (A)
for all A E .0(E),
as follows from the obvious fact that it.(A) < p.(U) for every open U D A; and (28.7)
p.(U) = /I* (U)
for all open U E Y(E),
which follows from (28.5) and the isotoneit.v of it.. Somewhat more effort is required
to check that (28.8)
p.(K) = p`(K)
for all K E X.
For every e > 0 definition (28.2) supplies a u E C,.(E) with to > 1K and
I(u) - p.(K) < E.
172
IV. Measures on Topological Spaces
For0a} is an open superset of K and 1Ue
1K,uK2=1K,+1Ks. I}, and According to 27.3 there is a v E C,(E) with 0 < v < 1, v(K1) supp(v) C CK2, hence with v(K2) = {0}. The functions vu and (1 - v)u lie in CA(E) and satisfy vu > 1K,
and
(1 - v)u > 1K2.
Therefore
p.(Ki) +p.(K2) < I(vu) + I((1 -v)u) =1(u) ,
174
IV. Measures on Topological Spaces
which, because of (28.2), has the consequence that
p.(Ki) + µ.(K2) < u.(K1 U K2). In view of (28.8) this inequality is half of the equality being claimed. The other half is simply the subadditivity of the outer measure µ'. The first important consequence of all this is:
28.3 Theorem. The restriction of µ' to M(E) is a Borel measure. The proof is immediate from Lemma 26.5 and the facts accumulated to this point. Notice that (28.7) and (28.5) say that hypothesis (1) of 26.5 is fulfilled, while (28.7), (28.8) and (28.4) insure that hypothesis (ii) of 26.5 is fulfilled.
The Borel measure µ' I ..(E) has a series of further remarkable properties:
28.4 Theorem. Every Borel subset A C E with µ'(A) < +oo satisfies
µ.(A) = µ`(A) Proof. Given e > 0, there is an open U D A such that
It* (U) - µ'(A) < e/2, which, due to µ' (A) < +oo and µ' being a measure on 9(E), can be written as
µ'(U\A) =µ'(U) -µ'(A) <e/2. From (28.4) we get compact L C U such that
µ'(U\L)=µ'(U)-li (L) <e/2. The set
Q:=(U\A)U(U\L) then satisfies p* (Q) < e. Hence there is an open G Q such that µ'(G) < C.
Now K := L \ G is a (closed, hence) compact subset of L with the properties
K C A and A\ K C G.
(28.10)
In fact, on the one hand
K = L \ G C L \ Q C L \ (U \ A) = L n A, since L C U, and on the other hand
A\K=A\(L\G)=(AnG)U(A\L)CGu(U\L)=G, since U \ L C Q C G. From (28.10) we get
µ'(A) - µ'(K) = µ'(A \ K) 5 µ'(G) < e,
§28. Construction of Radon measures on locally compact spaces
175
and so u* (A) < µ'(K) + e 0 was arbitrary, this says that µ'(A) < µ.(A), which with (28.6) finishes the proof. The finiteness hypothesis in the preceding theorem can be weakened. In doing so we make use of the terminology introduced just before the proof of Theorem 13.6.
28.5 Corollary. The equality p. (A) = u* (A) also holds for every A E -V(E) which has o'-finite µ'-measure.
Proof. The terminology means that there exist An E R (E) (n E N), each of finite µ'-measure, such that An T A. The preceding theorem and the isotoneity yield
µ'(An) = p.(An) < µ.(A) , from which and the continuity of µ' from below on R (E) follows µ'(A) = sup p* (An) 0, a compact Kn C A. satisfying
p. (An) - µ.(KK) < 2-ne
for each n E N.
Since the sets Kj are pairwise disjoint, UKj)=µ*\UKj/IL_(Kj)A.(Kj)
j=1
j=1
j=1 n
> Ep.(Aj) - E j=1
j=1
j=1
n
j=1
for every n E N.
176
IV. Measures on Topological Spaces
Letting n -+ oo we infer that 00
(A) ? Eµ.(A.i) -e, 00
holding for every c > 0. That is, µ. (A) > E µ. (A,,), the complementary inequality we needed to finish the proof. We now set (28.11)
µo := µ. I .4(E) a n d µ° := µ* I R(E)
and, inspired by COURREGE [19621, call these the essential measure determined
by I and the principal measure determined by I, respectively. Each is a Borel measure (28.3 and 28.6).
Obviously the essential measure tb is inner regular, hence is a Radon measure on E. By contrast the principal measure µ° is outer regular. It turns out that µ° is the more important of the two. Thus to the given positive linear form I on CA(E) we have associated two Borel measures. The further relation of these measures to I and the questions of whether and when they coincide will be clarified in the next section. The closing lemma of this section recasts definition (28.4), when A is open, into a equivalent form. It has a preparatory character.
28.7 Lemma. Every open set U C E satisfies (28.12)
110(U) =11°(U) = sup{I(u) : u E C0(E), supp(u) C U, 0 < u < I}.
Proof. The first equality is just (28.7). Denote the right side of (28.12) by y, and consider any compact K C U. Corollary 27.3 provides a function u E CA(E) with
0 < u < 1, u(K) = {1} and supp(u) C U. In particular, 1K < u and so by (28.2) µ.(K) < I(u) < y, that is, µ.(K) < y for every such K. It follows that µ°(U) = µ`(U) = µ.(U) < y, by (28.4). The reverse inequality y < µ°(U) is derived as follows: Let u E CA(E) be a typical function involved in the definition of y. Set L := supp(u) and consider a typical v E C0(E) involved in the definition (28.2) of µ.(L). Evidently then u < v, so 1(u) < I(v); that is, I(u) < µ.(L) = µ0(L) = µ°(L) < µ°(U). Taking the supremum over eligible u gives finally the desired complementary inequality -y:5 µ°(U).
A sharpening of equality (28.12) will be presented in Exercise 2 of §29. The special case U = E of lemma 28.7 furnishes the following useful description of the total masses of it. and µ°: (28.13)
11µo11 = 11µ°II = sup{1(u) : u E CC(E),0 < u < 1).
§29. Riesz representation theorem
177
Exercises.
1. For a locally compact space E and a measure p defined on ..(E), show that it is a Borel measure if and only if Cc(E) C 21(p). 2. Let p be a Radon measure on a locally compact space E and (Gi)1EI a family of open sets which is upward filtering, that is, for any i, j E I there is a k E I such that Gi U G; C Gk. Show that C := U Gi satisfies iEI
p(G) = sup{p(Gi) : i E I} . 3. Using the preceding exercise, show that for any Radon measure p on a locally compact space E:
(a) There exists a largest open set G with p(G) = 0. The set CG is called the support of the measure p and is denoted supp(p). (b) A point x E E lies in supp(p) if and only if every open neighborhood of x has positive p-measure.
(c) For a non-negative f E C(E), f f dµ = 0 if and only if f = 0 throughout supp(p). Determine supp(Ad) for L-B measure Ad on Rd, and supp(E°) for every Dirac measure ea on E. 4. Let p be a Borel measure on a locally compact space E. Show that every set A from the a-ring p0(X) generated by the system ..iE' of compact subsets of E is a Borel set which satisfies p.(A) = p°(A). Here a ring .4 in a set 0 is called a aring if the union of every sequence of sets in .9 is itself a set in R. In complete analogy with a-algebras, every subset of .9(0) is contained in a smallest a-ring. Sometimes it is only the sets in pe(a') which get called "Borel sets"; this is the case, e.g., in the classic exposition of HALMOS [1974]. Why is it generally the case that po(..1E') 3 .9(E)?
§29. Riesz representation theorem Again let E be a locally compact space. Every Borel measure p on E defines a positive linear form
I,,(u) := fudp on CA(E). The question posed in §28 was: Is it true that for every positive linear form I on CA(E) there is a Borel measure p on E such that Iµ = I, that is, such
that
I(u) = Judp
foralluECC(E)?
Any such Borel measure p will be called a representing measure for I. The answer, leaked earlier, to this question reads:
178
W. Measures on lbpological Spaces
29.1 Riesz representation theorem. If E is a locally compact space, every positive linear form I on CA(E) has at least one representing measure. In fact, both the essential measure Po determined by I and the principal measure p° determined by I are representing measures for I.
Proof. po and p° are Borel measures. It must be shown that (29.1)
I(u)= fud = Judpo
for all uECC(E),
and because of linearity and the fact that the positive and negative parts of each u E CA(E) also lie in C°(E), it suffices to show this for non-negative u. So let such be given and let the real number b > 0 be an upper bound for u. Fbr auE a given e > 0 choose real numbers yp,... , y,, with
0=yo 0, i3 > 0 the measure aµ +)3v also lies in .ill+(E), as is easily checked. That is, .0+(E) is what is called a convex cone. Besides . W+ (E) we often consider the following subsets
.'+(E) = (1A E 4'(E) : p(E) < +oo}
-#+'(E) =fu E-0+(E):µ(E)=1}, the set of all finite (or bounded) Radon measures and the set of all Radon pmeasures on E, respectively. Evidently
-&+' (E) C.-W+(E) C .4+(E) .
In .f+1 (E) are to found all the Dirac measures on E. And 4 (E) is a convex subcone of 4f+ (E). In the special case E = Rd the set ..W+b (W') is the set of all finite Borel measures
on Rd, already familiar to us from §24. That the definition there is equivalent to the present one is due to Theorem 29.12, according to which every Borel measure on Rd is a Radon measure. Depending on whether one thinks of the elements of . W+(E) as measures on -V(E) or as positive linear forms on CA(E), two notions of convergence suggest themselves: One can define the convergence of a sequence (ta,,) in 4'+(E) to
§30. Convergence of Radon measures
pE
189
by requiring either that lim An (A) = p(A)
n-+oo
for all A E R(E)
or
lim
n-+oo
J
f dp = J f dp J
for all f E CC(E).
We will forthwith show that the first of these is of limited interest, while the second is of considerable significance.
30.1 Definition. A sequence (pn)nEN of Radon measures on E is said to be vaguely convergent to a Radon measure y if (30.1)
lim
-oo
for all f E CA(E).
A sequence (pn) in 4'+(E) is vaguely convergent just when the sequence of real numbers (f f dpn) converges in R for every f E CA(E). For in this case f H lim f f dpn evidently defines a positive linear form on CA(E), so by the Riesz n representation theorem together with Theorem 29.3 there is a unique Radon measure p to which (An) vaguely converges. At the same time we see that a sequence in . K+(E) can have at most one vague limit.
Examples. 1. Let (xn) be a sequence in E, x E E. If (xn) converges to x, then (e2 ) converges vaguely to eZ, for the latter just amounts to lim f (xn) = f(X)In general however lime= (A) = ex(A) does not hold for all A E -V(E); in fact, if all xn are distinct from x, A := {x} is such a set. Conversely, if (es,) vaguely converges to ey, then (xn) converges to x. For if this were not so, there would be a subsequence of (xn) which remains outside of some neighborhood U of x. 27.3 furnishes an f E CA(E) with f (x) = 1 and supp(f) C U. Evidently the (f (xn)) does not converge to f f de,. sequence (f f Let (an) be an arbitrary sequence of non-negative real numbers and (xn) a sequence in E with the property that {n E N : xn E K} is finite for every compact K C E. (In other words, E is not compact and limxn = wo E E'.) Then the sequence of measures An := ane: (n E N) is vaguely convergent to the zero measure p := 0. For f f dpn = an f (xn) = 0 for all n except the finitely many for which xn E supp(f), whenever f E Cc(E). 2.
The fact, illustrated by Example 1, that the vague convergence of (An) to A does not generally entail the convergence of (pn(A)) to p(A) for each A E . (E), while, as 30.2 will show, the converse is true, seems to indicate that the first mode of convergence mentioned above is too restrictive to be of much use. Actually, vague convergence of (An) to p follows just from knowing that (An (A)) converges to p(A) for certain special sets A E R(E). Even more:
190
IV. Measures on Topological Spaces
30.2 Theorem. A sequence (pn) of Radon measures on a locally compact space E converges vaguely to a Radon measure p if and only if the following condition is fulfilled: (30.2)
lim pp 1zn (K) < p(K)
and
lim oinµn (G) > jz(G)
for every compact K C E and every relatively compact, open G C E. converges vaguely top and that K and G are any compact and open sets, respectively. Consider functions u,v E CC(E) with u > 1K, 0 < v < 1 and supp(v) C G. Then for all n E N Proof. Suppose
µn(K) < J udjcn and JVdPn 0 we choose finitely many numbers
0=yo 0 set K,.(x) := rdK(rx) (x E Rd). Then K, is also non-negative and Ad-integrable, and f K, dAd = 1 as well. To see this we only have to recall (7.10), according to which the homothety H,(x) := rx on Rd transforms L-B measure thus: Hr(Ad) = r-dAd. For from that it follows
§30. Convergence of Radon measures
193
that
J KrdAd=rd I K0HrdAd=rdJ Kd(Hr(Ad))= I KdAd = 1. Now r -+ Kr)1d is a mapping of JO,+oo[ into dl. (Rd), and in the sense of the vague topology it satisfies
lim KrAd = e0
(30.7)
r-a+oC
To confirm this, first notice that for every f E
.F
f f Kr dad = rd J f (K o Hr) dad = rd f (f o Hr-') K dHr(Ad) = f(f oHH')KdAd= ff(f_1x)K(x)Ad((fr)
this and the Lebesgue dominated convergence theorem the claim (30.7) follows upon checking that, on the one hand
lint f (r-'x)K(x) = f (0)K(x)
r-++oo
for every x E Rd,
and on the other hand for all real r > 0 and all x E Rd
If (r-'x)K(x) I (.q'!.+) into R (f E C'(E)). But this mapping is just the restriction to 4)(..C/+) of the projection of P = RC, onto its coordinate specified by f.
As to (b): Let I E P be a point in the closure of 4'(..E'+) in P. Then I is a positive linear form on CA(E). To see its additivity, for example, let f, g E CA(E)
and E > 0 be given. The set of all I' E P which satisfy
II'(u) - I(u)I < E
for u E (f, g, f + g}
is a neighborhood of I in P, and therefore contains a point I' = 4>(p) from I' is thus the positive linear form
u H I' (u) = Judu
206
IV. Measures on Topological Spaces
on CA(E). That means that we have
II(f +g) - I(f) - I(g)I
II(f +g) - I'(f +g)I + II'(f +g) - I(f) - I(g)I
=II(f+g)-I'(f+g)I+II'(f)-I(f)+I'(g)-I(g)I <e+II'(f)-I(f)I+II'(g)-I(g)I 0 is arbitrary, the extreme inequality means that its left-hand side must be 0. In a completely analogous way one proves that I (a f) = aI (f) for every a E R, f E CA(E), and I(g) > 0 for every non-negative g E CA(E). With the linearity of I confirmed, the Riesz representation theorem supplies a Radon I. That is, I lies in confirming that measure v E + such that the latter is closed in P. lJ 31.3 Corollary. For every real number a > 0 the set
9a:={pE..t+(E):IItzII 0. That is, the desired equality f f dp = f f dv must hold. The next step is to show that the topology determined by P is none other than the vague topology. We will, to that end, make use of the fact that the sets defined in (30.5) are a neighborhood base at v E ..&+ in the vague
210
IV. Measures on Topological Spaces
topology, when all possible finite subsets {fl,..., fn} of C0(E) and all numbers e > 0 are considered. We will denote by Ue (v) the open ball of center v and radius e
with respect to the metric p. 1. Given e > 0 there exists m E N such that Vd,..... dm;e/2(V) C UU(V)
(31.7)
for every v E .4'+.
Indeed, one may take any m E N such that 00
E 2-n < e/2 n=m+1
and every le E Vd,..... d,,,;e/2(V) will then satisfy in E2-n
p(µ, V)
0 and every v E 4'+, there is a number i > 0 such that (31.8)
Un(v) C V11,---.fn;-(V)
First of all, choose k E N so that n U supp(fj) C Lk C {ek = 1}. j=1
We can find a number 8, dependent on v, so that
0 N.
if
The second of the (valid for all r,s > N) inequalities in (31.11) shows that the numerical sequence (f ek d
EN is bounded, say by M E R+:
forallnEN. The earlier inequality therefore yields
Jfdpr_JfdP8N.
Notice that M depends only on k, hence only on f. Furthermore N depends only on b and f. Therefore this last inequality affirms that (f f dpn)nEN is a Cauchy sequence in R. According to the remark following Definition 30.1 the sequence (tin)
is therefore vaguely convergent to some p0 E .4'... Since the vague topology coincides with the p-topology, as we have already confirmed, this means that the sequence (pn) converges to po in the p-metric. We finally need to prove that, like the topology of E, the vague topology of ..k+ has a countable base. Since the vague topology is generated by the metric p, it is enough to find a countable set 9o which is dense in . W+; because it is obvious that the set of all open balls with respect to the metric p centered at points of 9o and having rational radii is then a countable base for the p-topology of . '... Our candidate for 9o is the set of all discrete measures k
b :_
aifx,
with positive rational ai and points ai drawn from a countable set Eo which is dense in E. We get such a set Eo simply by taking a point from each set in a countable base for the topology of E. Evidently, this 90 is countable. We have to show that for every p E . fl+, every real e > 0, and every finite set F :_ {fl,..., fn} C CA(E), the basic vague neighborhood Vj,,... contains a measure from 90. At least, according to 30.4, this neighborhood contains a
with positive real Ui and Ti E E. Thus (31.12)
ip- Jfdbl-l Jfd
k
<e i=1
for all f EF.
§31. Vague compactness and metrizability questions
213
Now for such f and d as above
if fdIt-Jfd.6l
0 and integers I < n1 < n2 < ... such
that If f dlt,,; - f f ditl > e for all j E N. The sequence
)jEN would have a vaguely convergent subsequence and its vague limit could not be iz. If we further that it is tight, then with the aid of Remark 3 in §30 we can hypothesize of even converges weakly to it. conclude that it E .W+(E) as well, and that
5. The foregoing deliberations show (for locally compact. E with a countable base) that tight sequences in &+'(E) always contain weakly convergent subsequences. Explicitly formulated this says: A set H C .,i.+ (E) is relatively compact (= relatively sequentially compact) in the weak topology if it is tight, meaning that for every e > 0 a compact Kf C E exists such that p(E \ KE) < e for every it E H. A theorem of Yu.V. PROHOROV asserts that the lightness of H is even equivalent to its weak relative compactness. More is true: This equivalence prevails as well whenever E is any Polish space. For details the reader can consult BILLINGSLEY [1968[.
214
IV. Measures on Topological Spaces
The ideas employed in the proofs of Theorems 31.4 and 31.5, slightly modified, lead to a further interesting result. It concerns the space
C := C(R+, E)
of all continuous mappings f of R+ := [0, +oo into a Polish space E, for example, Rd. We endow C with the topology of uniform convergence on compact subsets of R+.
31.6 Theorem. Along with E, the space C(R+, E) is also Polish. Proof. Consider any complete metric B which generates the topology of E. Another
such metric is given by (x,y) H min{1, p(x,y)}, and using it if need be, we can simply assume that L< 1. This lets us define do in C for each n E N by dn(f,g) := sup{p(f(x),g(x)) : x E [0, n]),
f,g E C;
and
(31.14)
d(f,g) :_
00
E2-ndn(f,g),
f,g E C.
n=1
Just as earlier (cf. (31.3) and (31.4)), one easily confirms that d is a metric on C (with all its values in [0,1]) which satisfies (31.15)
2-nd(f,g)