Measure and Integration Theory (De Gruyter Studies in Mathematics)

de Gmyter Studies Mathematics 26 Heinz Bauer Measure and Integration Theory de Gruyter Studies in Mathematics 26 Ed...

Author: Heinz Bauer

221 downloads 2049 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

de Gmyter Studies Mathematics 26

Heinz Bauer

Measure and Integration Theory

de Gruyter Studies in Mathematics 26

Editors: Carlos Kenig Andrew Ranicki Michael Rockner

de Gruyter Studies in Mathematics 1 Riemannian Geometry, 2nd rev. ed., Wlhelm R A. Klingenberg 2 Semimartingales, Michel M6tivier 3 Holomorphic Functions of Several Variables, Ludger Kaup and Burchard Kaup 4 Spaces of Measures, Corneliu Constantinescu

5 Knots, Gerhard Burde and Heiner Zieschang 6 Ergodic Theorems, Ulrich Krengel 7 Mathematical Theory of Statistics, Helmut Strasser 8 Transformation Groups, Tammo tom Dieck 9 Gibbs Measures and Phase Transitions, Hans-Otto Georgii 10 Analyticity in Infinite Dimensional Spaces, Michel Hervt 11 Elementary Geometry in Hyperbolic Space, Werner Fenchel 12 Transcendental Numbers, Andrei B. Shidlovskii 13 Ordinary Differential Equations, Herbert Amann 14 Dirichlet Forms and Analysis on Wiener Space, Nrcolas Bouleau and Francis Hirsch 15 Nevanlinna Theory and Complex Differential Equations, Apo Laine 16 Rational Iteration, Norbert Steinmetz 17 Korovkin-type Approximation Theory and its Applications, Francesco Altomare and Michele Campiti 18 Quantum Invariants of Knots and 3-Manifolds, Vladimir G. Turaev 19 Dirichlet Forms and Symmetric Markov Processes, Masatoshi Fukushima, Yoichi Oshima, Masayoshi Takeda 20 Harmonic Analysis of Probability Measures on Hypergroups, Walter R. Bloom and Herbert Heyer 21 Potential Theory on Infinite-Dimensional Abelian Groups, Alexander Bendikov 22 Methods of Noncommutative Analysis, Vladimir E. Nazaikinskii, Victor E. Shatalov, Boris Yu. Sternin 23 Probability Theory, Heinz Bauer 24 Variational Methods for Potential Operator Equations, Jan Chabrowski 25 The Structure of Compact Groups, Karl H. Hofmann and Sidney A. Morris

Heinz Bauer

Measure and Integration Theory Translated from the German by Robert B. Burckel

W Walter de Gruyter Berlin New York 2001

Author Heinz Bauer Mathematisches Institut der Universit t Erlangen-Numberg Bismarckstral3e 1 1/2 91054 Erlangen Germany

Translator

Robert B. Burckel Department of Mathematics Kansas State University 137 Cardwell Hall Manhattan, K ansas 66506-2602

USA

Series Editors

Carlos E. Kenig Department of Mathematics University of Chicago

Andrew Ranicki

Michael Rockner Fakultit fiir Mathematik Universitiit Bielefeld

Department of Mathematics

5734 University Ave

University of Edinburgh Mayfield Road

Chicago, IL 60637

Edinburgh EH9 3JZ

USA

Scotland

UniversitiitsstraBe 25

33615 Bielefeld Germany

Mathematics Subject Classification 2000: 28-01; 28-02 Keywonts: Product measures, measures on topological spaces, topological measure theory, introduction to measures and integration theory Ptimod on acid-free papa which fans widen the guidelines of the ANSI to errawe permanence and dwability.

Library of Congress - Cataloging-in-Publication Data Bauer, Heinz, 1928[Mass- and Integrationstheorie. English] Measure and integration theory / Heinz Bauer ; translated from the German by Robert B. Burckel. p.

cm. - (De Gniyter studies in mathematics ; 26)

Includes bibliographical references and indexes. ISBN 3110167190 (acid-free paper)

1. Measure theory. 2. Integrals, Generalized. QC20.7.M43 84813 2001 530.8'0 1 - dc2l

I. Title.

It. Series. 2001028235

Die Deutsche Bibliothek - Cataloging-in-Publication Data Bauer, Heinz:

Measure and integration theory / Heinz Bauer. Trans[. from the German Robert B. Burckel. - Berlin ; New York : de Gruyter, 2001 (De Gruyter studies in mathematics ; 26) Einheitssacht.: Mass- and Integrationstheorie (engl.) ISBN 3-11-016719-0

© Copyright 2001 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin, Germany. All rights reserved including those of translation into foreign languages. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Printed in Germany. Typesetting: Oldlich Uhych, Prague, Czech Republic. Printing and binding: Hubert & Co. GmbH & Co. KG, GBttingen. Cover design: Rudolf Hubler, Berlin.

In memoriam

Orro HAUPT (5.3.1887 -10.11.1988) former Professor of Mathematics

at the University of Erlangen

Preface

More than thirty years ago my textbook Wahrscheinlichkeitstheorie and Grundziige der Maf3theorie was published for the first time. It contained three introductory chapters on measure and integration as well as a chapter on measure in topological spaces, which was embedded in the probabilistic developments. Over the years these parts of the book were made the basis for lectures on measure and integration at various universities. Generations of students used the measure theory part

for self-study and for examination preparations, even if their interests often did not extend as far as the probability theory. When the decision was made to rewrite and extend the parts devoted to probability theory, it was also decided to publish the part on measure and integration theory as a separate volume. This volume had to serve two purposes. As before it had to provide the measure-theoretic background for my book on probability theory. Secondly, it should be a self-contained introduction into the field. The German edition of this book was published in 1990 (with a second edition in 1992), followed in 1992 by the rewritten book on probability theory. The latter was translated into English and the translation was published in 1995 as Probability Theory (Volume 23) in this series.

When offering now a translation of the book Mall- and Integrationstheorie we have two aims: To provide the reader of my book on probability theory with the necessary auxiliary results and, secondly, to serve as a secure entry into a theory which to an ever-increasing extent is significant not only for many areas within mathematics, but also for applications in physics, economics and computer science.

However, once again this book is much more than a pure translation of the German original and the following quotation of the preface of my book Probability

Theory, applies a further time: "It is in fact a revised and improved version of that book. A translator, in the sense of the word, could never do this job. This explains why I have to express my deep gratitude to my very special translator, to my American colleague Professor Robert B. Burckel from Kansas State University.

He had gotten to know my book by reading its very first German edition. I owe our friendship to his early interest in it. He expended great energy, especially on this new book, using his extensive acquaintance with the literature to make many knowledgeable suggestions, pressing for greater clarity and giving intensive support in bringing this enterprise to a good conclusion."

In addition I want to thank Dr. Oldfich Ulrych from Prague for his skill and patience in preparing the book manuscript in TJ( for final processing. Many thanks are due to my family and Professor Niels Jacob, University of Swansea, for reasons

viii

Preface

they will know. Finally, I thank my publisher Walter de Gruyter & Co., and, above all, Dr. Manfred Karbe for publishing the translation of my book. Erlangen, March 2001

Heinz Bauer

Introduction

Measure theory and integration are closely interwoven theories, both content-wise and in their historical developments. They form a unit. The development of analysis in the 19th century - here one is thinking especially about the theory of Fourier series and classical function theory - compelled the creation of a sufficiently general concept of the integral that discontinuous functions could also be integrated. The jump function of P. G. LEJEUNE DIRICHLET should be seen in this light. At that time only an integration theory due to CAUCHY, a precursor of Riemann's, was known. And it was not until B. RIEMANN's Habilitation in 1854 (text published posthumously in 1867) that Cauchy's ideas were made sufficiently precise to integrate (certain) discontinuous functions. For the first time the need was felt for integrability criteria. Parallel to this a "theory of content" was evolving - primarily at the hands of G. PEANO and C. JORDAN - to measure the areas of plane and the volumes of spatial "figures".

But the decisive breakthrough occurred at the turn of the century, thanks to the French mathematicians EMILE BOREL and HENRI LEBESGUE. In 1898 Borel -

coming from the direction of function theory - described the "a-algebra" of sets that today bear his name, the Borel sets, and showed how to construct a "measure" on this a-algebra that satisfactorily resolved the problems of measuring content. In particular, he recognized the significance of the "a-additivity" of the measure. In his thesis (1902) LEBESGUE presented the integral concept, subsequently named after him, that proved decisive for the development of a general theory. At the same time he furnished the tools needed to make Borel's ideas more precise. From then on Lebesgue-Borel measure on the a-algebra of Borel sets and Lebesgue measure on a somewhat larger a-algebra - consisting of the sets which are "measurable" in Lebesgue's sense became standard methods of analysis.

What was new about Lebesgue's integral concept was not just the way it was defined, but also - and this was the real reason for its fame - its great versatility as manifested in the way it behaved with respect to limit operations. Consequently the convergence theorems are at the center of the integration theory developed by Lebesgue and his intellectual progeny. Subsequent developments are characterized by increasing recognition of the versatility of Lebesgue's concepts in dealing with new demands from mathematics and its applications. In the course of time (up to 1930) the general (abstract) measure concept crystallized, and a theory of integration built on it - after Lebesgue's model.

It is this theory that will be developed here in an introductory fashion, but far enough that from the platform so erected the reader can easily press ahead to deeper questions and the manifold applications. Areas in which measure and integration play a key role are, for example, ergodic theory, spectral theory, harmonic

x

Introduction

analysis on locally compact groups, and mathematical economics. But the foremost example is probability theory, which uses measure and integration as an indispensable tool and whose own specific kinds of questions and methods have in turn helped to shape the former. Even today the development of measure and integration theory is far from finished. The book is comprised of four chapters. The first is devoted to the measure concept and in particular to the Lebesgue-Borel measure and its interplay with geometry. In the second chapter the integral determined by a measure, and in particular the Lebesgue integral, the one determined by Lebesgue-Borel measure, will be introduced and investigated. The short third chapter deals with the product of measures and the associated integration. An application of this which is very important in Fourier analysis is the convolution of measures. In the fourth and last chapter the abstract concept of measure is made more concrete in the form of Radon measures. As in the original example of Lebesgue-Borel measure, here the relation of the measure to a topology on the underlying set moves into the foreground. Essentially two kinds of spaces are allowed: Polish spaces and locally compact spaces. The topological tools needed for this will mostly be developed in the text, with the reader occasionally being given only a reference (very specific) to the standard textbook literature. The examples accompanying the exposition of a theme have an important function. They are supposed to illuminate the concepts and illustrate the limitations of the theory. The reader should therefore work through them with care. Exercises also accompany the exposition. They are not essential to understanding later developments and, in particular, proofs are not superficially shortened by consigning parts to the exercises. But the exercises do serve to deepen the reader's understanding of the material treated in the text, and working them is strongly recommended.

Notations

Here we assemble some of the notation and phraseology which will be used in the

text without further comment and which - with but a few exceptions - are in general use.

By N, Z, Q, R we designate the sets of natural numbers 1,2,... (excluding 0), of whole numbers, of rational numbers and of real numbers, respectively. We always think of the field R as equipped with its usual (euclidean) metric and the topology

that it determines. Thus Ix - yi is the euclidean distance between two numbers x, y E R. We also speak of the number line R. Via the adjunction of (+)oo and -oo to R, the extended or compactified number line K is produced. Addition with the improper numbers +oo and -oo is performed

in the usual way: a + (±oo) = (±oo) + a = ±oo for a E R, and as well (+oo) +

(+oo) = +oo and (-oo) + (-oo) = -oo. On the other hand +oo + (-oo) and -00 + (+oo) are not defined. As usual too we set a (too) = ±oo for all real a > 0, including a = +oo, and a (±oo) = Too for all real a < 0, including a = -oo. Not so general but typical in measure theory are the additional conventions

which mean that the product a b is defined for all a, b E R. The notation A := B or B =: A means that this equation is the definition of A in terms of B. The < (resp., 12' a mapping. Then the system of sets 4.

(1.5)

T-1(d) := {T-1(A') : A' E Ad'}

is a a-algebra in Cl, as follows from the known behavior of the set-theoretic operations under inverse mappings (like T-1 here).

Every a-algebra .d has properties "dual" to (1.1) and (1.3), namely: OE.srd

(1.6) (1.7)

,

n An E W.

(an)nEN C d

nEN

These follow from (1.1)-(1.3) and the identities 0 = C11 and nAn = C(UCAn). Moreover,

A,u...UAn =A,u...UAnuOuOu... and

A, n... nAn = A, n... nAn nCln1n... Therefore, along with any finite number of sets which 0 contains, it also contains their union and their intersection. From this observation and (1.2) follows as well: (1.8)

A\B=AnCBEd.

A,BEd

For constructing a-algebras the following theorem is important:

1.2 Theorem. The intersection n .si of any family (dj)iEI of o-algebras in iEI

a common set 0 is itself a a-algebra in Q. Its proof is just a routine check of properties (1.1)-(1.3). It follows that for every system 9 of subsets of Cl there is a smallest a-algebra a(8) which contains 9; that is, a(8) is a a-algebra in 0 with the defining properties

(i) 9 C a(9), (ii) for every o-algebra .sd in Cl with 8 C 0, a(8) C.W. For a proof, consider the system E of all a-algebras nd in S2 with 9 C nd; for example, . (S2) is an element of E. Then o(e) is the intersection of all the 0 E E, which according to 1.2 possesses all the desired properties. Q(8) is called the a-algebra generated by 8 (in Cl) and .9 is called a generator

of a(8). Examples. 5. If 9 itself is a a-algebra in S2, then 9 = a(8). 6.

If S consists of a single set A C Cl, then a(S) = {0, A, CA, S2}.

4

1. Measure Theory

7.

The a-algebra in Example 2 is generated by the system of all finite subsets

of Q.

Several systems of sets possessing some of the properties of a-algebras frequently occur as generators. Of special interest are rings of sets.

1.3 Definition. A system .

of subsets of a set 11 is called a ring (in Sl) if it has

the following properties: O E R;

(1.9)

(1.10)

(1.11)

A,BE.J A,BER

A\BE-4;

AuBEF.

If in addition (1.12)

SZ E R

then :.8 is called an algebra (in fl). A ring contains with each two of its sets (and so, with each finite collection of its sets) not only their union, but also their intersection. This is because An B = A \ (A \ B).

1.4 Theorem. A system 1 of subsets of a set 0 is an algebra if and only if it has properties (1.1), (1.2) and (1.11). Proof. By definition an algebra has properties (1.1) and (1.11) and (1.10), and from the latter follows (1.2). The converse follows from the fact that 0 = Co, together with the set-theoretic identity

A\B=AnCB=C(BuCA). 0 Examples. 8. Every a-algebra is an algebra. For any set 0 the system of all sets A C 0 which are either finite or co-finite (i.e., have finite complement in i2) is an algebra, but is a a-algebra only if fl is 9.

finite. 10.

The system of all finite subsets of a set 0 is a ring, but is an algebra only

if fl itself is finite. 11.

The smallest ring of subsets of a set 0 is the empty set O.

Exercises. 1. For every system 8 of subsets of a set n there exists a smallest ring p(8) in 0 which contains if. It is called the ring generated by 8. Prove this existence assertion. Determine p(8) and a(8) in the case where f consists of two subsets A, B of Q. When does p(8) = a(e) hold in this latter case; when does it hold for general 8?

§2. Dynkin systems

2. For sets A and B

5

AL.B:=(A\B)U(B\A)

is called their symmetric difference. Prove that it obeys the following rules of calculation (in which A, B, C are arbitrary sets):

ADB=BAA; (AAB)ACAA(BAC);

(a) (b) (c) (d) (e)

CA A CB =ADB ; (A 6 B) n C = (A n C) A (B n C);

(f)

(U An) 0 (U Bn) C U (An A Bn)

AAA=0;

nEN

AA0=A;

nEN

nEN

(for arbitrary sequences (An) and (Bn) of sets). 3. Deduce from exercise 2 that -4 C .9(Q) is a ring in a set Q if and only if with respect to the operation A (as addition) and n (as multiplication) -4 constitutes a commutative ring in the sense that the algebraists use that term. 4. A subset V of a ring -4 in a set Q is called an ideal if it satisfies (a) (b) (c)

0EN;

NE.A',ME, ,MCN

ME.X; M,N E.N => MUN E.N. .

Continuing with exercise 3, show that .N C 9 is an ideal in 9 if and only if it is an ideal in the algebraists' sense in the commutative ring -4. Every ideal in . ' is itself a ring in Q. 5. Let Q := N and for each n E N, do denote the a-algebra in 12 generated by the system do comprised of the singletons {1}, {2},..., {n}. Show that do consists of all subsets of Q which are either contained in (1, 2,. . ., n} or contain the complement of this set. Obviously stI'n C .s4 for every n E N. Why is U stn nEN nevertheless not a a-algebra in 0 = N? [Hint: It is generally true of any isotope sequence (.4n)nEN of rings in a set Q that the union of all of them constitutes a a-algebra if and only if they are equal from some index onward. Cf. OVERDIJK, SIMONS and THIEMANN [1979] and, for the special case of a-algebras, BROUGHTON and HUFF [1977].1

§2. Dynkin systems It is often difficult to directly determine whether a given system of sets is a a-algebra. The following concept, which goes back to DYNKIN [1961] but in inchoate form even to SIERPINSKI (1928], helps to get around some of these difficulties.

6

I. Measure Theory

2.1 Definition. A system 9 of subsets of a set Il is called a Dynkin system (in A) if it has the following properties: S2 E 9;

(2.1) (2.2)

(2.3)

DE9

.

CDE9;

U D E 9.

D pairwise disjoint E 9 (n E N)

nEN

Every Dynkin system 9 thus contains the empty set 0 = CA, and then (2.3) also insures that 9 contains the union of every finite, pairwise disjoint collection of its sets. Examples. 1. Every a-algebra is obviously a Dynkin system.

Let A be a finite set with an even number 2n of elements (n E N). Then the system 9 of all D C A which contain an even number of elements is a Dynkin system. In case n > 1, 9 is not an algebra, hence certainly not a a-algebra. 2.

The precise connection between the concepts of or-algebra and Dynkin system is elucidated in the following considerations:

2.2 Lemma. Every Dynkin system 9 is closed with respect to the formation of proper complements, meaning that (2.2')

D,EE9, DcE

E\DE9.

Proof. According to what was noted right after definition 2.1, the set D U CE, being the union of the disjoint sets D and CE from 9, lies in 9. But then the complement of this set with respect to 0, that is, E f1 CD = E \ D, lies in 9. Consequently, Dynkin systems can also be defined via properties (2.1), (2.2') and (2.3).

2.3 Theorem. A Dynkin system is a o-algebra just if it contains the intersection of any two of its sets.

Proof. What needs to be shown is that every Dynkin system .9 which is closed under finite intersections is a a-algebra. Of the defining properties of a a-algebra, only (1.3) needs to be confirmed and we do that thus: According to (2.2') and the closure hypothesis, A \ B = A \ (A fl B) lies in 9 whenever A, B E 9. Since (A \ B) fl B = 0 and A U B = (A \ B) U B, 9 contains the union of any two, hence the union of any finitely many, of its elements. For any sequence (Da)nEN C 9, we have 00

00

U Dn=U(D'n+1\D,) n=1

n=e

§2. Dynkin systems

7

in which D' := 0 and D;, := Dl U ... U D for each n E N. The sets D;+i \ D;, are pairwise disjoint and, thanks to (2.2') and what has already been proved, they lie in 2. According to (2.3) then the union of the sets D lies in 2. 0

Just as for a-algebras, algebras and rings, every system Cr C .9(Q) lies in a smallest Dynkin system. It is, of course, called the Dynkin system generated by 8, and is denoted 6(8). The significance of Dynkin systems lies primarily in the following fact:

2.4 Theorem. Every 9 C .9(Q) which is closed with respect to finite intersection satisfies (2.4)

6(8) = 0(6°) .

Proof. Since every a-algebra is a Dynkin system, o(8) is a Dynkin system containing 9' and consequently 6(8) C o((fl. If conversely, 6(8) were known to be a a-algebra, the dual relation o(8) C 6(8) would also follow. In view of 2.3 therefore it suffices to show that 6(8) is closed under intersection. To prove this, we introduce for every D E 6(8) the system

1D:={QE.9(st):QnDE6(8)}. A routine check confirms that 9D is a Dynkin system. For every E E 8 the hypothesis on 8 insures that 8 C 2E and therewith that 6(8) C 2E. Thus for

every DE6(8)andevery EE8wehave EnDE6(8);that is,8C2D,and consequently 6(8) C 9D, holding for every D E 6(8). But this is just the property of d(eb) that had to be confirmed. 0 Systems of subsets which are closed under intersections (respectively, unions) of two, hence of any finite number, of their sets will from now on be described as r)-stable (respectively, U-stable).

Exercise. Determine the Dynkin system generated by the system consisting of just two subsets A, B of fl. Show that 6(&) and o(8) coincide just in case one of the sets A n B, A n CB, B n CA of CA n CB is empty.

8

1. Measure Theory

§3. Contents, premeasures, measures Combining the concepts of ring and or-algebra with the properties (B) and (C) of lengths, areas and volumes that we encountered in the introduction leads to the basic concepts of measure theory.

3.1 Definition. Let .4 be a ring in SI and it a function on sP with values in 10, +oo]. It is called a premeasure on 9 if p(0) = 0

(3.1)

and for every sequence (An) of pairwise disjoint sets from R whose union lies in 1B 00

00

u(U An) = E p(A,)

(3.2)

(a-additivity)

n=1

n=1

holds. it is called a content if instead of (3.2) it only satisfies n

tt

It (U

(3.3)

A;) = F p(A;)

(finite additivity)

(for every two and therewith) for every finitely many pairwise disjoint sets A,,. .. , A,, E R_ Due to (3.1) every premeasure is evidently a content. To see this, you have only to take An+1 = An+2 = ... = 0 in (3.2).

Examples. 1. For every ring R in 11 and every point w E 11 the function defined on .

s,,,

by if U) EA

if1.r0A is a premeasure. It is called the premeasure defined by unit mass at W.

Let a be the a-algebra defined in Example 2 of §1, for an uncountable set fl, say for S2 =1R. Set p(A) := 0 or 1 according as A of CA is countable. Since of two disjoint subsets of f? at most one can have a countable complement, property (3.2) is easily confirmed; thus p is a premeasure on d. 2.

3.

Let W be the algebra defined in Example 9 of §1, for a countably infinite set i.

Set p(A) := 0 or 1 according as A or CA is finite. Then p is a content but not a premeasure. The first assertion has a proof analogous to that in the preceding example, the second follows from the fact that f) is the disjoint union of countably many 1-element sets.

Let 111,112.... be a sequence of contents (premeasures) on a ring 9, and let a 1, 02, ... be a sequence of non-negative real numbers, Then 4.

00

p n=1

§3. Contents, premeasures, measures

9

is also a content (premeasure) on R. Every content µ on a ring R enjoys the following further properties (in which A, B, A1, B1, ... E R): (3.5) (3.6) (3.7)

µ(A U B) +µ(A n B) = µ(A) + µ(B) ; µ(A) < µ(B) ACB . µ(B \ A) = µ(B) - µ(A) A C B, µ(A) < +oo n

n

µ(U Ai)

(3.8)

(isotoneity); (subtractivity);

i=1

,p(Ai)

(subadditivity);

i=1

for every sequence (An) of pairwise disjoint sets from R whose union lies in R 00

"D Lµ(An) 0, delivers both (3.6) and (3.7). If we set B1 := A1, B2 A2 \ A1,... ,Bn := A. \ (A1 u ... u A,-,), then B1,..., Bn are pairwise disjoint sets from R, which entails that n

n

µ(U B,) =Ej(Bi) n

From the facts that Bi C Ai (i = 1,. .. , n), a is isotone, and U B, = U Ai i=1

i=1

now follows (3.8). To prove (3.9) we only have to observe that for every sequence (An)nEN of pairwise disjoint sets from R with A := u An E R nEN

µ(A1) + ... + µ(Am) = µ(A1 U ... U A.n) < µ(A) and let m -+ oo.

(m E N)

10

1. Measure Theory

Finally, if it is a premeasure on .4, then for any sets A0, A1, ... E 9

0 (3.10)

Ao C U A.

=

p(Ao) :5 >2 p(An) n=1

n=1

Because of AO = U(Ao n An) and (3.6), we can assume, in verifying (3.10), that Ao = U An. Then set B1 := Al, B2 := A2 \ Al,... ,Bn := An \ (A1 U ... U An-1) and proceed as in the proof of (3.8). In particular, we now have ,u(UAn) <Ep(Aa)

(3.10')

n=1

n=1

whenever all the sets An as well as their union lie in R. The following theorem characterizes premeasures via other properties related to the a-additivity. Its formulation is facilitated by the notations: En T E

(3.11)

and

En J. E

which mean that the sets E1 C E2 C ... satisfy E = U En, or that the sets El ) E2 D ... satisfy E= n En. In other words, the sequence (En) either increases isotonically to E or decreases antitonically to E.

3.2 Theorem. For a content p on a ring .9 consider the following statements: (a) p is a premeasure. (b) A,,, A E 9 with An T A =; limn_, , p(An) = p(A) (continuity from below). (c) An, A E .4 with An 1 A and p(An) < +oc for all n

lim p(An) = p(A)

(continuity from above).

n400

(d) An E 9 with An 4.0 and p(An) < +oo for all n = (continuity at 0).

1imn p(An) = 0

n-+oo

Then the following implications hold:

(a) a (b)

(c) a (d).

If it is finite on R, that is, p(A) < +oc for all A E .9, then all four statements (a)-(d) are equivalent.

Proof. (a)=(b): Defining Au := 0, the sets Bn := A. \ An-1 (n E N) are pairwise disjoint, lie in .9 and satisfy

A= U Bn, Yd=1

An=B1U...UB,,.


11

Therefore on account of the a-additivity of p n

00

µ(A) = E y(Bn) = nlim J µ(Bi) = lim p(An) +oo n=1

i=1

(b)=(a): Let (An) be a sequence of pairwise disjoint sets from R whose union A:= U An also is in R. If we set Bn := Al U ... u An, then Bn E 9 and Bn T A; therefore µ(A) = limµ(Bn). As a result of the finite additivity of µ

µ(B.) = µ(A1) +... +µ(An) and therefore p(A) = F_µ(An). Thus µ is a-additive, and consequently is a premeasure.

(b)=(c): According to (3.7), µ(A1 \ An) = µ(A1) - p(An) for every n E N. From An 1 A follows Al \ An T Al \ A, and all the sets appearing here are in R. From (b) therefore µ(A1 \ A) _ imoµ(A1 \ An) = µ(A1) - im0p(An).

From this follows (c), because A C An means that also µ(A) < +oo and so µ(A1 \ A) = µ(A1) - µ(A). (c) .(d): Here there is nothing to prove! From An 1. A follows An \A 10. Since An\A C An, the isotoneity of µ means that along with p(An), µ(An\A) is finite too. Hence by (d), limµ(An\A) = 0. But then (c) follows because p(A) < µ(An) < +oo, causing µ(An \ A) to equal

p(A.) - p(A) To finish off, let us consider the case that µ is finite, and show that then (d) =*- (b): If (An) is a sequence of sets from .9 and A. T A E.9, then A\An 10. Taking account of the finiteness of µ, it therefore follows that 0 = lim µ(A \ An) _

lim[µ(A) - µ(An)] and therewith (b). 0 Remark. If one modifies Example 3 of this section by making µ(A) := 0 for all finite sets and p(A) := +oo for all cofinite sets, then he gets a content that is continuous at 0 but is not a premeasure. Thus without the finiteness hypothesis in the preceding theorem, statements (a)-(d) are not generally equivalent. On the other hand, in (c) and (d) it is enough to explicitly hypothesize µ(An) < +oo for some n E N, as then u(A,n) < +oo for all m > n (isotoneity). The concepts of content and premeasure are preliminary to the central concept of this book, that of a measure.

3.3 Definition. A premeasure defined on a a-algebra 41 of subsets of a set 51 is called a measure (on ark). The function value µ(A) of µ at an A E d is called the (p-)measure or the (p-)mass of A. If p(S1) < +oo (and consequently µ(A) < +oo for every A E 4), the measure µ is called finite.

Thus a measure is a non-negative, numerical function p defined on a a-algebra .0 and enjoying properties (3.1) and (3.2). The constant function µ = 0 is a measure on every a-algebra, the so-called zero-measure. The examples that

1. Measure Theory

12

follow are still of a rather formal nature. But as early as §6 and then quite a bit later we will become acquainted with an abundance of important examples.

Examples. 5. If for the ring R in Example 1 one takes a a-algebra d in 1a, then e, is a measure on d, called the measure defined by a unit point mass at w, or more briefly the unit mass at w, and also the Dirac measure at w. These designations derive from interpreting a measure p on a a-algebra in f as a mass distribution over Q. Accordingly for A E 0, p(A) is viewed as the mass that has been "smeared" over A. The Dirac measure at w has, in so far as the one-element set {w} lies in d, all of its (unit) mass concentrated at the point w: e({w}) = 1, eW(C{w}) = 0. 6. Let SZ be an arbitrary set. For every A E .(12) let JAI denote the number of elements in A in case A is finite, and otherwise +oo. Then r;(A) :_ IAI defines a measure on :x(11), called the counting measure on ft (or on .9(1)). Its restriction to a o-algebra W in i is called counting measure on W.

7.

The premeasure defined in Example 2 is a measure. Next we derive a not-so-obvious consequence of the a-additivity of measures.

3.4 Lemma. Let p be a measure on a a-algebra ii and (An)nEN a sequence of sets from 0. Suppose there is a k E N such that the sets A,n and An are disjoint whenever their indices satisfy Im - nl > k. Then 00

00 (3.12)

J >(A.) < kp (U A.). n=1

n=1

When k = 1 this is, in view of (3.10'), just the a-additivity requirement of a measure.

Proof. Designate the union of all the An as C. For each r = 1,.. . , k the sets (Ar+rnk)mENo are pairwise disjoint. So if we set 00

Fr

U Ar+mk

r

rn=0 then 00

E p(Ar+mk) = p(Fr) < p(C) M=0

because Fr C C. Since the sum of a series of non-negative terms in independent of the ordering of the terms, it follows that 00

k

E p(A,) = E u(Fr) n=1

r=1


13

From this equality and the preceding inequality the asserted inequality can be read off.

Exercises. 1. Let 12 be a finite, non-empty set. Show that the counting measure ( on Y(O) coincides with E e,,. Show further that every measure p on :x(1l) has the form cEn

p=

a,,e,,,, with each a, := p({w}). WE n

2. For a finite content p on a ring .4 establish the following input-output formula generalizing equality (3.5): For all n E Nl, A,, ... , An E M n

n

µ(U A;) =EA(Ai)- E t(AinAj)+ i=1

i=1

15d, each a union of pairwise disjoint intervals, then A(I1) + ... + A(In) = A(J1) + ... + A(Jm) m

Indeed, Ij = Ij nF = U (Ij nJ;) is a representation of Ii as a union of the pairwise 1=1

disjoint intervals Ij n Js, ... , I, n J,,, and thanks to (c)

A(Ij)=>A(IjnJ=) i=1

(j=1,...,n).

§4. Lebesgue premeasure

17

Upon interchanging the roles of i and j, one gets analogously n

(i = 1,...,m).

A(J1) = EA(Ij nJ1) j=1

Together these last two equations entail the equality E A(Ij) = E A(JJ). (e) Thus for every F E .'d the number F_ A(Ij) is independent of the special representation

F=I1u...UI, of F as a union of finitely many pairwise disjoint 11, ... , In E fd. Therefore the decree

A(F) := A(I1) +... + A(In) well defines an extension, to be denoted still by A, of the original function on .fd to one on gd. This function is real-valued, non-negative, and according to (d) finitely-

additive. Since 0 E j0d and A(0) = 0, a content with the sought-for properties is at hand.

4.4 Theorem. The content A on !Fd is a premeasure. Proof. Because A is finite, 3.2 says that we only need to prove the continuity of A at 0. To this end, let (Fn) be an antitone sequence of figures from d. We will show that from the assumption that

b:= limoA(FF)=n NA(Fn)>0 follows

nFn #0. Each Fn being a union of finitely many pairwise disjoint intervals from .>Id, it should be clear that by a slight leftward shift of the right endpoints of each of these intervals a new figure an E .fin is created, whose topological closure Gn is still a subset of Fn, and A(Fn) - A(Gn) < 2-"6.

If we set Hn := G1 fl ... fl Gn, then (Hn) is a sequence of sets from gd satisfyin Hn Hn+1+ Fn C Gn C Fn for all n. Because Fn is bounded its closed subset H,, is compact. As soon as we succeed in showing that each Hn is not empty, it will follow from the finite-intersection property of compacts (WILLARD [1970), p. 118,

KELLEY [1955], p. 136) that n Hn 0 0 and so a fortiori n Fn 54 0. So let us nEN

prove that no Hn is empty. For every n E N (*)

nEN

A(Hn) > A(Fn) - (1- 2-")d,

as we will confirm by induction. The inequality holds for n = 1 because H1 = G1,

and by choice of G1, A(F1) - A(G1) < 2-16. Suppose the inequality valid for

18

1. Measure Theory

some n. Since

G,,+1 fl H, and everything is finite, (3.5) gives A(H,,+1) = A(Gn+1) + A(HH) - A(Gn+1 U Hn)

From the induction hypothesis A(H,,) > A(F,) - (1 - 2-")b; from the choice of G,.+1, A(Gn+1) > A (Fn+1) - 2-"-'b and G.+1 U Hn C F.+1 U Fn = Fn, so that Combining these observations completes the inductive U step in the confirmation of (*):

A(F'n+l) - 2-"-lb - (1 - 2-")b = A(Fn+l) Recalling that A(F,) > S by definition of b, we infer from (*) the inequality A(H") > 2`5 > 0 and therewith the fact that Hn 0 0, the last link that had to be accounted for in the logical chain. 0 4.5 Definition. The premeasure A on the ring Jrd of d-dimensional figures in Rd is called Lebesgue premeasure in Rd or d-dimensional Lebesgue premeasure. From now on it will be denoted by Ad.

Here we encounter for the first time the name of the French mathematician H. LEBESGUE (1875-1941), the inventor of the measure and integration concepts that today are named after him. The development of the theory of measure and integration was spurred above all by his investigations and those of his countryman E. BOREL (1871-1956). For the history of Lebesgue integration see DIEUDONNE [1978] and HAWKINS [1970].

Exercises. 1. Show that on 91 there is exactly one content p that assigns to the right halfopen interval [a,,3[, a, f3 E R, the following values

ifa 0 then, each 0&(Qn) contains a sequence (Anm)mEN such that 00

1: p(Anm) : µ'(Q.) +2-ne. M=1

1. Measure Theory

20

The double sequence (A,nm)n,mEN lies in 11 ('J Qn) and as a consequence the n=1

definition of µ' gives 00

00

,t* (U Qn) < E lp(Anm) < n=1

n,mEN

L, µ'(Qn) + £

n=1

and (5.4) follows from this and the arbitrariness of £ > 0. It is immediate from the definition that

i >0.

(5.5)

Decisive for what follows is the fact that every A E .4 satisfies p (Q) > µ' (Q fl A) + µ' (An CA)

(5.6)

for every Q E .9(0),

as well as

p*(A) =;&(A).

(5.7)

In proving (5.6) we can again assume µ'(Q) < +oo, so that P(Q) 34 0. First of all we have

00

00

00

p(An) _ >lx(AnflA)+Ep(An \A) n=1

n=1

n=1

for every sequence (An) from °1!(A), due to the finite additivity of p. Moreover, the sequence (AnflA) lies in 9l(QnA) and the sequence (An \A) lies in P!(Q\A). Consequently, 00

1: p(An) > p*(QnA)+µ"(Q\A) n=1

for every such sequence (An), and from this fact (5.6) is immediate. Equality (5.7)

follows on the one hand from (3.10), according to which u(A) < p*(A), and on the other hand from consideration of the sequence A, 0, 0.... which lies in P (A). The significance of what has been proven lies in the fact, which we will establish, that the system d' of all sets A E .9(1) satisfying (5.6) is a a-algebra in 52 and the

restriction of µ' to af' is a measure. Now (5.6) as just proved says that .' C d', C W*. Then according to (5.7) ji := µ' I a (R) is an and so we shall have extension of it to a measure on o(ff). The definition and theorem which follow will therefore complete the present proof. 0 5.2 Definition. A numerical function µ' on the power set .9(St) having properties (5.2)-(5.4) is called an outer measure on the set fl. A subset A of 0 is called u-measurable if it satisfies (5.6).

Notice that µ' > 0 always prevails, an immediate consequence of (5.2) and (5.3) together. The idea in the proof of the measure-extension theorem, which goes back to C. CARATHFODORY (1873-1950), consists in associating via definition (5.1) an outer measure to the premeasure p on.' and then invoking the following theorem.

§5. Extension of a premeasure to a measure

21

5.3 Theorem (Caratheodory). Let µ' be an outer measure on a set f). Then the system 0' of all µ'-measurable sets A C fl is a o-algebra in fl. Moreover, the restriction of µ' to dA' is a measure. Proof. First let us note that the requirement (5.6) for a subset A of St to lie in d' is equivalent to

µ'(Q)=µ'(QnA)+µ'(Q\A)

(5.6')

for allQE9(1),

because from (5.4) applied to the sequence Q n A, Q \ A, 0, 0.... follows the reverse of inequality (5.6), for every Q E 9(S1). From either (5.6) or (5.6') it is immediate that S2 E d', and because of their symmetry in A and CA, whenever A lies in d', so does CA. The following considerations will show that with each two of its sets A

and B, .d' also contains their union A U B, and so d is an algebra. B E as'' entails that µ' (Q) = µ' (Q n B) +.u* (Q \ B)

for every Q E 9(11). Replacing Q here first by Q n A, then by Q \ A = Q n CA, we get two new equalities (valid for all Q E 9(1)) which, when inserted into (5.6'), lead to

µ'(Q) =µ'(QnAnB)+µ'(QnAnCB)+µ'(QnCAnB)+µ (QnCAnCB). Replacing Q here by Q n (A U B) gives

(5.8) µ'(Qn(AuB)) =µ'(QnAnB)+µ'(QnAn CB)+µ'(QnCAnB), which in conjunction with the preceding equality yields

µ'(Q) = µ'(Qn(AUB))+µ'(QnCAnCB) = µ'(Qn(AuB))+µ'(Q\(AuB)) This being valid for all Q E Y(n) affirms that A U BE d'. Now let (An) be a sequence of pairwise disjoint sets from W' and A be their union. The choice of A := A1, B:= A2 in (5.8) produces

µ'(Qn(A1 uA2)) =µ'(QnA1)+µ'(QnA2) An induction argument generalizes this to n

, (Q n

n

U A) = E(Q n Ai) i=1

i=1 n

U Ai has already been proven

for all Q E 9(1), all n E N. Recalling that Bn

i=1

to be in Af ', and that Q \ Bn D Q \ A, so that µ' (Q \ Bn) > µ' (Q \ A), we obtain n

p* (Q) =14* (QnBn)+p'(Q\Bn)?F1i'(QnAi)+µ'(Q\A) i=1

22

I. Measure Theory

for all n E N. From this and an application of (5.4) follows 00

W(Q) ? F, p'(QnA.)+µ'(Q\A) ? 1 (QnA)+,u*(Q\A) n=1

and consequently, as noted at the beginning of the proof, we actually have equality throughout:

p'(Q) = 2p'(QnAn)+p'(Q\A) =p'(QnA)+p'(Q\A), n=1

holding for all Q E 9x(1l). Thus A lies in d'. After all this we recognize that the algebra sad' is an r)-stable Dynkin system and therefore by Theorem 2.3 a o-algebra. If in the last pair of equalities we take Q := A, we get 00

p'(A) _ E (An), n=1

proving that the restriction of p' to d' is a measure. 0 It can be further shown that in many important cases the measure µ from Theorem 5.1 is uniquely determined. As a preliminary we give a proof that is a typical application of the technique of Dynkin systems. (Cf. also Exercise 9.)

5.4 Theorem (Uniqueness theorem). Let 9 be an n-stable generator of a a-algebria d in 1 and suppose that (En) is a sequence in 9 with U En = n. Then nEN

measures p1 and p2 on W which satisfy (i)

p1(E) = p2(E)

for all E E c9

p1(En)=p2(En) 0 be given. At issue is the existence of a C E X with µ(A 0 C) < e. According to 5.1 and 5.6, especially the equation (5.1) which extends pl do to 0, there exists a sequence (Af)1EN in .00 which covers A and satisfies 00

0 < E µ(A11) -;i(A) < 2 .

(5.11}

11=1

n

If we set Cn

U Ai, n E N, then A' i=1

U An satisfies nEN

C n f A'

and

A' \ Cn y. 0.

Since p is finite, and consequently continuous at 0, an no E N exists for which (5.12)

p(A' \ Coo) < 2

Let us show that the set C := Cno E do does what is wanted:

A,L C= (A \C)u(C\A) c (A'\ C) u (A'\ A),

§5. Extension of a premeasure to a measure

25

and so the subadditivity of µ yields

,u(ADC) 0, (6.9) entails that the function F must be isotone. Moreover, F has to be left-continuous. This is because

for every x E R and every sequence (x,,) in R with x,, 1 x, the corresponding interval behavior is t [x1,x[, and since p must be continuous from below, it follows that

lira F(xn) - F(xl) = lim p([xl,xn[) = pQxl,x[) = F(x) - F(xl)

n-+oo

that is, lin, F(xn) = F(x), F is left-continuous at x. n-1oo Functions F : R -+ R which are isotone and left-continuous will be called measure-generating (or measure-defining) functions (on R). Of course, whenever F is such a function, so is aF + b for any a E R+, b E R. The designation "measuregenerating" is justified by the next theorem, which answers completely the earlier question of what are the appropriate conditions on F.

6.5 Theorem. To every measure-generating function F on R there corresponds exactly one measure OF on 91 having property (6.9), that is, satisfying pp([a,b[) = F(b) - F(a)

for all [a,b[ E 91.

The measure pc determined by the measure-generating function G satisfies PC = pp if and only if G = F + c for some constant c E R. Every pF is a Borel measure on R, and every Borel measure on R is a pp for an appropriate F.

Proof The techniques employed in the proof of Theorem 4.3 can be repeated to show that corresponding to F there is a unique content p on the ring Jr' of 1-dimensional figures which has property (6.9). That part of the proof used only the isotoneity of F. From the left-continuity of F it follows that for every

1=[a,b[E5' and every e>0there isaJ=[a,c[E51with JCland IA(1) - p(J) = p([c, b[) = F(b) - F(c) < e.

§6. Lebesgue-Borel measure and measures on the number line

31

But then the technique employed in the proof of Theorem 4.4 shows that it is a a-finite (as well as finite) premeasure on .071.

According to 5.6 it can be extended in exactly one way to a measure on 0. This measure does what is wanted, is a pF. Its uniqueness with respect to its prescription on .1 via F was settled in the deliberations preceding the present theorem. From pF = pc we get G(b) - G(a) = F(b) - F(a) whenever a < b. Upon applying this with a = 0 < b as well as with a < 0 = b, we learn that G = F + c, with c := G(O) - F(0). Every AF is a Borel measure, because every bounded B E 91 is contained in [-n,n[ for some n E N and so pF(B) < IAF([-n,n[) _

F(n) - F(-n) < +oo. If conversely, p is an arbitrary Borel measure on R, we can define

F(x) .=

p([0, x[)

if x > 0

I-p([x, 0[) if x < 0 and get a function on R having property (6.9) and therewith, in light of the discussion preceding this proof, measure-generating. In fact, for real numbers 0!5 a < b the subtractivity (3.7) of measures entails that p([a, b[) = p([0, b[ \ [0, a[) = F(b) - F(a)

,

and (6.9) is confirmed analogously when a < b < 0. In the remaining case a < 0 < b we get (6.9) from [a, b[ = [a, 0[ U 10, b[ and the additivity of it. The uniqueness

already proved leads finally to the equality of p with the measure AF derived from F. Notice that L-B measure )' has the form PF, with F the identity map x H x on R.

Of special importance are the finite measures on 0. Every one is a Borel measure on R. Because 0 < p(B) < p(R) < +oo for all B E 91, a finite Borel

measure p on R is either the zero measure p = 0, or 0 < p(R) < +co and

v:=

p is a measure on.91 with v(R) = 1. Measures normalized this way play p(R) a fundamental role in probability theory. This explains the following vocabulary: A measure p on a a-algebra .sad in a set Q is called a probability measure (abbreviated to p-measure) if p(1l) = 1. Because of the isotoneity property every p-measure satisfies (6.10)

0 < p(A) < 1 = p(fl)

for all A E W.

Consider now a p-measure p on 0. The open interval [-co, x[ lies in 91 for each x E R, so a real function F. with values in [0,1] is defined by (6.11)

F,,(x) := p(] - oo,x[)

(x E R).

It is called the distribution function of p. For example, the distribution of the Dirac measure eo equals 0 throughout ] - oo, 0] and 1 throughout ]0, +oo(. Since ] - coo, b[ \ ] - oo, a[ = [a, b[ whenever a < b,

p([a, b[) = F, (b) - F, (a)

for all (a, b[ E S1.

32

I. Measure Theory

Therefore (6.11) uniquely defines a measure-generating function, which obviously satisfies (6.12)

µF,. = A

in the notation introduced in Theorem 6.5. Among the infinitely many measuregenerating functions F that satisfy pF = µ for a given p-measure p the distribution function F. is characterized as follows: 6.6 Theorem. A real function F on J is the distribution function of a -- necessarily uniquely determined -p-measure p on 4' if and only if it is measure-generating (that is, isotone and left-continuous) and satisfies lira F(x) = 0 _cc

(6.13)

and

lira F(x) = 1.

X-++oo

Proof. The distribution function Fµ of a p-measure it on 91 is always measuregenerating, as (6.12) shows. Properties (6.13) follow from the continuity at 0 and the continuity from below of every finite measure, respectively, since for sequences (x,2) in R with x,, , -oo, resp., xn t +oo we have ] - oo,xn[ .. 0, reap.,

]-oo,x,, [TR. If conversely F is a measure-generating function satisfying (6.13), then according to 6.5µF is the only Borel measure on R with property (6.9), in particular, with

pp([-n, n[) = F(n)-F(-n) for all n E N. When n - +oo here, the normalization condition u(R) = 1 follows from (6.13). Thus µF is a probability measure. F is then the distribution function of pp, because for x E R and all n E N fl [-x,+00[

pF([-n,x[) = F(x) - F(-n) and [-n.x[ t ] - oo,x[ so that

F(x) = bin ILF([-n,xD +n-+oo lira F(-n) = u(] - oo,x[) = F,,, (x) . Via p +-> F,, the set of p-measures on 91 is thus bijectively mapped onto the set of measure-generating functions F on JR having property (6.13). This is the significance of the preceding theorem. Remarks. 1. Measure-generating functions are also called "Stieltjes measure func-

tions". This is because, even before the invention of the measure concept, T.J. STIELTJES (1856-1894) had used such functions to extend the ideas behind the Riemann integral (cf. Remark 2 in §12). 2. Measure-generating functions (and distribution functions) also make sense in Rd. But they are difficult to deal with and that is not the least reason why they are of less significance. A function F : Rd -* R is called measure-generating if in each of its d variables 1;1.... , l d, when the others are held fixed, it is left-continuous and satisfies the additional condition

A$'...AQ,F>0

for all a,bERdwith a 0 there is a covering of B by countably many open intervals In C Rd such 00 that E Ad(In) < c. (b) There is a covering of B by countably many open intern=1

00

vals In such that E Ad(II) < +oo and every point of B lies in In for infinitely n=1

many n. Both characterizations remain valid if the In are allowed to be half-open or compact, instead of open. [Hint for (a): Utilize (5.1).] 2. Write Rd in the form Rd = Rp X RQ with p, q E N, p + q = d, by grouping the first p coordinates of a point x E Rd into a point in RP and the last q coordinates into a point in R. Denoting by 0 the zero of the vector space R9, show that for a set A C RP, A x {O} E .mod precisely when A E

P.

3. Let p be a p-measure on 0 and Fµ its distribution function. Show that Fµ is continuous at the point x E R just if p({x}) = 0. 4. Determine the p-measure on .r which has x -+ 0 V (x A 1) as distribution function, and answer anew the question in Exercise 1 of §4.

5. Show that every a-finite measure p on 0 can be represented in the form 00

p = E an pn, where for each n E N, an E R+ and An is a p-measure on .mod. The n=1

supplemental condition that for every bounded set B E Rd, pn(B) -A 0 for only finitely many n E N can be imposed if and only if y is a Borel measure.

I. Measure Theory

34

§7. Measurable mappings and image measures The following considerations can be more simply formulated if we introduce some shorthand terminology. If 11 is a set and d9 a a-algebra in fl, the pair (12, mot) will

be called a measurable space and the sets in d measurable sets. If in addition a measure p is defined on the a-algebra d, then the triple (Cl, d, la) arising from the measurable space (12, a) is called a measure space (cf. Exercise 7 of §5). If p is a p-measure, the measure space (Sl, .a(, pC) is called a probability space (p-space for short). Correspondingly, one speaks of a a-finite measure space p) if the measure p is a-finite.

The measurable space (ltd, .4d) will henceforth be called the d-dimensional Borel measurable space. The measure space (ltd, .mod, Ad) will correspondingly be called the d-dimensional Lebesgue-Bored measure space abbreviated to L-B measure space). The concept measurable space exhibits a formal analogy to that of topological space. For a topological space is also a pair, consisting of a set and a system of its subsets, namely, the open ones. In the sense of this analogy the next concept, that of a measurable mapping, corresponds to the concept of continuity in topology.

7.1 Definition. Let (11,,W) and be measurable spaces, and T : fl -, Cl' a mapping of 11 into Cl'. T is called W-d'_measurable if (7.1)

T-'(A') E.off

for every A' E ,V'.

We express the W-sad'-measurability of T symbolically by

and speak of a measurable mapping of the first measurable space into the second. Using the notation introduced in (1.5), (7.1) can be written as (7.1')

T-'(,W') Cd.

Examples. 1. Every constant mapping T : 1-> Cl' is .W-a'-measurable. 2.

Every continuous mapping T : Rd - Rd' (d, d' E N) is : 1d-9"-measurable,

briefly put, Borel measurable. According to 6.4 the system /P' of all open subsets

of Rd' is a generator of .$. Because of the continuity of T, T-1(O) E Od C Rd for every 0 E Od'. The asserted measurability of T therefore follows from the next theorem.

7.2 Theorem. Let (12, d) and (Q', W') be measurable spaces; further, let 9' be a generator of 0'. A mapping T : Cl - 12' is measurable just if (7.2)

T-1(E') E R1

for every E' E 4'.

§7. Measurable mappings and image measures

35

Proof. The system .l' of all sets Q E 9(S2') for which T-1(Q') E d is a a-algebra in 11'. Consequently, 0' C °. ' holds just if 8' C 2' does. sZf' C .l' is equivalent to the measurability of T, while 8' c 2' is equivalent to (7.2). Concerning the composition of measurable mappings, what the earlier analogy with topology suggests, prevails:

7.3 Theorem. If Ti

: (c', .s+'j) -> (Sl2, a/2) and T2 : (S22, saI2) -* (S13, s71/3) are measurable mappings, then the composite mapping T2 o T, is sari-d -measurable.

Proof. The claim follows from the validity of the equation (T2 o T,)-1(A) = Ti 1(TZ 1(A)) for all A E 9(SZ3), in particular, from its validity for all A E saf3. Next consider a family of measurable spaces ((c,, sO ))iEI and a family (Ti)iE1 of mappings Ti : S2 -> S2i of some fixed set S2 into the individual sets 11,. Obviously the

a-algebra in 0 generated by U Ti 1(sa;) is the smallest a-algebra 0 with respect to which every Ti is 0-sfi-measurable. We designate this a-algebra o(T, : i E I), that is, we define (7.3)

o(Ti : i E I) := o(U(T; 1(-Wi)) iEI

and call it the a-algebra generated by the mappings Ti (and the measurable spaces n}, we also use the notation (Sti, r!)). In the case of the finite index set I

o(T1i...,Tn)For n = 1 we clearly have a(TI) = Ti 1(sad1). If therefore a a-algebra d in a set S1 is given, then a mapping T, : S2 -> S1, being d- s i(i -measurable is equivalent to (7.4)

a(T,)C0.

Cf. (7.1').

As a further application of 7.2 we will demonstrate:

7.4 Theorem. Let (T,)iEI be a family of mappings Ti : 0 -+ S2, of a set Sl into measurable spaces (Sli, s ). Further, let S : Slo -> fl be a mapping of a measurable space (Slo, sto) into Sl. The mapping S is then solo-o(Ti : i E I) -measurable if and only if each mapping Ti o S (i E I) is sago-d-measurable. Proof. According to Theorem 7.3 the condition is necessary. The following considerations show that it is also sufficient. By (7.3) the system

8:=UT,'(s ) iE1

is a generator of o(TT : i E I). Each set E E 8 has the form E = Ti 1(Ai) for some i E I, A, E .sad . Thus S-1(E) = (Ti o S) -1(Ai) E s to because of the hypothesized measurability of Ti o S. From 7.2 therefore, S is sio-o(Ti : i E I)-measurable.

I. Measure Theory

36

Finally, with the aid of measurable mappings, measures can be mapped:

7.5 Theorem. Let T : (I ,.d) -+ (0', 0') be a measurable mapping. Then for every measure p on a+f, (7.5) defines a measure

p' on af'.

Proof. We only have to observe that for every sequence (An)nEN of pairwise disjoint

sets from al', (T-1(A'n))nEN is a sequence of pairwise disjoint sets from W, and that T-1(UA')=UT-'(Art).

nEN

O

nEN

7.6 Definition. In the situation described in 7.5, the measure p' is called the image of p under the mapping T and is denoted by T(p). Thus according to this definition (7.5')

T(p)(A') := p(T-1(A'))

for all A' E ai'.

The formation of image measures is transitive, that is, (7.6)

(T2 o TO) (p) = T2(Ti(p)),

whenever we are in the situation of 7.3 and U is a measure on .aft: For every A E aft, T := T2oT1 satisfies T -'(A) = Ti ' (T;" (A)), and T;" (A) E .aft. Therefore, setting

A':= Ti(p), 14":= T2(µ') for short, it follows that

T(p)(A) = p(Ti '(Tz 1(A))) = µ (Tz 1(A)) = p"(A), for all A E W3i showing that T(p) = p" and confirming (7.6).

Examples. 3. Let (Q, d) = (11',.af') :_ (Rd, Rd) be the d-dimensional Borel measurable space and p := Ad the associated L-B measure. For every point a E Rd, the translation mapping T. : Rd -a Rd is defined by

Ta(x) := a + x

x E Rd.

It is continuous and so (Example 2) measurable. We inquire into the image measure A' := Ta(Ad).

The mapping Ta is bijective, and Ta 1 = T_a. So for every interval [b, c[ E jd, Ta 1([b, c[) = (b - a, c - a[, whence A'([b, c[) = Ad([b - a, c - a[) = Ad([b, c[). Both measures Ad and A' thus assign to every interval from . pd its d-dimensional elementary content. According to 6.2 therefore Ad = A', that is, (7.7)

Ta(Ad) = Ad

for every a E Rd.

This property of Ad is called its translation-invariance. If we set, as is customary (7.8)

a+A=A+a:=Ta(A)={a+x:xEA)

§7. Measurable mappings and image measures

37

for sets A E .9(Rd) and points a E Rd, then TQ(Ad)(A) = ad(-a+A) for arbitrary A E Rd. Property (7.7) can therefore also be expressed as (7.7') 4.

Ad(a + A) = Ad(A)

for all A E 69d, a E Rd.

In the context of Example 3, each non-zero real number a and each i E

{ 1, ... , d} determine a continuous, hence Borel measurable, linear mapping DQ')

which assigns to the point x = (x1, ... , xd) E Rd the image point x' E Rd having coordinates x; := ax;, and x' = xj for all j 0 i, a dilation of x. It satisfies (7.9)

Da'>(ad) = 1a1-1 Ad.

For, every open interval ]a, b[ C Rd has D.()-pre-image equal to ]a', b'[, where the coordinates of a', b' except the ith are those of a, b, the ith being a-1 times those of a, b if a > 0, and a-1 times those of b, a if of < 0. Hence Ad((DR'i)-'Qa,b[)) = IaI-' Ad(]a,b[) DQ'i(ad) and IaI-l Ad are therefore measures on .mod which coincide on all bounded

open intervals. Thanks to 6.4 such intervals constitute a generator of 9d, which obviously has with respect to each of these measures all the properties of the generator 8 in the uniqueness Theorem 5.4. From that theorem (7.9) therefore follows. 5.

If we set Hr := Dr1) o ... o D(rd) for real r 96 0, we obtain the linear mapping

Hr(x) = rx (x E Rd), called a homothety. Because of the transitivity of image measures, it follows from (7.9) that (7.10)

Hr(Ad) =

Iri-dad

For r = -1 we get H_ 1(Ad) = A' Because H_ 1 is reflection through the origin, this property is called the reflection-invariance of Ad. Exercises. 1. For fl := R, let (Sl, dA, p) be the measure space of Example 2, §3. For SY := {0,1 }

9(fl) define the mapping T : fl --, SW by T(w) := 0 if w is rational, T(w) := 1 if w is irrational. Show that T is d-d'-measurable and determine the image measure T(µ). 2. Show that for any sets fl, Sl', any mapping T : 11 - fl', and any system of sets and .sad'

B' c .9(11'), T-1(o(8')) = a,(T-'(r)) 3. Let K be a compact subset of Rd with the property that the intersection HH(K)fl

Hr, (K) of every two homothetic images of K with 0 < r < r' < 1 is an L-B-null set. (This property is enjoyed by every sphere S,,(0) of radius a > 0 and center 0 := (0,. .. , 0), that is, the set of x E Rd having euclidean distance a from 0.) Show

that Ad(K)=0. [Hint: For allrE10,11, Hr(K)CK:={tx:0 0 and f+(w) = 0 in case f (w) < 0. Observe that not only f + > 0, but also f - > 0. The important equalities (9.8)

f=f+-fand Ifl=f++f-

are immediate. From 9.4 and 9.6 we effortlessly infer our concluding result:

9.8 Theorem. A numerical function f on Il is jz -measurable if and only if both its positive part f + and its negative part f - are each d-measurable. Furthermore, along with f, its absolute value If I is always saf -measurable.

Exercises. 1. Let (Q, a() be a measurable space, D a dense subset of llt (e.g., Q). Show that a numerical function f on fl is af-measurable if the analog for all a E D of one of (a)-(d) in Theorem 9.2 holds. 2. Let (fn)nEN be a sequence of as -measurable numerical functions on a measurable

space (0,W). Why is the set of all w E f2 for which the sequence (fn(w))fEN converges in R, and that for which it converges in R, xf-measurable? 3. The real function f : Sl -> R is measurable on the measurable space (0, sd). Are exp f and sin f , that is, the function w H of (1) and w - sin f (w), 0-measurable?

4. With the aid of Theorem 9.1 show that the real function defined on R2 by (x, y) +-> max{x, y} is 6#2-measurable. Deduce from this another proof of Corollary 9.6.

5. Show via an example that the measurability of a numerical function f is not always a consequence of the measurability of if I.

§10. Elementary functions and their integral Our path to the integral proceeds via the set

E = E(1,0) of sag-elementary functions on ft, which we define as follows:

10.1 Definition. A real function on 11 is called an (.sat-)elementary function (or a non-negative step function) if it is non-negative, sad-measurable, and assumes only finitely many different values.

54

11. Integration Theory

If {a1, ... , a,, } is the set of distinct values of a function u E E, then the sets Ai := u-1(a; ), i = 1,..., n, are pairwise disjoint, and as pre-images of the Borel sets {ai} they each lie in d. Using the notation for indicator functions introduced in (9.2), we have then n

(10.1)

u = E ailA,. i=1

If conversely, numbers al,... , a,, E R+ and sets &..., An E 0 are given (n E N) and we define u via (10.1), then u is an elementary function, because by 9.4 it is measurable. Thus E is the set of all functions having a representation of the form (10.1), with n E N, coefficients ai in It+ and sets Ai from W. From Definition 10.1 and the results of §9 the following further properties of E are immediate: (10.2)

uVv, uAvEE.

au, u+v,

14,11 EE,aER+

The derivation of (10.1) shows moreover that every function u E E has a rep-

resentation of the form (10.1) in which the sets Ai E d are pairwise disjoint and cover Il, that is, constitute a decomposition of 0. Such representations will henceforth be called normal representations of u. It is easy to see that generally functions u E E can have several different normal representations. However, for u 96 0 there is only one representation in which the coefficients are the distinct non-zero values taken by u. Anyway, for purposes of integration non-uniqueness of normal representations is not an issue, as the next lemma shows.

10.2 Lemma. Let (it, d,,u) be a measure space. For any normal representations m

n

q =fl,1B' j=1

i=1

of an elementary function u E E we have m

tol

n

L,Q1µ(Bj) j=1

(bearing in mind the conventions for calculating with +oo). Proof. From

i1=AlU...UAm=B1U...UBn follows n

m

Ai = U (Ai n Bj) and Bj = U (Ai n Bj ) j=1

i=1

§10. Elementary functions and their integral

55

in which the sets Ai n Bj are pairwise disjoint. The finite additivity of A therefore supplies the equalities n

ns

p(Ai) = > p(Ai n Bj) and µ(Bj) _ E p(Ai n Bj), j=1

i=1

the first for all i E { 1, ... , m}, the second for all j E

After further

summation

m

n

Eajp(A1)=>aip(AinBj) and Ef3jp(Bj)=E/3jii(AinBj) i=1

i,j

j=1

From these two equalities the claim follows when we observe the following fact:

Because we started with normal representations of u, ai = Qj for every index pair (i, j) such that Ai n Aj 0 0, in particular, for every pair (i, j) such that

p(AinAj)j4 0. o Thanks to the preceding our next definition is sound:

10.3 Definition. Let u be an elementary function. The number (10.3)

Judo :_ i=1

which is independent of the special choice of normal representation U

it

= E ailA, i=1

of u, is called the (p-)integral of u (over 1).

Thus u H f u dp defines a mapping from E into R+. Clearly it is a mapping in R+ just if p is finite. The most important properties of this mapping are summarized in:

r (10.5)

for all A E 0;

J IA dpi = p(A)

(10.4)

J(au)d;i =ra J udfor all u E E, a E

(10.6)

f(u+v)dp=J udp+Jv dp

(10.7)

uau} lies in 0 for each n E N. From this definition follows on the one hand that un > au1B and consequently by (10.5) and (10.7)

J

undp>a J

for every n E N. Since the sequence (un) is isotone and u < supun, it follows on the other hand that Bn T St, and so Aj n Bn T Aj for each j E {1, ... , m} and consequently, because p is continuous from below m na r

JudµajA(A1)_ mpVajµ(AjnBn)=nl +00 ula dµ. f

j=1=1 sup nEN

J

un dµ > sup a J u 1 B dp nEN

r

= a n-oo lim J u1s dp = a

r

f

udµ .

where the first step follows from f un dµ > a f ul B dµ. Since a E 10,1 [ is arbitrary here, the claim follows.

58

1 1. Integration Theory

11.2 Corollary. For any sequences sup un = sup vn

(11.2)

nEN

(vn)fEN of functions from E

* sup / un dµ = sup ( vn 41A. nEN J

nEN

nEN J

Proof. For every m E N, vn, < supun and u,,, < sup vn, from which inequalities n

n

and 11.1 follow

j

sup J un dp and J vn, dp < nEN

u.. du < sup nEN

J vn du.

Claim (11.2) is immediate from the validity of these inequalities for all m E N. Now let

E- = E'(0,a)

(11.3)

designate the set of all non-negative numerical functions f on 1 for which an isotone sequence

of functions from E can be found satisfying

sup un = f . nEN

Then according to (11.2) the number sup J U. dp E Ft+ nEN

depends only on f and not on the special representating sequence (u,,) of f used to compute it. We're in a position similar to that of 10.3. Therefore we make the

11.3 Definition. Let f be a function in E', represented as the upper envelope f = supun of an isotone sequence (un)nEN for elementary functions. Then the number

r

(11.4)

J

fdp:=sup J undpEk+, neN

shown above to be independent of the special representing (un), is called the (p-)integrnl of f (over f1).

Evidently E C E*, because every u r= E satisfies u = sup un for the constant sequence un := u. Moreover, using this sequence (as we may) in (11.4), we see that in case f = u E E, that definition of the integral coincides with the earlier one. The mapping f i-+ f f dp initially defined only on E is thereby extended to a mapping of E' into It+. That in this extension process the known properties of the integral persist, will now be confirmed. The analogs of (10.2) and of (10.5)-(10.7) are (11.5)

(11.6)

f,gEE',aElt+

of, f+9, f.9, fVg, fA9EE*;

Jfrxf)di=affdpfor all f EE' ,aER+;

X11. The integral of non-negative measurable functions

(11.7)

J(f+9)dii=Jfd+fgdi.i

for all f,gEE*;

Jfd/iJgdlz

for allf,gEE'.

f n we have u,nn < vm and so m

sup umn = fn < sup vm

for every n E N.

mEN

mEN

Together with the preceding this gives finally sup vm = f . Therefore (vn) is a sen

quence with the needed properties 0

11.5 Corollary. For every sequence (fn)nEN of functions from E' 00

00

fn E E'

nn=1 00

and J(f)d$t=JfdIL. n=1

n=1

Proof. Apply 11.4 to the sequence U t + ... + fn)nEN and recall (11.7). 0 In analogy with the device of writing An T A, An 4. A for sets, introduced in §3, we will from now on write

fn t f, fn 4.p for numerical function f, 11, f2,... on the set S2 to signal that fn(w) T f (w) for every w E S2, or fn(w) 4. f (w) for every w E Q; that is, the notations mean (fn) is an isotone sequence and f is its upper envelope, or (fn) is an antitone sequence and f is its lower envelope. Obviously for a sequence (An) of subsets of 12

ABTA a

1 A T lA

and An J. A q 1A 4.'A

.

Examples. 1. Let (S2, 0) be an arbitrary measurable space and c,, the measure defined on d by unit mass at the point w E S2 (cf. Example 5 in §3). Then

f fde.=f(w) for every f E E. Due to 11.3 we can at once assume that f E E. If, however, f = E ai 1 A, is a normal representation of f, then w lies in exactly one of the sets A;, say in Aj0. Then f f den, = E ajc,,(Aj) = a;. = f (w). Consider 0 := N and .d :_ ,90(N). The o-additivity requirement means that a measure p on V is uniquely defined whenever numbers do = p({n}) E R+ are specified for each n E N. E` consists of all numerical function f > 0 on Q. Indeed, one sets fn := f (n) It,,) for each n E N and then fn E E`, and in case f (n) < +oo, 2.

§11. The integral of non-negative measurable functions

61

fn E E. Since 00

f=I:fn, n=1

it follows from 11.5 that f E E' and f du =

J 3.

f (n)pn . n=1

Let (0,0) be a measurable space, (pn)iEN a sequence of measures on 0 and 00

.U:= F, pn (cf. Example 4 in §3). Then for every f E E` n=1

fidp

->fidpn.

This is evidently true of indicator functions f, so the claimed equality holds for all elementary functions. Transition to an arbitrary f E E' is accomplished thus: Let (un) be a sequence in E with un t f. Then the double sequence

f

n

f

amn = >2 i=1

,,n

*n E N)

dpi

satisfies

sup (supamn)= sup(sup amn)

mEN nEN

nEN mEN

(= sup amn) , m.nEN

which confirms the assertion.

Now that E` is seen as a natural generalization of E, we might ask for a more workable characterization of it. A surprisingly simple one exists which brings us back to the measurability concept in §9.

11.6 Theorem. E' is the set of non-negative, d-measurable, numerical functions an 11.

Proof. Every elementary function is measurable and so therefore is every function in E', by 9.5. Suppose conversely that f is a non-negative, measurable, numerical function on 11. The sets A3n

I {If

}, n) n

if < (

-E 1)2-n},

i = n, 1 ..., n2n - 1

all lie in W, and for each fixed n E N the n211 sets are a decomposition of I. Consequently, for each n n2n

i2-n1A,,,

un i=1

62

I l. Integration Theory

is a normal representation of a function in E. On the set Air the function un+1 can 1)2-"-1 if i E {O... , n2" -1}, and only (2i)2'n-1 and (2i + take only the values

values > n when i = n2". Therefore the sequence (un) is isotone. It satisfies sup un = f , because for any w E 11 either f (w) = +oo, in which case un (w) = n n

for every n, or f (w) < +oo, in which case u. (w) < f(w) < un(w) + 2'n for all

n > f (w). Thus f lies in E. 0 Example. 4. Let fi be an uncountable set, dd the a-algebra in fZ comprised of all sets which are either countable or have countable complement (introduced in Example 2 of §1). We claim that a numerical function f on 0 is daf-measurable just if there is a countable set A in the complement of which f is constant. This constant a(f) does not depend on the particular set A, because if B is another such, CA n CB, being the complement in uncountable ft of the countable set Au B, is not empty. That this condition really implies the si-measurability of f follows

from Theorem 9.1, because for every a E R either if > a} C A or CA C (f > a}. In proving the converse we can, thanks to (9.8), assume that f > 0. The claim is then true for elementary functions f E E(fl, dd), because among finitely many pairwise disjoint sets whose union is 1, exactly one has a countable complement. For arbitrary f E E'(11,d) let (un) be a sequence of elementary functions with it, T f . Each function un is constantly a(un) in the complement of some countable set A. But then f (w) has the constant value for all w E n CA. = n

nEN

C( U An). As the set U An is countable, this proves that f has the asserted nEN

nEN

property and that moreover a(f) = supa(u,,). If now p is the measure defined in Examples 2 and 7 of §3 which takes only the values 0 and 1, then it follows from the preceding deliberations that

f f dp = a(f)

for all f E E=(l2,.ul).

In closing we will use Theorem 11.6 to derive a factorization lemma, due to J.L. Doob, which is interesting in its own right and quite important for its applications in probability theory. 11.7 Factorization lemma. Let T : St -> W be a mapping of a set 12 into a measurable space (n', dd') and f : 11 - Ft a numerical function on i2. The function f is measurable with respect to the a-algebra o(T) = T-1(4d') in D generated by T if and only if there exists a measurable numerical function g on (f2', s') such that (11.9)

f =goT.

In case f is c(T)-measurable and real (reap., non-negative)-valued, then there is such a g which is real (reap., non-negative) -valued.

§11. The integral of non-negative measurable functions

63

Proof. If f has the form f = g o T as specified, then it is the composite of a Q(T)-sad'-measurable with an a('-21 -measurable mapping, making it a(T)41measurable. For the proof of the converse we distinguish three cases: n

1. Let f = E ai1A, be a Q(T)-elementary function; so Ai E o(T) and ai E R+ for =1 i = 1, ... , n. For each Ai there is a set A; E 0' with Ai = T-1 (A;), by definition of o(T). Therefore the function g :=

n

ailA' does what is wanted.

2. Let f > 0. According to Theorem 11.6 there is an isotone sequence (un)neri of o(T)-elementary functions with f = sup u,,, and by the proof just given, there n

are d'-elementary functions gn such that un = gn o T. The function g := sup gn n then does what is wanted in this case. 3. An arbitrary r(T)-measurable f : 0 -* Ilk decomposes into its positive part f+ and its negative part f -. From 2. we get d'-measurable go > 0 and go > 0 on Sl' for which f + = go o T and f - = g, "o T. For w' in the set U' := {g'o = +oo} fl {go =

+oo} the difference go(w') - go(w') is not defined. But the set T(Sl) is disjoint from U', because go' (T(w)) = +oo always entails that 9o(T(w)) = f (w) = 0. Therefore if we set

9

1Cu'9o

and g"

1Cu'9o

then g := g' - g" will do the desired job. 4. If f is real, 3. supplies a numerical d'-measurable function go on SW such that f = go oT. If we set U := {IgoI = +oo}, then U fl T(f2) = 0 since f takes only real values, and so the real function g := 1Cu9o does what is wanted. 0

Remark. The restriction of g to T(1l) is uniquely determined by f and (11.9). Specifically, for each w' E T(0), g(w') = f(w) for every w E T`(w'). On T(fl) one therefore has no other choice than to set g(T(w)) := f (w). In case T(1) E at, in particular when T(11) = fl', the existence of g can thus be secured without recourse to 11.7 - cf. Exercise 3 below. The factorization lemma is therefore noteworthy only in so far as it allows the measurability of T(f)) to be dispensed with. And in doing that the special structure of (1, 91) is critical. Remark 4 in §8 shows how we are sometimes forced to do without the measurability of T(Q).

Exercises. 1. Show that every bounded, 0-measurable, non-negative real-valued function on a measurable space (fl, d) is the uniform limit of an isotone sequence of dmeasurable elementary functions.

2. Let (Sl, .r9, µ) be a measurable space with a finite measure µ. Further, let f, f1, f2.... be measurable numerical functions on 11. Prove the equivalence of


64

the two assertions:

limµ( U{f,,,>f+E))=0

(i)

for every e>0;

m>n

(ii) for every 6 > 0 there exists an A6 E .& with µ(A6) < 6 such that for every

e > 0, f,, (w) < f (w) + E holds for all w E CA6 and all sufficiently large n E N.

[Hints: Note that (i) is also equivalent to the statement that for every e > 0 and

6 > 0 there exists an A6,, E 0 with µ(A6,,) < 6 and an N6,,. E N such that f,, (w) < f (w) + e for all w E CA6,, and n > N6,e.] Why does (i) hold, given the sequence (fn)n£N, for every measurable function f which satisfies f > lim sup fn? n-4oo

3. With the hypotheses and notation of the factorization lemma, show that for any w1, w2 E 12 with T(wi) = T(w2), and every C E a,(T), either wl,w2 E C or w1, w2 E CC. (That is, w1 and w2 cannot be "separated" by any set in o(T).) From this fact infer that a Q(T)-measurable f satisfies f(wl) = f(w2) whenever T(wl) = T(u)2). In case T(S1) E d', deduce the existence of a er(T)-measurable mapping g : SY -4 fR with f = g o T. [Hint: Consider the system `B of all C C Sl which have this two-point property and conclude that o(T) C W. Further, take note of the equality T(T'1(A')) = A' fl T(1) for A' C W.]

§12. Integrability By now the integral f f d;i is defined for all non-negative d-measurable numerical functions on 11, as a result of 11.4 and 11.6 together. In a third and final step f f du will now be defined for certain numerical functions f which are not of constant sign.

According to Theorem 9.8, f is measurable just if both its positive part f+ and its negative part f - are measurable. This remark prompts the following definition:

12.1 Definition. A numerical function f on the measure space (Sl, 0, µ) is called (p-) integrable if it is s/-measurable and the integrals f f + dµ, f f " dµ are real numbers. Then

J fdu := f f+dµ- f f dµ is called the (µ-)integral of f (over Sl). If for some reason one wants to put the variable w E Sl into evidence, he also writes f f (w),u(dw)

or

J

f (w) dit(ty) .

Remarks. 1. The right side of (12.1) is meaningful for measurable f if at least one of f +, f - has a real integral. One says that then f is quasi-integrable or that

§12. Integrability

65

the integral off exists and one uses (12.1) to define f f dµ E R. Only occasionally will we be concerned with this obvious generalization. 2. In the special case µ = ad we speak of Lebesgue integrable functions (on Rd) and of their Lebesgue integrals. If a Borel measure µF on Rd is described with the help of a measure-generating function F on Rd (cf. §6), the µF-integrable functions f on Rd are called Lebesgue-Stieltjes integrable (or Stieltjes integrable) with

respect to F. One speaks of its (Lebesgue-)Stieltjes integral and writes f f dF instead of f f dtF. The general theory of measure and integration has however displaced this terminology and the notation f f dF, despite their historical significance.

Let us now summarize the most important properties of the conceptual edifice just built:

12.2 Theorem. Each of the following four statements is equivalent to the integrability of the measurable numerical function f on S2:

(a) f + and f - are integrable. (b) There are integrable functions u > 0, v > 0 such that f = u - v. (Note that the last equality entails that u(w) - v(w) is defined (in R) for every w E 11.) (c) There is an integrable function g with if I < g. (d) If I is integrable.

From (b) follows: f f dµ = f u dµ - f v dµ. Proof. What has to be shown is the equivalence of (a) through (d), since (a) constitutes the definition of f being integrable. (a)=:-(b): According to (9.8), u := f+ and v := f- do the job required in (b). Because the integral is additive on E', along with u and v, u + v is also integrable. Since f = u - v < u < u + v and -f = v - u < v < u + v, the function g := u + v is as required. (c)=*(d): This follows from the isotoneity of the integral on E* and the fact that If I E E' (Theorems 11.6 and 9.8): f If I dµ < f gdµ < +oo. (d)=:;-(a): Upon recalling that f+ < IfI and f- If I, this too follows from the isotoneity of the integral on E*. v + f +, which via (11.7) In (b), f = u - v = f + - f - and so u + f yields f u dµ + f f - dµ = f v dµ + f f + dµ and therewith the last assertion of the theorem, since all the integrals here are finite. 0

12.3 Theorem. Let f and g be integrable numerical functions on 0, a E R. Then the functions of and, if it is everywhere defined on 11, f + g are integrable, and satisfy (12.2)

f(af)d=aJfdtz

and

J(f+)dit=Jfdii+Jgdt.

Furthermore, the functions

fVg and fAg

66


are integrable.

Proof. The claims regarding of follow from (11.6), since

(of)+=of+,

(af)-=of-

ifa>0,and

(af)+ = Ial f-,

(af) = lalf+

ifa < 0.

Regarding f + g, we argue as follows: from f = f + - f - and g = g+ - g- follow

f+g=f++g+-(f +g ).(11.7) insures that u:=f++g+ and v:=f- +gare integrable. Then the claims about f + g follow from the equality f +g = u - v via 12.2. Finally, If V gI < If I + I9I and If A 91 1f(m)Ian 0} lies in sat. What has to be shown is that

f f dy = 0 q µ(N)=0. Suppose f f dp = 0. For each n E N the set A. := If > n-1) also lies in af and An T N, so that µ(N) = limoµ(A,,) and it is enough to show that p(An) = 0 for every n. But obviously f > n-11A,,, entailing that 0 = f f dp > n-1p(An) > 0, that is, p(An) = 0, as wanted. Suppose conversely that p(N) = 0. Each of the functions un := n1N (n E N) lies in E(1l, 0) and satisfies fun dµ = 0. Setting g := sup un gives a function n

g E E' (0, 0) such that un T g, so f g dµ = sup f un dp = 0. Finally, since n

evidently f < g, 0 < f f dµ < f g dµ = 0 gives the desired equality f f dµ = 0. 0 13.3 Corollary. Every W-measurable numerical function f on fl is integrable over every µ-nullset N, and

fdp=0. IN

Proof. If f > 0, this claim follows from the theorem, because each function 1N f lies in E' (12, sd) and is almost everywhere 0. In turn, application of this to f +

and f - delivers the full claim. 0 13.4 Theorem. Let f, g be sat-measurable numerical functions on Sl which are µ-almost everywhere equal on Sl. Then (a)

(b)

f>0,g>0 f integrable

Jfd=J9d;

= g integrable and

fi dµ = J g dµ .

72


Proof. (a): By hypothesis (and 9.3) N := { f 34 g} is a Wnullset. From 13.3 then

f Nfdµ= f Ngdµ=0. On the other hand, for M = CN we have lM f = 1Mg due to the definition of N, and so by (12.6) JM

dµ_IM

dµ.

A dding integrals and using (12.8') leads to the conclusion in (a).

(b): The almost everywhere equality hypothesis entails that

f+ = g+ almost everywhere and f

g- almost everywhere.

From (a) then

f f+dµ= J g+dµ

and

If-dA= f g-dµ.

Because f is integrable, what we have here are non-negative real numbers, showing

that g is integrable (part (a) of 12.2) and, upon subtracting the second equality from the first, we get the equality claimed in (b). Since, roughly speaking, all this shows that integrability and the integral of a function are insensitive to (measurable) changes of the function on nullsets, results proved earlier can easily be reformulated somewhat more sharply. For example:

13.5 Corollary. Let the l-measurable numerical functions f and g on 11 satisfy If I 11 /satisfies IA < n f and therewith µ(An) :5 n

,f dµ < +00.

f This holds for all n E N, confirming the a-finiteness claim. 0 Theorem 13.6 has yet another consequence: Let N be a p-nullset and f a numerical function which is defined on M := CN and is M fl ad-measurable. Such a function is described as being a (p-)almost everywhere defined (d)-measurable function. The function fm introduced in 12.6 extends it to an &d-measurable function on 11. Any other extension of f to SZ must agree with fm almost everywhere. According to 13.4 therefore either every such extension is integrable or none is. In the first case moreover all extensions have the same µ-integral. These observations justify the following definition:

13.7 Definition. Let f be a µ-almost everywhere defined, std-measurable numerical function on 0. It will be called (µ-)integrable if it can be extended to a (p-)integrable function f' defined on the whole of ft f f' dµ will then be called the (p-)integral of f and denoted f f dµ. We will only occasionally be concerned with this extension of the integral concept, but its utility is already shown by the following

Remark. Suppose f and g are integrable numerical functions on Q. According to 13.6 each is almost everywhere finite. Because the union of two nullsets is itself a nullset, there is a nullset N such that both If (w) I < +oo and Ig(w) I < +oo for all w E CN. But then w H f (w) + g(w) (w E CN)

is an almost everywhere defined measurable function. This fact, in conjunction with what was shown above, shows that the explicit hypothesis made in 12.3 that f + g be everywhere defined is of little significance. For two integrable numerical functions f and g on 11 the sum f + g is almost everywhere defined, and in the sense of 13.7 integrable. The equality

J(f+o)d=ffd+J9d µ prevails unrestrictedly.

Exercises. 1. The numerical functions f and g on the measure space (St, s(, µ) satisfy f = g ,u-almost everywhere. Show via an example that in general the sat-measurability

74


of g does not follow from that off . Show however that in case (52, d, p) is complete,

the d-measurability of g is equivalent to that of f. 2. Let (S2, .od, p) be a measure space, (1, x 1o', po) its completion. Prove that f : Q -* R is wo-measurable just if .vd-measurable numerical functions fl, f2 on fl

exist with the properties f, < f < f2 everywhere in f1 and fl = f2 p-almost everywhere. If f is po-integrable, then any functions fl, f2 with these properties are p-integrable, and f fl dp = f f2 dp = f f dpo. (This supplements Exercise 7 in §5 and generalizes Exercise 1 in §10.)

3. Even if the f in the preceding exercise is real-valued, the functions fl, f2 which were proved to exist there cannot always be chosen to be real-valued. Prove this for the case where 11 is any infinite set, Ad := {Q1, S2} and p := 0.

§14. The spaces 2P(µ) According to 9.4 the product of two measurable functions is again measurable. By contrast however the product of two integrable functions is not generally integrable, as the next example shows:

Example. (0, sd, p) is the measurable space described in Example 2 of §12 and Example 2 of §11, with a,, := n_P-1 for each n E N, where 1 < p < +oo. The identity function, f (n) := n for all n E N, is integrable, but its p-th power is not. Thus for p = 2, f2 = f f is not integrable.

This observation suggests the investigation of those measurable functions f on I for which if IP is integrable. In what follows p will designate a real number, p > 1. For every od-measurable function f on fI, If I and then also If Ip is measurable, because (adopting the usual convention that (+oo)P := oo) for every real a Q

ifaa}= (IfI2:a'/P) ifa>0. For such an f (14.1)

Np(f)

(f Iflp di )

1/p

is therefore defined. It satisfies 0 5 Np(f) < +oo and, clearly, (14.2)

Np(af)=IaINp(f)

Two deeper properties will now be established:

for all aER.

§14. The spaces .`gy(p)

75

14.1 Theorem. p > 1 is a real number and q > 1 is defined by the equation

-+-=1. P q 1

1

Then for any measurable numerical functions f, g on St (14.3)

NI(fg) < NP(f)NN(g)

(HOLDER'S inequality).

Proof. It is clear from definition (14.1) that we may assume f > 0 and g > 0. Setting

a:=Np(f) and r:=Nq(g), we can also assume that both these numbers are positive. For if, say a = 0, then by 13.2 f P, whence also f , is almost everywhere equal to 0. The same is then true of f g (remember that 0 (+oo) = 0), so that again by 13.2 we have NI (f g) = 0, and (14.3) holds. Once a, ,r are each positive, no loss of generality is incurred by assuming that each is also finite, which we now do. Applying the mean-value theorem of the differential calculus to the function q 1- (1 + rl)l/D, there follows at once the well-known Bernoulli inequality

(1+71)I/p 0, satisfies

f fndp.

f limonf fndp 0 on S2 such that 8f (x, w) < h(w) 8xi

for all (x, w) E U x S2.

Then the function defined on U by

w(x) := ff(x.w)i(d) has an ith partial derivative at every x E U, the function w

'-

8f (x, w) is µ-

8x,

integrable, and

av (x) = J az (x, w),u(dw) axj

for every x E U.

This follows at once from the differentiation lemma: Given T = (T,, ... ,Td) E

U, there is an open interval I C R containing ai such that for each t E I the point (zl , ... , T,- j , t, Ti+i .... 7d) lies in U, and we can apply 16.2 to the function (t,w),_, f(xl,...,xi-1, .Td,w).

II. Comparison of the R.iemann and Lebesgue Integrals. For every ddimensional Borel set B E .mod and suitable Borel measurable numerical functions f on B the integral fa f dad was defined in §12 and identified with f f dAB. This integral is called for short the Lebesgue integral of f over B. A frequently encountered alternative way of writing it is (16.4)

ff(x)dx= Jfda5.

§16. Applications of the convergence theorems

91

In case d = 1 and B = [a, a], or ] - oo, a], or R, etc. the notations fa f (x) dx, or f °. f (x) dx, or f ±' f (x) dx, etc., are also common. Since in basic analysis courses it is frequently only the Riemann integral that is dealt with, the following remarks relating it to what has been done here may be useful.

16.4 Theorem. Consider a Borel measurable real function f defined on a compact interval I := [a,)31 in R. If f is Riemann integrable (which in particular means it is bounded), then it is also Lebesgue integrable, and the values of the two integrals off coincide. Proof. To every finite subdivision

J:={a=ao oo.

00

Therefore the series > ak converges. Using this it is very easy to confirm that the k=1

improper Riemann integral

JrR sin x

lim R ++oo 0

X

dx

exists. On the other hand, k +1)R

L

n

IsinxJ

If I d,\' >

J fa,(n+1)w)

at.

2

(k + 1)lr

J0

= JR+

sin t

F+ kir dt - Jo it+ k7r

x

and so for every n E N

a

sin t

If ( dA'

n

(k+1)n

E k=lJka

dx >

2

n

E k+1 k=11

Since the harmonic series diverges, these inequalities show that fR+ If I dA' = +oo, and so by 12.2 f is not Lebesgue integrable over R+.

III. Calculation of the integral G. The preceding considerations show that integrals which the reader may already have encountered as Riemann integrals can, in the stated circumstances, be immediately interpreted as Lebesgue integrals. Known formulas and computational rules for the Riemann integral thereby become available to the Lebesgue theory as well.

94

H. Integration Theory

As an illustration, consider the non-negative function e-x(1+m2 )

f (x, w) :_

(16.5)

(x,w)ER x1R.

1 + w2

Both f and the function (x, w) t-+ f'(x, w) := -e-:(1+w2) are continuous. For fixed xo > 0 form the auxiliary functions ho(w) := e-220Iwl

and h(w) :_ (1 +w2)-1 ,

w E It.

Their A'-integrability (over R) follows from Corollary 16.5 and the fundamental theorem of calculus. For example, r+ J/

(1 + W2)-1

hm [arctan(W)]"n = r.

n-too

Obviously f (x, w) < h(w) for all (x, w) E HI+ x R. It follows from 12.2 that for each

x E It+ the function w H f (x, w) is A'-integrable. And the real function defined by

(16.6)

V(x) := Jf(z)dw

x E IR+

is continuous by the continuity lemma 16.1. Note that p(O) = r. Since 2 JWJ < 1+w2 for all w E R, we have I f'(x,w)J < ho(w) for all (x, w) E [xo,+oo[x]R. Consequently the differentiation lemma 16.2 insures that
0, that is, differentiable in JO,+oo[, and

(16.7)

(x) = -

e_2(1+")).1(dw)

for x > 0

and via the substitution t = w f this reads (16.8)

cp'(x) = -Gx-1"2e-z

forx>0

where G designates the integral (16.1) that we are trying to explicitly compute. Its existence is already fart of the preceding analysis, but can also be inferred from

the majorization a-' < e-t, which holds for t > 1. From (16.6), (16.8) and the

§16. Applications of the convergence theorems

95

fundamental theorem of calculus

V(x) - V(a) = GI t-1/2e-° dt = 2G 41. e" dw, for x > 0 and a > 0. Upon letting a run to +oo, we will get (16.9)

p(x) = 2G

+oo a-", dw

J,rif we notice that V(a) -+ 0 as a - +oo, which in turn is a consequence of the inequalities +w2)-1A1(dw) = p(O)e-0

w(a) < e-° f(i

for all a > 0.

Because cp is continuous on R+ we can pass to the limit x -+ 0+ in (16.9) and get

it = p(0) = 2G

r+ e-"'2 dw = G2,

J0

using the obvious (on grounds of symmetry) fact that f °. a-"'' dw = f0+00 e' dw. G = . That is, Since G > 0, it follows finally thatfe2

dx = r

(16.10)

or equivalently, in the form seen in probability theory,

2a.

(16.10')

This derivation goes back to ANONYME [1889J and VAN YZEREN [1979]. A par-

ticularly short alternative one is made possible by Tonelli's theorem (cf. Exercise 4 in §23).

Exercises. 1. Which of the two functions below are integrable, which are square-integrable with respect to Lebesgue-Borel measure on the indicated intervals? (a)

(b)

f (x) := x-1, f (x) := x-1/2,

x E I:= [l, +oo[; x E I:= 10,1] .

2. Show that for every real number a > 0 the function x H e" is A1-integrable over R+.

3. Show that for every real number a > 0 the function x

- a_°x [sinX x13 J

96


is A'-integrablc over JO, +oo[ and that

rsinx13 A1(dx) x J

Jo is continuous Oil 10, +00[.

§17. Measures with densities: the Radon-Nikodym theorem Again let (12, dd, p) be an arbitrary measure space and E' = E'(f2, sd) the set of all W-measurable, non-negative numerical fimctions on 12. In 12.4 we defined the integral of every function f E E* over every set A E id'. We are interested here in how this integral behaves with respect to A. 17.1 Theorem. For each function f E E`JA the equation

v(A) :=

(17.1)

f du

defines a measure v on sd.

Proof v(0) = 0 and v > 0. For every sequence (An)nEN of pairwise disjoint sets

from W with A:= U A nEN

IAf =

IA, f n=1

and so by 11.5

v(An),

v(A) n=1

the final property needing to be checked in confirming that v is a measure on 0. 0

17.2 Definition. If f is a non-negative .d-measurable, numerical function on 11, then the pleasure v defined on .0' by (17.1) is called the measure having density f with respect top. It will be denoted by

v=fiz.

(17.2)

Concerning the relationship between v- and µ-integrals we will show

17.3 Theorem. Let f,, E E', v:= fu. Then (17.3)

1

§17. Measures with densities: the Radon-Nikodym theorem

97

or, written out,

Jd(f,i) = f Wf dµ -

(17.3')

An id-measurable function V : fl - R is v-integrable if and only if ,pf is µintegrable. In this case (17.3) is again valid. Proof. First suppose p =

a,lA; is an sad-elementary function. In this case (17.3)

holds because n

n

f ,pdvaiv(A1)a;f lA,fdµ=Jcof d µ. For an arbitrary p E E' there is a sequence (un) in E such that U. T V. Since then un f T W f as well, (17.3) follows from 11.4. Finally, consider any id-measurable numerical function p on Sl. By now we know that

fco+ dv = Jco+f dµ = J(caf)+ dµ and

f

W- dv = f V f du = f(f ) dp.

From these equations and the definition of integrability follows the second part of

the theorem. 0 It now follows that the formation of measures with densities is transitive:

17.4 Corollary. Let f, g E E', v := fit and P := gv. Then B = (gf )µ, that is,

9(fµ) = (9f)µ

(17.4)

Proof. For every A E id

g(A) = f gdv = A

f

lAgdv

and furthermore, according to 17.3

f lA9dv=

f

lA9fdµ= f(9f)dii.

We thus obtain p(A) = fA g f dµ, for all A E W; which is what had to be proved. 0

On the question of uniqueness of density functions we have

17.5 Theorem. For functions f, g E E' (17.5)

f =g

µ-almost everywhere

= f p = gµ .

If either f or g is µ-integrable, the converse implication holds as well.

98

IL . Integration Theory

Proof. If f and g coincide p-almost everywhere, then so do 1A f and 'Ag for each A E a(, whence JALgdp

for allAEd,

which just says that fit = gp. Now suppose that f is p-integrable and that fit = gp. Since g > 0 and f gdp = f f dp < +oc, g is also p-integrable. Let us show that the set

N:={f>g}, which lies in 0 by 9.3, is a p-nullset. For every w E N, f (w) - g(w) is defined and is positive, which means that the definition

h:= 1Nf - 1N9 makes sense. The functions 1N f, 1Ng, being majorized by the p-integrable func-

tions f, g, are themselves integrable. Because fit = gp, they have the same itintegral. From this we getr that

J

hdp=

r Ir fdp- /Ngdp=0.

Since N = {h > 0}, this equality and 13.2 tell us that p(N) = 0. With the roles of f and g reversed, this conclusion reads u(N') = 0, where N' := {g > f }. Since if 54 g} = N U N', the desired conclusion, namely that If 34 g} is a p-nullset, is

obtained. 0 The converse of implication (17.5) is not valid without some additional hypothesis on the densities f and g. The next example illustrates this.

Example. 1. As in Example 2 of §3 let fl be an uncountable set, 0 the a-algebra of countable and co-countable subsets of (1 (see Example 2 in §1). But the measure p will be defined on 0 by p(A) := 0 or +oo, according as A or CA is countable. If f and g are the constant functions on ft with the respective values 1 and 2, then indeed f p = gp, yet f (w) = g(w) holds for no w E ft. Of course, it then follows from 17.5 that neither f nor g is p-integrable. Before turning to the principal problem of this section, we will examine another characterization of a-finite measures which is important for what follows and is of interest in its own right.

17.6 Lemma. Let (fl,.ad,p) be a measure space. The measure is a -finite if and only if there exists a p-integrable function h on Cl which satisfies (17.6)

0 0 there exists d > 0 such that v(A) < e. (17.7) . A E O and u(A) 0 if A is a p-nullset. Hence v(A) = 0 and v is thus a p-continuous measure, even without the finiteness

hypothesis. For the converse we will show that if (17.7) fails, then v is not µcontinuous. Thus, for some c > 0 there is no 6, which means there is a sequence with the properties (An)nEN in p(An) < 2_n and v(An) > E for each n E N. We set

A := 41.s .up An := n U An nEN m>n

and have a set in ap which on the one hand satisfies 00

A(A) < µ( U Am) < E p(Am) n

m=n

00

m=n

2-m = 2-n+1

for every n E N,

100

II. Integration Theory

whence p(A) = 0, and on the other hand, due to the finiteness of v and 15.3, satisfies

v(A) > limsup

E > 0,

nix

which proves that v is not p-continuous. 0 Examples. 2. Let 12 be an uncountable set, W the or-algebra of countable and cocountable subsets of .W (Example 2 in §1). As in the preceding Example, consider the measure v on .i which assigns to a set the value 0 or +oo according as the set or its complement is countable. Let is denote the counting measure C on at (from Example 6, §3). Since 0 is the only p-nullset, v is trivially µ-continuous. However, v cannot have a density with respect to p. For from v = f p with f E E* it would follow that

0 = v({w}) = f f dp = f(w)k({w}) = f(w) W}

for every w E S2, making f = 0 and therefore v = fit = 0, which is not the case because Sl is uncountable.

Let (R, 0, It) be the 1-dimensional Lebesgue-Borel measure space (so p = 'V) and denote by A" the system of all p-nullsets. Then is an example of a or-ideal in W1: The union of any sequence of its sets is another, as are the intersections of its sets with those of ,5d1 (cf. Exercise 5, §3). These properties insure that 3.

v(A)

-

10 +oo

ifAE-4 if AEJO\.X

defines a measure on 1 (cf. Exercise 6, §3). From its definition it is clear that v is p-continuous. Here however (17.7) falls, since for every b > 0

jp([o,ap = s and v([0,ap =+oo. Thus the finiteness hypothesis on v in 17.8 is not superfluous. Example 2 shows that for the existence of a density f E E' with v = fit, the µ-continuity of v, while necessary, is not sufficient. All the more noteworthy is the theorem of Radon and Nikodym which we will prove, after a preparatory lemma.

17.9 Lemma. Let or and r be finite measures on a o-algebra ii of subsets of 11 and let a := r - a denote their difference. Then there is a set S2o E W with the properties (17.8)

(17.9)

e(fl0) > LOW); @(A) >0

for all AESTOltW.

Proof. Let us first proof the weaker claim: (*) For every, e > 0 there exists 0e E 0 with the properties (17.8') (17.9')

N(1l) >- 9(f) ;

g(A) > -E

for all A ED, ft a/.


101

We may obviously suppose that p(l) > 0, since otherwise SlE := 0 does what is wanted. If then e(A) > -e for all A E .sad, it suffices to choose 1 := Q. So we consider the case that some Al E ad satisfies e(A1) < -e. From the definition of e and the subtractivity of the finite measures a and T, e(CA1) = e(fl) - e(A1) ? e(1) + e > e(11) .

Therefore, if e(A) > -e for all A E (CAI) fl 0, we can set S1E := CA1 and be done. In the contrary case there is a set A2 E (CAI) flsat with e(A2) < -e. Then because A1, A2 are disjoint

e(C(A1 U Az)) = o(Q) - e(A1) - e(A2) > e(fl) + 2e > e(n) and the preceding dichotomy presents itself anew. If after finitely many repetitions of this procedure we have not reached our goal, then we will have generated a sequence (An)nEN of pairwise disjoint sets in gd with e(Sl \ (A1 U ... U An)) > e(Sl)

and e(A.) < -e

for every n E N.

Because of the finite additivity of a and r, this would have the consequence that n

e(A1U...UAn)=Ee(A,) -1/nforallnENandeveryAESlofl.od. O As indicated, this puts us in a position to answer the important question we posed earlier.

17.10 Theorem (Radon-Nikodym). Let u and v be measures on a a-algebra .srd in a set Q. If µ is a-finite, the following two assertions are equivalent:


102

v has a density urith respect to A. (ii) v is 14-continuous. (i)

Proof. Only the implication (ii)=(i) is still in need of proof. To that end we distinguish three cases.

First Case: The measures µ and v are each finite. Form the set 9 of all d measurable numerical functions g > 0 on Sl which satisfy gµ < v, that is, which satisfy

for allAEd. The constant function g = 0 lies in 9, so 9 is not empty. 9 is moreover sup-stable, that is, g V h E 9 whenever g, h E W. Indeed, setting Al := {g > h}, A2 := CAI, every A E d satisfiees

J

gvhdµ= 1

Ana,

r

gdµ+J

ArA,

Since f gdµ < v(Q) < +oo for every g E 9, the number

ry:=suP{ f 9dµ:gE9) is finite and there is a sequence (g;,) in 9 such that lim f gn dµ = -y. Due to supstability the functions gn := gi V ... V gn lie in 9, and consequently ry > f gn dµ >

f gn dµ (since g,, > gn) for all n E N. Which shows that lim f gn dµ = ry. As the sequence (gn) is isotone, the monotone convergence theorem can be applied,

assuring that f := supgn is a function in 9 and that f f dµ = ry. All this proves that the function g H f g dµ on 9 assumes its maximum value at f. Now we prove that v = f µ. In any case we have f µ < v, since f E 9, and so

T:= V- f A is a finite measure on sat, evidently µ-continuous since v is by hypothesis. We have

to show that r = 0. So let us assume contrariwise that r(Sl) > 0. Due to the µ-continuity of r, this entails that µ(11) > 0 as well, and we may form the real number

Q:=2

(M}>0,

which satisfies r(Sl) = 20µ(Sl) > Qµ(St). The preceding lemma applied to r and a:= Q3µ supplies a set flo E 0 which satisfies

r(flo) - lµ(ilo) > r(1) - $µ(!l) > 0 and r(A) > Qµ(A) for all A E f o n 0. The .sat-measurable, non-negative function fo := f +,81n. therefore has the property

ffodiz=jfdii+I3(QonA)

jfd+r(A)=v(A)


103

for every A E sV. These inequalities put fo in 9. Since r is p-continuous and r(S2o) > Qµ(S2o), we must have µ(S20) > 0, leading to

f

fodµ= ffdµ+ap(no)=7+i3µ(Slo)>7,

an inequality which is incompatible with the definition of -f and the fact that fo E 9. The assumption r(S1) > 0 is therefore untenable, and r = 0, as desired. Second Case: The measure µ is finite and the measure v is infinite. We will produce 00

a decomposition SZ = U On of S1 into pairwise disjoint sets from d with the following properties

(a) A E 1o fl at (b)

n=0

either µ(A) = v(A) = 0 or 0 < µ(A) < v(A) = +oo . v(S1n) < +0o

for all n E N.

To this end let 2 denote the system of all Q E 0 with v(Q) < +oo and define a:= sup{µ(Q) : Q E _l} . This is a real number because the measure µ is finite. There is a sequence (Qm)mEN

in .l with limµ(Qn,) = a. Since 1 is evidently closed under finite unions, (Q,n) U Q,n is then a set from std satisfying may be assumed to be isotone. Qo mEN

µ(Qo) = a. We will show that 52o := CQo satisfies (a). So consider A E Stood with

v(A) < +oo. We need to see that p(A) = v(A) = 0, and since v is µ-continuous we really only need to confirm that p(A) = 0. Since v(A) < +oo and, as noted already, . is closed under union, each Q,n U A lies in 2, so that p(Q,, U A) < a, and consequently µ(Qo U A) = lim p(Qm U A) < a. "t-400

Since A is disjoint from 1o, u(Qo U A) = a + µ(A). Conjoined with the preceding inequality and the finiteness of a this says that indeed p(A) must be 0. Finally, to take care of (b) we merely define S21 := Ql, and u n := Qm \Q,n_1 for all integers m > 2 in order to get a decomposition of S2 with the desired properties. Now let An, vn denote the restrictions of µ, v to the trace a-algebra On fl 8d, for n = 0, 1.... and note that each vn is a µn-continuous measure. Moreover, for all n > 1 both An and vn are finite. Case 1 therefore supplies Cl,, n 0-measurable functions fn > 0 on Cl,, with vn = fnµn Taking fo to be the constant function +oo on Sto, vo = foµo also holds, thanks to (a). Finally, "putting all the pieces together" gives our result in this second case. Namely, the function f on Cl defined to coincide on each Cl,, with fn (n = 0, 1, ...) is non-negative, sad-measurable and satisfies

v=fp.

Third Case: This is the general case: only the a-finiteness of it is demanded. There

is according to 17.6 a strictly positive function h E 2'(µ). The measure hp is therefore finite and possesses exactly the same nullsets as does A. Consequently

v is also (hp)-continuous. By what has already been proved there is then an

104

II. Integration Theory

0-measurable function f > 0 on 1 with v = f (hµ). According to 17.4 v then has the density f h with respect to A. 0 The question arises whether, in the situation of Theorem 17.10 the density f of v is p-almost everywhere uniquely determined. From 17.5 we at least get a positive answer when f is p-integrable, that is, when v is a finite measure. But more is true:

17.11 Theorem. Let v = fit be a measure having a density f with respect to a a-finite measure p on 0. Then f is p-almost everywhere uniquely determined. The measure v is or-finite exactly when f is p-almost everywhere real-valued. Proof. First we show that f is µ-almost everywhere uniquely determined if the measure p is finite. In proving this we may assume that v(St) = +oo, since its truth is otherwise a consequence of the second part of 17.5. Furthermore, as we now find ourselves in case 2 of the preceding proof, the decomposition of St into %J11,... employed there lets us confine our attention to Sto, as 17.5 takes care of the remaining Stn (n E N). So it suffices to treat the case ft = Sto, that is, to assume that p and v are linked by the alternative: A E srp

=

either p(A) = v(A) =0 or 0 < µ(A) < v(A) = +oo.

The constant function +oo is then a density for v with respect to p and what has to be shown for uniqueness is that f = +oo holds p-almost everywhere. And for that it suffices to show that µ({ f < n}) = 0 for each n E N, which in turn is a consequence of the above alternative and the inequalities

v({f

0 for all A in the trace a-algebra Sl+ n 0, and g(A) < 0 for all A E Sl- n dd. Proof. Set

-y:= sup{g(A) : A E 0}

and choose a sequence (An) in 0 with limg(An) = y. By applying 17.9 to the restriction of g to An nad, we may replace An by a set Pn E 0 satisfying g(Pn) > g(An) and g(A) > 0 for all A E Pn n 0. We will then have

y=sup{g(Pn):nEN).

(18.1)

The decomposition of Cl that is sought can be realized by

Sl+ := U Pn,

S2- := S2 \ Q+ .

nEN

Indeed, all A E H+ n .ad satisfy g(A) > 0 because such an A has the form

A = U Bn nEN

with pairwise disjoint sets B. E P. n ad (by the disjointification procedure used in the verification of (3.10)). From this representation of A and the a-additivity

g(B,) > 0. Thus p assumes only non-negative real values

follows g(A) _ n=1

on Sl+ n .sad, that is, the restriction of g to Sl+ n 0 is a finite measure. Moreover, because @(P.):5 g(Sl+) < y and (18.1) this measure satisfies

y=Q(sl+) In particular, y < +oo since p assumes only real values. g(A) > 0 cannot hold for any A E Sl- n .sat, for otherwise g(C+ U A) = g(Sl+) + g(A) > y. Thus, g(A) < 0

for allAESl-n0. Measures (in the sense of Definition 3.3) have occasionally been interpreted as mass distributions on the underlying set Cl. A finite signed measure can be analogously interpreted as an (electric) charge distribution smeared over Cl. The foregoing theorem justifies this metaphor by showing that as with charge in electrostatics, there are two disjoint sets, one carrying all the positive charge, the other all the negative charge.

§18*. Signed measures

109

From this theorem another important feature of signed measures becomes evident: The difference p in Lemma 17.9 is more than an illustrative example of a signed measure - it is the typical signed measure:

18.2 Corollary. Every finite signed measure p on a a-algebra sat in ] is the difference of two finite measures on sat.

Proof. Let fl = S2+ U S2- be a Hahn decomposition in the sense of 18.1. Then evidently p+(A)

p(A n St+)

and p(A) :_ - p(A n St-),

A E sat

define measures on d, which satisfy p = p+ - p-, since each A E sat is the disjoint

union (AnS2+)u(Ancl-). 0 With this result the circle closes: finite signed measures are nothing more than the differences of finite measures. It is however possible to dispense with the finite-

ness hypothesis if a-additivity is handled with sufficient care, but we will not go into this further. In the final analysis it is because of the preceding corollary that we only consider measures with non-negative values in this book. Often to emphasize the distinction with signed measures, what we call simply measures are called positive measures.

Exercises. 1. Show that every finite signed measure on a a-algebra is bounded and assumes a largest and a smallest value.

2. Let p be a finite signed measure on a-algebra d in Sl, and St = Sli U f1i , fl = fl2 Uci be two Hahn decompositions for it. Show that ii LSl2 and Sti OS22 are totally p-nulsets, meaning that p(N) = 0 for every N E 0 which is subset of either of them. Conclude that to within such totally p-nullsets there is only one Hahn decomposition for p. 3. Let p be a finite signed measure on a a-algebra sat in Q. Show that the specific representation p = p+ - p- of p as the difference of the two measures on sat which was produced in the proof of 18.2 is characterized by the following minimality property: In every representation p = pl - p2 as the difference of measures pl, p2

on 0, pl = p+ + 8 and p2 = p + b for an appropriate finite measure 8 on sa7, and indeed if 11 = Sl+ U S2- is any Hahn decomposition of S2 corresponding to p,

8 = (ln+)p2 + (1n-)pl. (Conversely, of course, every finite non-zero measure b on sat generates in this way a different representation of p.) Infer that the only measure v on sat which satisfies v(A) < min{p+(A), p-(A)} for every A E sat is the identically 0 measure. [Remark: The representation p = p+ - p uniquely determined by this minimality condition is called the Jordan decomposition of the finite signed measure p. As with functions, p+ and p- are called the positive part and the negative part of p.]

110


§19. Integration with respect to an image measure Along with the measure space (it, .0', i) a measurable space (W,01) and an jW-d'-measurable mapping

T : (fl, a) -a (ft', d') are given. Then the image measure

p` := T(p) is defined in (7.5). The connection between p-integrals and µ'-integrals is elucidated by:

19.1 Theorem. For every s/'-measurable numerical function f' > 0 on 0' (19.1)

Proof. The non-negative function f' o T is d-measurable, by 7.3. The integral on the right-band side of (19.1) is therefore defined. To prove the equality there we first consider only d'-elementary f': n

f

ailA s

i=1

(with coefficients ai E R+ and sets A; E d'). For such f

f'oTa;lAi e=1

with A; := T-r (A;), so this composite is an d-elementary function. Since

T(p)(Ai) = p(Ai)

(i = 1,...,n)

holds by definition of image measures, (19.1) follows in this case. For an arbitrary s9'-measurable f > 0 there is an isotone sequence (un) of d'-elementary functions for which u;, T f'. Then (un o T) is a sequence of s(-elementary functions for which u;, o T T f o T. From the validity of (19.1) for the u;, and Definition 11.3 of the integral in general, we get (19.1) for f'.

19.2 Corollary 1. Let f' be an sf'-measurable numerical function on W. Then the T(µ)-integrability of f' entails the p-integrubility of f' oT, and conversely. In case of integrability (19.2)

1

§19. Integration with respect to an image measure

111

Proof. From 19.1

f (f')+dT(p)=J(f')+

o Tdp and

J(f')_dT(P) = f

(f')- oT d1 z,

and of course

(f'oT)+=(f')+oT

and

(f'oT)-=(f')-oT.

Both claims therefore follow from the definition of the integral 12.1.

19.3 Corollary 2. The mapping T : S2 -+ S2' is bijective and d -d'-measurable,

with W'-d-measurable inverse T'. Further f' is a numerical function on W. Then the T(p)-integrability of f' is equivalent to the p-integrability of f' o T, and in its presence equality (19.2) prevails.

One has only to note that the integrability of f' o T entails the measurability of f' o T and therewith that off'= f' o T o T -1. The content of 19.1-19.3 constitutes what is called the "general transformation theorem for integrals".

As the behavior of the L-B measure with respect to Cl-diffeomorphisms is known from (8.16'), the transformation theorem for Lebesgue integrals follows at once:

19.4 Theorem. Let G. G' be open subsets of W', cp : G -> G' a C1-diffeomorphisrn

of G onto G'. A numerical function f' on G' is Ad-integrable if and only if the function f' o cp I det DWI is Ad-integrable over G, and in this case (19.3)

IG, f' dAd =

fcf' o' I det D,,, I dAd .

Proof. The Ad-integrability of f' over G' and that of f' o W I (let DWI over G means the AG,-integrability and the AC-integrability of those functions, respectively. According to (8.16') ' (Ac) = I det DWI Ad ;

furthermore, the Borel measurability of f' is equivalent to that of f'o 0 such that

r

fA-'(A) f oTdp fqdu

-TJ for all dat-measurable numerical functions f > 0 on fl, and all A E d.

§20. Stochastic convergence Let us return to the study of p-fold integrable functions begun in §14. Our goal will be to replace the almost-everywhere convergence concept that underlies the theorems proved there with a weaker convergence concept. It is suggested by a simple but very useful inequality.

The setting is once again an arbitrary measure space (el, 0,u). 20.1 Lemma. For every measurable numerical function f on 0 and every pair of real numbers p > 0 and a > 0 the Chebyshev-Markov inequality p({IfI >- a}) a}nA and {Ifn-fl>a}nA differ from each other only in an (n-independent) nullset. The converse of this is important:

20.3 Theorem. For every o-finite measure p, any two stochastic limits of a sequence of measurable real functions are µ-almost everywhere equal to each other.

114


Proof. If f and f* are stochastic limits of the sequence (fn), then from the triangle inequality in R

{If -f*I2al C{If.-fI? a/2}U{Ifn-f*I2! a/2}, whence

p({If-f*I >a}nA)a/2}nA)+p({Ifn-f*I2:a/2}n A) for every n E N and every A E d. Letting n -3 oo shows that

p({ If -f*1 >- a} nA) = 0 for every a > 0 and every A E ii of finite measure. Then however, f = f* "-almost everywhere in every such set A, since

If 54 f*} n A= U{If - f*1 > Ilk} nA kEN

is a p-nullset. Upon taking for A the sets in a sequence (An) in 41 which satisfies p(An) < +oo for all n and An t 0, the p-almost everywhere equality of f and f follows. D To supplement this fact we mention:

Remark. 4. Stochastic limits f and f* of the same sequence (fn) are almost everywhere equal without any hypotheses on the measure itself if both functions are p-fold integrable for some p E [1, +oo[. This is because for every real a > 0 the

set (if - f* I > a} has finite measure, by (20.1), and so f = f * p-almost everywhere in this set, whence { If - f * I > 0} = U {If - f* I > 1/n} is a countable nEN

union of p-nullsets. This just says that f = f* p-almost everywhere in Sl. But the next example shows that it may fail if one of the functions is not in any 2P-space. Example. 2. Consider the measure space (fl, Y(fl), p), where 11 consists of exactly two elements wo,wl and p({wo}) = 0, p({wl}) = +oo, fn = f = 0 for every n E N. These functions lie in every .2'P(p) and the sequence (fn) converges stochastically

to f , as well as to every real-valued function f * on 0. Every such f* which is non-zero at wl, however, lies in no 2"(p) with 1 < p < +00 and fails to coincide p-almost everywhere in 11 with f. The considerations with which we began this section lead to an important class of stochastically convergent sequences:

20.4 Theorem. If the sequence (fn) in 2P(p) converges in e" mean to a function f E 2P(p) for some 1 < p < +oo, then it also converges to f p-stochastically. Proof. The Chebyshev-Markov inequality tells us that

p({Ifn - fl ?a}nA) a}) =0

for every a > 0,

lim µ({sip Ifml > a}) = 0

for every a > 0,

p(limsap{Ifnl>a})=0

for every a>0.

lim A n-rao

(20.7)

m>n

m>n

Proof. To prove the equivalence of (20.6) with the almost everywhere convergence of (fn) to 0, we set, for each a > 0 and each n E N

An :_ { sup IN > a} . m>n

Obviously both n H An and a H An are antitone mappings; then k H An/k is isotone on N. If we also set

A:= {w E fl :limo fn(w) = 0} = {w E Sl : limas

op

Ifnl (w) = 0),


116

then these lie in W. either by appeal to 9.5 or by noticing that each A; E W and

A= n U kEN nEN

Passing to complements,

CA= U nAnk kEN nEN

and so

n A ;/k r CA as k -+ oo,

and Al/k n 1

fI' dl "m

as n -00.

mEN

nEH

Consequently,

u(CA) = sup p ( n A,imk) = sup inf

(20.8)

kEN

kEN 'nEN

nEN

because the finite measure µ is both continuous from above and continuous from below, by 3.2. Thus (fn) converges almost everywhere to 0 just when the number defined by (20.8) is 0. In turn, the latter occurs exactly in case

inf p(AIlk) = Iuu p(An1fk) = 0

nEN

n-+oo

for every k E N. The first equivalence follows from this. The equivalence of (20.6) with (20.6') follows from the observation that for any numerical function g on S2

{g>a}C{g>a}C{g>a'} whenever 0 < a' < a. Finally, the equivalence of (20.6') with (20.7) follows from the validity, for every

a > 0, of the equality

a(( sup Ifml > a}) = µ(limsop tlfnl > a}) .

(20.9)

m> n

For the proof of which we introduce

Bn:= U{Ifml>a} and B:=llmspp{Ifnl>a}. m>n

On the one hand, Bn I B and consequently tim p(Bn) = µ(B). On the other hand, however,

Bn= U {Ifml>a}={sup Ifml>a}. rn>n

m>n

From this finally we get the needed (20.9). 0 The conditions involved in Theorems 20.4 and 20.5 are indeed sufficient to insure stochastic convergence, but they are not necessary for it, as the following examples show.

§20. Stochastic convergence

117

Examples. 3. Let S2 :_ [0,1 [, s/ := 1 n 91 and µ := an, a finite measure. With converges to 0 at every point of Q An :_ JO, 1/n[ E a, the sequence and so, either by appeal to 20.4 or by virtue of

µ({n1A > a)) = µ(An) = n

whenever 0 < a < n E N,

this sequence also converges stochastically to 0. By contrast

= n"p(An) = np-1 shows that the sequence does not converge to 0 in pth mean for any p > 1. 4.

Let (fl, 0, µ) be the measure space of the preceding example. Write each n E N

as n = 2' + k with non-negative integers h and k satisfying 0 < k < 21 (which uniquely determines them) and set

An :_ [k2-h, (k+ 1)2-h[,

In

n E N.

lAn,

It was shown in the example in §15 that the sequence (fn(w))nEN converges for no w E S1. Nevertheless the sequence (fn) does converge stochastically to 0, since for every a > 0 and n E N

p({) fnI 1 a}) < 2-h < 2r2 . In this example stochastic convergence can also be inferred from 20.4, since the example in §15 showed that (fn) converges to 0 in pth mean for every p E [1, +oo[. The connection between stochastic convergence and almost-everywhere convergence is nevertheless closer than one would be led to suspect on the basis of the last example.

20.7 Theorem. If a sequence (fn)nEN of measurable real functions converges ,u-stochastically to a measurable real function f, then for every A E 0 of finite p-measure some subsequence of (fn) converges to f µ-almost everywhere in A. Proof. For A E sa( with µ(A) < +oo, the measure µA, which is the restriction of p to A n.ad, is finite. It therefore suffices to deal with the case of a finite measure u; moreover, in that case we can simply take A to be St itself. For a > 0 and m, n E N the triangle inequality shows that

{Ifm - fnI 2: a} C {If,. - f I ! a/2} U {Ifn - f I

a/2);

thus by hypothesis µ({I fn, - fnl > a}) can be made arbitrarily small by taking m and n sufficiently large. If therefore (rlk)kEN is a sequence of positive real numbers with 00

E rlk < +00, k=1

118


then for each k E N there is an nk E N such that

forallm>nk.

{t({Ifm-fnkl?nk})

k=1

k=1

and consequently,

p(Ak) = 0.

lira

n-oo

k=n

From this it follows that the set A := lira sup An satisfies n-,00

p(A) = 0, 00

because A C U Ak for every n E N, entailing that p(A) < E p(Ak) for every n. k=n

k>n

The definition of A shows that if w E CA, then the inequality Ifnk+. (w) - fnk (w) I ? rlk

prevails for at most finitely many k E N. Therefore, along with the series E Ilk, the series 00 1: lfnk+l(w) - A. (w)1 k=1

converges (absolutely); that is, the sequence Y n& (w))kEN converges in R. In summary, the sequence (fnk) converges almost everywhere to a measurable real func-

tion f' on !l. By 20.5 f' is also a stochastic limit of (fnk )kEN. But, as a subthat sequence converges stochastically to f as well. Hence sequence of by 20.3, f = f " almost everywhere. We have shown therefore that (fnk )kEN con-

verges almost everywhere to f. 0 In terms of almost-everywhere convergence we can now even characterize stochastic convergence by a subsequence principle.

20.8 Corollary. A sequence (fn) of measurable real functions on 11 converges pstochastically to a measurable real function f on ) if and only if for each A E of of finite measure, each subsequence (fnk )kEN of (fn) contains a further subsequence which converges to f p-almost everywhere in A.

Proof. The preceding theorem establishes that the subsequence condition is necessary for the stochastic convergence of (fn) to f, since every subsequence of (fn)

§20. Stochastic convergence

119

likewise converges stochastically to f. Let us now assume that the subsequence condition is fulfilled, and fix an A E W of finite measure. Since every subsequence (f,,.)

contains another which converges almost everywhere in A to f and by 20.5 this latter subsequence must also converge (in A) stochastically to f, we see that in the sequence of numbers

(kEN),

p({Ifnk - fI -a}nA)

in which a > 0 is fixed, a subsequence exists which converges to 0. But, as an easy argument confirms, a sequence of real numbers whose subsequences, have this property must itself converge to 0. That is, the sequence of real numbers

>a}nA)

(nEN)

converges to 0. As this is true of every A E d having finite measure and every a > 0, the stochastic convergence of to f is thereby confirmed. 0 Remarks. 5. It is not to be expected that in 20.7 and 20.8 the reference to the finite-measure set A E W can be stricken. This is already illustrated by Example 2

if one replaces the sequence (fn) there with the sequence (f) defined by f,, :_ nl(,,,, ), n E N. This new sequence also converges stochastically to f := 0. See however Exercise 5.

6. The second part of the proof of 20.7 shows that for finite measures u there is a Cauchy criterion for the stochastic convergence of a sequence (f.): Necessary to a measurable and sufficient for the stochastic convergence of a sequence real function on S1 is the condition for every a > 0.

litre

m.n-ix 7.

The sequence formed by alternately taking terms from each of two stochasti-

cally convergent sequences whose limit functions do not coincide almost everywhere

shows that in Corollary 20.8 it does not suffice to demand that in each A some sub sequence of the full sequence (fn) converge almost everywhere. A particularly useful consequence of 20.8 is:

20.9 Theorem. If the sequence (f,,) ,EN of measurable rral functions on 11 converges stochastically to a measurable real function f on. Q. and yo : R -4 R is continuous, then the sequence (y^ o f )nEN converges stochastically to V o f.

Proof. One exploits both directions of 20.8, noting that from the almost everyto f on an A E 41 follows the almost

where convergence of a subsequence everywhere convergence of (,p o

f on A. 0

The general question of functions p : R -* R which preserve convergence, in the sense that (o o f, inherits the kind of convergence (f,,)iE14 has, is investigated by BARTLE and Jo1CH1 (1961]. They show how Theorem 20.9 can fail if the more restrictive definition (20.5) is adopted for stochastic convergence.

120


Exercises. are stochastically convergent sequences of measurable real func1. (fn) and tions, having limit functions f and g, respectively. Show that for all a,,8 E R

the sequence (af,, + 13g,,) converges stochastically to of + fg, and the sequences (fn A gn), (f V g,,) converge stochastically to f A.9, f V g, respectively. 2. For a measure space (Si, d,,u) with finite measure p let d, be the pseudomet-

ric on d constructed in Exercise 7 of §3. Show that a sequence (An) in saf is d,,-convergent to A E 0 if and only if the sequence (NAB) of indicator functions converges stochastically to the indicator function IA. 3. For every pair of measurable real functions f and g on a measure space (Cl, sA, µ) with finite measure µ define

D,(f,g) := inf{e > 0 : p({I If - gI > e}) < e} and then prove that (a) DP is a pseudometric on the set M(d) of all measurable real functions. (b) A sequence (fn) in M(W) converges stochastically to f E M(d) if and only if lim D, (f,,, f) = 0. n +00 (c) M(se) is D,,-complete, that is, every Dµ Cauchy sequence in M(d) converges with respect to Da to some function in M(Ao ). What is the relation of D,, to the dµ of Exercise 2? 4. In the context of Exercise 3 define

If - gi

dp,

for every pair of functions f, g E M(ss). Show that Dµ also enjoys the properties (a)-(c) proved for D$, in the preceding exercise. be a or-finite measure space. Show that a sequence (fn) of measur5. Let able real functions on Cl converges stochastically to a measurable real function f on Cl if and only if from every subsequence (fk) of (fn) a further subsequence can be extracted which converges almost everywhere in 0 to f. [Hints: Suppose (fn) is stochastically convergent. Choose a sequence (Ak) from d with p(Ak) < +oo for each k and Ak 1 11, and consider the finite measures pk(A) := µ(A fl At,) on sW. The claim is true of each measure Pk. Given a subsequence 4 of (fn), there is for each k E N a subsequence of (g;,k))nEN of 4' which converges pk-almost everywhere

to f. It can be arranged that (g nk+u)) is a subsequence of (gnl) for each k. Then the diagonal subsequence (g;,ni ), EN does what is wanted.] 6. Give an "elementary" proof of 20.9 based directly on the relevant definition 20.2.

To this end, show that for each E E 10, 1[ there exists 6 > 0 such that fl f I
0.

Suppose M is a set of measurable numerical functions on fl, 1 < p < +oo, and there is a p-fold µ-integrable majorant g for M, that is, every f E M satisfies 3.

µ-almost everywhere.

If1 < g

Then the set

M":={IfIP:fEM} is equi-integrable. Indeed, as in Example 2, the single integrable function h := 2gP is an --bound for every e > 0, since by 13.6

J

fIdµ < J

gP dµ = J

dµ = 0

{g=too}

{gP>h}

1f1P>h}

This example shows that Theorem 15.6 on dominated convergence is really about an equi-integrable set of functions. Of course, one cannot expect that conversely from the equi-integrability of a subset of .`" (t) there should follow the existence of a single integrable majorant for the set. The following example confirms this. Consider the probability space (N, .(N), µ), the finite measure µ being specified by µ({n}) = 2-n for each n E N. The sequence of functions fn := 2"n-11{n) (n E N) is equi-integrable: For the constant function 1 E .2o1(µ) the inequality 4.

fn dµ
0 and every n E N

/

JIf-I>g}

If,.Idµ=J

r

ndµ=J ndµ-J A

ndµ>1-J

A

From the finiteness of the measure gµ and the fact that An 1 {0}, it follows that

liminf J n_+00

Ifnl dµ> 1,

{If..I>g}

showing that g cannot be an a-bound for any e E ]0, 1[. Here is a useful characterization of equi-integrability, which, for o-finite measures, will be improved upon in 21.8.

§21. Equi-integrability

123

21.2 Theorem. A set M of measurable numerical functions on l is equi-integrable if and only if the following two conditions are satisfied: sup

(21.3)

fEM

f If I dµ < oo .

(21.4) For every e > 0 there exists a p-integrable function h > 0 and a number 3 > 0 such that

< d=* Jill/iforallfEMand Proof. For every A E &/, every measurable numerical function f on 0, and every integrable function g > 0

f AIfI du=

f

An{IfI>g}

IfI du+ f

An{III 0 be as furnished by (21.4). For each f E M and real a > 0, consider the obviously valid inequality

f IfI du

4IfI?ah}

Ifl du > f {If (If I>_-h}

or its equivalent 1

J IfI?ah} h djo < -

If I dM.

The integrals f If I dµ here are bounded as f ranges over M, by (21.3). Therefore a > 0 can be chosen so large that

hdµ < b for all f E M. {IfIiah} (21.4) then insures that g := ah is an c-bound for M, which proves that this set is equi-integrable. 0

21.3 Corollary. Let M C 2P and the set MP :_ { If I P : f E MI be equiintegrable, where 1 < p < +oo. Then the set

M;:={laf+,0glP:f,gEM,a,,0ER,Ial:_1,1,01g} Ifnrn IP do + J Ifm,.I 0 from 2'(It). If in addition lien

then the

sequence

f f dit = If dp, J

converges to f in mean.

Proof. We consider the sequence (f A fn)nEN. The inequalities

0< fA and Example 3 show that it is equi-integrable. Since

05f-fAfnz

From this, the decomposition f + fn = f V f + f A fn, and the convergence hypothesis follows the companion result (21.10')

lim

If V f dp =

f

f du.

But then the decomposition

If,, - fl =.f V .fn -.f A.fn shows that the claimed mean convergence ensues upon subtracting (21.10) from (21.10').

Now we can get the sharpening of Theorems 21.4 and 15.4 mentioned earlier:

21.7 Theorem. For every sequence (fn) in 2P(t) which converges p-stochastically to a function f E 2P(,u) the following three assertions are equivalent: The sequence (fn) converges in p'h mean to f . (1) (ii) The sequence (If,, 1") is equi-integrable. (iii) lim f If,, I' d;i = f If I' dp. n-, x.

Proof. The equivalence of (i) and (ii) is contained in Theorem 21.4. We need therefore establish only two implications: (i) .(iii): Assertion (15.6) in Theorem 15.1 affirms this. (iii)=,>(ii): From the hypothesized stochastic convergence of the sequence (f,,) to f follows that of (I f I') to If 11, via 20.9. And then from the preceding lemma

it further follows that the sequence (If P) converges to I fI' in mean. Finally, Theorem 21.4 - with the p there chosen to be I - shows that the convergence in mean of this sequence entails its equi-integrability.

128


For a-finite measures µ, equi-integrability can be characterized in a way that is particularly convenient for applications. The a-finiteness will be exploited in the form expressed by 17.6, that there is a strictly positive function h in Y' (it). 21.8 Theorem. Let (S2, dd, p) be a o-finite measure space and h a strictly positive

function from 2'(p). Then for any set M of dd-measurable numerical functions on Sl the following three assertions are equivalent:

(i) M is equi-integrable. (ii) For every e > 0 some scalar multiple of h is an a-bound for M. (iii) M satisfies sup

(21.11)

fIfI dµ < +oo

JEM

as well as the following: Given e > 0 there exists 6 > 0 such that

fhd6=JIfIdlAah} If I du = 0

holds uniformly for f E M. Condition (21.12) is for obvious reasons (cf. 17.8) called the equi-(hit)-continuity of the measures If I µ, f E M. Proof. (i) .(ii): Let g be an E-bound for M. Then for all f E M and all a > 0

{IfI>-hh}

IfI dµ=

f

{IfI>oh}n{IfI>g}

< fj IfI>_g} I fI dµ+

IfI dµ+

f

f

{(fI>«h)n{(fIcth} According to 13.6, µ({g = +oo}) = 0. Since gµ is a finite measure on dd, it is {g>ah}

2

continuous from above. Hence the fact that

n {g > ah} = n {g > nh} = {g = +oo} a>o

nEN

is a set of (gµ)-measure 0 means that

k>ah)

g dµ < 2

for all sufficiently large a. Coupled with the preceding inequality this shows that indeed ah is an a-bound for all sufficiently large a, that is, (ii) holds.

§21. Equi-integrability

129

This can be gleaned from the inequality derived at the beginning of the proof of 21.2, ah being now eligible for the function g there:

JIfIdJLjIJI> an}IfI d1+a

for all f EM.

hd/1

21.2 affirms this. 0 Theorem 21.8 is of special significance for finite measures p. Then it is often expedient to choose for h the constant function 1. When one does, (21.13) assumes the equivalent form (21.13')

lim

a-++oo

J IfI?a} IfI dp = 0

uniformly for f E M.

This condition is thus - just as (21.13) for a-finite measures - necessary and sufficient for equi-integrability of M.

Remark. 2. In part (iii) of Theorem 21.8 the 21-boundedness of M expressed by (21.11) cannot in general be dropped from the hypotheses. It suffices to consider the measure space ({a}, Y({ a}), Ca) consisting of a single point and the sequence

of functions f,, := n 1. This sequence is not equi-integrable, although for every e > 0 and every strictly positive h, (21.12) holds whenever 0 < 6 < h(a). Let us close by deriving a sufficient condition for equi-integrability in the finitemeasure case which generalizes the introductory Example 3.

21.9 Lemma. Let p be a finite measure and M C Y' (y). Suppose that there is a p-integrable function g > 0 such that (21.14)

J{Ift?a}

IfI dp
a}

9dp

for all f E M and all a E R+. Then M is equi-integrable. Proof. The case a:= 0 of (21.14) says that f If I dp < f g dp < +oo for all f E M. Then Chebyshev's inequality tells us that p({IfI ? a}) 0, f EM.

It follows from this that (21.15)

lim p({IfI > a}) = 0

a-4+oo

uniformly in f E M.

For each e > 0, 17.8 supplies a 8 > 0 such that

AEd and p(A)o)

IfI dp = 0

uniformly for f E M,

that i4, (21.13'), which we have seen entails equi-integrability of M. O

Exercises. 1. Show that for any measure space (0, a, p) a set M of measurable numerical functions is equi-integrable if and only if for every e > 0 there is an integrable function h = hr > 0 such that f (If I - h)+ < e for all f E M. [Hint: For sufficiently large q > 0, g := r)h will be a 2e-bound for M.] 2. Let (S2, d,14) be an arbitrary measure space, 1 < p < +oo. Suppose the se((t) converges almost everywhere on 12 to a measurable real quence (f,,) in function f. Show that f lies in 2P(p) and (fn) converges to fin pth mean if the sequence (If,, I P) is equi-integrable.

3. Show that from the 2-convergence of a sequence (fn) to a function f E 2"(e) follows the 21-convergence of the sequence (I fn IP) to If I, for any 1 < p < +oo. 4. Consider a finite measure .t and an M C Y1(µ). For each n E N, f E M set

an(f):=nµ({n 0 from the sequence (f,,) in the Example from § 15. 7. Let (f), .x, µ) be a measurable space with µ(S2) < +oo, and let (v;)iE f be a family of finite and it-continuous measures on 0. Suppose this family is equi-continuous at 0, meaning that to every sequence (An)nEN in iA with A,, J. 0 and to every

c>0there is an nEENsuch that y;(A,)<efor all n>nE,and all iEI.Show that then this family is equi-µ-continuous in the following sense (cf. (21.12)): To every E > 0 there corresponds a 6 = 6e > 0 such that

and µ(A) 2. One important application of product measures is the introduction of the concept of convolution for measures and functions.

§22. Products of c-algebras and measures j = 1, ... , n E N are given. We consider

Finitely many measurable spaces

the product set

n

Q:= X11j=Q1x...xQ,t j=1

and for each j the projection mapping Pj : 52 -> S2y

which assigns to each point (w1, ,w,) E I its jth coordinate wj. The a-algebra in Q generated by the mappings pa,. , pn is designated n j=1

and called the product of the a-algebrns d1 r ... , d,,. According to (7.3) we have to do here with the smallest a-algebra s® in ft such that each pj is d-safj-measurable.

The reader may recall that the product of finitely many topological spaces is defined in a very similar way. An important principle of generation for such products is immediately at hand:

22.1 Theorem. For each j = 1, ... , it let Ag be a generator of the a-algebra salj in SZj which contains a sequence (Ejk)kEN of sets with Ejk T Q j. Then the a-algebra ®.n is generated by the system of all sets A(i 0

E1x...xEn with E., E 9, for each j = 1, ... , n.

§22. Products of a-algebras and measures

133

Proof. Let 0 be any a-algebra in Q. What we have to show is that the mappings p,

are all d-Oj-measurable (j = 1,.. . , n) if and only if s+d contains each of the sets El x ... x En described above. According to 7.2 pj is .V-Afj-measurable just exactly if p 1(E3) E 0 for every E3 E 8 . If this condition is fulfilled for each j E {1,.. . , n}, then the sets

El x ... x En =p11(El)n...npnl(En) all lie in 0. If conversely, E, x ... x En E s+1 for every possible choice of E3 E 4 and j E {1,. .. , n }, then upon fixing E3 E 8j, the sets

Fk:=Elkx...xEj-1.kxEi xEj+1,kx...xEnk,

kEN,

all lie in W. Since the sequence (Fk)kEN increases to

U1 x...x1j-1 xEj xflj+1 x... xOn =pj1(Ej), this set too lies in d, for each j. The claim is therewith proven.

13

Remark. 1. The restriction imposed on the generators S, cannot generally be dispensed with. Take, for example, n := 2, sail in which .QF2 contains at least four sets.

{0,111}, ell := {0} and 82 := W2i

A particular case of this theorem is the fact that the product dj ® ... ®srdn is generated by all the sets Al x ... x An with each A3 E . . Our further course will be guided by the following example:

Example. F o r each j E { 1, ... , n} let Std := R, . rt :_ .41 and 8j :_ f 1. The system of all sets E1 x ... x En with each E? E Jr' is evidently just the system .5n of all right half-open intervals in Rn. According to 6.1, fn generates the a-algebra R" of n-dimensional Borel sets. Taken together with 22.1 - whose hypotheses are clearly satisfied here - this reveals that

,qn = a1 ®

(22.2)

(& R1

(n factors on the right).

By 6.2, A" is the only measure on R" which satisfies

,\' V1 x ... X In) = V1(Il) . ... Al (In) for all I, i ... , In E .01. This remark and the example preceding it leads to the following question.

Measure spaces (f13, O j, pi) are given, 1 < j < n with n > 2, and for each dj

a generator 9j. Under what hypotheses can the existence of a measure a on

010 .. . (9 On satisfying (22.3)

zr(E1

for all E,ESj,I<j 2.

Remark. 2. In closing it should again be mentioned that a mapping

f:S2o-4 SZlx..-xSZ of a measurable space (11o, ado) into a product of measurable spaces (0j, Afj) is measurable with respect to the a-algebra all ® ... ®as' if and only if each component mapping fj := pj o f off is d0-Oj-measurable - a fact which is immediate from Theorem 7.4.

Exercise. Finitely many measurable spaces (flj,.Wj) are given, j = 1,. .. , n. Show that the algebra in S21 x ... x S2 generated by all sets Al x ... x A,, with each Aj E .rrdj consists of all finite unions of such product sets.

§23. Product measures and Fubini's theorem

135

§23. Product measures and Fubini's theorem Initially measure spaces (521, .sdl, pj ), (522, sd2, µ2) are given. For every Q C ill x 112

the sets (23.1)

{w2 E ill : (WI, W2) E Q} {w1 E ili : (w1,w2) E Q}

Q111

Q,,,.,

are called, respectively, the w1-section of Q (w1 E ill) and the w2-section of Q (w2 E p2) This notation is chosen for typographic simplicity and will see us through §23, after which it is not needed. In case ill = il2i however, it presents obvious problems, to circumvent which, alternative notations like,,,, Q or Q4 for Q,,1 are also popular in the literature. About these sets we claim:

23.1 Lemma. If Q E sd1 ® sd2i then its w1-section lies in ad2 for every w1 E 01, and its w2-section lies in sd1 for every w2 E i12. Proof. For arbitrary subsets Q, Q1 i Q2.... Of fl :=121 x 522i and points w1 E ill

(!\Q)w, =!2\Q.1 and

(U Qn)

= U (Qn)., . nEN

nEN

Furthermore 52, = 112, and more generally for Al C 111, A2 C ill we have (A1 x A2),1 =

j A2 0

if w1 E Al if w1 E ill \ A1.

For each w1 E 121, therefore, the system of all sets Q C fl having section Q,,, E .ode

is a a-algebra in Cl which contains every product set Al x A2 with Al E .o'j, A2 E ode. But according to 22.1 01 (& ad2 is the smallest a-algebra which contains all such product sets. This proves the part of the lemma dealing with w1-sections. Of course, w2-sections are treated the same way. 0 Since now µ2(QW1) and make sense for all Q E 01 ®.02, wl E ill and w2 E S12, we are in a position to take the next step:

23.2 Lemma. Suppose the measures p1 and µ2 are or-finite. Then for every Q E sd1 ® . 9 the functions w1 H µ2(Q.,)

and w2 H A, (Q..)

on 121 and 122, respectively, are sd1-measurable and 02-measurable, respectively.

III. Product Measures

136

Proof. The function wl H P2(Qw,) will be denoted by sq. We will establish the d1-measurability of sq, for each Q E d1 ®sal2. The other function can be treated analogously.

First suppose that µ2(1Z2) < +oo. In this case the set ) of all D E .01 ®sal2 whose sD function is.call-measurable constitutes a Dynkin system in C := 111 x 11.2. This involves the following easily checked assertions: 811 = /12(122);

sf1\D = 851 - SD for every D E .9;

svD = ESD. for every sequence (D,6) of disjoint sets in .9. Furthermore 9 contains Al x A2 for every Al E salli A2 E sale, since SA, xA2 =112(A2) - lA,

The system if of all such Al x A2 is fl-stable and generates sale ®sd2, by 22.1. Therefore 2.4 insures that 01 ®ad2 is the Dynkin system generated by it. From 9 C -9 C Wl ®,42 therefore follows that .9 = .call ®.v i which is what is being claimed.

of sets from ae, each of If 162 is only a-finite, then there is a sequence finite 162-measure, with Bn T 112. For each n, A2 H u2(A2f B.) is therefore a finite measure 162,, on sate, to which the already proven result can be applied, showing is .aft-measurable for each Q E Of, ® 02. Now that wl H 112(Q,,,) = auP112,,(Qw,) nEN

because of the continuity from below of the measure 162. From Theorem 9.5 then the mapping wl -r 162(Q,,,) is indeed al-measurable.

It is now rather simple to construct the measure it that we seek:

23.3 Theorem. Let (f1j, dj, pp) be o-finite measure spaces, j = 1, 2. Then there is exactly one measure.. it on all ® .sate which satisfies (23.2)

rr(A, x A2) = p, (Al)112(A2)

for all Al E sli, A2 E sate.

In addition this measure satisfies (23.3)

it(Q) =

f

f

for all Q E sail ®d2

and is a-finite. Proof. As before, for each Q E sate e s12 let sq denote the Wi-measurable function on 121; it is of course non-negative. Consequently via

w1

ir(Q) :=

JSQdILI

a non-negative function it is well defined on 010 sate. For every sequence (Q,)nEH of pairwise disjoint sets from sat 0 szt2 the equality sUq = E sq, and 11.5 insure


that

137

00

7r U Qn) _ F, n(Qn) n=1

nEN

Since so = 0 we have 7r(0) = 0. This proves that 7r is indeed a measure on .od1®a2. It has property (23.2) because SA, XA2 = p2(A2)IA,, whence integration yields 7r(A1 x A2) = pl(A1)a2(A2)

Proceeding analogously, we confirm that

ir'(Q) :=

fi(Qw2)iz2(dw2)

also defines a measure on s1® ® d2 having this property. But when Theorem 22.2

sr'1 and &2 := W2 it affirms that there is at most one such measure. Thus 7r = 7r' and (23.3) is confirmed. There is a sequence (Ajn)nEN of sets from ,rarj, each of finite pj-measure, with Ajn T 52j, for j = 1 and j = 2. Using these as the A1, A2, respectively, in (23.2) proves the a-finiteness of IT because is applied to 9d°1

r(A1nxA2n) y}, namely

E:={(w,t)ESZxR+: f(w)>t}, lies in sad®.. Theorem 23.6 for the product measure p®A' consequently supplies the equalities

JJ

(23.8)

V

(t)IE(w,t)A'(dt)p(dw) = f f V(t)1E(w,t)µ(dw)X'(dt)

= Jw'(t)iz(Ei)A(dt) =

Jc'(t)({f > t})A'(dt),

since the t-section of E is just the set of all w E 1 which satisfy f (w) > t. As V is isotone, W'(t) > 0 for all t > 0. The continuous function gyp' is integrable over [1/n, a] whenever 1/n < a < +oo, and since [1/n, a] t ]0, a], and

f

oal

(t)A'(dt) = limo J

n

(t) dt = W(a) - n m V(1/n) = w(a)

142

!IL Product Measures

(cp(0) = 0 and Sp is continuous on R+), we see that V is also integrable over 10, a] for every a > 0. It follows from f > 0 and the preceding calculation that

p'(t)a(dt) = (f(w))

J

for every

E S1,

o,f(W)l

both expressions being 0 whenever f (w) = 0. We thus get o f dµ =

f (Jlo,f(W)l

= J f o'(t)llo,nw)d(t)A*(dt)µ(&) =

J

IV

which combined with (23.8) concludes the proof. D

Example. 2. The relevant hypotheses are certainly fulfilled by the functions V(t) := t' with p > 0. Thus for every a(-measurable real function f > 0 on S1 (23.9)

J

fl'dµ=p

+ 0

When p = 1 we get the especially important formula (23.10)

f f du =

r p({f > t})A1(dt) =

t})dt.

The reader should not overlook the geometric significance of this, which is that the integral f f dµ is formed "vertically", while the integral on the right-hand side of (23.10) is formed "horizontally".

Now at last we turn back to the general case of §22 and consider finitely many o-finite measure spaces (S1i, di,,a ), j = 1, ... , n and n > 2. The two product sets (f21 x ... x 1li_1) x On and SZ1 x ... x Sln_1 x Stn will be identified via the bijection

((w1,...,W,y_1),wn) H (L11,...,wn-l,wn) The agreed-upon equality of these sets leads at once to the equality of the corresponding products of v-algebras: (23.11)

(Wi®...®An-1)®-Wn=010...®An-1®dd/n.

In fact, by 22.1 the sets Al x ... x An- l with each Ai E jz(j generate rote®...OAfn-1,

and by the same theorem the sets

then generate (.Q91 0 ... 0 s0n_ 1) ®6dn as well as .c

® ... ®sOn_ 1 ®SF,.


143

In a completely analogous fashion one confirms a general associativity in the formation of products of a-algebras: m

n

j=1

j=m+1

(23.12)

n

-'10

= j=1 ® 0j

(1<m 2 of factors via induction on n.

23.9 Theorem. or-finite measures µl, ... , µn on a-algebras .d1, ... , jVn uniquely determine a measure 7r on safe ® ... 0 do such that (23.13)

for all Aj E 0j, 1 < j < n.

7r(A1 x ... x An) = ul(A,) .... µn(An)

This measure 7r is a-finite.

Corresponding to Definition 23.4, 7r is called the product of the measures µl, ... , µn and is denoted by n

®µj µl®...®µn. j=1

The question posed in §22 is finally answered in full, by this theorem.

Proof. In 22.2 take for the various generators 8j the o-algebra .dj itself, and learn that there is at most one measure 7r which satisfies (23.13). The existence question has already been settled for n = 2, in 23.3. We make the inductive assumption that 7r' := µ1 ®... ®µn-1 exists for some n > 2 and show how that leads to the existence of µl ® ... ®µn. Evidently the a-finiteness of µl, ... , µn_1 entails that of 7r', as in the proof of Theorem 23.3. That theorem therefore supplies us with a measure 7r := 7r' ®µn on (.W1 ®... ®.dn_ 1) ®.dn which satisfies 7r(Q' x An) = 7r'(Q')µn(An)

for all Q' E .d1 ® ... ® .dn-1 and all An E dd4n. Because of (23.11) this measure does what is wanted at level n, completing the induction. Again, a-finiteness of 7r is confirmed exactly as in the proof of 23.3. 0 This inductive construction of the n-fold product measure builds in the equality (23.14)

(141 ®... (&µn-1) ®µn = µ1 ®... ®µn-1 ®µn By now familiar considerations show that in fact a general associativity prevails in the formation of product measures: m (23.15)

In particular

n

n

(®µj)®( ® µj)=® µj j=1 j=m+1 j=1 xd

=

V

®V,

(1<m 0 be an s91®... ®.c 4-measurable numerical function on 01 x... x Stn. Then for every permutation j1, ... , j,, of 1, ... , n

Jfd(ii®...®in)

(23.16)

= f(... (f (f f(w1i...,wn)µj,(dwj,))µj.(dwjs))...)µjr(dwj.)' Every integral that occurs on the right-hand side is measurable with respect to the product of the appropriate Oj, namely those corresponding to the coordinates in which integration has not yet occurred. This right-hand side is often written in the shorter fashion

J ... J The simple proof of this theorem (involving induction), as well as the formula, tion and proof of the analog of 23.7, will be left to the reader. One more piece of notation is convenient:

23.10 Definition. For finitely many a-finite measure spaces (SZj, Wj, µj), 1 < j < +,

1l

1!

n, the triple ()( SZj, ®.Wj, ®µj) is called the product of these measure spaces 7=1

j=1

j=1

and is denoted by

n

j,

14Y

j=1

Remark. 2. Throughout the preceding the index set was finite. But there is also a theory of products of (finite) measures indexed by arbitrary sets, which is particularly important in probability theory; it is treated in detail by BAUER [1996], and somewhat more extensively in HEw rr and STROMBERG [1965]. For p-measures SAF,KI [1996] gives a short, elementary proof that uses only 5.1.

In closing we will consider the case where each measure µj comes with a real density f j > 0. According to Theorem 17.11, vj := f jµj is then a a-finite measure too.

23.11 Theorem. Let (S2j,.Vj, jAj) be or-finite measure spaces

andfj>0real-

valued w(j-measurable, functions on S1j. Set

vj = fjµj, Then the product of these measures is defined and satisfies (23.17)

n

n

j=1

j=1

®vj = F. (®µj)

j = 1,...,n.


145

with the density function n

[ffj(wj),

F(wl,...,wn)

(23.18)

j=1

The function F is the so-called tensor product of the densities f1,..., fn Proof. As already noted, 17.11 insures that each measure vj is a-finite, guarantee-

ing that their product is defined. It suffices to treat the case n = 2 and refer the general case to induction. For sets Al E and A2 E s12 vl(A1)v2(A2) =

=

(jfid14i)(j12d142) z

Jf

I ._

lA,(w1)fl(wl)lA2(w2)f2(w2)141(dwl)112(dw2)

= Jf lA,xA2(wl,w2)F(wl,w2)1L1(dwl)122(dw2) From 23.6 therefore Fd(141 ®1L2),

v1(A1)v2(A2) = J

for all Al E. iA2Ed2.

, x A2

But then according to 23.3, v1 ® v2 coincides with the measure F (141 ®14z). 0

Exercises. 1. Consider 521 = 522 :=1R, 01 = 02 := ,41, it, := Al and 142 the non-a-finite counting measure on .41 (cf. Example 3, §5). Show that equality (23.3) fails to hold for Q := D, the diagonal {(w,w) : w E R} in 121 x 522. Why does D lie in jV1 002 =W2? 2. Show that the function (x, y) H 2e2xv - exv is not A2-integrable over the set [1, +oo[x [0, 1].

3. With the aid of Tonelli's theorem find a new proof of Theorem 8.1 along the following lines: Up is a translation-invariant measure on mod, 14([O,1[) = 1, and f >

0, g > 0 are Borel measurable numerical functions on Rd, compare the integrals

f

f()f(x + y)14(dx)Ad(dy)

and f f g(y - x)f(y)14(dx)Ad(dy)

and, finally, take f to be any indicator function, g the indicator function of [0, 1[. 4. Compute 00

2

I:= f e_x dx, 0

and thereby evaluate anew the important integral G = 21 in (16.1), in the folye_y2V2 lowing simple way: fo a-e2 dt = fo dx for every y > 0 and therefore

146


I2 = f °° (, fn f (x, y) dx) dy for the function f on R+ x R+ defined by f (x, y) yP-v2(1+z2). Applying Tonelli's theorem leads to I = 2Vr7r.

5. Let IxI := (x + ... + xd)112 denote the usual euclidean norm of the vector x := (x1,. .. , xd) E Rd. Show that the function x H e-Iz1° is ad-integrable for every a > 0. (Recall Exercise 2 of §16.) In case a = 2, show that the Ad-integral of this function is Gd.

6. KL(xo) will denote the closed ball in Rd with center xo and radius r > 0. Set ad :_ and prove that ,\d(K*(xo)) = adrd .

Show also that the numbers ad can be calculated by a2q = 4 9rq,

2q(2q

and a2q- i = 1 3

- 1)

a-1

(q E Dl).

[Hint: Use (7.10) and note that every xd-section of K,.(0) is either empty or is a (d-1)-dimensional closed ball. Tone1G's theorem then leads to a recursion formula for the ad. Here, of course, 7r has its customary geometric meaning.]

How do these relations change if we replace K,.(xo) by the open ball Kr(xo) in Rd of radius r and center xo? [Cf. Exercise 3 in §7.] 7. For every compact interval [a, ,Q] C R+ designate by R(a, Q) the spherical shell

K,3(0) \ K.(0) _ {x E Rd : a < IxI < /3} . Show that for every continuous real function h on such an interval (a, /3] C R+

f

h(Jxj)Ad(dx) = d ad f

.

a

R(a,p)

h(t)td-1

dt,

ad being the number ad(KI (0)) from the preceding exercise. [Hint: The function H defined on [a, p) by

H(t) := f

h(IxI)J1d(dx),

is differentiable with H'(t) = d ad h(t) td-1 for all such t.] 8. Apply the result of Exercise 7 to the case d = 2 and h(t) := show, using Exercise 5, once again that G = f.

tE

a-t2

in order to

9. Let (S2, d1. p) be a o-finite measure space, f : Il -+ R+ measurable. Show that

the set of all t > 0 such that u({f = t}) # 0, as well as the set of all t > 0 such that µ({ f > t}) # µ({ f > t}) is countable. Therefore in the equalities (23.8), (23.9) and (23.10), p({ f > t}) can always be replaced by µ({ f > t}).

§24. Convolution of finite Borel measures

147

§24. Convolution of finite Borel measures Consider the d-dimensional Borel measurable space (Rd,.gd). Every finite measure µ on Rd will be called a finite or also a bounded Borel measure, and the set of all of them will be designated by.,&+' (lR'). For every such µ the number (24.1)

lI,II := IA(Rd)

is called the total mass of A. Making critical use of the group structure of (Rd, +) a so-called convolution product can be assigned to any finitely many measures Al, ... , An E .K+ (Rd);

in contrast to the previously studied product measure, it is again a measure on the original o-algebra Vd, even an element of .,of' (Rd). What we do below can be carried out in every (abelian) locally compact group. We cannot, however, go into this generalization, but must instead refer interested readers to the excellent monographs of HEwIrr and Ross [1979] and RUDIN [1962]. Initially we consider

the product measure Al ® ... ® An defined in §23. Since W d = Rd ®... 00, this measure is an element of .,W+b (Rod) The mapping A. : R"d -3 Rd defined by

A,,(xl,... , xn) := x1 + ... + xn is continuous, and so Vnd-.mod-measurable. The following definition accordingly makes sense:

24.1 Definition. The image under the mapping An of the product measure -IC/+b(Rd), plo. .®Idn is called the convolution product of the measures pl,... , An E in symbols (24.2)

The theorems on product and image measures combine to yield the most important properties of the convolution operation *. First of all, At * ... *An is again an element of .0+1 (Rd) and

µl*...*µn(R")=µl®...®p,(R"d)=11µ11I ...

IIJUnII

so that in fact (24.3)

IIµl * ... * poll = 11µ11I ...' 11µn11

In studying the convolution product it suffices to deal with n = 2, because (24.4)

Al * ... * An * I`n+1 = (Al * ... * ln) * ltn+1

for every n + 1 measures from .4 (Rd). To see this, introduce the continuous mapping Bn+1 : R(n+l)d _+ Red by

Bn+1(x1, ... , xn, xn+l) := (XI + ... + xn, xn+l )

148


and have An+l = A2 o B.+1. Checking that Bn+1(p1 ®... OA. 0 pn+1) = A. (j AI ®... ®pn) ®pn+1,

and remembering that the formation of image measures is transitive, we get Al * ... * pn * µn+1 = A2(Bn+l (JAI ®... ®pn ®pn+i )) = A2((1.t1 * ... * A.) 0 pn+1), which confirms (24.4). Henceforth therefore n = 2. For any measures p, v E .4f+' (Rd) and any 0-measurable numerical function f > 0 it follows from T19.1 and 23.6 that

J

fd(E.e*v)

r

=J foA2d(p®v) = ff f(x + y)p(dx)v(dy)

(24.5)

= f f f(x + y)v(dy)µ(dn)

As this holds for f := 1B, they indicator function of any set B E fed, we have (24.6)

p * v(B) = J µ(B - y)v(dy) = J v(B - x)p(dx)

(Recall (7.8) that B-x = -x+B.) Consequently * is a commutative, and by (24.4) also an associative operation in .1/+(R.d) Due to 19.2 and 23.7, (24.5) are valid as well for every p*v-integrable numerical function f on Rd. Equality (24.6) is frequently taken as the definition of p * v. Evidently .,W+6 (Rd) is closed with respect to addition and under multiplication by numbers in R+. From (24.6) we immediately see the relation of convolution to these two operations: For all p, v, v1i v2 E .41+(Rd), a E 11 Y+

p*(vl+v2)=p*v1+p*v2, p*(av)=(ap)*v=a(p*v).

(24.7) (24.8)

The distributive law (24.7) even holds in the following generality: For every sequence

of measures from .4r+(Rd) satisfying E IkvJJ1 < +oo, the sum n=1

00

E vn is also a measure in .4f+1 (Rd) (cf. Example 4 of §3). Taking account of 11.5,

n=1

it therefore follows from (24.6) that 00

(24.9)

14 *(E14t n=1

00

Ep*vn n=1

for every p E A,(+(Rd)

Let us now compute p * v in some special cases.


149

1. We again denote by T. the translation mapping x H x + a of Rd onto itself via a E Rd, and by ea the (Dirac-)measure on Md defined by unit mass at the point a. Of course, Ea E -f+(Rd) and IIEa1I = 1. From (24.6) follows that Ea * µ(B) _ µ(B - a) = µ(T; ' (B)) for all B E mod, and so (24.10)

E. * µ = Ta(p)

for all p E .4W+6 (Rd), a E Rd.

Now To is the identity mapping, so co is a - and obviously the only - unit with respect to convolution. If, namely, E were also a unit, meaning that p = E *,U for every µ E 4. (Rd), then it would follow that Eo = E * co = E. For the special choice p := Eb, (24.10) says that (24.10')

for all a, b E Rd.

Ea * Eb = Ea+b

2. Let f > 0 be a Ad-integrable numerical function on Rd and p := fAd. Since IIµII = f f dAd < +oo, p also lies in W+ (Rd). Let us compute p*v for an arbitrary v E .,4+(Rd). From 17.3 using the translation-invariance of Ad and the general transformation theorem 19.1, we get

p * v(B) = J J 1B(x + y)f (x)Ad(dx)v(dy) = f f 1B(x +

y)f(x)T-v(Ad)(dx)v(dy)

= f f 1B(x)f(x

- y)Ad(dx)v(dy)

for every B E .mod. With the help of Tonelli's theorem it further follows that

p * v(B) = f 1B(x)q(x)Ad(dx) = f gdAd, B

where q is the non-negative .mod-measurable function x H f f (x - y)v(dy). This function is also Ad-integrable, since f q dAd = Ilp * vfl < +oo. Thus whenever p has a density with respect to Ad, so does p * v. We set f * v := q, that is, we make the definition (24.11)

f * v(x) := f f (x - y)v(dy)

for x E Rd.

The preceding result now assumes the more suggestive form (24.12)

(/Ad) * v = (f * v)Ad.

Naturally f * v is called the convolution of f and v.

3. Besides p = f Ad, let now v = gAd also have a Ad-integrable density g > 0. According to 17.3 and the preceding f * (gAd)(x) = f f(x - y)g(y)Ad(dy)

(x E Rd)

150


is a density for u * v with respect to Ad. We denote this function by f * g, that is, we set (24.13)

f * g(x)

f f(x - y)g(y).d(dy)

(x E Rd)

and get

(f Ad)*(gAd)_(f*g)Ad-

(24.14)

Here too f *g is called the convolution off and g. It is defined for every pair of nonnegative Ad-integrable functions and is itself such a function. Nevertheless, it might

not be real-valued, even if f and g each are (cf. Remark 1 below). Ftom (24.13) and the translation- and reflection-invariance of Ad it follows that for every x E Rd

f * g(x) = f f(x - y)g(y)Ad(dy) = f f(x + y)g(-y)Ad(dy) =

f f(y)g(x _ y)Ad(dy) = g * f(x)-

That is, the * operation between functions is also commutative: (24.15)

f * g = g * f.

Similar calculations confirm its associativity; that is, (24.16)

(f*g)*h=f*(g*h)

for all Ad-integrable, non-negative functions f, g, h. The distributive law (24.17)

f*(g+h)=f*g+f*h

and the homogeneity property (24.18)

f * (ag) _ (af) * g = a(f * g)

(aER.F.)

for such functions hold as well and follow immediately from (24.13).

4. For arbitrary functions f, g E 2' (Ad) decomposition into their positive and negative parts and appeal to the resusecured in 3. show that x +

ff(x - y)g(y)Ad(dy),

while possibly defined only Ad-almost everywhere (see Remark 1 below), is always Ad-integrable. One can therefore define f * g by f * g(x):= f f(x - y)g(y)Ad(dy)

but generally only for Ad-almost all x E Rd. Once again the expression convolution is used for this f * g.


151

Remarks. 1. For real-valued, non-negative functions f, g E pl (Ad) the function f * g need not be finite everywhere. It suffices to consider any real-valued, non-negative, even function f which lies in Y1 (A") but not in 22(Ad) and to take g = f. Then f * g(0) = +oo. In case d = 1, such a function is

f(x) :=

forlxI>Iorx=0

10 1

IXI-112

for 0 < IxI < 1.

2. In passing to Le(ad) - cf. Remark 1 in §15 - the difficulties high-lighted above with the definition of f * g disappear. Indeed, let f H f be the canonical mapping of .1 (Ad) onto Ll (Ad). One defines f * g for arbitrary f , § E Ll (Ad) as the image h of a function h E 21 (Ad) which coincides Ad-almost everywhere with f * g. This definition is independent of the special choice of representing functions f, g and h from 21 (Ad). The new operation * renders the vector space Ll (Ad) an algebra over R.

Exercises.

1. Show that for any it, v E dii (Rd) and any linear mapping T : Rd - Rd, T(µ * v) = T(p) * T(v). To this end, first observe that T o A2 = A2 o (T (& T), where T 0 T denotes the mapping (x, y) -+ (T (x), T (y)) of Rd x Rd into itself. 2. Compute the nlh convolution power of the function f defined on R by f (x)

ethat is, the convolution f * ... * f with n(E N) factors. Is it true that for every n E N, f has an "nth convolution root"? That is, is f the nth convolution power of some A'-integrable function g > 0? 3. If we set N1(f) f I f I dAd (this is (14.1) for it := Ad), then

N, (f *g) n, and this is true of each n E N. Now the set

K := {x} U U Kn nEN

is compact. For if °1! is an open cover of K, then some U E P1 contains x and since (Vn) is a neighborhood basis at x, Vno C U for some no E N. It follows that C U for all n > no. Since Kl U ... U Kno is a compact subset of K, K, C Vn C it is covered by finitely many sets in 9l. These together with U then furnish the desired finite covering of K. On the one hand then p(K) < +oo, since p is a Borel

156

IV. Measures on Topological Spaces

measure, and on the other hand since K C K

µ(K) ? p(KK) > n This is the contradiction sought. O

for allnEN.

Exercises. 1. Let (Q, .W) be a measurable space, 8 a generator of &V and ! ' a subset of Q.

Consider the traces a' and d" of a' and 8, reap., on S2' and show that e' is a generator of the a-algebra .rah' in ff. Example 3 above is a special case. 2. Equip the set R with the so-called right-sided topology (which is also sometimes named after SORGENFREY [1947) whose system 0, of open sets is defined as follows: A subset U C R lies in ®r if and only if for each x E U there is an e > 0 such that [x, x + E[ C U. The topological space thus created will be denoted R,. Establish, one after another, the following claims: (a) Every right half-open interval [a, b[ is both open and closed in R,.. The rightsided topology on R is strictly finer than the usual topology. In particular, R, is a Hausdorff space.

(b) .W(R,) =0. (c) Suppose (x,e) is a strictly isotone sequence of real numbers possessing the supremum b E R. Then the set {z : n E N} U {b} is closed but not compact in R,. By contrast, if (y,,) is a strictly antitone sequence of real numbers possessing the infimum a E R, then {a} U {y : n E N} is compact in R,.. (d) Let K be compact in R,. Then there exists (from the first part of (c)) for every x E Kay E Q with y < x and [y, x[f1K = 0. If for each x E K, p(x) designates such a rational number y, then a mapping B : K -+ Q materializes which is strictly isotone, and hence injective. (e) Every compact subset of R, is countable. (But (c) shows that the converse is not true.) (f) Consider on .W(R,) = . 1 the measure p which assigns to every countable set

the value 0 and to every uncountable set the value +oo (cf. Example 6). Then p is a Borel measure on R, for which no point of R, has a neighborhood of finite measure. In particular, the measure p is not locally finite and is neither inner regular nor outer regular.

(g) Consider the measure v := IA' with density f(x) := x-'

llo,+ool(x)

(x E R)

and show that it too is a non-locally-finite Borel measure on R,.

(h) Investigate the L-B measure Al, thought of as a Borel measure on R in respect to its inner and outer regularity.

§26. Radon measures on Polish spaces

157

§26. Radon measures on Polish spaces For two extensive classes of Hausdorff spaces Borel measures come up very naturally. The first of these classes will be discussed in this section, beginning of course with its

26.1 Definition. A topological space E is called Polish when its topology has a countable base and can be defined by a complete metric. The terminology is due to N. BouRBAKI and commemorates the achievements of Polish topologists in the development of general topology. A metric is called complete when the associated metric space is complete: every Cauchy subsequence in it converges. A countable base or basis for the topology is a countable system of open sets such that every open set is the union of those from the system which are subsets of it. For a metrizable space E the existence of such a basis is equivalent to the existence of a countable dense subset.

Examples. 1. The euclidean spaces Rd of every dimension d > 1 are Polish, the ordinary euclidean metric being complete. The product E' x E" of two Polish spaces is another, when given the product topology. For if d, d" are complete metrics generating the topologies of E' and E", reap., then the product topology of E' x E" is generated by the metric 2.

d(x, y) = d'(x', y) + d"(:r", y"), x := (x', x"), y (y', y"). which moreover is complete. If 9',9" are countable bases for E', E", resp., then {G' x G" : G' E 91, G" E 9") is a countable basis for E' x E". Every closed subspace F of a Polish space E is Polish. Just restrict to F any complete metric that generates the topology of E. 3. 4.

Every open subspace G of a Polish space E is Polish.

Proof. We may suppose G # E. By 1. and 2. R x E is Polish. Let d be a complete

metric giving the topology of E, and consider the set F of all (A, x) E R x E E\G) = 1. Here, as usual, for 0 0 A C E. d(x., A) := inf{d(x, a) a E A} is the distance from the point x E E to A. The mapping x H d(x, A) is continuous on E, in fact., as the reader can easily check, ld(x, A) - d(y, A)l < satisfying

d(x, y) for all x, y r= E. Consequently, (A, x) Fa A d(x, E \ G) is a continuous real function on R x E, and F is a closed subset of R x E, hence itself a Polish space, by 3. Finally, (A, x) H .r. maps F homeomorphically onto G. To see surjectivity, we only have to notice that, because E \ G is closed, G coincides with the set {x E E : d(x, E \ G) > 0}. 5.

More generally it is true (cf. COHN [1980], Theorem 8.1.4 or WILLARI) [1970],

Theorem 24.12) that a subspace A of a Polish space E is Polish if A is a Ga-set in E, that is. A is the intersection of a sequence of open subsets of E. Thus, for

158


example, the set J of all irrational numbers with its topology as a subspace of R is Polish, since

J= n (R \ {x}) . 2E'Q

Every compact space E with a countable basis is Polish. For a famous theorem of P.S. URYSOHN (1889-1924) (cf. KELLEY [1955], p. 125 or WILLARD [1970], 6.

Theorem 23.1) guarantees that E is metrizable, and in Remark 3 of §31 we shall even give a proof of this. The compactness of E easily entails that every metric defining its topology is complete.

The key to the further discussion is the following lemma, which is here just a preliminary to the big theorem that follows it, but nevertheless is significant in its own right. In it we encounter our first extensive class of Radon measures. 26.2 Lemma. Every finite Borel measure it on a Polish space E is regular. Proof. We consider the system .9 of all B E -W(E) which satisfy both

p(B) = sup{µ(K) : K compact C B}

(26.1)

and

µ(B) = inf {it(U) : B C U open). The goal of course is to show that .9 = M(E). We block off the work into five sections. Let d be a complete metric defining the topology of E. 1. E E 9: Only (26.1) needs proof when B = E. Let (X,,)-EN be a sequence which is dense in E, and for x E E, real r > 0 let Kr(x) denote the open ball of center x and d-radius r. For every r then E _ U K,.(xn), because in every ball Kr(x) lies (26.2)

nEN

some x,, so that x E Kr(xn). Sincep is continuous from below k

p(E) = kunµ(U Kr(xj)) . j=1

Therefore, for each e > 0 and n E N there exists kn E N such that

k

µ

K1/,, (xj)) > p(E)

-F2'°

j=1

kp

Each set Bn

U K 1 / (x j ), hence also their intersection K:= f Bn is closed, nEN

j=1

and we have

u(E)-µ(K)=µ(E\K)=p(U (E\B,)) 5 nEN

p(E\Bn) 0 be given. We already know that there is a compact set K with µ(E) - IA(K) < e. According to 3.5 however

µ(C) - µ(C fl K) = p(C U K) - µ(K) < µ(E) - µ(K) < £ and this proves (26.1) for B :

C, because C fl K is compact. As a closed subset

of a metric space, C is a G6-set, that is, there are open sets G. J. C. To see this we may assume C 9& 0, so that G := E \ C is an open proper subset of E. Consequently, x H d(x, C) is a continuous mapping whose zero-set is C, as was

shown in treating Example 4. The sets Gn :_ {x E E : d(x,C) < 1/n} are therefore open and decrease to C. From the finiteness of µ and 3.2(c) we then have that µ(G.) 4. µ(C), showing that (26.2) is also satisfied by B := C. 3. Whenever B lies in 9 so does CB: First note that for every compact K C B

µ(CK) - p(CB) = µ(B) - µ(K) , and so CB satisfies (26.2) whenever B satisfies (26.1). Moreover, if G is an open superset of B, then CG is a closed subset of CB with µ(CB) -,u(CG) = µ(G) - µ(B) ,

showing, at least, that CB satisfies (26.1) weakened by replacing "compact" there

by "closed". But then application of step 2 to these closed sets gives us the full (26.1) for CB.

4. Whenever pairwise disjoint sets Dn lie in 9 (n E N), their union D also lies in 9: First of all

µ(D.)

µ(D) _ n=1

Letting e > 0 be given, we therefore have an nr E N such that n, (26.3)

µ(D) - E p(Dn) < c/2. n=1

Every Dj contains a compact K,j such that

µ(Di) - µ(Ka)
0 there is a compact subset KK C E such that p(CKE) < e and the restriction off to K, is continuous. Proof. Let us first suppose that p is finite. Let 9' be a countable base for the topology of E' and (Gn)nEN a sequential arrangement of its elements. Notice that 9' is a generator of the Borel o-algebra because every open subset of E' is a (countable) union of sets from s'.


164

(a)=(c): By hypothesis there is a Borel measurable mapping g : E -* E' and p-nullset N E .£(E) with f (x) = g(x)

(26.7)

for all x E CN.

For every set Gn, g-1(Gn) E . (E). Because every Radon measure on E is regular, given E > 0, there exist compact sets Kn and open sets Un such that (26.8)

K C g-1(G'n) C Un and p(Un \ Kn) < 2-ne

The set A

for each n E N.

U (Un \ Kn) is open, being a union of open sets. For its measure nEN

we have the obvious inequality 00

p(A) s E p(Un \ Kn) < C. n=1

Using once more the (inner) regularity of 1S, we find a compact K C C(A U N) _ CA n CN such that

p(CAnCNnCK) <e-p(A), thus (since A U N C CK and A U N U (CA n CN = E) such that p(CK) = p(A U N U [CA n CN n CKI) < p(A) + p(N) + E - p(A) = E .

This set K does what is wanted in (c), because by (26.7) f and g coincide in K and because the restriction go of g to CA is continuous, as we now confirm. For each set Gn, go 1(Gn) = g-1(Gn) n CA;

from (26.8) and the fact Un \ Kn C A follows therefore

UnnCA =KnnCA cg'(G')cUnnCA, which means that

goI(Gn)=UnnCA =KnnCA, showing that the go-pre-image of G;, is open (as well as closed) in CA. Since (Gn)nEN is a base for the topology of E', this is enough to guarantee the continuity

of go=gICA. (c)=(b): It suffices to find pairwise disjoint compact subsets Kn of E such that f I Kn is continuous and K3) < p(C ?=1 U J n =

for each n E N. For then

N:=CUKn= nCKn nEN

nEN

is a Borel set disjoint from each Kn and satisfying p(N) < 1/n for every n E N, i.e., p(N) = 0. The sequence (Kn) is gotten inductively from (c) as follows: To start off, there is a compact K1 C E such that u(CKI) < 1 and f I K1 is continuous.

§26. Radon measures on Polish spaces

165

If Ks,. .. , Kn have been defined having the desired properties, we will get K"+1 from (c) and the inner regularity of p. By (c) there is a compact K' C E such that

p(CK') < (2n + 2)-' and f I K' is continuous. With L := K, U... UKn the inner regularity of p supplies a compact Kn+1 C K' \ L such that

µ(K' \ L) - p(Kn+1) = µ(K' n CL n CKn+,) < (2n + 2)' 1

.

Because

p(C(L U Kn+,)) = p(CK' n CL n CKn+1) + µ(K' n CL n CKn+, )

< p(CK')+p(K'nCL nCK,,+,) < (n + 1)-', with this set Kn+, the inductive construction is complete. (b)=(a): If E = N U K, U K2 U ... is the given decomposition, one defines a mapping g : E -* E' as follows. In case N = 0, let g := f. In case N 96 0, choose yo E f (N) arbitrarily and set

g(x) := f (x) for x E E \ N,

g(x) := yo for x E N.

What has to be shown is that g is Borel measurable, which is done as follows: For every open G' C E' 9_1(G')

= (g-1 (G') n N) U U (g-1(G') n Kn) = No U U g; 1(G') nEN

nEN

where No := g-1(G') n N and gn := g I Kn. Now No is either N or 0, according as yo E G' or yo V G'. Moreover, gn coincides with the restriction of f to Kn, so that by hypothesis gn 1(G') is open in Kn, that is, of the form Kn n Un for some open subset U,, of E. Therefore only Borel sets occur in the above decomposition of g-1(G') and we conclude that g-1(G') is a Borel set. This being true of every open G' C E', the Borel measurability of g follows from 7.2. Now consider an arbitrary locally finite measure p on R(E). According to 26.3, p is a-finite. Lemma 17.6 therefore furnishes a strictly positive p-integrable real function h on E. The measure v := hp is then a finite Borel measure on E which has exactly the same nullsets as p. The proven equivalence of (a) and (b) for the measure v therefore entails the validity of this equivalence for the measure it. Thus the whole theorem is proved.

Remarks. 1. The equivalence of (a) and (b) in Lusin's theorem may be lost if (a) is

strengthened to the 9(E)-9(E')-measurability of f. It suffices to take for E the compact set [0,1] x [0,1] and for p the L-B measure .X E. As was noted in the second part of Remark 4, §8, E contains a p-nullset N which contains a non-Borel subset. If M is such a set, its indicator function f = l,w is not Borel measurable, although f is p-almost everywhere equal to the Borel measurable function 1N On the other hand, if f is . (E)-. (E')-measurable, there is a Polish topology r on E, stronger than the original but generating exactly the same Borel sets, such that f is r-continuous. See 3.2.6 of SRIVASTAVA [1998] for the proof, which is not difficult.

166


2. The Dirichlet jump function (cf. Remark 1 of §16) is continuous at no point of its domain of definition 10, 1], yet it is Borel measurable. This shows that in assertion (c) of Lusin's theorem one cannot hope to be able to replace the continuity

of the function f I K by the continuity of f at each point of K.

Exercises. 1. Show that every inner regular finite Borel measure on a Hausdorff space is outer regular.

2. Show that in a Polish space E the Dirac measures are the only non-zero Borel measures it which take only the values 0 and 1. [Hint: Show that the system of all compact K C E such that tt(K) = I is fl-stable and investigate the intersection of all itssets.]

3. Show that AE x E') _ i(E) ®M(E') for any Polish spaces E,E'. 4. Consider K compact C U open C Rd, and for each n E N let V denote the open ball of radius 1/n and center 0. Show that K + V C U for some n. [Hint: n CU # 0 for every it E N, find xn E K, vn E V,,, zn E CU such that If (K + x + v = z,,, for every n E N. Some subsequence of (xn) converges to a point xo E K and because CU is closed we even have x0 E K fl CU, which contradicts the fact that K C U.]

5. Let p be a locally finite Borel measure on a Polish space E and f : E - E' a mapping into a topological space E' with a countable base. Show that assertions (a) and (b) in Lusin's theorem are equivalent to (c'): For every e > 0 and every compact K C E there is a further compact Kf C K such that p(K\Kf) < c and f I KE is continuous.

§27. Properties of locally compact spaces A topological space is called locally compact if it is Hausdorff and if each of its points has at least one compact neighborhood. Examples of such spaces are the euclidean space Rd, every manifold (i.e., every locally euclidean Hausdorff space), every discrete space, and every compact space. When an arbitrary point is removed from a compact space the remainder is a locally compact space. Actually every locally compact space is of this form. For if © is the system of all open subsets of the locally compact space E and wo is any (so-called ideal) point not in E, then a topology can be defined on E' := EU {WO} as follows: The system d' of open sets in E' shall consist of ® together with the sets E' \ K for all the compact subsets K of E. This defines a compact topology on E', E is an open subset of E' and the topology that E inherits from t9' is its original topology. E was compact to start with if and only if wo is an isolated point in E'. If E is not compact, then it is dense in E'. These claims are easily confirmed, or the reader can consult KELLEY [1955], p. 150, or WILLARD [1970], 19.2. The space E'

§27. Properties of locally compact spaces

167

is called, after its creator P.S. ALEXANDROFF (1896-1982), the (Alexandroff) one-point compactification of E and wo its infinitely remote point. We will pursue the further theory of locally compact spaces via this compactification. First we study some distinguished continuous functions in this environment. For an arbitrary topological space E we denote by C(E) and

Ct(E)

the vector space of all, respectively all bounded, continuous real functions on E.

27.1 Definition. Let f : E -> JR be a real function on a topological space E. The set (27.1) supp(f) := If 34 0} is called the support of f.

The complement of supp(f) is thus the largest open set at every point of which f takes the value zero. If E is locally compact. we will designate by CA(E)

the set of all f E C(E) with compact support supp(f). A function f E C(E) lies in CA(E) just if there is some compact subset of E in the complement of which f is identically zero. Clearly (27.2)

C (E) C Cb(E) C C(E),

since an f E CA(E) is bounded on its compact support, hence throughout E. C,.(E) is a vector subspace of Cb(E). More generally for any n E N, E C(1R") with V(O) = 0 and fl,.. . E C,.(E), the composition f,,) lies in CA(E), rr

and indeed its support is a subset of f supp(fj). In particular, whenever u, v E j=1

C,.(E) the functions Jul, u V zv. u A v, and therewith u+ and u.-, all lie in C'(E). The needed continuity of y,(x, y) := r V y on 1R2 follows from the identity r V y =

(.x+y+I.e-yI) In the special case of a compact space E, all three function spaces in (27.2) coincide.

A fundamental property of the space C,.(E) is the following:

27.2 Theorem (on partitions of unity). Suppose that the compact subset K of the locally compact space E is covered by the n open sets U1, ... , U,,. n E N. Then

there are functions fl.... , f E C,.(E) with the following properties (27.3)

fj>0

(27.4)

supp(fj) C Uj

for j = 1.....n; for j = 1,....n:

r4

f(x) < 1

(27.5) j=1

for all r E E;

168

IV. Measures on Topological Spaces n

rfj(x)

(27.6)

forallXEK.

j=1

Proof. We work in the one-point compactification E' := E U {wo} of E. The given open sets together with Uo := E' \ K constitute an open cover of E'. Because compact spaces are normal topological spaces (cf. KELLEY [1955], p. 141 Or WILLARD [1970], Theorem 17.10), this covering can be "shrunk" to an open covering Ui, ... , Un of E' satisfying UUCUj for each j =0,...,n, where of course the bar denotes closure in E'. The theorem on partitions of unity in normal spaces (KELLEY [1955], p. 171 Or WILLARD [1970], 20 C) provides functions

fo..... fn E C(E') such that fj' > 0,

(i)

supp(f f) C Uj,

for j = 0,..., n;

n

Ef,(x)=1

(ii)

for all xE E'.

j=o

The restrictions f I , ... , fn to E of f f,i lie in C(E) and it will be easy to show that they have all the properties wanted. From (i) and (ii) properties (27.3)-(27.5) follow almost immediately. One only has to notice that for each j = 1,.. . , n

supp(fj)=supp(ff)flECUUflE=UUCUj since UU C Uj C E. In particular, Uf being a closed subset of the compact space E',

is a compact subset of E. From supp(fj) C W therefore follows the compactness of this support. Thus f I, ... , f,, all lie in CA(E). The remaining property (27.6) likewise follows from (ii) because supp(fo) C Uo = E \ K entails that fo(x) = 0

for all x E K. 0 Two consequences of the foregoing will turn out to be especially useful. The first - known as Urysohn's lemma - often serves as the starting point for inductive constructions of partitions of unity (see, e.g., RUDIx[1987J, p. 39). The second can also be proven directly, as indicated in Exercise 1 below.

27.3 Corollary 1. In the locally compact space E, U is an open neighborhood of the compact subset K. Then CA(E) contains a function f which satisfies (27.7)

0:5f:51, f(K)=fl),

and

supp(f) C U .

In particular, supp(f) is a compact neighborhood of K.

Proof. We have only to apply 27.2 for n = 1. Since K C (f, > 0} C supp(f3), the fact that (f, > 0) is open means that supp(f 1) is indeed a neighborhood of K. 0 27.4 Corollary 2. In the locally compact space E the compact subset K is covered

by then open sets UI,... , Un, n E N. Then K can be decomposed as K = KI U ... U Kn with Kj a compact subset of Uj for each j = 1, ... , n.

§27. Properties of locally compact spaces

Proof. Let fl,

169

, fn E Cc,(E) be as provided by 27.2. The compact sets

K; := K n supp(f3 ),

j = 1, ... , n

do what is wanted; for if x E K, then 1 = f i (x) +... + f n (x) means that f, (x) j4 0 for some j, and therefore x E K3.

For a locally compact space E there is another function space besides CC(E) that is of importance. To define it we assign to every bounded real function f on an arbitrary space E its supremum norm, also called its uniform norm, via Ilf11

sup If W1 sEE

The mapping (f, g) -+ If -gIi makes Cb(E) - more generally even the vector space of all bounded real functions on E - into a metric space. One speaks of the metric of uniform convergence (on E). A sequence (fn) of bounded real functions on E converges uniformly on E to a bounded function f just means that lim Ilfn - f 1l = 0 . nloo

27.5 Definition. A continuous real function f on a locally compact space E is said to vanish at infinity if it lies in the closure Co(E) of CC(E) in Cb(E) with respect to the metric of uniform convergence. Denoting closure in this metric by bar, we thus have Co(E) := CC(E) C Cb(E). The terminology "vanishing at infinity" is both clarified and justified by

27.6 Theorem. For a real function f on a locally compact space E the following statements are equivalent:

(a) f E Co(E); (b) f E C(E) and {If I > e} is compact for each e > 0; (c) the function

f'(x) :_ { f (x), for all x E E for x = wo 0, is continuous on the one-point compactification E' of E.

Proof. (a)=(b): Given e > 0, there is by definition off E Co(E) a g E Cc(E) with Ilf - gfl S e/2. Every x E E satisfies If (x)I - Ig(x)I e} C {IgI > E/2} C supp(g). This shows that (If 12: c} is a relatively compact set. But, due to the continuity of f, it is also closed. Hence it is compact. (b)*(c): Since the subspace topology of E in E' is its original topology and E is an open subset of E', continuity of f' at each point of E is assured by f E C(E). As to continuity at the ideal point wo, given e > 0, we have I f'(x) - f'(wo) I = l f'(x) I
e} is a compact subset of E. (c)=:>(a): Continuity of f' at wo and the definition of the topology in E' mean that for each e > 0 there is a compact K C E such that If (x)I = If'(x) - f'(wo)I < E for all x E E \ K. 27.3 supplies a g E CA(E) with 0 < 9< I and g(K) = {1}. Then fg E CA(E) and satisfies

If

- f(x)I = If(x)I (1-g(x)) < E

for all x E E, so Ilfg - f II < E. As e > 0 is arbitrary, this proves that f E CA(E).

Exercises. 1. Without resort to partitions of unity, prove Corollary 27.4 directly. [Hint for the case n = 2: Separate the disjoint compacta K \ U1, K \ U2 with disjoint open neighborhoods V1, V2 and set Kl := K \ V1, K2 := K \ V2.] 2. Let E' = E U {wo } be the one-point compactification of a locally compact space E. Describe the Borel sets in E' by means of the Borel sets in E. In particular, see how your description fits into the following general picture: For a measure space (E,.o), a point wo it E and the set EWO := E U {wo}, the a-algebra d"'O in E"'° generated by d and {wo} consists of all A' C El- such that All fl E E St.

§28. Construction of Radon measures on locally compact spaces In what follows E will be a locally compact space. We consider a Borel measure p

(defined on R(E)). Here the requirement µ(K) < +oo for every compact set K is the same as the local finiteness requirement, because every point of E has a compact neighborhood and the implication (25.7) holds in general. So in the present context the concepts of Borel measure and locally finite measure on .W(E) coincide. The Radon measures on E are thus (cf. 25.3) those Borel measures which are inner regular. For a Borel measure it every u E CA(E) turns out to be p-integrable. For, being continuous, u is Borel measurable. Denoting by K the compact support of u, we have 1111 5 IIuII 1K. Since It is a Borel measure, 1K is p-integrable, and the pintegrability of u follows. Therefore corresponding to the Borel measure is a linear form 1,, on C,;(E) defined by (28.1)

lu(u) := Judy.

This is an isotope linear form in the sense of (12.3): From u < v follows I,,(u) < I,,(v). Because of the linearity of I,, this is equivalent to

00,

§28. Construction of Radon measures on locally compact spaces

171

which is why I,, is usually called a positive linear form. This brings us to a key question for our further work: Is every positive linear form on C,.(E) an I,, for some Borel measure p on E, or are there possibly positive linear forms of a completely different kind? Even for compact intervals J := [a, b] on the number line, answering this question is by no means a trivial task. In this case however, as early as 1909 F. Riesz showed (cf. RIEsz (1911]) that besides the

linear forms I,, arising from Borel measures it on J, there are no other positive linear forms on Q,,(J) = C(J). One of our goals is to show that every locally compact space E shares this property with J. The result in question will, in view of this pioneering work, be called the Riesz representation theorem. En route to it we will naturally be led to the construction of Radon measures on E. Besides the locally compact space E. let now a positive linear form

I : Cr(E) -+ R be given. What follows will prepare the way for the proof of the Riesz representation theorem. For every compact K C E we set (28.2)

p.(K) := inf{I(u) : 1K < it E C.,,(E)}.

Such functions u exist thanks to Corollary 27.3. Consequently, (28.3)

0 < p. (K) < +oc.

Moreover, the mapping K ' p.(K) is obviously isotone on the system ..l' of all compact, sets. For an arbitrary A E -1P(E) we set (28.4)

p.(A) := sup{p.(K) : K compact C Al.

Because of the above noted isotoneity of it. on ..it', this new definition is consistent with (28.2). Finally, for A E .9(E) we define (28.5)

p'(A) := inf{p.(U) : A C U open}.

Then it. and p` are isotone functions on . (E). Moreover (28.6)

p. (A) < y* (A)

for all A E .0(E),

as follows from the obvious fact that it.(A) < p.(U) for every open U D A; and (28.7)

p.(U) = /I* (U)

for all open U E Y(E),

which follows from (28.5) and the isotoneit.v of it.. Somewhat more effort is required

to check that (28.8)

p.(K) = p`(K)

for all K E X.

For every e > 0 definition (28.2) supplies a u E C,.(E) with to > 1K and

I(u) - p.(K) < E.

172


For0a} is an open superset of K and 1Ue
1K,uK2=1K,+1Ks. I}, and According to 27.3 there is a v E C,(E) with 0 < v < 1, v(K1) supp(v) C CK2, hence with v(K2) = {0}. The functions vu and (1 - v)u lie in CA(E) and satisfy vu > 1K,

and

(1 - v)u > 1K2.

Therefore

p.(Ki) +p.(K2) < I(vu) + I((1 -v)u) =1(u) ,

174


which, because of (28.2), has the consequence that

p.(Ki) + µ.(K2) < u.(K1 U K2). In view of (28.8) this inequality is half of the equality being claimed. The other half is simply the subadditivity of the outer measure µ'. The first important consequence of all this is:

28.3 Theorem. The restriction of µ' to M(E) is a Borel measure. The proof is immediate from Lemma 26.5 and the facts accumulated to this point. Notice that (28.7) and (28.5) say that hypothesis (1) of 26.5 is fulfilled, while (28.7), (28.8) and (28.4) insure that hypothesis (ii) of 26.5 is fulfilled.

The Borel measure µ' I ..(E) has a series of further remarkable properties:

28.4 Theorem. Every Borel subset A C E with µ'(A) < +oo satisfies

µ.(A) = µ`(A) Proof. Given e > 0, there is an open U D A such that

It* (U) - µ'(A) < e/2, which, due to µ' (A) < +oo and µ' being a measure on 9(E), can be written as

µ'(U\A) =µ'(U) -µ'(A) <e/2. From (28.4) we get compact L C U such that

µ'(U\L)=µ'(U)-li (L) <e/2. The set

Q:=(U\A)U(U\L) then satisfies p* (Q) < e. Hence there is an open G Q such that µ'(G) < C.

Now K := L \ G is a (closed, hence) compact subset of L with the properties

K C A and A\ K C G.

(28.10)

In fact, on the one hand

K = L \ G C L \ Q C L \ (U \ A) = L n A, since L C U, and on the other hand

A\K=A\(L\G)=(AnG)U(A\L)CGu(U\L)=G, since U \ L C Q C G. From (28.10) we get

µ'(A) - µ'(K) = µ'(A \ K) 5 µ'(G) < e,

§28. Construction of Radon measures on locally compact spaces

175

and so u* (A) < µ'(K) + e 0 was arbitrary, this says that µ'(A) < µ.(A), which with (28.6) finishes the proof. The finiteness hypothesis in the preceding theorem can be weakened. In doing so we make use of the terminology introduced just before the proof of Theorem 13.6.

28.5 Corollary. The equality p. (A) = u* (A) also holds for every A E -V(E) which has o'-finite µ'-measure.

Proof. The terminology means that there exist An E R (E) (n E N), each of finite µ'-measure, such that An T A. The preceding theorem and the isotoneity yield

µ'(An) = p.(An) < µ.(A) , from which and the continuity of µ' from below on R (E) follows µ'(A) = sup p* (An) 0, a compact Kn C A. satisfying

p. (An) - µ.(KK) < 2-ne

for each n E N.

Since the sets Kj are pairwise disjoint, UKj)=µ*\UKj/IL_(Kj)A.(Kj)

j=1

j=1

j=1 n

> Ep.(Aj) - E j=1

j=1

j=1

n

j=1

for every n E N.

176


Letting n -+ oo we infer that 00

(A) ? Eµ.(A.i) -e, 00

holding for every c > 0. That is, µ. (A) > E µ. (A,,), the complementary inequality we needed to finish the proof. We now set (28.11)

µo := µ. I .4(E) a n d µ° := µ* I R(E)

and, inspired by COURREGE [19621, call these the essential measure determined

by I and the principal measure determined by I, respectively. Each is a Borel measure (28.3 and 28.6).

Obviously the essential measure tb is inner regular, hence is a Radon measure on E. By contrast the principal measure µ° is outer regular. It turns out that µ° is the more important of the two. Thus to the given positive linear form I on CA(E) we have associated two Borel measures. The further relation of these measures to I and the questions of whether and when they coincide will be clarified in the next section. The closing lemma of this section recasts definition (28.4), when A is open, into a equivalent form. It has a preparatory character.

28.7 Lemma. Every open set U C E satisfies (28.12)

110(U) =11°(U) = sup{I(u) : u E C0(E), supp(u) C U, 0 < u < I}.

Proof. The first equality is just (28.7). Denote the right side of (28.12) by y, and consider any compact K C U. Corollary 27.3 provides a function u E CA(E) with

0 < u < 1, u(K) = {1} and supp(u) C U. In particular, 1K < u and so by (28.2) µ.(K) < I(u) < y, that is, µ.(K) < y for every such K. It follows that µ°(U) = µ`(U) = µ.(U) < y, by (28.4). The reverse inequality y < µ°(U) is derived as follows: Let u E CA(E) be a typical function involved in the definition of y. Set L := supp(u) and consider a typical v E C0(E) involved in the definition (28.2) of µ.(L). Evidently then u < v, so 1(u) < I(v); that is, I(u) < µ.(L) = µ0(L) = µ°(L) < µ°(U). Taking the supremum over eligible u gives finally the desired complementary inequality -y:5 µ°(U).

A sharpening of equality (28.12) will be presented in Exercise 2 of §29. The special case U = E of lemma 28.7 furnishes the following useful description of the total masses of it. and µ°: (28.13)

11µo11 = 11µ°II = sup{1(u) : u E CC(E),0 < u < 1).

§29. Riesz representation theorem

177

Exercises.

1. For a locally compact space E and a measure p defined on ..(E), show that it is a Borel measure if and only if Cc(E) C 21(p). 2. Let p be a Radon measure on a locally compact space E and (Gi)1EI a family of open sets which is upward filtering, that is, for any i, j E I there is a k E I such that Gi U G; C Gk. Show that C := U Gi satisfies iEI

p(G) = sup{p(Gi) : i E I} . 3. Using the preceding exercise, show that for any Radon measure p on a locally compact space E:

(a) There exists a largest open set G with p(G) = 0. The set CG is called the support of the measure p and is denoted supp(p). (b) A point x E E lies in supp(p) if and only if every open neighborhood of x has positive p-measure.

(c) For a non-negative f E C(E), f f dµ = 0 if and only if f = 0 throughout supp(p). Determine supp(Ad) for L-B measure Ad on Rd, and supp(E°) for every Dirac measure ea on E. 4. Let p be a Borel measure on a locally compact space E. Show that every set A from the a-ring p0(X) generated by the system ..iE' of compact subsets of E is a Borel set which satisfies p.(A) = p°(A). Here a ring .4 in a set 0 is called a aring if the union of every sequence of sets in .9 is itself a set in R. In complete analogy with a-algebras, every subset of .9(0) is contained in a smallest a-ring. Sometimes it is only the sets in pe(a') which get called "Borel sets"; this is the case, e.g., in the classic exposition of HALMOS [1974]. Why is it generally the case that po(..1E') 3 .9(E)?

§29. Riesz representation theorem Again let E be a locally compact space. Every Borel measure p on E defines a positive linear form

I,,(u) := fudp on CA(E). The question posed in §28 was: Is it true that for every positive linear form I on CA(E) there is a Borel measure p on E such that Iµ = I, that is, such

that

I(u) = Judp

foralluECC(E)?

Any such Borel measure p will be called a representing measure for I. The answer, leaked earlier, to this question reads:

178

W. Measures on lbpological Spaces

29.1 Riesz representation theorem. If E is a locally compact space, every positive linear form I on CA(E) has at least one representing measure. In fact, both the essential measure Po determined by I and the principal measure p° determined by I are representing measures for I.

Proof. po and p° are Borel measures. It must be shown that (29.1)

I(u)= fud = Judpo

for all uECC(E),

and because of linearity and the fact that the positive and negative parts of each u E CA(E) also lie in C°(E), it suffices to show this for non-negative u. So let such be given and let the real number b > 0 be an upper bound for u. Fbr auE a given e > 0 choose real numbers yp,... , y,, with

0=yo 0, i3 > 0 the measure aµ +)3v also lies in .ill+(E), as is easily checked. That is, .0+(E) is what is called a convex cone. Besides . W+ (E) we often consider the following subsets

.'+(E) = (1A E 4'(E) : p(E) < +oo}

-#+'(E) =fu E-0+(E):µ(E)=1}, the set of all finite (or bounded) Radon measures and the set of all Radon pmeasures on E, respectively. Evidently

-&+' (E) C.-W+(E) C .4+(E) .

In .f+1 (E) are to found all the Dirac measures on E. And 4 (E) is a convex subcone of 4f+ (E). In the special case E = Rd the set ..W+b (W') is the set of all finite Borel measures

on Rd, already familiar to us from §24. That the definition there is equivalent to the present one is due to Theorem 29.12, according to which every Borel measure on Rd is a Radon measure. Depending on whether one thinks of the elements of . W+(E) as measures on -V(E) or as positive linear forms on CA(E), two notions of convergence suggest themselves: One can define the convergence of a sequence (ta,,) in 4'+(E) to

§30. Convergence of Radon measures

pE

189

by requiring either that lim An (A) = p(A)

n-+oo

for all A E R(E)

or

lim

n-+oo

J

f dp = J f dp J

for all f E CC(E).

We will forthwith show that the first of these is of limited interest, while the second is of considerable significance.

30.1 Definition. A sequence (pn)nEN of Radon measures on E is said to be vaguely convergent to a Radon measure y if (30.1)

lim

-oo

for all f E CA(E).

A sequence (pn) in 4'+(E) is vaguely convergent just when the sequence of real numbers (f f dpn) converges in R for every f E CA(E). For in this case f H lim f f dpn evidently defines a positive linear form on CA(E), so by the Riesz n representation theorem together with Theorem 29.3 there is a unique Radon measure p to which (An) vaguely converges. At the same time we see that a sequence in . K+(E) can have at most one vague limit.

Examples. 1. Let (xn) be a sequence in E, x E E. If (xn) converges to x, then (e2 ) converges vaguely to eZ, for the latter just amounts to lim f (xn) = f(X)In general however lime= (A) = ex(A) does not hold for all A E -V(E); in fact, if all xn are distinct from x, A := {x} is such a set. Conversely, if (es,) vaguely converges to ey, then (xn) converges to x. For if this were not so, there would be a subsequence of (xn) which remains outside of some neighborhood U of x. 27.3 furnishes an f E CA(E) with f (x) = 1 and supp(f) C U. Evidently the (f (xn)) does not converge to f f de,. sequence (f f Let (an) be an arbitrary sequence of non-negative real numbers and (xn) a sequence in E with the property that {n E N : xn E K} is finite for every compact K C E. (In other words, E is not compact and limxn = wo E E'.) Then the sequence of measures An := ane: (n E N) is vaguely convergent to the zero measure p := 0. For f f dpn = an f (xn) = 0 for all n except the finitely many for which xn E supp(f), whenever f E Cc(E). 2.

The fact, illustrated by Example 1, that the vague convergence of (An) to A does not generally entail the convergence of (pn(A)) to p(A) for each A E . (E), while, as 30.2 will show, the converse is true, seems to indicate that the first mode of convergence mentioned above is too restrictive to be of much use. Actually, vague convergence of (An) to p follows just from knowing that (An (A)) converges to p(A) for certain special sets A E R(E). Even more:

190


30.2 Theorem. A sequence (pn) of Radon measures on a locally compact space E converges vaguely to a Radon measure p if and only if the following condition is fulfilled: (30.2)

lim pp 1zn (K) < p(K)

and

lim oinµn (G) > jz(G)

for every compact K C E and every relatively compact, open G C E. converges vaguely top and that K and G are any compact and open sets, respectively. Consider functions u,v E CC(E) with u > 1K, 0 < v < 1 and supp(v) C G. Then for all n E N Proof. Suppose

µn(K) < J udjcn and JVdPn 0 we choose finitely many numbers

0=yo 0 set K,.(x) := rdK(rx) (x E Rd). Then K, is also non-negative and Ad-integrable, and f K, dAd = 1 as well. To see this we only have to recall (7.10), according to which the homothety H,(x) := rx on Rd transforms L-B measure thus: Hr(Ad) = r-dAd. For from that it follows

§30. Convergence of Radon measures

193

that

J KrdAd=rd I K0HrdAd=rdJ Kd(Hr(Ad))= I KdAd = 1. Now r -+ Kr)1d is a mapping of JO,+oo[ into dl. (Rd), and in the sense of the vague topology it satisfies

lim KrAd = e0

(30.7)

r-a+oC

To confirm this, first notice that for every f E

.F

f f Kr dad = rd J f (K o Hr) dad = rd f (f o Hr-') K dHr(Ad) = f(f oHH')KdAd= ff(f_1x)K(x)Ad((fr)

this and the Lebesgue dominated convergence theorem the claim (30.7) follows upon checking that, on the one hand

lint f (r-'x)K(x) = f (0)K(x)

r-++oo

for every x E Rd,

and on the other hand for all real r > 0 and all x E Rd

If (r-'x)K(x) I (.q'!.+) into R (f E C'(E)). But this mapping is just the restriction to 4)(..C/+) of the projection of P = RC, onto its coordinate specified by f.

As to (b): Let I E P be a point in the closure of 4'(..E'+) in P. Then I is a positive linear form on CA(E). To see its additivity, for example, let f, g E CA(E)

and E > 0 be given. The set of all I' E P which satisfy

II'(u) - I(u)I < E

for u E (f, g, f + g}

is a neighborhood of I in P, and therefore contains a point I' = 4>(p) from I' is thus the positive linear form

u H I' (u) = Judu

206


on CA(E). That means that we have

II(f +g) - I(f) - I(g)I

II(f +g) - I'(f +g)I + II'(f +g) - I(f) - I(g)I

=II(f+g)-I'(f+g)I+II'(f)-I(f)+I'(g)-I(g)I <e+II'(f)-I(f)I+II'(g)-I(g)I 0 is arbitrary, the extreme inequality means that its left-hand side must be 0. In a completely analogous way one proves that I (a f) = aI (f) for every a E R, f E CA(E), and I(g) > 0 for every non-negative g E CA(E). With the linearity of I confirmed, the Riesz representation theorem supplies a Radon I. That is, I lies in confirming that measure v E + such that the latter is closed in P. lJ 31.3 Corollary. For every real number a > 0 the set

9a:={pE..t+(E):IItzII 0. That is, the desired equality f f dp = f f dv must hold. The next step is to show that the topology determined by P is none other than the vague topology. We will, to that end, make use of the fact that the sets defined in (30.5) are a neighborhood base at v E ..&+ in the vague

210


topology, when all possible finite subsets {fl,..., fn} of C0(E) and all numbers e > 0 are considered. We will denote by Ue (v) the open ball of center v and radius e

with respect to the metric p. 1. Given e > 0 there exists m E N such that Vd,..... dm;e/2(V) C UU(V)

(31.7)

for every v E .4'+.

Indeed, one may take any m E N such that 00

E 2-n < e/2 n=m+1

and every le E Vd,..... d,,,;e/2(V) will then satisfy in E2-n

p(µ, V)
0 and every v E 4'+, there is a number i > 0 such that (31.8)

Un(v) C V11,---.fn;-(V)

First of all, choose k E N so that n U supp(fj) C Lk C {ek = 1}. j=1

We can find a number 8, dependent on v, so that

0 N.

if

The second of the (valid for all r,s > N) inequalities in (31.11) shows that the numerical sequence (f ek d

EN is bounded, say by M E R+:

forallnEN. The earlier inequality therefore yields

Jfdpr_JfdP8N.

Notice that M depends only on k, hence only on f. Furthermore N depends only on b and f. Therefore this last inequality affirms that (f f dpn)nEN is a Cauchy sequence in R. According to the remark following Definition 30.1 the sequence (tin)

is therefore vaguely convergent to some p0 E .4'... Since the vague topology coincides with the p-topology, as we have already confirmed, this means that the sequence (pn) converges to po in the p-metric. We finally need to prove that, like the topology of E, the vague topology of ..k+ has a countable base. Since the vague topology is generated by the metric p, it is enough to find a countable set 9o which is dense in . W+; because it is obvious that the set of all open balls with respect to the metric p centered at points of 9o and having rational radii is then a countable base for the p-topology of . '... Our candidate for 9o is the set of all discrete measures k

b :_

aifx,

with positive rational ai and points ai drawn from a countable set Eo which is dense in E. We get such a set Eo simply by taking a point from each set in a countable base for the topology of E. Evidently, this 90 is countable. We have to show that for every p E . fl+, every real e > 0, and every finite set F :_ {fl,..., fn} C CA(E), the basic vague neighborhood Vj,,... contains a measure from 90. At least, according to 30.4, this neighborhood contains a

with positive real Ui and Ti E E. Thus (31.12)

ip- Jfdbl-l Jfd

k

<e i=1

for all f EF.

§31. Vague compactness and metrizability questions

213

Now for such f and d as above

if fdIt-Jfd.6l
0 and integers I < n1 < n2 < ... such

that If f dlt,,; - f f ditl > e for all j E N. The sequence

)jEN would have a vaguely convergent subsequence and its vague limit could not be iz. If we further that it is tight, then with the aid of Remark 3 in §30 we can hypothesize of even converges weakly to it. conclude that it E .W+(E) as well, and that

5. The foregoing deliberations show (for locally compact. E with a countable base) that tight sequences in &+'(E) always contain weakly convergent subsequences. Explicitly formulated this says: A set H C .,i.+ (E) is relatively compact (= relatively sequentially compact) in the weak topology if it is tight, meaning that for every e > 0 a compact Kf C E exists such that p(E \ KE) < e for every it E H. A theorem of Yu.V. PROHOROV asserts that the lightness of H is even equivalent to its weak relative compactness. More is true: This equivalence prevails as well whenever E is any Polish space. For details the reader can consult BILLINGSLEY [1968[.

214


The ideas employed in the proofs of Theorems 31.4 and 31.5, slightly modified, lead to a further interesting result. It concerns the space

C := C(R+, E)

of all continuous mappings f of R+ := [0, +oo into a Polish space E, for example, Rd. We endow C with the topology of uniform convergence on compact subsets of R+.

31.6 Theorem. Along with E, the space C(R+, E) is also Polish. Proof. Consider any complete metric B which generates the topology of E. Another

such metric is given by (x,y) H min{1, p(x,y)}, and using it if need be, we can simply assume that L< 1. This lets us define do in C for each n E N by dn(f,g) := sup{p(f(x),g(x)) : x E [0, n]),

f,g E C;

and

(31.14)

d(f,g) :_

00

E2-ndn(f,g),

f,g E C.

n=1

Just as earlier (cf. (31.3) and (31.4)), one easily confirms that d is a metric on C (with all its values in [0,1]) which satisfies (31.15)

2-nd(f,g)

Measure and Integration Theory (De Gruyter Studies in Mathematics)

Measure Theory and Integration (Graduate Studies in Mathematics)

Measure and integration theory

Measure Theory and Integration

Bernstein Functions: Theory and Applications (De Gruyter Studies in Mathematics)

Ergodic Theorems (De Gruyter Studies in Mathematics)

Transformation Groups (De Gruyter Studies in Mathematics)

Transformation Groups (De Gruyter Studies in Mathematics)

Markov Processes, Semigroups and Generators (De Gruyter Studies in Mathematics)

Gibbs Measures and Phase Transitions (De Gruyter Studies in Mathematics)

Lebesgue integration and measure

Lebesgue measure and integration

Lebesgue measure and integration

Lebesgue measure and integration

Integral Representation Theory: Applications to Convexity, Banach Spaces and Potential Theory (De Gruyter Studies in Mathematics)

General Integration and Measure

Real analysis: Theory of measure and integration

Real analysis: Theory of measure and integration

Integral Representation Theory: Applications to Convexity, Banach Spaces and Potential Theory (De Gruyter Studies in Mathematics)

Introduction to Measure Theory and Integration

Lebesgue Integration and Measure

Distribution Theory of Algebraic Numbers (De Gruyter Expositions in Mathematics)

Computer Arithmetic and Validity: Theory, Implementation, and Applications (De Gruyter Studies in Mathematics)

Methods of Noncommutative Analysis: Theory and Applications (De Gruyter Studies in Mathematics)

Measure Theory (Graduate Texts in Mathematics)

Measure Theory (Graduate Texts in Mathematics)

A Modern Theory of Integration (Graduate Studies in Mathematics)

Measure, integration and function spaces

Introduction to measure and integration

Measure, Integration and Function Spaces

Lectures on measure and integration

Measure and Integration Theory (De Gruyter Studies in Mathematics)

Measure Theory and Integration (Graduate Studies in Mathematics)