Studies in Advanced Mathematics Series Editor STEVEN G. KRANIZ WashingtoJI University in St. Louis
Editorial Board
Gerald B. Folland
R. Michael Beals Rutgers University
University of Washington
Dennis de Turck
William Helton
University of Pennsylvania
University of California at San Diego
Ronald DeVore
Norberto Salinas
University of South Carolina
University of Kansas
Lawrence C. Evans
Michael E. Taylor
University of California at Berkeley
University of North Carolin
Titles Included in the Series Steven R. Bell, The Cauchy Transform, Potenual Theory, and Conformal Mapping John J. Benedetto, Harmonic Analysis and Applications John J. Benedetto and Michael
W. Frazier, Wavelets: Mathematics and Applications
Albert Boggess, CR Manifolds and the Tangential Cauchy-Riemann Complex Goong Chen and Jianxin Zhou, V ibration and Damping in Distributed Systems,
Vol.
1:
Analysis, Esumation, Attenuation, and Design. Vol. 2: WKB and Wave Methods,
V isualization, and Experimentation Carl C. Cowen and Barbara D. MacCluer, Composition Operators on Spaces of Analytic Funcuons John
P. D'Angelo, Several Complex Variables and the Geometry of Real Hypersurfaces
Lawrence C. Evans and Ronald
F. Gariepy, Measure Theory and Fine Properties of Functions
Gerald B. Folland, A Course in Abstract Harmonic Analysis Jose Garc(a-Cuerva, Eugenio Hernandez, Fernando Soria, and Jose-Luis Torrea,
Fourier Analysis and Partial Differential Equations Peter B. Gilkey, Invariance Theory, the Heat Equation, and the Atiyah-Singer Index Theorem,
2nd Edition Alfred Gray, Modem Differential Geometry of Curves and Surfaces with Mathemauca, 2nd Edition Eugenio Hernandez and Guido Weiss, A First Course on Wavelets Steven G. Krant7., Partial Differenual Equations and Complex Analysis Steven G. Krantz, Real Analysis and Foundations Kenneth
L Kuttler, Modem Analysis
Michael Pedersen, Functional Analysis in Applied Mathematics and Engineering Clark Robinson, Dynamical Systems: Stability, Symbolic Dynamics, and Chaos, 2nd Edition Jolm Ryan, Clifford Algebras in Analysis and Related Topics Xavier Saint Raymond, Elementary Introduction to the Theory of Pseudodifferential Operators Robert Striclzartz, A Guide to Distribution Theory and Fourier Transforms Andre Unterberger and Harald Upmeier, Pseudodifferential Analysis on Symmetric Cones James S. Walker, Fast Fourier Transforms, 2nd Edition James S. Walker, Pnmer on Wavelets and their Scientific Applications Gilbert G. Walter, Wavelets and Other Orthogonal Systems with Applications Kelze Zhu, An Introduction to Operator Algebras
JEWGENI H. DSHALALOW
Real Analysis An Introduction to the Theory of Real Functions and Integration
CHAPMAN & HALUCRC Boca Raton
London
New York Washington,
D.C.
Library of Congress Cataloging-in-Publication Data Dshalalow, Jewgeni H.
Real analysis : an introduction to the theory of real functions and integration I Jewgeni
H. Dshalalow. p.
em. --(Studies in advanced mathematics)
Includes bibliographical references and index. ISBN
1. 2.
1-58488-073-2 (alk.
paper)
Mathematical analysis. I. Title. II. Series. Biology-molecular. I. McLachlan, Alan. II. Title.
QA300 .074 2000 515--dc21
00-058593 CIP
This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distnbution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC,
2000 N.W. Corporate Blvd.,
Boca Raton, Florida 33431.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe.
©
2001
by CRC Press LLC
No claim to original U.S. Government works
1-58488-073-2 Library of Congress Card Number 00-058593 Printed in the United States of America 1 2 3 4 5 6 7 8 9 0 International Standard Book Number
Printed on acid-free paper
To my Lord and Redeemer Who made the supreme sacrifice for me and Who will come again
Preface This book is intended to be an introductory two-semester course in abstract analy sis, which includes topology, measure theory, and integration, traditionally staff ing an assemblage of topics under the cognomen "Real Analysis," more common in the United States. Most North American schools offer this as a graduate one- to two-semester course for mathematics, physics, and engineering majors. Many European schools, to the best of my knowledge, do not have such a course; they have instead a sequence of separate courses such as
gration, and Functional Analysis.
In some countries, such as Russia and former
Soviet Republics, they, additionally, have a somewhat similar to
Topology, Measure and Inte
Real Variables course, which is
Real Analysis but is more specialized, and, its profile and
rigor vary from college to college.
A very good reason for learning real analysis is that not only is it a core course for all mathematical disciplines, but it is absolutely mandatory for statistics and probability, operations research, physics, and some engineering majors as well. Hence, rephrasing an old adage, all routes of science and technology go through real analysis. This text predominantly targets the first year graduate students of mathemat ical science majors as well as the frrst and second year graduate students of engi neering, physics, and operations research majors. A stronger senior undergraduate mathematics student can also benefit from the course. Some less theoretically oriented programs or those with weaker mathematics course curricula may frnd it reasonable to use the book for a three-semester course: with the first two semes ters of basics and the third semester of advanced topics. The course can always be shortened to two semesters in such schools with the option to cover the first seven chapters, which are also quite sufficient for technical majors. This book is destined primarily as a textbook and its purpose as a reference is secondary. The reason for such a claim is a rather thorough elaboration of ma jor theorems, notions, and constructions, very often supplied with a blueprint and sometimes a less formal introduction. The latter are then succeeded by detailed treatments. For instance, the Radon Nikodym Theorem is first introduced in Chapter
6, with a minimum of proofs and formalities, but with a number of exam
ples and exercises. Then it is followed by a more abstract version later, in Chapter
8.
Vll
. .
PREFACE
Vlll
. . .
The first three chapters of the book (Part
I) include preliminaries on sets
theory and basics of metric spaces and topology.
I have been using these three
chapters for the many years teaching a bilevel topology course at Florida Tech during our quarter system. However,
I
would not be able to cover the present
version of the three chapters in one quarter, and one semester would be a more appropriate term for the current program at our school. Hence, the first three chapters can easily serve as a separate one quarter to one semester topology se nior undergraduate or beginning graduate course. Chapters 4-7 (Part
II) present basics of measure and integration and, again,
they can be offered as a separate measure theory {and integration) course. Con sequently, Parts
I and IT can become appealing to those programs with separate
named courses and, in particular, to European students. Part ITI (Chapters
8 and 9)
includes a more elaborate and abstract version of measure and integration, along
with their applications to functional analysis
(LP spaces and Riesz Representation
Theorem for locally compact Hausdorff spaces), probability theory (conditional
expectation, uniform integrability, Lebesgue-Stieltjes integrals, decomposition of distribution functions, stochastic convergence, and convergence of Radon mea sures), and conventional analysis on the real line (monotone and absolutely con tinuous functions, functions of bounded variations, and major theorems of calcu lus). Part
III can be utilized for advanced topics, as well as an enlarged variant of
measure and integration. While the reader would be better off to have studied Part
I prior to Part IT and the first six sections of Chapter 8, the latter can also be used as
an independent material with sufficient basics of topology drawn from any
generic advanced analysis course. The book can also be used as a reference source for researchers in mathe matical and engineering sciences, and especially, operations research (such as applied stochastic processes, queueing theory, and reliability). The reader should understand, however, that the book is not intended to become an encyclopedia of mathematics or to be any kind of a broad reference.
I had to suppress my tempta
tion to include some written chapters on Hilbert spaces, functional analysis, and Fourier transforms, because of my motives to compile main topics of what consti tutes the real. analysis and to design a text by spending more time on details (with in the frameworks of the book size imposed by the publisher and buyers' afford ability). This text may be well suited for independent studies with or without in structors for which an abundance of examples and over pertinent support. While a solution manual is
600 exercises provide a
in preparation and will become
available soon (and it would be an additional studying aid), the publisher and I
have agreed on honoring only university instructors with this manual upon adop tion of the book for the course. The reader may also fmd the
new terms subsect
ions (at the end of each section) useful, especially considering a plethora of new definitions and notations, which not only can be intimidating, but they can create an additional memory burden and thereby slow down learning of the main concepts.
.
PREFACE
lX
Most of my thanks are due to my wife Irina for her ample support, encour agements, and overwhelming sacrifice.
I would like to express my deep apprecia
tion to Mr. Jiirgen Becker, for his constant guidance and countless ideas, Mr.
Donald Konwinski for his enormous editorial work on earlier versions of my
manuscript, Professors Gerald B. Folland and Ryszard Syski for their numerous and very constructive remarks, as well as the kind assistance of Professors S.G.
Deo, Jean-B. Lassere, Jordan Stoyanov, Mr. Gary Russell, the project editor, Mr.
David Alliot, and anonymous reviewers who thoroughly read my manuscript and
made many helpful suggestions. My thanks are also due to the publisher, Mr.
Robert Stem for his help and extreme patience.
Jewgeni H. Dshalalow Melbourne, Florida
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Part L An Introduction to General Topology
Chapter 1 1. 2. 3. 4. 5. 6. 7.
3.
4. 5. 6. 7.
3.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 3
.
3
11 Set Operations under Maps ........................... . ...17 Relations and Well-Ordering Principle ......................22 Cartesian Product . . . . 31 Cardinality . . 40 . . . . . 46 Basic Algebraic Structures Functions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Analysis ofMetric Spaces ................. 59
. . . . 59 The Structure of Metric Spaces . . 65 Convergence in Metric Spaces ...........................7 4 Continuous Mappings in Metric Spaces 78 . 87 Complete Metric Spaces . Compactrless . . . 92 Linear and Normed Linear Spaces I 00
Defmitions and Notations
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 3 1. 2.
.
Set- Theoretic and Algebraic Preliminaries
Sets and Basic Notation
Chapter 2 1. 2.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Elements ofPoint Set Topology ..........107
Topological Spaces
.
.
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
Bases and Subbases for Topological Spaces
.
.
.
.
.
.
.
.
.
.
.
.
.
.. . .
.
.
Convergence of Sequences in Topological Spaces and
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
107 115 .
Xl
CONTENTS
Countability
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Xll
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4.
Continuity in Topological Spaces
5.
Product Topology
6.
Notes on Subspaces and Compactness
7.
Function Spaces and Ascoli 's Theorem
8.
Stone-Weierstrass Approximation Theorem
9.
Filter and Net Convergence
10. Separation 11.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Part IL Basics of Measure and Integration
2. 3.
Systems of Sets
.
.
.
.
System's Generators
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Measures
.
. .
. .
.
143 151 160 167 195
201
.
4.
Image Measures
5.
Extended Real-Valued Measurable Functions
6.
Simple Functions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Elements ofIntegration
c-1(Q,..E)
.
.
203 204 210
. 216 .
. .. .. . .. . . .. ... ... .... .. . .. . . .. . 221
Lebesgue and Lebesgue-Stieltjes Measures
1.
Integration on
2.
Main Convergence Theorems
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Lebesgue and Riemann Integrals on R .. ..... ... .
.
Integration with Respect to Image Measures
.
...... . .
.
222 235 258 277 282 288 295 296 312
..... .327 .
.
.. ... 341 .
Measures Generated by Integrals. Absolute Continuity. Orthogonality . . .
.
.
.
..
.
.
.. ... .
.
.
.
.... .. ..... . ... .
.
.
.
.
.
. 346
Product Measures of Finitely Many Measurable Spaces and Fubini's Theorem .
7.
.
.
3.
6.
.
.
Extension of Set Functions to a Measure
5.
.
.
2.
4.
.
.
Set Functions
3.
.
.
1.
Chapter 6
.
.
.
Measurable Functions
Chapter 5
135
Measurable Spaces and Measurable Functions
I.
128
..... . ... .. . . . . .......... . .. . . ... ... . ... .182
Functions on Locally Compact Spaces
Chapter 4
122
.
.
.
.
.
.
. . .
.
.
.
Applications of Fubini's Theorem .
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . .
.
.
.. . 356 .
... . . ... ..... .
.
.
.
.
.
.
378
CONTENTS
Xlll . .
.
Chapter 7
Calculus in Euclidean Spaces
............ 387
1.
Differentiation ........................................ 387
2.
Change of Variables
...................................402
Part IlL Further Topics in Integration
Chapter 8
Analysis in Abstract Spaces
1.
Signed and Complex Measures
2.
Absolute Continuity ..
3.
Singularity
4. 5.
LP Spaces
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
Modes of Convergence
.
.
.
.
.... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
.
.
.
.
.
.
.
.
.
.
1.
Monotone Functions
2.
Functions of Bounded Variation
3.
Absolute Continuous Functions
4.
Singular Functions .
INDEX
.
.
.
. .. .
.
.
.
.
. . .
.
. . .. . .
.
.
.
.
Calculus on the Real Line
BIBLIOGRAPHY .
.
.
.
.
.
.
.
.
.
.
.
.. .
.
.
.
.
.
.
.
419
421
.
.
.
.
.
437
.
.
.
.
.
.
.
.
.
.
Measure Derivatives
.
.
.. . .. . .. .. ..... ....452
8.
.
.
.
.
Radon Measures on Locally Compact Hausdorff Spaces
.
.
.. ..
. .
7.
Chapter 9
.
.
.
.
.
.
.
.
.460
.. . . . ...........................474 .
.
.
.
Uniform Integrability ... .
.
.
.
. . . ... . .422
6.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
. .
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. .
.
... .
.
.
..
.
.
.
.
.
.
.
.. . .
.
.
.
.
.
.
.
.
.
.
.....
.
.
.
.
.
.
.
.
.
.
.
. 486 .
.
.
.
.
.
.
.
. .. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
. .
493 510 517 517 528 535 543
. 55 1 .
.
.
553
Part/ An Introduction to General Topology
Chapter 1 Set-Theoretic and Algebraic Preliminaries
Set theory is not just one of the main tools in mathematics, it is the very root of mathematics, from which all mathematical disciplines stem. The great German mathematician, Georg Ferdinand Cantor, is considered to be a sole founder of set theory in a series of papers, the first of which appeared in 1874. Although Czech Bernard Bolzano (178 1-1848) made one of the first attempts to formalize set theory, in particular in his Paradoxien des Unendlichen 1851 work, by considering the one-to-one correspondence between two sets (later on developed by Cantor to what we now know as cardinals), neither he, nor anyone else, was really a predecessor to Cantor's creation. Ernst Zermelo (187 1- 1953) was another German, who among his numerous contributions to set theory, is the au thor of the first axiom for set theory (of 1908) and undoubtedly the primary axiom of the whole mathematics. This chapter presents only essentials of set theory and abstract algebra needed throughout the book. 1. SETS AND BASIC NOTATION
collection M into a whole of definite, distinct objects ( that are called elements of M) of our thought. In other words, we
Cantor defined a set as a
bind objects (perhaps of different nature) in our mind into a single entity and call that entity a set. We will denote sets by capital letters, and their elements by lower case letters. For instance, a set A has elements a , b, c, or a1 , a2 , . To abbreviate the expression "a is an element of the set A, " we will write " " a E A. The expression a rt A" reads a is not an element of A." Observe that the notion of a set is relatively simple if we deal with such frequently encountered sets as sets of integers, rational numbers, real numbers or continuous functions. In some rare situations, thought less use of this notion can lead to contradictions, like Bertrand Russell's paradox. Russell posed the following set dilemma. Let � be the set of all sets, which are not elements of themselves. Clearly, � is not empty. For instance, the set of all real numbers is not an element of itself (for it is •
••
3
4
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
not a real number), thus it belongs to �. The question arises: Is � an element of itself? If � E � then by definition of �, it should not belong to � which is a contradiction. Thus, � ft. �. But then, by definition, it must belong to �, which is impossible. In this case, we have put the definition of an object ahead of its existence. The concept of a set must be supported by axioms of set theory, just as main axioms of plane geo metry define the shape of lines. 1.1 Definitions.
( i) A set A is said to be a subset of a set B (in notation, A C B) if all elements of A are also elements of B. If A is a subset of B, we call B a superset of A (in notation, B:) A). A set that contains exactly one element, say a, is called a singleton (set) and it is denoted by {a} . If a E A, then we can alternatively write {a} C A. Any set is obviously a subset of itself: A � A. ( ii) The unique set with no elements is called the empty set and is denoted 0. Clearly, 0 is a subset of any set, including itself. (iii) A = B (read "set A equals set B") if and only if A C B and B C A; otherwise, we will write A f. B. Occasionally, we will be using the symbol " C " applied to the situation where one set is a subset of another set but the sets are not equal. A C B reads "A is a proper subset of B." In this case, B is a proper superset of A (in notation, B :J A). D We postulate the existence of a set that is a superset of all other sets in the framework of a certain mathematical model. This set is usually called a universal set or just universe. We will also make use of the word "carrier" as a synonym for the universe and reserve for it the Greek letter n. Sometimes, we will denote it by X, Y or Z. A universe (as a base for some mathematical model or problem) is generally defined to contain all considered sets and it varies from model to model. For example, if e ra , b] denotes the set of all n-times differentiable functions on interval [a,b], it contains, .as a subset, the set of possible solutions of an ordinary differential equation of the nth order. Thus, f2 = e ra , b] is a relevant universe within which the problem is posed. One could also take for n the set e [ a, b] of all continuous functions on [a,b] or even the set of all
real-valued functions on [a,b]. However, these are "vast" to serve for uni verses and they are impractical for this concrete problem. Set theory is also a basic ingredient of probability theory, which always begins with elements of set theory under slightly modified lexicon. For instance, a universe is referred to as sample space. Subsets of the sample space are called events, specifically singletons are called elementa-
1.
Sets and Basic Notation
5
ry events. The concept of the universe is most vivid when used in proba
bility theory. Let us consider the experiment that consists of tossing a coin until the first appearance of the head on the upper face of the coin. Denoting H as an output of the head and T as an output of the tail, when tossing the coin, we may define { ( T,T, . . . ,T,H)} as an elementary event of the sample space n populated by the elements {(H), (T,H), (T, T , H), . . . }. The universe n contains, as elements, all possible out comes of tossing the coin until the "first success" or the first appearance of the head. For instance, in the language of probability theory, the event {(H), ( T,H),(T, T, H)} corresponds to the "success in at most three tosses." 1.2 Notations. Throughout the whole book we will be using the following notation.
( i) Logical sym bois: V means "for all" 3 means "there is" or "there are" or "there exists" => means "implies" or "from . . . it follows that ... " ¢:> means "if and only if" 1\ ( & ) means "and" V means "or" : means "such that" (primarily used for definition of sets)
(ii) Frequently used sets: N: the set of all positive integers N0 : the set of all nonnegative integers Z: the set of all integers Q: the set of all rational numbers Qc : the set of all irrational numbers IR: the set of all real numbers C: the set of all complex numbers IR + the set of all nonnegative real n urn hers IR the set of all negative real numbers (iii) Denotation of sets: List: The elements are listed inside a pair of braces [for instance, {a,b,c} or {a 1 , a 2 , . . . }] . Condition: A description of the elements with a condition following a colon (that in this case reads "such that" ) , again with braces enclosing the set [for instance, The set of odd integers is { n E Z: :
_:
n =
2k+l, k E
Z}].
6
CHAPTER 1. S ET-THEORETIC AND ALGEBRAIC PRELIMINARIES
( iv) Main set operations: Union: Au B = { x E n: X E A v X E B} Intersection: A n B = { x E n: X E A 1\ X E B} Two subsets A, B C n are called disjoint if A n B= 0. Difference: A\B = { x E n: X E A 1\ X � B} [A\B is also called the complement of B with respect to A, with the alter native notation A - B or BA . ] Symmetric Difference: A� B = (A\B) U (B\A) Complement ( with respect to the universe f2): A c = An = f2\A (v) General notation: ": = " reads "set by definition." D indicates the end of a proof, remarks, examples, etc. A set-algebraic expression is a set in the form of some defined sets connected thrdugh set operations. Any transformation of a set-algebraic expression into another expression would require a set-theoretic manipula tion which we call a set-algebraic transformation. All basic set-algebraic transformations over basic set-algebraic expressions are known as Laws of Algebra ( or Calculus ) of Sets. D 1.3 Remark. One of the standard tools of the algebra of sets is the so called pick-a-point process applied to, say, showing that A C B or A = B. It is based on the following Axiom of Extent: For each se:t A and each set B, it is true that A = B if and only if for every x E n, x E A when and only when x E B . Axiom's modification: If every element of A is an element of B, then A C B. Thus, for the modification, the pick-a-point process consists of selecting an arbitrar-y point x of A (picking a point x ) and then proving that x also belongs to J1. The identities below can be verified easily by the reader using pick-aD point techniques. 1.4 Theorem (Laws of Algebra of Sets).
(i)
( ii)
Commutative Laws: A U B=B U A AnB = BnA
Associative Laws: (A U B) U C=A U (B U C) (A n B) n C=A n (B n C)
Sets and Basic Notation
1.
7
( iii ) Distributive Laws:
(A U B) n C = (A n C) U (' B n C) (A n B) U C= ( A U C) n (B U C)
(iv) Idempotence of complement: (Ac)c=A union: A U A A intersection: AnA=A =
(v) AnAc=0 (vi) AuAc=n (vii) DeMorgan 's Laws: (AUB)c=AcnBc (A n B) c = Ac U Be ( vi i) AU0==A (ix) An0=0 (x) nc = 0 and 0c n. i
=
D
1.5 Example. Show the validity of the first distri bu ti ve law.
[ xEA
A
xEc
x E (A U B) n C X E (Au B) 1\ X Ec x EA n C] v [ xEB A xE(A n C) U ( B n C).
xEc xEB n C]
0
1.6 Remark. The concepts of union and intersection can be extended
to an arbitrary family of sets. For instance, U
i EI
Ai={xEf2:3iEl,xEAi}·
The distributive laws and DeMorgan's laws hold for arbitrary families (subject to Problem 1.1 b)) : U Ai) n B= U ( A i n B)
( ( n Ai ) U B= n (A U B) ( U Ai)c= n Ai ( n Ai )c= U Ai · iEI
iEI
iEI
i EI
i EI
iEI
i
iEI
i EI
D
8
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
1.7 Definitions.
An indexed family
To specify the type of convergence, we will write {A n } l A ({A n }! A). A sequence {A n } of sets is said to be monotone vanishing, if it is monotone nonincreasing and {A n }! C/J. ( v ) Let {A n } be an arbitrary sequence of sets. Denote 00 00 (a) lim n n A m . This limit is n---.inf oo A n (or just lim An ) = nU= l m= called the limit inferior. 00 00 (or just lim A n ) = n U n A m . This limit is A ( b) lim n n---.sup n = l m= oo called the limit superior. If lim A n = lim A n then we denote this common limit as nlim -too A n . In D this case, the limit of {A n } is said to exist and equal UE:oo A n . -
PROBLEMS 1.1
a) Prove Theorem 1.4, the laws of algebra of sets by using the pick-a b)
point process. Prove the generalized distributive laws and DeMorgan's laws stated in Remark 1.6.
1.
1.2
Sets and Basic Notation
9
Show that:
a) (A U B)\C ( A\C) U ( B\C). b) ( A n B)\C ( A\C) n ( B\C). ) C\(A U B) (C\A) n ( C\B). d) C\ ( A n B) = (C\A) U ( C\B). Show that A \B = A n Be. Let I A I = n (i.e. , the set A contains I � (A) I = 2 . c
1.3 1.4 1.5
n
n
elements). Show that
Prove that:
a) ( A\B) c= A c U B. b) [(Ac U B) c U (A U B c )] c = B\A. ) (A n B) U ( A n B c ) u (A c n B) = A U B . c
1.6
For each of the following, justify with a proof or give a counter example.
a) A U C= B U C=> A= B. b) (AU B)\B= A. ) A\B= C\B=> A= C. d) (A \B) c == ( A n B c ) c . c
1.7 1.8
Give an example of a monotone vanishing sequence of sets. Let {A n : n = 1,2, . . . } be an arbitrary sequence of sets. Define 00 00 A0 = n A n and A00= U A n .
n=l
n=l
a) Construct a monotone non increasing sequence of sets { B n } such that {B n } l A0 • b) Construct a monotone nondecreasing sequence of sets { C n } such that { C n } j A 00 • ) Given { C n } j A00 , construct a pairwise disjoint sequence {D n } such that L: :0= D n = A00 • c
1
1.9
In the condition of Problem 1.8 , show that
1.10
Let n be an arbitrary set. Find a sequence such that
{En } of subsets of n
10
CHAPTER 1 . S ET-THEORETIC AND ALGEBRAIC PRELIMINA RIES
NEW TERMS:
set 3 element of a set 3 Russell's paradox 3 subset 4 superset 4 singleton 4 empty set 4 proper subset 4 proper superset 4 universe 4 carrier 4 sample space 4 events 4 elementary events 4 union 6 intersection 6 disjoint sets 6 difference 6
symmetric difference 6 complement 6 set-algebraic expression 6 set-algebraic transformation 6 pick-a-point pr0cess 6 axiom of extent 6 commutative laws 6 associative laws 6 distributive laws 7 idempotence 7 DeMorgan 's laws 7 pairwise disjoint sets 8 disjoint family of sets 8 decomposition of a set 8 partition of a set 8 partition of an interval 8 power set 8 monotone nondecreasing sequence of sets 8 monotone nonincreasing sequence of sets 8 monotone vanishing sequence of sets 8 limit inferior 8 limit superior 8 limit of a sequence 8
2. Functions
11
2. FUNCTIONS
The word "function" was introduced by Gottfried von Leibnitz in 1694, initially as a term to denote any quantity related to a curve, such as its slope, the radius of curvature, etc. The notion of the function was refined subsequently by Johann Bernoulli, Leonard Euler, Joseph Fourier, and finally, by Lejeune Dirichlet in the middle of the nineteenth century with a formulation pretty close to what we are using at the present time and which a mathematics or engineering student meets in an introductory calculus course. Dirichlet introduced a variable, as a symbol that repre sents a set of numbers; if two variables x and y are so related that when ever x takes on a value, there is a value y assigned to x by some rule of correspondence. In this case y (a dependent variable) was said to be a function of x (an independent variable). In this section we introduce a more contemporary notion of a func tion. For functions operating with sets (rather than with points) , we will be using a nontraditional notation of f and / * (instead of just f) , previ ously used by MacLane and Birkhoff [1993] and which we found very appealing, as it brings more order within functions acting on collections of sets (such as topologies and sigma-algebras) and simplifies many proofs. *
2.1 Definitions.
( i) Let X and Y be two sets. The set {(x, y ): x E X, y E Y} of all ordered pairs of elements of X and Y is called the Cartesian or direct product of X and2 Y and it is denoted by X x Y. If X = Y then we shall write X X X = X • Similarly, the Cartesian product of n sets is
the set of all ordered n-tuples. (ii) Any subset f of X x Y is called a binary relation. ( iii) A binary relation f C X x Y is called a (single-valued) ,function if whenever (x , y 1 ) and (x , y 2 ) are elements of /, then y 1 = y 2 . We also say that the function f is a map (or mapping) from X to Y and denote this most frequently by the triple [X,Y,f] or by f: X � Y or by (x,f(x)) or by f (x) = y or by x � f ( x) ( iv) For a function f (as a subset of X x Y), denote .
and call it the
domain
D1
=
{x E X: (x ,y) E /}
of
f.
When a function
[X,Y,f]
is given we will
12
CHAPTER
1. S ET-THEORETIC AND ALGEBRAIC P RELIMINARIES
agree that X is the domain of f. If a domain is not specified, we agree to regard as D f the largest possible set where f is defined. The latter re quires a more rigorous motivation. For instance, let
f(x) = F,. x-1 This function is defined for all x E ( l,oo). On the extended real line !R = !R U { + oo , - oo }, we allow x E [l,oo]. And finally, it is not wrong to have x be any real ( or even complex) number, if f will take on values in Y C C (or C = C U { oo} ) . ( ) Another component of a function is its range, v
A superset of R1 (such as Y) is referred to as a codomain. In other words, Rf is the subset of all such elements of Y, which take part in the relation f C D f x Y. (vi) If x E D1, then f(x) ( E R1) is called the image of x under f. By the above definition, for every x there is a unique image. [Note that an "extended'' concept of a function allows more than one image of each point x under f. Any such function f is called multi-valued. The reader is definitely acquainted with principles of complex analysis where such functions are common. It is also known that in this case the range of a multi-valued function can be parttitioned into pairwise disjoint subsets, such that the function is then split into a number of single-valued functions called branches.] ( vii) If D C D 1 then the set of the images of all points of D under f is called the image of D under f and, following the notation of most analysis text books, it can be denoted
f(D) = { y E Y: 3 x E D, f(x) = y }. However, for the upcoming constructions, it is convenient to distinguish images of points of a set from images of subsets of X under f. In other words, we introduce the function
[�(X), � (Y), f *], where for D E �{X) we denote
f * (D) = { y E Y: 3 x E D , f(x) = y }.
2. Functions
13
Specifically, R 1 = f (D 1 ). We agree to set f * ( { x } ) = 0 \lx rt D f. How * we will always assume that in [X , Y , J] , X is the ever, unless specified, domain of function f . [In particular, this agreement excludes such an in consistency as having f(x) = C/J, whenever x ft. D 1 , since f(x) is supposed to be a point and not a set.] (viii) Let [X, Y, /] be a function. Define the function
inverse of f *" In other words, for each B E GJ(R 1 ), f (B ) = {x E X: f(x) E B}. The set f * (B) is called the inverse i m age of B under J , 1or the pre-imag e o f B under f. Another construction related to f * is f - defined as {(y, x) E Y x X: (x, y ) E /} and called the inverse of f. Unlike / * , in general, f - l is not a single-valued function (in other
and call it the
*
J
words, it is a binary relation or multi-valued function . Consider, for instance, the function [IR, IR, /] such that f(x) = x . Clearly, R 1 1 = IR + and the inverse V = f - of f is a two-valued function with domain D _ 1 = IR + and with range equal R , which can be decomposed J
IR = ( oo,O) + [O,oo). Accordingly, we have two branches [R + , oo,O), VJ and [R +, R + , V ] of V . ( ix) Observe that it is legitimate that f( x 1 ) = f( x 2 ) and x 1 f. x 2 . However, if f is such that f(x 1 ) = f(x 2 ) if and only if x 1 x 2 , then f is called one-to-one (or injective or invertible). If f is one-to-one, f - 1 is a
as (
-
-
=
single-valued function too. Since f - 1 in general is not a single-valued function we will agree to regard f - 1 (y) as a set (which in particular can be a singleton or the empty set) , with the alternative notation / * ( {y} ) . (x) Let [X, Y, f ] be a function. Generally, f * (X) = R 1 C Y. In this case, we say the map f is from X into Y. When f * ( X ) = Y, we say the map f is fro m X onto Y or surjective. We call f bijective if f is surjective (onto) and injective (one-to-one). ( xi ) Let f � X x Y and g C Y x Z be binary relations. Then the composition of f with g is defined as
g o f = {(x,z ) E X x Z : 3y: (x,y) E /, (y,z) E g}. The composition of f with g is most frequently used when [X,Y,/] and [ R 1 n D g' Z , g] are functions and, consequently, it is defined as
[X, R 1 n D g, Z ,go f].
D
14
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC P RELIMINARIES
2.2 Example. For a ftxed subset A C X, define the indicator function [X,IR ,1 A ] as 1, 0,
[X, IR, 1 A] is an into map, while [X, {0, 1 } , 1A] is an onto map. 2.3 Definition. Let f: X--+ Y and let A C X. Then define
Then ,
D
R es A f = {(x,y) E (Ax Y) n /}. This function is called the restriction of f to A. On the other hand, the function f is called an extension of the function R e s A f from A to X. D 2.4 Example. Consider [IR, [ - 1 , 1] , sin] which is surjective (i.e. , onto) but not injective (one-to-one). Take a restriction of function [IR, [ - 1 , 1] , sin] to one of the largest subsets A of IR where [IR, [ - 1 , 1] , sin] is monotone increasing. It is plausible to set A = [- ; , ;], since it is also symmetric about the Y-axis. Then [A, [ - 1, 1] , Res Asin] is obviously bi D jective and its inverse is the well-known function [ [ - 1 , 1] , A,arcsin] . 2.5 Remark. Let [X, Y, f] be a single-valued function such that for some y E R1, f * ( {y}) = {x 1 , x 2 , x 3 } C X. Consider the composition f o f * and find that
*
Thus, if f is single-valued, the restriction of f o / - 1 to R f is the identity function (denoted I, with the domain Df f _ 1 = Rf)· However, f- 1 o f need not be a single-valued function at all (show it). f- 1 o f is the identity function only when f is injective. D 0
PROBLEMS
2.1 2.2 2.3
Find the image of [ - 3,5) under 1 ( 1 , 2 ] . Find the inverse image of (�,4] under 1 ( 1 , 2 ] . Composition: a ) Show that the compose operator is associative. 1 1 1 . o f= Show that ( g o f)9 b) c ) Show that Dg o f = D1nt*( Dg) ·
2. Functions
2.4
15
Show the equivalence of the following statements: a ) f is one-to-one.
b) f * (A n B) = f * (A) n t * (B). ) For every pair A and B, = 0.
c
In the following problems we assume that
of disjoint sets,
f * (A ) n f * (B )
f is a map from X into Y.
2.5
Show that
2.6
Show that VB C Y,
2.7
Show that [X, Y, f] is onto if and only if \I B c Y.
A c X => A C f* o f * (A). f* o f * (B) C B. f * o f *(B) = B holds
16
CHAPTER 1 . SET-THEORETIC AND ALGEB RAIC PRELIMINARIES
NEW TERMS:
Cartesian (direct) product 11 binary relation, 11 function 11 map 11 mapping 11 domain 11 range 12 codomain 12 image of a point 12 multi-valued function 12 image of a set 12 branch of a function 12 inverse image of function f 13 pre-image 13 in verse of function f 13 one-to-one (injective, invertible) map 13 into map 13 onto (surjective) map 13 bijective (onto and one-to-one) map 13 composition of binary relations 13 composition of maps 13 indicator function 14 restriction of a map 14 extension of a map 14 identity function 14 *
3. Set Operations under Maps
17
3 . SET OPERATIONS UNDER MAPS
The mos t remarkable property of the inverse of a function is that it "pre serves" all set operations. The function itself, as we shall see, does not have such a quality. The main theorems in this section will be proved for special cases of surjective maps; the rest will be left for the reader. 3.1 Theorem. Let [X, Y, f] be a surjective map and let B C Y. Then Proof. We prove an equivalent statement,
we show that
f*(B) + ! *(Be) = X,
i.e. ,
( i) f*( B) and f * (Be) are disjoint and ( ii) f * ( B) complements f * ( Be)
up to X. We start with: (i) Suppose f * (B) and f * (Be) hav.e a common point x. Then there is y 1 E B such that f(x) = y1 and y2 E Be such that f(x) = y2 . Thus, y 1 :f y2 and f is not a single-valued function. (See Figure 3.1.)
.f*(B)
f*(B)
y
X
Figure
3.1
18
CHAPTER I. SET-THEORETIC AND ALGEB RAIC PRELIMINARIES
( ii) If f * ( B) does not complement f * ( Be) up to X, there will be at least one point x which does not belong to either of these sets (for they are disjoint as shown above). This is an obvious contradiction, since it follows that f(x) rt Y. (See Figure 3.2 below.) 0
f*(B)
f*(B)
y
X
Figure 3.2 Let [X, Y, f] be a function. Then [!*(Y)f = xc = 0. On the other hand, setting B = Y, by Problem 3. 1, we obtain 3.2 Example.
i.e. /*(0)
=
0.
0
Let [X, Y, f] be a surjective map. Then B 1 C B2 C Y implies that !*( B1 ) C !*( B 2 ). Proof. Suppose that f * (B 1 ) is not a subset of f* (B 2 ). This implies the existence of a point x which belongs to f*(B1 ) and does not belong to f* (B 2 ). Therefore , there is exactly one pointy E B 1 with f(x) =y. On the other hand, since x rt f*(B2 ), f(x) cannot belong to B 2 • But it must, since f( x) =y E B1 C B2 . (See Figure 3.3 below.) Hence, our assumption D above was wrong. 3.3 Theorem.
3. Set Operations under Maps
19
Figure 3.3 Let f: X--. Y be an onto map and let {Bi : i E I} be an indexed family of subsets of Y. Then, 3.4 Theorem.
Proof.
( i ) We prove that i U f*(B i) C ! *( i U B i ) · EI El Let x E U f * (B i )· Then there is an index i0 E I such that iEI x E f * (B i0 ) Since B i0 Ci U B i , by Theorem 3.3, f * (B i0 ) C f * (i U B i ), EI EI which implies that x E /*( U Bi )· i ei ( ii) We show the validity of the inverse inclusion, f * ( U B i ) Ci U f * (B i )· iei ei Let x E / * ( U Bi )· Then f(x) E U B i . Therefore, there is an index i EI iE/ i0 E I such that f(x) E B i0 if and only if {f(x)} C B i0 . By Theorem 3.3, it follows that f * {f(x)} C f * (B i 0). Since x E f * ({f(x)}), we have D {x} C f*{f(x)} C f * (Bi0 ) C U f * (B i )· i EI .
20
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS 3.1 3. 2 3.3 3.4
3.5 3.6
Prove Theorem 3.1 under the condition that f is an into map. Prove Theorem 3.3 under the condition that f is an into map. Generalize Theorem 3.4 when f is an into map. Let [X, Y, f] be an into map and let {B i : i E I} be an indexed family of subsets of Y. a ) Pro ve tha t / * ( n B i ) = n f * ( B i ) · i el i EI b) If { Bi : i E I} is a pairwise disjoint family, show that ! * (Li E [ Bi ) = Li E I f* (B i ) • Show that f * (A \B) = f * (A)\f * (B). The results above prove that all set operations are closed under the in verses of maps. Show that not all set operations are closed under maps per the following. a ) Show that maps preserve inclusions. b) Show that maps preserve unions. ) Show that maps do not preserve intersections; specifically, show that as
c
1.( i ne i AJ c i ne i J * (A; )
and that the inverse inclusion need not hold. Explain the latter without a counterexample. d) Do maps preserve the difference? 3.7
Let "[X, Y, f] be a map and let A C Y. Show that
3.8
Prove the following properties of the indicator function defined on a nonempty set n: (i) lA n B = min{ lA , lB} = lAlB . (ii)
lA B = max{ lA , lB} u
3. Set Operations under Maps
21
(iii) lA+B = lA + lB . ( v)
lE. E 1A· = �i e IlA. lA c = 1 - lA .
( vi )
A C B =>!A < lB .
(iv)
( vii )
1
1
1
lu A. = sup{ lA.:i E l} , iEI li n A. = inf{ lA.:i E l} . EI Let {A n } be a sequence of subsets of n. Show that the function limlA n is the indicator function of the set lim A n and that the function lim 1 A n is the indicator function of the set lim A n . 1
1
1
3.9
3.10 3.11
1
Prove that nlim A n exists if and only if nlim lA exists. [Hint: Use Problem 3 . 9 . ] Let [X,X',F] be a bijective map and let T and r' be respective col lections of subsets of X and X' such that F ( r �) s; T and F ( r ) C r'. Show that F** ( r' ) = T and F ( r ) = r'. --.co
--.co
**
**
**
n
22
CHAP TER 1. SET-THEORETIC AND ALGEB RAIC PRELIMINARIES
4. RELATIONS AND WELL-ORDERING PRINCIPLE
In Definition 2.1 (ii) we introduced the concept of a binary relation R as an arbitrary subset of A x B. In the special case when R C A x B and A = B, we call R a binary relation on A. We will sometimes use as notation aRb instead of ( a,b ) E R. This notation makes sense, for instance, if R is stipulated by < or < on some set. In addition, we will also say that a pair ( A,R) is a binary relation, where in fact R is a binary relation on a set A (a carrier). Now we consider some special relations. 4.1 Definitions. Let R be a binary relation on S. ( i ) R is called reflexive if Va E S, (a,a) E R [aRa]. ( ii) R is called symmetric if ( a,b ) E R =? ( b,a) E R [aRb => bRa]. ( iii ) R is called antisymmetric if ( a,b ) , ( b,a ) E R => a = b [aRb 1\ bRa=> a = b ]. ( i v ) R is called transitive if ( a,b ), ( b,c) E R => ( a,c ) E R [aRb 1\ bRc =>aRc].
( v) R is called an equivalence on S (denoted by symbol or E) if it is reflexive, symmetric and transitive. [Observe that the equivalence E on S partitions S into mutually disjoint subsets, called equivalence classes. A partition of S is a family of disjoint subsets of S whose union is a decomposition of S. The elements of S "communicate" only within these classes. Therefore, every equiva lence relation generates mutually disjoint classes. The converse is also true: an arbitrary partition of the carrier S generates an equivalence relation.] (vi) R is called a partial order (denoted by the symbol -< ) if it is reflexive, antisymmetric and transitive. (vii) If -< is a partial order, it is called linear or total if every two elements of S are comparable, i.e. \la,b E S either a -< b or b -< a . (viii) Let S be an arbitrary set and let (E) be an equivalence relation on S. For t E S denote �
�
[ t ]� ( = [t] E) = {s E S : s t} �
and call it an equivalence class modulo classes
�
( E). The set of all equivalence
{[t] �} = S l � (or SI E or SjE)
4. Relations and Well- Ordering Principle
23
is said to be the quotient (or factor) set of S modulo . It is easily seen that a quotient set of S is also a partition of S. Note that x � [x] is a function assigning to each xES, an equiva lence class [x] . We will denote this function by 1rE (or 1r ) and call it D the projection of S on its quotient by E (or ) . �
�
�
�
�
4.2 Examples.
( i ) ( IR, = ) is an equivalence relation. Therefore, every real number as a singleton represents an equivalence class. ( ii ) (lR, < ) is a linear order. ( iii ) Congruent triangles on a plane offer an equivalence relation on the set of all triangles. [Two sets A and B are called congruent if there exists an "isometric" bijective map f : A --+ B, i.e., f must preserve the "distance" for every pair of points a,b E A and their images f (a ) ,J(b) E B.] ( iv) ( IR 2 , < ) is not a linear order if we define < as ( a1 ,61 ) < ( a2 ,b 2 ) if and only if a1 < a2 1\ b 1 < 62 • To make this relation a linear order we can define, for instance, ( a1,b1) < ( a2 , b2 ) if and only if II ( a1 ,61 ) II < II ( a2 ,b 2 ) II , where II ( a,b ) II is the distance of point ( a,b ) from the origin. ( v ) Let I be the relation on N such that n I m if and only if n divides m (without a remainder ) . It can be shown that (N, I ) is a partial order but not a linear order. (See Problem 4.5.) (vi ) Let p be a fixed integer greater than or equal to 2. Two integers a and b are called congruent modulo p if a - b is divisible by p (without remainder ) ; in notation we write p I a - b or a = b (mod p ) . The number p is called the modulus of congruence. Let "
"
[m] p = {n E ?L: m = n (mod p ) } (m E ?L) . In other words, [m] p = {n E ll.: 3k E 7l.: n = kp + m}.
Then any two integers m and n are related in terms of [ · ] p if and only if n E [ m] p . This is an equivalence relation. (Show it; see Problem 4. 1.) (vii) Let S be a nonempty set and R C S S be a binary relation. Taking for R the diagonal D = {( ) E S} we have with ( S,D ) the "smallest" ( by the contents of elements of S x S) equivalence relation on S, where each element forms a singleton-class, and D partitions S into { s } 5 classes. The "largest" equivalence relation on S is obviously R = s,s : s
X
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
24
S x S itself and it consists of the single class. ( viii) Any function [X, Y, f] generates an equivalence relation on its domain X partitioning X into disjoint subsets. Define the binary relation E1 ( 1) on X as �
x E1y f(x) = f( y ).
Then, it is readily seen that E f is an equivalence relation on X, referred to as the equivalence kernel of the function f. Formally, for every point y E f * ( X) , the pre-image f - l (y) is an equivalence class in X and {[f - l (y )] E / y E f * ( X) } is the quotient set of X modulo E f (or 1). Furthermore, I: E f *(X) f - l ( y ) is a deco mposition of X. For instance, the function f(x) = x2 generates a partition of lR into a collection of subsets of the form { - a,a}, for a > 0, along with {0}, which is a factor set of lR modulo E 2 • Another example is the function \{7r(2n2 - 1 ) : n E 71. }, IR, tan . X = IR :::::;
Y
X
[
Let
Then,
A y = tan - 1 (y) = {arctan y + 1rn : n E 71.} = [arctan y] E t an .
Etan is the equivalence kernel of the function tan,
XIE
and
]
l (y) : y E IR} (the quotient set of X modulo E t an) = {tan tan D
The last discussion about equivalence relation generated by a func tion yields some important results and notions we would like to use in the upcoming materials of Chapters 6 and 8. While we demonstrated in Example 4.2 (viii) that any function on X generates an equivalence relation, the following proposition states that the converse is also true; namely that any equivalence relation E is the equivalence kernel of some function.
Let E be an equivalence relation on a nonempty set X. Then the projection [ X,X I E, 7rE] is an onto map with E as the equi valence kernel. D 4.3 Proposition.
4. Relations and Well-Ordering Principle
25
From the definition of 1rE it follows that 1rE is surjective. To claim that E is the equivalence kernel of 1rE, we need show that 1rE ( x ) = 1rE (y ) if and only if xEy. Let 1rE ( x ) = 1rE( y). Since xEx, x E [x]E and therefore, by the assumption ( 1rE ( x ) = 1rE (y)) x E [y]E · This proves that xE y . Now let xEz. If y E [x]E, then yEx and thus, by transitivity, yEz, i.e. y E [z]E. Therefore, [x]E C [z]E . The inverse inclusion, and thus the equality, is due to the symmetry of E. Hence, 1rE( x ) = 1rE (y ). D Proposition 4.3 asserts that the projection 1rE is a trivial example of an onto function defined on X and with the range X I E · Now suppose E is an equivalence relation on a set X and [X,Y,f] is any function whose equivalence kernel is E. The following theorem claims that, there is a unique "mediator" f between the quotient set X I E and the codomain Y of f. Proof.
4.4 Theorem.
Let E be an equivalence relation on a nonempty set X and [X,Y,f] be a function whose equivalence kernel is E. Then there is a D unique function [X I E,Y,f] such that f = f 1rE. The reader shall be able to take care of this theorem (Problem 4.10) well of Corollaries 4.5 and 4.6 (Problems 4.11 and 4.12). 4.5 Corollary. In the condition of Theorem 4.4, if f is onto, then f is bijective. D 4.6 Corollary. Let [X, Y,J] be a function and let E f denote its equiva lence kernel. Then, there is a unique one-to-one function [X I E f , Y,!J such that f can be represented as a composition o
as
as
of D Furthermore, f is bijective if f is surjective ( onto). Now, we turn to a discussion on the partial order relation and all rele vant notions and theorems, which we are going to apply throughout the book. 4. 7 Definitions. Let (A, -< ) be a partial order and let B C A. Clearly, ( B, -< ) is also a partial order. ( i ) The partial order ( B, � ) is called a chain in ( A, -< ) if it is linear. ( ii ) An element b0 E B is called a minimal element of B (relative to
26
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
-< ) if for each b E. B with b -< b0 , b b0 (compared with the smallest element b0 , which is -< b for all b E B). ( iii) An element b 00 E B is called a maximal element of B (relative =
to -< ) , if for each b E B, with b 00 -< b, it holds true that b = b 00 (compared with the largest element b 00 , which is such that b -< b 00 \lb E B) . [Observe that the difference between a minimal element and the smallest element of a set is as follows. A minimal element b 0 is -< b E B whenever b0 is comparable with some b. In addition, the smallest element is comparable with all elements of B.] ( iv) An element u E A is said to be an upper bound of B if b -< u \lb E B. An element l E A is said to be a lower bound of B if l -< b \lb E B. If B has lower and upper bounds then B is called bounded (or
-< -bounded).
( v) If the set of upper bounds of B has a smallest element u0 then this element is called the least upper bound of set B (abbreviated lub(B)) or supremum (sup(B)). Similarly, if the set of all lower bounds has a largest element 1 00 then it is called the greatest lower bound of the set B (in notation glb( B)) or infimum (inf( B)). [For instance, 0 is the glb((0, 1 )) or inf(0, 1 ) in ( IR, < ), while a lub of the set [ 1 , /2] n Q does not exist in (Q, < ).]
( vi ) Let B contain at least two points. The partial order (B, -< ) is
called a lattice if every two-element subset of B has a supremum and an infimum and they are also elements of B. [In notation: if B = { x,y } , then x V y = sup{x,y}
and
x 1\ y = inf{x,y}
4.8 Examples. ( i ) Let B =
= =
sup(B) inf(B) .]
D
{1,3,3 2 , ,3", . . . }. Then (B, I ) (where I is the relation in Example 4. 2 ( v )) is a chain in (N, I ). ( ii ) Let B = {2,3,4, . . . } and consider the relation I on B. In terms of this relation, the set of all prime numbers {2,3,5,7, 1 1 , . . . } is the set of all minimal elements, while there is no smallest element in B, since there is no minimal element related to all other elements. B does not have a maximal element either. ( iii) Consider the partial order (� ( n), C ). It is obvious that for an arbitrary subcollection A = { A i C n : i E I} C �(n), it is true that • • •
4.
Relations and Well-Ordering Principle
27
supA = U
A i E � ( 0) and infA = n A i E � ( 0). iEl iE l . a In particular, it holds true for pairs of subsets. Thus, ( � (0), C ) 1s lattice.
D
4.9 Definition. A linear order
(A, � ) is said to be well-ordered if
every nonempty subset of A has a smallest element in the sense of the same order � . 0 4. 10 Example. Let IR be the set of all real numbers and consider the relation (IR, < ) which is clearly a linear order. However, IR is not well ordered by < for there are nonempty subsets containing no smallest D element, such as (0, 1). But (N, < ) is well-ordered. Can all sets be well-ordered? This is one of the fun dam en tal ques tions in set theory posed by Georg Cantor in the 1870's. Cantor consider ed it obvious that every set can indeed be well-ordered. At that time set theory was not well-postulated yet. In 1908, Ernst Zermelo formulated his axiom of choice and showed in his paper, Untersuchungen uber die Grundlagen der Mengenlehre, that the axiom of choice is equivalent to the "well-ordering principle." The axiom of choice was included in an axiom scheme for set theory that was later (1922) strengthened by A. Frankel in his paper, Zu den Grundlagen der Cantor-Zermeloschen Men ,
genlehre.
Zermelo and Frankel introduced the following notions. Let 1 be a collection of sets. A function c defined on 1 is called a choice function, if for each S E 1, c ( S ) E S. In other words, c assigns to each set exactly one element of the set. Or less formally, we can choose exactly one element from each set. Observe that if 1 is an indexed set, i.e. 1 = { S i : i E I}, then we have f(i) = c ( S i) E S i . The axiom of choice is formulated in this way:
Every system of sets has a choice function. Zermelo proved that a nonempty set A can be well-ordered if rund only if its power set � (A) has a choice function. [There will be a short discussion of the axiom of choice in the upcoming sections.] 4. 11 Theorem (Zermelo).
well-ordering principle.
The axiom of choice is equivalent to the
4. 12 Examples.
( i) To illustrate a use of the axiom of choice, consider the following example. Let [X, Y, f] be an onto map. We show that there exists a sub set A C X such that Res A f : A � Y is bijective. Let be a choice funcc
28
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
tion for the factor set {[/ - 1 (y)] : y E Y} of X modulo E 1 . Then the set A = { c (f - 1 (y)) : y E Y } has the desired property. In other words, we choose one x from f - 1 (y) for each y and the collection of all these x's is A. (ii) Let A = {c(tan - l y) = arctany : y E IR} . Then A = ( - ; , ; ) and hence [A, IR, Re s A tan] is a function such that it is one-to-one and (Res A tan) - l = arctan. D One of the central results in set theory is Zorn's Lemma [1935] , which is widely used in set theory and which is also equivalent to the axiom of choice. 4.13 Lemma (Zorn). If each chain in a partially ordered set A has an upper bound, then A has a maximal element. PROBLEMS 4.1 4.2
4.3
Show that the relation in Example 4 . 2 (vii) is an equivalence relation on 7l. Give the equivalence classes for p = 4. Classify the following binary relations. a) Let n be a nonempty set. Define the relation (�(n), C ). b) Let n = IR2 \(x,O). Define R: ( a,b)R( c , d ) 5.3 Definitions.
( i)
Let { Y x :
x E X} be a collection of sets. The map
[x IT Y x ' Y a ' 7r a] EX for each a: E X is called the o:th projection map if a(f) = f( a:) , where f E IT Y x , f(o:) E Ya. The point /(a:) is called the o:th coordinate of f x 1r
eX
and the space Y a is called the o:th factor space. ( See Figure 5.2.) [Observe that 1r�( {/(a:) } ) f. {/} but it contains {/} . For instance, if
5. Cartesian Product
33
X = {1, . . . ,n} is finite,
In general, 1ri( {/(a)}) = IT Y x , where Y x = Y x' for
ya = { / (a)} .]
xeX
x
:f:- a, and
Figure 5.2 n
(ii)
Let X = {l, . . . ,n} and let Ai C Yi , i = l, . . . ,n. The set, IT Ai is
i=l
called a rectangle or parallelepiped and it can be expressed in the form (5.1)
(See Figure 5.3 below. ) The notion of a parallelepiped can also be extend ed when index set X is arbitrary. Given Ax C Y x' x E X, the set IT Ax xEX
is a parallelepiped with the alternative representation (5.1).
34
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
t .......,. .,. ._
. ............. Y. 2
__ _ ___
Figure 5.3 (iii) Now we introduce a more general notion of a projection map. Let { Yx : x E X} be an arbitrary indexed family of sets and let A C X. Define [ ll Yx , ll Ya , 7rA] xeX aEA and call it the A-projection map if 1rA ( f ) == f . ( A ) . Specifically, if A = {a} we have 7r{a } ( f) = f .( {a} ) which, in contrast with definition ( i), is a singleton. Let A c ll Ya . Then call 1l"A(A) an A-cylinder with base A. An AaeA cylinder is called a rectangular cylinder if ..A is a rectangle. If, in addition, A is a finite set then the rectangular cylinder is called simple. A sim ple A-cylinder is called a unit cylinder if A is a singleton. (See Figures D 5.�5.7.) 5.4 Example. Let A = {o 1 , a2 ,
7r{an, . . . , an } (f) and hence, '��" {an,
. . .•
=
•
•
•
, an} · Then,
f.( { al , . · ., an })
=
{/( al),. · .,/( on ) } '
n a n} ( { ! ( al ), . . ., ! (an ) } ) =i n / �i{f ( ai ))
is a {a1 , . . . ,an}-simple cylinder with base {f ( a1), . . . , / ( an) } .
0
5. Cartesian Product
Figure 5.4
Figure 5.5
35
36
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
.......___ ._. .
._............ -..... ... __"-¥......-..__....
A
1CA(f)=f.(A)
� Figure 5.6
X
37
5. Cartesian Product
A
A-cylinder with base
Figure 5.7
�
38
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS 5.1
Let
z
A =x IIE XA x , where Ax =
=x rr y , let Ax c y and let EX X
X
Yx except for finitely n many values of the index x, say 1 , . . . , n E X. Show that A = n 1rk ( Ak ) · 5.2
00
k=l
rr y x ' and let Anx c y where for each X = 1 , 2 , . . . , the sequence of sets { Anx } is monotone non decreasing (i.e. A 1 x C A 2x 00 C . . . ) with sup{ Anx= n = 1 ,2, . . . } = U Anx = Y x for x = 2,3, . . . . Also assume that A 11 = A 2 1 = A31 = . . . = A 1 . Show that
Let
z
=
x=l
X ,
n=l
sup 5.3
5.4
5.5
Let
{ x=fil Anx n= 1,2, . ..} =
=
1r i ( A 1 ).
Y x = IR, for all x E IR, A = (0,2) . 2 a) Draw 1r (f) for f(x) = x • A b ) Draw 1r A (A) for A = (0 , 1) x (0 ,1).
{Yx ; x E X } and { Zx ;x E X} be two family of sets. Show that a ) ( IJ Y x n IJ Z x = IJ (Y x n Z x ) · x xEX xEX EX b ) x rr y x U x rr z x c rr (Y n z x ) · xEX EX EX Let m , n E N and Y f. C/J. find an injective map [Y m ,Y",J]. a ) For m < b ) Find an injective map [Y",Y IR ,f]. ) Find a bijective map [Y" x y iR ,y iR ,f]. d) Find a bij ective map [Y IR x y lR ,y iR ,f]. e ) For A C X, find an injective map [Y A ,y X ,f]. Let
(
)( )(
n,
c
) )
X
5. Cartesian Product NEW TERMS:
Cartesian product of a sequence 3 1 Cartesian product of an indexed family of sets 32 projection map 32 coordinate 32 factor space 32 rectangle 33 parallelepiped 33 A-projection map 34 cylinder 34 rectangular cylinder 34 simple cylinder 34 unit cylinder 34
39
40
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES 6. CARDINALITY
One of the main perplexities in the theory of sets is finding a criterion for their "powers." We can overcome this difficulty when considering the class of "finite" sets. (We frequently operate with the term "finite" , though we did not give any strong definition.) We can easily define an equivalence relation in this class, for example, introducing en as the class of all n-element sets for every n E N0 • A partial order relation in this class would act as an appropriate comparison among sets from various classes. Sets A and B are said to be compared, in notation A ::5 B, when and only when A E en , B E es and n < s . Then we could assign to set A the number n and call it the cardinal number of A. Doing this, however, we would experience real difficulties when introducing "countable" and "uncountable" sets. Specifically, we would fail to operate with cardinal numbers as numbers in the usual sense. (Pursuing this philosophy we readily encounter contradictions - the most frequent phenomenon in set theory.) The basic principles of the formalism of cardinality belong to Georg Cantor who was the first to introduce a well-structured concept of "infinity" in his pioneering work done in the 1870 's and 1880's. We will present a rather informal version of cardinality sufficient for us throughout the analysis presented in this book. A curious reader should be referred to special monographs on set theory. We will start with comparison ideas based on finite sets, ideas that enable us to deal with infinite sets as well. 6. 1 Definitions.
( i) Two sets A and B are said to be equipotent if there is a bij ective function /: A � B. In this case we denote I A I = I B I (or A � B) and also say that A and B have equal cardinality. ( ii) If there exists a one-to-one function f: A � B, then we say that the cardinality of A is less than or equal to the cardinality of B, in notation f A I < I B I or A � B. If I A I < I B I and I A I # I B I we shall write I A I < I B I or A � B. (iii) A cardinal number is an equivalence class containing all sets that are " � -comparable." [For some cardinal numbers we will be using the same notation as for regular numbers.] (iv) Let 0 denote the cardinal number of the empty set 0 (the only representative of this class) . Note that 0 is not a number but the class containing 0 . Thus, I 0 I = 0.
6. C din a lit y
41
ar
(v) Similarly, the cardinal nun1ber n is the equivalence class containing the set {1, . . . , n }. Therefore, a set A is finite if it is equipotent with some set of cardinal number n, such that the integer number n is an element of N, i.e. , I A I = I { 1 , . . . , n } I = n. A set that is not finite is called infinite.
[One can easily show that N is infinite.] (vi) A set A is said to be countable or denumerable if it is equip otent with N and in this case we write I A I = N0 (pronounced aleph nought). A set A is called at most countable if I A I < I N I or A -< N. (vii) An infinite set, which is not countable is called uncountable. (viii) A set A is said to have the cardin ality of continuum if it is equipotent with the set IR of real numbers and we write I A I = G:. [We show below that N0 < 1) . . . . k=l
Then, clearly A = E ::O= 1 Bn . Without loss of generality, we assume that each set Bn is countable (in general, any set may also be at most countable) and, therefore, can be enumerated as Bn = {bnl ' bn 2 , • • • } , n = 1 ,2, . . . . We can place these sets in the form of a matrix:
42
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
............ ....................... .
Now the desired bijective map is /( 1) = b 11 , /(2) = b 1 2 , /(3) = b 2 1 , f( 4) = b31 , /( 5 ) = b22 , /( 6 ) = b 1 3 , . . . , from N to A. ( v) The set Q of rational numbers is countable, for the function ! ( �) = ( m, n) is one-to-one from Q to (N x N) U {(0,0)}. The latter is countable by ( iii). (vi) We can show that N0 < G:. Clearly, N0 < G:. Then it is sufficient to show that N -< [0,1], since [0,1] � lR (see Problem 6.5). If a bijective function f: N ---. [0,1] exists, then f(n) is of type O . a n1 a n2 . . . . Now define the number O.b 1 b2 • • • such that bi = 3 if a i . f. 3 and bi = 5 if a i· = 3, i = 1 ,2,. . . . Then the number b := 0. b1 b2 • • . cannot appear among the values of f( n ) for it differs from f( n ) at the nth place. On the other hand, b E (0,1] contradicts the assumption that f is onto. Thus N 0 < (i. Observe that each rational number has two representations, e.g. 0.1 and 0.0999 . . . That means we have to be careful about different numbers above. D The following theorem is one of the central results in set theory. 1
1
.
6.4 Theorem (Cantor). A
-< � ( A) for every set A.
A (see Problem 1.4). Specifically, for the empty set, 1 0 1 = 0, while I �(0) I = 1. Since � (A) contains all singletons, it immediately follows that A -< � (A). To show that I A I f. I � ( A) I , we assume that A � � ( A) and deliver a con tradiction. By our assumption, there exists a bijective map f: A ---. � ( A). Then each element a in A is also an element of a subset of A that contains a. In other words, a may belong to f( a) (a subset of A) or may not. We then define B = {a E A: a rf. f(a)}. B is nonempty, since there exists at least one element a0 E A assigned to 0. We pick a point b E A such that f(b) = B. By definition of B, b E B b f/:. f(b) = B, and this Proof. The result holds trivially for any finite set
is a contradiction.
D
6.5 Remarks.
( i) In Remark 5.2 ( iii) we showed that the power set �(X) of a set X is equipotent with the set {0 , 1} X of all functions f: X {0,1}. Note that 2 is the cardinal number of the set {0,1}. Thus, we conclude that ---.
6.
Cardinality
43
set I B I I A I = I B A I ). In particular, if N N = An interesting fact is that 2 ° = G:, = 2 °. the proof of which is left for the reader as an exercise (see Problem 6.6). ( ii ) The continuum hypothesis states that if � is an infinite cardinal, then there is no cardinal m such that � < m < 2 � . This was conjectured by Cantor for � = N0 • In 1900 David Hilbert included the "continuum problem" as Problem # 1 in his famous list of open problems in mathematics. In 1940 Kurt Godel proved that the continuum hypothesis is consistent with (i.e. does not contradict) the axioms of set theory (axiom of existence, axiom of choice, etc. ). In 1963 Paul Cohen [1966] showed that the continuum hypothesis is independent of the • axioms. (iii) The cardinal number 2 Q: is called the hypercontinuum. For example, the set 'P(IR) has the hypercontinuum cardinal. D Supplementary Historical Note. Modern set theory was founded by Georg Can tor, in a sequence of several articles that appeared between 1870 and 1880. One of these articles, Uber eine Eigenschaft des In begriffes allen reellen algebraischen Zahlen, appeared in Grelle 's Journal in 1874, and is said to have given birth to set theory. Georg Cantor was born of Danish parents (both of Jewish descents) in St. Petersburg, Russia, in 1845 , and lived there until 1856, when his parents moved to Frankfurt, Germany. Cantor began his university studies at Ziirich in 1862. After one semester at Zurich he moved to Berlin University, where he attended lectures of Weierstrass, Kummer and Kronecker. Leopold Kronecker later became Cantor's main opponent, criticizing his concept of infinity and regarding it as theology and not as mathematics. (Cantor, whose mother was a catholic and father a Protestant, has been a devoted Protestant and active theologian. The latter has become a major target of attacks by Cantor's liberal opponents in Berlin University.) In 186 7 Cantor received his Ph.D. (in number theory) from Berlin University. His dream to get a teaching position at Berlin University never came true, primarily due to the opposition of Kronecker. In 1869 Cantor was appointed at Halle University, where he remained until his retirement in 1913. Cantor died in a mental hospital in Halle in 1918. In 1925 David Hilbert recognized Cantor's concept of in finity. He said, "No one can drive us from the paradise that Cantor D created for us."
I � ( X) I = 2 l X I (where we IXI I N I then I 'P(N) I
44
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS. 6. 1 6.2 6.3
6.4 6.5 6.6
Show the validity of the statement in Remark 6 . 2 . Prove the SchrOder-Bernstein Theorem: If A -< B and B -< A ,
A � B.
then
We call an algebraic number any root of a polynomial with integer coefficients. What is the cardinal number of all algebraic num bers? [Hint: Use Problem 6 . 2 . ] Prove that every subset of a countable set is at most countable. [Hint: Use the well-ordering principle.] Show that lR � [0,1]. [Hint: Show that [0,1] � (0,1).] Show that 2No == Q:.
Let [X Y, /] be a surjective map. Show that there is a subset of X equipotent with Y. Let [X, Y , f] be an injective map, where Y is countable, and let 6.8 f - 1 (y) be a countable set for each y E Y. Must X be countable? Let A be an uncountable set and let B C A be countable. Show 6.9 that A \B is uncountable. 6.10 Prove the statement: Every infinite set contains a countable 6.7
1
-
subset.
6. 11 6. 12
What is the cardinal number of all polynomials whose coefficients are algebraic numbers? Show that the set of all finite subsets of N is countable.
6. NEW TERMS:
cardinal number 40 equipotent sets 40 finite set 41 countable (denumerable) set
N0 41
at most countable Set 41 uncountable Set 41 continuum 41 Cantor's Theorem 42 continuum hypothesis 43 hypercontinuum 43 Schroder-Bernstein Theorem algebraic number 44
41
44
Cardinality
45
46
CHAPTER 1 . SET-THEORETIC AND ALG EBRAIC PRELIMINARIES 7. BASIC ALGEBRAIC STRUCTURES
Algebra
is a mathematical discipline that studies algebraic structures. The most rudimentary algebraic operations with natural and positive rational numbers were already encountered in ancient mathematical texts. The famous book, "Arithmetics," by Greek Diophantos (of Alexandria) in the third century A.D ., has a significant influence on the development of algebraic formalism. The term "algebra" stems from the text Al-jabr wa 'l-mukhabala (by Muhammad al-Khowarismi in the ninth century A.D.), which dealt with solution techniques for various problems reducing to first and second order algebraic equations. Not until the end of the fifteenth century, when the common algebraic operations + , , x , power, roots and parentheses were introduced, one used cumbersome phrases and descriptions of algebraic expressions. Fran�ois Viete, by the end of the sixteenth century, was the first to use letters to denote unknowns and parameters. The algebraic symbolism, as we know it now, has been used only since the middle of the seventeenth century. The Elementary Algebra (which deals with basic arithmetic operations on real numbers, first to fourth order algebraic equations, binomial formula, Diophant equations) was completed by the middle of the eighteenth century. Leonard Euler's Introduction to Algebra was one of the most prominent texts then. In the early nineteenth century the algebra became furnished with five basic (commutative and associative and distributive) laws with respect to two (multiplication). On the algebraic operations, + (addition) and strength of Dirichlet's definition of a function, later on, these operations were declared as binary operations based on the following definition. An operation on a set A is a rule that assigns to each ordered subset A n C A of n elements a uniquely defined element of the same set A. For n = 1 ,2, and 3, the operation is called unary, binary, and ternary, respectively. The alg ebraic structures were formalized in 1830 by the Brits George Peacock in 1830, Duncan Gregory in 1840, and Augustus De Morgan and further refined by the Germans Hermann Hankel and Hermann Grassman. The abstract alg ebra is regarded as having been born in 1846, when Joseph Liouville had published Galois' theory (of solvability of polynomial equations) based on the g roup concept, which began to spread within mathematics ever since. In 1872, German Felix Klein published a program, in which he proposed to formulate all of geometry as the study of invariants under groups of transformations. In 1883, Norwegian math ematician Marius Sophus Lie published his fundamental work on continu ous groups of transformations used in studies of continuous functions. The group theory, which is at the heart of contemporary abstract algebra, made prominent contributions to geometry, topology, and even physics in the 20th century . -
·
7. Basic Algebraic Structures
47
In
this section, we review some familiar algebraic structures. These will provide a basis for analysis shifting it to more abstract settings in the upcoming chapters. 7.1 Definitions.
( i)
A set y with a binary algebraic operation * (frequently called addition + or multiplication · ) from y x y into y is called a semigroup, in notation (Y,* ) , if is associative. [Note that even though + or · may denote addition and multiplication, they need not mean the conventional algebraic operations known for numbers.] ( ii) A semigroup (y, *) is called a monoid if, there is an element I E y (called a two-sided identity) such that for all x E y, X *I = I * X = x. (iii) A monoid (y,*) is called a group, if for each x E y, there is a * inverse x' such that X*x' = x'*x = I If is commutative (semigroup, monoid, or group) , (Y, *) is called commutative or A belian. If we use for * symbol + or - , (y, + ) is referred to as additive or ( y, · ) multiplicative, respectively. *
.
*
If (y, + ) is additive, the element I, denoted by (), is called zero, and the element x' denoted by - x is said to be an additive inverse of x. If (y, · ) is multiplicative, the element I is called the unity and denoted by 1. The element x' is denoted by x - l and is said to be a multiplicative inverse of x. (iv) A set � with addition + and multiplication from � x � into �' i.e. a triple (�, + ) , is called a ring if: a) (�, + ) is an Abelian group; b) is associative; c ) \1 a ,b,x E �, x · (a + b) = x · a + x · b (called the left distributive law) (a + b) · x = a · x + b · x (called the right distributive law) . Observe that multiplication · need not be commutative in a ring. How ever, if this is the case, the ring is called commutative. A ring need not have a unity either; consequently, a ring equipped with a unity is called a ring with unity. [For instance, the set of all matrices ..Ab(n ' n ) is a noncommutative ring with unity (unit matrix).] ( v) Let (y, *) and ( � , * ) be two groups and let [y, � ,/] be a map preserving the algebraic operations * and * , i.e. such that ·
,
·
•
f(x* y) f ( x) * f ( y) . =
48
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Then f is called a (group) homomorphism of y into �If [y, �, f] is bijective then it is called an isomorphism. In this case, the groups (y,*) and (�, * ) are called isomorphic. If [y, �' f] is a homomorphism, and in addition (y,*) = (�,* ) , then [y, �' f] is said to be an endomorphism. If [y, �' f] is an endomorphism and an isomorphism then it is called an automorphism. D A homomorphism preserves some (but not all) structural properties of groups, as the following theorem states.
Let [y, �' f] be a homomorphism. Then ( i) for each x E y, !( x ' ) = [!( x ) ]', and ( ii) f(I) = I . (See Problem 7.1. ) 7.2 Theorem.
7.3 Definition. Let
[y, �' f] be a homomorphism. Define
Kerf = f * ( {I} ) and call it the
kernel of f.
D
7.4 Examples.
( i)
The space of all continuous functions with operation + forms an Abelian group. The same space is not a group with operation of multi plication. (ii) All polynomials with operation + form an Abelian group. (iii) (?l, + ) , (lR, + ) and ( (O,oo ) , ) are Abelian groups; (?l, ) is an Abelian monoid. (iv ) The space C\{(0,0)} with the operation "complex multiplica tion" is obviously an Abelian group and ( C, + , ) is a ring. (v) Let ef:.>&f = e(n)((a,b] ;IR) denote the space of all n times continuously differentiable real-valued functions on [a,b] C IR . Then (ef:\1 , + ) is a commutative group. If GJ) n f denotes the nth derivative of a function /, then (ef:.>b1 ' ef��b1 ' GJ) n] is a homomorphism of (ef:.>b1 ' + ) ·
·
·
·
into (ef�� b 1 , + ) . Replacing ef:.>b 1 by the space of all polynomials '!P on [a,b] , we have [�, � ' G]"] as an endomorphism. (vi) Consider two groups (IR, + ) and ( (O ,oo ) , ) and the function f( x ) = ex. Then, [IR,IR + ,/] is an isomorphism. Indeed, f( x + y) = ·
7. Basic Algebraic Structures
49
f( x ) f(y). In addition, [IR, !R + , /] is bijective. ( vii) Let � = �(X;Y) = y X be the space of all functions from X into Y. Then, ( �, ) is a multiplicative monoid. For any nonnegative integer and f E �' define the unary operation power f " on � as: 0 f i =k 1, /"i k+ 1 = f f ".· The power has the properties, f i f k = f i + k and (f ) = f . Note that the power can be defined on an arbitrary ·
·
n
•
•
multiplicative monoid with the above properties. ( viii) A function T from C onto C (where C = C U {
bilinear transformation
if
T ( z ) = �= � � with
;!
oo
})
f:. 0. Let
is called a 'J
denote
the set of all bilinear transformations. Then, ( 0} be an indexed family of functions and let * be some binary operation defined on �. ( �,* ) is called a semi group (of function s) if /0 = 1 and for all s , t > O , fs * f t = f s + t ' Obviously, the semigroup ( �,*) is a commutative monoid. (x) Let fP ( C !RN ) be the space of all sequences such that for each where p E [1 , ) . Define the x = (x 0 , x 1 , . . . ) E fP , L: :0= 0 I x n I P < following operation on fP. For x and y, let = ( z0 , z 1 , ) = X*Y is such that z n = L: � = 0 x k y n k (called discrete convolution). The operation * is commutative and associative and it is closed in fP (see Problem 7. 1 1) . Obviously, 1 = (1,0,0, . . . ) is the unity of ( l P ,*) and thus ( f P ,*) is an Abelian monoid. Let x = (x 0 ,x 1 , . . . ) E fP such that x 0 f. 0. Define y = (y0 ,y 1 , . . . ) such that Yo = }0 . For n > 1, Yn can be determined recursively from the equations L: � = 0 x k Yn k = 0. For instance, o
o
o
o
o
o
oo ,
oo
z
. • .
_
_
2 x2 x x 1 , Y2 = 1 Y1 = - , 3-2· xo xo xo In conclusion, for each x with x 0 f. 0, there is a unique element y x - 1 . On the other hand, if l� denotes the subset of all elements x E fP with x0 = 0 then l� and its complement fP\ f� relative to fP are two equivalence classes induced by * · This implies that ( fP \ f�,*) is a commutative group. Obviously, the triple ( f P , + ,* ) is a commutative =
ring with unity.
50
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Now, let 9J be the space of all complex-valued functions analytic at zero and not equal to zero at the origin. This space is closed with respect to multiplication. Hence, ( 9J, ) is an Abelian group. Indeed, u = 0 1 is the unity and for each x E 9J, � is analytic at zero and it is a two-sided inverse of x. Obviously, each x E 9J can be expanded in Taylor series at zero, such that x is uniquely associated with the sequence ·
X =
_
n � ( { X n n . X ) ( 0); =
n =
0, 1, . . . } .
If F is defined as
F(x) x and F(l) =
then
=
�'
[fP \ f�,9J,F] is a group homomorphism such that F ( x* y) F( x)F( y). =
F - l ( x) x
need not be an element of fP\f�, for L.... n = O I x n I may be a divergent series. (xi) Let £P ( p > 1 ) denote the class of all real-valued functions { [IR ,lR,/]} such that I : I f I P < Define on LP operation * as follows.
Notice that � 00
=
oo
X * y ( u)
=
.
I : x ( u - v)y (v)dv .
The operation * is closed in £P and it is commutative and associative (see Problem 7.12 ) . Define the function
)2;
2 u , for u > 0 and u E IR. exJ f( u , u) } 2 u 21r lL 2u This function is a well-known probability density function of a normal random variable with mean 0 and variance u 2 • Consequently, =
I : f( u,u)du 1. =
From the theory of probability, it is also known that a lion portion of the integral under the curve f (over 99%) is concentrated over the interval ( - 3u,3u). Function f has its maximum value at 0 equal approximately 0.399� . Now, if we let u ----. 0 + , the resulting function is called the ( Dirac) delta function, in notation, 6. It is readily seen that the delta equals 0 on IR \ { 0} and at 0, and that I : fJ = 1. There is an alter native integral representation of delta function. Recall that the Fourier transform of f is oo
7.
Basic Algebraic Structures
bl
and that f can be restored by applying the inverse Fourier transform to its image as follows: f(u,u) = 2 J :exp { - iOu}exp - ( u ) 2 do.
{ �}
�
Again, letting u � 0, we arrive at
6 ( u) = 2� J :exp{ - iBu } d B.
By using this integral representation it will be easy to show that fJ is the unity of * operation:
X * 6(u) =
J x(u - v)6(v)dv = J x(t)6(u - t )dt t E IR
v E IR
=
J x(t) 2� J e - iO (u - t ) d() dt
t E lR
8 E IR
- i 9u j x( t )e i Ot dt d () . J e - 2 7r - ...L
9 e IR
t E lR
Since the expression in parenthesis is x( B ) , that denotes the Fourier transform of x, the rest is the inverse Fourier operator, which should restore x at u. So, x * fJ = x. According to Problem 7. 1 , fJ is a unique unity of operation *· Since fJ > 0 and because
I : I d( u , u ) I Pdu = I : d( u,u)Pdu
-
_
1r:;
vp
2 7r f 0oo
1 fie.
.JP v u
exp -
u 2 ) du (2 .JP 2 u
_
1r:;
- d(x,z).
( 1 . 2b )
Inequalities ( 1.2a) and ( 1.2b ) yield
I d(x,y) - d(y, z ) I < d(x, z ), \lx,y, z E X.
Y � X. Then subspace of (X,d) . Let
the pair
( Y ,d)
( 1.2c )
D
is also a metric space, called a
1.3 Examples (of metric spaces).
( i) The discrete metric is defined on a nonempty set X as
1.
Definitions and Notations
61
{
1, x :f- y 0, X = y. The triangle inequality does not hold if and only if d(x,y) = 1 and d(x,z) = d(z,y) = 0. However, this would only be possible for x = z = y. Hence, d(x,y) cannot equal 1 . ( ii) Let X = (O,oo) and d(x,y) = I � - � I· The triangle inequality follows from d(x,y) = ; - � = 1 ; - � + ! - � l d(x, y) =
l
�
1 � - ! I + I ! - �I = d(x,z) + d(z,y) . ( iii ) Let X consist of all sequences { x n } C IR. Such a carrier X is denoted by IR N . Recall that a subset of lR N is the 1 1 space if it contains
D
2.10 Examples.
(i) From Definition 2. 8 (iii) it follows that A C A. ( ii ) Since the set of all open subsets of a discrete metric space ( X, d) coincides with its power set, the set of all closed subsets is also the
power set. Particularly, in a discrete metric space all subsets are simultaneously open and closed. D 2.11 Proposition.
superset of A.
For any subset A of X, A is the smallest closed
70
CHAPTER
2 . ANALYSIS OF METRIC SPACES
Proof.
(i)
We show first that A is a closed set, i.e. that (C l(A))e is open. Let x E (Cl(A))e. Then there exists an open ball B(x,r) such that B( x, r) n A = (/J (since, otherwise, x would belong to A by the definition). However, we have not proved yet that B(x,r) n A = (/J, which would immediately imply that (Cl(A))e is open. Now we show that no point of B(x,r) is a closure point of A. Take an arbitrary point t E B(x,r). Since B(x,r) is an open set, there is an open ball B(t,rt) C B(x,r) also disjoint from A. By the definition of a closure point, this means that t rf. A. Since t was an arbitrary point of B(x,r), B(x,r) C (Cl(A))e. ( ii) Now we show that the closure of A is the smallest closed set containing A. Let B be an arbitrary closed set such that A C B. We prove that Be C (A )e. Since Be is open, for each x E Be, there is an open ball B(x,r ) C Be. This implies that B(x,r ) n B = C/J and that
B(x,r) n A = (/) .
� A ( by the definition of a closure point), which is equivalent to x E ( C l A )e. Therefore, we have proved that x E B e yields that � E (Cl A)e, i.e. Be C (Cl A)e. The latter is obviously equivalent to A C B. D 2.12 Corollary. A set A is closed if and only if A = A. Thus
x
(See Problem 2. 1 .) 2.13 Remark. Consider the set C(x,r) = {y E X : d(x,y) < r }. It can be easily shown that C is a closed set. (See Problem 2.4.) Such C is called a closed ball centered at x with radius r. Evidently, B(x,r) C C(x,r) implies that B(x,r) C C(x,r), since B is the smallest closed set containing B. However, we observe that C(x,r ) does not necessarily coincide ,w ith the closure of the corresponding open ball B( x, r ) . For instance, let (X ,d) be a discrete metric space, where any open ball is both closed and open set, i.e. B(x,r) = B(x,r). Because
C(x, r) =
{x }, X,
r 1,
we have B(x,r) = C(x,r) = X for r > 1 or B(x,r) = C(x,r) = {x} for r < 1. For r = 1 , B(x,r) = {x} C C(x,r ) = X, unless X is a singleton. D 2.14 Examples.
( i) In the Euclidean metric space (IR,de), for each x E IR, { x} is
2. The Structure of Metric Spaces
71
closed. Indeed, { x} c = ( - oo,x) U ( x , oo) is open. ( ii) The set of all rational numbers Q is neither open nor closed. Indeed, it is known that each irrational point x is a limit of a sequence of rational points {x n } · Therefore, there is no open ball B(x,r), which does not contain rational points. This implies that Q c is not open, or equi valently, Q is not closed. On the other hand, Q cannot be open, since otherwise, every rational point q could be the center of an open ball (interval) containing just rational numbers. This is absurd, since any interval is continuum. Therefore, the set of all rational numbers is neither open nor closed. It also follows that the set of all irrational num bers is neither open nor closed. D 2.15 Definition. A point x E X is called an accumulation point of a set A C X if \/ r > 0, B(x,r) n (A\{x}) # C/J. [Observe that x need not be an element of A .] The set of all accumulation points of A is called the derived set of A and it is denoted by A'. D Unlike a closure point, an accumulation point must be "close, to A. If B(x,r ) n (A\{x}) -:f. (/J, then B(x,r) n A # (/J, and, consequently, x E A' yields that x E A or A' C A. 2.16 Examples.
( i) Notice that not every closure point is an accumulation point. For instance, let A = (0, 1 ) U {2} C (lR,d e ) · Then {2} is obviously a closure point of A . However, {2} is not an accumulation point of A , since B(2, � ) n (0, 1 ) = (/J . On the other hand, {0 } is an accumulation and closure point of A. ( ii) Let A == { 1 , �, �, . . . } C (IR,de) · Since 0 is the limit of the se quence {�} (in terms of Euclidean distance), it is also an accumulation point of A . Any open ball at 0 contains at least one point of A. This is the only accumulation point of A. By the way, A is not closed, for 0 is a D closure point of A. So we have A' = {0}, A = A U {0}. In the previous section we introduced the notion of the product metric. We wonder what the shape of open sets in the product metric space is. A remarkable property of this metric is given by the following theorem. 2.17 Theorem. Let {(Y k ,d k ) : k = 1 , . . . ,n} be a finite family of metric spaces and let (Y,d) = x {(Y k ,d k ) : k = 1 , . . . ,n} be the product space.
Then 0 C (Y,d) is open if and only if 0 is the union of sets of the form {Oi : i = 1, . . . ,n}, where each o i is open in (Y i ,d i )• X
A proof of this theorem in a more general form is given in Chapter 3.
72
CHAPTER 2. ANALYSIS
OF
METRIC SPACES
PROBLEMS 2. 1 2.2 2.3 2.4 2.5 2.6 2.7
2.8 2.9 2.10
Prove Corollary 2.12. Is it true that A � B => A C B? Show that [A c ] c C A. Prove that a closed ball C(x,r) is a closed set. Show that in (IR",d e ), B(x,r) = C(x,r). Show that A = A u A'. Let A � (X,d), where X is an infinite set. Show that, if x is an accumulation point of A, then every open set containing x contains infinitely many points of A. Give an example of a continuum closed set that does ,not have any accumulation point. Find the shape of open balls in the metric space (X,d) introduced in Example 1.3 ( ii) . Show that the set 2.9.
[1,oo) is closed in the metric space in Problem
2.
The Structure of Metric Spaces
NEW TERMS:
open ball 65 radius of an open ball 65 supremum metric 65 open ball with respect to the Euclidean metric 66 open ball with respect to the supremum metric 66 open ( d-o pen ) set 67 interior point 69 interior of a set 69 closed set 6 9 closure point 69 closure of a set 69 closed ball 70 accumulation point 7 1 derived set 7 1
73
74
CHAPTER
2 . ANALYSIS OF METRIC SPACES
3. CONVERGENCE IN METRIC SPACES
This section introduces the reader to one of the central notions in the analysis of metric spaces - convergence. Among different things, we will discuss the relation between limit and closure points. 3.1 Definitions.
(i) Recall that a function [N,Xf] is called a sequence, and its most commonly used notation is {x n } = f, with x n = f ( n ) . Let {x n } C (X,d) be a sequence and let x E X. A subsequence Q N = {x N , x N + 1 , . . . } is called an N(x,£) tail of {x n } if there are N > 1 and £ > 0 such that Q N C B( x ,£) . The sequence { x n } is said to con verge to a point x E X if for every £ > 0, there is a N(x,£)-tail. In notation, -
nlim � oo d(x n ,x) = O (also d-lim n� oo x n = x or just x n � x ). x is called a limit point of the sequence {x n } · A sequence is convergent if it is convergent to at least one limit point that belongs to X. (ii) A point x is said to be a limit point of a set A if there is a sequence { x71} C A convergent to x. (iii) A sequence { x n } is called a Cauchy sequence, in notation
if for each £ > 0, there is an N such that d(x n ,x m ) < £, for n , m > N. ( iv) A metric space (X,d) is called complete if every Cauchy sequence in X is convergent. (v) A sequence {x n } is called bounded if for every n , d(x 1 ,x n ) < M, D where M is a positive real nurn ber. 3.2 Remark. A sequence in a metric space can have at most one limit point. Indeed, let x,y be limits of a sequence { xn} C (X ,d) and let £ > 0 be arbitrary. Then, given an N, by the triangle inequality,
D (i.e. d(x,y) can be made arbitrarily small) . Thus, x = y. 3.3 Theorem. Let A C (X, d) . Then a point x is a closure point of a set A if and only if x is a limit point of A (i. e. there is a sequence { x n }
3. Convergence in Metric Spaces
75
C A such that x n � x). Proof.
( i) Let x be a closure point of A. If x E A then the proof becomes trivial ( take x n = x, n = 1,2, . . . ) . Let x E A \A. By the definition of a closure point, every open ball B(x,r) meets A. Thus for every n, there is a point, x n E A n B(x,�), so that d(x,x n ) < �- Therefore, {x n } is a desired sequence convergent to x. (ii) Let {x n } C A such that Ai!E00 x n = x. We prove that x E A. The convergence implies that for every £ > 0, there is an N such that
d(x,x n ) < for all n > N. £,
Thus Ve > 0, B(x,e) n A i= C/J, which yields that X E A. ( Particularly, if x E A'\A :j:. (/J, then there exists a sequence with all distinct terms such that x n � x.)
{x n }
D
A subset A of a metric space (X,d) is closed if and only if it contains all of its limit points. 3.4 Corollary. Proof.
( i) Let
A be closed and let
Then, by Theorem
3.3,
{ x n } C A be a convergent sequence.
nlim �oo x n = x E A. and x E A. Thus,
Since A is closed, A = A A contains all of its limit points. ( ii) Let A contain all of its limit points. Apply the pick-a-point pro cess. Let x E A. Then, by Theorem 3.3, there is a sequence {x n } C A such that nlim �oo x n = x. By our assumption, x belongs to A or, equivalent ly, A C A implying that A = A and hence A is closed. D 3.5 Definitions.
(i)
A subset A C (X, d) is called dense in X if A = X. [By Theorem 3.3 , A is dense in X if and only if the set of all limit points of A coincides with X, or, in other words, if and only if for every x E X, there exists a sequence { x n } C A such that x n � x.] ( ii) A set A C (X , d) is called nowhere dense if its closure has the empty set for its interior, i.e. , if Int (Cl(A)) = (/J.
76
CHAPTER
2 . ANALYSIS OF METRIC SPACES
(iii) A point x E ( X, d ) is called a boundary point of A i f every open ball at x contains points from A and from Ac. The set of all boundary points of A is called the boundary of A and is denoted by 8A. [Note that 8A = 8Ac = A n Ac].
D
3.6 Examples.
( i) Since each irrational number can be represented as the limit of
a sequence of rational numbers, Q is dense in IR (in terms of the Euclidean metric). ( ii) X and C/J have no boundary points. (iii) Let A = [0, 1) U {2}. Then, A = (0, 1), A = [0, 1 ] U {2}, A' = [0, 1 ], 8A = {0, 1 ,2} (since Ac = ( oo ,O) U [1 ,2) U (2, oo ), Ac = ( oo ,O] U [1, oo ), and A n A c = {0 , 1 ,2} ). (iv) Let A = {1 ,5, 10} � ( IR,de) · Then A is nowhere dense. ( v ) { � : n = 1 ,2, . . . } is nowhere dense in (lR,d e ). 0
-
-
PROBLEMS 3. 1 3.2 3.3 3. 4 3.5 3.6 3.7 3.8 3. 9
Show that every convergent sequence is a Cauchy sequence. Give an example when the converse is not true. Prove that A = A + 8A ., If x E 8A, must x be an accumulation point? Prove that a set A C (X,d) is nowhere dense in X if and only if the complement of its closure is dense in X. Assuming that (lR, d e ) is complete (a known fact from calculus) prove that (IR" ,de) is also complete. Show that any Cauchy sequence is bounded. Show that in a discrete metric space any convergent sequence has at most finitely many distinct terms. Show that any discrete metric space is complete. Show that if { x n } � (X ,d) is a Cauchy sequence and { x n k } is a subsequence convergent to a point a E X, then x n __. a. -
0
3.
Convergence in Metnc Spaces
NEW TERMS:
sequence 74 N ( x,t: ) -tail 74 convergent sequence 7 4 limit point of a sequence 74 limit point of a set 74 Cauchy sequence 74 complete metric space 7 4 bounded sequence 7 4 dense set 75 nowhere dense set 7 5 boundary point 76 boundary of a set 7 6
77
78
CHAPTER
2 . ANALYSIS OF METRIC SPACES
4. CONTINUOUS MAPPINGS IN METRIC SPACES
(X,d) and (Y,p) be two metric spaces. A function f : (X,d) � (Y,p) is called continuous at a point x0 E X if for each £ > 0, there is a number 6 > 0 such that p(f(x),f(x0 )) < £ for all x with d(x,x 0 ) < 6. The function f is called continuous on X or simply conti nuous if f is continuous at every point of X. D 4.2 Remark. Since Xo E r( {f(xo )} ) Xo E f * (B p (f(x o ), c: )). However, in general, x 0 need not be an interior point of f * (B p (f(x 0 ),e). The continuity of function f at x0 is equivalent to the statement that , for any £ > 0, x 0 is indeed an interior point of f * (B p (f(x 0 ),e)). In other words, f is continuous at x0 if and only if the inverse image under f * of any open ball centered at f(x0 ) contains x0 an interior point. (See Figure 4. 1 .) Consequently, there is an open ball B d (x 0 ,6) C f*(B p (f(x 0 ),e)). In particular, this implies that: 1) such a positive 6 exists, and 2) the image of Bd(x 0 ,6) under f is a subset of B p (f(x 0 ),e), which guarantees that p(f(x), J (x0 )) < £ for*all x with d(x,x0 ) < 6. 4. 1 Definition. Let
.
as
y
X
Figure 4. 1
4.
Continuous Mappings in Metric Spaces
79
However, if f is not continuous at x 0 , as it is depicted in Figure 4.2 below, x 0 need not be an interior point of f * (B (f(x 0 ),e)). In this case, no ball Bd (x 0 ,6) can be inscribed in f * (B p (f(x 0 ), c: )) or, equivalently, no positive fJ exists to warrant p(f(x),f(x0 )) to be less than £ for all x with
d( x,x0 ) < 6.
D
y
BP(f{x0),e) -- --··-
·--·-·--·--
- ..J!:..
I
1
J 1 I
________
-----� -·------·--·--···---- ··-·-·····-·------
r
f{xo)
I I
i
i I
1 iI I
......
1--·--
:
'
---····--.-·-··--·--·----·- ---··...----·--..�·--
i
x0
(non-interior point)
X
Figure 4.2 The following theorem is a generalization of the above principles of continuity. 4.3 Theorem.
A function f: ( X,d) � (Y,p) is continuous if and only if the in verse imag e of any open set in (Y, p) under f is open in (X ,d).
CHAPTER 2 . ANALYSIS OF METRIC SPACES
80 Proof.
1) As mentioned in Remark 4.2, we will begin the proof by showing
the validity of the following assertion:
f is continuous at x 0 if and only if x0 is an interior point of the inverse image under f * of any open ball Bp (f(x 0 ),e). Let x0 be an interior point of f * ( B p (f(x 0 ),e)). Then there is an open
ball
and hence, ( by Problems 3.6 (a) and 2.6 of Chapter
1),
which yields continuity of f at x0 • Now, let f be continuous at x 0 • Then , the inclusion f (B d (x 0 ,c5)) C B p (f(x0 ),e) holds, which, along with Problem 2.5 ( Chapter *1) lead to the following sequence of inclusions:
Because x0 is the center of B d (x 0 ,c5), it is an interior point of this ball and, due to the last inclusion, an interior point of f * (B p (f(x 0 ),e)). 2) Suppose f is continuous on X. We show that for each open set 0 C Y, f * (O) is open in (X,d). Pick a point x0 E / * (0). Then, f(x 0 ) E f * ( ! * ( 0)) C 0 and, since 0 is open, f( x 0 ) is its interior point. Thus, 0 is a superset of the open ball B p (f(x0 ),e)), for some £, and consequently, (4.3) Since f is continuous at x0 , by assertion 1 ) , x0 must be an interior point of f * (B p (f(x0 ),e)), and, by (4.3), an interior point of / * ( 0). Thus, ! * ( 0) is open. 3) Let f * (O) be open in ( X,d) for every open subset 0 of Y. Take x0 E X and construct an open ball B p (f(x 0 ),e). By our assumption, the set f * (B p (f(x0 ),e)) is open in (X,d). Since f(x 0 ) E B p (f(x0 ),e), we have that
x 0 E f * (B p (f(x 0),e)) and it is an interior point of D f * (B p (f(x 0 ),e)). By 1 ) , f must then be continuous at x0 .
and, therefore,
4.
Continuous Mappings in Metric Spaces
81
There will also be yet another useful criterion of continuity. 4.4 Theorem. A function f: ( X ,d) __. (Y,p) is continuous at
x E X if and only if for every sequence {x n }, d-convergent to x, its image se quence {f(x n )} is p-convergent to f (x) . We will prove this theorem for a more general case in Chapter 3 (Theorems 4.9 and 4. 10). 4.5 Definition. Let (X ,d) be a metric space and r ( d) be the collection of all open subsets of X with respect to metric d. Then r(d) (or j ust r) is D said to be the topology on X generated by d . Theorem 4.3 can now be reformulated as follows.
Let f: (X,d) � (Y,p) be a function and let r(d) and r (p) be the topologies generated by metrics d and p, respectively. Then f is continuous on X if and only if f ** ( r(p)) C r( d) [i. e., \/0 E r(p ), D f * (O) E r(d)] . 4. 7 E ple. Let f: ( IR ,d) � (lR,de) be the Dirichlet function defined f I Q , where Q is the set of rational numbers. H d de is the Euclidean metric then f is discontinuous at every point. If d is the dis crete metric, by Theorem 4.3, f is continuous on IR, since the inverse image of any open set in ( IR ,de ) under f is clearly an element of the 4.6 Theorem.
as
xam
=
=
power set coinciding with the "discrete topology" generated by the dis D crete metric (see Example 2. 7). We will further be interested in the conditions under which two dif ferent metrics on X generate one and the same topology. This property of metrics satisfies an equivalence relation on the set of all topologies on X and hence referred to as equivalence of metrics. In other words, topolo gies generated by metrics on a carrier induce an equivalence relation. 4.8 Definition. Two metrics d 1 and d 2 on X are called equivalent if r(d1 ) = r(d 2 ) (in notation d 1 � d 2 ). D 4.9 Remark. Let ( X ,d 1 ) and ( X,d 2 ) be two metric spaces and, let f: (X,d 1 ) � ( X ,d 2 ) be the identity function ( f (x) = x, x E X). If d1 and d 2 are equivalent and therefore r( d1 ) = r( d 2 ) , then for every open set 0 in (X,d 2 ) (and in (X,d 1 )), f * (O) E r(d 1 ). According to Theorem 4.4, this is equivalent to the statement that implying that
nlim �oo d 1 (x n ,x) = 0 nlim �oo d 2 (x n ,x) = 0. �oo d 2 (f(x n ),f(x)) = nlim
CHAPTER 2 . ANALYSIS
82
Thus, assuming (i) r(d 1 ) = r(d2), we showed that ( ii) nlim --+ oo d 1 ( x n ,x) = 0
OF
METRIC SPACES
¢> nlim --+ ood 2 ( x n ,x) = 0.
By Theorem 4.4, it follows that the converse is also true, i.e. that statement ( ii) implies statement ( i) . Hence, we may call two metrics D r( d 1 ) and r( d 2 ) on X equivalent if (i) or (ii) holds. From Theorem 4.3, it also follows that the identity map above is con tinuous under equivalent metrics. However, an identity map need not be continuous if d 1 and d 2 are not equivalent. 4. 10 Definitions.
(i)
Let A be a subset in a metric space (X,d). The number d(A) = sup{d(x,y) : x,y E A}
(more precisely, a real number or infinity) is called the diameter of A . The set A is called d -bounded or just bounded if d(A) < oo . Particularly, the metric space (X,d) or d is called bounded if X is bounded. A is said to be unbounded if d(A) = oo . ( ii) A subset A in a metric space ( X,d) is called totally bounded if for every c > 0, the set A can be covered by finitely many c-balls (i.e. balls with common radius c) . D 4. 11 Example. According to Problem 1.4, the function d(x,y) p(x, y ) = 1 + d(x,y) defined on a metric space (X,d) is a metric on X. Obviously
P ). Therefore, d and p are if and only if nlim p(x ,x) 0 (due to d = = n 1 oo -p --+ equivalent. Observe that p is clearly bounded while d is arbitrary. D We finish this section by rendering a short discussion on uniform con tinuity. This concept will be further developed in Section 6 and Chapter 3. 4.12 Definition. A function f : (X,d) --+ (Y,p) is called uniformly continuous on X if for every c > 0, there is a positive real number fJ such
4. that
Continuous Mappings in Metric Spaces
d(x ,y ) < 6 implies that p(f(x) ,f(y)) < for every x, y E X. e: ,
83 D
Unlike continuity, uniform continuity guarantees the existence of such positive 6 ( for every fixed e: ) for all points of X simultaneously. In the case of usual continuity, a delta depends upon a particular point x E X, where the continuity holds, so that a common delta, good for all points x E X, need not exist. Clearly, uniform continuity implies continui ty. Uniform con tin ui ty can also be defined on some subset A of X, so that in Deflni tion 4. 12, X will be replaced by A. 4.13 Examples.
( i)
Consider
f : ( IR, de ) --+ ( IR, de ) such that f(x) = x 2 • Then
implies that
I x - x0 + 2 x0 I < I x - x0 I + 2 I x0 I < 6 + 2 I x0 I and
l f (x) - f (xo ) l = l x 2 - x� l = l x - xa l · l x + x o l
< 6 ( 6 + 2 I x0 I ) . ·
Take 6 · ( 6 + 2 1 x0 I ) as t: such that
e: .
Then
6 can be found explicitly as a function of
Therefore, 2the function x 2 is d e-continuous at every point x0 E IR. However, x is not uniformly continuous on IR, since 6 depends upon x0 as well. Specifically, 6 --+ 0 when x0 --+ oo Consequently, we cannot find a 6 > 0 good for all x0 . ( ii) Let f(x) = x 2 be given as .
From the last inequality above we derive
I f(x) - f(x0) I < 6(6 + 2 1 x0 I ) < 6(6 + 6), and thus whenever
6 = .J9+"£ - 3, where t: = 6 (6 + 6). Thus d e (f(x), f(x0)) < de (x , x0) < 6 = � - 3. Since 6 is independent of x0 , f (x)
t:
CHAPTER 2 . ANALYSIS OF METRIC SPACES
84
is uniformly continuous. Observe that f has been given on a closed and bounded interval which provides the uniform continuity. However, in this case f would also be uniformly continuous if f were defined on any bounded but not necessarily closed interval, for instance (0,3) ( why? ) . (iii) A continuous function can be uniformly continuous over un bounded sets, as for example, functions f(x) = 1 , x E [l,oo ) , and D f(x) = sin x, x E IR. There is an analytical result, known as Heine-Bore! Theorem, stating that any continuous function defined on a closed and bounded set in any Euclidean metric space is also uniformly continuous. The general form of this result will be discussed in Section 6 (Theorem 6. 13). 4.14 Remark. It is known from calculus that the space of all real
valued continuous functions defined on [R n is closed under the formation of main algebraic operations. What if functions were defined on an arbitrary space (X,d)? We give here some informal discussion on this matter. Let IR X be the space of all real-valued functions defined on a set X and let f ,g E lR X . Define the following.
( i) f ± g is the function such that for each point x E X, (! ± g)(x) = f(x) ± g(x). (ii) fg is the function such that Vx E X, (fg)(x) f(x) g(x). (iii) + and oo are not real numbers. Consequently, fIg is the function such that for all x E X, (flg)(x) = f(x)lg(x), exclud ing x E X for which g(x) = 0. At all those values, the function fIg is either undefined or can be specified. =
oa
•
-
(iv) As a special case, any real-valued function multiplied by a real number, is a real-valued function too. ( v) J'he associative ( relative to multiplications ) and distributive laws of functions relative to the addition and multiplication defined in ( i) and ( ii) are the corresponding consequences of these laws for real numbers.
Bearing in mind these observations, we conclude that the space lR X is a commutative algebra over lR with unity and a vector lattice ( that was also mentioned in Example 7.7 (ix), Chapter 1). A subset e((X,d); ( lR, p )) ( of lR X ) of all continuous functions is a subalgebra characterized by the following properties : (a)
J,g E e => af + bg E e , Va,b E IR. (b) J,g E e => fg E e.
4.
Continuous Mappings in Metric Spaces
85
PROBLEMS 4. 1 4.2 4. 3
Show that if A is totally bounded then A is bounded. Give an example, where a bounded set is not totally bounded. Prove that e is indeed a subalgebra with properties ( a ) and ( b) above. Show that a continuous bounded function on a bounded interval need not be uniformly continuous. In
the problems below it is assumed that from (IR,d e ) to (lR,d e )·
f and
g
are functions
f : (( - oo ,O),d e ) --+ (( - oo ,O),d e ) be a function given by f(x) = 1· Show that f is continuous. Explain why f(x) is not uni
4.4
Let
4.5
formly continuous. Let f : A --+ IR be a differentiable function such that its derivative f ' is bounded over A, where A is an arbitrary ( bounded or unbounded ) interval. Show that f i s uniformly continuous on A.
4.6 4.7
Show that if f and g are uniformly continuous on lR and bounded then f g is uniformly continuous on lR too. Which of the following functions are uniformly continuous? )
f(x) = sin 2 x (x E lR). b) f (X) = x 3 X (X E (R). ) f(x) = x sin x (x E IR). d) f(x) = ln2 x (x E [ 1 , oo ). ) f(x) = x ln x (x E (1, 100)).
a c
COS
e
4.8
4. 9
Let f be a continuous function and g a uniformly continuous func tion on a set A such that I f I < I g I . Is f then uniformly conti nuous? Show that in (lR n ,d e) , any bounded set is also totally bounded.
86
CHAPTER 2 . ANALYSIS
OF
NEW TERMS:
continuous at a point function 78 continuous function on a set 78 inverse image of an open set under f 79 continuity criteria 79, 81 topology generated by a metric 81 Dirichlet function 81 equivalent metrics 81 diameter of a set 82 bounded set 82 d-bounded set 82 unbounded set 82 totally bounded set 82 uniformly continuous function 82 algebra of functions 84
METRIC SPACES
5.
Complete Metric Spaces
87
5. COMPLETE METRIC SPACES
In this section we will discuss the completeness of metric spaces as it was introduced in Definition 3. 1 (iv).
Let (X,d) be a complete metric space. Then a subspace ( A, d) is complete if and only if A is closed. Proof. Let A be closed and let {x n } C A be any Cauchy sequence. Since (X,d) is complete, there is a point x E X such that nlim --+ oox n = x. Then, by Corollary 3.4, x E A. Thus, ( A, d ) is complete. Now, let (A,d) be complete and { x n } be any convergent sequence in A. Then this se 5.1 Theorem.
quence is also a Cauchy sequence and hence A contains its limit. There fore, A is closed, again, by Corollary 3.4. D The reader should be aware of the differences between the notions of completeness and closeness of a subspace. (See Problem 5. 3.) 5.2 Theorem. A metric space (X,d) is complete if and only if � very nested sequence { C(x n ,r n )} of closed balls, with r n l 0 as n--+oo , has a
nonempty intersection. Proof. Because r n ! 0, for any r v < �£. Given that k > > v, n
£
> 0, there is an integer
v such
that
and, consequently, Therefore, { x n } is a Cauchy sequence. First assume that (X,d) is complete. Then, {x n } converges to a point, say x E X. Since each ball C(x n ,r n ) contains the tail of the sequence { x n } and because it is closed, it must contain x. Thus, n C(x n ,r n ) contains X and hence it is not empty. n=l Now, let any nested sequence of closed balls have a nonempty intersection and let {x k } be a Cauchy sequence in X. By Definition 3 . 1 (iii) , it implies the existence of an increasing subsequence {v 1 ,v 2 , . . . } of indices of { x k } such that for each n , 00
CHAPTER 2 . ANALYSIS
88
We show that the sequence y E C n + 1 . Then
{
cn
OF
METRIC SPACES
(
= c X V n ' 21n
)} is nested. Indeed, let
d ( y , X v n 1 ) < 2 n � 1 and d ( X v n , X v n ) < 2 n 1+ 1 . +1 + Therefore,
d ( y ,x v n ) < 21n , which yields that y is an interior point of C n and thus C n :> C n + 1 . Since by our assumption, the intersection n C n f. (/J, there is at least n=1 one point, say x that belongs to all balls. Furthermore, because the sequence { r n } of their radii is convergent to zero, the subsequence { xv n } of their centers must converge to x E X and thus, by Problem 3.9, {x k } also con verges to x. D 5.3 Remark. Clearly, in the final phrase of the last theorem, point x is a unique point of the intersection n C n · The below theorem is a n=1 useful refinement of this statement due to Georg Cantor. Because of its similarity with Theorem 5.2, its proof is suggested as an exercise (Problem 5.8). D 5.4 Theorem (Cantor). Let (X,d) be a complete metric space and let {A n }! C X be a sequence of nonempty closed subsets with 00
00
Then nn=1 A n consists of exactly one element. 00
D
d-bounded if Y is a number M such that
5.5 Definition. A function [X, (Y,d) ,f] is called
linear space and there is a nonnegative real d(f(x),O(x) ) < M, Vx E X, where 0 is the function identically equal to D () E Y (the origin of Y) . 5.6 Examples.
(i) Let X be a nonempty set, (Y,d) a linear metric space, and let � * � (X; (Y, d)) be the set of all d-bounded functions from X to Y. For all *f, g E � * define =
p(f,g) = sup{d(f(x) , g(x)) : x E X}. It can be shown (Problem 5.4) that p is a metric on GJ * ' called a uniform (or supremum) metric. Consequently, the convergence in (� * ,p) is called
5.
Complete Metric Spaces
89
uniform convergence. A subset of functions GJ C GJ * is said to be uni formly bounded on X if GJ is p-bounded, i.e. , diamGJ < M (a positive real number). We show that any Cauchy sequence in (GJ * ' p) is uniformly bounded. We will make use of Problem 5.5. Let {f n } be a Cauchy sequence in (GJ * 'p ) . Therefore, for £ = 1, there is an N = N(1) such that p(f n ,f k ) < 1, n,k > N. Let k = N(1). Then,
p(f n ' 0) < P(f n , f N ) + p(f N , O ) < 1 + M (f N ),
where M (f N ) is a "p-bound" of function f N · If M(f i ) is a bound of f i , then M, defined as max{M(f 1 ), . . . ,M(f N _ 1 ),1 + M (f N )}, p-dominates the whole sequence {f n } · By Problem 5.5, we have that {f n } is p bounded. ( ii) Assume that (Y,d) is a complete linear metric space. Let us show then that (GJ *'p ) is complete too. Consider a Cauchy sequence {/ n } c (GJ * 'p ) . It is obvious that for each fixed x E X, the sequence { f n (x)} is also Cauchy in (Y,d). Since (Y,d) is by our assumption complete, the "pointwise limit" of {f n } exists. Denote it by f. In other words,
nlim --. ood(f n (x),f(x)) = 0, Vx E X. We need to show that f E (GJ * 'p ) . Since {f n } is a Cauchy sequence, according to ( i ) it is uniformly bounded by a real number M . Thus we
have
d(f(x), O (x)) < d(f(x),f n (x)) + d(f n (x), O (x)) < d(f(x), J n (x)) + p(f n ,0) < M, . , 1.e.
d(f(x), O (x)) < d(f(x), J n (x)) + M .
The last inequality holds for every
x E X if --. n
oo ,
which yields
d(f(x), O (x)) < M, for all x E X. Consequently, p(f ,0) < M and hence f E (GJ * 'p ) . We only showed that f n (x) � f(x), for each f E GJ * " The assertion f n � f is subject to Problem 5.6.
x E X, and that D
CHAPTER 2 . ANALYSIS
90
OF
METRIC SPACES
PROBLEMS 5.1
5.2
Using similar arguments as in Example 5.6, show that the limit of any uniformly convergent sequence of continuous bounded func tions from ( X,d 0 ) to (Y,d) is a bounded and continuous function. Let { C n } be a sequence of closed balls in (IR",d e) such that each of the balls e n is centered at a point Xo E IR" and has radius � , n = 1 ,2, . . . . F1nd nn C n . •
00
=l
5.3
5.4 5.5
5.6 5. 7
5.8 5.9
Show that if a metric space ( X ,d) is not complete then a closed subspace (A,d) need not be complete either. [Hint: Consider the metric space in Problems 2. 9 and 2. 10.] Show that p, defined in Example 5.6 (i) , is a metric on GJ * . Let GJ C GJ ( X; (Y,d)), where Y is a linear space. Prove that GJ is p bounded if* and only if there is a positive constant M such that for all f E �, p ( f , O) < M. Show that in Example 5.6 ( ii) f n !... f. We can make use of the fact that the Euclidean and uniform metrics are equivalent to show completeness of (IR n ,d e ) · For n = 1 , it is well-known from calculus. Prove completeness of (IR",d e) for an arbitrary n. (See Problem 4.9.) Prove Cantor's Theorem a._,4 . Let ( X ,d) be a metric space. A subset A C X is said to be of the first category if it can be represented as a countable union of no w here dense sets. Otherwise, A is of the second category. Prove Baire's Category Theorem: A complete metric space is of the
second category.
5.
Complete Metric Spaces
NEW TERMS:
completeness criteria 87 Cantor's Theorem on intersection of closed sets 88 d-bounded function 88 bounded function 88 uniform metric 88 supremum metric 88 uniform convergence 89 uniformly bounded set of functions 89 p-bound of a function 89 bound of a function 89 point wise limit 8 9 Baire's Category Theorem 90
91
CHAPTER 2 . ANALYSIS OF METRIC SPACES
92
6. COMPACTNESS
Compactness is one of the kernel concepts in real analysis. We develop it in the present section for metric spaces and then in Chapter 3 for the general topological spaces. It stems from the fact known in IR that every bounded sequence has a convergent subsequence, which implies that any sequence in a closed bounded interval has a subsequence convergent to a point in this interval. In a general metric space, a subset A, in which every sequence has a subsequence convergent to a point in A is called sequentially compact or compact. Although compactness and sequential compactness are distinct notions in general topological spaces (and they ar� defined differently) , they are equivalent in metric spaces as Theorem 6.3 states it. Continuous functions defined on compact sets are uniformly conti nuous; continuous images of compact sets are compact ( hence, closed and bounded) anq this means that in normed linear spaces continuous func tions on compa 0, there is a 6.13 Theorem.
6x > 0, such that
p(f(x), J (y)) < �
for all y with d(x,y) < 6x . Since X is compact, after reduction, there is an n-tuple of open balls such that
6. Let fJ = � min{fJx1 , . . . ,fJx B( x i ,fJ x ./2) implies that
n
Compactness
97
} and let x,y be such that d(x,y) < 6. Then x E
1
and Thus, y belon gs to the ball B(x i ,fJ x . ). Since y and xi are within the distance of fJx ' due to continuity of f at xi , given £, 1
·
•
1
p(f(xi), J ( y )) < � · Obviously,
d(x,x i) < fJx . yields p(f(x i ),f(x)) < � and, therefore, 1
p(f(x), J (y)) < p(f(xi),f(x)) + p(f(x i ),f( y )) < c . 6. 14 Theorem. A
metric space (X, d) is compact if and only if it complete and totally bounded.
D is
Proof.
1 ) Let (X,d) be compact. Then by Problem 6.6, it is complete. Since X Cx U B(x,£) for some £ > 0, by compactness, the cover EX
{B(x,£): x E X} can be reduced to a finite subcover, which implies total boundedness. 2) Let (X,d) be complete and totally bounded. We will show that (X,d) is sequentially compact, which, by Theorem 6.3, would imply com pactness. Let { x n } be a sequence in X. We will construct a Cauchy sub sequence. Since X is totally bounded, it can be covered by finitely many open balls of radius 1. Then at least one of the balls, for instance B 1 , contains infinitely many terms, say {x l }, of this sequence. Furthe�more, cover X by balls of radius � and again an infinite subsequence { x�} C {xl} (since B1 will also be covered) is contained in one of the balls, which we label B 2 , and so on. The desired Cauchy sequence is formed by the selection of the first term from each subsequence. Indeed, by the con struction, x l and x � belong to ball B 1 . Thus, d(xl,x�) < 1 . x � and x i belong to ball B 2 , which implies that d(x�,xf) < �' and so on. Since (X,d) is complete, this Cauchy sequence is convergent, yielding sequenti al compactness of (X,d). D
CHAPTER 2 . ANALYSIS OF METRIC SPACES
98 PROBLEMS
6.7
Show that if {x k } C (IR",d e) with d(x k ,O) < 3, then {x k } has a con vergent subsequence. Define \IA,B C (X,d), d( A ,B) = inf{d(a,b) : a E A, b E B}. Let A be compact. Show that VB C X, there is an x E A such that d(x ,B) = d(A,B). [Hint: Use the fact that A is sequentially com pact.] Let A,B C (X,d) such that A is compact and B is closed. If A n B = C/J , show that d(A,B) > 0. Let A C (X,d). Show that if A is totally bounded then A is also totally bounded. Generalize Theorem 6.6: Any Lindelof metric space is separable. Show that sequential compactness of a subspace implies its comp leteness. Prove Theorem 6.2.
6.8
Prove Theorem 6. 3.
6.1 6.2
6.3 6.4 6.5 6.6
6.
Compactness
NEW TERMS:
cover 92 subcover 92 open cover 92 open subcover 92 compact set 92 compact metric space 92 Lindelof set 92 Lindelof space 92 compactness, criteria of 93 , 97 Bolzano-Weierstrass compactness 93 sequential compactness 93 separable metric space 93 Reine-Borel Theorem 94 compact set under a continuous function 95 uniform continuity criterion in compact space 96
99
CHAPTER 2 . ANALYSIS
100
OF
METRIC SPACES
7. LINEAR AND NORMED LINEAR SPACES
We have already mentioned that the Euclidean metric defines the length of a vector in n-dimensional Euclidean vector ( linear ) space. The follow ing generalizes the notion of vector length in a linear space and reconciles it with the notion of a special metric defined on a linear space ( initially discussed in Section 5) . 7. 1 Definition. Let (X,d) be a metric space such that X is a linear space over IR or C. The metric d is said to be: )
translation invariant if for all a, x, y E X , d(x + a , y + a) = d(x, y ) . b) homothetic if for all a E f and x, y E X, d(ax, o:y) = I a I d(x, y).
a
If d is translation invariant and homothetic we will abbreviate it by
TIH.
D
If d is a metric on a linear space X, then we are able to measure length of vectors, and thus comparing them, by setting the distance from any p oint x E X to one fi.Xed p oint of X , the origin. If, in addition, d is TIH then we can use the properties of X as a linear space, and in some particular cases, employ even the geometry, thereby replicating the Euclidean space and preserving the generality needed in applications. 7.2 Definition. Let d be a TIH metric on a linear space X, with the origin (), over f. ( assuming that IF is lR or C). Then for all x E X , we call the distance d(x,B) the norm of vector x and denote it by II x II . We will also call II II the norm on X ind ced by the TIH metric d. The pair D (X, II I I ) is called a normed linear space (NLS). u
·
·
Let II II be a norm on X in Definition following properties of II II hold true: (i) II X II = 0 ¢> X = B. ( ii) II ax II = I a I II x II , 'v' a E f, 'v' x E X. (iii) II X + y II < II X II + II y II 'v' X , y E X. 7.3 Theorem.
·
7. 2.
Then the
·
'
Proof.
Property ( i ) is obvious. (ii) II ax II = d(ax,B) = d(ax,aB) = I a I d(x,B) = I a I I I x II . (iii) II x + y II = d(x + y ,B) = d(x , - y ) < d(x,B) + d(B, - y ) = II x II D + I - l l II Y II = ll x ll + II Y II C onve rs ely, if II II is a real-valued nonnegative function defined on a linear space X and has properties (i-iii) of Theorem 7.3, then II II ·
·
7.
Linear and Normed Linear Spaces
101
d(x,y) = I x - y I
generates a TIH metric on X by setting (show it , see Problem 7. 10) . If d in Definition 7.2 is a TIH pseudometric then the function is called a semi-norm and correspondingly, the pair (X, ) is called a semi-normed linear space ( SNLS). It is easy to show that the Euclidean metric de on IR" is TIH. The as sociated norm induced by de is called the Euclidean norm and it will be denoted e. A very important class of NLS's is introduced below. 7.4 Definition. An NLS is called a Banach space if it is complete with respect to the metric induced by the norm (or the norm induced by a TIH metric). D
I I
I I
·
·
I I ·
7.5 Examples.
I I
(i)
I xI
The NLS (lR", ) over the field lR with is a Banach space with the Euclidean norm (see Problem 7 . 1 ) . ( ii) The NLS l P over the field C with the norm ·
e
I xI P = [ I: := I x n I PJ!P is a Banach space. Observe that I I P indeed defines a norm (called the l P norm) . (See Problem 7.5. ) Now let {x(n ) } be a Cauchy sequence. Then this sequence is uniformly bounded (show it in Problem 7.6), say, by some M E lR + . Let x = (x 1 , x2, ) be the pointwise limit of the sequence { x ( n ) } . This limit exists, since each xi is the limit of the ith-component sequence in (C,d e ) which is complete. We need to show that x is an element of lP, i.e. I x I P < and that [P X X ( n ) -+ (i.e. {x ( n ) } converges to x in l P norm). We have [ ktl l x k I pJ / p = [ ktl l xk - x�n) + x �n) I pl / p (by Minkowski 's inequality with ak = xk - x �n ) and bk = x �" ) ) < [ J:l I x k - x�n) I PJ/P + [ i:l I x �n) I PJ / p < [ ktl l xk - x�n) l pJ / P + l x ( n) I P < [ ktl l xk - x�n ) l pJ / P +M. 1
·
• • •
oo
Now, letting
n
-+
oo ,
we have
CHAPTER 2 . ANALYSIS
102
OF
METRIC SPACES
[ kt=1 I xk I P]1 / P < M, which holds for all 1, 2, . . . . Hence, we have II x II < M . Show th at fP norm (Problem 7. 7). Thus, fP is completeP and therefore is xa (Banach n ) --+ x inspace. (iii) Let GJ ( n ) be the space of all bounded real-valued functions on n valued in ( IR, d e ) or (C,d e ) · One can show that GJ is a linear space. The r =
*
*
norm II f I I u = sup { I f (w) I : w E n} is called the supre mum n o rm. GJ is a Banach space with respect to this norm (see Problem 7.4) . iv) Consider e [a , b ] as the space of all n-times differentiable real valued functions on a compact interval [ ]. It is easily seen that e ra ' b ] is a linear space. We introduce the following norm in e ra , b ] : *
(
a, b
Clearly, II . II E is a norm in e ra , b ] • We show that e ra , b ] is a Banach space under this norm. Let { f k } be a I I II E-Cauchy sequence. Then, for every £ > 0, there is a positive integer N such that \lk,j > N, ·
which implies
II f ( i ) j
i ) II u - sup { I t ( i ) - t kc i ) I } N (since G:? Generalize Theorem 5.5 for the case of Tychonov's topology.
142
CHAPTER 3. ELEMENTS O F P OINT SET T O P O LO GY
NEW TERMS:
product topology for finitely many factor spaces 135 base parallelepiped 135 subbase parallelepiped 135 continuity of projection maps 136 product topology for arbitrarily many factor spaces 136 box topology 136 box parallelepiped 136 Tychonov topology 137 weak topology (generated by a family of functions) 138 topology of pointwise convergence 139 pointwise convergence, criterion of 139 open map 140
6. Notes on Subspaces and Compactness
143
6. l'�OTES ON SUBSPACES AND COMPACTNESS
It has been mentioned that subspaces of topological spaces (i.e. relative topologies) inherit certain qualities of the original spaces. In this section we consider this notion more systematically. We will be concerned with such topological properties as separability, countability, and compactness and their effect on subspaces. 6. 1 Definition. A property of a space is referred to as hereditary if every subspace has this property. A property is said to be weakly heredita ry if it is inherited by a subspace whose carrier is closed in the original space. A property is vaguely hereditary if it is inherited by a subspace whose carrier is open in the original space. [The last notion is restricted D to use in this text book.] 6.2 Example. Second countability is hereditary. (See Problem 3.10.) 0 6.3 Remark. In Section 1 we denoted by A the closure of some subset A of a topological space (X, r ), understanding that this is the closure relative to the topology r. As was mentioned in Definition 1.8 ( i), in the case of subspaces we may need to deal with closures of subsets with res pect to any relative topology, say (Y,ry) . To make a certai � distinction clear we will then write C l y A. However, we will still use A having in mind the closure relative to the original space (X, r ). D 6.4 Example. The property of density of a set is not hereditary and not weakly hereditary, i.e. if D is dense in (X,r), its trace in a subspace (Y,ry) need not be dense. Let (X,r) = (IR,r ) and Y = lR + U { - -)2}. Then, obviously the set Q + = Q n Y is not dense in (Y,ry ). It is easily seen that { - .)2} is an open neighborhood of the point - "J2 that does not meet Q + . Thus C l y Q + f. Y. Since Y is closed in (IR,r ) the density property is not weakly hereditary either. D 6.5 Theorem. Separability is vaguely hereditary, but not ( weakly) here e
e
,
ditary.
Proof.
(i)
Let (X,r) be separable and let (Y,ry) be a subspace of (X,r) such that Y E r. We show that (Y,ry) is separable. Let D be a count able, dense set in (X,r). We need to prove that Cly(D n Y) = Y; specifi cally, we need to show that Y C Cly(D n Y), for the inverse inclusion holds trivially. Let y be any point of Y and let be any open neighbor hood of y in ry. Since Y is open in X, is also a neighborhood of the point y in r. [It is easy to show the following. Since is a neighborhood which is ry-open. But 0 � = O y n Y, where of y in ry, there is 0 � c 0 Y E r. Since Y is r-open, it follows that 0 � is also r-open and, clearly,
U�
U�
U�
U�
144
CHAPTER 3 . ELEMENTS OF P O INT SET T O P O L O G Y
U � E CUY in r.]
Therefore, U � meets D and, consequently, U � meets D n Y (as a subset of Y). Observe that if Y is not open in X, U � need not be a neigh borhood of y in Y. [For instance, let Y = (0,2] and U2 = (1 ,2]. Clearly, U2 is not a neighborhood of 2 in ( lR , r ) but it is a neighborhood of 2 in e
(Y,ry).]
,
( ii) As a counterexample of separability as a hereditary property, we consider the topology (X,r ) known as the Moore plane. Let X = lR x [O,oo) (the upper semi plane and the horizontal axis). The topology on X is described by the following base sets. At each point ( x , y ) E lR x ( O,oo ), the· neighborhood base is the collection of all open balls {B (( x , y ), r ) : r < y }, where B( z,r) = {z ' E X : de( z , z ') < r}. At each point ( x,O), the neighborhood base consists of all open balls touching the horizontal axis at (x,O), and the point (x,O) is attached to these balls. Take the union of all neighborhood bases and construct a base for the Moore plarie topology in light of Theorem 2.2. This topological space is separable with the dense countable subset D = Q 2 n X. Indeed, let ( x,y ) E lR x (O ,oo). Then any neighborhood of ( x,y ) contains points with rational coordinates (a property inherited from the Euclidean space). As for the points ( x ,O ) , any open ball bordering (x,O) also contains points with rational coordinates. Now, for a subspace of the Moore plane, consider the one with the horizontal axls as the carrier Y. Clearly, all singletons are traces of base neighborhoods at ( x ,O ) in Y yielaing the discrete topology as the relative topology on Y. According to Problem 6.2, any discrete topology with a noncountable carrier is not separable. Observe that yc is obviously open in X. Hence the separability is not weakly hereditary. D 6.6 Definition. A subset A of a topological space (X, r) is said to be compact ( Lindelo/) if every open cover of A contains a finite (at most countable) subcover. We also say that A is finitely ( countably) reducible. Specifically, if X is compact (Lindelof), (X,r) is called a compact topo
logical spq,ce ( Lindelof space).
D
6. 7 Example. Compactness in metrizable topological spaces obviously
coincides with that for the corresponding metric spaces. In this case, we may use the tools and criteria of compactness for metric spaces. For instance, the interval [ a,b] for a,b E IR " is compact in the sense of the Euclidean metric; therefore, it is compact in (IR",r ) while (a,b) is not compact in (IR " , r ) since it is not closed. D e
e
,
,
Let f. (X, r)---. ( Y, r') be a continuous function. Then the image of any compact subset of X is compact. One can use the same method of proof of Theorem 6.8 as that of 6.8 Theorem.
6. Notes on Subspaces and Compactness
145
6. 10, Chapter 2. 0 6.9 Theorem. Compactness is weakly hereditary (i. e., a closed subset of a compact topological space is compact). Proof. Let (X,r) be compact and let B C X be closed. Let {O i : i E I} be an open cover of B. Since Be is open, {Be, Oi : i E I} is an open cover of X. Since X is compact, there exists an open subcover of X, say {Be, 01, . . . , 0 n }, which is also an open subcover of B. Hence, B is Theorem
compact. D Hausdorff topological spaces possess an important property with res pect to compactness.
Every compact subset of a Hausdorff space is closed. Proof. Let A be a compact subset of the Hausdorff space (X,r). We show that A e is open. Take x E A e. The family of neighbor hoods of all points y E A covers A. We extract a particular subfamily of these neigh borhoods. Since (X, r) is Hausdorff, for each y E A, there is a neighbor hood U x( Y ) of x and a neighborhood V y (x) of y such that U x(Y) n V y ( x) C/J. Without loss of generality we may assume that the family {V y ( x) : y E A} is an open cover of A. (Otherwise, for each y E A we can select open subsets O y (x) � V y (x) such that y E O y (x).) Since A is compact, there exists an open subcover {V Y (x) : k 1, . . . ,n} of A. Obvik ously, V y (x) n U x( Yk ) = ¢. Select { 0 x( Yk ) C U x( Yk ), k 1, . . . , n }, whose k intersection (denoted by 0 x), since being finite, is open and nonempty. Therefore, Ox is an open neighborhood of x E Ae with Ox n A = C/J, which means that x is an interior point of A c. Thus, A c is open, or equivalently, A is closed. D 6.1 1 Remark. In Theorem 6.3, Chapter 2, we stated and proved (Problem 6.8) the equivalence of the conditions: ( i) A C (X,d) is com pact; ( ii) every infinite subset of A has an accumulation point in A (Balzano- Weierstrass compactness); (iii) every sequence in A has a con vergent subsequence (sequential compactness). This equivalence does not 6.10 Theorem.
=
=
=
hold for topological spaces, where ( i) and (iii) are in general distinct pro perties, and compactness just implies Bolzano-Weierstrass compactness, as the reader will prove it (see Problem 6.6). D Recall that second countable spaces are first countable and separable (see Problem 3.6). In addition, they are Lindelof spaces, as the following theorem asserts.
Any second countable topological space is Lindelof. c:B be a countable base of a topological space (X, r) and let
6.12 Theorem. Proof. Let
146
CHAPTER 3. ELEMENTS OF P OINT SET TO P O L O GY
0 { 0£ : i E I} be an arbitrary open cover of X. Let x E X. Then x belongs to some O E 0. Since � is a base for r, by Theorem 2.2, there is a neighborhood base � x C �- Then there is a base neighborhood B E � x such that B C 0 The collection of all distinct B 's for all x E X is at most countable. On the other hand, this collection obviously covers X. Consequently, the collection of all open supersets { 0 } associated with each B is also countable and it covers X. Thus ( X,r) is indeed Lindelof. =
x
x
x·
x
x
x
x
D
The below result is in the spirit of Theorem 4. 8, where, for continuity of a function f from (X, r) to (Y, r( !f )), it was sufficient to verify that f * ( !f) C r. Here we claim that, if !f is a subbase for r, then X is compact whenever every cover of X by elements from !f can be reduced to a finite subcover.
Let (X, r) be a topo logical space and let u be a subbase for r. If every open cover of X by elements of u is finitely reducible, i.e. if every open cover can be reduced to a finite subcover, then X is compact. Proof. We prove th e equivalent statement: If X is no t co mp a c t , then there exists an open cover by elements of u that is not finitely reducible. Assume that X is not compact. We will prove this assertion in four 6.13 Theorem (Alexander Subbase Theorem).
steps, which we outline as follows: ( i) Let 0 be the collection of all open covers of X that cannot be reduced to finite subcovers. We will show that 0 has a maximal element; call it ..Ab. ( ii) We will show that for every x E X, there is an open set M x E ..At, and a finite tuple of open sets { S1 (x), . . . , S n (x)} C u such that yv e will show that at least one of the sets {S 1 (x), . . . , S n (x)}, denote it S( x ) , belongs to ..Ab. (iv) We will recognize that for each x E X, S(x) E u and S(x) E .Ab. In particular, the latter will imply that {S(x) : x E X} is an open cover of X, which is not finitely reducible. On the other hand, since we will have {S(x) : x E X} C u, the proof will be complete. We will be concerned with each of the above steps in detail. Step (i) : Since X is not compact, 0 is not empty. Introduce on 0 the partial order relation in terms of the inclusion. ( In other words, two open covers, e1 and e 2 of X from 0 are related as e 1 C e 2 if and only if e 1 is
(iii)
6. Notes on Subspaces and Compactness
147
a subcover of the cover e 2 .) Let C C 0 be any chain, and let CU be the union of all elements of C. Clearly, CU is a cover of X that cannot be reduced to a finite subcover. Th us, 'U must belong to 0, and 'U is an upper bound of C. By Zorn's Lemma 4.13, Chapter 1, there is at least one maximal element in 0; denote it by .Ab. Step ( ii): Let x E X and let M E .At, such that x is an interior point of M ( which exists, for .At, E 0 is an open cover ) . On the other hand, by Theorem 2. 7, the collection � of all finite intersections of elements of the subbase u is a base 0 an d 6 1 , 6 2 > 0 such that c > 2 6 1 + 6 2 . 2) Cover � by balls B p (f i ' 61), i = 1, . . . , n [call the n-tuple {/ 1 , . . . , f } a 6 1-net]. 3) Use continuity of each f i at x0 in the form of Problem 4. 7 b ) : for each 6 2 > 0, there is a neighborhood U�)0 of Xo with n
7.4
n
\.I v
x E ux( i0) ,
z = ·
1 , . . . , n.
4) Choose a neighborhood U x , good for all
5) Let f be any function in balls in 2 ), say B p (f i ' 6 1 ). 6) Use the estimate
0
Cif;
thus
fi 's.
f falls into one of the
157
7.
Function Spaces and As coli's Theorem
where the first term of the right-hand side of the inequality is less than 6 1 (why?), and the second term is dominated by
7.5
(The estimate needed then follows.)] Prove Lemma 7. 10. [ Hint: Choose £ > 0 and 6 1 , 6 2 > 0 such that £ > 26 1 + 6 2 . Show that there exist� an £-net { f 1 , . . . , f N } C �- Use the steps that follow. 1 ) Use equicontinuity of � and compactness of X to show that, for every 6 1 > 0 , there is a finite open cover (by neighbor hoods) { U 1 ( 6 1 ), . . . , U ( 6 1 )} of X, such that for any f E � and for any y that falls into a neighborhood U . ( 6 1 ), x
x
n
x
I
2) Cover Y by a finite collection {B(j)} of d-balls, such that B(j) = B d(Y i' 6 2 ), j == 1, 3) Let r be the collection of all integer functions . . . , m.
r: { 1 , . . . , n }
--+
{1 . . . , m }. ,
Let r' be a subset of r with the following property: an element 1 E r belongs to r' if and only if there is a function f E � such that ! (xi) E B( r(i)), i == 1 , . . . , n. Let I f' I = N. Then order the elements of r' and the functions assigned to r' by {1, . . . , N }, so that Show that �' is a relevant £-net. 4) Let f E � - Show that for this f there is an element of r' , say /j, such that if f ( x i ) E B( r j (i)), i = 1 , . . . , n, then d(f (x i), f j (xi)) < 62 , i = 1 , . . . , n . 5) Show that for all x E X\{x1, . . . , x } , n
by using the triangle inequality and the inequality in 1 ).
158
CHAPTER 3 . ELEMENTS OF P OINT SET TOP OLOGY
6) Show that the inequality in 5 ) implies the desired inequality p ( f , f k ) < £ for some k E {1, . . , N} and therefore {/ 1 , . . . , f N } is indeed an £-net in � -] Prove the following: Let (X, r ) be a topological space and let (Y, d) be a complete metric space. If e * ( X;Y) is the subspace of .
7.6
7.7 7.8
7.9
continuous bounded functions, then ( e * (X;Y), p ) is a uniform complete metric space. Prove the statement: If GJ C e(X; Y) is an equicontinuous family, then so is its uniform closure GJ. Prove Dini's Theorem: Let (X, ) be a compact topological space. Consider the space ( e(X; lR), p ) . Let {f n } be a monotone sequence from e(X; IR) such that {f n } converges to a continuous function f E e ( X; lR) in the topology of pointwise convergence. Then {f n } converges to f in p also. Let � n be the set of all polynomials defined on [0, 1] with degrees r
less than or equal to n and with all real coefficients bounded by a positive constant. Show that (�", p ) is compact.
Let GJ C e(X;Y), where X is a compact topological space and Y is a metric space. Show that if GJ is equicontinuous and pointwise bounded, then it is uniformly bounded in (e ( X;Y), p ) . 7.11 Let GJ C e ( X;IR" ) and let X be compact. Show that GJ is relatively compact if and only if it is equicontinuous and pointwise bounded. 7. 12 Let GJ be the set of functions 7. 10
GJ =
{ f ( x ) = a sinx: a E [ - 2, 2]}.
Show that the set ( GJ, p ) is sequentially compact. 7.13 Let GJ be a sequence of functions with f n (x) = b n cosx , b n = 1 + � , n = 1, 2, . . . , and f0 ( x ) = cosx. Show that ( GJ, p ) is compact. 7.14 Let dJ be a subset of (e ( X;lR"), p ) . Show that the uniform closure ( GJ, p ) is equicontinuous and bounded if and only if ( GJ, p ) is equi continuous and bounded. 7. 15 Prove Arzela's Theorem: Let X be compact and let {f k } C 7.16
e(x; lR") be a pointwise bounded and equicontinuous sequence of functions. Show that ( {f k }, p ) is sequentially compact. Prove Theorem 7.13.
159
7. Function Spaces and As coli 's Theorem
NEW TERMS:
uniform metric 151 uniform metric space 151 supremum norm 151 space of all continuous functions 151 space of all continuous bounded functions 152 uniform convergence 152 uniform convergence, criterion of 152 completeness of a uniform metric space 152 equicontinuity at a point 153 equicontinuity on a set 153 Ascoli's Theorem 153, 155 equicontinuity, criterion of 153 totally boundedness, criterion of 153 relative compactness 155 pointwise bounded set of functions 155 uniformly bounded set of functions 156 Dini's Theorem 158 Arzela's Theorem 158
160
CHAPTER 3 . ELEMENTS O F P OINT S ET T O P O L O GY
8. STONE-WEIERSTRASS APPROXIMATION THEOREM
(f2r) be a topological space, X be a compact subset, and let ( A,IR) C e(X;IR) be a subspace of all real-valued continuous functions on X that also contains products f g of functions from A. Each continuous func tion on a compact set is bounded, as we know it from Theorem 6.8. We will use the uniform metric p introduced in the previous section: Let
·
p ( f , g ) = sup { I f ( X ) - g ( X ) I : X E X}. Since e(X;lR) is complete (Example 7.3 or Theorem 7.2), A c e(X;IR). We wonder under what condition A = e(X;IR ), i.e. , under what condition
each continuous function can be "uniformly approximated" by elements of A. For instance, if A is the set of all polynomials, can a continuous function be uniformly approximated by a sequence of polynomials ? It is known from calculus that every function, analytic at a point can be uniformly approximated in a vicinity of this point by a sequence of polynomials (Taylor's theorem). In 1885, German Karl Weierstrass established a more general result (also known from calculus), which states that every continuous function defined on a compact interval X can be uniformly approximated by polynomials. Finally, American Marshall H. Stone in 1937 generalized the classical Weierstrass Theorem, allowing X t� be a compact topological space with some minor rest riction to the subspace A. For all necessary preliminaries the reader is referred to the beginning of Section• 7, Chapter 1. We will start with some auxiliary results to be rendered in a few steps (Lemmas 8.4 and 8.5) that lead to the Stone-Weierstrass Ap proximation Theorem. 8. 1 Remark. Compactness of the topological space (X,r) we were talking above is not a mandatory prerequisite to define the uniform metric, if we consider e * (X;Y) as a subspace of all d-bounded continuous functions from (X,r) to a complete metric space ( Y,d). The uniform metric p is. also well-defined on e (X;Y). Completeness of (e ( X;Y ),p) is * on the then due to Theorem 7.6 (where* only boundedness of e * ( X;Y) compact space X is essential). D •
8.2 Definitions.
( i ) Let y be a family of functions defined on a set X. Then y separates points of X if for each x and y from X such that x f. y, there is a function f E y such that f ( x ) f. f(y). ( ii ) Let y C e * (X;lR) be an arbitrary nonempty subcollection of con tinuous, bounded functions on X and let A be any subalgebra of e * (X;IR) containing y. The intersection of all subalgebras containing y is obviously
8. Stone- Weierstrass Approximation Theorem
16 1
a subalgebra (see Problem 8.1); and moreover it is the smallest sub algebra containing y, denoted by A(y), and is called the subalgebra g enerated by y. The subcollection y is called the g enerator of this sub algebra. D
Let X be a compact subset of a topolo gical space ( 0, ) and let y C e * (X; IR). If y separat es points and contains the unity 1 ( i.e. the function identically equal to 1 ), then the subalgebra A(y) g enerated by y is dense in e * ( X; IR) relative to the uniform metric p . 8.3 Theorem (Stone-Weierstrass ). r
[Observe that if needed, the condition "y separates points" can be strengthened by the condition "A(y) separates points."]
8.3. 8.4 Lemma. For each > 0, there is a polynomial P( t ) such that I P( t) - I t I I < for all t E [ - 1, 1 ]. Proof. Let I: :0 b n z " be the binomial expansion of the function (1 + z) a for E Q and E C. Recall that this function can be expanded in the binomial series, where the coefficient b n is given by the formula b n = a( a - 1) (a - n + 1) / n!, n > 1, and b0 1. ( 8. 4) A few lemmas will precede the proof of Theorem £
£
=
a
0
z
·
·
·
=
The binomial series is uniformly convergent in the open ball B( O , 1) � C and at point z = ( - 1, 0) for a > 0, it is absolutely convergent as a special case of a hypergeometric series. Thus, the series E� 0 b n x " with coefficients given by ( 8.4) is uniformly convergent to function ( 1 + x ) a , at least for all x E [ - 1, 0]. Letting a = � and replacing x by - x we arrive at the series E� 0 b� x ", which is uniformly convergent to (1 - x) 1 1 2 , \lx E [0, 1], where b� = ( - 1 ) " b n . The statement now follows if we set x = 1 - t 2 , where t E [ - 1, 1 ]. The series E� 0 b� ( 1 - t 2 ) " converges to I t I , \1 t E [ - 1, 1 ] with b� = ( - 1) " b n ; and the partial sums of the series are polynomials. 0 8.5 Lemma. Let ( X, r ) be a topolo g ical space, A C e (X; IR) be a sub * the closure A alg ebra, and ( e * ( X; IR), p ) be a uniform metric space. Then relative -to p is a sub alg ebra and, - in addition, A is a vector lat t ice, i. e., \l f , g E A , f l\ g and f V g E A. Proof. By Problem 8.2 a ) , A is a subalgebra. We need to show that A is a lattice. Because of =
=
=
162
CHAP TER 3 . ELEMENTS O F P OINT SET TOPOLOGY
a
1\
b = ( a + b) +2 l a - b l and a V b = ( a + b) -2 l a - b l ,
it suffices to show that with bounded function on X, I f I
f E A, I f I E A. Since f is a continuous, < M and I g I < 1, where g = f E A. M
Then, by Lemma for all x E X,
8.4, for every £ > 0, there is a polynomial P such that I P(g) - I g I I < £.
(8.5)
Since A is an algebra, P(g) E A. From inequality ( 8.5), we have that for each £ > 0, th � ball B p ( I g I , £) meets A, implying that I g I E A. Hence, I f I E A (see Problem 1._!_6 ). Finally, the statement of the lemma follows from the linearity of A. D Now we return to the Stone-Weierstrass Theorem. Proof ( of the Stone-Weierstrass Theorem). We will show that each function f E e ( X ; IR) can be approximated by functions from A = A(y ) relative to the uniform metric. By the assumption , y separates points, i.e., \1 x 1 f. x 2 E X, there exists a function g E y such that g(x 1 ) f. g(x 2 ). Define for fixed a, (3 E IR, the auxiliary function *
- g(x 1 ) h ( X ) = a + ((3 - ) gg(x) ( x2 ) - g ( xl ) which belongs to A, because 1 E A. Thus, \1 x 1 f. x 2 E X and \1 a, (3 E IR, there is an h E A such that h (x 1 ) = a and h (x 2 ) = {3. Let f E e * (X;IR). Then by the above argument, \1 X f. y, there is an h x y E A with the property that h x y ( x ) = f ( x) and h x y (Y) = f(y), 0!
1
[ g(x) g(z) f(y) hxy( z ) - f(x)] g(y) _ g(x) " Fix an x and let y be arbitrary. Since f - h x y is continuous at y and f(y) - h x y (Y) = 0, \/e > 0, there is an open ball B(y) = B(y, fl y ) such
where
= f(x) +
that
I hx y ( z ) - f( z ) - 0 I < £, \lz E B(y). Now, we cover X by {B(y) : y E X}, and by compactness of X, reduce
163
8.
Stone- Weierstrass Approximation Theorem
this cover to a finite subcover { B ( y 1 ) , . . . B ( y n ) } . Let the associated func tions, with the above properties in vicinities of y 1 , . . . , Yn be ,
h x y1 ' . . . , h x y n ' respectively, and let h x = min{ h x y , . . , h x y }, on n 1 h x E ..A. . By ( * ) , \/r: > O , .
X. By Lemma 8.5,
I h x y . (z ) - f(z ) I < £, \l z E B ( y i ) , i = 1, . . . , n, I
which implies that
hx (z ) < f (z ) + £, \lz E X . Observe that the above inequalities, along with their parameters, depend upon a fixed x E X. Notice that h x does not really approximate f on X; it just approximates f in a vicinity of point x. Thus, f = h x by continui ty of f - h x and h and f satisfy inequality ( * * ) . By continuity of f - h x , for each £ > 0, there is a ball B(x) = B(x, 6x) such that l f ( z) - h x ( z) - 0 1 < c:, \lz E B(x) . Ag ain, let us cover X by the collection { B( x) : x E X} and then reduce the latter to a finite subcover { B ( x 1 ) , . . . , B ( x k ) } . Correspondingly, \/r; > 0,
I f ( z ) - hx . ( z ) I < £ , 1
Then h = max { h x 1 , . . . , h x k }
Vz
E B( xi),
i=
1, . . . , k.
E A by similar considerations,
and hence
f ( z ) - £ < h( z ), z E X . Furthermore, and
(
**
) yields that h(z)
exists, where l = g'( a) , and now we can say that function g is differ entiable at a if and only if this limit exists along any sequence { x n } �onvergent to a in the sense of notation (9. 7b ) . This idea is frequently used in analysis whenever convergence along a sequence is a plausible (if not the only) option for us. (iii ) Consider some special cases of limits along the filter bases from Example 9.3 (iii). Let X = Y = IR and f: X � (Y,r e )· a ) If '!Fb on X is �b = { ( a - e: , a + e: ) : t: > 0}, then the concept of limit introduced in Definition 9.6 ( iv) reduces to the conventional definition of the limit of a function known from calculus, with the usual notation lim f(x ) = I x� a b ) Similarly, with �b = {[a,a + e: ) : t: > 0} we obtain x� lima + f(x) = l. c ) VVith �b = {[b,oo): b E IR}, we have lim f(x) = l. x�+ oo (iv ) Let f: X � (Y,r), �b be a neighborhood base on X and let .
172
CHAPTER
3 . ELEMENTS OF POINT SET TOPOLOGY
f has a limit along '!Fb, then it is unique. Assume that 1 1 and 1 2 are two different limits along '!lb. Since Y is Hausdorff, there are two disjoint neighborhoods of 1 1 and 1 2 : V 1 1 and V1 2 • By the definition of the limit along �b ' there are two sets U 1 , U2 E '!! b such that (Y,r) be Hausdorff. We show that if
By the definition of U C U1 n U2 • Since
'!Fb as a filter base, there is U E '!F b such that
f * (U 1 n U 2 ) C V1 1 and /*(U 1 n U2 ) C V1 2 ,
D we have f * (U) C V1 1 n V1 2 = (/J. This is absurd, for U '# (/J. When introducing convergence of a function f: X ---. (Y,r) along a filter base '!Fb on X in Definition 9.6 ( ) we did not need to assume any topology on X. Now if we define a topology on X and take for '!! b the neighborhood filter CU:r:0 at a point x 0 E X, then, by Definition 9.6 ( i v) v
,
(applied to CUxo = '!! b ) and taking I E Y as f(x 0 ), we arrive at the defini tion of continuity of f at x 0 that agrees with Definition 4.1: A function f : (X,r) ---. (¥, r') is called continuous at a point x0 if lim
X �X O
f{x) = f(xo ) ·
Now, we consider another very useful type of con vergence: con ver gence along nets. As we will see it, the filter and net convergence have a very close relationship. 9.8 Definitions.
A set A is called ( i) < ) on A defined as:
directed
if there exists a relation ( nP.noted
)
( R ) for each .,\ E A, .,\ < .,\. b) (T) .,\ 1 < .,\2 and .,\2 < .,\3 imply that .,\ 1 :5 .,\3 . s ) ( SL - superlativity) for each pair .,\ 1 ,.,\ 2 E A, there is .,\ E A such that .,\ 1 < .,\ and .,\ 2 < .,\. ( ii) A net is roughly speaking a set indexed by a directed set, and it is a generalization of a sequence. More formally: A net in X induced by A is any function f: A X where A is a directed set. The point f(.,\) is denoted by x .,x and we will then instead denote the net by { x .,x} a
---.
=
9. Filter and Net Convergence
173 {xA
>. E A}. Observe that since f need not be surjective, { x A} is in general a proper subset of X. (iii) If {xA} is a net, then {xA: >.a < >.} is called a >.a-tail of {xA}. ( iv ) Let A C X. A >.a-tail of a net {xA} is called a >. a ( A ) tail of { x A} if the >.a-tail is a subset of A. ( v ) A net { x A} is said to be cofinally in A C X if for each >.a E A, there is >. > >.a such that x A E A. ( vi ) A point x E X is said to be an accumulation point of a net { x A} if the net { x A} if { x A} is cofinally in each neighborhood U x E CU x . ( vii ) Let { x A} be a net in X. { x A} is said to converg e to a point x E X (in notation x A --. x }, if for each neighborhood U x of x , there is a >.0 (U x ) -tail of { x A}. x is called a limit point of the net { x A}. ( viii) A net {xA} is called an ultranet if for every subset A C X, there is a >.0 ( A )-tail of {xA} or >.a ( A c)-tail of {xA}. D :
-
9.9 Examples.
An example of a directed set A will be IR" with ( i) >. = ( >. 1 , . . . ,>. n ) � J.L = (J.L 1 , . . . ,J.L n ) if and only if x i < Yi, for all i = 1 , . . . , n. ( ii) A neighborhood base � x at x, or even more trivial case, the neighborhood system CU x , with the relation U 1 < U 2 if and only if U 1 :::) U 2 for their elements, is a directed set. ( iii) Let X be an arbitrary continuum set and let { x A} be the net in X induced by A defined in ( i). Now, a >.a-tail involves only those x E X whose indices are < -related. ( i v) Let ( X, r ) be a topological space, x E X, and let � x be any neighborhood base of x directed as in ( ii). Now, we index a subset of X as follows. For each neighborhood B E A k , k = 1 , . . . ,n, such that each A0-t.ail of {1ri (xA)} is in U X · , k = 1, . . . ,n. Hence, 1r; (A0-tail of k k .k {1ri (xA)} ) is contained by 1ri (U � . ) . Consequently, the A0-tail of { xA} is k k � in 1r; (U x . ) , k = 1, . . . ,n, and k .k XA E n i ( u ) = u for all A > Ao · k =l 7T' k X .. k D In other words, x A ---. x. 9.15 Remark. We activate Example 9. 7 ( i) treating a special case of the convergence of a function on N (sequence) along the filter base '!F b = {{n,� + 1, . . . }: n E N} in a discrete topological space. Since any sequence is a net, the filter base � b in this case obviously contains all n0tails of this net, and the converg ence of f along '!F b is equivalent to th e convergence of the net { f(n)} . We wonder what is a connection between the filter and net convergence, and in which cases they are equivalent. We will start with the natural generalization of this case. D n
*
X
9. 16 Proposition. Let { x A} be tails of { x A} is a filter base on X.
(See Problem
'
a net in X. Then the collection of all
9.1 1.)
9.17 Definition. Let {xA} be a net in
X. The filter base in
9. Filter and Net Convergence
177
Proposition 9.16 is said to be the filter base generated by the net { x .,x } and it is denoted by '!f .,x · Correspondingly, the filter '!f ( '!f .,x ) generated by this particular filter base is called the filter generated by the net { x .,x }· D The following two criteria form a bridge between filter and net con vergence. 9. 18 Theorem. A net { x .,x } --+ x if and only if the filter '!f ( '!f .,x ) generated by this net converges to x. D 9.19 Theorem. x is an accumulation point_ of a net { x .,x } if and only if x is an accumulation point of the filter '!f ( '!f .,x ) generated by this net. D The proofs to both theorems are left for the reader as Problems 9. 12 and 9. 13. 9.20 Remark. Let '!f be a filter on X. Denote A '!f = { ( x, F): x E F E '!f} and introduce the relation � on A '!f by Note that from each F, each time we select exactly one point x. Conse quently, we pair all elements of F with F. Then ( A '!f , < ) is a directed set (show it as Problem 9.14) and the projection map 1r: A '!f -+ X (assigning 1r(x,F) to x) is a net in X. This net is called the net based on '!f. So, the net based on '!f is just {x .,x } where ,\ = (x,F) and this particu lar x is labeled by ,\ or by F. This is somewhat similar to the labeling a net generated by a neighbor hood base. However, in this case, we select all elements x of F and, in addition, we deal with a filter base instead of a D filter. 9.21 Theorem. A filter '!f converges to x if and only if the net based on '!f converges to x. Proof.
1) Suppose that '!f -+ x. Then by Definition 9.6 (i), CUx C '!f. Let U x E 'Ux . Then U x E '!f. Let x .,x E U x · Then (x .,x ,U ) E Ar;r. By a a superlativity of A '!f , there is ,\ > ,\ a · Hence, there is an F( E '!F) C U ,\a < ,\ = (x _,x ,F), and x .,x E F. The collection of all such x _,x 's is the ,\ a-tail and it is a subset of U x being an arbitrary neighborhood of x. Therefore, x
�
x'
X _,x -+ X . 2 ) Let {x .,x } be the net generated by a filter '!f such that x.,x -+ E X. We need to show that CUx C en=. Since x .,x -+ x, for each U x ' there is ,\a E A crr: such that the ,\ a-tail is in U x ' i.e., for some ,\a = ( x _,x ,Fa ), all a x
178
CHAPTER 3 . ELEMENTS OF P O INT SET TOP O L O G Y
x A E U x ' with >. > >.a , or equivalently, with x A E FA C F Furthermore, Fa must be contained by Ux · If this is not the case, then at least Fa and U x are not disjoint (it follows from the above inclusions). Since by our assumption Fa \U x f. C/J, there is a y E Fa \U x ' and then the pair (y,Fa ), 0•
marked with some >. is obviously in the >. a-tail. Thus Y>.. must belong to U x ' which contradicts the assumption. Another reason why Fa C U is that if some x A E F belongs to U, then all other elements of F belong to U, for they participate in the relation (x>.,,F) C (y,F) and thus belong to the >.a-tail. So, w e have shown that an arbitrary neighborhood U x is a D superset to some Fa E GJ. By the definition of a filter, U x E �9.22 Example. If � = CU x , then such a filter always converges to x. By Theorem 9.21, the net {xA} based on CU x converges to x. A >.a-tail of this net would consist of all points y indexed with all neighborhoods D U E CU x , which are included in the ">.a-neighborhood" U Aa 9.23 Remark. The following considerations are similar to those in Remark 9.20. 1et '!Fb be a filter base on X. Denote x
•
AG} b = {(x,F): x E F E �b}
and set the relation < in A � b by (x 1 ,F 1 ) < (x 2 ,F 2 ) F 2 C F 1 . Then A� b is a directed set (show it, in Problem 9.15). Now, the projection map 1r : A'!! � X is a net in X. This net is called the net based on the b
filter base �b·
9.24 Theorem ..
D
A filter base �b converges to x if and only if the net D based on �b converges to x. The proof of this theorem is similar to that of Theorem 9. 21 and it is subject to Problem 9. 16.
9.25 Example. Let '!F b = (ii): Let ( X,r ) be T 2 and let � be any filter on X with � ---. x and � � y. By Definition 9.6 ( i), C:U x C � and C:U Y C '!F . Thus, V U x ' u y E �, u n u y f. C/J ( by the definition of a filter ) . Consequently , either x = y or (X,r) is not Hausdorff. If now {x.,x} is any net in X with x.,x ---. x, then by Theorem 9.18, the filter � ( � .,x) generated by this net converges to the same point x. If y would be another point such that x.,x ---. y f. x, then by the same Theorem 9. 18, it would mean that � ( � .,x ) ---. y as well, which is impossible, for in T 2 , any filter, as proved, converges to at most one point. ( ii) ::} (iii): Assume that all limits in ( X, r) are unique along any nets. Therefore, the net based on a filter � converges to x and to no other point of X. By Theorem 9.21, it follows that � also converges to x and to no other point of X. Let D: = {(x,x) E X 2 : x E X}. Then the diagonal D will contain all nets (x.,x,x.,x) · By Proposition 9.10, a point (x,y) E D if and only if there is a net (xA,x.,x) C D: (x.,x,x.,x) ---. (x,y). Thus, if we show that x = y, it would imply that D = D. The statement x = y easily follows from the uniqueness of limits along nets. Therefore, for each point (x,x) E D, there is a net (x.,x,x.,x) ---. (x,x). The latter yields D = D. (iii) => (i): It can be directly taken from (iii) => (i) of Theorem 3.1 1. X
D
The next two results are analogous to Lemma 4. 12 and left for students as exercises.
4.1 1
and Proposition
Let f, g: ( X,r) ---. (Y,r') be continuous functions and let ( Y, r' ) be T2 • Then the set S: = {x: f(x) = g ( x)} is closed in X. D 9.28 Proposition. Let J,g: ( X,r ) ---. ( Y,r' ) be continuous maps and let (Y, r') be T2 • If f and g coincide on some dense set D C X then f = g D on X. 9.27 Lemma.
PROBLEMS 9.1 9.2 9. 3
Show that the filter � in Remark 9.2 ( ii) is the smallest filter containing the filter base � b · Show that a filter base for a filter is a filter base. Let X be a set and A C X. Define �: = {F E '!P(X): A C F} . Show that � is a filter on X. Give the smallest filter base � b on X
180
9.4 9.5 9.6
CHAPTER 3 . ELEMENTS O F P O INT SET T O P O L O GY
containing the set A. For Problems 9.4-9.7, let � be a filter on X and A C X. Show that if one element F E GJ meets A, then A meets all other elements of �- In this case we say � meets A. Let � meets A. Show that � A : = {F n A: F E �} is a filter on X, called the trace of the filter � on A. Show that �': = � U U � A is the smallest filter containing B :J A
� U {A}.
9. 7 9.8
)
(
Show that � A is a filter base for �'. Show that x is an accumulation point of a filter
x E n {F: F E �}.
�
9.9
Show that if a filter point of �.
converges to
9. 10
Let ( X, d), ( Y,p ) be metric spaces, fo ll owing statements are equivalent:
�
if and only if
x then x is an accumulation x0 E X, l E Y. Show that the
( i ) xlim ---. x0 f(x) = l ( in the sense of Definition 9.6 ( iv) and Example 9. 7 ( ii ). ( ii ) For each > 0, there is a 6 > 0 such that for all x E X with d(x,x0 ) < 6, p (f(x) , l ) < [Hint: Work with the system of open balls as a filter base.] £
£.
9.18
Prove Proposition 9.16. Prove Theorem 9. 18. Prove Theorem 9.19. Show that ( A � , < ) is a directed set. Show that ( A GJ ' < ) is a directed set. b Prove Theorem 9.24. Show that the net based on an ultrafilter is an ultranet. Show that the filter gene rated by an ultranet is an ultrafilter.
9. 19
Generalize Theorem 3 . 1 1 replacing condition
9.20
Prove Lemma 9.27. Prove Proposition 9.28.
9. 11 9. 12 9.13 9.14 9.15 9.16 9.17
9.21
( ii) by the condition: each net or filter in ( X,r) converges to no more than one point.
181
9.
Filter and Net Convergence
NEW TERMS:
filter 167 til ter base 16 7 filter base for a filter 167 ultrafilter 167 filter generated by a filter base 168 neighborhood filter 168 maximal filter 16 9 convergence of a filter 169 limit point of a filter 169 convergence of a filter base 169 accumulation point of a filter 169 accumulation point of a filter base 169 convergence of a function along a filter base 169 limit of a function at a point 170 continuity of a function at a point 172 directed set 172 net 172 net induced by a directed set 172 A 0-tail of a net 173 net, cofinally in a set 173 accumulation point of a net 173 convergence of a net to a point 173 limit point of a net 173 ultranet 173 net generated by a neighborhood base 173 partition of an interval 17 4 refinement of a partition 17 4 Darboux lower sum 17 4 Darboux upper sum 174 Riemann integral 17 4 function convergent along a net 175 continuity of a function, criterion of 175 convergence of a net to a point, criterion of 176 filter base generated by a net 177 convergence of a net to a point, criterion of 177 accumulation point of a net, criterion of 177 convergence of a filter to a point, criterion of 177 convergence of a filter base to a point, criterion of 178 uniqueness of limits along nets and filters, criteria of 178 filter that meets a set 180 trace of a filter on a set 180
182
CHAPTER 3 . ELEMENTS O F P O INT SET TO P O LO GY
10. SEPARATION
In this section we will see that the fineness of a topology is characterized by its ability to separate points and sets. We will treat some special types of topological spaces that have qualities somewhat similar to Haus dorff spaces introduced in Section 3 and here given in weaker or stronger forms. In addition to countability, it is another attempt to arrive at various classes of topological spaces having common properties with metric spaces and yet being sufficiently more general. 10.1 Definitions. Let (X,r) be a topological space. (i) (X,r) i& called a T0 space if for each pair of points x f. y E X, there is a neighborhood of x, U x such that y E U�:
o y
(ii) (X,r) is called a T 1 space if for each pair u and u y such that y E u � and X E U� :
x f. y E X,
there are
X
o x
(iii)
o y
(X,r) is called a T2 space ( or Hausdorff) if \lx f. y E X, 3
U x ,U y : U x n U y = C/J :
10. Separation
183
(iv) (X,r) is called regular if for every closed set F C X and for every point x E Fe there are disjoint open sets O x and 0 such that F C O and x E O x :
F 0
( v) (X, r) is called a T space if it is regular and it is a T1 space. (vi) ( X,r) is called completely regular if every closed set F C X and every point E Fe can be separated by a continuous function, i.e. if there is a continuous function f: (X,r) � ([0,1],r e ) : f(x) = 0, f(F) = 1 . (vii) (X,r) is called Tychonov if it is completely regular and a T13
x
space.
(viii) (X,r) is called normal if any two disjoint closed sets have
disjoint open supersets:
( ix) (X, r) is called a T space if it is normal and a T1 space. (x) (X,r) is called locally compact if every point of X has at least one compact neighborhood. D 10.2 Lemma. The following are equivalent: ( i) (X, r) is T1 . ( ii) Each one-point set is closed. ( iii) Every subset of X equals the intersections of all open sets containing this set. 4
184
CHAPTER
3 . ELEMENTS O F P O INT S ET TO P O L O GY
( i ) => ( ii): Let (X, r) be T 1 and let x E X. Then by the defini tion, each y ( f. x) has a neighborhood, disjoint from { x }; for instance, X\ { x} is such one. By the definition of a neighborhood, there is an open neighborhood, say O y C X\{x}. Thus, y is an interior point of X\{x}. Since y E X\ { x} was an arbitrary choice, it follows that X\ { x} is an open set. [Observe that Hausdorff spaces have the same property, c f. Proof.
Problem 3.2.]
(ii) => (iii): Assume that each singleton in (X,r) is closed. Let A C X. Then A = n (X\{x}). Now, the statement follows from the x
E Ac
X\{x} is open and that A C X\{x} , \1 x E Ac. (iii) => ( i) : Assume that every subset A C X is the intersection of all open sets containing A. Let A = { x }. Then { x} is the intersection of all open neighborhoods of x such that x = n Ox. Let y be a point such that there is no open set 0 that does not cont ain x. This implies that y E 0 x and hence y E n Ox and y = x. D 10.3 Proposition. If (X,r) is a T i space then the following diagram holds: fact that
Y
T2 => T 1 => T0 is obvious. Since T3 is T1 , by Lemma 10.-2, we take F = { y } , which is closed, to get T 2 • Similarly, by letting F 2 = {x} and applying Lemma 10.2 to set {x}, we have T4 => T3 . D 10.4 Example. Let X be any infinite set equipped with the cocount able topology r = {X,Q),C c : I C I < I N I } (introduced in Problem 1.7). Proof. Indeed:
Thus, by the definition, all at most countable sets are closed, specifically, all singletons are closed. Thus, by Lemma 10.2, r must be T 1 . Similarly, any cofinite topology (cf. Problem 1 . 1) is T 1 . Now let 0 1 and 0 2 be any two open sets in a cofini te topology with an infinite carrier. We show that Ot and 0 2 cannot be disjoint unless 0 1 or 0 2 is empty . If they are disjoint and nontrivial then 0 1 C 0� which is impossible, for 0� must be finite and 0 1 is infinite. Thus any cofinite topology on an infinite carrier cannot be T 2 . Similarly any cocountable topology on a carrier whose D cardinal number is greater than N0 cannot be T 2 . 10.5 Theorem. The following are equivalent for a topological space
(X,r): ( i) X is regular. ( ii) If 0 x is an open neighborhood of x then there exists an open set
10.
185
Separation
U which contains x and such that U C 0. (iii) Each x E X has a neighborhood base consisting of closed sets. Proof.
( i) => ( ii). Suppose X is regular. Let x E 0 E r . Then oe is closed and x rt o e and by regularity of X, there are disjoint open sets U and W such that X E u and oe c w. Clearly, we is closed and u c we c 0. Furthermore, U C we C 0. ( ii) => (iii). If 0 C K = K), by Theorem 6.9, 0 is compact in r n K. By Theorem 10.6, as a compact and Hausdorff subspace, 0 is regular. As an open neighborhood of x in ( X, r ) , and a subset of 0, 0 is also open in 2' n 0. By Theorem 10.5, there is an open neighborhood W of x in r n 0 such that its closure in r n 0, W C 0. ( It is easily seen that W is also open in r. ) Since 0 is a compact subspace, W is compact in 0. We need to show that W is also compact in (X, r ) . Let {V5} be an open cover of W in r. Then, {V n 0} is obviously an open cover of W in r n 0. This cover can be reduced to a finite sub cover {V 1 n O, . . . ,V k n O} and therefore, {V 1 , . . . ,V k } is a finite subcover of W in r. In a nutshell, we showed that an arbitrary neighborhood U of x has an open subneighborhood W whose closure is compact. Hence, a neighbor hood base at x forms thereby a neighborhood base consisting of open sets whose closures are compact. In particular, it means that every point of X D possesse� a neighborhood base consisting of compact sets. 8
Let ( X, r ) be a locally compact Hausdorff space a point o and let U be an open neighborhood x. Then there is an open f neighborhood 0 of x such that -0 C U and -0 is compact. 10.10 Proposition. x
( See Problem 10.6.)
x
x
Let K be a compact set in a locally compact Haus dorff space ( X, r ) and W be an open superset of K . Then there is an open superset U of K such that U C W and U is compact. 10. 11 Proposition.
Proof. By Proposition 10. 10, each point x of K has an open neigh borhood U whose closure is compact and included in W. If we cover K by all U 's, because of compactness of K, this cover can be reduced to a x
x
10. Separation
187 finite subcover, say
U . . . ,U n· If U = U 1,
1
U . . . U U " then clearly '
As a finite union of compact sets, U is compact. D The next is a small and useful consequence of Proposition 10.11 (whose proof we assign to Problem 10.8). It states that every locally com pact Hausdorff space is "weakly" normal. Recall that a space is normal if every two disjoint closed sets can be separated, i.e. they have disjoint open supersets. In a locally compact Hausdorff space, the same property applies to compact sets, which as we know ( cf. Theorem 6.10 ) , are closed in Hausdorff spaces. In other words, any two compact sets can be separated by disjoint open supersets.
In a locally compact Hausdorff space any two disjoint compact sets have disjoint open supersets. D 10. 12 Corollary.
The theorem below is quite famous and it is known as Urysohn's Lemma. Given two disjoint closed sets in a normal space (X, r ) , the lemma asserts the existence of a real-valued continuous function on f that "separates" two given disjoint closed sets, i.e. f: X __. [0, 1] such that f * ( A ) = 0 and f * ( B ) = 1. (The original proof guarantees the existence of a function f from X onto [0, 1], but with a simple transformation, the range of f can be made [a,b].) Whenever we talk about real-valued functions from X to IR, we will mean the usual topology in IR. The following short biographical note on Pavel S. Urysohn will add to the prominence of his widely referred to lemma. Pavel Samuilovich Urysohn (born in 1898 in Odessa, Russia), accord ing to Pavel S. Alexandrov, was the founder of the Russian school of topology. He studied mathematics under Nikolai N. Lusin in Moscow State University from which he was awarded a doctoral degree in 1921. He tragically died by drowning in Brittany, France (at the early age of 26 ) , during his visit of one of the mathematical conferences . Among the different significant results Urysohn made during his less than four years of academic work, was one of the central problems in topology - the dimensions of arbitrarily complex geometrical figures. 10. 13 Theorem (Urysohn's Lemma). A space ( X, r ) is normal if and only if whenever A and B and disjoint closed sets in X, there zs a continuous function f: X --.[0,1] such that f * ( A ) = 0 and f * ( B ) = 1. Proof.
Necessity. We assume that ( X, r ) is normal and that A and B are disjoint closed sets. By normality of ( X, r ) and Problem 10.8, there is an 1.:.
188
C HAPTER
3.
ELEMENTS OF P OINT SET TO P OL O GY
open superset U 1 1 2 of A such that U 1 1 2 n B = f/J. Now, the sets A and (U 1 1 2 ) c are disjoint and closed. By normality, there are open supersets, U1 1 4 and V of A and (U 1 1 2 ) c , respectively, such that
U1 14 , (U 11 2 ) c C V and U1 1 4 n V (/J . Therefore, U 1 1 4 C v c � U1 1 2 and this yields that U 11 4 C v c C U 1 1 2 . Since B and U 1 1 2 are disjoint and closed, by Problem 10.8, there is an _ such that U open superset U3 14 of U 3 14 n B = (/J. In summary, 1 12 A c U 1 I 4 , [j1 I 4 c U 1 1 2 , U 1 1 2 c U 3 I 4, and U3 I 4 n B C/J . =
A�
=
For convenience, we display one more step. Repeating the above argu ments, there are open sets
U118 ' U114 ' U318' U1 1 2 ' U 518 ' U31 4 ' and U718 .
that are embedded in the following way: A
c u1l8' u 1 l8 c u114 ' u1 14 c u3l8' u3l8 c u11 2 '
u1 1 2 c u5 l 8' u5l8 c u3l 4 ' [j3l 4 c u7 l8' with u7 l8 n B = (/J . Continuing the same process, we define sets u i 1 2 n , i = 1, . . . , 2 n - 1, wh ich are embedded as
c U1 1 2 " ' U 1 1 2 " c U2 1 2 " '· . . , U ( 2 " 1 )1 2 " n B = C/J. Let D0 denote the set of all dyadic rationals belonging to [0, 1], i.e. those numbers -of the form i/2" where i = 0,1, . . . ,2 n and 0, 1, . . . , and D be the subset of dyadic rationals from ( 0,1), i.e. , D0 \{0,1}. It is easy to show that D 0 is dense in [0, 1]. By induction, we can construct the count able fami ly {U d ; d E D} of open sets indexed by the elements of D such that for each pair p , q E D with p < q , A
n =
A
Let
c u P ' up c u and u n B = (/). q'
q
U denote the union of all U d's. Now, we introduce the function f (w) =
inf{p: w E U p }, if w belongs to some U P 1, w E [0, 1]\U
10. Separation
189
on X. Clearly, f (A) = 0 and f * (B) = 1 and that [0, 1] is the range of f. We prove that *f is continuous at each point w of X. Continuity is subject to the following arguments. It is easy to show that: if w E U then
f( w ) < p;
if w � U then
f( w ) > p;
P
hence,
P
f is continuous at w if for every neighborhood Wf(w ) ' there is a neighborhood V w such that f * ( V w ) C W f(w) " Let f( w ) E (0, 1) and let ( a,b) = W f(w) be any open subinterval of [0, 1] containing f( w ). Because D is dense in [0,1], there is a pair of dyadic rationals p,q E D By Definition 4. 1,
such that
a < p < f( w ) < q < b. is a neighborhood of w such that Now, the open set V w = U q f * ( V ) � ( a,b ) . It is a rather routine procedure to verify the continuity of f at 0 and 1. This completes the necessity of the statement. Z:, Sufficiency. Assume that for any two closed disjoint sets A and B, there is a continuous function f: X----. [0, 1] such that f * (A) = 0 and f ( B ) = 1. Since f is continuous, / * ([0,£)) and / * ((£, 1]) are open sets in ( X, r ) D and they contain A and B, respectively. 10. 14 Corollary. A T4 space is Tychonov. Proof. Let (X, r ) be a T4 space. By Lemma 10.2, as a T 1 space, each singleton in (X, r ) is closed. Since the T4-space is normal, given an x and a closed set F, to which x does not belong, by Urysohn's Lemma there is a continuous function f with the range [0, 1], which separates { x} and F. Hence, ( X, r ) is completely regular. In addition (X, r ) is a T 1 space. D 10. 15 Corollary (Urysohn). Let K and W be compact and open sets,
\UP
w
*
respectively, in a locally compact Hausdorff space ( X, r ) such that K C W. Then there is a continuous function [X,[0,1],/] such that f * ( K) = {1} and f *(G) = {0}, where ac is a compact subset of W containing K.
10.1 1, there is an open superset U of K whose closure U is compact and is contained in W. Since the subspace ( U,r n U) is compact Hausdorff, by Corollary 10. 7, it is normal. Then, by Proof. By Proposition
190
CHAPTER 3 . ELEMENTS OF P O INT S ET T O P OLO G Y
Urysohn's Lemma, for any two disjoint closed subsets of U, there is a continuous function [U,[0,1],t,o] such that tp (A) = {0} and tp (B) = { 1 } . Now, if take A = U \U and B = K we have *two disjoint closed* subsets of U (see Theorem 6.10) in the scenario of Urysohn's Lemma. Now, we extend the function tp to X by letting f (X\ U) = 0, where f denotes the * extension of tp from U to X. Hence, in particular, on its subset, G = (U) c . It remains to show that f is continuous. Let C be any closed subset of [0,1]. If C does not contain 0, then f * (C) = tp�* ( C) is closed in U and, therefore, it is closed in (X, r) (as the traces of all r-closed sets on U are all closed sets in r n U and they are closed in r ) . If 0 E C, then
! * (C) = t * (C U {0}) = t,o * (C) u u c is also closed in
(X, r ).
D
( X, r) be a topological space. Any at most countable intersection of open sets is denoted by G 8 • Any at most countable union of closed sets is denoted by F u · A set is referred to as u-compact, in notation K u ' if it is at most a countable union of com D pact sets. 10. 17 Proposition. Let (X,r) be a second countable locally compact Hausdorff space. Then each open set is an F u- and K u-set and each closed set is a G 8 • Proof. Let � be a countable basis for r and let U E r. By Proposition 10.10, each point x E U has an open neighborhood O x such that O x C U 10.16 Definition and Notation. Let
and O x is compact. On the other hand, O x can be represented as a union of some sets from JL * C U 1 Q n ) < E � 1 J.L* ( Q n) ( u-subadditivity). Although axiom ) is redundant, since J.L * (C/J) = 0 a
)
=
a
general, we find it to a be useful reminder. 2.2 Definition. Let J.L * be an outer measure on is said to be J.L*-measurable, if for any Q � n,
(2. 1a) (2. 1 b ) (2. 1c) as a set function in D
�(f2). A subset M � f2 (2.2 )
We will also say that M separates Q . D The following is what essentially constitutes the widely referred to Caratheodory Extension Theorem. For convenience, we will break it up into several theorems. The idea of outer measures and the below construction belong to the German mathematician ( of the Greek origin ) Constantin Caratheodory that appeared in his 1914 paper, Uber das
line are Ma(i von Punktmengen eine Verallgemeinerung des Liingebegriffes (in Gottingen Nachrichten ) and in his famous 1918 book, Vorlesungen iiber Reellen Funktionen (in Teubner, Leipzig ) . 2.3 Theorem. The collection E* of all J.L * -measurable subsets forms a u-algebra in n. The restriction of J.L * from �(n) to E * , in notation J.L�, zs a measure. .
Proof. Since throughout the proof of this theorem we will largely use
equation ( 2.2 ) or prove its validity, we first notice that, due to subadditivity of J.L*, as an outer measure, the inequality
u
( 2.3 )
holds true for all subsets, Q and M, of n. Our proof will consist of the following steps. a ) n is obviously an element of E*, as it satisfies (2.2). If M E E * , then Me E E * , by their symmetry in ( 2.2 ) . b) We show that E * is closed with respect to the formation of finite unions, i.e. , we show that with A , B E E * , A U B E E*. Since B E E*, it follows that for each Q' E �(n) , ( 2.3a)
2.
Extension of Set Functions to a Measure
Specifically, (2.3a) is valid for Q E �(0). Hence,
237
Q' = Q n A and Q ' = Q n A c ,
J.L * (Q n A) = J.L * (Q n A n B) + J.L * (Q n A n Be) J.L * (Q n A c ) = J.L * (Q n A c n B) + J.L * (Q n A c n Be) .
and
Summing up the last two equations and taking into account that A E E * , we have
J.L * (Q) = J.L * (Q n A) + J.L * (Q n A c ) =
J.L * (Q n A n B) + J.L * (Q n A n Be) + J.L * (Q n A c n B)
imp l ying that
J.L * ( Q)
= J.L * (Q n A n B) + J.L * (Q n A n Be) + J.L * (Q n A c n B) (2.3b) Now replacing
Q in (2.3b) with Q n (A U B ) we also have J.L * (Q n (A U B))
= J.L * (Q n (A U B) n A n B) + J.L * (Q n (A U B ) n A n Be) + J.L * ( Q n (A u B ) n B n A c ) + J.L * (Q n (A u B) n A c n Be) = J.L * (Q n A n B) + J.L * (Q n A n Be) + J.L * (Q n B n A c ) + J.L * ( Q n C/J). The latter reduces to
J.L * (Q n (A U B )) = J.L * (Q n A n B ) + J.L * (Q n A n Be) + J.L * (Q n B n A c ) . Substituting (2 .6) into (2.5) we get
( 2 .3c)
238
CHAPTER S. MEASURES
which shows that A U B E E * . The above assertions a ) and b) imply that E * is an algebra in n. c ) Now we prove that E * is a u-algebra in n. Since E * , as an algebra, is n -stable, it is sufficient to show that E * is a Dynkin system. (See Problem 1 . 10 of Chapter 4.) Let {A n } C E * be a sequence of disjoint sets. Take A 1 ,A 2 E {A n } · Substituting A 1 = A and A2 = B into (2.3c), taking A and B in (2.3c) disjoint, and then noticing that A n Be = A and B n A c = B, we arrive at
Jl * [ Q n ( A + B)] = Jl * (Q n A)
+ Jl * (Q n B) .
(2.3d)
If A 1,. . . , A n is an n-tuple of mutually disjoint elements of then, by induction, from (2.3d) ,
E*,
(2.3e)
S = 2: � Ak. Denote S = E ::"= 1 A n . Because of Sn c S, (Q n sc ) c (Q n S�), and by monotonicity of Jl * , where
n
=
1
(2.3f) Since E * is an algebra, it follows that S n E E * and hence it is Jl * measurable, i.e. , it separates Q , which, combined with (2.3e) and ( 2. 3f) , yields
n
= 1,2, . . ..
Therefore, (2 .3g ) that, by u-subadditivity, is
E ;; 1 A k ) + Jl * (Q n s c ) = Jl * (Q n S) + Jl * (Q n sc ) .
> Jl * (Q n
Inequalities (2.3) and (2.3g-2.3h) lead to
(2.3h)
2.
Extension of Set Functions to a Measure J.L * ( Q)
239
= J.L * ( Q n S) + Jl * ( Q n s c )
concluding that S = 2:: :'= 1 A n indeed separates any Q C �(Q) and thus is an element of E * . The latter supports the claim that E * is a Dynkin system and, consequently, that E * is a u-alge bra. d) We show that J.L� is a measure on E * . Substituting the set S = 2:: :'= 1 A n for Q in (2.3g), we have
JL�( 2:: �= A n ) > 2:: ;' l JL� ( A k ) , 1
V\rhich, due to u-subadditivity of Jl * , leads to the strict equality and thereby, u-additivity of Jl� · Therefore, we have proved that ResE* J.L * , denoted by J.L� , is a measure. The proof is, therefore, completed. D 2.4 Examples.
( i)
Let
n = {a,b, c }, A = { a }, A c = {b, c }, P = { b} , Q = { c }, R = {a, b}, S = {a, c }.
Define the following set function J.L * on �(n). J.L * ((/J) = 0, J.L * (Q) = 4, J.L * (A)
= 1,
J.L * ( Ac ) = J.L * (R) = J.l * (S) = 3, J.L*(P) = J.L * (Q)
= 2.
One can easily verify that J.L * is an outer measure on �(Q)
= {(/J,n,A,A c ,P, Q , R , S },
as it satisfies axioms (2. la-2. 1c) , but Jl * is not a measure, because it is not additive. We can see that only the sets C/J, n, A, and A c J.L *-separate all subsets of n and, consequently, {C/J,f2,A, A c } is the u-algebra E * . Clearly, J.L � , as the restriction of J.L * on E*, is a measure. ( ii ) Let n be an infinite set. Define the set function 1 on �(Q) by 1 ( Q ) = 0 if Q is a finite set and 1 ( Q ) = 1 if Q is infinite. Let Q = { { w n }, n = 1 , 2, . . . } be a sequence of all different singletons. Then, while 1 ( Q ) = 1. Thus,
1
is not u-subadditive and not an outer measure. D
240
CHAPTER 5. MEASURES
Recall that a restriction of a function [X,Y, f] is a function [X0 ,Y0 , J0] defined on contracted domain X0 C X with f = f0 on X0 and Y0 � Y. (In notation, fo = R e s x 0 f. ) From Theorem 2.3, we learned that the set function [E*,[O,oo],J.L�] is a restriction of an outer measure
[�( n), [O,oo] ,J.L *]. If X and Y are supersets of X and Y, respectively, a function [X,Y,f] is called an extension of f (from X to X), if [X , Y , f] is the restriction of f to X. (In notation f = Ext x f . ) We will apply this notion to extend a set function 1 defined on a collection y of subsets of n to a set function 1 on an expanded family g(y ) of subsets of n. For instance, in Ex-ample 1.2 ( ii ) we defined the Lebesgue elementary content intervals in IR". We can extend the Le >..0 on the semi-ring !f of half-open 0 besgue elementary content >.. to a (unique) content >.. c on �(!f) (see Problem 2.2) , which turns out to be a premeasure on � (verified in Theorem 3 . 1). _The primary goal in this section is to construct an exten
sion of a set function, such as premeasure, given on a ring, to a measure on the smallest u-algebra generated by this ring. Although this is the main objective, other extensions, such as "completion" of a measure, will also be a focus of our discussions. 2.5 Definitions.
Let (n,E,J.L) be a measure space. A set N E E is called a J.L -null set (or just null set) if J.L( N) = 0. We denote the set of all J.L-null sets by N 1-' " A set E is called J.L-negligible (or j ust negligible) , if there is a measur able null superset of E . The measure space is called complete, if for each null set N E N 1-'' GJ(N) C E, i.e. , if all negligible sets are measurable. ( ii ) Consider a measure space (O,E,J.L ) . Let E be the collection of all sets of type A U M where A E E . and M is any negligible set. Accord ing to Problem 2.8, E is a u-algebra. We extend J.L to J.L on E by setting (i)
J.L(A U M) = J.L(A) = J.L(A) . (E,J.L) of (E,J.L) or just J.L is then said to be the completion of measure J.L and, due to Problem 2.7, (O, E, J.L) is a measure space, called the completion of measure space (n, E, J.L ). D 2.6 Example. Let n = IR, E = {A E �(IR) : either A or A c -< N}, which is a u-algebra on n (see Example 1.2 (vii), Chapter 4 ) , and let c: 1 be the point mass. Both A = { �, n = 1,2, . . . } and A c are elements of E and e 1 (A c ) = 0. Obviously, E = [2,oo), as a subset of A c , is negligible, but not measurable. Therefore, the measure space (lR,E,e 1 ) is not complete.
The extension
(See a more general case in Problem 2. 14.)
D
2.
Extension of Set Functions to a Measure
241
The proposition below is a paradigm of a complete measure space. 2. 7 Proposition. The restriction J.L� of an outer measure J.L * to the u algebra E * of all J.L * -measurable subsets of n is complete and (n, E * , J.L�)
is a complete measure space.
Proof. Since J.L * is defined on whole �(0), for any J..L *-negligible subset
N C n, due to (2. 1b) , J.L * ( N ) = 0 and, therefore, it is sufficient to show that N is J.L *-measurable. Let Q C n. Due to monotonicity of outer measure, J.L * (Q n N) = 0 and J.L�'(Q n N c ) < J.L * (Q) and this, along with (2.3),
yields
and, hence, that N E E * . D The following will be a construction of an outer measure by an arbitrary set function 1 defined on an arbitrary subcollection of sets � C �(0). As usual, we only assume that � contains the empty set and that /, as a set function, is such that 'Y(C/J) = 0. This construction lies in the basis of the Caratheodory extension of the set function 1 to a measure on u-algebra E( �). For any subset Q � n, denote by C!: Q ( � ) the collection of all at most countable covers of set Q by elements of �. (Unless there is another subcollection, besides �' under consideration, we will for brevity drop � in <EQ((J) .) Therefore, if G:Q =ft (/J, for any { G n } E G:Q, we have Q 2.8 Proposition.
C n 'Q G n· 1
The set function J.L * defined on �(0) as i nf 2: ;:"= r( G n ) : { G n } E G:Q . <E Q # (/)
{
}
1
oo,
�Q = (/J
(2.8)
is an outer measure. Proof. We need to verify the above properties (2. la-2. 1c) of J.L * as an
outer measure:
) Since C/J E � and 'Y(C/J) = 0, it follows that J..L * (C/J) = 0. b) We assume that both J.L * ( A ) and J.l. * (B) are finite, since otherwise, the proof is obvious. If A C B, Q: B C Q:A and then we can reach on Q: A a possibly smaller limit inferior than that on Q: B · Therefore, a
c)
Let
{ Q n } C �(n) and
00 =l
Q =n U Q n . If for at least one
n,
G:Q n = (/J,
·
242
CHAPTER S. MEASURES
then also 0. From and by the definition of a limit inferior, it follows that for e2 there is a cover {G in ' n = 1 ,2, . . . } E Q: Q . such that
-
1,
z
Now, clearly
{G i n' i,n = 1,2, . . . } E ttQ. Thus,
which proves monotonici ty. D We will call the couple (y,1) (a subset y of '!P(Q) and a set function 1 on y) a formatter of outer measure p. * defined by (2.8). As it has been shown, the formatter and, subsequently, the outer measure, induced the u-algebra E*, on which p.* was a complete measure. When constructing a measure space (n, E*, p.� ) by (y, 1), the major goal is to extend 1 from y to a measure, say p., acting on the smallest u algebra E(Q) generated by y. This can be achieved by restricting (E*,p.0) to (E (y) , p. ) � given that (E*,p.�) itself is an extension of (y, I ) · The latter, however, is · not guaranteed from the above construction, unless we impose some restrictions to the formatter (y, 1), for even though (y, 1) produces (n, E*, J.Lti ), (y, 1 ) need not have all elements p.*-measurable. In other words, y need not be a subset of E*. In addition, p.� need not coincide with 1 on y. For example, if 1 is an elementary content and y is a semi-ring, then, according to Problem 2.2, for each G E y, there is a cover { C n } of G such that E� 1 C n is a de composition of G and •
=
Hence, in order that J.l * (G) = 1( G), 1 must be u-addi ti ve on y, which, in general, it is not. Consequently, we call (E * ,p.�) (produced by (y, 1) in (2.8)) the complete Caratheodory extension of (y, 'Y ) if � C E* and Res y p.� = I · If (E* ,p.� ) is the Caratheodory extension of (y, 1), then the formatter (y, 1) is said to be extendible and the corresponding restriction of (E* ,p.�) to (E(y) , J.l) is referred to as the Caratheodory extension of (y, 1 ) . As mentioned above, one of the most important questions arises, what the formatter (y,1 ) should really be to be extendible and, con sequently, generate the Caratheodory extension. By now, we have a fairly
2.
Extension of Set Functions to a Measure
243
large choice of systems of sets and set functions on them ranging from semi-rings to u-algebras and elementary contents to measures. The idea is, however, to select a possibly more rudimentary formatter (g, 1), which is tame and suited in most common practical applications and construct ions and such that ( E* ,J.L�) is an extension of (g, 1 ). In particular, this means that the elements of y have to be J.L * -measurable. The theorem below, which is a crucial step in the whole extension procedure, infers that ( y, ")' ) can be a ring and premeasure to serve as a reasonable extendible formatter. 2.9 Theorem.
Let ( y, ")' ) be a semz-rzng and elementary content, respectively, in n, which produce the outer measure J.L* and u-algebra E* of J.L* measurable subsets of n. Then y c E * . ( ii ) If, in addition, ")' is u-additive on y, then ")' = Res y f..L * and therefore (E * , J.Lo ) is an extension of (y, r). (i)
Proof.
(i )
We have to show that y C E * , i.e. , that any element, G E g, J.L * -separates all subsets of n. Take any subset Q c n with tt Q f. (/J, since, otherwise, the proof would be trivial, and let e = {e n } be any (count able) cover of Q from tt Q . For a G E y, and e n E e, Since y is a semi-ring, e n n G is an element of y and e n \G can be represented as a finite union of pairwise disjoint elements of y, say L, n 1 S;n · Consequently, (2. 9) can be rewritten as
�
e n = ( e n n G) + L, Ni n 1 S i n
and, by finite additivity of ")', (2.9a) Now, suppose L, ::" (2.9a) over n gives
=
1 1( e n ) < oo Then, summing up all equations in .
(2.9b) where {S n } is the reordered sequence {S i n ' j = 1 , . . . ,N n ' n = 1 ,2, . . . } . As Q = (Q n G) + (Q n G c ), obviously, {e n n G } c y and {S n } are covers of Q n G and Q n G c , respectively. Consequently,
cy
244
CHAPTER S. MEASURES
and and then by (2. 9b )
,
Since this inequality holds for every cover the limit inferior to yield
e of Q, it should also hold for (2. 9c)
If L: �= 1 1' ( C n ) = oo , then the equation symbol in (2.9b) must be replaced by > to yield (2.9c) again. The inverse inequality is due to (2.3). Therefore, G separates all subsets of n and, consequently, g C E * . ( ii ) By Problem 2.2, for each G E g, there is a cover { C n} of G such that G = L: �= 1 C n and J.L * ( G) = L: �= 1 1' ( C n ) · Hence, if ")' is u 'l
"
additive, J.L * coincides with 1 on g. These two facts warrant that (E * ,J.L�) is an extension of (g, I ) · 2.10 Remarks.
D
( i) One should bear in mind that, while (g, 1) can be an extendible formatter for the outer measure J.L *, g is not really a generator for E*, as the latter need not be the smallest u-algebra containing g. We would like to make a clear distinction between these two terms. Recall that a family g C �(n) is said to be a generator of another family (g C ) �0 C �(n) with a prqperty P, if '!f0 is the intersection of all supercollections of g on each of which property P holds. In our case, E* will eventually contain the smallest u-algebra E = E(g) and, in general, J.L * needs to be further restricted "to this u-algebra. From Theorem 2. 9, we conclude that any elementary content 1 on a semi-ring g, which is u-additive, can be extended to a measure J.L = Res E(y) J.L * (acting on the smallest u-algebra E generated by g). In other words, if 1 is a u-addi ti ve elementary content on a semi-ring g, then there exists at least one extension, namely, Caratheodory 's extension. ( ii ) From the proof of Theorem 2.9, it is obvious that a semi-ring with a u-additive elementary content on it is one of the most economical systems good for the Caratheodory extension. However, it is often more
2.
Extension of Set Functions to a Measure
24 5
prudent to work with premeasures on rings. In practice, to start with, one can first extend a semi-ring with an elementary content to the smallest ring with the content using the procedure of Theorem 2.5 (Chapter 4) and Proposition 2. 1 1 below. ( iii) Another reasonable question arises: in how many different ways can a formatter ( �, 1 ) be extended to a measure on E (y) ? Theorem 2. 13 below states that with some relatively minor restriction (given in Remark 2. 12) to a set function 1, the uniqueness of Caratheodory 's extension is guaranteed. D We will begin with one useful extension of an elementary content on a semi-ring to a content on the smallest ring containing the semi-ring. 2. 11 Proposition. There is exactly one content on �(!f), which coincides with the elementary content on !f. (See Problem 2.3.) 2.12 Remark. In Definition 1 . 1 ( vii) we introduced the notion of u finiteness of a set function. Sometimes it is more convenient to use an other definition of u-finiteness, which is equivalent to 1 . 1 ( vii ) for a large class of set functions. Namely, the condition of having a monotone in creasing sequence {G n } i n from � with 1 ( G n ) < oo for all n can be replaced by the equivalent condition that there is at most a countable partition {n l , n 2 , . . . } c y of n ( = :E� 1 n n ) such that 'Y( n n ) < 00 for all n. For instance, rings with contents clearly provide a basis for such equi valence. For a semi-ring with elementary content, the first definition yields the second one, as we can arrange from { G n } i f2 a countable de composition; the converse is not true. Another related notion we are going to use in the sequel is u-finite ness of a set. Let (f2, E, J.L) be a measure space. A measurable set A is said to be u-finite if Res E n A f..L is u-finite. D 2. 13 Theorem. Let � be a n -stable generator of the u-algebra E(�) in n such that � contains a monotone increasing sequence {En } j n. Let J.L1 and J.L2 be two measures on E(�), which are u-finite on {Bn } and which coincide on �- Then J.L 1 = J.L 2 on E(y). Proof. Let A E � such that J.L 1 (A) = J.L 2 ( A) < oo and let '!» A = {B E E: J.L1 (A n B) = J.L 2 (A n B)}. We show that '!» A is a Dynkin system: a ) A E '!» A implies that f2 E '!» A · b) Let D E '!» A . Then A n ne = A\D = A\(A n D), which implies that =
J.L1 (A n De) = J.L1 (A) - J.L 1 (A n D)
= J.L2 (A) - J.L2 (A n D) = J.L2 (A n De),
246
CHAPTER S. MEASURES
c)
and this leads to De E �A. Let { D n l be a sequence of disjoint sets from
J.L1 (A n l: D n ) = J.L1 ( l: A n D n ) = =
00
00
n =l
n =l
l: J.L 2 (A n D n ) 00
n= l
�A. Then l: J.L 1 (A n D n ) 00
n =l
= J.L2 ( l: A n D) = J.L2 (A n l: D n ) · 00
00
n =l
n= l
00
Hence I: D n E � A ' and therefore �A is a Dynkin system. Since n =l obviously y C � A ' it follows that y C �(y) C �A . Also since y is n -stable, it follows that �(y) is a u-algebra. Hence, we have
y C �(y) = E(y) C �A C E(y) leading to
� ( y) = E(y) = �A · In particular, we proved that VB E E(y) J.L 1 (A n B) = J.L 2 (A n B). Now let {B n } be a monotone increasing sequence of sets from y convergent to n. Thus E(y) = �Bn . Then \In = 1,2, . . . , and \/ B E E(y) , J.L1 (B n n B) = J.L2 (B n n B). Since
{ B n n B} j B and si.ijce J.Li(B n B n ) < oo, by Lemma 1 .6 , nlim --+oo J.L 1 (B n B n ) = nlim --+ oo J.L 2 (B n B n ) D
Now, by means of Theorem 2. 13 we easily deduce the following signi ficant statement.
Let 1 be a u-finite and u-additive elementary content on a sem.i-ring y. Then the Caratheodory extension of 1 to a measure on u-algebra E (y) is a unique extension. D 2.14 Corollary.
The lemmas below will be used for various purposes and, in parti cular, will lead to a relationship between the completion ( O,E,J.L) of a measure space (fl,L',J.l) and the u-algebra E * of all J.l *-measurable sets. 2.15 Lemma. Let (0, y, 'Y) be an extendible formatter of the outer
measure J.L * , Yu the collection of all at most countable unions of elements from y. Then, for each Q c n, there is a set G E Yu, such that G ::> Q and (T
(T
2.
Extension of Set Functions to a Measure
247 (2. 15)
Proof. Because
J.l * is generated by ( g, 1'),
{ 'L ::"= 1 1'( G n): { G n } E =
'L �- 1 1'(Gn) > 'L � = 1 1'(G n) 'L � = 1 J.l* ( Gn ) > J.l:t ( n u= 1 Gn )·
(2. 15b)
Now, we make use of the fact that (y, 1') is an extendible formatter. This implies that not only Y C E*, but also Yu C E*. Since n � G n is
=1 k monotone increasing and J.l * n U 1 G n < oo for all k, by continuity from -below (Lemma 1 .6) ,
(
lim J.l *
k--+oo
( n U= 1 Gn)
)
=
J.l*( n lJ 1 Gn)· =
Passing to the limit in (2.15b ) , which holds true for all k, we prove (2. 15 ) with G = n U 1 G n being the desired set. D 00
= Lemma 2.16. Let J.l * be an outer measure, E * the u-algebra of all J.l * -measurable sets, and A any subset of n . If there is a J.l * -measurable set B such that B ::> A and J.l * (B\A) = 0, then A E E * . Proof. Since B E E*, it should J.l * -separate Q : u
(2. 16) Now, because
A C B, we can easily show that
From Q n (B\A) (2. 16a) ,
C B\A,
it follows that
J.l * (Q n (B\A)) = 0. From
248
CHAPTER S. MEASURES
Consequently, we can replace J.L*(Q n Be) in (2. 16) by J.L*(Q n A c ) . Finally, noticing that Q n B c Q n A, we have that D and this is the desired inequality. Lemma 2.17. Let J.L * be the outer measure generated by an extendible formatter (y, 1 ) , E * be the u-algebra of all J.L* -measurable sets, J.L� be Res E * J.L*, and let E ( y) be the u-algebra generated by �. Then, for every A* E E * such that J.L�(A * ) < oo, there is a set B E E(y) with B ::> A* and J.L�(B\A * ) = 0 . Proof. Since J.L�(A *) < oo , Q:Q f. C/J. From Lemma 2. 15, for every c > 0, say �. there is a a: = \ a� :::> A* such that JLQ( a:) :S JLQ(A * ) } + c . The latter yields that JLQ( a:\A *) < � · Obviously, k a: is still a Jl 1 superset of A* and since k Jl 1 a: c a:;', it follows that
(2. 17)
where Dm = ( k n= l a: )\A* E E * . The sequence {Dm} is clearly monotone nonincreasing and J.L�( D 1 ) < oo. Therefore, by continuity from above (see Theorem 1.7 (i))) of J.Lo and because of (2. 17),
JimoJLQ(Dm ) = JL0{ ( k fl 1 a; )\A* } = o.
The set k n= l a: obviously meets the requirements on set B "promised" in the statement and we are done with the proof. D Corollary 2.18. Let J.L* be the outer measure generated by a u-finite extendible formatter (y, 1), E * be the u-algebra of all J.L • -measurable sets, and let E(y) be the u-algebra generated by y. Then, for every A* E E * , there is a set B E E(y) with B ::> A* and J.L*(B\A *) = 0 . Proof. Since ( y, / ) is u-finite, there is a partition { H 1 ,H 2 , . . . } C y of n such that 1 ( H k) < oo. If A* E E * , then {A Z = A* n Hk , k = 1,2, . . . } is a J.L*-measurable partition of A*, with J.L*(A Z ) < oo for every k, and to each of which we can apply Lemma 2. 17 and have a set B k E E(y), with B k ::> A Z and J.L*(B k \A Z ) = 0.
2.
Extension of Set Functions to a Measure
24 9
Notice that since 00 B ( k U= l k )\( 2: n -_ 1 A � ) = ( k U= l Bk ) n ( n n= l (A �) c ) it holds true that
[
� · ( k lJ 1 B k )\( I: :'= 1 A � )
]
< �· [ k lJ 1 ( Bk \A k ) ] < I: :'= 1 � *(Bn\A � ) = O . 00 The statement follows after setting B = k U= l B k ( E E(y) ). D Now, with the aid of the above propositions, we can finally answer the question about the relationship between the completion (n,E,J.L) of a measure space (O,E,J.L) and the u-algebra E* of all J.L*-measurable sets. 2.19. Theorem. Let ( �, 1 ) be an extendible formatter for (0, E*, J.L�) and a generator for the measure space (0, E = u(y), J.L = Res E J.L*) whose completion is (n,E,J.L ). ( i ) Then, E C E * . ( ii )
If ( Y7 Y ) is u-finite, then E = E* and J.L = J.L� ·
Proof.
( i ) Obviously, E C E* if and only if, any element A of I: is of the form A U N, where A C E, N is J.L-negligible, and A is J.L*-measurable. According to Lemma 2. 16, A U N would be J.L*-measurable, if there is a J.L*-measurable set B such that B ::> A U N and J.L*( B\(A U N) ) = 0. By Definition 2.5 ( i ) of a J.L-negligible set, N must have a E-measurable J.L null superset, say N0 • (Note that even though, by Problem 2. 10, J.L*(N) = 0 and J.L*(A U N) = J.L*(A) , this does not warrant that A U N E E*.) Since A U N0 is a superset of A U N and, by Problem 2. 1 1 , (A U N0 )\(A U N) is a J.L*-null set, B = A U N0 meets all prerequisites of Lemma 2. 16, which makes A U N indeed J.L*-measurable. This proves part ( i ) of the theorem. ( ii) Because of part (i), we need to show that E* C E, i.e., that each A* can be represented as the union of a J.L-measurable set and J.L negligible set. By Problem 2.12, for any A* E E*, there is a E-measurable subset B of A* such that J.L*(A*\B) = 0. Obviously, A* can be decomposed as B and J.L*-measurable null set A* n Be. It only remains to show that A * n Be is J.L-negligible.
250
CHAPTER S. MEASURES
By Corollary 2. 18, for A*, there is a set C E E such that C ::> A * and J.L*( C \A*) = 0. The set-difference C \B = ( C \A*) + (A * \B), as the union of two J.L*-null sets, is a J.L*-null set, therefore, a J.L-null set (as C \B E E). This proves that A* n Be is J.L-negligible. Now, we show that J.L = J.Lo· (Recall that they are equal on E.) Since E = E*, A* = A U N, where A E E and N is J.L-negligible, and J.L(A * ) = J.L(A) = J.L*(A).
( 2 . 19 )
On the other hand, there is a J.L-null superset of N to yield J.L*(N) = 0 due to mono tonicity of J.L * . Finally, from the inequalities J.L*(A*) < J.L*(A) + J.L*(N) = J.L*(A)
and
J.L*(A * = A U N) > J.L*(A),
it follows that J.L*(A*) J.L*(A) and this, along with (2. 19), yields that J.L (A * ) = J.L*(A * ) for each A * E E* = E. 0 Example 2.20. If (n, E, J.L) is a probability space, it follows from Theorem 2.19 that the completion of (E, J.L) coincides with (E*,J.L�) produced by ( E, J.L) or by a "smaller generator" ( y, 1) of ( E, J.L ). 0 A noteworthy question arises : if we have a semi-ring and u-additive elementary content, would it make any difference, if we first extend them to the smallest ring and premeasure, according to Proposition 2. 11, and then use the Caratheodory extension to arrive at the smallest u-algebra and a measure on it, or apply the Caratheodory extension directly to that semi-ring and u-additive elementary content. The same question applies, say, to a ring with a premeasure and the generated u-algebra with a measure. The difference, if any, can apparently take place at the expense of two outer measures, induced by a formatter and its extension. 2.21 Theorem. Let (0, y, 'Yo ) be an extendible formatter of outer measure J.L* and u-algebra E* of J.L* -measurable sets and let ( g = g (y ) , 1) be an extension of (y, 'Yo ) and an extendible formatter of outer measure v* and 0, there is a cover { G n k ' k = 1,2, . . . } E Q: En (y) , such that n-1 ) ) + J.l.*(E 'Yo( £2 n L: '; 1 G n k < = 'Y(E n ) + £2 - n - 1 .
(2.2lb)
Because { G n k ' n , k = 1,2, . . . } E G: Q (y) , from (2.2 1a) and (2.2 1b),
J.l. * (Q) � L: :O 1 I: '; 1 'Yo( G n k ) < v * (Q) + � + �(2.2 1c) Finally, taking in (2.2 1c) e = � leads to the inverse of inequality (2.21) and proves that J.l. * = v * on
With
and, consequently,
252
CHAPTER S. MEASURES
we meet all conditions of Theorem 2.2 1 to have p.; = J.L ; = J.L * · 4) E * = E; = E; also by Theorem 2.2 1. 0 For instance, � can be a ring generated by !f and 1 the extension of the elementary content 'Yo in accordance with Proposition 2. 1 1; or � can be an algebra with 1 as a premeasure or � can even be the u-algebra E(!f). In particular, it follows that, once the Caratheodory extension from (!f, 'Yo ) to ( E (!f), J.L) is rendered, another Caratheodory's extension of (�, 1 ) would be redundant. Another consequence of Theorem 2.2 1 is the uniqueness of outer measures generated by measures. Corollary 2.23. Let J.L a measure on a u-algebra E, which produces the outer measure J.L * with u-algebra E * of measurable sets. If there is another outer measure � * , then � * = J.L * on �(Q) and E * = E * . Proof. This is a direct application of Theorem 2.2 1 with the following identification of the above characteristics : 1) Let � be a measure on E such that � = J.L · Then � can serve as an extension of ( E, J.L). 2) E � E * . D 3) J.L = 'j1 = Res E J.L * . -
"'
Corollary 2. 23 is useful in various applications of Caratheodory's extension. Suppose 'Yt and 12 are two elementary contents coinciding on a u-finite semi-ring !f (i.e. they are u-finite on !f). By Corollary 2. 14, their respective Caratheodory extensions J.Lt and J.L2 must coincide on E ( !f). Let J.L i and J.L; be the corresponding outer measures,_ according to Corollary 2.22, produced by 'Y t and 1 2 or J.L t and D J.L2 (regardless). By Corollary 2.23, J.Li = J.L; on '!P(Q) and Ei = E;. As in. Theorem 2.21, by comparing two measures generated by a set function acting on a collection of sets and their extension, we ended up comparing two corresponding produced outer measures. It seems to be reasonable to raise another question: what if an outer measure will pro duce another outer measure? Would this make any difference? More speci fically, can the restriction J.L� of an outer measure J.L * on E * become a formatter of another, different from J.L * , outer measure? Note that this is a different scenario from one considered in Theorem 2.21, since here J.L * is not supposed to be generated by a formatter and it "acts on its own." The following example shows us this distinction. 2.25 Example. Consider '!P(O ), J.L * , E * , and J.L� in Example 2.4 ( i ) : Remark 2.24.
2. Extension of Set Functions to a Measure
25 3
n = { a,b,c}, A = { a}, Ac = { b,c}, P = { b}, Q = { c }, R = { a,b }, S = { a,c }, J.L*(C/J) = 0 , J.L * (Q) = 4, J.L * ( A ) = 1, J.L * ( Ac ) = J.L * (R) = J.L * (S) = 3, J.L * ( P ) = J.L * (Q) = 2, Then, generate the outer measure v * by (E * , J.L �). So, we have : J.L * = v* on E * and v * (P) = J.L * (Ac ) = 3 ( > J.L * (P) = 2),
v * ( Q) = J.L * (A c ) = 3 ( > J.L *(Q) = 2), v * (R) = J.L * (Q) = 4 ( > J.L * (R) = 3), v * (S) = J.L *(O) = 4 ( > J.L * (S) = 3). D As we see it, in most cases v * is strictly greater than J.L * on Q and J.L * (A) = J.L * ( Q ). D (See Problem 2. 17.) 2.27 Remark. If J.L * is generated by an extendible formatter (�, 1 ) , then clearly J.L * = v *, due to Theorem 2.21, as (J.L'Q,E * ) can serve as an extension of (y, 'Y )· Alternatively, if Q E Q and J.L * ( G u) < J.L * ( Q ) + t:. We assume that v * is the outer measure generated by J.Lo · Since J.L * = v* on �(n) and G u E E * , we have J.L * (Gu) = v * (Gu) and, by monotonicity, v * (Gu) > v * ( Q ). Thus, we have e: ,
which yields v * ( Q ) < J.L * ( Q ). The inverse inequality is due to Problem
254
CHAPTER S. MEASURES
2. 1.
0
Let (0, E, J.L) be a measure space such that E = E(!f) with !f being a semi-ring, and J.L be u-finite on !f. Then, given A E E( !f ) and £ > 0, there is a disjoint countable cover { S n } C !f of A that "approximates" A, i.e. such that A C E�= l Sn and J.L( (E�= l Sn)\A) < £. Proof. Let 1 = Re s !f J.L and J..L * be t� e outer measure produced by ( !f, 'Y) · Then, J.L is the unique Caratheodori extension of 1 from !f to E(!f), according to Corollary 2. 14, and J.L = Res E f..L * · Case 1.:. Let J.L(A) = J.L*(A) < oo. Then, by (2.8) (of Proposition 2.8), for each £ > 0, there is a sequence { G n } E
E ::O= 1 / (G n ) = E ::O= 1 J..L ( G n) < J.L(A) + £. Since J.L( n lJ 1 G n ) < E ::'= 1 J.L(G n), we have that J.L( { n lJ 1 G n }\A ) < c .
Case 2. J.L(A) is arbitrary. By u-finiteness of J.l on !f, there is at most a countable decomposition E � 1 nk of n by { Ok } c !f such that J.L(Ok) < and hence J.L(A n n k) < oo. Now, we apply the above atguments to A n n k and 2ek . Hence, there is a sequence { G nk } c !f n nk such that 00
This leads to and thus to where G n: = E � 1 G nk· Finally, it remains to form a disjoint sequence of semi-ring sets as stated in the theorem. The latter can be rendered in the same way as in 0 Lemma 2.5 of Chapter 4. 2.29 Corollary. If in the condition of Theorem 2. 28, J.L(A) < oo, then A can be approximated by just a finite tuple of disjoint semi-ring sets.
2.
Extension of Set Functions to a Measure
255
Because J.L(A) < oo, by Case 1 of Theorem 2.28, so is J.L(E�= 1 G n ) < oo. Then, by continuity from above, for each e > 0, there is an N such that Proof.
thereby leading to
1-{ A�( 2: � = 1 Sk)) < 2 c: .
0
PROBLEMS 2.1 2.2
2.3 2.4
2.5
2.6
2. 7 2.8
2.9
Let J.L* be an outer measure on �(0), J.L� = Res E* J.L*, and v* be the outer measure induced by J.L� · Show that J.L* < v* on �(0). Let (y,")') be a formatter of the outer measure J.L* defined by (2.8). Show that if 1 is an elementary content and y is a semi-ring, then for each G E y , there is a cover { C n } of G such that G = 2: ::0= 1 C n and J.L*( G ) = 2: ::0= 1 1 ( C n ). Prove Proposition 2. 11. Let J.L be a finite measure on (O,E) and let E be any subcollection of E. Show that, for any fiXed subset Q c n, it is true that inf{ J.L(A): Q C A} = J.L(O) - sup{J.L(A c ) : Q C A}. Show that the original definition of u-finiteness 1 . 1 (vii) implies the second definition of u-finiteness for semi-rings and elementary contents mentioned in Remark 2. 12. Let J.L* be an outer measure on � ( Q ) and { An} a sequence of disjoint J.L*-measurable sets. Show that for any Q c n, J.L*(Q n L: :0= 1 A n ) = L: :O= l J.L * ( Q n A n ) · Let N E N (i.e., a J.L-n ull set) and let B E E. Show that J.L( N U B) = J.L( B\N) = J.L( B). Show that E defined in Definition 2.5 ( ii) is a u-algebra, J.L is a measure, that this extension does not depend upon representations of sets of E, and that (O,E,J.L) is complete. Show that the measure space defined in Example 1.2 ( iv) is complete. 1-'
256 2. 10 2. 11 2.12
2.13
CHAPTER 5 . MEASURES
Let J.l* be an outer measure on '!P(Q) and N C n be such that J.l*(N) = 0. Show that for any subset Q c n, J.l*( Q U N) = J.l*( Q ). Show that (A U N0 )\(A U N) in part (i) of Theorem 2.19 is a J.L* null set. Let J.l* be the outer measure generated by an extendible u-finite formatter (�, 1 ), E * be the u-algebra of all J.L*-measurable sets, and let E(y) be the u-algebra generated by y. Show that for every A* E E*, there is a set B E E(y) with B C A* and J.l*(A *\B) = 0. Let (f2,E0 ,J.l0 ) be a completion of a measure space (O,E,J.L ) . Define for each A C n J.L(A) = sup{J.L(B): B E E, B C A} and -J.L(A) = inf{J.L(B ) : B E E, A C B} . Show that
a) if A E E0, then J.L (A) = J.L(A) = J.Lo(A); b). if J.L (A) = -J.L(A) < oo, then A E E0 • 2.14 Let E be a u-algebra in Q and let a E n. Show that for {a} E E the measure space (n,E,ea) is complete if and only if E = '!P(Q). 2.15 (Generalization of Problem 1.12.) Let (fl, E, J.L) be a measure space, y00 = {G E E: J.L(G) < oo} , E00 = {Q C f2: Q n G E E, \IG E y00 }, and J.l be u-finite. Show that E00 = E. 2.16 In th� condition of Problem 1 . 13, show that if J.l is complete, then so is J.l00• 2.17 Prove Proposition 2.26. 2.18 Let J.l* be the outer measure generated by an extendible formatter (�, 1 ) on a non-empty se t n, E* be the u-algebra of all J.L*-measur able sets, and E( y) be the u-algebra generated by y. Show that a subset N C n is negligible if and only if J.l*(N) = 0 .
2.
Extension of Set Functions to a Measure
NEW TERMS:
outer measure 235 monotonicity of outer measure 236 subadditivity of outer measure 236 J.L *-measurable set 236 J.L *-separability 236 Caratheodory's Extension Theorem 236 J.L0-measure 236 E*-u-algebra 236 restriction of a function 240 extension of a function 240 J.L-null (null ) set 240 null (J.L-null) set 240 N -set 240 J.L-negligible (negligible ) set 240 negligible (J.L-negligible) set 240 extension of a measure 240 completion of a measure 240 completion of a measure space 240 restriction of outer measure to E*-algebra 24 1 Caratheodory 's extension 24 1 , 242 formatter of an outer measure 242 complete Caratheodory's extension 242 extendible formatter 242 extendibility of a formatter, criterion of 243 u-finiteness of a set function 245 Caratheodory's extension, uniqueness of 245, 246 1-'
257
258
CHAPTER S. MEASURES
3. LEBESGUE AND LEBESGUE-STIELT JES MEASURES
In this section, we will use the results of the previous section for the con struction of Lebesgue and Lebesgue-Stieltjes measures. We have learned that to warrant the Caratheodory extension, a given formatter should be at least a semi-ring and u-additive elementary content, which applies to some special cases of formatters in Euclidean spaces. In Theorem 3.1 below,n we will show that the Lebesgue content is u-additive on the ring �(IR ), which will clearly yield that Lebesgue elementary content is also u-additive on the semi-ring of half open intervals. Although it is possible to prove this statement directly ( cf. Problem 3.25 with no prior extension and ·¢-continuity arguments, as in Theorem 3 . 1), we prefer first to extend the elementary content to the ring, as we want to exploit the equivalence of ¢-continuity and u-additivity. The latter, as we know, can be observed on set families not lesser than rings. n Theorem 3.1. The Lebesgue content .,\c on the ring �(!R ) is u
additive, i. e. a premeasure. Proof. Since the Lebesgue content .,\ c is finite on �, by Proposition 1. 7 ( ii ), .,\c were a premeasure if it would be ¢-continuous. We shall be using an equivalent version of 0-continuity: For every monotone decreasing sequence {An} l C � with .,\c(A 1 ) < oo, the assumption that nliriJo"'c(An) (which clearly exists) is strictly positive must yield that n n= l An f. Q). 00
Let {An} be any such monotone decreasing sequence with (3. 1)
It is readily seen that (3. 1) implies that for each An f. 0, and there fore, by Cantor's Theorem 5.4, Chapter 2, n n= l An f. ct>. However, the nonempty intersection of the closures of An 's need not yield that the intersection n n= l An f. Q) either. To overcome this difficulty we will construct a subsequence of compact subsets of An 's with the desired above property. Now, since An 's E �, each An can be represented as a finite union of disjoint half open parallelepipeds, say E� = 1 P (for brevity let us drop index ) such that .,\c(P ) > 0. Then for each value of and for every P there is a half open parallelepiped II whose closure II is a proper subset of P and such that n,
00
n
£
5,
s
s
5
s
s
3. Lebesgue and Lebesgue-Stieltjes Measures
n 2 >..c ( P ) < >.. c ( JI 5 ) + 2 5 _ 2 5 - r t · Bound ( 3.1a) yields that s
259 ( 3.1a) ( 3.1b )
where B n = E : = 1 115• Obviously, B n C A n . It seems like we are done with the sequence {B n }· However, the claim that n n= 1 B n f. C/J is unwarranted, as { B n } need not be monotone decreasing. Therefore, we define n e n = n Bk , k=1
which forms a monotone nonincreasing sequence of sets term-wise dominated by {A n } · Now, we need to show that e n f. (/J. We shall be able to prove a much stronger statement that >.. c ( e n ) > 0 for all n. Namely, we will prove that ( 3.1c )
which , because of >.. c ( A n ) > £, would yield the desired
( 3.1d )
We prove ( 3. 1c ) by induction. For n = 1, ( 3. 1c ) holds true, since from ( 3.1 b ) , Now we assume that ( 3.1c ) holds for some n > 1 and show the validity of ( 3. 1c ) for n + 1. Because of e n + 1 = B n + 1 n e n and Proposition 1.5 (ii),
( 3.1e )
Due to ( 3.1e ) , the inequality >.. c (B n + 1 ) >- >.. c (A n + 1 ) - 2 n 1+ 1 £ (from ( 3.1b ) for n + 1 ) , and the assumption that ( 3. lc ) holds true for some fixed n we have
>.. c (B n + 1 U en ) > >.. c (A n + 1 ) + >.. c (A n ) - >.. c (C n + 1 ) - c( 1 - 2 n � 1 } Since obviously Bn + 1 U e n C A n , we have >.. c (A n ) > >.. c (B n + 1 U e n ), and hence
260
CHAPTER S. MEASURES
( 1 - 2 n\ 1) + A c (A n ) - A c (Bn + 1 U C n ) > A c ( A n + 1 ) c( 1 2n\ 1 }
· ( Cn + 1 ) > A c (A n + ) 1
\
-
c
-
-
This proves (3. 1c) and (3. 1d) and thereby yields that { C n } 1s. a monotone nonincreasing sequence of nonempty compact sets; hence, by Cantor's Theorem 5.4, Chapter 2, Consequently, it shows that >. c is indeed a premeasure on the ring '5b.
D
3.2 Remarks and Definitions.
Theorem 3.1 states that the Lebesgue content on �(!f) in IR" is u-additive. This, obviously, implies that the Lebesgue elementary content is also u-additive on !f. ( ii ) In Example 1.2 ( ii ) we defined the Lebesgue elementary content 0 >. on the semi-ring !f of half-open intervals in IR " . Now, by the use of Pro position 2 . 1 1 , Corollary 2. 14, and Theorem 03. 1, we can have the couple (�,>. c) or, in light of Remark ( i ) , even (!1',>. ) as an extendible formatter of the outer measure >. * acting on �(lR") and call this set function the Lebesgue outf1,r measure. The u-algebra E* C '!P(lR") of all >.*-measurable sets, in notation, L * , called the Lebesgue u-algebra of measurable sets, along with >.� R e s L >. * , calle€). the Lebesgue measure, will form a complete measure space, according to Proposition 2. 7. The further restriction >. Res E( !l' ) >. * of the Lebesgue outer measure on the smallest u-algebra generated by !f (which, according to Theorem 2. 7, Chapter 4, is identical to the smallest u-algebra generated by the usual topology) or, equivalently, by � ' known as the Borel u-algebra . coincides with Lebesgue measure >.0 on L * and the corresponding completion of the Borel u-algebr a coincides with the u-algebra L * of Lebesgue measurable sets. Both, Lebesgue and Borel-Lebesgue measures have their strengths
3.
Lebesgue and Lebesgue-Stieltjes Measures
26 1
and weaknesses. The Borel-Lebesgue measure acts on the Borel u-algebra, which stems from the usual topology and preserves some topological pro perties. The Borel-Lebesgue measure is also an element of a very im portant class of Borel measures. However, unlike Lebesgue measure, Borel-Lebesgue measure is not complete. D 3.3 Definitions.
( i ) Let 0, there is a cover {I k } E Cf, N such that .
£
which proves the first part of the statement. Conversely, let e > 0 and let { Ik } C !f be a countable cover of N 0 with the property that L: � A (I k ) < Then, 1
£.
00
A * (N) � A * ( ku/k ) < L: ;' l Ao ( Ik ) < c and hence, by Problem 2. 18, N is a A-negligible set. 0 3.7 Lemma. Let f: IR ---. IR be an additive function, continuous at zero. Then, f is linear. Proof. First note that f(O) + f(O) f(O + 0) = f(O). ==
( 3 .7a)
This yields that f(O) = 0. Then, from 0 = f(O)
=
f ( x - x) = f (x) + f( - x )
(3. 7b)
3. Lebesgue and Lebesgue-Stieltjes Measures it follows that f( x ) = - f( - x ) and thus f is odd. Now, let positive integer number. Then, since f is additive, f( nx ) = nf( x ).
265 n
be any (3.7c)
If n is a negative integer, then, from (3.7b-c),
f(nx) = ! ( ( - n) ( - 1) x ) = - f ( - nx ) = - ( - n) f (x ) = nf( x ). Hence, for each n E 7L, f( nx ) = nf( x ),
which yields that
! (�) = kf( x ). Combining (3.7d) and (3.7f) we have that for each integer
(3.7d) (3. 7f) m,
m f (fr) f (�x) = �f( x). In other words, for each rational number q, =
f( qx) = qf(x). Since f is continuous at zero and because f is additive and odd we have from f ( x - y ) = f( x) + f( - y) = f ( x) - f (y) that f is continuous on IR. Now, let r E IR. Then, there is a sequence {q n } of rationals convergent to r. Due to continuity of /, nl.L� f(q n ) = f( r ).
(3. 7g)
On the other hand, f(q n · 1) = q n f(1) and (3.7g) lead to f( r ) = nlim --.oo q n = f(1) r . --t-oo f(qn ) = /(1) nlim D This shows that f is a linear function f ( x ) = e x , where c = f ( 1). n 3.8 Corollary. Let f : !R ---. !R be continuous at zero and additive for
each variable separately. Then
f( x1 , . , x n) = cx1 · . .x n , where c = /(1, . . . , 1). Proof. If x 2 , . , x n are fixed, then by Lemma 3.7, .•
••
266
CHAPTER S. MEASURES
Applying the same procedure successively to the other variables we have the statement. D n 3.9 Definition. A Borel measure J.L on . is the Borel-Lebesgue and c = J.L( C) ( C stands for a unit cube). Proof. For each x E IR , define 3. 10 Theorem.
I = X
[x,O),
xO
[O ,x),
and sgn x =
Borel measure on measure on O - 1, X < 0.
Denote (3 . 10)
We show that f defined in (3 . 10) is additive and continuous in each variable separately. Without loss of generality, we show it with respect to x 1 . Let x 1 = x + y. Case 1. Suppose x > 0 and y > 0. Then, I x + y = [O,x + y) = [O,x) + [x,x + y) and where
3.
Lebesgue and Lebesgue-Stieltjes Measures
267
R1 = I x I 2 x . . . x I n and R 2 = [x,x + y) x I 2 x . . . x I n . x
x
x
x
x
Since, x,y, and x + y are all positive,
(
sgn . fi x ; 2=1 From (3. 10a) ,
and since
)
=
sgn(x x 2 ·
•
•••
•
x n)
=
sgn(y x 2 ·
•
•.•
•
x n) ·
(3. 10a)
J.L is translation invariant,
Case 2. Suppose x + y > 0 and x > 0, y < 0. Then, sgn ( ( x + y) x 2 ·
•
•••
•
x n) (3 . 10b)
Since
I x + Y = [O,x + y) = [O,x)\[x + y,x) ,
>.([x + y,x)) = >.([y,O)), and because (3. 1 0b) we have that
J.L is translation invariant, using
Case 3. x + y > 0 and x < 0, y > 0 is same as case 2. The other combinations of x and y are left for the reader. (See Problem 3 .20.) No,v, we prove continuity of f at zero. Let { a k } be a sequence conver-
268
CHAPTER S. MEASURES
gent to zero from the right. Then, {I a k } is such that 00
{ a k } C IR + and the sequence of sets
k n= 1 la k
= {0}.
The latter yields that 00
n {Ia k x l x 2 x . . . x l x n } = l0 x l x 2 . . . x l x n . k=l By the definition, I0 = C/J; and by continuity from above of J.L , we have x
that
Similarly, by continuity from below of J.L, we have that
for
{ a k } T 0. In addition, f ( O, x 2 , .
• •
, x n) = 0 is by the definition of f.
By Corollary 3.8,
f (x1 , . . . ,xn ) = /(1, . . . ,1)x1 · x n = sgn(1 · · 1 )x1 · Xn J.L ( C) , · ·
where
C = [0,1)
x
'
1
v
·
· ·
,
(3. 10c)
. . . x [0, 1). On the other hand,
f (x 1 , . . . , x n) = sgn ( aJ J= 1 x ) J.L ( . fr= 1 Ix . )' ;
'
2
which, along with (3. 10c) , gives
Note that
JL c � /x; ) = sig=c��:��xn) JL ( C).
(3. 10d) (3. 10e)
Equations (3 . 10d) and (3. 10e) tell us that for any rectangle R whose all sides lie on corresponding coordinate axes, (3. 10f) For an arbitrarily positioned rectangle R whose all sides are parallel to the corresponding coordinate axes (3. 10f) still holds true due to the translation invariance of J.l·
3.
Lebesgue and Lebesgue-Stieltjes Measures
269
0
By J..Lo = J..L ( C) >-. we define an elementary content on the semi-ring !f of half open rectangles. Then, by 'jJ. = J..L ( C) >.. we also have a Borel measure on �- Now, we have three Borel measures on c:B: 'jJ. , J..L , and the (unique, as !f is n -stable) extension of J..Lo from !f to
IR = U (q + A) = qEQ
}: (q + A).
qEQ
(P3. 16a)
4) Finally, let Q = Q n (0, 1 ]. Then U - (q + A) c (0,2]. If A qEQ
is a Borel set (and this is the assumption that will lead to a contradiction), then by Problem 3. 1 1 , x + A is a Borel set too; and by the translation-invariance of Borel-Lebesgue measure >., >.(x + A) = >.(A) implying that
A( U (q + A)) = }:_ .\(q + A) < .\((0,2]) = 2. _
qEQ
qEQ
Thus the above �eries is finite; and since the >.( q + A) values are equal for all q E Q , each of them must be zero, which implies that
274
CHAPTER S. MEASURES
.,\( q + A) = 0, \lq E Q . But !R =
L: (q + A ) => .,\(!R) = L:
qEQ
qEQ
.,\( q + A ) = 0,
which is an absurdity. Thus, our assumption that A is a Borel set was wrong.] 3 .. 17 Let .,\ denote the Borel-Lebesgue measure on the Borel u-algebra � (!R n ) . Show that for each Borel set B and e > 0, there is a count able cover of B by disjoint semi-open cubes { C k } such that In particular,
.,\(B) = inf{ L: � 1 .,\0 (C k ) : {C k } E G: B (c ubes) } . 3.18
[Hint: Use Problem 3.1 5.] Show that if N is a negligible set in (!R n c:B, .,\), for each e > 0, there is a countable cover of N by disjoint semi-open cubes { C k} such ,
that
3.19
Show t'hat if N is a subset of !R n , and for each e > 0, there is a countable cover of N by semi-open (not necessarily disjoint) cubes such that
then N is negligible. 3.20 Show additivity of f in Theorem 3 . 10 for the other combinations of x and y. 3.21 Let J.L be a translation invariant Borel measure on . on . be the Borel-Lebesgue measure on the Borel u-algebra 1 ) Let
.f * ((a,b]) = � >.((a,b] ) for a > 0 a
278
CHAPTER S. MEASURES
A/ * (( a,b ]) = �( 1 ) " A(( a , b ]) for a < 0, a and thus
.\ f * ( ( a ,b ]) = I : I n .\ (( a ,b ]).
As a continuous map relative to the usual topology, f is Borel and, consequently, .\ f * is a Borel measure on c:B. Obviously, : n,\ is also a I I Borel measure on c:B. Since .\ f * and : n,\ are u-fmite on :1' and coincide I I on :1' (being a n -stable generator of c:B) , and since I : n .\ , is u-additive, I by Corollary 2. 14, they should also coincide on .� on L * is translation-invariant. 4.8 Let J.l be a translation-invariant Borel measure on �(IR") and let 4 .9 J.l * be nhe outer measure produced by ( 2 n n J 2� f*((2, ,2 - ]).
A0 (n) = {f > 0} A l (n) = In other words,
=
=
-
Simple Functions
6.
291
and Ai(n) = / * ((n,oo]), i = n2". Therefore, all sets Ai(n), i = 0, . . . ,n2", are disjoint and obviously E measurable. Let us define
Both f and s n are depicted in Figure 6.2.
n
·
-+----t-!1 -
I
I
I
I
r
1 -
!
'
'
'
I
1
- � - - :-
I
I
• •"I' • I
•
I•
·
...
I
I
I
I
·
I
l
I
1
. r
I
I
-
'
· ·
I •LLJ�"""""-.. I
I
'
I
I
I
I
'
'·
..
'
\ ·. -
- -- - - - '
__
_
_
_
_
Ao _
_
� _
_
_.
�
,
.·
...
Figure 6.2 Clearly s n +l > sn. Besides, s n (w) < f(w) < s n ( w ) + 2- n , Vw E 0: f( w ) < n, and f(w) > n, V w E 0: f(w) = oo . Functions s n and s n + l are drawn in Figure 6. 3.
292
CHAPTER S . MEASURES
--- � ---- --- ---- -,-- - . -
sn
I
1
I
-·
-
- . - - - - _.,_.,.
I
, - - , - ·- r -
-
-
. --
I
-
-
.•
..
-
..
. - 7
i
-
..
-
-
-
-
.
- .. . ....� l
I -
·-
..
I
- - , - - - - - - - - - - .,. - - T - I - ---
Fi-g.ure 6 . 3 Thus there exists sup{s n } = f (pointwise \lw E 0), and therefore f E � + , implying that e + 1 C IF + . This proves that e + 1 = IF + . 0 PROBLEMS 6.1 6.2
Prove Proposition 6.3. Let f2 be an uncountable set and let E = { A c n : A or A c is at most countable}. Show that f E e - 1 (f2,E) if and only if f is constant everywhere except on an at most countable subset of n.
6. Simple Function s NEW TERMS: nonnegative simple functions 288 canonic representation (expansion) 289 canonic expansion (representation) 289 semi-linear space 289 semi-linear lattice 289 quasiring 289 quasialgebra 289 semi-$-space 289 simple function 289 closed semi-$-space 290
29 3
Chapt er 6 Elements of Integration
The historical significance of the development of measure theory is that it created a base for a generalization of the classical Riemann notion of the definite integral ( which since 1854 was considered to be the most general theory of integration ) . Riemann defined a bounded function over an interval [a,b] to be integrable if and only if the Darboux ( or Cauchy ) sums I: ':_ 1 /( t i ) >.. ( I i ), where I: ':_ 1 I i ' is a finite decomposition of [a, b) into subi � tervals, approach a uni que limiting value whenever the length of the largest interval goes to zero. A French mathematician, Henri Lebesgue (1 875-1941), assumed that the above intervals I i may be substituted by more general measurable sets and that the class of Riemann integrable functions can be enlarged to the class of measurable functions. In this case, we arrive at a more solid theory of integration, which is better suited for dealing with various limit processes and which greatly contributed to the contemporary theory of probability and stochastic processes. Although many results existed prior to Lebesgue's major work be tween 1901 and 19 10, Lebesgue's construction appeared to be the most ef ficient. After 1910, a large number of mathematicians began to engage in work initiated by Lebesgue. Some of the most significant contributions were made by the Frenchman Pierre Fatou (1878-1 929), Italian Guido Fubini (1 897-1943), Hungarian Frigyes ( Frederic ) Riesz (1880- 1 956), Pole Otto Nikodym (1887-1974), and Austrian Johann Radon (1 887-1 956) who developed the Lebesgue-Stieltjes integral and whose work led to the modern abstract theory of measure and integration. In this chapter, we will first be concerned with the main principles of integration with respect to arbitrary measures. We will be using standard techniques developed for Lebesgue integration but without sacrificing the generality . Then various applications of the integral will be considered. We will look at the integral as a measure ( and later, in Chapter 8, in the general case, as a "signed measure" ) , at Radon-Nikodym derivatives, at decomposition of measures and decomposition of absolutely continuous functions, and at "multiple integration. ,, Other applications of inte gration ( including uniform integrability ) and various principles of conver gence will be developed in Chapter 8.
295
296
CHAPTER 6. ELEMENTS OF INTEGRATION 1. INTEGRATION ON
e - 1(!l,E)
We begin the theory of integration with integrals of nonnegative simple functions, which we introduced in Section 6, Chapter 5. Prior to the definition of the rudimentary integral, the proposition below states that integrals of nonnegative simple functions are invariant of their representa tions. 1. 1 Lemma. Let (O.,E,J.L) be a measure space and let s E tJf + (O.,E)
have two representations: Then it holds that
Proof. The above representations are due to the two decompositions
of n :
Then
which implies that and By noticin,g that ai = b k on Ai n B k ' we are done with the proof. D 1.2 Definition. Let (n,E,J.L) be a measure space and let s E tJf + (O.,E) with the r�presentation Then the number is called the the symbols:
integral of s with respect to
J.L, and it is denoted by one of
I s( w )dJ.L( w ) or I s( w )J.L(dw) or, shortly, J sdJ.L .
D
1. Integration on e - 1 (0,E)
297
Since the value of the integral of a function s does not depend upon its representation, this definition is consistent. In other words, the integral s � I sdJ.L defines a functional on .P + valued in lR. 1.3 Proposition (Properties of the integral). ( i) For each measurable set A E E,
( ii)
The integral I is a nonnegative linear functional, e., z. .
I (as + b t ) dJL = a I sdJ.L + b I tdJ.L, where s,t E .P + and a,b E lR +
.
For any two nonnegative simple functions, s, t E tJ! + , such that s < t, it holds that I sdJL < I tdJ.L ( monotonicity) . (iii)
(See Problem 1 . 1 . ) 1.4 Example. Let f be the Dirichlet function defined as f = 1q (earlier introduced in Example 4.7, Chapter 2) , where Q is the set of all rational numbers (hence a Borel set). Thus f E .P + ( lR,. is
I Jd>. = 1 · >.(Q) =x E >.(x) = o. eQ
D
For the upcoming definitions and statements we will denote a mono tone nondecreasing sequence of functions by {/ n l l and a monotone non increasing sequence of functions by {/ n l ! . 1.5 Lemma. Let { sn }l C .P + and s E 1JF + such that s < sup {s n } .
Then
Proof. Let
Denote
s = E � 1 a;1 A ; and let c > 0 be any small number. B n = {w : s n > ( 1 - c:)s} ( E E) .
Thus
s n > s(1 - c:)1 B n . By Proposition 1.3 (ii,iii),
298
CHAPTER 6. ELEMENTS OF INTEGRATION
By the definition of {s n } , it follows that {B n } j n, which implies that { A j n B n } j Aj . Therefore, by continuity from below of J.L (Lemma 1.6, Chapter 5), L: �
a1 iJ.L( A i ) = L: � 1 ai A�oo J.L( A i n B n ) = nlim --+ oo L: �- 1 a iJ.L( A i n B n ) = nlim --+oo I s l B d J.L.
I s d j.L =
z
n
The last equation is due to the relationship
Thus,
sup{ J s n dJ.L) } = nlim --+ oo J s n d J.L
> ( 1 - c ) �i_!!} I s I B n dJ.L = 00
( 1 - c ) I s d j.L ,
which proves the statement because the inequality holds for each c > 0. D 1.6 Corollary. For {s n }j, {t n }i C lff + such that sup{s n } = sup{ t n } ,
it holds that
D (See Problem 1.2.) Let us now turn to the integral of the functions from the more general class e + 1 = e 1 ( n, E; 1R + ) which we became familiar with first in Theorem 6.5, Chapter 5. 1.7 Definition. Let (O,E,J.L) be a measure space and let f E e + 1 . By Theorem 6.5, Chapter 5, there is a monotone, nondecreasing sequence { s n } j C tJ.i + such that f = sup{ s n } · Hence, it is plausible to define -
integral of ( an extended, real-valued, nonnegative func tion ) f with respect to measure J.L. By Corollary 1.6, the value of the integral, I f d J.L , is unique. D Analogous to Proposition 1.3 ( ii,iii) , we have: 1.8 Proposition. The integral introduced in Definition 1. 7 zs a positive, linear, monotone nondecreasing functional on e + 1 . Proof. Let /, g E e + 1 and a , b E IR + . Then
and call it the
.
1. Integration on e - 1 (0,E)
29 9
f = sup{s n }, g = sup{ t n } and af + b g = sup{as n + b t n } yield that
I (af + b g)dJ.L = sup{ I (as n + b t n )dJ.L},
which, by Proposition
1.3 ( ii ), equals sup{ a I s n dJ.L + b I t ndJ.L}
= a sup{ I s n dJ.L} + b sup{ I t n dJ.L} = a I fdJ.L + b I g dJ.L. Now let f < g. Then we have
Thus, by Lemma
1.5,
and finally, 0
1.9 Examples.
Let e a be a point mass on a measurable space (O,E) for some a E n and let s E 1/1 + (O,E) be such that s( a) = a i , for some ° i0 E { 1 , . . . ,n}. Then (i)
n
I s d e a = iL: ai e a( A i ) = a i0 · 1 = s( a).
=l
Now let f E e + 1 (0,E). Then there is a sequence { s n } i C tJi + such that f = sup {s n } · Thus I fde a = sup{s n (a)} = f(a). Similarly, if J.L = c e a (for some c > 0), I fdJ.L = cf(a). ( ii ) Let By Problem
1.3,
300
CHAP TER 6 . ELEMENTS O F INTEG RATION
( )
n
Specifically, if c i = i pi (1 - p ) - i , then J.L is the binomial measure x t !3 n , 1!.. · (See Example 1.8 (iii), Chapter 5.) Furthermore, if f( x ) = e , for t E lr' , then the transform of the binomial measure
is a function in t and is referred to as the moment generating function. In the general definition, t is allowed to run the complex plane C . (iii) Let (O.,E,J.L) be the measure space with n = [0,1], E = � ([0,1]), and J.L = A (Borel-Lebesgue measure on � ([0,1]). Let C be the Cantor set and G n be the open intervals of the Borel-Lebesgue measure �(;)" (introduced in Example 3.11, Chapter 5). Let us define the function
1, XEC f( x ) = 12 , x E G n, n = 1 , 2 , . . . . " We are going to evaluate the integral I f( x )A( d x ) (with respect to the [0, 1 ] Borel-Lebesgue measure). First of all, we have to identify the function /, which can be represented in the form f = sup { s n } , where 1, xEC 0, x E [0 , 1]\(G1 U . . . U G n U C) 1k , x E Gk , k = 1 , . . . , n . 2 Clearly, s n E tJ.i + ( [0, 1 ], � [0, 1 ]) and /( x ) = sup{ s n ( x ) }. Thus J E e + 1 ( [0,1 ] ,� ( [0,1 ])) n
and hence
x x A = sup I sn ( x)A ( d x ) f( ) ( ) d [0, 1 ] [0, 1 ] = sup [ 1 · A( C) + 0 · .-\ ( [0, 1 ]\ { G 1 , . . . ,G n . C}) + f: 21k A ( G k )] k= 1 I
1. Integration on e - 1 ( 0,E )
301
Let { J.L n } be a sequence of measures on a measurable space (O,E). Then J.L = I: �= 1 J.L n is a measure on E; and for an A E E, the integral of the indicator function 1 A is
( iv)
I lA d j.t = J.L ( A ) = I:�= 1 J.L n ( A ) = I: �= 1 I l A d J.L n · Let s E '.V + ( n, E). Then I sd J.L = I: ;;'= 1 ak J.L(Ak ) (1.9) = E ;;'= 1 ak l: �= 1 J.L" ( Ak ) = E �= 1 E ;:'= 1 a k J.L n ( A k ) = E �= 1 J s d J.L n · Now, for f E e + 1 , we have f = sup { s j } such that { s j }i C tJ.i + · Let b j = I: 7 = J s jd 1-'i. Since { b j n } is monotone increasing, n
1
which yields that s� p J
I: � 1 I s j dJ.Li = S}}P l: � = 1 I fdJ.L i = I: � 1 I f dJ.L i · (1. 9a)
Therefore,
I fdp. = sjp I s;dp. (by (1.9)) = sjp E � 1 I s ;dP.i (by ( 1.9 a)) = I: �- 1 J f dJ.Li· Thus we showed that
Now we further enlarge the class of integrable functions by consider ing arbitrary extended, real-valued, measurable functions of e - 1 ( 0 ,E ) . For each f E e - 1 and 0, being the function identically equal to zero on n, denote
+ f = sup{/,0}
and
+ f - = - inf{/,0) = ( - f)
302
( c f. 1) ,
CHAPTER 6. ELEMENTS O F INTEG RATION
Definition 7.7, Chapter 1). Clearly (see also Problem 7. 16, Chapter
By Proposition 6.6, Chapter 5, f + and f - are also elements of e - 1 ( more precisely, elements of e + 1 ) if and only if f E e - 1 . 1.10 Definitions.
(1 i) Let (O,E,J.L) be a measure space and let f E e - 1 (f2,E;fR) (or e - (0, E; IR) ) . If at least one of the integrals, I f + d J.L or I f - dJ.L, is finite, we say that the integral of f with respect to measure J.L exists and denote this integral by
( 1 . 1 0) We also denote
lL (n, E, J.L;fR) = {/ E e - 1 (0, E;fR) : I fdJ.L exists}.
(1. 10a)
If both of the integrals of the functions f + and of f - are finite, we say that the function f is J.L-inte g rable and again denote the integral of f by formula (1. 10). The subset of e - 1 of all JL-integrable functions is denoted - .1.e. by L 1 (O,E,p:;IR),
L 1 (n, E, j.L ;fR ) = {/ E e - 1 (0, E ) : I f + d j.L
g} we conclude that J.L{f > g: g is finite} = 0. J.L{f > g : g >
Letting
Bn = { g =
L > M,
:5
which yields
n}
{ g is finite}
)
Hence,
- oo }
- oo ,J > - n} we have
= 0.
and therefore, or, equivalently, nJ.L(B n ) > OOJ.L(B n ) · This holds true if and only if J.L( B n ) = 0 (as the consequence of the agreement that oo 0 = 0). Thus, ·
J.L ( n U= l B n ) = J.L{f > g, g =
In summary, we proved that
n.
J.L{f > g } = 0
- oo } = O . implies that
f < g J.L-a.e.
b) Now, let J.L be u-finite and let J.L n = Res E n n n J.L. Then
on
308
CHAPTER 6 . ELEMENTS OF INTEG RATIO N
fdJ.l = I ln n fdJ.l < I I n n gdJ.l I A An A nn
and hence f < g J.l-a.e. on n n . The rest of this case is obvious. The reader can easily conclude that Corollary. If J.l is u-finite, J, g E IL(n, E, J.l; IR ), and
1 .20
IA fdJ.l = AJ gdJ.l, for each A E E,
D
(1.20)
then f = g J.l-a.e. on n.
D
(For a pertinent discussion, see Problem 1.28.) Finally, we would like to formulate the proposition below that will be often cited in the sequel and whose prove we assign to the reader as Problem 1.19. Proposition. Each function f E L1 ( f2,E,J.l; IR) is finite J.l-a.e. on D n.
1 .2 1
PROBLEMS
1.1 1 .2
Prove Proposition 1.3. Prove Corollary 1.6, i.e. , for { s n } j, { t n } l � tJi + such that sup{ s n } = sup{ t n } it holds that
[Hint:
1.3
Use the fact that s j < sup{t n } and t k < sup{s n }.] Show that for J.l = L: � 0ci £ a the corresponding value of the integral of any bounded measurAble function f is _
·'
1
1 .4
Let
1r .,\
be a Poisson measure and let
I fd7r.,\ =
1 .5
Under the condition of Problem
f E e + 1 ( lR, II. Therefore, N� C rr c
2.
Main Convergence Theorems
315
and lim n-+oo f n ( w ) exists for all w E N�. Since g E L 1 ( f2,E,J.L), by Proposition 1.21, it follows that g is finite J.L-a.e. on n, i.e. there is a J.L-null set N 2 such that g( w ) < oo for all w E N2 . Define the function (2.6) where A = (N 1 U N 2 ) c . Clearly, f n converges to f pointwise J.L-a.e. on n and hence, by Proposition 5.6 (iii) and (vi), Chapter 5, f E e - 1 . Indeed, since f n and A E e - 1 , it follows that f n A E e - 1 and that f n A --+ f in the topology of pointwise convergence; the latter implies that f n --+ f pointwise J.L- a .e. on n. ( ii) From (2.6) it follows that on set A, lim n -+oo f n = /; in addi tion, {/ n l is dominated by a finite function g on A. Thus, I f I < g on A and, due to (2.6), f = 0 on A c . Hence,
1
1
1
g , 'v'w E n. By Proposition 1. 17 and since I f I < oo, f E L 1 ( f2,E,J.L). Proposition 1. 17, {/ n l c L 1 ( 0,E,J.L). (iii) We prove that f n is convergent in mean to /, i.e. , IfI
Let Since
< 00 and
IfI
g n = I f - f n I ( E e + 1 ( f2,E),
0. Let a 1 = b 1 = 1 and suppose a j and b j are positive integers defined for all j < n . Furthermore, let a n + 1 > a n such that (If there is no such a n + 1 , then it would surely contradict our assumption that lim k -+ oo J.L( A k ) = £ > 0.) Now, let b n + 1 > b n such that
!.£ > an 1 ( A bn 1 ) . -r + + (Such a b n + 1 should exists, because J.L a n 1 is 0 -continuous. ) For B n : + Abn \A bn + 1 , we have that J.La n + 1 (B n ) > �E . Therefo re for j being odd 8
II
=
2. and j > k > 1,
Main Convergence Theorems
·( n J.L ( n
f..La
Then, for k > 1 ,
J
32 1
) !c:. En > kB n ) > !c:. }:n Bn >k
even:
�
even :
We can easily verify that the last inequality holds true also for all odd values of n. Consequently, for all k > 1,
JL( Abk) = t{ E :;'
tB s) > �c:.
The latter contradicts the assumption that lim k_. 00 J.L ( A k ) = c; > 0. 2.15 Theorem (of Monotone Convergence) . Let f E e + 1 (0, E)
D
and {J..L 1 , J..L 2 , . . . } be a mono tone nondecreasing sequence of measures on a measure space (n, E). Then there is a measure J.L on (0, E, J.l) such that J.L n ( A ) J.L ( A ) for all A of E and --+
(2. 15)
{J..L n }
is monotone nondecreasing, by Theorem 2. 14 ( i ) , the setwise limit J.L of {J.L n } exists and it is a measure on (0, E). Since f is nonnegative and J..L n j J.L, the sequence { J fdJ.L n } is monotone nondecreas ing and hence Proof. Since
(2. 15a) The last inequality holds because of J f d J.L n < J f d J.L which, in turn, is due to Problem 1.26. On the other hand, from Fatou's Lemma 2. 10 applied to our case, ,
that, combined with (2. 15), yields the statement. D The convergence theorems below are for sequences of functions and measures at once. 2.16 Lemma Fatou. Let { J.L, J.l1 , J..L 2 , . . . } be a sequence of measures on a measure space (f2, E) and let {f n } C e + 1 ( n , E) such that for each A E E, lim J.Ln (A) > J.L( A ) . Then
where
( 2. 16 )
322
CHAPTER 6 . ELEMENTS O F INTEGRATION
Proof. First
f ( w ) : = lim/ n ( w ), w E 0. assume that {/ n } C e + 1 ( f2, E).
positive integer N and for every
n,
(2. 16a) Then, for every fixed (2. 1 6b)
Applying the version of Fatou's Lemma 2. 10 to the right-hand side of (2. 16b) we have
> J inf{/
m=
m
> N} dJ.L .
(2. 16c)
Since {inf{/ m > N} N } j f defined in (2. 16a), applying the standard Monotone Convergence Theorem 2. 1, we arrive at m=
D The following generalization of Fatou 's Lemma 2. 16 is applied to arbitrary measurable functions {/ n } and its proof is left to the reader. (Problem 2. 13.)
In the condition of Fatou 's Lemma 2.16, let {g, f 1 , f 2 , . . . } c e - 1 (f2, E ) such that for all f n > g and lim n--+ oo J gd J.L n = J gd J.l > - oo. Then, 2.17 Lemm a (Fatou).
where
n,
J fdJ.L < lim J fn d J.Ln f ( w) : = lim/ n ( w ), w E 0. '
(2. 17) D
2.18 Theorem (Lebesgue's Dominated Convergence Theorem) .
Let {f n } C e -' 1 (f2, E), g E e + 1 (f2, E), and {v,J.L,J.L 1 ,J.L 2 , . . . } be a sequence of measures on the measure space (f2, E) such that: ( i) J.L n < v. ( ii) f n converges
to a function f in the topology of pointwise conver-
gence. (iii) J.L n con verges to J.L set wise. ( i v ) J gdv < oo. ( v ) I f n i < g.
2. Main Convergence Theorems
323
Then, (2. 18) for which we use the conditions ( i), (iii) and (iv) . Then, applying Theorem 2. 11 to g we have that
2. 11
Proof. Consider Theorem
Now, since
g ± f n > 0 for all
n,
we have from Fatou's Lemma 2.17,
On the other hand, since I gdJ.l < I gdv < oo,
that yields the assertion.
D
PROBLEMS 2. 1 2.2
Prove Corollary 2.2. Generalize the Monotone Convergence Theorem: Let {/ n } j C e - 1 ( 0,E) and g E e - 1 ( 0,E) such that f n > g for all n and suppose that J g dJ.l > oo. Prove that -
sup{ I f n dJ.l} = I sup{/ n } d jJ. . 2.3
2.4
Show that if I gdJ.l oo, the Generalized Monotone Conver gence Theorem need not hold. Let { f n } ! C e - 1 and g E e - 1 such that f n < g for all n. If I g dJ.l < oo, show that =
-
inf{ J f n dJ.l } = I inf{/ n } dJ.l 2.5
Let
.
( O,E,J.l) be a measure space and le t {A n } C E.
and if J.l < oo that
Prove that
324
CHAPTER 6 . ELEMENTS O F INTEG RATION
[Hint: Apply Fatou's Lemma 2.4 to the sequence of functions
2.6
2. 7
n}
and use Problem 3.8, Chapter 1; then apply DeMorgan's law to prove the second inequality. ] Show that if f n � f in mean then {1 A
Generalize Fatou 's Lemma 2.4 in the following way. Let {/ n } C e - 1 ( 0,E) and g E e - 1 ( 0,E) such that g < f for all n. Let I g - d J.l < oo. Show that n
2.8
I lim f n dJ.l < lim I fn d J.l • Let {/ n } C e - 1 (!1, E ) and g E e - 1 (!1,E) such n. Let J g + d J.l < oo. Show that
2.9
Let
{
that
f n < g for all
f n ( x) = n,, l O < x < � 0 < x < oo. n
Show that f n ---. 0 ,\-a.e. in the topology of pointwise convergence. Explain why
I lim n --.oo f n A( d x) f. lim n -+oo I f n A( d x) 2. 10
·
Let
x>2
n·
Show that 2.11
Use Lebesgue's Dominated Convergence Theorem 2.6 to prove that for all a > 0, -+ oo a ( a nlim
-
1)
,
n.n ·
·
·
a
( a + n - 1)
=
r(a),
( P2. 1 1 )
2.
Main Convergence Theorems
325
where r( a) is known to be the gamma function and it is expressed as the improper Riemann integral (P2. 1 la) Give an example of a monotone nonincreasing sequence of meas ures convergent to a set function J.l setwise such that J.l is not a measure. 2.13 Prove Fatou 's Lemma 2. 17. 2.14 Prove Theorem 2.9. [Hint: Use Theorem 2.6, the Mean Value Theorem, and Example 9. 7 ( ii), Chapter 3 .] 2. 12
326
CHAPTER 6 . ELEMENTS O F INTEG RATION
NEW TERMS:
Monotone Convergence Theorem for functions 312 Beppo Levi's Corollary 3 13 Monotone Convergence Theorem, Generalized 313 Fatou's Lemma for functions 313 convergence in mean 314 Lebesgue's Dominated Convergence Theorem for functions 3 14 interchanging derivative and integral 3 17 Fatou's Lemma for measures 3 18 Lebesgue's Dominated Convergence Theorem for measures 31 8 setwise con vergence of measures 3 1 9 setwise limit of measures 3 1 9 setwise convergence, criterion of 320 Monotone Convergence Theorem for measures 321 Fatou's Lemma for measures and nonnegative functions 32 1 Fatou 's Lemma for measures and functions 322 Lebesgue's Dominated Convergence Theorem for measures and functions 322 gamma function 324, 32 5
3.
Lebesgue and Riemann Integrals on IR
327
3. LEBESGUE AND RIEMANN INTEGRALS ON IR
In this section we will develop integration techniques in L 1 ( 1R, c:B,>.;IR) (see Definition 1 . 1 0 (ii)). The principal idea is to reduce the Lebesgue integral to the Riemann integral whenever it is possible in combination with the main convergence theorems. The Riemann notion of an integral, which was a refinement since its inception of Cauchy in 1832, was introduced in 1854. We begin with the concept of the Riemann integral of a bounded function on a compact interval suggested by the Frenchman Gaston (in some sources, Jean-Gaston) Darboux (1842-1917) in 1875 . Although the construction below is selfcontained, the reader is encouraged to go back to Example 9.9 (vi) , Chapter 3 , for topological preliminaries of this construction. Let n = [a,b] be a compact interval in IR. By Definition 1 .7 (ii) , Chapter 1 (see also Example 9. 9 (vi) , Chapter 3), partition of [ a,b] is any ordered n-tuple P = P( n) = P ( a0 , ,a n ) with • • •
P = {a0 , . . . ,a n E [a,b]: a = a0 < a1 < . . . < a n = b}. P1 and P2 be two partitions of [ a,b ]. We say P2 is finer than P1 if P1 C P2 . P2 is also said to be a refinement of P1 (in notation P 1 � P2 ).
Let
Thus, if <j'J is the set of all partitions on [a,b], -< is a partial order on GJl . Denote by e b- 1 ([a,b], c:B ([a,b])) = e b- 1 ( [a,b], c:B n [ a ,b]; IR) the set of all real-valued , Borel-measurable, bounded functions on [ a,b ]. Let f E eb- 1 ( [a,b], 0, there is a partition P of [ a , b] such that U _ ( J P ) - L + ( f , P ) < e. However, Riemann did not specify the class of functions, which are subject to integration (although he pointed out that a function can be discontinuous on a dense set and nevertheless integrable), as Lebesgue did in his Theorem 3. 5 which is to follow. D 3.4 Example. Let f be the Dirichlet jump function introduced in Example 1.4. Consider its modification .
,
f (x) = lQ
n
[ o , 11 ( x ) E e + 1 ([0,1], 0, there is a fJ > 0, such that for each partition P whose mesh is less than 6,
u (f , P) - l (f , P) < e .
(3.5)
(Show it, see Problem 3 . 1 . ) This leads to Riemann integrability. ( ii) Let f be bounded, Borel-measurable and ,\-a. e. continuous on
3.
Lebesgue and Riemann Integrals on IR
331
[ a,b ]. If f is not continuous everywhere, but is bounded, it can have only discontinuities of finite magnitude. From the nature of the lower and the upper Baire functions, l and u, it follows that l and u coincide with f at all points of continuity of f. (A rigorous proof of this statement, known as Baire's theorem, is contained in many standard analysis text books. ) At the points of discontinuity of f, l assumes the smallest values and u takes the largest values (this can be shown by elementary methods). (See Figure 3. 1 .)
u(x)
=
l(x), x '# x0
l(x) = u(x), x * x0
I I I
·-- - - - - - · - · · -
·-- - - - - - � -
-
l(x0 )
- - � -X�
--
. - - ·
Figure 3 . 1 Then, if f is discontinuous on a negligible set S , it should equivalently follow that u and l differ on the same set S. By the above condition, S C N where N is a measurable null set. Since f is bounded, u n and I n are measurable, bounded jump functions, and U n and L n exist. By Lebesgue's Dominated Convergence Theorem, U _ - L + = 0, which im plies that f is Riemann integrable. Indeed,
= J u d ).. - J l d).. = 0, by Lemma 1 . 15, since u = I on N c , i.e. , a. e. (iii) Let
f be Riemann integrable. Then, by (3 .2),
332
CHAPTER 6 . ELEMENTS OF INTEG RATION
f = l = lim n---. oo ln = u = lim n---.oo un a.e. Furthermore , f is bounded. We repeat the above arguments. From the nature of u and l, it follows that, in this case, u, l, and f coincide wher ever f is continuous. At all points of discontinuity, while f assumes one of these values, the smallest values of f will be assigned to l and the largest ones - to u. Therefore, the set, on which the function f is dis continuous equals the set on which u and l differ. This proves that f is D continuous >.-a.e. 3.6 Remarks.
By employing a canonic chain of partitions on the X-axis, in construction of the Riemann integral, we sometimes face the problem that the sequence of the corresponding lower jump functions {l n } con verges to the lower Baire function l, but it does not converge to f, as it turns out for the Dirichlet function. Consequently, the lower Darboux integral gives a "wrong" value. In contrast, the construction of the Lebesgue integral literally sets up partitions on the Y-axis whose canonic chains form monotone increasing sequences of lo we r jump functions. The latter, due to Theorem 5.5, Chapter 5, always converge to f. Con sequently, the lower Darboux integral L + equals the Lebesgue integral
( i)
"
"
J f d J.l. ( ii)
Although Riemann and Darboux enlarged the previously existing class of integrable functions, the Riemann integral has a plethora of limit ations, one of which goes back to the fundamental theorem of calculus in the form ( R) J f' (x)dx = f(b) - f ( a).
:
This formula becomes meaningless when a differentiable function integrable. On the other hand, the classical proof of the formula
f is not
d� J = f (u ) du = f(x) was originally based on the continuity assumption for f. The new con cept of integration suggested by Henri Lebesgue in 1902 in his doctoral work restored the generality of the fundamental theorem to its current status. Furthermore, the class of Lebesgue integrable functions is signi ficantly enlarged. Notice that from Theorem 6.5, Chapter 5, it follows that, in contrast with the Cauchy-Riemann-Darboux formation of par titions of [a,b] and essentially leading to Definition (3 . 3), the Lebesgue construction of the integral of an ( initially nonnegative ) function f suggests partitions of the interval [0 sup/] on the Y-axis instead. The latter leads to a notion of a sequence of nonnegative simple functions ,
3.
Lebesgue and Riemann Inte grals on lR
333
{ s n } approximating f from below, a very elegant and lucid definition of the integral of a nonnegative simple function, and, as a consequence, the definition of the integral J f d A as sup{ J s ndA }. The function f need not be A- a . e . continuous, nor need it even be bounded. (iii) As we mentioned, in order that a function be Lebesgue integr able, it need not be bounded. A class of Riemann-integrable functions, as known, can be "extended" for nonbounded functions by the use of the "improper integral." Another need for the improper integral arises when the interval of integration is unbounded. In the latter case, the integral is constructed as usual on a compact interval [ a , b ], and then its values are taken for a --+ oo or b --+ oo This is a "trick'' rather than a proper integral construction. That is why such integrals are called improper. ( iv) Unlike this type of improper integration over infinite intervals, there is another way to integrate functions with the conventional approach of constructing an integral via uniform "partitions" of the in finite interval . Consider as an example a bounded Borel measurable function f on an interval [ a , oo ) and a partition of t hi s interval by the sequence { a n }, where an = a + 6n, n = 0, 1 , . . . , for some positive 6. Then on each of the intervals � n = [ an , an + 1 ) consider -
.
mn = inf {f(x) : x E � n }
and
Mn = sup {f(x) : x E � n } · Since the Lebesgue measure of each interval � n equals 6, we have again the lower Darboux sum,
and the upper Darboux sum, If limo L(/,6) = lim U(/,6) then its common value is denoted by o!
o!o
(D) J c; f(x) dx
direct Riemann integral. The function f is then said to be directly Riemann inte grable. The direct integrability is used in prob
and called the
ability, specifically in renewal theory, where such a notion is introduced for a class of nonnegative functions bounded over finite intervals. D
334
CHAPTER 6 . ELEMENTS OF INTEG RATIO N
3. 7 Examples. (i) Let
Q = [0,1] and let f(x) = x 2 1 A (x) + sinx l A c(x), where A c is
the Cantor ternary set. The function f is a bounded Borel-measurable function on [0,1] and obviously >.-a.e. continuous on 2 [0,1]. Thus, f is Lebesgue as well as Riemann integrable and f(x) = x >.-a.e. on [0,1]. Furthermore,
J0, 1 f(x)dx = (L)0J1 f(x)>.(dx)
(R)
[ ] = (L)
J0 1
[, ]
x 2 >.(dx) = (R)
[, ]
(ii)
J0, 1
x 2 dx = �·
[ ] Let n = [1,2] and f(x) = (x - 1 ) - 1 / 3 . We wish to evaluate
J f(x)>.(dx). Since f is no longer bounded (on
[1 , 2 ]
[1,2])
we cannot apply
the same techniques as discussed above. Consequently, we introduce an auxiliary sequence of functions, {/ n } , defined as
1 < x < 1 + n13 (x - l)- 1 1 3 , 1 + 13 < x < 2 3.2). It is easily seen that {/ n } is monotone increasing continuous functions contained in e + 1 ([1,2], . ) -function f can be integrated over arbitrary Borel sets, while the Riemann integral is defined just on inter vals. With all these advantages, however, the Lebesgue integral does not have the same elegance and analytical tractability the Riemann integral has, due to its "Newton-Leibnitz bridge" to derivatives and a huge inven tory of integration techniques. In many cases, whenever possible, the Lebesgue integral is j ust reduced to a Riemann integral. In addition, the class of Riemann integrable functions is traditionally enlarged to include those functions which are Riemann integrable in an improper sense. There will be functions with discontinuities of an infrnite magnitude and functions defined on intervals of type [ a,oo ) or ( - oo,b ] or ( - oo, oo ) . In Example 3.7 (ii) we examined a Lebesgue integral of a nonbound ed function. In a certain sense, the approach used there reminds us of Riemann integration of nonbounded functions. In the proposition below we will state that in most cases, when the integration over an infinite interval is needed, we can use Riemann integration in the improper sense and equate their values to those for Lebesgue integrals. This fact makes D the Riemann improper integral more legitimate. 3.9 Proposition. Let f E e + 1 ( 1R, .;IR + ) if and only if the improper Riemann integral of f ,
336
CHAPTER 6 . ELEMENTS O F INTEG RATIO N
R = lima --. - 00 I f (x )dx , b ---. oo [ a , b ]
exists. ( We say that f E �(IR), where �(IR) is the class of all functions on IR Riemann integ rable in the improper sense.) In this case R = I f d ).. . Proof. Denote Rnk =
Then, since
(R) BI f ( x )dx nk
wher e B n k = [ - k, n ] .
f is Riemann integrable, Rn k = I fi Bn k d ).. . Observing that f = sup { f l B n k : n = 1,2, . . . ; k = 1,2, . . . },
we have, by the Monotone Convergence Theorem,
I f d ).. = sup R n k = R
.. ; IR). Therefore, using Proposition 3.9, we conclude that I f I must be an element of �(IR). In this case, evidently,
-l
(R) I ': 00 f (x) dx 00 = (R) I 00 / + (x) dx - (R) I ': f - (x) d x oo = I t + d >.. - J t - d >.. = J t d >.. . 3.1 1 Examples.
D
. x x sin ( w he re k '# k2 + x2
0) . W e show Consider the function f ( x ) = that this function is Riemann integrable in the improper sense but not Lebesgue integrable over IR + . We apply the Dirichlet criterion: (i )
Let g and h be two real-valued functions defined on [a, oo ) . If g is monotonically vanishing at oo and I (R) J � h (x )dx I < C, for each b > a and positive real number C, i. e., the integ,ral of h is uniformly bounded in b, then the improper integral ( R ) J :" g h is convergent. In our case, the function 2 x 2 can be taken for g and sinx can represent k +x
3. Lebesgue and Riemann Inte g rals on lR
337
0, and con sequently, ( R) J '; f converges. On the other hand, f E L 1 (1R, =
oo
= En =O
J
J
1r sint{t + n1r) 0 k 2 + (n 7r + t) 2
dt
a =
1r 1rn J 0 sint d t >- k 2 + 7r 2 (n + 1 ) 2
�--�--�
(the second summation is due to the inequality 1rn + t < (n + l)1r, for t E [0,1r]). Thus 2
( ii) The function f( x) = sinx exp ( - ; ) is an element of e - 1 and it 2 is Lebesgue integrable, because I f ( x) I < g ( x ) = exp ( � ) and g ( x ) > 0 and because 00 -
g(x )dx j � -
.jz; (
2
oo
=
1.
)
Observe that x 1---+ exp - � , x E IR, is the normal density func21r tion of the standard normal distribution. (See Example 5. 10 (iii).) D PROBLEMS 3.1 3.2
3.3
Prove (3.5) in Theorem 3.5. In Example 3.4, we showed that the Dirichlet function f on [0,1] is Lebesgue integrable, but not Riemann integrable. Since the rationals have the Lebesgue measure of 0, the function f is equal to 0 (a constant) for A-almost all points on [0, 1 ], and therefore, it is continuous almost everywhere on [0,1]. By Theorem 3.5, f must be Riemann integrable. This is just the opposite of the result of Example 3.4. What is wrong with this reasoning? Is the function f ( x ) = � on [0, 1] Borel-measurable and A-integr-
338
3.4 3.5
CHA PTER 6. ELEM ENTS O F INTEG RATIO N
able? Show that the function /, such that f(x) = � cos( � ) on f(O) = 0, is Borel-measurable and not >.-integrable. Let f: [0, 1] � IR be defined as
f (x) =
3.6
0,
(0, 1]
and
0.
X=
Show that f is improperly Riemann integrable but not Lebesgue integrable. Let f be a monotone increasing differentiable function on [ a ,b] and let cp be its inverse function on [f(a),f(b)] . Prove that
f f( b ) J � f(x)>.(dx) = ycp '( y)>.(d y ). f( a )
3.7
Investigate
0 < a < 1) IR + .
3.8
3.9
if the function f(x) = s:ax l { x 1:- o} (x) (where is improperly Riemann and Lebesgue integrable over .
Let G be a nonempty open subset of [a,b] and let f be a Borel measurable function on [a,b] , discontinuous at each point of G . Can f be Riemann integrable? Show that the functional
II f - g II L l = I : I f - g I d ). semi-norm on L 1 ([a,b], �([a,b]), >.). How
defines a become a norm? 3.10 Let s E tJ.i + ([a,b], 0, there is a
Show that the space e([a,b]) of all continuous functions on interval [a,b] is dense in ( L1 ([a,b], .), II II £ 1 ) . Use Lebesgue's Theorem 3.5 to show that the limit of a uniformly convergent sequence {/ n } of bounded Riemann integrable func tions on [a,b] is Riemann integrable on [a,b]. Prove that under this ·
3.12
£
can
3. Lebesgue and Riemann Integ rals on IR
339
condition, b = (R) I b (R) f (x)dx I n nlim nlim �oo �oo f n (x)dx. a
a
( P 3. 12)
Lei A be a closed negligible subset of [ a,b]. Is the function 1 A Riemann integrable? 3. 14 Let A be a subset of [a,b] whose closure is negligible. Is 1 A Riemann integrable? 3.15 Let {/ n } be a sequence of bounded, Borel measurable, nonnegative functions on A C IR. Suppose (L) I f n dA � 0 for n�oo. Is it true A that f n � o A-a.e. on A? 3. 13
340
CHAPTER 6 . ELEMENTS OF INTEG RATI O N
NEW TERMS:
partition 327 refinement 327 Borel-measurable bounded functions 327 Darboux lower sum 327 Darboux upper sum 327 mesh of a partition 328 canonic chain of partitions 328 upper Darboux integral 328 lower Darboux integral 328 Riemann integral 328 Riemann integrable function 328 upper Bair function 329 lower Baire functions 329 Cauchy sum 329 Cauchy integrable function 330 Dirichlet function 330 Lebesgue's Theorem of Riemann integrability improper Riemann integral 333 direct Riemann integral 333 direct Riemann integrability 333 Dirichlet's criterion 336
330
4.
Inte gration with Respect to Imag e Measures
34 1
4. INTEGRATION WITH RESPECT TO WAGE MEASURES
As one of the extensions of major integration techniques, we will study integration with respect to image measure J.LF * (where F is a measurable mapping) , with the nickname chan g e of variables, as it resembles the prominent method for the Riemann integral. In this section we will restrict our attention to the abstract integral. A more specific approach to a change of variables for Lebesgue integrals in Euclidean spaces will be treated separately in Chapter 7.
4. 1 Theorem (Change of Variables). Let (00 ,E0 ,J.L) be a measure space, f E e - 1 (0,E), and F: (00 , E0 )--+ (n, E) be a measurable map (such that J.LF * is an imag e measure on the measurable space (n, E)). Then, the following formula holds true: (4. 1) Specifically, if f = g1 A , where A E E and g E e - 1 {0,E), then (4. 1 ) reduces to ( 4. 1a) Proof.
( i)
Problem
Let
3.7,
Therefore,
E tJ.i + ( O,E) be just an indicator function Chapter 1, w e have that s
s
= 1 A . By
I 1A F( w0 ) dJ.L( w0 ) = I 1 F * ( A ) (w0 ) dJ.L( w0 ) = J.L(F * (A)) = J.LF * (A) = I 1 A ( w )dJ.LF * ( w ). o
( ii) Then,
Let s be a nonnegative simple function with the representation,
342
CHAPTER 6 . ELEMENTS O F INTEGRATION
and
n J s o F dJ.L = E ai J.LF * (Ai) = J sdJ.LF * . (iii) Let
i= l
f E e + 1 (0,E). Then there exists { s n } j C tJi + such that
f = sup{s n }· For s n we have, according to (ii) :
Observe that {s n o F } j C tJ! + (00 ,E 0 ) and, by Proposition
5.6 (iv) ,
1 sup{ s n o F} = f o F E e + ( 00 , E0 ).
Therefore, we have that
I f o F d J.L = SUp { J n o F d J.L} S
= sup{ J s n dJ.LF*) =
( iv)
Problem
Let
4. 1,
I f dJ.LF*. 1 + f E e - (f2, E). Then, f = f - f - and, according to
Therefore,
f o F = J o F + - / o F - = J + o F - f - o F, and this, along with (iii) , imply that
J J o F dJ.L = J f o F + dJ.L - I f o F - dJ.L = f f + o F dJ.l - f f - o F dJ.l I J + dJ.LF* - I f - o F* d J.L = I f d J.L F*. (v) have,
Let
f = g lA where A E E and g E e - l (O,E). Then we
0
4. Integration with Respect to Image Measures 4.2 Corollary.
343
Let (rl,E,J.L) be a measure space and let F: ( rl,E ) � ( rl,E)
be a bijective transformation which is E -E measurable along with its in verse F * . Then, for each f E e - 1 (f2,E), the following formula holds true. I* f dJ.L = I f F * dJ.LF * . ( 4.2) A F (A ) ( See Problem 4. 2.) 0
4.3 Examples.
(i) Then,
Let
/ E e - 1 (1R " , 0. (ii) Let ( rl,E,IP) be a probability space and let X E e - 1 ( f2,E) be a random variable. Recall that X induces the image measure IP' X * , or, equi valently, the probability distribution on the measurable space ( IR,C!B), thereby generating the new probability space (IR, 0 and g E L 1 (n,E,J.L). Then and
gn > l A n . Thus
g of
348
CHAPTER 6 . ELEMENTS OF INTEGRATION
which implies that J.L( A n ) < oo Since g > 0, it follows that A n j n. D We have shown that u-finiteness of J.L is equivalent to the existence of a positive integrable function g. In other words, there is a positive "Radon-Nikodym density" g such that the measure v generated by the integral is finite. Another noteworthy observation is that if .
then g l A E [0] /J . Since g > 0, A E N 1-' ' i.e., from v(A) = 0 it follows that J.L( A ) = 0. Shou1d J.L( A ) = 0, then g l A E [0] 1-' and v(A) = 0. Thus, v(A) = 0 if and only if J.L(A) = 0. In other words, v and J.L possess the same null-sets. It is clear that, if g is just nonnegative, v(A) = 0 does not necessarily imply that J.L(A) = 0. But from J.L(A) = 0, it follows anyway that v(A) = 0 (why?). If v has a density relative to J.L, then a J.L-nul l set is also a v-null set. Is the converse of the statement true? (i.e. , would this relation between the measures guarantee the existence of a density?) The answer will be given in the Radon-Nikodym Theorem below. 5.5 Definition. Let J.L and v be two measures on a measure space (Q,E). The measure v is called (absolutely) continuous (with respect to J.L) if every J.L-null set is also a v-null set. If v is continuous relative to J.L , then we write v � J.L · Any Borel measure continuous with respect to the Lebesgue measure is just called continuous. D The use of the word "continuity" is basically due to the following proposition. 5.6 Proposition. Let v be a finite measure on (n, E) and let J.L be another measure on (n, E) . Then the following are equivalent: (A) v � J.L· (B) For all £ > 0, there is 6 > 0, such that for each A E E with J.L(A) < 6, the inequality v ( A) < c holds. Proof.
( i ) Suppose statement (B) is true. Choose an e. Denote by Ll the set of all A E E, for which J.L(A) < 6. Then N 11 C Ll (where N denotes the subset of all J.L-null sets) . Then, for all N E N , 0 = J.L(N) < 6 and v(N) < £. Since £ can be made arbitrarily smal f, we conclude that v(N) = 0 and thus v � J.L·
5.
Measures Generated by Integrals
349
( ii) Suppose now that statement (B) is not true. That means, for some 0 and for any 6 0 there is a set A ( 6 ) E E such that J.L ( A ( 6 )) < 6 implies that v ( A ( 6 )) We now define the sequence of 6's as 6n = � , n = 1,2, . . . , and construct the corresponding sequence of A 's such that A ( 6n ) = An with the above property, i.e. { An } is a J.L-monotone 00 decreasing sequence but "v-resistant." Let A = lim An. Then A CmU n A m = and ( 00 00 ) J.L( A ) < J.L mUn Am < mEnJ.L( Am � < 2 n-1 1 ' n = 1,2, . . . . Therefore, J.L( A ) = 0. However, by Problem 2.5, since v is finite, c >
>
>
£.
2
and thus v is not J.L-continuous. Hence (A) is not true either. D The most general version of the celebrated Radon-Nikodym Theorem was proved by the Pole Otto Nikodym in his paper, Sur une g enera lisa tion des integrales de M. J. Radon of 1930. Another prominent Pole, Stanislav Saks, suggested the name of this theorem, perhaps meaning as Nikodym's Theorem on Radon Integrals, although Radon himself proved a much more special case. The idea of Radon-Nikodym's result had its inception in a 1884 paper by Thomas Stieltjes, in which he introduced the new concept of a density function in connection with his famous "Stieltjes integral" (in its present version known as the Riemann-Stieltjes integral) and initiall y applied to very restricted classes of functions. In 1909, Frederic Riesz proved in his widely referred to Representation Theorem that Stieltjes integrals are represented by the most general continuous linear function als on [a,b] (whose more general version we will explore in Section 7, Chapter 8 ) . Riesz's result yielded many generalizations, of which the most produc tive was by Johann Radon in his 1913 paper, Theorie und Anwend�u ngen der absolut additiven Mengenfunktionen. In this paper, Radon, combining the ideas of Lebesgue and Riesz, introduced an integral with respect to n Borel measures o.n Borel u-algebra of IR rather than the Borel-Lebesgue measure used by Lebesgue. Among other things, Radon showed the exis tence of a Radon-Nikodym density function with respect to this integral as an absolute continuous measure with respect to the Borel-Lebesgue measure, significantly generalizing the earlier theorem by Lebesgue about the existence of an almost everywhere differentiable density. Right after the appearance of Radon's paper, Maurice Frechet noticed that Radon's result cann be generalized for arbitrary measures, rather than Borel measures of IR . This lead Nikodym to his 1930 gene-
350
CHAPTER 6 . ELEMENTS O F INTEG RATI ON
ralization of Radon's theorem in the form very close to the present version. Consequently, a significant gap in integral theory existed between 1913 and 1930. Soon thereafter, in 1933, Nikodym's generaliza tion led to the birth of measure-theoretic probability theory (in Andrey Kolmogorov's famous monograph, Grundbegriffe der Wahrscheinlichkeits rechnung), the concept of conditional expectation, and an introduction to the theory of stochastic processes. Still, many consider Radon as the father of the modern theory of integration. Otto Nikodym, who is at the heart of one of the most important re sults ever made in mathematics, was born on August 13, 1887, in eastern Poland, then belonging to the Russian empire. In 1919 he was among 16 mathematicians .to found the Polish Mathematical Society. Shortly after World War II, Nikodym's family moved to Belgium and then to France, where Nikodym was invited by the Institute of H. Poincare to work on the mathematical foundations of quantum mechanics. (He published his results in numerous papers, and his monograph, The Mathematical Appa ratus for Quantum Theories, was published by Springer-Verlag in 1966.) In 1948 he accepted a position in the United States at Kenyon College, Gambier, Ohio, where he stayed until his retirement. He died in 197 4. We introduce some preliminaries on the Radon-Nikodym Theorem (further to be embellished in Chapter 8). 5.7 Notation. Let m1 = !Dl(Q, E) be the set of all measures on (0, E) . For a fixed measure J.L E !Dl, denote !IJ1 11< = {v E !IJ1 :__ v «: J.L} . (This set is not empty, since J.L E fJR 11< . ) Define on IL( n, E, J.L;lR + ) a mapping J 11 such that for each f E IL( n, E, J.L;lR + ) ,
I t = I f dJ.L = v( · ) . JJ
( ) •
D By Problem 1 . 2 0 , I 11 is valued in !IJ1 11< . Now the Radon-Nikodym Theorem states that if J.L is u-finite, for each v E !IJ1 11< , there exists a unique (up t � the equivalence class modulo J.L) Radon-Nikodym density f E lL(O, E, J.L;lR + ) of v relative to J.L· This needs some clarification : 1) Given a function f E IL(Q, E, J.L; fR + ) , I 11 ! defines a measure, which is absolutely continuous with respect to J.L· As noticed above, this is done. Consequently, [IL(O, E, J.L; fR + ) , !IJ1 11< , I 11] is an into mapping. 2 ) Recall (Definitions and Remarks 1.14 (iii)) that the J.L-almost everywhere property of equality of measurable functions generates an equivalence relation � on e - 1 ( 0, E; fR ) and thus oE IL ( O, E, J.L; fR + ), as a subset of e - l (n, E; lR). Consequently, lL ( n , E, j.L; lR + ) I JJ is a quotient set, "inherited" from (1. 14). On the other hand, by Corollary 1.20, the
5. Measures Generated by Integrals mapping I "agrees" with this equivalence relation E, i.e. as its equi :alence kernel. Then, by Theorem 4.4, Chapter unique function, say
351
1�
I
adopts E there is a
such that where 7rE stands for the projection of l(n, E, J.L; IR + ) on its quotient IL(O, E, J.L; _lR + ) I 11 by E. (See Section 4, Chapter 1.) Therefore, I , literally turns to the injective mapping I p that now acts on the quotient set IL(O, E, J.L; IR + ) I 11• 3) The major claim (existence) of the Radon-Nikodym Theorem is that the mapping [IL(O, E, J.L; IR + ) I 11 , r.m 11< , I ,J is surjective. In other words, for each measure v E r.m 11< (i.e. , absolutely continuous with respect to J.L) , there is an equivalence class [!] 11 of Radon-Nikodym densities of v relative to J.l · A compact version of the above arguments is as follows: 5.8 Theorem (Radon-Nikodym). Let J.L E fJJI. ( O, E) be a u-finite meas ure. Then [IL(O, E, J.L; IR + ) I 11 , !IJ1 11< , J P ] is a bijective map. As mentioned, the uniqueness of the Radon-Nikodym density class is due to Corollary 1.20. The rest of the proof of Theorem 5.8 (existence) will be rendered in Section 2, Chapter 8, for more general classes of D signed measures. By Radon-Nikodym's Theorem, the map I P is therefore invertible and its inverse, denoted by symbol is also a map valued in IL(O, E, J.L) J 11 • Thus, for any v E ID111< , there is a nonempty equivalence class [!] 11 of Radon-Nikodym densities of v relative to J.L and, for a fixed E fJJl 11 J..L ( A n ) , if J..L ( A n ) > 0) or A E X 1-' (if J..L ( A n ) = 0). In the latter case set g = /1 A c .] Let J..L and v be measures on (O,E) such that v � J..L and let v be finite and g E �- Denote A = {w E Q : g ( w ) '# 0}. Show that the restriction of J.l 6n E n A is u-finite. Give an example where J.l ( A ) is not finite. Let 1r be a Poisson measure on (IR, .") and (IR ,� ,>. ) be the Borel-Lebesgue measure spaces. Show that
1,2,
1,2,
Let ( O i , E i , Jl i ) , i = be measure spaces with u-finite measures and let A E E 1 ® E 2 . Show that the following statements are equivalent: 1 ) Jl 1 ® Jl 2 (A ) = 0; Jl2 (Aa 1 ) = 0 Jl1- a . e . on 01 ; 3 ) Jl 1 (A a 2 ) = 0 Jl 2- a . e . on 0 2 . Let A c nl X n2 and let al E n l . Show that ( l A ) a 1 = l A a . 6.8 1 Show that f a 1 * (A 3 ) = (J * (A3 )) a 1 , A 3 C 03 . 6.9 6.10 Prove Proposition 6. 13. [ Hint: Apply Lemma 6.7 and Problem
6.7
2)
376
6. 11
CHAPTER 6 . ELEMENTS O F INTEGRATION
6.9. ] Let A, B c that
n1 n2 be two disjoint sets and let X
a , (3
E IR. Show
1 = a( l A ) a 1 + {3(1 B) a 1 . Let f E e + 1 (0 1 n 2 ,E 1 ® E 2 ) and let {s n } c tJ.i + (0 1 n 2 ,e - 1 ) such that f = sup{s n } · Show that f a 1 = sup {( n ) a 1 } [Hint: Apply Theorem 6.5, Chapter 5, and Problem 6.10 ] . Show that I f I a = I f a I , (f + ) a = (f a ) + , and (f - ) a = (f a ) - · Let E 1 and E 2 be u-algebras on 0 1 and 02 , respectively. Show that E1 0 E 2 is a semi-ring. Let y 1 and y 2 be semi-rings on 0 1 and 0 2 , respectively. Is y 1 0 y 2 also a semi-ring? What will the smallest algebra generated by E 1 0 E 2 from ( al A + (3 1 B ) a
6.12
X
X
s
6.13 6.14
6.15
Problem 6 . 14 look like? 6.16 Let J.li and v i be finite measures on a measurable space (ni, E i), i = 1 ,2. Show that if J.l i «: v i ' i = 1,2, then J.l 1 + J.l2 � v 1 + v 2 . 1 6.17 Let (O, E, J.l) be a u-finite measure space and let f E e + (n, E). Prove that
( P6.17 ) by using Theorem 6. 10. 6. 18 Generalization of ( P6. 17 ) . In the condition of Problem 6. 17, let g: IR + ---. IR + be a continuous monotone nondecreasing function such that g ( O )= O and which is continuously differentiable on (O,oo). Show that f g(f)dJ.l = ( L ) f
(O, oo )
g ' ( X) J.l( { f > X } ) A( dx)
= (R) J g'(x)J.l({f > x})dx. 00
0
6.19
Show that if F and G in Example 6. 18 (vi) have no common dis continuities, then formula ( 6. 18m ) reduces to
F(b)G(b) - F(a - )G(a - ) = J F(x)J.l a (dx) + J G(x)J.l p (dx). I I ( P6. 19 )
6. Pro duct Measures and Fubini 's Theorem
NEW TERMS: measurable rectangle 357 product u-algebra 357 measurable cylinder 357 section of a set 359 ai-section of a set 359 section of a function 365 a .-section of a function 365 Tonelli's Theorem 365 Fubini's Theorem 367 product measure space 368 closed ball in IR", Borel-Lebesgue measure of 368 integral with respect to the counting measure 371 moment generating function 373 integration by parts formula for Lebesgue-Stieltjes integrals 375 , 376
377
378
CHA PTER 6. ELEMENTS OF INTEG RATION
7. APPLICATIONS OF FUBINI'S THEOREM
Product measures and Fubini's theorem find some of their finest applica tions in probability theory. One of them has to do with independence of random variables, a popular topic in statistics and stochastic processes. 7. 1 Definitions. Let (O,E,IP) be a probability space. ( i) Let � C E be an arbitrary (indexed) family of events (i.e. meas urable subsets of 0) . y is called rP-independent (or j ust in dependent) if, for any finite subcollection { A i 1 , . . . ,A i n } of n > 2 events from y , the following relation holds true:
=
rP {A i1 n . . . n Ai n } IP (A i1 )
•
·
· IP (A i n ) .
(7 . 1a)
Observe that, if � is an independent family of events then the Dynkin system generated by y is also independent (see Problem 7. 1). If, in addition, y is n -stable, then '!»(y) is an independent u-algebra. ( ii) Let m { Yi ; i E I} c E be an indexed collection of families of events. m is called in dependent if, for any finite subset {i 1 , . . . ,i n } C I, 1 , . . . ,n, the events Ai 1 , . . . ,Ai n n > 2, and for any choice of Ai E Y i , k k k " d epen d ent. are 1n ( iii) Let GJ {X i ; i E I} be an indexed collection of random vari ables on ( 0, £, IP). GJ is called in dependent if the corresponding collection { u( X i); i E I} of u-algebras generated by these random variables is inde pendent. (iv) Let X i : n ni , i 1, . . . ,n, be E,-E random variables on n n1 X . . . X n n and ( 0, E, IP). Then we denote ® X i {X 1 , . . . ,X n } : n 1 call it the pro duct map. n ( v) It appears (Problem 7 .2 ) that the product map ® 1 Xi is E. ® E ,-measurable. Therefore, by letting
=
=
=
�
.
=
" =
�
=
.
' =
1
= IP. ®n x i = IP( . ® 1 X i) * , 1 we can define a probability measure on ( fr n ; ® E i ) and call it the 1 i 1
'& =
IP ® x . "
" =
'& =
joint distribution of random variables
=
�
X 1 , . . . ,X n .
D
Let IP x . IP Xi be the distribution of the random variable X i , i 1 , . . . ,n . This is a probability measure on E i . Then, according to the previous section, we can construct the triple
=
'&
7. Applications of Fubini 's Theorem
379
On the other hand , we already have another measure IP' ® X . on ft n;,i ® E ; . which in general, need not be a product measure. The
(i
1
1
)
1
following statement clarifies the matter. 7.2 Proposition. The joint probability distribution IP' ® X . is a pro duct
n
I
measure and equals . ® PX . if and only if the ran dom varia bles = are in depen dent. 1 2
X1 , . . . , Xn
1
D
( See Problem 7.3. ) Note that the treatment of the product IP' ® x . of more than finitely I
many independent random variables is more complicated; such a treat ment involves the product of infinitely many u-algebras and measures. Another important application of product measures and Fubini's theorem is the notion of "convolution" of measures. k 7.3 Definition. Let !B*(IR , � ( IR k)) be the set of all fmite Borel meas ures on � k = .. ( C) = rn and
C is (1.3)
If N is a negligible set, then according to Problem 3. 18, Chapter 5, for each e > 0 , there is a countable cover of N by disjoint semi-open cubes { C k } such that (1.3a) Therefore, . unions,
N C E '; 1 C k
and since maps preserve inclusions and
The latter, along with (1 .3) and (1.3a) , yield that:
).. * ( F ( N)) :::; E � 1 ).. * ( F * ( c k )) *
< E � 1 >-. (CZ) = E ;; 1 (K dia mC k)" = E � 1 ( K .jn) n>.. ( C k ) = (Ky'Ti) n E ;; 1 >-. (C k) < e.
We showed that for any c , F. (N) can be covered by countably many half open cubes with the sum of their volumes less that e. By Lemma D 3.6, of Chapter 5, F *(N) is negligible. The following concept of the derivative was given by Frechet in 1903 , which we first formulate for Banach spaces.
390
CHAPTER 7 . CALCULUS IN EU CLIDEAN S P A CES
1.4 Definitions.
( i ) Let n and Q' be Banach spaces and let 0 be an open set in n. A map F: 0 � 0' is said to be differentiable at a point x E CJ if there is a continuous linear operator L ( F , x ) : Q � Q' and a map o : Q � Q' such that
tz. m h --+ 9 o (hh) =
II II
and
B'
F(x + h ) = F(x) + L ( F , x ) ( h ) + o ( h ), x + h
E 0.
( 1 .4)
It is easy to shoyv that if a map F has such an operator L ( F x ) ' then it is unique given F and x (Problem 1 .4) . The operator L ( F � ) is usually , denoted by F'(x) or D Fx is called the derivative (or Frechet derivative) of F at x. Consequently, from (1.4) , .l F(x + h ) - F(x) 1. DF x( h ) = ( 1 .4a)
h�
II h II
h� II h II .
If the function F is differentiable at every point of 0, it is said to be differentiable on 0. Then x � DFx is evidently a function itself, which is obtained by the application of the operator D to F. ( ii) Consider the special case of n and Q' being Euclidean spaces IR n and IR m , resp,ectively. Then, at every x = (u 1 , . . . ,u n )T E IR n , F(x) = ( f 1 ( x ), . . . , f m (x))T. In the above definition, the linear operator L F ( x ) ' as any linear operator in lR n (recall it is also continuous) , is known to be represented by an m x n matrix, say M x · Therefore, the derivative of F at x is, in this case, a matrix, called the Jacobian matri x, in notation �F(x). Then, (1.4) and (1.4a) can be rewritten as F(x + h ) = F(x) + �F(x) h + o ( h ), and
'
x E 0,
( 1 .4b)
�F(x) h . I (1 .4c) h � II h II h� II h II the determinant of � F( x) is denoted by J F( x) and is called
For m = n , the Jacobian .
F(x + h ) - F(x)
= li
D
1.5 Examples.
( i ) If F itself is a continuous linear map, then F( x + h ) - F( x) = F( h) and taking o = 0 (zero funct ion) , we get L F ( x ) ( h) = DF x( h ) = F( h ). Therefore, F is everywhere differentiable and for all x, D Fx = F, i.e. , D F x does not depend on x and F coincides with its derivative. In particular, if F acts in the Euclidean space and thus is represented by an
1.
Differentiation
39 1
m x n matrix, say M, then the Jacobian matrix & F ( x ) equals M. ( ii ) Let n = Q' = e([0, 1] , !R ) with norm II X II = sup{ x ( t ) : t E [0, 1] } and let 0 = { X : I I X I I < r} for some r > 0. Define the operator F: 0 � n as F ( x )( t ) = y ( t ) + I K ( t ,s) g ( s , x (s)) d s , (1.5) where K( t ,s) is continuous on [0, 1] 2 and t he partial derivative ( u , v) ( defined on the set R = [0, 1] x !R ) exists and is uniformly con t inuous on R. Then we can show that
�
�!
F ( x + h )( t ) - F ( x )( t ) = I � K(t,s) [g ( s , x ( s) + h ( s)) - g ( s, x ( s))] d s 1 {} = I 0 K ( t , s) a vg ( s , x ( s)) h (s) dx + cp ( x , h ) '
where
I zm . h-.rJ ll cpII(xh, hII ) ll = 0 .
Thus, F is differentiable at x and its derivative satisfies 1 89 ( F '(x) h)( t ) = I 0 K( t , s) 8 (s , x( s ))h ( s) dx . v
(1.5a)
0
m = ( f , . . . , ) ] be a function . 1 / T F is differentiable at an interior point x of 0 if and only if each component function f 1 , . . . , f m is differen tiable at x and in this case 1.6 Proposition. Let [ 0( C !Rn ), !R , F
m
Proof.
( i)
Suppose F is differentiable at x . Then,
F ( x + h ) - F ( x ) = (f1 ( x + h ) - f1 ( x ), . . . , f m ( x + h ) - f m ( x )) T =
D F x( h ) + o ( h ) = d F ( x ) h + o ( h ) = (& � ( x ) P . . ,& p ( x )) T h + o ( h ), .
(1 .6)
where & P, ( x ) is the ith row vector of d F ( x ). The right-hand side of (1 .6) can also be writ ten in the form
3 92
CHAPTER 7. CALCULUS IN EU CLIDEAN S P A CES
which yields that
. fi(x + h) - fi(x) = �p (x) h + oi(h)
and, hence, f i is differentiabl� at x and its derivative fi( x) is expressed by a 1 x n Jacobian matrix � F(x). Consequently, we have that F'(x) =
(f}(x), . . . ,f'm(x)) T . ( ii) The converse of the statement is obvious.
D
1. 7 Definitions.
( i) Suppose [0( C lR" ) ,IR, f] is a function. If f is differentiable at x E 0 "along the segment [x,x + te k]" parallel to the X k axis, where t is a real scalar and e k is the kth basis vector of lR", i.e. , the limit .l lm f(x + t e k ) - f(x) t t --+0
-
------
exists, it is called the partial derivative of
f with re spect to its kth
coordinate , in notation ::k (x ) . [Note that by fixing all components of vector x except for x k , in the above limit, the partial derivative ::k (x) is nothing else but the usual Newton-Leibnitz derivative.] (ii) We c;an analogously define the kth partial derivative of a vector function [ 0( c ·IR" ) , IR m , F ( ! 1 , . . f m )] as =
..
-
1
. F( x + te k ) - F (x) aF� ( X ) _ 1 1m , a '- k t t--+0
if the limit on the right exists. In light of Proposition 1.6 kth partial derivative (x) of F is
g�
8a F (x) ek
=
(
)
8fl (x), , 8f m(x) T a ek . . . aek
( h = tek ) , the ( 1 . 7)
and it exists if and only if the corresponding partial derivatives of all its D component functions exist. Suppose [0 ( � lR"),lR,/] is a function differentiable at a point x E 0. Therefore, f ' ( x ) exists and from (1.4a), .l f(x + h ) - f(x) = 1 . f'(x)( h ) (1.8) . � h h� II h II II h II In particular, if h = te k ' where t is a real scalar and e k is the kth basis vector of lR", h is the increment of x taken along the segment of a line
1.
393
Differentiation
II h II = t and, since f ' is linear, 8f ( x ) · f (x + t e k ) - f (x) - l 1m t t--+ 0 '- k
parallel to the X k-axis. Then,
a�
(1.8a)
::k
From (1.8a) it follows that (x) equals the scalar product of f's Jacobian matrix &f(x) and the kth basis vector e k . If [0 ( C IR" ), lR m , F] is a vector function differentiable at an interior point x of 0, then Propo sition 1.6 and (1.8a) yield (1.8b) Thus, if F is differentiable at x, all its partial derivatives exist and are determined by formula (1.8b ). In particular, (1.8b) reveals the nature of the Jacobian matrix � F ( x) . Namely, from (1.8b) and (1.7) it follows that ... ... •
.. .
•
.
•
(1.8c)
8fm ) ae (x
.
n
g�(x) and therefore, �p(x) ( �(x), g[ ( ) }
The kth column of �p(x) is
=
x
. . .•
(1.8d)
0
The above can be summarized as the following theorem. 1.8 Theorem. Let [ 0 ( C lR"), lR m , F] b e a function differentiable at a point x E 0 ( an interior point) . Then , all its partial derivative s exis t and its Jacobian matrix & F ( x) is equal to
( aaJe�. (x); i
=
1, . . . , m; k
=
1.9 Definition. Let 0 be an open set in
1, . . . , n
IR " .
)
D
.
A function [ 0, IR m F] is ,
394
CHAPTER 7 . CALCULUS
IN EUCLIDEAN
SPACES
said to be continuou s ly differentiable on 0 or a e 1 ( 0 1Rm ) -fun c tion if F is g exist and differentiable on 0, and all of its partial derivatives 1 are continuous on 0. Note that F is a e 1 - map if and only if F is differentiable and F' is continuous on 0. D ,
g[
1. 10 Examples. (i) If F E e 1 (0,1Rm) and a continuous function on 0.
m
, . . .,
[
n
= n , then the Jacobian JF is obviously
( ii) It can be easily verified that F (x , y )
=
( ':+�i )
is a e 1 ( {(x,y) E lR 2 : X = y } c , IR 2 ) -function. The following is the chain rule holding in Banach spaces. 1.11 Theorem (Chain Rule). L et n, 01 , and 0 2 be Banach spaces and let H: 0 ( � Q) � n 1 and G: 0 1 ( c n1) ---. n2 be maps such that H( 0 ) C 0 1 . Let H be differentiable at x E 0 and G be differentiable at H( x ). Then the compo s ed map G o H is a differentiable function at x and
(G o H)'(x) = G'(H(x))(H'(x)).
( 1 . 1 1)
Proof. By the assumption of differentiability,
H(x + h ) = H (x) + DHx( h) + o H ( h )
and
G(H(x + h)) =
G ( H(x)) + DG H ( x ) ( H (x + h ) - H(x)) + o 0 ( H (x + h ) - H(x)).
Substituting the expression for have that
H(x + h ) - H(x) = DHx( h ) + o H ( h ) we
G(H(x + h )) = G(H(x)) + DG H ( x ) (D H x + oH ( h )) + o 0 (H(x + h ) - H(x)). By linearity of DGH ( x ) '
1.
Differentiation
395
+ oa ( H (x + h ) - H(x)). Now, by continuity of H, H (x + h ) - H (x) � e 1 when arity and continuity of D G H( x ) '
h � B , and by line
(
. D GH( x ) (oH (h )) . D (oH (h ) ) G = l�o l� H( x ) II h II II h II '
)
01 Therefore, G H (x + h) = G H(x) + D G H( x ) DH X + o a H(h ) . V'
0
0
0
1. 12 Corollary. In the condition of Theorem
01 = lR m , and 02 = IR1 • Then,
1. 11,
D
let n = IR", (1. 12) D
Mean Value Theorem) . Let F : lRn � [R m be dif ferentiable on a conve x s et 0 . Then , for any x and y E 0, there is a p oint TJ , which belongs to the line s egment S (x, y ) between x and y , such 1. 13 Theorem (The
that
F(y) - F(x) = F ' ( TJ )( y - x) .
( 1 . 13 )
x, y E 0. Denote g( t ) = ty + (1 - t )x for 0 < t < 1 . Then, the function g represents the segment S( x, y ) and F o g will let the function F run over the segment S(x, y ) . By the chain rule, the function ,P = F o g is evidently differentiable on the segment [0, 1] and by ( 1 . 1 1 ), Proof. Let
cJi ' ( t ) = (F o g) ' ( t ) = F ' (g( t ))(g ' ( t )) = F ' (g( t ))( y - x) . Now, applying to X M II II e � II
II e '
(P 1.9b)
I I M - l x II e > � I I X II u '
(P 1.9c)
Let [0 c IR",1R m ,F] be a e 1 -function, where 0 is an open set, and x0 E 0. Prove that for each £ > 0, there is an open ball B e (x0 ,6) C 0 or B u (x0 ,6) C 0 such that
II (F'(x ) - F'(xo ))( h) II e < £ II h II u '
(P l . lO)
400
CHAPTER 7. C ALCULUS IN EUCLIDEAN S P A C ES
or
1.11
II (F' (x) - F'(xo ))( h ) II u < £ II h II u '
respectively. In the conditions of Problem 1. 10, let 0 be a convex set. Prove that for each t: > 0, there is an open ball Be( x0 ,6) C 0 or Bu (x0,6) C 0 such that
II F(x + h ) - F(x) - DFx (h) II e < II h II u ' E:
for all
( P 1. 1 1 )
x E Be(x0 ,6) and h E IR" such that x + h E Be(x0 ,6)
or
1.12
( P1. 10a)
II F(x + h) - F(x ) - DF x ( h ) I I u < £ II h II u '
( P 1 . 1 1a)
for all x E B u (x0 ,6) and h E IR " such that x + h E B u (x0 ,6) , respectively. Let [0 c IR",IR", F] be a e 1 -function, where 0 is an open set, and x0 E 0 such that the Jacobian J F ( x0) f. 0. Prove that there is an open ball Be(x0,6) C 0 such that for all y E Be(x0,6), (P 1 . 1 2)
[Be(x0 ,6),1R " , F] is one-to-one. 1.13
( P 1 . 1 2a )
[0 C IR " ,IR", F] be a diffeomorphism. Show that for each x0 E 0, Let
or, equivalently,
1.14 1.15
F'(x0 )(F - 1 ) ' (F(x0)) = 1. Show that if (IR",1R m , F] is differentiable, then { x E IR" : I I I F'(x) I l i e < a } is an open set in IR". Under the condition of Problem 1 . 14, is { x E IR" : II F'(x) II u < a}
an open set?
1.
Differentiation
NEW TERMS: Lipschitz condition 387 Lipschitz constant 387 Euclidean (Frobenius) norm of a matrix 387 Frobenius (Euclidean) norm of a matrix 387 submultiplicative property of a matrix norm 387 matrix supremum norm 388 maximum row sum matrix norm 388 differentiable map 3 90 derivative of a map 390 Frechet derivative 390 Jacobian matrix 390 Jacobian 390 partial derivative 392 continuously differentiable function 394 chain rule in Banach spaces 3 94 chain rule in Euclidean spaces 395 Mean Value Theorem 395 diffeomorphism 396 Inverse Mapping Theorem 397
40 1
402
CHAPTER 7. CALCULUS IN EUCLIDEAN S P A CES
2. CHANGE OF VARIABLES
2.1 Lemma. Let L be a linear op erator from IR" to IR" expressed by a regular matrix M and C be the compact unit cube spanned by the basis vectors in IR". Then, it holds true that
( 2 . 1) Proof.
( i) We will refer to the linear operator L as to
elementary, if the
corresponding matrix M is regular and one of the following three types:
Type 1. M is derived from the n x n unity matrix I whose ith
element on the main diagonal is replaced by a nonzero real number c.
Type 2. M is obtained from the n x n unity matrix I , in which
the columns i and j are interchanged.
Type 3. M is obtained from the
unity matrix I in such a way that in its column i, the element e j i = 0 is replaced by the element m j i = 1. n x n
In all types above we assume i,j = 1, . . . , n and i '# j. Clearly, if x E IR" is a column vector, then L( x) = M x stipulates the rules of the following transformation of x: For type 1, the ith entry of x is multiplied by c and the rest of the entries are left unchanged. For type 2, the entries x i and x j are interchanged and the rest of the entries remain unchanged. For type 3, entry x i is replaced by x i + x j and the other entries are left unchanged. ( ii) We first show that J.l( G) = ).. L * (C) = I detM I , if L is an elementary operator. Remember that C is the closed unit cube spanned by the basis vectors e 1 ,. . . , e n and expressed as the Cartesian product [0, 1] ". Consequently, it is obvious that when mapping C by L * we apply L to each of its points x = t1 e 1 + . . . + t n e n , where t i E [0, 1] . Therefore, by the above rules we have: Type 1
or
L * (C) = (0, 1]
X
• • •
X
[ 0, c]
'-v-'
ith edge
X
• • •
X
[ 0, 1] , if c > 0
2. Change of Varia bles
403
and L * (C) = [0, 1] x . . . x [c,O] x . . . x [0, 1] , if c < 0. '-y-J
ith edge
The edges of C, from e 1 ,. . . ,e n are transformed onto e 1 ,. . . ,ce i , . . . ,e n whose volume .A( L * ( C)) equals I c I . This is the same value as that of Type 2
In this case, the edges e i and e j are interchanged, and therefore, the shape of the cube remains the same. The volume of .A(L * (C)) is the same as that of .A( C) = 1 = I - detl I = I det.A(L * (C)) I · Type 3
The edges of C will be transformed onto ( e 1 , . . . , e i + e j , . . . e n ), which will '---v---'
ith edge
span a paralleletop whose sides parallel to the X i X J-plane are rhombi and the other sides are squares. For convenience sake i = 1 and j = 2, the volume of L * (C) can be calculated by using Fu bini's theorem as follows: A(L * ( C)) =
'
j d,\ n(x1 1
L* ( C)
• • •
, xn)
n-2 V'
This reduces to 1 as it is easy to see. On the other hand, it is also the same quantity as I det.A(L * ( C)) I = det( e 1 ,. . . ,e i + ej , . . . ,e n ) · (iii) Now, if instead of a cube, we have a compact rectangle R, i.e. a paralleletop with its edges spanned by the coordinate axes and possibly translated, by similar arguments as in (i-ii) we obtain that >.L * (R) = I detM I .A(R) ,
(2. 1a)
if L is an elementary linear operator. (See Problem 2. 1 where the validity
of (2. 1a) is to be shown.)
CHAPTER 7. CALC ULUS IN EUCLIDEAN S P A CES
404
Let P be a compact paralleletop in lR". Since the boundary 8P of P consists of parallelograms each of which have a dimension less than n, >.. ( 8P) = 0 and, therefore, >.. ( P) = >.. ( P) . By Problem 2. 10, Chapter 4, as an open set, P can be represented as a countable union of disjoint semi-open cubes:
( iv)
0
0
0 >.. ( P)
Therefore, >.. ( P) = = there is an N E N such that
0P 00 Ej
=
=
00 E = 1 ci .
i 1 >.. ( C j)
0
(2. 1 b) On the other hand, by Problem 3.22, Chapter 5, for each e > 0, there is a finite cover o� P by disjoint semi-open rectangles R 1 , . . . , R r such that
E � 1 >-. (R i ) � < >.. ( P) < E � 1 >-. (R i ) • =
-
=
(2. 1c)
Equations (2. 1b) and (2. 1c) yield
E � 1 >.. ( Ri) - � < >.. ( P) < E r: 1 >.. ( c j) + � · =
(2. 1d)
Therefore, from (2. 1d) we have that (2. 1e) Now L * (C) = P is a compact paralleletop with the property that for each e > 0, there is a finite cover of P by semi-open disjoint rectangles and a finite tuple of semi-open disjoint rectangles that can "approxi mate" P from above and below, (2. 1f) In terms of the Lebesgue measure >.. , this is in accordance with (2. 1c2. 1f). ( v ) Suppose L is an elementary linear operator. Then, applying L to (2. 1f) and evaluating the Lebesgue measure of the resulting inclusion we have
From (2 .la), the last inequality can be rewritten as
405
2. Change of Variables
L: f 1 C i and � = L: � = 1 R i , in the form " ( L * (e)) = 1 detM 1 " (e) < " L * (P) < " ( L *(�)) = I detM I A (�).
or, with notation e =
On the other hand, replacing £ in (2 .1e) by
£
(2. 1g)
I detM I we get (2. 1h)
We conclude that, if L is an elementary operator applied to a compact paralleletop P, for each e > 0, there are a subset e and a superset � of P whose images under L * satisfy inequalities (2. 1g-2. 1h) and and
" ( L* (e)) = 1 detM 1 " (e)
(2. 1i)
A ( L * ( � )) = I detM I A (�).
(2. 1j)
Equations (2. 1 g-2. 1j) yield that
A (L * (P)) = I detM I A(P) . L is a regular linear operator, then,
(2. 1k)
it is known from linear algebra, L can be expressed as a composition of finitely many elementary operators or, equivalently, M = M 1 · · · M 8, where M/s are elementary matrices. (One of the arguments is the Gauss-Jordan algorithm for deriva tion of the matrix inverse.) The application of L * = ( L 1 L8 ) * or any subgroup of L 1 L8 to C makes it a compact paralleletop such as P above. Consequently, (vi) If
as
o
o
• • •
•
•
•
o
o
and because of (2. 1k),
which finally yields
2.2 Theorem. Let
L: IR" lR" �
be a line ar operator spe cified by
406
CHAPTER 7 . CALCULUS IN EU CLIDEAN S P A CES
matrix M. Then, fo r every Lebesgue me asurable set
"a (L * ( E )) = 1 detM 1 "a ( E ).
E, (2.2)
Proof.
( i) If M is a singular matrix, then L maps the ( n-dimensional) set E into lRm , where < and, therefore, L * (E) becomes A-negligible. On the other hand, detM = 0 and thus equation (2.2) is valid. m
n
( ii) Suppose M is regular.* Then L is diffeomorphic on IR" and, due to Proposition 1.21, L*( E ) E L . Denote
Then J.L� is a measure on ( z ) l l l e: z E C}, where
(2.4a)
x ,x0 E C and, obviously, d q; (x) = dF(x) - I. From (2.4a) , II F ( x) - F(xo ) II u < II 4> (x) - 4> (xo ) II + II x - X o II u < ( K + 1) II x - x0 II u · u
(2.4b)
x0 is the center of a cube C and 2r is the length of its edge, II x - x0 II u < r and
If
II F ( x) - F ( x0) II u < r ( K + 1).
(2.4c)
The last inequality tells us that F( x) belongs to the compact cube centered at F(x0 ) with edge 2r(K + 1) or ball with radius r(K + 1), with respect to the supremum norm, in notation B u (F(x0),r( K + 1)). In other words, (2.4c) yields that
(see Figure 2. 1),
408
CHAPTER 7. CALCULUS IN EUCLID EAN S PACES
: F. ( C ) I
I I
I
Xo �
- - - - - - - -¢- - - - - - - -
-
-
-
F(x0 )1 Q-
-
-
-
-
-
-
-
-
-
-
-
r
r(K + 1)
Figure and because
I
2.1
F (C ) is a Borel set, *
A (F * (C)) < A0(B u (F(x0),r(K + 1 ))) =
Now, if follows.
( 2r) " (K + 1)" = (K + 1 ) " A0(C) .
I l l d p (x) - I l l l e < for all x E C, then K < t:Vn and ( 2.4 ) £
2.5 Proposition. L et [0 pose for some b > 0,
C lR",0 1 C IR " , F]
I Jp (x) I fo r all x E; B, where
B
=
1 detd p (x) 1
be a diffeomorphism. Sup
< b,
(2.5)
is a Borel subset of 0. Then,
A(F * ( B)) < b A (B) . Proof.
0
(2.5a)
(i) Suppose B is an open and bounded set such that B C 0. We prove (2.5a) under the assumption that (2.5) holds true for all x E B. Denote
0 such that (2.5c) II q,( x) II < M for all x E B. As a e 1 -map, F' is continuous on B and because B is compact, F ' is therefore uniformly continuous, i.e. , for every c > 0, there is a 6 > 0 such that , for all x , y E B with II x - y II e < 6,
I l l F'(x) - F' ( y ) l l l e < M·
(2.5d)
Combining (2.5c) and (2.5d) we have from (2.5b) that
I l l [q,(x0)F(x) - Ix] ' l l l e < c given II x - x0 II e < 6.
(2.5e)
410
CHAPTER 7. CALCULUS IN EUCLIDEAN SPA CES
By Problem 2. 10, Chapter 4, B, as an open set, can be represented as at most a countable union of disjoint semi-open cubes { C k } with edges parallel to their coordinate axes. 0 bviously, we can assume that the edge of each cube does not exceed 28 or, otherwise, we can subdivide the edges accordingly if necessary. Now, if x0 is the center of such a cube, then II x - x0 II < 8 for any x from the cube. From Problem 1 . 13, u
Hence,
(2.5f)
Since q, ( x0) F is demeomorphic (as a composition of regular linear and demeomorphic maps) , q, ( x0) F * ( C k ) is a Borel set. Since F ' ( x0) is a linear operator, by Theorem 2.2, and from (2.5f),
A( F * (C k )) = A( F' (x0) q, (x0) F * (C k )) =
By our assumption,
q, (x0) F,
I det F' (x0) I A(q, (x0) F*(C k )) .
(2.5g)
I detF' (x) I < b on B. By Lemma 2.4, applied to
Hence, (2.5h) Inequality (2.5h) holds for any cube. Now, since that
B = I: � 1 C k , we have
and thus
A(F*(B)) = E ;:' l A (F * (C k )) < b ( 1 + t:y'7i) " E ;:' 1 A0( C k ) = b(1 + t:y'1i) " A (B). Since the latter holds for every
£
>
0, we have that
A ( F * (B)) < b A (B). Hence, given that (2.5) holds true on an open and bounded set
B , (2.5a)
2. Change of Variables
411
is valid. (ii) Now we suppose that (2.5) holds true on 0. Note that 0 is
open but not necessarily bounded. By Problem 6. 12, Chapter 3 , there is a monotone sequence { O k } of bounded open subsets of 0, increasing to 0. By Part ( i) , for each O k ,
Since F *(0)
=
00
U F (O ), by continuity from below, k=l * k
(iii) Finally, let B be a Borel subset of 0 on which (2.5) holds true.
By regularity of >., Problem 3 . 15 (Chapter 5), for each e > 0, there is an open superset Oe of B such that >.(Oe \B) < £ or >.(Oe) < >.(B) + £. We assume that 0 e C 0, or, otherwise, we take 0 n 0 instead. Denote e
"'
0 has the following properties: f'V
1) Since I det�p(x) I < b on B, B C 0 f'V 2 ) .§ince 0 = OE: n {x E IR": II F'(x) I I < b + e} , by Problem 1. 14, 0 is open. f'V
So, we have that B C 0
C Oe. Thus,
>.(F * (B)) < >.(F * ( 0 ))
< (b + e)>.( O )
< (b + e)>.(Oe) < (b + e) [>.(B) + e] . D This holds true for any £ > 0. Hence it yields the statement. 2.6 Proposition. L et [0 C IR",0 1 C lR", F] be a diffeomorphism. Then for each Borel subset B of 0,
( 2.6 ) Proof.
each
( i)
k=
Let B be a Borel subset of 0 such that >.(B) < oo Define for 1,2, . . . and a fixed positive integer m , .
Bmk =
{ X E B: k ;;; 1 < I Jp (X) I < !;. }
412
CHAP TER 7. CALCULUS IN EU CLIDEAN S PA CES
From Proposition 2.5, (2.6a) From Example 1.20, (1.20d) ,
l = J 1 ( F ( x )). (2.6b) J F (x) p I J p (x) I ( < �) and hence , from (2.6b) , -
For all
x E B m k ' k ,-:;; 1 �
,
(!f < ) I J F _ l ( F(x)) l < k : 1 or (';; < ) I J F _ 1( y ) I < k : 1 for all y E F * (B m k ). If we apply Proposition 2.5 to F - l we will have that which along with (2.6a) yields
For all
x E Bm k ' (2.6d)
Integrating (2.6d) we have
Combining (2.6c) and (2. 6e) leads to
A(F * (Bm k )) - J I Jp (x) I A( d x) Bmk < k ,-:;; 1 A(B m k ) - �A (B m k) = �A(B m k ).
{-'
�k 1 Jp (x) 1 -' Cdx) }
E ;' 1 C F * C B m kn -B = A(F * (B )) -
JB I J p (x) I A( d x) < �A ( B) .
(2.6f)
413
2. Change of Variables
Since by our assumption A(B) < oo, we have from (2.6f) the validity of (2.6) by letting m --+ oo. ( ii) If B is an arbitrary Borel set , we can make a countable decom position of B = E � 1 B5 such that A(B 5 ) < oo and get (2.6) by sum ming up the equations
over Let and
s.
D
2.7 Remark. Formula (2.6) can be alternatively expressed as follows.
B 1 be a Borel subset of 0 1 and B = F * (B 1 ) . Then B is also Borel B 1 = F *(B) . Applying Proposition 2.6 to such a B, we have that A(B 1 ) = J I J p (x) I A(dx) . F* ( B 1 )
2.8 Theorem. (Change of Variables.) Let [0 diffeomorphism, let A be a Borel subset of 0 and each Borel measurable function [O,IR,g] ,
Proof.
(2.7) D
C lR",01 � lR",F] be a A 1 = F * (A). Then for
(2.8) J g( y)A( d y) = J g(F(x)) I JF (x) I A(dx) . A A1 Let g = l B1 for some Borel subset B 1 of A 1 and B = F * (B 1 ) .
Then, by (2.6),
f g( y )A( d y )
A1
=
f l B 1 ( y )A( dy )
=
=
A(B 1 ) = A(F * ( B))
A1 JB I Jp (x) I A( d x) = J lB (x) I JF (x) I A(dx) = =
J lB 1 (F(x)) I JF (x) I A( d x)
A
J g(F(x)) I JF (x) I A( d x) .
(2.8a)
A Thus (2.8) holds true for g being an indicator function. Let g be a simple function, i.e. , g = E 7 = 1 a i l B i ' where { Bi, i = . . . ,k} is a measurable partition of A 1 . From (2.8a),
1,
414
CHAPTER 7. CALCULUS IN EU CLIDEA N S P ACES
f g( y )A( dy ) A1
= f }: � a i lB ( y )A( d y) A _ 1
• -
'·
1 = }: ·� - 1 ai f l B'· ( y ) A ( dy) - A1 L: � _ 1 a i J l B '· (F(x)) I J p (x) I A ( d x) A1 = J g(F(x)) I J p(x) I A ( d x) . A 1
I
=
• -
•
The rest of this theorem is due to the standard procedure by going over D to the class of tJi: + -functions and then to g = g + - g - . 2.9 Examples. (Spherical Coordinate Transformation). ( i) Let 0 be an open subset of IR3 defined as 0 = {(r,O,cp) E IR 3 : r > 0, 0 < 0 < 2 1r , 0 < cp < 1r } and let
F=
[O,lR3 , F] be defined as
(x(r,O,cp)
= r cosO sincp, y(r,O,cp) = r sinO sincp, z(r,O,cp) = r coscp ) T .
(2. 9) The transformation has the range IR3 \D, where D = { (x,y,z) E IR3 : x > 0, easily see that F is a e 1 -map on 0 and its Jacobi y = 0, z E IR}. One can an, J p(r,O,cp) = - r 2 sincp f. 0 on 0. By Remark 1 . 1 9 (ii), [O, F * (0) = IR3 \D, F] is a diffeomorphism. Such a map transforms the rectangle [O, p] x [ 0,2 7r ) x [ 0, 1r] onto the ball Be(O, p ), but it obviously fails to be a diffeomorphism. On the other hand, if we take R = (O, p) X (0,2 7r ) X (0, 7r ) instead as . the domain of F it will transform the open rectangle an open ball Be(O, p) with the deleted sector
R onto
S = {(x,y,z) E IR3 : x = r sincp, y = 0, z = r coscp, 0 < r < p , 0 < cp < 1r } = {(x,y,z) E IR3 : x 2 + z2 < p , 0 < x, y = 0}. The transformation diffeomorphism. ( ii) Let
[R ,Be(O,p)\S, F], with F defined by (2. 9), is clearly a
[IR,IR,h] be a continuous function and let g be defined as
2.
415
Change of Varia bles
(2.9 a) Let
Be(O , p) be an open ball in IR3 • We will show that
J
B e (O, p )
gdA
(2.9b )
p
= 47r J h(r)r2dr. 0
Consider the transformation [R , Be(O , p)\S ,F] from ( i ) . Since S is a two dimensional set, its Lebesgue measure in IR3 is zero and, consequently,
Now we are going to apply formula
( 2.8) :
J gdA = J g( F ( p )) I Jp (P) I A(d p ) , A A1 with A 1 = Be ( O, p) \S, A = F * (A 1 ) = R = (O, p) ( 0 , 27r ) (0,1r) , p = (r, B , cp) , and I J p ( P ) I = r 2 sincp. Clearly, g(F(p)) = h(r) , which by Fubini's X
X
Theorem leads to
p
=J
2J7r
7r
J
r = O 8 = 0 cp = O
h( r )r2sincpA( dr ) A( dO)A( dcp) .
The last expression reduces to a Riemann integral and this further p reduces to 47r J h(r)r2dr. 0 0
PROBLEMS 2. 1 2.2
Show the validity of ( 2 . 1a) . Let [IR,IR,h] be a continuous function and let the open ellipsoid
Show that
E (O ; a1 ,a2 ,a3) denote
416
2.3 2.4
CHAPTER 7 . C ALCULUS IN EUCLIDEAN S P A C ES
Show that the volume of the ellipsoid in Problem Evaluate the integral
2.2 is �1ra1 a 2 a3 .
J exp {(x 2 + y 2 + z2 ) 3\ 2 }d.,\(x , y , z) ,
Be(O , p )
where
Be ( O, p) is a ball in lR3 .
2.
41 7
Change of Variables
NEW TERMS: Borel-Lebesgue measure of a cube under a linear map 402 Lebesgue measure of a set under an affine map 406 Borel-Lebesgue measure of a Borel set under a diffeomorphism change of variables in Euclidean spaces 413 spherical coordinate transformation 414 volume of an ellipsoid 416
411
Part III Further Topics in Integration
Chapter 8
A nalysis in A bstract Spac es This chapter (which is the least focused of the entire text) continues integration started in Chapter 6 and combines seemingly diverse topics from measure, integration, functional analysis, and topology. After we learned about absolute continuity of positive measures briefly introduced in Chapter 6, Section 5 (which may be sufficient for a first acquaintance), we will render a more thorough analysis of the Radon-Nikodym theory (Section 2) from the position of signed and comp lex measures (subject to Section 1). Singularity and Lebesgue decompo sition of signed measures are also treated here (Section 3) in a more rigor ous fashion. The reader will definitely benefit from having a first look at Chapter 6, Section 5, even though much of its formalities are suppressed. The results on signed measures are then applied to the analysis of L P spaces (a traditional topic of functional analysis) and generalization of the Lebesgue Dominated Convergence Theorem (Section 4), followed by convergence of measures (Section 5) and uniform integrability (Section 6). In Section 7, we return to locally compact Hausdorff spaces (started in Sections 10 and 1 1, Chapter 3) in connection with regularity of Radon measures and the general proof of the Riesz Representation Theorem (Section 7) . The chapter concludes with measures derivatives (Section 8) making traditional calculus on the real line (Chapter 9) very powerful. Besides the Radon-Nikodym Theorem (initially discussed in Chapter 6) , LP spaces and the Riesz Representation Theorem are among the main topics of this chapter. LP spaces (and their duals) were introduced and studied by the Hungarian Frigyes (Frederic) Riesz (one of the major figures in early functional analysis) who presented in 1910 a fully developed theory of these spaces, operators on them, and their spectral theory. His 1909 widely referred to Representation Theorem (of conti nuous linear functionals through integrals) , initiated by Jacques Hadamard in 1903, was his other major accomplishment, even though he proved this theorem for the special case of Riemann-Stieltjes integrals on [a,b] . Consequently, Riesz used no measure theory, although his work made a huge impact on the development of measure theory and inte gration and, in particular, lead Johann Radon to his 1913 revolutionary work.
421
422
CHAPTER 8 . ANALYSIS IN A BSTRA CT S P A CES
1. SIGNED AND COMPLEX MEASURES
The situation below is motivational to study a more general class of set functions than those we called "measures." Let ( O,E, J.L) be a measure space and let f E L 1 ( n,E, J.L; fR ). Define the following set function on E: v (A)
where
v + (A)
=
=
I fdJ.L A
=
v + (A) - v - (A) ,
I f + dJ.L and v-(A) A
=
I f- dJ.L . A
The set function v has all the properties of a measure ( u-additivity follows by Lebesgue's Dominated Convergence Theorem) except for being positive. However, in the above decomposition v = v + - v - , the set function v is represented by the difference of two measures. We will study this type of a set function, which we wish to call a signe d measure. We give a formal definition below, without saying anything about a de composition which is to follow later. 1.1 Definitions.
( i) Let ( n, E ) be a measurable space. A set function
called a signed me asure if: a)
v:
E --t- IR js
0; b) for each A E E, the value of v ( A) is well defined, i.e. it is either finite or + oo or - oo; c) v is u-additive. To tell signed measures from nonnegative measures, we will refer to the latter as positive me asures. CS(n, E) will denote the set of all signed meas ures on the measurable space (0, E ) . ( ii) The signed measure is called finite if its range is a subset of IR. Otherwise, is is called infinite. The triple (0, E, v) is called the signed measure spa ce. According to the type of the signed measure, the signed measure space is referred to as finite or infinite. The signed measure v is called u-finite if E admits a countable measurable partition {O n } of v fini te sets. (iii) Sometimes, we will need a notion of a finite set under v (or just a v-finite set) . This is referred to as a measurable set A with I v(A) I < oo. A measurable set P is called v-positive (or just positive) if v(P n A ) > 0 for all A E E. A measurable set N is called v- negative (or just nega tive) if v ( N n A) < 0 for all A E E. Obviously, P (N) is positive (nega tive) if and only if for any measurable subset E of P (N), v(E) > 0 ( < 0). v(C/J)
=
1 . Signed and Complex Measures
423
( iv) A set function v: E --+ IR is called continu ous from below if for every monotone nondecreasing sequence { A n } j C E it holds that
nlim --+oo v(A n ) = v ( nU=l A n)·
( v) Let {A n } be a monotone nonincreasing sequence of sets from E of which at least one is v-finite. A set function v: E--+IR is said to be continu ous from above on {A n } if ( 1 . 1) The set function v is continuous from ab ove on E, if (1.1) holds for every monotone nonincreasing sequence { A n } l C E with at least one v-finite set. In particular, if {A n } l C/J, ( 1 . 1) reduces to
nlim --+oo v(A n ) = 0 and this is referred to as continuity from ab ove at th e empty set or, shortly, (/)- continuity of v. (vi) Any signed measure on the Borel u-algebra is called a signed n Borel me asure. In particular, a signed Borel measure on ( lR , 0, A is negative itself and the statement of the lemma is proved. Otherwise, let S0 = sup {v(C): C C B0 = A } , -
which, by Proposition 1.3 ( i) , is finite and by our assumption about E is also positive. Hence, for every £ , there is a set C 1 � A such that v(C 1 ) + £ > Sg > 0. Let £ = �S0 • Then, C 1 is such that v(C 1 ) > �S 0• Now, if B 1 = A \C 1 is v-negati\ce, then we are done with the proof. Indeed, v(B 1 ) = v(A) - v(C 1 ) , by Proposition 1.3 (i) , and because v(C 1 ) > 0 , v(B 1 ) < v(A) . Otherwise, there is at least one subset of B 1 whose measure is strictly positive. Continuing with the same procedure, at step n we arrive at set
which is either a v-negative set satisfying v(B n ) < v(A) or it admits at least one subset with a positive value under v. This again leads to a posi tive real n urn ber and the existence of a nontrivial set C n + 1 such that v( C n + 1 ) > �S n > 0. If for no n, B n defined above is negative, then we set We show that N is a negative subset of A claimed in the statement of
1.
Signed and Complex Measures
425
the lemma. From we see that both v(N) and E�= 1 v( C n ) are finite. The latter implies that v ( C n) and, consequently, S n , dominated by v( C n ) , are vanishing. (Notice that, because v(E� 1 C n ) > 0, N # C/J.) This in turn yields that N is negative. Indeed, from the definition of S n , for every measurable subset E of B n , v(E) < S n . Since B n C N, it follows that for every meas urable set D , v(N n D ) < S n ! 0. Finally, that v(N) < v( A ) is obvious. D The following theorem states that there is an (essentially unique) de composition of the carrier set n into a positive and a negative set relative to a given si g ned measure v. This decomposition, referred to as a Hahn decomp osition leads to the upcoming Jordan decomposition of v into the difference of two positive measures mentioned in the beginning of this section. 1.6 Theorem (Hahn Decomposition Theorem). Let ( O,E, v) be a sign ed measure sp a c e . Then n can be partitioned into two sets, P and N, of which P is a positive and N is a negative set, referred to as a Hahn decomposition of n with respect to v, in notation (P, N) . A Hahn =
decomposition is unique in the fo llowing sense. If there is an other Hahn decomposition (P', N') then P6.P' and N 6.N' are v-null sets and therefore all Hahn de compositions form a unique equivalence class.
Proof. We assume without loss of g enerality that v does not take the value - oo . If C/J is the only negative set of v, then for each A E E, v( A)
> 0. (If there is a set A such that v( A ) < 0, then by Lemma 1.5 there would be a nonempty, negative subset of A .) Therefore, (f2,C/>) is the "trivial" Hahn decomposition and we are done with the proof. Let I
=
inf{v(E) : E E E and E is v-negative} .
Clearly, I :5 0. Then, there is a sequence {N n } of negative sets with lim n -HX>v(N n ) = I . Because of Problem 1.5, 00 N: = U N n n=l
is also a negative set. Regarding B n as k U N k ' we have {B n } as a =l monotone nondecreasing sequence of negative sets T N and hence, by Proposition 1.3 ( ii), lim n __. 00 v(B n ) = v(N) . Furthermore, since B n \N n � B n and B n is negative, v(B n \N n ) < 0. On the other hand, v(B n \N n ) = v(B n ) - v(N n ) and thus v(B n ) < v(N n )· The latter yields that v(N) < I. On the other hand, as for a negative set, v(N) > I, and thus v(N)
426
CHAPTER B . A NA LYSffi rn ABSTRACT S P A CES
=I
Now we show that P = N c is a v-positive set. If this is not the case, then there is at least one measurable subset A of P with oo < v( A ) < 0 and then, by Lemma 1 .5, there is a measurable, negative subset B of A with v( B ) < v( A ) ; hence v( B ) < 0. Then, B + N makes a negative set such that v( B + N ) = v( B ) + v(N) < v( N ) = I, which contradicts the fact that I is the v-limit-inferior of all negative sets. The uniqueness of D the Hahn decomposition is left for an exercise. (See Problem 1.7.) While the Hanh decomposition is a decomposition of the carrier n (with respect to the signed measure v) , the Jordan decomposition below is of the signed measure itself. It states that each signed measure is the difference of two positive measures. 1.7 Corollary (Jordan Decomposition). L et (n, E, v) be a sign ed measure space. Then v can be represented as the difference of two posi .
-
tive measures; of which at least one is finite, and this representation is unique (in the sense that it ib invariant of any Hahn decomposition) .
(P,N) be a Hahn decomposition of n relative to v and define the set functions + and on E as follows: Proof. Let
v
v
-
v + ( A ) = v(A n P) and v - ( A ) =
-
v( A n N) .
(1.7)
It follows from the definition of v + and v - that both are positive meas ures on E. It is also obvious why only one of them can be infinite. Hence, v = v + v - is the Jordan de comp osition induced by the Hahn de compo -
sition
( P ,N).
Suppose that J.l + J.l - is yet another Jordan decomposition of v induced by the Hahn decomposition (P',N'). Then, it can be easily shown (and it is left for an exercise; see Problem 1.8) that v + = J.l + and D v =v . 1.8 Definition. The defmed in Corollary 1 . 18 Jordan decomposition of a signed measure v, due to its uniqueness, suggests the following terms: -
v + is called the positive variation of v v - is called the negative variation of v I v I = v + + v - is called the total variation of v. (As the sum D of two positive measures, I v I is a positive measure itself.) One of the remarkable properties of the Hahn-Jordan decomposition of a signed measure is that it attains its maximum and minimum values on two disjoint measurable subsets of n as stated by the following propo sition.
1.
Signed and Complex Measures
42 7
1.9 Proposition. L et ( 0, E, v) be a signed measure spa ce. Then the positive, negative and total variations of v can be represented as follows. Given any measurable set A E E,
(i) v + (A) = sup{v(E): E E E n A} (ii) v - (A) = sup{ - v(E): E E E n A} = - inf{v(E): E E E n A} (iii) I v I (A) = sup{ }: � = 1 I v(E k ) I : { E1 , . . . ,En} C E and }: � = 1 E k C A} . Proof. Denote by (P,N) a Hahn decomposition of n with respect to v
and let
vsu p (A) = sup{v(E): E E E n A} and
v i n f (A) = sup{ - v(E): E E E n A} = - inf{v(E): E E E n A}.
(i) Clearly, v + (A) = v(A n P) < vsu p (A). To prove the inverse in equality we notice that because (P,E n P,Res E n pv) is a positive meas ure space, Res E n pV is monotone and hence, for each E E E n A,
v(E) = v(E n P) + v(E n N) < v(E n P) < v(A n P) = v + (A) . This yields the desired inverse inequality and thereby proves part ( i) of the proposition. (ii) Because P and N interchange their roles for - v, we have
and therefore v - = - vi n f · D We leave part (iii) for an exercise ( Problem 1 . 9) . 1.10 Remark. In summary of the Hahn-Jordan decomposition, we have that v + (A) = sup { v(E): E E E n A} = v(A n P),
428
CHAPTER 8 . ANALYS IS IN ABSTRACT S PA CES
- v - (A) = i nf{v( E): E E E n A} = v( A n N) and This has an obvious interpretation. The signed measure v attains its max imum and minimum values on two measurable disjoint subsets of A: A n P and A n N, respectively; and the entire measure of A is the sum of these two values. In particular, it follows that P and N are the v -max imal and v- minimal subsets of n (in notation, P = S and N = I) on which v attains maximum and minimum values, respectively. This is due to the fact that ( P, E n P, Res E n pv) is a positive measure space and hence Res E n pV is monotone. A similar argument explains why v attains a minimum value on N. D Let us consider a few examples. 1.11 Examples.
( i)
Let (O,E,v) be a signed measure space. If v is a positive measure, then, obviously, S n and I = (/J. Consider the case with v = J f dp, where p is a positive measure on (O,E) and f E L 1 ( 0, E, J.l). Then, =
v ( A) = I fd p = A
Therefore, {f
I
< o}
I
A n {! > 0 }
f d p < v(A)
o}
I
A n {/ < 0 }
fd p .
fd p, \1 A E E,
and thus S = {/ > 0} and I = {/ < 0}. ( ii) If J.l and p are two positive measures (of which at least one is finite) , the difference v = J.L - p is a signed measure. However, it is not, in general, the Jordan decomposition of v. Let v be a signed measure on the measurable space ( 0, E). Denote by v E = Res E n E v, where E is a measurable set. To obtain the Jordan decomposition of v = J.L - p, we need any Hahn decomposition of n with respect to v. Say, ( P,N) is one. Then, from Corollary 1.7, ( 1 . 1 1) v + = vp = J.lp - Pp and (1. 1 1a) We can also make use of formulas of ( i) and determine the positive and negative variations.
( ii)
of Proposition 1.9 to
1. Sign ed and Comp lex Measures
429
(iii) Let e 0 be the point mass and IP' a probability measure on (lR , . J.l , where ,\ is the Lebesgue measure on (lR, �) and is the geometric measure defmed as -
J.L
Clearly, N = { 1,2, . . . } is a negative set relative to v , whereas P = Nc is a positive set. Thus, (P,N) is a Hahn decomposition of lR relative to v and, consequently, for every Borel set A, and
v + (A) = ( >. - J.l) (A n { 1 ,2, . . . } c) v - ( A) =
(J.L - >. ) ( A n { 1,2, . . . } )
represent the Jordan decomposition of v. Since N is a >.-null set, the latter reduces to v - (A) = J.l (A n { 1 ,2, . . . } ) .
Therefore, v attains its minimum at N and its value is - 1 , while the maximum value of v is oo and it is attained at N c. D The next embellishment of the notion of measure is a complex measure.
430
CHAPTER 8 . ANALYSIS IN ABSTRACT S P A CES
1. 12 Definition. Let (0, E) be a measurable space. A set function on E is said to be a complex measure if:
( i)
(ii) (iii)
v
v is valued in C. [Notice that being valued in C, v must not
have infinite values, and therefore, of those signed or positive measures only finite ones can be qualified as complex meas ures.]
v( ¢ ) = 0. v is u-additive. [Analogously to the signed measures (see Remark 1 .2 (ii)), u-additivity of v,
(where I I stands for the two-dimensional Euclidean norm), implies that the series E� 1 v( An ) is also absolutely convergent.] D The triple (0, E, v) is referred to as a complex m easure sp ace. Now, we use a similar concept in Proposition 1.9 (iii) to define the total variation of a complex measure. ·
=
1.13 Definitions.
(i) Given a complex measupe space (0, E, v), the complex measure v can be represented as v = v1 + iv 2 , where v 1 and v 2 are finite signed measures on E. Hahn decompositions should then be applied for v 1 and v 2 and their corresponding Jordan decompositions will yield
(1. 13) with v 1+ , v 1- , v 2+ , and v 2- being positive finite measures. We will call (1. 13) the Jordan decomposition of the complex measure v. (ii) For each measurable set A , the total variation I v I (A) of a complex measure v is is defined as sup
{ I: � = 1 I v(A k) I , over all finite measurable partitions { A 1
1•
.
•
,A n } of A
}
1.14 Proposition. The total variation of a complex m e asure (n, E) is a finite positive me asure on (0, E) .
D
v
on
Proof. Let {A 1 ,. . . ,A n } be a measurable partition of a set A E E.
1.
Signed and Complex Me asures
43 1
Because for nonnegative real numbers a , b , c , d , and due to Proposition 1 . 9 (iii) we have E�= 1
and therefore,
I v(A k) I < E � = I v 1 I (A k) + E � = 1 I v 2 l (A k) < I v1 I (A) + I v 2 l (A), 1
l v I (A) < I v1 I (A) + I v 2 1 (A) = (v 1+ + v 1- + v 2+ + v 2- )(A) .
(1. 14)
Consequently, the total variation of any measurable set is a real nonnega tive number. Obviously, I v I ( ¢ ) = 0. Now we show that v is an additive set function. Let A and B be disjoint measurable sets and let {E 1 , . . . ,E n } be a measurable partition of A + B. Then,
and the triangle inequality of the Euclidean norm yield that:
E � = 1 1 v(E k) l
and therefore,
< 2: � = 1 1 v(E k n A) I + 2: � = 1 1 v(E k n B) I < l v i (A) + l v i ( B), I v I (A + B) < I v I (A) + I v I (B) .
(1. 14a)
The inverse inequality is due to the following. Given a measurable partition { E1 ,. . . ,E n } of A + B, it holds true that
with
Fk =
k = 1 , . . .n as another partition of A + B E k n B, k = n + 1, . . . ,2n
432
CHAPTER S. ANALYS ffi rn ABSTRACT S PA CES
Applying the supremum twice to the left-hand side of the above inequali ty we arrive at the desired inverse to inequality ( 1. 14a) . Hence, we showed that the total variation of v is a finite content on E. Finally, by Proposition 1. 7 (ii ) , I v I is u-additive if it is ¢-continu ous. This readily follows from ( 1 . 14 ) and the fact that v 1+ , v 1- , v 2+ , and D v 2- , as positive measures, are ¢-continuous. 1. 15 Remarks.
( i)
Notice that there is a slight difference in the definition of the total variation of a signed measure and a complex measure, but accord ing to Problem 1 . 13, they agree in the case of finite signed measures.
( ii ) While the set 6 (n, E ) of all signed measures is not a linear space ( the sum of two signed measures need not be a signed measure, as we can arrive at oo - oo ) , the space G: ( O, E ) of all complex measures (over the field C) is. It is easy to verify that II v II defined as I v I ( 0) is a norm and therefore upgrades G: ( n, E ) to a normed linear space. It can be shown ( Problem 1 . 14 ) that ( = { / E L1+ : l f d!-L < v(A) , 'v' A E L' }
the subset of
Since 0 E
is closed under finite suprema. Indeed, let ,g E
g(w)} G = { w E A: f(w) < g(w)}. E=
and Then, E + G
= A and I f V g dJ.L = EI fd J.L + GI g d J.L < v( E) + v ( G ) = v( A) .
A Now, let
S : = sup{ l fd !-L: f E 4> } < v( !l) < oo .
I dJ.L} =
Then, there is a sequence {
w
Theorem. L et (0, E) be a measurable space such that for each E n, { w } E E and let J.l be a u-finite positive measure on (n , E ) . Then
3.6
there is a unique decomposition
J.l
= J.lc + J.l d into a continuous and dis-
3. Singularity
crete comp onent such that
f..L c ..L f..Ld ·
Proof. Assume that J.L is finite. Let Then C is measurable and
J.L( C) =
455
C be
any countable subset of E.
2: Jl ( { w} ) < J.L(O) < oo
wee
Obviously,
L:
wen
(3.6)
.
J.L( { w} ) = sup { J.L ( C) : C E E and C � N} .
From (3. 6) we have that
L: J.L ( { w} ) < oo Thus,
wen
.
L: J.L( { w} )
wen
can have
only at most countable many positive terms. In other words, the set all J.L-atoms can be at most countable. Denote
A of
Then, f..L d is an atomic measure. We will show that the set function P. = c J.L - f..L d is a positive measure. It clearly suffices to show that f..L > 0. Let c B be a measurable set. Then ,
J.L(B) = J.L(A n B) + J.L(Ac n B) = J.L d ( B) + J.L( A c n B).
Clearly, f..L c is continuous and, as mentioned previously, f..L c ..L f..L d · Conse quently, J.L = f..L c + f..L d is the desired decomposition. Now suppose that J.L is u-finite and let {Qn} be a countable measur able partition of n such that
is finite for each n. Applying the above arguments to every J.L n , we arrive at the decomposition J.L n = J.L � + J.L� relative to the set A n of the atoms of J.L n · Then , is the set of all atoms of J.L and
456
CHAPTER 8 . ANALYSIS IN ABSTRACT S PA CES
is the desired decomposition of J.l and J.l c j_ J.l d with respect to A. It now remains to prove the uniqueness of the decomposition. Let
( 3.6a) Since the set A of all atoms of J.l is unique, both J.l d and P d are concentra ted on A that makes them clearly equal. If B is a j.l-finite measurable set, then J.l d = P d and ( 3.6a) immediately imply that J.l c (B) = P c (B). Other wise, let B n = B n D n , w here {D n } l f2 and
J.l n = Res E n D n J.l < oo. Then,
J.l c (B n ) = P c (B n ) and continuity from above lead to
and to the equality of J.l c and
p
c·
0
3.7 Theorem. L et (n, E) be a measure sp ace as in Theorem 3. 6 and v be a u-finite sig ned measure on (n, E) .. Then, given a u-finite positive measure J.l on (f2, E), there is a unique decomposition
with respect to J.l into three u-finite signed measures, of which th e first one is continuous and a bsolute continuous with resp ect to J.l, the second is continuous and singular with respect to J.l, and the third one is atomic. Furthermore, v d ..L v c a and v c a ..L v d· Proof. Let v = v + v - be its Jordan decomposition. Then, by Theorem 3 .6 , v + and v - can be decomposed as -
relative to the sets A + and A - of atoms of v + and v - , respectively. Consequently,
is the corresponding decomposition of the signed measure v into its conti nuous v c and atomic v d components with respect to the set A = A + U A - of atoms of v. This representation is obviously unique.
3. Sing ularity
457
Now, given a u-finite positive measure f..L , let v = v c + vd be the de composition (with respect to the set A of atoms of v ) . According to Theorem 3.4, there is a unique Lebesgue decomposition of v c = v c a + v c s with respect to f..L · Therefore, v = v c a + v c s + v d is a unique decomposi tion of v with respect to f..L into three u-finite signed measures of which the first is continuous and absolute continuous with respect to f..L , the se cond is continuous and singular with respect to f..L and the third one is atomic. Furthermore, we have that v c a ( A ) = v c s ( A ) = v d ( A c ) = 0. There fore, vd j_ v c a and v c a j_ vd. D 3.8 Corollary .. Let v be a si g ned Borel�L ebesgu e-Stieltjes measure on (lR", . be the Borel-Lebesgu e measure. Then, there is a unzque
decomposition
(3.8)
with respect to the Borel-L ebesgue measure >. such that j_ >., and v c s j_ v d .
v a � >., v c s + v d
Proof. Because any Borel-Lebesgue-Stieltjes measure is u-finite, by Theorem 3.7, v can uniquely be decomposed as
where
v ca � >.. Since obviously, v d j_ >., by Proposition 3.2
( ii) ,
Because the Lebesgue decomposition is unique, it follows that v c a is the absolute continuous and v c s + v d is the singular component in the Lebesgue decomposition of v. In particular, it follows that va = v c a is also continuous. D 3.9 Definition. The singular components v c s and v d of v in decompo sition (3. 8) are said to be singular-continuous and sin g ular-discrete (or just discrete), respectively. D
We are going to continue our discussion of singularity of measures in Section 4, Chapter 9. PROBLEMS 3.1
Prove part ( i) of Proposition 3 . 1 .
3.2
Generalize Proposition 3.2 for complex measures replacing signed measures.
458
CHAPTER 8. ANALYSIS IN ABSTRA CT S P A C ES
3.3
Prove a version of the Lebesgue Decomposition Theorem with a complex measure replacing the signed measure v in Theorem 3.4.
3.4
Prove a version of the Lebesgue Decomposition Theorem with a complex measure replacing the signed measure v and an arbitrary positive measure J.L replacing the u-finite positive measure J.L in Theorem 3.4.
3. 5
Prove a version of the Lebesgue Decomposition Theorem with a u finite positive measure replacing the signed measure v and an arbitrary positive measure J.L replacing the u-finite positive measure J.L in Theorem 3 .4.
3.6
Let v a ann v s be the absolute continuous and singular components of a complex measure v with respect to a positive measure J.L · Sh ow tha t I v I = I v a I + I v s I ·
3. Singularity NEW TERMS:
singularity ( orthogonality ) of a signed measure 452 orthogonality ( singularity ) of a signed measure 452 Lebesgue decomposition of a signed measure 453 absolutely continuous component of a signed measure 453 component of a signed measure 453 Lebesgue Decomposition Theorem 453 atom ( v-atom ) 454 atom of a singular measure 454 continuous singular measure 454 dec om position of a positive measure 454 decomposition of a u-finite signed measure 456 singular components of a signed measure 457 singular-continuous component of a signed measure 457 singular-discrete component of a signed measure 457
459
460
CHAPTER S. ANALYSffi rn AB STRACT SP ACES
4. LP SPACES
This section will deal with the so-called LP-spaces and give more sys tematic studies of them as metric spaces. 4.1 Notation. Let (n,E,J.L) be a (positive) measure space. Then, for 0 < p < oo , we denote by LP(f2,E,j.t;C) , the set of all measurable complex-valued functions such that I f I P E L 1 (n,E,J.L;C) . In particular, if J.L is the counting measure on (n, E) with n = { 1 ,2, . . . } , then the set LP(f2, E, j.L;C) reduces to the familiar lP space of all summable sequences. We will occasionally abbreviate LP(f2, E, j.L;C) as LP(f2, E, J.L) or just LP. One more notation we are going to use throughout is LP(f2, E, J.L;lR) as the set of all e � 1 (n, E; fR )-functions with I f I p E L 1 (n, E, j.L; fR + ). D 4.2 Proposition. LP(f2,E ,J.L;C) is a linear space over the field C. Proof. Let a, b > 0, then
(a + b) P < [2{a V b}] P (4. 2) Now , for J,g E LP(f2,E,J.L,C) , due to (4.2), we have (4. 2a) from which we see that obvious.
f + g E LP.
The other linear space properties are D
Notice that LP(f2, E, J.L;lR) is sort of quasi-linear over lR. Due to ( 4. 2) and the homogeneity, the LP is "linear" restricted to the scalars from IR but not IR , of course. Consequently, endowing a norm on LP should be done with care and respect to the accepted terminology. We now introduce a semi-norm on LP. 4.3 Theorem. The real-valued function
defined zs .
as
II II P : ·
LP(f2,E ,J.L;C) --+ lR +
a semz-norm. .
Theorem 4.3, whose proof will follow, essentially reduces to the triangle inequality , which we show in two steps below. Recall (Problem 1.5, Chapter 2) that two real numbers p > 1 and q > 1 are said to be con ju g ate exponents if
4. LP Spaces
46 1
� + � = 1. Now we prove the Holder inequality for the semi-norm
LP(f2,E,f.L;C).
4.4 Proposition (Holder's Inequality). Let 1
f n ( w ) exists for all w E Me.
Denote by L( w ) the value of this limit. Since g P E L 1 ( n , E, J.L;lR + ) , by Proposition 1 .2 1 , Chapter 6, there is N E N, such that g(w) < oo on Ne. Furthermore, there is a set o n E N su ch th at I f n I < g for all w E 0�.
= n U= O n . Then, clearly 0 E N. Denote A == Me n Ne n oe and f l == Ll A . Then, f n --+ f J.L-a.e., f E e - 1 (0, E;C). Because I f n I < g < on A, I f I < g a. e. , I f I < oo and hence f E C. By Proposition 1 . 17, Let 0
00
00
Chapter 6 , we have that
CHAPTER 8 . ANALYSIS IN A B STRA CT S P A CES
464
I f I and, consequently, f E £P(f1., E, J.L;C). Let Y n = I f n - f I P and h = ( I f I + g)P. Then, the sequence { g n } is nonnegative and is dominated by h. Since I f I + g E £P(f1., E, J.L;IR + ), Y n E £ 1 (0., E, J.L;IR + ). Applying Fatou's Lemma to h - Y n we have
Therefore,
Since Y n --+ 0 a . e . , This and (4.8) yield
Because
h - Y n --+ h
a.e.
h
and therefore lim(h - g n ) =
a.e
..
Y n > 0, we have l i m n--+ oo J Y n dJ.L = l i m n oo II f n - f II p = 0. --+
Finally,
II f n II --+ II f II P is due to Problem 4.2.
D
P
We are going to show that the space LP(f1., E, J.L;C) is complete with and hence th e quotient space respect to the seminorm II II P LP(f1., E, J.L; C) I . JJ is Banach. ·
4.9 Theorem (Riesz-Fischer). Let {f n } C LP(f1., E, J.L; C) ( or LP(f1., E, j.l; IR )) be a Cauchy sequence with respect to the seminorm · p · Then, there exists f E £P(f1., E, J.L; C) such that f L--+P f .
II II
n
Proof. Let {/ n } be an LP-Cauchy sequence. Then, given there is an N k such that for all indices n k , n k + l > N k '
£
= 2 -k, (4. 9)
Hence, there is a subsequence
{/ n k } whose terms satisfy (4. 9) . Denote
Y k = f nk - f n k + t and g = r: : 1 1 Y k l inequality of Problem 4.1 to the sequence { I g k I } .
and apply the we have from ( 4.9):
Then (4. 9a)
Thus, g E LP or, equivalently, gP E L 1 . By Proposition 1 .2 1 , Chapter 6 , g P and, therefore, g is finite J.L-a.e .. The latter implies that the partial
4. £P Spaces
465
sums
and hence the subsequence
I f" k I
{/ nk } converge J.L-a.e. on n.
Furthermore,
= I f n + g 1 + . . . + g k I � I f n 1 I + g, 1
and since (due to ( 4. 9a)) g E £P(f2, E, J.L;IR + ), the subsequence
{/" k }
is
dominated by an integrable nonnegative function I f n I + g. All other 1 conditions of the Lebesgue Dominated Convergence Theorem 4.8 (applied to the subsequence {/ n k } ) are met. Consequently, there is a function f
E £P(f2, E, J.L; C ) to which
{ / n k } converges J.L-a.e., both in the topology of
pointwise convergence and in the pth mean. Finally , {/ n } , being an LP-Cauchy sequence, by Problem 3.9, Chap ter 2, must converge to the same limit function f (as its subsequence {f n } ) in the pth mean. D k
Notice that the function f to which {/ n } converges in the pth mean is defined uniquely J.L-a.e .. Therefore, the Riesz-Fischer theorem states that the quotient space LP(O, E, J.L; C) I 11 is Banach. As a byproduct, the theorem provides a subsequence { f n k } of { fn } , which converges to f J.L-
a.e. in the topology of pointwise con vergence. The theorem does not state, however, that {/ n } also converges to f J.L-a.e. pointwise. (The reader is encouraged to provide a counterexample where such an option is not the case, see Problem 4.6.) Below is what we can afford. 4.10 Proposition. If an LP(Q, E, J.L; C)-Cauchy sequence {/ n } conver ges J.L-a.e. pointwise to a function f E e - 1 ( 0., E; C), th en f E LP and
I
n
e f.
l""tJ
Proof. By Riesz-Fischer Theorem 4.9, there is an LP-function f such
:hat f
n
e
7
{
}
and there is a subsequence f n k s;
{ !n}
such that f n k
_,
f a.e. pointwise. On the other hand, by our assumption, f n k --+ f a.e. pointwise. Therefore, f E [f ]11 and the rest of the statement is again l""tJ
due to the Riesz-Fischer Theorem.
D
4.1 1 Proposition. Let (O.,E,J.L) be a measure space, such that J.L is finite and let f E e - I (n, E; C ) . If 1 < p < q < + oo, then ( 4. 1 1)
466
CHAPTER 8 . ANALYSIS IN AB STRACT S P A CES
and therefore Lq ( O.,E,J.l,C )
C £P(f2,E,J.l,C) .
Proof. We assume that p < q or else ( 4. 1 1 ) is trivial . Then denote a = q/p and b = af(a - 1 ) = qf(q - p) . Then , a and b are conj ugate exponents with a > 1 . Since J.l is finite, the constant function 1 E L b ( O., E,J.l,IR) . Now apply Holder's inequality to I f I P and to 1 with respect to the conjugate exponents a and b:
or, equivalently, (since pa = q, 1 /a = pfq and 1 /b = 1 - qfp) 1 1 = 11 t 11 � [J.l( n ) r -
q
that proves ( 4. 1 1).
D
4. 12 Examples.
( i) Consider an important special case. If J.l is a probability meas ure in Proposition 4. 1 1 , then the result applied to a random variable X can be interpreted as follows. The existence of the moment of nth order implies the existence of all lower moments of X. (ii) The statement of Proposition 4. 1 1 that, for p < q ,
need not hold if J.l is not finite. For example, if n = [l ,oo) and J.l is the counting measure concentrated on set { 1 ,2, . . . } , i.e. J.l = E� 1 e n . Let J(x) = �- Then, =
and thus f E £2. However, it is easily seen that f � £ 1 .
D
The theorem below states that the space of all real-valued integrable "extended" simple functions is dense in £P. We need the following notation. Let qiP(Q, E, J.l;IR) = W'(O., E;IR) n £P(f2, E, J.l;IR) denote the sub set of all real-valued simple £P-integrable functions. (See Remark 6.2 (iii) , Chapter 5, on simple functions.) 4. 13 Theorem. The real subspace qiP is dense in ( LP,
II II p ) ·
Proof. qiP C LP, by the definition. Now, given an f E £P, by Theorem 6.5, Chapter 5, for f + and f - there are monotone nondecreas·
4. LP Spaces
467
ing sequences {s: } j f + and { s; } j f - . Since f E £P, so are f + , J - E £P and, consequently, and By ( 4.2), and since f E £P, we have that {f - s n } C LP. Therefore, the sequence { I f - sn I P} is dominated by an £ 1 -function 2P + 1 I f I P. We also know that {f - sn} converges to function 0 pointwise. Hence , the sequence {f - sn} meets all criteria of the Lebesgue Dominated Convergence Theorem. As the result, there is an £P(f1., E, J.L;IR)-function, say f*, to which {f - sn} converges a.e. pointwise. Hence f* E [O] JJ and by setting
! * = 0, we have lim n-+oo I I f - sn II P = 0 or that sn
� f. In other words,
D 4.14 Remarks. (i) Given an £P-function J, we proved the existence of an "extended" sequence { sn} of simple functions such that { I sn I } is monotone increasing to I f I and { s n } converges to f pointwise. (ii) Noticed that not only tJ! = e - 1 ( 0., E;IR) in e - 1 ( 0., E),r 00 (i.e. , in the topology of pointwise convergence) , but as we showed, the sub space tJ! P of tJ! is dense in ( LP, II II p ) . ·
(iii) A minor adjustment to Proposition 4. 13 allows us to claim that the subspace tf!P(Q, E, J.L;C) = tfl(Q, E;C) n £P(f1., E, J.L;C) of all complex valued simple £P-integrable functions is dense in £P(f1., E, J.L;C). (Problem D 4.8.) The following topic on J.L-a. e. bounded measurable or "£ 00-functions" occurs often in applications and is going to be explored. We will also see how the L00-space fits in the LP-family. 4. 15 Definition. Let f E e - 1 ( 0., E; C) or e - 1 (0., E; IR). A positive
real number M is said to be an essential bound for f if I f I < M J.L-a.e. on n. If f has an essential bound it is naturally called essentially bound
ed.
D
We would like to notice the difference between J.L-a.e. finite and essen-
468
CHAPTER 8 . ANALYSIS IN AB STRACT SP ACES
tially bounded functions. For instance the function � E e - 1 (0, L'; IR ) is finite .A-a.e. on IR, i.e. every where, except for 0, whereas it is not essentially bounded. Moreover, the "repaired" version of � ,
{
-:j. 0 f ( X ) = �'0, XX = 0 becomes finite (and an element of bounded.
e - 1 (0, E; C)) ,
but still not essentially
4. 16 Definition and Notation. If a measurable function f on (0, E, J.L) is essentially bounded, then the infimum of all essential bounds for f is called the (J.L-) essential supremum of f and it is denoted by I I f II 00 or by ess sup { I f I } . More formally,
II f I I 00 = inf{ M > 0: J.L { I f I > M } = 0 } .
The subset of e - 1 (0, E; C ) (or e - 1 (0, E; IR )) of all essentially bounded functions is denoted by L00 (0, E, J.L;C) (or L 00 (0, E, J.L;IR ), resp.). (Of course, if f is not essentially bounded, it would make sense to set I I f I I 00 = oo . However, since we are going to use I I I I 00 as the norm wit hin D L 00 , we do not need such an extension.) It is easy to see that L 00 (0, E, J.L; C) is a vector space over the field C, while L 00 (f2, E, J.L; IR ) is a "quasi"-vector space over IR. The properties below justify n . II 00 as a semi-norm on L 00 • 4. 17 Proposition. Given two measurable functions f and g on (0, E, J.L) and a scalar a E C, the following are valid: ·
( i) ( ii) (iii) (iv)
( v) ( vi)
( vii) ( viii)
I t I < I I t I I oo J.L- a . e. on n. I I t + g I I oo < I I f I I oo + I I g I I oo · 1 t 1 < 1 g 1 J.L-a.e. on n implies that II t I I oo < I I g I I oo · / E [g] J.& => ll f ll oo = I I Y I I oo · l l a f ll oo = l a l ll f ll oo · II f I I 00 = 0 f E [ O ] J.& . I I a I I oo = I a I . I I t g II oo < I I t I I oo I I g II oo ·
Proof.
( i)
Given £ =
�,
there is an essential bound M n such that
4. £P Spaces
IfI Hen ce, the set
( ii)
{IfI
I t+g I
>
0, J1.
C{ I f I
> c} )
0
is such that for each k
=
1 , 2, . . . ,
(5 . 9)
where Let 00
Bs : = . U A j. j=s s, { hk } is Cauchy ,
We will show that for each notice that since for each w E Ak,
In other > N,
uniformly on
B �.
l h < h (w) (w) E � l k m l k l hi (w) - hi + I (w) l "' m - 1 1: = 1 _ 1 < 1 . < L...J i = k 2 z 2k - 1 2m - 1 2k - 1 words, given a 6 > 0 , there is an N > s such that for all I h k (w) - h m (w) I
< 6,
good for all 00
First
(5 . 9a) m
>k
w E B�.
Consequently, {hk } is Cauchy on A: = U B� in the topology of points=l . w1se con vergence. Furthermore, since the sequence { B s } is monotone nonincreasing and
5. Modes of Convergence from (5 .9) , J.L(B5)
s,
Moreover, since
and because { f n } was assumed to be Cauchy in measure, each of the sets on the right of inclusion (5. 9c) converges to zero. Therefore,
Finally, let g be yet another J.L-limit of { f n } · Then, from
{ I f - g I > t:} E N11, good for all £ > 0, and thus g = f
(mod J.L) .
From Proposition 5.4 and Theorem 5.9 we arrive at
f C LP(Q, E, J.L; C) such that
D
£P
---. f . Then {f } there is a subsequence { f k } of { f } that converges to f J.L- a . e . D . t wzse. . pozn
5.10 Corollary. Let
n
,
n
f
n
n
The following proposition makes some sort of converse of Proposition 5.4 (that LP-convergence implies convergence in measure) with one addi tional condition. 5. 11 Proposition. Let (O, E, J.L) be a measure space an d let /, {/ n } C
LP(Q, E, J.L; C) such that f n IR + )-function g such that I f Proof. Since f
n
� /,
n
� f and suppose there is an £P < Then f __... f . I g.
LP(Q, E, J.L;
n
according to Theorem 5.9, there is a sub-
480
CHAPTER 8. ANALYSIS IN ABSTRACT S PACES
{ fnk } of {!n }, which converges to f p.-a.e. on n in the topology of pointwise convergence. Since { f n k } is dom; nated by g, by Lebesgue's Dominated Convergence Theorem 4.8, f n k S f and f E LP ( fl, E, J.L ; C). Suppose that f n f . Then there is a positive £ and a subsequence { hj: = fnj } of {! n } such that for all j's, it holds true that ( ) l hj - f l p > £ . On the other hand, since hj..!:.. f, there is a subsequence { hi } (of { hi } ) ; convergent to f J.L-a.e. on f2 (and also dominated by g ) and thus, by the Lebesgue Dominated Convergence Theorem, hj. -+ f thereby directly D contradicting ( ) . 5.12 Proposition. Let f n ..!:.. f. Then, every subsequence { f n } of k { f n } contains a subsequence f n k } such that f n -+ f J.L -a.e. on n. { j. j Proof. By the assumption, every subsequence f n k of { f n } must converge to f in measure. Then, by Theorem 5 . 9 , f n must have at least one subsequence, say f n k } that converges to f p.-a.e. on n. D { j sequence
£P
+
*
£P
1
*
k .
k
The converse of Proposition 5 . 12 requires the finiteness of J.l·
{ f n } be a sequence of e - 1 (n, E ; C)-function s on a finite measure sp ace (n, E, J.L) . Suppose that every subsequence n k of {/ n } contains a subsequence n k . such that nk . -+ J.L-a.e. J j � on f2 . Then, n -+ Proof. Since J.L is finite, by Proposition 5 . 8 , n Therefore, k J. given an 0 , every subsequence { a n k } of a n: k = 1 ,2,. · . , n5.13 Proposition. Let
{f }
f f.
{f }
f f f �f .
£>
{ = � {I f f I > E: } ). } has a subsequence { a n k } that converges to 0. Therefore, the numeric j
sequence {a n } is sequentially compact and ( cf. Theorem 6 . 3 , Chapter 2 D or Problem 3 .9 , Chapter 2) converges to 0 itself. The following chart (Figure 5 . 1 ) makes an overview of the major convergence modes and their relations and summarizes the theorems and propositions above.
5. Modes of Converg ence
48 1
Every subsequence } of {/,. } has a
{fnk
subsequence such that
p is fmite
kj
fn
.. ..
/,
{tnkj } -4-
f a. e.
p
� n --r
J
J
/,,
p is fmite
,
,,.
....
I fn i < g E lf'
In
�
�
f
,u - a . u.
,
f
(LDCT)
,u - a . e.
p is finite
(Egorov)
Figure 5. 1 5. 14 Proposition. Let (O, E, J.L) be a finite measure sp ace and J, {fn}
C e - 1 (0, E; C)
such that f n � f .. Suppose a function cp: C ---. C is conti11 nuous .. Then, cp o f n ---. cp o f. Proof. Since cp is continuous, cp o J,{cp o f n} C e - 1 (0, E; C). By Proposition 5 . 12, each subsequence of {/ n} has a subsequence, say fn convergent fo f J.L a . e . on n. Hence, by continuity of cp , also k· j
}
-
}
converges to cp o f J.L a . e . on k. j is due to Proposition 5. 13. cp o f n
-
n.
Since J.L is finite, the statement D
5.15 Proposition. Let {! n} , {gn} C e - 1 (n, E; C) be two sequences on a measure space (0, E, J.L) convergent in measure to measurable functions
f an d g, resp ectively .. Then, for any two complex numb ers a and b, JJ af n + bgn ---. af + bg.
482
CHAPTER B . AN ALYSffi rn ABSTRACT S P A CES
Proof. From
we have that
Therefore,
af n � af .. D 5. 16 Proposition. Let { f n }, {g n } C e - 1 (0, E; C) be two sequences on a finite measure space (0, E, J.L) convergent in measure to m easurable functions f and g, respectively. Then, f nYn .!:... f g. Proof. By Proposition 5. 12, every subsequence of { f n } contains a subsequence convergent to f J-L-a.e. on n. Let f n k be any subsequence of {f n } and f n k be a subsequen ce of f n k convergent to f J-L-a.e. on Furthermore, it is obvious that
n.
{ j}
Then the subsequence
{ Gi : = Ynki
J
convergent to
{ g n ki }
g J-L-a.e.
{ } { } of
{g n }
on
n.
has a sub se que nce
Therefore, the sequence
{F i G i } ( where F i : = f n k · ) converges to fg J.L- a .e . on n.
J 1· In summary, we showed that an arbitrary subsequence
{/ nYn }
has a subsequence {F i G i } that converges to statement now follows by Proposition 5. 13.
{ !nkYnk } of
fg J.L-a. e . on 0.
The D
5. 17 Examples.
A n = [O,� ]. Obviously , f n---. 0 >.-a. e. Therefore, by Proposition 5.8, f n � f = 0 pointwise. Since >. is finite on 0, by Egorov's Theorem, f n ---. f = 0 a.u. However, (i)
Let
0 = [0, 1]
and let
f n = e" l A n ,
where
for n---. oo (0 < p < oo ) . So , the LP-convergence of {/ n } does not hold. The same applies to L 00 : II f n II 00 = e" ---. oo, for n ---. oo.
5. Modes of Convergence
483
(ii) Let f2 = 1R + , E = joint), which is a contradiction.
i
1=
I: � n A (A ; ) = oo (since A;'s are dis-
(iii) The following is an application of two major convergence modes
to probability. Let {X n } be a sequence of L 1 (0, E, IP; IR)-random vari ables. Construct the sequence (5. 17)
and denote f: = 0. If f n .!: f in measure, we say that {/ n } converges to f in probability (also called stochastic convergence) and in this particular case, we say that the sequence {X n } obeys the Wea k Law of Large Num bers. If the sequence in (5. 1 7) is such that f n ----. f IP- a . e . on n (more precisely, rP- alm ost surely or rP-a.s.) , then {X n } is said to obey the Strong
Law of Larg e Numbers.
Due to Proposition 5.8, the Strong Law of Large Numbers implies the Weak Law of Large Numbers, thereby j ustifying their names. In the special case, when the random variables {X n } share a common mean, say m , the convergence of f n to 0 means that the average value
"" kn = 1 X k r-u n ·. -- nl L...J of the sequence converges to m (weakly or strongly) and therefore becomes a constant. This is often being used in statistics to evaluate the unknown mean ( m ) of a population (by J.L n )· Notice that the Central Limit Theorem is also applied as a practical tool to estimate the sample size within a given significance level . Finally , the reader can be referred to regular text books in pro babili ty to learn about various sufficient conditions to satisfy the Weak and Strong Laws of Large Numbers. D
484
CHAPTER 8. ANALYSIS IN ABSTRACT S P A CES
PROBLEMS
5.2.
5.1
Prove Proposition
5.2
Show that
5.3
Give an example of a sequence convergent in measure but not in £P.
5.4
Let J.l =
f n .!:.. f implies that {/ n } is Cauchy in measure.
£P(f2,L',j.t;lR) be as follows: !1 = (0,1], E = .. , and p > 1. Define a sequence {/ n } in £P as f n (x ) =
n lAn (x) ,
A n : = [0,�). Show that the J.L-limit of {/ n } £P-limit of {/ n } is not, for all p > 1 (including oo )
is
0,
but the
.
5.5
Let £P(f2, E, J.L;IR) be as follows : Q = IR, E
Define f n ( x ) :
uniformly on lR,
Find
.. , p >
II f n II 00 •
0.
Sh ow th at
f n --+ 0 >..- a.e., >..-I im f n = 0. However, show that f n fails to converge in £P (0 < p < oo ) . Define n = [0,1], E = .. , p > 0. Define f n m = l A n m , where A n m = ( m; l , �], m = 1, . . . , n , n = 1,2,. . . . Show tltat the sequence {/ n m , m = 1,. . . , n , = 1,2 , . . . } converges to 0 in the pth mean but does not converge >..- a.e. , not a. u . . and f n --+ 0
5.6
= � 1A n ( x ) , A n : = [O,e n].
=
f n --+ 0
i n L 00 ,
n
not in £ 00 •
5. Modes of Co nvergence NEW TERMS:
convergence in measure sequence of functions 474 Cauchy in measure sequence of functions 474 almost uniform con vergence of a sequence of functions 4 7 4 Chebyshev's inequality 474 Egorov's Theorem 476 convergence in probability (stochastic convergence) 483 stochastic convergence (convergence in probability) 483 Weak Law of Large Numbers 483 Strong Law of Large Numbers 483
485
486
CHAPTER B. AN ALYSffi rn ABSTRA CT S P A CES
6. UNIFORM INTEGRABILITY
Uniform integrability has some resemblance with equicontinuity as it applies to a family of functions. Recall that Problem 1 . 22 , Chapter 6 , states that a function f E e - 1 (0, E; C) on a measure space (n, E, J.L) is integrable if and only if for each e > 0, there is g E L 1 (0, E, J.L; IR ) such + that
J
{ I l l > g}
l f l dJ.L < e.
This is a motivation for the notion of uniform integrability of a family of integrable functions, for all of which such a function g exists, given any positive e.
6.1 Definition. A family � c e - 1 (0, E; C) of functions is said to be uniformly inte g ra ble with respect to a measure J.L on (0, E) if for each e > 0, there is g E L 1 ( n , E, J.L; IR ) such that for every f E q,,
+
(6. 1) I f I djl < e. S { I l l > g} The function g is said to be an £- bound of q, _ D 6.2 Remark. If (rl, E, J.L) is a finite measure space, then Problem 1 . 22 of Chapter 6 can be restated as: a function f E e - 1 (0, E; C) on a finite measure space (0, E, J.L) is inte g rable: if and only if for each e > 0, there is a nonnegative number N such that (6 .2) ConsequentlY., a family q, c e - 1 (0, E; C) is uniformly integra ble with respect to a finite measure J.L if for every £ > 0, there is a nonnegative number N such that for every f E q,, (6.2 ) holds true. This second variant of uniform integrability was originally introduced in connection with martingale theory in probability. Definition 6 . 1 is therefore more general. D 6.3 Examples.
A finite set q, = {f 1 , . . . ,f n } of L 1 -functions forms a uniformly integrable family. Indeed, given an £ > 0, by Problem 1 . 22 , Chapter 6 , each f i has an £-bound g i · Therefore, g = g 1 V . . . V g n is an £-bound of q, _ More generally, replacing f i by a uniformly integrable family q, i of functions, we deduce that the finite union of uniformly integrable families of functions is uniformly integrable.
( i)
6. Uniform Integ rability
487
(ii) In the Lebesgue Dominated Convergence Theorem, a sequence
{/ n }, dominated by a nonnegative L 1-function g, is uniformly integrable. Indeed, since for each n, I f n I < g a.e., we have that
However, it is not true that a uniformly integrable family is dominated by any function. Consider a finite measure space (N, '!P(N) ,J.L) such that J.L( { n } ) = �, 2 n = 1 ,2, . . . , and a sequence {/ n } of measurable functions defined as
2" n'
k=n
0,
k # n.
We will show that {/ n } is uniformly integrable, by using the definition of Remark 6.2. Let N > 0. Then,
Since, obviously,
2: > N C/J, 2: < N
{n} ,
holds, 1 , k = n and 0,
�>N
otherwise
l { f n > N } < l { n } leadin g to 1 2" 2" < J.L({n}) = n d = d j.L I I I I l · n t j.t -n l n{ " { I fn I > N} Consequently , given an £ > 0, for all n > ! , the set {/ n + 1 , . . . } is uniform ly integrable. Since f , . . . ,/ n are integrable, the whole sequence {/ n } is uniformly integrable. k On the other hand, g( k ) = 2k is evidently the smallest function of those dominating the sequence {f n } and it is not J.L-integrable. Indeed,
and therefore,
1
488
CHAPTER 8. ANALYSIS IN ABSTRACT S P A CES
and
2 k ( J.L (d 1 I Tl [k } w) w )
k1 2 s�p I g dJ.L = s�p 1k 2 k = oo. Therefore, there is no integrable dominating function for { f n } ·
L:
n k
=
=
L:
n k
=
D
We immediately observe that 6.4 Proposition. If a family is uniformly integrable, then
sup { J I f I d J.L= f E q, } < 00 . Proof. Indeed, given an £ > 0, let g be an £-bound of . Then,
+
I
{Ill
< g}
I f I dJ.L < £ + I g dJ.L , good for all f E �-
D
The following is a useful criterion of uniform integrability for a sequence of functions on a finite measure space. We start with 6.5 Definition. A sequence {f n } C e - 1 (0, E; C) on a measure space ( 0, E, J.L ) is said to be uniformly continuous in LP if I fn I P ----. 0 with
IA
D J.L ( A) ----. 0 uniformly in n. 6.6 Theorem. Let {f n } C e - 1 ( 0, E; C) b e a sequence of functions on a finite measure space ( 0, E, J.L ) . {f n } is uniformly integ rable if and only if it is uniformly continuous in L 1 and the int eg rals I I f n I d J.L are uni formly bounded. Proof.
1 . Let {f n } be uniformly continuous in L 1 and the integrals I f n I d J.L be uniformly bounded. Then, by Chebyshev's Inequality (Le mm a 5.3) and due to uniform boundedness,
I
Hence, J.L{ I f n I > N} ----. 0, as
I
N ----. 0, and this implies that
{ I /n I > N}
I f I d J.L n
---t
0, for all n,
by uniform continuity. The latter leads to uniform integrability of {f n }·
6. Uniform Integrability
489
2. Let { f n } be uniformly integrable. Then ,
(6.6) By uniform integrability of
{f n}, N can be chosen such that
I I f n I dJ.L < � , fo r all { I '" I > N }
n.
If J.L(A) < 2 � , then from (6 .6) we have that I I fn I < £ and thus { f n} is A 1 uniformly continuous in L . The uniform boundedness is due to Proposition 6.4. D Now, we prove another criterion of uniform integrability for arbitrary measures generalizing Theorem 6. 6. 6.7 Theorem. A family c e - 1 (0, E; C) is uniformly J.L-inte g rable if
and only if the following two conditions hold: A) sup { I I f I dJ.L: f E � } < oo .
B ) For each £ > 0, there is a nonnegative L 1 -function cp and 6 > 0 su ch that for each measurable set A with I cp d J.L < 6,
A I I f I dJ.L < £ unifo rmly for all f E . A
Proof.
1 . Suppose conditions A) and B ) are met. For each
Since by A) , I I f I dJ.L < M,
with
A = { l / 1 2: ccp}.
an £-bound for .
c
c
> 0 and f E ,
can be chosen large enough to have
I cpdJ.L < 6 A Then, by B), I I f I < e for all f and thus ccp is A
2. Conversely, let be uniformly integrable. Since
490
CHAPTER B . ANALYSIS IN AB STRA CT S P A CES
I I f I d jl , { I l l < g}
we have
(6.7)
If g is an £-bound of �, then (6. 7) yields
and thus condition A) . Taking cp = g and 6 = £ we have for each measurable set A with I gd Jl < £ , we have from (6.7) I I f I d jl < 2£ and thereby condition B).
A
0
A
6.8 Proposition. L et suppose the family ci>P: =
the family
ci> C LP ( Q, E, jl ; C) for some 1 < p < oo and { I f I P: f E ci>} is uniformly int eg rable. Then
{ I af + bg I P: f,g E ci>, a,b E C}
is also uniformly inte g rable.
Proof. For any f E equality,
LP
Now, let f 1 = af and f 2 subsequently, by ( 4.2) ,
=
and A E E, f 1 A E
LP.
By Minkowski 's in
b g , for some f,g E �. Then, from (6.8) and
II ( af + b g) 1 A II � < ( I a I II f 1 A II + I b I II g 1 A II )P < 2 P( I a I P II f l A II � + I b I P II g 1 A II �) · p
Therefore, by Theorem 6.7, conditions A) and B) for for I af + bg I P .
p
IfIP
imply those D
By Proposition 5.4, f n � f implies that f n � f. The converse of this holds true if { f n } , in addition, is uniformly integrable. The following two versions of the converse are left for the reader. 6.9 Theorem. L et f, { f n } C e - 1 (0., E; C) b e a sequence on a finite
measure space (0., E, Jl) such that
f n � f. If { I f n I P} is uniformly
6 . Uniform Inte g rability
integrable for some
49 1
£P
p
> 0, then f n --+ f. 6. 10 Theorem. For each sequence {f n} C LP(Q, E, J.L; C), the following are equivalent: ( i) {f n} is LP- convergent. ( ii) {f n} is convergent in measure and { I f n I P} is uniformly integr able. PROBLEMS 6. 1
6.2
Let {f n } C e 1 ( 0. E; IR) be a uniformly integrable sequence on a measure space (rl, E, J.L). (Using Fatou's Lemma) show that -
,
Let {f n} C e 1 ( n, E; IR) be a uniformly integrable sequence on a measure space (0., E, J.l ). If f n --+ f J.L-a.e. on n or in measure , then f is integrable. -
6.3
Prove Theorem 6.9.
6.4
Prove Theorem 6. 10.
492
CHAPTER 8 . ANALYS IS IN ABSTRACT S PA CES
NEW TERMS:
uniformly integrable family of functions 486 c;-bound of a family of functions 486 uniformly integrable sequence of functions, a criterion of 49 1
7.
R ad o n Measures on Locally Compact Hausdorff Spaces
493
7. RADON MEASURES ON LO CALLY C OMPACT HAUSDORFF SPACES
We will assume that (X, r ) is a locally compact Hausdorff"-/topological space, A, G E r } . b) J.L- inner regular if J.L(A) = sup{ J.L(K) : K C A, K E R} .
( ii) A Borel measure J.L is said to be outer ( inner) regular on a sub family � C 0, for each n = 1,2, . . . , there is an open superset U k of Q k such that 1 (U k ) < J.l * ( Qk) + e / 2 k . By Corollary 1 1 .6, Chapter 3, there is an f E e ( X ) such that
c
0
0 and g (K) = 1 . Then J.l*(K) < I(g) and J.l* is finite on
R(X).
*
Notice that unlike Proposition 7.5, the function g is given and it does not dominate K. Proof. Let 0 < a < 1 and U o: = {x E X: g(x) > a }. Then U o: is an open set. By Corollary 1 1 .6, Chapter 3 , there is h E ec (X) such that h -< U o: · It is readily seen that a - 1 g > h . ( It is strictly greater on U o: and greater than or equal to elsewhere. ) It follows that,
7. Radon Measures on Locally Compact Hausdorff Spaces a:
- 1 I(g) > I ( h ), good for all
h�U
497
0,
and therefore for sup { I( h ) : h � U 0 } = 1 ( U 0 ) . From this and by monotoni city of J.l * ,
The above inequality holds true for all a: j 1 . Finally , given K E R, by Corollary 1 1 .5, Chapter 3 , there is g E e (X) such that K � g , which c yields that J.l * (K) is finite. D 7.7 Lemma. J.l* is finitely additive on R . Proof. Let K 1 and K 2 be two disjoint compact sets. By Corollary 10. 12, Chapter 3 , in a locally compact Hausdorff space, K 1 and K 2 can be separated by two disjoint open supersets, say U and V, respectively. Now, for each £ > 0, there is an open superset W of K 1 + K 2 such that
Since (U + V) n W covers K1 + K2, the open sets U 1 = U n W and U 2 = V n W cover K 1 and K 2 , respectively . By monotonicity of 1 , J.l * (K1 + K 2 ) = inf{'Y(O) : K 1 + K 2 c 0 E r} > 1 (W) - e > 1(U 1 + U 2 ) - e
(7. 7)
On the other hand, by Corollary 1 1 .4, Chapter 3 , there are f 1 , / 2 E e ( X) c such that K1 -< f 1 -< U 1 and K 2 -< f 2 � U 2 . Therefore, by Proposition 7. 6 ,
Obviously , in our case, K 1 -< f 1 -< U 1 and K 2 -< f 2 � U 2 if and only if K 1 + K 2 -< / 1 + / 2 -< U 1 + U 2 , and hen ce from (7. 7a) ,
The latter, combined with (7. 7) for e ! 0, yields
The inverse inequality is due to subadditivity of J.l * .
D
498
CHAPTER B. ANALYSIS
IN
7.8 Theorem. J.l* is inner regular on
ABSTRA CT S P A CES
r.
Proof. We need to prove that
1(U) = J.l * (U) = sup{J.l * (K) : K C U, K E R} .
(7.8)
Given an £ > 0 and U open with 1(U) < oo, let a E IR be such that 1(U) = a + £. By Corollary 1 1 . 6 , Chapter 3 , there is f -< U such that I(f) + e > 1 (U) = a + e. Hence, I(f) > a. Let K = suppf. Then, by Problem 7. 1 , J.l * (K) > I(f) > a
and
J.l * (K) + e > a + e = ! (U).
(7.8a)
Thus, we showed that, given e > 0, there is a compact set K C U with (7.8a) holding. This yields (7.8) . Now, let 1 (U) = oo. Then, there is f -< U and 1(U) = sup{ I(f) : f -< U} . Thus, for any M > 0 (arbitrarily large) , there is f E e (X) such c Hence, that I(f) > M. Given K = suppf, by Problem 7. 1 , J.l• (K) > M. we showed that, given U with 1(U) = oo and M > 0, arbitrarily large, there is a compact subset K C U such that J.l * (K) > M. Therefore, sup {J.l * (K) : K 7.9 Theorem.
r
C E*.
C U, K E R} = oo.
Consequently, O, by Corollary 1 1 .6 , Chapter 3 , there is an f -< Q n U such that I(f) + e > ! ( Q n U). Because Q n (suppf) c is an open set, there is g -< Q n (suppf)c such that I(g) + e > !( Q n (suppf) c) .
7. Radon Measures on Locally Compact Hausdorff Spaces
499
Clearly, f + g -< Q . Consequently, ! ( Q ) > I (f) + l ( g ) > !( Q n U ) + 1( Q n (suppf) c ) - 2 t: .
(7. 9a)
On the other hand, Q n (suppf) c ::> Q n (U n Q ) c = Q n uc, which leads to 1( Q n (suppf) c )
= J.l * ( Q n (supp f) c ) > J.l * ( Q n U).
The latter, along with (7.9a) yields and hence,
I (( Q ) > 1( Q n U) + J.l * ( Q n u c ) - 2 t: I( ( Q ) > I( Q n U) + J.L * ( Q n U c ).
The inverse inequality is, as usual, due to subadditivity of J.l* . 2. Let Q C X . If J.L * ( Q ) = oo then the separation is due to subadditivi ty. Let J.l*( Q ) < oo . Then, since
J.l * ( Q ) = inf{1(V): Q C V E r } , for each £ > 0, there is an open superset V of Q such that
J.l * ( Q ) + t: > 1(V) b y =
case
1
1( V n U) + 1(V n u c )
> J.l * ( Q n U) + J.L * ( Q n u c ) . For £ l 0, and the inverse inequality follows from subadditivity of J.L * . Thus we showed that r C E * . This immediately implies that all Borel sets are J.l *-measurable. D
From now on, the restriction of J.l * from E * (act ually, J.l�) to � ( X ) will be denoted by J.l· The last two theorems finalize the most significant feature of J.l * , besides its integral representation, that its restriction from � ( X ) to � ( X ) is a Radon measure. Indeed, Theorem 7.8 states that J.L * is inner regular on r . Proposition 7.5 states that J.l * is finite on compact sets. Theorem 7.9 states that Res � (X) J.l* = J.L is a Borel measure. And, finally , J.L >"' is outer regular, by definition, on
�(X ) , and therefore, on
500
CHAPTER B. ANALYSIS IN ABSTRACT S P A CES
J.l ( B ) 2£ , which shows the J.t-inner regularity of B and hence, regularity of J.l · In particu lar, J.l is Radon, and because of the uniqueness of J.l, we have that J.l = v.D -
7.18 Remark. Notice that, since IR " with the usual topology is a u compact and locally compact Hausdorff space, any Borel-Lebesgue-Stielt jes measure, according to Theorem 7. 17, is regular. D
Another very useful result is as follows.
0 , there i s an N such that for all n > N,
By case 1, applied to f bounded on E n , there is an F E e ( X ) such that c F = f i E n everywhere except on a set of measure less than � - Thus, F = fl E except on a set of measure less than
e.
0
508
CHAPTER 8 . ANALYSIS IN ABSTRACT SP ACES
PROBLEMS 7. 1 7.2
Show that for any function (X, /, (0, 1]] E e c (X), J.L* (suppf) Show that for all K E R, it holds true that J.L * (K)
> I(f) .
= inf{I(f) : f E ec (X) and f > lK }.
[Hint: Apply Proposition 7.6.] 7.3
Can the uniqueness of the Radon measure induced by a positive linear functional be established by means of Theorem 2. 13, Chapter 5 , at least in part?
7.4
Prove Corollary 7. 13.
7.5
Show that if (X,r) is a locally compact Hausdorff space, then every u-finite Radon measure on .) and call Dv(x) the ( measure) derivative of v at x (with respect to >.) . D .
Notice that if v
«
>., then v
Nikodym density) and since
= J f d >.
(with respect to some Radon
����=: ��� represents the mean value of the
function f on the cube C(x,d) (of diameter d and containing point x) , Dv, if it exists, seems to be equal to f >.-a.e. in a vicinity of x. This idea (which gives a practical insight of the Radon-Nikodym derivative) will be explored in a rigorous way through several statements below.
8.
Measure Derivatives
511
8.2 Remark. One interpretation of the measure derivative is if Dv exists at a point x0 (and therefore, coincides with its upper and lower derivatives), then
. Dv( x0) = h m 0
-+
{0 ;\v ((GG))
:
e E !f( x0, 8 )
}
(8.2)
exists for 8 l 0 along any pertinent net of open cubes. Therefore, for any £ > 0, there is a 8 > 0 such that for any open cube e containing x0, of diameter less than or equal to 8,
v;\ ( G ) - Dv(x0) < e. (G )
(8.2a)
As a relevant net of cubes, we can take those centered at x0 and even reduce that net to a sequence of cubes of diameters { � }. 0
e 1 ,. . . ,e m be open cubes in lR " . Then there is a sub collection, e k , . . . ,ek , of pairwise disjoint cubes among e 1 ,. . . ,e m such 1 5 that m " . U ei < 3" L: � 1 " ( e k . ) . J J 1 8.3 Lemma. Let
(
)
1 =
=
Proof. Let 8i be the diameter of e i . Rearranging the cubes, we can assume that 8 1 > 8 2 > . . . > 8 m . Set k 1 = 1 and let k 2 be the smallest in dex (of the cubes) greater than 1 and such that the cube with this index be disjoint from e k . If there is no such cube available, then we are done.
1
Otherwise, set k3 to be the smallest index greater than k 2 and such that e k is disjoint from ek + ek2 . Continue this process until the formation
1 of all disjoint cubes C k , . . . ,e k is finished. Suppose S k is a cube with 1 s J the same center as C k . but with a diameter three times as large. Since each e i intersects some e k with i > kj (it is impossible otherwise, as J 3
.
J
·'
the set of the disjoint cubes is assumed to be complete) and d( e i ) < d( e k . ), it yields that e i C S k . · Hence, J
J
8.4 Lemma. Let J.L be a positive Borel-Lebesgue-Stieltjes measure on � n and let N E N JJ " Then DJ.L exists .,\- a. e. on N and DJ.Ll N E [0] ;\ · Proof. Because J.L is a positive measure, 0 need to show that for each positive a, {x
< I2J.L < D J.L;
E N: DJ.L(x) > a} E N ;\ ·
and thus we
5 12 Let
CHAPTER B . ANALYS ffi rn ABSTRACT S P A CES
A = N n { x E IR
n : DJ.L(x) > a} , for some a > 0 .
Then, A is Borel (Problem 8.4) and, by regularity of J.L (see, Theorem 7. 1 7 and Remark 7. 18) , for any £ > 0 , there is an open superset U of A such that J.L(U\A) < £. Since A is a J.L-null set, we can make J.L(U) arbitra rily small. We will show that the latter, times a positive constant, do minates >.( K ) , where K is a compact subset of A, and hence >.(A) , taking into account regularity of >.. Let K � A be a compact set and U ::> A be an open set. Given x E K, by Problem 8.2, there is an open cube C of any ftxed diameter, say d , that contains x, and such that >.(C) < �J.L( C) . From Problem 8.2, we can make d small enough to ensure C C U. We can cover K by all such cubes and due to compactness have this open cover (dominated by U) reduce to a finite subcover, say, C1 , . . . ,C m . Then, by Lemma 8.3, there is a sub collection, C k , ,C k , of pairwise disjoint cubes, among C 1 , . . . ,C m ' such 8 1 that .
•
.
>.
(. 0 1 ci) < 3 n L: s. J
I =
=
1 >.(C k J. ) .
As mentioned above, due to regularity of J.L, given an e an 3 selected as J.L(A) + £ an > J.L(U) . 3 Hence,
On the other hand, by regularity of >., for each
as
The latter, along with >.(K)
>.( K ) + e
> 0,
U can be
£ > 0, K can be selected
> >. (A).
< £, gives >.(A) < 2 £ .
D
8.5 Corollary. L et v be a sing ular signed Borel-L ebesg ue-Stieltjes measure. Then, the measure derivative Dv exists >.-a.e. and Dv = 0 >. a.e. Proof. Since v ..L >., by Proposition 3 . 2 (iii) , v + ,v E <S AJ.. and there is a Borel set B such that I v I (B) = v + (B) = v - (B) = >. (Be) = 0. Hence, by Lemma 8.4, D I v I = Dv + = Dv - = 0 >.-a. e. on B and since
8. Measure Derivatives
5 13
Be E N .,x, we have that D I v I = Dv + = Dv - = 0 >.-a.e. on IR " . Because D is a linear operator on the set of all signed Borel-Lebesgue-Stieltjes measures, we have that Dv = 0 >.-a. e. D
Since any Borel-Lebesgue-Stieltjes measure is u-finite, by Theorem 3.4, there is a unique Lebesgue decomposition of a signed Borel-Lebesgue Stieltjes measure v with respect to the Borel-Lebesgue measure >., as v a + v 5 , where v a . and v s ..L >.. Absolute continuity of v a (with respect to >.) provides a >.-equivalent class
:
dv d
of Radon-Nikodym densities,
which is referred to as the Radon-Nikodym derivative. The theorem below states that v a is >.-almost everywhere differentiable and its derivative coincides with any Radon-Nikodym density of the class
: >.-
dv d
a.e. We therefore formulate the theorem for an absolutely continuous
signed Borel-Lebesgue-Stieltjes measure.
8.6 Theorem. L et v be a sig ned Borel-L ebesg ue-Stieltjes measure on such that v . . Then Dv exists on some set A such that A c E N ,\ and l A Dv E �Proof. Let f E ��- Given a real number a, denote
�n
Then p is a positive Borel-Lebesgue-Stieltjes measure on de-bounded Borel set. Then p (B )
a} f. C/J, for some real a. Show that there is a cube C containing x such that ���� > a.
8.2
v be a signed Borel-Lebesgue-Stieltjes measure and A = {x E B C IR " : Dv(x) > a} f. C/J , for some real a and B being a Borel set. Show that , given a positive real number 6, there is a cube C(x,6) v(G(x , 8) ) such that A(G(x, o)) > a.
8.3
Show that
Let
Let
8. A is an open set.
{
Measure Derivatives
( v (c) c)
= x E IR n : sup A (
-
:
C E :f ( x , 6)
515
) } >
a
D J.L, D J.L and D J.L E e 1 (IR", �; IR).
8.4
Prove that
8.5
Let F be an extended distribution function induced by a positive Borel-Lebesgue-Stieltjes measure J.L on (IR, f(y)) whenever x < y. A function is monotone if it is of either types. The jump fJ 1 ( x) of a function f at a point x , is f( x + ) - f ( x - ) . The latter is clearly a finite number at any real point x. A point x is a jump discontinuity of f if fJ 1 ( x) f. 0. [Note that the function ; does not fall into this category of monotone functions, as it is not bounded over bounded intervals around zero.] Note that monotone functions are measurable. Indeed, if f is mono tone nondecreasing, for any real number a, the set { f > a} is either D empty or an interval .
1.2
The
set D of all jump discontinuities of a monotone function [IR,IR,f] is at most countable, and if f is defined on a compact interval [a,x] and D ( a x ) = {x 1 ,x 2 , . . . } is the set of all discontinuities of f on (a,x) (a < x) , then, Theorem.
5 1 "'
518
CHAPTER 9 . CALCULUS O N THE REAL LINE
< f(x) - f(a). Proof. We assume that will deal with - f. Because
f
( 1 . 2)
is monotone nondecreasing. Otherwise, we
lR = n U= 1 ( - n,n) 00
and
( - n,n) = k U= 1 [ - n + k1 , n - k1 ] ,
it is sufficient to prove that f has at most countably many points of dis continuities on any compact interval [a,x]. First observe that for an n tuple, a < x 1 < . . . < x n < x, of points it is true that
f(a + ) - f(a) + E � = 1 6 1 (x k) + f(x) - f(x - ) < f(x) - f(a) .
( 1 .2a)
Indeed, if t0 E (a,x 1 ), t 1 E (x 1 ,x 2 ), . . . , t n E (x n ,x) are arbitrarily selected points, then b y summing up the inequalities
f(a + ) - f(a) < f( t 1 ) - f(a) 6,(x k) < f(t k) - f( t k - 1 ), k = 1 , . . . ,n f(x) - f(x - ) < f(x) - f( t n ) we have ( 1 . 2a) . From inequality ( 1 . 2a) , it also follows that if De is the set of all jump discontinuities of f on [a,x] at which the jumps are greater than an £ > 0, and if x 1 , . . . ,x n E De, then n £ < f(x) - f(a) and therefore De is finite. Let D [ a , x ] denote the set of all jump discontinuities of f on [a,x] and let
D 1 / k = {u E [a,u]: 6, (u) > Z } ·
Then, it is readily seen that
D [a , x ] = k U D 1 / k' 1 00
and since each D 1 /k is finite, latter and ( 1 . 2a) yields ( 1 .2).
D [a , x ] -< N, i.e. , D [a , x ] = {x 1 ,x2, . . . }. The
D
Observe that if the function f is defined on [a,x], then f( a + ) - f(a) = 61 (a) and f(x) - f(x - ) = 6 1 (x) can be taken for jumps of f at
1 . Monotone Functions
5 19
the ends of the interval [a,x]. With � , ([a,a]) = 0, equation ( 1 .2) still holds. On the other hand, if f is really defined on IR, then from ( 1 . 2) it follows directly that �, ([a , a]) = o , (a) . Now, if for � ,([a ,x]) we will take a as a fixed constant and if x varies in [a,b], �,([a,x]) in ( 1 .2) turns to a function of x, in new notation, � 1 (x) , which is monotone nondecreasing on [a,b]. The "step" function � 1 (x) is referred to as the cumulative jump function of f. While it is almost obvious how to turn a monotone into continuous function, we would like to formalize it as follows:
1.3
[[a,b],lR,f] be a monotone nondecreasing function f - � f is monotone nondecreasing and
Proposition. L et
function. Then the continuous on [a,b]. Proof. Let x applied to [x, y],
f (x + ) - f ( x ) , which, along with ( 1 .3c) , yields that
Analogously, we can show that
il 1 (x - ) - f(x - ) = � 1 ( x) - f( x).
D
Recall that extended distribution functions fall into the category of monotone functions and there is a bijective map between the factor space m e l � of �-equivalence classes of all extended distribution functions that differ in constants and Borel-Lebesgue-Stieltjes measures !B they induce and vice versa. (See Example 1.2 (:iii) and Remark 3.5 (iii) , Chapter 5 , for a refresher.) It is intuitively clear that the measure derivative as a "pointwise" limit, if it exists, is identical to the function derivative . This is subject to the following theorem .
f ( E me )
be an extended distribution function and let J.L f be the positive Borel-L ebesgue-Stieltjes measure induced by f . Then f is differentiable at a point x 0 if and only if J.L f is differentiable at x0 and in this case, 1.4 Theorem. Let
( 1 .4) Proof. Let f be differentiable at a positive fJ such that 1( X) - 1( XQ)
x0 • Then, for each positive e, there is
• f ( x 1f ) e 0 < 0 < I x - x0 I < fJ. x x0 If x > x0 , then by Problem 3 . 7 a), Chapter 5, F( x):
=
_
1
( 1 .4a)
1. Monotone Functions
and if x
52 1
< x0 , since f is continuous at x0 , J.L j ((x , x0 )) = f(x) - f(x 0 ).
Therefore, if x
< x0 , F(x) =
llA((( (x,x, xx0o))) ) - f'(x0 )
and if x > x0 ,
The latter is not a significant difference from ( 1 .4) , since /, and there fore, F are continuous at x0 • Furthermore, because f can have only at most countable many discontinuities, there is an interval around x 0 , where f and F are continuous. In other words, the selection of 6 can be made appropriate to warrant F(x) = F(x - ) . Then, by ( 1 .4a) and Remark 8.2, Dp, ( x0) exists and ( 1 .4) holds. The converse is subj ect to similar arguments after in the expression D for F(x) , f ' (x 0) is replaced by DJ.L(x 0 ) . Corollary 8. 7 and Theorem 1 .4 combined immediately yield :
1.5 Corollary. Every extended distribution function f E ID e is dif
ferentiable
>.-a.e.
and
f' = D J.L f = g
>.-a. e.,
where J.L f is the Borel-L ebesgue- Stieltj es measure induced by f and g is a Radon-Niko dym density of the continuous component of J.L f in its L ebesgue decomposition. 1.6 Corollary. Every monotone function bounded over bounded inter
vals is differentiable
).
.. a.e.
Proof. Let g be a monotone nondecreasing function (otherwise, we consider g ) . Define
-
f (x): = g(x + )
to have f E m e . Then f is differentiable >.-a.e., due to Corollary 1.5 and so is g, which, by Theorem 1.2, has at most countable many discontinui D ties, and hence equal f >.-a.e. 1.7 Theorem (Fubini). L et {F n } b e a sequence of monotone non-
decreasing functions such that the series E� 1 F converges to a function F in the topolo g y of p ointwise convergence. Then: =
n
522
CHAPTER 9 . CALCULUS O N THE REAL LINE
(i)
Both F n and F are differentiable A-a. e.
(ii) F'(x) =
I: �= 1 F �(x),
A-a.e.
Proof. Assume that for each n , F n is a distribution function and F is bounded. Let J.L F n be the corresponding finite Borel-Lebesgue-Stieltjes
00
measure. The set function J.L F = I: n - 1 J.L F n is a positive measure. Then, F is clearly a distribution function, and _
It follows by eleiJlentary arguments that J.L F is a finite Borel-Lebesgue Stieltjes measure induced by F. Let denote the Lebesgue decomposition of J.L F n and let f n be a RadonNikodym density of its absolute continuous component. We show that is the Lebesgue decomposition of J.L F and f: = E�= 1 / n is a Radon Nikodym density of its absolute continuous component . Since J.L� _!_ A, there is a A-null set N n such that A(N n) = J.L�(N�) = 0. Let N
= n 00u= l Nn.
Then, because N ::> N n for each n ( and thus Nc C N�) , On the other hand, E�= l f..L � is the continuous component of J.L F ' since by the Monotone Convergence Theorem, As a finite measure, E�= 1 J.L� ( < J.L F ) provides that f is an L 1 -function and, by the Radon-Nikodym Theorem, f is a unique, modulo A, Radon
Nikodym density of E� 1 J.L� with respect to the Lebesgue measure. Since F is a distribution function, by Corollary 1 . 5 , F' exists A-a.e. and =
1 . Monot one Functions
On the other hand, applying the same argument to F F�
n
= D J.L� = f A-a. e.
and the two equations yield F' =
I: :0=
1
523
n ' we have that
F� A-a. e.
The general case of the theorem, when F is a monotone nondecreas ing function , bounded over bounded intervals, is left for the exercise D (Problem 1 . 1) . The following statement is an interesting partial confirmation of the revered Newton-Leibnitz theorem applied to a class of monotone func tions. The latter are differentiable A-a.e. Unless specified otherwise, we will extend the derivative of such a function f by setting f' = 0 on the set N E NA and Nc is the set on which /' exists.
1.8
Theorem. Let f be a bounded monot one nondecreasing function
on the compact int erval [a,b] . Then, f' is measurable and
J : f ' dA < f(b) - f(a) .
( 1 . 8)
Proof. Let us (continuously) extend f through (b,b + 1] by setting f(x) = f(b) on this interval. Then, at every point x where the derivative of f exists it can be represented as the limit
of a convergent sequence of measurable functions. Furthermore, f' exists on a measurable subset of [a,b] whose complement is a A-null set on which f' is set to equal zero. Thus, f' is well defined on [a, b] , it is non negative and therefore its Lebesgue integral exists. By Fatou's Lemma, then
J !I ' d.\ < sup{ n J ![ f( x + �) - f (x ) ]A(dx ) }
By the change of variables,
and thus:
bJ af(x + �)A ( dx) = J b+ n f (x)A (dx) a+n 1
l
J ![ f( x + �) - f (x) ];\ (d x)
524
CHAPTER 9 . CALCULUS ON THE REAL LINE
=
L
b +l n
= �f(b) -
f(x)>.(dx) -
L
Ja
a +l n
f(x)>.(dx)
a +l n
f(x)>.(dx) < � [f(b) - f(a)].
D
The above statement seems to fall surprisingly short of the familiar Newton-Leibnitz equation. Moreover, as we will learn from the example below, the result of Theorem 1.8 can deliver a strict inequality. Example. (Cantor function). Let G n , n = 1 , 2 , . . . , be open sets removed from [Of1] to form the Cantor ternary set (see Example 3.11, n Chapter 5 ) . Recall that each G n is the union of 2 - 1 disj oint open inter-
1.9
n
n
vals. Now, the set kU Gk is the union of 2 - 1 ( as the result of the =1 n summation of 1 + . . . + 2 - 1 ) open intervals denoted by
and arranged in the order of their location in function F n : [0,1] --+ [0,1] as follows. Let
F n (x)
F n ( O)
= 0,
Fn ( l )
= 1.
[0,1]. For each n, define the
= k/2 n , if X E Ak(n), k = 1,2, . . . ,2 n - 1,
and
Then, interpolate F n by connecting the ends of the corresponding seg ments of F n on Ak( n ) . For instance,
and
2 ) + ( 1 2 ) + ( 7 ) + ( 1 2 ) ( 20 ) + ( 7 ) + ( 2 5 2 6 ) 1 ( - 27'27 27'27 . 27'27 2 7'27 -
9'9
8
3'3
+
19
The graphs of F 1 and F3 are drawn in Figure
8
9'9
1.1 below.
1.
' F; '
, ,
,
'
,, ''
525
Monotone Functions
'�-------J - - ------------- --- - - - - - --
, ,
,
,
,
,
�(3)
A1 (3)
-
-
As (3) Figure
-
1.1
A k (n) = A 2k (n + 1), and that F n (x) = F n +1 (x) = k/2 n , for x E A k ( n ) = A 2k ( n + 1 ), k = 1, . . . ,2 n - 1. It is easily seen that F n is a monotone nondecreasing, continuous function on [0, 1 ], and it is also clear that I F n (x) - F n +1 (x) I < 21n , \lx E [0,1]. Thus F n (x) conver ges uniformly to a function F(x ), which is called the Cant or function, and F is Observe that
also continuous and monotone nondecreasing ( as the result of the uni form convergence of a sequence of monotone nondecreasing, continuous functions ) . Therefore, since F(x) = F n (x) = k/2 n for x E Ak(n), we have that F' (x) = 0, for x E A k (n), k = 1,2, . . . ,2 n - 1, n = 1,2,. . . . Hence,
F'(x) = 0
00
on U l G n . The latter 1s the complement of the Cantor set
n=
Consequently ,
F (1) - F( O ) = 1.
•
F' E [OL\
on
[0,1].
Therefore,
J � F'dA = 0,
C.
while
0
PROBLEMS
1.1
Complete the proof of Fubini's Theorem 1. 7 for the general case of when F is a monotone nondecreasing function, bounded over bounded intervals.
1.2
Let f be a monotone nondecreasing function on [a,b] and F be a monotone function on [A,B]. Is the composition F o f: [a,b] --+IR monotone?
526
CHAPTER 9 . CALCULUS ON THE REAL LINE
1.3
Let f and F be the functions of Problem 1.2 and suppose the function f has a jump of discontinuity at x0 E (a,b). Must F o f be discontinuous at x0 ?
1.4
Show that if f is continuous on [a,b] , then the functions m (x): = inf{ f(t) : t E [a,x]} and M(x): = sup{f(t) : t E [a,x]} are continuous and monotone on [a,b] .
1.5
Give an example of two monotone nondecreasing functions whose product is not monotone.
1.6
Give a monotone increasing function rational point.
1.7
Prove that if a function [( a,b ),!R,/] is monotone, bounded, and continuous, then it is uniformly continuous.
1.8
Does the validity of the statement of Problem 1 . 7 still hold if the interval ( a,b ) is replaced by !R?
[!R,!R,/] discontinuous at each
1. Monot one Functions NEW TERMS:
monotone nondecreasing function 5 1 7 monotone nonincreasing function 51 7 monotone function 5 1 7 jump discontinuity 517 cumulative jump function 519 Fubini's Theorem for monotone functions 521 Cantor,s ternary function 524, 525
527
528
CHAPTER 9 . CALCULUS ON THE REAL LINE
2. FUNCTIONS OF B OUNDED VARIATION
Now we will introduce the class of functions of "bounded variation," which play the same role for signed measures as distribution functions do for generating positive Borel-Lebesgue-Stieltjes measures.
2.1 Definition. Let [a,b] be a compact interval in P = { a0 = a, . . . ,a n = b} be a partition of [a,b]. Let f be a bounded real-valued function defined on [a,b]. Denote
IR and let measurable
V (P) = V (P,( [a,b]) = E � 1 I f ( a i) - f( ai - l ) I and let � be the set of all partitions of [a,b]. Then we call sup{ V (P) : P E '!P} the variation of f on [a,b] and denote it by V 1 [a,b]. The function D f is said to be of b ounded variation on [a,b] if V 1 [a,b] < oo. 2.2 Example. Consider the function =
f (x) and make the partition P
X=0 x sin � , 0 < x < 1 0,
=
= {0 < x n < . . . < x 1 < 1 } such that Xn - ( 1 ) n +� f(x) = ( - l ) n ( 1 ) n + !.2 "
7r
Then,
7r
and hence
Consequently,
V1 [0,1] = oo.
D
We will leave for an exercise (Problems 2. 1-2. 14) the following properties of functions of bounded variation.
2.3
Theorem. Let
hold true:
( i) ( ii ) ( iii )
[[a,b],IR,f]
be a bounded function. The following
If f is monotone, then it is of bounded variation.
If f satisfies a Lipschitz condition, then f E 'r [ a, b ] .
L et f E o/"( a , b ]. Then function on [a,b].
x V1 [a, x] t-+
is a monotone nondecreasing
2. Functions of Bounded Variation
529
( iv) The set 'r[a,b] of all functions of bounded variation on [a,b] is a vector space over the field IR and it is closed with respect to multiplica tion. Let J,g E 'r[a, b] such that g > fJ > 0. Then � E 'r[a,b] . (v)
( vi) If f E 'r[a,b] , then V1 [a, b] = V 1 [a,c] + V 1 [c,b] . (vi i) lf P = {a = a0 < a1 < . . . < a n = b} is a partition of [a,b] such that on each of the subintervals [ai,ai + 1 ] f is monotone, then f E 'r[a, b].
( v iii ) If / E 'r[a, b] and [a, b] = [a,c] + ( c,b] , then / E 'r[a, c] and / E 'r[c, b] .
( ix)
f E 'r[a, b] if and only if f can be represented as the difference of two monotone nondecreasin g functions. ( x)
If f E 'r[ a, b ], then f is differentiable A-a.e. on [ a,b] .
(xi)
The set of all jump discontinuities of any function f E 'r[ a, b] is at most countable.
(xii)
A ny f E 'r[a, b] can be represented as the sum of its jump function � f and a continuous function of bounded variation on [a,b] .
( x iii ) Let f E 'r[ a, b ]. If f is continuous at x 0 E ( a,b ) , th en so is x H V[ a, x ] . If f is right- continuous, then so is V[ a, x ]. ( xi v) A ny continuous function f E 'r[ a, b] can be represented as the D difference of two continuous monotone functions. 2.4 Definition. Let [IR,IR,/] be a bounded function . The limits V 1( - oo,b] V 1 [a,oo) V 1 (lR)
=
= lim a-+oo V 1 [ - a,b] =
lim b-+oo V 1 [a , b]
V 1 ( - oo, + oo)
= lim a-+oo V 1 [ - a,a]
are said to be the variation of f on ( oo, b ], the variation of f on [a,oo) , and the total variation of J, respectively. The function f is said to be of bounded variation on ( - oo,b] , [a,oo) , or IR, if the above respective limits are finite, in notation, f E 'r( - oo, b ], D f E 'r[ a, oo ), or f E 'r(IR) , respectively. -
2.5 Theorem. ( i)
For any two real numbers
a < b,
530
CHAPTER 9 . CALCULUS O N THE REAL LINE
and
v, [a,oo) = v , [a, b] + v, [b,oo).
( ii)
If f E o/"(�) , then
lim a -+ oo V 1 ( - oo, - a]
= lim a -+oo V1 [a,oo) = 0.
f E 'r ( � ) if and only if f can be represented as the difference of two monotone nondecreasing bounded functions. If, in a ddition, f is a distribution function, then the latter representation is of two distribution D functions.
(iii)
( i) and ( ii) are left for the reader ( Problem 2.21 ). (iii) Denote v 1 ( x) = V1 ( - oo,x] and
Proof. Parts
F: = v1 + f and G = v1 - f. Clearly, v f is a monotone nondecreasing and bounded function. Let x < y . Then, because I f (y) - f(x) I < V f[ x, y] ,
F ( y) - F(x) = V1 [x, y] + f (y) - f (x) > 0. The proof that G is monotone nondecreasing is analogous. Now,
is a pertinent representation. Finally, if f is a distribution function, then so is v 1, due to part ( ii) and Proposition 2.3 ( xiii ) . D
2.6 Definition. A function f E 'r(lR ) is said to be a signe d distribution function ( in notation, f E 9::> 5 ) , if it is right-continuous and vanishes at D - oo. 2.7 Theorem. Let
GJ)
finite signe d measures by
be the operator defined on set f (x)
= v(( - oo,x]).
Then, f is a signe d distribution function an d jection. Proof.
6 * (lR, 0 , there is a fJ > 0 such that for any n-tuple of disjoint open intervals, { ( ak ,b k )} k = 1 with E � 1 (b k - ak ) < 6 , it holds true that E � = 1 l f(b k ) - f(ak ) l < £. =
Since
V1 [ak ,b k] = sup{ E
I f( · ) - f( · ) I
over all finite partitions of
for each {n > 0, there is a partition that
Therefore,
On the other hand, since
[ak ,bk] },
a k = a0 , k < . . . < a N k ' k = b k
such
3 . A bsolutely Continuous Functions
Nk
L.J k = l L.J m k = l ( a m k , k - a m k - 1 , k )
�
�
n
=
�
L.J k = l (b k - a k ) n
53 7
< 0'
we have
L � = 1 L:�Z = 1 f ( a m k , k ) - f ( a m k - l , k ) < � Consequently, from (3.5) ,
L�
=
1
V1 [ak , b k] < e: .
(3. 5a)
By Theorem 2.5 ( i ) , and by our assumption that f E 'r(IR),
which allows us to rewrite (3. 5a) in the form
and there by complete the proof.
3.6 Corollary.
Let f and v 1 be as in L emma
F:
=
v f + f and
G
=
vf
-
D 3. 5.
Then, the functions
f
are absolutely continuous, bounded, and monotone nondecreasing. If f E A( IR ) n 'r(IR) and vanishes at infinity, then it can be represented as the difference, f
=
�F - �G,
of two absolutely continuous distribution functions. Proof. From Lemma 3.5, and linearity of A(IR), it follows that F and G are elements of A(lR) and they are obviously bounded. The rest is due to Theorem 2.5 (iii). D
3.7
Proposition. If f E A[a,b] , then f can be represented as the dif
ference of two distribution functions on [a,b] . 3.8 Theorem. Let v E s. ( IR, Iv
=
=
= L: � 1 I t ,., ( b k ) - t ,., (a k ) I , =
implying that f v E A(IR).
( ii )
Now, let f E A(IR) n 9::> s · Since
v = v f is finite,
Therefore, f E 'r(IR) n A(IR), and by Lemma 3 . 5 , then Corollary 3 . 6 , the functions F:
=
vf + f 2
and
G: =
v
f
E A(IR). By
v, - f 2
are absolutely continuous, bounded, monotone nondecreasing, and vanish ing at - oo . In particular, being absolutely continuous, F and G are elements of 9::> . Let J.L F and J.L a be the corresponding finite Borel Lebesgue-Stieltjes measures induced by F and G, respectively. Because F - G = f, the signed measure v 1: = J.L F - J.L a is clearly an element of <S*(IR, .(B) < �, where fJ is the "threshold" taken from the absolute continuity condition of the distribution function F. By regularity of >. ( see Theorem 7. 17 and Remark 7. 18) , for each �, there is an open superset U of B such that •
>.(B) + � > >. ( U) .
(3.8)
On the other hand , by Problem 2. 10, Chapter 4, U can be represented as at most a countable union of disjoint semi-open intervals:
3. A b s olutely Continu ou s Functions
539
so that , from (3 .8) , (3.8a) Now, by absolute continuity of F, for any finite subcollection of { I j = (a j ,b j]}, say {(aj ,b j]} j = 1 , because of (3. 8a) ,
2: j = 1 F(b j ) - F(aj) = 2: j = 1 Jl p ((a j ,b j] ) = Jl.F( 2: j = l (a ; ,b ;J ) < c . By continuity from below of Jl p , we have that Jl p (U ) < e . Since B C U, Jl F (B) < e. In summary, we showed that for each £ > 0, there is a 60 = � such that for every Borel set B with A (B) < 60 , Jl F (B) < £, and therefore D Jl F 0 for each a E A and v ( A c ) = 0. Since v _L A , it is also called singular-discrete. Binomial, geometric, and Poisson measures are examples of positive singular-discrete Borel Lebesgue-Stieltjes measures. 4. 1 Definition. A function f is called singular-continuous if it is conti nuous, not a constant , A-a.e. differentiable, and its derivative is zero A a.e. [Observe that by Corollary 3 . 12 , a singular-continuous function is continuous but not absolutely continuous.] D
4.2
Example. (Cantor Singular-Continuous Function). From Example 1 . 9, the Cantor ternary function F is monotone nondecreasing and singular-continuous. Let Jl. p be the corresponding Borel-Lebesgue Stieltjes measure. Since F is constant on A k (n), it follows that J.L p (Ak (n)) = 0 and thus J.L F ( C c) = 0. On the other hand , .X(C) = 0. Thus, J.L F _L ,\. Furthermore, since F is continuous, J.L p ( { x}) = for all x E [0, 1] . Therefore, J.L F is a singular continuous Borel-Lebesgue-Stieltjes D measure induced by F.
The above example gives rise to a seemingly close relation between singular continuous distribution functions and singular continuous Borel Lebesgue-Stieltjes measures. We will start with the following:
J.L
be a positive u-finite singular- continuous Borel Lebesgue-Stieltjes measure. Then the corresponding extended distribution function f 11 is singular- continuous. 4.3 Theorem. Let
Proof.
Let J.L _!_ ,\. Then, there is a Borel set A such that J.L( A) = ,\(Ac ) = 0. Since f: = f 11 is an extended distribution function, by Corollary 1.6, f' exists A-a.e. and clearly /' > 0 everywhere it exists. We will show that E = {x: f'(x) > 0} E N A "
( i)
J f'd,\ = I l A f'd,\ < I f'd,\. E E A
544
CHAPTER 9. CALCULUS ON THE REAL LINE
We will prove that
>..- a.e.
I f'd).. = 0. If so, J f' d ).. = J f 'd).. E
A
0 and thus f'
=
=
0
By Theorem 1 .8, for each compact interval [a , b] , (4. 3) J :f'd).. < f(b) - f(a) = J.t((a,b]). Since A is Borel and J.L is u-finite, by Theorem 2.28, Chapter 5, for each £ > 0, there is a disjoint sequence {I n J of semi-open intervals such that A C I: ::" 1 I n and J.L( E �= l l n\ A) = E �= 1 J.L( l n ) - J.L(A) = E �= 1 J.L( In ) < e. (Notice that since J.L(A) 0, the u-finiteness of J.L is not a necessary con =
=
straint to use Theorem 2.28.) Because of (4.3),
Therefore,
X
( ii)
E lR.
J !'
A
f is continuous, because
=
0.
J.L is continuous,
i.e.
J.L( { x }) = 0
for all D
Let v be a singular-continuous signed Borel-Lebesgue Stieltjes measure and f be the signed distribution function induc ed by v . Then, f is singular continuous.
4.4 Corollary.
v
Proof. Let v = v + - v - be the Jordan decomposition of v and f + and f - be the corresponding distribution functions. Then, clearly v + and v - are singular continuous finite positive Borel-Lebesgue-Stieltjes D measures. The proof is complete after applying Theorem 4.3.
4.5 Theorem.
L et f E ID e and f' = 0 >.. - a . e . Then
J.t j
...L
>.. .
Proof. Denote J.L = f..LJ · Then, J.t is a positive u-finite Borel-Lebesgue Stieltjes measure. By the Lebesgue Decomposition Theorem 3 .4, Chapter 8, there is a unique decomposition J.L = f..La + f..L s such that f..L a «: ).. and f..L s ..L >.. . Assume first that J.L is finite. Then, both f..La and J.L5 are finite. By Radon-Nikodym's Theorem 2.2 ( case 1), there is a nonnegative L 1-func tion g such that f..La = J gd >.. . By Lebesgue Corollary 3 . 10, the function
F ( x ) = I _ 00g( u )>.. ( d u) = J.La (( - oo, x] ) X
is differentiable >..- a.e. and F' = g >.. - a.e. On the other hand, f = F + G, where G ( x ) : = J.L5(( - oo, x]) . By Theorem 4.3 ( i) , since f..Ls ..L >.. , G' = 0 >..-
4. Singular Functions a.e. and therefore, F' = 0
>.-a. e. and g f..L a
=
=
J gd j.L
0
545
>.-a. e. Consequently,
=
0
and it leaves J.L ..L >.. Now, if J.L is u-finite, let {O n } be a countable measurable partition of IR so that J.L n = Re s E n n n J.L is a finite Borel-Lebesgue-Stieltjes measure,
which, according to the above arguments is orthogonal to >., i.e., there is a set A n C n n such that J.L n ( A n ) = >.(n n \A n ) = 0 . Therefore, the set
is such that
J.L(A) >.(Ac ) =
A = L: �= 1 A n =
0.
D
4.6 Corollary. Let f be a singular-continuous signed distribution func
tion and let v f be the signed Borel-Lebesgue-Stieltjes measure induced by v . Then, v f is a sin g ular continuous signed measure.
Proof. In the decomposition f = F - G into two distribution func tions, each one is singular continuous. This, as we know, yields the de composition v 1 = J.L F - J.La into two finite positive Borel-Lebesgue Stieltjes measures each one of which is singular-continuous due to Theorem 4.5. D 4. 7 Definitions.
( i)
An extended distribution function D is said to be discrete if it is a monotone nondecreasing step function on any compact interval and it can be represented as (4. 7) where { d n } j C lR and
L: := 00 A n _
=
IR is a countable decomposition of
lR into semi-open intervals. Due to Theorem 1 .2, an extended discrete distribution function can also be defined as a piecewise constant mono tone nondecreasing function. If D = D 1 - D 2 is a signed distribution func tion, with D i being discrete distribution functions, then D is said to be a
discrete signed distribution function.
Since any discrete signed distribution function D is almost every where constant, its derivative D' exists >.-a.e. and D' = 0 >.-a.e. Unlike its singular-continuous counterpart, a discrete signed distribution function is not continuous and thus we can alternatively call it singular-discrete.
( ii)
Any singular-discrete or singular-continuous signed distribution function is referred to as singular. D
546
CHAPTER 9. CALCULUS ON THE REAL LINE
4.8
Remark. If D is an extended discrete distribution function given by ( 4. 7) , it increases only at points { xn } of an at most countable set A and it induces the following atomic measure
(4.8) where 6 n = d n - d n _ 1 = J.L( { x n }) > 0. Correspondingly, any signed singul ar-discrete distribution function induces a unique signed singular-discrete Borel-Le besgue-S tiel tj es measure. Conversely, any signed singular-discrete Borel-Lebesgue-Stieltjes measure generates a unique signed singular discrete distribution function. D 4.9 Theorem. A ny signed distribution function
composed as
f
can uniquely be de
f = fa + fc s + fd '
(4. 9)
where f a ' f f d . are its absolute continuous, singular- continuous, and discrete components, respectively. Furthermore, f' exists >.-a.e. and f ' = f � >.-a.e. cs'
Proof. By Corollary 3.8, Chapter 8, any signed Borel-Lebesgue Stieltjes measure v can uniquely be decomposed as
(4. 9a) such that v a ., v c s + v d _!_ >., and v c s _!_ v d ' where v c s and v d are singular-continuous and singular-discrete components of v . By the above theorems and propositions, each of the three components of v induces a unique signed distribution function of its respective type and therefore, the signed distribution function f v (induced by v ) is of the form
This representation is clearly unique. Conversely, if f is a signed distribution function, it generates a unique signed Borel-Lebesgue Stieltjes measure v, which by the above decomposition, in turn yields the corresponding unique decomposition
of signed distribution functions. Finally,
a .e.
f'
exists >.-a.e. and
f' = f�
>. D
The following provides a practical method for determining the decom position of a distribution function. By Proposition 1 . 3 , any monotone
4. Singular Functions
547
nondecreasing function f can be represented as the sum of the monotone nondecreasing continuous function f - � 1 and the step function (cumulative jump function of f) � 1. The theorem below states how a continuous function of bounded variation can uniquely be represented as a sum of an absolutely continu ous and singular-continuous function.
f E 'r[a, b] n e [a , b ] " Then f can b e decomposed as the sum a + u, where a is an absolutely continuous and u is a singular continuous function. With a (a ) = /{a ) , this representation is unique. 4. 10 Theorem. Let
Proof.
(i)
Existence. Since
f is differentiable ,\- a e. on [a, b] we can define .
X
a(x) = f( a) + aJ f' d,\ , u = f - a.
( 4. 10)
Since f E 'r[ a, b ], it is bounded and it can be decomposed as the sum of two monotone nondecreasing functions. Hence, applying Theorem 1.8 to each of them we conclude that f' E £ 1 . Then, by Theorem 3.9, a E A[ a, b ]. As regards u, it appears to be a linear combination of two 'r[ a, b ] functions, and therefore, its derivative u ' exists >.-a.e. and wherever it e� ists, it is � qual to f' - a' = 0. Of course, u E e [a , b ]· Therefore, u is singular-con t1n uous. Uniqueness. Suppose f = a + u = a + u . Thus a - a = u - u . Since u ' = u' = 0 A-a. e., ( a - a )' E [ O L\. Furthermore, a - a E A[ a, b ], and therefore, by Corollary 3 . 12, a - a = canst. On the other hand,
( ii )
a ( a) - a (a) = f(a) - f(a) = 0. a is identical to a and thus, u is identical to u . D 4. 11 Corollary. L et f E 9::> 5 n e[ a , b ] · Then f can b e decomposed as the sum a + u, where a is an absolutely continuous and is a singular continuous function. With a (a ) = /{a ) , this representation is unique. D 4. 12 Proposition. If f is a distribution function, then f can b e decom posed as the sum a + u, where a is an absolut ely continuous and u is a
The latter shows that
u
singular-continuous distribution functions.
Proof. Let f be defined on [a , b] . Since f' > 0, a (x) = f( a ) + monotone nondecreasing. Furthermore, from Theorem 1 . 8 , y
I f'd,\ < f( y ) - f(x) (x < y )
X
X
Ia f'd,\ is
548
CHAPTER 9 . CALCULUS O N THE REAL LINE
and hence
a(y) - a (x) < f(y) - f(x) or in the form u (x) = f(x) - a (x) < u (y) = f(y) - a(y) .
Now, suppose that the domain of Since
f is lR.
Set J.L ( ( - oo,x])
f(x) --+ 0 for x -+ - oo, we have for a -+ - oo,
X
=
Ioo f'd).. < oo.
-
(4. 1 2) a (x) = J.L (( - oo,x] ) = I 00 /'d).. and a ( x) --+ 0 for x--+ - oo by ¢-continuity of J.L· This also implies that u ( x) --+ 0 for x--+ - oo. D 4.13 Example. Consider the following distribution function: X
_
F(x) =
We can decompose
0,
x 2.
1 2' O < x < 1 x2 , 1 < x < 2
F as the sum of an absolutely continuous component, 0,
x2 - 1 ,
X