Real Analysis: An Introduction to the Theory of Real Functions and Integration

JEWGENI H. DSHALALOW Real Analysis An Introduction to the Theory of Real Functions and Integration I I . # ' ~ ~ ~ ' ...

Author: David Mond | Marcelo Saia

22 downloads 658 Views 39MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

JEWGENI H. DSHALALOW

Real Analysis An Introduction to the Theory of Real Functions and Integration

I

I . # ' ~ ~ ~ ' l ) Il.v l ~ .. S\ l ) \ : \ . v t ' k . ' I ) >\1.~7~111:.\1~\~'1~',s

CHAPMAN & HALLICRC

Studies in Advanced Mathematics Series Editor

STEVEN G. KRANTZ Washiltgtorl University

St. Louis

Editorial Board R. Michael Beals Rutgers University

Dennis de Turck

Gerald B. Folland

University of Washington

William Helton

University of Pennsylvania

University of California at San Diego

Ronald De Vore

Norberto Salinas

University of South Carolina

University of Kansas

Lawrence C. Evans

Michael E. Taylor

University of California at Berkeley

University of North Carolin

Titles Inciuded in the Series Steven R. Bell, The Cauchy Transform, Potentlal Theory, and Conformal Mapping Johr~J. Benederto, Harmonic Analysis and Applications John J. Benedetro and Michael W Frazier, Wavelets: Mathematics and Applications Albert Boggess, CR Manifolds and the Tangential Cauchy-Riemann Complex Goong Chen and Jianxin Zhou, Vibration and Damping in Distributed Systems, Vol. 1: Analysis, Estimation, Attenuation, and Design. Vol. 2: WKB and Wave Methods, Visualization, and Experimentation Carl C. Cowen and Barbara D. MacCluer, Composition Operators on Spaces of Analytic Functions John P. D'Angelo, Several Complex Variables and the Geometry of Real Hypersurfaces Lawrence C. Evans and Ronald E Gariepy, Measure Theory and Fine Properties of Functions Gerald B. Folland, A Course in Abstract Harmonic Analysis Jose' Garcia-Cuerva, Eugenio Herndndez, Fernando Soria, and Josi-Luis Torrea, Fourier Analysis and Partial Differential Equations Peter B. Gilkey, Invariance Theory, the Heat Equation, and the Atiyah-Singer Index Theorem, 2nd Edition Alfred Gray, Modem Differential Geometry of Curves and Surfaces with Mathematlca, 2nd Edition Eugenio Herndndez and Guido Weiss, A First Course on Wavelets Steven G. Krant~,Partial Different~alEquations and Complex Analysis Steven G. Krantz, Real Analysis and Foundations Kenneth L Kutfler, Modem Analysis Michael Pedersen, Functional Analysis in Applied Mathematics and Engineering Clark Robinson, Dynamical Systems: Stability, Symbolic Dynamics, and Chaos, 2nd Edition Jotm Ryan, Clifford Algebras in Analysis and Related Topics Xavier Saint Raymond, Elementary introduction to the Theory of Pseudodifferential Operators Robert Strictlartz, A Guide to Distribution Theory and Fourier Transforms A ~ ~ dUnterberger ri and Harald Upmeier, PseudodifferentialAnalysis on Symmetric Cones Jatnes S. Walker, Fast Fourier Transforms, 2nd Edition Jarnes S. Walker. Pnmer on Wavelets and their ScientificApplications Gilbert G. Walter, Wavelets and Other Orthogonal Systems with Applications Kehe Zhu, An Introduction to Operator Algebras

JEWGENZ H.DSHALALOW

Analysis An Introduction to the Theory of Real Functions and Integration

CHAPMAN & HALUCRC Boca Raton London New York Washington, D.C.

Library of Congress Catalogingin-PublicationData Dshalalow, Jewgeni H. Real analysis : an introduction to the theory of real functions and integration / Jewgeni H. Dshalalow. p. cm. -- (Studies in advanced mathematics) Includes bibliographical references and index. ISBN 1-58488-073-2 (alk. paper) 1. Mathematical analysis. I. Title. 11. Series. 2. Biology-molecular. I. McLachlan, Alan. 11. Title. QA300 .D742000 5 15--dc2 1

00-058593

CIP This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information stomge or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W.Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe.

O 2001 by CRC Press LLC No claim to original U.S. Government works International Standard Book Number 1-58488-073-2 Library of Congress Card Number 00-058593 Printed in the United States ofAmerica 1 2 3 4 5 6 7 8 9 0 Printed on acid-free paper

To my Lord and Redeemer Who made the supreme sacrificefor me and Who will come again

Preface This book is intended to be an introductory two-semester course in abstract analysis, which includes topology, measure theory, and integration, traditionally staffing an assemblage of topics under the cognomen "Real Analysis," more common in the United States. Most North American schools offer this as a graduate one- to two-semester course for mathematics, physics, and engineering majors. Many European schools, to the best of my knowledge, do not have such a course; they have instead a sequence of separate courses such as Topology, Measure and Integration, and Functional Analysis. In some countries, such as Russia and former Soviet Republics, they, additionally, have a Real Variables course, which is somewhat similar to Real Analysis but is more specialized, and, its profile and rigor vary fiom college to college. A very good reason for learning real analysis is that not only is it a core course for all mathematical disciplines, but it is absolutely mandatory for statistics and probability, operations research, physics, and some engineering majors as well. Hence, rephrasing an old adage, all routes of science and technology go through real analysis. This text predominantly targets the first year graduate students of mathematical science majors as well as the first and second year graduate students of engineering, physics, and operations research majors. A stronger senior undergraduate mathematics student can also benefit fiom the course. Some less theoretically oriented programs or those with weaker mathematics course curricula may find it reasonable to use the book for a three-semester course: with the first two semesters of basics and the third semester of advanced topics. The course can always be shortened to two semesters in such schools with the option to cover the fust seven chapters, which are also quite sufficient for technical majors. This book is destined primarily as a textbook and its purpose as a reference is secondary. The reason for such a claim is a rather thorough elaboration of major theorems, notions, and constructions, very often supplied with a blueprint and sometimes a less formal introduction. The latter are then succeeded by detailed treatments. For instance, the Radon Nikodym Theorem is first introduced in Chapter 6, with a minimum of proofs and formalities, but with a number of examples and exercises. Then it is followed by a more abstract version later, in Chapter 8. vii

PREFACE

viii

The first three chapters of the book (Part I) include preliminaries on sets theory and basics of metric spaces and topology. I have been using these three chapters for the many years teaching a bilevel topology course at Florida Tech during our quarter system. However, I would not be able to cover the present version of the three chapters in one quarter, and one semester would be a more appropriate term for the current program at our school. Hence, the first three chapters can easily serve as a separate one quarter to one semester topology senior undergraduate or beginning graduate course. Chapters 4-7 (Part II) present basics of measure and integration and, again, they can be offered as a separate measure theory (and integration) course. Consequently, Parts I and II can become appealing to those programs with separate named courses and, in particular, to European students. Part III (Chapters 8 and 9) includes a more elaborate and abstract version of measure and integration, along with their applications to functional analysis (LPspaces and Riesz Representation Theorem for locally compact Hausdorff spaces), probability theory (conditional expectation, uniform integrability, Lebesgue-Stieltjes integrals, decomposition of distribution functions, stochastic convergence, and convergence of Radon measures), and conventional analysis on the real line (monotone and absolutely continuous functions, functions of bounded variations, and major theorems of calculus). Part 111 can be utilized for advanced topics, as well as an enlarged variant of measure and integration. While the reader would be better off to have studied Part I prior to Part II and the first six sections of Chapter 8, the latter can also be used as an independent material with sufficient basics of topology drawn from any generic advanced analysis course. The book can also be used as a reference source for researchers in mathematical and engineering sciences, and especially, operations research (such as applied stochastic processes, queueing theory, and reliability). The reader should understand, however, that the book is not intended to become an encyclopedia of mathematics or to be any kind of a broad reference. I had to suppress my temptation to include some written chapters on Hilbert spaces, functional analysis, and Fourier transforms, because of my motives to compile main topics of what constitutes the real.analysis and to design a text by spending more time on details (within the frameworks of the book size imposed by the publisher and buyers' affordability). This text may be well suited for independent studies with or without instructors for which an abundance of examples and over 600 exercises provide a pertinent support. While a solution manual is in preparation and will become available soon (and it would be an additional studying aid), the publisher and I have agreed on honoring only university instructors with this manual upon adoption of the book for the course. The reader may also find the new terms subsections (at the end of each section) useful, especially considering a plethora of new definitions and notations, which not only can be intimidating, but they can create an additional memory burden and thereby slow down learning of the main concepts.

PREFACE

Most of my thanks are due to my wife Irina for her ample support, encouragements, and overwhelming sacrifice. I would like to express my deep appreciation to Mr. Jiirgen Becker, for his constant guidance and countless ideas, Mr. Donald Konwinski for his enormous editorial work on earlier versions of my manuscript, Professors Gerald B. Folland and Ryszard Syski for their numerous and very constructive remarks, as well as the kind assistance of Professors S.G. Deo, Jean-B. Lassere, Jordan Stoyanov, Mr. Gary Russell, the project editor, Mr. David Alliot, and anonymous reviewers who thoroughly read my manuscript and made many helpful suggestions. My thanks are also due to the publisher, Mr. Robert Stern for his help and extreme patience. Jewgeni H. Dshalalow Melbourne, Florida

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Part l; An Introduction to General Topology . . . . . . . 1 Chapter 1 1. 2.

3. 4. 5. 6. 7.

2.

3. 4. 5. 6. 7.

2. 3.

Analysis of Metric Spaces . . . . . . . . . . . . . . . . . 59

Defmitions and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 The Structure of Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Convergence in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Continuous Mappings in Metric Spaces . . . . . . . . . . . . . . . . . . . . - 7 8 Complete Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 9 2 Linear and Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 100

Chapter 3 1.

3

Sets and Basic Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Set Operations under Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Relations and Well-Ordering Principle . . . . . . . . . . . . . . . . . . . . . . 22 Cartesian Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4 0 Basic Algebraic Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Chapter 2 1.

Set-Theoretic and Algebraic Preliminaries

Elements of Point Set Topology . . . . . . . . . . 107

Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Bases and Subbases for Topological Spaces . . . . . . . . . . . . . . . . . 115 Convergence of Sequences in Topological Spaces and

CONTENTS

xii

Countability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Continuity in Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . .128 ProductTopology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Notes on Subspaces and Compactness . . . . . . . . . . . . . . . . . . . . . 143 Function Spaces and Ascoli's Theorem . . . . . . . . . . . . . . . . . . . . . 151 Stone-Weierstrass Approximation Theorem . . . . . . . . . . . . . . . . . 160 Filter and Net Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .182 Functions on Locally Compact Spaces . . . . . . . . . . . . . . . . . . . . . 195

Part IL Basics of Measure and Integration . . . . . . .20 1 Chapter 4 1. 2. 3.

Systems of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .204 System's Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

Chapter 5 1. 2. 3. 4. 5. 6.

Measurable Spaces and Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -203

Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .221

SetFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .222 Extension of Set Functions to a Measure . . . . . . . . . . . . . . . . . . . 235 Lebesgue and Lebesgue-Stieltjes Measures . . . . . . . . . . . . . . . . . . 258 Image Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .277 Extended Real-Valued Measurable Functions . . . . . . . . . . . . . . . -282 Simple Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .288

Chapter 6

Elements of Integration . . . . . . . . . . . . . . . . . . 295

Integration on C.'(Q. 27) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Main Convergence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . .312 Lebesgue and Riemann Integrals on R . . . . . . . . . . . . . . . . . . . . . 327 Integration with Respect to Image Measures . . . . . . . . . . . . . . . . . 341 Measures Generated by Integrals. Absolute Continuity. Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 Product Measures o f Finitely Many Measurable Spaces and Fubini's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 Applications of Fubini's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 378

...

CONTENTS

XLZL

Chapter 7 1. 2.

Calcubs in Euclidean Spaces . . . . . . . . . . . .387

Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402

Part III. Further Topics in Integration . . . . . . . . . . . . 419 Chapter 8

Analysis in Abstract Spaces . . . . . . . . . . . . . . 421

Signed and Complex Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 422 Absolute Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Singularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 LPSpaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .460 Modesofconvergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 Uniform Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 Radon Measures on Locally Compact Hausdorff Spaces . . . . . . . 493 Measure Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510

Chapter 9 1. 2. 3. 4.

Calculus on the Real Line . . . . . . . . . . . . . . . . 517

MonotoneFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 Functions of Bounded Variation . . . . . . . . . . . . . . . . . . . . . . . . . . 528 Absolute Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 SingularFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543

BLBLIOGRAPHY

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551

Part I An Introduction to General Topology

Chapter 1 Set- Theoretic and Algebraic Preliminaries -

Set theory is not just one of the main tools in mathematics, it is the very root of mathematics, from which all mathematical disciplines stem. The great German mathematician, Georg Ferdinand Cantor, is considered to be a sole founder of set theory in a series of papers, the first of which appeared in 1874. Although Czech Bernard Bolzano (1781-1848) made one of the first attempts to formalize set theory, in particular in his Paradoxien des Unendlichen 1851 work, by considering the one-to-one correspondence between two sets (later on developed by Cantor to what we now know as cardinals), neither he, nor anyone else, was really a predecessor to Cantor's creation. Ernst Zermelo (187 1-1953) was another German, who among his numerous contributions to set theory, is the author of the first axiom for set theory (of 1908) and undoubtedly the primary axiom of the whole mathematics. This chapter presents only essentials of set theory and abstract algebra needed throughout the book.

1. SETS AND BASIC NOTATION Cantor defined a set as a collection M into a whole of definite, distinct objecis (that are called elements of M) of our thought. In other words, we bind objects (perhaps of different nature) in our mind into a single entity and call that entity a set. We will denote sets by capital letters, and their elements by lower case letters. For instance, a set A has elements a, b, c, or al,a2,. . .. To abbreviate the expression "a is an element of the set A," we will write a E A. The expression "a 6 A" reads "a is not an element of A." Observe that the notion of a set is relatively simple if we deal with such frequently encountered sets as sets of integers, rational numbers, real numbers or continuous functions. In some rare situations, thoughtless use of this notion can lead to contradictions, like Bertrand Russell's paradox. Russell posed the following set dilemma. Let % be the set of all sets, which are not elements of themselves. Clearly, '3 is not empty. For instance, the set of all real numbers is not an element of itself (for it is

4

CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES

not a real number), thus it belongs to %. The question arises: Is % an element of itself? If % E % then by definition of '3, it should not belong to % which is a contradiction. Thus, % 6 %. But then, by definition, it must belong to 3,which is impossible. In this case, we have put the definition of an object ahead of its existence. The concept of a set must be supported by axioms of set theory, just as main axioms of plane geometry define the shape of lines.

1.1 Definitions. (i) A set A is said to be a subset of a set B (in notation, A 5 B) if all elements of A are also elements of B. If A is a subset of B, we call B a superset of A (in notation, B 2 A). A set that contains exactly one element, say a, is called a singleton (set) and it is denoted by (a). If a E A, then we can alternatively write {a) C A. Any set is obviously a subset of itself: A 5 A.

(ii) The unique set with no elements is called the empty set and is denoted 0. Clearly, 0 is a subset of any set, including itself. (iii) A = B (read "set A equals set B") if and only if A C B and B C_ A; otherwise, we will write A # B. Occasionally, we will be using the symbol " c " applied to the situation where one set is a subset of another set but the sets are not equal. A C B reads "A is a proper subset of B." In this case, B is a proper superset of A (in notation, B 3 A). We postulate the existence of a set that is a superset of all other sets in the framework of a certain mathematical model. This set is usually called a universal set or just universe. We will also make use of the word "carrier" as a synonym for the universe and reserve for it the Greek letter S1. Sometimes, we will denote it by X, Y or 2. A universe (as a base for some mathematical model or problem) is generally defined to contain all considered sets and it varies from model to model. For example, if en [a, bl

denotes the set of all n-times differentiable functions on interval [a,b], it contains, .as a subset, the set of possible solutions of an ordinary differential equation of the nth order. Thus, R = is a relevant universe within which the problem is posed. One could also take for 52 the set C[,,bl of all continuous functions on [a,b] or even the set of all real-valued functions on [a,b]. However, these are "vast" to serve for universes and they are impractical for this concrete problem. Set theory is also a basic ingredient of probability theory, which always begins with elements of set theory under slightly modified lexicon. For instance, a universe is referred to as sample space. Subsets of the sample space are called events, specifically singletons are called elementa-

5

1. Sets and Basic Notation

ry events. The concept of the universe is most vivid when used in probability theory. Let us consider the experiment that consists of tossing a coin until the first appearance of the head on the upper face of the coin. Denoting H as an output of the head and T as an output of the tail, when tossing the coin, we may define {(T,T,. ..,T,H ) ) as an elementary event of the sample space R populated by the elements {(H), (T,H), (T,T , H),. .). The universe R contains, as elements, all possible outcomes of tossing the coin until the "first success" or the first appearance of the head. For instance, in the language of probability theory, the event {(H), (T,H),(T, T, H)} corresponds to the cLsuccessin a t most three tosses."

.

1.2 Notations. Throughout the whole book we will be using the following notation. (i) Logical symbols:

V means "for all" 3 means "there is" or "there are" or "there exists" 3 means "implies" or "from

... it follows

that

..."

means "if and only if" A (&) means "and" V means "or" : means "such that" (primarily used for definition of sets) (ii) Frequently used sets:

N: the set of all positive integers No: the set of all nonnegative integers Z: the set of all integers Q: the set of all rational numbers QC:the set of all irrational numbers W: the set of all real numbers C: the set of all complex numbers W + : the set of all nonnegative real numbers R - : the set of all negative real numbers

(iii) Denotation of sets: List:

The elements are listed inside a pair of braces [for instance, {a,b,c) or {al, a2 ,. .)I.

.

Condition: A description of the elements with a condition following a colon (that in this case reads "such that"), again with braces enclosing the set [for instance, The set of odd integers is { n E Z: n = 2k+1, k E Z)].

6


( i v ) Main set operations:

Union: A u B = ( x E ~ ~X: E A V X E B ) Intersection: A n B = {x E R: x E A A x E B) Two subsets A , B 5 i2 are called disjoint if A n B = 0. Difference: A\B = {x E i2: x E A A x 6 B) [A\B is also called the complement of B wiih respect to A, with the alternative notation A - B or B> .] Symmetric Difference: A A B = (A\B) U (B\A) Complement (with respect to the universe R): AC = A h = R\A

( v ) General notation: u. - ,, . - reads "set b y definition." L3 indicates the end of a proof, remarks, examples, etc.

A set-algebraic expression is a set in the form of some defined sets connected thrciugh set operations. Any transformation of a set-algebraic expression into another expression would require a set-theoretic manipulation which we call a set-algebraic transformation. All basic set-algebraic transformations over basic set-algebraic expressions are known as Laws of Algebra (or Calculus) of Sets. 0

1.3 Remark. One of the standard tools of the algebra of sets is the socalled pick-a-point process applied to, say, showing that A C B or A = B. It is based on the following Axiom of Ex-tent: For each s d A and each set B , it is true that A = B i f and only i f for every x E R, x E A when and only when x E B.

Axiom's modification: If every element of A is an element of B , then A C B.

Thus, for the modification, the pick-a-point process consists of selecting an arbitrary point x of A (picking a point x) and then roving that x also belongs to LI. The identities below can be verified easily by the reader using pick-apoint techniques.

1.4 Theorem (Laws of Algebra of Sets).

(i)

Commutative Laws:

(ii)

Associative Laws:

1. Sets and Basic Noiation

(iii) Distributive Laws: ( A u B ) n C = (AnC)UCBnC) ( A n B ) U C= ( A U C ) n ( B U C ) (iv)

Idempotence of complement: (AC)'= A union: A U A = A iniersection: A n A = A

(vi)

AuAC=fI

(vii) DeMorgan's Laws:

(viii) A U 0 = A (ix) A n 0 = 0 (x)

RC= 0 and 0' = S1.

1.5 Example. Show the validity of the first distributive law.

1.6 Remark. The concepts of union and intersection can be extended to an arbitrary family of sets. For instance,

U Ai={x~R:3i€I,x€A;}.

iEI

The distributive laws and DeMorgan's laws hold for arbitrary families (subject to Problem 1.1 6 ) ) :

U Ai ( i E I

n A;

(iEI

U

A;

) n B =i UE I ( A i n B ) U B = r) ( A ~ U B ) iEI

8


(i) An indexed family '3 = (Ai R : i E I) of sets is called (pairwise) disjoint, if for all i # j , Ai n A j = 0. Throughout this book, the union of a pairwise disjoint family of sets will be denoted for convenience by C A;. Specifically, A + B means A U B, when A and B are disjoint. I

'

(ii) A decomposition of a set A is any representation of A as the union of a disjoint family of sets, A = C Ai. The family {Ai; i E I) is iCI

referred to as a partition of A. [There is another use of the term partition, applied to a different construction in a narrower sense. Namely, P is a partition of a closed interval [a,b] C R if P is any ordered finite set of points {ao,.. .,a,) & [a,b] with a = a. < a, < ... < a, = 6.1 (iii) Let R be a fixed set. The family of all subsets of St is called the power set of and it is denoted by T(R). (iv) A sequence {A, : n = 1,2,. .. ) of sets is said to be monolone nondecreasing ( n o n i n c r e ~ s i n ~if) ,

T o specify the type of convergence, we will write {A,} t A ({A,} 1A). A sequence {A,) of sets is said to be monotone vanishing, if it is monotone nonincreasing and {A,) 0. (u)

Let {A,) be a n arbitrary sequence of sets. Denote A,) == ,IU- m=n A,.

n

This limit is

n U A,. ,=lrn=n

This limit is

00

( a ) lim inf A, (or just n+w

called the limit inferior,

-

(6) lim sup A, (or just lim A,) = n+w

00

00

00

called the limit superior. If

-

A, = lim A, then we denote this common limit as

li.imAn. In

this case, the limit of {A,) is said t o exist and equal n lim A,. +oo

PROBLEMS 1.1 a) Prove Theorem 1.4, the laws of algebra of sets by using the pick-apoint process. b ) Prove the generalized distributive laws and DeMorgan's laws stated in Remark 1.6.

1. Sets and Basic Notation

Show that:

Show that A\B = A n BC.

IA 1 =n 1 ? ( A ) I = 2".

Let

(i.e., the set A contains n elements). Show that

Prove that:

For each of the following, justify with a proof or give a counterexample.

Give an example of a monotone vanishing sequence of sets. Let ( A , : n = 1,2,. .. ) be an arbitrary sequence of sets. Define

n A, and A, n =1

A, =

00

00

= U A,. n =1

a) Construct a monotone nonincreasing sequence of sets ( B , )

.

such that { B,) A, b ) Construct a monotone nondecreasing sequence of sets { C , ) such that ( C , ) f A,. c) Given ( C , ) t A,, construct a pairwise disjoint sequence

{ D , ) such that

Em-, n - D,

= A,.

In the condition of Problem 1.8, show that A, C limA, E IimA, C A,. Let 52 be an arbitrary set. Find a sequence { E n ) of subsets of R such that lim En = (8 and lim En = 52. -

I0


NEW TERMS: set 3 element of a set 3 Russell's paradox 3 subset 4 superset 4 singleton 4 empty set 4 proper subset 4 proper superset 4 universe 4 carrier 4 sample space 4 events 4 elementary events 4 union 6 intersection 6 disjoint sets 6 difference 6 symmetric difference 6 complement 6 set-algebraic expression 6 set-algebraic transformation 6 pick-a-poin t process 6 axiom of extent 6 commutative laws 6 associative laws 6 distributive laws 7 idempotence 7 DeMorgan's laws 7 pairwise disjoint sets 8 disjoint family of sets 8 decomposition of a set 8 partition of a set 8 partition of an interval 8 power set 8 monotone nondecreasing sequence of sets 8 monotone nonincreasing sequence of sets 8 monotone vanishing sequence of sets 8 limit inferior 8 limit superior 8 limit of a sequence 8

2. Functions

2. FUNCTIONS The word "function" was introduced by Gottfried von Leibnitz in 1694, initially as a term to denote any quantity related to a curve, such as its slope, the radius of curvature, etc. The notion of the function was refined subsequently by Johann Bernoulli, Leonard Euler, Joseph Fourier, and finally, by Lejeune Dirichlet in the middle of the nineteenth century with a formulation pretty close to what we are using a t the present time and which a mathematics or engineering student meets in an introductory calculus course. Dirichlet introduced a variable, as a symbol that represents a set of numbers; if two variables x and y are so related that whenever x takes on a value, there is a value y assigned to x by some rule of correspondence. In this case y (a dependent variable) was said to be a function of x (an independent variable). In this section we introduce a more contemporary notion of a function. For functions operating with sets (rather than with points), we will be using a nontraditional notation of f , and f * (instead of just f ) , previously used by MacLane and Birkhoff [I9931 and which we found very appealing, as it brings more order within functions acting on collections of sets (such as topologies and sigma-algebras) and simplifies many proofs.

2.1 Definitions. (i) Let X and Y be two sets. The set {(x,y): x E X , y E Y) of all ordered pairs of elements of X and Y is called the Ca7-tesian or direct product of X and Y and it is denoted by X x Y. If X = Y then we shall write X x X = x2.Similarly, the Cartesian product of n sets is

the set of all ordered n-tuples. (ii) Any subset f of X x Y is called a binary relation. (iii) A binary relation f X x Y is called a (single-va1ued)'function if whenever (x,yl) and ( x , ~ are ~ ) elements of f , then yl = y2. We also say that the function f is a map (or mapping) from X to Y and denote this most frequently by the triple [X,Y,f] or by f : X - - + Y or by (x,f (x)) or by f ( x ) = y or by X H f(x). (iv) For a function f (as a subset of X x Y), denote

and call it the domain of f . When a function [X,Y,f] is given we will

12

CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARLES

agree that X is the domain of f . If a domain is not specified, we agree to regard as Df the largest possible set where f is defined. The latter requires a more rigorous motivation. For instance, let 1

f ( 4 = d_1 This -

function is defined for all x E (1,oo). On the extended real line R = R U { oo, - oo), we allow x E [l,oo].And finally, it is not wrong to have x be any real (or even complex) number, if f will take on values in Y E C (or C = CU{w)). (v)

+

Another component of a function is its range, Rf = { Y E Y : ~

X

Df E , f ( x ) = y}.

A superset of Rf (such as Y) is referred to as a codomain. In other words, Rf is the subset of all such elements of Y, which take part in the relation f 5 D x Y. (vi) If x E D f , then f(x) ( E R f ) is called the image of I under f . By the above definition, for every x there is a unique image. [Note that an "extended" concept of a function allows more than one image of each point x under f . Any such function f is called multi-valued. The reader is definitely acquainted with principles of complex analysis where such functions are common. It is also known that in this case the range of a multi-valued function can be parhitioned into pairwise disjoint subsets, such that the function is then split into a number of single-valued functions called branches.] (vii) If D Df then the set of the images of all points of D under f is called the image of D under f and, following the notation of most analysis textbooks, it can be denoted

However, for the upcoming constructions, it is convenient to distinguish images of points of a set from images of subsets of X under f . In other words, we introduce the function

where for D E T(X) we denote f,(D) = { y E Y: 3 x E D, f (x) = y } .

13

2. Functions

Specifically, Rf = f ,(Df). We agree to set f ,({x}) = 0 Vx !$! Df. However, unless specified, we will always assume that in [X,Y,f , X is the domain of function f . [In particular, this agreement excludes such an inconsistency as having f (x) = @,whenever x $ D f , since f (x) is supposed to be a point and not a set.] (viii) Let [X, Y, f ] be a function. Define the function

and call it the inverse of f ,. In other words, for each B E T ( R f ) , f * ( B ) = (x E X: f (x) E B). The set f *(B) is called the inverse image of B under f , or the pre-image of B under f . Another construction related to f * is f defined as {(y, x) E Y x X: (x,y) E f } and called the inverse of f . Unlike f*, in general, f is not a single-valued function (in other words, it is a binary relation or multi-valued function , Consider, for instance, the function [R, R, f ] such that f (x) = x . Clearly, = W + and the inverse J = f of f is a two-valued function wit domain D = R + and with range equal R, which can be decomposed

-'

-'

-'

f

J

Rl

+

as R = (-m,O) [O,m). Accordingly, we have two branches [R+, ( - m ) o ) , JI and [ R + , R + , J I of J . (is) Observe that it is legitimate that f (xl) = f (x,) and x, # I,. However, if f is such that f (xl) = f (x,) if and only if xl = x,, then f is called one-to-one (or injective or invertible). If f is one-to-one, f is a single-valued function too.

-'

-'

in general is not a single-valued function we will agree to Since f as a set (which in particular can be a singleton or the regard f empty set), with the alternative notation f *({y}). Let [X,Y, f ] be a function. Generally, f ,(X) = Rf & Y. In (x) this case, we say the map f is from X into Y. When f,(X) = Y, we say the map f is from X onto Y or surjective. We call f bijective if f is surjective (onto) and injective (one-to-one).

- X x Y and g C Y x Z be binary relations. Then the (xi) Let f C composition of f with g is defined as

The composition of f with g is most frequently used when [X,Y,f] and [Rf n D,, 2,g] are functions and, consequently, it is defined as

14


2.2 Kxample. For a fixed subset A C X, define the indicator function [X,R , l A I as

Then, [X, R, l A ] is an into map, while [X, {0,1), lA] is an onto map. 2.3 Definition. Let f: X

-+

Y and let A C X . Then define

This function is called the restriction o f f to A. On the other hand, the function f is called an extension of the function ResAf from A to X. 0

2.4 Example. Consider [R, [ - 1,I], sin] which is surjective (i.e., onto) but not injective (one-to-one). Take a restriction of function [R, [ - 1,1],sin] to one of the largest subsets A of R where [R, [ - l,l],sin] is monotone increasing. It is plausible to set A =

[-$,;I

since it is also

symmetric about the Y-axis. Then [A,[ - 1,1],R e s p i n ] is obviously bi0 jective and its inverse is the well-known function [ [ - 1,1], A,arcsin].

2.5 Remark. Let [X,Y, f be a single-valued function such that for some y E R f , f *({y}) = {xl, x2, x3} C X. Consider the composition f, o f * and find that

Thus, if f is single-valued, the restriction of f o f-' to Rf is the identity = R f ) However, f o f function (denoted I , with the domain D f of need not be a single-valued function a t all (show it). f-' of is the 17 identity function only when f is injective.

-'

PROBLEMS 2.1

Find the image of [-3,5) under 1(1,21.

2.2

Find the inverse image of (&4] under

2.3

Composition:

~1.

a) Show that the compose operator is associative. b) Show that (g o f )-' = f o g-l. c) Show that Dg = Df n f *(Dg).

-'

-'

2 . Functions

2.4

Show the equivalence of the following statements: a) f is one-to-one.

b ) f * ( An B ) = f * ( A )n f * ( B ) . c) For every pair A and B, of disjoint sets, f , ( A ) n f ,(B) = 0. In the following problems we assume that f is a map from X into Y .

2.5

Show that A C X 3 A C - f * o f ,(A).

2.6

Show that VB & Y, f, of * ( B )& B.

2.7

Show that [X, Y ,f] is onto if and only if f, of * ( B )= B holds

VB

c Y.

16

CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARLES

NEW TERMS: Cartesian (direct) product 11 binary relation, 11 function 11 map 11 mapping 11 domain 11 range 12 codomain 12 image of a point 12 multi-valued function 12 image of a set 12 branch of a function 12 inverse image of function f, 13 pre-image 13 inverse of function f 13 one-to-one (injective, invertible) map 13 into map 13 onto (surjective) map 13 bijective (onto and one-to-one) map 13 composition of binary relations 13 composition of maps 13 indicator function 14 restriction of a map 14 extension of a map 14 identity function 14

3. S e t O p e r a t i o n s u n d e r M a p s

3. SET OPERATIONS UNDER MAPS The most remarkable property of the inverse of a function is that it "preserves" all set operations. The function itself, as we shall see, does not have such a quality. The main theorems in this section will be proved for special cases of surjective maps; the rest will be left for the reader.

3.1 Theorem. L e t [ X , Y ,f ] be a s u r j e c t i v e map a n d let B

Y . Then

Proof. We prove an equivalent statement, f * ( B )+ f*(BC)= X , i.e., we show that (i) f * ( B )and f * ( B Care ) disjoint and (ii) f * ( B )complements f * ( B E ) up to X. We start with: (i) Suppose f * ( B )and f*(BC) have a common point x. Then there is yl E B such that f (x) = y, and p2 E BC such that f (x) = y,. Thus, y l # y 2 and f is not a single-valued function. (See Figure 3.1.)

Figure 3.1

18


(ii) If f*(B) does not complement f*(BC) up to X, there will be at least one point x which does not belong to either of these sets (for they are disjoint as shown above). This is an obvious contradiction, since it follows that f(x) 6 Y. (See Figure 3.2 below.)

Figure 3.2 C

3.2 Example. Let [X,Y, f ] be a function. Then [ f * ( ~ ) ]= XC= 0. On the other hand, setting B = Y, by Problem 3.1, we obtain

3.3 Theorem. Let [X,Y, f ] be a surjective map. Then B1 C_ B2 implies that f *(B1) 2 f *(B2).

Y

Proof. Suppose that f *(B1) is not a subset of f *(B2). This implies the existence of a point x which belongs to f*(B1) and does not belong to f *(B2). Therefore, there is exactly one point y E B1 with f (x) = y. On the other hand, since x 6f*(Bz), f(x) cannot belong to B2. But it must, since f (x) = y E B1 5 B2. (See Figure 3.3 below.) Hence, our assumption above was wrong.

3. Set Operations under Maps

Figure 3.3

3.4 Theorem. Let f : X--1 Y be an onto map and let { B i : i E I) be an indexed family of subsets of Y. Then,

Proof.

(i)

We prove that

Let x E

U f * ( B i ) C f*( U B i ) .

;€I

U f *(Bi)

i E I

Then there is a n index io E I such that

i € I

B i , by Theorem 3.3, f * ( B i0) C_ f*( i U Bi), € I

x E f *(BiJ S

which implies that x E f '(

U Bi). i € I

( i i ) We show the validity of the inverse inclusion,

Let x E f *(

U Bi). Then

f (x) E

U Bi. Therefore,

there is an index

iEI

iEI

io E I such that f ( x ) E Bi if and only if { f (x)} 5 Bi 0

. By Theorem

0

it follows that f *{f ( x ) ) 5 f *(Bi ). Since x E f *({ f ( x ) } ) , we have 0

3.3,

20


PROBLEMS 3.1

Prove Theorem 3.1 under the condition that f is an into map.

3.2

Prove Theorem 3.3 under the condition that f is an into map.

3.3

Generalize Theorem 3.4 when f is an into map.

3.4

Let [X, Y, f ] be an into map and let { B i : i E I) be an indexed family of subsets of Y. a) Prove that f*(

n Bi) = i nc I f*(Bi).

i c I

b ) If { B i : i E I) is a pairwise disjoint family, show that

3.5

Show that f *(A\B) = f *(A)\ f *(B).

3.6

The results above prove that all set operations are closed under the inverses of maps. Show that not all set operations are closed under maps as per the following. a) Show that maps preserve inclusions.

b ) Show that maps preserve unions. c) Show that maps do not preserve intersections; specifically, show that

and that the inverse inclusion need not hold. Explain the latter without a counterexample. d) Do maps preserve the difference?

3.7

Let -[X,Y, f ] be a map and let A C Y. Show that

3.8

Prove the following properties of the indicator function defined on a nonempty set R:

(i)

lA

= min{lA, lB)= lAlg

3. Set Operations under Maps ( i i i ) lA+B - 1A + lB.

ACB

(vi)

3 lA5

lB.

( v i i ) 1" A~ = sup{lA1.: i E I ) , i € I

l n Ai = inf{lA.:1 i E I ) . i € I

3.9

Let { A , ) be a sequence of subsets of

-

a. Show that -

the function

limlA is the indicator function of the set limA, and that the n

function 3.10

lim 1An is the indicator function of the set limA,.

exists. [Hint: Use Prove that n-w lim A, exists if and only if nlim lA -oo n Problem 3.9.1

3.11

Let [ X , X 1 , F ]be a bijective map and let r and lections of subsets of X and XI such that

T'

F**(r:) C - r and F,,(r) C_ r'. Show that

F**(rl) = r and F , , ( r ) = r'.

be respective col-

22


4. RELATIONS AND mLL-ORDERING PRINCIPLE In Definition 2.1 (ii) we introduced the concept of a binary relation R as an arbitrary subset of A x B. In the special case when R E A x B and A = B, we call R a binary relation on A. We will sometimes use as notation aRb instead of (a,b) E R. This notation makes sense, for instance, if R is stipulated by < or 5 on some set. In addition, we will also say that a pair (A,R) is a binary relation, where in fact R is a binary relation on a set A (a carrier). Now we consider some special relations.

4.1 Definitions. Let R be a binary relation on S. (i)

R is called reflexive if Va E S, (a,a) E R [aRa].

(ii) R is called symmetric if (a,b) E R

+ (b,a) E R

(iii) R is called antisymmetric if (a,b), (b,a) E R bRa ja = b]. (iv) R is called transitive if (a,b), (b,c) E R

[aRb

ja

j (a,c)

bRa].

= b [aRb A ER

[aRb A

bRc 3 aRc]. (v) R is called a n equivalence on S (denoted by symbol it is reflexive, symmeiric and transitive.

or E ) if

[Observe Chat the equivalence E on S partitions S into mutually disjoint subsets, called equivalence classes. A partition of S is a family of disjoint subsets of S whose union is a decomposition of S. The elements of S "communicate" only within these classes. Therefore, every equivalence relation generates mutually disjoint classes. The converse is also true: a n arbitrary partition of the carrier S generates a n equivalence relation.] ( v i ) R is called a partial order (denoted by the symbol 5 ) if it is reflexive, antisymmetric and transitive.

(vii) If 3 is a partial order, it is called linear or total if every two elements o f ' s are comparable, i.e. Va,b E S either a 5 b or b 5 a . (viii) Let S be a n arbitrary set and let relation on S. For t E S denote [t]

E

(E) be an equivalence

,( = [tIE) = {s E S : s = t}

and call it an equivalence class modulo classes

FZ

(E). The set of all equivalence

4. Relations and Well- Ordering Principle

23

is said to be the quotient (or factor) set o f S modulo m . It is easily seen that a quotient set of S is also a partition of S. Note that x H[XI is a function assigning to each x E S, an equivalence class [x] We will denote this function by a~ (or a, ) and call it the projection of S on iis quoiient by E (or = ).

,.

4.2 Examples.

(i) (R, = ) is an equivalence relation. Therefore, every real number as a singleton represents an equivalence class.

- ) is a linear order. (ii) (R, < (iii) Congruent triangles on a plane offer an equivalence relation on the set of all triangles. [Two sets A and B are called congrueni if there exists an L'isometric" bijective map f: A -, B, i.e., f must preserve the L'distance" for every pair of points a,b E A and their images f (4,f(b) E B.1 (iv) (R2, 5 ) is not a linear order if we define " < " as (al,bl) 5 (a2,b2) if and only if al 5 az. A b1 5 b2. T o make this relation a linear order we can define, for mstance, (al,bl) 5 (a2,b2) if and only if I1 (al,bl) 11 11 (a2,b2) 11, where 11 (a,b) 11 is the distance of point (a,b) from the origin. (v) Let I be the relation on N such that n 1 m if and only if n divides m (without a remainder). It can be shown that (N, I ) is a partial order but not a linear order. (See Problem 4.5.) (vi) Let p be a fixed integer greater than or equal to 2. Two integers a and b are called congruent modulo p if a - b is divisible by p (without remainder); in notation we write p I a - b or a b (mod p). The number p is called the modulus of congruence. Let [mJp= {n E Z: m

I

n (mod p ) ) (m E a ) -

In other words,

Then any two integers m and n are related in terms of [.Ip if and only if n E [mIp. This is an equivalence relation. (Show it; see Problem 4.1.) (vii) Let S be a nonempty set and R C_ S x S be a binary relation. Taking for R the diagonal D = { ( s , ~ )s: E S} we have with (S,D) the "smallest" (by the contents of elements of S x S) equivalence relation on S, where each element forms a singleton-class, and D partitions S into {s), classes. The "largest" equivalence relation on S is obviously R =

24


S x S itself and it consists of the single class.

(viii) Any function [X, Y, f ] generates an equivalence relation on its domain X partitioning X into disjoint subsets. Define the binary relation Ef ( af ) ~ n X a s

Then, it is readily seen that Ef is an equivalence relation on X , referred to as the equivalence kernel of the function f . Formally, for every point y E f,(X), the pre-image f is an equivalence class in X and {[f -l(y)lEf:y t f,(X)} is the quotient set of X modulo Ef (or z f). Furthermore,

is a decomposition of X.

f*(X) f

For instance, the function f (z)= x2 generates a partition of R into a collection of subsets of the form { - a,a), for a > 0, along with {0), which is a factor set of R modulo E 2. x

Another example is the function

Let Ay=tandl(y)={arctany+~n:nEZ)=[arctany]E

tan

.

Then, Eta, is the equivalence kernel of the function tan,

I 'tan = {tan

: y E W) (the quotient set of

X modulo Eta,)

and

The last discussion about equivalence relation generated by a function yields some important results and notions we would like to use in the upcoming materials of Chapters 6 and 8. While we demonstrated in Example 4.2 (viii) that any function on X generates an equivalence relation, the following proposition states that the converse is also true; namely that any equivalence relation E is the equivalence kernel of some function. 4.3 Proposition. Let E be a n equivalence relation on a n o n e m p t y set X . T h e n the projection [X,XIE,rd is an onto m a p w i t h E a s the equivalence kernel. 0

4. Relations and Well-Ordering Principle

25

Proof. From the definition of TE it follows that rEis surjective. T o claim that E is the equivalence kernel of r ~we, need show that rE(x) = rE(y) if and only if xEy. Let rE(x) = rE(y). Since xEx, x E [xIE and therefore, by the assumption (rE(x) = nE(y)) x E [yIE. This proves that xEy. Now let XEZ.If y E [xIE, then yEx and thus, by transitivity, yEz, i.e. y E [%IE. Therefore, [xIE 5 [%IE. The inverse inclusion, and thus the equality, is due to the symmetry of E. Hence, rE(x) = rE(y). 0 Proposition 4.3 asserts that the projection r~ is a trivial example of an onto function defined on X and with the range XIE. Now suppose E is an equivalence relation on a set X and [X,Y,f ] is any function whose equivalence kernel is E. The following theorem claims that, there is a unique 'mediator" f between the quotient set XIE and the codomain Y of f .

4.4 Theorem. Let E be an equivalence relation on a nonempty set X and [X,Y,f] be a function whose equivalence kernel is E. Then there is a unique function [XIE,Y,flsuch that f = f o r r ~ . The reader shall be able to take care of this theorem (Problem 4.10) as well as of Corollaries 4.5 and 4.6 (Problems 4.11 and 4.12).

4.5 Corollary. In the condition of Theorem 4.4, i f f is onto, then f is bijective. 17 4.6 Corollary. Let [X,Y,f] be a function and let Ef denote its equivalence kernel. Then, there is a unique one-to-one function [XIEf,Y,fl such that f can be represented as a composition

Furthermore, f is bijective i f f is surjective (onto).

Now, we turn to a discussion on the partial order relation and all relevant notions and theorems, which we are going to apply throughout the book.

4.7 Definitions. Let ( A , 5 ) be a partial order and let B 5 A. Clearly, (B, ) is also a partial order. (i) The partial order (B, ) is called a chain in ( A , 5 ) if it is linear. (ii) An element bo E B is called a minimal element of B (relative to

26


4 ) if for each b E'B with b 3 bo, b = bo (compared with the smallest element bo, which is 5 b for all b E B).

(iii) An element b, E B is called a maximal element of B (relative to -4 ), if for each b E B, with b, 5 b, it holds true that b = b, (compared with the largest element b,, which is such that b 5 b, Q b E B). [Observe that the difference between a minimal element and the smallest element of a set is as follows. A minimal element bo is 5 b E B whenever bo is comparable with some b. In addition, the smallest element is comparable with all elements of B.] (iv) An element u E A is said to be an upper bound of B if b -1 u Qb E B. An element 1 E A is said to be a lower bound of B if 15b Qb E B. If B has lower and upper bounds then B is called bounded (or - -bounded). 4 If the set of upper bounds of B has a smallest element uo then this element is called the least upper bound of set B (abbreviated lub(B)) or supremum (sup(B)). Similarly, if the set of all lower bounds has a largest element I, then it is called the greatest lower bound of the set B (in notation glb(B)) or infimum (inf(B)). (u)

[For instance, 0 is the glb((0,l)) or inf(0,l) in (R, the set [I,&)

< ), while a lub of

n Q does not exist in (Q, 5 ).I

(vi) Let B contain a t least two points. The partial order (B, 5 ) is called a laitice if every two-element subset of B has a supremum and an infimum and they are also elements of B. [In notation: if B = {x, y), then

and

4.8 Examples.

(i) Let B = {1,3,3~,...,3",. ..). Then (B, relation in Example 4.2 (v)) is a chain in (N, I ).

1)

(where

I

is the

(ii) Let B = {2,3,4,. ..) and consider the relation I on B. In terms of this relation, the set of all prime numbers {2,3,5,7,11,. ..) is the set of all minimal elements, while there is no smallest element in B, since there is no minimal element related to all other elements. B does not have a maximal element either. (iii) Consider the partial order (T(a), ). It is obvious that for an arbitrary subcollection A = { A i E R : i E I) E T(R), it is true that

4. Relations and Well-Ordering Principle supA =

U Ai E ?(a)

i€I

and infA =

n Ai E ?(a).

i E I

In particular, it holds true for pairs of subsets. Thus, (?(a), C ) is a lattice. 0 4.9 Definition. A linear order (A, 5 ) is said to be well-ordered if every nonempty subset of A has a smallest element in the sense of the same order 5 .

4.10 Example. Let R be the set of all real numbers and consider the relation (R, < - ) which is clearly a linear order. However, R is not wellordered by 5 , for there are nonempty subsets containing no smallest element, such as (0,l). But (N, ) is well-ordered.

= ( x 0 , x ,. 1

,

xrn

n=O

1 2, I < m,

where p E [ l , m ) . Define the

following operation on lP. For z and y, let z = (zo,tl,. ..) = z*y is such that zn =

xi = ,,xkyn -

(called discrete convolution). The operation

*

is commutative and associative and it is closed in lP (see Problem 7.11). Obviously, 1= (l,O,O,.. . is the unity of (lP,*) and thus (lP,*) is an Abelian monoid. Let z = (xo,xl,. ..) E 1P such that so # 0. Define y = ( ~ ~ , y..)~such , . that yo = 1. For n

2 1, yn can be determined recursively

from the equations

= 0. For instance,

"0

x;= Oxkyn-

In conclusion, for each z with xo # 0, there is a unique element y = x-'. On the other hand, if 1; denotes the subset of all elements z E IP with xO= 0 then 1; and its complement lp\lO, relative to IP are two equivalence classes induced by *. This implies that (lp\l;,*) is a commutative group. Obviously, the triple (IP, + ,*) is a commutative ring with unity.

50


Now, let 9.J be the space of all complex-valued functions analytic a t zero and not equal to zero a t the origin. This space is closed with respect to multiplication. Hence, (Y, = ) is an Abelian group. Indeed, u = 0 1 is the unity and for each x E CLJ, & is analytic a t zero and it is a two-sided inverse of x. Obviously, each x E 9.J can be expanded in Taylor series a t zero, such that x is uniquely associated with the sequence

If F is defined as F(z) = x and F ( l ) = 0, then [ l p \ l ~ , ~is, a~ group ] homomorphism such that

Notice that F-'(z) = z need not be an element of lp\l;, ( x, I may be a divergent series. X = p , , (xi) Let LP (p {[R,W,f]} such that

> 1)

J:

for

denote the class of all real-valued functions

1 f I < oo. Define on LP operation * as follows.

The operation * is closed in LP and it is commutative and associative (see Problem 7.12). Define the function 1

f(u.u) = -J= e x 4 u 2n

2). > 2

for u

0 and u t R.

This function is a well-known probability density function of a normal random variable with mean 0 and variance u2. Consequently,

From the theory of probability, it is also known that a lion portion of the integral under the curve f (over 99%) is concentrated over the interval ( - 3u,3u). Function f has its maximum value a t 0 equal approximately 0.399). Now, if we let u +0 + , the resulting function is called the (Dirac) delta function, in notation, 6. It is readily seen that the delta equals 0 on R\{O} and oo a t 0, and that

6:

= 1. There is an alter-

native integral representation of delta function. Recall that the Fourier transform of f is

7 . Basic Algebraic Structures

and that f can be restored by applying the inverse Fourier transform to its image as follows:

Again, letting u+O, we arrive at

By using this integral representation it will be easy to show that 6 is the unity of 4 operation:

Since the expression in parenthesis is 2(0), that denotes the Fourier transform of x, the rest is the inverse Fourier operator, which should restore x at u. So, x*6 = x. According to Problem 7.1, 6 is a unique unity of operation *. Since 6 0 and because

>

6 is an element of LP. This all implies that (LP,*) is a commutative monoid and, therefore, (LP, ,*) is a commutative ring with unity.

+

(xii) As an application of the last example, consider the discrete indexed family of functions { f ;, n = 0,1,. ..} defined as follows:

52


Then, f (" group.

+ k)*

= f "** f k*, and therefore ({ f

n =O,,..

.

is a semi-

(xiii) Let 9, = 9,(R;R) denote the space of all bounded real-valued functions. For a function A E 9,, define

in agreement with Example (vii). Obviously, for each u, the above series converges absolutely, since there is a positive constant M such that

so that eA is again an element of 9,. For a fixed A, define the family of functions ft = etA, t 2 0. From the above definition of eA it follows that fo = 1. It is easy to show that e = e(' + t)A. Indeed,

The last expression yields e ( a + t ) A for letting n + m. Consequently, (etA,= ) is a semigroup defined in (b).This example can be generalized for operators, for instance, squafe matrices. T o discuss such cases rigorously, one would require the concept of the "norm" of operators treated in upcoming chapters.

0

7.5 Definitions.

(i) Let IF be a nonempty set with two binary operations, addition ( a + P ) and multiplication ( a p ) [in many instances, especially. for the elements- of ff, we will drop the conventional multiplication symbol 1. (F, +, * ) is called a field if it is a commutative ring with unity and if for every a 0 there is a multiplicative inverse CY - l.

-

+

In other words, IF is a field if for all a,P,y E IF,

+

1) (commutative law) a + ,O = /3 a, a@= Pa 2) (associative law) ( a p ) 7 = a ( P y), (aP)y = a ( P 7 ) 3) (zero) there is an element 0 E F such that a + 0 = a 4) (additive inverse) there is an element - a E IF such that a+(-a)=O 5) (distributive law) a ( p + 7) = ap cry 6) (unity) there is an element 1 E ff such that la = a 7) (multiplicative inverse) for every a # 0, there is a-' E F such

+ +

+ +

+

7. Basic Algebraic Structures

that aa-' = 1. The elements of a field are called scalars. (ii) Let ff be as above with the exception that ff does not have additive inverses. Then ff is called a semifield. We will denote a semifield by

ff+. [The set of all rational numbers, Q, the set of real numbers, R, and the set of all complex numbers, C, are typical examples OF fields. The set of all nonnegative rational or real numbers and the set of complex numbers z E 43 with Re(z) 0, are examples of semifields.]

>

(iii) A linear or vector space X over a field ff is a nonempty set with the binary operations addition ( + ) on X x X into X and multiplication ( - ) on IF x X into X such that 1) + is commutative and associative; 2) there exists an element (called an origin of X), 0 E X such that o w x = e ,V X E X ; 3) 1 - x = x , V X E X ; 4) a ( + + y) = a x + a y , ( a + P ) x = a x + P x , Q a,P E ff, Q x,y E X; 5) a(@) = (aP)x, V a,p E F, Q x E X. (iv) Elements of X are frequently called vectors. If ff = R then X is called a real linear space. If ff = C then X is called a complex linear space. If in (iv) a semifield ff + is taken, then we call X a semi-linear space. (v) Any subset of a linear space, which itself is a linear space, is referred to as a subspacc.

-

(vi) A ring (A,+, ) is called an algebra over a field ff if its additive (Abelian) group (A,+) is a linear space over ff. An algebra over a field ff will be denoted by (A;!=). If (A;ff) is an algebra, a pair (A1;F1)is called a subalgebra (of (A;ff)) if A' C - A , ff' ff, and (A';ffl) is also an algebra. The above characteristics of commutative rings and rings with unities are hereditary for algebras. (vii) A partially ordered linear space, which is also a lattice, is called a vector lattice. 0

7.6 Properties of Linear Spaces. By Definition 7.5 (iii), 2) and 3), we have 0 + x = 0 x + 1 - x = (0 1) - x = x. Therefore, the origin 0 is zero and, by Problem 7.1, it is unique.

(i)

+

(ii) For every x E X, there exists - x such that x Indeed, by Definition 7.5 (iii), 2) and 4), we have

+ ( - x) = 19.

54


We call ( - 1)x the additive inverse of x and denote it by - x. Properties (i) and (ii) imply that ( X , + ) is an Abelian group. (iii) V a ~ f f a, O = a ( O - x ) = ( a 0 ) - x = O * x = B .

0

7.7 Notation. Let X be a vector lattice over a field ff. Then Q x E X ,

7.8 Examples.

(i)

(8') is a subspace, since by Property 7.6 (iii), a - 8 = 8.

(ii)

Any field is a linear space over itself.

(iii)

Rn is a real linear space with 8 = (0,. ..,0) over R.

(iv) I' space, with all real sequences over the field R whose series are absolutely convergent, is a linear space. 1P space over the field 43, of all sequences such that for each (v) z = (xlrx2,. .) E I P ,=:c I I,I < 03, where p E [ l , ~ ) , is a linear space. (See Problems 7.9 and 7.10.)

.

(vi) space.

e[a, b ] space of all continuous functions on [a,b] is a real linear

(vii) era,,l space of all n-times differentiable functions on [a,b] is a real linear space. (viii) space.

dm)space of all analytic (entire) functions is a complex linear

(ix) In Example 7.4 (xi, (l~\l: U {B), + ,*), where 8 = (0,0,. ..), is a field, since elements of P \ l p have multiplicative inverses. (C, +, = ) is another example of a field. The space RX of all real-valued functions on a set X is a (x) commutative algebra over R with unity. RX is also a vector lattice. (xi) The subspace 4,(X;R) 5 IRX of all bounded real-valued functions on a set X is a commutative subalgebra with unity and a vector lattice.

7. Basic Algebraic Siructures

55

( x i i ) The subspace C ( X ; R ) of all continuous functions is also a commutative subalgebra over R with unity and a vector lattice. (xiii) The subspace C,(X;R) of all bounded continuous functions is a commutative subalgebra of C ( X ; R )and a vector lattice. ( z i v ) The subspace Cn(R;R) of all n-times differentiable functions is a commutative subalgebra with unity but not a lattice (sup{x,-x) = I x I $ Cn(R,R))* (xu) The space C ( ~ ) ( C ; Cof) all entire functions over C is a commutative algebra with unity but not a lattice. ( x v i ) The space 9 of all polynomials with real coefficients is a commutative subalgebra over R with unity but not a lattice. (xvii) The space Q of all polynomials with rational coefficients is a commutative subalgebra over the field of rational numbers with unity but not a lattice. 0

PROBLEMS. Show that each monoid has exactly one identity. Let (Q,*) be a group. Show that for each two elements x,y E Q , there are 1,r E Q, such that l*x = y and x*r = y.

An operation * is called reducible if x*y = x*z implies that y = z for all x,y,z. Show that if (Q,*)is a group, then * is reducible. In particular, show that for each x E Cj, its inverse is unique. Prove Theorem 7.2. Let [ Q 1 f l l f ]be an isomorphism. Show that isomorphism.

[o,Q,f -'I

is also an

6, f ] be an isomorphism. Find K e r f . Let [Cj, 0, f ] be a mapping such that Cj = 6 = R with operation + and let f ( x ) = [XI (i.e. the greatest integer less than or equal to x). Let [Cj,

Is [Q, 0, f J an endomorphism?

Let ( Q , * ) be the set of all 2 x 2 real matrices with determinant equal 1. a) Show that ( C j , - ) is a group.

b ) Let B be any 2 x 2 nonsingular matrix. Define the map [g, Cj, f ] such that f ( A )= B - AB. Show that [Q,Q, f ] is an automorphism.

'

56 7.9


>

Show that, Va,b 0 and p E [l,oo), (a + b)P 5 2~-'(aP + bP). > 1, work with the auxiliary function f (x) = (a x ) ~ - 2p-'(ap + XP), x 2 0.1

+

[Hznl: For p

7.10

7.11

Show that 1P is a linear space; specifically show that x,y E lP + x + y E 1P. [Hint: Apply the inequality in Problem 7.9 in the form Ixn+YnIPI 2p-1(I~nIP+ I Y ~ I ~ ) * I Show that the operation * in Example 7.4 (x) is commutative and associative and it is closed in lP.

*

7.12

Show that the operation and associative.

7.13

Show that o defined in Example 7.4 (viii) is associative and that T ~ T = - T ~- ~ ~ T = I .

7.14

Is ( 4 , + , o ) (where ( 4 , 0 ) is defined in Example 7.4 (viii)) a ring?

7.15

Let S be a subset of 43. Argue for what cases S is a subspace of C over R.

in Example 7.4 (xi) is commutative

a) S is a closed unit disc centered at zero, i.e., S = {z E C: I z 5 1). b) S = {z E C: { I Re(z) I 5 1) x ( I Im(z) I 5 I}}. c) S = {z E C: {Im(z) = 0) x ( 1 Im(z) I 1)). d) S = {z E C: Im(z) 2 0 and Re(z) 2 0) U {z E C: Im(z) 5 0 and Re(z) 5 0).

1 and q > 1 are called

conjugate exponents,

F1 + + = 1. Show that for all x, y E W+ and for conjugate exponents p and q, the following inequality holds.

[Hint: Work with the function f (2) = jr + f - zllp and then substitute r =

1.6

% .] Y

Prove Holder's inequality (for finite sums): for conjugate exponents p > 1 and 9 > 1 such that f = 1, a,,. ..,an 3 0, and bl,. ..)bn 2 0 1

+b

[Hint: Apply Problem 1.5 to x = ai/A and y = bi/B, where

1. Definitions and Notations

[

A = i=l &r]l" and I3 = 1.7

[

&?flq.

]

i=l

a) Prove Minkowski's inequality (for finite sums): for p all.. . a n 2 0, and bl,. ..,b, 2 0, it holds true that

+

[Hint: Make use of (a b)P = a(a then apply Holder's inequality.]

2 1,

+ b)P - '+ b(a + b)P - ' and

b ) Generalize Minkowski's inequality for infinite sums. 1.8

The Euclidean metric or Euclidean distance is defined in Rn by

(Specifically, if n = 1, we have d(x,y) = d ( x - y)2 = I x - Y 1 .) Show that d, is indeed a metric [Hint: Apply Minkowski's inequality.] [In Problem 1.8 we defined the Euclidean metric on Rn by equation (P1.8). This metric can be regarded as

where dk(xk,yk) is the one-dimensional Euclidean metric on the kth coordinate axis (kth factor space). We can extend this notion and define a metric on the n-times Cartesian product set Y = Y1 x Y2 x .. . x Yn by formula (P1.8a). The proposition in Problem 1.9 states that such dp is indeed a metric on Y. We call this metric the product metric arld the corresponding metric space (Y,dp) the product space. In notation, x {(Yk,dk): k = 1,...,n).]

1.9

Prove the statement. Let (Yk, dk), k = 1,..., n, be a collection of metric spaces and let Y be the Cartesian product of Y1,. ..,Y,. Then the function d p on Y x Y defined b y (P1.8a) is a metric on V

1.10

Show that the function p(x,y) = on Y = Y l x Y 2 x ... x Y n .

C ; = ldk(xk,yk) is also a metric

64

CHAPTER 2. ANALYSIS O F METRIC SPACES

NEW TERMS: metrization, 60 carrier 60 metric 60 distance 60 triangle inequality 60 metric space 60 pseudo-metric 60 pseudo-metric space 60 subspace 60 discrete metric 60 1'-space 61 supremum metric 61 conjugate exponents 62 Holder's inequality 62 Minkowski's inequality 63 Euclidean metric 63 Euclidean distance 63 product metric 63 product space 63

2. The Structure of Metric Spaces

2. THE STRUCTURB OF METRIC SPACES The structural properties of metric spaces stem from the notion of the open ball with the aid of which we shall be able to introduce open and closed sets, interior, closure, and accumulation points. Open balls, due to a particular metric, generate convergence and continuity, the principles of any analysis, which we explore in this chapter and Chapter 3.

2.1 Definition. Let (X,d) be a metric space and let x E X and r > 0. The subset of X , B(x,r) = (y E X : d(x,y) < r), is called the open ball centered at x with radius r (with respect to metric d). [If we need to emphasize that the ball is with respect to metric d, we will write as Bd(xlr). This notation makes sense whenever more than one metric on X is considered.]

2.2 Examples. The open ball B(x,r) in Euclidean space (R, d,) is the open interval (x - r, x r).

(i)

+

(ii) The open ball B(x,r) in Euclidean space (W2, d,) is the open disc centered a t x with radius r in the usual sense. (iii) Different choices of metric on a given carrier give rise to different spaces and, as the result, to different open balls. In metric spaces other than Euclidean, the shape of open balls may be quite surprising to our usual way of their perception. Consider, for instance, an open ball B(x,r) in (W2,d), where d is the supremum metric defined as in Problem 1.3, for n = 2, i.e.,

It is easy to see that the open ball B(x,r) is of the square shape and that the corresponding open ball B,(x,r) with respect to the Euclidean) metric in W ' is inscribed in this square (see Figure 2.1 below). (iv) Let ( X , d ) be a discrete metric space with the metric defined in Example 1.3 (i). Then, for any x E X , an open ball centered a t x is

66

CHAPTER 2. ANALYSIS OF METRIC SPACES

Figure 2.1

Figure 2.2

2. The Siruciure of Metric Spaces

67

Let (X, d) be the metric space defined in Example 1.3 ( i v ) , where X = C(,, ,], and (v)

Then the open ball B(x,r) has a shape as depicted in Figure 2.2 above. 0

2.3 Definition. Let (X,d) be a metric space. A subset A of the carrier X is called a d-open set (or just open set) if every point x of A can serve as the center of an open ball inscribed in A, i.e., there is a n r > 0 such that B(x,r) 5 A. 0 2.4 Examples.

Every open ball is an open set itself. Indeed, if xl E B(x,r) then r - d(x,xl) > 0. Take rl = r - d(x,xl) and show that B(xl,rl) C B(x,r). For every z E B(xl,rl), by the triangle inequality,

(i)

Thus

aE

B(x,r) (see Figure 2.3).

Figure 2.3

68


(ii) The set [a,b), for a open ball B(a,r) 2 [a,b).

< b, in (W, d,) is not open, since there is no

(iii) The carrier X is obviously open.

(iv) A set A is not open if there is at least one point x E A such that there is no ball B(x,r) that can be inscribed in A. Since the empty set does not have any point, it is reasonable to assign it to the class of open sets. (v) In the Euclidean space (R,d,), R is an open set but not an open ball (why?). 0

2.5 Theorem. F o r every metric space ( X , d), the following statements hold true:

(i)

Arbitrary unions of open sets are open sets.

(ii)

Finite'intersections of open sets are open sets.

Proof.

(i) Let {Ak: k E I}be an indexed family of open sets in X and let A = U A k . If x E A then there is an index i such that x E Ai. Since Ai k EI

is open, there is a n r

> 0 such that

Therefore, A is open. n

n A k . If x E A k=l

(ii) Let At,.. .,An be open subsets of X and let A =

then x E Ak, k = 1,...,n. It follows that there are rl ,...,r n such that B(x,rk) 5 A k , k = 1,. .,n. Let r = min{rl,. ..,rn). Then, obviously, B(x,r) # (8 and B(x,r) Ak, k = 1,...,n. Thus, B(z,r) 5 A and A is open. 0

.

2.6 &mark. The intersection of more than a finitely many open sets need not be open. The reason is that r = min{rk: k E I) can be zero. For example, let

Then 1 E A n , n = 1,2,. .., which implies that 1 E

n An and hence 00

n=l

However, the set {I) is not open in (W,d,).

2. The Structure of Metric Spaces

69

2.7 Example. Let (X,d) be a discrete metric space. Then the power set T ( X ) coincides with the set of all open sets. Indeed, in Example 2.2 (iv), we showed that in any discrete metric space, every singleton {x) and the carrier X are open balls. In addition, 9) is an open set. Since any subset A of X can be represented as the union of all points of X, by Theorem 2.5 (i), it follows that A is also open. Specifically, in R endowed with the discrete metric, all singletons are open, while in Euclidean space (Ride) they are not. C3 2.8 Definitions.

(i) A point x E A X is called a n interior point of A if there exists an open ball B(x,r) 5 A. The set of all interior points of set A is denoted by

A or Int(A) and called the interior of A.

[Clearly,

is the largest open subset of A, which yields that A is

A. Indeed, let C c A be an open set, larger than A. Then there is an x E C such that x $ A. But this is a contradiction,

open if and only if A =

since x must be an interior point of A.] (ii) A subset A of X is called closed if its complement AC is open. [Specifically, the carrier X and the empty set (8 are both closed.] (iii) A point x E X is called a closure point of A E X if every open ball centered a t x contains a t least one element of A (including x if x E A). We will also say, "if every open ball centered a t x meets A The set of all closure points of A is denoted by 2 or by Cl(A) and called the closure of A.

."

[For example, let A = [0,2) U (5). (5) is a one of the closure points since B(5, r ) contains {5) for all r > 0 . Thus, 2 = [0,2]U{5).] 2.9 Proposition. Arbitrary intersections or finite unions of closed sets are closed sets.

Proof. The statements follow by applying DeMorgan's laws. 2.10 Examples. (i)

From Definition 2.8 (iii) it follows that A

2.

(ii) Since the set of all open subsets of a discrete metric space (X, d) coincides with its power set, the set of all closed subsets is also the power set. Particularly, in a discrete metric space all subsets are simultaneously open and closed.

2.11 Proposition. For any subset A of X, superset of A.

X

is the smallest closed


Proof. (i) We show first that 2 is a closed set, i.e. that (Cl(A))' is open. Let x E (Cl(A))'. Then there exists an open ball B(x,r) such that B(x,r) n A = (8 (since, otherwise, x would belong to A by the definition). (8, which would However, we have not proved yet that B(x,r) immediately imply that (Cl(A))' is open. Now we show that no point of B(x,r) is a closure point of A. Take an arbitrary point t E B(x,r). Since B(x,r) is an open set, there is an open ball B(t,rt) B(x,r) also disjoint from A. By the definition of a closure point, this means that t $2.Since t was an arbitrary point of B(x,r), B(x,r) C (CI(A))'.

nz=

(ii) Now we show that the closure of A is the smallest closed set containing A. Let B be an arbitrary closed set such that A C B. We prove that BCC (A)'. Since BC is open, for each x E BC, there is an open ball B(x,r) 5 2.This implies that B(x,r) fl B = (d and that

Thus x @

(by the definition of a closure point), which is equivalent to x E (Cl A)'. Therefore, we have proved that x E BC yields that x E (Cl A)', i.e. BC5 (Cl A)'. The latter is obviously equivalent to A B.

2.12 Corollary. A set A is closed if and only if A = A. (See Problem 2.1.)

2.13 Fkmark. Consider the set C(x,r) = {y E X : d(x, y) 5 r). It can be easily shown that C is a closed set. (See Problem 2.4.) Such C is called a closed ball centered at x with radius r. Evidently, B(x,r) C C(x,r) implies that B(x,r) 5 C(x,r), since B is the smallest closed set containing B. However, we observe that C(x,r) does not necessarily coincide .with the closure of the corresponding open ball B(x,r). For instance, let ( X , d ) be a discrete metric space, where any open ball is both closed and open set, i.e. B(x,r) = B(x,r). Because

we have B(x,r) = C(x,r) = X for r > 1 or B(x,r) = C(x,r) = {x) for r < 1. For r = 1, B(x,r) = {x) C C(x,r) = X , unless X is a singleton. 0 2.14 Examples.

(i)

In the Euclidean metric space (R,d,), for each x E R, {x) is

2. T h e Structure of Metric Spaces closed. Indeed, {x)' = ( - oo,x) U (x,m) is open. (ii) The set of all rational numbers Q is neither open nor closed. Indeed, it is known that each irrational point x is a limit of a sequence of rational points {x,). Therefore, there is no open ball B(x,r), which does not contain rational points. This implies that QC is not open, or equivalently, Q is not closed. On the other hand, Q cannot be open, since otherwise, every rational point q could be the center of an open ball (interval) containing just rational numbers. This is absurd, since any interval is continuum. Therefore, the set of all rational numbers is neither open nor closed. It also follows that the set of all irrational numbers is neither open nor closed. 0

2.15 Definition. A point x E X is called an accumulation point of a set A X if V r > 0, B(x,r) fl (A\{x)) # [Observe that x need not be an element of A.] The set of all accumulation points of A is called the derived set of A and it is denoted by A'.

a.

Unlike a closure point, an accumulation point must be "close" to A. If B(x,r) n (A\{x)) # #, then B(x,r) fl A # (8, and, consequently, x E A' yields that x E 2 or A'

x.

2.16 ExamplesNotice that not every closure point is an accumulation point. For instance, let A = (0,l) U (2) (R,de). Then (2) is obviously a closure point of A. However, (2) is not an accumulation point of A, since ~ ( 2 , in) (0,l) = @. On the other hand, {0) is an accumulation and closure point of A.

(i)

1 1 (ii) Let A = {1, 3, 3,. ..) 2 (W,de). Since 0 is the limit of the se(in terms of Euclidean distance), it is also an accumulation quence point of A. Any open ball a t 0 contains at least one point of A. This is the only accumulation point of A. By the way, A is not closed, for 0 is a closure point of A. So we have A' = {0), 2 = A U {O).

{a)

In the previous section we introduced the notion of the product metric. We wonder what the shape of open sets in the product metric space is. A remarkable property of this metric is given by the following theorem.

2.17 Theorem. Let {(Yk,dk): k = 1,...,n) be a finite family of m e t r i c spaces and let (Y,d) = x {(Yk,dk): k = 1,...,n) be t h e product space. T h e n 0 (Y,d) i s open if and only if 0 i s t h e u n i o n of sets of t h e f o r m x ( 0 ; :i = 1,...,n), where each 0; is open in (Yi,di). A proof of this theorem in a more general form is given in Chapter 3.

72


PROBLEMS 2.1

Prove Corollary 2.12.

2.2

Is it true that A C_ B

2.3

Show that

2.4

Prove that a closed ball C ( x , r )is a closed set.

2.5

Show that in (Rn,d,),

2.6

Show that

2.7

Let A ( X , d ) , where X is an infinite set. Show that, if x is an accumulation point of A, then every open set containing x contains infinitely many points of A.

2.8

Give an example of a continuum closed set that does ,not have any accumulation point.

2.9

Find the shape of open balls in the metric space ( X , d ) introduced in Example 1.3 (ii).

2.10

Show that the set [l,oo)is closed in the metric space in Problem 2.9.

j

2 2 B?

[FIC C - 2. B(x,r) = C ( x , r ) .

= A U A'.

2. The Structure of Metric Spaces NEW TERMS: open ball 65 radius of an open ball 65 supremum metric 65 open ball with respect to the Euclidean metric 66 open ball with respect to the supremum metric 66 open (d-open) set 67 interior point 69 interior of a set 69 closed set 69 closure point 69 closure of a set 69 closed ball 70 accumulation point 71 derived set 71


3. CONVERGENCE IN METRIC SPACES This section introduces the reader to one of the central notions in the analysis of metric spaces - convergence. Among different things, we will discuss the relation between limit and closure points.

3.1 Definitions. (i) Recall that a function [N,Xf] is called a sequence, and its most commonly used notation is {x,} = f , with x, = f(n). Let {x,} (X,d) be a sequence and let x E X. A subsequence QN = {xN , XN + I , . ..} is called an N(x,E)-tail of {x,} if there are N 2 1 and E > 0 such that QN E B(x,E). The sequence {I,} is said to converge to a point x E X if for every E > 0, there is a N(x,E)-tail. In notation, lim d(x,,x) = 0

n+oo

(also d-lim x, = x or just x,-+x). n+oo

x is called a limit point of the

sequence {x,}. A sequence is convergent if it is convergent to a t least one limit point that belongs to X. (ii) A point x is said to be a limit point of a set A if there is a sequence {x,} E A convergent to x. (iii) A sequence {x,} is called a Cauchy sequence, in notation lim

n, m-oo

if for each

E

d(xn,xm) = O ,

> 0, there is an N such that d(xn,x,) < E, for n,m > N.

(iv) A metric space (X,d) is called complete if every Cauchy sequence in X is convergent. (v) A sequence {x,} is called bounded if for every n, d(xl ,xn) 5 M, 0 where M is a positive real number.

3.2 Remark. A sequence in a metric space can have a t most one limit point. Indeed, let x, y be limits of a sequence {x,} 5 (X,d) and let E > 0 be arbitrary. Then, given an N , by the triangle inequality,

(i.e. d(x,y) can be made arbitrarily small). Thus, x = y.

3.3 Theorem. Let A E (X,d). Then a point x is a closure point of a set A if and only if x is a limit point of A (i.e. there is a sequence {x,}

3. Convergence in Metric Spaces

C A such that x,-,

2).

Proof. (i) Let x be a closure point of A. If x E A then the proof becomes trivial (take x, = x, n = 1,2,. ..). Let x E X\A. By the definition of a closure point, every open ball B(x,r) meets A. Thus for every n, there is a point, x, E A n ~(x,;), so that d(x,x,) < Therefore, {x,} is a desired sequence convergent to x.

4.

lim x, = x. We prove that x E 2. (ii) Let {x,} C_ A such that n-tm The convergence implies that for every E > 0, there is an N such that ~ ( x , x , ) < E, for all n z N. Thus VE > 0, B(x,E)n A # #, which yields that x €2. (Particularly, if x E A1\A # #, then there exists a sequence {x,} with all distinct terms such that x,+ x.) C3

3.4 Corollary. A subsei A of a meiric space (X,d) is closed if and only if it contains all of its limit points.

Proof. (i) Let A be closed and let {xn} Then, by Theorem 3.3,

A be a convergent sequence.

-

lim x, = x E A. n+oo Since A is closed, A = 2 and x E A. Thus, A contains all of its limit points. (ii) Let A contain all of its limit points. Apply the pick-a-point process. Let x E 2.Then, by Theorem 3.3, there is a sequence {x,) A such that n+oo lim x, = x. By our assumption, x belongs to A or, equivalent-

s

ly,

Z E A implying that A = A and hence A is closed.

C3

3.5 Definitions. (i) A subset A C_ (XId) is called dense in X if 2 = X. [By Theorem 3.3, A is dense in X if and only if the set of all limit points of A coincides with X, or, in other words, if and only if for every x E X, there exists a sequence {x,} 2 A such that x, + 2.1 (ii) A set A C_ (X,d) is called nowhere dense if its closure has the empty set for its interior, i.e., if Int(Cl(A)) = #.

76


(iii) A point x E (X,d) is called a boundary poini of A if every open ball at x contains points from A and from AC. The set of all boundary points of A is called the bounda y of A and is denoted by dA. [Note that B A = B A ~ = I ~ P ] . 3.6 Examples.

(i) Since each irrational number can be represented as the limit of a sequence of rational numbers, Q is dense in W (in terms of the Euclidean metric). (ii) X and

0 have no boundary points.

(iii) Let A = [0,1) U {2). Then, [0,1], aA={0,1,2}

= (Ol),

= [0,1] U {2), A' =

(since AC=(-oo,O)U[1,2)U(2,m), k = ( - m , O ]

u [l,oo), and Xn H = {0,1,2)). (iv) Let A = {1,5,10) C (R,d,). Then A is nowhere dense. (v)

(A: n = 1,2,. ..) is nowhere dense in (W,d,).

PROBLEMS 3.1

Show that every convergent sequence is a Cauchy sequence. Give an example when the converse is not true.

3.2

Prove that

3.3

If x E aA, must x be an accumulation point?

3.4

Prove that a set A C_ (X,d) is nowhere dense in X if and only if the complement of its closure is dense in X.

3.5

Assuming that (W, d,) is complete (a known fact from calculus) prove that (Wn,d,) is also complete.

3.6

Show that any Cauchy sequence is bounded.

3.7

Show that in a discrete metric space any convergent sequence has at most finitely many distinct terms.

3.8

Show that any discrete metric space is complete.

3.9

Show that if (x,) E ( X , d ) is a Cauchy sequence and (x } is a nk subsequence convergent to a point a E X, then xn -t a.

2=

+aA .

3. Convergence in Me-tnc Spaces

NEW TERMS: sequence 74 N(x,E)-tail 74 convergent sequence 74 limit point of a sequence 74 limit point of a set 74 Cauchy sequence 74 complete metric space 74 bounded sequence 74 dense set 75 nowhere dense set 75 boundary point 76 boundary of a set 76


4- CONTINUOUS MAPPINGS IN METRIC SPACES 4.1 Definition. Let (X,d) and (Y ,p) be two metric spaces. A function f : ( X , d )-,(Y,p) is called continuous at a point xo E X if for each E > 0, there is a number 6 > 0 such that p( f (x),f (xo)) < E for all x with d(x,xo) < 6. The function f is called continuous on X or simply continuous if f is continuous a t every point of X. CI

4.2 Remark- Since xo E f *(I f (xo)}), x0 E f *(Bp(f (xO),&)).However, in general, xo need not be an interior point of f *(Bp(f ( X ~ ) , E )The . continuity of function f a t xo is equivalent to the statement that, for any E > 0, xO is indeed an interior point of f *(Bp(f(xo),~)). In other words, f is continuous a t xo if and only if the inverse image under f * of any open ball centered a t f(xo) contains xo as an interior point. (See Figure 4.1.) Consequently, there is an open ball Bd(x0,6) C f*(Bp(f(x0),&)) In particular, this implies that: 1) such a positive 6 exists, and 2) the image of Bd(xo,6) under f , is a subset of Bp(f (x0),&), which guarantees that p(f (x), f (xo)) < E for all x with d(x,xo) < 6.

Figure 4.1

4. Continuous Mappings in Metric Spaces

79

However, if f is not continuous a t xo, as it is depicted in Figure 4.2 below, xo need not be an interior point of f * ( B (f (x0),&)).In this case, no ball Bd(xo,6) can be inscribed in f * ( B ~ f (xoP,&)) ( or, equivalently, no positive 6 exists to warrant p(f(x), f(xo)) to be less than E for all x with d(x,xo) < 6.

Figure 4.2 The following theorem is a generalization of the above principles of continuity.

4.3 Theorem. A function f: (X,d) -4 (Y,p) is continuous if and only if the inverse image of any open set in (Y,p) under f is open in (X,d).


Proof. 1) As mentioned in Remark 4.2, we will begin the proof by showing the validity of the following assertion:

f is continuous at xo if and only if xo is an interior poini of the inverse image under f * of any open ball Bp(f ( X ~ ) , E ) . Let xo be an interior point of f *(Bp(f (X~),E)). Then there is an open ball Bd(xo,s) C f *(Bp(f ( ~ o ) , & ) ) , and hence, (by Problems 3.6 (a) and 2.6 of Chapter I),

which yields continuity o f f a t xo. Now, let f be continucus a t xo. Then, the inclusion f *(Bd(xO,d))E B p ( f ( x o ) , ~ )holds, which, along with Problem 2.5 (Chapter 1) lead to the following sequence of inclusions:

Because xo is the center of Bd(x0,6), it is an interior point of this ball and, due to the last inclusion, an interior point of f *(Bp(f(xo),~)). 2) Suppose f is continuous on X. We show that for each open set 0 Y, f *(O) is open in (X,d). Pick a point xo E f '(0). Then, f (xo) E f ,(f '(0)) 0 and, since 0 is open, f (xo) is its interior point. Thus, 0 is a superset of the open ball Bp(f ( X ~ ) , E )for ) , some E, and consequently,

c

Since f is continuous a t xO, by assertion I), xo must be an interior point of f *(Bp(f ( x o ) , ~ ) ) ,and, by (4.3)) an interior point of f ' ( 0 ) . Thus, f * ( O ) is open. 3) Let f*(O) be open in (X,d) for every open subset 0 of Y. Take xo E X and construct an open ball Bp(f ( x o ) , ~ ) .By our assumption, the set f *(Bp(f(xo),&))is open in (X,d). Since f (xo) E Bp(f (go),&),we have that

and, therefore, xo E f*(Bp(f(xo),&)) and it is an interior point of f *(Bp(f(xo),~)).By I), f must then be continuous a t xo.


81

There will also be yet another useful criterion of continuity.

4.4 Theorem. A function f : (X,d) -, (Y,p) is continuous at x E X if and only if for every sequence {I,}, d-convergent to x, its image sequence {f (x,)) is p-convergent to f (x). We will prove this theorem for a more general case in Chapter 3 (Theorems 4.9 and 4.10).

4.5 Definition. Let (X,d) be a metric space and ~ ( d be ) the collection ) just T) is of all open subsets of X with respect to metric d. Then ~ ( d (or said to be the topology on X generated b y d. Theorem 4.3 can now be reformulated as follows. 4.6 Theorem. Let f : (X,d) t (Y,p) be a function and let r ( d ) and ~ ( p be ) the topologies generated b y metrics d and p, respectively. Then f is continuous on X if and only i f f **(T(P)) E ~ ( d )[i.e., VO E ~ ( p ) , f * ~ E)~ ( d ) l * 0

4.7 Example. Let f: (W,d) (R,d,) be the Dirichlei function defined as f = l q , where Q is the set of rational numbers. If d = d, is the Euclidean metric then f is discontinuous a t every point. If d is the discrete metric, by Theorem 4.3, f is continuous on R, since the inverse image of any open set in (W,d,) under f is clearly an element of the power set coinciding with the "discrete topology" generated by the dis0 crete metric (see Example 2.7). We will further be interested in the conditions under which two different metrics on X generate one and the same topology. This property of metrics satisfies an equivalence relation on the set of all topologies on X and hence referred to as equivalence of metrics. In other words, topologies generated by metrics on a carrier induce an equivalence relation.

4.8 Definition. Two metrics dl and d2 on X are called equivalent if ~ ( d , )= r(dz) (in notation dl R d2).

4.9 Remark. Let (X,d,) and (X,d2) be two metric spaces and, let f : (X,dl) -, (X,d2) be the identity function (f(x) = x, x E X). If dl and d2 are equivalent and therefore r(dl) = T(d2), then for every open set 0 in (X,d2) (and in (X,dl)), f*(O) E r(dl). According to Theorem 4.4, this is equivalent to the statement that lim dl(xn,x) = 0

n t w

implying that

lirn d2(f (x,), f (x)) = nlirn t m d 2 (xn' x) = 0.

R+OO

82


Thus, assuming

we showed that

(ii) n+oo lim dl(x,,x) = 0

e n+oo lim d2(xn,x) = 0.

By Theorem 4.4, it follows that the converse is also true, i.e. that statement (ii) implies statement (i). Hence, we may call two metrics r ( d t ) and r(d2) on X equivalent if (i) or (ii) holds. CI From Theorem 4.3, it also follows that the identity map above is continuous under equivalent metrics. However, an identity map need not be continuous if dl and d2 are not equivalent. 4.10 Definitions. (i)

LeC A be a subset in a metric space (X,d). The number

(more precisely, a real number or infinity) is called the diameter of A. The set A is called d-bounded or just bounded if d(A) < oo. Particularly, the metric space (X,d) or d is called bounded if X is bounded. A is said to be unbounded if d ( A ) = oo. (ii) A subset A in a metric space ( X , d ) is called iotally bounded if for every a > 0, the set A can be covered by finitely many &-balls (i.e. balls with common radius E ) . 0

4.11 Example. According to Problem 1.4, the function

defined on a metric space (X,d) is a metric on X. Obviously lim d(xn,x) = 0

n+oo

if and only if lim p(xn,x) = 0 (due to d = &). n+oo

Therefore, d and p are

equivalent. Observe that p is clearly bounded while d is arbitrary.

0

We finish this section by rendering a short discussion on uniform continuity. This concept will be further developed in Section 6 and Chapter 3.

4.12 Definition. A function f: (X,d) + (Y,p) is called uniformly continuous on X if for every a > 0, there is a positive real number 6 such

4. Coniinuous Mappings in Metric Spaces

83

that d(x,y) < 6 implies that p(f (x),f (y)) < E , for every x,y E X. Unlike continuity, uniform continuity guarantees the existence of such positive 6 (for every fixed E) for all points of X simultaneously. In the case of usual continuity, a delta depends upon a particular point x E X, where the continuity holds, so that a common delta, good for all points x E X, need not exist. Clearly, uniform continuity implies continuity. Uniform continuity can also be defined on some subset A of X, so that in Definition 4.12, X will be replaced by A. 4.13 Examples.

(i)

Consider f : (W, d,)

-t

(W,d,) such that f (x) = x2. Then

11, - +I

0 good for all so.

(ii) Let f (x) = x2 be given as

From the last inequality above we derive

d E

+

and thus 6 = - 3, where E = 6(6 6). Thus de(f (x),f (to)) < a whenever de(x,xo) < 6 = - 3. Since 6 is independent of x,, f ( x )

d E

84


is uniformly continuous. Observe that f has been given on a closed and bounded interval which provides the uniform continuity. However, in this case f would also be uniformly continuous if f were defined on any bounded but not necessarily closed interval, for instance (0,3) (why?). (iii) A continuous function can be uniformly continuous over unbounded sets, as for example, functions f(x) =$, x E [l,m), and f (x) = sin x, x E R. There is an analytical result, known as Heine-Bore1 Theorem, stating that any continuous function defined on a closed and bounded set in any Euclidean metric space is also uniformly continuous. The general form of this result will be discussed in Section 6 (Theorem 6.13). 4.14 Remark. It is known from calculus that the space of all realvalued continuous functions defined on Rn is closed under the formation of main algebraic operations. What if functions were defined on an arbitrary space (X,d)? We give here some informal discussion on this matter. Let FtX be the space of all real-valued functions defined on a set X and let f ,g E RX. Define the following.

(i)

f fg is the function such that for each point (f fg)(x) = f ( 4 fg(x).

x E X,

(ii) f g is the function such that Vx E X, (f g)(x) = f (x) g(x). (iii)

+ m and

- oo are not real numbers. Consequently, f l g is the function such that for ail x E X, (f/g)(x) = f (x)/g(x), excluding x E X for which g(x) = 0. At all those values, the function f l g is either undefined or can be specified.

(iv) As a special case, any real-valued function multiplied by a real number, is a real-valued function too. (v)

The associative (relative to mu1tiplications) and distributive laws of functions relative to the addition and multiplication defined in (i) and (ii) are the corresponding consequences of these laws for real numbers.

Bearing in mind these observations, we conclude that the space RX is a commutative algebra over R with unity and a vector lattice (that was also mentioned in Example 7.7 (ix), Chapter 1). A subset e((X,d);(R,p)) (of RX) of all continuous functions is a subalgebra characterized by the following properties: (a) (6)

*

f,g E e af +bg E e , Va,b E R. f,gEe*fgEe.


PROBLEMS

4.1

Show that if A is totally bounded then A is bounded. Give an example, where a bounded set is not totally bounded.

4.2

Prove that C is indeed a subalgebra with properties (a) and ( 6 ) above.

4.3

Show that a continuous bounded function on a bounded interval need not be uniformly continuous.

In the problems below it is assumed that f and g are functions from (R,de) to (R,de). 4.4

Let f : (( - oo,O),de)--t (( - m,O),de) be a function given by f (x) = &. Show that f is continuous. Explain why f (x) is not uniformly continuous.

4.5

Let f : A -, W be a differentiable function such that its derivative f' is bounded over A, where A is an arbitrary (bounded or unbounded) interval. Show that f is uniformly continuous on A.

4.6

Show that if f and g are uniformly continuous on W and bounded then f g is uniformly continuous on R too.

4.7

Which of the following functions are uniformly continuous? a) f ( I) = sin2x (x E W). b ) f (x) = x3cos r (x E W). c) f(x) = xsinx (x E R). d) f(x) = lnx (x E [l,m). e) f (x) = x21n x (x E (1,100)).

4.8

Let f be a continuous function and g a uniformly continuous function on a set A such that I f 1 5 1 g 1 . IS f then uniformly continuous?

4.9

Show that in (Wn,d,), any bounded set is also totally bounded.

86


NEW TERMS: continuous a t a point function 78 continuous function on a set 78 inverse image of an open set under f 79 continuity criteria 79, 81 topology generated by a metric 81 Dirichlet function 81 equivalent metrics 8 1 diameter of a set 82 bounded set 82 d-bounded set 82 unbounded set 82 totally bounded set 82 uniformly continuous function 82 algebra of functions 84

5. Complete Metric Spaces

5. COMPLETE METRIC SPACES In this section we will discuss the completeness of metric spaces as it was introduced in Definition 3.1 (iv).

5.1 Theorem. Let (X,d) be a complete m e t r i c space. T h e n a subspace (A,d) i s compIete if and only if A is closed.

Proof. Let A be closed and let {x,} C A be any Cauchy sequence. Since ( X , d ) is complete, there is a point x E X such that n+oo lim x, = x. Then, by Corollary 3.4, x E A. Thus, (A,d) is complete. Now, let (A,d) be complete and {x,} be any convergent sequence in A. Then this sequence is also a Cauchy sequence and hence A contains its limit. Therefore, A is closed, again, by Corollary 3.4. 0 The reader should be aware of the differences between the notions of completeness and closeness of a subspace. (See Problem 5.3.)

5.2 Theorem. A m e t r i c space (X,d) i s complete if and only if every nested sequence {C(z,,r,)) of closed balls, w i t h r, 10 as n-too, has a n o n e m p t y intersection.

Proof. Because rn 0, for any r, < :E. Given that k > n > - u,

E

> 0, there is an integer u such that

and, consequently , d(xk,xn) 5 2r,

< E.

Therefore, {x,} is a Cauchy sequence. First assume that (X,d) is complete. Then, {x,} converges to a point, say x E X. Since each ball C(z,,r,) contains the tail

of the sequence {x,}

and because it is closed, it must contain x.

n C(xn,rn) contains x and hence it is not empty. 00

Thus,

n = l

Now, let any nested sequence of closed balls have a nonempty intersection and let {xk) be a Cauchy sequence in X. By Definition 3.1 (iii), it implies the existence of an increasing subsequence {ul,u2,. ..} of indices of {xk) such that for each n, d(x3,xpn)
0, such that

for all y with d(x,y) < 6,. Since X is compact, after reduction, there is an n-tuple of open balls such that

6. C o m p a c t n e s s Let 6 = 1min{6,

97

,...' 6" n 1 and let x,y be such that

1

d (x,Y) < 6. Then x E

B(xi,bXi/2) implies that d(x,xi) < and

Thus, y belongs to the ball B(xi,6,;).

Since y and xi are within the

distance of bXi, due to continuity of f a t xi,. given

Obviously, d(x,xi)

E,

< 6 . yields p( f (xi), f (x)) < f and, therefore,

6.14 Theorem. A m e t r i c space ( X , d ) is c o m p a c t i f a n d o n l y i f it is c o m p l e t e a n d totally bounded.

Proof. 1) Let (X,d) be compact. Then by Problem 6.6, it is complete. Since X E U B(x,E) for some E > 0, by compactness, the cover x E

X

can be reduced to a finite subcover, which implies total boundedness.

2) Let (X,d) be complete and totally bounded. We will show that (X,d) is sequentially compact, which, by Theorem 6.3, would imply compactness. Let {xn} be a sequence in X. We will construct a Cauchy subsequence. Since X is totally bounded, it can be covered by finitely many open balls of radius 1. Then a t least one of the balls, for instance B1, contains infinitely many terms, say {xi}, of this sequence. ~ u rheirnore, t cover X by balls of radius and again an infinite subsequence {xi} C { x } (since B1 will also be covered) is contained in one of the balls, which we label B2, and so on. The desired Cauchy sequence is formed by the selection of the first term from each subsequence. Indeed, by the con1 2 struction, xi and x: belong to ball B1. Thus, d(xl,xl) < 1. xy and x t

4

2 3 belong to ball B2, which implies that d(xl,xl)

< i, and so on. Since

(X,d) is complete, this Cauchy sequence is convergent, yielding sequent ia1 compactness of ( X , d ) . 0

98


PROBLEMS

6.1

Show that if i r k } (Rn,d,) with d(xk,O)5 3, then { x k } has a convergent subsequence.

6.2

Define VA,B E ( X , d ) , d ( A , B )= inf{d(a,b ) : a E A, b E B } . Let A be compact. Show that V B X , there is an x E A such that d(x,B)= d(A,B). [ H i n t : Use the fact that A is sequentially compact.]

6.3

Let A,B C ( X , d )such that A is compact and B is closed. If A n B = 0, show that d ( A , B )> 0.

6.4

Let A C ( X , d ) . Show that if A is totally bounded then totally bounded.

6.5

Generalize Theorem 6.6: Any Lindel'if m e t r i c s p a c e i s separable.

6.6

Show that sequential compactness of a subspace implies its completeness.

6.7

Prove Theorem 6.2.

6.8

Prove Theorem 6.3.

2 is

also

6. Compactness

NEW TERMS: cover 92 subcover 92 open cover 92 open subcover 92 compact set 92 compact metric space 92 Lindelof set 92 Lindelof space 92 compactness, criteria of 93, 97 Bolzano-Weiers trass compactness 93 sequential compactness 93 separable metric space 93 Heine-Bore1 Theorem 94 compact set under a continuous function 95 uniform continuity criterion in compact space 96

100


7. LINEAR AND NORMED LINEAR SPACES We have already mentioned that the Euclidean metric defines the length of a vector in n-dimensional Euclidean vector (linear) space. The following generalizes the notion of vector length in a linear space and reconciles it with the notion of a special metric defined on a linear space (initially discussed in Section 5).

7.1 Definition. Let (X,d) be a metric space such that X is a linear space over R or 43. The metric d is said to be:

+

a) translation invariant if for all a, x, y E X , d(x + a,y a) = d(x,y). b) homothetic if for all a E F and x,y E X , d(ax,ay) = I a I d(x,y). If d is translation invariant and homothetic we will abbreviate it by TIH. If d is a metric on a linear space X , then we are able to measure length of vectors, and thus comparing them, by setting the distance from any point x E X to one fixed point of X , the origin. If, in addition, d is TIH then we can use the properties of X as a linear space, and in some particular cases, employ even the geometry, thereby replicating the Euclidean space and preserving the generality needed in applications.

7.2 Definition. Let d be a TIH metric on a linear space X, with the origin 0, over [F. (assuming that IF is R or C). Then for all x E X, we call the distance d(x,0) the norm of vector x and denote it by 11 x 11. We will also call 11 11 the norm on X induced b y the TIH metric d. The pair C3 ( X , 11 (1 ) is called a normed linear space (NLS). 7.3 Theorem. Let following properties of

11 - 11 be a norm on X 11 11 hold true:

(i)

IIaxII = la 1 IIx

11,

(iii)

11 x + y 11 I Il x 11 + Il Y II

in Definition 7.2. Then the

b'a E F , v x E X. 9

VX,Y E X -

Proof. Property (i) is obvious.

Conversely, if 11 11 is a real-valued nonnegative function defined on a linear space X and has properties (i-iii) of Theorem 7.3, then 11 11

-

7. Linear and Normed Linear Spaces generates a TIH metric on X by setting d(x, Y) = Problem 7.10).

101

11 x - y 11 (show

it, see

If d in Definition 7.2 is a TIH pseudometric then the function 11 - 11 is called a semi-nonn and correspondingly, the pair (X, 11 11 ) is called a semi-normed linear space (SNLS). It is easy to show that the Euclidean metric d, on Rn is TIH. The associated norm induced by d, is called the Euclidean norm and it will be denoted I1 11 ,.

-

A very important class of NLS's is introduced below.

7.4 Definition. An NLS is called a Banach space if it is complete with respect to the metric induced by the norm (or the norm induced by a TIH metric).

7.5 Examples. (i)

The NLS ( R n , / 11,) over the field R with l l x / =

,/g

is a Banach space with the Euclidean norm (see Problem 7.1).

(ii) The NLS

[ x=p,, I xn I

IP

over the field C with the norm

11 x 11

,

.

is a Banach space. Observe that 11 11 indeed defines a norm (called the lP norm). (See Problem 7.5.) Now let {z(")) be a Cauchy sequence. Then this sequence is uniformly bounded (show it in Problem 7.6), say, by some M E W+. Let x = (xl, xz,. ..) be the pointwise limit of the sequence {x(")}. This limit exists, since each zi is the limit of the ith-component sequence in (C,d,) which is complete. We need to show that x is an element of I P , i.e. 11 x 11 < m and that

=

p]lIp

(i.e. { x ( ~ ) }converges to x in

1P

norm). We have

(by Minkowski's inequality with ak = xk - zp)and bk = x p ) )

Now, letting n

-t

oo, we have

102


[2I k=l

k

I pi'"

5 M,

which holds for all r = 1,2,... . Hence, we have 11 x 11 5 M. Show that x ( ~ ) - x in l P norm (Problem 7.7). Thus, l P is complete and therefore is a Banach space. (iii) Let T,(St) be the space of all bounded real-valued functions on St valued in (R,d,) or (C,d,). One can show that 4, is a linear space. The norm 11 f 11, = sup{ I f(w) I : w E St) is called the supremum norm. 9, is a Banach space with respect to this norm (see Problem 7.4).

Cia,bl as the space of all n-times differentiable realvalued functions on a compact interval [a,b]. It is easily seen that Cn is a linear space. We introduce the following norm in Cia,bl : (iv) Consider

[ a ,bl

Clearly,

11 11 z

cL,bl.We show that Cia,bl is a Banach { f k ) be a 11 - 11 z-Cauchy sequence. Then, for

is a norm in

space under this norm. Let every E > 0, there is a positive integer N such that Qk,j

> N,

which implies

Therefore, by the well-known theorem from calculus (cf. Theorem 4.2, p. 508, in Fisher [1983]), there exists a function gi : [a,b]4 W to which the sequence {f ;( 1: j = 1,2,. ..) converges uniformly and gi is continuous, i = 0,1,. ..,n.On the other hand, it holds that

Let k - + m in the above equation. Since the convergence is uniform, we may interchange the limit and the integral (a more rigorous motivation is due to the Lebesgue Dominated Convergence Theorem in Chapter 6) and have i l ( ) - i l ( ) =

J gi(u)du, i = 1,...,n. [ a ,XI

Consequently, we conclude that gi-l is differentiable on [a,b] and g :-l(x)

7. Linear and Normed Linear Spaces

= gi(x). Thus go E Cia, bl implying that

11 f

- go 11

103

+

0 and Cia, bl is

a Banach space. 7.6 Definitions.

(i) Let X and Y be linear spaces over a field f f .A map A : X --+ Y is called a linear operator (with respect to ff) if

(ii) A linear map f : X--r f f (where X is a linear space over a field

f f ) is called a linear finc2ional.

(iii) Replacing a field f f in ( i )and (ii) by a semifield F + I we have the notions of a semi-linear operator and a semi-linear functzonal, respectively.

PROBLEMS 7.1

Show that (Rn, 11 11 ,) defined in Example 7.5 (i) is an NLS and then show that it is a Banach space.

7.2

Define the space lW as the set of all bounded sequences x = {xl,x2,...} C C. Show that lm is an NLS with the norm defined as 11 x 11 = sup{ I xi 1 : i = 1,2,. ..}.

7.3

Define the space c E lW as the subset of all convergent subsequences and let co C - c be the set of all sequences convergent to zero. Show that c and co are normed linear subspaces of loo with the same norm as that in Problem 7.2.

7.4

Let 9,(a) be the space of all bounded real-valued functions on a. Show that 4, is a linear space. Let 11 f 11 ,= sup { I f ( w ) I : w E a} be the supremum norm defined in Example 7.5 (iii). Show that the supremum norm in 4, is indeed a norm and show that 9, is a Banach space with respect to this norm.

7.5

Show that

7.6

Show that the Cauchy sequence { I ( " ) } in Example 7.5 (ii) is uniformly bounded.

7.7

Show that the pointwise limit x of the sequence { x ( " ) } in Example 7.5 (ii)is also an IP-limit.

7.8

Show that the differential operator dn : Cia, dx with respect to R.

11 11

in Example 7.5 (ii)is a norm.

-+

C[,,

bl is linear

104


7.9

Let A be an n x m matrix. Show that A: Rm+ Rn is a linear operator with respect to R.

7.1

Let 11 11 be a real-valued nonnegative function defined on a linear space X over a field ff (which is R or C ) and let it have properties (i-iii) of Theorem 7.3. Show that 11 11 generates a TIH metric on by 4 x 1 Y) = I1x - Y (1

x

7. Linear and Norrned Linear Spaces NEW TERMS: translation invariant metric 100 homothetic metric 100 TIH metric 100 norm 100 normed linear space (NLS) 100 NLS 100 semi-norm 101 semi-normed linear space (SNLS) 101 SNLS 101 Euclidean norm 101 Banach space 101 lP-norm 101 supremum norm 102 G n o r m 102 linear operator 103 linear functional 103 semi-linear operator 103 semi-linear functional 103

Chapter 3 Elements of Point Set Topology 1. TOPOLOGICAL SPACES In Definition 4.5, Chapter 2, we called the collection of all open sets ~ ( d ) of a metric space ( X , d ) the topology induced by a metric. We recall that this collection of open sets or topology is closed with respect to the formation of arbitrary unions and finite intersections. We understand that the topology of a metric space carries the main information about its structural quality. For instance, equivalent metrics possess the same topology. In addition, through the topology we can establish the continuity of a function (see Theorem 4.6, Chapter 2) without need of a metric. This all leads to an idea of defining a structure more general than distance on a set, a structure that preserves convergence and continuity. Mathematics historians are not in complete agreement about the roots of topology and who should get full credits for being its initiator. Most consider that topology, as the theory of structures, has its basis in the work of the German mathematician Felix Hausdorff, who published his fundamental monograph, Grundziige der Mengelehre (Principles of Set Theoy), in Leipzig, in 1914. It was 'Limmediately') preceded by Maurice Frdchet's 1906 pioneering introduction to metric spaces. (Notice that contemporary topology has branched out into several specialized areas, such as general topology, algebraic topology, and combinatorial topology. The very topology founded by Hausdorff w& what we now refer to as general topology, also called point sei topology, which is deeply bound to classical analysis.) Bourbaki [1994], regarded German Bernhard Georg Riemann's work (his doctoral and habilitation theses and a paper on abelian functions) from 1851 to 1857 revolutionary and qualified him as the creator of topology, since he was the first to recognize where'topological ideas were needed. In 1870, Georg Cantor (apparently inspired by Riemann's work), in connection with the representation of real-valued functions by Fourier series, was concerned with the characterization of sets on which the function's value can be altered leaving the series invariant. This yielded more advanced concepts of topological accumulation point (earlier introduced by Karl Weierstrass), derived set, closed set, connected set, dense set and others that further led to the topological big bang. The word topology was introduced for the first time in 1836 by German Johann B. Listing, who used this as the notion of a "new analysis."

108

CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY

Topology has been further evolved ever since. Most of the fundamental results in general topology were developed in works by Germans Felix Hausdorff, Heinrich Hopf, and Hermann Weyl, Russians Pavel Alexandrov and Pavel Urysohn, Poles Stefan Banach, Kazimierz Kuratowski, and Waciaw Sierpihski, American Eliakim H. Moore and James Alexander, and Bourbaki group of French mathematicians.

1.1 Definition. Let X # @. A collection r of subsets of X is called a topology on X or a family of open sets, if: (i)

X,

@ E r.

(ii) {Oi : i E I) C_ r 3 U Oi E r. iEI (iii) r is n -stable, i.e., 01,02E r 3 O1 n O2 E r. [Observe that property (iii) implies inductively that the intersection of any finite collection of open subsets will also be open.] A carrier X endowed with a topology r is said to be a topological space. The topological space is denoted by (X,T). 1.2 Examples. (i) Let (X,d) be a metric space and let r ( d ) be the topology generated by the metric d (see Definition 4.5, Chapter 2). Due to Theorem 2.5, Chapter 2, the collection of all open sets generated by metric d contains all arbitrary unions and finite intersections. Moreover, @ and X a r e also open, so that r(d) is indeed a topology as it was defined above. For instance, the topology in Rn generated by the Euclidean metric de is called the usual (or standard or natural) topology and it is denoted by re. (ii) Let X be a nonempty set. Then the pair {X, @) = so is a trivial example of a topology. It is obviously the smallest topology on X, and it is called the indiscrete topology. Another trivial example of a topology is T(X), the collection of all subsets of X. This is the largest possible topology on X, and it is called the discrete topology. (iii) For A C_ X, r, = {X,@,A) is a topology "induced by set A."

+

(iv) Let X = = R U { - m) U { m) be the extended real line. Let ? C Y(X) be the following collection of sets:

0 E 7 if and only if 1) O n R E re

2) if m E 0 or - m E 0, then there is an a E R such that (a,m] C_ 0 or an a E R such that [ - oo,a) E 0, respectively. Then 'i is a topology on

(see Problem 1.1).

109

1. Topological Spaces

- X. Define the sys(v) Let ( X , r ) be a topological space and let Y C tem of subsets ry = {0n Y : 0 E r). We show that r y is a topology on Y. Indeed, Y and (8 obviously belong to ry. Let { U i : i E I) C ry. Then, V i E I, there is 0; E r such that 0; n Y = U i E ry. Now U 0; E r icI

and therefore Y n U Oi E ry. On the other hand, due to the distributive law, iEI

It can similarly be shown that ry is closed with respect to the formation of all finite intersections. Therefore, ry is a topology on Y 5 X, called the relative topology of r on Y. The pair (Y,ry) is called a subspace. In some older textbooks, the topology ry is also called the trace of Y in T. For instance, take the Euclidean metric space (R,d,) and let Y = [0,1]. Then the set (;,I] is open in (Y,ry). CI

Let X be a non-empty set and let T and T' be two topologies on X. If T C - TI, then we say T is weaker (or smaller or coarser) than 7'. We also say that r' is stronger (or larger or finer) than T. As it follows from Examples 1.2 (ii) and (iii), roC_ rl 5 9(X). The indiscrete topology is, therefore, the coarsest topology on X, while T(X) is the finest topology on X. (i)

(ii) If ( X , d ) is a metric space and ~ ( d is) the topology induced by metric d (also called the metric topology), then (X,r(d)) is said to be a metrizable (topological) space. Therefore, a metrizable space is a topological space with a topology that comes from some metric.

1.4 Definition. Let ( X , r ) be a topological space. A subset A C X is called T-closed or just closed if AC E 7. CI As in the case of metric spaces, we can easily prove that X and # are closed, finite unions of closed sets are closed, and arbitrary intersections of closed sets are closed. In Definitions 1.5 below we introduce some important notions for topological spaces. It will be advantageous to support these definitions by examples immediately after the notions are introduced. T o reference the examples, we assign them the letter D followed by the prefix of the definition.

1.5 Definitions. (i) Let ( X , r ) be a topological space. A subset A 2 X is called a neighborhood of a point x E X if x belongs to some open subset of A. Specifically, if A E T then A is called an open neighborhood of x.

110


[Ezample D1.5(i). Let X = R and r = {W, @,{1),(3,4],{1) U (3,411. Then {I) is an open neighborhood of 1, [3,5] is a neighborhood of 3$, ( - 2,O) is not a neighborhood of - 1, and R is the only neighborhood of - 1.1 (ii) A point x is called an interior point of a set A if A is a neighborhood of x. The set of all points interior to A is called the interior of A and is denoted by or by Int(A). [Example Dl.S(ii). In Example D1.5(~), 1 is the interior point of the set {I). The interior of set A = [3,5] is = (3,4].] (iii) The collection of all neighborhoods of a point x E X is called the neighborhood system at x and it is denoted by 91,. An arbitrary subcollection %, %, is called a neighborhood base at x (or a fundamental system of neighborhoods of x), if every neighborhood U E 91, is a super., Any element B E %, is called a base neighborset to least one B E % hood. ~learl;, 91, itself is a neighborhood base a t x. Obviously, %, is a neighborhood base a t x if and only if there is another neighborhood base 9, such that every base neighborhood D, E 9, is a superset to a t least one neighborhood base B from 9,.

.

[Example D1.5(iii). Let { ( x , 1) , n = 1,2,. .) be the sequence of deopen balls centered a t a point x E Rn. Clearly, it is a fundamental system of neighborbods of x. Another neighborhood base a t x, which contains the above ne'ighborhood base, is the system of all open balls with rational radii, centered a t x. We can alsc take the system of all open balls with positive real radii, centered a t x. This system contains the first two neighborhood bases.] A neighborhood base 93, a t x is in general a more "economical system" of neighborhoods than the whole neighborhood system %, ; and, as it will be shown, it is as informative about the structure of the space in the vicinity of x as %, is. Technically, it is of greater advantage in various proofs for us to use a base neighborhood than to use a n arbitrary neighborhood. As it follows from the definition, an arbitrary set A need not be a neighborhood of all of its points. For instance, [0,1] is not a neighborhood for points 0 and 1 in the usual topology @,re). More about the nature of neighborhoods is contained in the following propositions that the reader can easily verify.

1.6 Proposition. A 5 X is a neighborhood f o r all of its points if and only if A is open. 0 (See Problem 1.4.)

1.7 Proposition.

is the largest open set contained in A.

0

1. Topological Spaces (See Problems 1.5.) 0

In particular, it follows that A is open if and only if A = A.

1.8 Definitions. (i) x E X is called a closure point for a set A if any neighborhood of x has a nonernpty intersection with A. We also say that any neighborhood of x meets A. The set of all closure points of A is called the closure of A and it is denoted by 2.[Sometimes, when working with relative topologies it is necessary to emphasize that the closure of A is with respect to the carrier X , it is advisable to use the notation CIXA. However, for brevity we shall still use the notation 2,whenever X is the only carrier under consideration.] [Example D1.8(i). In the topology introduced in Example D1.5(i), let us take A = ( - 2,O). Then we have

while = #. Indeed, for any x E ( - m , l ) , W is the only neighborhood of x; thus W n ( - 2,O) # @.Observe that 1 is not a closure point of A, since 11) is a neighborhood (of 1) such that {1) fl A = Q).For set B = { - 1) we have

(ii) A subset A s X is said to be dense in X if said to be nowhere dense if ~ n t ( A = ) 0.

2 = X. A s X is

[Example D1.8(ii). Consider Example D l . 8(i). For A = ( - 2,0),

while i n t ( 2 ) = Q),i.e. A is nowhere dense. The set

C = { - 1) U (1) U (3,4] is dense in X.] (iii) A point x E X is called a n accumulation point (or cluster point) of a set A if every neighborhood of x contains a t least one point of A other than x. The set of all accumulation points is called the derived set and is denoted by A'. [Example Dl.b(iii). In Example D1.8(i), A' = 2.1 (iv) A point x E X is called a boundary point of a set A if every neighborhood of x contains a t least one point of A and a t least one point of AC. The set of all boundary points of A is denoted by

CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY

and called the boundary of A. [Example Dl.$iv). In Example DI.B(i),

A = @ and 6'A = 2.1

(The closure of A is evidently the smallest closed set containing A; and A is closed if and only if A = A. See Problem 1.6.) (v) A topological space (X,T) is called separable if there exists a t most a countable, dense subset of X.

PROBLEMS 1.1

Show that the collection is a topology in R.

1.2

Let X be a nonempty set and r = {X,Q),CC:C 2 X and C is fanite). Show that r is a topology on X. T is called the cofinite (or finite complement) topology on X.

1.3

Let X = R and let r = {x,#,(-m,l],[l,m),(3,10]}. Is r a topology on R? If not, supplement T by some subsets to a topology (and be reasonable).

1.4

Prove Proposition 1.6.

1.5

Prove Proposition 1.7. [Hint: Show that A contains all open sets that are contained in A and use Proposition 1.6.1

1.6

Show that the closure of A is the smallest closed set containing A; and A is closed if and only if A = 2.

1.7

Show that (a) A C B

; i of

sets introduced in Example 1.2 (iv)

A E B, ( b )

= 2 U B, (c) AnB

C _ ~ f l ~ ai n td( ~ n ~ ) = X f l IS h .i n t ~ = ~ ?

2 = A U aA.

1.8

Show that

1.9

For X being an infinite set, define T: = {x,@,cC: C is a t most countable). Show that T is a topology on X. We call such a topology cocountable (or the countable complement topology).

1.10

Show that 2 = A + a A [Hint: Proceed in the same way as in Problem 3.2, Chapter 2, and work with a neighborhood instead of a ball.]

1.11

Prove that a subset of a topological space is closed if and only if it contains all of its accumulation points.

1 . Topological S p a c e s

1.12

113

Let ?=(W,(-1,1],[0,5),(0},{10)). a) Extend ? to the smallest topology

T

in R generated by ?.

b ) Let A = ( - 7 , - 51, B = (0,7],and C = [ -

k,20). Find the sets

A , B , C , i , b , & A',Bt,C', , aA, aB, and 6'C. Determine whether A,B and C are dense in R.

1.13

Show that a A = (8 if and only if A is open and closed.

1.14 Show that (2)' C p. 1.15

Show that the inverse inclusion in the previous problem holds if and only if A is closed and open.

1.16

This provides an equivalent definition of a closure point. Show that r E 2 if and only if VUz E rll,, U zn 2 # @.

114

CHAPTER 3. ELEMENTS OF POINT SET TOPOLOGY

NEW TERMS: topology 108 open sets 108 n -stable family of sets 108 topological space 108 usual topology 108 standard (natural) topology 108 indiscrete topology 108 discrete topology 108 topology induced by a set 108 topology on the extended real line 108 relative topology (subspace) 109 subspace 109 subspace 109 trace of a s e t in a topology 109 weaker (coarse?, smaller) topology 109 coarser topology 109 stronger (finer, larger) topology 109 finer topology 109 metric topology 109 metrizable topological space 109 closed set 109 neighborhood of a point 109 open neighborhood of a point 109 interior point 110 interior of a set 110 neighborhood system a t a point 110 neighborhood base a t a point 110 fundamental system of neighborhoods a t a point 110 base neighborhood 110 closure point for a set 111 neighborhood of a point that meets a set 111 closure of a set 111 dense set 111 nowhere dense set 111 accumulation (cluster) point 111 cluster (accumulation) point 111 derived set 111 boundary point of a set 111 boundary of a set 112 separable topological space 112 cofinite (finite complement) topology 112 t) topology 112 cocountable (countable com~lemen

2. Bases and Subbases for Topological Spaces

115

2. BASES AND SUBBASES FOR TOPOLOGICAL SPACES In the previous section, we introduced the notion of a collection of open sets, called a topology. In many applications, describing an entire topology on a carrier is difficult and sometimes even impossible. This predicament is manageable if one deals instead with a sort of "pre-topology," a smaller collection of sets, which is not a topology, but which generates a topology and thereby can be extended to a topology. With a similar idea, we come to introduce neighborhood bases. Take, for example, a metric space. While the family of all open balls does not yield a topology, every open set, as we know, can be made of the union of some subcollection of open balls, and consequently, it leads to a topology and gives rise to the notion of a base for a topology.

2.1 Definition. Let (X,T) be a topological space. A subcollection 93 of open sets is called a base for T if every open set is a union of some elements of 93. (Specifically, it follows that 0 must be an element of 93.) The elements of 93 are called base sets. 0 With no major difficulty (and with hints provided), the reader can afford establishing a very useful criterion of a base for T, subject to Problem 2.2. An important relation between bases and neighborhood bases is given in the following theorem. 2.2 Theorem. 93 is a base for T i f and only if, # E 93 and for every point x E X , there is a neighborhood base 93, consisting of open sets such that 93, C 93.

Proof. We have to show that 93 is a base for T if and only if, for every x E X and each neighborhood U, of x, there is a base neighborhood B, E 93 such that B, C U,.

Let 93 be a base for T and let U, be a neighborhood of a point x E X. Without loss of generality we assume that U, is open. (Otherwise, take any open neighborhood 0,E U, of x and work with 0, instead.) If U, is open, there exists a subcollection of 93 whose union equals U,. Thus, a t least one set of this subcollection, say B, ( E 48)) must contain x, and B, C U,. Observe that by Definition 1.5 (iii), B, is then an element of a neighborhood base and 93, = {B,} forms a neighborhood base of x. Therefore, each neighborhood base %, of x has a t least one neighborhood base 93, of x such that 93, C_ 93 and each U, E 21, is a superset of a t least one B, E 3,. (i)

(ii) Let 93 E T and assume that for every x E X, there is a neighborhood base 93, C 93. Let 0 be an arbitrary open set. Then, by our assumption and by the definition of a neighborhood base, for any point x E 0 (since 0 is a neighborhood of x), there is a base neighborhood B, E 93,

116


such that B, 5 0. Thus 0 = x -

U t=

B,

(union of all such B, E 93). Hence,

0

every open set 0 E T can be cohposed of a union of some elements of I, or equivalently, 93 is a base for r.

2.3 Examples. (i) Let (93,: x E X) be an arbitrary collection of open neighborhood bases a t all points. Then, U %, can be regarded as an example of ,EX

a base for r. Indeed, as in Theorem 2.2, take a point x of any open set 0. Then, 0 is a neighborhood of x and thus it belongs to the neighborhood system a t x. By the definition, a neighborhood base %, E {%,: x E X) is such that there is a t least one base neighborhood B, of x included in 0. Collecting all such neighborhoods of all points of 0, we can represent 0 as the union U B,. Hence, {%,: x E X} is a base for the topology r. ~

€

0

(ii) As mentioned a t the beginning of this section, in any metric space (X,d), the collection of all open balls is a trivial example of a base for the corresponding metrizable topological space. Indeed, by Definition 2.3, Chapter 2, for each open neighborhood 0, of x E X, there exists an open ball B(x,E) E Ox. Earlier (in Example 1.5 (iii)), we showed that B(x,r) is a base neighborhood a t x. Thus by Theorem 2.2, the system {B(x,E): x E X, E > 0) is a base for r(d). As in Example Dl.S(iii), a neighborhood base a t x can be reduced to the system 93, = {B(xIq): q E Q, q > 0) of all balls with rational radii. Consequently, by Theorem 2.2, the collection of all open balls with rational radii is a base for r(d). [Note that these balls are centered a t all x E X, so consequently, this base need not be countable.] (iii) We give a rather informal definition of an open parallelepiped in (Wn, re).More formalism is brought in Section 5. A set

is called an open parallelepiped (or rectangle) in Rn if each o(;)is an open set in W. An open parallelepiped is said to be base (or simple) if each o(;)is an open interval. Let 9 be the system of all base parallelepipeds in (Rn,re) along with the empty set 0.Let x E Rn and let Ox be any open neighborhood of x. Then, there is an open ball B(x,r) 5 0,. On the other hand, there obviously is a base parallelepiped P, "centered" a t x that can be inscribed into this ball, and this implies that P, E 0 T' Therefore, the system 9, of all open base parallelepipeds centered a t x is a neighborhood base a t x; and again by Theorem 2.2, 9 = {T,: x E X ) is a base for (Rn,re). Observe that the system of all "rational" parallelepipeds (i.e. those base ones with rational coordinates) is also a base for

2. B a s e s a n d Subbases f o r Topological S p a c e s

(Rn,re). (iv) The collection of all singletons {I) E 9 ( X ) , along with base for the discrete topology on X.

@,is

a

0

be a base for (i) Let r1 and r2 be two topologies on X and let r,. If 93, 5 r2 then rl C r2.[Observe that B !1 need not be a base for r2.] Indeed, by the definition of a base, each 0' E rl can be represented as o1= u B i However, Bi E r2implies that U Bi = 0' E r2. 1

1

(ii) Let r1 and r2 be two topologies on X with a common base '3. Then, by (i), rl 5 r2 and r25 T,, and thus rl = 72. In other words, a base uniquely defines a topology. Note that although one topology may have different bases, a base cannot share different topologies. be a base for r2.It does not follow that (iii) Let rl 5 T2 and let !B2 is a base for 7,. In fact, !B2 need not even be a subcollection of 7,. However, if in addition, !B2 & r,, then by (i), r2C r1 and therefore, rl = r2.Indeed, rl C r2implies that 7, = r2. Cl

In a construction of a topology on a carrier, it is often very helpful to start with a collection, yet smaller and more rudimentary than a base. Even more rewarding becomes the formation of product topologies and quick and tame continuity criteria of functions. Recall that a function f , corresponding between two metric spaces X and Y, is continuous if and only if inverse images under f of open sets in Y are open in X. Remarkably, continuity of f can be verified for a (frequently) much smaller community of subbase sets in Y. This will be established and elaborated in Section 4 for topological spaces. We begin with the following: 2.5 Definition. Let Y C 9 ( X ) such that

U

A = X. If there exists the

A E !f

weakest topology containing Y, then it is called the topology g e n e r a t e d b y Y, and the collection !f is called a subbase o n X. [Note that !f can directly restore only X, while '3 restores all open sets, including @. Clearly, a base 93 for a topology 7, besides T itself, offers a trivial examp16 of a subbase on X.] T o justify Definition 2.5 we need: 2.6 Proposition. T h e weakest topology generated by a subbase exists.

Proof. Clearly, there exists a topology containing Y (for instance, T(X)). Then define r(Y) as the intersection of all topologies containing Y. We show that r(Y) is a topology on X. (i)

X and @ belong to all topologies containing 1 . Therefore X

118


and @ E ~ ( 9 ) .

..,On E ~ ( 9 ) .Then 01,02,. ..,On are elements of (ii) Let 01,02,. n every topology containing Y. This implies that Ok belongs to all

n

k=l

topologies containing Y, and thus it belongs to T(Y).

(iii) By similar arguments, r(Y) is closed relative to the formation of arbitrary unions. Obviously, r(Y) is the weakest topology containing Y. The following theorem shows that the way we generated the weakest topology r(Y) dVer a collection Y of "primitive" sets or a subbase, by extending this collection to the one closed with respect to the formation of finite intersections and arbitrary unions, takes place in the construction of arbitrary topologies. [It seems plausible to supplement Y by X , @, and all unions and finite intersections of elements of Y.] In addition, the theorem shows that the extension of a subbase to an fl -stable supercollection makes a base to the weakest topology ~ ( 9 ) . 2.7 Theorem. Let Y be an arbitrary subcollection of

T(X)with

and let where @ E '3 and 38 contains all finite intersections of elements of Y. Then 38 is a base for r(Y).

Proof. Let

where '3 is defined in the condition of this statement. We show that r' is a topology on X. It is sufficient to show that T' contains all finite intersections; the other properties of T' as a topology are obvious. Also, for brevity in notation, we show this for the case of the intersection of two open sets. Let U and V be two elements of 7'. By the definition of 38,

u = U Ui i € I

where

u.= a

(7 s:

k=l

U Vj,

and V =

(Ui,Vj E 9)

j€J

and V j =

~ s (s;,sa€Y). L s=1


119

Then

Now, since obviously 39 is a base for 7' and 39 C r(Y) C r', by Remark 2.4 (iii), identifying r(Y) as r1, T' as T2, and 39 as 39,, we have r(Y) = 7'. In particular, we see that 39 is a base for r(J). 2.8 Examples-

(i) In Example 2.3 (ii), it was shown that the system 9 of all base parallelepipeds is a base for (Wn,re). On the other hand, it is easily seen that 9 is closed relative to the formation of all finite intersections (recall that @ is also in 9 ) . Thus, 9 is a base for r ( 9 ) , according to Theorem 2.7. Furthermore, 9 is a base for re. Thus, by Remark 2.4 (ii),re and r ( J ) coincide. In other words, the natural topology re on Rn is generated by the system of all base parallelepipeds. In another situation, we can take for Y the system of all open parallelepipeds with rational coordinates, which is certainly closed relative to all finite intersections. Then, re would also be generated by the system of all rational parallelepipeds. [Recall that metrics de and supremum metric are equivalent in Rn. No wonder that re and r ( J ) coincide.] (ii) In another scenario of (Rn,r,), the collection of open parallelepipeds of types af((ai,bi)) = R x ... x R x (ai,bi) x R x ... x W, where (ai,bi)'s are open intervals in R, i = 1,...,n,forms a subbase for re. [Note that none of at((ai,bi)) is a base parallelepiped.] This collection can be extended to a base 39 for re by including in 39 the empty set @ and all finite intersections of the subbase parallelepipeds. Base 93 evidently contains 9 (why?). 0

PROBLEMS 2.1

Let (X,T) be a topological space and let 39 C T. Show that 93 is a base for r if and only if for every open set 0 E r and each 'point x E 0, there is a subset U of 0 such that x E U E 39.

2.2

Show that 39 C 9 ( X ) is a base for a topology on X if and only if (i)

each x E X belongs to at least equivalently, X = U B) BE% and

(ii) QB1,B2 E 39 and Qx E Bl B1 fl B,.

one set B E 39 (or

n B2, 3 B E 39 such that

x EBC


[Hint:Use the steps that follow. 1) If '3 is a base, then apply Theorem 2.2. (ii).

2) Let

r={

U

B:V 9 3 ' 9 3

Show that

T

is a

B E '3'

topology on X and that '3 is a base for

7.1

2.3

Let 93 be a base for a topology r on X. Since '3, in particular, is a subbase on X, it also generates the weakest topology ~ ( ' 3 )and hence r('3) r. Is r ( 3 ) = r ?

2.4

Let rl denote the topology on the real line generated by all semiopen intervals of type [a,b) where a,b E R. This topology is called the lower limit topology. Show that {[a,b): a,b E R) is a base for rl and that r l is strictly finer than re,the usual topology on the real line.

2.5

Let '3 = {[a,b): a,b E Q). Show that '3 is a base for the topology r that 93 generates and that r is strictly coarser than the lower limit topology r1of Problem 2.4.

2.6

Show that the collection of all sets on the real line of types (a,m) and ( - m,b) is a subbase for the usual topology @,re).

2.7

Show that any base and subbase parallelepipeds in Example 2.3 (ii) and Example 2.8 (ii), respectively, are open sets.

121


NEW TERMS: pre- t opology 115 base for a topology 115 base sets 115 base for a topology criterion for 115, 119 open parallelepiped (rectangle) 116 rectangle 116 base (simple) parallelepiped (rectangle) 116 simple parallelepiped 116 rational parallelepiped 116 subbase 116 topology generated by a subbase 116 base, a construction of 118 subbase parallelepiped 119 lower limit topology 120

122


3. CONVERGENCE OF SEQUENCES IN TOPOLOGICAL SPACES AND COUNTABILITY Convergence of sequences introduced in this section generalizes that of Section 3, Chapter 2, for metric spaces, and it is preparatory for the more general type of convergence of nets and filters to be treated in Section 9.

3.1 Definition. Let {x,: n = 1,2,. ..) C (X,T) be a sequence and let A be a set. A subsequence QN = {x,: n = N, N 1,...) is called an N(A)tail of {x,} for some N 2 1 if QN & A. A sequence {a,: n = 1,2,.. .} CX is said to converge t o a point x E X if for every neighborhood U, of x, there is an N(U,)-tail of {x,). The point x is said to be a limit point of the sequence. A point x is said to be a limit point of a set A if x is a limit point of some sequence {x,) A.

+

Unlike metric spaces, a sequence in a topological space can have more than one limit as we learn it from the following example.

3.2 Example. Let X = W , let r={R,@,(-2,3],[-1,2]) and let x, = +, n = 1,2,. . . . Then, {+) converges to all points of the set [ - 1,2], since for each point x E [ - 1,2], its open neighborhoods are R, ( - 2,3], and [ - 1,2], each one of which contains the whole sequence. In most applications we will deal with general topological spaces, in which every convergent sequence has exactly one limit. An important representative of this class is introduced in the definition below.

3.3 Definition. A topological space (X,T) is said to be HausdorfS (or separated or T 2 ) if every two distinct points, x, y E X, possess disjoint neighborhoods.

T2 is often referred to as the second separation axiom. Other separation axioms will be introduced and discussed in Section 10. As was mentioned, the following proposition (which will be hardly a challenge for the reader) is a consequence of Hausdorff spaces.

3.4 Proposition. L e t (X,T) be a Hausdorff topological space, lim x, = x, and let lim x, = y. T h e n x = y.

n--+oo

n--+w

(See Problem 3.1.)

3.5 Example. Let (X,d) be a metric space and let (X,r(d)) be the corresponding metrizable topological space. With xl and x2 being distinct points of X, construct two open balls, B(xl,r) and B(x2,r), with r = $d(x1,x2). It follows that the balls are neighborhoods of xl and x2, respectively, and that B(xl, r ) fl B(x2,r) = @.This immediately implies that any metrizable topological space is Hausdorff.

123

3. Convergence of Sequences in Topological Spaces

3.6 Remarks.

(i) In metric spaces (see Corollary 3.4, Chapter 2)) a point is a closure point of a set A if and only if it is a limit point of A. This does not apply to general topological spaces. More specifically, a limit point is always a closure point, but the converse is not true. Let x be a closure point of A. If x E A, then setting x, = x, we have a sequence convergent to x. If x @ A then, by definition, for each neighborhood U, of x, a . a closure point, U, fl A # In this case, however, it is not clear how to choose a sequence convergent to x, i.e., how to ensure that for each U,, there is an N(U,)-tail, for we do not have the flexibility of metric spaces with balls like ~(x,;) of Theorem 3.3, Chapter 2. In Remark (ii) below we will demonstrate an example of a topology where a set A contains all of its limit points and yet is not closed, or, in other words, some closure points of A are not its limit points. However, if x is a limit point of A, then it is always a closure point. Indeed, if {x,) C A is a sequence convergent to x, then for every neighborhood U, of point x, there is a tail . .), which is contained by U,, and hence U, meets A. {xN,xN

a.

(ii) Consider the cocountable topology T on R introduced earlier in Problem 1.9. Take A = (a,b) where a < b. Let {xi} C A be a sequence. Then, by the definition of T , the complement of (xi) is open (and disjoint from {xi}). If this sequence has a limit x E AC, then this limit should belong to the open set {xi)' (since {I,} E A =+ AC C {x,}~), which can serve as an open neighborhood of x. This neighborhood does not have a single element of the sequence and, therefore, x cannot be its limit; or equivalently, this sequence cannot converge to any point of AC. Therefore, x E A. However, A is not closed either. T o see this, take a in association with set A = (a,b). Let 0 be any open set of the form R\(any sequence not containing a). Then 0 is a neighborhood of a, any such neighborhood 0 meets A on some set, and a is an accumulation point of A. Thus, A is not closed, for otherwise, by Problem 1.11, it would contain all of its accumulation points. An alternative argument shows that the only convergent sequences in a cocountable topology are those with constant tails and X itself. In other words, any sequence {x,} with an N-tail is {I, = x: n 2 N). It is clear that the complement of {x,: x, # x) is an open set containing x. Therefore, every set contains all limits of its convergent sequences, but the only closed sets are the countable ones and the carrier. (iii) Consequently, there arises the quest ion: Under what condition does a topological space have the property metric spaces have, namely, xE

Z tj 3 a sequence in A whose limit is x?

(3.6)

124


(With no additional condition, this result is valid for metric spaces; see Theorem 3.3, Chapter 2.) In other words, when is a set closed if and only if it contains all of its limit points? We also raise another question: When can Proposition 3.5 be reversed, i.e. when does the uniqueness of limits imply that the space ( X , r ) is Hausdorff? These two questions are closely related. T o see this, assume that in a topological space ( X , r ) property (3.6) holds and, in addition, all limits are unique. Assume that we can prove that (3.6) also holds in the "product topology" r, on X ~ = X X X generated by the base '3 = r x r that consists of all open parallelepipeds 10, x 0 2 : Ol,O2 E r). Pick a n arbitrary point (x,y) from B, which stands for the closure of diagonal D = {(x,x): x E X}. If (3.6) holds in ( X , r ) (and eventually in ( x 2 T )), then the point (x, y) is a limit of ' .p some sequence {(xn,yn)) C D. Since (xn,yn) E D, we have that x, = y,; and, in accordance with our above assumption, by uniqueness of limits, x = y. Thus (x,y) E D, i.e. D = b or D is closed in the product topology r ~The ' latter implies that any point (x, y) with distinct coordinates is an interior point of DCand hence it is contained in some base neighborhood O,xOy C - Dc. This implies that O , n O y = @, i.e. ( X , r ) is Hausdorff. 0

If (3.6) is so crucial for ( X , r ) to be Hausdorff, what then is a prerequisite for (3.6)? The answer is provided in the upcoming Theorems 3.8 and 3.10. Before that we introduce the following important notions.

3.7 Definitions. (i) A topological space ( X , r ) is said to satisfy the first axiom of countability (or to be first countable), if each point x E X has a t most a countable neighborhood base.

(ii) A topological space ( X , r ) is said to satisfy the second axiom of countability (or to be second countable), if ( X , r ) has a countable base.

As mentioned, a noteworthy attribute of topological spaces emulating metric spaces is subject to Theorem 3.8 combined with reader's efforts in Problem 3.7.

3.8 Theorem. Let (X, r ) be first countable and let A be a subset of X . Then a point x is a closure point of A if and only i f there exists a sequence {I,}( C A) which converges to x. 3.9 Remark. In what follows, we will advance to the notion of the product topology to be rigorously constructed in Section 5 of this chapter. We will call the topology on the Cartesian product X x X generated by all open parallelepipeds, O1 x O2 E r x r, the product topology and denote it by rp. The reason why r x r is a generator for rp is that T x r is a subbase and base for rp (in light of Proposition 2.7). Obviously, rp


125

is first countable if

T

is; show it (see Problem 3.12).

The statement below builds promised bridges between uniqueness of limits of sequences, Hausdorff spaces, and closeness of the diagonal in rp. The same result will be generalized and applied to filter and nets in Section 10 (Theorem 10.22).

3.10 Theorem. Let (X,T) be a topological space. Then the following are equivalent.

( i ) ( X , r ) is HausdorPf. (ii) All convergent sequences in ( X , r ) have unique limit points. (iii) The diagonal D = {(x,x) E is closed in the product topology rp on

x2.

x2]

Proof.

+ (ii) holds according to Proposition 3.4 (Problem 3.1). For (ii) + (iii) we assume that all limits of sequences in ( X , r ) are (i)

-

unique. If D is not closed, then there is a sequence ((xn,xn)} C D such that (x,,x,) -t (x,y) with x # y, but then it immediately contradicts assumption (ii), since then x, +x and x, +y. For (iii) (i) we assume that the diagonal D is closed in ( x 2 , r p ) . Let x # y E X. Then (x,y) E DCC Since DC is open, it can be represented as a union of base open sets, i.e. as a union of open parallelepipeds. Then a t least one of these parallelepipeds, say 0, x 0, C DC, must contain the point (x,y), i.e., x E 0, and y E 0,. Thus 0, and 0, are open neighborhoods of x and y, respectively. They are disjoint, since 0, x 0, 5 DC.Hence, ( X , r ) is Hausdorff.

x2.

PROBLEMS 3.1


3.2

Show that any one-point set in a Hausdorff space is closed.

3.3

Show that any metric space is first countable.

3.4

Prove that any separable metric space is second countable.

3.5

Is it true that any first countable topological space is also second count able?

3.6

Prove that if a topological space is second countable, then it is separable and first countable.

126


3.7

Prove Theorem 3.8.

3.8

Let 0 CX be open. Show that V x E 0 and V sequence x, + x, there is an N(0)-tail of this sequence. Prove the converse of this statement assuming that X is first countable.

3.9

While Corollary 3.4, Chapter 2, claims that in a metric space a set A is closed if and only if it contains all its limit points, Remark 3.6 (ii) asserts that in a general topological space a set A could contain all its limit points and still not be closed. However, for any set A of a first countable space, the former property does hold. Show that a set F is closed in X if and only if each convergent sequence in F converges to a point in F.

3.10

Show that subspaces of second countable spaces are second countable.

3.11

Show that T, x r 2 that consists of all open parallelepipeds {O, x 02:0, E r,,02E T2} is n -stable.

3.12

Show that rp in Remark 3.9 is first countable if able.

T

is first count-

127


NEW TERMS: N(A)-tail of a sequence 122 convergent sequence 122 limit point of a sequence 122 limit point of a set 122 Hausdorff (separated, T2)topological space 122 separated topological space 122 T 2space 122 Second Separation Axiom 122 product topology 124 diagonal 124 First Axiom of Countability 122 first count able topological space 124 Second Axiom of Countability 124 second countable topological space 124 closure point, criterion of 124 Hausdorff space, criterion of 125


4. CONTINUITY IN TOPOLOGICAL SPACES Except for a brief introduction of sequences (being a rather vague manifestation of functions) in the previous section, in the present section, functions will appear for the first time in conjunction with topologies. Naturally, their most natural quality we look into will be continuity. After a first acquaintance with continuity in metric spaces (Section 4, Chapter 2), the reader will be well prepared to its "surprising" variant for topological spaces and a striking similarity between Theorem 4.2 below and Theorem 4.3, Chapter 2, with respect to a key continuity criterion. Again, we will observe some other continuity properties, typical for metric spaces and holding for special topological spaces, yet more general than metric spacks. One of them deals with an important relationship between convergence of sequences and continuity of functions initiated in Chapter 2 (formulated as Theorem 4.4 and pledged to be proved in this section). 4.1 Definitions.

( i ) A function f: (X,r)+(Y,rl) is said to be conlinuous a t a point a E X if, for every neighborhood Wf(,), there is a neighborhood U, such that f *(U,) W1(.)

s

This is obviously equivalent to the following definition: f is continuis a neighborhood ous a t a, if for every neighborhood Wf )(, , f *(Wf of a (see Problem 4.1). (ii) The function f is said to be continuous on X (or simply con0 tinuous) if it is continuous at each point a E X. 4.2 Theorem. Let f : (X, T) -+ (Y, T ~ be ) a function. Then the following are equivalent. (i)

f is continuous.

(ii) The inverse image under f of any open set H E r1 is open, i.e. is an element O ~ T . Proof. (ii). Let H E rl. For each point a E f *(H), f (a) E H and (i) therefore f(a) is an interior point of H. Specifically, H is a neighborhood of f (a). Since f is continuous a t a, there is a neighborhood U, such that f (U,) C - H. Because the inclusion is preserved under the inverse, we have

which implies that f ' ( H ) contains a neighborhood for each of its points.

4. Continuity in Topological Spaces

129

Hence, f *(H) is itself a neighborhood for all of its points. Therefore, by Proposition 1.6, f *(H) is open, i.e. is an element of T. be a neighborhood of f (a). Then, (ii) + (i). Let a E X and let Wf By there exists an open set H E rl such that f (a) E H 5 W assumption (ii), f * ( H ) an element of T. Since obviously a E f * ( H ) , f ' ( H ) is a neighborhood of a and thus f *(Wf(a)) is also a neighborhood of a. Consequently, we have continuity of f a t a. Let ( X , r ) be a topological space. Denote the collection of all closed sets OCsuch that 0 E r by rC.

4.3 Proposition. A function f : (X, T) + (Y, r l ) is continuous on X if and only i f the inverse image under f of any closed set OC E slCis closed C3

in (X, T).

(See Problem 4.2.) 4.4 Proposition. Let (X, T), ( Y ,T ) , and (2,s 2 ) be topological spaces and let f : X -+ Y and g : Y+ Z be continuous functions. Then the function g o f :X + Z is continuous. (See Problem 4.3.)

4.5 Definition. Let (X,T) be a topological space and let [X,Y, f ] be a function. Define

i.., f r qC - T . By the below arguments (Remarks 4.6), rq is a t o p e logy and it contains any topology relative to which f is continuous. rq is called the quotient topology induced on Y b y f. [Recall that f * is defined on T(X); consequently, we denote f ** as a function acting on 9(9(X)).]

(i)

rqis indeed a topology:

sectionofopensets) 3

nBkErq.

k=l

3) A similar consideration can be used to show that rq contains all unions.

(ii) rq is the largest topology on Y relative to which f is continu-

130


ous. This follows directly from Definition 4.5.

4.7 Example. Let X=W, r = { R , @,(-1,2], [0,3), [0,2], (-1,3), (-1,1)} and let f(x) = x 2 defined as f:W-+W = Y . It is clear that W , @ and [0,1) are the only subsets of Y whose inverse images are in r. Therefore, (W,@, [0,1)} is the quotient topology on Y. By Theorem 4.2, f : ( X , r ) -+ (Y,rl) is continuous if and only if f **(rl) 2 T. However, if we know a generator Y' of TI, then condition (ii) of Theorem 4.2 can be weakened as the following theorem shows.

4.8 Theorem. Let f : (X, r ) -+ (Y,r(Y)) (where r(3) is the topology generated b y a subbase Y). Then f is continuous i f and only i f f **(3) 5 7. Proof. If f is continuous, then, in particular, f **(3) 5 r. Assume that f **(f) C - r and introduce the quotient topology rq induced by f . Thus, Y E 7,; which implies that ~ ( 3 ) rq, for ~ ( 3 )is the smallest topology containing Y. Then since f **(rq) 5 T, we have

4.9 Theorem. Let f : ( X , r ) --, ( Y , r l ) be a map continuous at some point x E X . If {x,) is a sequence convergent to x, the sequence {f(x,)) is convergent do f (x). (See Problem 4.10.) Theorems 4.8 and 4.9 and the next theorem form an analog to Theorem 4.6, Chapter 2, which was only valid for metric spaces. The statement in Theorem 4.9 has no restriction as to the nature of topological spaces ( X , r ) and (Y,rl), while its converse needs to be strengthened by the condition that ( X , r ) is first countable.

4.10 Theorem. Let f : (X, r ) -+ (Y, TI) be a map and let ( X ,r ) be first countable. If for any sequence {x,} convergent to a point x E X, the sequence {f (2,)) converges to f (x), then f is continuous at x. Proof. T o prove this theorem, we assume that f is not continuous a t x, then select a sequence {x,} convergent to x such that { f(x,)) does not converge to f (x). The assumption that (X,T) is first countable is essential in the selection of a convergent sequence {x ) which otherwise .' need not exist. If f is not continuous a t x, there is a neighborhood W r(4 such that f *(W ) is not a neighborhood of x, or equivalently, there is no neighborhoob(d, such that f (U,) 5 W (,). [Otherwise, if f(U,) W f (,), then

13 1


This would contradict our assumption. (See Figure 4.1.)]

Figure 4.1 Specifically, it follows that, for each base neighborhood B E 1,, f ,(B) is not a subset of W Since ( X , r ) is first countable, there is a f (XI* countable neighborhood base 3, = {B1,B2,...) which can always be assumed to be monotone decreasing (why?). Now, each Bi contains a t least one point, say xi, such that f(xi) $ W f ( z ) ,which immediately yields that the sequence {f(x,)} is not in W and, thus, does not f (xl) converge to f(x). However, x, + x. Indeed, for every neighborhood V,, there is an element BN E 3 ' , such that BN C V, , which implies that Bk E V,, Vk 2 N (since 1, is monotone decreasing). Thus, {xN, X N + 1 ,...} is the N(V,)-tail of {x,). Theorem 4.10 leads to some useful applications. 4.11 Lemma. Let f , g : (X,T) -+ (Y,rl) be two continuous maps. If (Y, r l ) is Hausdorff, then the set S = {x E x : Ax) = g(x)} is closed in ( X ,r ) .

Proof. Since f and g are continuous, clearly the map (f ,g): X x X -+ Y x Y is continuous relative to the respective product topologies. Since by the assumption, (Y,rl) is Hausdorff, by Theorem 3.10, the diagonal D in Y x Y is closed. Hence, the set S, as the inverse image of the diagonal D under the continuous map (f ,g) must be closed. 4.12 Proposition. Let f,g: ( X , r ) -+ (Y,rl) be two continuous maps that coincide on some dense set in X . If ( X , r ) is first countable and if ( Y , r l ) is T2, then f = g on X .

132


Thus, it follows that a continuous function is well-defined on a dense set. The proof to this proposition is the subject to Problem 4.11. 4.13 Example. If f , g : (Wn,re)

(Rn,re) are continuous maps that coincide on the set Qn of all vectors with rational coordinates, then f and g are identical on Wn. This fact takes into account that (Rn,re) is 0 Hausdorff and first countable. +

4.14 Definition. Let ( X , r ) and (Y,rl) be two topological spaces. A bijective map [X,Y, f ] is called a homeomorphism if both f and f are

-'

-

continuous. The topological spaces ( X , r ) and (Y,rl) are then called homeomorphic. We write X Y. If f fails to be surjective, then f is called an embedding ofX into Y. X is also said to be embedded in Y b y

f4.15 Remark. It is not hard to see that the homeomorphic property applied to a collection of topological spaces on fixed carriers X and Y

offers an equivalence relation (show it, Problem 4.12).

PROBLEMS 4.1

Show that f is continuous a t a point a if and only if for every neighborhood Wf(,), f *(Wf(,)) is a neighborhood of a.

4.2


4.3


4.4

Let f : ( X ) ( Y , ) be a function such that f (x) = x, x = Y = R , r={R,@,{1),[1,3)) and rI={R,@,{2),[2,4)}. Is f continuous?

4.5

Under the conditions of Problem 4.4, set f (x) = x tinuous?

4.6

Let f : ( X , r ) -+ (Y,rl) be a map. Show that f is continuous a t a point x E X if and only if, for any base neighborhood Bf(,) of the point f(x), f *(Bf(,)) is a neighborhood of x.

4.7

Under the condition of Problem 4.6, assume that (Y,rl) is a metrizable topological space.

+ 1. Is f

TI

con-

= r(d), i.e.

a) Show that f is continuous a t x E X if and only if the inverse image under f of any open ball Bd(f(x),&)is a neighborhood of x.

b ) Show that, for each open ball Bd(f(x),&) there is a neighborhood U,(E) such that


4.8

4.9

Let f : (X,T) + (Y, 11 11 d) be a map, where Y is an NLS over a field F, and let 11 11 be the norm generated by a TIH metric d. Show that f is continuous a t x E X if and only if, for every E > 0, there is a neighborhood U,(E) E U, such that for each y E U,(E), 11 f ( ~ -) f ( ~ 11) d < &Prove the following statement: Let f : (X,T) --t (IWn,de), where (X,T) is a topological space. Then f is continuous a t a point x E X if and only if, for every E > 0, there is a neighborhood U,(E) E Q x such that, for all y E U,(E), 11 f ( x ) - f (y) 11 < E. ( 11 11 denotes the Euclidean norm.)

-

4.10

Prove Theorem 4.9.

4.11


4.12

Prove the statement posed in Remark 4.15.

4.13

Show that (R,r,) is homeomorphic to ( - 1,l) with the corresponding relative topology on ( - 1,l).

4.14

Is (R,re) homeomorphic to [ - 1,1]?

134


NEW TERMS: function continuous a t a point 128 function continuous a t a point, criterion of 128 continuous function 128 continuity of a function, criterion of 128, 129, 130 composition of continuous functions 129 quotient topology 129 continuous function on a dense set 131 homeomorphism 132 homeomorphic topological spaces 132 embedding 132 embedded set 132

135

5. P r o d u c t T o p o l o g y

5.

PRODUCT TOPOLOGY

Let (Yl,rl),...,(Yn,sn)be topological spaces. One of the reasonable ways to define a topology on the Cartesian product Y = Y l x ...x Y , is to take the collection

for a family of "open" parallelepipeds and declare it as a base for the topology it generates. 4B is obviously closed relative to the formation of all finite intersections [show it], and therefore, by Proposition 2.7, is a base for ~ ( 9 3 that ) includes all unions of elements of 93. We wish to call r(4B) the p r o d u c t t o p o l o g y o n Y and denote it by r p . The following is an attempt to reduce the base 93 for rp.

5.1 Proposition. L e t

where '3; i s a base f o r ri, i = 1,...,n.T h e n 93' is a h 0 a base f o r r p .

CI

(See Problem 5.1.) Any element of '3' is called a base parallelepiped. 5.2 Proposition. L e t

Y =Ylx...xYn={Slx...xSn:

S i € Y i , i = 1,...,n ) ,

w h e r e Y is a subbase f o r ri, i = 1,...,n.T h e n Y is a subbase f o r

T,.

(See Problem 5.2.) Any element of Y is called a subbase parallelepiped.

5.3 Proposition. L e t Y' = { n f ( S i ) :Si E Yi, i = 1,...,n } , w h e r e Yi is a subbase f o r ri. T h e n f ' C Y is a subbase f o r r,.

(See Problem 5.3.) Observe that any element of Y' is a unit cylinder.

-

5.4 Example. As it was mentioned in Example 2.8 (i), the usual topology T , on Rn coincides with the product topology r , on Rn = R x ... x R generated by the base '3' of all open parallelepipeds (as the n

136


n-times Cartesian product of open sets in W). The base parallelepipeds

n (ai,bi), where (ai,bi) E R, and they are elements of a n

are of the form

i=l

base for @,re). In particular, the system of all rational parallelepipeds is also a base for rp = re.The system !f' of all unit cylinders {ri*((ai,bi)) : (ai,bi) C W, i = 1,...,n}is a subbase for re. (See Example 2.8 (ii).) It is apparent that the projection maps are continuous relative to the product topology. Furthermore,

5.5 Theorem. Let Y =

n Y; and n j : Y n

-+

Y j be the jth projection

i=l

map, j = l , , n Then the product topology rp on Y is the weakest topology f o r which each projection is continuous.

Proof. Let

be a topology on Y, for which each projection is continuous, i.e. af *(ri) 5 T. Then for every set 0 € rj, j = 1,...,n, T

But 0 is known to belong to rp, where 0 is a base set of rp. Thus, if % is a base for rp such that % E T, then by Remark 2.4 (i), rp 5 T. C3 We extend the notion of product topology of finitely many factor spaces to that on the Cartesian product of arbitrarily many factor spaces. We therefore assume that ((Y,,T,) : 3: E X) is an arbitrary indexed family of topological spaces. Let us c~nsidertwo different models of topologies on th; ~ a r t e i i a nproduct Y = Y,. One of them, called the box

n

x € X

Lopology (in notation r b ) , is subject t o the following construction. We take for a base for rbthe system of box parallelepipeds,

or even a weaker base, %b={

n B,: B , E % , } .

,EX

Hence, the introduced box topology rbis not different from its version for finitely many factor spaces. There is another, "more economical" topology on Y, which also preserves continuity of projection maps, and in addition, it leads to a tame formation of the widely used "pointwise topology" (which the box topology does not).

5.6 Definition. Let us define the topology rp on Y through the base

5. Product Topology

137

where 0,= Y,, except for finitely many indices x E X. In other words, all elements of '3 are simple cylinders (see Definition 5.3, Chapter 1). The topology rp generated by such a base is called the product or T y c h o n o v topology o n Y. 0 Obviously base (5.6) for rp can be further reduced if each 0, is selected from a base 3, for 7,. 5.7 Remarks. Let Y, be a subbase for r,. One can show that the collection Y = { ( S ) :S E Y , x E X ) of unit subbase cylinders is a subbase for r,, just as it is for the case of finite products. (See Problem 5.7.) (i)

(ii) We will always prefer to deal with the smallest possible base or a subbase for r,, provided that we have the knowledge of bases or subbases for each T,. For instance, as the rule of thumb, we can take {':(Ox): 0, E r,) as a subbase for r,, unless more is known about the nature of rX9s. 0

5.8 Examples. (i) Let {(Y,,r,), spaces and let Y =

n

2

x E X ) be a collection of metrizable topological Y,. According to Example 2.3 (i), the collection

E X

of all open balls B?(~,,T), y, E Y,, constitutes a base for (Y,,r,(d,)). Now, the set of all simple cylinders of the form "zl(Bnl(~l,rl)) "z2(Bn2(~z,r2)). cri

rzk(Bnk(~k,rk)), E X , y; EY,.,k = 1,2,..., 1

(5.8) is a base for T whereas the collection of all unit cylinders of the form p'. 7r~(BnX(yx,r,))is a subbase for r,. (ii) Let Y = RR be the collection of all real-valued functions on R that are regarded as the Cartesian product of R's, with each R eqhipped with the usual topology. We select an open neighborhood U f of a point f E Y. First of all, according to (5.8), a simple cylinder with base (y, - E ~yl, E,) x ...x (yk - ~ ~ E, ~has )y the~ form

+

+

where y, is a point in Y, = R. In order that this cylinder be a neighborhood of f , we need to replace y, by the corresponding traces f (a,) of f in the factor spaces Y al,. .,Yak:

138


(See Figure 5.1.)

Figure 5.1

5.9 Remark. Let {g,: x E X) be a family of functions g,: R 4 Y,, where each Y, is endowed with a topology T,. Recall that g y L ( r , ) , Vx

EX,i s - a topology on R, and that each function g, is

continuous

relative to this topology. The union of all these topologies,

need not be a topology, for it does not necessarily preserve unions and intersections. But we can extend it to a topology, say T ( Y ) , regarding 3 as a subbase. This topology is the weakest one for which all functions of the above family are continuous. r(Y) is called the w e a k f o p o l o g y generated by the family {g,). Now, taking Y, for R and r, (the xth

n

,EX

projection map) for g,, we deduce that the Tychonov topology r p is the weakest topology for which all projections are continuous. Consequently,

139

5. Product Topology

rp turns out to be the weak topology generated by the projection maps. (Of course, we need to show that r p = r(3); see Problem 5.7.) By the way, this offers another (equivalent) definition of the Tychonoff topology Y,. on

n

xEX

5.10 Example. Recall that a sequence {x,} C R converges to a point x E R if, for every neighborhood U,, there is an N(U,)-tail of {x,). In the product space R = R', a sequence of points {f is convergent to a point f E R if and only if f ,(x) -, f (2) for all x E R. T o see this we note (see Example 5.8 (ii)) that a base neighborhood Uf of f in (5.8b) is of the form,

In other words, f n -, f if it is close to f on each finite set {xl,. ..,xk} C R, specifically on singletons {x) C_ R. Example 5.10 is motivational to the following notion.

5.11 Definition. Let {(Y,,r,), x E Y}, be a topological space and let ( n YZ,rp) be the Tychonov product topology. Recall that if Y, = Y ,EX

and rX= T, for each x E X , then we denoted

n

Y, by yX and called it

,EX

the set of functions from X to Y. Now the special Tychonov product topology (lfX,rp) is called the topology of pointwise convergence. As a generalization of Example 5.10, the following proposition can help solidify our understanding of the topology of pointwise convergence.

5.12 Proposition. Let ifn}be a sequence in yX. Then fn +f E yX (in the topology of pointwise convergence) if and only if fn(x)-,flx), Vx E X (in the topology (Y,,r,)).

Pmof. Recall that T,: yX-r Y is the x-projection map defined as ~ , ( f )= f (x) (see Section 5, Chapter 1). (i) First assume that f, + f in (yX,rP). By Theorem 5.5, T, is continuous for every x. Thus, by Theorem 4.9, n,(f ), -r ?r,(f). This yields that f n ( x ) - t f(x) in (Y,,r,).

-

(ii) Let f ,(x) f (x) in (Y,,r,), Vx E X. Let U f be a neighborhood of f in ( y X , r P ) . Clearly, U f contams some base neighborhood Bf.Since by Theorem 2.2, Bf E CBf E 9 (for r p ) , it follows that Bf is of the form

140


where all 0 (,)'s but finitely many ( 0 (xl)l. each i = 1,. ..,k, 0

(xk) ) are Yx's and for

contains f (xi). Thus the base neighborhood B is

a simple cylinder

Now, f n -.f if and only if for every base neighborhood Bf , there is an N(Bf)-tail of If ,}. By our assumption, f n(xi) +f (xi), which implies the existence of an Ni(Of ( )-tail, i = 1,. .,k. Let N = max{N1,. .. , N k b

.

1

(Note that this is exactly the place, where we take advantage of the Tychonoff product topology, for otherwise, in the case of the box topology, a baie neighborhood of f could not be represented by a simple cylinder. The latter would be an obstacle in finding a finite maximum of infinitely many Ni's, which would finally imply that { f,} does not converge to f in this box topology.) Then, for each xi, i = 1,...,k, we have the ~ ( if.(,i))-tail ) of { f n(zi)}, which yields that

Therefore, we have k

k

f n E$= fl1r; .(fn(x;)) 2. fl z = l

r: .(of

= B , for all n

> N.

The latter tells us that an N (B$!-tail of { f ,} exists, and therefore, X + f in (Y ,rp).

fn

PROBLEMS 5.1


5.2


5.3

Prove Proposition 5.3. [Hint:Apply Theorem 2.7.1

5.4

A map f : ( X , T ) + (Y ,TI) is said to be open if f (7) 2 TI. Show that in the product topology each projection map is open. [Hznt: Use the fact that, according to Problem 3.3, Chapter 1, maps preserve unions.]

5.5

Let f : (R,T)

3

(X =

n Xi,rp). n

Show that the function f is

i=1

continuous if and only if each r;o f is continuous. [Hint:Show , then apply that f *(S)E T, for every subbas; element of T ~ and Theorem 4.8.1

5.6

Let (Xi,ri) be a Hausdorff space, i = 1,...,n. Prove that

(nX, n

i =1

141

5 . Product T o p o l o g y r p )is Hausdorff.

5.7

Show that Y in Remark 5.9 is a subbase for the Tychonov topology.

5.8

Show that all major properties of the product topology of finitely many spaces can be reformulated and can hold for the Tychonov topology (Problems 5.4-5.6).

5.9

Let ( X =

n X i , r p ) be the Tychonov topology and assume that

iEI

each factor space is first countable. Is ( X , r p ) first countable if: a)

I II

= No?

b) 111 ? & ? 5.10

Generalize Theorem 5.5 for the case of Tychonov's topology.

142


NEW TERMS: product topology for finitely many factor spaces 135 base parallelepiped 135 subbase parallelepiped 135 continuity of projection maps 136 product topology for arbitrarily many factor spaces 136 box topology 136 box parallelepiped 136 Tychonov topology 137 weak topology (generated by a family of functions) 138 topology of pointwise convergence 139 pointwise convergence, criterion of 139 open map 140

6. Notes on Subspaces and Compactness

6. NOTES ON SUBSPACES AND COMPACTNESS It has been mentioned that subspaces of topological spaces (i.e. relative topologies) inherit certain qualities of the original spaces. In this section we consider this notion more systematically. We will be concerned with such topological properties as separability, countability, and compactness and their effect on subspaces.

6.1 Definition. A property of a space is referred to as hereditary if every subspace has this property. A property is said to be weakly hereditar y if it is inherited by a subspace whose carrier is closed in the original space. A property is vaguely hereditary if it is inherited by a subspace whose carrier is open in the original space. [The last notion is restricted to use in this textbook.] 0

6.2 Example. Second countability is hereditary. (See Problem 3.10.) 6.3 Remark. In Section 1 we denoted by 2 the closure of some subset A of a topological space (X,T), understanding that this is the closure relative to the topology T. As was mentioned in Definition 1.8 (i), in the case of subspaces we may need to deal with closures of subsets with respect to any relative topology, say (Y,ry). To make a certain distinction clear we will then write CIyA. However, we will still use 2 having in mind the closure relative to the original space (X,T). 6.4 Example. The property of density of a set is not hereditary and not weakly hereditary, i.e. if D is dense in (X,T), its trace in a subspace (Y,ry) need not be dense. Let (X,T) = (R,T,) and Y = W+ U { Then, obviously the set Q+ = Q n Y is not dense in (Y T ). It is easily that does seen that { is an open neighborhood of the point not meet Q+. Thus Cly Q+.# Y. Since Y is closed in (W,T,), the density 0 property is not weakly hereditary either.

a}.

fi}

6.5 Theorem. Separability is vaguely hereditary, but not (weakly) hereditary.

Proof. Let (X,T) be separable and let (Y,ry) be a subspace of (X,T) such that Y E T. We show that (Y,ry) is separable. Let D be a countable, dense set in (X,T). We need to prove that Cly(D n Y) = Y; specifically, we need to show that Y C Cly(D n Y), for the inverse inclusion holds trivially. Let y be any point of Y and let Ub,be any open neighborhood of y in ry. Since Y is open in X, UL is also a neighborhood of the point y in T. [fi is easy to show the follow 0, there is

a neighborhood

such that for each x E U, and f E 4, d(f (x),f (xo)) < E . 0

U

of xOr =o The subset 4 is

called (d-)equicontinuous if it is equicontinuous a t each point of X.

7.8 Theorem (Ascoli). Let ( X , r ) be a compact topological space and let (C(X;Rn),p) be the function space endowed with the uniform metric p. A subset 4 C_ (C(X;Rn),p) is compact if and only if it is closed, bounded and de-epuicontinuous. The proof of Ascoli's theorem is based on the following two lemmas.

7.9 Lemma. Let (X,r) be a compact topological space and let (Y,d) be a metric space. If a subset 5 C (C(X,Y),p) is totally bounded in (e(X, Y), p), then 9 is d-equicontinuous on X . (See Problem 7.4.)

7.10 Lemma. Let ( X , T ) be a compact topological space, (Y,d) be a totally bounded metric space, and 2 C(X;Y) be any d-equicontinuous subset. 'Then is totally bounded. (See Problem 7.5.)

Proof of Ascoli's Theorem. If 9 is compact, it is closed and bounded by Theorem 6.7, Chapter 2, with no further restrictions. In this case, we have to prove that 9 is de-equicontinuous. We first show that since 9 is bounded, there is a compact subset Y 5 Rn such that, for all x E X and for all f E 9, f (4 E Y. Let f E 9. Since f is continuous, by Theorem 6.8, f o*(X) is a compact subset of Wn. In other words, fo*(X) is closed and bounded. Hence, there is a n open ball Bd (8 = (0,...,O),R) such that fo,(X)

(i)

e

154


Bd (0,R). On the other hand, since 9 is bounded, there is an M 2 0 such e

that p(f o, f ) < M, V f E 9. Thus for all

f E 4,

+

and now, Bd (0,R M) can be taken for Y. Hence, (Y ,d,) is a compact e

subspace of (Rn,de) such that each f E 9 is valued in Y. By compactness, 9 is totally bounded (see Theorem 6.14, Chapter 2)) and we conclude that 4 is de-equicontinuous by Lemma 7.9. (ii) Let 4 be closed, bounded, and de-equicontinuous. As a closed subset of the complete metric space (C(X;Rn),p) (Example 7.5), (9,p) is complete (see Theorem 5.1, Chapter 2). Since '5 is bounded, by the above argument in (i), all functions of 9 are valued in a compact subspace of (Rn,de). Now, X and Y are compact and 4, by the assumption, is deequicontinuous. By Lemma 7.10, we conclude that 9 is totally bounded. Finally, we can make use of Theorem 6.14, Chapter 2, and have 9 compact. 0

7.11 Examples. For the following examples we denote by ~ ' ) ( x ; Y ) the space of all differentiable functions with uniformly bounded derivatives.

(i) Let X = Y = W. Then, e(l)(W;W) is an equicontinuous family. Indeed, for every f € e(')(W;W), I f ' 1 5 M. Let E > 0 and x E R. Then for all y E R such that I x - y I < E/M, we have, by the mean value theorem,

-

(ii) -Let X = [a,b], Y = R, and 4 be the subspace of C (1)(X;Y) consisting of all uniformly bounded functions. We wish to show that (9,p) is compact. By Example (i), 9 is equicontinuous. Clearly, (9,p) is bounded, since the diameter of 9 is

where N is defined as the common bound for all f E T. Furthermore, it is easy to see that (9,p) is closed. Since a subset of a metric space is closed if and only if it contains all of its limit points, we select an arbitrary convergent sequence {f,) C 9 and show that its limit is a function, which 1) is differentiable,

7. Function Spaces and Ascoli's Theorem

2) is bounded by N, 3) has its derivative bounded by M. The first statement immediately follows from the known fact in analysis that a uniformly convergent sequence {f ,) of differentiable functions has as the limit, a differentiable function f , and that p-lim f = f '. The other two statements can be easily verified. There is another version of Ascoli's Theorem frequently used in applications. It is based on the result of Problem 7.14: If (9,p) C (C(X;Wn),p) is equicontinuous and bounded, then ( g , p ) is also equicontinuous and bounded. We will need another definition. Any subset of a topological space is called relatively compact if its closure is compact. For instance, if 9 is a sequence of continuous functions (which need not be closed), we might be interested in whether or not it has a convergent subsequence, i.e. if 3 is sequentially compact or, equivalently, if '3 is relatively compact. Now, with the use of Problem 7.14, the following version of Ascoli's Theorem obviously holds. 7.12 Theorem (Ascoli). Let 9 be a subset in a uniform metric space (C(X;Wn),p). Then 4F is relatively compact if and only if 9 is bounded and equicontinuous.

A more general version of Ascoli's Theorem for a subset 9 E C(X;Y), where Y is a Banach space, requires a finer condition imposed on 9. 7.13 Theorem (Ascoli). Let '3 be a subset in a uniform normed linear space (C(X;Y),sup 11 11 ), where (Y, (1 11) is a Banach space (over f). Then 9 is relatively compact with respect to sup (1 11 if and only if 9 is equicontinuous and, for every x E X , the set

9(x) = (f (x) E Y: f E 9 ) is relatively compact in (Y,

I[

(I).

0

(See Problem 7.16.) As mentioned earlier, there are very many other versions of Ascoli's Theorem known from textbooks and research papers that led to special applications. For instance, consider Arzelii's Theorem (see Problem 7.15). T o work with some of the problems below we need the notion of pointwise boundedness.

7.14 Definition. A collection '5 C(X;(Y, d)) of functions is called pointwise bounded if, for every x E X , the set 9(x) = {f(x): f E 9) is bounded, i.e. for each x E X, there is a positive real number M , such


156

that d ( f ( x ) ,g(x)) 5 M ,

, for each pair

f , g E 9.

0

Recall that a collection tS C ( C ( X ; Y ) p, ) is uniformly bounded if there is a positive real number M such that p ( f , g ) 5 M , V f , g E 9.

PROBLEMS

7.1

Prove Lemma 7.3. [Hint:Make use of the inequality

and continuity of f k in the form of Problem 4.8.1 7.2

Prove Theorem 7.4. [Hint: Use Problems 4.6-4.9 and apply Lemm a 7.3.1

7.3

Prove Theorem 7.6. [Hint: Show first the validity of the statement similar to Lemma 7.3: Under the conditions of Theorem 7.6, if a sequence If n} 5 ( C ( X , Y ) ,p ) converges uniformly to a function f , then f E C ( X ; Y).]

7.4

Prove Lemma 7.9. [Hint: Let 9 be totally bounded; show that 9 is equicontinuous at any fxed point so E X.

> 0 and bl, b2 > 0 such that E 2 2b1 + b22) Cover 4 by balls B p ( fi, 61)' i = 1,. ..,n [call the n-tuple 1) Choose any

If I,

a,

f ,}

E

a sl-netl.

3) Use continuity of each f a t xo in the form of Problem 4.7 b): for each b2 > 0, there is a neighborhood

4 ) Choose a neighborhood

(I,

u"0( ~of)xo with

, good for all f i's.

0

5) Let f be any function in IT; thus f falls into one of the balls in 2), say B p ( f i, 61).

6 ) Use the estimate

157


where the first term of the right-hand side of the inequality is less than bl (why?), and the second term is dominated by

(The estimate needed then follows.)]

7.5

Prove Lemma 7.10. [Hint: Choose E > 0 and 61, 62 > 0 such that E > 2b1 b2. Show that there exists an E-net {fl,. . .,f N ) k. Use the steps that follow.

+

1) Use equicontinuity of k and compactness of X to show that, for every b1- > 0, there is a finite open cover (by neighborhoods) {U,1(61),.. .,Us( 4 ) ) of X, such that for any f E k and n

for any y that falls into a neighborhood U,.(b1), t

2) Cover Y by a finite collection {B(j)} of d-balls, such that B ( j ) = Bd(y b2), j = 1,. ..,

3) Let I' be the collection of all integer functions

Let I?' be a subset of I' with the following property: an element y E I? belongs to I?' if and only if there is a function f E 8 such that f (xi) E B(y(i)), i = 1,...,n. Let I I" I = N. Then order the elements of I" and the functions assigned to I" by (1,...,N), so that I" = {yl,. ..,yN} and 8' = { f ..,f N). Show that 8' is a relevant &-net. 4) Let f E $. Show that for this f there is an element of I", say yj, such that if f (xi) E B(yj(i)), i = 1 , .n , then '(f (xi), f j ( ~ i ) )< 62, i = 1,. n5) Show that for all x E X\{xl,.

..,xn},

by using the triangle inequality and the inequality in 1).


6) Show that the inequality in 5) implies the desired inequality p(f, f k) < E for some k E (1,. ..,N) and therefore i f l , ...,f N } is indeed an &-net in I.]

Prove the following: Let ( X , r ) be a topological space and let (Y, d) be a complete metric space. If C*(X;Y) is the subspace of continuous bounded functions, then (C,(X;Y), p) is a uniform complete metric space. Prove the statement: If 9 E C(X; Y) is an equicontinuous family, then so is its uniform closure 3. Prove Dini's Theorem: Let (X,T) be a compact topological space. Consider the space (C(X; W), p). Let { f ,) be a monotone sequence from C(X;W) such that {f,) converges to a continuous function f E C(X; W) in the topology of pointwise convergence. Then ( f ), converges to f in p also. Let 9" be the set of a11 polynomials defined on [0, 11 with degrees less than or equal to n and with all real coefficients bounded by a positive constant. Show that (Tn, p) is compact. Let 9 C_ C(X;Y), where X is a compact topological space and Y is a metric space. Show that if 9 is equicontinuous and pointwise bounded, then it is uniformly bounded in (C(X;Y), p). Let 9 2 C(X;Rn) and let X be compact. Show that 9 is relatively compact if and only if it is equicontinuous and pointwise bounded. Let 9 be the set of functions

Show that the set ( 9 , p) is sequentially compact. 1 Let 9 be a sequence of functions with fn(x) = bncosx, b , = 1 +El n = 1,2,. .., and fo(x) = COSX. Show that (9,p) is compact.

Let $ be a subset of (C(X;Wn),p). Show that the uniform closure (3, p) is equicontinuous and bounded if and only if ( 9 , p) is equicontinuous and bounded.

Prove Arzeli's Theorem: Let X be compact and let { f k} C C(X;Wn) be a pointwise bounded and equicontinuous sequence of functions. Show that ({fk),p) is sequentially compact. 7.16

Prove Theorem 7.13.

159


NEW TERMS: uniform metric 151 uniform metric space 151 supremum norm 151 space of all continuous functions 151 space of all continuous bounded functions 152 uniform convergence 152 uniform convergence, criterion of 152 completeness of a uniform metric space 152 equicontinuity a t a point 153 equicontinuity on a set 153 Ascoli's Theorem 153, 155 equicontinuity , criterion of 153 totally boundedness, criterion of 153 relative compactness 155 pointwise bounded set of functions 155 uniformly bounded set of functions 156 Dini's Theorem 158 Arzeli's Theorem 158

160


8. STONEWEIERSTRASS APPROXIMATION THEOREM Let (QT) be a topological space, X be a compact subset, and let (A,R) E C(X;R) be a subspace of all real-valued continuous functions on X that also contains products f - g of functions from A. Each continuous function on a compact set is bounded, as we know it from Theorem 6.8. We will use the uniform metric p introduced in the previous section:

Since C(X;R) is complete (Example 7.3 or Theorem 7.2), A 5 C(X;R). We wonder under what condition 2 = C(X;R), i.e., under what condition each continuous function can be "uniformly approximated" by elements of A. For instance, if A is the set of all polynomials, can a continuous function be uniformly approximated by a sequence of polynomials ? It is known from calculus that every function, analytic at a point can be uniformly approximated in a vicinity of this point by a sequence of polynomials (Taylor's theorem). In 1885, German Karl Weierstrass established a more general result (also known from calculus), which states that every continuous function defined on a compact interval X can be uniformly approximated by polynomials. Finally, American Marshall H. Stone in 1937 generalized the classical Weierstrass Theorem, allowing X tg be a compact topological space with some minor restriction to the subspace A. For all necessary preliminaries the reader is referred to the beginning of Sectioh.7, Chapter 1. We will start with some auxiliary results to be rendered in a few steps (Lemmas 8.4 and 8.5) that lead to the Stone-Weierstrass Approximat ion Theorem.

8.1 Remark. Compactness of the topological space ( X , r ) we were talking above is not a mandatory prerequisite to define the uniform metric, if we consider C,(X;Y) as a subspace of all d-bounded continuous functions from (X,T) to a complete metric space (Y,d). The uniform metric p is also well-defined on C,(X;Y). Completeness of (C,(X;Y),p) is then due to Theorem 7.6 (where only boundedness of C,(X;Y) on the compact space X is essential). 8.2 Definitions.

(i) Let fj be a family of functions defined on a set X. Then fj s e p a r a t e s points of X if for each x and y from X such that x # y, there is a function f E Cj such that f (x) # f (y). (ii) Let C j C_ C,(X;R) be an arbitrary nonempty subcollection of continuous, bounded functions on X and let A be any subalgebra of C,(X;R) containing Cj. The intersection of all subalgebras containing Cj is obviously

8. Stone- Weierstrass Approximation Theorem

16 1

a subalgebra (see Problem 8.1); and moreover it is the smallest subalgebra containing Cj, denoted by A(Cj), and is called the subalgebra generated b y Cj. The subcollection Cj is called the generator of this sub-

0

algebra.

8.3 Theorem (Stone-Weierstrass). Let X be a compact subset of a topological space (a,T ) and let Cj 2 C , ( X ; R). If C j separates points and contains the unity 1 (i.e. the function identically equal to I ) , then the subalgebra A.(Cj) generated b y Cj is dense in C , ( X ; R ) relative to the uniform metric p. [Observe that if needed, the condition '9 separates points" can be strengthened by the condition "A(Cj)separates points."]

A few lemmas will precede the proof of Theorem 8.3.

8.4 Lemma. For each

I P(t)-

It1

I

Proof. Let

<E

> 0,

there is a polynomial P ( t ) such that for all t ~ [ - 1 , 1 ] .

n=O

E

b nI" be the binomial expansion of the function

( 1 + z)" for a E Q and z E C. Recall that this function can be expanded in the binomial series, where the coefficient bn is given by the formula b, = a ( a - 1 )

( a-n

+ l ) / n ! ,n 2 1, and b0 = 1.

(8-4)

The binomial series is uniformly convergent in the open ball B ( 0 , l ) C_ 43 and a t point z = ( - 1,O) for a, > 0, it is absolutely convergent as a special case of a hypergeometric series. Thus, the series c:= 0 bnxn with

+

coefficients given by (8.4) is uniformly convergent to function (1 x)", a t least for all x E [ - 1,0]. Letting a, = f and replacing x by - x we arrive a t the series 1

-x

c:=

0 bb xn,

which is uniformly convergent to

V x E [0,11, where bk = ( - l)"bn. The statement now follows

if we set x = 1 - t 2 ,

where t ~ [ - l , l ]The . series

converges to I t 1 , V t E [ - 1, 1] with b', = ( - l)"bn; sums of the series are polynomials.

~:=Ob',(l-t~)"

and the partial

8.5 Lemma. Let ( X ,T ) be a topological space, A E C,(X; R ) be a subalgebra, and (C,(X; R), p ) be a uniform metric space. Then the closure 1 relative to p is a subalgebra and, in addition, 2 is a vector lattice, i.e., vflg€2, fAgandfvg€1.

Proof. By Problem 8.2 a), is a lattice. Because of

3 is a subalgebra.

We need to show that

162


aAb=

(a+b)+ la-bI

and

2

aVb=

(a+b)- la-bI 2

,

it suffices to show that with f E 1, 1 f 1 E 1. Since f is a continuous, bounded function on X , I f 1 M and I g I 1 , where

0, the ball Bp( 1 g I ,E ) meets A, implying that 1 g 1 E 1. Hence, I f 1 E 1 (see Problem 1.16). Finally, the statement of the . lemma follows from the linearity of 1 Now we return to the Stone-Weierstrass Theorem.

Proof (of the Stone-Weierstrass Theorem). We will show that each function f E C,(X;R) can be approximated by functions from A = A(g) relative to the uniform metric. By the assumption, Cj separates points, i.e., Q x1 # x2 E X , there exists a function g E Cj such that g ( x l ) # g(x2). Define for fixed a,P E R, the auxiliary function

which belongs to A, because 1 E A. Thus, Q xl # x2 E X and Qrr, P E R, there is an h E A such that h ( x l ) = a and h ( x 2 ) = P . Let f E C,(X;R). Then by the above argument, Q x # y, there is an h,, E A with the property that

where

Fix an x and let y be arbitrary. Since f - h,, is continuous a t y and f ( y ) - h X y ( y )= 0 , VE > 0, there is an open ball B ( y ) = B(y, 6), such that

Now, we cover X by ( B ( y ) :y E X ) , and by compactness of X , reduce

163

8. Stone- Weierstrass Approxima-lion Theorem

this cover to a finite subcover {B(yl), ...,B(yn)}. Let the associated functions, with the above properties in vicinities of yl,. ..,yn be

respectively, and let h, = min(h xyl'h, E 3. By (*), VE > 0,

h

- - ' "Yn

), on X. By Lemma 8.5,

which implies that

Observe that the above inequalities, along with their parameters, depend upon a fixed x E X. Notice that h, does not really approximate f on X; it just approximates f in a vicinity of point x. Thus, f = h, by continuity of f - h, and h and f satisfy inequality (**). By continuity of f -h,, for each E > 0, there is a ball B(x) = B(x, 6,) such that 1 f ) - h ( ) - 0 1 < E , Vz E B ( z ) . Again, let us cover X by the collection {B(x): x E X) and then reduce the latter to a finite subcover {B(xl), ...,B(xk)). Correspondingly, Vs > 0,

Then h = max(h

h

x l ) ' . - )' k

1 E 3 by similar considerations,

f ( t ) - E < h(z),

and hence

E X.

Furthermore, (**) yields that

and

From the last inequality we have that any function f E C,(X;R) is approximated by elements of 2, i.e. VE > 0, Bp(f, E) fl3 # @ which, • due to Problem 1.16, implies that B,(f, E) fl A # @.

8.6 Corollary (K. Weierstrass). Every real-valued continuous function defined on a compact interval [a,b] can be approximated uniformly by polynomials. (In other words, the algebra C([a, b ] ; R) of all continuous functions on [a, b] is the closure of the subalgebra A of all polynomials on [a, bl-)

164


Proof. The subalgebra of all polynomials on [a, b] has g = (1,x) as a generator, which contains 1 and separates points (see Problem 8.3). Therefore, the hypotheses of the Stone-Weierstrass theorem are satisfied.

0 In the proof of the classical version of the Stone-Weierstrass theorem we essentially needed a subalgebra A(g). Indeed, in Lemma 8.5 we made use of the fact that g E A and P(g) E A to show that I g I E A and to claim that A is a vector lattice. Should we have assumed that Q is already a vector lattice separating points and containing 1, we were able to prove the Stone-Weierstrass theorem (special version) without Lemmas 8.4 and 8.5.

8.7 Theorem (Stone-Weierstrass, special version). Let (X, T ) be a compact topological space and let g 5 C,(X;R) be a vector lattice that separates points and contains 1. Then C j is dense in C,(X;R). 8.8 Example, Let g be the collection of all continuous piecewise linear functions on [O,l]. Thus, 6 satisfies the hypotheses of Theorem 8.7 and Cj = C([O, I],R). In other words, every continuous function on [O,l] can be approximated by a piecewise linear function.

PROBLEMS 8.1

Show that A(g) in Definition 8.2 (ii) is a subalgebra.

8.2

Let 3 be the closure of a subset A uniform metric p. Show that

C,(X;W) relative to the

a) if A is an algebra, then 1 is also an algebra; b) if A is a vector lattice, then 1 is also a vector lattice. 8.3

Let C j = {f(x) = l,g(x) = x) E C([a,b];R), for a < b E R. Show that A(Q) is the subalgebra of all polynomials on [a,b].

8.4

Let g be the collection of all continuous, piecewise linear functions on [0,1]. Show that g E C([O,l];R) is a vector lattice but not a subalgebra.

8.5

Let X be a compact subset of R. Show that (C(X,R);p) is separable.

8.6

Let (X,r(d)) be a compact metrizable topological space. Show that (C(X;R),p) is separable. [Hint: Use the steps that follow. 1) Let D = {dl,dz, ...) be a countable, dense set in (X,d) (why?). Define f ,(x) = d(x,d,), Vx E X.

8. Stone- Weierstrass Approximation Theorem 2) Show that

fn E e(X;R).

3) Show that ( f ,) separates points.

4) Show that the algebra generated by fo = 1, is dense in C(X;R).]

Ifn: n = 0,1,. ..), with

8.7

Prove the following: Let X be a compact subset of Rn. Then every real-valued continuous function on X can be approximated uniformly by polynomial functions of n variables.

8.8

Can continuous functions on a compact interval be approximated by polynomials with rational coefEcients?

8.9

Show that each continuous function on a compact interval can be approximated by a differentiable function.

8.10

Can continuous functions on a compact interval be approximated by polynomials with integer coefficients? Can we apply the StoneWeierstrass theorem?

8.11

A continuous function defined on a compact interval [a,b] is called a parabolic spline if there is a partition (ao = a,al, ...,a, = b) of [a,b] (cf . Definition 1.7 (ii), Chapter 1) such that f is a second degree polynomial on each subinterval [ai,ai+J, i = 0,. ..,n - 1. Can continuous functions on [a,b] be approximated by parabolic splines? If so, what version of the Stone-Weierstrass theorem should be applied?

8.12

Consider a subcollection 5 of "rational" parabolic splines on [a,b], i.e. piecewise second degree polynomials with rational coefficients. Can continuous functions on [a,b] be approximated by elements of '3?

166


NEW TERMS: set of functions that Separates Points 160 subalgebra generated by continuous functions 161 generator of a subalgebra 161 Stone-Weierstrass Theorem 161, 164 binomial series 161 Weierstrass Theorem 163 piecewise linear function 164 subalgebra of polynomials 164 parabolic spline 165 rational parabolic spline 165

9. Filter and Net Convergence

9. FILTER AND NET CONVERGENCE In this section we will generalize the concept of convergence of sequences introduced in Section 3. Many problems in topological spaces allow significantly weaker conditions imposed on the linear order of terms in sequences while retaining the principles of convergence. This gives rise to the notion of a net, which is a set indexed by another (partially ordered) set, in which the usual linear order is therefore largely relaxed. One of the prominent applications of convergence of nets is the notion of the Riemann integral, which is known to have inspired American Eliakim H. Moore in his 1915 widely referred to paper, Definition of limit in general integral analysis, and 1922 paper, A general theory of limits, co-authored with H.L Smith, to develop the general concept of a net. Filters offer another, very useful type of convergence in topological spaces such as convergence of neighborhoods to a point. The theory of filters was developed in the thirties by the famous Bourbaki group of French mathematicians.

9.1 Definitions.

Let X be a set and 9 C T ( X ) be a nonempty collection of sets. 9 is said to be a filter on X if:

(i)

a) 8 d 91 b) for each two sets F1,F2E 9, Fl n F2 E 9 (specifically, it means that every pair of elements of 4F is not disjoint), c) if F1 E 9 then any superset F2 of F1 is also an element of 4. Clearly X E 9. if:

(ii) A collection of subsets Tb E T(X) is called a filter base on X

4 @ d 4bl

b ) for each two sets F1,& E Tb, there is a set F E Tb such that F 5 Fl n F2 (clearly, Fl n F2# 0).

(iii) Let 9 be a filter on X. A collection of subsets Tb 5 T(X) is called a filter base for the filter 9 if:

9,s

a) s, b) each F E 9 is a superset for some FbE Tb. (iv) A filter 9 on X is called an ultrafilter if for each subset A of X , either A or AC is in 9. 9.2 Remarks. (2)

A filter is obviously a filter base, since we can take F1 fl F2 for

168


F to have ?Fb

9.

(ii) Let Tb be a filter base on X. We can extend T b to a filter 9 by including in 9 additionally all supersets of each FbE qb.Indeed, a) Let F1, F2E 9. Then there are FL, Ff:E !Fb such that FL 2 F1 and Ff:C F,. Thus, there is an FbE ?fb such that Fb5 FL n Ff: ( 5 F, n F,). By definition, 9 contains all supersets of elements of Tb, in particular, Fl n F, is one superset of Fb.Consequently, Fl n F2E 9.

b ) Let F E 9. Then there is an FbE Tb such that FbC F. Now, 9 should contain all supersets of Fb,thus all supersets of F. Therefore, 9 is a filter. Note that the above filter 9 is the smallest filter containing the filter base Tb (show i t in Problem 9.1). For instance, T(X) is another filter containing Tb. Consequently, it is called the filter generated b y the filter base and it is denoted by 9(Vb). Thus a filter base on X is a filter base for a filter on X, namely for the filter generated by the filter base. (iii) We showed that a filter base on X is a filter for a filter base. The converse is also true: A filter base for a filter is a filter base (show it in Problem 9.2).

9.3 Examples. The neighborhood system X, called the neighborhood filter.

(i)

tll, a t

a point x E ( X , T )is a filter on

' , a t a point x E (X,T) is a filter base on (ii) A neighborhood base 3

(iii) Let xo E X = R. Then the following collection of sets are filter bases:

9.4 Lemma. Let F(V0) be the collection of all filters that contain a filter 9,, on X . Let C be the partial order inclusion on ff(90). A filter


169

9 E IF is an zllirafilter if and only i f 9 is a maximal filter in IF. Proof. 1) Let 5 be a maximal filter in ff(qo) and let A 5 X. Each element of 9 intersects A or AC. Assume that one such F meets A. Then, by Problem 9.4, 9 meets A. By Problems 9.5-9.7, ?FA: = {F n A: F E 91, is a filter base for 9': = 9 U U !FA , which is equal to 9 ( 9 U { A ) ) , i.e., (B>A

)

the filter generated by the collection 9 U (A}. 9' is finer than 9 and it contains A. Since 9 is a maximal filter, it follows that 9 = 9'. Thus, 9 contains A. The same result holds if F meets AC. Therefore, 9 is an ultrafilter. 2) Let 9 be an ultrafilter and let A X such that A E 9. We show that 9 is maximal. Let 9' be any filter in IF such that 9 E 9'. Then there is F' E 9'\9. Since 9 is an ultrafilter and F' $ 9 , we have that FlCE 9 and hence FlCE 9'. However, this is impossible, for two disjoint sets F' and FtCcannot belong to the same filter and this is a contradiction.

9.5 Proposition. For each filter To, there is an ultrafilter Q 2 To. Proof. Let ff(v0) be a collection of all filters finer than V0 and let 43(T0) be any chain in ff(To). Then it is easy to see that

is again a filter and it is the largest filter in 63(T0). Specifically, it is an upper bound for C(TO). Then, by Zorn's Lemma 4.13, Chapter 1, IF(Vo) has a maximal element which by Lemma 9.4 is an ultrafilter. U

9.6 Definitions. A filter 9 on a topological space ( X , r ) is said to converge to an x E X (in notation 9 + x) if it is finer than or equal to the neighborhood system U,, i.e. if U, E 9. x is said to be a limii point of the filter 9. (i)

Clearly, every neighborhood system CI1, converges to x.

(ii) A filter base Tb is said to converge to x ( 9 ) -P x) if for every neighborhood U, E Q,, there is an FbE Tb such that FbC_ U,. Consequently, each neighborhood base %, converges to x. (iii) A point x E ( X , T ) is said to be an accumulation point of the filler 9 (filler base Tb) if for each F E 9 ( F b E Tb) and for each U, E Q,,

Fnu, #

@*

(iv) Let gb be a filter base on X and let f : X -t ( Y , r l ) (a topological space). The function f is said to converge to 1 E Y ( f -t I) along the

170


filter base 4, if for every neighborhood Vl of 1, there is a n F E T , such 0 that f ,(F)5 Vl. 9.7 Examples.

(i) Let X = N and let 9, = {{n,n + 1,.. .): n E N) be a filter base ), which, in fact, is a on N. Now consider a map f : N + (Y,rdiscrele sequence in space Y. Then, Definition 9.6 (iv) in this case reduces to the conventional definition of the limit of the sequence { f (n) = y,) (cf. Definition 3.1). (ii) Let (X,T), (Y,T') be topological spaces, f : X --+ Y, a E X , 1 E Y, and let T b= CU, (the neighborhood filter on X). Now, the expression f 4 1 along CU, means: for each neighborhood Vl, there is a neighborhood U, E %, such that for each x E U , f (x) E Vl (or, equivalently, f ,(U,) C Vl), in notation, lim f (x) = 1. 2-a

(9.7)

Observe that as long as %, is declared and since it is unique with respect to the point a and topology T , we need not specify along which filter base f converges to 1. Should %, be replaced by a specific neighborhood base 38, (also a filter base), then we can write lim

f(x)=l.

~-4'33~1

Now, let 38, be a neighborhood base a t a with (9.7a) holding. Then, by Definition 9.6 (iv), for each neighborhood Vl of I, there is a neighborhood B, E %, of a such that f ,(B,) V 1 Since 93s , CU,, (9.7a) then implies (9.7). Conversely, if (9.7) holds, then for each VI, there is a neighborhood U, from the neighborhood system %., Because each U, is, by Definition 1.5 (iii), a superset of a t least one B, E 3, (being an arbitrary neighborhood base a t a), (9.7a) must hold. Consequently, (9.7) and (9.7a) are equivalent, even though (9.7a) is related to a specific neighborhood base of a. We therefore see that the limit is invariant of a neighborhood ljase of a and (9.7) can be sustained with no specification of any neighborhood base. Consequently, (9.7) can be used for the notion of convergence of a function f a t a point a. Notice that f acts between two topological spaces. Interestingly enough, we could alternatively use a definition of convergence, similar to that of continuity in Definition 4.1, i.e. with no visible consent of a filter base. This would read:

A function f is said to have a limit 1 at a point a if for each neighborhood V1 of 1 in (Y,rl), there is a neighborhood U, of a in (X,T) such that f,(U,) 5 Vl, or equivalently, if f*(Vl) is a


171

neighborhood of a. In particular, if (X,T) is first countable (which is the case of metric spaces and many other applications), we can have f converge to 1 along any monotone decreasing countable neighborhood base of the point a, say, {B,"}.If we now select from each B," an arbitrary point x, (a. in the proof of Theorem 4.10), then x, + a in the usual sense and, consequently, we can write lim f(x) = 1

xn+a

that has a double meaning. For one, it goes back to notation (9.7-9.7a) and limit (9.7b) is a limit of f along the filter base {B:). On the other hand, it coincides with our conventional definition of the limit of f a t a point a along the sequence {x,). Finally, if limit in (9.7b) is consistent along any sequence {x,) that converges to a, then, by arguments as in Theorem 4.10, we can show that 1 is a limit of f along a filter base {B,"} and therefore, along any neighborhood base of a. The uniqueness of 1 is subject to Example (iv) below and we will see that this is the case if (Y,rl)is Hausdorff. For instance, if we consider as f the function

f ( 4 = g(x)x --ag(a)

1

then function [Rn,R,g] is differentiable a t a if and only if the limit lim f(x) = 1 x+a

exists, where 1 = g1(a), and now we can say that function g is differentiable at a if and only if this limit exists along any sequence {x,) convergent to a in the sense of notation (9.7b). This idea is frequently used in analysis whenever convergence along a sequence is a plausible (if not the only) option for us. (iii) Consider some special cases of limits along the filter bases from Example 9.3 (iii). Let X = Y = R and f : X -, (Y,T,).

+

a) If EFb on X is T b = {(a - &,a E): E > 0), then the concept of limit introduced in Definition 9.6 (iv) reduces to the conventional definition of the limit of a function known from calculus, with the usual notation lim f(x) = 1. x-'a

b) Similarly, with

% = {[a,a + E):E > 0)

c) With T b = {[b,oo): b E W), we have

(iv) Let f : X

+

we obtain lim

x++oo

lim f (x) = 1.

x+a+

f(x) = 1.

( Y , r ) , T b be a neighborhood base on X and let

172


(Y,T) be Hausdorff. We show that if f has a limit along Tb, then it is unique. Assume that ll and l2 are two different limits along Tb. Since Y is Hausdorff, there are two disjoint neighborhoods of ll and 12: V l and

. By

1

the definition of the limit along EFb, there are two sets U1,

UP E T b such that

By the definition of Tb as a filter base, there is U E gb such that U 2 Ul n U2. Since

we have f ,(U)

E Vl n V, = @. 2

1

This is absurd, for U # (d.

When introducing convergence of a function f: X -t (Y,T) along a filter base Tb on X in Definition 9.6 (v), we did not need to assume any topology on X. Now if we define a topology on X and take for EFb the neighborhood filter (U a t a point xo E X , then, by Definition 9.6 (iv) "0

(applied to LU

"0

= Tb) and taking 1 E Y as f (xo), we arrive a t the defini-

tion of continuity of f a t xO that agrees with Definition 4.1: A function f : ( X , r ) -t (Y,rl) is called continuous at a point xo if lim f i x ) = f (xo).

x +so

Now, we consider another very useful type of convergence: convergence along nets. As we will see it, the filter and net convergence have a very close relationship. 9.8 Definitions.

A set A is called directed if there exists a relation (denoted (i) j ) on A defined as: a) (R) for each X E A, X

b ) (T) X1

< X2

< A.

and X2 j X3 imply that X1

5 X3.

s) (SL - superlativity) for each pair X1,X2 E A, there is X E A such that X1 5 X and X2 5 A. A net is roughly speaking a set indexed by a directed set, and (ii) it is a generalization of a sequence. More formally: A net in X induced by A is any function f : A + X where A is a directed set. The point f (A) is denoted by xA and we will then instead denote the net by {xA) =


173

{xA:A E A). Observe that since f need not be surjective, {ox) is in general a proper subset of X. (iii)

If {xA) is a net, then {xA: Xo

5 A} is called a Xo-tail

of {xA).

(iv) Let A C_ X. A Xo-tail of a net {xA) is called a Xo(A)-tail of {xA) if the Xo-tail is a subset of A. A net {xA} is said to be cofinally in A C_ X if for each Xo E A, (v) there is X 2 Xo such that xA E A. (vi) A point x E X is said to be an accumulation point of a net {xA) if the net {xA) if {xA) is cofinally in each neighborhood U, E CU,. (vii) Let {xA} be a net in X. {xA} is said to converge to a point x E X (in notation xA --+ I), if for each neighborhood Uz of x, there is a Xo(Uz)-tail of {xA}. x is called a limit point of the net {xA). (viii) A net {xA) is called an ultranet if for every subset A there is a Xo(A)-tail of {xA} or Xo(AC)-tail of {xA}.

S X,

9.9 Examples.

(i) An X = (A1,. ..,A,)

example of a directed set A will be Wn with 5 p = (pl,. ..,p,) if and only if xi 5 yi, for all i = 1,. . .,n.

A neighborhood base 93, a t x, or even more trivial case, the (ii) neighborhood system CU,, with the relation U1 U2 if and only if U1 2 U2 for their elements, is a directed set.

E C be

a sequence of

< ca and inf{p(An): n = 1 , 2 , . ..) = a 2 0.

(nGKOO~n) 2 a.

Let ( f l , C , p ) be a measure space and for a number 0 < a define g, = {G E C: p(G) < a) and

5 oo,

Show that E, is a a-algebra. In the condition of Problem 1.11, let a = oo and p be a finite measure. Show that C, = C.

1. Set Functions

1.13

Let (R, 13,p ) be a measure space, Ern = {Q C - 51: Q fl G E C, VG E 9,). on C, as t

(Notice that C 5 C,.)

g,

23 = {G E C:p(G) < 001, an( Define the set function p,

Show that p, is a measure on C,.

1.14 Argue that for any probability space ( , , ) , the axiom P(@) = 0 is redundant. Is it also true for any measure?

CHAPTER

5 . MEASURES

NEW TERMS: set function 222 additive set function 222 a-additive set function 222 continuity from below on a a-algebra 222 continuity from below on a sequence of sets 223 continuity from above on a sequence of sets 223 continuity from above on a a-algebra 223 continuity from above a t the empty set 223 @-continuity 223 finite set function 223 a-finite set function on a system of sets 223 a-finite set function on a sequence of sets 223 elementary content 223 content 223 premeasure 223 measure 223 probability measure 223 measure space 223 probability space 223 point mass (Dirac measure) 224 Dirac measure (point mass) 224 Lebesgue elementary content 224 distribution function 224 extended distribution function 224 Lebesgue-St ieltjes elementary content 225 atomic (discrete) measure 226 discrete (atomic) measure 226 atomic probability measure 226 counting measure 226 monotonicity 226 p-minimal decomposition of a set 228 a-subadditivity 228 finite subadditivity 228 continuity from below, criterion of 229 continuity from above, criterion of 230 Bernoulli measure 231 Binomial measure 231 Poisson measure 231 ideal 232 a-ideal 232

2. Extension of Set Functions to a Measure

2. EXTENSION OF SET FUNCTIONS

TO A MEASURE We begin this section with the introduction of a set function that is not exactly a measure, as it is not even additive, but which is a t the heart of the formation of measures extended from some more primitive set functions. A prominent example of such a construction yields the Lebesgue measure. It is initially defined on rectangular figures and then the measurement of a more arbitrary figure is accomplished by means of approximation of rectangles inscribed into the figures or rectangles that cover the figure. The latter leads to the notion of an "outer measure," which was initially proposed by Lebesgue a t the turn of this century and later on refined by Carathiodory. Carathdodory's approach is essentially preserved in the contemporary construct ions. The principal idea of the extension begins with measuring an arbitrary set by sequences of rudimentary sets, which should cover the set and whose measure is previously defined. The total "measure" of the cover is then minimized over all available cover-sequences of basal sets (such as rectangles in Euclidean space). As it turns out, this way we can measure all subsets by the resulting set function, i.e., outer measure, but the latter fails to hold additivity, although it preserves some, rather useful properties of measure, such as subadditivity and monotonicity. Having proved this, we will notice that some of the additivity can be regained; namely, there are sets, including the basal sets, that, each, along with its compliment, forms a two-set partition of any other set, on which the outer measure becomes additive. The collection of all such "separating" sets assembles a a-algebra, which, as we will notice, will contain the basal sets. This is generally not the smallest a-algebra over the basic collection, but this a-algebra of separating sets can further be reduced. Our procedure, however, will be different from the more intuitive way described above. Rather than having a particular generator (such as a semi-ring along with an elementary content) in mind, we will try to develop the whole extension in general. In the beginning, we will define an outer measure as a set function with monotonicity and subadditivity and show that the subcollection of all separating sets is a a-algebra and, in addition, that the outer measure on this subcollection is a measure. All this will initially be rendered without assuming that the outer measure was generated by a "formatter" (i.e., some collection of sets and set function). Then, we take an arbitrary formatter and create a more specific outer measure by applying the above construction with countable covers.

2.1 Definition. Let S1 be a nonempty set and p* be a set function defined on ?(a). p* is called an outer measure if:

CHAPTER 5. MEASURES

a) p*(@) = 0.

b) A C -B

+ p*(A) 5 p*(B) (monotonicity).

(2. l a ) (2. l b )

Although axiom a) is redundant, since p*(@) = 0 as a set function in general, we find it to a be useful reminder. 0

2.2 Definition. Let p* be an outer measure on Y(S2). A subset M 2 R

is said to be p*-measurable, if for any Q C_ a,

We will also say that M separates Q.

0

The following is what essentially constitutes the widely referred to Carathdodory Extension Theorem. For convenience, we will break it up into several theorems. The idea of outer measures and the below construction belong to the German mathematician (of the Greek origin) Constantin Carathbodory that appeared in his 1914 paper, ~ b e rdas lineare Map von Punktmengen - eine Verallgemeinerung des Langebegriffes (in Gottingen Nachrichten) and in his famous 1918 book, Vorlesungefi iiber Reellen Funktionen (in Teubner, Leipzig).

2.3 Theorem. The collection C* of all p*-measurable subsets forms a u-algebra in a. The restriction of p* from ?(a) to C*, in notation p:, is a measure.

Proof. Since throughout the proof of this theorem we will largely use equation (2.2) or prove its validity, we first notice that, due to 0subadditivity of p*, as an outer measure, the inequality

holds true for all subsets, Q and M, of a. Our proof will consist of the following steps. a) S1 is obviously an element of C*, as it satisfies (2.2). If M E C*, then MCE C*, by their symmetry in (2.2). b) We show that C* is closed with respect to the formation of finite unions, i.e., we show that with A, B E C*, A U B E C*. Since B E C*, it follows that for each Q' E T ( a ) ,


237

Specifically, (2.3a) is valid for Q' = Q n A and Q' = Q n AC, Q E 9(!2). Hence, p*(QnA) = p * ( Q n A n B ) + p * ( Q n A n B C )

and p*(Q n AC)= p * ( Q n A C nB )

+ p*(Q n A C nBC).

Summing up the last two equations and taking into account that A E C*, we have

implying that p*(Q)

Now replacing Q in (2.3b) with Q n (A U B) we also have

The latter reduces to

Substituting (2.6) into (2.5) we get

238

CHAPTER 5. MEASURES

which shows that A U B E C*. The above assertions a ) and b) imply that C* is an algebra in S1. c) Now we prove that C* is a a-algebra in 52. Since E*, as an algebra, is n -stable, it is sufficient to show that C* is a Dynkin system. (See Problem 1.10 of Chapter 4.) Let {A,} C C* be a sequence of disjoint sets. Take Al,A2 E {A,}. Substituting Al = A and A2 = B into (2.3~))taking A and B in ( 2 . 3 ~ )disjoint, and then noticing that A f l BC= A and B f l AC = B, we arrive at

If AI,. ..,A, is an n-tuple of mutually disjoint elements of C*, then, by induction, from (2.3d)) P*[Q n

where

S, =

s,]= C

E = IAk.

Denote

- ,r*(Q

S=

(2.3e)

n A,))

C=:

lAn.

Because of

S, C S, (Q n SC)c (Q n Sk), and by monotonicity of p*,

Since C* is an algebra, i t follows that S, E C* and hence it is p*measurable, i.e., it separates Q, which, combined with (2.3e) and (2.30, yields

Therefore,

that, by a-subadditivity, is

Inequalities (2.3) and (2.3g-2.3h) lead to


concluding that S =

Ern A n=l n

239

indeed separates any Q C T(R)

and thus is an element of C*. The latter supports the claim that C* is a Dynkin system and, consequently, that C* is a a-algebra. d) We show that pg is a measure on C*. Substituting the set S = A for Q in (2.3g), we have =1 n

which, due to a-subadditivity of p*, leads to the strict equality and thereby, a-additivity of p;. Therefore, we have proved that Resz*p*, denoted by pg, is a measure. The proof is, therefore, completed.

0

2.4 Examples.

()

Let

52 = {a,b,c), A = {a), AC= { b , ~ ) , P = {b}, Q = {c}, R = {a,b), S = {a,c}. Define the following set function p* on ?(a).

One can easily verify that p* is an outer measure on

as it satisfies axioms (2.la-2.lc), but p* is not a measure, because it is not additive. We can see that only the sets (8, R, A, and AC p*-separate all subsets of R and, consequently, {@,R,A,A~}is the c-algebra C*. Clearly, p:, as the restriction of p* on C*, is a measure.

(ii) Let R be an infinite set. Define the set function 7 on T(R) by y(Q) = 0 if Q is a finite set and 7(Q) = 1 if Q is infinite. Let Q = { { w , ) , n = 1,2,. .) be a sequence of all different singletons. Then,

.

while 7(Q) = 1. Thus, y is not a-subadditive and not an outer measure.

240

CHAPTER 5. MEASURES

Recall that a restriction of a function [X,Y,f is a function [Xo,Yo,fo] defined on contracted domain Xo X with f = f o on X o and Yo 2 Y. (In notation, f a = ResX f .) From Theorem 2.3, we learned 0

that the set function [E*,[O,oo],p@ is a restriction of an outer measure ["s(fl),[O,~I,~*l. If and P are supersets of X and Y, respectively, a function [ E , 9 , f ] is called an extension of f (from X to X),if [X,Y,f ] is the restriction of 7 to X. (In notation f = Eolxf .) We will apply this notion to extend a set function y defined on a collection (jof subsets of S1 to a set function 7 on an expanded family $(g) of subsets of 0. For instance, in Example 1.2 (ii) we defined the Lebesgue elementary content XO on the semi-ring Y of half-open intervals in Wn. We can extend the Lebesgue elementary content A0 to a (unique) content A, on a ( ) ) (see Problem 2.2)) which turns out to be a premeasure on (verified in Theorem 3.1). The primary goal in this section is to construct an extension of a set function, such as premeasure, given on a ring, to a measure on the smallest o-algebra generated by this ring. Although this is the main objective, other extensions, such as "completion" of a measure, will also be a focus of our discussions. 2.5 Definitions.

(i) Let ( n , E , p ) be a measure space. A set N E L' is called a p-null set (or just null set) if p ( N ) = 0. We denote the set of all p-null sets by Np.A set E is called pnegligible (or just negligible), if there is a measurable null superset of E. The measure space is called complete, if for each null set N E N p , T ( N ) C - 22, i.e., if all negligible sets are measurable.

(ii) Consider a measure space (R,E,p). Let E be the collection of all sets of type A U M where A E C and M is any negligible set. According to Problem 2.8, C is a 0-algebra. We extend p to F on by setting

(z,~)

The extension of (,E,p) or just P is then said to be the completion of measure p and, due to Problem 2.7, ( n , C , p ) is a measure space, called the completion of measure space (a,L',p).

2.6 ]Example. Let Sl = W, E = {A E ?(R): either A or AC 4 N), which is a o-algebra on !it (see Example 1.2 (vii), Chapter 4)) and i t E~ be the n = l,2,. . .) and Ac are elements of C and point mass. Both A = E~(A')= 0. Obviously, E = [2,m), as a subset of AC, is negligible, but not measurable. Therefore, the measure space (IW,E,E~)is not complete. (See a more general case in Problem 2.14.)

{A,


241

The proposition below is a paradigm of a complete measure space.

2.7 Proposition. The restriction p; of an outer measure p* to the ualgebra C* of all p*-measurable subsets of Q is complete and (Q,E*,pg) is a complete measure space.

Proof. Since p* is defined on whole ?(a),for any p*-negligible subset N a, due to (2.lb)) p*(N) = 0 and, therefore, it is sufficient to show that N is p*-measurable. Let Q 5 a. Due to monotonicity of outer measure, p*(Q n lv) = 0 and p"(Q fl N c ) 5 p*(Q) and this, along with (2.3), yields

and, hence, that N E C*. The following will be a construction of an outer measure by an arbitrary set function y defined on an arbitrary subcollection of sets Cj 5 ?(a). As usual, we only assume that CJ contains the empty set and that 7, as a set function, is such that y(@) = 0. This construction lies in the basis of the Carathiodoy extension of the set function y to a measure on a-algebra C(g). For any subset Q 5 R, denote by CQ(g) the collection of all at most countable covers of set Q by elements of g. (Unless there is another subcollection, besides g, under consideration, we will for brevity drop g in EQ(g).) Therefore, if EQ #

@,for any {G,}

E EQ, we have

00

Q 5 n - IGn.

2.8 Proposition. The set function p* defined on ?(a) as

is an outer measure.

Proof. We need to verify the above properties (2.la-2.1~)of p* as an outer measure: a) Since (b E g and y (0) = 0, it follows that p*(@) = 0. b ) We assume that both p*(A) and p*(B) are finite, since otherwise, the proof is obvious. If A B, EB E Q A and then we can reach on EA a possibly smaller limit inferior than that on EB. Therefore,

00

C)

Let {Qn} 5 ?(a) and Q = U Q,. If for a t least one n, n=l

(E. Qn

= @,

CHAPTER 5 . MEASURES

then also gQ = 0 and subadditivity immediately follows. We assume that for all n, gQ # 0 and choose an E > 0. From n

and by the definition of a limit inferior, it follows that for €2 -', there is a cover {Gin, n = 1,2,. ..) E (Sp. such that 8

.

Now, clearly {Gin, i,n = 1,2,. .) E

which proves monotonicity

(Sg.

Thus,

.

0

We will call the couple (Q,y) (a subset Q of T(R) and a set function 7 on Q) a f o r m a t t e r of outer m e a s u r e p* defined by (2.8). As it has been shown, the formatter and, subsequently, the outer measure, induced the rr-algebra C*, on which p* was a complete measure. When constructing a measure space (R, E*,p;S) by (Q, 7)) the major goal is to extend y from Q to a measure, say p, acting on the smallest rralgebra C(g) generated by Q. This can be achieved by restricting (C*,pi) to (E(g),pcl)\given that (E*,p@ itself is an extension of (Q, 7). The latter, however, is* not guaranteed from the above construction, unless we impose some restrictions to the' formatter (g, y), for even though (g, y) produces (R, Z*,pg), (Q,y) need not have all elements p*-measurable. In other words, Q need not be a subset of 6'. In addition, p: need not coincide with y on Q. For example, if y is an elementary content and Q is a semi-ring, then, according to Problem 2.2, for each G E Cj, there is a cover {C,) of G such that c:= lCn is a decomposition of G and

Hence, in order that p*(G) = y(G), y must be rr-additive on g, which, in general, it is not. Consequently, we call (E*,p;I) (produced by (g,y) in (2.8)) the complete Carath6odory extension of (Q,?) if Q C* and R e s pg = y. If

9

(E*,pg) is the Carathbodory extension of (9, y), then the formatter (9, y) is said to be extendible and the corresponding restriction of (E*,p:) to (Z(Q), p ) is referred to as the CarathCodoy extension o f (9, y). As mentioned above, one of the most important questions arises, what the formatter (g,y) should really be to be extendible and, consequently, generate the Carathhodory extension. By now, we have a fairly


243

large choice of systems of sets and set functions on them ranging from semi-rings to a-algebras and elementary contents to measures. The idea is, however, to select a possibly more rudimentary formatter (g, y), which is tame and suited in most common practical applications and constructions and such that (E*,pG) is an extension of ( g , ~ ) .In particular, this means that the elements of Q have to be p*-measurable. The theorem below, which is a crucial step in the whole extension procedure, infers that (9,y) can be a ring and premeasure to serve as a reasonable extendible formatter. 2.9 Theorem.

Let (Q,7) be a semi-ring and elementary content, respectively, in 52, which produce the outer measure p* and a-algebra C* of p*measurable subsets of R. Then E C*. (i)

(ii) I ' in addition, 7 is a-additive on therefore (E*,p:) is an extension of ( 9 , ~ ) .

9,

then 7 = Res p* and

9

Proof. We have to show that 8 C E*, i.e., that any element, G E 9, p*-separates all subsets of R. Take any subset Q & R with OQ # 0,since, otherwise, the proof would be trivial, and let C = {C,} be any (countable) cover of Q from Oq. For a G € Q, and Cn € C,

(i)

Since Q is a semi-ring, C, n G is an element of 9 and C n \ G can be represented as a finite union of pairwise disjoint elements of Q, say

c y =n 1Sjn. Consequently, (2.9) can be rewritten as and, by finite additivity of 7,

y(C,) Now, suppose Ern n = l (2.9a) over n gives

< m.

Then, summing up all equations in

.

where {S,} is the reordered sequence isjn, j = 1,. .,Nn, n = 1,2,. ..I. As Q = (Q n G) (Q n GC), obviously, {Cnn G) g and {S,} 9 are covers of Q n G and Q n GC,respectively. Consequently,

+

c

c


and

C=:

l,rsn)

2 p*(Q n GC),

and then by (2.9b),

Since this inequality holds for every cover C of Q, it should also hold for the limit inferior to yield

If

cF-- P-Iy(Cn)=

OO,

then the equation symbol in (2.9b) must be

- " to yield ( 2 . 9 ~ )again. The inverse inequality is due to replaced by " > (2.3). Therefore, G separates all subsets of Sl and, consequently, g C*. (ii) By Problem 2.2, for each G E 9, there is a cover {C,} of G such that

G=

C: = I C n

and p*(G)= C:-~-~(C,). -

additive, p* coincides with y on

Hence, if y is a-

g.

These two facts warrant that (C*,p:) is an extension of ( 9 , ~ ) .

0

2.10 Remarks.

(i) One should bear in mind that, while (g, 7) can be an extendible formatter for the outer measure P*, C j is not really a generator for C*, as the latter need not be the smallest a-algebra containing 9. We would like to make a clear distinction between these two terms. Recall that a family Q E T(R) is said to be a generator of another family (g ) To ?(a) with a property P, if is the intersection of all supercollections of CJ on each of which property P holds. In our case, C* will eventually contain the smallest a-algebra ,E = E(g) and, in general, p* needs to be further restricted 'to this a-algebra. From Theorem 2.9, we conclude that any elementary content y on a semi-ring 9, which is a-additive, can be extended to a measure p = p* (acting on the smallest a-algebra

Resx(~)

C generated by 9). In other words, if y is a a-additive elementary content on a semi-ring 9, then there exists a t least one extension, namely, Carat hdodory 's extension. (ii) From the proof of Theorem 2.9, it is obvious that a semi-ring with a u-additive elementary content on it is one of the most economical systems good for the Carathdodory extension. However, it is often more


245

prudent to work with premeasures on rings. In practice, to start with, one can first extend a semi-ring with an elementary content to the smallest ring with the content using the procedure of Theorem 2.5 (Chapter 4) and Proposition 2.11 below.

(iii) Another reasonable question arises: in how many different ways can a formatter (9,y ) be extended to a measure on C(g)?Theorem 2.13 below states that with some relatively minor restriction (given in Remark 2.12) to a set function y, the uniqueness of Carathkodory's extension is guaranteed. 0 We will begin with one useful extension of an elementary content on a semi-ring to a content on the smallest ring containing the semi-ring.

2.11 Proposition. There is exactly one content on %(!f), coincides with the elementary content on Y. (See Problem 2.3.)

which

2.12 &mark. In Definition 1.1 (vii) we introduced the notion of afiniteness of a set function. Sometimes it is more convenient to use another definition of cr-finiteness, which is equivalent to 1.1 (vii) for a large class of set functions. Namely, the condition of having a monotone increasing sequence {G,} t R from g with y(G,) < a, for all n can be replaced by the equivalent condition that there is a t most a countable art it ion {a1,R2,. ..) 5 g of R ( ==z : p,, R,) such that y(R,) < a, for all n. For instance, rings with contents clearly provide a basis for such equivalence. For a semi-ring with elementary content, the first definition yields the second one, as we can arrange from {G,)7 R a countable decomposition; the converse is not true. Another related notion we are going to use in the sequel is c-finiteness of a set. Let ( R ,E ) p ) be a measure space. A measurable set A is said to be a-finite if ResC A~ is a-finite. 0

,

2.13 Theorem. Let g be a fl -stable generator of the cr-algebra E(g) in 51 such that g contains a monotone increasing sequence {B,} TG!. Let p1 and p2 be two measures on C(Q), which are a-finite on {B,} and which coincide on g. Then p1 = p2 on C(Q. :

Proof. Let A E g such that p l ( A ) = p 2 ( A ) < a, and let %A = {B E p l ( A fl B ) = p,(An B)}. We show that g A is a Dynkin system: a ) A E g A implies that R E g A . b ) Let D E % A Then A n DC = A\D = A\(A fl D),which implies that PI(An DC)= ~1 ( A )nD)

CHAPTER 5. MEASURES

and this leads to DCE gA. c) Let {D,} be a sequence of disjoint sets from gA.Then

w

Hence

C D n E gA,and

therefore gAis a Dynkin system. Since

n=l

obviously g gA,it follows that g c g((f) gA . Also since ~~stable, it follows that g(Cj) is a a-algebra. Hence, we have

g is

leading to In particular, we proved that VB E C(g) p l ( A n B ) = p2(A n B). Now let {Bn}be a monotone increasing sequence of sets from 9 convergent to R. Thus C(Pf) = gB . Then tln = 1,2,. . ., and n QB E Z(g),

Since {B,

n B) t B and sate pi(B n B,) < m, by lim n+w

pl(B

n Bn) = n+w lim

p,(B

Lemma 1.6,

n B,)

Now, by means of Theorem 2.13 we easily deduce the following significant statement.

2.14 Corollary. Let y be a a-finite and a-additive elementary content on a semj-ring Cj. Then the CarathCodory extension o f 7 to a measure on a-algebra Z(g) is a unique extension. The lemmas below will be used for various purposes and, in particular, will lead to a relationship between the completion ( R , C , ~ )of a measure space (fl,C,p) and the a-algebra C* of all p*-measurable sets.

2.15 Lemma. Let (R,g,y) be an extendible formatter of the outer measure p*, Pf, the collection of all at most countable unions of elements from 9. Then, for each Q S fl, there is a set G, E Q,, such that G, Q and

>


Proof. Because p* is generated by

247

(g, y),

If p*(Q) = oo, then inequality (2.15) holds trivially. Suppose p*(Q) < oo. Then, by definition of a limit inferior and from (2.15a), for every E > 0, there is {G,} E EQsuch that p*(Q)

+

&

Now, we make use of the fact that (9,y) is an extendible formatter. This implies that not only Cj E C*, but also monotone increasing and p* below (Lemma 1.6))

g, 5 C*.

Since

k

u Gn is n =1

< oo for all k, by continuity from

Passing to the limit in (2.15b)) which holds true for all k, we prove (2.15) with G, =

00

U Gn being the desired set.

n =1

0

Lemma 2.16. Let p* be an outer measure, C* the a-algebra of all p*-measurable sets, and A any subset of S1. If there is a p*-measurable set B such that B 2 A and p*(B\A) = 0, then A E C*. Proof. Since B E C*, it should p*-separate Q:

Now, because A C B, we can easily show that

From Q n (B\A) 5 B\A, it follows that p*(Q n (B\A)) = 0. From (2.16a))

248

CHAPTER 5. MEASURES

Consequently, we can replace p*(Q n B C ) in (2.16) by p*(Q n A'). Finally, noticing that Q n B C_ Q fl A, we have that

and this is the desired inequality.

0

Lemma 2.17. Let p* be the outer measure generated b y an extendible formatter ((5, y ) , E* be the a-algebra of all p*-measurable sets, p; be ResE*p*, and let E(Cj) be the a-algebra generated b y 9. Then, for

every A* E E* such that &(A*) B 3 - A* and pG(B\A*) = 0.

< m,

there is a set B E E(Cj) with

Proof. Since pG(A*) < oo, Eg # 0. Fmm Lemma 2.15, for every

i, there is a Gk, =

> A* such that p;(Gk,) 5 pg(A*) + The latter yields that ~;(G;\A*) 5 i. Obviously, k m=nl Gk, is still a

E

> 0,

say

00

U G:

n=l

E.

superset of A* and since

where Dm

m

n G: E G F , it follows that

k=1

=(k = l G~,)\A* E E*. The sequence { D m } is clearly monotone

nonincreasing and pg(D1) < oo. Therefore, by continuity from above (see Theorem 1.7 (i)))of p; and because of (2.17),

The set

00

fl

k =1

G; obviously meets the requirements on set B 'Lprornised"

in the statement and we are done with the proof.

Corollary 2.18. Let p* be the outer measure generated b y a a-finite extendible formatter ( C j , y ) , E* be the a-algebra of all p*-measurable sets, and let E((5) be the a-algebra generated b y (5. Then, for every A* E C*, there is a set B E C(Cj)with B A* and p*(B\A*) = 0.

>

Prmf. Since (0,y ) is a-finite, there is a partition {H1,H2,...) 5 Cj of R such that y ( H k ) < oo. If A* E E*, then

{A; = A* n H k , k = 1,2,. ..) is a p*-measurable partition of A*, with p*(A;) < oo for every k , and to each of which we can apply Lemma 2.17 and have a set B k E C(Cj),with B k A; and p*(Bk\A;) = 0.

>


249

Notice that since

it holds true that

The statement follows after setting B =

k = 1Bk ( E ~ ( g ) ) .

n

Now, with the aid of the above propositions, we can finally answer the question about the relationship between the completion (n,E,ji) of a measure space (n,C,p) and the a-algebra C* of all p*-measurable sets.

2.19. Theorem. Let (g, y) be an extendible formatter f o r (R,C*,p;) and a generator f o r the measure space (R, C = a(g), p = Respp*) whose completion is ( ~ , E , j i ) . (i)

Then,

EEC*.

* (ii) If (Cj,y) is a-finite, then C = C* and P = po. Proof. (i) Obviously, 5 C* if and only if, any element 2 of 3 is of the form A U N, where A E C , N is p-negligible, and 3 is p*-measurable. According to Lemma 2.16, A U N would be p*-measurable, if there is a p*-measurable set B such that B A U N and p*(B\(A U N)) = 0. By Definition 2.5 (i) of a p-negligible set, N must have a C-measurable pnull superset, say No. (Note that even though, by Problem 2.10, p*(N) = 0 and p*(AU N ) = p*(A), this does not warrant that A U N E E*.) Since A U No is a superset of A U N and, by Problem 2.11, (A U No)\(A U N ) is a p*-null set, B = A U No meets all prerequisites of Lemma 2.16, which makes A U N indeed p*-measurable. This proves part (i) of the theorem.

>

(ii) Because of part (i), we need to show that C* C C, i.e., that each A* can be represented as the union of a p-measurable set and pnegligible set. By Problem 2.12, for any A* E C*, there is a C-measurable subset B of A* such that p*(A*\B) = 0. Obviously, A* can be decomposed as B and p*-measurable null set A* n BC. It only remains to show that A* n BCis p-negligible.

250

CHAPTER 5. MEASURES

>

By Corollary 2.18, for A*, there is a set C E C such that C A* and p*(C\A*) = 0. The set-difference C\B = (C\A*) (A*\B), as the union of two p*-null sets, is a p*-null set, therefore, a p-null set (as C \ B E C). This proves that A* n BCis p-negligible. Now, we show that = pg. (Recall that they are equal on C.) Since C = C*, A* = A U N , where A E C and N is p-negligible, and

+

On the other hand, there is a p-null superset of N to yield p*(N) = 0 due to monotonicity of p*. Finally, from the inequalities p*(A*) 5 p*(A)

+ p*(N) = P*(A)

and p*(A* = A U N )

2p*(A),

it follows that p*(A*) = p*(A) and this, along with (2.19), yields that p(A*) = p*(A*) for each A* E C* =

x.

Example 2.20. If (S2, C , p ) is a probability space, it follows from Theorem 2.19 that the completion of ( C , p ) coincides with (C*,&) produced by ( C , p) or by a "smaller generator" (Cj, 7 ) of p). 0

(z,

A noteworthy question arises: if we have a semi-ring and o-additive elementary content, would it make any difference, if we first extend them to the smallest ring and premeasure, according to Proposition 2.11, and then use the Carathiodory extension to arrive a t the smallest c-algebra and a measure on it, or apply the Carathdodory extension directly to that semi-ring and o-additive elementary content. The same question applies, say, to a ring with a premeasure and the generated cr-algebra with a measure. The difference, if any, can apparently take place a t the expense of two outer measures, induced by a formatter and its extension. 2.21 Theorem. Let (Sl,Cj,70) be an extendible formatter of outer measure p* and o-algebra C* of p*-measurable sets and let (8 = 8(Cj),7 ) be an extension of ( C ~ , Y and ~ ) an extendible formatter of outer measure v* and %*, such that 8 C C* and 7 = Resgp*. Then, v* = p* on T(R) and C* = %*.

Proof. Let Q E R. Since eQ(g) E gQ(8), obviously

which yields the equation v* = p* on a subcollection of sets Q E T(S2) with v*(Q) = m. Suppose v*(Q) < m. Then, for every > 0, there is a cover {En}E CSq(g) with

5


Since y = p* on 8 and y(En) = pl(En) < m, for each E2 - n - 1 is a cover (Gnk, k = 1,2,. ..) E E E (g), such that

25 1

> 0, there

n

Because {Gnk,n, k = 1,2,. ..) E CSg(g), from (2.21a) and (2.21b),

Finally, taking in (2.21~)E = leads to the inverse of inequality (2.21) and proves that p* = v* on T(R). Since the outer measure is the mere generator of the a-algebra of separating sets, p* = V* yields that C* = %*, which completes the theorem. An important consequence of Theorem 2.21 is the following.

2.22 Corollary. Let (3, yo) be a semi-ring and w-additive and w-finite elemenlay content in R, and let (8 = 8(3),y) be an extension of (3,yo) such that 8 C(3). Suppose (C, = C(Y),p,) and (C, = C(g),p,) be the Carathiodoy extensions with their respective outer measures p,* and p,* and a-algebras C,* and C: of measurable sets. Then,the following hold true: 1) C = C, = C,.

Proof. 1) From % 2 C, we have Ee 2 C,. From 3s 8 C E , it follows that

Z,

c Ce.

2) Now measures p, and p, act on the same a-algebra C and coincide on semi-ring Y. Since yo is a-finite on 3, by Corollary 2.14, p, = pe on C.

3) With

and, consequently,

CHAPTER 5. MEASURES

we meet all conditions of Theorem 2.21 to have p: = p: = p*. 4) C* = C: = C: also by Theorem 2.21.

For instance, 6 can be a ring generated by !f and y - the extension of the elementary content yo in accordance with Proposition 2.11; or 6 can be an algebra with y as a premeasure or IS can even be the a-algebra E(Y). In particular, it follows that, once the Carathdodory extension from (Y, yo) to (C(Y), p) is rendered, another Carathdodory's extension of (S,y) would be redundant. Another consequence of Theorem 2.21 is the uniqueness of outer measures generated by measures.

Corollary 2.23. Let p a measure on a a-algebra C, which produces the outer measure p* with a-algebra C* of measurable sets. If there is another outer measure p *, then * = p* on ?(a) and C* = E*.

Proof. This is a direct application of Theorem 2.21 with the following identification of the above characteristics: be a measure on C such that 1) Let as an extension of (C,p). 2) 23 2 C*. 3) p = = Reszp*.

= p. Then ji can serve

Remark 2.24. Corollary 2.23 is useful in various applications of Carat hdodory 's extension. Suppose yl and y2 are two elementary contents coinciding on a a-finite semi-ring 9 (i.e. they are a-finite on 9). By Corollary 2.14, their respective Carathkodory extensions p1 and p2 must coincide on C(9). Let p; and p; be the corresponding outer measures, according to Corollary 2.22, produced by yl and y2 or p1 and p2 (regardless). By Corollary 2.23, p; = p; on T(R) and C; = C;.

As in-Theorem 2.21, by comparing two measures generated by a set function acting on a collection of sets and their extension, we ended up comparing two corresponding produced outer measures. It seems to be reasonable to raise another question: what if an outer measure will produce another outer measure? Would this make any difference? More specifically, can the restriction p i of an outer measure p* on E* become a formatter of another, different from p*, outer measure? Note that this is a different scenario from one considered in Theorem 2.21, since here p* is not supposed to be generated by a formatter and it "acts on its own." The following example shows us this distinction. 2.25 Example. Consider

?(a),p*, C*, and p;

in Example 2.4 (i):


Q=

{a,b,c}, A = {a}, AC = {b,c}, P = {b},

C* = {Q),R,A,AC],and pf; = Resz*p*. Then, generate the outer measure v* by (C*,p;). So, we have: p* = v* on E* and v*(P) = p*(AC)= 3 ( > p*(P) = 2),

v*(S) = p*(R) = 4 ( > p*(S) = 3).

0

As we see it, in most cases v* is strictly greater than p* on Y(S1).

As we learn it from Problem 2.1, if v* is an outer measure induced by p:, then always p* 5 v* on ?(a). That p* = v*, requires some restrictions, such as those in the following proposition.

2.26

Proposition. Let p* be an outer measure on be the outer measure produced b y ( *

* po = Resz*p* and v*

?(a), ) The

equation p* = v* holds true on Y(R) if and only if for every Q E ?(a), there exists a set A* E C* such that A* Q and p*(A) = p*(Q). 0

>

(See Problem 2.17.) 2.27 Remark. If p* is generated by an extendible formatter (g, y), then clearly p* = v*, due to Theorem 2.21, as (p;,E*) can serve as an extension of (Cj,y). Alternatively, if Q E Y(R), according to Lemma 2.15, for each positive E, there is a set G, E Cj, (a collection of all countable unions of elements from 9) such that G, 2 Q and p*(G,) 5 p*(Q) E. We assume that v* is the outer measure generated by pg. Since p* = v* on T(R) and G, E C*, we have p*(G,) = v*(G,) and, by monotonicity, v*(G,) v*(Q). Thus, we have

+

>

which yields v*(Q) 5 p*(Q). The inverse inequality is due to Problem


2.28 Theorem. Let ( a , C, p) be a measure space such that C = C(Y) with Y being a semi-ring, and p be a-finite on Y. Then, given A E C(Y) and E > 0, there is a disjoint countable cover (S,} E Y of A that aapproximates" A, i.e. such that A C c:= and p((C:= l ~ n ) \ ~ )

< E.

Proof. Let y = Resyp and p* be the outer measure produced by (Y, y). Then, p is the unique caratheodor; extension of y from Y to C(Y), according to Corollary 2.14, and p = ResEp*. p(A) = p*(A) < oo. Then, by (2.8) (of Proposition 2.8), for each sequence (G,} E Ga such that Case 1. Let --

s

since p( n = 1Gn) 5

C IP n = l p(Gn), we have that

p({

n

E

> 0, there is

a

E= l ~ n } \ ~ 0, there is an N such that 00

" ( C k = n + l Gk) < E , for all n

>N

thereby leading to

PROBLEMS 2.1

Let p* be an outer measure on ?(R), p; = Resp*p*, and v* be the outer measure induced by p:.

2.2

Show that p* 5 v* on ?(a).

Let (Cj,7) be a formatter of the outer measure p* defined by (2.8). Show that if 7 is an elementary content and g is a semi-ring, then for each G E CJ, there is a cover { C } of G such that G= - 1 Cn and p*(G) = - 1y(Cn).

~r-

cY-

2.3


2.4

Let p be a finite measure on (R,E) and let be any subcollection of C. Show that, for any fixed subset Q C R, it is true that

2.5

Show that the original definition of a-finiteness 1.1 (vii) implies the second definition of a-finiteness for semi-rings and elementary contents mentioned in Remark 2.12.

2.6

Let p* be an outer measure on ?(R) and {A,} a sequence of disjoint p*-measurable sets. Show that for any Q C R,

2.7

Let N E N, ( i . . a p-null set) and let B E E. Show that U B ) = p(B\N) =

2.8

Show that defined in Definition 2.5 (ii) is a o-algebra, P is a measure, that this extension does not depend upon representations of sets of C, and that ( R , ~ , P )is complete.

2.9

Show that the measure space defined in Example 1.2 (iv) is complete.

z

256

CHAPTER 5. MEASURES

Let p* be an outer measure on ?(R) and N C R be such that p*(N) = 0. Show that for any subset Q C R, p*(Q U N ) = p*(Q). Show that (A U No)\(A U N) in part (i) of Theorem 2.19 is a p*null set. Let p* be the outer measure generated by an extendible a-finite formatter (g, y), Z* be the a-algebra of all p*-measurable sets, and let C(Q) be the a-algebra generated by Q. Show that for every A* E E*, there is a set B E E(g) with B S A* and p*(A*\B) = 0. Let (G?,Eo,pO)be a completion of a measure space (R,C,p). Define for each A C St F(A) = sup{p(B): B E C, B C A) and p(A) = inf(p(B): B E C, A 5 B). Show that a) if A E Co, then p ( A ) = p(A) = po(A);

b ) if p ( A ) = p(A) < oo, then A E Co. Let C be a u-algebra in R and let a E R. Show that for {a} E 23 the measure space (G?,E,E=)is complete if and only if C = T(R). (Generalization of Problem 1.12.) Let ( R , E , p ) be a measure Q fl G E C, VG space, Q, = {G E C: p(G) < oo), C, = {Q E,}g, and p be u-finite. Show that C, = 6.

m:

In the condition of Problem 1.13, show that if p is complete, then so is pm. Prove Proposition 2.26. Let p* be the outer measure generated by an extendible formatter (9, y) on a non-empty set R, C* be the u-algebra of all p*-measurable sets, and C(Q) be the u-algebra generated by Q. Show that a subset N 2 52 is negligible if and only if p*(N) = 0.

2. Extension of Set Punctions to a Measure

NEW TERMS: outer measure 235 monotonicity of outer measure 236 subadditivity of outer measure 236 p*-measurable set 236 p*-separabili ty 236 Carathkodory's Extension Theorem 236 p:-measure 236 E*-a-algebra 236 restriction of a function 240 extension of a function 240 p-null (null) set 240 null (p-null) set 240 Niset 240 p-negligible (negligible) set 240 negligible (p-negligible) set 240 extension of a measure 240 completion of a measure 240 completion of a measure space 240 restriction of outer measure to E*-algebra 241 Carathkodory 's extension 241, 242 formatter of an outer measure 242 complete Carathkodory 's extension 242 extendible formatter 242 extendibility of a formatter, criterion of 243 a-finiteness of a set function 245 Carathdodory's extension, uniqueness of 245, 246

258

CHAPTER 5. MEASURES

3. LEBESGUE AND LEBESGUESTIELTJES MEASURES In this section, we will use the results of the previous section for the construction of Lebesgue and Lebesgue-Stieltjes measures. We have learned that to warrant the Carathbodory extension, a given formatter should be a t least a semi-ring and a-additive elementary content, which applies to some special cases of formatters in Euclidean spaces. In Theorem 3.1 below, we will show that the Lebesgue content is a-additive on the ring %(Fin), which will clearly yield that Lebesgue elementary content is also a-additive on the semi-ring of half open intervals. Although it is possible to prove this statement directly (cf. Problem 3.25 with no prior extension and -@-continuity arguments, as in Theorem 3.1), we prefer first to extend the elementary content to the ring, as we want to exploit the equivalence of @-continuity and a-additivity. The latter, as we know, can be observed on set families not lesser than rings.

Theorem 3.1. The Lebesgue content A, on the ring %(Rn) is aadditive, i. e. a prerneasure. Proof. Since the Lebesgue content A, is finite on %, by Proposition 1.7 (ii), A, were a premeasure if it would be @-continuous. We shall be using an equivalent version of @-continuity: % with For every monotone decreasing sequence {A,)J Ac(A1) < oo, the assumption that n+oo lim A c (A,) (which clearly 00

exists) is strictly positive must yield that fl A, n =1

# @.

Let { A , ) be any such monotone decreasing sequence with E

= n+= lim A c (A,) > 0.

It is readily seen that (3.1) implies that for each n, A, fore, by Cantor's Theorem 5.4, Chapter 2,

00

-

fl A,

n=l

# @,and

# @.

there-

However, the

nonempty intersection of the closures of An's need not yield that the intersection

00

n

n = l

A,

#@

either. T o overcome this difficulty we will

construct a subsequence of compact subsets of A,'s with the desired above property. Now, since An's E 9, each A, can be represented as a finite union of disjoint half open parallelepipeds, say C : , (for brevity let us drop index n) such that A,(P,) > 0. Then for each value of E and for every P,, there is a half open parallelepiped l?, whose closure is a proper subset of P, and such that

n,

3. Lebesgue and Lebesgue-Stieltjes Measures

Bound (3.la) yields that

where B, =

x

= D,. Obviously,

with the sequence

3,

A,. It seems like we are done

{B,). However, the claim that

00

n

n = l

an#@ is un-

warranted, as {B,} need not be monotone decreasing. Therefore, we define

which forms a monotone nonincreasing sequence of sets term-wise dominated by {A,}. Now, we need to show that C, # @.We shall be able to prove a much stronger statement that AC(Cn)> 0 for all n. Namely, we will prove that

which, because of Xc(An) 2 E , would yield the desired XC(C,)

1

+.

(3. l d )

We prove ( 3 . 1 ~ )by induction. For n = 1, ( 3 . 1 ~ )holds true, since from (3.lb), 1 > Ac(Al) - 9. Now we assume that ( 3 . 1 ~ )holds for some n > 1 and show the validity of (3.1~)for n + 1. Because of C, = B, n C, and Proposition 1.5

Xc(CI) = X,(Bl)

+

+

(ii),

AdB,+ I LJCn) = 'c(Bn + 1)+ hc(Cn) - Ac(Cn + 1)Due to (3.le), the inequality Xc(Bn + I ) 1 .\,(A,

+

(3.le)

- 2"+1E (from

+

(3.lb) for n 1), and the assumption that ( 3 . 1 ~ )holds true for some fixed n we have

Since obviously B, and hence

+

U C,

S A,, we have hc(An) 2 X,(B,

+

U C,),

260

CHAPTER 5. MEASURES

This proves ( 3 . 1 ~ )and (3.ld) and thereby yields that {En]is a monotone nonincreasing sequence of nonempty compact sets; hence, by Cantor's Theorem 5.4, Chapter 2,

Consequently, it shows that Xc is indeed a premeasure on the ring Vb.

3.2 Remarks and Definitions. (i) Theorem 3.1 states that the Lebesgue content on %(Y) in Rn is a-additive. This, obviously, implies that the Lebesgue elementary content is also a-additive on Y. (ii) In Example 1.2 (ii) we defined the Lebesgue elementary content

X0 on the semi-ring Y of half-open intervals in Wn. Now, by the use of Proposition 2.11, Corollary 2.14, and Theorem 3.1, we can have the couple (%,Ac) or, in light of Remark (i), even (!!,A0) as an extendible formatter of the outer measure X' acting on T(Rn) and call this set function the Lebesgue o u t q measure. The a-algebra E* 9(Rn) of all A*-measurable sets, in notat'ion, L*, called the Lebesgue a-algebra of measurable sets, along with A: = ResL.A*, callid. the Lebesgue measure, will form a complete measure space, according to Proposition 2.7. The further ) Xthe * Lebesgue outer measure on the smallest restriction A = R ~ S ~ ( ~ of a-algebra generated by Y (which, according to Theorem 2.7, Chapter 4, is identical to the smallest a-algebra generated by the usual topology) or, equivalently, by %, known a s the Borel a-algebra 38 on Wn, is referred to as the Borel-Lebesgue measure. By noticing that there exists a monotone increasing sequence ( - k,kIn f Rn of half-open squares with

we conclude that Xo is 0-finite on 9 and, therefore, by Corollary 2.14, the Borel-Lebesgue measure X is unique on 38. By Remark 2.24, the Lebesgue outer measure A* and hence the Lebesgue measure A; are also unique on T(Rn) and L*, respectively. Finally, by Theorem 2.19 (ii), the completion of Borel-Lebesgue measure X coincides with Lebesgue measure A: on L* and the corresponding completion of the Borel a-algebra coincides with the a-algebra L* of Lebesgue measurable sets. Both, Lebesgue and Borel-Lebesgue measures have their strengths


26 1

and weaknesses. The Borel-Lebesgue measure acts on the Borel u-algebra, which stems from the usual topology and preserves some topological properties. The Borel-Lebesgue measure is also an element of a very important class of Borel measures. However, unlike Lebesgue measure, Borel-Lebesgue measure is not complete.

3.3 Definitions. (i) Let G$ be a Borel u-algebra in R. Any measure p on 93 is called a Borel measure and the triple (R,G$,p) is called a Borel measure space. (ii) A Borel measure p on (Rn,G$) is said to be a Borel-LebesgueStieltjes measure if p(B) < XI for any de-bounded Borel set B. Clearly, any Borel-Lebesgue-Stieltjes measure is u-finite. (iii) Let p be a a Borel-Lebesgue-Stieltjes measure on (Rn,G$). Now, in light of Carathbodory's construction we can use the couple (93,p) as an extendible formatter of the outer measure p* acting on 9(Rn) and call this set function the Borel outer measure. The u-algebra E* E 9(Rn) of all p*-measurable sets will be denoted by 3 ' ; and called the LebesgueStieltjes a-algebra of measurable sets. The corresponding restriction p s = Res%.p* will be called the Lebesgue-Stieltjes measure. In the

u literature on measure theory, Lebesgue-Stieltjes measures are often confused with Borel-Lebesgue-Stieltjes measures. In addition to the Borel-Lebesgue measure on Borel a-algebra 93 on Rn, we present another construction of a Borel-Lebesgue-S tieltjes measure, for simplicity letting dimension n = 1. In Example 1.2 (iii) we introduced the Lebesgue-S tieltjes elementary content py on the semi-ring Y of all half-open intervals (a,b] 2 R, by means of an extended distribution function (i.e., a monotone nondecreasing, right-continuous function) f : ( , d e ) ( , d e ) , as pOf((a,b]) = f (6) - f (a). Observe that py reduces to the Lebesgue elementary content if f(x) = x. According to Proposition 2.11, py can uniquely be extended to the Lebesgue-Stieltjes content pf on the ring %(Y) of "figures." The following is to show that pf is u-additive.

3.4 Theorem. Let pf be a Lebesgue-Stieltjes content on the ring %(f) induced by a monotone nondecreasing right-continuous function f . Then pf is a premeasure.

Proof. Since pf is finite on %(Y), as in Theorem 3.1, it is sufficient to show that p is @-continuous. Let {R,) be a sequence of sets from %(Y) monotonical y decreasing to @. We prove that lim pf(Rn) = 0.

I

.

n-w

We assume that Rn C C, n = 1,2,. ., where C is a compact set in @,re). A set Rn E % is a figure if it is a finite union of disjoint intervals of type (a,b]. Because of right-continuity of f , it can be easily shown

262

CHAPTER 5. MEASURES

that, for each fixed Bn 2 Rn such that

E

> 0 and for any figure R,, there is a subfigure

3,s R n and such that

(Rn) - (B,) < ~2-". It also follows that n B n = Q) We claim that there is an r such that w -

f

n=l

-

n Bk = @.To see this, observe that k=l

{C\Bn = C n (B,)';

.

n = 1,2,. .} is

an open cover of C in the relative topology (C, ren C). Since compactness is weakly hereditary and C is closed, it follows that C is also compact in r en C. Thus, the above cover reduces to a finite subcover, for example, C\B1,. .,c\B, yielding that

.

Thus,

6Bk = 0 and n Bk = @. f

k=l

Now, for all n

k=l

> r,

n

nBk=#

k=l

and

Since {R,} is monotone decreasing, it follows that

Observe that this is the desired inclusion implying the estimate (R,) < E . This inclusion is due to the inclusion Rn\Bk 5 Rk\Bk, which holds for all k 5 n (as long as n < 00). Hence the above countable intersection reduces to a finite intersection of the sets Bk, k = I,. ..,r. Thus we have

which shows that pf(Rn)

4

0.

tl

Notice that it can alternatively be shown (Problem 3.26) that the Lebesgue-Stieltjes elementary content is a-additive with no prior extension to the Lebesgue-Stieltjes content and bypassing @-continuity.

3.5 Remarks.

(i)

Using the same arguments as in Remarks and Definitions 3.2,

3. Le besgue and Lebesgue-Stieltjes Measures

263

we will extend the Lebesgue-Stieltjes elementary content py (or content p f ) from the semi-ring Y (or ring %(Y), respectively) to the LebesgueStieltjes measure p j on the a-algebra C* ( = I;) of Lebesgue-Stieltjes measurable sets and then reduce it to the unique measure pf, which is clearly a Borel-Lebesgue-Stieltjes measure on the Borel a-algebra I @ ) . (ii) When dealing with Borel measures, it is common to observe a certain property of a a-finite Borel measure p1 on the semi-ring Y in Rn and extend this property of p1 from Y to the Borel a-algebra GB arriving a t another Borel measure p2. Since p1 and p2 coincide on Y, by Corollary 2.14, p1 = p2 on 3. Consequently, by Remark 2.24, the corresponding outer measures p: and p," must coincide on 9(Rn) as well as their restrictions on 93; = 93;. Note, however, that '3; is not a general notation like 3 is, for it is not a induced by the usual topology and it is related to a particular Borel measure p on 3. (iii) We have learned that if f is an extended distribution function (see the definition in Example 1.2 (iii)), then it induces a BorelLebesgue-Stieltjes measure on 3. Conversely, a Borel-Lebesgue-Stieltjes measure p generates an extended distribution function. If p is a finite Borel-Lebesgue-Stieltjes measure on I,then we can set f (x) = p(( - oo,x]) and such an f is a distribution function. Indeed, take a sequence xI > x2 > ... -+ x. Then f (x,) - f (x) = p((x x ])+0, by @.' ., continuity of p (Theorem 1.7 (i)), which shows that f is right-continuous. Since p is a finite measure, f is bounded. Finally, if x, is any monotone decreasing sequence convergent to - oo (such as { - n)), then, again by @-continuity of p, it follows that p(( - m,x,]) and thus, f(x,) 4 0. If p is an arbitrary Borel-Lebesgue-Stieltjes measure, we can define f (0) = 0 and

Similarly, one can show that f is an extended distribution function. (See Problem 3.3.) If B = B(R,3) denotes the set of all Borel-LebesgueStieltjes measures on ( R , I ) , then it can be shown that any two extended distribution functions f l and f 2 that induce p E 8 can differ only in an additive constant (see Problem 3.4). The latter generates an equivalence relation, say 8. Therefore, if 9, denotes the set of all extended distribution functions, for each p E 23, there is a unique equivalence class {f;p) of all such extended distribution functions that induce p, and {f;p) = {f + c : c E R). Let 9,lg: = ({f;p):p E 8) be the corresponding quotient set of 9,.Then, there is a bijective map '3 from the set 8 onto the set 9,18.

264

CHAPTER 5. MEASURES

As regards the subset 23, of all finite Borel-Lebesgue-Stieltjes measures, then, obviously, each one of them generates a unique distribution function and there is a bijective map between IS, and the set 9 ( C 9,) of all distribution functions. T o make all distinctions between distribution and extended distribution functions lucid the reader may find it expedient to go over Problem 3.9. We will return to Lebesgue measure A: on A*. First, we prove a lemma about negligible sets. One of the interesting consequences of this result is that in Rn, all Bore1 sets having a dimension less than n are null sets.

3.6 Lemma. A set N 5 Rn is A-negligible i f and only i f for each E > 0 there is a countable cover of semi-open intervals { I k } C 3 of N such that

C=;

l,X(~k< ) E*

Proof. Let N be Xnegligib!e. Then, by Problem 2.18, A*(N) = 0 and

where EN is the set of all countable covers of N by semi-open intervals and it is not empty, since otherwise A*(N) would equal oo. By the definition of a limit inferior, for each E > 0, there is a cover { I k }E EN such that

which proves the first part of the statement. Conversely, let E > 0 and let { I k } 5 f! be a countable cover of N with the property that Ern A0(Ik)< E. Then, k =1

and hence, by Problem 2.18, N is a A-negligible set.

3.7 Lemma. Let f : R --t R be an additive function, continuous at zero. Then, f is linear.

Proof. First note that

This yields that f ( 0 ) = 0. Then, from

3. Lebesgue and Lebesgue-Stieltjes M e a s u r e s

265

it follows that f (x) = - f ( - x) and thus f is odd. Now, let n be any positive integer number. Then, since f is additive,

f(

4 =nf ( 4 .

If n is a negative integer, then, from (3.7b-c),

Hence, for each n E iZ,

which yields that

Combining (3.7d) and (3.70 we have that for each integer m,

In other words, for each rational number q,

Since f is continuous a t zero and because f is additive and odd we have from f ( x - Y) = f ("1

+ f ( - Y) = f ( 4 - f (Y)

that f is continuous on R. Now, let r E R. Then, there is a sequence (nn) of rationals convergent to r. Due to continuity of f , lim n+ca

Aq,) = f ( 4

(3.71~)

-

On the other hand, f(qn 1) = qnf(l) and (3.7g) lead to

This shows that f is a linear function f (x) = cx, where c = f (1).

0

3.8 Corollary. L e t f : Rn 4 R be c o n t i n u o u s at z e r o a n d additive f o r e a c h variable separately. T h e n f (xl,. ..,xn) = exl.- .xn, w h e r e c = f (1,. ..,l). Proof. If x2,. . .,xn are fixed, then by Lemma 3.7,

CHAPTER 5. MEASURES

Applying the same procedure successively to the other variables we have the statement. 3.9 Definition. A Borel measure p on !B(Rn) is said to be translationinvariant, if for each Borel set B E 4B(Rn) and x € Rn, p ( B x) = p(B), where B + x = (x y: y E B).

+

+

We will see in Section 4 that the Borel-Lebesgue and Lebesgue measures are translation invariant. The following theorem states that any translation-invariant Borel measure is a multiple of the Borel-Lebesgue measure. 3.10 Theorem. Let p be a translation-invariant Borel measure on !B(Rn). Then, p = cX, where X is the Borel-Lebesgue measure on 93(Rn) and c = p(C) ( C stands f o r a unit cube). Proof. For each x E R, define

and sgnx =

1, x > o

- 1,x < 0.

Denote

We show that f defined in (3.10) is additive and continuous in each variable separately. Without loss of generality, we show it with respect to xl. Let xl = x y.

+

Case 1. Suppose x > 0 and y > 0. Then,

and

where


R1=I,xI

x2

x...xI,

n

267

and R 2 = [ x , x + y ) x I X 2 x . . . x IXn.

+ y are all positive, sgn n xi = sgn(x - x2 - ... x,) (t:l )

Since, x,y, and x

= sgn(y x2

...

x,).

(3.10a)

From (3.10a),

and since p is translation invariant,

Case 2. Suppose x

+ y > 0 and x > 0, y < 0. Then,

= s g n ( x - x 2 - ...-x,) = -sgn(y-x2

-...-xn).

(3.10b)

Since Ix

+ y = [01x + Y) = [O,x)\[x + Y,x),

+

X([x y,x)) = X([y,O)), and because p is translation invariant, using (3.10b) we have that

The other combinations of x and y are left for the reader. (See Problem 3.20.) Now, we prove continuity of f a t zero. Let {ak} be a sequence conver-

268

CHAPTER 5. MEASURES

gent to zero from the right. Then, {ak} E W + and the sequence of sets {Iak) is such that

The latter yields that

n 00

} = I ~ x I , ....XI,.

{I~,XI,,X...XI

k =1

By the definition, I. = that

,n

2

n

a; and by continuity from above of p, we have lim f (ak,x2,...,x,) = 0.

k-00

Similarly, by continuity from below of p, we have that lim f (ak,o2,...,x,) = 0

k-rw

for {ak} l' 0. In addition, f (0,x2,.. .,xn) = 0 is by the definition of f . By Corollary 3.8,

where C = [O,l) x .. .x [0,1). On the other hand,

which, along with ( 3 . 1 0 ~ )gives ~

Note that

Equations (3.10d) and (3.10e) tell us that for any rectangle R whose all sides lie on corresponding coordinate axes,

For an arbitrarily positioned rectangle R whose all sides are parallel to the corresponding coordinate axes (3.10f) still holds true due to the translation invariance of p.

3. Le besgue and Lebesgue-Stieltjes Measures

269

By po = p(G')Ao we define an elementary content on the semi-ring Y of half open rectangles. Then, by 2 = p(C)A we also have a Borel measure on 3.Now, we have three Borel measures on 93: p , p, and the (unique, as !f is n-stable) extension of po from Y to 93. All three coincide on Y and therefore must be equal on 93. 0 3.11 Example. (Cantor ternary set). Consider the following family of

subsets

of

[0,1].

G2= (i,g) U (g,!),

Let

R=Co=[O,l],

GI=(:,$),

C1=CO\G,,

C2 = C1\G2, as depicted in Figure 3.1 below:

Figure 3.1 Therefore, each Cn is the union of 2" closed intervals, while each Gn is the union of 2n-1 open intervals. Also,

and (C,) is a monotone decreasing sequence of sets. The Cantor set is defined as

and it can be characterized as follows.

1) C is closed as the intersection of closed sets. 2) Each Cn contains 2" closed disjoint intervals Fl (n),. . .,F2n(n). Each of these intervals is a term of the monotone decreasing sequence

270

CHAPTER 5. MEASURES

{Fk(n)]l with d,(Fk(n)) = X(Fk(n)) L O , n+m.

By applying Cantor

Theorem 5.4, Chapter 2, we conclude that V k = 1,2,. ..,

n Fk(n)

00

n=l

consists of exactly one point. In other words, C is a union of isolated points and therefore nowhere dense.

5) The Lebesgue measure of G, is X(Gn) = :($I",

since

6) Thus

Hence X(C) = 1 - 1 = 0 and therefore C is a Borel A-null set.

7) C is not empty, since C contains all boundary points of the sets 1 Cn which are 0,1, g,

2

1 2 7 329 321 321

8 ,,. .. . The boundary

points have the

following ternary representations

1 = 1.0 (or) = 0.22222.. . (in duadic representation)

l3 = 0.1 (or) = 0.02222.. .

Each set Cn has exactly 2" boundary points, each of which has a unique triadic representation consisting of all n-tuples of digits 0 or 2. Observe that 2, = 19(A)I where A is a n n-element set. Therefore, C is equivalent to the set of all subsets of natural numbers which has the cardinality of the continuum. In other words, C G R. Therefore, the Cantor set is an example of a noncountable Borel Anull set.

3. Lebesgue and Lebesgue-Stieltjes Measures PROBLEMS 3.1

Let H = ( x = (xl,. . .,en)E Rn: xi = a E R) be a hyperplane orthogonal to the ith coordinate axis. Show that H is a A-null Borel set. [Hint: 1 ) Show that H is closed in (Wn,re)and hence Borel, 2 ) Find a relevant countable cover of H by rectangles from Y and apply Lemma 3.6.1

3.2

Show that each countable subset of W n is a Borel A-null set. [Hint: Use Problem 3.11.

3.3

Show that f defined by (*) in Remark 3.5 (iii) is an extended distribution function.

3.4

Let f l and f 2 are two extended distribution functions and let p1 and p2 are the corresponding Borel-Lebesgue-Stieltjes measures induced by these functions. Show that p1 = p2 if and only if f - f = C , where c is a constant function.

3.5

Let %Ie be the set of all extended distribution functions. Show that 9,is a semilinear space over W + .

3.6

Let f and f 2 be two extended distribution functions. If pl and p2 are the corresponding Borel-Lebesgue-St ieltjes measures induced by f l and f 2 , show that for any nonnegative scalars al and a 2 , 9 ( a l p l u 2 p 2 )= { a l f a2f 2;p} , where '3 is defined in Remark 3.5 (iii).

+

3.7

+

Let f : R+R be an extended distribution function and let p f be the corresponding Borel-Lebesgue-Stieltjes measure on %(R). Show that

a ) p f ( ( a ) b ) )= f ( b -

-f (a)

b ) ~ f ( [ a , b l=) f ( b ) - f ( a - )

4

= f ( b - ) -f (a- )

~ f ( [ " l b ) )

d ) f is continuous if and only if p f ( { x ) ) = 0 , x E R.

3.8

Let f be the extended distribution function on R given by

CHAPTER 5. MEASURES

and let pf be the corresponding Borel-Lebesgue-Stieltjes measure. Evaluate the measure of the following sets:

3.9

Let f be a distribution function and let pf denote the BorelLebesgue-Stieltjes measure induced by f . Justify with a proof or give a counter-argument: a) Must f be an extended distribution function?

b) Suppose g is a function defined by (*) of Remark 3.5 (iii). Is g a distribution function? If your answer is yes, is g = f ? 3.10

Let p be an atomic measure ( =

i=o

ai ob i)a

1 ) Is p always a Borel-Lebesgue-Stieltjes measure? If it is not, give a condition under which p is a Borel-Lebesgue-Stieltjes measure. 2 ) Find in this case { f ;p). 3 ) Plot one such f .

3.11

Consider the Borel a-algebra 93 = 3B([W)" generated by the usual topology. Show that, for any Borel set B E 93 and any point xeRn, B + x = ( r ~ R " : z = y + x : Y E B ) E % . [Hznt: Show that C , = {A E 3: A + x E 3 ) is a a-algebra.]

3.12

Let (52,bB)p)be a Borel measure space, such that the Borel aalgebra 4B is generated by a Hausdorff topological space T , and p is a finite Borel measure. For any subset Q E f l denote by %(Q) the collection of all compact subsets of Q. Show that a subcollection J% C 93 of all sets B E 4B such that

is a monotone system in f l , i.e. a subcollection of those Borel sets that can be approximated "from below" by compact subsets is a monotone system.

3.13

Let (fl,G$,p) a special case of the Borel measure space introduced in Problem 3.12, namely, let 52 = Rn and the Borel a-algebra 93 = bB(Rn) be generated by the usual topology re. Again assume that p is a finite Borel measure. Show that in this case every Borel set B can be approximated from below by a compact subset K C B; i.e. for every o > 0 there is a compact subset K ( B )E B, such that


273

3.14

Generalize Problem 3.13 allowing p to be a a-finite Borel measure.

3.15

Let (Rn,%,p) be a Borel measure space, where the Borel u-algebra '3 is generated by the usual topology re in Rn and p is a finite Borel measure on 93. Show that every Borel set can be "approximated from above" by an open set, i.e. if C(B) is the collection of all open supersets of B, then

3.16

Show that there is a non-Bore1 set in T(R).

[Hints:We call x,y E R equivalent (x y) if and only if x - y E Q (rational numbers). For every real number x R we assign another real number y to the class A, if and only if

X-YEQ. 1) Show that (R,

-

) is indeed an equivalence relation.

Let 81 ,be the quotient set of modulo . Using the Axiom of Choice we select any element from each class of 81, that belongs to set (0,1]. Denote by A the collection of all such elements. .V

2) Show that such a selection is possible taking into account the Axiom of Choice; i.e. it can be shown that V x E R, A,n(0,11# @. 3) Show that set A has the following properties:

R can be restored from A as

4) Finally, let Q = Q fl (O,l]. Then

U 9€

Q

(q

+ A) 5 (0,2]. If

A

is a Borel set (and this is the assumption that will lead to a contradiction), then by Problem 3.11, x + A is a Borel set too; and by the translation-invariance of Borel-Lebesgue measure A, X(x A) = X(A) implying that

+

U- (n + 4)= C

A( 9

Q

qeQ

X(q

+ A) I X((0,21) = 2.

Thus the above series is finite; and since the X(q + A) values are equal for all q E Q, each of them must be zero, which implies that

274

CHAPTER

X(q

5. MEASURES

+ A) = 0, Vq E $. But R=

z (q+A)*X(R)= 9cQ

C 9

E

X(q+A)=O,

Q

which is an absurdity. Thus, our assumption that A is a Borel set was wrong.]

3.17

Let X denote the Borel-Lebesgue measure on the Borel a-algebra 4B(Rn). Show that for each Borel set B and E > 0, there is a countable cover of B by disjoint semi-open cubes {Ck}such that

C=:

"=O(ck) -

< E.

In particular,

[Hint:Use Problem 3.15.1 3.18

Show that if N is a negligible set in (Rn,93,X), for each E > 0, there is a countable cover of N by disjoint semi-open cubes {Ck}such that

3.19

Show that if N is a subset of Rn, and for each E > 0, there is a countable cover of N by semi-open (not necessarily disjoint) cubes such that C=: l X o ( ~ k 0, there is a finite cover of B by semi-open rectangles Dl,. ..,DNsuch that


275

3.23

For any E > 0, construct an open set D in (R,T,), which is dense in R and with X(D) 5 E .

3.24

Is every a-finite Bore1 measure on %(Rn) also a Borel-LebesgueStieltjes measure?

3.25

Give a direct, alternative to Theorem 3.1, proof that the Lebesgue elementary content X0 is u-additive on the semi-ring 3 of half open intervals in Rn, not using any prior extension to the Lebesgue content A, on %(Rn), as Theorem 3.1 does.

3.26

Give a direct (alternative to Theorem 3.4) proof that the Lebesgue-Stieltjes elementary content is c~-additivewith no prior extension to the Lebesgue-Stieltjes content and bypassing 0continuity.

CHAPTER 5. MEASURES

NEW TERMS: Lebesgue outer measure 260 Lebesgue a-algebra 260 Lebesgue measure 260 Borel-Lebesgue measure 260 Borel measure 261 Borel measure space 261 Borel-Lebesgue-Stieltj es measure 26 1 Borel outer measure 261 Lebesgue-Stieltjes measure 261 Lebesgue-Stieltjes content 26 1 distribution function 263 extended distribution function 263 translation-invariant Borel measure 266 Cantor ternary set 269 measure of a hyperplane 271 non-Bore1 set 273

4. Image Measures

4. IMAGE MEASURES In Remark 3.5 (iii) we saw how Borel-Lebesgue-Stieltjes measures can be generated by measurable functions belonging to the class of so-called extended distribution functions. In this section we will also generate measures by elements of the far more general C-'-class of measurable functions. The very process of generation of measure is totally different from that in Remark 3.5 (iii) and the two notions should not be confused with each other. Section 3, Chapter 4, is a relevant prerequisite to this material.

4.1 Proposition. Let ( R , C , p ) be a measure space and let J ( ) ( R 1 ) be a measurable function. Then the set function A' -+ pf * ( A 1 = ) p(f *(A1)) on E' is a measure. tl (See Problem 4.1.)

4.2 Definition and Notation. The measure p f * in Proposition 4.1 induced by a measurable function f is called an image measure. Notice that directly from Definition 2.1 (viii), pf*(A1) can alternatively be viewed as p(w E Q: f (w) E A') or, shortly, as p{f E A'). 0

4.3 Proposition. Let L: Rn + Rn be such that L(x) = a x + b, where a' E R\{O) and b E Rn. Then the Borel-Lebesgue measure X on %(Rn) has the properly XL* = A X . Specifically, if a' = 1 we have XL* = A , Ial which shows that the Borel-Lebesgue measure is translation-invariant.

Proof.

1) Let f ( x ) = a x (called a homothetic function), where x E Rn and a ( # 0) E R. Let X be the Borel-Lebesgue measure on the Bore1 a-algebra 9.W e show that

Take (a,b] € Y. Then,

fi f *((a,bl) =

(ai bi

a',a

]

i=l

i=l

which implies that

X f *((a,b]) = and

4X((a,b]) for a > 0 a'

CHAPTER 5. MEASURES

A f *((a,b J) = -&( CY

- l)"A((a, b 1) for a < 0,

and thus

As a continuous map relative to the usual topology, f is Borel and, consequently, Af* is a Bore1 measure on 93. Obviously, ,A is also a

'

lal

Bore1 measure on 3.Since A f * and ,A are a-finite on 3' and coincide

I 4

on Y (being a n-stable generator of %), and since +A,

lal

is rr-additive,

by Corollary 2.14, they should also coincide on 93. 2) Let g(x) = x + b. Similarly, we can show (see Problem 4.2) that Ag* is a Borel measure on 93 and that Ag* = A. Therefore, A is translation-invariant. Finally,

The proposition is therefore proved.

4.4 Remark. Proposition 4.3 tells us that the Borel-Lebesgue measure is invariant under translation, which is a sort of motion defined as T,(x) = x + In the two-dimensional Euclidean space we know another form of the motion, called rotation. A figure under rotation R and subsequent translation T, is trarraformed into a congruent one. We can show that an arbitrary Borel set in Rn rotated and then translated preserves its volume. (See Corollary 2.3, Chapter 7.) In the n-dimensional Euclidean space, instead of rotation, we use an orthogonal transformation. More precisely, in Rn an orthogonal transformation is in the form of an orthogonal n x n matrix; recall that an n x n nonsingular matrix R is orthogonal if R R T = R ~ = R I (the identity matrix). The composition M = T, o R is an example of a motion. Generally, a bijective map M from one metric space ( X , d ) onto another metric space (Y,p) is called an isometry if it preserves the distance, i.e. if for every pair z,y E X, d ( x , ~=) p(M(x),M(y)). Such two metric spaces are said to be isometric. A motion is a special case of isometry when Y = X, p = d. In the Euclidean space (Wn,d,), a motion M can be represented by the composition T, CJ R, where T, is a translation map and R is an orthogonal matrix. As a continuous map, the motion is also Borel. It can be shown (see Problem 4.10) that the Borel-Lebesgue measure is motioninvariant, i.e. AM* = A. 4.5 Examples.

(a)

Let (R,E,p) be a probability space and let (R1,E1)be a measure

279

4. Image Measures

space. Then any E-El-measurable function f : R --, R' is called a random variable. The corresponding image measure p f * is called the probability distribution ( o f the random variable f). Observe that in probability theory, a probability measure is denoted by P and a random variable is denoted by upper case letters like X, Y or 2. In most applications, 52' is the numeric set Rn or a subset of Rn, and C' is the corresponding Bore1 a-algebra %(Rn) or its trace on the subset. We would like to emphasize that a measurable function, say X , can only be a random variable if it is associated with a particular probability measure P, along with which it induces the probability distribution. The latter specifies the random variable. In other words, measurable functions may share the same measurable space, but as far as probability theory goes, they differ if they induce different probability distributions (or more precisely, different classes of probability distributions categorized by their parameters). (0,1,...,) be a random variable such that PX* is a Poisson measure T X . Then the random variable X is called a Poisson random variable. Similarly, a random variable X: R + (0, ...,n) is called binomial, if PX* is a binomial measure P n t p . A random variable X is called (discrete) uniformly distributed if P X * = E kn= o E As it was pointed out in (i), X: R n + l k'

(ii) Let (R,E,P) be a probability space and X: R

-t

(0,. ..,n) is just a measurable function (which can be uniform or binomial), and it becomes a random variable upon specification of its distribution PX* or even earlier, the probability measure P. These are examples of so-called discrete random variables. The construction of probability distributions of continuous random variables (i.e., those whose ranges are continuums) requires integration and the concept of a density. The latter will be developed in Chapters 6 and 8. -,

PROBLEMS 4.1


4.2

Prove part 2 of Proposition 4.3.

4.3

Let (R,E,p) be a measure space with R = R, E = (A 5 R: either A or AC4 - N) and let p(A) = 0 for A 5 N and p(A) = 1 for ACj N. Let R' = {0,1), 22' = 9(!2').Define [R, R', f ] as

f(x)={

0, if x is rational ~~ifxisirrationai.

Prove that f is C-22'-measurable and determine pf *.

280

CHAPTER 5. MEASURES

4.4

What are the traces of Borel a-algebras on R' = (0,1, ...) and a' = (O,l,. ..,n)introduced in Example 4.5 (ii)?

4.5

Let A ( R n ) be the collection of all motions on (Rn,d,). Show that (&(lRn), o ) , where o is the composition operator, forms a group with unity.

4.6

Let f be a homothetic function (f (x) = ax) defined in Proposition 4.3, part 1, A - the Borel-Lebesgue measure on G;B(Rn), A* - the Lebesgue outer measure, L* - the rr-algebra of Lebesgue measurable sets, and A; - the Lebesgue measure on L*. Let p* be the outer measure generated by the image measure A f *, '3; - the rr-algebia of all p*-measurable sets, and p; = ResB,p*. that: P

Show

c) 93; = L*.

4.7

Generalize Problem 4.6 by letting f to be a special case of the affine map f (x) = a x + b, a # 0, b E Rn.

4.8

Show that the Lebesgue measure A; on L* is translation-invariant.

4.9

Let p be a translation-invariant Borel measure on 3B(Rn) and let p* be the outer measure produced by (B(Rn),p). Show that: a) p* = p(C)A*, where C is the unit cube. b) 3; = e*.

4.10

Show that the Borel-Lebesgue measure is motion-invariant. (See also Chapter 7.)

4 . Image Measures

NEW TERMS:

C - '-class of functions 277 image measure 277 homothetic function 277 orthogonal transformation 278 isometry 196 isometric metric spaces 278 motion 278 motion-invarian t measure 278 random variable 279 probability distribution 196 Poisson random variable 197 Binomial random variable 197 Discrete random variable 279 translation-invariance of Lebesgue measure 280

282

CHAPTER 5. MEASURES

5. EXTENDED REAL-VALUED MEASURABLE FUNCTIONS 5.1 Definitions and Notations. (i) Recall (Section 3, Chapter 4) that C - '(a, C ; R', C', ) denotes the collection of all measurable functions from a measurable space (a,E ) to a measurable space ( R , ) . If R ' = W and C'= %(W), then C - '(a, E ; W) will denote the class of all real-valued measurable functions on a measurable space (R, C). The class of all complex-valued functions will be denoted by C - '($2, E ; C) = C - '(a, C ; C,GB(C))). Using the notion of product measures (Section 6, Chapter 6) we can show that a function f = u + i v € C - ' ( ~ , C ; C ) ifand only i f u , v ~ C - ' ( R , E ; W ) . (ii) In Examples 1.2 (iv) and 10.19 (i), Chapter 3, we constructed a topology on the extended real line R via "two-point compactification." The formed topological space (R,T) included all open sets of (W,r) and, in addition, open sets of types

where 0 E T . The corresponding Borel c-algebra %(W) = C(-?), therefore, consists of all sets of %(R) and combinations of unions of Borel sets with the sets ( + m ) and (-00). In this section, we will be concerned with the class of all eztended real-va,julued functions f: R+R which are E-:-%(R)measurable, where C is a a-algebra in R. We denote such a class by C- l ( n , s ; R ) (or sometimes shortly by C- if a measurable space (R,C) is previously specified).

'

We give a simple criterion for measurability of C- '-functions.

5.2 Proposition. A function f E C-

' is measurable if and only if, f o r
r). rEQ

Conversely, if

then there exists an ro E Q such that wo E { f (w) < ro}n {g(w > ro} and f (wo) < ro < g(wo), implying that f (wo) < g(wo). Thus wo E {f (w) < g(w)}. Now the statement shall follow from Proposition 5.3 (ii,iii). C3

5.5 Definitions. In the situations below we will deal with spaces of measurable functions that have not occurred before. We discuss the following constructions. Let IF be a field and let % be a vector lattice over ff and a commutative ring with unity. Observe that (%,F) is an algebra and (36,* ) is a multiplicative Abelian semi-group with unity (i.e. a group that perhaps fails to have multiplicative inverses); call it shortly an %-space over IF. Throughout the remainder of this book, as an %-space, we shall consider a class of functions (extended real- or complex-valued over the field R or 43). For instance, the space of all continuous functions is an %space over R. [Note that the term %-space is not common in real analysis literature and is restricted to the use in this book.]

(i)

(ii) Let En be the set of all functions from il to R (as we defined it in Section 5, Chapter I ) , and let (Rn,rp) be the topology of pointwise convergence (cf. Definition 5.11, Chapter 3) generated by the compact topology (R,T,) in each of the factor spaces. Let us call (Rn,r,) the eztended topology of pointvise convergence. Let 96 be a subset of ( R 4 r p ) such that it is an %-space over W. We call 96 a closed %-space if (EG,rp) contains the limits of all rP-convergent sequences. In other words, it contains the limit of every pointwise convergent sequence (observe that since @,ye)is Hausdorff, any pointwise limit is unique). For instance, the space of all continuous functions is not a closed $-space. (iii) Consider the subspace ( - ) ( T ~ of) all measurable functions structured in terms of the extended topology of pointwise convergence. The next theorem states that until now (C - ',T,) is the widest, known class of functions, second to En.

5.6 Proposition. (C-l,rp) is a closed %-space over W, that is f o r any f , g E C - ' a n d f i r {f,: n = 1,2,...) 5 C-':

5. Extended Real- Valued Measurable Functions

285

(ii) f f g ~ C - l . (iii) f . g E C - '.

(iv) supif ,} E (C - l , r p ) and inf{f ,) E (C follows that CI f 1 EC-'.

'

is a lattice,

' ,rp);

specifically, it and thus with any f E C - ', also

(vi) if f -+f in the extended topology of pointwise convergence, then f E ( C - q r p ) .

Proof. (i)

is obvious.

'

(ii) BY (i), a - g E - implying that {f Therefore, by Problem 5.2 (i), f g E C (iii) { f 2 > a ) = n (a > 0)

* {f

2 a)

E 23

representation f g = f (f

@SO),

*f

-'.

+

+ g < a ) = {f

{ f 2 ~ a ) = { f >/ii)u{f

€ C-'.

g - a).

5

T h e statement follows from the

+ g)2 - i(f- g)2.

(iv) We show that

wo E {supif

n)

5 a) if and only if supif ,(wo)} 5 a or equivalently

or, equivalently,

The latter implies that 00

{sup{f Let {f ,}

n)

6 a} =

-fi)

n {fn6 a).

n=l

C - '. Then { - f }, C C - '. The statement follows from

286

CHAPTER 5. MEASURES

Now if f E e - ', it implies that (v)

I f I = sup{f,

- f } E e - '.

This statement directly follows from (iv).

lim f, = f if and only if (vi) n-)w

lim

f, = lim f, = f , and the

statement follows from (v).

0

PROBLEMS

5.1

Prove Corollary 5.3 (i) and (ii).

5.2

Prove that for f ,g E e - ', (i)

(f5s)Er:

(ii) (f = g) E C (iii) (f

# g) E E.

5.3

Let f , g ~ ~ - ' . S h o w t h a t w ~ c o s ( f ~ ( ~ ) + 4 g ( w ) ) ~ ~ - ' .

5.4

Show that if f 3 € e - ' then then f need not be in C -

5.5

Let f ,g E (2 - and let A E C. Show that

5.6

Let f : (a,b]-tR be a a) monotone function b) convex function c) function with a t most countably many discontinuities. Show that in each case, f is GB( (a,b ] ) - GB(R)-measurable.

5.7

Prove the statement: f E e-' if and only if {f > d ) E C for all d E D, where D 5 R is any dense set in R.

5.8

Show that if f has derivative a t each point of R, then this derivative is Borel-measurable.

'.

fee-'.

Show that if f 2 E c - '

5. Extended Real- Valued Measurable Functions NEW TERMS:

extended real-valued function 282 measurability of an extended real-valued function, criteria of 282, 283 %-space 284 extended topology of pointwise convergence 284 closed %-Space 284

CHAPTER 5. MEASURES

6. SIMPLE FUNCTIONS The present section is a direct precursor to integration, which we develop in the next chapter. The integral itself will be first defined for simple functions valued in a finite set of nonnegative reals. 6.1 Definition. We consider the following subclass of functions from C - l ( a , c ; R ) , which we call nonnegative simple functions and denote this subclass by 3 + ( Q C ) = P + ( a , C;R). An element s is said to belong to 3 + or to be nonnegative simple if:

c) s takes on only finitely many real values.

0

6.2 Remarks.

(i)

Let s E 3+(R,C). If there is an n-tuple of nonnegative real

numbers {al,. ..,a,} and a finite decomposition

C "k = l Ak of C? such that

s(w) = ak for all w E Ak, then the function s (as in Figure 6.1) can obviously be represented as

Figure 6.1

6. Simple Functions

289

In some cases we may need to deal with different decompositions of 52. Consequently, there are in general different finite representations or expansions of s E P + of type (6.1). However, there is obviously a unique one where (6.1) contains all different values {al,.. .,an} of s. We wish to call such a representation (expansion) canonic. (ii) For the upcoming material we will need some modifications of X-spaces introduced in Definition 5.5. Let ff be a field. Recall that F+ C F is called the semifield if all axioms of the field hold except for #4 (the existence of additive inverses, see Definition 7.5, Chapter 1). If X is a linear space over F, the corresponding erestriction (%;IF+) is called a semi-linear space. If, in addition, (X;F) is a vector lattice, then (ES;F+) is called a semi-linear lattice. Similarly, we can define corresponding restrictions of rings and algebras over F+ calling them quasirings and quasialgebras. If % is a semi-linear lattice over a semifield F + and a commutative quasiring with unity over F+ then we call the pair (96;F+) a semi$-space. (iii) In Chapter 8 (Section 4), we will also be using the notion of a simple function, which is just as in Definition 6.1, except that they are not necessarily nonnegative. The set of all such simple functions will be denoted by P(Q, E ) = P(R, E;R).

6.3 Proposition. ( P + (52,E); W+; - ) is a semi-%-space. In other words, if s,t E P + , then: (ii) s e t E P + . (iii) sup(s,t) E P +

.

(iv) inf(s,t) E P + . (See Problem 6.1.) We denote by ( 9+ (R,C),rp) the subspace of all extended, realvalued, nonnegative functions f € C - to each of which there exists a monotone nondecreasing sequence {s,} 5 P + of nonnegative simple functions such that f = sup{sn} in the topology - of pointwise convergence. By Proposition 5.5 (iv), $ + 2 C - l , i.e. P + consists of only measurable functions. The following proposition asserts that 9 + is a semi-%space and it is the closure of P + with respect to the topology of pointwise convergence).

'

6.4 Proposition. P + (R,C) is a semi-%-space o v e r W+, f ,g E + ( n , E ) , then:

i.e. if

290

CHAPTER 5.

MEASURES

(ii) f o g € $ + . (iii) sup(f ,g) E iF + . (iv) inf(f,g) E

Y+.

Proof.

(i)

Let f = sup{s,}, g = sup{t,}. Then af

+ bg = a sups, + b supt ,= sup{as,}

and as,, bt, E creasing.

P + . Furthermore, {as,] and

{bl,]

+ sup{bt,} are monotone nonde-

(ii) The proof is similar to that for (i). (iii) Let w, = sup(s,,t,). s u p i f ,g} (why?).

Then obviously, sup{w,} exists and equals I7

(iv) The proof is similar to that for (iii).

= C-'(n,c;R+),i.e., the subclass of all nonnegative ertended real-valued functions. Then e 7' = @ and it is 6.5 Theorem. Let C;':

+

the a closed semi-%-space.

Proof. Evidently, $ + C C ;.'

Therefore, we are left to prove that

@ +.. We will show that, for every f E C ;,' there is a monotone C nondecreasing sequence { s } of nonnegative simple functions from P + such that sup{s,) = f . The latter is a t the heart of the following construction. Let

;'

For instance,

In other words,

6. Simple Funciions and

Therefore, all sets Ai(n), i = 0,...,n2", are disjoint and obviously Emeasurable. Let us define

Both f and s , are depicted in Figure 6.2.

Figure 6.2

+

Clearly s,+~ 2 s,. Besides, s,(w) 5 f ( w ) < s,(w) 2-", V w E R: f ( w ) n, and f ( w ) > n, V w E R: f ( w ) = oo. Functions s , and s , + are drawn in Figure 6.3.

0 be any small number.

Thus s, 2 s(1- &)lgn. By Proposition 1.3 (ii,iii),

298

CHAPTER 6. ELEMENTS OF INTEGRATION

By the definition of Is,}, it follows that {B,} 1R, which implies that { A fl B,} 7 A j . Therefore, by continuity from below of p (Lemma 1.6, Chapter 5),

$sdp = = n+m lim

r = l

aip(Ai) =

=;

lai l i i m p(Ai n B,)

xm a . p ( A i f l B n ) = lim =1 t

n+oo

2

J s l B dp. n

The last equation is due to the relationship

Thus, SUP{

J s,dp)}

S

= n--+oo lim sndp

which proves the statement because the inequality holds for each

1.6 Corollary. For {s,}t,

{t,}I

E

> 0. 0

b + such that sup{s,} = sup{t,},

it holds that

(See Problem 1.2.) Let us now turn to the integral of the functions from the more general class C ;= C - '(a, C;R + ) which we became familiar with first in Theorem 6.5, Chapter 5.

'

;'.

1.7 Definition. Let (R,C,p) be a measure space and let f E C By Theorem 6.5, Chapter 5, there is a monotone, nondecreasing sequence {s,}T E P + such that f = sup{s,}. Hence, it is plausible to define

and call it the integral of (an extended, real-valued, nonnegative function) f with respect to measure p. By Corollary 1.6, the value of the integral, Sf dp, is unique. Analogous to Proposition 1.3 (ii,iii), we have:

1.8 Proposition. The integral introduced in Definition 1.7 is a positive, linear, monolone nondecreasing functional on C 7'.

Proof- Let f ,g E C

;' and a, b E R+. Then

1. Integration on C - '(R,c)

yield that (af

+ bg)dr = sup{ S (as, + bt,)dp},

which, by Proposition 1.3 (ii), equals

Now let f 5 g. Then we have sups,

5 supt,; hence sk -C - supt, .

Thus, by Lemma 1.5,

S skd P 5 SUPS t n d ~ , and finally,

S f d r = SUP J' s k d r 5 J t,dp

= J gdp.

1.9 Examples.

(i) Let E , be a point mass on a measurable space (R,Z) for some a E R and let s E P + (R,Z) be such that s(a) = ai , for some o io E (1,. . .,n).Then

Now let f E C ; '(R,c). f = sup {s,}. Thus

Then there is a sequence {s,}t C - 9 + such that

Similarly, if p = C E , (for some c

> 0), f d p = cf (a).

(ii) Let n

r = C i = O ca; ~ a . . By Problem 1.3,

300

CHAPTER 6. ELEMENTS O F INTEGRATION

Specifically, if

ci

=

(5)pi(l - P)"-~,

then p is the binomial measure

pn . (See t

6 k,

Example 1.8 (iii), Chapter 5.) Furthermore, if f (x) = etx, for then the transform of the binomial measure

-

El=,cieti = (1+ pe t - p)"

is a function in t and is referred to as the moment generating function. In the general definition, t is allowed to run the complex plane C.

(iii) Let (R,ZI,p) be the measure space with 52 = [0,1], ZI = %([0,1]), and p = X (Borel-Lebesgue measure on %([O,l]). Let C be the Cantor set 1 2 n and Gn be the open intervals of the Borel-Lebesgue measure 2(3) (introduced in Example 3.11, Chapter 5). Let us define the function

We are going to evaluate the integral

S

f (x)A(dx) (with respect to the

[o, 11 Borel-Lebesgue measure). First of all, we have to identify the function f , which can be represented in the form f = sup{sn}, where

Clearly, s, E P + ([O,11, % n [0,11) and f (x) = sup{s,(x)}.

f and hence

E e;'([o,lI,q[o,11))

Thus

1. Integration on C - l ( f i , ~ ) (iv) Let

('1C).

301

{p,} be a sequence of measures on a measurable space

Then P =

C

00

= 'pn

is a measure on C ; and for an A E C, the

integral of the indicator function 1, is

S l , d ~ = P(A) = C=:

'P,(A)

=

C=;

1

S 1, dPn.

Let s E I+ (R,C). Then

$ s d =~ C=: -

-

C ;"=

lak

dAk)

C=:

lpn(Ak)

- C n = l C r = l a k ~ n ( A k )= C:=1Ssdpn' 00

Now, for f E C T', we have f = sup{sj} such that

{sj}t 2 P! +. Let

bjn=Cn= 1 Jsjdpi. Since { bjn} is monotone increasing,

which yields that

Therefore,

Thus we showed that

Now we further enlarge the class of integrable functions by considering arbitrary extended, real-valued, measurable functions of C - (52,c). For each f E C - and 0, being the function identically equal to zero on a, denote

'

'

f

+

=sup{f,O)

and f

-

= -inf{f,O) = ( - f ) +

302


(cf. Definition 7.7, Chapter 1). Clearly (see also Problem 7.16, Chapter 1)1

By Proposition 6.6, Chapter 5, f + and f - are also elements of C-' (more precisely, elements of C );' if and only if f E C - '. 1.10 Definitions. Let (R,C,p) be a measure space and let f E e -'(R,C;R) (or e - '(R, C ; R)). If a t least one of the integrals, J f + d p or Sf - dp, is finite, we say that the integral of f with respect to measure p exists and denote this integral by

(i)

We also denote L(R, C, p;R) = { f E C -'(R, C;R) : J f d p exists}.

(1.10a)

If both of the integrals of the functions f + and of f - are finite, we say that the function f is p-integrable and again denote the integral of f by formula (1.10). The subset of C-' of all p-integrable functions is denoted by L'(R,C,~;W), i.e. L'(R,C,~;R) = {f E C - ' ( 5 2 , ~ ) :

1 f + d p < m and 1 f - d p

< m). (1. lob)

Note that

S If I + = which is due to 1 f I = f + (1.10b) can be rewritten as

S f + d p + Sf-dp,

(1.10~)

+ f - and Proposition 1.8. In light of (l.lOc),

If a measurable space is specified, the notation f E L(R, C, p;R) or f E L'(R, C , p;R) will be shortened to f E L(p) or f E L1(p). (ii) If 52 = Wn, C = 3, and p is the Borel-Lebesgue measure A and if the integral of the function f in (1.10) exists, it is called the Lebesgue integral off. If f is A-integrable, we write f E L1(A). (iii) If 52 = W, C = 3 and p = p~ (a Borel-Lebesgue-Stieltjes measure induced- by an extended distribution function F), and if g E h(R, C, pF;R), then the integral in (1.10) is called the Le besgue-Stieltjes

1. Integration on C - '(R,C) integral of g; and we will write g E L1(pF) if g is pFintegrable. (iv) Let

e -'(L?,aR) be the

space of all extended real-valued random variables on a probability space (R, C, P). From Example 4.5 (i), Chapter 5, we recall that for any random variable X E C-'(R,C) on (R, C, P), the image measure P X * is the probability distribution of X. If X E L'(R, C, P;R), then the numeric value S X d P is called the expectation of the random variable X , in notation, IE[X]. Observe that E[X] makes sense only if X is P-integrable, i.e., if I X I d P < oo. [It is now becoming clear why in text books on probability, the expectation E[X] is defined only when E[ ( X I ] < 00.1 tl

1.11 Proposition. The integral is a linear, monotone, nondecreasing functional on the space L'(R, C, p). I7 (See Problem 1.6.) 1.12 Proposition.

f,

(i) L ~ ( R , C , ~ ; Ris) a vector lattice o v e r E L1(R, C, P$),

(ii)

~f

EL',

W,

i.e. for every pair

I S f d ~ Il l l f l d ~ .

Proof.

+

l g I and I inf(f ,g) I ( 9 I supIf ,gl l i l f l statement is now due to Problems 1.7 and 1.8. (ii) Obviously, [ f we have

I >f

l lf I

+ Ig1.

The

and [ f [ 2 - f . Thus, by Proposition 1.11,

and

S If l

d ~ I2S f d p I .

1.13 Notations. Let f E C-'(R,C;R)

Specifically, it follows that

and A E C. Then, we denote

3 04


Now we will need the notion of "properties that hold almost everywhere."

1.14 Definitions and Remarks. (i) Let (R,C,p) be a measure space. A property II (of points of R) is said to hold almost everywhere (a.e.) or p-almost everywhere (p-a.e.) if there is a (p-null) set N E N, (see Definition 2.5 (i), Chapter 5) such that II holds for all points of NC. Notice that this definition does not preclude property II to hold on N or on its subset. It merely says that II may fail on a negligible subset of N.

(ii) Two measurable functions f and g are said to equal (p-)a.e. if f = g on the compliment of a p-null set N. Observe that (f # g) E N. Recall that, by Problem 5.2 (iii), Chapter 5, the set { f # g} is measurable. Therefore, if f = g a.e., then the set {f # g} E N,, i.e., is p-null. (iii) Let e-'(R,Z;@) be the set of all measurable functions on R and let p be a measure on C. Let [f], denote the set of all functions that are pairwise equal p-a.e. on R. Specifically, [O], denotes the set of all measurable functions, which equal zero p-a.e. on 52. Clearly, the p-almost everywhere property of equality of functions induces an equivalence relation (say E) on the set C - '(R,C; E). Then

denotes the quotient set {[f], : f E C- ' ( Q z ; ~ ) } and it is called the quotient set modulo p. In light of these considerations, any two functions f and g such that f = g p-a.e. on R are also said to be equal modulo p and we will write f = g (mod p), or f E [g],, or equivalently, f - g E

PI, .

0

1.15 Lemma. Let (R,Z,p) be a measure space and let f E c ;~(S~,Z.;R).Then J f d p = 0 if and only iff E [O],.

Proof. Denote N = (f > 0) (which is an element of C). (i) Let f E [O],. Then N EN,. Let s, = nlN ( E P+),n = 1,2,.... Therefore,

J' s,dp = np(N) = 0, for all n. Denote s = sup(s,}.

Then, by Theorem 6.5, Chapter 5, s E C j 'and

Finally, f = s, = 0 on NC. While f is arbitrary on N and, in particular, not necessarily oo, we have that s,too on N. Consequently, f 5 s on R

1. Integration on C - l ( n , C ) which, by monotonicity (Proposition 1.8), yields O < J f d p s J'sdp=O

and hence J' f d p = 0.

(ii) Now let J' f d p = 0. Denote

Obviously, N, E C and N,T N, where

By continuity from below of p, lim p(N,) = p(N).

n-+w

(1.15)

Clearly, nf 3 lN. Again, by monotonicity (Proposition 1.8), we have n that 0 = J'fdr 2 J'klNndr =~ P ( N , ) , which leads to p(N,) = 0, n = 1,2,.... From (1.15) it follows that p(N) = 0 and hence N E N,. Therefore, f E [O],. 1.16 Proposition. Let (Q,C,p) be a measure space and let f,g E C ; l ( n , c ; R ) such that f = g (mod p). Then

Proof. By Problem 5.2 (iii), Chapter 5, we have that N = (f # g ) E C. Therefore, by the above assumption regarding f and g, N E N, and the functions f lN and g l N are elements of the quotient set [O],. By Lemm a 1.15, it follows that

On the other hand, if A = NC, then

Similarly,

J' s d r = J' slAdr.

306


The statement follows from f lA= glA, V w E R . Indeed, while on set N, Sf = Sg = 0; on NC we have that f = g.

0

1.17 Proposition. Let (R,E,p) be a measure space -and let f,g E e - l ( n , c ; R ) such that 1 f I 5 g a.e.. Then g E L1(R,E,p; W) implies that f E L'(R,c,~;R). Proof. Let g E L1(R,E,p; R). Then by Proposition 6.6, Chapter 5, we have that

gl=sup{g,

If I I EC'.

Clearly,

I f I 5 g ' everywhere and g ' = g (mod p ) -

(show it),

and by Problem 1.17, g ' E L'(R,E,~; R). Then, by Problem 1.8, f E ~ l ( a , s , pR). ; 1.18 Proposition. Lei f , g E C - '(R, C ) and f or g E L1(R, C, p). Then

j' f d p = S gdp, for each A E C ,

A

(1.18)

A

yields that f = g (mod p).

(See Problem 1.27.) Theorem 1.19 and Corollary 1.20 modify and, to some extent, refine Proposition 1.18. -

1.19 Theorem. If p is c-finite, f, g E L(R, C, p; R), and

S f d p 5 AS gdp, f o r

A

each A E C,

(1.19)

then f 5 g p-a.e. on R.

Proof. a) Let p be finite. Denote

Then, since by our assumption,

Sf d p 5 S gdp for each A E C , we have A

A

1 . Integration on C - ' ( S ~ , C )

= M: =

S g d p +A p ( A n ) . An

On the other hand,

Therefore, from (1.19a) and because

S g d p is finite, L 2 M, which yields An

that p ( A n ) = 0 , for each n. Thus,

On the other hand, from 00

n=l

n=l \

'

w

n=l V

{ g is finite)

{ f >9) we conclude that p{ f

Letting Bn = { g =

> g: g is finite) = 0. Hence,

- m,f 2 - n )

-oop(Bn) =

we have

S gdp 2 S fd p 2

Bn

-np(Bn)

Bn

and therefore,

or, equivalently, n p ( B n ) 2 o o p ( B n ) . This holds true if and only if p ( B n ) = 0 (as the consequence of the agreement that oo - 0 = 0 ) . Thus,

In summary, we proved that p { f > g ) = 0 implies that f 5 g p-a.e. on S1.

b ) Now, let p be c-finite and let pn = Rest,

n

p. Then

308


and hence f _< g p-a.e. on R,. The rest of this case is obvious. The reader can easily conclude that

-

1.20 Corollary. If p is a-finite, f, g E (L(S2, C, p; R), and

S f dp =

A

gdp, for each A E 27,

(1.20)

A

then f = g p-a.e. on 52.

(For a pertinent discussion, see Problem 1.28.) Finally, we would like to formulate the proposition below that will be often cited in the sequel and whose prove we assign to the reader as Problem 1.19. 1.21 Proposition. Each function f E L ~ ( ~ , C , ~ ;isRfinite ) p-a.e. on R.

0

PROBLEMS 1.1


1.2

Prove Corollary 1.6, i.e., for {s,}t, sup{s,} = sup{t,} it holds that

[Hint: Use the fact that s

{t,}t E I + such that

5 sup{t,} and t k 5 sup{s,}.]

1.3

n- ocfioi, the corresponding value of the Show that for p = integral of any bounded'measurable function f is

1.4

Let nx be a Poisson measure and let f E C; l ( ~ , b $ ; R )Show . that

1.5

Under the condition of Problem 1.4 assume that

1. Integration on C - '(R,c)

309

and find in each case the integral o f f with respect to measure r ~ . Prove Praposition 1.11. Let Q be a non-Bore1 subset of R (such as one in Problem 3.16, Chapter 5) and let C denote the Cantor ternary set. Define the function

Is f Lebesgue measurable, i.e. f E C - '(R,L*,x)? Let (R,C,p) be a measure space and' let f E C-'(R,c;R). Show that f E L'(R,C,~;R) if and only if there exists g E L1(R,C,p;R) such that I f I 5 g. Show that L' is a linear space over R. Show that

Show that { ~ l ( n , z , ~ , ; R )a: E 0) = (L'(R,c,E,;R):

a E R).

Let (R,CLp) be a complete measure space and let f E C-'(R,C;R). Suppose that g: R-tR is an extended, real-valued function. Show, that if g = f (mod p), then g E C-l(R,C;R). [Hint: Show that (g < c) E C, Qc E R.]

,.

Let ( p ) be a complete measure space and let { f n} 5 C - '(R,C; R). Suppose that lim f, exists and f ,+ f pointwise p-a.e. on R, where f is an extended, real-valued function. Show that f E C - '. Prove that f = g (mod p) if and only if f f - = g - (mod p). Show that f E [OIp if and only if f

+

= g + (mod p) and

+,f - E [OJc.

Show that if f E C -'(R, C;R) then f E [O],,yields that f f d p = 0. Does the converse hold true? Let (R,C,p) be a measure space and let f E L1(R,C,p;R), g -E C -'(R,C;R) such that f - g E [0], . Show that g E L1(R,C,p; R ) and that J f d p = J gdp. Generalize Proposition - 1.16 assuming that f ,g E C - '(a,E;R) and that f E L(R, E, p; R) (i.e., that J f d p exists). 1.19

Show that each function f E L'(R,C,~;R) is finite p-a.e. on 0.

3 10


[Hint:Let A = { 1 f 1 = ca). Show that ap(A) < ca, V a E R + , and then show that n-+m lim n p ( A ) < ca implies that p(A) = 0.1 Show that for f E C only if f E [O],.

-'(a, C ) ,

S f dp = 0 for each A E C if and A

Show by a counterexample that L' is not an %-space.

-

Let f E C -'(a, c;R). Show that f E L1(R,C,p;-R ) if and only if for each E > 0, there is a function g E L1(R,C, p; R + ) such that

Let ( R ,C ,p) be - a measure space and c > 0. Show that for each f E L'(Q, C, P; R),

S f d ( c p )= C S f d l l Let { p n } be a sequence of measures on a measurable space ( R ,C ) , Icn1 be ,a sequence of positive real numbers, and let P = C n = l c n ~ n which , is a measure on ( R ,27). Show that for -

every f E L'(R, C ,p; R),

S f d(cp) = c S f d ~ . two measures on - ( R , C ) such that p -< v. Show f E L1(R,C ,p; R ) n L1(R,C , v;R), the integral

Let p and v be that for each S f d(v - p) makes sense, f E L'(R, C , v - p;E), and that

Let p and v be two measures on ( R , C ) such that p < v. Show thdt for each f E C ;'(a, c;R), S f dp S f dv.

- oo. Then, ~ ~ ~ { S f n= dS p s) u ~ I f n I d ~ . (See Problem 2.2.) 2.4 Lemma (Fatou). Let 5 e;'(n, 22). Then

(a,22, p)

be a measure space and let {f,)

Proof. By Proposition 5.6 (v), Chapter 5, Proposition 5.6 (iv), Chapter 5,

Clearly, the sequence {g,)

f, E C

;'

is monotone nondecreasing and hence

sup{g,) J I = I & f ,and gn 5 f k , for all k 2 n. By monotonicity of the integral,

and by

3 14


which implies that

Finally, by the Monotone Convergence Theorem,

2.5 Defbition. Let f , { f }, C L'(R,E,~; a ) . The sequence {f}, is said to converge to f in mean if

We now formulate and prove one of the central results in the theory of integration. As with the Monotone Convergence Theorem, the following theorem enables us to interchange the limit and the integral for a pointwise convergent sequence of functions. However, it does not require that the sequence be monotone nondecreasing and nonnegative. On the other hand, the sequence needs an integrable dominating function, and thus it is not a generalization of the Monotone Convergence Theorem. 2.6 Theorem (Lebesgue's Dominated Convergence Theorem). Let e -'(Q, el)be a (point(R,Z,p) be ,a measure space and let {f ,} wise) a.e. oonvergent sequence. Suppose that there is a p-integrable function g ( E L ~ ( R , C , ~ ; R )such ) that g > - 0, and that I f, ( 5 g, n = 1, 2,. ... Then the following a r e true.

(i) There ezists at least one function f E e-', such that f < oo, to which the sequence {f,} converges a.e. in the topology of pointwise convergence. (ii)

f E L'(R,G,~;w) and {f},

C_ L 1 ( ~ , ~ , p ; R ) ;

(iii) The sequence {f}, converges to f in mean, i.e.,

Proof.

(i)

By our assumption, there is a negligible set I1 such that

exists for all w E IIc and there is a p-null set N1

> II. Therefore, NF

ItC

2. Main Convergence Theorems

and

exists for all w E Nf. Since g E L1($2,.E,p), by Proposition 1.21, it follows that g is finite p-a.e. on $2, i.e. there is a p-null set N 2 such that g(w) c co for all w E N;. Define the function

where A = (N1 U N 2 ) ' Clearly, f, converges to f pointwise p-a.e. on $2 and hence, by Proposition 5.6 (iii) and (vi), Chapter 5, f E C-'. Indeed, and that f, lA+ f since f, and lAE C -', it follows that f, lAE C in the topology of pointwise convergence; the latter implies that f, -* f pointwise p-a.e. on $2.

-'

(ii) From (2.6) it follows that on set A, lim , f, = f ; in addition, { f ,} is dominated by a finite function g on A. Thus, ( f I 5 g on A and, due to (2.6), f = 0 on AC. Hence,

By Proposition 1.17 and since 1 f 1 < co, f E L ~ ( R , E , ~ ) . Also by Proposition 1.17, { f }, C_ ~ ~ ( ~ ~ 1 3 , ~ ) .

(iii) We prove that f, is convergent in mean to f , i.e.,

Let g,= Since

If-f,l

( ~ e ; l ( n , ~ ) why?). , Then,

Ifl +

OIgn6

I f 1 +g.

E L'(Q,c,P),

it follows that g, E L'(R,c,~), again by Problem 1.8. [Observe that since linearity of the integral holds just on L', we do need to show that g, E L' which would lead to

Applying Fatou's lemma to the sequence { ( f

I + g - g,},

we have:

3 16


1 f 1 + g - g,

Since f ,+f a.e., then gn- 0 a.e., and hence a.e. which implies that

-+

I f 1 +g

By Proposition 1.16,

which, together with inequality (2.6a), yields

or, equivalently,

G J g n d p 5 0. Because g, 2 0, (2.6b) reveals that

-

lim gndp = 0

and thus e m s If-fnIdp=O, which proves (iii). Now (iv) follows from Problem 2.6.

2.7 Examples. (i)

We evaluate

1

lim Sonx(l - x)"dx.

n-+w

First observe that the

sequence (nx(1- 2)") is convergent to the function 0 pointwise on [0,1]. However, .it is an easy exercise to show that the sequence {nx(l - x)") does not converge to 0 uniformly. Otherwise, we could interchange the limit and the integral. (See Problem 3.12 of the next section.) Fortunately, the functions nx(1 - x)" are uniformly bounded by 1. Therefore, function 1 can be taken as a pertinent integrable majorant function in the Lebesgue Dominated Convergence Theorem. This enables us to interchange the limit and the integral and conclude that lim ~ i n z ( 1 -x)"dx = 0.

n4w

(We can verify this result by direct computation of the integral


and then passing to the limit.) (ii) Calculate n+m lim

J ;(l+ $ r e - 2 x (d~x ). Clearly,

Hence, by the Lebesgue Dominated Convergence Theorem,

=

J n+m lim (1 + Crllol ,](x)e - 2 x ~ ( d x= ) J y e - xX(dx) = 1.

2.8 Remark. Note that we treated ~ i n x ( 1 -x)"dx in Example 2.7 (i) informally both as Lebesgue (L) and Riemann (R) integrals (since they are identical in this case), although the formal relationship between the two will be developed and discussed in Section 3. The same applies to Example 2.7 (ii). In Problems 2.9-2.11 we will also assume that the Lebesgue integrals are equal to Riemann integrals. Another useful application of Lebesgue's Dominated Convergence Theorem 2.6 leads to the possibility of interchanging the derivative and integral whenever we need to differentiate a function under integral. The only obstacle in using Theorem 2.6 is that it is formulated for sequences, while derivative is defined as a limit along nets or filters. Nevertheless, to overcome this predicament we will utilize the arguments of Example 9.7 (ii), Chapter 3, when the limit of a function, originally introduced along a filter base, reduces to the topological limit along countable neighborhood bases whenever we deal with first countable spaces (which we frequently do, as far as applied to derivatives in metric spaces, in particular, in Euclidean spaces). This enables us to make use of limits as derivatives along sequences (as was pointed out in that example) and finally apply the Lebesgue Dominated Convergence Theorem. This is subject to Theorem 2.9, which the reader shall be able to prove. (See Problem 2.14.)

2.9 Theorem. Let f E C - ' ( ~ 2 x [a,6],C'; W) (a < 6 E W) be a Bore1 measurable function and for each t E [a,b], f ( ,t) E L'(i-2, Z, p; R).

-

) ) that ( i ) If there is a p-integrable function g ( E ~ l ( n , ~ , p ; Rsuch g 2 0 , and that [ f ( w , t ) I 5 g(w), t E [a,b], w E i-2, and if the function t H f ( - ,t) is continuous at some 5 E [a,6] uniformly for all w, then the integral of parameter


is continuous at 0 for all n, we have from Fatou's gdp 5

On the other hand, since

gdv

Lemma 2.17,

< m,

that yields the assertion.

PROBLEMS 2.1

Prove Corollary 2.2.

2.2

Generalize the Monotone Convergence Theorem: Let { f ,) f 5 C -'(R,c) and g E C -'(R,c) such that f, 2 g for all n and suppose that J g dp > - co. Prove that

2.3

Show that if Sgdp = - oo, the Generalized Monotone Convergence Theorem need not hold.

2.4

Let { f , ] J . s C - ' and g ~ ~ - l s u cthat h f,_ lim p(A,).

for all n. If

.

E E. Prove that

324


[Hint:Apply Fatou's Lemma 2.4 to the sequence of functions {IA } and use Problem 3.8, Chapter 1; then apply DeMorganYs n

law to prove the second inequality.] 2.6

Show that if f n - , f in mean then

2.7

Generalize Fatou's Lemma 2.4 in the following way. Let { f ,} C C - ' ( R , E ) and g E C-'(R,c) such that g 5 f, for all n . Let g - d p < CQ.Show that

Slim f n d p slim S f n d P 2.8

Let { f ,} c C -'(n,C) and g E C -'(Q,C) n . Let g + d p < CQ. Show that

2.9

Let

S

Show that f n Explain why

-t

such that f n

5 g for

all

0 A-a.e. in the topology of pointwise convergence.

S limn,, 2.10

Let

n2x,

o < x ~ ;

( x - ) ,

1 2 iii x sn

0,

2 xZz.

Show that

Slim f n X ( d x ) < bj' f n A ( d x ) . 2.11

Use Lebesgue's Dominated Convergence Theorem 2.6 to prove that for all a > 0, lim

n+=

n!na

a(a - 1). .. ( a

+n - 1) = r ( 4 ,


325

where r ( a ) is known to be the gamma function and it is expressed as the improper Riemann integral (P2.1 la) 2.12

Give an example of a monotone nonincreasing sequence of meas, setwise such that p is not a ures convergent to a set function u measure.

2.13

Prove Fatou's Lemma 2.17.

2.14

Prove Theorem 2.9. [Hznt: Use Theorem 2.6, the Mean Value Theorem, and Example 9.7 (ii), Chapter 3.1

326


NEW TERMS: Monotone Convergence Theorem for functions 312 Beppo Levi's Corollary 313 Monotone Convergence Theorem, Generalized 313 Fatou's Lemma for functions 313 convergence in mean 314 Lebesgue's Dominated Convergence Theorem for functions 314 interchanging derivative and integral 3 17 Fatou's Lemma for measures 318 Lebesgue's Dominated Convergence Theorem for measures 318 setwise convergence of measures 319 setwise limit of measures 319 setwise convergence, criterion of 320 Monotone Convergence Theorem for measures 321 Fatou's Lemma for measures and nonnegative functions 321 Fatou's Lemma for measures and functions 322 Lebesgue's Dominated Convergence Theorem for measures and functions 322 gamma function 324, 325

3. Lebesgue and Riemann Iniegrals on R

3. LEBESGUE AND REMANN INTEGRALS ON aB In this section we will develop integration techniques in L~(R,%,X;R)(see Definition 1.10 (ii)). The principal idea is to reduce the Lebesgue integral to the Riemann integral whenever it is possible in combination with the main convergence theorems. The Riemann notion of an integral, which was a refinement since its inception of Cauchy in 1832, was introduced in 1854. We begin with the concept of the Riemann integral of a bounded function on a compact interval suggested by the Frenchman Gaston (in some sources, Jean-Gaston) Darboux (1842- 1917) in 1875. Although the construction below is selfcontained, the reader is encouraged to go back to Example 9.9 (vi), Chapter 3, for topological preliminaries of this construction. Let = [a,b] be a compact interval in R. By Definition 1.7 (ii), Chapter 1 (see also Example 9.9 (vi), Chapter 3), partition of [a,b] is any ordered n-tuple P = P(n) = P(ao,. ..,an) with

P = {ao,...,a, E [a,b]: a = a. < al
. Then, Cn(B,l) = L*(r,xO)(Cn(r,zO)). Ob

6. Product Measures and Fubini's Theorem

369

serve that L*(r,xo) = E($) o M ( - xO), where E means expansion with factor and M stands for the parallel motion (here with the shift - x,,) On the other hand,

n

2

2

= {(xl,...,xn): E x i 5 1i=3

- x22 , with

(x1,x2) E C2(8,1)}

= L: - 2((1 - x12 - xZ2)- 1'2,0)(cn-2(e,l))

9

where (xl,xZ) E C2(f?, 1). This yields

Now, P

(by Fubini's theorem and by (6.18b))

The interior (second) integral is, due to Proposition 5.3, I), Chapter 5, and the above observation equal to

C H A P T E R 6. ELEMENTS O F INTEGRATION

By Proposition 5.3, 2), Chapter 5, and by Theorem 4.1, the last integral equals 2 - x22)(n-2)/2 v (1 - Xl n-2 ' Therefore,

(by Fubini's theorem)

This is a Lebesgue integral of a continuous function on the unit ball and it can be redaced to a Riemann integral by using conventional techniques for Riemann integrals. For example, the double integral above is then

and thus V, = V,-2T,27r n = 2,3,... .

(6.18d)

V o = 1. Then, V1 = 2 (as the Lebesgue measure of the interval [ - 1,1]). By (6.18c), V2 = n. (that agrees with the definition of Vo). Let

2

2n r2-' , and v3= vlT-1.3

V4 =

X

2

The validity of formulas (6.18) and (6.18a) is then easily shown by induction and the use of (6.18d). (ii) We show that Fubini's theorem need not hold when a t least one of the measures, p1 or p 2 is not c-finite. Let (Ri,Ci)= ([0,1],%([0,1])), i = 1,2, p1 = Res[o,llA, and p2(A) = IAl, if A is finite and p2(A) = oo, if

A is infinite, where A E C2.Denote

the diagonal of the square (see Figure 6.2).


Figure 6.2 We show that D E C1 8 C2 = %2([0,1]2). Let

and

n 00

Then D E C1 8 C2 for D = An. Now we find n=l

I

p2(Dx)X(dx) = X([O,l]) = 1 (since p2(Dx) = 1). [o, 11 On the other hand,

So as we see, Fubini's theorem or more precisely, the second equation in (6.10) of Theorem 6.10, does not hold. (iii) Let (N,T(N), y) be the counting measure space introduced in Example 1.2 (viii), Chapter 5, for more general measure spaces. We will consider a sequence isn) of nonnegative simple functions on N as

where (ak) is a nonnegative sequence of reals, so that


Hence the integral of g will turn to a series:

This is readily extendible to a series with real-valued terms. In other words, the integral of a sequence { a ) E R with respect to the counting n. measure y is represented by the series in (6.18e). Let { f ,} be a sequence of nonnegative functions of C; '(a, C ) and let p be a a-finite measure on C. Since the above counting measure y is a-finite, the function f (where f (n,w) = f ,(w)) obviously meets the conditions of Tonelli's Theorem 6.14: f E C ;'(N x R, T(N) 8 C). Consequently, the sections

are 3(N)- and C-measurable, respectively, and

(18f) is an nice illustration of Tonelli's Theorem. Howvere, it is a slightly weaker alternative to Beppo Levi's Corollary 2.2, since the latter does not require p to be a-finite. Now, let { f ,} E C - '(a, C). T o use an analog of Fubini's Theorem, we need to make sure that f E L'(M x R,Y(N) 8 C ,y 8 CI;R ), or, alternatively, apply the above procedure initially to the sequence f n I ) instead. Then, from (6.180 we can get

f

SL

00

S I f 1 dy 8 p

or their equivalents, C n = 0 J o [ f n I dp, be finite, then it would yield that

Should now,

1 f n I dp

or

and therefore, Fubini's formula (6.180 would hold true, now for an arbitrary sequence of measurable functions { f ,}. Notice that, since

is a necessary condition for f E L'(M x a,T(N) 8 C, y 8 P;R), it automat-

6. Product Measures and Fubini's Theorem ically implies that

would be alternative necessary conditions for f E L'(N x R, T(N) 8 C, y OP;R) (although the latter is by no way a necessary condition for Fubini's Theorem). This version of Fubini's Theorem can compete with Generalized Monotone Convergence Theorem 2.4 and Lebesgue's Dominated Convergence Theorem 2.6 in some applications. (iv) As an illustration to the last application of Fubini's Theorem, consider a random variable X on a probability space (Q, E, P). The function m(0)rIE[eeX] (normally, complex-valued) is known to be the moment generating junction of X. If we expand eeX in the Maclorin series,

we will have with

exn] - on!

m(O) = E [ c ~ -

a scenario of the application of Fubini's Theorem discussed in Example

(iii). Hence we have to make sure that, in light of (6.18h), the series is

in some vicinity of 0 = 0. [The latter holds for many practical cases, provided that IE[ I X ( "1 < oo for all n.] Assuming that all absolute moments of X exist and the above series converges, the application of Fubini's Theorem (6.18f) yields that

as a Taylor series expansion of m(O) in terms of all moments of the random variable X, and consequently that

Consider Borel-Lebesgue measure X2 = X 8 X on Bore1 c-algebra g2.Let A = Q x R. According to Problem 3.1, Chapter 5, A is a countable union of Borel-null sets. Thus, (v)

374


On the other hand, the section (IA)

is not A-integrable for all al E Q.

O1

This is, however, in agreement with Fubini's Theorem that the function (lA)ol is A-integrable only for almost all al E W. (ui) Now we discuss yet another application often occurring in probability theory. Let pF and pG be finite Borel-Lebesgue-Stieltjes measures induced by distribution functions F,G E 9(R,G$) (see Remark 3.5 (iii), Chapter 5). Recall that

lim,,

- ,F(x) = lim,,

- ,G(x) = O.

From Problem 3.7, Chapter 5, given a compact interval I = [a,b], we have that

Let T, = {(x,y) E [a,b12:y > x) and T I= {(x,y) E [a,bI2:y 5 x), which are the upper and lower triangles of the square I , respectively. Now we calculate the measure of I 2 under pF 8 p~ by using Theorem 6.10 in terms of Lebesgue-Stieltjes integrals:

Equating (6.18i) and (6.18j) we arrive a t

Interchanging the roles of

F and G we have


375

Hence, from (6.18k) and (6.181) we establish the following integration by parts formula for Lebesgue-Stieltjes integrals:

PROBLEMS 6.1

Let

go

be the set of all measurable rectangles A = Al x . . . x A,,

Ai E Ei. Denote by C(Cjo) the algebra generated by

go

and by

C(go) the collection of all finite unions of disjoint rectangles of

go.

Show that C(Qo) = C(go). 6.2

Prove that the section is commutative with respect to all set operations.

6.3

Show the validity of assertion a) in Lemma 6.7.

6.4

Show that a rectangle Rl x R2 E C 8 C, where R1 and R2 are not empty, if and only if R1 E C and R2 E C.

6.5

Let (Ri,Ei,pi), i = 1,2, be u-finite measure spaces. Show that the product measure p1 8 p2 is a-finite.

6.6

Let (Rn,91n,An) and (R k ,9l k ,A k ) be the Borel-Lebesgue measure spaces. Show that

6.7

Let (Ri,Ci,pi), i = 1,2, be measure spaces with u-finite measures and let A E C18 C2. Show that the following statements are equivalent:

6.8

Let A C R1 x R2 and let al E

6.9

Show that f al*(A3) = (f *(A3))o1 , A3 C R3.

6.10

Prove Proposition 6.13. [Hint: Apply Lemma 6.7 and Problem

4. Show that

(lA)a = lA 1

.

3 76

C H A P T E R 6. ELEMENTS O F INTEGRATION

Let A,B C Rl x R2 be two disjoint sets and let a,p E R. Show that

Let f E C;'(fll

x R2,Cl 8 13,) and let {s,} g !+F+ (52' x n,,C - ')

such that f = sup{sn}. Show that f, = sup{(s ) } [HintApply 1 "1 Theorem 6.5, Chapter 5, and Problem 6.101. Showthat

If I.=

Ifal,(f+)"=(fa)+,and(f-)a=(f")-.

Let El and C2 be cr-algebras on a1 and a,, respectively. Show that Cl 0 C2 is a semi-ring. Let Ql and Q2 be semi-rings on Rl and a,, respectively. Is Q1 0 Q2 also a semi-ring? What will the smallest algebra generated by Cl O C z from Problem 6.14 look like? Let pi and vi be finite measures on a measurable space ( a i , C i ) , i = 1,2. Show that if pi 2 events from g, the n following relation holds true: P{Ai n . . . n A i ) = P ( A i l ) - - - P ( A i ) . n

1

(7. l a )

n

Observe that, if g is an independent family of events then the Dynkin system generated by g is also independent (see Problem 7.1). If, in addition, g is n -stable, then 9(g) is an independent a-algebra. (ii) Let Q = {gi; i E I) 5 C be an indexed collection of families of events. Q is called independent if, for any finite subset {il,. ..,in} I , n 2 2, and for any choice of Ai E Cjik, k = 1,.. .,n, the events Ail,. . .,Ai k n are independent. (iii) Let 9 = {Xi ;i E I} be an indexed collection of random variables on ( R , E , P ) . 9 is called independent if the corresponding collection {a(Xi); i E I} of a-algebras generated by these random variables is independent. (iv) Let X i : R -4 R i , i = 1,.. .,n, be ErE random variables on n

(R, E , P ) . Then we denote 8 X i = {XI,. . .,Xn) : R i=l call it the product map. (v)

+

Rl x ... x Rn and

6

It appears (Problem 7.2) that the product map X i is Ci=1 Therefore, by letting

6 El-measurable. i=l

~

-

( n nit8 ,Xi) n

we can define a probability measure on

*=1 1=1

joint distribution of random variables XI,. . .,X,.

and call it the

0

Let PK = PX; be the distribution of the random variable X i , 1

i = 1,.. .,n. This is a probability measure on E;. Then, according to the previous section, we can construct the triple

7. Applications of Fubini's Theorem

On the other hand, we already have another measure P 8

(

0 ,&

,=l

) ,

on

which in general, need not be a product measure. The

1=1

following statement clarifies the matter. 7.2 Proposition. The joint probability distribution P

is a product

n

measure and equals 8 PX. if and only i f the random variables XI, ...,Xn a are independent. i = l cl

(See Problem 7.3.) Note that the treatment of the product P g X i of more than finitely many independent random variables is more complicated; such a treatment involves the product of infinitely many a-algebras and measures. Another import ant application of product measures and Fubini's theorem is the notion of "convolution" of measures.

7.3 Definition. Let %,(IRk, 3(IRk)) be the set of all finite Bore1 measures on '3k= %(IRk). Clearly, %,(IRk, %(Wk)) is a semi-linear space over n

the field W + . Let p = 8 pi a=1

where pi E 8 (it is easily seen that

p E B,(w~",%~")).Consider the linear measurable map L,: Wkn L

n ... x ) =

i=l

+ Wk

as

x .. Then the image measure pLn* is called the

convolution of measures pl,. . .,pn and it is denoted by

7.4 Properties of Convolution. l k k k k (i) Let f E C z (W ,% ) and let pl,p2 E B*(W ,% ). Then

=

Sf

0

L2(xl,x2)dpl 8 p2(x1,x2) (by Fubini's theorem)

For f E C -'(IRk,ak) and p1,p2 E B,, we require that f = f + - f be p1*p2- (or p2*p1-) integrable to have (7.4a) valid. Specifically, let f = lA,A E %k. Then,

380


(since lAo L =

where T(xl) = xl

+ x2 and

TL(y) = y - 1 2 )

Applying Fubini's Theorem to (7.4a) (i-e. interchanging the integration) we also get

But the expression on the right is exactly (p2*pl)(A), which implies commutativity of the convolution. k

k

k

k

(ii) If kl,p2,v E B,(W ,J ), then pl + p 2 E Bt(R ,J ), and we have by (7.4b) that

Thus, the convolution is distri+utive. 00

(iii) Let {v,pn} E B,(R k ,% k ) such that

n=l

pn E B,(W k ,ak ). Then,

by the same argument as in (ii), we get

i.e. the convolution is also

CT-distributive.

(iu) Let p1 and p2 be as above and let easily seen that

P

E W+\{O).

Then, it is

7.5 Examples. (i)

Let p1 =

and

p2 = E~ E B,(W k ,% k ). Then by (2.2),

7. Applications of Fubini's Theorem since &,(A- b) = lA-&a) = l A ( a write

+ b)

E,SE~

381

for fixed a,b,A. We can therefore

=E,+~

(7.5a)

(ii) Let P , and P m l p be binomial measures introduced in Example 1.8 (iii), Chapter 5. We find the convolution of these measures by applying (7.4~)-(7.5a):

Denoting i

+k =j

and renumbering the second sum, we have

The middle sum is fore, (

by a known cornbinatorial identity. There-

+"

j

) Pm, p*Pn,

- Pm+n, p

. 00

(iii) Convolution of atomic measures. Let pl =

i=O

M)

P j r j E 23,. Then

=

C aihi

and p2

3=0

Substituting k for i

The expression

+j

we have

e ~ ~ ~ ~ - ~ = yk is known as the convolution in the

i=O

product of power series

382


(iu) Consider the following special case of (iii) by taking p1 = ra and pz = r b - Poisson measures with parameters a and b (introduced in Example 1.8 (iu), Chapter 5). By formula (7.5b), we therefore have

= k=O

exp( - (a

+ b))

k

C (f)aibk-i

i=O

7.6 Remark. Let f 2 0 be an element of L ~ ( I R ~ , % and ~ , Alet ~ )p be k k the measure generated by fdAk. Therefore, p E 8,(W ,% ). Let v be

S

another measure from B,(W k ,%k ). We wonder about the convolution p*v. We have by (7.4a) and with L&x) = x y that:

+

[by Fubini's theorem]

Sf

where p(x) = (x - y)v(dy) 2 0 is, by Fubini's theorem, %k-measurable and 3 is obviously finite. Therefore, p E L1(Ak) and we conclude that p*v has a density relative to Ak. We rewrite the above equation in the form

7.7 Definition. The function p in (7.6a) is called the convolutzon of

7. Applications of Fubini's Theorem the function f and the measure v and it will be denoted by cp = f*v.

383

O

Observe that, since p*v = v*p, we have that f *v = v* f .

7.8 Remark. Let f ,g E L1 (A k) and f , g 2 0. Let p = fdXk and v = S gdXk. Then p,v E 8,; and we obtain, by using Proposition 5.3 that

Now we have from (7.6a))

where we denoted

The function f*g is obviously integrable and we call it the convolution of f and g. tl The definition of the convolution for functions can be extended from (7.8a) to real-valued integrable functions. However, we shall refrain from connecting it with the convolution for measures, since it will require a background on signed measures (in Chapter 8). 7.9 Example. In probability theory, the convolution finds its application for the distribution of the sum of independent random variables. Let XI, ...,Xn be independent random variables valued in Flk with their distributions Px., i = 1,. . .,n.Let S = = : X iand Ln be as in Definia tion 7.3. Then

and thus

Since XI,.. .,Xn are independent, by Proposition 7.2,

and, following Definition 7.3, we obtain that


PROBLEMS 7.1

Let (C!,C,P) be a probability space and let 2 be an independent family of events. Show that the Dynkin system 9(g) generated by C j is also independent.

7.2

Show thatn the product map . & X i introduced in Definition 7.1 (iu) is C- @ Ermeasurable. t = l i =1 Prove Proposition 7.2.

7.3 7.4

Let L+ denote the space of all real-valued nonnegative x ~ integrable functions. Show that (L + ,*), with the binary operator * defined by (7.8a) is a semi-%-space (over the semifield W + ) in light of Remark 6.2 (ii), Chapter 5.

7.5

Let y

zdenote the probability distribution of a normal random

a,6

variable X with density f(a,a2)defined in Example 5.10 (iii) and lety y be the probability distribution of another normal

tJ,6

random variable Y. Show that, if X and Y are independent, then

P ~ += Yra + B , 0 2

+d2'

-

7. Applications of Fubini's Theorem NEW TERMS: independent random variables 378 independent family of random variables 378 product map 378 joint distribution of random variables 378 convolution of measures 379 convolution of measures, properties of 379 convolution of point masses 380 convolution of binomial measures 381 convolution of atomic measures 381 convolution of Poisson measures 382 convolution of a function and measure 382, 383 convolution of functions 383 sum of independent random variables 383, 384 sum of independent normal random variables 384

Chapter 7 Calculus in Euclidean Spaces This is about Lebesgue integration in Euclidean spaces, which will primarily deal with the change of variables techniques. As a mandatory preliminary and for consistency, it begins (in Section 1) with differentiation. Any standard analysis text book can serve as an alternative refresher. Although the Euclidean space is the chief application in this chapter, for didactical purposes, we allow us to introduce certain concepts for Banach spaces.

1. DIFFERENTIATION 1.1 Definition. Let X and X' be Banach spaces. A function F: X -,X' is said to satisfy a Lipschitz condition on a subset 0 C - X, if there is a constant K, called the Lipschitz constant on 0, such that 11 F(x) - F(y) 11 5 K 11 a: - y 11 for all x,y E 0. Clearly, a function F that satisfies a Lipschitz condition on a set 0 is uniformly continuous on 0. If the Lipschitz constant is zero, then F = const on 0. 1.2 Remarks. Let X = Rn and X' = R m both endowed with Euclidean norms and let L be a linear operator from Rn onto Rm. It is known that any linear operator can be expressed by a matrix. Conversely, any m x n matrix, say A, represents a linear operator, so that L(x) = Ax, for each x E Rn. The Euclidean or Frobenius norm of matrix A = (aij) is defined as

There are a few other norms we are going to use in the sequel. Before we introduce them, note that the notation 11 I 11 I for a matrix norm is used when besides the usual properties of the norm, a matrix norm is submultiplicative, i.e., if

One can show (Problem 1.1) that the Frobenius norm is submultiplica-

388

CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES

tive. The matrix supremum norm is defined as

t,

It is not submultiplicative. (A counterexample is [I A' 11 > [I A 11 where A is the 2 x 2 matrix with aij = 1 for all i and j . ) The maximum row sum matrix norm is defined as

One can show (Problem 1.2) that the maximum row sum matrix norm is submul tiplicat ive. We will outline the following properties of matrix-vector norms. We assume that A is a n m x n matrix and x E Rn. (i)

II AxII .
0,

for all x € B, where B is a Borel subset of 0. Then,

Proof. 0. We (i) Suppose B is an open and bounded set such that prove (2.5a) under the assumption that (2.5) holds true for all x E B. Denote @(x) = ( F - ')'(F(x)). If F - '(y) = (gl(y),. ..,g,(y))T, then obviously

2. Change of Variables

Since @(xo)F(x) represents a linear map applied to F(x), by the chain rule,

By Example 1.20, ( F - ')'(F(X~))(F'(X~))= I. Thus,

and this turns out to be the product of matrices (F-')'(F(X~)) and F1(x) - F1(xo). Since the Frobenius norm is submultiplicative (see (1.2a)),

Since Q is continuous and B is compact, Q is bounded on B (in terms of the Frobenius norm) and so it is on B. Hence, there is an M 3 0 such that

[I @(x) (1 5 M for all x E B.

(2.5~)

As a c'-map, F' is continuous on B and because is compact, F' is therefore uniformly continuous, i.e., for every c > 0, there is a 6 > 0 such that, for all x,y E B with 11 x - y 11 < 6,

Combining (2.5~)and (2.5d) we have from (2.5b) that

111 [@(xo)F(x)-IxI1(IIe < E given 11 x-x0 11. < 6.

(2.5e)

410

CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES

By Problem 2.10, Chapter 4, B, as an open set, can be represented as a t most a countable union of disjoint semi-open cubes (Ck} with edges parallel to their coordinate axes. Obviously, we can assume that the edge of each cube does not exceed 26 or, otherwise, we can subdivide the edges accordingly if necessary. Now, if xo is the center of such a cube, then 11 x - xo 11 6 for any x from the cube. From Problem 1.13,

0. Let E = $So. Then, C1 is such that v(C1) 2 $So. Now, if B, = A\C, is v-negatiue, then we are done with the proof. Indeed, v(B1) = v(A) - v(C1), by Proposition 1.3 (i), and because v(C1) > 0, v(Bl) < v(A). Otherwise, there is a t least one subset of B1 whose measure is strictly positive. Continuing with the same procedure, a t step n we arrive a t set ~.

+

which is either a v-negative set satisfying v(B,) < v(A) or it admits a t least one subset with a positive value under v. This again leads to a positive real number

and v(C, set

the +

such that existence of a nontrivial set C, + 2 > 0. If for no n, B, defined above is negative, then we

;s,

We show that N is a negative subset of A claimed in the statement of

1. Signed and Complex Measures

the lemma. From

we see that both v(N) and C F = lv(C,) are finite. The latter implies that v(C ) and, consequently, S,, dominated by v(C,), are vanishing. 1 (Notice that, because Y ( C ~ lCn) = > 0, N # 0.)This in turn yields that N is negative. Indeed, from the definition of S,, for every measurable subset E of B,, v(E) S., Since B, E N , it follows that for every measurable set D, v ( N n D ) 5 S, 10. Finally, that v(N) < v(A) is obvious.

0 )

and I = { f

< 0).

(ii) If p and p are two positive measures (of which a t least one is

finite), the difference v = p - p is a signed measure. However, it is not, in general, the Jordan decomposition of v. Let v be a signed measure on the measurable space (R,E). Denote by v~ = R e s z n ~ v , where E is a measurable set. To obtain the Jordan decomposition of v = p - p , we need any Hahn decomposition of R with respect to v. Say, ( P , N ) is one. Then, from Corollary 1.7, and

We can also make use of formulas of (i) and (ii) of Proposition 1.9 to determine the positive and negative variations.


429

(iii) Let E,, be the point mass and P - a probability measure on (W,%). We find a Hahn decomposition of the signed measure v = P - ro. We show that I = (0) is a v-minimal (and negative) set discussed in Remark 1.10. For an A E 93, u(A n I c ) = P(A n I C )2 0, and either v(A n I ) = P({O)) - cO({O)) = P({O)) - 1, with 0 E A or

v(A n I) = 0, with 0 4: A,

which implies that v ( A n I ) 5 0. Using relations (1.11) and ( l . l l a ) , we have the Jordan decomposition of v: v + (A) = v(A n I ~ = ) P(A n I'), and v - ( A ) = lA(0)(l - P({O)). Note that (0) = I is the set where v attains its minimum. (iv) Let v = X - p, where A is the Lebesgue measure on (OX,%) is the geometric measure defined as

and p

Clearly, N = (1,2,. ..) is a negative set relative to v, whereas P = NCis a positive set. Thus, ( P , N ) is a Hahn decomposition of R relative to v and, consequently, for every Bore1 set A, v + ( A ) = (A - P)(A n {1,2,. .

and

.Ic)

v - (A) = ( p - A)(A n {1,2,. ..))

represent the Jordan decomposition of v. Since N is a A-null set, the latter reduces to Y

- (A) = p(A n {1,2,. ..)).

Therefore, v attains its minimum a t N and its value is maximum value of v is oo and it is attained a t NC.

- 1, while

the

0

The next embellishment of the notion of measure is a complex measure.

430

CHAPTER 8. ANALYSIS IN ABSTRACT SPACES

1-12 Definition. Let (R,C) be a measurable space. A set function v on C is said to be a complex measure if: (i)

v is valued in 43. [Notice that being valued in 43, v must not have infinite values, and therefore, of those signed or positive measures only finite ones can be qualified as complex measures.]

(ii) v(@) = 0. (iii) v is c-additive. [Analogously to the signed measures (see Remark 1.2 (ii)), a-additivity of v,

-

(where I I stands for the two-dimensional Euclidean norm), implies that the series c := lv(A,) is also absolutely convergent .] The triple (R, E, v) is referred to as a compIex measure space. Now, we use a similar concept in Proposition 1.9 (iii) to define the total variation of a complex measure.

Given a complex measme space (R, E, v), the complex measure v can be represented as v = vl + iv2, where vl and v2 are finite signed measures on C. Hahn decompositions should then be applied for vl and vZ and their corresponding Jordan decompositions will yield

(i)

v;, ,v: and v; being positive finite measures. We will call with ,v: (1.13) the Jordan decomposition of the complex measure v. (ii) For each measurable set A, the total variation complex measure v is is defined as sup

{

I v 1 (A)

of a

= 1 v(Ak) I , over all finite measurable partitions {Al,. ..,An} of A).

1.14 Proposition. The total variation of a complex measure v on (R, 27) is a finite positive measure on (R, 23).

.

Proof. Let { Al,. .,An} be a measurable partition of a set A E C.


Because

for nonnegative real numbers a,b,c,d, and due to Proposition 1.9 (iii) we have

5 and therefore,

I Vl I (A) + I v2 l(A)l

Consequently, the total variation of any measurable set is a real nonnegative number. Obviously, 1 u 1 (@)= 0. Now we show that v is an additive set function. Let A and B be disjoint measurable sets and let {El,. .,En}be a measurable partition of A B. Then,

.

+

and the triangle inequality of the Euclidean norm yield that:

and therefore,

< 1.1

(A)+

Ivl

(B))

I v I ( A + B ) l l v l ( A ) + IvI(B)*

(1.14a)

The inverse inequality is due to the following. Given a measurable partition {El,.. .,En}of A B, it holds true that

+

432


Applying the supremum twice to the left-hand side of the above inequality we arrive a t the desired inverse to inequality (1.14a). Hence, we showed that the total variation of v is a finite content on C. Finally, by Proposition 1.7 (ii), 1 u I is c-additive if it is @-continuous. This readily follows from (1.14) and the fact that vl+ , v; , v z , and v2 , as positive measures, are @-continuous. 0

1.15 Remarks. (i) Notice that there is a slight difference in the definition of the total variation of a signed measure and a complex measure, but according to Problem 1.13, they agree in the case of finite signed measures. (ii) While the set 6 ( R , C ) of all signed measures is not a linear space (the sum of two signed measures need not be a signed measure, as we can arrive a t oo - oo), the space E(R,C) of all complex measures (over the field 43) is. It is easy to verify that 11 v 11 defined as [ v [ (R) is a norm and therefore upgrades E(S1,C) to a normed linear space. It can be shown (Problem 1.14) that (Q(R, C), 11 11 ) is even a Banach space. (iii) In the subspace G,(R, C ) of finite signed measures we can introduce the partial order relation as vl 5 v2 if and only if vl(A) 5 v2(A) for each measurable set A. The lattice operations are defined as follows:

(v

P)(A) = inf(v(")

+ p(A\E):

It is obvious that v A p 5 v 5 v V p and v A p readily shown (Problem 1.15) that

and therefore, (G,(R, C),

E E C nA ) .

5 p 5 v V p . It can also be

11 - 11 , < ) is a Banach lattice.

The following is an embellishment of the integral notion of real- and complex-valued functions with respect to signed and complex measures.

1.16 Definitions. (i) Let [R,C,f = (u,v)] be a complex-valued function. Given a ualgebra C in R, f is measurable if for every Bore1 set B E %(W2), f*(B) E C, as usual. By using the projection operators one can easily show that f is C-%(it2) measurable if and only if u and v are C-%(R) measurable. Now, given a positive measure p E IIIZ(S1, C ) , we say that f E L1(R, C, p;C) or f is p-integrable if 1 f I E L1(R, C, p;W + ). Since I f ( ( u ( + 1 v I 2 1 f I , f is integrable if and only if both

,

,

Proof. The proof of the theorem includes three objectives: 1) Show that given [g], E L(R,

z,p;R) 1 , I dg],

E5 (:

, i.e.,

that

for each g E [g],, v = gdp defines a signed measure, absolutely continuous with respect to p. This is readily done. Since g E L, v is a signed measure. The proof that v 0. If v(A fl EC) < oo, then A fl ECE

r, and thus

E U (A n EC) E r. The latter yields that

and this contradicts p(E) = S. Thus v(A n EC) = oo. b) Let p ( A n EC)= 0. Then since v

- 1. Due to case 3, Let R = c:= for each n, there is a G niln-3 + -measurable function [a,, R + ,i j J , such

2. Absolute Continuity that

for all A E E. Denoting gn = ij la

n

we have

and thus 00

where, by the Monotone Convergence Theorem, g = En = lgn. Therefore, given two positive measures p and v such that p is afinite, v is arbitrary, and v 0. Then for each n = 1, 2,. .., the conditional probability

(i)

defines the probability measure PHn on the new measurable space (R, C n a,), where

Thus, the expected value of X with respect to measure PHn is then

2. Absolute Continuity

EIXIHn] = J x d P

H

"=

447

J XdP,

'(Hn) H,

which is called the conditional expectation of X given the hypothesis H,. Observe that the value IE[X I H J is a constant (random variable). Now consider the random variable

which is Co-%-measurable, where Co = u({H,}) is a a-algebra generated by the sequence of hypotheses {H,). Obviously, Co = {n,@,A = E Hi:I C N). 1E

I

Hence, for every A E Co (which a union of some Hi's):

J X d P = J XdP. =i Z E[X I Hi]P(Hi) =a. E EI EIH. a

A

The random variable Xo is then a version of the conditional expectation IE[X I E$ that belongs to the class [Xolp. (ii) We consider a special case of the above example. Let 52 = [0,1), C = 93 fl[0,1) and P = ReszX (where X denotes the Borel-Lebesgue measure). As decomposition, take

k-1 k 52= E ; = l H k r where H~ =[-?i-,Fi). Let X(w) = w, for all w E 52. Then,

and

Thus, from (2.7))

and from (2.7a),


as a version of the conditional expectation = cr(Hl,. . .,Hn).

E[XI Co], where Co

(iii) Let X and Y be two random variables on a probability space (S1,E1P). Then, Zo = a(Y) is a sub-cr-algebra of Z generated by Y. The corresponding conditional expectation of X given Co is denoted E[X I Y] or IEY[x]. 17 2.8 Remarks.

(i) Observe that from (2.6a) and (2.6b) it does not follow that z E O[X] = X (mod P), because X need not be Eo-measurable. However, E

E O[X] = X (mod P) if X is Co-measurable (see Problem 2.10).

(ii) Note that if two random variables X and Y belong to the same equivalence class, we would normally write X = Y (mod P) or X = Y Pon S1. In probability, however, the latter is usually denoted by X = Y P-a.s. on 52 or just a.s. (reads almost surely). a.e.

After a short break from the Radon-Nikodym Theorem for signed measures, we return to this theme with a version of Radon-Nikodym's Theorem for complex measures. This is readily done as follows. Firstly, given a cr-finite positive measure p E im(S1, E ) , we will denote by

Let v E e$ and let v = vl + iv2. Since vl < p and v2 < p and ul and v2 are finite signed measures, according to case 5a of the Radon-Nikodym Theorem, there are two equivalent classes [g1Ip and [g2Ip of Radon-

-

Nikodym densities from the factor space ~ ' ( f l El , p;W) I I, so that, for every elements gl and g2 of their respective classes,

S

' I ( ~ ) = A gldP

and v2(A) = AJ g2dp, for each A E El

thereby making [g], = [gl], x [g2], E ~ l ( S 1Clp;C) , (see Definition 1.16) the desired Radon-Nikodym derivative. The uniqueness of [g], is based on that for signed measures. Summarizing the above arguments we have:

2.9 Theorem (Radon-Nikodym for complex measures). Let p E im(S1, C ) be a cr-finite measure. Then [~l(S1, C,p ; ~ I) &$ , f is a bijective map. Finally, with reader's help (Problem 2.1) we will establish a small,


but useful result in

2-10Proposition. Let v be a signed measure and p be a positive measCI ure. Then v

a.s.

C

0~x1 a.s.

is a linear space over the field C. Does the same


NEW TERMS: absolutely continuous signed measure 437 Radon-Nikodym density of a signed measure 438 Radon-Nikodym Theorem for a signed measure 438 Radon-Nikodym derivative of a signed measure 444 probability density function 444 probability distribution function 444 continuous random variable 445 conditional expectation given a a-hypothesis 446 version of the conditional expectation 446 conditional probability of an even given a a-hypothesis 446 conditional expectation given a random variable 448 almost surely equality 448 Radon-Nilcodym Theorem for a complex measure 448 chain rule 449


3. SINGULARITY The singularity (which we introduced in Section 5, Chapter 6, for positive measures) is a sort of opposite notion to continuity.

3.1 Definition and Notation. Let v and p be two signed or complex measures on a measurable space (S1,C). v is said to be singular with respect (or orthogonal) to p, in notation, v Ip, if there is a measurable partition (R1,a2) of S1 such that I u I ( a l ) = I p I ( a 2 ) = 0. Clearly, ( 6 , I ) is a symmetric relation. Therefore, v and p are to be called mutually singular or just singular. [Because the total variations of complex measures coincide with that for finite signed measures (Problem 1.13) and the total variations of signed and positive measures are equal, the above definition of singularity agrees with that for positive measures.] A signed or complex measure, orthogonal to the Lebesgue measure is called just singular. Given a signed measure a signed measure space (51,C,v), we will denote by 6 : (51, C) the subset of all signed measures G(51, C ) orthogonal to v. We establish a few major properties of singular measures. 3.2 Proposition. Let p be a positive measure and v and p be signed measures on the measurable space ( a , C). The following hold true:

(i) If v = v + - v - is the Jordan decomposition, then v + (ii) If v E 6 : and p E G : , then v + p , v - p E G :

Iv - .

.

(iii) v I p if and only if v + I p and v - I p.

< p and p E G,: then v Ip. If v < p and v Ip, then v = 0.

(iv) If v (v)

Proof. w e leave (i) for the reader. (Problem 3.1.)

-

(ii) By the definition, there are two measurable sets A and B such that p(A) = P(B) = 0 and I v I (AC)= I p I (BC)= 0. Then, by Problem 1.11, v 0 on C f l AC and p 0 on C n BC. Consequently, v, p, v + p , and v - p are identically zero, each one on C n (ACn BC). Again, applying Problem 1.11, we see that the measures I v p 1 and I v - p 1 attain zero on the set ACfl BC. On the other hand, obviously, p(A U B) = 0.

+

a ) v Ip implies that 1 v 1 (A) = v + (A) some A and therefore v + (A) = v - ( A ) = 0.

+v -

( A ) = p(AC) = 0 for

3. Singularity b) If v + I p and v - I p , then by (ii), I v I = v + + v - I p .

(iv) Since p Ip, there is a set A E E such that p(A) = I p I (AC)= 0. By Proposition 2.10, 1 v I 1 are said to be conjugate exponents if

4. LP Spaces

46 1

Now we prove the Holder inequality for the semi-norm LP(~,Z,p;Q3).

)I . 11

on

4.4 Proposition (H61der7sInequality). Let 1 < p < oo and q be its conjugate exponent, and let f € LP(n,C,p;C) and g E LQ(R,E,p;C). Then, f g EL' and

IIfsII1 5 l l f

llpllgllq.

(4.4)

Proof. By Problem 1.5, Chapter 2,

Hence,

I fg I

is bounded by integrable functions and

,

If one of the values 11 f 11 or 11 g 11 vanishes or is infinity (or any combination), then (4.4) holds. Assume that neither of them is zero or infmity. Then (4.4a) still holds with f / 11 f 11 replaced by f and g/ 11 g 11 - by g. This yields (4.4).

,

Observe that for the special case p = q = 2, Halder's inequality reduces to the frequently used Cauchy-Schwarz inequality. (In addition to (4.4), we have f g E L' and f ,g E L ~ . )Now, we are ready to prove the triangle inequality, known as Minkowski's inequality.

4.5 Proposition (Minkowski's Inequality). Let 1 < p < oo and f ,g E LP(G!,E,p;C). Then f + g E LP(R,E,p;C) and

Ilf

+glIps

I l f 11 p +

llg I1 p-

(4.5)

Proof. For p = 1, (4.5) reduces to the known triangle inequality for L' space. Assume that 1 < p < oo and denote by q its conjugate exponent. We have

Since obviously pq

-q = p

and because the space LP(G!,E,p;C) is linear,

462


and hence

Consequently,

If

+ g l p - l € Lq.

Now we apply the Holder inequality to f ,g E LP and to I f + g I p-' € L q to have I f 1 - 1 f + g l P-' and 1 g 1 I f g 1 P-' as L'-functions and

+

Ilf

lf+91P-'Il

=

1If l If +gIp-'dp

(since pq - q = p )

=

with

Il f 11 p Il f + 9 11

I191f+91p-111

pplql

Il l g l l p l l f + ~ l l pp/q.

(4.5~)

Applying the norm (integral operator) to (4.5a-c) we have

+

Dividing both sides of the last inequality by 11 f g 11 pp/q (of course, we assume 11 f g 11 > 0, or else the triangle inequality holds true tl immediately) and ue to p - ( p l q ) = 1 we have the above assertion.

+

d

Proof of Theorem 4.3. Notice that 11 CY f 11 = I a 1 11 f I( satisfies property (ii) of the norm in Theorem 7.3, Chapter 2. Property (iii) of the same theorem is subject to the Minkowski inequality. And finally, f = 0 implies 11 f 11 = 0. The converse however gives a weaker condition: 11 f 11 = 0 yields f = 0 p-a.e.. Theorem 4.3 is therefore 0 proved.

4.6 Remark. T o make (LP, 11 11 ) a normed space we will pass to equivalent classes in the same way as in Sections 1 and 5 of Chapter 6 and Section 2 of the present chapter. Recall that, the p-almost everywhere property of equality of measurable functions generates an equivalence relation E on C - '(a, C; C) and thus on LP. Consequently,

is also a quotient set. Then, [ O ] is a linear subspace and

4. LP Spaces

463

is the (quotient) space, with the origin 0 = [OIp, generated by E and 11 11 is now a norm on LP(R, C, p; C) I ., Indeed, by Lemma 1.15, Chapter 6, we see that )I f 11 = 0 implies that f E [O],. 4.7 Definition. A sequence { f }, E LP(R,C,P;C) is said to converge in the pth mean to a function f E LP(R,C,p;C) (or just LP-converge to f ) if

,

We will also denote it by f ,LP -+ f . Problems 4.2 and 4.3 (which are essentially due to Riesz) state that if an LP-sequence {f ,} converges to a n LP-function f , then the convergence of { 11 f, 11 to 11 f 11 is equivalent to the convergence of { f,} to f in the pth mean. Below we state and prove a more general version of the Lebesgue Dominated Convergence Theorem than Theorem 2.6, Chapter 6, for ( ~ ' ( n ,C, PI, I1 II ,)-space.

4.8 Theorem (Lebesgue's Dominated Convergence Theorem), Let ( R E , p ) be a measure space and {f }, S C -'(a, C;C) (or C - '(R, c;R)) be an a.e. convergent sequence, a.e. dominated by an LP(R, C, p; W + )function g, more precisely, 1 f, 1 5 g for-each n p-a.e.. Then the following are true:

(ii) there is an LP(R, C ,p; C)-function f such that (f,} converges to f a.e. in the topology ofpointwise convergence; (iii) f n

LP -+

f;

Proof. As usual, denote by N = N p the subfamily of all measurable p-null sets. Since {f,} is a.e. convergent pointwise, there is M E N spch that lim,,f

,(w) exists for all w E MC.

Denote by L(w) the value of this limit. Since gP E L'(R, E , p;E + ), by Proposition 1.21, Chapter 6, there is N E N, such that g(w) < oo on NC. Furthermore, there is a set On E N such that 1 f, 1 5 g for all w E 0:. Let 0 =

n=l

On. Then, clearly 0 E N. Denote A = M Cn N Cn OC and f

= L l A Then, f, + f p-ae., f E C -'(R, C;C). Because I f, I 5 g < m on A, I f I 5 g a.e., I f I < m and hence f E 43. By Proposition 1.17, Chapter 6, we have that

464


I f [and, consequently, f E LP(R, E,p;Q=). Let g, = I f, - f I and h = ( I f I + g)p. Then, the sequence (g,} nonnegative and is dominated by h. Since I f I + g E LP(R, C, p;R

Therefore,

is +

),

g, E L'(R, C, p;W + ). Applying Fatou's Lemma to h - g, we have J k ( h - g,)dp

< lim S (h - gn)dp = J hdp - lim J gndp.

Since g, + 0 a.e., h - g, This and (4.8) yield

-t

h a.e. and therefore h ( h - g,) = h a.e..

-

lim g,dp

Because g,

Finally,

1

(4.8)

5 0.

2 0, we have

f

1

j

11 f 11 , is due to Problem 4.2.

We are going to show that the space LP(fl,C,p;Q=)is complete with respect to the seminorm 11 = 11 and hence the quotient space LP(Q, C, p; C) I,p is Banach.

4.9 Theorem (Riesz-Fischer). Let { f ,] C LP(R, C, p; C) ( o r L P ( R , E , ~ ; R ) ) be a Cauchy sequence with respect to the seminorm

11 . 11 .

Then, there exists f E LP(R, E , p ; C) such that f,

LP 4

f.

Pmof. Let {f}, be an LP-Cauchy sequence. Then, given there is an N k such that for all indices nk, nk+' 2 Nk,

Hence, there is a subsequence (f

"k

E

= 2-k,

} whose terms satisfy (4.9). Denote

and apply the inequality of Problem 4.1 to the sequence { I gk I }. Then we have from (4.9):

Thus, g E LP or, equivalently, gP E L'. By Proposition 1.21, Chapter 6, gP and, therefore, g is finite p-a.e.. The latter implies that the partial

4. LP Spaces

sums

and hence the subsequence {f

,k

) converge p-a.e. on

a. Furthermore,

and since (due to (4.9a)) g € LP(S1,C, p;W +), the subsequence {f"k } is dominated by an integrable nonnegative function

I f ,l I + g.

All other

conditions of the Lebesgue Dominated Convergence Theorem 4.8 (applied to the subsequence {f )) are met. Consequently, there is a function f nk

E LP(n,E,p;C) to which

(f } converges pane., both in the topology of ,k

pointwise convergence and in the pth mean. Finally, {f,), being an LP-Cauchy sequence, by Problem 3.9, Chapter 2, must converge to the same limit function f (as its subsequence { f nk)) in the pth mean. 17 Notice that the function f to which { f ,} converges in the pth mean is defined uniquely p-a.e.. Therefore, the Riesz-Fischer theorem states that the quotient space LP(Q, C,p; C) I is Banach. As a by product, the theorem provides a subsequence {f

"k

) of { f ,}, which converges to f p-

a.e. in the topology of pointwise convergence. The theorem does not state, however, that {f,) also converges to f p-a.e. pointwise. (The reader is encouraged to provide a counterexample where such an option is not the case, see Problem 4.6.) Below is what we can afford.

4.10 Proposition. If an LP(R, E, p; C)-Cauchy sequence { f ,) converges p-a.e. pointwise to a function f E C -'(R, C ; C), ihen f E LP and

f,

LP -)

f.

Proof. By Riess-Fischer Theorem 4.9, there is an LP-function that f,

LP -, f

and there is a subsequence {f nk)

7

such

{ f }, such that f,

k

-+

a.e. pointwise. On the other hand, by our assumption, f "k f .a.e. pointwise. Therefore, f E [f ,1 and the rest of the statement is again due to the Riesz-Fischer Theorem. -+

N

4.11 Proposition. Lei (R,C,p) be a measure space, such that p is finite and let /E €!-'(R, C;C). If 1 5 p 5 q < oo, then

+

466


and therefore Lq(R,E,p,C) C LP(R,E,p,C). Proof. We assume that p < q or else (4.11) is trivial. Then denote a = q/p and b = a/(a - 1) = q/(q - p). Then, a and b are conjugate exponents with a > 1. Since p is finite, the constant function 1 E L ~ ( R , E , ~ , WNow ) . apply Holder's inequality to 1 f 1 and to 1 with respect to the conjugate exponents a and b:

or, equivalently,

II f II 5 [ S l f l pad~]lla[a(~)]llb (since pa = q, l / a = p/q and l / b = 1- q/p)

=

1 1 P-V

I1 f I1 ;[P(Q)J-

that proves (4.11).

4.12 Examples. (i) Consider an important special case. If p is a probability measure in Proposition 4.11, then the result applied to a random variable X can be interpreted as follows. The existence of the moment of nth order implies the existence of all lower moments of X. (ii) The statement of Proposition 4.1 1 that, for p

< q,

need not hold if p is not finite. For example, if R = [ l , ~and ) p is the 1 ~ ., Let counting measure concentrated on set 1 2 . . i.e. p = :C f (x) = l. Then,

and thus f E L ~ However, . it is easily seen that f $ L'.

I7

The theorem below states that the space of all real-valued integrable "extended" simple functions is dense in LP. We need the following notation. Let PP(R, E , p;R) = P(R, C;R) fl LP(R, E, p;R) denote the subset of all real-valued simple LP-integrable functions. (See Remark 6.2 (iii), Chapter 5, on simple functions.)

4.13 Theorem. The real subspace

PP

is dense in (LP, 11

11 p).

Proof. PP 5 LP, by the definition. Now, given an f E LP, by Theorem 6.5, Chapter 5, for f + and f - there are monotone nondecreas-

4. LP Spaces ing sequences { s ; ) t f + and f +,f - E LP and, consequently,

{s,

I s , + ) , {s,I

1t f - .

467

Since f E LP,

SO

are

CLP

and

{ s , = s,+ -s,}

5 LP.

BY (4.2))

and since f E LP, we have that { f - s,) E LP. Therefore, the sequence { I f - s, I is dominated by an L'-function 2 P + I f I P. We also know converges to function 0 pointwise. Hence, the sequence that { f - s,} { f -s,} meets all criteria of the Lebesgue Dominated Convergence Theorem. As the result, there is an LP(R,C,p;R)-function, say f *, to which { f - s,} converges a.e. pointwise. Hence f * E [0], and by setting LP f * = 0, we have lim,,, 11 f - s , 11 = 0 or that sn -' f . In other words,

'

4.14 Remarks.

(i) Given an LP-function f , we proved the existence of an "extended" sequence {s,} of simple functions such that { I s, 1 ) is monotone increasing to I f I and {s,} converges to f pointwise. ~ (ii) Noticed that not only 9 = C -'(Q, E;R) in C - '(R, E ) , T (i.e., in the topology of pointwise convergence), but as we showed, the subspace PP of I is dense in (LP, 11 I( p).

-

(iii) A minor adjustment to Proposition 4.13 allows us to claim that the subspace I P ( R , E, p;C) = P(R, E;C) fl LP(R, E, p;C) of all complexvalued simple LP-integrable functions is dense in LP(R, E, p;C). (Problem 4.8.) The following topic on p-a.e. bounded measurable or "Loo-functions" occurs often in applications and is going to be explored. We will also see how the Loo-space fits in the LP-family.

4.15 Definition. Let f E C - '(R, E; C) or C - '(a, E ; R). A positive real number M is said to be an essential bound for f if I f I 5 M p-a.e. on R. If f has an essential bound it is naturally called essentially bounded. 0 We would like to notice the difference between p-a.e. finite and essen-

468


tially bounded functions. For instance the function ) E e - '(52, E;R) is finite A-a.e. on R, i.e. every where, except for 0, whereas it is not essentially bounded. Moreover, the "repaired" version of

8,

becomes finite (and an element of C - '(52, C; C)), but still not essentially bounded.

4.16 Definition and Notation. If a measurable function f on (52, C, p) is essentially bounded, then the infimum of all essential bounds for f is called the (p-) essential supremum of f and it is denoted by [I f 11, or by esssup{ I f [ ). More formally,

The subset of C - '(a,E ; C) (or e - '(a,E ; R)) of all essentially bounded functions is denoted by Lw(R, E, p;C) (or Lw(R, E, p;R), resp.). (Of course, if f is not essentially bounded, it would make sense to set 11 f 11 , = oo. However, since we are going to use 11 11, as the norm within 0 LO", we do not need such an extension.) I t is easy to see - that Lm(R, E, p; C) is a vector space over the field 43, while Lo0($2,E,p;R) is a "quasin-vector space over R. The properties below justify (1 11 as a semi-norm on Lm.

- ,

4.17 Proposition. Given two measurable functions f and g on (Q, E, p) and a scalar a E 43, the following are valid: (i)

If I I II f

( 4

llf+911,~Ilfll,+119l1,-

(iii)

If I < IgI

(i.1

f

(4 (vi) (vii) (viii)

E [gl,

II,

CL-a*e* On 52-

ya.e. on 52 implies that

* Il f ll , = Il g ll

I[ f 11, I 11 g 11.,

00.

II "f llm = I " I ll f ll m* Il f 11, = 0 * f E [Ol,. ll"ll,= 14II fg II , i ll f I1 , Il g ll.,

Proof. (i)

Given

E

=

A, there is an essential bound M , such that

4.LP Spaces

Hence, the set

{Ifl

,+a} E N, and along with this, the set G {If l > llf II,+~EN,.

{ I f I > 11 f 11 > llf ll,)=

n =1

If +gl 5 If l + lgl IIlf+,l llgll, pawe* 11 f [[ , + 11 g 11 ,is an essential bound for f + g. Thus,

Hence, infimum of all essential bounds,

(iii) Because of (i) and our assumption, we have a.e.. Therefore, g is an essential bound and

11 (I,

Ilf

Real Analysis: An Introduction to the Theory of Real Functions and Integration

Real Analysis: An Introduction to the Theory of Real Functions and Integration

Real analysis: an introduction to the theory of real functions and integration