An Introduction to Nonstandard Real Analysis

This Page Intentionally Left Blank An Introduction to Nonstandard Real Analysis This is a volume in PURE A N D APPL...

Author: Albert E. Hurd | Peter A. Loeb

21 downloads 936 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

This Page Intentionally Left Blank

An Introduction to Nonstandard Real Analysis

This is a volume in PURE A N D APPLIED MATHEMATICS A Series of Monographs and Textbooks Editors: SAMUEL EILENBERG AND HYMAN BASS

A list of recent titles in this series appears at the end of this volume.

An Introduction to Nonstandard Real Analysis

ALBERT E. HURD Department of Mathematics University of Victoria Victoria. British Columbia Canada

PETER A. LOEB Department of Mathematics University of Illinois Urbana, Illinois

1985

ACADEMIC PRESS, INC. (Harcourt Brace Jovannvich, Publishers)

Orlando San Diego New York London Toronto Montreal Sydney Tokyo

C O P Y R I G H T o 1985 BY ACADEMIC PRESS, I N C . ALL RIGHTS RESERVED. N O PARTOFTHIS PUBLICATION MAY BE REPRODUCEDOR TRANSMITTED I N ANY FORMOR BY ANY MEANS. ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY. RECORDING.OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM. WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.

ACADEMIC PRESS, INC. Orlando, Florida 32887

United Kingdom Edition published by

ACADEMIC PRESS INC. (LONDON) LTD. 24/28 Oval Road, London N W I . 7 D X

Library of Congress Cataloging in Publication Data Main e n t r y under t i t l e : An i n t r o d u c t i o n t o nonstandard r e a l a n a l y s i s . Includes b i b l i o g r a p h i c a l references and index. 1. Mathematical a n a l y s i s , Nonstandard. I . Hurd. A. E. ( A l b e r t Emerson), DATE 11. Loeb, P. A. A299.82.158 1985 515 84-24563 SBN 0-12-362440-1 ( a l k . paper)

P

.

PRINTED INTHE UNITEDSTATkS OFAMtRlCA

85868788

9 8 7 6 5 4 3 2 1

Dedicated to the memory of ABRAHAM ROBINSON

This Page Intentionally Left Blank

Contents

Preface

..................................................................................

ix

Chapter I lnfinitesimals and The Calculus

1.8 1.9

The Hyperreal Number System as an Ultrapower ................................... *-Transforms of Relations ............................................................... Simple Languages for Relational Systems Interpretation of Simple Sentences .................... The Transfer Principle for Simple Senten Infinite Numbers, Infinitesimals, and the The Hyperintegers .......................... Sequences and Series ................................................................... Topology on the Reals

1.10 1.11 1.12

Riemann Integration .......

I. I 1.2 1.3 1.4

1.5 1.6 1.7

..............

I . I3 I . 14 Two Applications to Differential Equations ............... I . I5 Proof of the Transfer Principle

2 8

32

.................. 51 56

.............................

................ 63

Chapter II Nonstandard Analysis on Superstructures Superstructures ...... ........ Languages and Interpretation for Superstructures ................................... 11.3 Monomorphisms between Superstructures: The Transfer Principle .............. 11.4 The Ultrapower Construction for Superstructures .................................. 11.5 Hyperfinite Sets. Enlargements, and 11.6 Internal and External Entities; Comp 11.7 The Permanence Principle ............ 11.8 K-Saturated Superstructures ........... II. I

11.2

vii

71 74 78

83

Contents

Chapter 111 Nonstandard Theory of Topological Spaces Ill. I 111.2 111.3 111.4 111.5

111.6 111.7 111.8

........................................ Basic Definitions and Results .. . .............. Compactness ...................................... ........................... Metric Spaces . .. .. ... .. .. .. .. . . . .. .. .. .. . .. . ... . Normed Vector Spaces and Banach Spaces ........... .............. .............. Inner-Product Spaces and Hibert Spaces . .. .. .. .. .. .. .. .......................................I Nonstandard Hulls of Metric Spaces .. Compactifications . .. ....................................... . .. .. .. .. .. .. . Function Spaces . . .. .. .. . . .. .. ...,. . . ..... .. .. .... ..

i10 i20

123 132 145

154 156 160

Chapter IV Nonstandard Integration Theory IV. I IV.2 IV.3 IV.4 IV.5 IV.6

Standardizations of Internal Integration Structures . .. .... . 165 Measure Theory for Complete Integration Structures .. .. .. .. .. . . .. .. . .. .. . . ... .. 175 Integration on R";the Riesz Representation Theorem .. . . .. .. . . . . .. Basic Convergence Theorems .. .. .. .... ..... .. The Fubini Theorem ............ ............................ 200 Applications to Stoch .................................

Appendix Ultrafilters

....................

References

......................................................................................

.........................................................

............................................................................... .............................................................................................

219

222

List of Symbols

225

Index

227

Preface

The notion of an infinitesimal has appeared off and on in mathematics since the time of Archimedes. In his formulation of the calculus in the 1670s. the German mathematician Wilhelm Gottfried Leibniz treated infinitesimals as ideal numbers, rather like imaginary numbers, which were smaller in absolute value than any ordinary real number but which nevertheless obeyed all of the usual laws of arithmetic. Leibniz regarded infinitesimals as a useful fiction which facilitated mathematical computation and invention. Although it gained rapid acceptance on the continent of Europe, Leibniz’s method was not without its detractors. In commenting on the foundations of calculus as developed both by Leibniz and Newton, Bishop George Berkeley wrote, “And what are these same evanescent increments? They are neither finite quantities, nor quantities infinitely small, nor yet nothing. May we not call them the ghosts of departed quantities’?’’The question was, How can there be a positive number which is smaller than any real number without being zero? Despite this unanswered question, the infinitesimal calculus was developed by Euler and others during the eighteenth and nineteenth centuries into an impressive body of work. It was not until the late nineteenth century that an adequate definition of limit replaced the calculus of infinitesimals and provided a rigorous foundation for analysis. Following this development. the use of infinitesimals gradually faded, persisting only as an intuitive aid to conceptualization. There the matter stood until 1960 when Abraham Robinson gave a rigorous foundation for the use of infinitesimals in analysis. More specifically, Robinson showed that the set of real numbers can be regarded as a subset of a larger set of “numbers” (called hyperreal numbers) which contains infinitesimals and also, with appropriately defined artithmetic operations, satisfies all of the arithmetic rules obeyed by the ordinary real numbers. Even more, he demonstrated that the relational structure over the reals (sets, relations. etc.) can be extended to a similar structure over the hyperreals in such a way that all statements true in the real structure remain true, with a suitable interpretation, in the hyperreal structure. This latter property, known as the transfer principle. is the pivotal result of Robinson’s discovery. ix

X

Preface

Robinson’s invention, called nonstandard analysis, is more than a justification of the method of infinitesimals. It is a powerful new tool for mathematical research. Rather quickly it became apparent that every mathematical structure has a nonstandard model from which knowledge of the original structure can be gained by applications of the appropriate transfer principle. In the twenty-five years since Robinson’s discovery, the use of nonstandard models has led to many new insights into traditional mathematics, and to solutions of unsolved problems in areas as diverse as functional analysis, probability theory, complex function theory, potential theory, number theory, mathematical physics, and mathematical economics. Robinson’s first proof of the existence of hyperreal structures was based on a result in mathematical logic (the compactness theorem). It was perhaps this aspect of his work, more than any other, which made it difficult to understand for those not adept at mathematical logic. At present, the most common demonstration of the existence of nonstandard models uses an “ultrapower” construction. But the use of ultrapowers is not restricted to nonstandard analysis. Indeed, the construction of ultrapower extensions of the real numbers dates back to the 1940s with the work of Edwin Hewitt [ 171 and others, and the use of ultrapowers to study Banach spaces [ 10,161 has become an important tool in modem functional analysis. Nonstandard analysis is a far-reaching generalization of these applications of ultrapowers. One essential difference between the method of ultrapowers and the method of nonstandard analysis is the consistent use of the transfer principle in the latter. To present this principle one needs a certain amount of mathematical logic, but the logic is used in an essential way only in stating and proving the transfer principle, and not in applying nonstandard analysis. We hope to demonstrate that the amount of logic needed is minimal, and that the advantages gained in the use of the transfer principle are substantial. The aim of this book is to make Robinson’s discovery, and some of the subsequent research, available to students with a background in undergraduate mathematics. In its various forms, the manuscript was used by the second author in several graduate courses at the University of Illinois at Urbana-Champaign. The first chapter and parts of the rest of the book can be used in an advanced undergraduate course. Research mathematicians who want a quick introduction to nonstandard analysis will also find it useful. The main addition of this book to the contributions of previous textbooks on nonstandard analysis [ 12, 37, 42, 461 is the first chapter, which eases the reader into the subject with an elementary model suitable for the calculus, and the fourth chapter on measure theory in nonstandard models. A more complete discussion of this book’s four chapters must begin by noting H. Jerome Keisler’s major contribution to nonstandard analysis in the form of his 1976 textbook, “Elementary Calculus” [23] together with the instructor’s volume, “Foundations of Infinitesimal Calculus” [24]. Keisler’s book is an excellent

Preface

xi

calculus text (see the second author’s review [30])which makes that part of nonstandard analysis needed for the calculus available to freshman students. Keisler’s approach uses equalities and inequalities to transfer properties from the real number system to the hyperreal numbers. In our first chapter, we have modified that approach to an equivalent one by formulating a simple transfer principle based on a restricted language. The first chapter begins by using ultrafilters on the set of natural numbers to construct a simple ultrapower model of the hyperreal numbers. A formal language is then developed in which only two kinds of sentences are used to transfer properties from the real number system to the larger, hyperreal number system. The rest of the chapter is devoted to extensive applications of this simple transfer principle to the calculus and to more-advanced real analysis including differential equations. By working through these applications, the reader should acquire a good feeling for the basics of nonstandard analysis by the end of the chapter. Anyone who begins this book with no background in mathematical logic should have no problem with the logic in the first chapter and hence should easily pick up the background needed to proceed. Indeed, it is our hope that such a reader will grow quite impatient with the restrictions on the language we impose in the first chapter, and thus be more than ready for the general language introduced in Chapter 11 and used in the rest of the book. We will not comment on what might be in the mind of a logician at that point. Chapter I1 extends the context of Chapter I to “higher-order’’ models appropriate to the discussion of sets of sets, sets of functions, etc., and covers the notions of internal and external sets and saturation. These topics, together with a general language and transfer principle, are held in abeyance until the second chapter so that the beginner can master the subject in reasonably easy steps. They are, however, essential to the applications of nonstandard analysis in modem mathematics. External constructions, such as the nonstandard hulls discussed in Chapter I11 and the standard measure spaces on nonstandard models described in Chapter IV, have been the principal tools through which new results in standard mathematics have been obtained using nonstandard analysis. The general theory of Chapter I1 is applied in Chapter I11 to topological spaces. These are sets with an additional structure giving the notion of nearness. The presentation assumes no familiarity with topology but is rather brisk, so that acquaintance with elementary topological ideas would be useful. The chapter includes discussions of compactness and of metric, normed, and Hilbert spaces. We present a brief discussion of nonstandard hulls of metric spaces, which are important in nonstandard technique. Some of the more advanced topics in Kelley ’s “General Topology,” such as function spaces and compactifications, are also included. Finally, in Chapter IV, we introduce the reader to nonstandard measure theory, certainly one of the most active and fruitful areas of present-day research in non-

xii

Preface

standard analysis. With measure theory one extends the notion of the Riemann integral. We shall take a "functional" approach to the integral on nonstandard spaces. This approach will produce both classical results in standard integration theory and some new results which have already proved quite useful in probability theory, mathematical physics, and mathematical economics. The development in this chapter does not assume familiarity with measure theory beyond the Riemann integral. Most of the results in [27, 29, 32, 331 are presented without further reference. We note here that the measures and measure spaces constructed on nonstandard models in Chapter IV are often referred to in the literature as Loeb measures and Loeb spaces. With one exception (Section 1.15). every section of the book has exercises. In designing the text, we have assumed the active participation of the reader, so some of the exercises are details of proofs in the text. At the back of the book there is a list of the notation used, together with the page where the notation is introduced. Of course, we freely use the symbols E. u, and n for set membership, union, and intersection. We have starred sections that can be skipped at the first reading. Every item in the book has three numbers, the number of the chapter ( I . II, 111 or IV). the number of the section, and the number of the item in the section. Thus. Theorem IV.2.3 is the third item in the second section of the fourth chapter. In referring to an item, we shall omit the chapter number for items in the same chapter as the reference, and the section number for items in the same section as the reference.

CHAPTER I

In$nitesimals and The Calculus

Our aim in this chapter is to introduce the reader to nonstandard analysis in the familiar context of the calculus. It was in this context that the concept of an infinitesimal was used by Leibniz and his followers to define the derivative, thus launching the infinitesimal calculus on its spectacular development. The notion of an infinitesimal is a cornerstone in all applications of nonstandard methods to analysis, and so an understanding of this chapter is basic to the rest of the book. Moreover, such an understanding will make the technical elaborations of the later chapters easier to appreciate. In spite of the many technical advantages attending the use of infinitesimals as developed by Leibniz, the notion of infinitesimal was always controversial. The main question was whether infinitesimals actually existed. Since an infinitesimal real number was supposed to be smaller in absolute value than any ordinary positive one, it was clear that all infinitesimals other than zero were not ordinary real numbers. Leibniz regarded them as “numbers” in some ideal world. Further, he implicitly made the important but somewhat vague hypothesis that the infinitesimals satisjed the same rules as the ordinary real numbers. Consider how this hypothesis would work in the calculation of the derivative of the function ex. Leibniz would write -de x dx

ex+dx

=

- ex

dx

ri’),

= ex

~

where dx is an infinitesimal. A separate calculation (Example 11.3.2) would show that (edx- l)/dx = 1. We will learn in this chapter that the foregoing calculation is correct as long as the equality signs are replaced by N , where a N- b means that a and b are infinitesimally close. Two facts should be noted: (a) We need to be able to add infinitesimals to ordinary real numbers. This implies that both infinitesimals and ordinary reals are contained in a larger set of “numbers” for which the operations of arithmetic are defined. 1

2

I.

lnfinitesimalsand The Calculus

(b) The function ex needs to be extended to this larger set of numbers in such a way that the law of exponents is satisfied. The example of the previous p a r a ~ a p hshows that to make Leibniz’s approach to the calculus rigorous we must A. construct a set * R of “numbers” and define operations of addition, mu~tiplicat~on, and linear ordering on * R so that (i) the field R of real numbers (or an isomorphic copy of R) is embedded as a subfield of * R and (ii) the laws of ordinary arithmetic are valid in *R, B. show how functions and relations on R are extended to functions and relations on *R, thus extending the “relational” structure on R to one on *R, C . ensure that statements true in the relational structure on R are “extended to statements true in the relational structure on *R. A set * R having the properties men ti one^ in A is d e v e l o ~ din 81.1 using ultrafilters. We show in the Appendix that the existence of ultrafilters follows from Zorn’s lemma, a form of the axiom of choice. In 51.2 we show how relations and functions on R are extended to relations and functions on *R. To deal with C we must develop a very modest amount of mathematical logic (@1.3 and 1.4) in order to make precise what is meant by the words “statement” and “true.” The sense in which true statements for R “extend” to true statements for *R is made precise in the transfer principle, which is stated in 51.5. This principle is at the heart of nonstandard methods as developed by Abraham Robinson. Its proof is deferred to $1.15 since it is not necessary to know the proof in order to apply the transfer principle. In the intervening sections we show how to use the transfer principle to prove results in the calculus. The proofs are usually similar to those developed in the early days of the calculus except for the role played by mathematical logic. As noted in the Preface, we have used a very simple formal language in this chapter in order to facilitate the initiation of readers not familiar with formal languages. Consideration of a more elaborate language and nonstandard model is deferred until Chapter 11.

1.1 The Hyperreal Number System

as an Ultrapower We assume that anyone reading this book is familiar with the real number system as a complete linearly ordered field 41, = (R, + , O if r < 0.

1.1 The Hyperreal Number System as an Ultrapower

7

This absolute value has all of the properties of the familiar absolute value in R. In Exercise 8 the reader is asked to shown that if r = [(ri)] then Irl= [I* Next we want to show that 9 can be embedded isomorphically as a linearly ordered subfield of W. To be precise, we define a mapping * :R + R as follows. 1.9 Definition If r E R, we define *(r) = * I , where *r = [ ( r , r , . . .)]E R.

1.10 Theorem The mapping

R.

* is an order-preserving isomorphism of R into

Proof: The mapping * is 1-1, for if *r = *s then [ ( r , r, . . ,)] = [(s, s, . . .)] and so = s. It is a trivial matter to show that * preserves the field and order properties. For example, the equation [ ( r , ~ ,...)] + [(s,s, . . .)I = [(r + s,r + s, . . .)] establishes *(r + s) = *r + *s. The details are left to the reader (Exercise 9). 0

Of particular interest are the standard numbers in R;these are the images of elements of R under *. 1.11 Definition If A E R then (A), is the set of all elements *a, where a E A; ( R ) , is the set of standard numbers in R.

Finally we want to show that R contains numbers other than standard numbers. In order to do so we use for the first time the assumption that 9 is a free ultrafilter. Consider the number w = [(1,2,3, . . .)I. This number ~ = i} cannotequalany standardnumber *r = [ ( r , r , r , . . .)],fortheset { i N:r consists of at most one natural number. Thus R is a strictly larger set than ( R ) , . In $1.6, w will be called an infinite number. Similarly the number w - = ((l,&+,, . .)) is not in (R), and is called an infinitesimal. We will see that there are many other distinct infinite and infinitesimal numbers in R. To sum up, we have shown that the structure W is at least an ordered field. The proof of this fact has involved simple but tedious manipulations involving the ultrafilter Q. One might ask whether other properties of W are likewise true of W. For example, W has the property that if r < s then there is a number t so that < t < s. It turns out that R also has this property (Exercise 10). After checking this and a few more properties, one begins to suspect that all reasonable statements that are true in W are also true in if the statements are suitably interpreted. This is the content of the transfer principle, which will be stated in a simple form in 11.5 and proved at the end of the chapter. With the transfer principle the proofs of Theorem 1.7 and similar results become trivial.

8

I.

lnfinitesimals and The Calculus

Exercises I.1 1. Show that (k, is a ring with identity (1,1,1,. . .) and zero (O,O,O,. . .). 2. Fix an ultrafilter 9 in a set I and show that if A,, A,, . . . , A, are a finite number of subsets of I with Ai n A j = 521 for i # j and U A i ( l Ii < n) = I, then one and only one of the sets Ai is in 9. 3. Complete the proof of Lemma 1.4. 4. Show that if r = [ ( r i ) ] and s = [(si)], then r = s (equality of equivalence classes as sets) if and only if (ri) = (si) a.e. 5. Show that parts (ii) and (iii) of Definition 1.6 are independent of the representatives chosen for the equivalence classes. Also show that r 5 s if and only if {i E N : r i I s i } E 9. 6. Prove that W is a ring. 7. Establish the properties (a) and (fl) of the ordering < which are stated in the proof of Theorem 1.7. 8. Show that if r = [ ( r i ) ] then Irl = [ ( l r i l ) ] . 9. Complete the proof of Theorem 1.10. 10. For any r, s E R with r <s, show that there exists a t E R with r < t < s. 11. Show directly (without using Theorem 1.7) that if r < s and s < t, then

@,o)

r < t.

+ SI

+

12. Show that Ir 5 Irl Is1 and lrsl = lrllsl for all r, s E R. 13. Show that there are infinitely many distinct elements of R greater than O = [(1,2,3 , . . . ) ] . 14. Show that an ultrafilter 9 is free iff it is not fixed at any x E I. 15. Show that if one lets 4' 3 be an ultrafilter fixed at some n E N in the construction of 9,then W is isomorphic to W .

1.2 *-Transforms of Relations

In order to do calculus we must introduce sets and functions into discussions involving 9 and 9.Of course, sets and functions are just special types of relations. We will show how to extend relations from W to W . The procedure generalizes what we have done for the relation c. 2.1 Definition For any set S, the set S" = S x S x . . x S (n factors) is the set of ordered n-tuples (a1,a2,. . . ,a"), a' E S. An n-ary relation P on S is a subset of S". If ( a ' , . . . , a") is an element of P we write either (a', . . . ,a") E P or P(a', . . . ,a"). The complement of a relation P is the relation P' = (S x S x . . . x S) - P . In particular, a subset A of S is a unary relation on

1.2

9

*-Transforms of Relations

S, and we write c E A or A ( c ) if c is in A. The domain of P is the subset of S"-' consisting of those (n - I)-tuples (a', . . . ,a"-') in S"-' for which there exists an a E S so that P (a', . . . , a"- a). The set of all such elements a is called the range of P. We write dom P and range P for the domain and range of P . An S-valuedfunction f of n variables on S is an (n + 1)-ary relation with the special property that if f(n', a', . . . , a",a ) and f(a', a', . . . ,a", b) then a = b. Here a is called the image of (a', . . . ,a") under f . We also frequently write f(d,. . . , a") = a if f(a', . . . , a",a) (notice the different brackets). If f(a', . . . , a") = f ( b ' , . . . ,b") implies (a', . . . , a") = (b', . . . , b"), then we say that f is one-to-one (1- 1) or injective.

',

As examples, we see that = and < are 2-ary (usually written "binary") relations on R; we will usually write a = b and a < b rather than = (a, b) and 2 (4.1) is true in 9. Likewise, the sentence

(4.2)

(vX)(vy)[X > 0 A y > 0 -P Xy > 01

17

1.4 Interpretation of Simple Sentences

is true in 9, and expresses the fact that the product of positive real numbers is positive. Note that

(4.3)

’

- 1)

;J W ;([

+

(J;; 2 O)]

is true in 9,since ,/u is interpretable only for a 2 0, and if a 2 0 then

& 2 - 1 and & 2 0. On the other hand, (Vx)[B(x> h 2 01 (4.4) is not true in 9,since & is not defined for all real numbers a. Many examples +

will be presented in this and the succeeding sections which will provide practice in deciding on the truth in Y of simple sentences. It is even more important to be able to translate an informal mathematical statement about a relational system 9’(involving English phrases like “for all,” “there exists,” “and,” and “or”) into a sentence in L , which has the same interpretation. The rest of this section is devoted to some examples and remarks concerning this problem. A basic problem in translation is that the simple language of this chapter involves only formal analogues of the phrases “for all,” “and,” and “implies” and these must occur in certain formal combinations in a simple sentence, whereas informal mathematical statements often involve phrases like “there exists,” “or,” and “not.” It is not always easy t o decide whether there exists a corresponding sentence in our simple language having the same interpretation. Fortunately, the specific translations which are necessary to do calculus will not cause any difficulty. Some typical examples follow. Sometimes the translation is direct. Take, for example, the statement “The distributive law holds in R,” or, more precisely, “For all real numbers x, y, and z, x(y + z ) = xy + xz.” Some simple sentences in L , which each correspond to this statement are (4.5)

(Vx)(Vy)(Vz)[R<x)

A

R(Y)

A

R(z)

+

X(Y 4- Z) = XY

Xz]

and

+

+

(Vx)(Vy)(Vz)[l = 1 + X(Y Z) = XY XZ]. (4.6) In the latter sentence we have used the fact that the only substitutions allowed for variables in L , are names of elements in R. We often, however, write R(x) for clarity. Mathematical statements involving “not” attached to a given n-ary relation P on S can often be restated using the complement P‘ of P. For example, corresponding to the true statement “2 is not less than 1” is the atomic sentence 1’(2,1), where I is the relation {<x, y) E R 2 : x < y} and I‘ is the complement of I given by {(x, y) E R 2 : x 2 y}. Note, however, that if a

18

1.


term in I‘(T’, , . . ,tn) is not interpretable in 9, then neither L’(T’, . . . , tn) nor P‘(tl,. . . ,t“) is true in 9’. Statements involving “or” are sometimes more difficult to deal with. Consider the law of trichotomy for the ordering on R, “For all real x, either x > 0 or x = 0 or x < 0.” This can be translated by any one of the three simple sentences 4 O A X # 0 + X > 01, (vX)[E(X) A X 4 O A X > 0 + X = 01, (vX)[~(X)AX>oAX#o+X>],

1.

How could the I(li be defined? The ideas just presented do not constitute a general translation scheme between statements and simple sentences, but will suffice for the problems presented in this chapter. In the next chapter we present a richer formal language for more general mathematical structures which will involve formal analogues of “there exists,” “or,” and “not,” and so will avoid Skolem functions. We have restricted ourselves to simple sentences in this chapter because the transfer principle is easier to state and prove for these sentences

1.5 The Transfer Principle for Simple Sentences

19

and because this restriction allows a more gradual introduction to the general techniques of nonstandard analysis. Exercises 1.4

1. Show in detail that the sentence (4.1) is true when interpreted in W. 2. Show that the sentences (4.9) express the fact that B is the range of the function f of n variables. In doing so, define the Skolem functions I,$~, l 0 there is a 6 > 0 so that Ix - a1 c 6

implies If(x) - f(a)( c E.

9. Let A, B, and C denote unary relations defining subsets of R. Write simple

sentences whose interpretation in W asserts that (a) (b) (c) (d)

A c B, A = B, C = A n B, C = A u B.

1.5 The Transfer Principle for Simple Sentences

We are now able to state accurately the transfer principle for simple sentences in L,. The proof will be deferred to the end of the chapter. In the intervening sections we will present many applications of the principle which

20

I.


should convince the reader that it is a very powerful tool. Moreover, it will be clear that one need not know the proof of the transfer principle to apply it successfully. A transfer principle for more general sentences and more general mathematical structures will be presented in Chapter 11. We first introduce the notion of the *-transform of a sentence in L,. Here, we adopt the following conventions. 5.1 Conventions (a) If r is a name in L , of r E R then I: is also a name in L, of *r E * R (remember that we identify r and *r). (b) If _P is a name in L, of the relation P on R then *f is a name in L, of the relation * P on *R. In particular, (c) Iff is a name in L, of the function f on R then *f is a name in L, of the function *f on *R. (d) The symbols 0 infinitesimal in *R} = (I{ y E * R : ( x- y l < E , E > 0 in R}. 12. Show that if x i E *R, 1 Ii In, then , / x : + * + x.' N 0 iff x i 'v 0 for all i, 1 Ii In. 13. Show that if a and b are finite numbers in * R with b # 0, and n is infinite in *N, then a + nb is infinite.

u

1.7

29

The Hyperintegers

1.7 The Hyperintegers

The set of integers, which we denote by 2, and the set N of natural numbers play central roles in analysis. We therefore pay particular attention to the structure of the *-transforms * Z and *N of these sets; we will call elements of * Z and * N hyperintegers and hypernatural numbers, respectively. In the literature, * Z and *CN are often called the nonstandard integers and nonstandard natural numbers, respectively. The first obvious fact is the following.

7.1 Proposition * Z is a linearly ordered subring of *R. Proof: To show that * Z is a subring of *R, we need only check that it is closed under addition and multiplication. This fact follows from the interpretation in * R of the *-transform of the simple sentence (7.1)

( V X ) ( V Y ) [ Z ( X > A Z(Y>

+

Z(X

+ Y> A z < x

*

Y)],

which is true in 9. Finally, notice that * Z inherits the linear ordering on *R. 0 In W there is a greatest integer function [ . 3: R (7.2)

[XI

Ix < [ X I

Z which satisfies

+1

for all x E R. Therefore the extended function *[ . 3: * R + *Z satisfies *[XI I x < *[XI + 1 for all x E * R by the transfer principle. Thus we have 7.2 Proposition For each x k + 1.

E

* R there is an element k E * Z so that k Ix
0 so that (9.1)

(vx)[&(x)

A IX

- a1 < & + a(x>]

is true in W. By transfer, if x E * R and 1x - a1 c E then x E *A. In particular, if (x - a1 N 0 then x E *A and so +a) c * A . Conversely, suppose that m(a) c * A for each a E A. If A is not open, there exists an a E A so that for each n E N we can find an x, E A' with Ix, - a[ < l/n. Define a Skolem function $:N + R by #(n) = x,,, where x, is a specifically chosen element of A' with Ix, - a1 < l/n. Then the sentence

I*$().

- a1 < l/n. In is true in W. By transfer, for all n E *N,*$(n) E *A' and particular, for n = w where w is infinite, the number x, = *#(a)satisfies x, E *A' and Ix, - a1 < l/w N 0, i.e., x, E +a) (contradiction). (ii) This assertion can be proved by noting that, by definition, A is closed iff A' is open (exercise). 0 9.2 Theorem

(i) If { A i : iE I } is a collection of open sets in R, then U A , ( i E I) is open. (ii) If A,, . . . ,A,, are open in R, then ()Ail I i I n) is open. (iii) If { A i : iE I} is a collection of closed sets in R, then n A i ( i E I ) is closed. (iv) If A,, . . . ,A, are closed in R, then U A A l 5 i 5 n) is closed. Proof: We prove (i) and (ii) and leave the proofs of (iii) and (iv) to the reader.

(i) Let x E U A , (i E I). Then x E A, for some j E I and so m(x) c *A, by 9.1(i). Thus m(x) c U * A i (i E I ) E * [ U A i (i E I)], the last inclusion by Proposition 5.8(iii). This shows that U A , (i E I) is open by 9.1(i). (ii) Let x E n A A 1 5 i n). Then x E A, and so m(x) c *A, for each i, 1 5 i 5 n, by 9,1(i). Thus m(x) c *Al n -. - n * A , = *[r)Ai(l s i s n)], the last equality by Proposition 5.8(ii). Thus ( ) A i l s i s n) is open by 9.1(i).

41

1.9 Topology on the Reds

Recall that a point x E R is an accumulation point of a set A E R if, for every n E N,there is a point y in A different from x with ly - XI < l/n. The set of accumulation points of A is denoted by 2, and the closure of A is the set A = A u 2. 9.3 Proposition A point x E R is an accumulation point of A E R iff there is a y # x in *A with y N x.

Proof: Suppose that x is an accumulation point of A. Then for each n E N we can find a y # x in A with Ix - yl < l/n. Let JI: N + A be a Skolem function obtained by associating a y E A with each n E N so that the sentence (9.3)

(Vn)[H(n)

+

$00 z x A 4(&(n))

A

Ix

-!&)I

< l/nI

is true in W.By transfer we see that, for each n E *N,*JI(n)# x, *Jl(n)E *A, and Ix - *Jl(n)l < l/n. We need only choose y = *$(a) E *A for w E *N, . The converse is left to the reader. 0 9.4 Proposition The closure A of a set A in R consists of those x E R for which m(x) n *A is not empty.

Proof: If x E A then x E A or x E 2. If x E A then x E *A and x E m(x). If x E 2 then m(x) n *A is not empty by Proposition 9.3. The converse is established by reversing the argument. 0

Proposition 9.4 can be expressed in a more graphic way. The standard part map st: G(0)+ R defines a mapping, also denoted by st, from subsets of G(0)to subsets of R by the obvious definition, For each B c G(O),st(B) = {st(y):yE B} = {x E R:there exists a y E B with y 21 x}. Proposition 9.4 can be restated as asserting that st(*A n G(0))= A for any subset A of R, and thus it shows how to construct the closure of any set A by constructing the *-transform of A and then collapsing back to R by a standard part operation. In this form, Proposition 9.4 is a prototype of similar results obtained in more complicated situations later in this book. 9.5 Theorem For any subsets A and B of R,

(a) A c A, (b) A = 1, (c) AUB = A u (d) A is closed,

B,

42

I.

(e) if B is closed and A E B thenA (f) if A is closed thenA = A.


G B,

Proof: (a1 Immediate from the definition. (b) A E 2from (a). If x E 2but x 4 A then x E ;?- .Thus, for any n E N, there is a y E A: with Ix - yl < l/n; by Proposition 9.4 there is a z E *A with Ix - zI < l/n. On the other hand, if x # 2 there is an n E N so that Ix - zI > l/n for all z E A. By transfer (check) this is true for all z E *A (contradiction). (d) If b # A then m(b) n *A = 0,for otherwise b E 2 by 9.4, and then b E A by part (b). Parts (c), (e), and (f) are left as exercises. Next we present an important characterization of compactness due to Robinson. Recall that, by definition, the collection A, (i E I)of sets is a covering of the set A E R if A c U A , (i E I), and that A is compact if each covering A, (i E I) by open sets contains a finite subcovering A, (i E 1') (i.e., I' c I is finite).To obtain Robinson's characterization we need the following standard result.

c R by open sets A, (i E I) contains a finite subcovering if each covering of A by a collection of open intervals (a,,, b,,) with rational end points contains a finite subcovering.

9.6 Lemma Each covering of A

Proof: Let A,(i E I) be a covering of A by open sets. If x E A then x E A, for somej E I. Since the rationals are dense in R and A, is open, we can find rationals a and b so that x E (a, b) c A, (why?).The corresponding countable collection covers A. Select a finite subcovering from this latter covering. Each interval in the finite subcovering is contained in some A,, and so we may find a finite collection of the A, (i E I) which also covers A. 0 9.7 Robinson's Theorem The set A c R is compact iff for each y E *A there is an x E A with x =! y, i.e., every point in *A is near a point in A.

Proof: Suppose that A is compact but y E *A is not near any x E A. Then for each x E A there is a S, > 0 in R such that Ix - yl 2 6,. Since A is compact we can extract a finite subcovering A, = { z E R : ~ X-, zI < d,,} (i = 1,2,. . . ,n) from the covering of A by the sets Ax = { z E R:lx - zI < S,} (x E A). It follows that (9.4) (vY)[A A 1.

- ~ 1 6,, 2 A*

*

A

Ixn-1-

~1

2 6,"- I

+

1 ~ . - YI

< 6x,I

43

1.9 Topology on the Reals

is true in W.Transferring to *W, we obtain a contradiction with the fact that y E *A and [ x i - yl 2 d,, for i = 1,2,. . . ,n. Assume now that a covering Ai (i E I) contains no finite subcovering. By Lemma 9.6 there exists a covering of A by a countable collection I,, = {x E R : a , < x < b,,},n E N,of open intervals with rational end points which has no finite subcovering. Thus there is a Skolem function $: N -+ A so that (9.5)

(Vn)(Vk)[N(n) A A!(k) A k 5 n A

+

- al

+

10.2 Proposition Let f be defined on A and choose a E A. Then the limit limx+, f(x) exists iff *f(x) 'v *f(y) for all x, y E * A with x 1: a, y 1: a but x # a, y # a.

Proof: Exercise.

0

10.3 Theorem If lim,+,f(x) = L, limx+, g(x) = M ,then

(a) Iim,+,, (f + g)(x) = L + M , (b) limx+, (fg)(x) = LM, (c) lirn,-, (f/g)(x) = L/M if M # 0. Proof: Exercise. 0

46

1.


10.4 Proposition Let f be defined on A E R. Then f is continuous at a E A iff *f(x) z f(a) for all x E *A with x z a, i.e., *f(m(a) n *A) E m(f(a)).

Proof: Immediate from 10.1 and the definition of continuity. 0

+

Proposition 10.4 says that iff is continuous at x E A, and x Ax E * A where Ax IY- 0, then Ay = *f(x + Ax) - f ( x ) z 0. For example, if f ( x ) = x2, then Ay = ( x + Ax)’ - x’ = 2 x A x (Ax)2 N 0.

+

10.5 Theorem Iff and g are defined on A and continuous at a E A, then so are f + B, fe,%nd [if da) # 01 f / ~ . Proof: Immediate from 10.3 and 10.4. 0 The preceding propositions can be used to prove the intermediate and extreme value theorems. 10.6 Intermediate Value Theorem If f is continuous on the closed and bounded interval [a, b] and f(a) < d < f(b) for some d, then there exists a c E (a, b) with f(c) = d.

+

Proof: Consider the points xk = a k(b - a)/n, 0 5 k I; n. Considering the values off at xky we see that there exists a Skolem function +: N + [a, 6) satisfyingf(+(n)) < d and f(+(n) (b - a)/n) 2 d (check). Hence the sentence

+

< d ^f(#n) + (b - a)/n) 2 4 is true in W.Transferring to *W,and letting n E * N m ,we have

(10.3) (10.4)

P”na

< b Af(@o)

-+

*f(*+(n)) < d

and

*f(*+(n)

+ (b - a)/n) 2 d.

+

Let c = st(*+(n)) = st(*+(n) (b - a)/n). By continuity we have f(c) 5 d and f ( c ) 2 d, and hencef(c) = d. Also c cannot equal either a or bysince otherwise f(c) = f(a) or f(W. 0

10.7 Extreme Value Theorem Iff is continuous on the closed and bounded interval [a, b], then there exists a c E [a, b] so that f(c) 2 f ( x )for all x E [a, b]. Proof: For each n E N construct the points x#,k = a + k(b - a)/n, 0 5 k 5 n. There is a Skolem function +:N-,N u (0)satisfying +(n) I; n such that, for each n E N , f ( ~ , , ~ ( 2 , ) f)( x , , ) , 0 I; k 5 n, since the finite set of numbers f ( ~ , , ~0 )I; , k I; n, has a maximum for some k satisfying0 s k I; n. By transfer, *f ( ~ , , . ~ ( ,2) )*f(~.,~),0 I; k I; n, for k E *N and n fixed and infinite. Then c =

1.10 Limits and Continuity

47

st(x,,,.*(,,)) satisfies the conditions of the theorem. To see this, fix d E [a,b]. Then d N X,,k for some k E *N with 0 S k S n (exercise), so, using continuity, f(d) N *f(xn,k)I *f(x,,y(,)) N f(c). If f(d) ‘v f(c) then f(d) = f(c) since both numbers are real. Otherwise f(d) < f ( c ) . 0 Proposition 10.4 shows that f is continuous on A iff *f(m(a) n *A) c m(f(a)) for all a E A. Uniform continuity on A results if an analogous condition holds for all a E *A. 10.8 Proposition The function f is uniformly continuous on a set A iff *f(m(a) n *A) c m(*f(a)) for all a E *A; i.e., a, b E *A and a N b implies

*f(4N *f(b). Proof: Recall that f is uniformly continuous on A iff, given E > 0 in R, there exists a 6 > 0 in R so that, for all a E A, - f(a)l < E if Ix - a1 < 6 and x E A. Suppose that f is uniformly continuous on A, let E > 0 in R be given, and find the corresponding 6 > 0 in R. Then the sentence

If($

< 6 -+ If(4- f(b)l < E l is true in W. By transfer, for all a and b in *A, la - bl < 6 implies I+f(a) - *f(b)J< E. In particular, this is true for any E > 0 in R if a N b, and hence a, b E *A and a N b implies *f(a) N *f(b). Conversely, suppose f is not uniformly continuous on A. Then there is an E > 0 in R so that, for each n E N, there are points $l(n) = a, E A and $z(n) = b, E A with la, - bnl < l/n but If(a,,) - f(b,)l 2 E. By transfer of the appropriate sentence(the reader is invited to write one down), for each n E *N there are points a, and b, E * A with la, - b,l < l/n but I*f(u,J- *f(b,JI 2 E . With n E *N, we have a, N b, but *f(a,) *f(b,). 0 (10.9

Va)V~)D(a>

A

la - bl

+

10.9 Examples

+

+ +

1 . limx+3x 2 = 9 since if h N 0, we have (3 h)’ = 9 6h hZ N 9. 2. lim,,,o { [ ( x + h)2 - x 2 ] / h } = 2x since if h = 0, h # 0,[(x + h)’ - x 2 ] / h = 2~ + h N 2 ~ . 3. limx+m = 0 since for h positive infinite in *R

( 4 3 J;;>

(&T

- Ji;)(&T

&T+&

+ Ji;) -

-

1

Jhl+Ji;

O.

N 0, l/a 0. However, f is not uniformly continuous on (0,l)

4. f ( x ) = l / x is continuous on (0,l) since if a E (0, 1) and h

l/(a

+ h) = h/a(a + h)

N

48

I.


since if n E *N,, l/n and l/(n - 1) are in *(O, 1) and l/n *f(l/n) - *f(l/(n - 1)) = 1 $0.

2:

l/(n

- 1)

but

Proposition 10.8 can be used effectively to prove standard results. 10.10 Theorem Iff is continuous on the compact set A, then f is uniformly continuous on A. Proof: If x, y E *A and x N y, then both x and y are near a standard point a E A since A is compact (Theorem 9.7). Thus *f(x) czf(a)N *f(y) by continuity (Proposition 10.4), so f is uniformly continuous by Proposition 10.8.

0

10.11 Theorem If A c R is compact and f is continuous on A, then f ( A ) is compact. Proof: If y E * [ f ( A ) ] = *f(*A) (Proposition 5.6) then there is an x E * A with *f(x) = y. Since A is compact there is a point a E A with x ‘v a (Theorem 9.7).Then *f(x) = y = f(a) since f is continuous at a, and so f(A)is compact by Theorem 9.7. 0

10.12 Theorem Suppose that f is uniformly continuous on each bounded subset of its domain A. Then f has a unique extension g defined on 2 (i.e., f agrees with g on A ) such that g is uniformly continuous on every bounded subset of 2. Proof: Every standard point y E 2 is near a finite point x E * A and we define g(y) = st(*f(x)). This definition is independent of the x we choose since ifx’ N y then x 2: x’, and both x and x’ are in *B, where B = A n [-IyI - 1, lyl + 11 is bounded. Therefore, *f(x) 2 *f(x’) by uniform continuity on B. We leave as an exercise the proof that * f ( x ) is finite. If C = A n [ - 2n, 2n], n E N,then, given E > 0, there exists a 6 > 0 so that If(x) - f(x’)l < ~ / if2 Ix - x’I < 6 and x, x’ E C. By transfer, I*f(x) - *f(x’)l < &/2 for all x, x’ E *C satisfying Ix - x’I < 6. Now if y, y’ E 1 n [ -n, n] are such that Iy - y’( < 6/2 and y N x, y’ ‘v x for some x, x‘ E *C, then Ix - x’I < 6, and so Idv)- g(y’)l = I *fM - *f(x’)l < &/2. Thus, 1g(y) - g(y’)) 5 4 2 < E. Uniqueness is left to the reader. 0 Theorem 10.12 can be used to extend the exponential function f(x) = ax, a > 0 in R, defined on the rationals Q to the reals R = Q. The function ax, x E Q,satisfies the following properties.

49

1.10 Limits and Continuity

10.13 Properties of Exponents If a and b are positive reals and 4 and r are rational then

(i) 19 = 1, (ii) a%' = as+', a-4 = I/&, (iii) ( a 7 = aq, (iv) a4bq= (ab)4, (v) a c b and q > 0 implies as c bq, (vi) 1 c a and 4 c r implies as c 6, (vii) a 2 0 and q 2 1 implies (a lp 2 a4

+

+ 1.

The useful inequality (vii) follows by noting that, for x 2 0, (x + 1)' - 4x - 1 has a minimum at x = 0. Properties (i) through (vi) are obvious. To extend f ( x ) = a", a > 0,x E Q, to R we need only show that f is uniformly continuous on bounded subsets of Q.That is, we need the following lemma. 10.14 Lemma If a > 0 in R, then up N

a4

if p = 4 in *Q n G(0).

Proof: We may suppose that p > q and a 2 1 [if 0 < a c 1 consider - 1; we must show that b N 0. By transfer from

as = ( l / ~ ) - ~ ]Let . 6=

10.13(vi), b 2 0, and, by transfer from lO.l3(vii), (10.6)

a = (b

+ l)I'(p-q) 2 b/(p - 4)+ 1 2 1,

so b/(p - 4)is a finite number p, and hence b = ( p - q)p = 0. 0 This argument is due to Keisler [23]. It is easy to show that properties 10.13 are satisfied by the extension g(x), x E R,of f ( x ) = ax, x E Q. For example, gcY + y') = *f(q + 4') = *f(q)*f(q') = gcvlscv') if 4 = Y , 4' = y', and 4,4'E *Q;this establishes the first part of 10.13(ii) for g since g is realvalued. Most of the results in this section can be extended to functions f of n variables defined on subsets of R" simply by using the definition of nearness for points in *R" introduced in the previous section. The details are left to the reader. Exercises 1.10 1. 2. 3. 4.

Prove parts (b)-(d) of Proposition 10.1. Prove Proposition 10.2. Prove Theorem 10.3. Complete the proof of Theorem 10.7 by showing that for each d E [a, b] there is a k E *N with 0 5 k 5 n such that d N x , , ~ .

50

I.


5. Prove that i f f is uniformly continuous on a bounded set B c R, then *f(x) is finite for each x E *B. 6. Prove uniqueness in Theorem 10.12. 7. Show that there are infinite rational numbers p and q with p = q such that 2p 2q. Where is the assumption that p , q E G(0) used in the proof

*

of Lemma 10.14? 8. Let

0 < x 5 1, x=o (a) Show that f ( x ) is not continuous on [0,1]. (b) Show that the function xf(x) is uniformly continuous on [0,1]. sin(l/x),

9. Show that the function f(x) = x2 on ( 0 , ~is)continuous but not uniformly continuous. 10. Show that limx+af ( x ) = L iff for each sequence (s,) with s, = a and s, # a, n E N,we have limn+mf(s,,) = L. 11. Prove that iff is uniformly continuous on R and (s,) is a Cauchy sequence then (f(s.)) is a Cauchy sequence. 12. Suppose that f is continuous on R and satisfies lim,+,f(x) = limx+- f ( x ) = 0. Prove that f is uniformly continuous. 13. Suppose that f is defined on a compact set A in R. Prove that f is continuous iff the graph ((x,f(x)) E R2:x E A} off is compact. 14. Show that if the function f is continuous on the set A then the zero set {x E A:f(x) = 0} off is closed. 15. Suppose that the function f on the closed bounded interval [a, b] is monotone [e.g., x 0 (h 0, eh N 1 by the continuity of e“ (which we assume here) and so bh N 0 and l/bh is infinite. Then e = lim x+m

( +Y 1

-

N

(1

+ bh)l/bh= (eh)llbh= el/*.

Hence b N 1, and [*f(x + h) - f ( x ) ] / hN ex if h > 0. A similar argument works for h < 0, showing that f ‘ ( x ) = ex (this argument is due to Keisler 1231). 11.4 Theorem Iff is differentiable at x

E (a, b), then

f is continuous at x.

Proof: By proposition 10.1, f ( x + h) - f ( x ) -f’(x)h for h N 0, and so f ( x + h) !x f ( x ) for all h 1: 0; i.e., f is continuous at x. 0 11.5 Theorem Iff, defined on (a, b), achieves a relative maximum or minimum at x E (a, b) and is differentiable at x, then f’(x) = 0.

Proof: Suppose that f achieves a relative minimum at x. Then, for all h sufficiently small and positive (negative), we have [ f ( x + h) - f ( x ) ] / h2 0 (SO).By transfer of the appropriate sentence, we see that [*f(x + h) - f ( x ) ] / h 2 0 ( S O ) if h N 0 and h > 0 (h < 0). Thusf’(x) = 0 from 11.1 and 6.7(iv). 0

Rolle’s theorem and the mean value theorem can be deduced in the standard way from this result and the extreme value theorem.

53

1.11 Differentiation

= g(x)f’(x)+ f(x)g’(x) by 11.1, 11.4 (applied to g), and 6.7. The result follows from Proposition 11.1. 0 At this point it is natural to introduce differentials in the spirit of Leibniz. Denoting the nonzero infinitesimal h by Ax, we have C*f(X + Ax) - f ( x ) l / A x = f ’ ( X )

iff is differentiable at x. We call Ay = * f ( x f A X )- f ( x )

the increment off at x corresponding to the increment Ax. The differential of f at x corresponding to Ax is defined to be dy = f’(x)Ax. Notice that E = Ay/Ax - f’(x) is infinitesimal, and so (11.1)

Ay = f ’ ( x ) A x

+ & A X= dy + &AX.

11.7 Theorem (Chain Rule) Let h(t) = f(g(t))be the composite off and g. If g’(t) exists and f’(g(t))exists [so that g is defined in an interval about t and f is defined in an interval about g(t)],then h’(t) exists and h’(t) = f’(g(t))g‘(t).

Proof: Let x (11.2)

= g(t) and

y = h(t) = f ( x ) . By (1 l.l),

Ay = f’(x)Ax

+ &AX,

E N

0,

54

I.


for any infinitesimal Ax. Setting Ax = *g(t + At) - g(t),where At is any nonzero infinitesimal, and dividing by At, we get AylAt =f’(x)(Ax/At) E(Ax/At). The result follows by taking standard parts. 0

+

11.8 Inverse Function Theorem Let f be continuous and strictly increasing (or decreasing) on (a, b) and let g be the inverse off. Iff is differentiable at x E (a, b) with f’(x) # 0, then g is differentiable at y = f ( x ) , and g’(y) = lIf’(4.

+

Proof: Let Ay N 0, Ay # 0, and set Ax = *& Ay) - gQ. Then Ax is infinitesimal and nonzero since g is continuous (why?)and one-to-one. Since S(X)# 0, (11.3)

1 f’(x) - *f(x -ry

+

Ax A X )- f ( x ) - y

+

Ax Ax =AY - y A y e

Since this is true for all nonzero infinitesimals Ay, g’Q exists and equals l/y(x). 0 . Partial derivatives of functions of several variables are defined as usual. For notational convenience, we confine ourselves to functions z = f ( x ,y ) of two variables; the extension to functions of n variables is obvious. The partial derivatives f, and f y are defined by f,(a, b) = g’(a) and fJa, b) = b’(b), where g(x) = f ( x ,b), h Q = f(a, y). Assuming that the partial derivatives exist, we define the increment Az and total digerential d z by (11.4)

AZ = *f(a

+ AX,y + Ay) - f ( ~b),

and (11.5)

dz = fx(a,b)Ax

+ &.(a, b)AY,

respectively, where Ax and Ay are arbitrary numbers in *R. Note that both Az and dz depend on a, b, Ax, and Ay. We say that f is direrentiable at (a,b) if (11.6)

Az = dz

+ E A X+ 6 A y

for any infinitesimals Ax and Ay and corresponding E

N

0, and 6 2:0.

11.9 Theorem Iff, and f, are continuous at (a, b), then f is differentiable at (a, b).

55

1.11 Differentiation

Proof: If Ax and Ay are nonzero standard numbers, then (11.7) f(a

+ AX,^ + AJJ)- f ( ~ , b ) = [f(a

+ AX,b + Ay) - f(a + AX,b)] + [f(a + A X ,b) - f(a,b)].

Using the mean value theorem, we have (11.8)

f(a + AX,b) 6) = fx(u,b)AX, AX,b Ay) - f(a AX,b) = &(a A X ,U) Ay,

+ + where la - UI 5 Ax, Ib - UI I Ay. Hence

(11.9)

f(a

f(a

+

+

+ AX,b + Ay) - f(a,b) = fX(u,b)AX + &(a + AX,U)Ay.

Since this equation is true for all standard Ax and A y we have by transfer check; you must use Skolem functions) that for Ax ‘Y 0, A y N 0, (11.10)

AZ = *fAu, b) AX

+ *&(a + Ax, U) Ay

UI

for u, u E * R with la - uI 5 Ax, Ib - I Ay. The result follows since *fx(u,b) = fJa, b) and *&(a Ax, u) N &(a, b). 0

+

Exercises Z.11

1. Prove Theorem 11.6, parts (i) and (iii). 2. Why is the inverse function g in Theorem 11.8 continuous? 3. Use Proposition 11.2 to show that iff’ exists then it is continuous on [a, b]

if and only if for each x E *[a, b] and each Ax with Ax N 0 and x + Ax E *[a, b ] , we have Ay = *f(x + Ax) - *f(x)= *f’(x)Ax + E Ax, where E N 0. That is, at any x E *[a, b ] , Ay = d y + E Ax with E N 0 when Ax N 0. 4. Consider the example f ( x ) = x z sin( l/x), x # 0, f ( 0 ) = 0, to see what happens in Exercise 3 iff’ exists but is not continuous. 5. (Darboux’s Theorem) A function f on [a, b] may possess a derivative f’ on [a, b ] that is not continuous. Prove that if f’(a) < c 0; prove that f’(x) = 0 for some x E (a, b). (iii) Reduce the problem to (ii) by using an appropriate function.] 6. (Hyperreal Mean Value Theorem) Let f be differentiable on (a,b). Assuming the standard mean value theorem (i.e., if x < y are points in (a, b) then there is a c, x < c < y, with f ’ ( c ) = v ( y ) - f ( x ) ] / ( y- x), show that if x < y in *(a,b) then there is a c E *(a,b),x < c < y, with *f’(c) = [*fW- *f(X)l/(Y- 4. 7. Let f be twice differentiable on (a, b). Prove that if f’(c) = 0 and f ’ ( c ) < 0 [f”(c) > 01 for some c E (a, b) then f has a local maximum [minimum] at c. (Hint: Use Exercise 6.)

56

I.


8. (Ekhrens [S]). A real-valued function f defined in a neighborhood of c E R is ~ n ~ o di~erenriub~e r ~ ~ y at c with derivative f’(c) if, for each E > 0 in R, there is a 6 > 0 in R so that

forallx,yE(c-6,~+6). (a) Show that f is uniformly differentiable at c iff there exists an a E R,

a=

W)- * f ( Y ) X--Y

for all x, y E * R with x N y = c and x # y, and that in this case f’(c) = a. (b) Show that iff has a derivative on an open interval (a,b)containing c, then f is continuous at c iff f is uniformly diffe~ntiableat c. [Hint: see the proof of Proposition 11.21. (c) Give an example of a function f which is uniformly differentiable at a point c, but every neighborhood of c contains a point where f is not differentiable. (d) Show that iff is uniformly differentiable at c then f is continuous on some neighborhood of c. (e) Show that iff is increasing on an interval (a, b) and f is uniformly differentiable at x E (a, b) with f’(x) # Oy then the inverse function g is uniformly differentiable at y = f(x) and g’(y) = l/f’(x).

1.12 Riemann Integration

Nonstandard analysis is a natural tool for developing the theory of Riemann integration on an interval [a, b], and this section contains a few relevant results. We c o n ~ n ~ r aon t e inte~rationof continuous functions on intervals [a,b]. The presentation in this section owes much to Keislet [23]. 12.1 Definition Let f be a continuous function on [a, b] c R, a < b. A partition P of [a, b] is a set {xo,xl, ,xJY where a = xo < XI < * * * < xn- 1 < x, = b. The upper, lower, and ordinary Riemann sums $(f, P), S:(f, P), and S%f,P) off with respect to P on [a,b] are defined by

...

St(f, P ) = MiAxJ1 < i < n), $:(fy P ) = )3 miAxJl 2 i 5 n),

1.12 Riemann Integration

57

and

$(f, P ) = 1f ( x i- 1) Axi(1

Ii In),

where M i and mi are the maximum and minimum o f f on [ x i _ , , x i ] and Ax, = xi - x i - , , 1 Ii s n. If P is given by setting xk = a + k A x , 0 I k 5 n - 1, where Ax is a fixed positive number and n is the greatest integer for which a + (n - 1) Ax < b, then we write Ax), S:(f, Ax), and S:(f, Ax) for the upper, lower, and ordinary Riemann sums, and say that P is determined by Ax. Here, Ax,, = b - x,,- 5 Ax. If a = b, all Riemann sums are set equal to 0.

s:(f,

,

The partition P, is a refinement of P , if PI E P,. It is easy to see that if P , is a refinement of P , , then

af, P2) Iw, P , ) Iw , PI).

S2f9 P1) %f, PZ)I

The common reJinement P , of P , and P , is given by P , = PI v P,. Since m

-

9

Pl)

s:(L P , ) 5 mf, P3)

a

f

t

P2),

it follows that any lower Riemann sum is less than or equal to any upper Riemann sum. 12.2 Definition The function f on [a, b] is said to be Riemunn integrable on [a, b] with integral J: f ( x )dx if (i) S:(f, P) I f ( x ) d x I St(f, P ) for any partition P of [a,b] and (ii) given any E > 0 in R there is a partition P so that P) P ) < E.

aft s:cs,

We now set out to show that a continuous function f is Riemann integrable. Although we do not have an extension of the set of partitions of [a, b] in this chapter, we can fix f and extend the Riemann sums determined by positive numbers Ax E R to Ax E *R. In the following result, *S:(f, .) and *S:(f, .) denote the extensions to * R of such sums St(f, -) and S:(f, . ) a

12.3 Proposition Let f be continuous on [a,b], and let Ax be a positive infinitesimal in * R . Then *S:( f ,Ax) = Ax).

*s:(f,

Proof: Given Ax > 0 in R, S:(f,Ax)'- S:(f,Ax) = l ( M i - mi)Axi(l < i In) BAxX1 Ii 4 n) = B Axdl Ii 5 n) = B(b - a),

51

58

1.


where E = max, s,&f, - m,).Thus toeach Ax E R + corresponds two points $(Ax) and @(Ax)on [a, b] with [$(Ax)- @(Ax)[< Ax and

S!U, A X )- S ! U , A X )S [f($&)) - f(@(Ax))](b - 4. For Ax = 0 in * R there is a c E [a, b] with *$(Ax) N c N *$(Ax),and hence *f(*$(Ax))N *f(*$(Ax))by the continuity off at c. The result follows by transfer of (12.1). 0 (12.1)

12.4 Corollary Let f be continuous on [a, b]. Then f is Riemann integrable and J.6 f ( x )dx N *S:(f, Ax) for any infinitesimal Ax.

From Corollary 12.4 it follows that fi f ( x ) d x = limb+, S:(f, Ax). In the following we will write S:(f, Ax) and * S i ( f , Ax) as x f ( x )Ax and *f(x)Ax, respectively. By convention we set fi f ( x )dx = -fi f ( x )dx and f ( x )dx = 0. 12.5 Theorem Let f and g be continuous on [a, b]. Then

(i) J.6 cf(x)dx = c J.6 f ( x )dx for c E R, (i)fi [ f ( x )+ &)I dx = fi f ( x )dx + j.6g(x)dx, (iii) J.6 f ( x )dx = fi f ( x )dx J: f ( x )dx if a < c s b, (iv) if f ( x ) s g(x) on [a, 61 then fi f ( x )dx 5 fi g(x)dx, (v) if m < f ( x ) 5 M on [a, b] then m(b - a) < f ( x )dx I M(b - a).

+

Proof: We prove (iii) and (iv) and leave the remaining proofs to the reader.

+

(iii) For each natural number n, if Ax = (c - a)/n > 0 then z f ( x ) A x z f ( x ) A x = c f : f ( x ) A x .The result follows by taking standard parts of the terms in the transferred equality when n E *N, . (iv) For each standard Ax > 0, f ( x )Ax 4 g(x)Ax. Thus by transfer cf: *f(x)Ax< cf: *g(x)Ax,where Ax > 0 is infinitesimal. The result follows from Theorem 6.7(iv). 0

cf:

12.6 Theorem Iff is continuous on [a, b], then the function F(x) = j:f((t) dt, defined for x E [a, b], is differentiable. Moreover, F’(c) = f ( c ) for each c E [a, b], where F‘(c)is the right- or left-hand derivative if c = a or b.

Proof: Fix c E [a, b). For any standard h E (0,b - c) we have, using 12.5(iii) and (v), that f(x,)h S F(c h) - F(c) < f(xl)h, where f has a minimum and maximum on [c,c + h] at x 1 and x 2 , respectively. Thus there are Skolem functions $, $:(O, b - c) + [c,c + h] so that f(&h))h < F(c h) - F(c) < f(@(h))h for all h E (0,b - c). By transfer, *f(*$(h))h 5 *F(c + h) - F(c)
0 and the corresponding partition P, let d, and t,b be Skolem functionssuch that, for 1 I; i s n - 1, Q (i - 1) Ax S d,(i, Ax) s a i Ax and Q (i - 1)Ax I; $@,Ax)I; a i A x while a (n - 1)Ax 5 d,(n,Ax) I; b and a+(n- l ) A x ~ $ ( n , A x ) S b .Let S(Ax)=CI, f(qb(i,Ax))g($(i,Ax))Ax. Show that limd,,, S(Ax) = ~ ~ f ( x ) g ( x ) ~ x .

EZt

+

+

+

+

+

+

1.13 Sequences of Functions

A sequence of functions on A c R is a map f:N x A + R. As usual we denote f(n, x) by f.(x) (n E N , x E A). We will use nonstandard analysis to study the convergence of such sequences. 13.1 position The sequence (A),$,: A --* R,n E N, converges pointwise to the function f:A -,R iff *f,(x) N f ( x ) for all x E A and all infinite n E * N .

Proof: The sequence (f,) converges pointwise iff for each fixed x E A the sequence ( f , ( x ) ) converges to f(x). The result then follows from 8.1. El 13.2 Proposition The sequence (f.), f,:A --* R, converges u n i f o ~ l yto the function f:A -+ R iff *f(x) 1: *f(x) for all x E *A and all infinite n E *N.

61

1.13 Sequences of Fundons

Proof: Recall that (f,) converges uniformly to f iff, given E > 0 in R, there exists a k E N so that If.(x) - f(x)l < E for all x E A if n 2 k. Suppose then that (f,) converges uniformly to f and find the k corresponding to a specified E > 0. Then the sentence (13.1)

(WW[N(n)

A

A+)

A

n2k

+

l&(4 - f(x)l

-= &I

is true in 9.By transfer, I*f.(x) - *f(x)l < E for all n E *N,n 2 k, and all x E *A. In particular, this is true for all infinite n, no matter what E > 0 we choose. Hence *f&) N * f ( x ) for all infinite n E *N and all x E * A . The converse is left to the reader. 0

13.3 Did’s Theorem Suppose that the sequence (f.) of continuous functions on the compact set A c R is monotone [i.e., f,(x) 5 f,(x) or f,(x) 2 fm(x) for all n 2 m,x E A ] and converges pointwise to the continuous function f. Then the convergence is uniform. Proof: We may suppose that f(x) = 0, x E A (simply by considering the sequence f, - f), and that f, decreases (otherwise consider -f.). By transfer we see that *f(x) 5 *f,(x) for all n 2 m in *N and all x E *A. Fix x E * A . Since A is compact there is a y E A, y N x. Then, for each n E *N, and standard m, 0 I*f(x) 5 *f,(x) ‘v f,(y), and since lim,,,+mf,Q= 0 it follows that *f,(x) N 0. 0 13.4 Theorem If (f,) converges uniformly to f on A, a E R is a limit point of A, and limx+,fn(x)= s, exists for all n E N, then (s,) converges and lim,,,f(x) = limn+ms,.

Proof: Let E > 0 in R be specified. Then there is a k E N so that If.(x) - fm(x)I < &/2for all x E A and all n, m 2 k by uniform convergence of (f,)tof on A. By transfer as in 13.2, I*f.(x) - *f(x)( < ~ / 4and I*f,(x) - *f,(x)l < &/2for all n, m 2 k and all x E *A. Since s, ‘v *f.(x) if x N a, x E * A , we have Is, - s,I N I*f.(x) - *f,(x)l < E if n, m 2 k, and so (s,) is a Cauchy sequence and converges, say, to L. It follows (letting x N a and n 2 k) that

f(x)l < ~ / 4 ,and hence

)&fI

I*f(x) - LJ5 I*f(x) - *f..(x)l+ I*f.(x) - snl+ Is, - LI I~ / 4+ infinitesimal + 2~ < 3&, and hence *f(x) N L. 0

62

1.

lnfinitesirnals and The Calculus

13.5 Corollary If the functions f, are continuous on A and (f.) converges uniformly to f on A, then f is continuous on A.

We end this section with a proof of the Arzell-Ascoli theorem, a result which has many important applications in analysis. The theorem asserts that from a uniformly bounded, equicontinuous sequences (f.) of functions on a closed bounded interval [a, b] c R it is possible to select a subsequencewhich converges uniformly on [a, b] to a continuous function f. That the result is not true for an arbitrary sequence of continuous functions is shown by the sequence in which f.(x) = x" on [0,1].Here (f.) actually converges pointwise (but not uniformly) to the discontinuous function

{

0, f(x) = 1,

OSXXl, x = 1.

13.6 Definition The sequence ( f n ) of functions on [a, b] is uniformly bounded if there exists an M so that Ifn(x)l M for all x E [a, b] and all n E N. The sequence (f.) of functions on [a, b] is equicontinuous if, given E > 0, there is a 6 > 0 (independent of x, y, and n) so that If.(.) - f.(y)l < E for all n E N and all x, y E [a, b] such that Ix - yl < 6. (Each f., then, is uniformly continuous on [a, 61.) 13.7 ArzelP-Ascoli Theorem If (f.) is a uniformly bounded and equicontinuous sequence of functions on the closed and bounded interval [a, b], then there is a subsequence (f.,)which converges uniformly to a continuous function f on [a, b]. Proof: Let E > 0 be given and find the corresponding 6 > 0 from the equicontinuity of the sequence. Then the sentence

([email protected])(vy)[nE & A x E [a, b] A Y E [a, b] A - _f,(Y)I < &I +

&).I

1. - YI

0 is arbitrary, we see that *f(x) = *f.(y)for any n E *N as long as x N y. Now let n = o be a fixed infinite natural number. By an argument similar to that of the first paragraph we see that I*f,(x)l 5 M for any x E *[a, b], so that *f,(x) is near-standard for x E [a, b]. Definef(x) = "(*f,(x)), x E [a, b]. We claim that f(x) is uniformly continuous. For let E > 0 be given and find the 6 > 0 corresponding to ~ / 2from equicontinuity. Then if x, y E [a, b]

63

1.14 Two Applications to Differential Equations

and Jx- y ( < 6, we have

Jfb) -f(Y)l

IfM - *f&)l+ I*fm- *fU(Y)l + I*fm(JJ)

- f(Y)l*

The first and last terms on the right are infinitesimal by definition o f f , and the middle term is < ~ / 2by the argument of the first paragraph, and so IfW - f ( Y ) l < E. Finally we show that a subsequence of (f.) converges uniformly to f on [a, b]. To do this it suffices to show that for all E > 0 and all n E N there is an m > n so that - f(x)l < E for all x E [a, b] (why?). Suppose this statement is not true. Then there exists an t o > 0 and an no E N so that for each m > no we can find an x E [a, b] with (f,(x) - f ( x ) ( 2 E ~ Thus . there exists a Skolem function $: {no,no + 1 , . . . } + [a, b] so that the statement (Vm)[m E N A m 2 no + If,($(m)) - f($(m)) 1 2 go] is true. By transfer, given w E * N m , we have w 2 no, and so there exists an x E *[a, b] [equal to *$(a)] such that I*fm(x) - *f(x)l 2 t o .But by compactness of [a, b] and Robinson's theorem, this x is infinitesimally close to a y E [a, b], and so

)&fI

I*f,x)

- *fWl

I*fm- *f,Y)l + I*fAY) - f(Y)l + lf(Y)

- 'f(x)l.

Each term on the right is infinitesimal, the last by the continuity off. This contradiction proves the theorem. 0 A general form of the Arzelsi-Ascoli theorem will be given in 8111.8 of Chapter 111.

Exercises Z.13 1. Finish the proof of Proposition 13.2. 2. Prove Corollary 13.5 directly from Proposition 13.2. 3. Give an example to show that an equicontinuous sequence need not be uniformly bounded. 4. Let (f,) be a sequence on [a, b]. Show that (f.) is an equicontinuous sequence if and only if for any n E *N and any pair x, y E *[a, 61 with x r= y we have * f ( x )z *f(y). (Hint: For the necessity see the proof of Theorem 13.7.) 5. Let (f.) be a sequence of continuous functions on [a, b] which converges uniformly to f. Show that limn+oo jf:f.(x) dx = jf:f ( x )dx. 1.14 Two Applications to Differential Equations

As our first application we prove the Cauchy-Peano existence theorem for ordinary differential equations. A nonstandard proof was first presented by A. Robinson [a].

64


1.

14.1 Cauchy-Peano Existence Tbeorem Let f be continuous and satisfy If(x,y)( 5 M on the rectangle B = {(x, y) E R 2 :Ix - xo( 5 Q, ly - yo[ 5 b}. Then there exists a function & with continuous first derivative, defined on the closed interval I = {x E R : ( x- xol 5 c}, where c = min(a, bM-’), and satisfying Q(xo) = yo and &’(x) = f ( x , &(x)) for x E I. Proof: We begin, as in [40], by constructing a family of polygonal approximations. It suffices to construct a solution on [ x o , xo + c]. Divide [xo, xo + c] into n equal parts by the points xk = xo + kcfn, 0 5 k 5 n, and define &,, by the equations

& A d = YO, &kX)= &n(Xk) + f ( x k ,

(14.1)

&n(xk))(x

- xk)

for xk < x < & + I , 0 s k < n - 1. For any n E N , the graph of &,, lies in B since (f(x, y)( s M. Moreover, I&,,(x) - &,,(x’)( s MIX- x’I for any x, x’ E [xo,xo c]. Thus the following statement is true in 9k

+

(14.2)

+

For all n E N , x, x‘ E [xo,xo c], we have I&,&) - yo( 5 b and (&,,(x) - &,,(x’)l 5 MIX - x’l.

By transfer, for all n E *N and x, x’ E * [ x o , x o + c], (14.3)

and (14.4)

l*&n(x) - *&n(x‘>lS

- XI(*

We now let n=o E *N, and note that *&JX) is finite for all x E *[xo, xo+c] by (14.3). We may therefore define the standard function & on [x, xo c] by &(x) = st(*&&)). Now & is continuous since, for standard x and x’ in [xo,xo+c], ~ ~ ( x ) - & ( x ’ ) ~ ~ l * & ~*&,,(x‘)( ( x ) - 5 Mlx-x’l by (14.4). Therefore

+

*&Y) = &(St(Y))N *&,(stQ) = *&w(Y) if y E *[xo,xo + c ] . Since f is continuous and hence uniformly continuous on the closed bounded set B, (14.5)

*fk*d(x)) N * f b Y

(14.6)

*&nix))

for all x E * [ x o ,xo + c ] (exercise). Now if x E [ x o , xo + c], then xk <x <xk+ for some k E *N,0 5 k 5 w - 1, whence xk N x and so &(x)

*&cu(xk) k- 1

65

1.14 Two Applications to Differential Equations

by transfer from (14.1). Thus k- 1

a! YO

+ iC *f(xi, *&xi))(xi+ 1 - xi) =o

since

c max I*f(xi, * 4 m ( x i ) ) OSiSk- 1

- *f(xi, *&xi>>l,

which is infinitesimal by (14.6) (where have we used the transfer principle in this argument?). Since k-1

where Ax = c/o, it follows from Corollary 12.4 that

Therefore 4 has a continuous derivative and #’(x) = f(x, &x)).

El

The standard proof of this result [S] uses the ArzelA-Ascoii theorem, 13.7. The reader is referred to any standard text on differential equations for a discussion of a (Lipshitz) condition that ensures uniqueness of the solution. Lastly we use nonstandard techniques to derive the wave equation for a vibrating string. We assume that the magnitude of the tension T and the density y of the string are constant along the string. Given an infinitesimal segment of length As from P , to P2 on the string as shown in the figure, the

vertical force on the segment is T(*sin 8, - *sin 8,) and its mass is PAS. If the nearest standard point is x , and we are considering the vertical position y as a twice continuously differentiable function of x and t, then by Newton’s law, 7‘(*sin O2 - *sin 61) = p As&(x, t )

+ E),

66

where E

I. lnfinitesimalsand The Calculus

= 0, so that

-(T P

*sin8, - *sine, As

= Y,(X, 0.

We want to show that (*sin 8, - *sin tI,)/As N yJx, t). Often in deriving the wave equation the assumption that Ay is uniformly small is made, and then the expression (sin 8, - sin O,)/As is replaced by (tan O2 - tan 8,)/Ax, where Ax2 + Ay2 = As2. Since As is infinitesimal, as is *sin 8 - *sin 8,, it is not clear why this replacement is justified. Let us instead fix t and consider small changes Ax, and Ax2 at P , and P2 resulting in changes Ay,,As,, i = 1,2, respectively. We may take Axi, i = 1,2, so small that

+

A xJAS, = *Xs(Pi, t ) E,, Ay JAs, = *sin Or + Ei + 2 , AYJAxi = *~#‘t,t) + ~ i + 4 for i = 1,2, where &,/As = 0 for 1 < j S 6. Then, omitting the sine2 - sine, As

= N

*, we have

Ax2 Ay, A x l ) b s AXA ~ S ~ AX, As,

YAP^

9

0 x P 2 t ) - YAP, t)xAPi As 9

9

0

Since we are assuming that y(x, t ) is twice continuously differentiable, it follows as in Proposition 11.2 that

= YXJX,

t)[xAx, t)I2 + XJX,

t)YX(X,

t).

The wave equation now results if y is “uniformly small” in the sense that ax/& can be taken to be 1 and a2x/ds2 can be taken to be 0. There are many potential applications of nonstandard analysis to differential equations. For example, for applications to singular perturbations, the reader is referred to the work of the Strasbourg group ( [ 6 ] and the papers referenced there).

1.15

67

Proof of the Transfer Principle

Exercises 1.14 1. In the proof of Theorem 14.1, Eq. (14.6) states that *f(x,*&x)) 2 *f(x, * ~ J x ) for ) all x E * [ x o , xo c]. Show that this is correct. 2. Fill in the details in Theorem 14.1 on the use of the transfer principle to show that 4 ( x ) = yo f ( t , &(t))dt. 3. Show that Theorem 14.1 goes through if we replace f(xk, f$,,(Xk)) in (14.1) by Mn.k where m i n ( ~ . y ) ~ A f ( x , y ) 5 Mn,k 5 m a X ( x , y ) ~ A f ( x , Y ) and

+

+

4. The conditions in Theorem 14.1 do not guarantee a unique solution 4 to 4f = f ( x , y). Use infinitesimal partitions and the idea in Exercise 3 to obtain the solutions + ( x ) = 0 and 4 ( x ) = x 3 to the equation 4' = 342/3, 4(0) = 0. 5. Generalize Theorem 14.1 to the vector situation. Let x denote a point in R and y denote a point in R". Let f be defined on {(x, y) E R x R": Ix - xoI S a, lly - yell 5 b ) where xo E R, yo E R", and denotes the usual distance in R". Consider the system 4f = f ( x , &), 4 ( x o ) = &, where 4(x) = (&(x), . . . ,+,,(x)) and &(x) = ( F 1 ( x ) ,. ,&,(x)). Find conditions

11.11

..

on the vector function f so that a solution 4 to this system exists in a certain interval about x o .

1.15 Proof of the Transfer Principle

Recall that the only functions and relations that are in *W are extensions of standard functions and relations. We assume that each constant c in La names an element of *R, and if c names an element of R then c is in La.Recall Definitions 3.8, 4.1, and 5.2, which give the following inductive definition of a constant term which is interpretable in *W: (i) A constant c in Lais interpretable in *Wand is interpreted as the element it names. (ii) Iff is the name of a function f of n variables on R and T I , . . . ,T" are constantterms interpretable in *W as r', , . . ,r", respectively, and if the ntuple ( r ' , . . . ,r") is in the domain of the nonstandard extension *f off, then *f(~', - . . . ,T") is a constant term interpretable in *W as *f(rl,. . . ,r"). We now want to associate with each constant term in Laa fixed sequence of constant terms in L,. We will denote the sequence for a constant term T

68

1.


by (T,(n)) or just T,. A sequence T, is defined for all terms T , interpretable or not, by the following inductive definition: (a) For each r E * R we choose a definite sequence ( r , ) from R so that r = [ ( r , ) ] . If r E R, we choose r, = r for all n. If c is a constant in Lathat names r, we set T,(n) =I,,,where T, is a name in La of r, E R for all n E N.If r E R, we set T,(n) = E: for all n. (B) If 7 = *f(7l,. . . , rk) where f is a name of the function f of k variables on R a n d the Ti are constant terms in La,1 I i I k, then Tr(n) = J(Trl(n), . * T+(n))*

-

9

Conditions (a) and (p) serve to define T, inductively for all constant terms in La. We are now able to prove a simple form of a theorem due to Lbs (pronounced "Wash"). T

15.1 Theorem

(A) If T is a constant term in Laand ( r , ) is a sequence of numbers in R, then T is interpretable in *W and names [ ( i n ) ] iff T,(n)is almost everywhere (a.e.) interpretable in W and names r, a.e. [i.e., for all n in a set U in Q, T,(n) is interpretable and names r,]. (B) If ~ l. ., . ,7' are constant terms in Laand *_P(z', . . . ,7') is an atomic sentence in La,then *E(7l,. . . , T') holds in *W iff E(T,,(n),. . . , T,&)) holds a.e. in W. Proof: (A) The proof is by induction on the complexity of the terms (as defined by 3.8 and 5.2).

(i) If T = where c is a constant naming an element of *R, then c names [ ( r , ) ] iff T,(n) a.e. names r, by definition of T, in (a). (ii) Let 7 = *f(zl, . . , ,T'), wheref is a name of the functionf of k variables and T ~ . ., . ,7 k are constant terms for which (A) is true; i.e., givenj, 1 ~j 5 k, and a sequence (r',), r', E R, 7' is interpretable in *W and names [(r',)] iff T,l(n)is a.e. interpretable and names r'. a.e. Let (s,) be a sequence in R. Then the following statements are equivalent: (a) The term z = *f(~',. . , , is interpretable in *W and names [ ( s , ) ] . (b) There exist elements [ ( r j ) ] , . . . , in *W such that, for 1 <j S k, TI is interpretable as [ ( d ) ] , the k-tuple ([(r,!)], . . . , is in the domain of *f,and * f ( [ ( r j ) ] , . . . ,[(&I) = [ ( s , ) ] . (c) There exist sequences ( I , ' ) , . . . , (4)in R and a set U E 9 such that, for each m E U , if 1 I j < k then T,,(m) is interpretable as &, the k-tuple (r:, , . . ,r",) is in the domain off, and f(r;, . . . , = s,.

[(e)]

[(e)])

69

1.15 Proof of the Transfer Principle

(d) f(Trl(n),. . . , T&)) is a.e. interpretable as s, in W. (e) T,(n)is a.e. interpretable in W as s,. Thus (A) is true by induction. (B) To prove (B), let P be a name for the k-ary relation P on R, and let t', . . . ,t k be constant terms in La. Then the following are equivalent statements: (a) *P(T',. . . ,t k )holds in *W. (b) There are elements [ ( r , ! ) ] ,. . . , in * R such that pretable as [(d)],1 < j < k, and the k-tuple ([(r,!)], . . . ,

[(e)]

*P.

ti

is inter-

[(e)]) is in

(c) There are sequences (r,!), . . . , (4)in R and a set U E 43' such that, for each m E U, TJm) is interpretable as r i for 1 <j < k and the k-tuple (r;, . . . ,&) is in P . (d) P(T,,(n),. . . , T&)) holds a.e. in W. This establishes (B). 0 We are now in a position to prove the transfer principle. If O is an atomic sentence which holds in 9 then 15.1 shows immediately that *O holds in *a. Suppose that O is of the form

and CP holds in W.Let *ti and *c: be the *-transforms of z: and 4 and replace the variables xl,. . . ,x, in *z: and *a', with constant symbols I-, , . . ,I, from La*. Assume that with this replacement . . . ,*&,) holds in *W for each i, 1 I i 4 k. Using 15.1, we see that there is a set U E Isuch that if n E U then Ei(Tr,(n),. . . , Trim,(n))holds in 43 for each i, 1 < i < k. But then, since 0 holds in R, Q,(T,Xn), . , . , T#JI)) holds in 41 for 1 Ij < 1 and n E U . By 15.1 again, *Qj(*a{, . . . , *c$,) holds in *W for each j, 1 < j I I, and we are through.

*e,(*t',,

CHAPTER II

Nonstandard Analysis on Superstructures

In order to proceed to analysis more general than the calculus, we will need to consider mathematical systems which contain entities corresponding to sets of sets, sets of functions, and so on. For example, we might want to prove theorems involving the set of open subsets of R, or the set of all continuous functions on R. Such entities, regarded as objects in themselves, are not contained in any relational system based on R. Beginning with a basic set X,we can construct a superstructure V ( X )which contains all of the entities normally encountered in the mathematics of X by successively taking subsets. This chapter is devoted to nonstandard analysis in this general setting. In particular, we consider mathematical logic for superstructures in $11.2, and the transfer principle in $11.3. The language presented is more general than that of Chapter I, and this will allow us to avoid Skolem functions and proofs by contradiction in applying the transfer principle in the rest of this book. We generalize the ultrapower construction and *-mapping for *W to superstructures in $11.4, obtaining a superstructure V(*X)and a map *: V ( X )+ V(*X). In 511.5 we show how to choose the ultrafilter in the construction of $11.4 to ensure that V ( * X )is an enlargement, a notion which is fundamental to nonstandard analysis as developed by Abraham Robinson. The notions of internal and external entities and sentences are developed in $11.6. These notions are important in being able to recognize when a sentence Y about V ( * X )is of the form *CP for some sentence CP about V(X). We will often use such a corresponding “downward transfer principle” in succeeding chapters. In $11.7 we present the permanence principle, which involves the idea of internal formulas, and is useful in many proofs. Finally in 511.8 we survey the theory of maturated superstructures, a concept which was introduced by W. A. J. Luxemburg [36] and is very important in some of the recent applications of nonstandard analysis. I0

71

11.1 Superstructures


In the succeeding chapters of this book, we will need to consider mathematical systems which contain entities corresponding to sets of sets, sets of functions, etc. Such sets, regarded as objects in themselves, are not contained in any relational system based on R; there are no names for them in the language of relational systems. More generally, we are led to work with a set X and all of the sets which can be obtained inductively from X in a finite number of steps by successively taking subsets of the preceding set, as indicated in the following definition. The resulting structure is called a superstructure over X. We will always assume that X contains the natural numbers N in order later to be able to define ordered n-tuples (Definition 1.2). 1.1 Definition Let X be a nonempty set containing at least the natural numbers N. The power set B ( X ) of X is the set of all subsets of X (including the empty set 0).The nth curnulatioe power set V,(X) of X is defined recursively by

VO(X)= x,

V,+,(X)= V,(X)u m I ( X ) ) .

The superstructure over X is the set

u m

W ) n= V,W. =O The elements of V ( X )are called entities, and the entities in X are also called individuals. The entities in V,(X)- V,- ,(X) are of rank n.

For example, let X = N, the set of natural numbers. Then some entities in V,(N) are 7, {7}, and the set {2,4,6,. . .} of even numbers. Similarly, some entities in V.(N) are 7, {1,3,5,. . .}, and the set of all finite subsets of N. As usual in set theory we use the symbol E to stand for “is an element of“ and $ to stand for “is not an element of.” Similarly, if x , y ~V(X), we write x E y ifz E x implies z E y; we write x = y ifx E y and y E x and write x # y otherwise. In particular, we write x c y if x G y but x # y. Notice that an entity may simultaneously be a subset of, and an element of, another entity; in particular, Vn(X)E V , + , ( X ) and Vn(X)E V,+,(X) for all n. We always assume that the individuals have no members; i.e., if x E X then x # 0 and the statement t E x is false. The choice of the basic set X is always somewhat arbitrary and depends on the context. If, for example, we want to study the real number system, and do not need to consider the manner of construction of each real number (as, for example, an equivalence class of Cauchy sequences of rational numbers), then we may take X = R. If, on the other hand, we want to study

72

II. Nonstandard Analysis on Superstructures

the real numbers as equivalence classes of Cauchy sequences of rationals, then we might take X = Q (the rational numbers). We next show how to describe relations and functions in the set theory of V(X).The basic step is to define an ordered n-tuple set-theoretically, and the rest follows as in Definition 1.2.1. We start with the definition of an ordered pair and make a distinction between ordered pairs and two-tuples. 1.2 Definition An ordered pair ( a , b ) is the set ({a}, { a , b ) } . For n 2 2, an ordered n-tuple (xl,. . . ,x,) of elements x,, x 2 , . , . ,x, is defined by (xl,. . .,x,) = ((1, x,), .. .,(n,~,)), where for each k E N, 1 s k i; n, (k,x,) is an ordered pair, If c,, c2,. . .,c, and c are sets, we define

cl x c2 x

x c, = {(x,,

. . . ,x,):xi

E ct (i = 1,.

. . ,n)]

and cn = c x - * x c (n factors). For n 2 2, an n-ary relation P on c1 x cz x * * * x c, is a subset of c1 x c2 x * * - x c,. P is a relation in V ( X )if each ci E V,(X)(i = 1,. . . ,n) for some fixed integer k. If P is a 2-ary relation on c1 x cz we will call it a binary relation. In this case we define the domain and range of P by +

dom P = {xl E c1:there exists x2 E c2 such that (x,, x2) E P } and range P = {xz E c,:there exists xI E c, such that (x,,x,)

EP}.

Similarly, if b c cl we define the image of b under P by

P[b] = {x2E c,:there exists x, E b such that (xl,xz)

E P],

and the inverse image of b C c2 under P is the set

P-'[b] = (xl E c,:there exists x2 E 6 with (x,,x,>

E P}.

If b is a singleton set, i.e., b = {x}, we will usually write P [ x ] and P - l [ x ] for P[b] and P - ' [ b ] . A functjon f from a to b, which we denote by f:a 4b, is a subset of a x b (and hence a binary or 2-ary relation) such that, for each x E a, there is exactly one y E b such that (x, y) ~ f The . element y is called the image of x and is denoted by f ( x ) . The set a is the domain off, and we say that f is defined on a. If f(x) = f(y) implies x = y for all x, y E a, we say that f is oneto-one (1-1) or injective. If range f = 6 we say that f maps a onto 6 or that f is surjective. A function g: c + b is an extension off: a + b (and f is the rest~ictionof g to a) if c 2 a and g ( ~= ) f(x) for ail x E a; we write f = gla in this case. If a c c, x * - * x c, we may say that f is a function of n variables and may write f(x) as f(xl,. . . ,x,), where x = (q,. . ,x,), xi E ci. (Note

.


73

that f(x) will be different from f[x]; e.g., if f(x) = xz on R then f(2) = 4 and fC21 = (41.1 The set-theoretic definition of ordered n-tuple in Definition 1.2 is justified by the following lemma, which expresses the definitive property of ordered n-tuples. 1.3 L.emma(x,,. . . , x , , ) = ( y l , . ..,y,)(setequaIity)iffx,=y,(i=l,.. .,n).

Proof: We prove the lemma for n = 2 and leave the rest of the proof to the reader. It is immediate that if x1 = y, and x, = y2 then (l,xl) = ( l , y l ) and (27x2) = (2,YZ) so (x1,xz) = (Y1,YZ). Suppose, conversely, that (x,,x,) = (y,, yz). Then (1.1) WL { L X I H , @I, {2J2)H = {{{l}, { L Y I H , W), {%Yz}H. Suppose first that x, = 1. Then { { l}, { l,xl}} = { { l}, { l}} = {{ 1)) and so, from (l.l), {{l}, {l,y,}} = {{l}}, so (1) = {l,y,} and hence x1 = y, = 1. Suppose now that x, # 1. From (Ll), {{l}, {l,xl}} = {{l}, {l,yl}}, whence { 1, x,} = { 1, y,} and so x1 = y,. Similarly x, = y,. 0

From Definition 1.2 we see that n-ary relations on &(X) and functions with domain and range in V,(x) for some k E N are entities in V(X). Indeed, if x, y E V,(x) then the ordered pair (x, y) E h+,(X).Thus if xl,. . . ,x,, x , + ~E V,(X) then (x,, . . . ,x,) E V,+3(X), and the 2-tuple ((xl, . . . ,x,),x,,+ 1) E h+,.Therefore, any relation P on a set c1 x - * * x c,, ci E V,(X)(i = 1,. . . ,n), is an element of V,+4(X),and a function of n variables on c1 x x c, with range in V,(X) is in V,+,; a function on just c1 is again in b+.+(X).Thus superstructures are at least rich enough to contain entities corresponding to the usual relations and functions occurring in ma thematical systems. We conclude this section with some examples. 1.4 Examples

1. Let X = R and let 9 be the set of all finite closed intervals in R; i.e., E iff ~Z = {x E R:a I; x 5 b , a , b R} ~ = [a,b]. Then Y E V2(R). 2. We define a relation P on N x 9,where N is the set of natural numbers, by “(n, y) E P iff n E y.” Clearly P is in V,(R) since N and 9 are in V2(R). 3. The relation p defined on 9 x R + (where R + is the set of positive real numbers) defined by “(y, r) E p iff y = [a, b] and r = Ib - a[’’is a function on 9 which measures the length of each interval I E 9;it is in V,(R). Z

74

II.


Exercises II.1

1. Complete the proof of Lemma 1.3. 2. If a E V,(X),k 2 1, and b c a then for what n is b E V,(X)? 3. If a, b E V,(X)- Vo(X),k 2 1, then for what n are a u b, a n b, and a - b in V,(X)? 4. Show that a relation P is in V ( X ) iff domP and range P are in V(X). 5. Let 9 denote the collection of all finite closed intervals on the real line (i.e., sets of the form [a, b]). For each I E 9 let p ( I ) = b - a (i.e., the length of I). For which value of n is p E K(R)?

11.2 Languages and Interpretation

for Superstructures In this section we introduce a suitable language for superstructures and show how to interpret sentences in this language. for Let V ( X )be a given superstructure. The symbols of the language gLeX V ( X )consist of the following. 2.1 C O M ~ C ~The ~ Vsymbols ~S 1,A , v, +, and t),to be interpreted later as “not,” “and,” “or,” “implies,” and “if and only if,” respectively. 2.2 Quantifiers The symbols V and 3, to be interpreted as “for all” and “there

exists,” respectively. 2.3 Parentheses The symbols [ ,3, ( ,), and ( , ), to be used for bracketing. 2.4 Constant Symbols At least one symbol 4 for each element a of V(X).

For simplicity of notation we will identify a and its symbol a. The context will clear up any possible confusion. 2.5 Variable Symbols A countable collection of symbols like x, y, xl, x2, . . . , to be used as “variables.” 2.6 Equality Symbol The symbol =, to be interpreted as “equals” [it denotes set-theoretic equality for elements of V ( X )- Vo(X)].


75

2.7 Predicate Symbol The symbol E, to be interpreted as “is an element of.”

We use the same symbol that was used informally in #II.l. Notice that the language Yxis richer than the language Ly for a relational system 9’based on X in that Y xhas the symbols i , v t*, and also 3. However, Yxis poorer in having no terms. Sentences in Yxare built up inductively using the symbols just introduced. The basic building blocks are the atomic formulas, introduced in the following definition. 2.8 Definition A formula of .YX is built up inductively using the following rules:

(a) If xl,. , . ,x,, xi and y are either constants or variables, the expressionsx ~ y , = x y,(XI,x2,. . . ,x,) E y,(x1,. . ,Xn) = y,((x1,. . - ,xn),x) E y, and ((xl, . . . ,x,), x) = y are formulas, called atomic formulas. (b) If @ and Y are formulas, then so are I@,(0 A Y, @ v Y, @ + Y, and @*Y. (c) If x is a variable symbol, y is either a variable or a constant symbol, and @ is a formula which does not already contain an expression of the form (Vx E z) or (3x E z) (with the same variable symbol x), then (Vx E y)@ and (3x E y)@ are formulas. A variable occurs in the scope ofa quantijer if whenever a variable x occurs in @, then x is contained in a formula Y which occurs in (0 in the form (Vx E z)Y or (3x e z)uI (z may be either a variable or a constant); it is then said to be bound, and otherwise it is called free. A sentence is a formula in which all variables are bound. For example, the expression (Vx E b ) [ x E y A (y, a) = 61, where a and b are constants and x and y are variables, is a formula in Yxbut not a sentence, since the variable y is free. The formula (Vx E y)(3z E c)[(x, z) = c] is likewise not a sentence, but (3y E a)(Vx E y)(3z E c)[(x, y) = c A (y,z) = d ] is a sentence. We now indicate how to interpret a given sentence @ in Yxin the superstructure V(X). That is, we show how to decide whether @ is true or false in

V(X). 2.9 Definition

(a) The atomic sentences a E b, ( a l , . . . , a,) E b, ((al,. . . , a,),c) E b and a = b, ( a l , . . . , a , ) = b, ((al,. . . ,a,),c) = b are true (hold) in V(X)if, respectively, the entity (corresponding to) a, (a1,. . . ,a,,), or ((al,. . . ,a,),c) is an element of, or identical to, b.

76


(b) If CP and Y are sentences then (i) -I@ is true if O is not true (does not hold), (ii) @ A Y is true if both O and Y are true, (iii) CP v Y is true if at least one of O and Y is true, (iv) O + Y is true if either Y is true or CP is not true, (v) CP-Y is true if O and Y are either both true or both not true.

(c) Let O = @(x)be a formula in which x is the only free variable, and b is a constant symbol. Then (i) (Vx E 6)o is true if, for all entities a E b, when the symbol corresponding to the entity a is substituted for x in @, the resulting formula, which we denote by @(a),is true, (ii) (3x E b)O is true if there exists an entity a E b so that @(a)is true. Theoretically, the induction scheme for interpretation of sentences implied by 2.9 could be rather involved. But it is nothing more than the obvious one consistent with the specified (and usual) interpretation of the logical symbols involved. For most sentences O in YXwhich we will later encounter, it will be easy to check if O is true in V(X). For example, let IpRbe the language for V(R),and let 3 be the ternary relation for sum (ie., (a, b, c ) E if a + b = c), d be the ternary relation for product (i.e., (a,b,c) E p if ab = c), and R , be the set R - (0)of nonzero reals. Then the sentence

s

(2.1)

Px E R,)(3Y E R ) [ ( x , Y , 1) E PI

is true in V(R).For let @(x)= (3y E R ) [ ( x , y , 1) E PI. Then (Vx E R o ) 4 is true if, for all nonzero a E R, (3y E R ) [ ( a , y , 1) E is true. This sentence is true if there is a number b so that (a, b, 1) E b. But this is true with b = a - ’ . Thus the sentence states that “every nonzero real number has an inverse.” As in Chapter I, it is important to be able to translate an ordinary mathematical statement into a sentence in the language Ipx,since in the next section we will show how to write down the “transform” of such a sentence, which can then be interpreted in an appropriate “nonstandard” superstructure. As an example, consider the distributive law for the real numbers. To simplify matters we introduce the following notational convention.

a]

2.10 Convention If f is the symbol for a function of n variables we may write x,+ = f ( x , , . . . ,x,) for the atomic formula ((x,, . . . ,x,), x,+ ,)E f. Iff is a function of one variable, we may write y = f ( x ) for (x, y) E f.

s

Noting that the ternary relations and d define functions S and P of two variables [e.g., S(a, b) = c if (a, b, c ) E $1, we may express the distributive

77


law by the sentence (2.2)

(VX E R)(Vy E R)(Vz E R)(Vx, E R ) . . . (VX, E R )

“CS(YJ) = X I 1 A CP(X,X,) = x21 A [&, y ) = X3] A [P(X,z) = [S(X3, Xd) = 4 1 . +

Notice that since our language IPR does not have terms, the sentence (2.2) is somewhat more involved than the corresponding sentence (2.3) (Vx)(Vy)(Vz)[R(x) A R(y)A R(z) E(P(x, S(Y,z))), W(x, Y), P(x, z>)1 in the language La of Chapter I. [Remember that, in La, x(y z) = xy + xz is shorthand for the term E(x(y z), xy xz), where E is the symbol for the equality relation.] However, it is still easy to check that (2.2) is true in V(R). For another example, consider the statement that f: R -,R is continuous at the point x = a, or, more precisely, given E > 0 there exists a 6 > 0 so that whenever Ix - a1 < 6, If(x) -f(a)l < E. To translate this into a sentence in gR,let R; denote the entity in V(R) which is the set of strictly positive real numbers. Let p be the function of two variables corresponding to distance (so that ((x, y ) , z ) E p iff (x - yl = z), and I be the binary relation of strict inequality (so that (x, y) E I iff x < y). Then the corresponding sentence in IRR is +

+

(2.4)

+

+

(VEE R;)(36 E R;)(VX E R)(VX, E R)(VX~E R)(Vx3 E R ) [[p(x,a) = xl A (x,,6) E I A ~ ( x=) x2 ~ f ( a= )b A

P(X2, b) = x31 -b

[(X3,E)

E 111.

Check that the interpretation of this sentence is true in V(R)iff is continuous at x = a and f(a) = b. Since, with a little practice, the translation of ordinary mathematical statements in a given superstructure V ( X )into sentences in the language IRxwill be routine, we adopt the following convention in the rest of this book. 2.11 Convention A sentence in Y Xwill often be written as a sentence in the language L , of Chapter I, where 9’is a relational system over X,or even as a sentence in ordinary mathematical language, when the translation into a sentence in 14, is clear. We will also abbreviate (Vx, E c) * - * (Vx, E c) by

(VX,,

. . . ,x, E c).

Thus, for example, the sentence in IPR which is equivalent to (2.4) is the sentence (2.5)

(VE E R;)(36 E R ~ ) ( V X E R)[/x - a( < 6 + If(x) -

f(~)l

< E].

78


Similarly, the sentence in p Rwhich is equivalent to the semiformal sentence (Vx E u)[x G b] is the sentence (Vx E u)(Vy E x)[y E b]. Exercises 11.2

1. Write out the commutative and associative laws of addition for R in the form of sentence (2.2). 2. Write out a sentence in the form (2.4) which means that lim,+,f(x) = L in R. 3. Write out a sentence in the form (2.4) which means that the derivative f’(u) exists and equals L. 4. Formulate a sentence in p Rwhich expresses the Archimedean property of the real number system (i.e., for each x E R there is an rn E N so that rn 2 x). 5. Write sentences in PRexpressing the fact that a collection 9 of subsets of R is a filter. 6. Let X be any set. Write sentences in 9xwhich express the facts that a function f : A + B is surjective (i.e., onto) and injective (i.e., one-to-one) respectively.

11.3 Monomorphisms between Superstructures:

The Transfer Principle In $1.5 we stated that the relational systems W and *W are connected by a transfer principle. To be precise, we stated that, with the *-mapping and the associated *-transform of simple sentences defined in 61.5, if (0 is any simple sentence which is true in W then *(0 is true in *W. In this section we generalize this relationship to superstructures. The basic properties of the new mapping *, which was introduced by Robinson and Zakon [45,48], are abstracted in the notion of a monomorphism. In the next section we show that with each superstructure V ( X ) one can associate a superstructure V(*X) and a monomorphism *: V ( X )+ V(*X). Let X and Y be two sets of individuals with associated superstructures V ( X )and V( Y) and languages Pxand Py, respectively. We will again assume that there is at least one constant symbol in zxand Pyfor each entity in V ( X )and V(Y),respectively, and identify the constant symbols with the corresponding entities. The context should settle any possible confusion. A constant symbol in Y Xnames something in V ( X ) and the same is true for constant symbols in pY.

11.3

79

Monomorphisms between Superstructures

Now let *: V ( X )-+ V(Y) be a one-to-one mapping (injection). For a E V ( X ) we write *(a) = *a. We assume that for each a E V ( X ) the symbol *a is in $R, and names *(a). 3.1 Definition If @ is a formula (or sentence) in Y Xthe , *-transform *@ of @ is the formula (or sentence) in TYobtained from @ by replacing each constant symbol c in @ with the symbol *c in SY associated with the entity *(c).

For example, given the set R of real numbers, we assume that a superstructure V(*R) over a set * R and a monomorphism *: V ( R ) + V(*R) exist (this will be established in the next section). Then the *-transform of the sentence (2.1) of the last section is the sentence (VX E *Ro)(3yE *R)[(x, y , *1) (3.1) and the *-transform of (2.4) is

(3.2)

(VE E *RL)(3S E *R:)(Vx

E *@I,

*R) [[*p(x, *a) = x1 A (x,, S) E * I A *f(x) = x2 A *f(*a) = *b A * p ( X z , *b) = X3] + [(X31&) E *I]]. E *R)(Vx, E *R)(Vx, E *R)(Vx, E

3.2 Definition The injection *: V ( X )+ V(Y) is called a monomorphism if

(i) *(fa) = fa, where fa is the empty set, (ii) a E X implies *a E Y, and n E N implies *n = n (recall that N c X and N E Y by assumption), (iii) a E V,, ,(X)- V,(X)implies * a E V,, l(Y) - V,(Y),n 2 0, (iv) if a E *V,(X),n 2 1, and b E a, then b E *V,- l(X), (v) (transfer principle) fur any sentence @ in 14,, @ holds in V ( X )iff *@ holds in V ( Y ) . Property (iv) is called strictness by Zakon [48]. We will later interpret it to say that elements of “internal” sets are internal. Because of (ii) we may, and will, assume that X is actually a subset of Y and *a = a for a E X [this is the analogue of Convention 2.5(c) of Chapter I]. The transfer principle as stated is redundant in that if from @ holding in V ( X )one can conclude that *@ holds in V ( Y ) ,then when i@ holds in V(X), *(lo),ie., i( holds * in V @ ( Y ) . The principle that) *@ holding , in V(Y) implies that @ holds in V ( X )will sometimes be called the downward transfer principle.

We now suppose that *: V ( X )-+ V ( Y ) is a monomorphism and collect together some elementary results that follow easily from the transfer principle; the proofs are good illustrations of the use of that principle.

80


3.3 Theorem

(a) Let a, b, a,, (i) (ii) (iii) (iv) (v) (vi) (vii)

. . . ,a,

be fixed entities in V ( X ) .Then

*{a,, . . .,a,} = {*a,, . ..,*a,}, *(a,, . . .,a,) = (*a,, . . . , *a,), a E b iff *a E *b, a = b iff *a = *b,

a c b iff *a E *b, ai)= *ai, ai)= x *a,. *(al x a2 x * . x a,) = *a, x

*cur=, u;= , *(n;= , n;=,*ai, -

(b) If P is a relation on a, x . * * x a, then * P is a relation on *a, x * * * x *a,, and, for n = 2, *(dom P) = dom *P and *(range P) = range *P. (c) Iff is a mapping from a into b then *f is a mapping from *a into *b, and * [ f ( c ) ] = *f(*c)for each c E a. Also f is one-to-one iff *f is one-to-one.

. . ,a,} and transform the sentence (Vx E b) v x = a,], as well as the sentences a, E b, . . . ,a, E b.

Proof: (a)(i) Let b = {a,,. [x = a, v x = a2v

9

*

(a)@) Exercise 1. (a)(iii) Clear. (a)(iv) Clear. (a)(v) The sentence (Vx E a)[.€ b] is true in V ( X ) iff its *-transform (Vx E *a)[x E *b] is true in V( Y).The interpretation of the latter sentence is that *a E *b. (a)(vi) We show that *(a u b) = *a u *b; it then follows by induction that *(U;=, ai) = *ai. The proof that *(fly=, ai)= *ai is similar (Exercise 2). Let c = a u b. The sentence (Vx E c)[x E a v x E b] is true in V(X),so its *-transform (Vx E *c)[x E *a v x E *b] is true in V(Y).The interpretation of the latter sentence is that *(a u b) G *a u *b. Similarly, the interpretation of the *-transforms of the sentences (Vx E a)[x E c] and (Vx E b)[x E c] shows that *(a u b) 2 *a u *b. (a)(vii) We show that *(a x b) = *a x *b; the proof for n > 2 is similar. Interpretation of the *-transforms of the sentences (Vz E (a x b))(3x E a) (3y E b)[(x, y) = z] and (Vx E a)(Vy E b)(3z E (a x b))[(x, y) = z] shows that *(a x b) E *a x * b and *a x *b E *(a x b). x *an follows by interpretation of (b) That * P is a relation on *a1 x the *-transform of (Vx E P)(3x, E a,) * (3x, E a,)[(x,,. . . ,x,) = x]. To show that, for n = 2, *(domP) c dom*P, interpret the *-transform of the sentence (Vx E dom P)(3y E a2)[(x, y) E PI. The proof of the fact that *(dom P) 2 dom * P is left to the reader (Exercise 3).

.-

u;=,

11.3

81

MonomorphismsBetween Superstructures

(c) *f is a relation on *a x *b by (b). To show that *f is a mapping, interpret the *-transform of the sentence (Vx E a)(Vy E b)(Vz E b)[ [(x, y ) E f A (x,z) E f ] + y = 23, which is true in V(X). The rest of the proof of (c) is left as Exercise 3. 0 The results in Theorem 3.3 are quite general in nature. To be more concrete we consider, as examples, the interpretation of the sentences (3.1) and (3.2). Remember that the sentence (2.1) of which (3.1) is the *-transform holds in V(R)because of the fact that there exists a multiplicative inverse of each nonzero element in the field L@, and (2.1) is a formal expression of that mathematical statement. Clearly (3.1) should be a formal expression of a similar fact about V(*R).To see this, note that the ternary relation d defines a function P of two variables since the product of two real numbers is uniquely defined. By parts (b) and (c) of Theorem 3.3 we see that *P is a function from * R 2 to *R. Thus for each a, b E * R the number c E *R such that ((a, b ) , c ) E * P is uniquely defined and is called the *-product of a and b. We denote c by a * b or ab. Now (3.1) is true by transfer in V(*R)since (2.1) is true in V(R),and its interpretation establishes the existence for each a # 0 in * R of a number y E * R so that a * y = 1. One can similarly show by transfer that y is unique. Consider now the interpretation of (3.2). Proceeding as above, we see that (3.2) is equivalent to the ordinary mathematical statement “Given E > 0 in * R there is a 6 > 0 in * R so that, for all x E *R, Ix - a1 < 6 implies I*f(x) - *f(a)I < E.” (The absolute value 1x1 for x E * R is the extension of the usual absolute value in R.) Notice that here E and S are allowed to be any positive numbers in * R (even infinitesimal). The function *f will be said to be *-continuous at a if it satisfies (3.2), which will be the case, by transfer, if f is continuous at a. In 61.2 we noted that if B was a subset of R then *B was an extension of B (regarded as embedded in *R).This fact is again true in the present context. For if b E 9(X)and a E b then *a E *b by Theorem 3.3(a)(iii). But since a E X we have *a = a and so a E *b, and hence b c *b. One might expect that this fact is true in general, i.e., that a E b implies a E *b for any entities a, b E V ( X ) , but in general Theorem 3.3(a)(iii)is the best we can do, as shown by the following example. 3.4 Example Let f denote the set of closed bounded intervals in R; each I E 9 is of the form I = {x E R : a I x I b, a,b E R } = [a,b].Then f E V2(R).

Thus the following statements are true in V(R): (Vx E f)(3a, b E R)(Vy E R ) [ a I y I b *y

E x],

(Va,b E R)(3x E f)(Vy E R)[a 5 y 5 b - y E x].

82

II.


By transfer, assuming a monomorphism *: V(R)+ V(*R),we see that if I E *# then there exist numbers a, b E * R so that I = {x E *R:a I; x I; b}. Even if a and b are standard (i.e., in R), if a # b such an interval is not identical to an interval in f , since it contains non-standard reals between a and b. Thus *#contains the transform *I = {x E *R:a I x I b} of each standard interval I = [a, b], a, b E R, and also all other intervals of the form {x E *R:a < x < b} where either a or b or both are non-standard. Notice, in particular, that 9 is not embedded in *9, i.e., only singleton sets in 9 lie in * f . This situation is indicative of what happens in general when one forms * b for an entity b of rank higher than one.

, the existential quantifier The fact that the languages Zxand 9,contain 3 allows alternative proofs of many of the results established in Chapter I. In particular, we may use 3 to do the work done by Skolem functions in Chapter I. To illustrate, consider the following proof of the sufficiency of the condition in Proposition 8.1 of Chapter I, which states that if (s,) is a standard sequence and *s, N L E R for all infinite n, then s, converges to L. We present the proof in a hybrid of the languages L, and ZR.Translation into the language gRis left to the interested reader. Suppose then that *s, N Lfor all infinite positive integers n E *N.Let E > 0 be a fixed standard real. Since J*s, - LI is infinitesimal for all infinite positive integers, the statement (3.3)

(vn E * N ) [ n 2 0 + IL - *SJ < E l

is a sentence in which is true for any infinite positive integer o.However, (3.3) is not the *-transform of a sentence in ZR, since it involves the constant a,which does not name the image *(a)of an element a E V(R).But since (3.3) is true, the sentence (3.4)

(3mE *N)(VnE * N ) [ n 2 m + (I,- *s,I < E ]

is also true in V(*R) and is the *-transform of (3.5)

(3mE N)(Vn E N ) [ n 2 rn -+ IL - s,I < E ] ,

which is then true in V(R) by virtue of the transfer principle. Since (3.5) is true for any E > 0, we see that s, converges to L. 3.5 Remark Comparison of this proof with that in Chapter I shows that we have avoided a proof by contradiction, and the use of Skolem functions. Another and more important aspect of this new technique of proof is that we construct a true sentence *CP in .Y.xwhich is the *-transform of a sentence CP in Zx,so 0 is true by transfer down and yields the desired result. In Chapter I we used the transfer principle only in the upward direction, i.e.,

11.4 The Ultrapower Construction

83

from La to La. The construction of *@ is often accomplished, as above, by writing down a sentence Y in 9., which is true but involves entities, like the w above, which do not occur in the *-transforms of sentences (R in Y,, and then appropriately adding the existential quantifier to convert Y to a sentence of the form *(R for some @ in 9,. The proof above may seem surprising since we infer the existence of a standard integer m satisfying (vn E N ) [ n 2 m -,1s - s,,l < E ] (3.6) from the existence of the infinite integer w satisfying (3.3). Since similar proofs will occur in the rest of this book it is important to be able to recognize when a sentence Y in Y., is or is not of the form *@ for some sentence (R in Y;. This question will be dealt with in gI1.6.

Exercises 11.3

Prove Theorem 3.3(a)(ii). show that a,) = *ai. Finish the proof of parts (b) and (c) of Theorem 3.3. Use the downward transfer principle to prove the sufficiency of the condition in Proposition lO.l(a) of Chapter I. 5. Use the downward transfer principle to prove the sufficiency of the condition in Proposition 10.8 of Chapter I. 6. Use the transfer principle to show that the set N of standard natural numbers is not an element of *9(N). 7. Let *: V ( X )+ V(Y)be a monomorphism. Show that iff E V ( X ) maps a onto b then *f maps *a onto *b.

1. 2. 3. 4.

'11.4 The Ultrapower Construction for Superstructures

In this section we show how to generalize the construction of * R in Chapter I by constructing, for any superstructure V(X), a superstructure V(*X) on an appropriate set *X and a monomorphism *: V ( X )+ V(*X). We begin with an ultrafilter 4 on an index set I (see the Appendix); both I and 4 will be fixed in the construction of V(*X), but in later sections we will choose them to have additional properties. Now let V ( X )= K(X) be a given superstructure.

u."=o

4.1 Definition Let S be an entity in V ( X ) . The set of all maps a: I + S is denoted by we write a(i) = a, for i E 1. The maps a and b in flS are

fl$

84

II. Nonstandard Analysis nn Superstructures

equivalent (with respect to %), and we write a =*b iff { i E l : a i = b,} E % (the equality is set-theoretic except when S E X,in which case it is identity). If a = 'y b we say that a, = b, almost everywhere (a.e.). The relation =* is an equivalence relation on n S . The set of associated equivalence classes is denoted by n*S, and is called the ultrapower of S (with respect to 4).The equivalence class in n,S containing a E n S is denoted by [a]. Let V- ,(X) = 0, the empty set. The bounded ultrapower of V(X)is the set

niw,= u l-I*[v.(W - v.-I(X)l* m

n=O

We define the map e: V(X)+ @V(X) by e(a) = [ii], where E, = a for all i E 1. The proof that = is an equivalence relation on n S is similar to the proof of Lemma 1.4 of Chapter I and is left as an exercise. We see immediately from Definitions 1.3 and 1.5 of Chapter I that * R = n 4 R , where 4 is the ultrafilter of $1.1. The map e is a generalization of the map *: R + * R of Definition 1.9 of Chapter I. n i V ( X ) is called a bounded ultrapower since, for each [a]E@V(X), a,€ V,(X)- h-l(X),i E 1 , for some fixed k E N; thus, there is a uniform upper bound to the rank of a,, i E 1. We now want to construct from niV(X)a superstructure V(*X)over a set *X,and an associated mapping M:niV(X)+ V(*X).We will finally define the mapping *: V ( X )+ V(*X)as the composition of e and M , and show that * is a monomorphism. In the literature, M is called a Mostowski collapsing function. First we must define *X.In analogy with the definition of * R we put

*x = n * x = fl*Vo(X).

(4.1)

Now V(*X)is completely determined and we proceed to the definition of M : niV(X)+ V(*X). We define M successively on V,(X)- V,- ,(X)]by induction. By (4.1) n , V o ( X ) = *X,and by definition Vo(*X) = *X,and so we define M to be the identity on *X,i.e.,

He[

(4.2)

M(a) = a,

u E n*Vo(X) = *X.

For higher levels we need the following definition. 4.2 Definition If [a],[b]

E n;V(X),then

[a]

[b] iff { i E Ila, E b,} E 4.

The reader should check that E* is well defined (exercise). To motivate the definition of M on V,(X) - Vo(X)] we let X = R and recall from $1.2 the definition of *A, where A is a subset of R. By Definition 2.2 of Chapter I (with 1 = N), *A consists of those elements [a] of

n*[

85


* R for which {i E I:a, E A } E 4. For our more general situation, the subset A is mapped by e to the element e(A) = [ A ] in H,V,(R). Note that [ A ] is not a subset of *R. We want * A to be a subset of * R and to consist of precisely those elements [ a ] E * R for which [ a ] E% [ A ] . Since will be the composition of e and M, it follows that we should put M ( [ A ] )= { [ a ] E * R : [ u ]E* [ A ] }

{ M(Ca1) E Vo(*R):[aIE n*Vo(R) and [a1 E* The general definition is now clear. =

[All.

4.3 Definition We define M: f l g V ( X ) + V ( * X )inductively by

M ( [ b ] )= [bl

for

PI E l-I*VO(X), n- 1

for [ b ] E H,[V,(X)- V,- ,(X)],n 2 1. The important properties of M and e are collected together in the following result. 4.4 Lemma

(i) e and M are one-to-one maps; i.e., a = b iff e(a) = e(b), and [a] = [ b ] iff M([ a ] ) = M([ b ] ) . (ii) e maps X into *X;M maps * X onto * X . (iii) e maps V , + , ( X )- V,(X) into n,[V,+,(X> - V,(X)];M maps H,[V,+ ,(X) - V,(X)I into V,+,(*XI - V,(*X). (iv) a E b iff e(a) E, e(b); [t]E* [ b ] iff M ( [ u ] )E M ( [ b ] ) . (v) e(X) = [XI and M ( [ X ] ) = *X. (vi) Let [a],[ b ] E HiV(X)and put ci = {ai,bi}, i E I . Then [c] E ngV(X) and M([c]) = { M ( [ a ] ) , M ( [ b ] ) }Similar . statements hold with { } replaced by ( ) and = replaced by E, and also for three or more terms. (vii) If [ b ] E* e(a), a E V,(X) - K-l(X), then [ b ] E* e(V,-,(X)). Proof; We leave the proof of (ii)-(v) and (vii) as exercises.

(i) To show e is one-to-one let a # b E V ( X ) .Then e(a) # e(b) since iii # 6, for all i E I, and 0 4 9.To show M is one-to-one, we consider only the case that [ a ] and [ b ] are in flzV(X) - *X,and [ a ] # [ b ] in f l $ V ( X ) . Let U , = {i E I:there exists uiE a, with ui 4 b,} and Ub = { i E I:there exists I), E bi with ui # a,). If neither U , nor U b is in 4,then I - (V,u u b ) is in 4 and a, = bi

86


for almost all i E I. But this is impossible. Assume, therefore, that U,E 4. Choose u, E a, - b, for each i E U, and let u, be a fixed uio otherwise. Then M([ u ] ) E M([a]) and M([ u ] ) $ M([b]). The rest is left to the reader. (vi) We prove the first statement and leave the rest to the reader. Now M([c]) = {M([y]):y, E {a,,b,} a.e.}. If y, E {a,,b,} a.e., let A = {i E I: yi = a,} and B = {i E I:y, = b,}. Then A u B E Q, and so either A E Q or B E 4 since Q is an ultrafilter. Thus

M([c]) = {M([y]):y, = a, a.e.} u {M([y]):yi = b, a.e.} = {M,M(Cb1)). 0 With

*

defined as the composition of e and M, we now show that

*: V(X)+ V(*X)is a monomorphism. To do so we need the following funda-

mental result; in the proof we use the axiom of choice.

was) If 4(x1,. .. ,x,) is a formula in le, with x i , . . . ,x, its only free variables, and [a,], . . . ,[a,] E f l g V ( X ) , then *4(M([a,]), . . , , M([a,])) is true in V(*X) iff

4 5 Theorem

{i E l:@(al(i),. . . ,a,(i)) is true} E Q.

Proof: 1. We first establish the result when @ is an atomic formula. If @ is of the form x E y or x = y, where x and y are either constants or variables, the result is immediate from 4 4 ) and 4.qiv). The result for @ of the form (xi,. . . , ~ n ) ~ ~ n + i (, ~ i , . - . , x J = x , + i ((~i,...,x,),x)~xn+i, , and ((xi, . . . ,x,), x) = x,+ can be proved by induction using 4.qvi) (Exercise 4). 2. Suppose now that the theorem has been established for the formulas @(x,, . . . ,x,,) and Y(x,, . . . ,x,,). We would like to prove it for the formulas i@ @ A Y, @ v Y, and @, + Y. We do so for the first two and leave the proofs for the last two as exercises (Exercise 4); recall, however, that 0 v Y is equivalent to i [ 0) ( A (1 Y)]. i

note that the following are equivalent: (i) For i@

*(7@)W([ai]b.. . , M([an])) is true; i *WM([a,]), . . . ,M([a,])) is true; {i E I:O(al(i), . . . ,a,(i)) is true} # 4; {i E I : i @ ( a , ( i ) ., . . ,a,(i)) is true} E 4 (since 4 is an ultrafilter).

87


(ii) For @ A \Y note that the following are equivalent: *(@ A \Y)(M([al]), . . . , M([a,,])) is true; *@W([al]), . . . , M([aJ)) A *\Y(M([al]), . . . , M([a,,]))is true; {i E l:@(al(i),. . . ,a,,(i))is true} E Q, and {i E l:Y(al(i), . . . , a,,(i))is true} E Q; { i E Z:@(ul(i), . . . ,a,,(i)) is true} n {i E l:\Y(ul(i), . . . , a,,(i))is true} E 9 (since 9 is a filter); { i E I : ( @ A Y)(al(i),. . . ,a,(i)) is true} E 4.

3. Suppose the result is true for a formula of the form @(xl,. . . ,x,,, y). We want to show it is true for formulas of the form ( 3 y ~ c ) @( 3, y ~ z ) @ , (Vy E c)@, and (Vy E z)@,where c is a constant and z is a variable. We consider the case (3y E c)@ and leave the case (3y E z)@ to the reader (Exercise 4). For the quantifier V, replace (Vy E c)@ with i ( 3 y E c ) i @ and (Vy E z)@ with i ( 3 y E z ) i @ . Suppose *(3y E c ) @ ( M ( [ a , ] ).,. . ,M([a,,]), y) holds in V(*X),i.e.,

-

( 3 E~*c)*@(M([all),

* 9

M([an]), Y )

holds in V(*X). Thus we can find M ( [ a ] )E V(*X) so that ( M ( [ a ] )E *c) A @(M([all), M,M([al)) holds in V(*X). Using step 2, this is equivalent to . 3

{i E l:a(i) E c A @(al(i),. . . ,u,,(i), a(i)) is true} E Q.

Hence also the larger set { i E 1:(3y E c)@(al(i),. . . ,a,,(i),y) is true} is in 4. Conversely, let {i E 1:(3y E c)@(al(i),. . . ,a,,(i),y ) is true} = U belong to 9. Then, for each i E U,we can use the axiom of choice to choose some a(i) E c and for i E 1 - U put u(i) = d E c, where d is a fixed element of c, so that {i E I:a(i)E c A @(al(i),. . . ,a,,(i),a(i))}E Q. Changing u(i) on the complement of a set in Q if necessary, we may assume that a(i) E V,(X)- V,- ,(X) for some n E N and all i (Exercise 6). Now the map a: 1 -+ c defines [a] E n i V ( X )and the steps of the previous paragraph can be retraced, yielding the result. 4. The general result now follows by induction based on Definition 2.9 (Exercise 4). 0 4.6 Theorem The map *: V ( X )+ V ( * X ) defined by morphism.

* =M

0

e is a mono-

Proof: We prove (v) of Definition 3.2 and leave the remaining proofs as exercises. Let @ be a sentence in Y x Then . @ has no free variables, so *@ is

88


true in V(*X) iff {i E 1:4 is true} E 9 by Theorem 4.5. But the set ( i E I : @ is true} is either I [if 4 is true in V ( X ) ] or 0 [if @ is not true in V ( X ) ] ,so *@ is true if and only if @ is true. Whenever nonstandard analysis is applied in any concrete situation in the rest of this book, we will start with a superstructure V ( S )based on a suitable set S, and then use a superstructure V(*S) and a monomorphism *: V(S)-+ V(*S)constructed with an ultrafilter 4 as in this section. Usually the monomorphism will not be mentioned explicitly, but we will always choose % in such a way that V(*S) has a special property, that of being an enlargement. This will guarantee that *S is large enough to contain “infinite” entities. We turn to this question in the next section. Exercises 11.4 1. Prove that =* is an equivalence relation on HS. 2. Show that the relation E* of Definition 4.2 is well defined. 3. Finish the proof of Lemma 4.4. 4. Finish the proof of Theorem 4.5. 5. Finish the proof of Theorem 4.6. 6. Show that if a(i) E V,(X)for a fixed n and all i E I, then, for some k In and all i E U for some U E 4,u(i) E V,(X)- V,- l(X).

11.5 Hyperfinite Sets, Enlargements, and Concurrent Relations

In $1.1 we showed that * R was strictly larger than R (regarded as embedded in *R) by exhibiting elements like [(1,2,3, . . .)I in * R which were not equal to any element of R. The demonstration involved the fact that the ultrafilter 4 on N was free, i.e., it contained the cofinite filter .FN. In the general case it is interesting to determine the conditions under which *X is strictly larger than X. It should be recalled that, by assumption, X contains N and hence is infinite. The following result shows that *X = X and hence V(*X) = V ( X ) when 4 is a principal (nonfree) ultrafilter on I; thus we get nothing new in this case. 5.1 Lemma If 4 is a principal ultrafilter on I then * X (as constructed in $11.4) equals X (regarded as embedded in *X).

Proof: A principal ultrafilter 4 is generated by a single element io E I; i.e., 9 consists of all sets U c I which contain io (see the Appendix). If [a] E *X

11.5

89

Hyperfinite Sets

and ai, = a, then [a]

=* [if],

where ifi = a, for all i E I . Thus [a] E X,where

X is regarded as embedded in *X. 0

We will next show how to choose an index set and an ultrafilter of subsets of the index set so that the *X constructed as in $11.4 is strictly larger than X,and so that V ( * X )has other desirable properties; the most important is that of being an enlargement. We begin by introducing the notion of a hyperfinite or *-finite set. 5.2 Definition If A E V,(X)- V,(X) for some n, we denote by PF(A)the set of all finite subsets of A. 9F(A) is in V(X),and we call the image *PF(A)E V(*X) (with respect to a monomorphism *) the set of hyperfinite or *-finite subsets of *A. The set of all hyperfinite subsets is the set *PAV,(X)).

u.“=l

Any elementary mathematical result that holds for finite sets extends to a similar result for hyperfinite sets by the transfer principle. An example of a hyperfinite set is the set J c * N of positive integers less than some j E *N. To see that J is hyperfinite consider the collection 9 c PAN) of all finite subsets of N of the form { 1,2, , . . , j } for some j E N (a set of this form is called an initial segment). Then *Y c *PAN) contains sets of the form {n E * N : n s j } for somej E *N. The following result shows that these hyperfinite sets are in some sense the prototype.

*h(X), k E N, is a hypefinite set, then there is an initial segment J = {n E * N : n s j } for somej E * N and a one-to-one, onto mapping f :J + B in *V,+,(X). 5.3 Theorem If B E

Proof: Suppose B E *SdA), where A E V,(X), n 2 1. Now the following statement (in semiformal language) is true in V(X):

9F(W)))(3 E”YE V,+4(X)) [f maps J one-to-one onto B, where J

(VB E

=

{n E N : n S j } ]

[the reader should check that the sentence in square brackets can be translated into a sentence in Y X(exercise)]. The result follows by transfer. 0 Because of Theorem 5.3 we will often write a hyperfinite set B as B = {bl, b 2 , . . . , b j } ,where b, = f(k), k E J, and f is the function of the theorem. It should be noted that the dots in this representation cover somewhat more ground than they do in the standard case, and that this representation is really an abbreviation of the setup in Theorem 5.3. Hyperfinite sets are an important tool in nonstandard analysis by virtue of the fact that many standard mathematical structures can be “approximated” by hyperfinite structures in a natural way. We will illustrate this fact later in this section.

90

II.


5.4 Definition Entities in V ( X ) , and entities which are of the form *b for some b E V ( X ) ,are called standard; all others are called non-standard.

5.5 Examples 1. Each individual in X E *X is standard. 2. In Example 3.4 the intervals Z E *9of the form Z = {x E *R:a Ix I b}, where a < b, are themselves standard entities even though they contain nonstandard numbers. An interval {x E *R:a 5 x 5 fl}, where 0 < a < fl and a and fl are infinitesimal, is a non-standard entity. 5.6 Definition The superstructure V(*X) [with respect to a monomorphism *: V ( X )+ V(*X)] is called an enlargement of V ( X ) if for each set A E V ( X ) there is a set B E * 9 A A ) such that *a E B for each a E A, i.e., B contains the

standard entities in *A. We have already seen that a hyperfinite set of the form {n E *N:1 5 n Ij}, where j E *N,, contains every standard natural number. Definition 5.6 is a generalization for arbitrary sets in V(X). We will now show that for a given superstructure V ( X ) it is possible to choose an index set J and a free ultrafilter V on J so that the associated superstructure V(*X), constructed as in $11.4 using J and *v; is an enlargement of V(X). It will follow as a corollary, since X is infinite, that *X is strictly larger than X.The proofs of Lemma 5.7 and Theorem 5.8 may be skipped on first reading of the chapter. Let J be the set of all nonempty finite subsets of V ( X ) .It follows that a E J iff there is a b E V ( X )- V,(X) and a E 9db) - 0 (why?). If a E J we define

J, = { b E J : a c b } . 5.7 Lemma The collection 9 = {A C J:there exists a E J such that J ,

c A}

is a free filter on J. Proof: It is easy to show that 9 is a filter. For example, if A,, A, E 9 there exist a,, a, E J so that A, z J,, (i = 1,2). Since A, n A, 2 J,, n J,,, = J,, ,2, A , n A , E 9.The rest is left as an exercise. To show 9 is free, let U E J . Then there is an element b E J so that a n b = 0.Since a 4 J b , J - {a} 2 J,, so 9 is free. 0

Now let Y be an ultrafilter on J with Y by Theorem A S of the Appendix).

2

9 (such an ultrafilter exists

91

11.5 Hyperfinite Sets

5.8 Theorem If V(*X)is constructed from V ( X ) using V and J then it is an enlargement of V ( X ) . Proof: Let A be a set in V(X). We define a map l-:J+&.(A) by r4= a n A, and let B = M ( [ r ] ) . Then B E *PF(A). If x E A then J , , = {a E J : x E a), so {u E J : x E a n A} E V .Thus [Z] ey [r]and so * x E B. 0

Robinson’s original definition of enlargement (see Theorem 5.10 below) made use of the notion of concurrent relation and was the cornerstone of his development of nonstandard analysis. 5.9 Definition A binary relation P is concurrent (finitely satisfiable) on A c dom P if for each finite set {xl,. .,x,> in A there is a y E range P so that ( x i , y) E P, 1 Ii 5 n. P is concurrent if it is concurrent on dom P.

.

Examples of concurrent relations are the relation 5 in N and c in PAN). 5.10 Theorem The following are equivalent:

(i) V(*X)is an enlargement of V ( X ) . (ii) For each concurrent relation P E V ( X )there is an element b E range * P so that (*x, b) E * P for all x E dom P. Proof: (i) => (ii): Let B E *9ddom P ) be such that, for each x E domP, * x E B. Since the sentence (Vw E Pddom P ) ) ( 3 yE range P)(Vx E w ) [ ( x , y ) E P ]

is true in V ( X ) by concurrence of P, its *-transform is true in V(*X). Thus there exists an element b E range *P so that (z, b) E * P for each z E B, and in particular for each *x with x E dom P. (ii) => (i): Exercise. 0

5.11 Corollary If Y E V ( X )contains an infinite number of entities and V(*X) is an enlargement, then * Y contains entities which are not standard. In particular, if A E X is infinite then *A properly contains A. Proof: The relation P on Y x Y defined by “(a, b) E P iff a # b” is concurrent since Y is infinite. By 5.1qii) there is a b E *Y such that b # * x for all x E Y. 0

Corollary 5.11 gives another proof of the existence, in an enlargement V(*R) of V(R),of non-standard numbers, but it holds in much more general situations. 5.12 Definition A set 9’ of subsets of an entity A E V ( X )is called exhausting if, for each finite subset F E A, there is an S E 9’with F c S.

92

II.


5.13 Proposition If Y is an exhausting set of subsets of A E V ( X )and V ( * X ) is an enlargement, then there is a set C E *Y containing all the standard entities in *A.

Proof: Let B be a hyperfinite subset of * A such that *a E B for each a E A. Then there is a C E *Y with B E C. 0 In spite of its simplicity, Proposition 5.13 turns out to be a very powerful tool in nonstandard analysis. The typical application runs as follows. Suppose A is an infinite set with some additional mathematical structure; for example, A could be an infinite graph, or a Hilbert space. Suppose further that A can be exhausted by a family Y of substructures-finite subgraphs, finitedimensional inner-product spaces, etc.-so that for each S E Y a certain result can be proved. One wants to establish a corresponding result for A. Using Proposition 5.13, we can find a set C E *Y containing all of the standard elements in *A, and by transfer the *-transform of the given result is true for C. The problem then is to show how the validity of the *-transform of the result on C induces the validity of the result on A. This last step can be quite difficult but is often easier than proving the result by standard methods. This method of proof was the basis of the first successful attack, by Bernstein and Robinson [7], on an invariant subspace problem in Hilbert space proposed by Smith and Halmos. We illustrate the technique by proving a result in infinite graph theory due to de Bruijn and Erdos. (See also the related paper by Luxemburg [35].) The application indicates how nonstandard analysis is applicable in areas other than analysis. A graph ( A , E) consists of a set A of vertices and a binary relation E on A x A which is symmetric (i.e., (x, y) E E implies ( y , x ) E E). If ( x , y ) E E we say that x and y are connected by an edge. ( A , E ) is injnite if A is infinite. ( A , E ) is k-colorable if there exists a map f:A + { 1,2, . . . ,k} (the set of “colors”)such that if (a, b) E E thenf(a) # f ( b ) , i.e., no two vertices which are connected by an edge are given the same color. If B E A then the subgraph (B, E IB) is defined by “(x, y) E E IB iff x, y E B and (x, y) E E ; i.e., B inherits its edges from E. 5.14 Theorem (De Bruijn-Erdos [13]) If each finite subgraph of an infinite graph ( A , E) is k-colorable, then ( A , E) is k-colorable.

Proof: We work in the superstructure V ( A u N).Let Y denote the set of all finite subsets of A (obviously exhausting). For each F E 9’the graph (F,EI F) is k-colorable, so the following is true in V ( A u N): (5.1)

(VF E 9’)(3fF:F + { 1,2, . . . ,k})(Vx,y E F) K.3

Y> E E

+

f&)

f

fF(Y)l.

11.5 Hyperfinite Sets

93

By the definition of enlargement, there exists a B E *Y so that B 2 A . By transfer of (5.1) we see that there is a map (coloring) fB: B + *{ 1,2, . . . ,k} (= { 1,2,. . . , k}) so that if (x, y} E * E then fdx) # fs(y). We now restrict fB to A to get a map f: A + { 1,2, . . . ,k}. f is a coloring since it inherits the property “(x, y } E E implies f ( x ) # f(y)” from fB (check). 0 Intuitively, the proof of 5.14 given above is obvious; we have simply covered A by a *-finite and hence k-colorable graph B and then restricted the coloring. A similar technique can be used to give easy proofs of more intricate theorems in infinite graph theory. In closing this section we note that the results of Chapter I for * R remain valid for an enlargement of V(R).To get more we need to consider the notions of internal and external entities in V(*R);these are introduced in the next section.

Exercises 11.5

1. Show that i f j is infinite then J = { n E N : n S j } E *@)F(N)- 9F(*N). 2. Show that in general *@&I) 2 @F(*A) whereas @(*A) 2 *@(A). 3. Check the translation into a sentence in Yxof the informal sentence in the proof of Theorem 5.3. 4. Show that the family 9in Lemma 5.7 is a filter given that Al, A , E .% =A , n A,€.%. 5. Prove that (ii) =s (i) in Theorem 5.10. 6. Show that if {O,:a E A} is an open covering of a set S c R but no finite subcollection covers S, then there is a y E *Ssuch that y ;74 x for all x E S. 7. Give another proof of the existence of infinite natural numbers in an enlargement V(*R)of V(R) by using the concurrent relation 0 in R so that @(b)holds for all b with Ibl s r in *R.

Proof: We prove the results in the case of the natural numbers N and leave the proofs for the real case (parentheses) of (i) and (ii) to the reader. (i) Let A = {x E * N : i @ ( x ) holds in V(*X)}.Then A is internal by Theorem 6.4 (internal definition principle) and A E *N, by hypothesis. If A = 0 we are through. Otherwise A is bounded below and hence has a least element I by Lemma 6.8; we may take k = 1 - 1.

11.7 The Permanence Principle

101

(ii) Given the internal set A defined as in (i), A c N and A is bounded above and hence has a largest element 1 by Lemma 6.8; we can take k = 1 + 1. (iii) Let A = {x E * N - {O}:@(y) holds for all y with lyl I l/x} and use (ii). 0 7.2 Corollary (Spillover Principle) Let A be an internal subset of *R.

(i) If A contains all standard natural numbers then A contains an infinite natural number. (ii) If A contains all infinite natural numbers then A contains a standard natural number. (iii) If A contains the positive infinitesimals then A contains a standard positive real number. Theorem 7.1 can be used to give yet another proof of the fact that if (s,:n E N) is a standard sequence and *s, = L E R for all infinite n, then lims, = L (see $11.3). Let E > 0 be a fixed number in R. Then l*s, - Ll < E for all infinite n. Applying Theorem 7.l(ii) with @(b)the internal statement “I*sb - LI < E”, we see that there is a k E N so that I*sb - LI < E for all b 2 k in * N and, in particular, ( s b - L( < E for all b 2 k in N since * s b = s b if b E N. This establishes the desired result. The following result has many applications.

7.3 Theorem (Robinson’s Sequential Lemma) Let (s,:n E *N) be an internal *R-valued sequence such that s, 2: 0 for each n E N. Then there is an infinite natural number o so that s, ‘Y 0 for all natural numbers n I o. Proof: The sequence (ns,:n E *N) is internal. Apply 7.l(i) with @(n) the internal formula “Jns,JI 1” to obtain an o E * N , so that ls,l 5 l/n if n I o. Thus s, N 0 if n E * N , and n I a,and so s, N 0 for all n 5 o. 0

One should beware of assertions similar to Theorem 7.3 which sound plausible but are not true. For example, it is not true that if s, N 0 for all infinite n then there exists a finite k so that s, N 0 for all n 2 k as the example s, = l/n shows. As an application of Theorem 7.3 we give another proof of the fact (Corollary 1.13.5) that if the sequence (fn(x):n E N) of continuous real-valued functions on the interval [a,b] converges uniformly then the limit f ( x ) is continuous on [a, b]. Let xo E [a, b]; we need to show that *f(x) 1: f ( x o ) if x 2: xo. But *f,(x) ‘Y *f(xo) for each n E N,and so *f,(x) ‘v *f,(xo) for some infinite o by Theorem 7.3. But *f,(x) N *f(x) for all x E *[a, b] by Proposition 13.2 of Chapter I, and we are through.

102

II.


Robinson [41] applied Theorem 7.3 in a more significant context in giving a nonstandard construction for Banach limits of bounded sequences. Suppose (s,:n E N) is a bounded sequence, i.e., IS,] 5 M for some real M > 0. We would like to attach a "limit" to (s,) even though it might not converge in the usual sense. For example, the sequence t, = (sl sz - . * s,)/n (n = 1,2,. . .) of Cesaro means sometimes converges when (s,) does not converge and defines a limit called the Cesaro sum of the sequence (s,). Any generalized limit should satisfy the properties in the following definition.

+ + +

7.4 Definition Let I, denote the set of standard bounded sequences. A map

L: I,

+R

is called a Banach limit if

+

+

(i) L(au b ~ = ) aL(u) bL(r)(a, b E R, u,T E la), (ii) if 0 = (s,,ln E N) then lim infs, 5 L(u) 5 lim sups,, (iii) i f u = ( s , I n ~ N ) a n d . r =(t,InEN),where t , = ~ , + ~ , t h e n L ( u )L(t). =

To obtain a Banach limit, we let summation operators I , n E N.

cy=

zr'

for o E *N, extend the standard

7.5 Theorem Fix o E *N,, and let L(u) = "((l/o) (s,:n E N) in I,. Then L is a Banach limit.

cr=

*sJ for each u =

Proof: The mapping L clearly satisfies 7.4(i). Given u = (s,:n M = sup{ls,l:n E N}.For a given m E N ,

(o- m)M 'y

E N), let

I

+-mM o

0.

By Theorem 7.3, there is an m E *N, so that

1 L(a) N 0

- m a=m+l

Fix E > 0 in R. We see immediately from Definition 8.16 of Chapter I that for each n E *N with m 1 < n 5 o

+

lim inf s, - E < *s, < lim sup s,

+ E.

103

11.7 The Permanence Principle

By the transfer of the usual properties of an average applied to (7.1), lim inf s,

- E IL(o)I

lim sup s,

+ E.

Since E is arbitrary, we obtain 7.qii). The rest of the proof is left to the reader. 0 Exercises 11.7 1. Prove the real case of (i) and (ii) of Theorem 7.1. 2. Assume that A is an internal set in * N such that, for some infinite integer y, if n is infinite and n 5 y in *N then n E A. Show that, for some finite m e N, if n E N and m 4 n then n E A. 3. Prove that the mapping L of Theorem 7.5 satisfies property (iii) of Definition 7.4, i.e., L is invariant under finite translations. 4. Use the permanence principle to show that iff is a standard function and I*f(x) - LI ‘v 0 for all x ‘v 1 but x # 1, then limx-.l f ( x ) = L. 5. Let (s,:n E *N) be an internal *R-valued sequence, and suppose that there is an M > 0 in R so that Is,[ IM for all n E N. Show that there is an o E * N , so that Is,I IM for all n Io in *N. 6. Show that the assertion in Exercise 5 is not true if ‘‘ls,,l S M” is replaced by “s, is finite.” 7. A filter 9E V ( X )- X has a countable subbasis if there is a countable family { A i : iE N} of entities in 9 so that for each F E 9 there is a sequence il, . . . ,in with r)Aa(l Ik In) c F. Suppose that B is an internal set in V ( * X ) and 9 has a countable subbasis. Show that if B n *F = 0 for all F E f then B n p ( 9 ) # 0,where d9)is the intersection monad o f f introduced in Exercise 11.5.9. 8. Let a: V(R)+ V(*R) be comprehensive, and let S = {nk:k E N} be a countable set contained in * N , .

(a) Show that S has a lower bound in * N , . [Hint: Regard S as a sequence, i.e., a map h: N + * N with h(k) = nk. Use comprehensiveness to extend h to an internal map g: * N + * N and apply the spillover principle to the set A = { m E * N : g ( k )> m for all k < m}.] For decreasing sequences nk this was presented by DuBois-Reymond and proved in our context by Robinson. (b) Use the transfer principle applied to g to show that S has an upper bound in * N , . 9. Show that iff is an internal function on an internal set A in some superstructure V(*X),and f is finite-valued, then there exists a standard n E N so that If(x)l 5 n for all x E X.Give an example to show that the assertion is not necessarily true iff is not internal.

104

II.


10. (*-Convergence and S-Convergence) An internal *R-valued sequence (s,:n E *N)is (i) *-convergent to L E * R if for each E > 0 in *R there is an rn E * N so that n > rn implies Is, - LI < E, (ii) S-convergent to L E * R if s, 2 L for all n E * N , .

(a) Show that if s, = *t, where ( t , ) is a standard sequence converging to L, then (s,) is *-convergent and S-convergent to L. (b) Show that there are internal sequences which are *-convergent but not S-convergent and vice versa. (c) Show that if (s,) is S-convergent to a finite L E *R then there is an rn E N so that s, is finite for n 2 m and the standard sequence E N) converges to "L. (d) Show that if s, = *t,, where ( t , ) is a standard sequence, then (s,) is S-convergent to a finite L iff there exists an infinite o E * N , so that *s, N L for every n E *N, with n I o. (Os,:n

11.8 ic-Saturated Superstructures

Theorem 7.5 of the last section is a good example of a result in which a standard entity (a Banach limit) is obtained by performing a standardizing operation on an internal entity [in this case, taking the standard part of the internal sum (l/o) *s,(l 5 i 5 a)]. Similar applications of nonstandard analysis often occur in more complicated circumstances, and sometimes the internal structure in a given extension V(*X)of a superstructure V ( X )is not rich enough to produce a desired result. A specific example arose from a result of Robinson, which was that if X is a metric space and B an internal subset of *X in an enlargement V(*X), then the standard part of B is closed (definitions and results will be presented in Chapter 111). It was natural to ask whether the result was still true if X was not metric. An example due to H. J. Keisler showed that the answer was negative if V ( * X ) was only an enlargement of V ( X ) [36, Example 3.4.31. Luxemburg [36, Theorem 3.4.21 showed that the result does go through if V(*X) is large enough to satisfy a generalization of the property of an enlargement, valid for internal concurrent binary relations on an appropriate set A in V(*X). V(*X) is called maturated, where IC is a cardinal number, if this generalization holds for all sets A in V(*X) with the cardinality of A < K (Definition 8.1). It is not necessary for the reader to be very knowledgeable about the theory of cardinal numbers for arbitrary sets in order to apply the theory. In a typical application we will begin with an internal concurrent binary relation on A-then we can assert that the results of the section will be applicable if V ( * X )is sufficiently large. Sufficiently large means that V(*X)is maturated,

11.8 K-Saturated Superstructures

105

where K > card A, but this is irrelevant in the application as long as we are assured that K-saturated structures exist (Theorem 8.2). Let V ( X )be a given superstructure and *: V ( X )+ V(*X)a monomorphism. We write card A to denote the cardinality, in the standard sense, of a set A. 8.1 Definition V(*X) is K-saturated if, for each internal binary relation P E V(*X)which is concurrent (Definition 5.9) on some (not necessarily internal) set A in V ( * X ) with card A < K , there exists an element y E range P so that (x,y) E P for all x E A.

H. J. Keisler [21,22] characterized those ultrafilters Q such that the superstructure V ( * X ) constructed from a given superstructure V(X), using Q as in g11.4, is maturated; he called them rc-good ultrafilters. In [21] Keisler established the existence of K-good ultrafilters on the assumption of the generalized continuum hypotheses. This assumption was subsequently removed by Kunen. Thus we have the following result. 8.2 Theorem Given any superstructure V ( X ) and cardinal K there is a Ksaturated superstructure V(*X)and a monomorphism *: V ( X )+ V(*X).

For the proof of this and related results the interested reader is referred to the papers mentioned above and also to the book by Stroyan and Luxemburg [46], where the desired structures are constructed as limits of ultrapowers. In any applications it will not be necessary to know the details of the proof. It follows from Theorem 5.10 that if K > card V(X)then V(*X) is an enlargement. In applying Theorem 8.2 it is important to note that the set A of Definition 8.1 need not be internal, although the binary relation P must be internal and so the elements of A are internal. For a successful application, however, we do need an upper bound on the cardinality of A which is independent of the particular construction of V(*X). For example, suppose that P is the binary relation on * R x *9’F(R) defined by “(x, B) E P iff the *-finite set B contains x.” Then P is concurrent on any subset A E *R. However, it is not possible to apply Theorem 8.2 and Definition 8.1 with A = *R; i.e., it is not possible to find a *-finite subset of * R which contains all numbers of *R, no matter how large K is. For then * R itself would be a *-finite set and hence, by transfer down, R would be finite. The error occurs in trying to apply the result to the set A = * R whose cardinality depends on the construction of the extension V(*R)and is not fixed in advance. In [36] Luxemburg developed a general theory of monads in enlargements and maturated extensions. In the following we present several of his important results.

106

II.


8.3 Definition Let *: V ( X )-,V(*X)be a monomorphism, and let A be an entity in V(X).The (intersection) monad p(A) of A (with respect to *) is the set p ( ~=) n.4.

A).

Monads p ( A ) are most important when A is a filter 9, i.e., when 0 4 9, F and G in 9implies F n G E 9,and F E 9and G 2 F implies G E 9.The next result generalizes the permanence principle.

*: V ( X ) + V(*X) be a monomorphism, and assume that V(*X)is maturated. Fix a filter f E V ( X )with card 9 < K; then

8.4 Theorem (Luxemburg) Let

(a) given an internal set B E V(*X),if *F n E # 0 for all F E F,then P ( 9 )nB

z 0,

(b) given an internal subset A of *9such that every standard element of *9is an element of A, there exists an element E E A such that E c p ( F ) , (c) given an internal subset A of *9such that E E *9and E c p ( F ) implies E E A, there exists an element F E 9 such that *F E A.

Proof: (a) Define an internal relation P, with domain *9and range contained in E, by “(F, x) E P if x E B n F.” Then P is concurrent on the collection of standard elements of *9, and this collection has the same cardinality as 9. Therefore there is a y E B so that y E B n *F for each F E 9, i.e.,

YE

(b) Define an internal relation P, with domain *9and range contained in A, by “(F, G) E P if G E A and G E F.” Then P is concurrent on the collection of standard elements of *9(why?),so there is an E E A such that E G *F for each F E 9,i.e., E c ~ ( 9 ) . (c) Let A satisfy the condition of (c). If A does not contain a standard element *F E *9then the internal set *f- A c *9contains all standard elements of *9and so by (b) there exists an element E E *9- A with E C p ( 9 ) . But then E E A by the hypothesis on A (contradiction). Several exercises in the preceding sections have dealt with situations in which, without saturation, the statement (a) of Theorem 8.4 may or may not hold. The results can be summarized as follows: The statement does not hold in general if E is not internal (Exercise 11.5.1 l), but does hold if E is standard (Exercise 11.5.10) or if E is internal and F has a countable basis (Exercise 11.7.7). (SeeTheorem 8.6.) An example due to H.J. Keisler (see Example 2.7.4 in [36]) shows that the statement need not hold if E is internal but V(*X) is only an ultrapower enlargement. We note finally that an internal version of comprehensiveness holds in Ksaturated extensions.

11.8

107

K-Saturated Superstructures

8.5 Theorem Let V(*X) be a K-saturated extension of V ( X ) .Assume C is a (not necessarily internal) set of entities in V,(*X) for some n E N with card C < K, and D is an internal set in V(*X). For any mapping &: C -P D, there is an internal extension 8: -P D of 4 [i.e., is internal, contains C, and &(a) = &a) if a E C]. If C = {*a:a E C,} we may take = *C,.

c

c

c

Proof: Let P be the binarl relation “(4, $) E P iff is an extension of 4’’ [i.e., dom $ 2 dom & and &(a) = &(a) if a E dom 41 defined on the set of internal mappings with values in D. Let A be the set of all internal mappings f,: {x} -,&(x), x E C.That is, each element of A is a set consisting of exactly one element from 4. Then card A = card C < K and P is concurrent on A (check). Thus there exists an internal map with values in D which extends each f,, x E C, and so dom = 2 C and &a) = &(a), a E C.The rest is left as an exercise (Exercise 1). 0

6 c

4

There is a converse of Theorem 8.5 when cardinal number bigger than card N.

K = K,, where

K, is the first

8.6 Theorem V ( * X )is a denumerably comprehensive extension of V ( X )(Definition 6.1 1) if and only if V ( * X ) is K,-saturated. Proof: Exercise. 0 8.7 Corollary An extension V(*X) constructed as in 811.4 is K,-saturated.

Proof: Follows from Theorems 6.13 and 8.6.

0

Corollary 8.7 shows that assuming HI-saturation in an application of nonstandard analysis is not assuming very much. Later in this book we assume a stronger form of saturation (larger K ) only in the proof of Theorem 1.22 of Chapter I11 (which is not used afterward) and in the proofs of the last few results in GIV.3, where K-saturation is used in a more significant way.

Exercises 11.8 1. Show that if the set C in Theorem 8.5 has the form {*a:a E C,} then one may take = *Co in the conclusion of the theorem. 2. Prove Theorem 8.6. 3. Let V(*R) be a K-saturated extension of V(R) with card 9(R) < K. Let B be an internal subset of * R and st(B) = {x E R:there exists a Y E B with st(y) = x}. Use Theorem 8.4(a) to show that st(B) is closed in R.

108

II.


4. (Luxemburg [36]) Suppose that V ( * X ) is a K-saturated extension of V ( X )with K > card V ( X ) .Let A E V ( X )contain an infinite number of elements. If A c *(PAA))is internal and moreover, E E A for every *-finite subset E c * A with the property that A = {a E V(X):*aE E } , then there exists a finite subset { a l , . . . , a,} c A so that {*a,, . . . , *a,} E A. (Hint: Apply Theorem 8.4 to the Frechet filter of A.)

CHAPTER 111

Nonstandard Theory of Topological Spaces

In Chapter I we showed how the notion of continuity for real-valued functions of a real variable could be characterized in terms of the nonstandard concept of nearness [f is continuous at x if *f(y) 2 f ( x ) for all y N x]. On the real line, nearness and the associated concept of monad are characterized in terms of the distance function, so that x 'Y y if Ix - yl N 0. We also characterized open and closed sets in terms of monads. In this chapter we will show how these notions can be extended to more general settings. In the standard development of topology one usually begins with a set X possessing a collection 5- of (open) subsets satisfying the abstract analogues Y) is called of conditions (i) and (ii) of Theorem 9.2 in Chapter I. The pair (X, a topological space. The notions of continuity can then be defined just in terms of the open sets; i.e., a function f: X -,Y is continuous if f - ' ( V ) is open in X for every set V which is open in Y. In the nonstandard theory developed here, we will show how the collection 5- on X can be used to characterize nearness and monad and so allow a simple development of the theory of topological spaces analogous to that of Chapter I. One of the most useful results in the nonstandard development is a characterization of compact spaces (the analogues of closed bounded sets on the real line) due to Abraham Robinson. This development is presented in 5111.2, with an elaboration in $111.7. Sections 111.3, 111.4, and 111.5 are devoted to the nonstandard theory of metric, normed, and inner-product spaces, which are of central importance in much of analysis. In $111.6 we show how one may begin with a standard metric space X and construct a (standard) metric space on the nonstandard set *X,leading to the so-called nonstandard hull of a metric space. This construction plays a central role in some recent applications of nonstandard analysis to the theory of Banach spaces by Henson and Moore (see [16] for a I09

110

111.


review). The section ends with a discussion of some results in the theory of function spaces, and includes a generalization of the Arzela-Ascoli theorem of Chapter I.

111.1 Basic Definitions and Results

A topological space is a pair (X,Y), where X is a set and Y is a family of subsets of X satisfying the conditions in the following definition. 1.1 Definition A family I of subsets of X , called open sets, is a topology for

X if (a) 0,X E 5 ; (b) U,V E 9implies U n V E 9(and thus every finite intersection of open sets is open), (c) U ,E Y ( i E I ) implies UUXi E I) E F,i.e., every arbitrary union of open sets is open. Closed sets are complements of open sets. Often we call X rather than (X, 9) the topological space.

The usual family of open subsets of R, defined in the proof of Proposition 9.1 of Chapter I, is a topology for R (Theorem 1.9.2). We will presently see that there are many topologies for R as for most sets. With each topology we will associate corresponding notions of convergence and continuity, using only the open sets. In order to develop a nonstandard theory, we first generalize the notions of nearness and monad which were central to the work in Chapter I. We begin with a few basic definitions. 1.2 Definition Let (X,Y) be a topological space. A set U is a neighborhood of a point x E X if U contains an open set V which contains x. The neighborhood system .Nxof x is the set of all neighborhoods of x. We denote the system of open neighborhoods of x E X by SX. A collection B E 9 is a base for Y if each set in 5 is a union of sets in a or, equivalently, if for each x E X and each U E YXthere is a V E Yxn with V E U.(For example, open intervals form a base for the usual open sets in R.) A collection 9 is called a subbase for Y if the collection of finite intersections of members of W is a base for 5 Similarly W,EJ; is a (neighborhood) base at x if for each U E Nx there is a V E axwith V E U;axE Nx is a subbase at x if the col-


111

lection of finite intersections of members of axis a base at x . If 9- and Y are topologies for X, then 9-is weaker than Y (and Y is stronger than 3)if s E 9. From now on we work in an enlargement V(*S)of a superstructure V(S), where V(S)contains the standard space X under consideration, so 5 E V(S) as well. In this section we will not use the fact that if x E X then x may contain elements. Therefore, we will write x instead of * x for the nonstandard extension of x . 1.3 Definition The sets in *sare called *-open subsets of * X . The monad of x E X is the subset m(x) = n * U ( U E sx) of * X . A point y E *X is neat x E X , and x is the standard part of y , if y E m(x); then we write y N x and x = st(y). The set of near-standard points is the set ns(*X) = u m ( x ) ( x E X). A point y E *X is called remote if it is not near-standard. An easy exercise shows that m(x) = (I*U(U E N,).

1.4 Proposition If Alx is a local subbase at x , then m(x) = n * U ( U E ax).

Proof: n * U ( U E a,)2 n * U ( U E JV,) since 9,E N,.On the other hand, for each U E .V; there exist V, E a,(lIi 5; n) with V,1 I i I n) c U, and so n*Vk1 I;i I;n) _c *U by transfer. Hence n * V ( V E AlJ -c

n*u(uENx).

1.5 Examples 1. Discrete topology. ( X ,Y) is discrete if { x } is open for each x E X . In this case m(x) = { x } for each x E X. 2. Trivial topology. (X, s)is trivial if 3 = (525, X}.In this case m(x) = *X for each x E X . 3. Usual topology on R. The open sets in R as defined in $1.9 constitute a topology. The monads as defined here and in Definition 6.4 of Chapter I are identical [where we assume that *B and V(*R) are obtained from the same ultrafilter]. This follows immediately from Proposition 1.4since the set ax of symmetric open intervals about x forms a local base by the definition of open set in R. A subbase for the topology is formed by intervals of the form (- 00, b), (a, 00) with a,b E R. 4. Half-open interval topology on R. Let 3 be the topology for R which has as base the set 9 of half-open intervals [a,@ = { x : a I x < b } , where a and b are real. Here m(x) = { y E * R : x I;y, x 2: y } (Exercise 1).

+

112

111.


5. Finite complement topology. For simplicity let X = N (any infinite set would do), and let Y be the collection consisting of the empty set and those subsets of N whose complements are finite. It is an easy standard exercise to show that Y is a topology. Here m(x) = {x} u *N, (Exercise 1). 6. Product topology. Let (X,Y) and ( Y , Y ) be topological spaces. Then X x Y can be made into a topological space as follows: A set W C _ X x Y is open if to each (x, y ) E W there correspond sets U E Yx,V E YYso that U x V E W, i.e., products of open sets form a base for the topology (check that this defines a topology). The resulting topology is called the product If my, m y , and m denote monads in topology and is denoted by Y x 9. (X, Y), ( Y ,Y), and (X x Y,9 x Y), respectively, then m((x, y ) ) = my@) x my(y), x E X , y E Y (Exercise 1). The following facts should be noted in comparing the usual monads for

R and monads in a general topological space (X,Y):

(a) The concept of nearness is derived from that of monad and not vice versa as in Definition 6.4 of Chapter I. (b) We have defined monads only for standard points in *X. (c) Nearness is not in general an equivalence relation on *X [this is, of course, because of (b)]. The monad m(x) always contains x. That m(x) will in general contain points other than x follows from the following basic lemma, the proof of which requires that V(*S) be an enlargement. 1.6 Proposition For each x E X there is a *-open set V E *Yx with V E m(x). Proof: The binary relation P on Yxx Fxdefined by P ( U , V ) if V E U is concurrent. For if U,,. . . , U,E Yxthen V = U, n * n U, satisfies P ( U i , V), 1 5 i 5 n. Since V(*S) is an enlargement, Theorem 5.10 of Chapter I1 guarantees the existence of an element V E *Yx, so that V E *U for all U E Yxand hence V E m(x). 0

1.7 Proposition Let A be a subset of X. Then (i) A is open iff m(x) c * A for each x E A, (ii) A is closed iff m(x) n *A = 0 for each, x in the complement A’ of A. Proof: (i) Suppose A is open and let x E A. By definition there exists an open set U E Fxwith U E A. By transfer m(x) E *U E *A.

111.1 Basic Definitionsand Results

113

Conversely, suppose m(x) E *A for x E A. By Proposition 1.6 there exists a V E *Fx with V c m(x) c *A. Thus the internal sentence (3 V E *FJ[V E * A ] is true and so, by downward transfer, there exists a set V E 9..with V E A. Thus A is open since A = u V x ( x E A). (ii) This follows immediately from (i) and the definition of a closed set: A is closed if A' is open. 0 1.8 Definition A point x is an accumulation point of the set A E X if every open neighborhood of x contains points of A other than x. We let A^ denote the set of accumulation points of A; the set A = A u A^ is the closure of A. A is dense in B if A = B. 1.9 Proposition A point x is an accumulation point of A iff m(x) contains a point y E * A different from x.

Proof: If x is an accumulation point of A then the sentence (VU E Fx) (3y E U n A ) [ y # x ] is true for V ( X ) ,and hence, by transfer, each U E * 9 . ' contains a point y # x in * A . This is true, in particular, of the *-open set V of Proposition 1.6, and so there is a y E m(x) n * A with y # x. Conversely, suppose that m(x) contains a point y # x in *A. Then, for a fixed U E YX,*U contains a point y # x in *A. Thus the internal sentence (3y E *(Un A ) ) [ y # x ] is true, and it follows by downward transfer that there exists a y E U n A with y # x. 0 1.10 Proposition The closure 2 of A E X consists of those x E X for which fa. The closure of A is the smallest closed set containing A. Thus A = 2 if A is closed. m(x) n * A #

Proof: Exercise. 0

Let 9.and 9 'be two topologies for a set X with associated monads ms(x) and my(x) ( x E X). An easy exercise shows that F is weaker than 9 ' iff m,(x) 2 my(x) for each x E X. We noted in 81.6 that if x and y are distinct standard real numbers then m(x)n m(y)is empty. Therefore, we say that R is a Hausdorff space. This property is not true in general for topological spaces. Properties of spaces which deal with the relationship between monads of distinct points are called separation properties. Some of the more important separation properties are presented next; the most important of these is the Hausdorff property.

114

111.

NonstandardTheory of Topological Spaces

1.11 Definition The space ( X , F ) is

(a) To if, for each pair x, y of distinct points in X, there is an open neighborhood of one not containing the other, (b) T, if {x} is closed for each x E X, (c) Hausdorfl (or T2)if whenever x # y in X there are disjoint open neighborhoods U and V of x and y. There are more separation properties (e.g., regularity and normality) which we will consider in the exercises. 1.12 Proposition The topological space (X, Y) is

) Y E m(x) then x = y, (a) To iff whenever x , y ~ Xand both x ~ m ( y and (b) T , iff whenever x, y E X and x E m(y) then x = y, (c) HausdorlT iff monads of distinct points in X are disjoint. Proof: We prove (c) and leave the other proofs as exercises. Suppose (X, F) is Hausdorff and x, y E X are distinct. Then there exist U E Yx, V E Yywith U n V = 0.Therefore, *U n *V = 0,and since m(x) E *U and m(y) E *V, we have m(x) n m(y) = 0. Conversely, if m(x) n m(y) = 0 then by Proposition 1.6 there exist U E *Fx, V E *Yy with U n V = 0. By downward transfer of the appropriate sentence (check), there exist U E Yx,Y E Yywith U n V = 0. 0

If (X, F)is Hausdorff then there is only one standard point st(y) associated with each y E ns(*X). It is defined by st(y) = x, y E m(x). Thus for Hausdorff spaces we have a well-defined map st: ns(*X) + X called the standard part map, which has many applications (e.g., see gIV.3 below). 1.13 Examples 1. The discrete topology is Hausdorff, and every subset is both open and closed. 2. The trivial topology of a space with two or more points is not T o . 3. The finite complement topology on N is T , but not Hausdorff by Proposition 1.12. Also a set is closed in the finite complement topology iff it is finite. For if A is finite then *A = A, and if x E A' then m(x) n *A = ({x} u * N , ) n A = 0. On the other hand, if A is infinite then *A n *N, # 0 by 6.11 of Chapter I, and m(x) n *A # 121 for any x.

So far we have used a topology Y to define associated monads m(x), x E X. Conversely,it is possible to start with a collection k(x), x E X, of subsets of * X with x E k(x), and define an associated family Y as follows: U E 9 if

111.1

115

Basic Definitionsand Results

k(x) E *U for each x E U . An easy exercise shows that Iis a topology. If k(x) (x E X)are the monads of F then clearly k(x) E I;(x) for all x E X,but set equality does not necessarily hold (see Exercise 6). The sets k(x) will be called pseudomonads; the concept will be used in 5111.8. Let (X, F )and (Y, 9) be topological spaces with monads m(x) (x E X)and rTi(y) (y E Y), respectively. To discuss continuity of mappings f: X -+ Y we work in an enlargement containing *X and * Y and thus *f, *Y and all mappings *f: *X + *Y, etc. The symbol 21 will be used for the relation of nearness in both (*X, * F )and (*Y,*Y);the context should clear up any ambiguities. 1.14 Definition The map f:X -,Y is continuous at x E X if to each V E 9',,,, there corresponds a U E Fxwith f [ U ] c Y . f is continuous on X if it is continuous at each x E X.A one-to-one mapf from X onto Y is a homeomorphism iff and f - are continuous.

'

1.15 Proposition The map f: X + Y is continuous at x E X iff *f(y) for each y z x. That is, *f[m(x)]E ITi(f(x)).

21

f(x)

Proof: Suppose f is continuous at x E X,and let V by any open neighborhood of f ( x ) . Find a corresponding U E I . from the definition of continuity so that f[U]E V . If y 'v x then y E *U by 1,7(i), so *f(y) E *V since * f [ * U ] c *V by transfer. Thus *f(y) E *V for each Y E9',(x),i.e., *f(y) z f ( x ) . The converse is left to the reader. 0

Proposition 1.15 shows that for real-valued functions of a real variable, Definition 1.14 is equivalent to the 8-8 definition of continuity. 1.16 Theorem The map f: X + Y is continuous on

each Y E 9.

X iff f - ' [ V ]

EI for

Proof: Fix x E X, suppose f is continuous, and let Y E 9'[(.,.Then *f[m(x)] G rii(f(x))c *V by continuity at x and the fact that V is open. It follows that m(x) E * f - ' [ * V ] = * ( f - ' [ V ] ) (check), and so f - ' [ V ] is open

by Proposition 1.7(i). The converse is left to the reader. 0

The reader will have noticed that the proofs of the results 1.6-1.16 are considerably simpler than the proofs of the corresponding results in gI.9 and 1.10. This is mainly because the richer language of Chapter I1 allows us to avoid proofs by contradiction which use Skolem functions. If Y is a subset of the topological space (X, F),then I induces a topology called the relatioe topology Fr on Y. A subset U E Y belongs to Fr iff

116

111.


U = V n Y for some V E Y. It is easy to see that the monads in (Y, 9,,) are given by A(y) = m(y) n *Y, y E Y, where My) is the monad of y in (X,Y) (check).The characterizations of relative openness, relative closedness, continuity, etc., are the obvious modifications of those we have just proved with hi replacing m. Next we define the important notion of a weak topology. Suppose that X is a set and (Xi,Yi) (i E I) is a family of topological spaces. We work in an enlargement containing *X and *Y where Y = U X X i E I). We let mXy) (i E I , y E X i ) denote the monads of y in ( X i , Y i ) .Let {& X -,X i : i E I} be a family of mappings. 1.17 Definition The weak topology 9 on X for the family {4i:iE I} is the topology generated from the subbase 9'consisting of all inverse images of the form 4; ' [ V ] , U E 9'i.e., ; Y consists of all sets obtained by taking arbitrary unions of finite intersections of sets in 9'.

The weak topology is the weakest topology which makes all the maps &i continuous (Exercise 8). 1.18 Proposition If m(x) (x E X)is a monad of the weak topology, then m(x) = { y

E

* X :*$i(y) E mi(c$i(x)) for all i E I}.

Proof: Let the right-hand side of the equation be denoted by k(x). If x E X then for i E I the sets 4; ' [ U ] , U E Y$,(x), are open neighborhoods of x, so m(x) c ( ) { y E * x : y E =

nIy

n*(4;1[~])(~ E Y:,(J}(i E I )

* x :E ~*4;1[r)*U(U

. ~ ; , ~ ~ , ) 3 } I( )i = k(x).

On the other hand, if V E Yxis a neighborhood in the base of 9 generated by the subbase 9,then V is a finite intersection of sets of the form 4; ' [ U i ] , U i E 9:,(X). Clearly k ( x ) E *4;'[*Ui] for each U i E 9$,(x) and so k ( x ) c * V . It follows that k(x) G m(x), and we are through. 0 1.19 Definition: The Product Topology Let ( X i , Y i ) (i E I) be a family of topological spaces. Then the product X = flXi(i E I) is defined to be the set of all mappings x on I with x(i) E X i for i E I. The product topology Y for X is the weak topology generated by the mappings 4 i : X + X i defined by

4Ax) = x(i).


117

To see what *X is, note that each x E X is of the form x: I -t U X i i E I) with x(i) E Xi. The *-transform of the collection { X [ : iE I} includes new sets Xi for i E *I - I. Thus, by transfer, each x E *X is of the form x: *I -t *[uX,(i E I)] with x(*i) E *Xi if i E I, whereas if i is not standard, then x(i) E Xi, but X ineed not be the extension of a standard set. If x E X, and m(x) denotes the monad in F, then by Proposition 1.18 m(x) = { y E *X:y(i) E mix(i)) for all standard i in *I}.

That is, the monad is determined by just the standard indices in *I. 1.20 Theorem The topological product of Hausdorff spaces is Hausdorff.

Proof: Let X = n X i , where the ( X i , q )are Hausdorff with monads mix). Let 9-be the product topology with monad m(x). If x, y E X with m(x) n m(y) # 0,let z E m(x) n m(y). Then z(i) E mX*x(i)) n mi*y(i)) for each i E I, and so x(i) = y(i) for each i E I since (Xi,Yi)is Hausdorff, i.e., x = y. 0

We end this section with a result which is valid under the assumption that X is in V(*S) for some S and V(*S) is rc-saturated with K > card 3.This result was mentioned at the beginning of 511.8 as a good example of the use of saturation in nonstandard analysis. It will be referred to again in 51V.3. 1.21 Definition Let (X,.F) be a topological space with monads m(x), x E X . The standard part st(A) of a set A E *X is the set of all x E X for which there exists a y E A with y E m(x). *1.22 Theorem Assume X E V(S) and V(*S) is rc-saturated with K > card .T. If B G *X is internal then st(B) is closed.

Proof: Suppose z is an accumulation point of "B = st(B). If U E F2then there exists a point x E " B with x E U . Since x E " B there exists a y E B with y E m(x),and hence y E *U since U is open, Thus *U n B # fa for all U E Y2. Since V ( * X ) is K-saturated with K =- card Yx, we see from Theorem 8 4 a ) of Chapter I1 that dYZ)n B # 0,where p ( F J is the intersection monad of the filter Y2 (Definition 8.3 of Chapter 11). Clearly p(.T2) = m(z), and so z E "B, and we are through. 0

Note that if A S *X then st(A) = st(A n ns(*X)). Also note that Proposition 1.10can be interpreted to say that A = st(*A n ns(*X)), and so Theorem 1.22 is a generalization of Proposition 1.10. Theorem 1.22 was established

118

111.


for metric spaces by Robinson using an enlargement [42, Theorem 4.3.31, and in the general case (assuming saturation) by Luxemburg [36, Theorem 3.4.21. An example due to Keisler shows that Theorem 1.22 is not true if V(*S) is not maturated with K > card F [36, Example 3.4.31.

Exercises III.1 Verify the statements in Examples 1.5.4-6. Prove Proposition 1.10. Prove (a) and (b) of Proposition 1.12. Prove that a topology Y is weaker than a topology 9 on X iff m,(x) 2 my(x) for each x E X,where m, and m y denote the monads for F and 9,respectively. 5. A TI space is normal if for any two disjoint closed sets A and B there are disjoint open sets U and V with A c U and B E V. A T , space is regular if the same condition holds for all A and B, where A is a point (actually a set consisting of a point) and B is a closed set. Give a nonstandard condition for regularity and normality. 6. (a) Let k(x) be a subset of *X for each x E X. Define a collection F of subsets of X as follows: U E Y iff k(x) E *U for each x E U.Show that Y is a topology for X.Also show that if hx) is the Y-monad of x E X then k(x) E i(x). (b) Fix an infinitesimal E > 0 in * R and for each x E R let k(x) be the pseudomonad {y E * R : l y - XI < E } . Show that a set U is open in R in the usual sense if and only if, for each x E U,k(x) c *U.Clearly k(x) m(x) for each x E R. (c) Let X be any set. Let A?,, x E X,be a collection of subsets of X satisfying the following:

1. 2. 3. 4.

(i) If V E 1, then x E V, (ii) If V,, V, E A?,, there exists a V E with V E V, n V,, (iii) If y E U E a,, then there is a V E 1,with V E U. Use the sets k(x) = n * U ( U E 1,) to define a topology F as in qa). Show that 1,is a neighborhood base in Y for each x E X. 7. Finish the proof of Proposition 1.15. 8. Show that the weak topology is the weakest topology making the corresponding functions continuous. (See Definition 1.17.) 9. Let A be a subset of a topological space X.A point x is an interior point of A iff A is a neighborhood of x. The set of interior points of A is denoted by A". A point x is a boundary point of A if x is not interior to A and not interior to A'. The set of boundary points of A is denoted by dA.

119


Show that (a) x E A" iff m(x) E *A, (b) x E d A iff m(x) n *A # 0 and m(x) n *A' # 0. 10. Let A be a subset of a topological space X.Use Exercise 9 and the text material to establish the following results:

(a) dA = A n A' = R - A", (b) X - d A = A" u (A')", (c) R = A u dA, A" = A - aA, (d) A is closed iff A 3 dA, (e) A is open iff A n d A = 0. 11. Let (X x Y, .T x 9')be the product of (X, 5 )and (Y,9'). Show that if A E X and B E Y then

(a) AxB = A x B, (b) (A x B)" = A" x B", (c) d(A x B) = (dA x B) u ( A x

a).

12. Let Y be a subset of (X,.T) with relative topology that

Yy.

If A E Y show

(a) A is Yy-closed iff it is the intersection of Y and a Y-closed set. (b) A point y E Y is a Yy-accumulation point of A iff it is a Y-accumulation point. 13. (a) Let (Xl,.Tl), (X2,Y2),and (X3,Y3)be topological spaces. Show that a function f: X1-+ X, is continuous iff, for each subset A E X,

f[AI

fin.

(b) Show that iff: X, -+ X 2 and g: X, + X 3 are continuous, then the composite function h = g f defined by h(x) = g(f(x)) for x E X,is continuous. 14. Let Y be the product topology on X = n X i ( i E I) where(Xi,%) are topological spaces. If Ai c X i for each i E I, show that n A X i E I) = E I), so that the product of closed sets is closed. 15. (a) A sequence (x,:n E N) in a space (X,.T) converges to x E X if for every neighborhood U of x there is an m so that x, E U if n 2 m. Show that (x,) converges to x iff *xu E m(x) for all infinite w. (b) Let (x,:n E N ) be a sequence in X = n X , ( i E I), where the ( X i , F i ) are topological spaces. Show that (x,) converges to x E X iff (4,(x,)) converges to 4,(x) for each i E I, where the di are as in Definition 1.19. 16. Let (X, Y) and (Y, 9')be topological spaces with (Y, 9')being Hausdorff. Suppose that f,g: X + Y are continuous. Show that { x : f ( x )= g(x)} is closed. 0

120

111.


111.2 Compactness

A cornerstone of topology is the notion of compactness, which is defined as follows. 2.1 Definition A collection d = { A i : iE I} of sets is a couer of (or covers) A c X if A C U A , (i E I ) . A subcover of d is a subcollection of d which also covers A. A is a compact subset of a topological space (X,F)if each open cover, that is, each cover of A by open sets U i (i E I ) , contains a finite sub-

cover. Probably the most useful result in nonstandard analysis is the following pointwise characterization of compactness due to Robinson. 2.2 Robinson’s Theorem Let ( X , F ) be a topological space. Then A E X is compact iff every y E *A is near a standard point x E A.

Proof: Suppose A is compact but that there is a point y which is not contained in the monad of any x E A. Then each x E A possesses an open neighborhood U, with y 4 *U,.The covering { U , : x E A} of A has a finite subcovering { U l , . . . , U,};i.e., Ul u * * * u U, 1 A. By transfer *U1u * - * u *U,2 *A. This contradicts the fact that y E * A but y # * U i , 1 S i 5 n. Conversely, suppose that A is not compact. Then there is an open covering d = { U i : iE I} of A which has no finite subcover. The binary relation P on d x A defined by P ( U , x ) iff x # U is concurrent (check). By Theorem S.lO(ii) of Chapter I1 there is a point y E *A with y # *U for all U E d. If x E A then x E U for some U ~ dbut ,y # *U so y # m(x). 0

2.3 Examples 1. In the discrete topology the only compact subsets are finite. 2. All subsets in the trivial topology are compact. 3. In the finite complement topology for N,every subset A is compact. For if A # 0 and y E * A then either y E A or y E *N, (Corollary 7.6 of Chapter I). In the first case y E m(y), and in the. second case y E m(x) for any x E N and, in particular, for some x E A. Recall that a set must be finite to be closed in this topology, so there are compact subsets which are not closed in this non-Hausdorff topology.

We use Robinson’stheorem to give proofs of the following standard results.

121

111.2 Compactness

2.4 Theorem If X is compact in the topology F and A c X is closed, then A is compact. Proof: Let y E *A. Since X is compact there is an x E X with y E m(x), whence x E A by 1.7(ii), so A is compact. 0

2.5 Theorem If (X, Y) is Hausdorff and A c X is compact, then A is closed. Proof: Let x E A' and suppose that y E m(x), y E *A. Since A is compact, y E m(2) for some iE A, but then m(x) n m(9 # 0,contradicting the fact that ( X , F ) is Hausdorff. 0

2.6 Theorem If ( X , Y ) and ( Y , Y ) are topological spaces and f: X continuous, then f[K] is compact for each compact K E X.

+Y

is

Proof: Exercise. 0

2.7 Theorem If (X, F )is compact, (Y,9') is Hausdorff, and f: X tinuous, then

+

Y is con-

(i) f is closed (i.e., takes closed sets onto closed sets), (ii) iff is one-to-one then it is a homeomorphism. Proof: (i) Follows from 2.4-2.6. (ii) We may assume that f[X]= Y. We need only show that f is open (i.e., takes open sets onto open sets). But if U is open in X,then U' is closed. Since f is one-to-one, f[U] = Y - f[U'], which is open by (i). 0

The real power of Robinson's theorem is illustrated by the proofs of the following standard results. The standard proofs of these results as given in Kelley [20] are somewhat involved. 2.8 Tychonoff's Theorem If (Xi,Yi)(i E I) are compact spaces and X = JJXii E I), then X is compact in the product topology Y. Proof: Let y E *X.Then Hi)E *Xi for (standard) i E I and so Hi)is near a standard point xi E Xi for each i E I. That is, Hi) E mi(xi),where mi(xi)denotes the monad of xi in ( X i , Y i ) .By 1.19, y E m(x), where m(x) is the monad in 9 of the point x E X defined by x(i) = x i . 0

122

111.


Tychonoffs theorem is used in many proofs in analysis. One can usually replace these standard proofs by simpler nonstandard ones which use Robinson’s theorem directly (for example, see the proof of Alaoglu’s theorem, 4.22, below).

*29 Alexander’s Theorem If 9’is a subbase for the topology of (X, 9) and every cover of X by members of Y has a finite subcover, then X is compact. Proof: (Hirschfeld [lS]) Suppose X is not compact. By 2.2 there exists a y E *X which is not near-standard and so for each x E X there is an open set U, with x E U, and y 4 *U,.Since each U,is a finite intersection of members 6 of 9,one of the *& must omit y, so we may as well assume that U , E 9’ for each x. Then the covering {U,:x E X} cannot have a finite subcover U , , , . . ,U,, for in that case *X = *U,u - * u *U,and y E *Ui for some i, 1 s i 5 n (contradiction). 0

Exercises 111.2

1. Prove Theorem 2.6. 2. Let ( X , 9 ) and ( Y , 9 ) be topological spaces and suppose that (Y, 9)

3. 4.

5.

6.

7.

is compact Hausdorff. Show that f: X + Y is continuous iff the graph G, = {(x,f(x)) E X x Y: x E X} off is closed in X x Y. Let X have the topologies 9 and 9, and suppose that ( X , 9 ) is compact Hausdorff. Show that (a) if Y is strictly contained in 9 then Y is not Hausdorff, (b) if 9 is strictly contained in Y then Y is not compact. Show that if(X,Y) is compact then there is a hyperfinite set F E *X with X c F E ns(*x) such that X = st(F). Suppose that (X,Y) is compact. Show that if ( A , : n E N) is a sequence of nonempty closed subsets of X which is monotone, i.e., A, 2 A, z * * * , then nA,(n E N) # 0. The following problem is derived from a result of A. Abian (see [11): Let pn be a sequence of polynomials and x, a sequence of variables so that, for each n, p. = pn(x1,x2,. . . ,x,) is a function of the first n variables. Let I, be a sequence of closed and bounded intervals in R. Assume that for each n there are values 4 E Ii for 1 5 i 5 n such that, for each i 5 n, pi(a;, a;, . . . ,4) = 0. Show that there are values ai E I i for 1 s i < 00 such that, for each n E N, pn(al,az,.. . ,a,) = 0. (Luxemburg [36]). Let ( X , Y ) be a regular Hausdorff space (see Exercise 1.5). If A is an internal set in a K-saturated enlargement of V ( X ) where K > card 9, and A E ns(+X),then st(A) = {x E X:there exists y E A with x = st(y)} is compact.

I23

111.3 Metric Spaces

111.3 Metric Spaces

The most important topologies which occur in analysis are those associated with a metric or distance function. The corresponding spaces are called metric spaces. 3.1 Definition A metric space is a pair ( X , d), where X is a set and d is a map from X x X into the nonnegative reals satisfying (for all x, y, z E X )

(a) d(x, y) = 0 iff x = y, (b) 4%Y ) = d(Y,X), (c) (triangle inequality) d(x, z )

d(x, y )

+ d(y,z).

Each metric space (X, d) can be made into a topological space ( X , r d ) by specifying that a set U E r d if, for each x E U,there is an E > 0 in R so that the open &-ballB,(x) = { y E X : d ( x , y ) < E } E U.The resulting collection F,, is a topology (standard exercise). When the metric d and associated topology F', are understood we simply call X rather than (X,d) or ( X , Y d )the metric space. Note that the open &-ballsabout a point x E X form a local base at x.

3.2 Examples 1. R is a metric space with the usual metric d(x, y ) = Ix - yl for x, y E R . 2. R is a metric space with the metric d(x, y) = Ix - yl/(l + Ix - yI) (check). 3. Let X be any set and define d(x, y) = 1 if x # y and d(x, y) = 0 otherwise. It is easy to see that d is a metric anti is called the discrete metric. 4. Rn is a metric space under each of the following metrics [where x = ( x 1 , * *. r x n ) r y = ( y l , . . . , ~ n ) ] :

C!=

(a) dl(x, Y ) = 1 [xi - yil, (b) d,(x, y ) = max{lxi - yil: 1 Ii In}.

Properties (a) and (b) of Definition 3.1 are trivial; to check property (c) for metric (a) we have [with z = (zl,.. . ,z,)]

The triangle inequality (c) for metric (/is I) left as an exercise. 5. Let 1, (also often denoted by I") be the set of bounded sequences x = (x,,x,, . . .). Then I , is a metric space under the metric defined by d,(x,y) = sup{lxi - yil:i E N } with y = ( y , , ~ , ,. . .). Note that d,(x, y) is finite for any x, y E I, since, for any i, Ix, - yil 5 (xi( lyil and so

+

sup{lxi - yil:i E N } Isup{lxil:i E N }

+ sup{lyil:i E N}.

124

111.


To check the triangle inequality (c) we have [with z = (zl,zz,.

IY,

. .)]

[xi - z i l s 1x1 - Yil + - ZiI 5 sup{[ x i - y,[ :i E N} sup{ y, - ziJ:i E N } = dm(x, Y )

+ drn(Y,z)*

+

1

The result follows by taking sup over i E N on the left. The nonstandard analysis of metric spaces will be carried out in an enlargement V(*S) of a superstructure V ( S )that contains X. We always assume that S contains the set of real numbers R. In proving abstract theorems concerning a metric space (X,d) we will write x instead of * x for an element of *X.In concrete examples, it might be important to investigate in more detail the structure of the elements of *X. For example, if X = I , then we could take S = R, in which case elements of I, would appear as bounded real-valued functions on the integers. Often the set S in a particular example will not be specified; the reader should be able to fill in the details. By transfer, the *-transform *d of d satisfies the conditions of Definition 3.1 with *d replacing d for all x , y, z E *X.

3 3 Definition Let (X,d) be a metric space. Two points x and y in *X are near if *d(x,y) N 0. We write x 21 y if x and y are near and x qk y otherwise. The monad of x E *X is the set m(x) = { y E *X:y N x } . Two points x , y in *X are in the same galaxy if *d(x, y ) is finite. The principal galaxy of *X is the one containing the standard points, and is denoted by fin(*X). Points in fin(*X) are called finite. An easy exercise shows that for standard points x E X the monad m(x) of Definition 3.3 coincides with the monad obtained from the associated topology .Td.The metric monads, however, are defined for all points x E *X.It is also easy to see that the relation N is an equivalence relation. 3.4 Examples

1. In the metric of Example 3.2.1, x N y iff x - y is infinitesimal. 2. In each of the metrics on R" defined in Example 3.2.4, x 'Y y iff x i - yi is infinitesimal for 1 Ii In (exercise). 3. Each element of * I , is an internal function x : * N -+ *R, and we usually write x(i) = x i and x = ( x i : i E *N).The standard elements in * I , are of the form *y, where y = ( y , : i E N) is an element of I,. Each x E * I , is *-bounded in the sense that there exists an M E * R (which could be infinite) so that [ x i /IM for each i E *N (exercise). In passing, note that there are external

111.3

Metric Spaces

125

functions z: * N -+ * R which are also *-bounded; an example occurs when zi = 1 for i E N and zi = 0 for i E *N,. The real-valued function on S(N)x 1, defined by (A, x ) -,sup{[xi[:i E A} extends by transfer to a *R-valued function on *B(N)x *I,. We again denote the value of this extended function by sup{lxil:i E A}, where A E *N is internal and x E *I,. Properties of the extended sup function can be obtained by transfer. For example, if A and B are internal subsets of * N and A E B then sup{ x i [:i E A} 5 sup{lxil :i E B}. For each x, y E * I , we have *d,(x, y ) = sup{ x , - yil:i E * N } . The monads in *I, are easily characterized. We claim that if x , y E * I , then x N y iff xi N y , for all i E * N . For suppose x N y . Then, for any i E *N, Ix, - y,l I sup{Ix, - y,l :i E *N}N 0. The converse is left as an exercise. The finite elements in * I , are those x = ( x i : i E *N)for which there exists a finite M (and hence even a standard M )in * R so that [xi[ 5 M for all i E *N. The value of M depends on x. All of the results of 3.1 and 3.2 are available for the topological space

(X,Yd)associated with a metric space (X,d). We concentrate in this section on some results which are special to metric spaces. The first few revolve around the notion of uniformity. 3.5 Definition Let (X, d ) and (Y, d ) be two metric spaces and A a subset of X.

(a) A mapf: A + Y is uniformly continuous on A if, given E > 0 in R, there exists a 6 > 0 in R so that d(f(x), f(y)) < E for all x, y E A for which d(x, y ) < 6. (b) A sequence of maps f.:A + Y, n E N , converges uniformly on A to f: A + Y if, given E > 0, there exists a k E N so that d(f.(x),f(x)) < E for all n 2 k in N and all x E A. In the following results, (X, d ) and (Y, 2) are metric spaces and A is a subset of X. We use N to denote nearness in both *X and *Y, letting the context settle any ambiguity.

3.6 Proposition The map f: A -+ Y is uniformly continuous on A iff *f(x) N * f ( y )whenever x, y E *A and x N y. Proof: Let f be uniformly continuous on A. Find the 6 > 0 for a prescribed > 0 from 3.5(a). By transfer, *d(*f(x), * f ( y ) )< E for all x, y E * A for which *d(x, y ) < 6. In particular, *d(*f(x), *f(y)) < E for all x , y E * A for which x N y . This is true for any E > 0 in R, and so *f(x) z *f(y) for all x, y E * A for which

E

x

N

y.

126

111.


Conversely, suppose *f(x) N * f ( y )whenever x, y in R be given. Then the internal sentence

E

* A and x

1:

y . Let E > 0

(36 E *R)[6 > 0 A (Vx, y E *A)[*&, y ) < 6 + *d(*f(X), *f(Y)) < E l ] is true in Y(*S)(choose 6 to be infinitesimal).That f is uniformly continuous follows by transfer to Y(S). 3.7 Propositioo The sequence f , : A + Y converges uniformly on A to f: A + Y iff *fJx) N *f(x)for all n E *N, and all x E *A.

Proof: Exercise. 0 3.8 Theorem Iff: A + Y is continuous and A is compact, then f is uniformly continuous on A. Proof: Let x, y E *A with x ‘v y. Then x and y are near a standard point z E A since A is compact, and *f(x) N f(z) N *f(y) since f is continuous at z. The result follows from Proposition 3.6. 0

3.9 Theorem Iff.: A + Y is a sequence of continuous functions which converge uniformly on A to f:A + Y, then f is continuous.

Proof: Let x E A and y E *A with y ‘v x. We need to show that *f(y) N f ( x ) . Now *f,(y) z *h(x) for each n E N and so, by Theorem 7.3 of Chapter 11, * f , ( y )N *f,(x) for some w E *N,. By Proposition 3.7, *f&) N * f ( y ) and *f,MN *f(x), so *f(Y) N *fM= f(4.0 Next we present the notion of a complete metric space. To do so we need the obvious generalizations of the definitions in fiI.8. 3.10 Definition Let (X, d) be a metric space, and let (s,:n of points in X. Then

E N) be

a sequence

conuerges to s if, given E > 0 in R, there is a k E N so that < E if n 2 k, (ii) (s,) is a Cauchy sequence if, given E > 0 in R, there is a k E N so that d(s,, s), < E if n, m 2 k, (iii) s is a limit point of (s,) if, for each E > 0 in R and each k E N, there is an n > k so that d(s,,s) < E.

(i) (s,)

d(s,,s)

111.3

Metric Spaces

127

The reader will easily be able to prove that (s,) converges to s iff *s, N s for all n E * N , , (s,) is a Cauchy sequence iff *s, N *s, for all n, m E * N , , and s is a limit point of (s,) iff *s, 1: s for some n E * N , . 3.11 Definition ( X ,d ) is complete if each Cauchy sequence in X converges to a point in X . 3.12 Examples 1. The set R with the usual metric is complete by 8.5 of Chapter I. 2. Any set X with the discrete metric is complete. 3. R" with each metric of Example 3.2.4 is complete. For example, let ( x k ) be a Cauchy sequence in (R",d,). Then for each i, 1 5 i 5 n, Ix: - xfl 5 d,(xk,x').Thus, (x:) is a Cauchy sequence for each i and so converges to a point x i in R. The point x = ( x l , . . . ,x,) in R" is the limit of xk in R".

We now use nonstandard analysis to prove some abstract theorems on completeness. The nonstandard characterization of completeness requires the following notion. 3.13 Definition Let ( X , d ) be a metric space. A point y E *X is a pre-nearstandard point if for every standard E > 0 there is a standard x E X with *d(x,y) < E. 3.14 Proposition A metric space ( X , d ) is complete iff every pre-near-standard point y E * X is near-standard.

Proof: Suppose ( X , d ) is complete. If y is pre-near-standard, find a sequence s, E X so that *d(y,s,) < l/n. Then (s,) is a Cauchy sequence with limit s and y N *s, N s if n~ * N , . Conversely, suppose every pre-near-standard point is near-standard, and let (s,) be a Cauchy sequence. Given E > 0, find the associated k E N from Definition 3.10. Then d(*s,,sk) < E if n E * N , . Thus *s, is pre-near-standard for every n E * N , and each such *s, must be near-standard to the same s E X (check). The sequence (s,) must converge to s. 0 3.15 Corollary A closed subset ( A , d ) of a complete metric space ( X , d ) is complete.

128

111.


Proof: Let y be a pre-near-standard point in *A. Then y N x for some x E X since (X,d) is complete. But x E A by Proposition 1.10 since A is

closed. 0 Using this characterization, we will show that it is possible to adjoin “ideal” elements to a metric space (X,d) so that the result is a complete metric space in which (X, d) is densely embedded.

3.16 Definition Let (X, d ) be a metric space. A metric space (8,(i) is a completion of (X,d) if (8,d)is complete, there is an isometric embedding 4: X -,8 [i.e., d(x, y ) = d(r$(x), &y)) for all x, y E X,whence 4 is one-to-one],

and 4[X] is dense in X.

3.17 Theorem Any metric space (X,d) has a completion (8,d).

Proof: We let X‘ be the pre-near-standard points in *X, and 8 be the equivalence classes of X under the relation of nearness N (an equivalence relation); thus the elements of 8 are monads 4 x 7 of pre-near-standard points x’ E *X.Also define d(m(x’),My’))= st(*d(x’, y’)) [note that *d(x‘, y’) is finite for any pre-near-standard points x’, y’]. This metric is independent of the pre-near-standard points chosen to represent the elements of 2,for if x’ N x i and y‘ N y; then *d(x‘, y’) N *d(x;, Y;) (Exercise 6). The map 4: X -,8 defined by $(x) = m(x) is obviously an isometric embedding. Also #[XI is dense in 8.For if m(x’) E 8,where x’ is pre-nearstandard, then given E > 0 there exists an x E X so that *d(x’, x ) -c E and then

d(m(x‘),N x ) ) = st(*d(x’, x ) ) < E. To show completeness, let (m(x3:n E N) be a Cauchy sequence in (8,d), with xk E X’. Since each xk E X‘, there are elements x, E X with *d(x,, xk) < l/n for each n E N. Given E > 0 in R, there exists a k E N so that d(m(x9, m ( x 3 ) < E and hence * d ( x ; , x a < E if n, m 2 k. Then d(x,,x,,,) = *d(x,,x,,,) 5 2/n E if rn 2 n 2 k in N by the triangle inequality. Again by transfer, *d(*x,,, *x,J 5 2/n E if m 2 n 2 k in *N.In particular, if w E * N m ,*d(x,,*xJ 9 2/n E if n 2 k, and so *x, is pre-near-standard. Therefore *d(xk, *x,) < *d(xk, x,,) *d(x,, * x J < 3/n E if n 2 k, yielding d(m(xk),m(*x,)) < 3/n E if n 2 k. Thus (m(x3) converges to m(*x,). 0

+

+

+

+

+ +

As an example, note that the rationals Q form a metri? space under the usual metric d(x, y) = Ix - yl, x, y E Q. The completion (0,d) is isomorphic to the real metric space (R,d). Recall that a subset of the real line is compact iff it is closed and bounded. In arbitrary metric spaces there is a similar relationship between compact-

111.3

129

Metric Spaces

ness, completeness, and total boundedness, the last being a generalization of boundedness.

3.18 Definition A metric space ( X , d ) is totally bounded if, to each E > 0 in R, there corresponds a finite covering {&(xi): 1

< i 5 n} by open &-balls[each

BXx) = { y E X:d(x,y ) < E } ] . 3.19 Proposition A metric space (X, d) is totally bounded if every point of * X is pre-near-standard.

Proof: Suppose (X,d) is totally bounded. Let E > 0 be given and find the corresponding points x i , 1 < i 5 n, so that X = UB,(xJ(l 5 i 5 n). By transfer, * X = U*Bc(x,)(l< i < n), and so every point of * X is pre-near-standard. The converse is left to the reader. 0

3.20 Theorem A metric space (X, d) is compact iff it is complete and totally bounded. Proof: Suppose (X, d) is compact. Then every point y E *X is near a point in X , so ( X , d )is complete and totally bounded by 3.14 and 3.19, respectively. Conversely, suppose (X,d) is complete and totally bounded. If y E *X, then y is pre-near-standard by 3.19 and hence near-standard by 3.14. 0 One might expect that “totally bounded” may be replaced by “bounded” in this theorem, where boundedness is defined as follows.

3.21 Definition A set A in a metric space ( X , d ) is bounded if there is a point xo E X and a number M so that d(x,xo) I M for all x E A. Example 3.2.2 and the following example show that boundedness is not enough for Theorem 3.20.

3.22 Example Let B , = {x E lm:dm(x,0) < l } be the “unit ball” in (l,,d,)

where 8 = (O,O, . . .). It is easy to see that B, is closed and hence is complete when regarded as a metric space with the metric induced by d, (Exercises 8, 14). Also, B , is obviously bounded. Now consider the element x = ( x i : i E *N)E * B , which is zero except at some infinite integer o where x, = 1. Then x is not near-standard. For if x ‘Y * y for some standard y = ( y i : i E N) then 0 = xi 3: y, for at least all i E N, and so yi = 0 for all i E N. By transfer, *yi = 0 for all i E * N , and so * y , x,.

+

130

Ill.

Nonstandard Theory of TopologicalSpaces

To end this section we consider another compactness criterion, which is especially important in applications. In many situations one can obtain a sequence (x,) of points (in a given topological space x,) which has certain desirable properties, e.g., giving better and better approximate solutions to a set of equations. One would like to assert that a subsequence of the given sequence converges to a point in the space (in order, e.g., to produce an exact solution). Though the criterion of compactness in the sense of $111.2 is not always of help in constructing such a subsequence, if the assertion is nevertheless always true we call the space sequentially compact. 3.23 Definition A topological space X is sequentia~lycompoct if from each sequence (x,) in X it is possible to select a subsequence which converges to a point x E X . It turns out that compactness is equivalent to sequential compactness in a metric space. U~ortunatelythis is not true in general topological spaces, as we shall see in $111.7.

3.24 Tbeorem A metric space (X, d) is compact iff it is ~quentiailycompact. Proof: (i) Suppose that (X,d) is compact and let (x,) be a sequence in X. By Exercise 9 there is a point xo which is a limit point of (x,,). We will show that some subsequence of (x,) converges to xo. Consider the open ball B , = {x E X:d(x,x,) c I}. Since xo is a limit point of (x,) there is an x,, E Bl. Similarly there is an x,, in BlI2= (x E X:d(x,xo) c it} with n, > n,. Continuing this process inductively, we obtain a subsequence (x,) with x,, E B , , = (x E X:d(x, xo) < l f k } ; clearly (x,,) converges to xo. (i) Suppose (X,d) is ~quentialiycompact. Then it is obvious that (X,d) is complete, so that if (X,d) is not compact, it must not be totally bounded. Thus there exists some E > 0 so that no finite collection (B,(yJ:I 5 i r; n} covers X. Let x1 E X be a given point. Then there is an x2 with d(x 1, x,) 2 E. Similarly there is an x3 with d(xl,x3) 2: E and d(x2,x3) 2 E. Continuing in this way, we construct a sequence (x,,) with d(x,,xd 2 E for any n, m E N. Clearly (x,) can have no convergent subsequence.

The procedure used in part (i) of the proof in going from a limit point to a convergent subsequence does not work in a general topological space. It uses in an essential way the fact that the neighborhood system of x has a countable base. A topological space is said to satisfy the first axiom of cou~tubilityif the neighborhood system of each point has a countable base. Included in such spaces are the metric spaces. Clearly, a subset A in a metric

111.3 Metric Spaces

131

or first countable space is closed iff A contains the limit of any convergent sequence in A. Exercises 111.3 1. Show that d,(x, y) satisfies the triangle inequality. 2. Show that for the metrics on R” defined in Example 3.2.4, x N y iff xi = yi for 1 Ii In. 3. Show that for each x E *1, there is an M E * N such that lxil I;M for all n E *N. 4. Prove that if x, y are internal sequences and x, N y, for all i E *N then sup{[ x i - y,l :i E * N } 2: 0. 5. Prove Proposition 3.7. 6. (a) Show that if a, b, c are points in a metric space (X, d) then Id@, c) d(b, I;4%b). (b) Show that if x’ N x i and y’ z y’, in (*X,*d) then *d(x’, y’) N *d(x;, yi). 7. Show that if (X,d) is a metric space and each point of *X is pre-nearstandard then (X, d ) is totally bounded. 8. Show that B1= {x E I,:d,(x,8) I; l} is closed. 9. Show that a sequence in a compact metric space has a limit point. 10. Let (x,) be a sequence in a compact metric space (X, d). Fix o E *N,. Use the downward transfer principle and the fact that x, is near-standard to prove there is a subsequence x,, that converges to st(x,). 11. Use Theorem 3.24 to prove Robinson’s result: If (X, d ) is a metric space and A is an internal set in X such that each a E A is near-standard, then st(A) = {x E X:there exists an a E A with x N a} is compact. (The generalization for regular topological spaces (Exercise 2.7) is due to Luxemburg [36].) 12. Prove that a Cauchy sequence in a metric space (X,d) is bounded. 13. Use Exercise 12 to show that (X,d) is complete if every finite point in *X is near-standard. 14. Show that ( l m , d m )is complete. 15. (&&Continuity,*-Continuity, and S-Continuity) Let (X,d) be a metric space, A be a subset of *X,and f: A -+ * R be a function. We say that f is &&continuous(*-continuous) at x E A if, for each > 0 in R (*R), there is a 6 > 0 in R (*R)such that If(x) - f(y)( < & ify E A and *d(x, y) < 6. We say that f is S-continuous at x E A if f(y) N f ( x ) for every y E A with y 2: x. (a) A = *X and f = *g, where g: X + R. Show that if g is continuous at each x E X,then f is *-continuous and S-continuous at each x E *X.

132

111.


(b) Show that iff is .&continuous at x E A then f is S-continuous at x E A but not necessarily vice versa. (c) Suppose that f is internal. Show that f is Scontinuous at x E A iff f is Ed-continuous at x. (Hint: Use the spillover principle.) (d) Show that there are internal functions f on * R which are *eontinuous but not Scontinuous at zero and vice versa. (Hint: Look for examples on X = R with the usual metric). 16. Let A be an internal set in *X where (X,d) is a metric space and let f:A + * R be internal. Show that f is S-continuous at each point x E A iff, for every (standard) E > 0 in R, there is a 6 > 0 in R such that If(x) - f(y)l < E for all x, y E A for which *d(x, y) < 6. (Hint: Again use

the spillover principle.)

17. Let X be a compact metric space. Suppose that the internal function

f:*X -i* R is Scontinuous at each point of *X and finite at each x E X. Let g be defined by g(x) = "f(x) for x E X. Then g is continuous on X and *g(x) N f(x) for all x E *X. 18. Two metrics on X are equivalent if they define the same topology. Show

that the metrics d and d' are equivalent if there exist positive (nonzero) constants a and /I in R so that ad(x, y) 5 d(x, y) 5 /Id(x, y) for all x, y E X. 19. Let I = [0,1] c R and let X be the set of all continuous functions f: I -,I such that If(.) - f(y)l 5 Ix - yl. Define d ( f , g ) = sup{lf(x) g(x)l: x E I } for f,g E X. (a) Show that (X,d) is a metric space. (b) Show that (X,d) is compact. 20. Use Robinson's theorem to show that the set of elements x of ll with llxlll 5 1 (the unit ball) is not compact. 21. (Lebesgue covering lemma). If Ul,. . . ,U,is an open covering of a compact metric space (X,d), then there is an E > 0 in R such that the E ball E,(x) about any x E X is entirely contained in one of the sets U,, 1SiSn.

111.4 Normed Vector Spaces and Banach Spaces

The space R is not only a metric space with the usual metric; it is also equipped with operations of addition and multiplication, and the distance function d(x, y) = Ix - y ( involves these operations. In this section we generalize this simple example. The metric spaces will have the additional struc-


133

ture of a vector space, and the metric will come from a generalization of the absolute value. Many theorems and exercises are standard. As in 5111.3, the nonstandard analysis will be carried out in an enlargement V(*S) of a suitable superstructure Y(S).The choice of S will depend on the context and will not be mentioned explicitly. 4.1 Definition A (real)' vector space is a set X on which are defined operations of vector addition (+) and scalar multiplication (*)(so that we form the sum x + y of two vectors x, y E X and the scalar multiple a - x of the vector x E X by a E R). These operations satisfy the following conditions (as usual we often omit the dot in scalar multiplication):

(i) x + y = y + x for all x, y E X. (ii) (x + y) + z = x + (y z ) for all x, y, z E X. (iii) There is a vector 8 E X called the zero vector so that x

+

XEX.

+

+ 8 = x for all

+

(iv) a(x y) = ax ay if a E R and x, y E X. (v) (a + b)x = ax + bx if a, b~ R and EX. (vi) a(bx) = (ab)x if a, b E R and x E X. (vii) 0 . x = 8, 1 . x = x for all x E X.

Wewrite(-l)x= -x,sothatx+(-x)=Oby(v)and(vii).Theset Y E X is a (linear) subspace of X if x, y E Y and a, b E R imply ax + by E Y. An easy exercise shows that the element 8 is unique. A subspace Y of a vector space X is itself a vector space with the inherited operations of addition and scalar multiplication.

4.2 Definition A norm on a vector space X is a nonnegative real-valued function 1) 11: X + R satisfying (a) llxll = 0 iff x = 8, (b) IIx + yll Illxll + llyll (triangle inequality), (4 llaxll = la1 IIXII. A normed vector space (X, 11 11) is a metric space if we define the metric d by d(x, y) = IIx - yll (exercise).If the normed vector space is complete in this metric it is called a Banach space. A subspace Y E X is closed if it is closed in the topology defined by the norm. The reader should easily be able to prove that the norm function 11 1 : X + R is continuous when X has the topology induced by d. Note also that a closed Much of this and the succeeding section obtains (with some obvious modifications) if the real numbers are replaced by complex numbers in the definition of vector space. f

134

111.


subspace of a Banach space is complete (Corollary 3.15) and hence a Banach space. 4.3 Examples

1. R" can be made into a vector space in the following standard way: If x = (xl, . . . ,x,), y = (yl, . . . ,y,), and a E R we definex y = (xl y , , . . . , x, + y,), ax = (ax,, . . . ,ax,), and l3 = (O,O, . . . ,0). R" is a normed space under each of the following definitions of a norm (exercise):

+

+

(a) IlXlll = C1=11x11, (b) llxllDD = sup{lxfl:l I i s n} 2. The space I,. The space R" of infinite sequences of real numbers is a vector space with the following definitions of addition and scalar multiplication: If x = (x1,x2,. . .), y = ( y , , y , , . . .), and U E R, we define x y = (xl y1,x2 y , , .. .) and ax = (axl,ax,, . . .) (check). Let ll be the set of elements x = (x1,x2, . . .) in R" for which llxlll = lxil is finite. Then II is a linear subspace of R" and 11 [I1 is a norm on II (regarded as a vector space). For example, to check the triangle inequality 4.2(b) and the fact that II is closed under +, we have (with x = (x1,x2,. . .) and y = ( y l , y , , . . .))

+

+

+

czl

i

I= 1

IXf

i

i

+ Yil 5 i = 1 IXil + i = 1 lYil 5 IlXlll + IIYIII,

and the results follow by taking the limit as n + 00 on the left. Properties 4.2(a) and 44c) are immediate. Finally we show that II is complete and so is a Banach space. Let (xk:k E N) be a Cauchy sequence in I , with xk = (x:,xi,. . .). Then given E > 0 there is an n E N so that llxk - xfII1I E if k, 12 n. Since Cauchy sequences are bounded there exists a number A so that llxklllI A for all k E N.Let o be an infinite integer; by transfer we have *IIx"II1 < A. Now 5 llxklll for all k, and so by transfer lxrl 5 A. Let xi = st(xfu). We will show that x = (xi) E I , and ( x k ) converges to x. For any k and L we have

and so by transfer

This shows that x E Il. Finally, for any k, I, and L,

135

111.4 Norrned Vector Spaces and Banach Spaces

By transfer, with k = w, we have L

L

C ]xi - x11 5 i C ( x i - xyl + *llxo i= 1 = 1

I;infinitesimal

+ *IIxm-

The right-hand side is < 2~ if l 2 n. Since this is true for any L E N , we conclude that IIx - .'Ill < 28 if 12 n. 3. The space 1, is a Banach space under the norm defined by IIxll, = sup{lx,l:i E N},where x = (xl,x2,. . .) (Exercise 111.3.14). 4. The space co. The space co consists of those x = (xi:i E N) E I, for which x, = 0. It is easy to see that co is a closed linear subspace of I, and hence a Banach space. 5 . The spaces B(S) and C(S). Let S be an arbitrary set. We denote by B(S) the set of all bounded functions on S. Then B(S) is a vector space with the usual definitions of addition and scalar multiplication of functions, that is, iff, g E B(S)and a E R, we put (f g)(x) = f ( x ) g(x) and (af)(x) = af(x)for x ES; we take 8 to be the function that is identically zero. B(S) is a Banach space under the norm defined by llfll, = sup{lf(x)):x E S} (Exercise 3). If S is a topological space we define C(S)to be the subset of B(S) consisting of continuous functions. Then C(S)is a closed subspace of B(S) (Exercise 4), and hence a Banach space.

+

+

Let (X,II 11) be a normed space. From now on we will follow the usual convention of denoting the *-transform of the norm 11 11 on *X by 11 11 rather than *I1 1 ; the context will clear up any possible confusion. We see immediately that the (norm)monad of a point x E * X is the set m(x) = { y E * X :Ily - 'Y O}. It is also almost immediate that m(x) = {y E * X : y = x z, z E m(O)}, so that all monads are translates of the monad about zero (Exercise 5). The finite points in *X (Definition 3.3) are those x E *X for which llxll is finite. Next we come to the basic notion of linear operator.

+

4.4 Definition Let X and Y be vector spaces. A map T:X + Y is called a linear operator if T(ax by) = aTx bTy for all a, b E R and x, y E X. The set of all such linear operators is denoted by L ( X , Y). Let X and Y be normed vector spaces. (Since there is no possibility of confusion we denote the norms and zeros on both by and 8, respectively.) A linear operator A?' X 4 Y is bounded if the number IlTll = sup{llTxll:IIxII 5 l } is finite. This number is called the norm of T. Then llTxll < llTllllxll for all x E X (check). The set of all bounded linear operators T: X + Y is denoted by B V , Y ) . If Y = R (with the usual operations of addition and multiplication and usual norm) then a linear operator T is called a linear functional. In what

+

+

11 11

136

111.


follows, we will often write x and T ( x )for the nonstandard extensions * x and *T(x)of x and T(x);* x and *T(x)may, however, have nonstandard elements. 4.5 Example Define a map T: I, 4I, as follows: i f x = ( x I , x 2 , x 3 , .. .) then T x = ( 0 , x 1 , x 2 , . . .). Then T is linear, one-to-one, and bounded (in fact (ITxll = llxll for all x E I,). However, T does not map I , onto I,. 4.6 Tbeorem (Robinson) Let T E L ( X , Y ) , where X and Y are normed

spaces. The following are equivalent: (i) T is bounded. (ii) * T : * X + *Y takes finite points to finite points. (iii) *T takes the monad of 8 into the monad of 8. (iv) * T takes near-standard points to near-standard points. In fact, if z E *X is near x E X then *Tz is near T x .

Proof: (i) * (ii): Suppose llTxll 5 Mllxll for all x E X . By transfer 11*Txll 5 Mllxll for all x E * X and (ii) follows. (ii) =$ (iii): Proceed by contradiction. Suppose x E m(8) but 11*Txll # 0.iThen the element z = x/llxll E *X is finite with norm 1 (here and in the following we use freely the transfers of the properties in Definitions 4.1, 4.2, and 4.4) but *Tz = (1/11x11)*Tx is not finite since llxll N 0 but 11*Tx11 0. (iii)*(iv): Let X E X and z ~ m ( x ) so , x - z E m ( 8 ) . Then * T ( x - z ) = T x - *Tz E Me), so *Tz is near T x . (iv) =$ (i): Proceed by contradiction. If T is not bounded then there exists a sequence ( x , E X : n E N ) so that IIx.II = 1 but llTx.ll> n for n E N (check).Then II*T X J is~ infinite for some infinite natural number w.NOW z = xm/,/ji%iJ is near-standard since it belongs to m(8), but 11*Tz11 = is not finite, so z cannot be near-standard. 0

,/m

It is easy to see that a linear operator is continuous if and only if it is continuous at 8 (Exercise 6). Therefore we have the following result. 4.7 Corollary T E L(X, Y)is bounded iff it is continuous.

Proof: Use 4.qiii) and 1.15. 0 48 Corollary If T E B(X, Y), then the null space N(T)= { x E X : T x = 8 ) is a closed linear subspace of X.

Proof: Exercise. 0

111.4

Norrned Vector Spaces and Banach Spaces

137

One of the most important results concerning bounded linear operators on Banach spaces is the uniform boundedness theorem. The proof is entirely standard. 4.9 Uniform Boundedness Theorem Let X be a Banach space, Y a normed vector space, and 9 c L ( X , Y ) a family of bounded linear operators. Suppose that for each x E X there is a constant M , so that llTxll < M , for all T E f .Then there is a constant M so that llTll IM for all T E 9, i.e., the operators in 9 are uniformly bounded. Proof: Suppose that T EL ( X , Y ) . Note that if llTxll IM for all x in the closed ball B,(x,) = { x E X : " x - xoll I E } then Tis bounded and IlTll 5 ~ M / E . The proof of this fact is left to the reader. Now we proceed by contradiction. Let xo E X and E, > 0 be given. Then there is an x 1 E Be,(xo) and a T,E f so that llTlxlll > 1. For otherwise llTxll I1 for all x E B,,(x,) and all T E 9,and then I2 / ~ , for all T E 9 by the remark in the first paragraph. By continuity we can find an E , with 0 < < 4,and B,,(x,) 2 BLI(xl)so that llTlxll > 1 for all x E BJx,). Inductively we can find a sequence {B,,(x,):n E N } with B,,(x,) 2 Be,+,(x,+ ,) and E, = 0, and a sequence T, E 9 so that llT,,xll 2 n for all x E Ben(x,). Now ( x , ) is a Cauchy sequence since E, = 0. Let x E X be the limit of ( x , ) (here we use the completeness of X ) . Then x E Bem(x,,),so 11T,,x11 > n for all n, contradicting the assumption. 0

As a corollary we can prove the following result. 4.10 Theorem Let X be a Banach space and Y a normed vector space, and suppose that (T,:n E N) is a sequence in B(X, Y) such that for each x E X there is an element y, with T,x = y , (limit in norm). Then the mapping T given by T x = y, is in B(X, Y). Proof: An easy exercise shows that the map T: X + Y is linear. Since 11 11 is a continuous function, lim"+mllT,,xll = IITxII, and thus for each x there exists an M , so that IIT,xll IM , for all n. By the uniform boundedness theorem there is an M E N with 11T,,11 IM for all n E N , so llTxll = limllT,xII IMllxll and T is bounded. 0

Next we study an important class of bounded linear operators, the compact operators. These operators occur in many applications. There is an extensive analysis of equations in Banach spaces involving these operators; it is called the Fredholm theory.

138

Ill.


4.11 Definition Let Xand Y be normed vector spaces. An operator T E L(X, Y) is compact if T[B] is compact for every norm-bounded set B c X. 4.12 Theorem (Robinson) T E L ( X , Y) is compact iff *T takes finite points to near-standard points. Proof: Suppose T is compact and let x E *X be finite, i.e., IIxI( < M for some M > 0. The ball B = {x E X:llxll I; M} is bounded and so is compact. Thus every point of *(T[B]) = *T[*B] is near-standard by Robinson's theorem, 2.2. Since x E * B we conclude that *Tx is near-standard. Conversely, suppose that *T maps finite points into near-standard points, and let B be a bounded set. By Theorem 2.4 we need only show that T[B] C _ K for some compact set K.Let K = {y E Y:y 2: y' for some y' E *(T[B])} = st(*T[*B]). Then T[B] c K and K is compact by Exercise 111.3.11. 0 We see immediately from 4.6 and 4.12 that compact operators are bounded. Theorem 4.12 can be used to establish the compactness of many operators, as the following example shows.

4.13 Example: Integral Operators (Robinson [42, Theorem 7.1.71) Let T: C([O,13)4C([O, 13) be defined by VlX)

= Jol

K(x, Y)f(Y)dY,

where K(x, y) is a continuous function on [0,1] x [0,1]. The reader should check that T is a linear operator. To show that Tf is continuous notice that if lf(x)I I M for all x IZ 1413 then (4.1)

I m x ) - Tf(Y)(

I Jol ( K b ,t ) - K ( Y , t)l If(t)l dt 5

M max{JK(x,t) - K(Y,t)l:(x, th (Y,t) E [O, 11 x

[o, 13>,

and maxlK(x, t) - K(y,t)l can be made as small as desired if Ix - yl is sufficiently small by the uniform continuity of K(x, t). Also note that lK(x, t)l 5 K for all (x, t ) E [O, 13 x [0,1] for some constant K, and so, for any x E [0,1], (4.2)

ITf(x)JI K max{lf(t)l:t E [O, I]}.

To show that T is compact we need to show that *Tfis near-standard for each finite f. Let f E *C([O,13) be finite. This means that there is a finite standard M so that I M for all t E *[O, 13.

If(?)

111.4

139

Normed Vector Spaces and Banach Spaces

From the transfer of (4.2) we see that I*Tf(x)l 5 K M for all x E *[O, 11, i.e., *Tfis finite, and we may define a function on [0,1] by +(x) = st(*Tf(x)), x E [0,1]. To complete the proof we will show that 91, is continuous and *Tf is near *$. From the transfer of (4.1) we have I*V(X) - *my11 IM max{l*K(x,t) - *K(y,t)l:(x,t), (y,t) in *[O, 13 x *[O, l]}. Thus *Tf(x) z *Tf(y) whenever x, y E *[O, 13 and x N y by the uniform continuity of K(x,t) (Theorem 10.10 and Proposition 10.8 of Chapter I). Let E > 0 be a fixed standard real, and let D = (6 E *R, 6 > O:x, y E *[O, 13 and Ix - yl < 6 implies I*Tf(x) - *Tf(y)l < E } . Then D contains all positive infinitesimals by the above remark, and so contains a standard 6 > 0 by Corollary 7.2(iii) of Chapter 11. Now if x, y E [0,1] then

IW - Jl(Y)l IIW)- *Tf(x)l + I*T.(x)

-

* m y ) ( + I*Tf(Y) - +(Y)l.

The first and last terms are infinitesimal, so that I$(.) - $(y)l < 2 if the 6 is chosen as above; thus $ is continuous. To show that *$ is near *Tfnotice that *$(x) N *Tf(x) for all standard x by the definition of $ and the fact that *$ is an extension of $. If x E *[0,1] then I*Vw - * W l II*Tf(x) - *Tf(Ox)l+ I*~f(Ox)- *$(Ox)[ + I*+(".)

- *$(x)I,

and all terms on the right are infinitesimal by the preceding remarks and the continuity of JI. A word of caution here. The reader may think that the above proof is needlessly complicated since we could replace t j by $(x) = st(Tf(x)) for all x in *[O, 13 rather than [0,11, in which case it would be obvious that $ is near Tf.Unfortunately the $ defined this way is usually external and thus not a standard element in *C([O,13).Notice also that an internal finitef E C([O, 13) can be quite wild; e.g., f(x) = sin wx, where w is infinite. The set of bounded linear operators can be made into a linear space in an obvious way. If T, S E B(X, Y) and a E R we define (T S)(x) = T(x) S(x) and (aT)(x) = aT(x). It is then not hard to see that the operator norm on B(X, Y) makes B ( X , Y) into a normed vector space (Exercise 8).

+

+

4.14 Theorem Let X be a normed vector space and Y a Banach space. Then

the normed vector space B(X, Y)is complete and hence a Banach space. The set of compact operators in B(X, Y) forms a closed linear subspace. Proof: Let ( T , E B(X, Y ) : nE N) be a Cauchy sequence. Then, for each x E X,T,x is a Cauchy sequence and hence converges to an element y, by

140


111.

completeness of Y. We define T by T x = lim T,,x. Then T is linear (check) and bounded since limllT,,ll = llTll (check). Finally, we show that T,, converges to T in norm. For given E > 0 there is an N so that IIT,,x - T,,,xll 5 ]ITn- T,,,ll llxll < ~llxllif n, m 2: N . Thus IIT,,x - Txll 5 ~llxllif n 2 N, and so [IT,,- TI[ 5 E for n 2 N,and we are through. An easy exercise shows that the set of compact operators is a linear subspace of B(X, Y ) .To show that it is closed, let (T,,) be a sequence of compact operators converging to an operator T E B(X, Y ) . If y E *X is finite then it belongs to *B, where B = { x E X:llxll S M )for some standard real M > 0. Now note that Tnx converges to T x uniformly on the ball B, i.e., for any E > 0 there is an m(&) E N so that IIT,,x - Txll < E for all n 2 m(E) and all x E B. Thus II*T,,,x - *Txll < &/2for no 2 m(&/2) in N and all x E *B. Since T,,, is compact, *T,,,y is near a standard z E Y and so ll*Ty - zll < E by the triangle inequality. Since e is arbitrary, *Ty is pre-near-standard. Since Y is complete, it follows from Proposition 3.14 that *Ty is near-standard. 0 The standard proof of the closedness of the set of compact operators usually involves the selection of infinite subsequences with certain desirable properties. 4.15 Corollary The space of bounded linear functionals on a normed vector

space X is a Banach space. The Banach space of this corollary is used sufficiently often for us to introduce some notation. 4.16 Definition The Banach space of bounded linear functionals on a normed

linear space X is called the dual space of X and is denoted by X'. The dual of X is denoted by X" and is called the second dual of X . Similarly for X"', etc. It is sometimes difficult to characterize the dual of a given Banach space, but the following example is an easy case.

T I , -B l', which is linear,l-l,onto,andsatisfies~(Tyll= llyll,fory~I,.Lety= ( ~ , : ~ E N1,) E and define Ty: 1, + R by Ty(x) = xiyi for x = ( x i ) E I , . Then Ty is linear, and

4.17 Example: 1; = I , Our aim is to define a mapping

1 ~ ~ x 1 1sup{ly,~:iENI

2 1

lXtl=

IIyIIrnIIxII1~

so T y is a bounded linear functional on II with llTyll S IIyll,. We next show

111.4

141

Normed Vector Spaces and Banach Spaces

that IITy(1 2 llyllm. We may assume llyll, > 0. Given a positive E < ~ ~ y ~ ~ , , there is an no so that lynol > llyllm - E. Now define x = ( x i ) E I, by xi = 0 for i # no and xno = YnJlYnol. Then llxlli = 1 and ITY(x)l = IYnoI > IIYIIm - E, so IITyll 2 IIyll,. We also see that T is 1-1, since if Ty = 8 then llyll, = 0 so y = 8.

It only remains to show that T is onto. Let f E I;. If e" E I , is defined by n and 8::= 1, then lle"lll = 1 for all n E N. Put Ilfll, and so y = ( y i ) E I,. Now the functional Ty attached to y as in the first paragraph agrees with f on the elements e". A simple limiting argument (check) shows that Ty = f,and so I; = I,.

e" = (d;), where S; = 0 if i # f(e") = y, E R. Then 1y.l 5

In the case of a general normed vector space X,it is not at all obvious that X' contains any elements other than 8. The following result, which is basic to the study of duality, shows that X' always contains many elements. 4.18 Hahn-Baaach Tbeorem Let X be a vector space and suppose that a given function p: X -+ R satisfies p(x y) s p(x) p(y) and dux) = ap(x) for each a 2 0 E R and x, y E X.Suppose that f is a linear functional defined on a subspace S of X with f ( x ) 5 Ax) for all x E S. Then there is a linear functional F on X which extends f [i.e., F(x) = f ( x ) for all x E S] and satisfies F(x) 5 p(x) for x E X.

+

+

Proof: Let g and h be linear functionals, each defined on a linear subspace of X. We say that g extends h and write h < g if the domain of g contains the domain of h and g = h on dom h. The relation < partially orders the set of linear functionals. Consider the set of all extensions g off which satisfy g(x) 5 p(x), for x in the domain of g. Applying Zorn's lemma (see the Appendix) to this set, partially ordered by 4,we see that there is a maximal extension F. We need only show that the domain X, of F is all of X.Suppose this is not the case, i.e., there is a vector y in X but not in X,. Then F may be extended to a functional g on the subspace 2 3 X, consisting of elements of the form ay + x,, x, E X,,a E R, by putting g(ay + x,) = ug(y) F(xo). Now g is specified uniquely by g(y), and we need to show that g(y) can be chosen so that g(x) I p(x) for all x E 8 in order to get a contradiction. For xl, x2 E Xo we have F(x2)- F ( x , ) = F(x2 - x,) s p(x2 - xl) 5 p(x2 y ) p ( - y - XI), which yields -p(-y - xl) - F(x,) 5 p(x2 + y) - F(x,). Since the left is independent of x2 and the right is independent of x1 there is a constant c E R so that

+

+ +

(9 c

P(x2

(ii) -p(-y

+ Y ) - F(X,), - xl) - F(x,) 5 c

142

111.


for all x I ,x 2 E X o . We now put g(y) = c. Then for x = ay + xo E 8 the inequality g(x) = d a y xo) = ac + F(xo) < d a y xo) follows by replacing x2 by xo/a in (i) if a > 0 and x 1 by xo/a in (ii) if a < 0. 0

+

+

4.19 Corollary If X is a normed vector space and x E X , x # 8, then there is an x‘ E X so that x’(x) = llxll and IIx‘II = 1.

Proof: Standard exercise. 0

We now show that X can be isometrically and isomorphically embedded in X . 4.20 Tbeorem Let X be a normed vector space and define a map T: X + X ’ by Tx(x’) = x‘(x) for all x’ E X . Then T is a linear and norm-preserving embedding. If X is a Banach space then T [ X ] is a closed linear subspace of X’.

Proof: The reader should check that T is linear. That T x is bounded (as we have implied in the statement of the theorem) follows since ITx(x’)I = Ix’(x)l < llxll IIx‘II, and we see that llTxll < IIxII. The result will be established when we show that llTxll 2 Ilxll. This is trivial if x = 8, so suppose x # 8. From Corollary 4.19 there exists an x‘ E X’ so that llx’ll = 1 and x’(x) = IIxII. Thus llxll = Ix’(x)I = ITx(x’)I 5 llTxll IIx’II = 113x11. The rest is left to the reader. 0

Because of Theorem 4.20 we identify X with T [ X ] and regard X as a subspace of X in the rest of this section without further explicit comment. We end this section with a consideration of compactness properties in Banach spaces. We have seen in Example 3.22 that the closed unit ball in I , is not normcompact. This situation turns out to be typical of all infinitedimensional spaces. In fact one can prove that a closed ball in a Banach space is norm-compact iff the space is finite-dimensional [14, Theorem IV.3.51. It follows that no set in an infinite-dimensional Banach space X containing a closed ball can be norm-compact. Since this severely limits the sets which can be norm-compact we look for other topologies on a Banach space in which closed balls are compact. 4.21 Definition Let X be a normed vector space. The weak topology on X is the topology whose neighborhood system at a generic point x E X is generated by the subbase consisting of sets of the form U(x;x ’ , ~ )= { y E X:Ix’(y)- x‘(x)l < E } for some x’ E * X . Let X be the dual space of a normed vector space X . The weak* topology on X is the topology whose neighborhood system at a generic point


143

x’ E X’ is generated by the subbase consisting of sets of the form V(x’;x , E ) = {y’ E X : I x ( y ’ )- x(x’)l < E } for some x E X (regarded as embedded in X”).

Notice that in the definition of the subbase for the weak* topology we take only those x E X and not all x” E X ” . This turns out to make a crucial difference. An easy exercise, which we leave to the reader, shows that the monads of points x E X and x’ E X’ in the weak and weak* topologies, respectively, are given by mw(x)= { y E * X : * x ‘ ( y )11 *x’(*x)= x’(x) for all (standard) x‘ E X‘}, mw,(x’)= {y’ E *X’:*x(y’)‘Y *x(*x’)= x(x’) for all (standard) x E X } .

Using the Hahn-Banach theorem, we can show that the weak and weak* topologies are Hausdorff (exercise). 4.22 Alaoglu’s Theorem The closed unit ball in X is compact in the weak* topology.

Proof: Let B be the unit ball in X . We must show that corresponding to every y’ E * B there is a point x’ E B so that *x(y’)N x(x‘) for all x E X . Fix y’ E * B and define a functional x‘ on X by x‘(x) = st(y‘(*x)), x E X . Then *x(y‘)11 x(x’) for all standard x E X . The linearity of x’ is obvious, and, finally, x’ E B since Ix’(x)I 5 o ( ~ ~ yIl*xl)) ’ ~ l 4 11x11 by transfer (y’ E *B so IIY‘IIS1). 0 The same result can be proved for a ball of any radius and also follows directly from Theorems 4.22 and 2.6. We obtain as a consequence the following corollary. 4.23 Corollary A norm-bounded and weak*-closed subset of X is compact.

Proof: Use Theorems 4.22 and 2.4.

One might expect a similar result to be true for subsets of X in the weak topology. However, it turns out that the unit ball in X is weakly compact iff X is rejexioe, which means that X = X ” [14, Theorem V.4.71. Considering the importance of sequential compactness as emphasized in 5111.3, we would like to know when the unit ball B in a Banach space X is weakly sequentially compact. A deep theorem due to Eberlein and Smulian asserts that B is weakly sequentially compact iff B is weakly compact (iff X is reflexive by the above remark).A nonstandard proof of this result can be found in [47].

144

111.


4.24 Example We will show that the unit sphere in I , is not we&* sequen-

tially compact even though it is weak* compact by Alaoglu's theorem. Consider the sequence e"€ll (regarded as embedded in defined by e" = (S::i E N). Then lle"lll = 1. Suppose that (e") has a convergent subsequence (Slk). Define the element x = (x,:iE N) E I, by x i = 1 if i = nk and k is even, and xi = 0 otherwise. Then e x ) = 1 if k is even, and 0 if k is odd, so the sequence (e"7x)) does not converge, i.e., (8") does not converge in the weak* topology. Note that by compactness (check) the sequence e" has a weak* limit point y, but we cannot select a convergent subsequence since the neighborhood system at y does not have a countable base.

r-)

An extensive study of the structure of Banach spaces using nonstandard methods has been developed by Henson and Moore [16]. This study uses in an essential way the notion of the nonstandard hull of a Banach space. We present the definition of the nonstandard hull of a metric space in $111.6 to help the interested reader to understand these results.

Exercises 111.4 1. Show that d(x, y) = IIx - yll is a metric. and 11 llm are norms on R". 2. Show that 11 3. Show that B(S) with the sup norm 11 lla, is a Banach space. 4. Show that C(S) is a closed subspace of B(S) if S is a topological space. 5. Show that for a normed space all monads are translates of the monad of zero. 6. Show that a linear operator is continuous if and only if it is continuous at 0. 7. Prove Corollary 4.8. 8. Show that the operator norm on B(X, Y)makes B(X, Y)into a normed vector space. 9. Show that the set of compact operators is a linear subspace of B(X, Y). 10. Show that the weak and weak* topologies are Hausdorff. 11. Discuss the relationship between Alaoglu's theorem and the Tychonoff product theorem. 12. Two norms on a space X are equivalent if the corresponding metrics they define are equivalent. (a) Show that the norms 11-11 and 111.111 on X are equivalent iff there exist positive (nonzero) constants a and /Iin R so that ctIIxII 5 lllxlll 5 Pllx11 for all x E x. (b) Show that any two norms on R" are equivalent. (Hint: Show that any norm 11 11 is equivalent to 1.1, To do so you need only show that 111x111/11~11~and [ ~ x ~ ~ , / [are ~~x finite [ [ ~ for all x E *R". Write x = & xg, and get estimates.)

111.5 Inner-Product Spaces and Hilbert Spaces

145

13. Let X be a vector space with a topology 9. X is a topological vector space if both vector addition (as a map X x X -,X)and scalar multiplication (as a map R x X + X)are continuous. Let m(a)denote the monad of a E R and A x ) denote the monad of x E X . Show that if X is a

topological vector space (of more than one dimension) then (a) A x ) + A Y ) = A x ) + Y = d x + Y ) = x + y + P(@, (b) m(4x c m(alAx) = W ( X ) = Aax), (c) 9 is Hausdorff iff p(8) n X = {O}, (d) if X is a topological vector space with topologies Y1and F2 having monads pl and p2 then Y1= .F2iff pl(8) = p2(0).

111.5 Inner-Product Spaces and Hilbert Spaces

In this section we consider those normed spaces and Banach spaces in which the norm is derived from an inner product. Most of the results and proofs of this section are standard. The canonical example of an inner product occurs in Euclidean space R” where the scalar product of x = ( x l , . . . ,x , ) and y = ( y l , . . . ,y,) is ( x , y ) = C; x,y,. The angle 8 between two nonzero vectors x and y is given by the familiar formula cos 8 = (x, y)/llx11 llyll. The scalar product is generalized to vector spaces as follows. 5.1 Definition Let H be a vector space. An inner product on H is a map ( ,): X x X + R which satisfies (for all x, y, z in X and a, b E R )

(4 (x,Y ) = (YY X I , (ii) (ax + by, 4 = 4 x , 2) + b(y, 4, (iii) ( x ,x ) 2 0, and (x,x ) = 0 iff x = 8.

A vector space with an inner product is called an inner product space. A norm on H is obtained by setting llxll = (exercise). If H is complete in this norm it is called a Hilbert space. To prove that

11 11 is a norm on X one uses the following basic result.

5.2 Schwarz’s Inequality For any x , y in an inner-product space H,[(x,y)l I IIXII

IIyII.

Proof: Let x and y be given. For any real I we have ( x + I y , x + I y ) = 11x11’ 21(x,y) I ’ ~ ~ y2~ 0. ~ ’Thus the quadratic expression in 1 given by

+

+

146

111.

5.3 Corollary ( x , y ) is continuous on H x x, y E H.


H as a function of the variables

Proof: Exercise. 0 5.4 Examples

1. In the linear space R" we define the inner product of x = (xl, . . . , x n ) xIyI. The reader should check that and y = (yl, . . . ,y,) by (x, y) = this defines an inner product on R". From Schwarz's inequality,

2. The space 1 2 . Let l2 denote the space of all infinite sequences x = (xl, x: < 00. If x = (x1,x2, . . .) and y = (y1,y2,. . .) are two such sequences, we define (x, y) = c g xfy,. To check that (x, y ) is finite for x, y E l2 we have

x2, . . .) for which

and so cgl (x,y,l converges. Using the fact that (zgl x:)l/' = llxll is a norm, we can now easily check that l2 is a linear space. We will see later that all separable Hilbert spaces are isomorphic to 12. Using the inner product, we can introduce a notion of orthogonality in an inner-product space. 5.5 Definition If H is an inner-product space then x and y in H are orthogonal if(x,y)=O,inwhichcasewewritexly.IfS~HthenS1={x~H:xlz for all z E S}. 5.6 Proposition For any S G H,S' is a closed linear subspace of H.

+

Proof: Let x, y E S* and a, b E R. Then, for any z E S, (ax by,z) = + b(y,z) = 0, so S* is a linear subspace. To show closure, let x E *S*

a(x,z)

111.5

147

Inner-Product Spaces and Hilbert Spaces

and x N y E H.Then (y,z) -N (x,z) = 0 for all z E S by the continuity of the inner product, and so y E S*. Thus S* is closed. 0 Since the norm on an inner-product space H is derived from the inner product, we might expect that it has some special properties. It turns out that it is completely characterized by the following law. 5.7 Parallelogram Law A normed space (H,(( 11) is an inner-product space iff for all x, y E H IIX

- Y1I2 + IIX + YII’ = 211x112 + 211Yll’.

Proof: Suppose H is an inner-product space. Then IJX-YI12

+ IIx+Y112 =(x-Y,x-Y)+(x+Y,x =llx1I2 -(Y,X) -(x, Y)+ = 211x112+211Y112.

+Y)

IIYII’+ llX1l2 +(Y, x)+(x, Y)+ IIYII’

The converse, which we omit, sets (x, y) = ${ IIx

+ yll - IIx - yll}.

0

Using this simple result, we now establish a sequence of results which are fundamental to all further analysis of Hilbert spaces. 5.8 Definition A subset K of a vector space H is conuex if whenever x, y E K then ax (1 - a)y E K for all real a E [0,1].

+

In the proof of the next result we use completeness in an essential way. 5.9 Theorem If K is a closed convex subset of a Hilbert space H,then there is a unique element xo E K so that llxoll < llxll for all x E K, i.e., K has a unique element of smallest norm. Proof: Let d = inf{llxII:x E K}.Then for each S > 0 there is an x E K so that d 5 llxll < d + S. By transfer, with 6 infinitesimal, there is a y E *K with llyll 2: d. We now show that y is near-standard. Since K is complete by Corollary 3.15, it is enough to show that y is pre-near-standard (see Proposition 3.14). Fix E > 0 in R. By transfer from the parallelogram law, (5.1)

Ilx - Yll’

+ IIX + Yl12 = 211x112 + 211YIl2

I+

for any x E K. If x E K then since y E *K,(x + y)/2 E *K,so x + yll’ = 4)1(x + y)/21I2 2 4d’. It follows from (5.1) that [(x- yl12 < 211x 2d2 4d2 + q = 2))x1I2- 2d2 + q, where r,~is infinitesimal and x E K. But we can find an x E K so that 1 1 ~ 1 c 1 ~ d2 + ~/4,and we get IIx - yll’ < ~ / 2+ r , ~< E.

148

111.


Thus y is pre-near-standard, so y is near some xo E H . The point xo E K since K is closed, and llxoll = d by the continuity of the norm. The uniqueness is another application of the parallelogram law (exercise). 0 5.10 Theorem Let E be a closed subspace of the Hilbert space H with E # H. There are unique linear operators P: H + E, Q: H + E l so that

x = Px

+ Q x for all x E H. Further, Px=x

iff X E E

and

iff X E E ' .

Qx=x

P and Q are called the projections of H onto E and E l , respectively.

+

Proof: For x E H let K = x + E = { x y:y E E } . Then K is convex and closed (check), Let Qx be the unique element of smallest norm in K (existing by 5.9), and put P x = x - Qx. Then it is clear that x = P x + Q x and P x E E. To show that (Qx,z ) = 0 for all z E E, we put Q x = y. Assuming without loss of generality that llzll = 1, we have

IIYI12 5 IIY - az1I2 = (Y - az, Y - az) = llYllZ - 2a(Y,Z) + '1.1

+

for every a E R, yielding 0 < -2a(y,z) la12. If a = (y,z) this gives 0 < -I(y,z)l2, and so (y,z) = 0. The uniqueness of P and Q follows from the fact that E n E l = {O}. For if x = x 1 + x 2 with x 1 E E, x 2 E E l , then x 1 - P x = Qx - x 2 and x l - Px E E, Q x - x 2 E E l , so x 1 = P x and x 2 = Qx. The rest of the proof is left to the reader. 0 The culmination of the preceding sequence of results is the following theorem, which probably has more applications than any other result on Hilbert spaces. 5.11 Riesz Representation Tbeorem To each bounded linear functional L on H there corresponds a unique element y E H so that L ( x ) = (x, y) for each x E H,and llLll = IIyII.

Proof: We may assume that L is not identically zero (otherwisetake y = 0). Let E = { x E H : L x = O}. Then E is a closed linear subspace (check) and E l # { O}, so we may choose z # 8 in E l . Then, for any x E H , x - (Lx/Lz)z E E, so (x,z ) - (Lx/Lz)(z,z ) = 0. Thus L x = (x, [Lz/(z,4 3z), and we take y = [Lz/(z,43 z. The rest is left as an exercise. 0 5.12 Corollary A Hilbert space H is self-dual; i.e.,

H

=H.

Next we investigate the generalization to Hilbert space of a familiar notion in R", that of an orthonormal basis. In R" the vectors el = (1, O,O, . . . 0),

111.5


149

e2 = (0,1,0,0, . . . ,O), . . . ,e, = (O,O, . . . ,O,l) have the property that lleill = 1, (ei,e,) = Sfi (the Kronecker &function), and any vector x E R” can be written uniquely as x = aiei. The set {e,} is called an orthonormal basis. In Hilbert spaces we will see that orthonormal bases exist and that any vector can be expressed in a limiting sense in terms of the orthonormal basis.

c;=,

5.13 Definition A set S = {ei:i E I } of nonzero vectors in an inner-product space H is orthonormal if e, I e, for i # j and lle,ll = 1 for all i E I. S is maxi-

mal (or complete) if it is not properly contained in any other orthonormal set. Given any x E H the numbers 2(i)= (x,ei) are called the Fourier coeficients of x relative to the orthonormal set S = { e , } . If H is a nontrivial inner-product space (ie., contains more than the zero vector 8) then there is at least one orthonormal set in H obtained by taking a single nonzero vector x E H and forming the normalized vector e = x/llxll. The existence of maximal orthonormal sets then follows from the following more general result. 5.14 Theorem Every orthonormal set S c H is contained in a maximal

orthonormal set 3 c H.

Proof: Let 9’be the collection of all orthonormal sets in H containing S, and partially order Y by set inclusion E.Y is nonempty since it contains S.We use Zorn’s lemma (see the Appendix)to show the existence of a maximal orthonormal set. Let ‘3 c Y be any chain in 9.Then the set 3 = uS(S E W) is an orthonormal set, for if x , , x , E 3, then x E S , and x , E S, for some S,, S, E ‘3. Since ‘3 is a chain, either S, G S2or S2 E S,. In either case x and y are in some S E ‘3, so x Iy. Thus 3 is orthonormal. By Zorn’s lemma there is a maximal orthonormal set. With a little more work it is possible to prove that any two maximal orthonormal sets can be put in one-to-one correspondence (i.e., have the same cardinality), but we will not need this fact. The reader should prove (exercise) that S is a maximal orthonormal set iff x E H and x I S implies that x = 8. This fact will be used in the proof of Theorem 5.19. 5.15 Example The vectors ei = (Sf:j E N) in I , form a maximal orthonormal set, for if x = ( x , : j E N ) E 1, and (x, ei) = xi = 0 for all i E N, then x = 8.

In the following we will deal only with inner-product spaces H which are (norm) separable (i.e., H contains a countable set which is dense in the

150

111.


topology induced by the norm). In this case any orthonormal set is either finite or countable, for if {ei:iE I } is orthonormal and i # j , then [lei- ejIl2 = (ei - e,,ei - e,) = lle1112+ (le,l(’= 2 since (ei,e,) = 0. Conversely, if any orthonormal set in H is either finite or countable then H is separable (exercise). Since the following results are easy if H is finite-dimensional (i.e., contains a finite maximal orthonormal set), we will assume in the following that the inner-product space H contains a countable orthonormal set which we arrange in a sequence ( e i : i E N). Without loss of generality we have chosen I = N. Now let x E H and ( a r: i E N) be a sequence of real numbers. Then

From this we obtain the following results.

5.16 Best Approximation Theorem Let (ei:i E N ) be an orthonormal sequence in an inner-product space H. For any x E H,

i.e., the best norm approximation to x by a linear combination of the ei is given by choosing the coefficients to be the Fourier coefficients. Proof: The right-hand side of (5.2) is minimized if ai = (x,ei). 0

5.17 Bessel’s Inequality For any x E H,

Proof: Exercise. 0

Bessel’s inequality has the following interpretation. For any x E H we can consider the sequence ( $ i ) : i E N ) of Fourier coefficients of x relative to

111.5

151


a given orthonormal sequence S = ( e i :i E N ) . Then Bessel's inequality shows that this sequence is in I,. Thus for a fixed countable orthonormal sequence (ei> we obtain a mapping T: H --* I , defined by T x = (2(i) :i E N). It is easy to check that T is a linear mapping. The next result, which requires that H be complete, shows that T maps H onto 1,. 5.18 Riesz-Fischer Theorem Let ( e i : i E N) be a countable orthonormal sequence in the Hilbert space H.Then each element of I , is of the form 2 for some x E H. Proof: Let ( a i : i E N ) be a sequence in I , so that a: < 00. Then aiei is a Cauchy sequence in H since x , - x , = the sequence x, = ~ ~ = , +aiei 1 if m > n, and so IIx, - x,,((' = a:. Since H is complete, there is an element x which is the (norm) limit of x,. By the continuity of the inner product, (x,ei)= lim(x,,e,) = a, for any i E N. 0

cy=l

The element x which is obtained in Theorem 5.18 is often written x = is a maximal orthonormal sequence, then the associated map T is one-to-one, a,ei. For this reaand so an x E H can be written in only one way as son a maximal orthonormal set (also called a complete orthonormal set) is sometimes called an orthonormal basis. It should be emphasized that this notion of basis must be understood in a limiting sense and not in the algebraic sense of vector space theory.

Cim,, aiei. One consequence of the next theorem is that if ( e i : i E N)

5.19 Theorem The orthonormal sequence ( e , : i E N) in the Hilbert space H is maximal iff each x E H can be written uniquely as x = (x, ei)ei.

cp",

Proof: Suppose ( e , ) is maximal and x E H . Then by the Riesz-Fischer theorem there is an element y E H so that y = (x,ei)ei and so (y,e,)= (x,ei)for all i E N. But then ( x - y,e,) = 0 for i E N , so x = y by the remark following Theorem 5.14. Conversely, suppose each x E H can be written as x = 1,E (x, ei)ei.If (ei> is not maximal there is an x # 8 in H so that (x,e,)= 0 for i E N. But then x= (x,ei)ei= 6' (contradiction).

czl

xz

5.20 Theorem (Parseval's Identities) If ( e i : i E N) is a maximal orthonorma1 sequence in the Hilbert space H then

c,?,

(i) [(x,ei)I2= 1 1 ~ 1 for 1 ~ all x E H ; (ii) I, (x, ? ei)(y,ei) = (x, y ) for all x, y E H .

152

111.


Proof: We leave the proof of (ii) as an exercise. To prove (i) we see by Bessel's inequality that I(x,ei)I25 11x11'. On the other hand, given E > 0 there is a z = (x, ei)ei so that IIx - zll < E , whence IIx(1-= llzll + E. Thus

cy=

1;

and the result follows since E is arbitrary.

0

The results above can now be used to show that 1, is essentially the only separable infinite-dimensional Hilbert space. 5.21 Theorem Given a maximal orthonormal sequence S = ( e i : i E N ) in a separable Hilbert space H,the associated map T: H + 1, is one-to-one, onto, and satisfies (x,y) = (Tx,Ty) for all x, y E H , and so T is a Hilbert space isomorphism.

Proof: Use 5.18-5.20.

0

We end this section with an application of nonstandard analysis to prove a theorem concerning compact operators in Hilbert space. Much more can be done in this direction. In particular, Bernstein and Robinson [4] first proved that so-called polynomially compact operators have nontrivial invariant subspaces using refinements of the technique used here. We are going to prove that every compact operator on a separable Hilbert space H can be approximated arbitrarily closely by an operator of "finite rank." 5.22 Definition An operator Q: H + H is of finite rank if there is a finitedimensional subspace E c H so that Q x E E for each X E H.

Since every separable Hilbert space H is isomorphic to 1, we will identify H and 1, in the following discussion. Thus we will asume that an orthonormal sequence ( e i ) is given and represent any x E H as either x = i = alel or (a1,a2, . . :). First we need the following lemma.

c"

5.23 Lemma If x = ( a i : i E * N ) E *I2 is near-standard, then &12(i i 2 o)is infinitesimal for any o E * N , .

cIpi12(i

E

*N,

Proof: If y = ( b i : i E N ) E I , then 1irnk+" E N, i 2 k ) = 0, so cI*bi12(iE * N , i 2 w ) 2: 0 for any infinite o.Now since x E *I2 is near-standard there is a y E 1, with Ilx - *yI12 = Clai - *bi12(iE * N ) N 0. By the trans-

111.5


153

and both terms on the right are infinitesimal. 0 5.24 Theorem Let T: H + H be a compact linear operator. For each E > 0 there is an operator Q of finite rank so that [IT - QIl < E. Proof: For each k E * N (finite or infinite) we define a projection operator Pk:*H + * H by Pkx = (a1,a2,. . . ,a k , O , O , . . .) when x = (ui:i E * N ) . Then P, is linear and [[PkXll5 llxll for any x E * H . Also, Il(1 - Pk)X1l2 = C(ai12(iE * N , i 2 k + l), and so, by Lemma 5.23, ll(Z - pk)x11 is infinitesimal for k infinite and x near-standard. It follows that ll*T - pk*Tll is infinitesimal for all infinite k. Now let E > 0 in R be given. The internal set A = {n E *N:Il*T - P,,*TI( < E } contains all infinite natural numbers, and so contains a finite (standard) integer rn by Corollary 7.2(ii) of Chapter 11. Thus ll*T - P,*TII < E. Transferring down shows that [IT - P,TII < E. Finally, the operator Q = P,T is of finite rank since its range is contained in the subspace E generated by { e l , . . . , em}. 0

This result can be used as a starting point for the Fredholm theory of compact operators. Exercises 111.5 1. Show that if (,) is an inner product on a vector space H then the map ~~~~~: H 4 R + defined by llxll = is a norm on H. 2. Prove Corollary 5.3 3. Show that the element x o of Theorem 5.9 is unique. 4. Complete the proof of Theorem 5.10. 5. Finish the proof of Theorem 5.1 1. 6. Show that S is a maximal orthonormal set iff x E H and x IS =. x = 8. 7. Show that if any orthonormal set in an inner-product space H is either finite or countable, then H is separable. 8. Prove Theorem 5.17. 9. Prove Theorem 5.2qii). 10. Establish the following converse to Lemma 5.23. If x = (ui:i E * N ) E *12, 11x11’ = &12(i E * N ) is finite, and &,l’(i E *N, i > w)z 0 for all infinite o,then x is near-standard.

a

154

111.


1 1 . The Hilbert cube is the set of all x = ( x i ) E I , such that lxil I l/i, i E N. Show that the Hilbert cube is compact. 12. Let H be a Hilbert space and let B ( H ) denote the normed space of all bounded linear operators A: H + H. A subbase for the weak operator topology on B(H) is formed by the collection of all sets of the form { A : I ( ( A- A,)x, y)I < S}, A, E B(H), x, y E H and 6 > 0 in R. Show that the monad of A, in B(H) in the weak topology is given by p(A,) = {A E * B ( H ) : ( A xy, ) N (A,x, y) for all standard x, y E H}. 13. (Standard) A bilinear form on H is a map B: H x H + R such that B(x, .) is linear for each x E H and B ( * ,y) is linear for each y E H. B is bounded if there exists M E R such that (B(x,y)I IMllxll llyll for all x, y E H. Show that if B is a bounded bilinear form, then there exists an operator T E B(H) such that B(x, y) = (Tx,y) for all x, y E H. 14. Use Exercises 5.12 and 5.13 to show that the unit ball in B(H) is compact in the weak operator topology. 111.6 Nonstandard Hulls of Metric Spaces

In this short section we introduce the reader to the concept of the nonstandard hull of a metric space. This notion was introduced by Luxemburg [36] and has proved to be a powerful tool in the nonstandard analysis of Banach spaces, as indicated by the survey paper of Henson and Moore [16]. The technique of nonstandard analysis, as applied to the theory of Banach spaces, is essentially equivalent to the use of Banach space ultrapowers, a technique which originated with Dacunha-Castelle and Krivine [101 and is now used extensively. Nonstandard methods, however, are more intuitive and usually easier to apply, especially when they involve concepts, such as the internal cardinality of a *-finite set, which are not easy to express in the ultraproduct setting. In this section we will assume that the nonstandard analysis is carried out is a metric in a ic-saturated enlargement where ic > KO.Suppose that (X,d) space. Recall that the principal galaxy G = fin(*X)is the set of points in *X each of which is at a finite distance from a point in X (regarded as embedded in *X).If u, b E *X we say as usual that u = 6 if *+, b) ‘v 0. Let 2 denote the equivalence classes of G under the equivalence relation N . Alternatively, 3 is the set of monads, where each monad m(a) = {b E G : *d(a,b) 2: 0} for a E G (notice that if a E G and b ‘Y a then 6 E G). Since *d(a,b) is finite for any a, b E G,we can define h(x, y) = st(*d(a,b))

when x

= m(a) and

y = m(b) in

8.

1116 .

155

NonstandardHulls of Metric Spaces

6.1 Proposition (2,h) is a metric space.

Proof: Exercise. 0

6.2 Definition

(8,h) is called the nonstandard hull of (X,d).

We now use saturation to prove that (2,d)is complete [even if ( 2 , j ) is not]. Our construction is like that of Theorem 3.17, but here 2 consists of monads of finite points and not just pre-near-standard points.

6.3 Theorem Suppose that *X lies in a K-saturated superstructure with K > KO.Then (2,h) is a complete metric space. Proof: Let (m(ai):iE N ) be a Cauchy sequence in

(2,d).Then for each

k E N there is an n(k) E N so that *d(ai,aj)< l/k if i and j are both >n(k); we can assume without loss of generality that n(k) -, co as k -, co. Let 4(i)= a,. By Theorem 8.5 of Chapter 11, the map 4: N --+ *X can be extended to an internal map 4:fi + *X,where fi E * N is internal and contains N and so contains some infinite integer. We would like to show that there is some infinite integer m’ in fi so that *d(ai,a,.) < l/k for all i E N with i > n(k), where a,. = &’). For any k E N the set E(k) = {m E fi:*d(ai,aj)< l/k for all i, j E fi satisfying n(k) < i I m, n(k) < j I m } is internal and contains N. Therefore E(k) also contains { m E * N : m I mk} for some infinite integer m k , and we may assume that mk+ I S mk for all k E N. Again by Theorem 8.5 of Chapter I1 we may extend the sequence ( m k ) to an internal decreasing mapping from an internal set fi c * N into *N.Since mk > k for each finite k E N, there is an infinite w with m, 2 w and m, E E(k) for all k E N. Let m’ = m,. Then *d(ai,a,.) < l/k for all i E N with i > n(k). It follows that a,. is finite and (rn(ai)) converges to m(a,.). 0

If our metric space ( X , d ) is a normed vector space with norm )I 1 , the nonstandard hull can be made into a normed vector space in an obvious way. For in this case G consists of all x E *X for which Ilx1I is finite, and so G is a vector space over the reals. We define addition and scalar multiplication of elements in 2 by m(x)

+ m(y) = m(x + Y),

and am(x) = m(ax).

x, Y E G,

156

111.


Also we define a norm in 2 by Illm(x)lll = stllxll, x E G. It is easy to check that d(m(x), m ( y ) ) = Illm(x) - m(y)III. From Theorem 6.3 we see that (2,111111) is a Banach space. The details are left to the reader. Exercises 111.6 1. Prove Proposition 6.1; in particular, show (i is well defined. 2. Show that if (X, I(* 11) is a normed space, then (2,(II.111) as a Banach space.

3. Show that there is an isometric embedding of a Banach space into its nonstandard hull. 4. Consider the sequence (e") which is 0 for n # o E *N, and 1 for n = o to show that the mapping in Exercise 3 is not onto for 1,. 5. Consider the sequence (x,), where x, = l/w for 1 In Iw, w E *N, and x, = 0 for n > o,to show that the mapping in Exercise 3 is not onto for

4. '111.7 Compactifications

In this section we show how some Hausdorff spaces (X,9) can be embedded as dense subsets of compact Hausdorff spaces (Y, F).That is, there exists a 1-1 map $: X + range $ E Y so that $ is a homeomorphism and range $ is dense in Y. In this case (Y, 9) is called a compactijcation of ( X ,9)We . usually identify X and range $, and so regard X as a subset of Y; we will denote Y by 8. A given space X typically has many compactifications. For example, if one adjoins 0 and 1 to (0,l) one obtains the compact interval [0,1]. Adjoining a single point to both ends of (0,l) gives a circle. Similarly the plane can be made into a sphere by adjoining a single point. We are interested here in compactifyinga space X so that certain continuous functions on X have continuous extensions to 8. What, for example, should one adjoin to (0, 11 to make sin(l/x) continuous on the resulting compact space? 7.1 Definition Let Q be a family of (perhaps not uniformly) bounded, continuous, real-valued functions on (X, 9). (8,Y) is called a Q-compuctification

of (X, 9) if it is a compactification for which (a) each f E Q has a continuous extension f to (z,Y), (b) if x and y are different points in - X there is an f E Q whose extension f separates x and y, i.e., f(x) it r(y). We sometimes write for X.

x

xQ

In order to construct a Q-compactification we need to suppose that Q contains sufficiently many functions.

111.7

157

Compactifications

7.2 Definition A family Q of continuous functions distinguishes points and closed sets if, for each set A c X and each x E X - A, there is an f E Q so that f(x) # m 1.

It should be noted that not all Hausdorff spaces X admit sufficiently many continuous real-valued functions to distinguish points and closed sets. There are enough functions if X is completely regular [20]. The compactifications of this section will be constructed from *X.The original work on this construction was done by Gonshor [151, Luxemburg [36], Machover and Hirschfeld [37], and Robinson [43]. Let Q be a family of bounded, continuous, real-valued functions on (X, 9’). Assuming that Q distinguishes points and closed sets, we construct as follows. We call two points y , z E *X equioalent, and write y z, if *f(y) N *f(z) for all f E Q. It is easy to see (check) that is an equivalence relation. The equivalence class containing x E * X is denoted by [XI, and the set of all equivalence classes is X. Next we show that if x E X, then [XI = m(x), the monad of x. First note that if y E m(x),then y x since each f E Q is continuous. On the other hand, if U is an open set containingx, then there is anf E Q so thatf(x) 4 f[X - U]. Thus f - ‘ [ R - f[X - U ] ] is an open set containing x and contained in U . We conclude that [XI = m(x). We extend each f E Q to a function on (again denoted by f ) by setting f([y]) = st(*f(y)),y E *X (check that f is well defined). The set of extended functions is again denoted by Q. The topology 9-on is the weak topology for the functions in Q. Thus U is open in iff for each [y] E U there is a finite set { f i , . . . ,f,} E Q and a positive number E in R so that { [ z ] E - Li[zI)1< E, 1 s i I n) c U . In order that we may treat as an element of the original superstructure V ( X )(which will be used in the proof of compactness),we may think of each point in as a function on Q by the definition [y](f) = f([y]). Distinct points of X give distinct functions on Q. The standard construction of X is based on such a family of functions on Q. It is often helpful, however, to think of X as a quotient of *X as we have done. Let XQ be constructed as above from a set Q of bounded, continuous, realvalued functions on (X,9’) which separate points and closed sets.

x

-

-

N

x

x

INYI)

x

x:

x

x

7.3 Theorem (XQ,Y) is a Q-compactification of (X, 9’).

Hausdorff.

xQ

x

be denoted by 8.Define the map I): X --t by $(x) = [XI, [x] = m(x) for x E X,so the map $ is 1-1 by 1.12(c)since X is

Proof: Let x E X.Now

158

111.

+


To show is a homeomorphism, we must show that JI and + - I are conis tinuous. An easy exercise shows that is continuous. To see that continuous, we must show that if x E X and V is an open neighborhood of x in 9,then there is a U E Yxso that U n X E V (we regard X as contained in Let f E Q be such that f ( x ) 4 where A = X - V . Then there is an E > 0 in R so that {z E X :(f(z) - f(x)l < E } E V (why?); we let U = ( 2 E X:If(z) - f(x)I < E } . To show +[XI is dense in 8,let [y] E s - +[XI,and let U E 9-be given by U = {[z] E 8 : I J ; ( [ z ] )- fd[y])I < E, 1 Ii In}. We must show that E U for some x E X. Let ai = fd[y]), 1 5; i In. Then the set {x E *X:If;(x) all < E, 1 I i I n} is not empty (indeed it contains y). By downward transfer, the set {x E X:lf,(x) - ail < E, 1 I i I n} is not empty, and we are through. To show that is compact we consider a mapping T on 8. For each [y] E 8, T([y]) is the function from Q into R defined by setting T([y])(f) = f([y]) for each f E Q. Let A be the range of T; then T is a 1-1 mapping from 8 onto A. We make T a homeomorphism by letting U be open in A iff T-'[U]is open in 8. Thus a typical neighborhood of an a E A is given by a finite set {f', . . . ,f"} c Q and an E > 0 in R: it consists of those b E A with Ia(f,) - b(f,))< E, 1 5 i I n. Since X is dense in each such neighborhood contains a T ( [ x ] ) for some x E X; i.e., la(f,) - J(x)l < E for 1 Ii 5 n. To show that X is compact, we need only show that A is compact. Fix b E * A . Let E be a positive infinitesimal in *R, and let Q1 be a hyperfinite subset of Q such that *f E Q, for each f E Q. By the transfer principle, there is an x E *X such that Ib(f) - f ( x ) l < E for each f E Qr. Let c = T([x]). For each f E Q, c(f) = T([x])(f) 'Y * f ( x )2 b(*f), so b is in the monad of the standard point c E A. Thus A is compact. Finally, by the construction, each member of Q has a continuous extension to 8,and the family of extensions separates the points of 8. 0

+

+-'

fCA3,

z).

[XI

x

x,

It is not hard to see that if Q1 and Q2 are two families as described above with Q1 G Q2, then there is a continuous map ( from onto XQ1such It follows that that c} EF, we have a contradiction. 0 Exercises 111.7 1. Let (X, F )be locally compact, and let 2 denote the one-point compactification of X. Let A be an internal set of near-standard points in *X. Use the fact that st[A] is closed in X and a closed subset of 8 is compact to show that st[A] is compact. 2. Show directly that the one-point compactification of a locally compact Hausdorff space is compact. 3. Show that, for w E * N , , { A c N : w E * A } is a free ultrafilter. 4. What is the Q-compactification of (0,l) when Q = { f ( x ) = x } ? 5. What is the Q-compactification of (0,l) when Q = { f ( x ) = x , g(x) = sin ( l / x ) } ? 6. Show that X is open in a compactification X if and only if X is locally compact.

160

111.


‘111.8 Function Spaces

Let (X, 9’)and (Y, 9) be Hausdorff topological spaces and F be a family of mappings from X into Y.This section will be concerned with two questions: (a) For which topologies 9 on F is the map +:, F x A + Y defined by +(f, x) = f ( x ) continuous for all subsets A E X in a certain family .W? Such a topology 9 is said to be jointly continuous with respect to .%. (b) For which topologies on the space M of all mappings from X into Y is the closure of F compact?

To answer these questions, we consider two important topologies, the topology of pointwise convergence and the compact-open topology. For a standard treatment the reader is referred to Kelley [20, Chapter 71. Our treatment follows suggestions of Hirschfeld [181. The nonstandard analysis will be done in an enlargement of a structure containing X and Y. Monads in ( X , Y ) and ( Y , 9 ) will be denoted by mJx) (x E X ) and my(y ) ( y E Y ) , respectively, but we will denote nearness in both X and Y by N as in 6111.1. With each subset A E X we associate an important pseudomonad k A ( f )(fE M) on the space M of all maps from X into Y by setting (8.1) k A ( f ) = { g E * M : g ( x ’ )!x f ( x ) for all x E A and x’ E *A with x’

N

x}.

The following result provides a nonstandard answer to question (a). 8.1 Proposition Let 9 be a topology for F with associated monads m ( j ) (fE F). Then 9is jointly continuous with respect to A? iff m(f) c n { k A ( f ) : A E .W} for all f E F.

+,,

Proof: We need only show that, for each A E X , is continuous iff m(f) c k A ( f ) for all f E F. But for f E F and x E A, the monad of (f,x) in * F x *A is m(f) x m,,(x), where mA(x)= mJx) n *A. is continuous at each (f, x) E F x A o *+,,(m(f) x mA(x))c m,,(+,,(f, x)) for each f E F, x ~ A o i ~f E F X, E A , then whenever g E m ( f ) and y ! x x , y ~ * A we , have g(y) N f ( x ) - m(f) G k A ( f ) for each f E F. 0

+,,

8.2 Definition

(a) The topology of pointwise conoergence 9 on M is the weak topology for the family {+x:x E X} of evaluation maps &: M + Y defined by c$,(f) = f ( x ) . The monads for 9 are denoted by p(f) (fE M). (b) The compact-open topology % on M is generated by the subbase conV ) = { g E M : g [ K ] c V } , where K is sisting of all sets of the form W(K,

111.8 Function Spaces

161

compact in ( X , Y ) and U is open in (Y,Y).We let c(f) (f E M ) denote the monads of %‘. From 1.18 we see that (8.2)

p(f) = { g E * M : g ( x )E f ( x ) for all standard points x E X } .

8.3 Proposition Let X be the family of compact subsets of (X,Y). Then, for each f E M, k , ( f ) E n { k , ( f ) : A E x } E ~ ( f E ) hf). Proof: (i) k , ( f ) E k A ( f )for any A E X,and the first containment follows. (ii) Let K be compact in ( X , Y ) and U be an open set in ( Y , Y ) containing f[K]. If g E n { k A ( f ) : AE X } ,then g E k K ( f ) ,so g ( y ) IIf ( x ) for all x E K and all y E * K with y 2: x . Since U is open, g ( y ) E *U for all y E * K with y z x E K . But this includes all y E *K since K is compact, and so g [ * K ] G *U, i.e., g E *W(K,U ) . Thus n { k A ( f ) : AE X } G *W(K,U )for any K and U with f[K] c U , and the second containment follows. (iii) A subbase for 9 consists of sets of the form W ( { x } ,U ) , and so 9 is weaker than W and the third containment follows. 0

8.4 Theorem Each topology which is jointly continuous with respect to the family of compact subsets of X is stronger than V. Proof: Immediate from 8.1 and 8.3. 0

8.5 Theorem Assume F c M is closed with respect to 9.Then F is compact in ( M , 9 ) if for each x the set { f ( x ) : fE F} has compact closure in Y. Proof: Our condition guarantees that, for any x E X , every point in * { f ( x ) : fE F} = {g(x):gE *F}is near a standard point in Y. Given g E *F, let f ( x ) be defined for each x E X by setting f ( x ) = y,, where y, is a point in Y with y, 2: g(x) [such a point is unique since (Y, Y) is Hausdorff]. Then f E M and f ( x ) N g ( x ) for all x E X, i.e., g E p(f). Since g E * F and F c M is closed, f E F. Thus each g E *F is near a standard f E F. 0

The fact that { f ( x ) : f E F } has compact closure for each x E X is an essential ingredient in obtaining a functionfE F from a function g E *F.The argument of Theorem 8.5 does not work, however, for the compact-open topology since the condition g(x) 2: f ( x ) for all x E X is not sufficient to guarantee that g E c(f). If, however, g(x’) 2: f ( x ) for all x E X and x‘ E X with x‘ N x , then

162

Ill.


g E k x ( f ) c c(f) (by Proposition 8.3) and compactness follows. A standard condition guaranteeing that this holds is the following from Kelley [20].

8.6 Definition The family F is evenly continuous if for each x E X,y E Y and each open neighborhood U of y, there are neighborhoods V of x and W of y so that for all f E F with f ( x )E W , we have f [ V] c U . 8.7 Proposition The family F is evenly continuous iff the following condition holds: Given x E X and y E Y , if g E * F and g(x) 1: y, then g(x’) 1: y for all x’ ‘v x in *X. Proof: Assume first that F is evenly continuous. Fix a neighborhood U E 5.and the corresponding sets V E Y xand W E Fygiven by Definition 8.6. Since g(x) z y, g(x) E *W, so by transfer g [ * V ] c *U. In particular, g(x‘) E *U if x’ ‘v x. This last statement is true for any U E F,,, and so g(x’) ‘v y if x‘ 1: x. To prove the converse, fix U E Yyand let V and W be *-open sets in *.UX and *FY, respectively, with V G mx(x) and W c mu(y). Now if g E * F and g(x) E W , then g(x) z y. By assumption, for all x’ E V , q(x’) E q ( y ) c * U . The rest follows by downward transfer. 0

As a corollary we get a generalized Ascoli theorem due to Kelley [20]. 8.8 Ascoli Theorem If F c M is closed in W and evenly continuous, and { f ( x ) : f E F} has compact closure for each x E X,then F is compact in (Ad,%?). Proof: Immediate from the discussion preceding Definition 8.6.

0

For the rest of this section we assume that (Y, F )is a metric space with metric d. In this context, a notion which is closely related to even continuity is the notion of equicontinuity, which has already been presented in the realvariable case in Definition 1.13.6.

8.9 Definition A family F c M is called equicontinuous on X if, for each x E X and each E > 0 in R, there is a V E Yxsuch that, for any f E F, if x’ E V, then d(f(x’),f(x))

< E.

8.10 Proposition The family F c M is equicontinuous on X iff, for any x E X and any g E *F, g(x’) 1: g(x) whenever x’ 1: x. Proof: Exercise.

0

163

111.8 Function Spaces

If F is the family { n + nx:n E N } then F is evenly continuous but not equicontinuous on [0,1]. By Propositions 8.7 and 8.10, any equicontinuous family F c M is evenly continuous. If F c M is a family of continuous functions, then the compact-open topology in F is the same as the topology of uniform Convergence on compact sets, or the topology of compact convergence. For the latter topology, a typical basic open neighborhood off E F is of the form { g E F : d ( f ( x ) , g ( x ) ) E for all x E K} for some compact K c X and E > 0 in R (see [20, p. 2291). It follows from Theorem 8.8 that if F is an equicontinuous family in M (whence each f E F is continuous), and F is closed in M with respect to the topology of uniform convergence on compact sets with { f ( x ) : f E F } having compact closure in Y for each x E X , then F is compact with respect to the topology of uniform convergence on compact sets. Moreover, for an equicontinuous family F, the topology of pointwise convergence is jointly continuous on compact sets (exercise), and hence coincides with the topology of uniform convergence on compact sets.

-=

Exercises 111.8

1. Use Theorem 8.5 to prove Alaoglu’s theorem, 4.22. 2. Prove Proposition 8.10. 3. (a) Show that the set of real-valued continuous functions on R (with the usual topology) is closed with respect to the topology of uniform convergence on compact sets. (b) Show that part (a) is no longer true if we replace the usual topology on R with a topology Y such that { r } E Y for each r # 0 in R, and U is an open neighborhood of 0 if 0 E U and R - U is countable. [Hint: what are the compact sets? Is g continuous if g(0) = 1 and g(r) = - 1 for r # O?] 4. Show that if (Y, Y) is a metric space and F is an equicontinuous family, then the topology of pointwise convergence is jointly continuous on compact sets and hence coincides with the topology of uniform convergence on compact sets. 5. Let C denote the set of real-valued continuous functions on I = [0,1]. Then the map d: C x C + R + defined by d ( f , g ) = max{lf ( x ) - g(x)l: x E I } is a metric on C. Show that the compact-open topology on C coincides with the metric topology. 6. Show that the space C ( X , Y) of continuous mappings from (X, 9) to (Y, 9) with the compact-open topology is Hausdorff if (Y, F)is Hausdorff.

CHAPTER IV

Nonstandard Integration Theory

In trying to apply the theory of the Riemann integral we are faced with the following technical problem. Suppose we are given a converging infinite series f,(x) = f(x) of functions on [a, b] and are asked to calculate gf(x)dx. The answer is often simple if we can write

c,”r

Thus we need to find conditions under which integration and infinite summation can be interchanged. Equivalently [letting gn(x)= I Jtrx)] we need conditions under which, if g(x) = g,(x), then

E=

lim n-tm

s,” g,(x)dx = s” g(x) dx

for a sequence (g,(x)) of Riemann-integrable functions on [a, b]. It turns out that we can reduce the discussion to sequences {g,(x)} which are monotone increasing, i.e., gn+,(x) 2 g,(x) for all n E N [this is the case ifk(x) 2 0 for all n E N]. Thus, assuming that {g,(x)) is a monotone increasing sequence of integrable functions and g,(x) converges to g(x) on [a, b], we need conditions which insure that ~ ( xis) integrable and the above equation holds. A result of this type is known as a monotone convergence theorem. Unfortunately, the conditions under which a monotone convergence theorem holds for Riemann integration are quite restrictive (for example, it holds if the sequence fgn) converges ~~~0~~~~on [a,b]). This fact led Lebesgue [26] and others to generalize the process of integration in such a way that the conditions for a monotone convergence theorem were considerably relaxed. The procedure was to generalize the concept of the length of an interval so that one could measure the “length” of a very general subset of [u, b] called a measurable set. The theory,of integration then developed systematically from this “measure theory.** 164

IV.l

165

Standardizations of Internal Integration Structures

An alternative approach was developed by P. Daniel [l 11. He began with the general notions of a lattice L of functions on a set X and an integral I on t.As indicated in Definition 1.2, a lattice of functions is a linear space which is also closed under the operation of taking absolute valves, and an integral 1 on L is a linear functional which is also positive [i.e., f 2 0 implies I ( f ) 2 01. Daniel showed that if I satisfied the additional continuity concould be dition “If {f.} decreases to 0 then l(f,)decreases to 0,” (t,I) enlarged to a structure f) which satisfied the monotone convergence theorem. Our nonstandard approach to integration follows the Daniel approach except that we begin with an “internal” integration structure ( L , I ) on an internal set X in some enlargement. We show that, without any continuity assumption, we can construct from (t, I) a standard integration structure (L,f) on the same internal set X , and that structure satisfies the monotone convergence theorem. In 41V.2 we show that the usual measure-theoretic approach can be recovered from any structure r^) satisfying the monotone convergence theorem. The usual Lebesgue theory on R” is developed in 8IV.3 by using the standard part map to carry results on *R” down to R”. Some important convergence theorems which hold in any structure for which the monotone convergence theorem is valid are developed in 4IV.4. A nonstandard approach to the Fubini theorem, which is an analogue of the iterated integration procedure for the Riemann integral, is developed in fiIV.5. Finally, in 4IV.6 we apply the nonstandard integration theory developed in the previous sections to study several important stochastic processes, including the Poisson process and Brownian motion. These processes are represented as processes on a *-finite probability space and indicate the usefulness of an integration theory on nonstandard sets. References to the original work on nonstandard integration theory will be given in the body of this chapter, with the exception, as noted in the Preface, of [27,29,32,33] by the second author.

(e,

(e,

IV.l Standardizations of Internal Integration Structures

The Riemann integral for continuous functions on an interval [a,b] (see $1.1 2) has the properties

(1-2)

f ( x )dx 2 0

if f ( x ) 2 0 on [a, b ] .

166

IV.


Implicit in (1.1) is the fact that a linear combination of continuous functions is continuous. It is also true that is continuous iff is continuous. A general theory of integration should specify (A) a class L of “integrable” functions on a space X corresponding to the continuous functions on [a, b] in the above example, and (B) a real-valued function I on L whose value at f E L we denote by I$ (a numerical-valved function on a set of functions is usually called a functional). Here Ifcorresponds to the Riemann integral of f . In general, the analogues of the properties above should be satisfied. We abstract these properties in the notion of an integration structure. It consists of a lattice of functions and a positive linear functional on this lattice as in Definition 1.2 below. This definition incorporates the standard (real) and nonstandard (hyperreal) notions of integration structures since we want to consider internal analogues of integration structures when the functions are internal and hyperreal-valued. Our main objective in this section is to show how, beginning with an internal integration structure (L, I) on an internal set X,we can construct a real integration structure (i,f) on the same internal set X by a process called standardization. The important fact is that the real integration structures so obtained satisfy a closure property called the monotone convergence theorem. This theorem states roughly that a monotone increasing sequence (f.) of functions in whose integrals ifnare uniformly bounded, converges to a function f E i,and @ is the limit of It is the basic tool in all further developments of integration theory. We begin with a definition summarizing standard notation.

If1

e,

(k).

1.1 Definition Let X be a set and E G X. The functions zE,1, and 0 on X are defined by

”={::

x E E,

x$E,

1 = xx, and 0 = xer, where 0 is the empty set. Iff and g are functions on X,we write f I; g if f ( x ) 5 g(x) for all x E X ; we define a f , f g , f g , f / g (if g does not vanish at any point in X),and

+

+

If1

as usual by assigning the values af(x),f ( x ) g(x),f(x)g(x),f(x)/g(x),and If(x)l at x E X. 1.2 Definition A set L of real- or hyperreal-valued functions on a set X is a real (hyperreal) lattice if

(a) f , g E L implies uf + /3g E L for all real (hyperreal) a, /3, (b) f E L implies E L .

If1

IV.l

167


A real- or hyperreal-valued function I on L is called a real (hyperreal) positive linear functional (p.1.f.) if (c) I(af + 88) = aIf (d) Z f 2 O i f f 2 0 .

+ 8Ig for all f,g

EL

and real (hyperreal) a, 8,

The pair ( L , I ) then forms a real (hyperreal) integration structure on X . The integration structure (t, r', on X is an extension of the integration structure ( L , I ) if L E and 9 = If when f E L. If the sets X and L (and hence all f E L) and the functional I are internal in some enlargement V(*S) of a superstructure V(S), then we say that ( L , I ) is an internal integration structure. A lattice L always contains 0 (check), and is also closed under the operations of taking maxima and minima, defined as follows. 1.3 Definition Iff and g are (real- or hyperreal-valued) functions defined on

X,we define the maximum and minimum off and g by

max(f, g ) = f v 9 = (f + 9 + If - 91)/2, min(f,g) = f A 9 = (f + 9 - If - g1)/2 and the positive and negative parts off by 'f = f v 0, f- = (-f)v 0. Clearly, if L is a lattice and f,g E L then f v g , f A g E L. Conversely, if L is a set of functions on X which is closed under linear combinations and for which f,g E L implies f v g and f A g E L, then L is a lattice (Exercise 1). Notice that iff, g E L and f 2 g , then the inequality If 2 I g follows from 1.2(d). This fact will be used frequently in the development. The following are examples of real integration structures of real-valued functions. 1.4 Examples

1. Let C [ a , b ] denote the set of all continuous real-valued functions on the finite interval [a, b] c R. Define the linear functional J: on C[a, b] by f = 1: f ( x )dx (Riemann integral). Then (C[a, b], j:) is a real integration structure on [a, b] (exercise).Note that 1 E C[a, b]. 2. Let C,(R) denote the set of all continuous real-valued functions f on R with compact support, where the support off is the set

suppf = { x : f ( x )# O}.

If

= j : f ( x ) d x if (a) Let denote the functional on C,(R) defined by supp f E [a, b]. (The definition of j is independent of the choice of a and

168

IV.


b satisfying this condition.) Then (C,(R),1) is a real integration structure (exercise).Note that 1 $ C,(R). (b) Let {. . . ,x- 2,x- xo,xl, . . .} be a countable set of points in R with no limit point. For each f E C,(R) let f = - f(x,). Then (C,(R), is a real integration structure on R (exercise).

,,

c

c)

cy=

3. A step function on R is a function f of the form f = cixE,,where the sets E, are disjoint finite intervals (open, closed, or semiopen; this includes the case where the end points are equal and Ei is thus a single point). Let S(R)denote the set of step functions on R. Define the functional $ on S(R) by $f = cdb, - a,) iff = cixE,and E , has the end points a, and bl, a, Ibi. Then (S(R),$) is a real integration structure on R (exercise). 4. With Y = {x,, . . . ,x,} a finite set, let B(Y) denote the set of all realvalued functions on Y. If a,, . . . ,a, are fixed real numbers with ai > 0, 1 5 i In, define the functional aif(xi). Then on B(X) by = (~(y), C)is a real integration structure on Y (exercise). 5. With Y any nonempty set, let Bo( Y) denote the set of all real-valued functions on Y, each of which is zero except for finitely many x E Y. If a is a positive real-valued function on Y, let denote the functional on Bo(Y) defined by f= 4xi)f(xi), where supp f = {xl, . . . ,x,}. Then (Bo(Y), is a real integration structure on Y (exercise). If Y is a finite set, this example degenerates to Example 1.4.4.

c;=,

c;=,

If

co)co cy=

Lo

The next proposition, easily proved using the transfer principle, shows that each standard real integration structure on a set Y (in particular, each of Examples 1.4) gives rise to an internal integration structure on * Y by transfer. We now fix an enlargement of a structure containing Y, with the associated monomorphism *. 1.5 Proposition If (L, I) is a real integration structure on a set Y, then (*L,*I) is an internal integration structure on X = *Y.

Proof: Exercise. 0 There are internal integration structures which cannot be obtained from a real integration structure by using Proposition 1.5, as the following example shows. 1.6 Hyperfinite Integration Structures Let X be an internal *-finite set {x,, . . . ,x,) in an enlargement V(*S) of some superstructure Y(S). Let

B,(X) denote the set of all hyperreal-valued internal functions on X. With {al,, . . ,a,} a fixed set of hyperreal nonnegative numbers of the same internal cardinality as X, let denote the hyperreal functional on B,(X) defined by f= a&,), where the summation is the extension of finite

c;"= c,

IV.l

169


summation. Then (B,(X),C,) is a hyperreal integration structure on X (Exercise 5). Such "hyperfinite" integration structures have recently been used as the starting point in an extensive nonstandard treatment of Brownian motion and other stochastic processes. An introduction to this theory is presented in §IV.6. Now let (L, I) be an internal hyperreal integration structure on an internal set X in an enlargement V(*S)of a superstructure V(S)containing the reals. Our main objective in this section is to construct a real integration structure (2, f) on the same internal set X so that the monotone convergence theorem is valid. (2,f) will be called the standardization of (L, I ) . T o prove the convergence theorem and other results we need to assume that V(*S) is K1saturated. Thus we assume from now on without further explicit comment that any internal structure (L, I) being standardized lies in an Kl-saturated enlargement V(*S) of a superstructure V(S). 2is now defined as follows. 1.7 Definition Let @ , I ) be an internal integration structure on an internal set X.We define the set Lo of null functions to be the set of hyperreal-valued (possibly external) functions g on X such that, for each E > 0 in R, there is a $ E L with 191 I; and "I$ < E. Further we define t to be the set of realvalued functions f on X such that f = 4 + g, where 4 E L, "Il4l < m, and g E Lo.

*

1.8 Lemma

(a) I f f = 4

f = 4? #with

44 - 4) = 0.(b) If

(41 v 4 2 )

fi E L

and

+ g E 2 with 4 E L , "I141< 00,

g E Lo, and we also have

4~L , # E Lo, then '1141 < m and 4 - 4E Lo,so I4 - 14 = with

fi = q5i + gi, 4i E L, giE Lo ( i = 1,2),

(flA h ) - (41A 4 2 )

then

(fl

vf2)

-

are in Lo.

Proof: (a) Since 6 - 4 = g - # E Lo, we have I"I(8 - 4)l I'118 - 41 = 0 and_ I"I($l- oIlc$l I I '114 - 41 (Exercise 6). It follows that "Z(4l< co and 1'14 -

= 0.

E > 0 in R, there is a (why?). From the inequalities

(b) Given

(41 v 4 2 )

it follows that

(fl

-

*

I,+E L with lgil < J/ ( i = 1,2) and "I$
0 in R we may find an m E N so that, for n 2 m in N, B - E < ff,,5 B, and hence B - E < 14,,< B + E for any E > 0 in R. We now use saturation again. As in the proof of Theorem 1.14 we can extend the sequence ( & E L : n E N) to (&E L:nE*N) so that it is still increasing (if necessary repeat some 4 E L for all n 2 some k in *N,).Thus, for some infinite w, 4m2 $,, for each n E N and '14" = sup{olc$,,:n E N} (Exercise 8). We need only show that f - 4mE Lo. Fix E > 0 in R, and for each n E N choose a $,, E L with Ig,( I $,, and I$,, < E/2". Again by Kl-saturation we may extend the sequence ($,,:n E N)

+

172


IV.

to ($,,:n E *N) so that, for some infinite k E *N, $,, 2 0 and l$,, < ~ / for 2 ~ each n < k. Let $ = $,,. Then I $ < E and

+

If=,

(1.3) 4 n I4 n for each n E N, so that (4n - 4 m )

$n

54n

+ gn sf

-$sf- 4 m 5 ~

(1

4

+ ~ ) ( 4 m+ J/)

+m (1 +

We may choose n E N so large that -2E

Also, I(&&

+

n(x - a) 1, 1 - n(x - b),

aIx_.

e

-

Exercises I V.1

1. Show that if L is closed under linear combinations and f , g E L f v g E L and f A g E L, thenfE L E L. 2. (Standard) Show that the structures in Examples 1.4 are real integration structures. 3. Let X = 1, and y = (y,) E 1,; assume that y, 2 0 for all i E N. Show that (X,I) is an integration structure if we define Ix = ( x , y ) for all x E X. 4. Prove Proposition 1.5. 5. Show that the structure in 1.6 is a hyperreal inteBration structure. 6. (a) Show that, for functions 4 and in L, I"Il+l - "Il4ll 5 - 41. (b) Show that if 4 E L n Lo, then " I [ $ / = 0 7. Prove that i is a real lattice. 8. In the proof of Theorem 1.15, show that for some infinite o E *N, 4u 2 for all n E N,and = sup{014n:nE N}. 9. Show that one cannot in general replace (1 + E)(+, + I)) with (4" + $) in the right-hand side of Eq. (1.3) in the proof of Theorem 1.15. 10. Let (L, I) be an internal integration structure on the internal set X and suppose that the function f is real-valued and nonnegative. Show that f € L o iff f~ i on X and ff = 0. 11. (Comparison Theorem) Let (L, I)and (L', 1')be two internal integration structures on the internal set X.Suppose that Lo G Lb and that for each 4 E L there exists a I) E L' so that 14 'v I'I) and 4 - $ E Lb. Show that L E and~ if =?f for all f E L .

If1

4

OI$,,,

O I I 4

IV.2

Measure Theory for Complete Integration Structures

175

12. Use Exercise 11 to show that if (L, I) and (L', Z') are the *-transfers (*C,(R), *j)and (*S(R),*$)of the structures in Examples 1.4.2(a) and 1.4.3 f) = (2,f'). respectively, then (i, 13. In the standardization of Example 1.18.1 give an example of a function g E Lo which takes infinitely large values. 14. (Standard) A collection of subsets of a set X is a ring if A, B E Y implies that A v B and A - B E 9. A function v : Y + R + is a jinitely additive measure on S if v(A u B) = v(A) v(B) for A, B E Y with A n B = 0.

+

(a) Show that if A, B E Y then A n B and A AB = ( A - B) u ( B - A) E Y . (b) Show that if d is any collection of subsets of X then there is a unique ring Y containing 8. (Hint: Y is the intersection of all rings containing 8.) (c) Show that the set L of all linear combinations of characteristic functions of disjoint sets in Y is a lattice. (d) Show that if 4 = aizA,E L, we may unambiguously define I4 aiv(Ai) and that ( L , I ) is an integration structure.

=cl=l

15. Develop the internal analogues of the notions in Exercise 14.

IV.2 Measure Theory for Complete Integration Structures

In the last section we showed that the monotone convergence theorem holds for the integration structure f) obtained from an internal structure & , I ) by standardization. In this section we develop a measure theory for any integration structure 1)for which the monotone convergence theorem is valid. Such structures will be called complete.

(e,

(e,

(e,

2.1 Definition A real integration structure I^, on a set X is complete if whenever (f, E e : n E N ) is a monotone increasing sequence for which f , ( x ) = f ( x ) exists for all x E X, (a) (b) sup{Ifn:n E N } = limn+,,, If,, < co,

then f E t and

9". Throughout this section (t, r^) will denote a complete integration structure. Our first objective is to introduce a set M of functions which includes the set f,. The functions in M are called measurable functions. Roughly speaking, @=

measurable functions will have the same regularity as functions in 2 but may not have finite integrals. We will find that products of measurable functions

176

IV.


are measurable, a useful fact that is not in general true for functions in I!,. We then extend the functional f to a subset t,of M ,and obtain a real integration structure which is an extension of @,f). We will also study the basic properties of those sets, called measurable, whose characteristic functions are in M.This leads to a discussion of measure theory which is often taken as the starting point for a standard development of integration theory and is important in many areas of analysis; in particular, it is basic to probability theory. We will show that the two approaches are equivalent. Most of the proofs are standard except at the end of the section where we establish connections with 4IV.l. The functions in M will be extended real-valued functions; that is, they may take the values + a and - a.Thus we make the following definition.

2.2 Definition The extended real number system is the set R = R v {-a, + a}.By convention - a < x, and x < + a for all x E R. The rules of arithmetic for R are supplemented by the following rules: If x E R then

+

( f m ) (fa) =x

+ (fa)=(&coo) + x = f a ,

(*a)(fa) = +a, ( + a ) ( T m ) = -a),

1

if x > O if x = O if x < O

fa

x(+co) = ( f a ) x= 0 Tco x/( f co)= 0

for all x

E

R.

If a set A E R is not bounded above we define sup A = + a,and if A is not bounded below we define inf A = - a, with a similar convention for lim sup and lim inf. As usual, we often denote + 00 by 00. Notice that we have not defined ( & a)+ ( T a),(ka)/( fa),or (& a)/( T a).

2.3 Dewtion L' denotes the set of nonnegative functions in I!,. We denote by M' the set of nonnegative R-valued functions h on X such that h ~f E I!, for each f E .,!I If h E M' we define .fh = SUp{f(h A f ) : f E

.f is an R-valued function on M'.

i}.

We denote by M the set of i?-valued function h on X whose positive and negative parts h+ = h v 0 and h - = - h v 0 are both in M'. If h E M and

177


either jh' or j h - is finite, we define

jh

= jh'

-jh-

2.4 Remarks

1. Since i is a lattice we see that M 3 2, and it is easy to check that if h E i then j h = fh. 2. In defining M' and j h for h E M', we may assume that f E L', where itis the set of nonnegative functions in t.That is, fix h 2 0 and suppose that h ~ i f ofr allf~ E i'. Then iff = f + - f - E i,we have h ~ ( f ' - f - ) = (h ~f') - f- E i.Similarly, j h = sup{f(h~ f ) : Ef i'} for h E M'. 3. An easy calculation shows that j h = sup{@:O ~f I h, f E i} for h E M'. This formula will be used later without explicit comment. 4. Suppose that (i, f) is obtained by standardization from (L,I). For h E M + ,j h may be less than the supremum of the integrals 'Z4 for 4 E L, 0 I 4 Ih. For example, let X = {x, y}, and let L be the internal set of *R-valued functions on X. For 4 E L define 14 = &(x) + w$~(y),where o E *N,. Then eachf E vanishes at y, 1 E M,,f1 = 1, but sup{"l4:4 E L, 0 I4 Il } = 03.

+

2.5 Proposition If hi, h2 E M ' and a E R', then h , h,, ah,, hl ~ h , and , h_, v h2 are in M'. Also j(hl h,) = j h , j h , , j(ah,) = a h l for a E R, and ~ h I, $1~1 if hl I h2.

+

+

Proof: Let f E 2'. Then

(h, + h , ) A f = [ ( h , A f )

+ (44 1A f E t.

For a > 0,(ah, ~ f= )a(h, A (l/a)f) E i.Similarly h, ~ h and , hl v h, E M'. For any f E t+,the reader should check that (h, + h 2 ) h f 5 (h, ~ f+) (h, ~ f )Thus .

i((h, + h , ) ~ f I ) f(hl ~ f+)@,I ~ fI)j h ,

+ jh,.

Taking the supremum on the left-hand side, we obtain

J(hl + h2) I h,+ j h , . On the other hand, suppose f,,f2 h,

+ h,, so

E

i and f, 5 h , , f, 5 h 2 . Then f, + f, I

Ifl + if2 = f(fl + f2) I j(hl + h,), and hence j h , + j h , I j ( h 1 + h2). Thus j(hl left as an exercise. 0

+ h,) = j h , + j h , . The rest is

178

IV.


Our next result extends the monotone convergence theorem to (M,j). In considering its meaning remember that j is an extended real-valued function and takes on the value + co for many functions in M'.

(a',

2.6 Monotone Convergence Theorem for j ) If (h, E M ' :n E N) is an increasing sequence in M', then h = sup h, E M' and j h = sup{jh,: n E N} = limn+mJh,. Proof: Let f E i '. Then h, ~f E 2 for each n, the sequence (h, ~ f : nE N ) increases to h ~ f and , sup{f(h, ~ f ) : nE N} I ff) c 03. By completeness, h A f E i a n d f ( h A f ) = lim f ( h , A f ) . Thus he&?' and

.h = sup{f(h ~ f ) :Efi} = sup{sup{f(h,r\f):fE i } : n E

N}

= sup{jh,:n E N } .

It is now natural to restrict our attention to those functions in M whose integrals are finite. 2.7 Definition We define 2, to be the set of R-valued functions h E M for which j h is finite and 2 : to be the set of nonnegative functions in i,.

el

The functions in are extended real-valued functions. For this reason they cannot, in general, be added without encountering difficulties with expressions of the form 00 - 00. We can, however, restrict ourselves to the real-valued functions in i1and obtain an integration structure. Later we will show that with any function f E i1is associated a real-valued function f E 2, (which equals f almost everywhere; see 8IV.4) such that jf = jf". 2.8 Proposition The set of real-valued functions in 2, together with j forms a complete integration structure on X,i, 2 i,and jf = ff iff E i.

Proof: Exercise. 0

2.9 Remarks 1. To show that a given function h is in 2, it suffices to show that h E M and (hl Ig for some g E 2 : (exercise). 2. If 1 E i,then every real-valued function in 2, is in (exercise). 3. In general, 2,properly contains i.In Example 1.18.3, i consists of all real-valued functions which vanish except perhaps at xo, while 2,consists

IV.2

179


of all R-valued functionsf which are finite at xo, and .i’f = f(xo).In particular, 1 E i1- i. To proceed we need to make a further assumption on L due, in the standard development of the subject, to Marshall Stone. 2.10 Definition A lattice L (real or hyperreal) is Stonian if 4 E L implies 4 A 1 E L. An integration structure (L, I) is Stonian if L is Stonian. 2.11 Remarks

1. If 1 E L, then L is Stonian. 2. If i is Stonian, then 1 E M + . 3. Each of the real lattices in Examples 1.4 is a Stonian lattice. 4. If L is a real Stonian lattice on a standard set Y, then * L is an internal Stonian lattice on X = *Y. 5. If L is Stonian, then 4 A O !E: L for any a > 0 since 4 A a = a((l/a)+ A 1).

2.12 Proposition If ( L , I ) is an internal Stonian integration structure on the internal set X, then the standardization (&f) is a Stonian integration structure. Proof: Let E > 0 in R and f E 2 be given. By Theorem 1.14 there are func< 00, and tions $ 1 ,t,h2 E L so that $1 I f I$ 2 , - $J < E. Then $ 1 A 1 I f A 1 5 $ 2 A 1 and 1 - $1 A 1) I‘f($2 - $1) < E, SO f~ 1 E by Theorem 1.14. 0

The above results show that all of the examples of integration structures encountered so far have been Stonian. In the rest of this chapter we will assume without further explicit comment that all integration structures are Stonian. To lead into our discussion of measurable sets we give an alternative characterization of measurable function in terms “good” sets which are defined as follows. 2.13 Definition We let XA E

t+.

2.14 Proposition If A AE9.

2 denote the collection of all sets A E X = {x E

X : f ( x ) > a}, where f

E

i’

for which

and a > 0, then

180

IV.


Proof: By considering (l/a)fwe may assume a = 1. Then f = f - f A 1 E i, and if B = {x E X : f ( x ) > 0 } then A = E. Also 1 A nf E i, f ( l A nf) I f(1 AS)I&! for all n E N , and so ze = lim(1 A nf) E by completeness. 0

2.15 Proposition M’ consists of all nonnegative extended real-valued functions h such that h A nzA E for each n E N and A E 2. Given h E M’, h = sup{f(h A nXA):nE N , A E 2).

I,,.

Proof: Givenf 2 0 in L, let A, = { x E X : f ( x )> l/n}, n E N. Then Ei by 2.14, and the result follows from completeness and the fact that h ~ = f limn+a [ h ~ n ~ ~and , ~h f~ ]n z ~ I , h~~ fn z ” I , h. 0

We are now ready to consider the notions of measurable set and measure. These notions were the starting point of the integration theory developed by Lebesgue. He proposed attaching a real number p(A), called the measure of A , to a subset A of a set X. The measure of a subset can be thought of as a generalization of the length of an interval on the real line, or the area of a rectangle in the plane. Thus it is natural to require that the measure of a disjoint union of sets is the sum of the measures of the sets, at least for finite unions. Unfortunately it is usually impossible to define p on all subsets of a given set X . The best we can expect is that the subsets, called measurable, on which p is defined are closed under countable unions and complements, and that the measure is “countably additive”. The general definitions of measurable sets and measure as presented by Lebesgue are as follows. 2.16 Definition A collection A of subsets of a set

X is called a a-algebra if

(a) x E A, (b) A E A implies that the complement A’ of A is in A, (c) {A, E .M:i E N} implies U A i ( i E N ) E .M. Each set in A is called measurable, and ( X , A) is called a measurable space. A nonnegative function p:.M --* R + is called a meusure on A if p ( 0 ) = 0 and (d) for each collection { A , E A:iE N} which is disjoint (i.e., Ai n A j = 521 if i # j ) we have p ( u A i (i E N ) ) = C d A i ) ( i E N ) .

This property is called countable additivity. A measure p on .M is complete if (e) whenever A E A with p ( A ) = 0 and B c A, then B E A (and thus p ( B ) = 0 since p ( B ) 5 p(A - B ) p ( B ) = p ( A ) ) .

+

The triple (X, 4, p) is called a measure space.

IV.2

181


2.17 Remarks 1. @ = X‘. 2. If { A i : iE N} c A then, by De Morgan’s law, r ) A i (i E N) = ( U A ;( i E N))’ E 4. 3. Finite unions and intersections of sets in A are again in 4. 4. If A, B E .A then A - B = A n B’ and the symmetric difference A A B = ( A - B) u (B - A) are in A. 5. If p is a measure on (X, A),then for any collection {A,, E 4 : n E N} we have p ( u y A,,) I p(A,). If A , c A, then p ( A , ) Ip(A2). (Exercise).

6. The term “complete” for measures is not related to completeness for integration structures.

Now we will show how to use a complete integration structure X to introduce a measure theory on X.

(&f)

on

2.18 Definition A set A E X is measurable with respect to (i,f)if x,, E M’. The collection of these measurable sets is denoted by 2.For each A E define $(A) = jXA.

Note that

2 E {A E .,8:p(A) c a}.

2.19 Theorem 2 is a a-algebra on X and $ is a measure on

2.

Proof: (a) By Remark 2.11.2, 1 = x x E A?’. (b) If A E A? then zA E M’ and so zA,= 1 - x,, E M’. (c) Suppose Ai E A (i E N) and put A = Ai and B, = A i . Then (Xe,:n E N ) is an increasing sequence of functions in M’. Since x,, = xBn,x,, E M’ by the monotone convergence theorem, 2.6, and hence AEM. (d) In the notation of (c) we have

u;

$(A) = jx,, = lim jxEn

by monotone convergence

n+ca

n

=

lim

n-m

c

1 jXA, i= 1

m

=

i= I

$(A,).

0

since the {A,} are disjoint

uy=

182

IV.


2.20 Examples

1. Let (J!,,f) be the standardization of ( L , I )= (*CdR),*S) on X = * R (see Example 1.18.1).

(a) 9 contains all intervals of finite length, including intervals of infinitesimal length and (the degenerate case) single points [see Example l.lS.l(c)]. (b) A? contains each interval on * R (exercise). (c) The set G of finite numbers in * R is in A? (exercise). (d) The set of numbers infinitesimally close to any a E R is in 9(exercise). 2. In Example 1.18.3, 9 consists of {xo},and A? consists of all sets.

In the standard developments of integration, one begins with a measure on a a-algebra A. Using A, one then defines the notions of measurable function and associated integral. We now present this development. Our eventual aim is to show that if we begin with the 2 and fi obtained from (i,1)then the measurable functions and integrals obtained from the standard development coincide with those obtained from (J!,,1). In the next few results p will be a measure on an arbitrary a-algebra A. 2.21 Definition An extended real-valued function h on X is measurable with respect to A if A , = { x E X : f ( x )> a } E d for each a E R. The set of functions f which are measurable with respect to A is denoted by M.

We will see presently that M = M in our situation, but a few results must first be established. We want to show that each h E M is the limit of a sequence of functions in M, each of which takes only finitely many values. 2.22 Definition A function v E M is simple if it takes only finitely many distinct real values a , , . . . ,a,, and the sets Ai = {x E X : v ( x )= ai} E A ( i = 1, . . . ,n). The representation v(x) = aiXA, is called the reduced representation of v.

cy=,

2.23 Proposition Each nonnegative function h E M is the limit of a monotonically increasing sequence ( v , E M:nE N) of nonnegative simple functions. Prooj: Define VAX) =

(k - 1)/2",

if (k - 1)/2" I; h(x) < k/2", if h(x) 2 n,

1

k I; n 2",

IV.2

183


(drawing a picture helps here). Then 0 Ih(x) - u,(x) I 1/2" if h(x) In, and u, = n if h(x) > n. Also on increases monotonically to h. 0 In the standard development of integration that we are following, the integral of a nonnegative function h E M is defined as follows. 2.24 Definition Let the measure p on A be given. If u = aizA,is a simple function with each ai 2 0, we define the integral of u by u d p = C;=l aip(Ai).One can show that the integral is well defined (Exercise 7). If h E M is nonnegative we define the integral of h by

If h E M and h = h' - h- we define j h d p = j h'dp integrals is finite.

-Ih - d p if one of the

We now show that our development of integration coincides with this standard development. 2.25 Theorem Let (t, f) be a complete integration structure with measurable functions M ,and let M be the functions measurable with respect to the 6algebra k obtained from(2,f). Then an R-valued function h is in M ' iff it is in M + ,and j h = hd4, where 4 is the measure obtained from (L,f).

Proof: Assume that h E M'. For a > 0 let A = { x E X : h ( x )> a}; fix rn > a i n N andCE&'.ForanynEN,XAAnX,=zAnC,and A n C = { x E X :h A mzc > a } E 9

by 2.14 and 2.15, so A E 2.Moreover, {X E

X : h ( x )> 0 } =

u

{X E

X : h ( x )> l/n} E 2,

noN

and so h e M'. Now assume that h E M', and fix C E &' and n E N . Then h A nzc is the limit of an increasing sequence of simple functions from t by 2.23. Thus h A nxc E t by completeness, so h E M'. To show that j h = hdp, note that j udp = j u for nonnegative simple functions and that

Shd;

= sup{ju:u

simple, 0 Iu I h}

Isup{jf:f E M ,0 If 5 h} A

= Jh.

184

IV.


But if f E t and 0 5 f 5 h, then there exists an increasing sequence (u,,: n E N) of simple functions with 0 I; u,, 5 f and limn-rmu, = f,so that

9 = lim fun 5 j h d j l n-rm

Hence

2.26 Corollary M = M and .fh = hdj? for all h E M for which J^ is defined.

Proof: To show that M G M let h = h+ - h- E M. By Theorem 2.25, h+ E M + = M' and h- E M + = M + ,and so h E M . To show that M E M we proceed in the same way, using the fact that if f, g E M ' and f g = 0, then f - g E M . To prove this we have x E X : f ( x )> a } if a 2 0 {x E X : f ( x ) - g(x) > a } = E X : g ( x )< - a } if a < 0.

Now {x E X : f ( x ) > a } E

2.Also

{ x E X : g ( x ) < - a } = { x E X : g ( x ) 2 -a}' = ( n { xE X : & ) > -a

is in

2 by Theorem 2.19.

-

i/n}(n E N))'

0

(e,f)

2.27 Notation Let be a complete integration structure with associated sets and measure My2,and fi. With Corollary 2.26 in mind we will denote the value of .f at h E M by the standard notation hd&

I

We can now show that the set of measurable functions is closed under many limiting and algebraic operations. 2.28 Proposition If (h, E M : n E N) is a sequence of functions in M ,then the

functions h, H,h, 1-7 defined by h(x) = inf{h,(x):n

E

i ( x ) = lim inf h,,(x),

N},

H ( x ) = sup{h,(x):n E N}, A ( x ) = Iim sup h,(x)

are in M.

u;=l

Proof: Since { x E X : H ( x ) > a } = { x E X : h , ( x ) > a } we see that H E M by 2.26. Then h E M since inf{h,} = -sup{ -h,} E M. Finally h" = sup{inf{h,:m 2 n} } E fi and similar R E M. 0

185


2.29 Proposition Iff, g E fi and H is a continuous function on the plane R2, then the function h defined by h(x) = H(f(x),g(x)) is in M. In particular, f + g andfg E fi. Proof: Since H is continuous, the sets U , = { (u, u):H(u, u) > a } are open, and so each can be written as a union of open boxes:

u { 0) = { x E X:+ > l/n) E (.v E X:n(# A l / n ) 2 I}, and so ('4 A n) - ("4A l/n) E L and "I((4A n) - (4 A l/n)) = j ( ( 4A n ) - ("4A l/n))

by Theorem 1.16. Now by our assumption and Theorem 2.6, '14 = lim

"I((4A n) - (4 A l/n))

n-m

=

lim

.?((04A n) - (04A l/n))

n-m

=j(O4).

If, on the other hand, I4 is finite, then the second and third in this string of equalities hold as before. If we also have '14 = .?(O4),then I4 'Y I((4A w ) ( 4 l/w)), ~ and so I(4 - ( 4 ~ 0 ). v)O and I ( ~l/w) A - 0 for each WE*N,. 0 Note that ;he condition I ( [ + / A l/w)

2:

0 is automatically satisfied for any

4 E L and w E * N , if 1 E L and "I(1)< co. Exercises I V.2 1. (Standard) Finish the proof of Proposition 2.5. 2. (Standard) Prove Proposition 2.8. 3. (Standard) Show that if h E M and Ihl I y for some y E L: then h E L,. 4. (Standard) Show that if 1 E i,then every real-valued function in LI is in L. 5. (Standard) Show that if (X, A,p ) is a measure space and A, E A, n E N, then p ( u ; A,) I p(A.1, and if A , c A , then p ( A l ) I A A 2 ) 6. Verify the statements in (b)-(d) of Example 2.20.1. 7. (Standard) Show that, for a simple function u, udp is well defined in Definition 2.24. That is, if u = aizA,= 1bjzBjr a, 2 0, bj 2 0, show that aip(Ai)= C bjp(Bj). 8. Show that the measure $ obtained from the standardization (L,f) of an internal integration structure (L,I) is complete (Definition 2. lqe)). [Hint: Use Exercise IV 1.101

x?

x

188

IV. Nonstandard Integration Theory

then the function h defined 9. (Standard) Show that iff, g E fi and E €2, by

is in M. 10. (Standard) Given a sequence (f,) of measurable functions, show that the set E of points where limn+mj,(x) exists is measurable. [Hint: Con-

sider lim sup f, and lim inf f.3. 11. Prove that if ( L ,I) is an internal integration structure with standardization (i, f),then for each E > 0 and A E ?L there is a 4 E L with 0 5 I$ I zA and P(A) - "I(4)< E. In particular, if W A ) > 0 there is a 4 E L with 0 S 4 Ix A and I(& > P(p)/2. L 12. (Standard) Show that A consists of those sets C such that C n A E ? for each A E 8. 13. Let S be an internal hyperfinite subset of an internal set X. If d is the set of internal subsets of X,define the function v: d + * R + by v(A) = [ A n Sl/(Sl, where 1. 1 denotes internal cardinality.

+

(a) Show that v is finitely additive, i.e., v(A u B) = v(A) v(B) for A, BE^ and A n B = 0. (b) Show how you may use the theory of 8IV.l to define a measure p on a o-algebra d of subsets of X (see 1.6 in particular) so that A =) d and p(A) = "v(A) for A E 1.Note that 0 I p(A) I 1 for all A€&. 14. (Nonmeasurable sets) Consider Exercise 13 where X = {n E * N : O s n < o,o E * N , } . Define an operation 0 on X by n 0 m = n + m if n + m < o,and n 0 m = n + m - o if n m 2 o.Call nand m i n X equivalent if there is a standard k E N with either n Q3 k = m or in 0 k = n (this is an equivalence relation). Using the axiom of choice, choose one point from each equivalence class to form a set B. Show that B 4 A. (Hint: Show that X = u [ ( B @ n) u (BQ3 (o- n)](n E N ) ) . 15. Let ( L , I ) be an internal lattice. Give an example of a function 4 E L for which I4 is finite but 4 is not S-integrable. 16. Let & , I ) be an internal lattice.

+

(a) Show that iff, g E L, g is S-integrable, and if1 I191, then f is Sintegrable. (b) Show that if f E L is S-integrable and g E L satisfies 191 s n for some n E N, then fg is S-integrable. (c) Show that iff, g are S-integrable and a, = * R are finite, then af + bg is S-integrable.

189

IV.3 Integration on R”; the Riesz Representation Theorem

17. Modify the proof of Proposition 2.33 t o show that for 4 2 0 in L, j(’4) I “f4.(Hint: we may assume O I 4 < 00). 18. Use Theorem 1.14 and 1.17 to show that if ( & I ) is an internal Stonian integration structure, then the function 1 is in iff 1 E L and “I(1) < cx). 19. State and prove Proposition 2.35 with the additional simplifying assumption that the function 1 E L and “I(1) < co. 20. Let ( L , f ) be the hyperfinite integration structure of Example 1.6, and bc the standardization of ( L , f ) ,with associated 9, 2,fi, etc. let (i,,i) Assume that ai ( i E I) is finite.

1

(a) Show that A E 9 iff for every E > 0 in R there exist internal subsets B and C of X such that B E A E C and a, ( i E C - B) < E. (b) Show that A E 3 iff there is an internal set Bsuch that j ( ( A - B ) u ( B - A ) ) = 0. (Hint: use HI-saturation and the permanence principle.)

1

‘IV.3 Integration on R”; the Riesz Representation Theorem

Let X be any open or closed subset of R” and suppose that lois a positive linear functional (p.1.f.)on the lattice C , ( X )of continuous functions with compact support on X (of course C,(X) = C ( X ) if X is compact). For example, f 0 ( f )could denote the Reimann integral o f f E C,(X) or, more generally, the Riemann-Stieltjes integral off with respect to an increasing integrator. In particular, fo(f) could be evaluation of f at some point xo E X. We want to use the theory developed in the previous sections to define a measure space ( X , .NX,p x ) and a corresponding complete integration structure ( L x ,I,) on X which is an extension of the structure (Cc(X),lo).Most of these results are easy to prove and are left as exercises. The measure px will be shown to satisfy an additional condition known as regularity. This and other associated results are more technical, and can be skipped if desired. All of the above results taken together yield the Riesz representation theorem. With minor modifications except in one place, the results and proofs of this section carry over to the case that X is any locally compact Hausdorff space. One essential difficully arises in the proof of Lemma 3.8, which, for the general case, requires Usysohn’s lemma [20]. Also, if X is not compact a “countability” condition is needed for the general case to show “outer regularity.’’ Without further ex4licit comment, the nonstandard analysis in this section will be carried out in a h--saturated enlargement V ( * R ) of V ( R ) .We assume that 6 2 N l . For a F ’ %ral space X we would need x > card ,7,where 9 is the collection of o r sets in X.

190

IV.


Let ( L , I ) be the internal integration structure (*C,(X),* I o ) on *X, with denoting the objects constructed from ( L , I ) by the procedures of &IV.l and IV.2. Recall that if G denotes the near-standard elements in * X then the standard part map st: G -,X maps G onto X. The basic idea of this section is to use the standard part map to lift functions from X to * X as follows.

(A?, j),(il, j ) , . k , 9, F

3.1 Definition For each R-valued function f on X we define the function on * X by

and for each A E X we define

A' = st-'(A) n *X.

3.2 Remarks 1 . ?is constant on the monads of standard points in *X, and zero at all points which are remote (i.e., not near-standard). In partic,ular, f(x) = 0 if x E * X a n d J h E r m of x is i e i t e . r y 2. ?=uj,.f y = f 6, f v y = f v a , f A g = ~ A (exercise). G 3. i A= x i (exercise).

+

+

We now obtain measure-theoretic structures on X with the following definition.

3.3 Definition We let M , = { / : fA?) ~ and define J , by putting J , ( f ) = j(f) when j ( , f )is defined. For each set A c X with A' ~ " i.e.,ix A,E M,, we set p x ( A ) = b ( i ) ; the set ,'it. = { A E X : ~ E . R }We . let L , denote the real-wlued functions j ' in M , for which J,j' is defined and finite. 3.4 Proposition ( L , , J,) is a complete integration structure which extends (C,(X),lo).Moreover, (X.. I f x , p x ) is a measure space such that f E M , iff f is . //,-measurable, and J f d / i x = J,f when J,f is defined. Proof: That ( L , , J , ) is an integration structure is left as an exercise. To show that ( L x ,J , ) extends (C,(X),/"), let f E C , ( X ) . By the uniform continuity off, *f'(y) 1: * f ( x )if y z x and *fis zero at any remote point since f has compact support. Thus f = "(*f). By the obvious extension of Example 1.18.1(b),f E and jf = = "/*I' = I 0 f .

IV.3

Integration on R", the Rtesz Representation Theorem

191

To show that (L,,J,) is complete, let (JJ be a monotone increasing sequence of functions in L , for which limn,,fn(x) = f(x) exists for all x E X and suplJ,/,;ri E N ) < x.Then ( f " ) is a monotone insreasing sequence of functions in L , and sup(jf,, : n E N l < lz. Also lim,-mfn(z) = f(z) for all z E * X ( c h c d ) ,soj E L , and jf = lim .lL by the monotone convergence theorem for ( L , , j ) .Therefore .f E L , and Jx.f' = limn,x J,.f,. The rest is left to the reader (Exercise 2); the equality 1J'dp, = J,ffollows from the corresponding fact for simple functions. 0 When we start with I, being the p.1.f. given by ordinary Riemann integration. then .IS, is called the class of Lebesgue-measurable sets and p x is called Lebesgue measure. I n that case we write f dpx as 1f dx.

3.5 Examples In the following examples we consider the case in which X = R and I , is given by Riemann integration.

1. The characteristic function of any bounded interval in X is in L , (i.e., these intervals are in ,M,). This follows from Example 1.18.1(c). The corresponding result for bounded rectangles holds if X = R". 2. Next we show that L , contains the function

and hence contains unbounded functions. If A =(O, I] then zA and hence nzA are in L , by Example I . Thus f, = nzAA l/& E L , by the lattice property. Now the sequence (f,) is monotone increasing and converges to f . An easy calculation shows that J,fn I 2, so the result follows from completeness. 3. If E E . N, is bounded then p , ( E ) < cc (Exercise 3). This again generalizes to

X

=

R".

The following results give more detailed information about .dX and p , and center about the notions of regularity, which is defined as follows. *3.6 Notation Let .%" and .P be the collections of subsets of X that are compact and open in X, respectively. Recall that, for X c R", V E X is open in X if V = X n W for some open W G R". A set K is compacr in X iff it is compact in R". We write K i J' if K E X , j ' E C,(X), 0 I f I 1, and f(x) = 1 for all x E K. We write f i V if V E .F,f E C,(X), 0 5 f I 1, and supp j' C_ V . The notation K i J' < V means that K i f and J' i V.

192

IV.


*3.7 Definition A measure p on a a-algebra A 3 X u F of subsets of a metric space X is inner regular if (a) &A) = sup{p(K):K E A, K

E X } ,A E A,

outer regular if

(b) p(A) = inf{p(V):A E V , V E F},A E A, and regular if it is both inner and outer regular. We first show that Ax2 X u 9. To do so we need the following fact about continuous functions.

*3.8 Lemma Suppose K E X , V E F,and K c V. Then there exists a function J E Cc(X)so that K 0 in R be given. We may choose i,b1, $ 2 E L with 05 5 x,, 5 $, I 1 and Z($, - $J ~ / by 3 Theorem 1.14 (the inequality $, I 1 uses the fact that L is Stonian). Let Xo= {K E X : K c V}. For each K E Xolet

-=

aK = inf{OI($,

A

*f):K 0 in R be fixed. By regularity there is a V 1 K with p( V) < p ( K ) + E. Let f satisfy K < f < V. Then This is true for any E > 0, so that p,(K) 5 p ( K ) . Similarly p ( K ) s p,(K), and the uniqueness follows. The completeness of p, follows easily from the completeness of fi (see Exercise IV.2.8) and is left as an exercise. 0 Exercises I V.3 1. Prove the validity of Remarks 3.2.2 and 3.2.3. 2. Show that (L,, J,) as defined in Definition 3.3 is an integration structure, and finish the proof of Proposition 3.4. 3. Show that if E E Axis bounded, then px(E) < 00. 4. Show that if X is an open or closed subset of R" and K c X is compact, then there is an open set V in X (i.e., V = X n W for some open W c R") such that K c V and the closure of V is both compact and contained in X. 5. Finish the proof of Proposition 3.9 by showing that if p x ( V ) = 00, then P X ( V = SuP{~of:f< V ) . 6. Assume that X is compact in R", and deduce Proposition 3.9 from Proposition 3.10. 7. Prove Corollary 3.1 1. 8. Show that if X is open or closed in R", then there is an increasing sequence W,> of sets open in X with X = u W n ( nE N) and each W,compact and contained in X. 9. Prove that p, is a complete measure in Theorem 3.13.

0 so that p ( { x : l f ( x ) l > B}) = 0. Similarly, we say that f = g a.e. if there is a set A E X with p(A) = 0 and { x : f ( x ) # g(x)} E A. If p is a complete measure or f and g are measurable, we need only specify that p ( { x : f ( x ) # g ( x ) } = 0. The relation of equality a.e. is easily seen to be an equivalence relation (Exercise 1). The basic fact is that sets of measure zero can be ignored as far as integration is concerned, as indicated by the following results. 4.2 Theorem

If

(a) Iff E M is zero a.e. then dp = 0. (b) Iff E M + and 5 f dp = 0 then f = 0 a.e. Proof: Let E = { x : f ( x ) # O}; then E E A.

(a) Suppose first that f~ M + and p(E) = 0. Letting u, = nzE, we have and j undp = np(A) = 0. With h = lim on it follows from Theorem 2.6 that h E M + and 5 h d p = sup{ j u n d p : nE N} = 0. Finally f I h, and hence 0 If f d p 5 j h d p = 0, so that j f d p = 0. For general f we write f = 'f - f-. Iff = 0 a.e. then f'and f- are both zero a.e., and the result follows by linearity of the integral. (b) The sets En = { x : f ( x ) 2 l/n} are in A and E = u E n ( n E N).Since f 2 (1/n)xE,,we have 0 = j f d p 2 (l/n)p(En)2 0,so p(En)= 0. Hence p ( E ) = 0 by countable additivity. 0 u, E M +

IV.4

197

Basic Convergence Theorems

4.3 Corollary I f f . g

E

M and f

=g

a.e. then 1f d p = j g dp.

Proof: If E = { x : j ( x ) = g ( x ) } , then j f X x - E d p = g X x - E d p = 0 by 4.2(a), j S d P = jfXEdP = BXEdP = j Y d P .

I

4.4 Theorem If .f E M and

(fl

d p < 00, then f is finite a.e.

If[,

Proof: Let E = { x : I f ( x ) l= 0 0 ) . Then E E .L(check) and nxE I and so np(E) I1Jfldp < 00 for any n E N. We conclude that p ( E ) = 0. 0

Most of the results in 4IV.2 can be improved by replacing assumptions which hold everywhere by corrresponding assumptions holding almost everywhere. We illustrate this by proving a final version of the monotone convergence t heorem.

4.5 Lebesgue’s Monotone Convergence Theorem Let f, (n E N) and g belong to M . Iff, 2 y a.e. where j g dp > - co,and f, I f,+ a.e. for all n E N, then f, converges a.e. to a function f E M and jf,dp =jfdp.

-=

Proof: By combining the countably many sets (wheref, g, f, > f,+ into one set E of measure zero, we may set each f, and g equal to 0 on E without changing the integrals. We may also assume that 0 2 g(x) > - 00 for all x (check), so - a, < j y d p I 0. The result now follows from the monotone convergence theorem applied to f, - g. 0

4.6 Fatou’s Lemma If (fn) is a sequence of nonnegative measurable functions, then 1(lim inf f,)d p Ilim inf f ,dp. Proof: If g, = inff;: ( i 2 n), then g, E M t and ( g n : n E N) is an increasing sequence which converges to lim inff,. Also, if n Im, then g, If,, so g , d p IJ f, dp; hence j g, d p Ilim inf j f,dp. Therefore j (lim inf f,)dp = limn+ 1gnd p Ilim inf j f,dp by the monotone convergence theorem. 0 ~

4.7 Lebesgue’s Dominated Convergence Theorem Suppose that (f,) is a sequence of measurable functions which converges a.e. to a measurable function f . If there is nonnegative function g E L1 so that 1f.l g a.e. for each n E N , then f E L , and j f d p = jf,dp. Proof: Fix a set E E .Iwith p ( E ) = 0 so that (f,) converges to f except Ig except possibly on the set E. Iff, = possibly on the set E, and

1.f(

198

IV.


-

. f . ~ ~.1-‘ ~ , and .y‘ = g x x - F , then the sequence (.f,) of-measurable functions converges everywhere to f , If,\ I .y‘ on X,and finally J . f d p = J’ j ’ d p and i dp-= Sf, dp by Corollary 4.3. Since ljl 5 y’ and f~ M, l f L,, ~ as is each of the functions &. Now 6 + i 2 0, and so by Fatou’s Lemma

S

sijdp

+ S f d p = s(g + f ) d p I lim inf s(y‘+ i ) d p

Hence J f dp I lim inf J 0, we obtain

idp. Similarly, applying Fatou’s lemma to .y‘

-

J’ 2

Jgdp - Jfdp = J(i- j ) d p Ilim infJ(g - j ) d p = Jij dp - lim

sup

si

dp.

Thus lim sup J 11, dp 5 J j d p , and the result follows. 0 The rest of this section will center on various convergence properties of sequences of measurable functions without special concern for the convergence of their integrals. The first of these is the famous result of Egoroff which states that a.e. convergence “almost” implies uniform convergence. To be specific we introduce the following definition. 4.8 Definition A sequence <jA) converges almost uniformly if for each E > 0 there exists a set E E A? with p(E) < E so that (f,)converges uniformly on

E. 4.9 Egoroff’s Theorem If p ( X ) is finite and

then (f,) converges almost uniformly to f. Proof: For each k and n define the set E,,

(f,) converges a.e. to f on X

Ed t’by

Ek, =

n;=, {x:lfm(x)

-

(f,) converges then for each k we have U E , , ( n E N ) 2 E. For fixed k we have E,, E Ekn if n s m,

f(x)l < l/k}. Notice that if E is the set on which

and so limn.+m p(I!ikn) = p(UI!ik,,(n E N ) ) 2 p ( E ) = p ( x ) . Thus, for a given > 0, we see that with each k E N is associated an n, E N so that p(Eknk)< c/zk. If F = (k E N ) then p(F‘) I I p(Gnk) < ~ / =2 E . ~ Finally we show that (I,)converges uniformly on F. Let E > 0 be given and find a k so that l/k < E. Then (fm(x)- f ( x ) ( < E for all m 2 n, if x E Eknk. Since F C Eknkwe have uniform convergence on F. 0 E

n&,,

=:x

IV.4

199

Basic Convergence Theorems

Another type of convergence which is important in probability theory is that of convergence in measure. 4.10 Definition A sequence (1,) of measurable real-valued functions on X Converges in measure to a real-valued function f if for every real E > 0 we p ( { x : l f , - 2 E } ) = 0. Similarly (f,) is Cauchy in measure if have for each E > 0 we have limn,m-.mp ( { x : [ f , ( x )- fm(x)l 2 E } ) = 0.

fI

It is easy to see that if (1")is convergent in measure to f then it is Cauchy in measure. Recall that Egoroffs theorem has been established only for sets of finite measure (see Exercise 2). The following result shows that, in general, almost uniform convergence is stronger than both convergence a.e. and convergence in measure. 4.11 Theorem If a sequence (f,) converges to f almost uniformly then it converges a.e. and in measure. Proof: For each k E N let (f,) converge uniformly to f on Fk where p ( 4 ) < l/k. Then (f,) converges on F where F = U F k ( 1 I k < CO) and p(F') I p(F;)< l/k for each k E N, so that p(F') = 0. Thus (f,) converges a.e. To prove convergence in measure let E 0 be given and choose k with Ilk < E. Sincef, converges unif@rmlyon F k , there is an rn such that {x: If,(x) f ( x ) l 2 E } E F; for all n 2 some m depending on k. Thus p({x:If,(x) f ( x ) l 2 E } ) < l/k < E for all n 2 m, and the result follows. 0

=-

The following example shows that a sequence can converge in measure but fail to converge at any point. 4.12 Example Represent each n E N as n = k + 2", m 2 1, 0 I k < 2", and define f , ( x ) on [O, 1 1 to be X [ k 2 - m . ( k + 1 ) 2 - m ] (the reader should draw some pictures). Then for any x E [0, I] and any no there is an m I 2 no and an m2 2 no so that fm,(x) = 0 and fm,(x) = 1. Thus f, does not converge at any point. On the other hand, given E > 0, the Lebesgue measure of {x:lf,(x)I > E } I2/n, so that f , -+ 0 in measure. In this example it is possible to select a subsequence of (f,) which converges a.e. This is true in general, as we now show. 4.13 Theorem If (f,) converges in measure to f,then there is a subsequence (f,,) which converges almost uniformly and hence a.e. to f .

200

IV.


Proof: Given k we can find an nk so that p({x:If,(x) - f ( x ) (2 2-k}) < 2-' for n 2 f l k - We may assume that n k + > nk. Now let Ek = {x:I f,,(x) - f(x)l 2 2-'}. Given E, let m be chosen so that 2 - m + 1< E. If x $ E , = A then If.,@) - f(x)l < 2-' for k 2 m, so fn,(x) converges uniformly to f ( x ) on A'. p(Ek)5 2-' = 2-'"+l < E, and the result follows. 0 But p(A)

IF=,,,

uT=,,,

z=,,,

Exercises I V.4

= on the set of functions on a measure space (X,A,p) defined by f = g iff = g a.e. is an equivalence relation. 2. (Standard) Show that Egoroffs theorem does not hold for Lebesgue measure on all of R. 3. (Standard) Show that if for each n E N,f,E L , and dp < 00, then the series f . converges absolutely and almost everywhere to an integrable function f and d p = :=I 4. (Standard) Show that if limn+w - dp = 0 then f . converges to f in measure. 1. (Standard) Show that the relation

lfnl

If

l,f

Ifn&.

In the following problems, (L, I) will be an internal integration structure and (i, f) the complete integration structure of 4IV.l with associated measurable structure of 5IV.2. 5. Show that if g E Lo then g N 0 fi-a.e. (Hint: Assuming g 2 0, for any E > 0, there is a I(/ E L with 0 I; g I $ and I $ < E. Use Proposition 2.33, Exercise 2.17, and the fact that {x:g Il/n} E {x:$ 2 E {x:+ 2 1/2n})

lln}

6. (Lifting of Measurable Functions) Assume that 1 E L. A function f is in fi iff there exists a $ E L such that '$ = f fi-a.e. Iff is bounded then $ can be obtained with the same bound and Jfdfi = '14. (Hint: Use Proposition 2.32 and Exercise 5.) Any function $ E L satisfying these conditions is called a lifting off. 7. (Lifting of Integrable Functions) Assume that 1 E i.Show that f E iI iff f has an S-integrable lifting $, in which case dfi = '14.

If

IV.5 The Fubini Theorem

A familiar process in the theory of Riemann integration for functions of several variables is that of iterated integration. If, for example, f ( x , y ) is a continuous function on the set [a,b] x [ c , d ] in R x R then we have the equality

IV.5 The Fubini Theorem

201

The purpose of this section is to establish a nonstandard version of this equality in the contexts of the earlier sections of this chapter. The general result is known as the Fubini theorem, after its originator, G. Fubini. The nonstandard version is then applied to establish a Fubini theorem for integration structures on Euclidean spaces. First some notation. We will be dealing with integration structures (internal or standard) on product spaces U x V (internal or standard). These structures will typically be denoted by ( L , ,, I, ,). We will also be given integration and (L,,I,) on U and V, respectively. Given a function structures &,I,) f E L, we may find that f(u, .) E L, for u E U,in which case I,f is a function of u. If g = 1,f is also in L , then we denote its integral Iug by l , I v f (a slight abuse of notation since we are suppressing variables).

,

5.1 Definition Let (L,,, I,), (L,,I,), and (L,,Iw) be integration structures on U,V, and W = U x V , respectively. If the integration structures are stan-

dard, we say that a function f E L , has the strong Fubini property with respect to I,, I,, and I, if (i) f(u, - ) E L, for all u E U and f(., u) E L , for all u E V, (ii) I,f is in L , and 1,f is in L,, (iii) I,f = I,Ivf = I,I,f.

If “all” in (i) is replaced by “almost all” (i.e., the conditions hold a.e.), and (ii) and (iii) hold if I,f and 1,f are set equal to zero when not otherwise defined, then we say that f has the Fubini property. If the integration structures are internal and (i), (ii) and (iii) hold without exception, we say that f has the internal strong Fubini property.

To begin we need the following basic result. (L,,I,), and (L,,Zw) with W = U x V be real complete integration structures on U , V , and W, respectively. Suppose that each function f, E L , in the sequence {f,:n E N} has the Fubini property with respect to I,, I,, and I,, and { f,} is a monotone increasing sequence converging to a real-valuedf. Also suppose that sup{I,f,:n E N} < co.Then f has the Fubini property with respect to I,, I,, and I,. 5.2 Lemma Let (&,,I,),

Proof: Exercise. 0

We next establish results concerning the standardizations (i,,f,), (i,, f,), and (L,, fw)of internal integration structures (L,, I,), (L,, I,), and (Lw, I,)

202

IV.


on the internal sets U ,V, and W = U x V, respectively, in an K,-saturated enlargement. These will be used to establish results on Euclidean spaces via the results of sIV.3. We assume that the function 1 (i.e., the function which is identically 1) is in L , and that "I,l < co. This will allow us t o apply Theorem 1.16 when 4 E Lw by taking JI = 1. We also assume that each function in L, has the internal strong Fubini property (as in the case, for example, with Riemann integration of continuous functions). In particular, 1 is in L,, and L, and "1,l c co and "1,l < 00.

5.3 Lemma Suppose that 4 is a finite-valued function in L,. Then ' 4 has the strong Fubini property with respect to f,,, f,, and fw. Proof: Since, by assumption, Q(u,.)E L, for each u E U, we see that

"4(u,.)E i, by Theorem 1.16. Similarly, using Theorem 1.16 where neces-

sary, we have fV("4) = 'I,4 in L,,, I,,(#) = "I,,# in L,, and fw("q5) = "Id4)= "lUIv(4) = fuol,(4) = fufV('4). The same argument with U and V reversed yields the result. 0

For the next lemma we use the fact (Exercise IV.l. 10) that if h is real-valued and nonnegative, then h is a null function (Definition 1.7) with respect to an integration structure ( L , I ) iff h E i and f ( h ) = 0.

5.4 Lemma Suppose that h is a bounded real-valued null function on W. Then h has the Fubini property with respect to f,,, f,, and f,. Proof: We may assume that h 2 0 by considering h = h+ - h - and using the fact that the Fubini property is preserved under sums (exercise). Then we have 0 _< h 5 K for some standard integer K. Since h is null there is a decreasing sequence (&:n E N ) of functions 4, E L , with h I4, IK for (n E N) = 0. Since h is real-valued there is a real-valued all n, and lim "Id4,) H E i, to which the sequence ("4,) monotonically decreases, and 0 5 h I H. Now H also has the strong Fubini property by Lemmas 5.3 and 5.2 (appropriately modified), and f,(H) = 0. It follows from Theorem 4.2 that for almost all u E U (in the measure induced by i,,, f"), f,H(u;) = 0, whence h(u;) is null on V . Therefore fufvh = 0. The same argument works with U and V reversed, and we conclude that the Fubini property holds for h. 0

Our main theorem generalizes a result of H. J. Keisler [25, p.

71

5.5 Nonstandard Fubini Theorem Let (L,,,I,,), &,,I,), and ( L w , I w ) be internal integration structures on the internal sets U, V, and W = U x V , respectively, with 1 in L , and "Iwl c co. Assume that every finite-valued

IV.5

203

The Fubini Theorem

function 4 in L , has the internal strong Fubini property with respect to I,, I,, and I,. Then any f E G , for which jwlfl < co has the Fubini property with respect to f", I , , and fw. Proof: Using the fact that the Fubini property is preserved under sums and writing f = f' - f - , we may assume that f is positive. Also, we may n using assume that f is bounded by first proving the result for f ~ and Lemma 5.2 to pass to the limit. Suppose then that f E t, is a bounded nonnegative function. Then f has a decomposition f = 4 h with 4 E L , bounded and h a bounded null function (check). Now f = '4 + (4 - '4) + h, and since the null function (4 - '4) h is real-valued, the theorem follows from 5.3 and 5.4. 0

+

+

We will now apply Theorem 5.5 to prove a Fubini theorem for integration structures in Euclidean spaces. In the following, X and Y will denote closed and bounded (and thus compact) subsets of R" and R", respectively, and Z = X x Y. Notice that 1 belongs to C ( X ) , C ( Y ) ,and C(Z).Given positive linear functionals I,, I,, and I, on C ( X ) , C ( Y ) , and C(Z), we obtain integration structures (C(X),I x ) , (C(Y), I,), and (C(Z),I,). These structures have *-transforms on *X,* Y , and *Z, namely, (*C(X),*Ix), (*C(Y), *I,), and (*C(Z),*Iz), respectively. For example, * C ( X )is the set of all *-continuous functions on *X. Using the techniques of B1V.l and IV.3, we find that these internal structures induce integration structures (i,,f,), (i,, f,), and (E,, f,) on *X, *Y, and *Z, which in turn induce integration structures (,!,,,.Ix), ( L y , J,), and (L,, J,) on X, Y, and Z, respectively. The latter structures extend ( C ( X ) ,Ix), (C(Y), I,), and (C(Z),I,). The reader should recall (Remark 2.9.2) that every real-valued function in is in t,.We remark that for f E C ( Z ) the equality of the iterated integrals always nolds [34, 16B, p. 441. If that common value is I, then the strong Fubini property holds for f. 5.6 Standard Fubini Theorem Assume that X and Y are compact. Suppose that each f E C ( Z ) has the strong Fubini property with respect to I,, I y , and I,. Then each f E M, such that Jzlfl < 00 has the Fubini property with respect to J , , J , , and J,. Proof: It suffices to prove the result for f bounded and hence in L,. The assumptions of Theorem 5.5 are satisfied with * X = CJ, * Y = V , and * Z = W, since the strong Fubini property for each f E C ( Z ) transfers to the internal strong Fubini property for each 4 E *C(Z). Let f E L,. Then TEi, has then the Fubini property with respect to fx, f,,, and f,. If "xl = j ( x l , y ) = j ( x 2 , y ) for all ~ E ' * Y Thus . there is a standard set A c X such that f ( x ; ) E L, for all x E * X - 2. Also A" is null in * X so A is Ox2

204

IV.


Tv

null in X.Ifx E X - A thenf(x;) = f ( x ; ) o ! * Y s o J , / ( x ; ) = j , f ( x ; ) . Set J Y f ( x ; ) = 0 for x E A. Since- J - *x ) = ~ , / ( x ; ) for x E *X - A, we have J, f ( x ; ) E L , and J,J,f' = J,Jyf = = JJ. The same argument with the roles of X and Y reversed gives the result. 0

sf

We have established the Fubini theorem for the case that X and Y are compact subsets of R" and R", respectively. The extension of this result for the case that X and Y are both open or both closed in R" and R" is a standard exercise, which we leave to the reader (Exercise 3). Exercises I V.5 Prove Lemma 5.2. Show that the Fubini property is preserved under sums. Use Theorem 5.6, Exercise IV.3.8, and the obvious extension of Lemma 5.2 (for the case of R-valued functions) to establish Fubini's theorem for integrable f on X x Y,when X and Y are both open or both closed in R" and R", respectively. (Nonstandard version of Tonelli's theorem) In the notation of this section, assume that 1 E Lw with "IWl< co, and the other assumptions of Theorem 5.5 hold. Show that iff E fi;, then (a) l(u;) E M v for a.e. u E U , and / ( . , u ) E M u for a.e. u E V , (b) 0 and k 2 0 in N . Given to E I and t > 0 with t finite and to + t E I , let C,, be the event “b, E [ t o , to l/q)”, and let D,,be the event “If j 1 5; i s; j k, b, E [to, to + t), and b,+k+ # [ t o ,to t).” Let y‘ = y - J . Given C,,,the conditional probability of getting a given point of the remaining y’ points in [to, to + t) is

+

+

,

+

+

trl q2 -

t q - to

=-=-tog

It y - toI

+

-

It y’ + j - t o l ’

Therefore, for all finite t o , and hence for all to < T for some infinite T, the conditional probability

~ ( ~ t o ~ C , oN ) ~ ( ~ t o ) On the other hand, ~ , c ,P(CJ = 1, and so (At)ke-Af/k!. That is, the P-probability of having exactly k more distinguishable

213

IV.6 Applications to Stochastic Processes

points in the interval of length t after the jth point is (At)ke-"/k!.Since

the @-probabilityof having only a finite number of distinguishable points in any finite interval [0, t] is 1. Moreover, since lirn,-, e-& = 1, the @-probability of having point bj+l infinitely close to bj is 0. Since this is true for eachj 2 1 in N, it follows that the @-probabilityof having two distinguishable points in the same monad is 0. We now let E c R denote that set of measure zero consisting of those w for which N,(w) is infinite for some finite t or for which two or more distinguishable points fall in the same monad. Since E E 3, we define a new probability space (a,8, F) by putting = R - E, d = { A : A E a, A E a}, and &(A) = P ( A ) for A E 8. We now use N,(w)to define a process { fi,:t E R} on (R, 8,P ) . For o E fi and t E R + we put fi,(w)= sup NJw) (s 2: t , s E I). By the above remarks, fi,(w)is finite and integer-valued for any w E fi and t E R,and fi,(w)= N , ( o ) for some s E I, s 21 t. We leave it to the reader to show that fi,(w)is right continuous (Exercise 3) and that (6.3)and (6.4)are satisfied. Thus {fir} is a Poisson process on (R,&,P). -

-

I

I

-

-

6.8 Example (Anderson's Construction of Brownian Motion)

Brownian motion is a stochastic process which is intended to model the behavior of a particle (for example, a small particle suspended in water). The particle is subject to random disturbances (for example, collisions with the water molecules) which cause its position to change with time. For simplicity, we consider the one-dimensional case, and denote the random position of the particle on the real line at time t 2 0 by X ( t ) . Again for simplicity we follow the particle only for a unit time interval. Then { X , : t E [0, l]} is to be a stochastic process on an as yet unspecified probability space (a,&,P). A (standard) Brownian motion { X , : t E [0, l]} must satisfy the following conditions:

(6.5) X, = 0, (6.6) if s1 < t , I s2 < t 2 I * . . I s, < t , are points in [0,1] then the random variables X(tl) - X(s,), X ( t J - X(s,), . . . , X(t,) - X(s,)

are independent random variables, which we denote by X,,- X,,, etc.,

(6.7) if t > s are points in [0, 13 then P({w E R:X,(w) - X J o ) I a } ) = J l ( a / c ) ,where $(x) = (1/&) s" oo e - y 2 / du. 2

214

IV. NonstandardIntegration Theory

Condition (6.5) locates the particle at the origin at t = 0. Condition (6.6) says that the probability of a change in position of the particle in any time interval (s,,t,] is unaffected by the changes in position in other disjoint intervals. Condition (6.7) indicates how closely the position of the particle at time t can be determined if its position at time s is known. The probability distribution function $(x) is known as the normal distribution with mean 0 and variance 1. One should note that $(x/G) = ( l / ~ & ) e-"''2"2du, which is the normal distribution with mean 0 and variance a2. In [2], Robert M.Anderson used the measure space construction of 4IV.2 to obtain, among other things, a nonstandard representation of Brownian motion. We give here a brief account of some of his results, which is necessarily incomplete since we refer to his nonstandard version of the central limit theorem (Theorem 6.11), which is crucial to the development. The central limit theorem is one of the deeper results in probability theory and to prove it here woule lead us too far from the main theme of these examples. A Brownian motion can now be defined as follows. Fix g = C!, an infinite factorial in *N; and let (0,8,P) be the internal space for infinite coin tossing of Example 6.6 (with R being all sequences o = (ol, . . . ,o,,), and o,= + 1 or - 1) constructed from the internal integration structure (L,I).Let (Q 8,@) be the corresponding standardization of (Q 8,P ) constructed from f) as in Example 6.6. Let ~ ( t ; denote ) the internal random variable (function in L ) defined by setting

(e,

) w,. Here [qt] denotes the largest element of * N less than or where X ~ O = equal to qt. Thus, for any o = ( w 1 , 0 2 , .. . ,o,,), the particle located by x(t,u)starts at the origin at t = 0 [i.e., x(0,o) = 01,and at each time ti = i/q ( i = 1,2,3, . . . ,q) the particle moves to the right or left a distance l/&, depending on whether w, is + 1 or -1; at times lying between the t , the particle remains fixed. The resulting motion is an internal analogue of a standard "symmetric random walk." We now define P(t, o)= "dt,w ) for t E [0,1] and o E R. We will show that f l ( t , - ) is a Brownian motion on (R,8,@).To do so we need the following results.

6.9 Definition An internal random variable on (R, 8,P) is a function X E L. A collection {X,:i E I} of internal random variables is *-independent if for

every *-finiteinternal subcollection {XI, . . . ,X,} (rn E *N) and every internal

215

IV.6 Applications to Stochastic Processes

{ X i : iE I} is S-independent if, for every finite subcollection { X I , .. . ,X , } (m E N ) and every m-tuple ( a 1 , . . . ,a,) E R", (6.8) holds with = replaced by =. 6.10 Lemma Suppose { X i : iE I} is S-independent. Then { " X , : iE l } is an independent collection of random variables on (a,8,F). Proof Suppose m E N, (a1,. &{W:~X,,(~ ISkjA 1 2m

&,a)> m

+nP

({

k

w : min E m ,< -lSkSA 1 2m

@I) 2m

F(u)

2 1 - sup inf F(Q,,,) 2 1 - sup inf4ne-fiI4" = 1. m

n

m

n

Fix w E R.If for some t E *[O, 13 we have "x(t,w ) = + m or "x(t,0) = - co, then w E Q,,n for all standard m and n E N,whence w # f2'. If for some s and t E *[O, 11 with s N t we have "lx(s,w)- x(t,o)l= a > 0, then for m > 2/a we have o E R,, for all n E N (exercise), whence w 4 nl. Now suppose w E R'. By the preceding paragraph, /3(t,0) is finite for all t E [0,1]. Fix E > 0 in R. Then the set {n E * N : l t - sI < l/n * Ix(t,w) ~ ( s o)l , < ~ / 2 }is internal and contains all infinite n. Hence it contains a finite n by II.7.2(ii).Thus if It - sI < l/n, Ix(t, w ) - ~ ( sw)l , < ~ / and 2 hence w) /?(s,w)I < E. It follows that /?(.,a) is continuous on [0,1]. 0 Exercise ZV.6

1. (Standard) Let X i be defined on the space a, of Example 6.6 by Xxw) = e, if w = (e,,e2, . . . ,en). Show that the random variables S, = Xi,

218

IV.


1 5 k I n, have independent increments, i.e., if 1 I k, < k2 < k, < k4

An Introduction to Nonstandard Real Analysis

An Introduction to Nonstandard Real Analysis

An introduction to nonstandard real analysis