Peter A. Loeb · Manfred P.H. Wolff Editors
Nonstandard Analysis for the Working Mathematician Second Edition
Nonstandard Analysis for the Working Mathematician
Peter A. Loeb Manfred P.H. Wolff •
Editors
Nonstandard Analysis for the Working Mathematician Second Edition
123
Editors Peter A. Loeb Department of Mathematics University of Illinois Urbana, IL USA
ISBN 978-94-017-7326-3 DOI 10.1007/978-94-017-7327-0
Manfred P.H. Wolff Mathematical Institute University of Tübingen Tübingen Germany
ISBN 978-94-017-7327-0
(eBook)
Library of Congress Control Number: 2015946066 Primary: 03H05, 03H10, 03H15, 11U10, 12L15, 26E35, 28E05, 46S20, 47S20, 54J05 Secondary: 11B05, 11B13, 11B30, 11B75, 46B08, 46B20, 47A10, 47A58, 47D06, 47H09, 47H10, 54D30, 60G51, 60H07, 60J65, 91A06, 91B99 Springer Dordrecht Heidelberg New York London © Springer Science+Business Media Dordrecht 2015 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper Springer Science+Business Media B.V. Dordrecht is part of Springer Science+Business Media (www.springer.com)
Preface
This book is addressed to mathematicians working in analysis, probability, and various applications. The aim is to provide an understandable introduction to the basic theory of nonstandard analysis in Part I, and then illuminate some of the most striking applications. Much of the book, in particular Part I, can be used in a graduate course; problems are posed in those chapters. After Part I, each chapter takes up a different field for the application of nonstandard analysis, beginning with a gentle introduction that even nonexperts can read with profit. The remainder of each chapter is then addressed to experts, showing how to use nonstandard analysis in the search for solutions of open problems and how to obtain rich new structures that produce deep insight into the field under consideration. The applications discussed here are in functional analysis including operator theory, topology applied to compactifications, probability theory including stochastic processes, economics including game theory and financial mathematics, and combinatorial number theory. In all of these areas, the intuitive notion of an infinitely small or infinitely large quantity plays an essential and helpful role in the creative process. For example, Brownian motion is often thought of as a random walk with infinitesimal increments; the spectrum of a selfadjoint operator is viewed as the set of “almost eigenvalues”; an ideal economy consists of an infinite number of agents each having an infinitesimal influence on the economy. Already at the level of calculus, one often views the integral as an infinitely large sum of infinitesimal quantities. Of course, the notion of an infinitesimal quantity has been used in mathematics for over 2200 years. Although it was one of the leading ideas employed during the period of mathematical development from Leibniz and Newton to Cauchy, it eluded rigorous treatment until the work of Abraham Robinson between 1957 and 1966, culminating in 1966 with the publication of his book “Non-standard Analysis”. That work finally established a rigorous foundation for the use of infinitesimals in mathematics. More precisely, Robinson constructed an ordered field extension R of the reals, R, so that all of the sequences, functions and, indeed, relations of real analysis had a unique extension to the equivalent structure built from R, and all statements true for the real structure remained true for the extended objects in the v
vi
Preface
extended structure. This essential property, known as the Transfer Principle, is the pivotal result of Robinson’s nonstandard analysis. With subsequent contributions to this new discipline from many mathematicians in the late 1960s, including a new result in standard functional analysis obtained by Robinson together with A.R. Bernstein, it became clear that nonstandard analysis was much more than a foundation of infinitesimal calculus. It promised to become a powerful tool in all branches of analysis. Then, an important breakthrough was initiated by W.A.J. Luxemburg’s 1969 paper developing nonstandard hulls, and was extended with the work of C.W. Henson and L.C.R. Moore. The prior construction of ultrapowers of the reals begun by E. Hewitt and generalized to ultraproducts of Banach spaces by J.L. Krivine in the mid-1960s had become a powerful tool in functional analysis. Nonstandard functional analysis using nonstandard hulls, discussed here in Manfred Wolff’s Chap. 4, is a far reaching generalization of these applications of ultraproducts. Chapter 4 deals with old and new applications of nonstandard analysis to the theory of Banach spaces and linear operators. In particular it considers the structure theory of Banach spaces, basic operator theory, strongly continuous semigroups of operators, approximation theory of operators and their spectra, and the Fixed Point Property. The early 1970s produced another breakthrough with P.A. Loeb’s construction of a measure space out of a “hyperfinite” discrete analogue of finite probability spaces. The simplest example was based on a coin toss of length H for an infinitely large natural number H. Immediately, Loeb’s general construction was successfully used by R.M. Anderson to formulate Brownian motion exactly as a random walk with infinitesimal increments based on that coin toss. The general procedure to extract a measure space out of some given “internal” one is now called Loeb measure theory. This theory is briefly introduced at the end of Chap. 3, and it is fully explored starting from first principles in Horst Osswald’s Chap. 6. Applications to stochastic processes, including the Itô integral as well as the Malliavin calculus, are further detailed in Chap. 7, while Yeneng Sun’s Chap. 8 contains an application solving the measurability problem that arises when one considers an infinite number of equally weighted independent agents, and more generally, a continuum of independent random variables. Most of the results of Chap. 8 come from Sun’s recent papers and are based on the richness of the Loeb construction applied to product spaces. Successful applications of nonstandard analysis have occurred in many applied areas such as mathematical physics (an example is L. Arkeryd’s research starting in the early 1980s on gas kinetics and the Boltzmann equation) and mathematical economics. Work in the latter area was initiated with the seminal 1975 paper of D.J. Brown and A. Robinson on nonstandard exchange economies. There is a need in economic theory for models of economies with a very large number of equally weighted agents, each of which has only a negligible influence on the economy. Standard models have taken as the set of agents the unit interval [0,1] supplied with Lebesgue measure. A more natural model is the “hyperfinite set” of agents used by Brown and Robinson. Yeneng Sun’s Chap. 9 of this book describes many uses of
Preface
vii
this model when combined with Loeb measure theory; it also shows why Lebesgue spaces must fail for many economic applications. Other applications of nonstandard analysis include the exploitation of compactness using the “S-topology” on the nonstandard extension of a topological space to obtain quite general and intuitive constructions of compactifications; see Chap. 5 by Insall, Loeb and Marciniak. Important and extensive applications of nonstandard analysis to combinatorial number theory are the subject of Jin and Di Nasso’s Chaps. 10 and 11 in this book. For the reader just learning nonstandard analysis, we point again to our Part I that begins with a simple form of nonstandard analysis, suitable for the results of calculus and basic real analysis. The presentation is intended to give the reader a feeling for the fundamental arguments of nonstandard analysis with a minimum use of model theory. The reader who begins with no background in mathematical logic should easily pick up what is needed to continue. Chapter 2 of Part I is devoted to general nonstandard analysis and presents the heart of Robinson’s theory. It is written so that the interested reader learns all that is needed for later applications without being forced to read detailed model theoretic constructions, some of which are postponed to the appendix of the chapter. Part I concludes in Chap. 3 with further applications. The authors have been asked from time to time about the relation between Robinson’s nonstandard analysis and the subject of “internal set theory,” initiated by Edward Nelson in the 1977 Bulletin of the American Mathematical Society. A good recent development of that framework with applications can be found in Nader Vakil’s text cited here in the references to Chap. 2. What is the difference? The Robinson framework adds to the standard mathematical “world” a second “nonstandard” mathematical world. There are no infinitely large integers in the standard world, but they do exist in the nonstandard world. Robinson chose the name “nonstandard analysis” because the nonstandard world is used to analyze the standard one. Internal set theory, on the other hand, works with only the nonstandard world, but recognizes some elements of that world as being “standard”. Important developments in the Robinson framework have taken objects formed in the nonstandard world and adjoined them to the standard world. For example, equivalence classes of “remote points” become compactifying boundary points of standard topological spaces. Quotients of points in the nonstandard extension of Banach spaces become new Banach spaces in the standard world. Measure spaces formed from nonstandard point-sets become rich measure spaces in the standard world. These constructions do not make sense in internal set theory because there is no standard world. In working through the foundations for nonstandard analysis presented in this book, the reader will gain many new and helpful insights into the enterprise of mathematics. Once these foundations are understood, research formulated in the framework of internal set theory can be easily understood with just some translation of terminology. The editors have found, however, that the reverse is not generally true. Therefore, the reader may best be served by starting here at least with Part I, or with a similar introduction to Robinson’s theory.
viii
Preface
The editors would like to thank the contributors to this second edition for their outstanding contributions to this project. We also thank Erik Talvila, who made numerous helpful suggestions for improvements of the first edition. Finally, the editors dedicate this book to one of the founders of nonstandard analysis; he is our mentor, colleague, and friend, W.A.J. (Wim) Luxemburg. Champaign-Urbana, IL, USA Tübingen, Germany February 2015
Peter A. Loeb Manfred P.H. Wolff
Contents
Part I 1
2
An Introduction to Nonstandard Analysis
Simple Nonstandard Analysis and Applications . . . . . . . . . . Peter A. Loeb 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 A Simple Construction of a Nonstandard Number System 1.3 A Simple Language . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Interpretation of the Language L . . . . . . . . . . . . . . . . . . 1.5 Transfer Principle for R . . . . . . . . . . . . . . . . . . . . . . . 1.6 The Nonstandard Real Numbers . . . . . . . . . . . . . . . . . . 1.7 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Topology on the Reals . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Limits and Continuity . . . . . . . . . . . . . . . . . . . . . . . . . 1.10 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.11 Riemann Integration . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Introduction to General Nonstandard Analysis . . . Peter A. Loeb 2.1 Superstructures . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Language for Superstructures . . . . . . . . . . . . . . . 2.3 Interpretation of the Language for Superstructures. 2.4 Monomorphisms and the Transfer Principle . . . . . 2.5 Ultrapower Construction of Superstructures and Monomorphisms . . . . . . . . . . . . . . . . . . . . . 2.6 Special Index Sets Yielding Enlargements . . . . . . 2.7 A Result in Infinite Graph Theory . . . . . . . . . . . 2.8 Internal and External Sets . . . . . . . . . . . . . . . . . 2.9 Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.... . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
3
. . . . . . . . . . . .
3 7 11 12 15 18 23 26 28 30 32 35
.........
37
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
37 38 39 41
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
44 50 52 53 58 78
ix
x
3
Contents
Topology and Measure Theory . . . . . . . . . . . . . . . . . . Peter A. Loeb 3.1 Metric and Topological Spaces . . . . . . . . . . . . . . . 3.2 Continuous Mappings . . . . . . . . . . . . . . . . . . . . . 3.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 More on Topologies . . . . . . . . . . . . . . . . . . . . . . 3.5 Compact Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Relative Topologies . . . . . . . . . . . . . . . . . . . . . . 3.8 Uniform Continuity and Uniform Spaces . . . . . . . . 3.9 Nonstandard Hulls . . . . . . . . . . . . . . . . . . . . . . . 3.10 Compactifications . . . . . . . . . . . . . . . . . . . . . . . . 3.11 The Base and Antibase Operators . . . . . . . . . . . . . 3.12 Measure and Probability Theory . . . . . . . . . . . . . . 3.12.1 The Martingale Convergence Theorem . . . 3.12.2 Representing Measures in Potential Theory References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part II 4
........ . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
79
. . . . . . . . . . . . . . .
79 83 83 84 85 88 88 89 92 93 93 97 99 101 103
.....
107
..... .....
107 108
.....
108
.....
114
..... ..... .....
116 119 119
. . . . . . . . . .
119 124 130 133 133 133 135 137 137 137
Functional Analysis
Banach Spaces and Linear Operators . . . . . . . . . . . . . . . . Manfred P.H. Wolff 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Basic Nonstandard Analysis of Normed Spaces . . . . . . 4.2.1 Internal Normed Spaces and Their Nonstandard Hull . . . . . . . . . . . . . . . . . . . . . 4.2.2 Standard Continuous and Internal S–continuous Linear Operators . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Special Banach Spaces and Their Nonstandard Hulls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Advanced Theory of Banach Spaces . . . . . . . . . . . . . . 4.3.1 A Brief Excursion to Locally Convex Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 General Banach Spaces . . . . . . . . . . . . . . . . . 4.3.3 Banach Lattices . . . . . . . . . . . . . . . . . . . . . . 4.3.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Elementary Theory of Linear Operators . . . . . . . . . . . . 4.4.1 Compact Operators . . . . . . . . . . . . . . . . . . . . 4.4.2 Fredholm Operators . . . . . . . . . . . . . . . . . . . 4.4.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Spectral Theory of Operators . . . . . . . . . . . . . . . . . . . 4.5.1 Basic Definitions and Facts . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
Contents
xi
4.5.2 4.5.3
The Spectrum of an S–bounded Internal Operator . . The Spectrum of Compact Operators and the Essential Spectrum . . . . . . . . . . . . . . . . . 4.5.4 Closed Operators and Pseudoresolvents . . . . . . . . . 4.5.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Selected Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Strongly Continuous Semigroups . . . . . . . . . . . . . 4.6.2 Approximation of Operators and of Their Spectra . . 4.6.3 Super Properties . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.4 The Fixed Point Property . . . . . . . . . . . . . . . . . . . 4.6.5 References to Further Applications of Nonstandard Analysis To operator Theory . . . . . . . . . . . . . . . . 4.6.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part III 5
6
139
. . . . . . . .
. . . . . . . .
141 142 144 145 145 147 152 155
.. .. ..
158 158 159
.......
165
. . . . .
. . . . .
165 167 170 174 176
....
179
. . . . . . . . . .
. . . . . . . . . .
179 182 182 185 186 188 189 190 195 197
....
197
Compactifications
General and End Compactifications. . . . . . . . . . . . . . . . Matt Insall, Peter A. Loeb and Małgorzata Aneta Marciniak 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 General Compactifications . . . . . . . . . . . . . . . . . . . 5.3 End Compactifications. . . . . . . . . . . . . . . . . . . . . . 5.4 Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part IV
..
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Measure and Probability Theory
Measure Theory and Integration . . . . . . . . . . . . . . . . . . . . . Horst Osswald 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Loeb Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Loeb Measure Spaces . . . . . . . . . . . . . . . . . . . 6.2.2 Loeb Measures over Gaußian Measures . . . . . . . 6.2.3 Loeb Measurable Functions . . . . . . . . . . . . . . . 6.2.4 Loeb Spaces over the Product of Internal Spaces 6.2.5 The Hyperfinite Time Line T . . . . . . . . . . . . . . 6.2.6 Lebesgue Measure as a Counting Measure . . . . . 6.2.7 Adapted Loeb Spaces . . . . . . . . . . . . . . . . . . . 6.3 Standard Integrability for Internal Measures . . . . . . . . . . 6.3.1 The Definition of S-integrability and Equivalent Conditions . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
xii
Contents
6.3.2 6.3.3 6.3.4 6.3.5 6.3.6 6.3.7 6.3.8 6.4 Internal 6.4.1 6.4.2 6.4.3 6.4.4 6.4.5 6.4.6 6.4.7 References. . . 7
lL -integrability and Sl -integrability . . . . . . . . . . . Integrable Functions defined on Nn K ½0; 1½m Standard Part of the Conditional Expectation . . . . Characterization of S-integrability . . . . . . . . . . . . Keisler’s Fubini Theorem . . . . . . . . . . . . . . . . . Hyperfinite Representation of the Tensor Product . On Symmetric Functions . . . . . . . . . . . . . . . . . . and Standard Martingales . . . . . . . . . . . . . . . . . . Stopping Times and Doob’s Upcrossing Result. . . The Maximum Inequality. . . . . . . . . . . . . . . . . . Doob’s Inequality . . . . . . . . . . . . . . . . . . . . . . . The Burkholder Davis Gundy Inequalities . . . . . . S-integrability of Internal Martingales . . . . . . . . . S-continuity of Internal Martingales. . . . . . . . . . . The Standard Part of Internal Martingales . . . . . . ....................................
Stochastic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Horst Osswald 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 The Itô Integral for the Brownian Motion . . . . . . . . . . . . 7.2.1 The S-Continuity of the Internal Integral . . . . . . . 7.2.2 The S-Square-Integrability of the Internal Itô Integral. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Adaptedness and Predictability . . . . . . . . . . . . . . 7.2.4 The Standard Itô Integral . . . . . . . . . . . . . . . . . . 7.2.5 Integrability of the Itô Integral . . . . . . . . . . . . . . 7.2.6 The Wiener Measure . . . . . . . . . . . . . . . . . . . . . 7.3 The Iterated Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 The Definition of the Iterated Integral . . . . . . . . . 7.3.2 On Products of Iterated Integrals. . . . . . . . . . . . . 7.3.3 The Continuity of the Standard Iterated Integral Process . . . . . . . . . . . . . . . . . . . . . . . . 7.3.4 The W CH -Measurability of the Iterated Itô Integral 7.3.5 InM ðf Þ is a Continuous Version of the Standard Part of InM ðFÞ . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.6 Continuous Versions of Iterated Integral Processes 7.4 Beginning of Malliavin Calculus. . . . . . . . . . . . . . . . . . . 7.4.1 Chaos Decomposition . . . . . . . . . . . . . . . . . . . . 7.4.2 A Lifting Theorem for Functionals in L2W ðCL Þ . . . 7.4.3 Computation of the Kernels . . . . . . . . . . . . . . . . 7.4.4 The Kernels of the Product of Wiener Functionals 7.4.5 The Malliavin Derivative . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
200 205 210 211 213 217 220 222 223 224 224 225 225 226 226 230
...
233
... ... ...
233 237 238
. . . . . . . .
. . . . . . . .
243 245 247 248 250 252 252 256
... ...
259 260
. . . . . . . .
262 263 264 265 270 271 273 276
. . . . . . . .
. . . . . . . .
. . . . . . . .
Contents
xiii
7.4.6 7.4.7 7.4.8 7.4.9 7.4.10
A Commutation Rule for Derivative and Limit . . The Clark-Ocone Formula . . . . . . . . . . . . . . . . A Lifting Theorem for the Derivative . . . . . . . . The Skorokhod Integral . . . . . . . . . . . . . . . . . . Product and Chain Rules for the Malliavin Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Stochastic Integration for Symmetric Poisson Processes . . 7.5.1 Orthogonal Increments. . . . . . . . . . . . . . . . . . . 7.5.2 From Internal Random Walks to the Standard Poisson Integral . . . . . . . . . . . . . . . . . . . . . . . 7.5.3 Iterated Integrals . . . . . . . . . . . . . . . . . . . . . . . 7.5.4 Multiple Integrals . . . . . . . . . . . . . . . . . . . . . . 7.5.5 The r-Algebra D generated by the Wiener-Lévy Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Malliavin Calculus for Poisson Processes . . . . . . . . . . . . 7.6.1 Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6.2 Malliavin Derivative . . . . . . . . . . . . . . . . . . . . 7.6.3 Exchange of Derivative and Limit. . . . . . . . . . . 7.6.4 The Clark-Ocone Formula . . . . . . . . . . . . . . . . 7.6.5 The Skorokhod Integral . . . . . . . . . . . . . . . . . . 7.6.6 Smooth Representations. . . . . . . . . . . . . . . . . . 7.6.7 The Product Rule . . . . . . . . . . . . . . . . . . . . . . 7.6.8 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
. . . .
. . . .
277 278 280 281
.... .... ....
284 288 288
.... .... ....
290 293 297
. . . . . . . . . . .
. . . . . . . . . . .
298 302 302 305 306 307 309 310 311 315 317
...
321
. . . . . . . . . . . .
321 322 324 326 327 330 332 336 338 340 343 344
New Understanding of Stochastic Independence . . . . . . . . . . . Yeneng Sun 8.1 The General Context . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The Specific Problems. . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Difficulties in the Classical Framework . . . . . . . . . . . . . . 8.4 The Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Exact Law of Large Numbers. . . . . . . . . . . . . . . . . . . . . 8.6 Converse Law of Large Numbers . . . . . . . . . . . . . . . . . . 8.7 Almost Equivalence of Pairwise and Mutual Independence 8.8 Duality of Independence and Exchangeability. . . . . . . . . . 8.9 Grand Unification of Multiplicative Properties . . . . . . . . . 8.10 Discrete Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . 8.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . .
. . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
xiv
Contents
Part V 9
Economics and Nonstandard Analysis
Nonstandard Analysis in Mathematical Economics . . . . . . . . . Yeneng Sun 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Distribution and Integration of Correspondences . . . . . . . 9.2.1 Distribution of Correspondences . . . . . . . . . . . . . 9.2.2 Integration of Correspondences . . . . . . . . . . . . . 9.3 Nash Equilibria in Games with Many Players. . . . . . . . . . 9.3.1 General Existence of Nash Equilibria in the Loeb Setting . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Nonexistence of Nash Equilibria in the Lebesgue Setting . . . . . . . . . . . . . . . . . . . 9.4 Nash Equilibria in Finite Games with Incomplete Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.1 Nonexistence of Nash Equilibria for Games with Information . . . . . . . . . . . . . . . . . . . . . . . . 9.4.2 Approximate Nash Equilibria for Large Finite Games and Idealizations . . . . . . . . . . . . . . . . . . 9.4.3 General Existence of Nash Equilibria for Games with Information . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Exact Law of Large Numbers and Independent Set-Valued Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6 Competitive Equilibria in Random Economies . . . . . . . . . 9.7 General Risk Analysis and Asset Pricing . . . . . . . . . . . . . 9.7.1 General Risk Analysis for Large Markets. . . . . . . 9.7.2 The Equivalence of Exact No Arbitrage and APT Pricing. . . . . . . . . . . . . . . . . . . . . . . . 9.8 Independent Universal Random Matching . . . . . . . . . . . . 9.9 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part VI
...
349
. . . . .
. . . . .
349 356 356 361 363
...
364
...
365
...
368
...
368
...
370
...
373
. . . .
. . . .
. . . .
375 380 383 383
. . . .
. . . .
. . . .
388 389 392 396
.......
403
. . . . . . .
403 405 407 411 417 419 432
. . . . .
Combinatorial Number Theory
10 Density Problems and Freiman’s Inverse Problems . . . . Renling Jin 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Applications to Density Problems . . . . . . . . . . . . . . 10.2.1 Sumset Phenomenon . . . . . . . . . . . . . . . . . 10.2.2 Plünnecke Type of Inequalities for Densities 10.3 Applications to Freiman’s Inverse Problems . . . . . . . 10.3.1 Freiman’s Inverse Problem for Cuts . . . . . . 10.3.2 Freiman’s 3jAj 3 þ b Conjecture . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
Contents
xv
10.3.3
Freiman’s Inverse Problem for Upper Asymptotic Density. . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Hypernatural Numbers as Ultrafilters . . . . . . . . . . . . . . Mauro Di Nasso 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The u-equivalence. . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Hausdorff S-topologies and Hausdorff Ultrafilters . . . 11.4 Regular and Good Ultrafilters. . . . . . . . . . . . . . . . . 11.5 Ultrafilters Generated by Pairs . . . . . . . . . . . . . . . . 11.6 Hyper-Shifts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.7 Nonstandard Characterizations in the Space ðbN; Þ . 11.8 Idempotent Ultrafilters . . . . . . . . . . . . . . . . . . . . . . 11.9 Final Remarks and Open Questions. . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
439 440
.......
443
. . . . . . . . . .
. . . . . . . . . .
443 445 450 454 458 462 466 468 471 473
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
475
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
Part I
An Introduction to Nonstandard Analysis
Chapter 1
Simple Nonstandard Analysis and Applications Peter A. Loeb
1.1 Introduction The notion of an infinitesimal number has been used in mathematical arguments since before the time of Archimedes, some 2200 years ago. We understand infinitesimals in terms of a scale set by ordinary numbers. For a positive number to be infinitesimal, it must be greater than 0 and yet smaller than any positive number you might write using a decimal expansion. Infinitesimals played a fundamental role in the development of the calculus, starting in the late 1600s with the work of Newton and Leibniz. They, in particular Leibniz, used infinitesimal numbers to define the derivative and the integral. Even today, the modern definitions of limits, derivatives and integrals can be simplified with the use of infinitesimal numbers. Consider, for example, calculating the total force of water on a dam. If we use infinitesimals, we can calculate the total force by cutting the dam up into horizontal strips of infinitesimal width. In each strip, the pressure changes by only an infinitesimal amount. By taking the pressure at any point in the strip and multiplying by the area of a rectangle that approximates the strip, we will get the actual force on the strip except for an infinitesimal error. The sum of these errors over all of the strips will still be infinitesimal. Therefore, the total force on the dam is the nearest ordinary number to the sum of the approximations we obtain for each strip. Here we come to a question we will answer later: What do we mean by the sum over all of the strips? It is helpful to think in terms of infinitesimals in branches of mathematics beyond calculus. For example, Brownian motion is the random motion of microscopic particles suspended in a liquid or gas. Under a microscope you can clearly see this zig-zag random motion caused by the collision of the particles with molecules in the suspending medium. An important part of mathematical probability is the construction P.A. Loeb (B) Department of Mathematics, University of Illinois, 1409 West Green Street, Urbana, IL 61801, USA e-mail:
[email protected] © Springer Science+Business Media Dordrecht 2015 P.A. Loeb and M.P.H. Wolff (eds.), Nonstandard Analysis for the Working Mathematician, DOI 10.1007/978-94-017-7327-0_1
3
4
P.A. Loeb
of mathematical models for this motion. A very convenient model has each particle performing a random walk with steps of infinitesimal length. That is, one divides time into infinitesimal intervals, and in each interval, the particle moves in a straight line over an infinitesimal distance. At the end of each time interval, the particle chooses at random a new direction for its motion. Many probabilists do not know how to make a mathematically correct version of this construction. Nevertheless, they still think of Brownian motion in these terms. For them, it is a useful fiction, but in this book, we will see how to turn that fiction into rigorous mathematics. Infinitesimals have always been regarded as a useful fiction that facilitates mathematical computation and invention. This fiction has always had its critics. Nevertheless, infinitesimals were used during the 18th and 19th centuries to flesh out what we now call “the Calculus”. Then, in the late 19th century, this theoretical foundation, which seemed unlikely ever to be made clear, was replaced with what is now called the ε-δ method. In the middle of the last century, mathematicians began removing infinitesimals from all mathematics courses. Thus infinitesimals gradually faded from use, persisting only as an intuitive aid to conceptualization by physicists and engineers, and even mathematicians when setting up multiple integrals. In 1960, the mathematical logician Abraham Robinson gave a clear, mathematically correct foundation for the use of infinitesimals in all branches of mathematics. (See [6].) Robinson’s foundation started with a mathematical object such as the real number system or the real number system together with another object such as a topological space or a Hilbert space. For now, we don’t have to worry about what the second object might be. As befit a logician, Robinson set up a formal language to express facts about the mathematical system with which he was working. That system formed what Robinson called a standard model for the theorems expressed in his formal language. One thinks of the standard model as a world that exists in some Platonic sense. The theorems in the language are correct statements about this world. What Robinson showed was that there is another mathematical object, called a nonstandard model, in which there exist positive infinitesimal numbers, and yet all of the theorems in the language are correct statements about this model as well. Informally, what we have are two worlds, the standard and the nonstandard, and the theorems about the first are also correct statements about the second. One uses the nonstandard model to analyze the standard model, and for this reason, Robinson called his method and results “nonstandard analysis.” One way to explain Robinson’s existence result is to invoke a theorem of Gödel. Take a name not used for anything in the standard number system—for example, Bach. To the theorems about the standard number system we add new statements: “Bach is bigger than 1”, “Bach is bigger than 2”, etc. We add one such statement for each natural number. The standard number system is not a model for the collection of theorems augmented by these statements about Bach. It is, however, a model for any finite subset of the augmented collection. To see this, all one has to do is look for the biggest number mentioned in a given finite set of statements and let Bach be the name of a number that is even bigger. Since every finite subset of our augmented collection of statements has a model, namely the ordinary numbers, it follows from a result of Gödel that the entire augmented collection of statements has a model. That
1 Simple Nonstandard Analysis and Applications
5
is, there is a number system extending the standard one in which Bach is the name of an infinite positive number. Bach’s reciprocal, i.e., 1 divided by Bach, is then an infinitesimal number. Another approach to understanding Robinson’s result is to construct a simple number system with infinitesimals using sequences of real numbers. A sequence of positive real numbers tending to zero, such as the sequence 1, 1/2, 1/3, 1/4, . . . , becomes a positive infinitesimal. A sequence of real numbers increasing to infinity, such as the sequence 1, 2, 3, . . . , becomes an infinite nonstandard number. We will say more about this in the next section. For applications, it is usually simpler just to work with the properties of a nonstandard number system, keeping in mind that any theorem for the ordinary numbers is also a theorem, when properly interpreted, for the enlarged number system. What does it mean to say “when properly interpreted”? Briefly, we can’t formally specify what we mean when we say “all” subsets of a given set. Even for the set of natural numbers, the idea of all subsets cannot be formalized. This inability to formalize the notion of “all subsets” means that when interpreting theorems in the nonstandard model, we can cheat. We don’t interpret the word “all” to really mean “all”. We work instead with what are called internal sets, and interpret “all sets” to mean all internal sets. We will postpone the problem of what is meant by “all” and the explanation in terms of internal sets until the next chapter of this book. We will first consider in this chapter a simplified formulation of nonstandard analysis. Experience shows that a simplified formulation is often a necessary beginning for non-logicians who wish to apply this powerful mathematical tool. Our formulation is based on a modification of H.J. Keisler’s 1976 calculus book [3], which uses infinitesimals. Keisler’s approach is based on Robinson’s foundations, and our version of that approach first appeared in the book [2] coauthored with A.E. Hurd. In Keisler’s calculus book, the nonstandard real numbers are an extension, with infinitesimals, of the usual number system. Every function defined for the standard real numbers also works for the enlarged set of numbers. If two finite systems of equalities and inequalities have the same solutions for the real number system, then they have the same solutions for the enlarged number system. These rules, which suffice for the calculus and basic real analysis, will be modified to form the simple introduction to nonstandard analysis presented in this chapter. Of fundamental importance for many of the applications in this chapter and applications later in this book is a generalization of the notion of a finite set. In the standard world, a set is finite if it can be enumerated with natural numbers, finishing with a largest natural number. Once we have extended the real numbers, we will have also extended the set of natural numbers, thus obtaining nonstandard natural numbers that are bigger than any ordinary natural number. Such numbers are called infinite or “unlimited” natural numbers. The number “Bach” mentioned earlier is an example. If a set can be formally enumerated with the standard and nonstandard natural numbers up to an unlimited natural number, then the set is called a hyperfinite set. Hyperfinite sets are infinite sets, but they have all of the formal properties of a finite sets. In particular, since we can sum any finite set of real numbers, the summing
6
P.A. Loeb
function in the nonstandard model also gives an answer for any hyperfinite set. Thus we have a meaning for a sum of infinitesimal numbers used in integration theory. Before we continue with our simple framework for real analysis, here is a brief description of some of the applications of nonstandard analysis that will be considered later in this book. Hyperfinite sets play a central role in such applications to many areas of mathematics beyond the calculus. In probability, for example, it is easy to analyze a finite coin toss; it is much harder to analyze an infinite coin toss. On the other hand, in a nonstandard real number system, where there are unlimited natural numbers, we can choose such a number, again call it Bach, and, at least in our imagination, we can perform a hyperfinite coin toss. That is, we can toss the coin Bach times. Now any particular outcome has probability equal to one divided by two raised to the power Bach. Moreover, this hyperfinite coin tossing space contains all standard infinite coin tosses. The problem is, to make contact with the rest of mathematics, one needs to superimpose on the set of hyperfinite coin tosses a standard probability space structure using standard numbers. In 1975 [4], the author showed how this could be done and developed a measure theory for hyperfinite probability spaces. In 1976 in [1], Robert Anderson used this measure theory for coin tossing spaces to construct a model for Brownian motion and the Itô stochastic integral. In Anderson’s model, the underlying set of elementary outcomes for Brownian motion is a set of random walks with infinitesimal steps. One divides time up into infinitesimal intervals, and at the beginning of each interval tosses a coin. If the toss is a head, one moves to the right; if the toss is a tail, one moves to the left. The step size is the square root of the time change. This is how to make rigorous the probabilist’s intuitive model for Brownian motion. In the last 40 years, the theory of hyperfinite measure spaces has been used to obtain new mathematical results in probability theory and many other areas of mathematics including potential theory, number theory, mathematical physics, and mathematical economics and finance. In mathematical economics, which will be discussed by Yeneng Sun later in this book, a central problem is to study equilibria in economies with a very large number of individuals when each individual has only a negligible influence on the economy. For this, it is quite natural to consider an economy with a hyperfinite number of individuals, each individual having only an infinitesimal influence on the economy. In particular, hyperfinite measure theory and its extension to more general nonstandard measure spaces has been used by Yeneng Sun to finally make rigorous the treatment of a continuum of independent random variables and traders in an economy. The aim of this book is to make Robinson’s discovery and some of the subsequent research available to the working mathematician. As noted, the reader who begins with this chapter of the book should easily pick up what is needed to go on. To accommodate readers with little or no background in mathematical logic, we have begun, as in the first chapter of [2], with a simple construction of a nonstandard number system associated with a very simple system of logic. The simplicity of the system of logic helps to illuminate the way logic is used to obtain mathematical results. It also simplifies the proof of the “transfer principle”, which is the basic property of
1 Simple Nonstandard Analysis and Applications
7
nonstandard number systems. The skills the reader needs for more advanced work are developed through applications of this simple system to calculus and elementary real analysis. In the next chapter of this book, we develop the full theoretical background needed to discuss modern developments and applications.
1.2 A Simple Construction of a Nonstandard Number System First, we extend the real numbers R with an ordered field ∗ R containing infinitesimals. We denote the natural numbers by N and the set of all subsets of N by P(N). We will construct ∗ R from sequences of real numbers. It is well known that sequences of real numbers with pointwise operations do not form a field. For example, let E be the set of even natural numbers, and let O be the set of odd natural numbers. The product of characteristic functions χ E · χ O is identically 0, and should represent 0. To have a field, therefore, one of the sequences χ E or χ O must also represent 0; we may decide which it should be. We will make this decision and all other such decisions in one step using a “free ultrafilter”. Definition 1.2.1 A free ultrafilter in N is a collection U ⊂ P(N) such that (1) (2) (3) (4)
∅∈ / U, A ∈ U & B ∈ U => A ∩ B ∈ U, A⊂N& A∈ / U => N \ A ∈ U, A a finite subset of N => N \ A ∈ U.
Properties 1, 2, and 3 make U an ultrafilter. They imply that any set containing a set in U is also in U. If we replace Property 3 with this weaker property, then, given Property 4, we have just a free filter in N. Ultrafilters are either free or fixed. There is a fixed ultrafilter for each n ∈ N; it is the collection {A ⊂ N : n ∈ A}. A free ultrafilter, on the other hand, contains the filter consisting of all complements of finite subsets of N. An easy application of Zorn’s Lemma shows that such an ultrafilter exists. The fixed ultrafilter consisting of all sets containing the singleton {n} corresponds to unit mass at n. Free ultrafilters on N correspond to 0-1 valued finitely additive measures on P(N). Problem: Show that for any finite partition A1 , . . . , An of N and any ultrafilter U in N, one and only one of the sets Ai is in the ultrafilter U. Answer: Note that if V ∈ U and V = A ∪ B with A ∩ B = ∅, then either A ∈ U or B = V ∩ (N \ A) ∈ U but not both since ∅ ∈ / U. Now, for the Ai s, not more than / U. Either A1 or ∪i>1 Ai is in U. If the latter, one of the Ai s can be in U since ∅ ∈ / U for i < n, then either A2 is in U or ∪i>2 Ai ∈ U. Continuing, we see that if Ai ∈ then An ∈ U.
8
P.A. Loeb
Definition 1.2.2 Given an ultrafilter U, we say a property holds a.e. if it holds on some set U ∈ U. We set a sequence ri ≡ si when ri = si a.e. That is, when {i ∈ N : ri = si } is in U. It is clear that the relation ≡ is an equivalence relation. In particular, if ri ≡ si
and si ≡ ti , then ri ≡ ti , since {i ∈ N : ri = ti } ⊇ {i ∈ N : ri = si } ∩ {i ∈ N : si = ti }. Definition 1.2.3 We will write [< ri >] for the equivalence class containing the sequence ri , and we will use ∗ R to denote the collection of equivalence classes. The set ∗ R is called the set of nonstandard real numbers or hyperreal numbers. Its formation using an ultrafilter is called an ultrapower construction. Returning to the example given at the beginning of this section, we now have either χ E ≡ 0 or χ O ≡ 0. Thus, there is no longer the problem with 0 divisors. We next define the operations + and · together with an order relation < for ∗ R. Definition 1.2.4 Given real sequences < ri > and < si >, we set [< ri >] + [< si >] = [< ri + si >], [< ri >] · [< si >] = [< ri · si >], |[< ri >]| = [< |ri | >] [< ri >] < [< si >] if ri < si a.e. Proposition 1.2.5 The operations + and ·, the mapping |·|, and the ordering < are independent of the choice of representing sequences. Proof The proof is left to the reader. Note that the set of real numbers R (these are also called the standard numbers) is imbedded in the set of nonstandard real numbers ∗ R via map c → [< c >]. For example, 5 is mapped to the equivalence class [< 5 >] containing the constant sequence < 5 >. We write ∗ c for [< c >], but we will later drop the star. We will call ∗ c the extension of c. Similarly, we have an extension for every n-tuple of real numbers c1 , . . . , cn given by ∗ c1 , . . . , ∗ cn . Proposition 1.2.6 The structure (∗ R, +, ·, ]: One of the sets {i ∈ N : ri < si }, {i ∈ N : ri = si }, {i ∈ N : ri > si } is in U. The rest is left to the reader.
1 Simple Nonstandard Analysis and Applications
9
Problem: Show that if r = [ ri ], then |r | = [ |ri | ] is the absolute value of r in the usual sense. Answer: If r ≥ 0, then ri ≥ 0 a.e., so |r | = r = [ ri ] = [ |ri | ]. If r ≤ 0, then ri ≤ 0 a.e., so |r | = −r = [ −ri ] = [ |ri | ]. Problem: Show that if r = [< ri >], s = [< si >], and r < s, then there is a t ∈ ∗ R with r < t < s. Answer: Since ri < si a.e., ri < (ri + si )/2 < si a.e. An element t ∈ ∗ R which works is t = [< (ri + si )/2 >]. Problem: Show that there are infinitely many elements in ∗ R greater than the number ω := [< 1, 2, . . . , n, . . . >]. Answer: For each m ∈ N, mω = [< m, 2m, . . . , mn, . . . >] > ω. Roughly speaking, a property will hold for ∗ R if for any finite set of sequences of real numbers, it holds a.e. on N. To extend this principle to properties involving functions, we extend each real-valued function f defined on R to a function defined on ∗ R; the value of the extended function at [< ri >] is [< f (ri ) >]. Definition 1.2.7 For any r ∈ ∗ R, (1) r is infinite or unlimited (positive or negative) if |r | > n for every standard n ∈ N. (2) r if finite or limited if |r | < n for some standard n ∈ N; (3) r is infinitesimal if |r | < 1/n for every standard n ∈ N. Note that 0 is the only standard infinitesimal. The equivalence class [< 1/i >] is infinitesimal and [< i >] is a positive, unlimited number in ∗ R. We have already extended the functions +, · , | · | and the relation . Definition 1.2.10 A function is an n + 1-ary relation such that f of n variables if a 1 , . . . , a n , b ∈ f and a 1 , . . . , a n , c ∈ f , then b = c. We will usually write b = f (a 1 , . . . , a n ) instead of a 1 , . . . , a n , b ∈ f . The domain of f is the set of all 1 a , . . . , a n for which ∃b with a 1 , . . . , a n , b ∈ f . The range of f is the set of all 1 b such that ∃ a , . . . , a n with a 1 , . . . , a n , b ∈ f . Example 1.2.11 The operations + and · are functions of 2 variables. However, we usually write 5 + 7 = 12 instead of 5, 7, 12 ∈ + or +(5, 7) = 12. When using a formal language however, as in the next section, one should remember that 5 + 7 = 12 is shorthand for a more formal statement.
10
P.A. Loeb
Definition 1.2.12 The ∗-transform an n-ary relation P is the relation ∗ P where ∗ of n 1 1 [< ri >], . . . , [< ri >] ∈ P if ri , . . . , rin ∈ P for almost all i. It is a good exercise to show that the ∗-transform of an n-ary relation is welldefined, that is, it is independent of the choice of representatives of the equivalence classes. Note that previous extensions of +, ·, < , | · |, all follow this general pattern. The ∗-transform of equality is true equality in ∗ R, because the elements of ∗ R are the equivalence classes. For a unary relation A, ∗ A ⊃ A in the sense that for each a ∈ A, the equivalence class [< a >] containing the constant sequence < a > is also in ∗ A. In general, we have the following fact. Proposition 1.2.13 If P is an n-ary relation, then ∗ P extends P; i.e., every n -tuple in P is in ∗ P (or rather, the extension of the n-tuple is in ∗ P.) Proof The proof is clear. Example 1.2.14 The extension of the unit interval ∗ [0, 1] contains all nonstandard reals between 0 and 1. If f is a function of n-variables, then ∗ f extends f . Moreover, if D is the domain of f , then ∗ D is the domain of ∗ f . We will also work with the characteristic functions of relations. Given an n-ary relation P, we set χ P (x1 , . . . , xn ) = 1 if x1 , . . . , xn ∈ P and χ P (x1 , . . . , xn ) = / P. An important consequence of the fact that we are working 0 if x1 , . . . , xn ∈ with a free ultrafilter, and not just a free filter, is given by the following result. Proposition 1.2.15 Given an n-ary relation P, the extension of its characteristic function, ∗ χ P , is equal to the characteristic function of ∗ P. That is, ∗ χ P = χ∗ P . This means that ∗ χ P (x1 , . . . , xn ) = 1 if x1 , . . . , xn ∈ ∗ P, and ∗ χ P (x1 , . . . , xn ) = 0 otherwise. Proof We will present the proof for a unary relation P, the general proof is similar. For any sequence ri , the sequence χ P (ri ) is defined and consists of just 0’s and 1’s. The set of natural numbers N is the union of the two disjoint sets {i ∈ N : χ P (ri ) = 1} = {i ∈ N : P ri } and
{i ∈ N : χ P (ri ) = 0} = {i ∈ N : P ri }.
Therefore, either ∗ χ P ([ ri ]) = 0 or ∗ χ P ([ ri ]) = 1 in ∗ R. Here we have used the fact that U is an ultrafilter. It follows that ∗ χ P is a characteristic function on ∗ R. We also see that ∗ χ P ([ ri ]) = 1 if and only if P ri holds a.e., and this is true if and only if ∗ P [ ri ] holds on ∗ R, so ∗ χ P = χ∗ P .
1 Simple Nonstandard Analysis and Applications
11
1.3 A Simple Language Recall that the real numbers R can be constructed as a collection of equivalence classes of Cauchy sequences of rational numbers. One soon forgets the construction, however, and works just with the properties of R. For the same reason, except for proofs of some fundamental properties of ∗ R, it is best to put aside the construction of ∗ R as a collection of equivalence classes of real-valued sequences and work not with the construction, but with the properties of ∗ R. Our simplified formulation of nonstandard analysis in this chapter treats only elements of ∗ R and the n-ary relations on ∗ R. We will distinguish those relations that are functions. In the next chapter, we will expand this simple approach to handle full mathematical structures. For this discussion, we need a formal language for R and another formal language for ∗ R. We construct both as a single language by using the symbol S to stand for either number system. The language we construct in this chapter is restricted so that beginners can concentrate on formal sentences having just two forms. This restriction simplifies the interpretation and proof of the important “transfer principle”. It also helps in learning to exploit the difference between a mathematical object, such as the number system S, and formal statements in the language about such a mathematical object. Our restricted language L is built from the following logical symbols: Connectives: ∧, → (“and”, “implies”). Quantifier: ∀ (“for all”). Parentheses: [, ], {, }, . Variables: x, y, x1 , a, n, etc. We need only a countable number of these. (Only a finite number will actually appear on these pages.) Constants: For each s ∈ S we have a name s. We may have more than one name for s. We have a relation symbol P for each relation, and in particular, a function symbol f for each function. These symbols are called the names of the relations and the functions. We omit the underline for numbers and also for +, ·, 0] is valid, but (∀x)[R x → ln(x) = ln(x)] is not valid. The distributive law can be expressed as follows: (∀x)(∀y)(∀z)[1 = 1 → x · (y + z) = x · y + x · z]. We will often use a double arrow ←→ to abbreviate two separate formal sentences: In the first sentence, the left side of the double implication implies the right, and in the second sentence, the right side implies the left. Similarly, we will write a compound sentence such as ⎡ (∀x1 ) · · · (∀xn ) ⎣
l j=1
⎤ − σ j⎦ Qj →
14
P.A. Loeb
to stand for a compound sentence such as ⎡ (∀x1 ) · · · (∀xn ) ⎣5 = 5 −→
l
⎤ − σ j ⎦, Qj →
j=1
where the left side of the implication is always true. Such a sentence holds or is valid in S if for each replacement of the variables x1 , . . . , xn with constant symbols → σ i are valid in S. s 1 , . . . , s n , all of the resulting atomic statements Q i − We have no formal “not” inour simple language. If τ 1 , . . . , τ n are interpretable in S, then instead of saying P τ 1 , . . . , τ n is not true, we can use the complement P , and say that P τ 1 , . . . , τ n is true. The reader should note, however, that if a 1 term in an atomic sentence P τ , . . . , τ n is not interpretable, then neither the atomic 1 sentence P τ , . . . , τ n nor the atomic sentence P τ 1 , . . . , τ n is true in S. The conjunction “or” is also handled with some difficulty. For example, the Trichotomy Law is handled as follows: (∀x)[R x ∧ x < 0 ∧ x = 0 → x > 0]. We do not have the existential quantifier ∃ in our simple language. This can be worked around using the model itself. If there is always a unique object that exists, then one can use the name of the object. For example, instead of writing (∀x)(∃y)[y + x = 0], we can write (∀x)[−x + x = 0]. Even when there is no unique object that exists, there is always a choice function ψ in the model that picks one of the objects that exists. For example, given any real number, there is always a larger number. If ψ picks one of those larger numbers, then the following sentence holds: (∀x)[R x → ψ(x) > x]. Such a function ψ is called a Skolem function. It picks one of the things that exists whether that thing is unique or not. If φ is the Skolem function that picks the unique multiplicative inverse, then the following sentence holds in R: (∀x)[R x ∧ x = 0 → x · φ(x) = 1]. Informally, one can write (∀x)[R x ∧ x = 0 → ∃(y)[x · y = 1]]. The reader should realize, however, that this is only an abbreviation for the appropriate sentence in our formal language using a Skolem function to pick an element that works to make the statement valid. Example 1.4.4 Let B be the range of a function f of n variables. The fact that B is the range of f is stated by the following two simple sentences taken together:
(∀x1 )(∀x2 ) · · · (∀xn )[ f (x1 , . . . , xn ) = f (x1 , . . . , xn ) → B f (x1 , . . . , xn ) ], (∀y)[B y → f (ψ 1 (y), . . . , ψ n (y)) = y]. In the first sentence, f (x1 , . . . , xn ) = f (x1 , . . . , xn ) is valid if and only if for all choices of values for x1 , . . . , xn , f (x1 , . . . , xn ) is defined. In the second of these sentences, each of the functions ψi picks a particular value of xi that works. Informally, we can write this second sentence as
1 Simple Nonstandard Analysis and Applications
15
(∀y) B y → (∃x1 ) · · · (∃xn ) f (x1 , . . . , xn ) = y . Problem: Write a simple sentence stating that a given nonempty set A ⊆ N has a first element. Answer: Let m be the name of the first element of A. Here, we are using our knowledge about the actual set being described by our formal language. That is, we know all about A, and we know the name of the first element m. A sentence that works is (∀x)[A x → A m ∧ m ≤ x]. We cannot abbreviate with an existential quantifier here, because that would mean there is a Skolem function acting on sets of numbers that produces the number that exists. In this chapter, we are not considering functions acting on sets. Problem: Characterize the fact that A is the domain of a real-valued function f of n variables. Answer:
(∀x1 ) · · · (∀xn ) A x1 , . . . , xn ←→ R f (x1 , . . . , xn ) . Problem: Express with a sentence of L the continuity of a real-valued function f with domain A ⊆ R at a ∈ A. Answer: For this, we must use a positive function δ of one variable. That variable is restricted to the positive real numbers. We assume we know all about this Skolem function δ that works for f and a; in particular, we know its domain. The sentence we want is (∀x)(∀ε) A x ∧ |x − a| < δ(ε) → | f (x) − f (a)| < ε . Again, informally, we can abbreviate the above sentence with the sentence (∀x)(∀ε > 0)(∃δ > 0) A x ∧ |x − a| < δ → | f (x) − f (a)| < ε .
1.5 Transfer Principle for ∗ R We can form two languages L R and L ∗ R from the language L by letting S be R and ∗ R, respectively. We next show how to transform each simple sentence in L into a R simple sentence in L ∗ R . Here are the rules.
16
P.A. Loeb
CONVENTIONS: (1) The name c of c ∈ R also names ∗ c; we identify c and ∗ c. (2) If P names the relation P, ∗ P names ∗ P. In particular, if f names a function f , ∗ f names ∗ f . We leave off ∗ for some relations and functions such as 1 is similar. Recall that for a given ρ ∈ ∗ R, we say that ρ is unlimited or infinite if |ρ| > n for all standard n ∈ N, ρ is limited or finite if |ρ| < n for some standard n ∈ N, and ρ is infinitesimal if |ρ| < 1/n for all standard n ∈ N. The number 0 is the only real infinitesimal. ∗ ∗ Example 1.6.4 Note that ∞ n=1 [−n, n] is the set of finite or limited numbers in R, N n∈ ∞ ∗ −1 , 1 is the set of all while ∗ [−n, n] = ∗ R. On the other hand, ∞ n=1 n=1 n n n∈N n∈N ∞ −1 1 ∗ ∗ infinitesimal numbers in R, while = ∗ {0} = {0}. n=1 n ,n n∈N
Proposition 1.6.5 Let B denote a subset of Rn . Then ∗ B ∩ Rn = B. Proof We give the proof for n = 1. If r is a real number not in B, then the atomic sentence r ∈ / B holds for B and thus for the extension of B. Definition 1.6.6 An n-tuple a1 , . . . , an is standard if each ai ∈ R. Proposition 1.6.7 A number ρ ∈ ∗ R is positive and unlimited if and only if 1/ρ is strictly positive and infinitesimal; ρ ∈ ∗ R is negative and unlimited if and only if 1/ρ is strictly negative and infinitesimal. Proof The preservation of positivity (negativity) under x → 1/x follows by transfer. Moreover, the fact that for all standard natural numbers n, |x| > n iff |1/x| < 1/n follows by transfer of the sentence (∀x)(∀m)[N m ∧ |x| > m ←→ N m ∧ |1/x| < 1/m]. Theorem 1.6.8 The following properties hold for ∗ R: (i) Finite sums, differences, products of limited numbers are limited. (ii) Finite sums, differences, products of infinitesimal numbers are infinitesimal. (iii) The infinitesimal numbers form an ideal in the ring of limited numbers; i.e., the product of a limited and an infinitesimal number is infinitesimal.
1 Simple Nonstandard Analysis and Applications
21
(iv) The limited and the infinitesimal numbers form vector spaces over R. Proof If |ρ| < n, and |τ | < m for n, m ∈ N, then n + m + n · m bounds the absolute values of the sum, difference and product of ρ and τ . Fix ρ and τ infinitesimal, and fix α limited in ∗ R. Given any n ∈ N, |ρ| < 1/(2n) and |τ | < 1/(2n), so |ρ + τ | < 1/n. There is an m ∈ N such that |α| < m. Since |ρ| < 1/(m · n), |α · ρ| < m/(m · n) = 1/n. The rest is left to the reader. Definition 1.6.9 We say that x and y are infinitesimally close or infinitely close if x − y is infinitesimal. Here one writes x y or x ≈ y. We say that x and y are finitely close if x − y is finite or limited. Here we write x ∼ y . (Both and ∼ are equivalence relations.) The equivalence class for containing x is called the monad of x and written m(x). That is, m(x) = {y ∈ ∗ R : y x}. The equivalence class for ∼ containing x is called the galaxy of x and written G(x). Remark 1.6.10 The monad of 0, m(0), is the set of infinitesimals; moreover, for all x ∈ ∗ R, m(x) = x + m(0). Similarly, G(x) = {y ∈ ∗ R : y ∼ x}. The galaxy of 0, G(0) is the set of limited elements of ∗ R. It is also denoted by Fin (∗ R). For each x ∈ ∗ R, G(x) = x + G(0). Proposition 1.6.11 Between any two monads is another monad; between any two galaxies is another galaxy. Proof If x and y are in different monads, then (x +y)/2 is between x and y. Moreover, x+y x−y 2 − y = 2 is not infinitesimal. Similarly, x is not in the monad of (x + y)/2. A similar proof works for galaxies. Remark 1.6.12 We usually center monads at standard real numbers, and speak of the monad of r for r ∈ R. Theorem 1.6.13 Every limited ρ ∈ ∗ R is in the monad of a unique r ∈ R. Proof Fix a limited ρ ∈ ∗ R, and set A := {s ∈ R : s ≤ ρ}. Here, we have used ρ to define an upper bounded subset of ordinary real numbers. Let r be the least upper bound in R of the set A. Assume |r − ρ| is not infinitesimal. Then for some n ∈ N, 1/n ≤ |r − ρ|. In this case, if r < ρ, then r + 1/n is still in A, so r is not the least upper bound of A. On the other hand, if r > ρ, then r − 1/n is an upper bound of A. This again contradicts the definition of r . It follows that, r ρ. If we also have s ∈ R and s ρ, then r − s 0. Since 0 is the only infinitesimal in R, s = r . Definition 1.6.14 If ρ is limited, then the unique real number r with ρ r is called the standard part of ρ. We write r = st(ρ) or r = ◦ ρ. The mapping st : G(0) → R is called the standard part map.
22
P.A. Loeb
The standard part map will be quite important in all later parts of this book. Theorem 1.6.15 The standard part of a sum, difference, product, or quotient of two limited numbers is, respectively, the sum, difference, product or quotient of the standard parts of those numbers, with the exception that a denominator must not be infinitesimal. If ρ ≤ τ , then ◦ ρ ≤ ◦ τ . Proof Fix limited numbers r + ε and s + δ, where r and s are real numbers, and ε and δ are infinitesimal (possibly 0). Then: (r + ε) ± (s + δ) = (r ± s) + (ε ± δ) r ± s, (r + ε) · (s + δ) = (r · s) + (r · δ) + ε · (s + δ) r · s. To establish the rule for quotients, we assume that r > 0, and we note that for some n, m ∈ N, n1 < r 2 − m1 , and of course m1 > |r · ε| 0, so n1 < r 2 + ε · r , whence 1 1 n > r 2 +ε·r > 0. Since r 2 +ε·r is limited, 1 ε 1 − = 2 0. r r +ε r +r ·ε The proof for r < 0 is similar, and the rest follows from the product rule. If (r + ε) ≤ (s + δ) , then r ≤ s + (δ − ε) < s + 1/n for any n ∈ N. It follows that r ≤ s. Corollary 1.6.16 The quotient G(0)/m(0) is isomorphic to the field R. Proof The set m(0) is the kernel of the linear map st taking the vector space (over R) G(0) onto R. Definition 1.6.17 For any subset A of R, we will write ∗ A∞ to denote the set \ G(0).
∗A
Theorem 1.6.18 The only limited elements of ∗ N are the standard natural numbers, so ∗ N∞ = ∗ N \ N. If A is an infinite subset of N, then ∗ A contains arbitrarily large unlimited elements. In particular, ∗ A ∩ ∗ N∞ is not empty. Proof If s ∈ ∗ N and s is limited, then by definition, for some standard n ∈ N, s ≤ n. Now the transfer of the sentence (∀x)[N x ∧ x ≤ n ∧ x = 1 ∧ · · · ∧ x = n − 1 → x = n] forces s to equal a standard natural number between 1 and n. For the second part, if A ⊆ N is infinite, then there is a Skolem function ψ such that (∀n)[N n → ψ(n) ≥ n ∧ A ψ(n) ]. The transfer of this sentence says that there are arbitrarily large elements of ∗ A. Definition 1.6.19 The set ∗ N is called the set of nonstandard natural numbers or hypernatural numbers. The extension of the integers ∗ Z is called the set of nonstandard integers. (It is formed from ∗ N as Z is formed from N.) Remark 1.6.20 We have the following easy to establish facts.
1 Simple Nonstandard Analysis and Applications
23
(1) The ∗-transform of the greatest integer function [·] continues to work in the way described by the sentence (∀x)[R x → Z [x] ∧ [x] ≤ x < [x] + 1]. (2) If A = {a1 , . . . , an } is a finite set in R, then ∗ A = A. This follows by transfer of the sentence (∀x)[A x ∧ x = a1 ∧ · · · ∧ x = an−1 → x = an ]. (3) The real numbers R = (∗ Q ∩ G(0))/m(0). (4) Not every subset of ∗ R, is the extension of a standard one. For example, N cannot be the extension of a finite subset A of N since otherwise, ∗ A = A, but if A is an infinite subset of N, then ∗ A contains unlimited elements. Problem: Show ∗ cos is periodic with period 2π . Answer: Transfer the sentence: (∀x)[R x → cos(x) = cos(x + 2π )]. To show that no smaller positive number is a period, transfer the sentence (∀r ∈ (0, 2π ))(∃x ∈ R)[cos(x + r ) = cos(x)]. This is actually an abbreviation of the following sentence using a Skolem function ψ: (∀r )[χ(0,2π ) (r ) = 1 → cos(ψ(r ) + r ) = cos(ψ(r ))]. Problem: Show that a bound M on the range of a function f remains in effect for ∗f. Answer: Transfer the sentence (∀x)[ f (x) = f (x) → | f (x)| ≤ M]. Problem: Show that the ∗-transform of the set { f > c} is the set {∗ f > c}. Answer: Let A be the set { f > c}. Transfer the sentence (∀x)[A x ←→ f (x) > c].
1.7 Sequences Since a sequence is a function from N into R, it has an extension that maps ∗ N into ∗ R. We write s for the original sequence and ∗ s for its extension. Note that for n n all n ∈ N, ∗ sn = sn . The results in this section are due to Robinson [6].
24
P.A. Loeb
Theorem 1.7.1 A sequence sn has limit L if and only if for all η ∈ ∗ N∞ , ∗ sη L. That is, sn → L iff L = st(∗ sη ) ∀η ∈ ∗ N∞ . Proof Assume sn → L. Given an ε > 0 in R, there is a k ∈ N for which the sentence (∀n)[N n ∧ n ≥ k → |sn − L| < ε] holds for R. It follows by transfer that ∀η ∈ ∗ N∞ , |∗ sη − L| < ε. Since ε is arbitrary in R+ , |∗ sη − L| 0 ∀η ∈ ∗ N∞ . Now assume it is not true that sn → L. Then there is an ε > 0 and a Skolem function ψ : N → N such that the following sentence holds for R: (∀k)[N k → ψ(k) ≥ k ∧ |sψ(k) − L| ≥ ε]. It follows by transfer that there are unlimited η ∈ ∗ N such that |∗ sη − L| ≥ ε. Example 1.7.2 The sequence 1/n : n ∈ N becomes 1/n : n ∈ ∗ N . For each unlimited η, 1/η 0, so 1/n → 0. Theorem 1.7.3 Assume sn → L and tn → M. then (i) sn + tn → L + M, (ii) sn · tn → L · M, (iii) sn /tn → L/M provided M = 0. Proof By assumption, for any unlimited η ∈ ∗ N, ∗ sη L and ∗ tη M, so ∗ sη + ∗ ∗ ∗ ∗ η L + M, sη · tη L · M; moreover, sη / tη L/M provided M = 0.
∗t
Theorem 1.7.4 A sequence sn is bounded if and only if for all η ∈ ∗ N∞ , ∗ sη is limited. Proof If sn is bounded, then there is an M > 0 such that (∀n)[N n → |sn | ≤ M] holds for R. Its transfer shows that ∀η ∈ ∗ N, ∗ sη is limited. If sn is not bounded, then there is a function ψ : N → N such that ∀n ∈ N, ψ(n) ≥ n and sψ(n) ≥ n. By transfer, if η ∈ ∗ N∞ , then λ = ∗ ψ(η) is unlimited and |∗ sλ | ≥ η. Thus, ∗ sλ is unlimited. Problem: Show that a sequence sn is Cauchy if and only if for all η, γ ∈ ∗ N∞ , ∗ η sγ . Answer: Assume sn is Cauchy. Given ε > 0 in R, ∃kε such that
∗s
(∀n)(∀m)[N n ∧ N m ∧ n ≥ kε ∧ m ≥ kε → |sn − sm | < ε]. ∗ By transfer we see that for η, γ ∈ N∞ , sη − sγ < ε. Since ε is arbitrary, sη − sγ 0. Now assume that sn is not Cauchy. Then there is an ε > 0 in R and functions ϕ and ψ from N into N such that
1 Simple Nonstandard Analysis and Applications
25
(∀k)[N k → ϕ(k) ≥ k ∧ ψ(k) ≥ k ∧ sϕ(k) − sψ(k) ≥ ε]. By transfer, ∃η, γ ∈ ∗ N∞ , with ∗ sη − ∗ sγ ≥ ε. Problem: Show that a Cauchy sequence sn must be bounded. Answer: Suppose sn is a sequence that is not bounded. Then for some Skolem function ψ, (∀k)[N k → N ψ(k) ∧ ψ(k) > k ∧ sψ(k) − sk ≥ 1] holds in R. By transfer, if η ∈ ∗ N∞ , then λ = ∗ ψ(η) > η and ∗ sη − ∗ sλ ≥ 1, whence sn is not Cauchy. Problem: Suppose sn → L, tn → M, and sn ≤ tn ∀n. Show that L ≤ M. Answer: Fix η ∈ ∗ N∞ . Note that by transfer ∗ sη ≤ ∗ tη , and take standard parts. Problem: Assume (sn − 1)/(sn + 1) → 0. Show that sn → 1. Answer: For any η ∈ ∗ N∞ , there is an ε 0 such that sη − 1 = εsη + ε. It follows that (1 − ε)sη = 1 + ε, whence sη = (1 + ε)/(1 − ε) 1. Problem: Show that a sequence sn converges if and only if it is Cauchy. Answer: If sn → L, then for unlimited η and γ , ∗ sη L ∗ sγ . If sn is Cauchy, pick η ∈ ∗ N∞ . Since sn is bounded, ∗ sη is limited, so we may let L = st(∗ sη ). Now for any γ ∈ ∗ N∞ , ∗ sγ ∗ sη L. Theorem 1.7.5 A real number L is a limit point of a sequence sn if and only if there is an η ∈ ∗ N∞ with ∗ sη L. Proof Assume that L is a limit point of sn . There is a function ψ : R+ × N → N such that (∀ε)(∀k)[ε > 0 ∧ N k → N ψ(ε, k) ∧ ψ(ε, k) ≥ k ∧ sψ(ε,k) − L < ε]. By transfer, if ε > 0 but ε 0 and η ∈ ∗ N∞ , then λ = ∗ ψ(ε, k) ∈ ∗ N∞ and λ L. Conversely, if L is not a limit point of sn , then ∃ε > 0 and a k ∈ N such that (∀n)[N n ∧ n ≥ k → |sn − L| ≥ ε]
∗s
holds in R. It follows by transfer that for all η ∈ ∗ N∞ , ∗ sη − L ≥ ε. Example 1.7.6 Let sn = (−1)n (1 − 1/n). For any unlimited η, ∗ sη 1 for η even and ∗ sη −1 for η odd. Theorem 1.7.7 (Bolzano-Weierstrass) Every bounded sequence has a limit point. Proof If sn is bounded and η ∈ ∗ N∞ , then ∗ sη is limited. Let L = st(∗ sη ). Now, there is an unlimited element of ∗ N, namely η, such that ∗ sη L.
26
P.A. Loeb
1.8 Topology on the Reals Recall that the monad m(a) of a real number a ∈ R consists of the points infinitesimally close to a in ∗ R. Many of the arguments in this section generalize to 1st countable topological spaces where the monad of a point x is the set m(x) := ∩{∗ V : V a standard open neighborhood of x}. The compactness results of this section generalize to Rn . Theorem 1.8.1 Let A be a subset of R. (i) A is open if and only if for all a ∈ A, m(a) ⊂ ∗ A. (That is, points an infinitesimal distance away from a are still in ∗ A.) (ii) A is closed if and only if for all a ∈ A , m(a) ∩ ∗ A = ∅. Proof Clearly, (ii) follows from (i). If A is open and a ∈ A, then for some δ > 0, (∀x)[R x ∧ |x − a| < δ → A x ] holds for R. If x ∈ ∗ R, and x ∈ m(a), i.e. x a, then |x − a| < δ, so by transfer, x ∈ ∗ A. If A is not open, then ∃a ∈ A and a sequence sn such that
(∀n)[N n → A sn ∧ |sn − a| < 1/n]. By transfer, for η ∈ ∗ N∞ , ∗ sη a and ∗ sη ∈ contained in ∗ A.
∗
A = (∗ A) , whence m(a) is not
Problem: Show that arbitrary unions and finite intersections of open sets are open, and arbitrary intersections and finite unions of closed sets are closed. Answer: If a ∈ A1 ∩ A2 ∩ · · · ∩ An , all open, then m(a) ⊂ ∗ Ai for i = 1, . . . , n, so m(a) ⊂ ∗ A1 ∩ · · · ∩ ∗ An = ∗ (A1 ∩ · · · ∩ An ). The rest is clear. Theorem 1.8.2 A point c is an accumulation point of A ⊆ R if and only if there is an x ∈ ∗ A with x = c but x c. Proof If c is an accumulation point of A, then there is a sequence sn such that (∀n)[N n → A sn ∧ 0 < |sn − c| < 1/n]. For the desired point, let x = ∗ sη for some η ∈ ∗ N∞ . If c is not an accumulation point of A, then ∃ε > 0 in R such that
(∀x)[0 < |x − c| < ε → A x ], so m(c) ∩ (∗ A \ {x}) = ∅. Theorem 1.8.3 The closure A of A ⊆ R is the set {x ∈ R : m(x) ∩ ∗ A = ∅}. Proof If x ∈ A or x ∈ R is an accumulation point of A, then m(x) ∩ ∗ A = ∅. If m(x) ∩ ∗ A = ∅ and x ∈ / A, then x is an accumulation point of A.
1 Simple Nonstandard Analysis and Applications
Remark 1.8.4 The closure A = st
∗
27
A ∩ G(0) .
We next give Robinson’s criterion for compactness. For our elementary setting, we need a standard-analysis result not needed in the general setup of nonstandard analysis. That result generalizes to Rn . We should note that Robinson’s Theorem is also valid for spaces that are not second countable. Proposition 1.8.5 A set A ⊂ R is compact if and only if each covering of A by open intervals with rational end points has a finite subcovering. Proof Fix an arbitrary open covering of A. Given a ∈ A, there is an open U in the covering and an open interval with rational end points (α, β) such that a ∈ (α, β) ⊂ U . This gives a covering of A by intervals with rational end points. If this covering has a finite subcover {(αi , βi ) : 1 ≤ i ≤ n}, thenfor each i, there is a set Ui in the original cover with (αi , βi ) ⊆ Ui . Clearly, A ⊆ 1≤i≤n Ui . The rest is clear. Definition 1.8.6 We say that ρ ∈ ∗ A is near-standard in A if ◦ ρ = st(ρ) ∈ A. Theorem 1.8.7 (Robinson) A set A ⊂ R is compact if and only if for all ρ ∈ ∗ A, there is an a ∈ A with ρ a. That is, every point of ∗ A, is near-standard in A. Proof Assume A is compact but ∃ρ ∈ ∗ A not in the monad of any standard point of A. Then ∀a ∈ A, |ρ − a| is a non-infinitesimal number, so ∃δa > 0 in R such that |ρ − a| ≥ δa . Since A is compact, there is a finite set {a1 , . . . , an } ⊂ A and numbers {δ1 , . . . , δn } ⊂ R+ with δi = δai such that the following sentence holds for R: (∀x)[A x ∧ |x − a1 | ≥ δ1 ∧ · · · ∧ |x − an−1 | ≥ δn−1 → |x − an | < δn ]. By transfer, |ρ − an | < δn , which contradicts the choice of δn = δan . It follows that if A is compact, each ρ ∈ ∗ A is near-standard in A. Now assume there are sequences αn and βn of rational numbers such that for each n, αn < βn , and the open intervals (αn , βn ) form a covering of A with no finite subcovering. For each n ∈ N, let ψ(n) be an element of A not in the first n intervals. Then we have (∀n)(∀k)[N n ∧ N k ∧ k ≤ n ∧ αk < ψ(n) → βk ≤ ψ(n)], (∀n)(∀k)[N n ∧ N k ∧ k ≤ n ∧ βk > ψ(n) → αk ≥ ψ(n)]. / ∗ (αk , βk ) for any standard k ∈ N. Since By transfer, for η ∈ ∗ N∞ , ρ = ∗ ψ(η) ∈ for each a ∈ A, ∃k ∈ N with a ∈ m(a) ⊂ ∗ (αk , βk ), ρ is not in the monad of any standard point of A. Theorem 1.8.8 (Heine-Borel) A set A ⊂ R is compact if and only if it is closed and bounded.
28
P.A. Loeb
Proof Assume A is not closed. Then for some x ∈ A , m(x) ∩ ∗ A = ∅, i.e., m(x) / A, A is not compact. Assume A is contains a point y ∈ ∗ A . Since st(y) = x ∈ not bounded; that is, for each n ∈ N, there is a point φ(n) ∈ A with |φ(n)| ≥ n. Then by transfer, ∗ A contains an unlimited element, which is not in the monad of any standard point. Now assume that A is closed and that A is bounded by a constant M. Then (∀x)[A x → |x| ≤ M] holds in R. Given ρ ∈ ∗ A, |ρ| ≤ M, so ρ has a standard part, call it r . Since m(r ) ∩ ∗ A contains ρ and A is closed, r ∈ A. Remark 1.8.9 All of these notions are easily extended to Rn . For example, we have x1 , . . . , xn y1 , . . . , yn if and only if for all i, xi yi . Problem: Does ◦ x ≤ ◦ y ⇒ x ≤ y? Answer: No. If ε > 0 is infinitesimal, then 0 < ε, while 0 = ◦ ε ≤ ◦ 0 = 0. It is true that ◦ x < ◦ y ⇒ x < y. That is, if ρ = r + ε and τ = t + δ, with r < t in R and ε 0, δ 0, then t − r is not infinitesimal, so ρ = r + ε < τ = t + δ. Problem: Show that if A and B are compact in R, so is A + B. Answer: Given y ∈ ∗ (A + B), ∃α ∈ ∗ A and β ∈ ∗ B such that y = α + β. Since st(α) ∈ A and st(β) ∈ B, st(y) exists and is in A + B.
1.9 Limits and Continuity It is quite natural to treat limits and continuity using infinitesimals. Theorem 1.9.1 Suppose a is an accumulation point of A and f : A −→ R. Then lim x→a f (x) exists and equals L ∈ R if and only if for all x ∈ ∗ A with x a but x = a, ∗ f (x) L; that is, ∗ f [(m(a) ∩ ∗ A) \ {a}] ⊆ m(L). Proof Assume that lim x→a f (x) = L. Fix ε > 0 in R. Then ∃δ > 0 such that (∀x)[A x ∧ 0 < |x − a| < δ → | f (x) − L| < ε] holds for R. By transfer, if x ∈ (m(a) ∩ ∗ A) \ {a}, then | f (x) − L| < ε, and this is true for any ε > 0 in R, whence f (x) L. Now assume that L is not the limit of f (x) as x → a. Then there is an ε > 0 in R and a sequence sn such that (∀n)[N n → A sn ∧ 0 < |sn − a| < 1/n ∧ | f (sn ) − L| ≥ ε] holds in R. By transfer, if η ∈ ∗ N∞ , then ∗ sη ∈ (m(a) ∩ ∗ A) \ {a} but ∈ / m(L).
∗
f (∗ s η )
1 Simple Nonstandard Analysis and Applications
29
Theorem 1.9.2 If lim x→a f (x) = L and lim x→a g(x) = M, then (i) lim x→a ( f + g)(x) = L + M, (ii) lim x→a ( f · g)(x) = L · M, (iii) lim x→a ( f / g)(x) = L/M if M = 0. Proof The result follows by taking standard parts of (m(a) ∩ ∗ A) \ {a}.
∗
f (x) and ∗ g(x) when x ∈
Theorem 1.9.3 If f is defined on A then f is continuous at a ∈ A if and only if for all x ∈ m(a) ∩ ∗ A, ∗ f (x) f (a). That is, if and only if for x = x − a 0, we have y = ∗ f (x) − f (a) 0. Proof The result follows from the previous result for limits. Example 1.9.4 If f (x) = x 2 and x = x − a 0, then y = (a + x)2 − a 2 = 2a · x + x 2 0. Corollary 1.9.5 The sum, product, and quotient of functions continuous at a are continuous at a, provided that for the quotient, the denominator does not vanish at a. Theorem 1.9.6 (Intermediate Value Theorem) If f is continuous on [a, b] and f (a) < d < f (b), then there is a c ∈ (a, b) with f (c) = d. A similar result holds if f (b) < d < f (a). Proof For each n ∈ N, partition [a, b] into steps of length (b − a)/n . There is a first time in going from a partition point a + (k − 1)(b − a)/n to a partition point a + k(b − a)/n that f crosses from below d to a level at or above d. This gives a sequence sn such that b−a (∀n) N n → a ≤ sn < b ∧ f (sn ) < d ∧ d ≤ f sn + . n For η ∈ ∗ N∞ , ∗ sη is limited, and c = st(∗ sη ) ∈ [a, b]. Moreover, since ∗ sη b−a b−a ∗ η + η , c = st( sη + η ). By continuity, f (c) ≤ d and f (c) ≥ d, whence f (c) = d. For the rest, replace f with − f .
∗s
Theorem 1.9.7 (Extreme Value Theorem) If f is continuous on [a, b], then there is a point c ∈ [a, b] with f (c) ≥ f (x) for all x ∈ [a, b]. Proof For each n ∈ N, construct the points xn,k = a + nk · (b − a), 0 ≤ k ≤ n. Choose ψ(n) so that f (xn,ψ(n) ) ≥ f (xn,k ) ∀k with 0 ≤ k ≤ n. Fix η ∈ ∗ N∞ . By transfer, ∗ f (∗ xη,∗ ψ(η) ) ≥ ∗ f (∗ xη,k ) for any k ∈ ∗ Z with 0 ≤ k ≤ η. Moreover, c = st(∗ xη,∗ ψ(η) ) ∈ [a, b]. If d ∈ [a, b], then for the appropriate Skolem function ϕ determined by d we have ∗ ϕ(η) ∈ ∗ Z, 0 ≤ ∗ ϕ(η) ≤ η, and
30
P.A. Loeb
a ≤ ∗ xη,∗ ϕ(η) ≤ d < ∗ xη,∗ ϕ(η) +
b−a . η
Since f is continuous, f (c) ∗ f (∗ xη,∗ ψ(η) ) ≥ ∗ f (∗ xη,∗ ϕ(η) ) f (d). Since f (c) and f (d) are real, f (c) ≥ f (d). Remark 1.9.8 With the general approach to nonstandard analysis that we will present in the next chapter, we would be able to simplify the above proof by using a function that picks the biggest element from any hyperfinite set. Theorem 1.9.9 A function f is uniformly continuous on a set A if and only if for each x and y in ∗ A with x y, ∗ f (x) ∗ f (y). Proof Assume f is uniformly continuous on A. Given ε > 0, ∃δ > 0 so that (∀x)(∀y)[A x ∧ A y ∧ |x − y| < δ → | f (x) − f (y)| < ε]. Thus if x y in ∗ A, then |∗ f (x) −∗ f (y)| < ε. Since ε is arbitrary in R+ , we have f (x) ∗ f (y). Now assume that f is not uniformly continuous on A. Then ∃ε > 0 and a pair of sequences an and bn such that ∗
(∀n)[N n → A an ∧ A bn ∧ |an − bn | < 1/n ∧ | f (an ) − f (bn )| ≥ ε]. By transfer, for η ∈ ∗ N∞ , ∗ aη ∗ bη but ∗ f (∗ aη ) − ∗ f (∗ bη ) ≥ ε. Example 1.9.10 The function f (x) = 1/x is continuous on (0, 1), since for a ∈ (0, 1) and h 0, 1/(a + h) 1/a. However, f is not uniformly continuous on (0, 1) since for η ∈ ∗ N∞ , 1/η 1/2η 0, but ∗ f (1/η) = η and ∗ f (1/2η) = 2η. Theorem 1.9.11 If A ⊂ R is compact and f is continuous on A, then f is uniformly continuous on A. Proof Fix x y in ∗ A. Let a = st(x). Then a = st(y). By continuity, ∗ f (x) f (a) ∗ f (y).
1.10 Differentiation Theorem 1.10.1 Let f be defined on an interval [a, a + δ) (or (a − δ, a]) for some positive δ. Then f has a right-hand (left-hand) derivative at a if for all strictly positive (negative) h 0, (∗ f (a + h) − f (a)) / h is finite and has a standard part independent of h. The right-hand (left-hand) derivative is that standard part.
1 Simple Nonstandard Analysis and Applications
31
Proof This follows immediately from the nonstandard criterion for a limit. As usual, we say that f has a derivative at a if f has both a right-hand and lefthand derivative at a taking the same value, or if a is the end point of the domain of f and f has the appropriate one sided derivative at a. In this case we write f (a) for the derivative of f at a. Theorem 1.10.2 Let f be defined on [a, b]. Then f has a continuous derivative on [a, b] if and only if for every c ∈ [a, b] and for all x, x , y, y ∈ ∗ [a, b] ∩ m(c), with x = y and x = y , we have ∗ f (x ) − ∗ f (y ) ∗ f (x) − ∗ f (y) ∈ G(0). x−y x − y In this case,
∗ f (x) − ∗ f (y) . f (c) = st x−y
Proof If f has a continuous derivative on [a, b] and for c ∈ [a, b] we have x < y in the monad of c, then by the transfer of the mean value theorem, there is a z with ∗ − ∗ f (y) = ∗ f (z). Here a Skolem function gives z as the x < z < y such that f (x)x−y output for the given input x and y. Since f is continuous, ∗ f (z) f (c), and the result follows. Assume now the conditions for difference quotients hold. Fixing c ∈ [a, b] and letting y and y take the value c, we conclude that f (c) exists. Since c is arbitrary, f exists on [a, b]. To show f is continuous, we again pick any c ∈ [a, b] and a point y ∈ m(c) ∩ ∗ [a, b]; we must show that ∗ f (y) f (c). Using an appropriate Skolem function and the Transfer Principle, we see that there are points x y and x c such that f (c)
∗ f (x ) − ∗ f (y) ∗ f (x) − f (c) and ∗ f (y) , so, f (c) ∗ f (y). x −c x − y
Example 1.10.3 If f (x) = x 2 , then for any limited x and non-zero x 0, y = (x + x)2 − x 2 = 2x · x + (x)2 , so y/x = 2x + x 2x. Remark 1.10.4 If f (c) exists for c ∈ [a, b], then when x 0 but x = 0, setting dy = f (c) · x, we have y = ∗ f (c + x) − f (c) = f (c) · x + ε · x = dy + ε · x where ε 0. If x = 0, this formula for y holds for any limited value of ε. In any case, since dy 0, we have y 0.
32
P.A. Loeb
Remark 1.10.5 A corollary of the last theorem is that f is continuous on [a, b] if and only if for every x ∈ ∗ [a, b] (be it standard or nonstandard), if x 0, then y = dy + ε · x where ε 0. The standard rules of differentiation are easily established using infinitesimals. For example, the chain rule has the following proof: If y = f (x) and x = g(t), with c = g(b), then when t 0, we have x 0, so
y = ( f (c) + ε)x = ( f (c) + ε) · (g (b) + δ) · t , where ε 0 and δ 0. If x = 0 but t = 0, then g (b) + δ = 0. Since g (b) is a standard number, g (b) = 0. In any case, it follows that y/t f (c) · g (b).
1.11 Riemann Integration The following treatment of Riemann integration is derived from the treatment in H.J. Keisler’s book [3]. Further developments by this author that allow the results given here to be presented in a standard calculus course without using infinitesimals are outlined in [5]. b Given a continuous function f on [a, b], we let S a ( f, P), S ab ( f, P), and Sab ( f, P) denote the upper, lower, and ordinary Riemann sum for f with respect to a partition P of [a, b]. For an ordinary Riemann sum, we take the evaluation of f at the left endpoint of each interval. We work with partitions of [a, b] that are determined by positive values of x; the partition points in (a, b) are of the form a + k · x for 1 ≤ k ≤ n − 1. For the last interval [xn−1 , b], we have b − xn−1 ≤ x. For each b value of x, we denote the corresponding Riemann sums by S a ( f, x), S ab ( f, x), b and Sa ( f, x), respectively. These functions of a, b, and x can be extended to ∗ R. For a continuous function f on [a, b] and x > 0, let Mi − m i be the difference between the maximum and the minimum of f on the i th partition interval xi−1 , xi . Let E(x) = maxi (Mi −m i ). A property that is equivalent to the uniform continuity of f on [a, b] is the fact that limx→0+ E(x) = 0. This fact allows a simple proof, both standard and nonstandard, that the Riemann integral of f on [a, b] exists. Theorem 1.11.1 Let f be a continuous function on [a, b]. If x is a positive infinb itesimal, then ∗ S a ( f, x) ∗ S ab ( f, x). Proof There are Skolem functions ϕ and ψ that given a positive input x give as output two points ϕ(x) and ψ(x) in [a, b] with |ϕ(x) − ψ(x)| ≤ x and b
S a ( f, x) − S ab ( f, x) ≤ [ f (ψ(x)) − f (ϕ(x))] · (b − a). (To see this, find the first subinterval Ii on which is found the maximum difference Mi −m i between the maximum and the minimum of f on each subinterval; let ψ(x)
1 Simple Nonstandard Analysis and Applications
33
be the point at which the maximum Mi is taken, and let ϕ(x) be the point where the minimum m i is taken in that same subinterval.) Given a positive infinitesimal x, there is a point c ∈ [a, b] such that ∗ ϕ(x) c ∗ ψ(x). Since f is continuous at c, it follows that ∗ b S a ( f, x) − ∗ S ab ( f, x)
≤ [∗ f (∗ ψ(x)) − ∗ f (∗ ϕ(x))] · (b − a) 0.
Corollary 1.11.2 If f is continuous on [a, b], then f is Riemann integrable there, b and for any positive infinitesimal x, a f (x)dx = st[∗ Sab ( f, x)]. b
Proof For any partition P, we always have S ab ( f, P) ≤ Sab ( f, P) ≤ S a ( f, P) . If we assume that the area under the curve exists, then every upper sum is too big and every lower sum is to small. It follows that for x 0, the corresponding upper and lower sums are infinitesimally close to the area, as is the Riemann sum ∗ Sab ( f, x). If we do not assume that the area exists, then one needs the standard fact that for any pair of partitions P1 and P2 with common refinement P, b
b
S ab ( f, P1 ) ≤ S ab ( f, P) ≤ S a ( f, P) ≤ S a ( f, P2 ). Therefore, there is at least one real number that is below every upper sum and above every lower sum. Since for x 0, the upper and lower sums are infinitesimally b close, that number is unique (we denote it by a f (x)dx); it equals st[∗ Sab ( f, x)] for any positive infinitesimal x. We proceed now on a somewhat more informal level that can be justified by the full machinery of the next chapter. x Theorem 1.11.3 If f is continuous on [a, b] and F(x) = a f (t)dt, then for every c in [a, b], F (c) = f (c). Proof Given c ∈ [a, b) and a positive h 0, ∗ F(c
1 + h) − F(c) = h h
c+h
∗
f (t) dt.
c
This value is between the maximum and minimum value of ∗ f (t) on [c, c + h], and so it is infinitesimally close to f (c). A similar proof works for c ∈ (a, b] and a negative h 0. Proposition 1.11.4 Given a partition by x 0 and a corresponding sum of the form S = εi · xi with each εi 0, the sum S 0.
34
P.A. Loeb
Proof Let ε = maxi εi . Then |S| ≤ |εi | · xi ≤ ε · xi = ε · (b − a) 0. Theorem 1.11.5 (Keisler’s Infinite Sum Theorem) Let S be a standard quantity such that for a continuous function f on [a, b] and a partition of ∗ [a, b] by x 0, b S = i Si = i (∗ f (xi ) · xi + εi · xi ) with each εi 0. Then S = a f (x)dx. Proof Since i εi ·xi 0, S i f (xi )·xi are real numbers, they are equal.
b a
f (x)dx. Since S and
b a
f (x)dx
Remark 1.11.6 This is a nonstandard form of a simplified version of Duhamel’s Principle (see [5, 7].) The standard form replaces the infinitesimal condition with the condition that for each x > 0 and each i, |Si − f (xi−1 ) · xi | ≤ E(x) · x where E(x) is a function of x with limit 0 at x = 0. Theorem 1.11.7 (Fundamental Theorem of Calculus) If F has a continuous derivb ative f on [a, b], then a f (x)dx = F(b) − F(a). Proof Let y = F(x). For a partition of ∗ [a, b] by x 0 there are infinitesimals εi such that, F(b) − F(a) = i yi = i dyi + εi · xi i dyi = i ∗ f (xi ) · xi
b
f (x)dx.
a
Example 1.11.8 To compute the total force F of water on the bottom half of a circular window, one must consider that for a partition by horizontal strips, the maximum length l is at the top of the strip while the maximum pressure p is a the bottom. It does not follow, a priori that upper and lower Riemann sums bound the quantity. By continuity, for a partition of the depth by an infinitesimal y, the force Fi on a strip corresponding to the interval [yi , yi + y] equals l(yi ) · p(yi ) · y + εi · y where εi 0. Therefore, F = (l · p)dy. Here is one more application of our simple theory. It is a typical construction using nonstandard partitions yielding a standard continuous object. The proof is due to Robinson. Theorem 1.11.9 (Cauchy-Peano Existence Theorem) Let f be continuous on the b rectangle [x0 −a, x0 +a]×[y0 −b, y0 +b] with | f | ≤ M there. Let c = min(a, M ), 1 and let Ic = [x0 − c, x0 + c]. Then there is a function ∈ C (Ic ) with (x0 ) = y0 and (x) = f (x, (x)) for all x ∈ Ic .
1 Simple Nonstandard Analysis and Applications
35
Proof We will establish the result for I = [x0 , x0 +c]. Given n ∈ N, let xk = x0 + nk c for 0 ≤ k ≤ n. Let n be the polygonal path given by n (x0 ) = y0 , and for 0 ≤ k ≤ n − 1 and xk < x ≤ xk+1 , n (x) = n (xk ) + f (xk , n (xk )) · (x − xk ). Here, we are writing n (x) for (n, x). Now the following holds for R: (∀n)(∀x)(∀z)[N n ∧ I x ∧ I z
→ n (x0 ) = y0 ∧ |n (x) − n (z)| ≤ M · |x − z|]. It follows by transfer that for each n ∈ ∗ N and each x ∈ ∗ I , ∗ n (x) − y0 ≤ M · |x − x0 | ≤ M · c ≤ b. Fix η ∈ ∗ N∞ ; ∗ η is finite valued. For each x ∈ I , set (x) = st(∗ η (x)); we will show that is the desired solution. To show is uniformly continuous, note that for x, z ∈ I, |(x) − (z)| ∗ η (x) −∗ η (z) ≤ M · |x − z| . Moreover, ∀x ∈ ∗ I , ∗
(x) (st(x)) ∗ η (st(x)) ∗ η (x);
that is, ∗ is uniformly infinitesimally close to ∗ η . Now given x ∈ I , there is a k ∈ ∗ N with 0 ≤ k ≤ η − 1 and a corresponding xk with xk ≤ x ≤ xk+1 so that for x = c/η, k−1 ∗ η (xk ) = y0 + i=0 f (xi ,∗ η (xi )) · x x k−1 ∗ y0 + i=0 f (xi ,∗ (xi )) · x y0 + f (t, (t))dt.
(x)
∗
x0
Therefore, has a continuous derivative with (x) = f (x, (x)) for all x ∈ I .
References 1. R.M. Anderson, A non-standard representation for Brownian motion and Itô integration. Isr. J. Math. 25, 15–46 (1976) 2. A.E. Hurd, P.A. Loeb, An Introduction to Nonstandard Real Analysis (Academic Press, Orlando, 1985) 3. H.J. Keisler, Elementary Calculus, An Infinitesimal Approach, 1st edn. 1976, 2nd edn. 1986 (Prindle, Weber & Smith, Boston, 1976)
36
P.A. Loeb
4. P.A. Loeb, Conversion from nonstandard to standard measure spaces and applications in probability theory. Trans. Am. Math. Soc. 211, 113–122 (1975) 5. P.A. Loeb, A lost theorem of calculus. Math. Intell. 24, 15–18 (2002) 6. A. Robinson, Non-Standard Analysis (North-Holland, Amsterdam, 1966) 7. A.E. Taylor, Advanced Calculus (Ginn & Company, Boston, 1955)
Chapter 2
An Introduction to General Nonstandard Analysis Peter A. Loeb
2.1 Superstructures In this chapter, we develop the general framework of nonstandard analysis and the necessary logic for the transfer principle. We will begin each section with a brief summary for readers who want to postpone the technical details until a later reading. The summary will note any important definitions and results of the section that the reader should know before going on. For example, Definition 2.1.1 describing a superstructure and Remark 2.1.3 are important in this section. A reader who wants quickly to get to later applications may skip Sects. 2.5, 2.7, and 2.9. The reader who has read the first chapter of this book will appreciate that Skolem functions will no longer be needed to replace the existential quantifier. The results obtained in the last chapter using our simple transfer principle will still be valid, since the transfer principle used here extends that simple one. The outline of this chapter is similar to that of Chap. 2 of the author’s book with Albert E. Hurd, [4]. To work with general mathematical analysis, we need to consider sets, sets of sets, etc. All of these are constructed starting with a set of individuals. We think of an individual as an object different from a set. In particular, an individual contains no elements. We build our universe from the set X of individuals using the power set operation P. The set X will always contain the natural numbers N; usually it will contain R. Definition 2.1.1 Fix a set X containing N. Let V0 (X ) = X , and for each n ∈ N, let Vn (X ) = Vn−1 (X ) ∪ P(Vn−1 (X )). The superstructure over X is the set V (X ) = ∪∞ n=0 Vn (X ). Entities in X are said to be of rank 0, and for n ≥ 1, entities in Vn (X ) \ Vn−1 (X ) are said to be of rank n. Example 2.1.2 Individuals are of rank 0. The number 7 and the set {7} are in V1 (X ). The number 7, the set {7}, and the set of all finite subsets of N are in V2 (X ). Note that V1 (X ) ∈ V2 (X ) and V1 (X ) ⊂ V2 (X ). P.A. Loeb (B) Department of Mathematics, University of Illinois, 1409 West Green Street, Urbana, IL 61801, USA e-mail:
[email protected] © Springer Science+Business Media Dordrecht 2015 P.A. Loeb and M.P.H. Wolff (eds.), Nonstandard Analysis for the Working Mathematician, DOI 10.1007/978-94-017-7327-0_2
37
38
P.A. Loeb
Remark 2.1.3 In this chapter’s appendix, written by Horst Osswald, it will be shown that members of the set X of individuals can be coded so that they contain no elements of V (X ). That is if b ∈ X , there is no a with a ∈ b. For example, the equivalence class of Cauchy sequences of rational numbers with limit 7 is in V (X ). It is not, however, the same as the object in X we will call 7. Given a superstructure V (X ) and an entity b in V (X ) we will assume that only entities a in V (X ) can satisfy the relation a ∈ b. If we speak of an element a, with a ∈ b, a will automatically be in V (X ). Definition 2.1.4 An ordered pair a, b is the set {{a}, {a, b}}. Example 2.1.5 The ordered pair 3, π is the set {{3}, {3, π}}. The ordered pair 3, 3 is the set {{3}}. Definition 2.1.6 For n ≥ 1, an ordered n-tuple x1 , . . . , xn of entities x1 , . . . , xn is the set of ordered pairs {1, x1 , . . . , n, xn }. If c1 , . . . , cn are sets, we have c1 × · · · × cn = {x1 , . . . , xn : xi ∈ ci for 1 ≤ i ≤ n}. Moreover, cn = c × · · · × c (n factors). For n ≥ 2, an n-ary relation P on c1 × · · · × cn is a subset of c1 × · · · × cn . We identify a 1-tuple x1 with the corresponding element x1 , so a 1-ary relation on c is a subset of c. Given a relation P as above, there will be a k such that each ci is in Vk (X ). The relation P itself will be in Vk+4 (X ). Functions are relations with the usual restriction. Suppose f is a function of n-variables, i.e., the domain consists of ntuples xi , . . . , xn from c1 × · · · × cn , with each variable in the domain and the range taking values in Vk (X ). Then f is an element of Vk+7 (X ) since a typical element of f is a two tuple of the form x1 , . . . , xn , xn+1 . Lemma 2.1.7 The n-tuple x1 , . . . , xn = y1 , . . . , yn as sets if and only if xi = yi for i = 1, . . . , n. Proof The proof is left to the reader (The only problem is if xi = i).
2.2 Language for Superstructures In this section we describe the construction of formal statements in a formal language L X about a superstructure V (X ). Given X , the language L X for the superstructure V (X ) over X has the following symbols: Connectives: ¬, ∧, ∨, →, ←→ Quantifiers: ∀, ∃ Parentheses: [, ], (, ), Constant Symbols: At least one name for each entity in V (X ). Variable Symbols: A countable number of them will do. Equality Symbol: Denotes equality for elements of X and set equality otherwise. Set membership: ∈. We will not have terms in our language.
2 An Introduction to General Nonstandard Analysis
39
Definition 2.2.1 A formula of L X is built up inductively with the following rules: (a) If x1 , . . . , xn , x, and y are either constants or variables, then the following are formulas called atomic formulas: x ∈ y, x = y; x1 , . . . , xn ∈ y; x1 , . . . , xn = y; x1 , . . . , xn , x ∈ y; x1 , . . . , xn , x = y. (b) If and are formulas, so are ¬, ∧ , ∨ , → , and ←→ . (c) If x is a variable symbol and y is either a variable symbol or a constant symbol and is a formula that does not already contain a formula of the form (∀x ∈ z) or (∃x ∈ z), or (∀y ∈ z) or (∃y ∈ z), then (∀x ∈ y) and (∃x ∈ y) are formulas. Definition 2.2.2 A variable x is bound in a formula if it occurs in and every occurrence takes the form (∀x ∈ z) or (∃x ∈ z). Here, z may be a constant or a variable. A variable occurring in a formula but not bound in is called a free variable in . A sentence in L X is a formula in which all variables are bound.
2.3 Interpretation of the Language for Superstructures In this section we give the rules for interpreting the formal language L X . The rules are as follows: (a) The atomic sentences a ∈ b, a1 , . . . , an ∈ b, a1 , . . . , an , c ∈ b are true or hold in V (X ) if the entities corresponding to the names a, a1 , . . . , an , or, respectively, a1 , . . . , an , c belong to the object named by b. The atomic sentences a = b, a1 , . . . , an = b, a1 , . . . , an , c = b are true or hold in V (X ) if the entities corresponding to the names a, a1 , . . . , an , or, respectively, a1 , . . . , an , c are identical to the object named by b. (b) If and are sentences, then (i) (ii) (iii) (iv) (v)
¬ is true in V (X ) if is not true (does not hold) in V (X ); ∧ is true in V (X ) if both and are true in V (X ); ∨ is true in V (X ) if either or is true in V (X ); → is true if either is not true or is true in V (X ); ←→ is true if and are either both true or both not true in V (X ).
(c) Let = (x) be a formula in which x either does not occur and all variables are bound or x is the only free variable. Given a constant a, we will write (a) for with all occurrences of x replaced by a. Let b be a constant naming a set β ∈ V (X ). (i) (∀x ∈ b) is true in V (X ) if for all entities α ∈ β, (a) is true in V (X ), where a is any constant naming α. (ii) (∃x ∈ b) is true in V (X ) if there is an entity α ∈ β such that (a) is true in V (X ), where a is any constant naming α.
40
P.A. Loeb
Remark 2.3.1 Although we do not have terms in our formal language, we will use xn+1 = f (x1 , . . . , xn ) as shorthand for x1 , . . . , xn , xn+1 ∈ f and y = f (x) as shorthand for x, y ∈ f . Example 2.3.2 The sentence that says that every nonzero real number has a multiplicative inverse has the following form using R to denote the set of real numbers and P to denote the product function: (∀x ∈ R)[¬(x = 0) → (∃y ∈ R)[x, y , 1 ∈ P]]. With our shorthand, this sentence has the form (∀x ∈ R)[¬(x = 0) → (∃y ∈ R)[P(x, y) = 1]]. Using S to denote the sum function, the distributive law has the form (∀x ∈ R)(∀y ∈ R)(∀z ∈ R)(∀α ∈ R)(∀β ∈ R)(∀γ ∈ R)(∀δ ∈ R) [[[S(y, z) = α] ∧ [P(x, α) = β] ∧ [P(x, y) = γ] ∧ [P(x, z) = δ]] → [S(γ, δ) = β]]. Remark 2.3.3 Note that we are missing the composition of terms. We will usually use ordinary mathematical sentences employing terms. One should keep in mind, however, that these are shorthand for more complicated parts of sentences of L X . Example 2.3.4 To say that a function f defined on R is continuous at a, let R + be the symbol used to denote the strictly positive real numbers. Let ρ denote the distance function, i.e., ρ(x, y) = |x − y|, and let I denote strict inequality, i.e., x, y ∈ I iff x < y. With f (a) = b, the sentence (∀ε ∈ R + )(∃δ ∈ R + )(∀x ∈ R)(∀α ∈ R)(∀β ∈ R)(∀γ ∈ R) [[ρ(x, a) = α ∧ α, δ ∈ I ∧ f (x) = β ∧ f (a) = b ∧ ρ(β, b) = γ] → [γ, ε ∈ I ]]. is abbreviated with the sentence (∀ε ∈ R + )(∃δ ∈ R + )(∀x ∈ R)[|x − a| < δ → | f (x) − f (a)| < ε]. Problem: Write sentences in the language for the real numbers expressing the commutative and associative laws for addition. Answer: Let S denote the sum function. The commutative law for addition is expressed by (∀x ∈ R)(∀y ∈ R)(∀α ∈ R)(∀β ∈ R)[[[S(x, y) = α] ∧ [S(y, x) = β]] → [α = β]].
2 An Introduction to General Nonstandard Analysis
41
The associative law for addition is expressed by (∀x ∈ R)(∀y ∈ R)(∀z ∈ R)(∀α ∈ R)(∀β ∈ R)(∀γ ∈ R)(∀δ ∈ R) [[[S(y, z) = α] ∧ [S(x, α) = β] ∧ [S(x, y) = γ] ∧ [S(γ, z) = δ]] → [β = δ]]. Problem: Write a sentence in the language for the real numbers saying that for a given function f , lim x→a∈A f (x) = L. Answer: We use R + to denote the strictly positive real numbers, ρ to denote the distance function, i.e., ρ(x, y) = |x − y|, and I to denote strict inequality, i.e., x, y ∈ I iff x < y. The sentence has the form (∀ε ∈ R + )(∃δ ∈ R + )(∀x ∈ A)(∀α ∈ R)(∀β ∈ R)(∀γ ∈ R) [[¬[x = a] ∧ ρ(x, a) = α ∧ α, δ ∈ I ∧ f (x) = β ∧ ρ(β, L) = γ] → [γ, ε ∈ I ]].
2.4 Monomorphisms and the Transfer Principle Just as in Chap. 1, where we worked with R and ∗ R, we now will work back and forth between X and its nonstandard extension ∗ X . We will in fact work with the superstructures V (X ) and V (∗ X ) using a “∗-mapping ”, which is a one-to-one mapping from V (X ) into, but not onto, V (∗ X ). The abstraction of the basic properties of the mapping ∗ originates with the work of Robinson and Zakon in [10]. For the moment, we write Y for ∗ X . We assume X and Y are two sets of individuals. We associate the superstructure V (X ) and language L X with X and the superstructure V (Y ) and language LY with Y . Each constant symbol in L X names something in V (X ), and there is at least one such symbol for each entity in V (X ). A similar statement is true for Y . We work with a one-to-one map ∗ from V (X ) into V (Y ). For each a ∈ V (X ), we write ∗ a for ∗(a). In this chapter, we will use the same symbol for ∗ a and for its name. The main results and definitions in this section are Definitions 2.4.1 and 2.4.3 along with Remark 2.4.4, Theorem 2.4.5, and Example 2.4.8. Definition 2.4.1 If is a formula in L X , the ∗-transform of , ∗ , is the formula in LY obtained by replacing each constant c in with ∗ c. Example 2.4.2 The ∗-transform of the sentence that says there is a multiplicative inverse in the real numbers is (∀x ∈ ∗ R)[¬(x = ∗ 0) → (∃y ∈ ∗ R)[ x, y , ∗ 1 ∈ ∗ P]]. Definition 2.4.3 (Robinson–Zakon [10]) The injection ∗ from V (X ) into V (Y ) is called a monomorphism if the following conditions hold.
42
P.A. Loeb
(i) (ii) (iii) (iv) (v)
∗ (∅)
= ∅, where ∅ denotes the empty set. If a ∈ X , then ∗ a ∈ Y . If a has rank n, ∗ a has rank n. If a ∈ ∗ Vn (X ) for n ≥ 1 and b ∈ a, then b ∈ ∗ Vn−1 (X ). (Transfer Principle) For each sentence in L X , if holds in V (X ) then ∗ holds in V (Y ).
Remark 2.4.4 It can be shown (the reader is invited to construct the proof) that Properties (i)–(iv) follow from the Transfer Principle, i.e., Property (v). We have listed them here as separate properties to help in understanding the notion of a monomorphism. We use them in the next section to help with the particular monomorphism constructed there. Property (iv) will be interpreted to say, among other things, that elements of “internal sets” are internal. Given Property (ii), we will assume from now on that ∗ a = a for each individual a ∈ X ; in particular, for n ∈ N, ∗ n = n. That is, we will assume that X ⊆ Y . We will write a for both a ∈ X and ∗ a ∈ Y . The Transfer Principle has the following consequence. Theorem 2.4.5 (Downward Transfer Principle) For each sentence in L X , if ∗ holds in V (Y ) then holds in V (X ). Proof If ¬ holds in V (X ), then ∗ (¬) = ¬(∗ ) holds in V (Y ). We sometimes state the transfer principle as follows: holds in V (X ) if and only if ∗ holds in V (Y ). Recall that the domain of a binary relation P is the set of all x for which there is a y with x, y ∈ P. The range of P is the set of all y for which there is an x with x, y ∈ P. Proposition 2.4.6 We have the following properties of the ∗-mapping. (a) Let a, b, a1 , . . . , an be fixed entities in V (X ). Then (i) (ii) (iii) (iv) (v) (vi) (vii)
∗ {a
∗ ∗ 1 , . . . , an } = { a1 , . . . , an }; ∗ a , . . . , a = ∗ a , . . . ,∗ a ; 1 n 1 n a ∈ b iff ∗ a ∈ ∗ b;
a = b iff ∗ a = ∗ b; a ⊆ b iff ∗ a ⊆ ∗ b; n a ) = ∪n ∗ a , ∗ (∩n a ) = ∩n ∗ a ; For n ∈ N, ∗ (∪i=1 i i=1 i i=1 i i=1 i ∗ (a × a × · · · × a ) = ∗ a × ∗ a × · · · × ∗ a . 1 2 n 1 2 n
(b) If P is a relation on a1 × a2 × · · · × an , then ∗ P is a relation on ∗ a1 × ∗ a2 × · · · × ∗ an . Moreover, if n = 2, and a and b are the domain and range of P, then ∗ a and ∗ b are the domain and range of ∗ P . (c) If f is a function from a into b, then ∗ f is a function from ∗ a into ∗ b, such that ∀c ∈ a, ∗ [ f (c)] = ∗ f (∗ c). If f maps a onto b, ∗ f maps ∗ a onto ∗ b. If f is one-to-one (i.e., injective), so is ∗ f .
2 An Introduction to General Nonstandard Analysis
43
Proof of Part a: (i) Let s = {a1 , . . . , an }, and transform the sentence (∀x ∈ s)[x = a1 ∨· · ·∨x = an ]. Also transform sentences a1 ∈ s, . . . , an ∈ s. (ii) ∗
a1 , . . . , an = ∗ {{{1}, {1, a1 }}, . . . , {{n}, {n, an }} = {∗ {{1}, {1, a1 }}, . . . ,∗ {{n}, {n, an }} = {{∗ {1},∗ {1, a1 }}, . . . , {∗ {n},∗ {n, an }} = {{{1}, {1,∗ a1 }}, . . . , {{n}, {n,∗ an }}.
(iii) and (iv) are clear. (v) Transform (∀x ∈ a)[x ∈ b]. (vi) Left to the reader. (vii) We show ∗ (a × b) = ∗ a × ∗ b. Transform (∀z ∈ (a × b))(∃x ∈ a)(∃y ∈ b)[x, y = z] to show that ∗ (a × b) ⊆ ∗ a × ∗ b. Transform (∀x ∈ a)(∀y ∈ b)(∃z ∈ (a × b))[x, y = z] to show ∗ a × ∗ b ⊆ ∗ (a × b). Proof of Part b: To show ∗ P is a relation on ∗ a1 × ∗ a2 × · · · × ∗ an , transform the sentence (∀x ∈ P)(∃x1 ∈ a1 ) . . . (∃xn ∈ an )[x1 , . . . , xn = x]. Thus we know for n = 2 and a and b the domain and range of P, that ∗ a contains the domain of ∗ P and ∗ b contains the range of ∗ P. To show that ∗ a and ∗ b are the domain and range of ∗ P, transform (∀x ∈ a)(∃y ∈ b)[x, y ∈ P] and (∀y ∈ b)(∃x ∈ a)[x, y ∈ P]. Proof of Part c: Let f be a function from a into b. By the transform of (∀x ∈ a)(∀y ∈ b)(∀z ∈ b)[[x, y ∈ f ∧ x, z ∈ f ] → [y = z]] and Part b, ∗ f is a function from ∗ a into ∗ b. If c, d ∈ f , then ∗ c,∗ d ∈ whence ∀c ∈ a, ∗ [ f (c)] = ∗ f (∗ c). The rest is left to the reader.
∗
f,
When a ∈ X , we will, as already noted, associate a with ∗ a. In general however, as the next example shows, we will have to be more careful. For this and the next example, we assume ∗ is a monomorphism between V (R) and V (∗ R). Example 2.4.7 Let I denote the set of closed and bounded intervals in R. Then I ∈ V2 (R). Moreover, the following sentences are true for V (R):
44
P.A. Loeb
(∀x ∈ I)(∃a, b ∈ R)(∀y ∈ R)[a ≤ y ≤ b ←→ y ∈ x] (∀a, b ∈ R)[a ≤ b → [(∃x ∈ I)(∀y ∈ R)[a ≤ y ≤ b ←→ y ∈ x]] Therefore, ∗ I contains the extensions of standard intervals as well as new ones. For example, if ε 0 is positive, then [ε, 2ε] is in ∗ I but it is not the extension of any standard interval. Even for a standard interval [a, b], with a = b, there are points in ∗ [a, b] not in the original interval [a, b]. Thus we can not directly associate [a, b] and ∗ [a, b]. Example 2.4.8 The Downward Transfer Principle can often be used to avoid arguments by contradiction. For example, to show that sn → L if ∀η ∈ ∗ N∞ , ∗ sη L, we note that for a given ε > 0, the ∗-transform of the sentence = (∃k ∈ N)(∀n ∈ N)[n ≥ k → |sn − L| < ε] is true in V (∗ R); just let k ∈ ∗ N∞ . It follows that holds in V (R). Problem: Assume ∗ is a monomorphism between V (R) and V (∗ R), and use downward transfer to show that f is uniformly continuous on A if ∀x ∈ ∗ A, ∀y ∈ ∗ A with x y, we have ∗ f (x) ∗ f (y). Answer: Fix ε > 0. We want to show the truth of the sentence = (∃δ ∈ R+ )(∀x ∈ A)(∀y ∈ A)[|x − y| < δ → | f (x) − f (y)| < ε]. Now ∗ is true for V (∗ R); just take δ 0. Therefore, is true for V (R).
2.5 Ultrapower Construction of Superstructures and Monomorphisms We now fix V (X ), an index set I , and an ultrafilter U in I . From these we will construct a new superstructure V (∗ X ) from V (X ) and a corresponding monomorphism. We may have I = N, we may even have U fixed at some i 0 ∈ I ; i.e., U ∈ U if and only if i 0 ∈ U . If U is fixed in this way, we will see that we get nothing new. In the next section we will see that by assuming additional properties for I and U we obtain a monomorphism with the desired properties for a nonstandard extension. Because an ultrafilter corresponds to a finitely additive measure on the power set of I , with that measure taking only the values 0 and 1, we say a property holds almost everywhere or a.e. if it holds on some U ∈ U. Given S ∈ V (X ), we write S for the set of all maps from I into S. For each such map a, we write ai for a(i). Two maps a and b are equivalent (with respect to U) and we write a = U b if ai = bi a.e. (that is, ai = bi for all i in some U ∈ U.) The relation = U is an equivalence relation. We write U S for the set of equivalence
2 An Introduction to General Nonstandard Analysis
45
classes, and [a] for the equivalence class containing the mapping a. If b ∈ V (X ), we write b for the constant function bi = b. Let V−1 (X ) denote the empty set. The reader should at least note the following two definitions and related remark. Definition 2.5.1 The bounded ultrapower 0U V (X ) of V (X ) is defined by setting 0U V (X ) := ∪∞ n=0 U [Vn (X ) \ Vn−1 (X )]. We denote by e the mapping from V (X ) into 0U V (X ) defined at each b ∈ V (X ) by e(b) = [ b ]. When [a], [b] ∈ 0U V (X ), we write [a] ∈U [b] if ai ∈ bi a.e. Remark 2.5.2 Recall that 5 is the constant function 5 on I , and [ 5 ] is the corresponding equivalence class. While it is true that [ 5 ] ∈U e(N), the relation ∈U is not the set membership relation we want. We need another map “M” so that we can replace e(N) in 0U V (X ) with the set of numbers M(e(N)) = {[r ] ∈ U X : ri ∈ N a.e.} = {[r ] ∈ U X : [r ] ∈U [ N ]}. Similarly, e(P(N)) = [ P(N) ] is the equivalence class formed from the sequence P(N), P(N), . . . , P(N), . . .. This, however, is not a collection of sets of numbers. While it is true that [ N ] ∈U [ P(N) ], this is not the set membership we want. Once we have used the map M at the level 1 to get true sets of numbers, we will then use M to replace [ P(N) ] with a true collection of sets of numbers, M(e(P(N))). In particular, the set ∗ N = M(e(N)), which we will think of as the set of nonstandard natural numbers, will be a member of the collection of sets M(e(P(N))) in the usual sense. Definition 2.5.3 Let ∗ X = U X = U V0 (X ), and let V (∗ X ) be the associated superstructure built on the set of individuals ∗ X . The Mostowski Collapsing Function M is a mapping of 0U V (X ) into V (∗ X ) defined by induction on the level n as follows: (i) For each element [a] ∈ U V0 (X ) = ∗ X , M([a]) = [a]. (ii) Given [b] ∈ U [Vn (X )\Vn−1 (X )], n ≥ 1, we set M([b]) = {M([a]) : [a] ∈ ∪n−1 k=0 U [Vk (X )\Vk−1 (X )] and [a] ∈U [b]}. We finish this section by showing that the map ∗ = M ◦ e : V (X ) → V (∗ X ) is a monomorphism. On first reading, the reader may wish to skip the following technical proofs and go on to the next section. Proposition 2.5.4 We have the following properties for ∗, e, and M: (i) e and M are 1:1 maps, so ∗ = M ◦ e is a 1:1 map of V (X ) into V (∗ X ). (ii) e maps X into ∗ X , and the restriction M|∗ X is the identity map (by definition). (iii) e(X ) = [ X ] (by definition), and M([ X ]) = ∗ X , so ∗(X ) = ∗ X .
46
P.A. Loeb
(iv) If ∅ is the empty set, then for no [a] is it true that [a] ∈U e(∅) = [ ∅ ], so ∗(∅) = M(e(∅)) = ∅. (v) a ∈ b in V (X ) iff e(a) ∈U e(b); [a] ∈U [b] iff M([a]) ∈ M([b]). (vi) For n ≥ 1, e maps Vn (X )\Vn−1 (X ) into U [Vn (X )\Vn−1 (X )] and M maps U [Vn (X )\Vn−1 (X )] into Vn (∗ X )\Vn−1 (∗ X ). It follows that if a has rank n ≥ 0, ∗(a) has rank n. (vii) If M([b]) ∈ ∗(Vn (X )) and M([a]) ∈ M([b]), then M([a]) ∈ ∗(Vn−1 (X )). Proof (i) If a = b, then ∀i ∈ I , a i = bi , so e(a) = [ a ] = [ b ] = e(b). Therefore, e is 1:1. To show M is 1:1, assume [a] = [b]. Since M is the identity map on ∗ X , if either [a] or [b] is in ∗ X , M([a]) = M([b]). Otherwise, ∃U ∈ U such that either ai \bi = ∅ ∀i ∈ U or bi \ai = ∅ ∀i ∈ U ; assume the latter. Then ∃ [c] / M([a]), so such that ci ∈ bi \ai a.e. Therefore, M([c]) ∈ M([b]), but M([c]) ∈ M([a]) = M([b]). Thus M is 1:1. (ii), (iii), (iv), and (v) are clear. (vi) By definition, if [a] ∈ U V0 (X ) = ∗ X , then M([a]) = [a] ∈ V0 (∗ X ) = ∗ X . Fix n ≥ 1, and assume that for each k < n, if [b] ∈ U [Vk (X )\Vk−1 (X )], then M([b]) ∈ Vk (∗ X )\Vk−1 (∗ X ). Given [b] ∈ U [Vn (X )\Vn−1 (X )] and M([a]) ∈ M([b]), we have [a] ∈ ∪n−1 k=0 U [Vk (X )\Vk−1 (X )], so by assumption, M([a]) ∈ Vn−1 (∗ X ). Moreover, ∃[c] such that ∀i ∈ I , ci ∈ Vn−1 (X )\Vn−2 (X ) and ci ∈ bi . Therefore, M([c]) ∈ Vn−1 (∗ X )\Vn−2 (∗ X ), and thus, M([b]) ∈ Vn (∗ X )\Vn−1 (∗ X ). (vii) Assume M([b]) ∈ ∗(Vn (X )) and M([a]) ∈ M([b]), then [b] ∈U e(Vn (X )), and [a] ∈U [b]. Therefore, there is a set U ∈ U such that ∀i ∈ U , ai ∈ bi ∈ Vn (X ). For these same i, ai ∈ Vn−1 (X ), whence [a] ∈U e(Vn−1 (X )), so M([a]) ∈ ∗(Vn−1 (X )). Now, except for the transfer principle, we have shown that ∗ = M◦e is a monomorphism from V (X ) into V (∗ X ). We will write ∗ a for ∗(a). The next proposition is used to establish the transfer principle. Proposition 2.5.5 Let [a], [a 1 ], . . . , [a n ], [b], and [c] be elements of the bounded ultrapower 0U V (X ); here, any or all of these may be of the form e(d) = [ d¯ ]. (i) (ii) (iii) (iv)
M([a]) (= or ∈) M([c]) iff ai (= or ∈) ci a.e. {M([a 1 ]), . . . , M([a n ])} (= or ∈) M([c]) iff {ai1 , . . . , ain } (= or ∈) ci a.e. M([a 1 ]), . . . , M([a n ]) (= or ∈) M([c]) iff ai1 , . . . , ain (= or ∈) ci a.e. M([a 1 ]), . . . , M([a n ]) , M([b]) (= or ∈) M([c]) iff ai1 , . . . , ain , bi (= or ∈) ci a.e.
Proof We will assume that n = 2 for the proof of (ii) and (iii). (i) (for =) Since M is 1:1, M([a]) = M([c]) iff [a] = [c] iff ai = ci a.e. (i) (for ∈) M([a]) ∈ M([c]) iff [a] ∈U [c] iff ai ∈ ci a.e. (ii) (for =) Assume {ai , bi } = ci a.e. Then
2 An Introduction to General Nonstandard Analysis
47
M([c]) = {M([y]) : yi ∈ {ai , bi } a.e.} = {M([y]) : yi = ai a.e.} ∪ {M([y]) : yi = bi a.e.} = {M([a]), M([b])}.
(ii)
(iii)
(iii) (iv)
Conversely, if M([c]) = {M([a]), M([b])}, then ai ∈ ci a.e. and bi ∈ ci a.e. If di ∈ ci a.e. then either di = ai a.e. or di = bi a.e.; that is, ci has only two points a.e. Therefore, ci = {ai , bi } a.e. (for ∈) Choose representatives for [a] and [b], and let di = {ai , bi } ∀i ∈ I . Then M([d]) = {M([a]), M([b])}. Therefore, {ai , bi } ∈ ci a.e. iff di ∈ ci a.e. iff M([d]) ∈ M([c]) iff {M([a]), M([b])} ∈ M([c]). (for =) Choose representatives for [a] and [b], and ∀i ∈ I , let di = {{1}, {1, ai }} and ei = {{2}, {2, bi }}. By Part ii, M([d]) = {{M([ 1 ]}, {M([ 1 ]), M([a])}} = {{1}, {1, M([a])}} since M([ 1 ]) = 1. A similar result holds for M([e]). Again by Part (ii), ai , bi = ci a.e. iff {di , ei } = ci a.e. iff {M([d]), M([e])} = M([c]) iff M([a]), M([b]) = M([c]). (for ∈) This is the same as ((ii) ∈) with { } replaced with . Apply (iii) to di , bi and M([d]), M([b]), where di = ai1 , . . . , ain a.e., so M([d]) = M([a 1 ]), . . . , M([a n ]) .
Notation. If (x1 , . . . , xn ) is a formula with variables x1 , . . . , xn , either free in or not appearing in and c1 , . . . , cn are constants, then (c1 , . . . , cn ) is with each xi appearing in replaced by ci . To establish the transfer principle for ∗, we need a theorem of Ł˘os (pronounced “Wash”). Theorem 2.5.6 (Ł˘os) Let (x1 , . . . , xn ) be a formula in the language L X with x1 , . . . , xn a set of variables containing all of the free variables in . Fix [a 1 ], . . . , [a n ] in 0U V (X ). Then ∗ (M([a 1 ]), . . . , M([a n ])) holds in V (∗ X ) iff (ai1 , . . . , ain ) holds in V (X ) for almost all i ∈ I . Remark 2.5.7 Note that the previous proposition has established the result for atomic formulas. For example, if (x, y) is the formula x ∈ y, then M([a]) ∈ M([c]) iff ai ∈ ci a.e. If (x) is the formula x ∈ N , then M([a]) ∈ ∗ N = M([ N ]) iff ai ∈ N a.e. If (x1 , . . . , xn , y) is the formula {x1 , . . . , xn , 5, A} ∈ y, where A is a standard set, then we have {M([a 1 ]), . . . , M([a n ]), M([ 5 ]), M([ A ])} ∈ M([c]) iff {ai1 , . . . , ain , 5, A} ∈ ci a.e. Proof of Ł˘os’ Theorem. The proof is by an induction argument similar to the induction used in the construction of formulas. Remember, we are establishing an equivalence; i.e., one sentence holds for V (∗ X ) iff related sentences indexed by i ∈ I hold a.e. for V (X ). (1) The previous proposition has established the equivalence for atomic formulas. (2) Assume the equivalence has been established for the formulas (x1 , . . . , xn ) and (x1 , . . . , xn ). Given this assumption, we establish the equivalence for ¬, and
48
P.A. Loeb
∧ . This will also establish the equivalence for ∨ = ¬(¬ ∧ ¬), → = ¬ ∨ , and ←→ = [ → ] ∧ [ → ]. For ¬, we have ∗
(¬)(M([a 1 ]), . . . , M([a n ])) = ¬(∗ )(M([a 1 ]), . . . , M([a n ])).
/ U iff {i ∈ The latter is true in V (∗ X ) iff {i ∈ I : (ai1 , . . . , ain ) holds} ∈ I : ¬(ai1 , . . . , ain ) holds} ∈ U since U is an ultrafilter. (Note we need an equivalence here, not just a.e. → ∗ .) For ∧ , note that the following are equivalent: ∗ ( ∧ )(M([a 1 ]), . . . , M([a n ])) ∗
(M([a 1 ]), . . . , M([a n ])) ∧ ∗ (M([a 1 ]), . . . , M([a n ]))
{i : (ai1 , . . . , ain ) holds} ∈ U and {i : (ai1 , . . . , ain ) holds} ∈ U {i : (ai1 , . . . , ain ) holds} ∩ {i : (ai1 , . . . , ain ) holds} ∈ U {i : ( ∧ )(ai1 , . . . , ain ) holds} ∈ U. We have used the fact that a superset of a set in U is in U. (3) Assume we have the equivalence for (x1 , . . . , xn , z, y), and d is a constant and z a variable that either does not appear in or is free in . We will establish the result for (∃y ∈ d) and (∃y ∈ z). This will give the result since (∀y ∈ d) = ¬(∃y ∈ d)¬ and (∀y ∈ z) = ¬(∃y ∈ z)¬. For (∃y ∈ z), we must fix [a 1 ], . . . , [a n ], [c] ∈ 0U V (X ) and then show that (∃y ∈ M([c]))∗ (M([a 1 ]), . . . , M([a n ]), M([c]), y) holds in V (∗ X ) if and only if (∃y ∈ ci )(ai1 , . . . , ain , ci , y) holds in V (X ) for almost all i ∈ I . (The proof of (∃y ∈ d) is a special case of this where we replace M([c]) with M([ d ]) and ci with d; i.e., the constant sequence d replaces c.) Assume that (∃y ∈ M([c]))∗ (M([a 1 ]), . . . , M([a n ]), M([c]), y) holds in V (∗ X ). Then there is an M([a]) such that [(M([a]) ∈ M([c])] ∧ [∗ (M([a 1 ]), . . . , M([a n ]), M([c]), M([a]))] holds in V (∗ X ), so {i ∈ I : (ai ∈ ci ) ∧ (ai1 , . . . , ain , ci , ai ) holds} ∈ U. Therefore, the larger set {i ∈ I : (∃y ∈ ci )(ai1 , . . . , ain , ci , y) holds} ∈ U.
2 An Introduction to General Nonstandard Analysis
49
To show the converse, assume there is a set U0 ∈ U such that U0 = {i ∈ I : (∃y ∈ ci )(ai1 , . . . , ain , ci , y)}. For each i ∈ I , pick an element ai ∈ ci , but make the choice so that for all i ∈ U0 , (ai1 , . . . , ain , ci , ai ) holds. There is a U1 ∈ U such that for some n ∈ N and all i ∈ U1 , ai ∈ Vn (X ) − Vn−1 (X ). Choose any α ∈ Vn (X )\Vn−1 (X ) and / U1 . Then for this sequence a = ai : i ∈ I , replace ai with α for i ∈ {i ∈ I : (ai ∈ ci ) ∧ (ai1 , . . . , ain , ci , ai ) holds} ⊇ U0 ∩ U1 ∈ U. Therefore, [(M([a]) ∈ M([c])] ∧ [∗ (M([a 1 ]), . . . , M([a n ]), M([c]), M([a]))] holds in V (∗ X ). If follows that (∃y ∈ M([c]))∗ (M([a 1 ]), . . . , M([a n ]), M([c]), y) holds in V (∗ X ). (4) The theorem now follows by induction. Theorem 2.5.8 The map ∗ : V (X ) → V (∗ X ) defined by ∗ = M ◦ e is a monomorphism. Proof It only remains to show that if is a sentence in L X that is true for V (X ), then is true for V (∗ X ). Since is a sentence, has no free variables, only constants and bound variables. Since is true for all i ∈ I , ∗ is true for V (∗ X ).
∗
Problem: Recall that given [a], [b] ∈ 0U V (X ), we write [a] ∈U [b] if ai ∈ bi a.e. Show that this relation is well-defined, that is, that it is independent of the choice of representative from [a] and [b]. Answer: Assume ai and ai represent [a] and bi and bi represent [b]. Also assume there is a set U ∈ U such that ∀i ∈ U ai ∈ bi . We know that there are sets V and W ai , while for all i ∈ W we have bi = bi . in U such that for all i ∈ V we have ai = bi . Now U ∩ V ∩ W is in U, and for all i in this set, ai ∈ Problem: Given a mapping a from I into Vn (X ), show that for some k ≤ n, ai ∈ Vk (X )\Vk−1 (X ) a.e. Answer: For each k ≤ n, let Ik = {i ∈ I : ai ∈ Vk (X )\Vk−1 (X )}. Then I is the disjoint finite union of the sets Ik , 1 ≤ k ≤ n, so one and only one of the sets Ik is in U.
50
P.A. Loeb
2.6 Special Index Sets Yielding Enlargements We have shown how to construct a monomorphism ∗ using a superstructure V (X ), an index set I , and an ultrafilter U. To get more, however, we need additional assumptions. The construction of an enlargement starting with the material after Example 2.6.5 and ending with Theorem 2.6.7 can be omitted on first reading. First we note that if U is not free, then U is fixed at some i 0 ∈ I ; i.e., the singleton set {i 0 } ∈ U. In this case, every sequence a is equivalent to the constant sequence a i0 since ai = ai0 a.e. Therefore, ∗ X = X and we get nothing new. We assume from now on that the ultrafilter U is free; i.e., for every i ∈ I , the set {i} ∈ / U. Definition 2.6.1 All entities in V (X ) and entities in V (∗ X ) of the form ∗ b for b ∈ V (X ) are called standard. Example 2.6.2 The sets [0, 1] and ∗ [0, 1] are standard entities even though ∗ [0, 1] contains nonstandard numbers. Note that in interpreting a sentence ∗ , the only entities α that arise are related by a finite ∈ chain to a standard entity ∗ b. Such an entity α is of the form M([a]). There are, however, entities in V (∗ X ) that are not of this form. An example is the set N. We will show that if Ai is a sequence of sets such that for each j ∈ N, the equivalence class containing the constant sequence j is in Ai for all i in some element of the ultrafilter, then the equivalence class [Ai ] contains unlimited natural numbers. We now consider “hyperfinite” sets; these are extremely important for the applications of nonstandard analysis. Definition 2.6.3 For each A ∈ V (X )\X , let P F (A) denote the finite subsets of ∗ A. The collection of hyperfinite or ∗-finite sets in V (∗ X ) is ∪∞ n=0 P F (Vn (X )) = ∗ ∪{ P F (A) : A ∈ V (X )\X }. Definition 2.6.4 Given a superstructure V (X ) and a monomorphism ∗, we say that V (∗ X ) is an enlargement of V (X ) if for each set A ∈ V (X ) there is a set B ∈ ∗ P (A) such that for every a ∈ A, ∗ a ∈ B, i.e., B contains all of the standard F entities in ∗ A. It follows from the transfer principle that if A is not a finite set, then there are elements of ∗ A that are not in B. Example 2.6.5 For η ∈ ∗ N∞ , the “initial segment” {1, 2, . . . , η} is a hyperfinite set; it contains N. Fix V (X ). We now construct an index set J and an ultrafilter U on J such that the corresponding superstructure V (∗ X ) is an enlargement of V (X ). Since X is an infinite set (containing N), it will follow that ∗ X = X . Let J be the collection of all nonempty finite sets belonging to V (X ). Note that each element of J is in Vn (X ) for some n. For each a ∈ J , let Ja = {b ∈ J : a ⊆ b}. Let F be the collection of all subsets of J such that for each A ∈ F there is an a ∈ J with Ja ⊆ A. Proposition 2.6.6 The collection F is a free filter on J .
2 An Introduction to General Nonstandard Analysis
51
Proof For each A ∈ F there is an a ∈ J with a ∈ Ja ⊆ A, so A = ∅. Given A and B in F and a, b ∈ J , with Ja ⊆ A and Jb ⊆ B, Ja ∩ Jb = Ja∪b ⊆ A ∩ B, and any subset of J containing a set in F is in F. Therefore, F is a filter. To show F is free, fix a ∈ J and find some b ∈ J with a ∩ b = ∅. Then Jb itself is in F, and the set a is not in Jb . Now use Zorn’s Lemma to obtain a free ultrafilter U on J such that U ⊃ F. Note that for each a ∈ J , Ja ∈ U. Theorem 2.6.7 If V (∗ X ) is constructed from V (X ) using J and U, then it is an enlargement of V (X ). Proof Let A be a nonempty set belonging to V (X ). Choose an a0 ∈ A. Define a map : J → P F (A) by setting a = (a ∩ A) ∪ {a0 } for each a ∈ J . Since A ∈ Vm (X ) for some m ∈ N, there is an n ∈ N and a set U0 ∈ U such that a has rank n for all / U0 . Let B = M([]). a ∈ U0 . Choose any a1 ∈ U0 , and replace a with a1 for a ∈ Then B ∈ ∗ P F (A) since ∀a ∈ J , a ∈ P F (A). Fix c ∈ A. We must show that ∗ c ∈ B. Since the singleton {c} is a finite subset of A, J {c} ∈ U, so c ∈ a for all a ∈ J{c} ∩ U0 , that is, a.e. Therefore, ∗ c ∈ B. Definition 2.6.8 A binary relation P is concurrent or finitely satisfiable on a set A contained in its domain if for each n ∈ N and each finite set {x1 , . . . , xn } ⊆ A there is a y in the range of P such that xi , y ∈ P for 1 ≤ i ≤ n. A relation P is concurrent if it is concurrent on all of its domain. Example 2.6.9 The relations ≤ in N and ⊆ in P F (N) are concurrent relations. Concurrent relations are of interest because of the following property. Theorem 2.6.10 Given X and a monomorphism ∗, the following are equivalent: (i) V (∗ X ) is an enlargement of V (X ). (ii) Given any concurrent relation P ∈ V (X ), there is an element c in the range of ∗ P such that ∗ a, c ∈ ∗ P for every a in the domain of P. Proof (i→ii) Let A be the domain of P, and let B ⊆ ∗ A be a hyperfinite set containing for each a ∈ A. By transfer of the concurrency condition for P, there is a c in the range of ∗ P such that b, c ∈ ∗ P for each b ∈ B, in particular, ∗ a, c ∈ P for each a ∈ A. (ii→i) Fix a set A ∈ V (X ). Since ⊆ in P F (A) is a concurrent relation, there is a hyperfinite set B that contains the extension of every standard finite subset of A. In particular, ∀a ∈ A, ∗ {a} = {∗ a} ⊂ B, so ∗ a ∈ B. ∗a
Corollary 2.6.11 If A is an infinite set in V (X ) and V (∗ X ) is an enlargement, then there is a nonstandard b ∈ ∗ A.
52
P.A. Loeb
Proof The relation = is concurrent in A. Remark 2.6.12 The results of the last chapter are valid for an enlargement of a superstructure V (X ) when X contains R. Problem: Assume {Oα : α ∈ A} is an open covering of S ⊆ R with no finite subcovering. Show that there is a point b ∈ ∗ S such that b is not in the monad of any x ∈ S. Answer: The relation P such that Oα , y ∈ P if y ∈ S\Oα is concurrent, so ∃b ∈ ∗ S such that b ∈ ∗ S\∗ Oα ∀α ∈ A. If x ∈ S, then x ∈ Oα for some α, and so / m(x) for any x ∈ S. m(x) ⊆ ∗ Oα . It follows that b ∈ Problem: Let A be a collection of sets with A ∈ V (X ). Assume A has the finite intersection property. That is, the intersection of any finite number of elements of A is not empty. Suppose that V (∗ X ) is an enlargement. Show that the intersection ∗ monad μ(A) := a∈A a is not empty. Answer: Take B ⊆ a hyperfinite ∗ a. b ⊆ b∈B a∈A
∗A
with ∗ a ∈ B for each a ∈ A. Then ∃y ∈
2.7 A Result in Infinite Graph Theory As an application of the existence of an enlargement, we give an easy proof of a result of de Bruijn and Erd˝os [3]. Definition 2.7.1 A graph (A, E) consists of a set A of “vertices” and a binary, symmetric relation E on A × A. If x, y ∈ E, we say that x and y are connected by an edge. The graph (A, E) is k-colorable if there is a map f : A → {1, . . . , k} (the set of k colors) such that if a, b ∈ E, then f (a) = f (b). If B ⊆ A and E|B is the restriction of the relation E to B × B, then (B, E|B) is called a subgraph of A. The cardinality of a graph (A, E) is that of the set A. Theorem 2.7.2 (de Bruijn and Erd˝os) If each finite subgraph of an infinite graph is k-colorable, then the graph itself is k-colorable. Proof Let (A, E) be the infinite graph, and let F denote the set of all finite subsets of A. The following sentence, given here in shorthand, holds for the original superstructure (∀F ∈ F)(∃g : F → {1, . . . , k})(∀x, y ∈ F)[x, y ∈ E → g(x) = g(y)]. Let B be a hyperfinite subset of ∗ A such that ∀a ∈ A, ∗ a ∈ B. Then B ∈ ∗ F, so ∃g : B → {1, . . . , k} such that ∀x, y ∈ B, if x, y ∈ ∗ E, then g(x) = g(y). In particular, ∀ a, b ∈ A, if a, b ∈ E, then ∗ a,∗ b ∈ ∗ E, so g(∗ a) = g(∗ b). Let f denote the restriction of g to A; that is, f (a) = g(∗ a). Then f is a k-coloring of A.
2 An Introduction to General Nonstandard Analysis
53
2.8 Internal and External Sets In this section we make the important distinction between internal and external objects in V (∗ X ). We also establish some additional properties that will be used in applications of nonstandard analysis. The reader may wish to skip over the proofs of Theorems 2.8.4 and 2.8.11 on first reading. Definition 2.8.1 An entity a in V (∗ X ) is called internal if for some standard set b ∈ V (X ), a ∈ ∗ b. All other entities in V (∗ X ) are called external. This means that the internal entities are the elements of standard entities. Of course, if b ∈ V (X ), then b ∈ Vn+1 (X ) for some n, so if a ∈ ∗ b, then a ∈ ∗ Vn (X ). Thus internal entities are elements of ∗ Vn (X ) for some n. On the other hand, suppose that b is internal but not a standard entity, and that a ∈ b. Then b ∈ ∗ Vn+1 (X ) for some n, so by transfer of the sentence (∀y ∈ Vn+1 (X ))(∀x ∈ y)[x ∈ Vn (X )], it follows that a ∈ ∗ Vn (X ), whence a is internal. In interpreting the transfer ∗ of a sentence , only constants naming standard objects are used, so only internal entities come up in the interpretation. If a ∈ ∗ b, then we can obtain information about a by transferring sentences of the form (∀x ∈ b)[· · · ]. If a is external, however, then the transfer principle does not yield information in this way about a. Example 2.8.2 To show that the set N is external in V (∗ X ), let us suppose the contrary, which means that N ∈ ∗ P(N). Then the set ∗ N\N = ∗ N∞ is also internal by transfer of the sentence (∀A ∈ P(N))[N\ A ∈ P(N)]. It follows by transfer of the sentence (∀A ∈ P(N))[A = ∅ → (∃m ∈ A)(∀k ∈ A)[m ≤ k]], that there is a first element m in ∗ N∞ . In this case, m − 1 is the last element of N, which is impossible. This contradiction is a proof that N is external. For applications of the Transfer Principle, it is clearly important to know when an entity in V (∗ X ) is internal. We have already given the proof for the following criterion. ∗ Theorem 2.8.3 The set of internal elements of V (∗ X ) is the set ∪∞ n=0 Vn (X ).
This result is not much help in determining when a set is internal. The next result is considerably more useful. Recall that L∗ X is the language for V (∗ X ). A formula in L∗ X is called standard if all of the constants are names of standard entities; it is called internal if all of the constants are names of internal entities.
54
P.A. Loeb
Theorem 2.8.4 (Keisler’s Internal Definition Principle) Let (x) be an internal formula in L∗ X for which x is the only free variable. Let A be an internal set. Then the set {a ∈ A : (a) holds in V (∗ X )} is internal. Proof Let c1 , . . . , cn be the constants in (x); denote by (c1 , . . . , cn , x). Fix k ∈ N so that A, c1 , . . . , cn are all in ∗ Vk (X ). The sentence in L X (∀x1 , . . . , xn , y ∈ Vk (X ))(∃z ∈ Vk+1 (X ))(∀x ∈ Vk (X )) [x ∈ z ←→ [x ∈ y ∧ (x1 , . . . , xn , x)]] holds in V (X ). Its transfer says that {a ∈ A : (a) holds} ∈ ∗ Vk+1 (X ). Theorem 2.8.5 If A and B are internal, so are A ∪ B, A ∩ B, A\B, A × B. Proof For ∪, assume that A, B ∈ ∗ Vn+1 (X ) and transfer (∀W, Y ∈ Vn+1 (X ))(∃Z ∈ Vn+1 (X ))(∀x ∈ Vn (X )) [x ∈ Z ←→ x ∈ W ∨ x ∈ Y ]. For ∩, replace ∨ with ∧. For \ replace x ∈ W ∨ x ∈ Y with x ∈ W ∧ x ∈ / Y . The proof for × is left to the reader. We have already shown that N and ∗ N∞ are external. As a consequence, we have the following result, where Z denotes the integers. Theorem 2.8.6 In an enlargement of a structure V (X ) with R ⊆ X , the sets N, ∗ N , R, Z, ∗ Z , ∗ R , m(0) are all external. ∞ ∞ ∞ Proof It follows from the fact that N and ∗ N∞ are external that Z and ∗ Z∞ are also external. Since Z = R∩ ∗ Z, R is external. Since ∗ N∞ = ∗ R∞ ∩ ∗ N, ∗ R∞ is external. Since for x = 0, x ∈ m(0) iff 1/x ∈ ∗ R∞ , m(0) is external. Remark 2.8.7 One should not think that external entities are “bad” in any sense. They are just not the subject of the transfer principle. A review of the last chapter shows the utility of such external objects as m(0), ∗ N∞ , and the standard part map. Problem: Recall that P F denotes the finite power set operation. Show that in general, P F (∗ A) ⊆ ∗ P F (A), but ∗ P(A) ⊆ P(∗ A). Answer: A truly finite subset of ∗ A is internal, so it is in ∗ P F (A). If, for example, the set has three elements, then the following sentence is true by transfer (∀x ∈ ∗ A)(∀y ∈ ∗ A)(∀z ∈ ∗ A)(∃w ∈ ∗ P F (A))[{x, y, z} = w]. Thus P F (∗ A) ⊆ ∗ P F (A). Not every hyperfinite set is finite, so the inclusion does not go the other way. If S ∈ ∗ P(A), then by the transfer principle, ∀s ∈ S, s ∈ ∗ A,
2 An Introduction to General Nonstandard Analysis
55
so S ∈ P(∗ A). Since external subsets of ∗ A are not in ∗ P(A), we can only say that ∗ P(A) ⊆ P(∗ A). Problem: It was shown in the last chapter that (∗ R, ∗ +, ∗ ·, ∗ card(T ).
2 An Introduction to General Nonstandard Analysis
59
Theorem 2.9.4 A nonstandard superstructure V (∗ X ) is κ-saturated if and only if for each internal set C and every (internal or external) κ-small collection B consisting of internal subsets of C such that B has the finite intersection property, there is an a ∈ ∩ {B : B ∈ B}. Proof Assume first that V (∗ X ) is κ-saturated, that B ⊆ ∗ P(C) is κ-small, and that B has the finite intersection property. Then the relation P := {B, a | a ∈ B ∈ ∗ P(C)} is internal and concurrent on B ⊆ domain(P). By the assumption, there exists an a such that B, a ∈ P for each B ∈ B. Now assume that P is an internal relation, and that P is concurrent on a κ-small set A ⊆ domain(P). For each a ∈ A set Sa := {b | a, b ∈ P} . Then B := {Sa | a ∈ A} isκ-small and has the finite intersection property. By the assumption, there is a b ∈ B. It follows that a, b ∈ P for each a ∈ A. Theorem 2.9.5 For V (∗ X ), ℵ1 -saturation is equivalent to being denumerably comprehensive. Proof First, assume that V (∗ X ) is at least ℵ1 -saturated. Fix an internal set S, and let an : n ∈ N be an ordinary sequence of elements from S. We must show that this sequence can be extended to an internal sequence an : n ∈ ∗ N in S. For each n ∈ N, Let Bn be the collection of internal maps F from ∗ N into S such that ∀i ≤ n, F(i) = ai . Then Bn is internal. Moreover, B := {Bn | n ∈ N} is κ-small and has ∗ the finite intersection property. Therefore, there exists an internal F : N → S with F ∈ B, whence F(n) = an for each n ∈ N. The proof of the converse is left to the reader. In the following we assume V (∗ X ) be κ-saturated, where κ ≥ ℵ1 . Proposition 2.9.6 If A is an infinite but κ-small set, then A is external. Proof Assume that A is internal. Then {a, b | a, b ∈ A and a = b} is internal and concurrent on A. Therefore, there exists an element b ∈ A such that a = b for all a ∈ A. Proposition 2.9.7 Let κ > card(V (X )) and let A be a set in V (X ). Let ∗ [A] := {∗ a : a ∈ A}. Then ∗ [A] ⊆ ∗ A and ∗ [A] = ∗ A if and only if A is infinite. Proof If a ∈ A, then by transfer, ∗ a ∈ ∗ A. Thus, ∗ [A] ⊆ ∗ A. Suppose A is infinite. Since ∗ [A] is infinite and κ-small, by the previous result, ∗ [A] is external. Since ∗ A is internal, ∗ [A] = ∗ A. On the other hand, if A is finite, say A = {a1 , . . . , ak } , then, ∗
A=
∗
a1 , . . . , ∗ ak = ∗ [A].
Part (i) of the next result shows that each κ-small internal cover of an internal set A contains a finite subcover of A; Part ii implies that the standard part of an internal finitely additive measure on an internal algebra is always σ-additive:
60
P.A. Loeb
Proposition 2.9.8 Fix an internal set C. (i) If A ⊆ C is internal and B is a κ -small subset of ∗ P(C) such that A ⊆ B. Then there exist finitely many B1 , . . . , Bk ∈ B with A ⊆ B1 ∪ · · · ∪ Bk . (ii) If (Ak )k∈N is a strictly or strictly decreasing sequence of internal increasing subsets of C, then k∈N Ak and k∈N Ak are external. Proof (i) Assume that the assertion fails. Then the internal relation R := {B, a | a ∗ X ) be κ-saturated, A ∈A, a ∈ / B ∈ ∗ P(C)} is concurrent on B. Since V ( B. (ii) Assume that B := k∈N Ak is internal. Since N is κ-small, by (i), there exists an m ∈ N such that B ⊆ A1 ∪ · · · ∪ Am = Am Am+1 ⊆ B, which is a contradiction. The rest follows from DeMorgan’s Law and the fact that the complement of an external set is external. Corollary 2.9.9 The sets N, ∗ N∞ , ∗ R∞ , m(0) are external. Proof We have seen that N is external. This also follows from the fact that N is a κ-small set. It now follows that ∗ N∞ = k∈N {m ∈ ∗ N | k < m} is external. With a similar proof, it follows that ∗ R∞ and m(0) are external. Problem: Assume V (∗ X ) is ℵ1 -saturated. Also assume that A is an internal set such that N ∩ A is infinite. Show that for each unlimited natural number M there exists an unlimited natural number K with K ≤ M and K ∈ A. Answer: Fix an unlimited M ∈ ∗ N. For each n ∈ N, let
bn := k ∈ ∗ N | n ≤ k ≤ M and k ∈ A . The bn ’s have the finite intersection property, so there is a K ∈ bn . It follows that K ∈ ∗ N∞ ∩ A and K ≤ M. We conclude this section by listing the numbers of the major tools needed for applications. They are the following: Part (v) of 2.4.3, 2.4.5, 2.6.4, 2.8.4, 2.8.8, 2.8.9, 2.8.12, 2.9.1. For further background, there is the initiating work of [9] and then [7]. Also we note the text of Vakil [13], that employs a weaker form of nonstandard analysis discussed in this book’s preface.
Appendix: Nonstandard Models Horst Osswald Mathematisches Institut der Universität München Theresienstr. 39, D–80333 München, Germany e-mail:
[email protected] As promised in Sect. 2.9, we shall now show that for each superstructure V of cardinality κ there exists a κ+ -saturated superstructure W and a monomorphism ∗ from V into W . (κ+ is the smallest cardinality greater than κ.) By the way, in model theory a monomorphism is often called an elementary embedding.
2 An Introduction to General Nonstandard Analysis
61
The proof is elementary. This means that the ambitious existence of countably incomplete κ+ good ultrafilter is not used as for example in the proof of Theorem 6.1.8 in the book of Chang and Keisler [2]. The idea of the proof is similar to the proof in algebra that each field k has an algebraically closed extension K . In the algebraic proof one has to find roots of polynomials; here we have to find elements in the intersection of families of internal sets. Roughly speaking, in algebra one constructs a suitable transfinite increasing chain (kα )α=< y, ∗ (T )ϕ > .
nearstandard
Proposition 4.2.3 (iv) yields the assertion. Conversely if T is compact so is T by the first part of our proof. But T = T | E , where E is identified with its canonical image in E .
4.4.2 Fredholm Operators We adhere to the notions introduced in the preceding section. Let E, F be standard Banach spaces and denote the space of compact operators where from E to F by K(E, F). For T ∈ L(E, F) consider the operator Q F ◦ T ˜ ) Q F denotes the quotient mapping from F onto F = F/F. Its kernel ker(Q F ◦ T contains E, hence there is a unique operator T : E → F. The mapping T → T is = 0 iff T ∈ K(E, F). Therefore, easily seen to be linear and continuous. Moreover T we obtain the following proposition. induces an isometric linear representation of Proposition 4.4.5 The map T → T F). If in addition E = F then this the Calkin space L(E, F)/K(E, F) into L( E, ˜ representation is multiplicative from the Calkin algebra L(E)/K(E) into L( E). An operator T ∈ L(E, F) is called a Fredholm operator if dim(ker T ) + dim(F/T (F)) < ∞. A well-known standard argument shows that a Fredholm operator T has closed range. More precisely E = ker(T ) ⊕ E 1 and F = T (F) ⊕ F2 are topological decompositions and T1 = T | E 1 is bijective from E 1 onto T (F). Set
136
M.P.H. Wolff
S : F → E by S(y) = T1−1 (y) for y ∈ T (F) and S(z) = 0 for z ∈ F2 . Then S satisfies T ST = T . This in turn gives the following version of Atkinson’s Theorem: Theorem 4.4.6 Let T be an element of L(E, F). The following assertions are equivalent: (i) (ii) (iii)
T is Fredholm. is bijective. T T + K(E, F) is invertible in L(E, F)/K(E, F).
Proof We use the notations of the paragraph preceding the theorem. = N and F 2 = F2 (see (i) ⇒ (ii): ker(T ) = N and F2 are finite dimensional, hence N = T ( F) ⊕ F2 , = N ⊕ Corollary 4.2.10). This gives the decomposition E E 1 and F = T1 is invertible with inverse S where S is as in the paragraph preceding the so T theorem. is not injective. (ii) ⇒ (i): If dim(ker(T )) = ∞ then ker T \E = ∅, hence T Moreover T (E) is closed by Proposition 4.2.30. Finally, assume that dim(F/T (E)) = ∞. By Theorem 4.2.20 there exists a hyperfinite dimensional subspace H of ∗ F containing F as an external subspace. Set G = T (E). By transfer the ∗ –linear hull L of H and ∗ G is internally closed and L/∗ G is hyperfinite dimensional, in particular L = ∗ F. Thus by Proposition 4.2.6 is bijective there exists y ∈ ∗ F\L of norm 1 satisfying d(y, L) ≥ 1/2). Since T ∗ ∗ g+z = y F/F = G/F. So there exists g ∈ G, g finite, and z ∈ F such that hence y g + z ∈ L, a contradiction to d(y, L) ≥ 1/2. (i) ⇒ (iii): Let S be as in the paragraph preceding the theorem. (i) implies (ii), is invertible in L( E, F). But then S˜ = T −1 follows, hence (iii) holds. i.e. T (iii) ⇒ (ii) follows from Proposition 4.4.5. The index of a Fredholm operator T is defined to be ind(T ) = dim(ker(T )) − dim(F/T (F)) = dim(ker(T )) − dim(ker(T )). Obviously ind(I ) = ind(I + A) = 0 for all A of finite rank. Moreover ind(T ) = 0 for all bijective T . The famous multiplication formula ind(ST ) = ind(S) + ind(T ) can be proved completely by methods from pure linear algebra (see e.g. [42], Theorem 1.4.8). The following theorem is an immediate consequence of Theorem 4.4.6. Theorem 4.4.7 The set F(E, F) of all Fredholm operators from E to F is open in L(E, F), and the index is continuous on F(E, F). Proof By Theorem 4.4.6, F(E, F) is the inverse image (with respect to the quotient mapping) of the set of invertible elements in L(E, F)/K(E, F), which is known to be open. Let now T have index 0. Then E = ker(T )⊕E 1 , F = T (E)⊕F2 as previously introduced. Also, ind(T ) = 0 implies dim(ker(T )) = dim(F2 ) , hence there exists a (necessarily continuous) linear bijection V¯ : ker(T ) → F2 . Set V (x) = V¯ (P x) where P is the projection onto ker(T ) with kernel E 1 . Then T + V is invertible. Now
4 Branch Spaces and Linear Operators
137
let R ∈ ∗ F(E, F) satisfy R ∗ T . Thus, R + ∗ V ∗ (T + V ), whence R + ∗ V is invertible. Now R = (R + ∗ V )−1 (I − (R + ∗ V )−1∗ V ), hence ∗
ind(R) = ∗ ind((R+ ∗ V )(I −(R+ ∗ V )−1∗ V )) = 0+ ∗ ind(I −(R+ ∗ V )−1∗ V ) = 0,
since (R + ∗ V )−1∗ V is of finite rank. Therefore, ind−1 ({0}) is open in F(E, F). Since ind is a homomorphism into Z, the assertion follows. Corollary 4.4.8 If K is a compact operator and z = 0 in K then ind(zT − K ) = ind(T ) holds. Proof The index is constant on the path t → zT − t K (0 ≤ t ≤ 1).
4.4.3 Notes The nonstandard analysis of compact operators was initiated by A. Robinson and A.R. Bernstein [6] who solved the invariant subspace problem for polynomially compact operators. The easy proof of Theorem 4.4.4 is taken from [54], cf. also [37]. As was pointed out above the first treatment of the theory of Fredholm operators by means of Fréchet products was given by Sadovskii [55]. Section 4.4.2 is to some extent within the spirit of that paper. These ideas came up again, apparently independent of [55], in [7, 9]. For the nonstandard analysis of semi-Fredholm operators see [63, 76].
4.5 Spectral Theory of Operators 4.5.1 Basic Definitions and Facts Let E be a Banach space and let T be a bounded linear operator on E. The resolvent set is ρ(T ) = {z ∈ C : (z − T ) is bijective} , and on ρ(T ) the resolvent R(z, T ) is defined by R(z, T ) = (z − T )−1 . Notice that R(z, T ) is continuous by the closed graph theorem. Moreover, ρ(T ) is open, and R(·, T ) is holomorphic satisfying the famous resolvent equation R(z, T ) − R(y, T ) = (y − z)R(z, T )R(y, T ).
(4.5)
The complement of ρ(T ) is called the spectrum σ(T ). It is compact since for |z| > T the Neumann series T n z −(n+1 ) converges to R(z, T ). This also implies the fact that lim|z|→∞ R(z, T ) = 0. Moreover, r (T ) = sup{|z| : z ∈ σ(T )} is called the spectral radius of T . As a consequence of Liouville’s theorem the spectrum is never empty.
138
M.P.H. Wolff
Now z is called an eigenvalue if ker(z − T ) = {0}. The space ker(z − T ) is the space of eigenvectors corresponding to z. The set σ p (T ) of all eigenvalues of T is called the point spectrum of T . A value z is an approximate eigenvalue if inf{(z − T )x : x = 1} = 0. The set of all approximate eigenvalues forms the approximate point spectrum σa (T ) of T . It is closed in σ(T ). A point z ∈ σ(T ) is called a Riesz point of T if it is a pole of R(z, T ) for which 1 R(v, T )dv is of finite rank. A Riesz point z is always the residue Q = 2πi |v−z|=δ
isolated in the spectrum of T , and moreover it is an eigenvalue with an eigenspace of dimension equal to the rank of the residuum. A very useful notion was introduced by L. Trefethen [65]: For ε > 0 we define ρε (T ) = {z ∈ ρ(T ) : R(z, T ) < 1ε }. The set σε (T ) = C\ρε (T ) forms the so–called ε–pseudospectrum. It has started to play an important role in modern numerical analysis. The formula for the Neumann series (see above) produces for |z| > T the 1 inequality R(z, T ) ≤ |z|−T . It follows that for every ε > 0, the set {z : |z| > T + ε} is contained in ρε (T ). Notice that if T is a normal operator on a Hilbert space H, the equality R(z, T ) = 1/d(z, σ(T )) holds for all z ∈ ρ(T ), whence σε (T ) is easily determined. Tn ˆ at infinity is the series ∞ The Laurent series of R(λ, T ) A§ n=0 λn+1 , the radius of convergence of it is given by r (T ) = lim supn T n 1/n . Therefore r (T ) = sup{|λ| : λ ∈ σ(T )}. It is called the spectral radius of T . The following formula is well known: Proposition 4.5.1 r (T ) = lim T n 1/n . n
Proof We shall use the formula ST ≤ ST at various places. Let s := inf n T n 1/n . Then for ε > 0 there exists k ∈ N such that s ≤ T k 1/k < s + ε/2. Let N ≈ ∞ be arbitrary. Then there exists a unique q ≤ k − 1 and P ≈ ∞ such that N = Pk + q. We obtain T N ≤ T k P · T q = (T k 1/k )k P · T q < (s + ε/2)k P · T q . Taking the Nth root we obtain 1
T N 1/N ≤ (s + ε/2) 1+q/k P · T q/N < s + ε, because q/N , q/k P 0. Since N ≈ ∞ was arbitrary we have lim supn T n 1/n < (s + ε) from which the assertion follows as ε > 0 was arbitrary.
4 Branch Spaces and Linear Operators
139
4.5.2 The Spectrum of an S–bounded Internal Operator Let E denote an internal Banach space, and let T be an internally bounded operator on E, i.e. T ∈ L(E). By transfer we may define all the notions above also for T ; ) in the case that T is we want to investigate the connection between σ(T ) and σ(T S-bounded (see Sect. 4.4.2). To this end we introduce the external sets ρb (T ) = {z ∈ ρ(T ) : R(z, T ) finite} and ρ∞ (T ) = {z ∈ ρ(T ) : R(z, T ) infinitely large}. By assumption the operator norm of T is finite. In the following we denote the standard part map on Fin∗ C by z → ◦ z. denote the nonstandard hull on E of the S–bounded internal Theorem 4.5.2 Let T operator T on E. Then the following assertions hold: ) = {◦ z : inf{(z − T )x : x = 1} 0}, and σa (T ) consists only of (i) σa (T eigenvalues. ) = {◦ z : z ∈ ρb (T )}. (ii) ρ(T ) ⊂ {◦ z : z ∈ σε (T )} ⊂ (iii) Fix 0 < ε < ε with both numbers standard. Then σε (T )}. σε ( T Proof (i) (I) Assume that inf{(z − T )x : x = 1} =: α 0. Then by transfer, for 0 < η 0 there exists an x of norm 1 satisfying (z − T )x ≤ α + η. x = ◦ z x. But then T (II) Assume now that α − T )x : x = 1} 0.
:= inf{(z x > ◦ α/2 for all x with x = 1, which implies by Then ◦ z − T ). definition that ◦ z ∈ / σa (T (ii) This follows from Lemma 4.2.28. ) by Lemma 4.2.28, (iii) (I) Let z ∈ σε (T ) be arbitrary. If z ∈ σ(T ) then ◦ z ∈ σ(T ). Therefore, assume z ∈ ρ(T ). Then (z − T )−1 ≥ 1 hence ◦ z ∈ σε (T ε )−1 ≥ 1 , since ε is standard, so {◦ z : and Lemma 4.2.28 gives (◦ z − T ε ). z ∈ σε (T )| ⊂ σε (T / σε (T ), (II) Assume that z is standard and z ∈ / {◦ v : v ∈ σε (T )}. Then z ∈ 1 1 −1 ) ≤ < . Since ε is standard, Lemma 4.2.28 yields hence (z − T ε ε the assertion. ) but equality does not hold in Remark 4.5.3 Obviously {◦ z : z ∈ σa (T )} ⊂ σa (T general. For example, let N ∈ ∗ N\N be arbitrary and consider the internal operator T given on C N by k≥2 e T (ek ) = k−1 0 k=1 ((en ) denotes the canonical basis). If one takes the usual scalar product norm on C N then T is a partial isometry, in particular bounded by 1. On the other hand T ) = {z ∈ C : |z| ≤ 1} since T is internally nilpotent hence σ(T ) = {0}. But σa (T
N induces the shift on the invariant subspace of the Hilbert space C spanned by the vectors { ek : k = r [N /2], r ∈ N}, where [N /2] is the greatest integer less than N /2.
140
M.P.H. Wolff
. In Corollary 4.5.4 (i) If z ∈ σ(T ) and |z| = r (T ) then ◦ z is an eigenvalue of T ) ≥◦ r (T ). particular r (T (ii) Let T be a standard bounded operator on the standard Banach space E. Then T ); (a) σ(T ) = σ(∗ (b) σa (T ) = σ p (∗ T ); (c) σε (T ) = σε (∗ T ). Proof (i) If z ∈ σ(T ) and |z| = r (T ), then R(v, T ) is unbounded near z (i.e. for |v| > r (T ) and v z) since otherwise z would not be a singularity of R(·, T ). So there exists a v z with v ∈ ρ∞ (T ), and the assertion follows. Problem 4.5.5 Prove assertion (ii). Hint: Use the foregoing theorem as well as Corollary 4.2.26. ) >◦ r (T ) may happen is shown by the operator constructed in Remark That r (T 4.5.3. Let T be a bounded linear internal operator on the internal Banach space E. A point z ∈ ∗ C is called an S–Riesz point if it is a Riesz point with residue of standard finite rank. As before by B(z, r ) we denote the set B(z, r ) = {v ∈ C : |v − z| < r }. We have the following theorem: Theorem 4.5.6 Let T be an S–bounded operator on the internal Banach space E. with residue of rank r . Then there exists Moreover, let z ∈ C be a Riesz point of T ∗ a standard δ such that the set σ(T ) ∩ B(z, δ) is not empty and consists of at most r S–Riesz points z 1 , . . . , z k with z j z and kj=1 rank(z j ) = r , where rank(z j ) denotes the rank of the residue of z j . )\{z}}/2 . Set η = δ/2. Consider the Proof (I) Let δ = inf{|z − v| : v ∈ σ(T )−1 : v ∈ K } annulus K = {v ∈ C : η ≤ |v − z| ≤ δ}. If a = sup{(v − T ∗ and M = 2a then K ⊂ ρ1/M (T ). To see this, assume the contrary. Then there ), a contradiction to the choice exists a v ∈ ∗ K ∩ σ1/M (T ). Hence ◦ v ∈ σ1/M (T of M. 1 R(v, T )dv (II) By the Transfer Principle the spectral projection q Q = 2πi exists and Q < δM is finite. )dv = Res(R(z, T )). = 1 R(v, T Claim: Q 2πi
|v−z|=δ
|v−z|=δ
) = R(v, Proof of the claim: By Lemma 4.2.28 R(v, T T ) for all v ∈ K . ) the resolvent equation (see Eq. 4.5) yields Moreover, since K ⊂ ρ1/M (T R(v, T ) − R(w, T ) ≤ |v − w|M 2 ; in particular R(., T ) is uniformly S– continuous on ∗ K . Hence, the Riemann sums Rm =
m−1 1 δ R(z + δ exp(2πik/m), T ) exp(2πik/m) 2π i m k=0
4 Branch Spaces and Linear Operators
141
m → satisfy Rm Q for all infinitely large m. On the other hand R 1 )dv. This proves the claim. R(v, T 2πi |v−z|=δ
E) = r < ∞. But Proposition 4.2.10 then implies dim Q(E) = r , (III) Now dim Q( and the assertion follows by the Transfer Principle applied to T | Q(E) .
4.5.3 The Spectrum of Compact Operators and the Essential Spectrum The spectral theory of compact operators is completely described in the following theorem. It is classical material. Theorem 4.5.7 Let T be a standard compact linear operator on the Banach space E. Then its spectrum σ(T ) is at most countable and each z = 0 in σ(T ) is isolated and an eigenvalue with finite dimensional eigenspace. Proof By Corollary 4.4.8 for all λ = 0 λI − T is a Fredholm operator and the index ind(λI − T ) vanishes, since ind(λ(I ) = 0. Thus if 0 = λ ∈ σ(T ) then λ is an eigenvalue and the corresponding eigenspace E λ is finite dimensional. Now let λ = 0 be an eigenvalue of T . Assume that λ is not isolated and let (λn )n be a sequence in σ(T ) converging to λ where λn = λm for n = m . Without loss of generality assume inf(λn ) =: η > 0. Each λn is an eigenvalue. Let xn be a normalized eigenvector corresponding to λn . Then {xn : n ∈ N} is linear independent. Let Mn be the linear hull of {x1 , . . . , xn }. Then obviously T (Mn ) ⊂ Mn . Moreover for y ∈ Mn+1 \Mn T y − λn+1 y ∈ Mn holds. For y = βxn+1 + u with u ∈ Mn , hence T y = λn+1 βxn+1 + T u where T u ∈ Mn . Set y1 = x1 . By Corollary 4.2.9 to every n ≥ 2 there exists yn ∈ Mn of norm 1 satisfying d(yn , Mn−1 ) > 1/2. Let n, p ∈ N be arbitrary. Then T yn+ p − T yn = T yn+ p − λn+ p yn+ p + λn+ p yn+ p − T yn = λn+ p yn+ p − v, where v ∈ Mn+ p−1 . Hence T yn+ p − T yn ≥ λn+ p d(yn+ p , Mn+ p−1 ≥ η/2. By Lemma 4.2.16 T y N is a remote point for every N ≈ ∞ . This contradicts Proposition 4.4.1, since T is compact.
Definition 4.5.8 Let T be a bounded linear operator on the Banach space E. Then the essential spectrum σess (T ) is the set of λ ∈ C such that λI − T is not a Fredholm operator.
142
M.P.H. Wolff
In order to avoid trivial cases let E be infinite dimensional. If T is compact then σess (T ) = {0} by Theorem 4.5.7. Moreover λ ∈ σess (T ) iff λ ∈ σ(T˜ ). Here T˜ is the equivalence class of T in the Calkin algebra. Theorem 4.4.6 now gives the following result: Proposition 4.5.9 Let T be a bounded linear operator on the infinite dimensional Banach space E. Then σess (T ) = ∅ and ress (T ) := sup{|λ| : λ ∈ σess (T )} ≤ r (T ). ˜ and Problem 4.5.10 (a) Prove the proposition. Hint: Consider the operator T˜ on E, use Theorem 4.4.6 as well as T ≥ T˜ and the formula for the spectral radius, see Proposition 4.5.1. (b) Let T be a bounded linear operator on E, such that T n is compact for some n ∈ N. Show that Theorem 4.5.7 holds also in this case. Hint: Use Proposition 4.5.1 for T˜ .
4.5.4 Closed Operators and Pseudoresolvents The nonstandard hull of a closed operator Let (A, D(A)) be a closed densely defined operator on the Banach space E. More precisely D(A) is a dense subspace and the graph G(A) = {(x, Ax) : x ∈ D(A)} is closed in E × E. Its resolvent set ρ(A) is defined as ρ(A) = {λ ∈ C : (λ − A) is bijective onto E}. If λ ∈ ρ(A), then (λ − A)−1 =: R(λ, A) is continuous by the closed graph theorem. Now, ρ(A) is open (but it might be empty). As in the case of a bounded operator the resolvent R(., A) : ρ(A) λ → R(λ, A) is holomorphic and satisfies the resolvent equation (see Eq. (4.5)). The complement of ρ(A) is the spectrum σ(A). The point spectrum σ p (A), the approximate point spectrum σa (A), and the ε–pseudospectrum σε (A) are defined as previously (see p. xxx). because In general it is quite difficult to define the nonstandard hull of A in E is no longer the graph of a mapping in E. To solve the problem we recapitulate G(A) the notion of a pseudoresolvent introduced by E. Hille (see [81], VIII.4): Let D ⊂ C be not empty and R : D → L(E) be a function satisfying the resolvent equation R(u) − R(v) = (v − u)R(u)R(v). Then all operators R(u) have a common null space denoted by N (R) and a common range, denoted by R(E). Moreover R(u)R(v) = R(v)R(u) holds for all u, v, ∈ D. Theorem 4.5.11 (standard, see [81], p. 21) (i) Let R : D → L(E) be a pseudoresolvent and assume that there exists a sequence (λn ) ⊂ D with lim |λn | = ∞ such that (λn R(λn )) is bounded. Then R(E) = {x ∈ E : lim λn R(λn )x = x} and N (R) ∩ R(E) = {0}.
4 Branch Spaces and Linear Operators
143
(ii) A pseudoresolvent is the resolvent of a closed densely defined linear operator A iff N (R) = {0}. Then R(E) is the domain of definition of A and A = u I − R(u)−1 . In the following let us assume that (A, D(A)) is closed, densely defined and that ρ(A) = ∅. Moreover let us assume that there is a sequence (λn ) ⊂ ρ(A) such that lim |λn | = ∞ and (λn R(λn , A)) is bounded. These hypotheses are satisfied for A selfadjoint in a Hilbert space or for A the generator of a strongly continuous semigroup, for in the latter case u R(u, A) converges strongly to the identity for u → ∞ (for details see the next section). Now by the Transfer Principle ρ(A) n ))n is A) =: R(λ) for which (λn R(λ λ → R(λ, defines a pseudoresolvent on E bounded. R = R( E). The space E R is invariant under R(u) for Therefore, we define E is injective with dense range D( A) := R(u)( E ). We now set all u and R(u)| R R E −1 A = u I − ( R(u)| ER ) , and call this the nonstandard hull of the closed operator (A, D(A)). R = E and Remark 4.5.12 If A has compact resolvent then by Proposition 4.4.1, E A = A.
General spectral theory of pseudoresolvents In order to avoid the above special assumptions on the resolvent of A and still retain the power of nonstandard hulls, the singular set of a pseudoresolvent R was introduced in [52], following an idea of E. Hille [26]. This singular set coincides with the spectrum of the closed operator A if R(λ) = (λ − A)−1 . It enables one to avoid the construction of the nonstandard hull of a closed operator. Let R : D → L(E) be a pseudoresolvent and fix λ0 ∈ D. Then set Dmax = {λ ∈ C : λ = λ0 or (λ0 − λ)−1 ∈ ρ(R(λ0 ))}, and define Rmax on Dmax by Rmax (λ) = R(λ0 )(I − (λ0 − λ)R(λ0 ))−1 . The singular set sing(R) is defined to be the complement of Dmax in C. The connection with the spectrum of bounded operators is given by the two formulas σ(R(λ))\{0} = {(λ − μ)−1 : μ ∈ sing(R)}, sing(R) = {λ − 1/μ : μ ∈ σ(R(λ))\{0}}.
(4.6) (4.7)
144
M.P.H. Wolff
In complete analogy to the spectral theory of bounded operators we introduce the point spectrum as well as the approximate point spectrum of the pseudoresolvent as follows: sing p (R) = {α ∈ C : ∃x[x = 1 and (λ − α)R(λ)x = x]}, singa (R) = {α ∈ C : inf{((λ − α)R(λ) − I )x : x = 1} = 0}.
(4.8) (4.9)
The resolvent equation (4.5) shows that this definition is independent of the particular λ. Now fix some λ ∈ Dmax . An easy calculation shows that sing p (R) = {λ − 1/μ : μ ∈ σ p (R(λ))\{0}}, singa (R) = {λ − 1/μ : μ ∈ σa (R(λ))\{0}}.
(4.10)
Moreover, if R(λ) = (λ − A)−1 for some closed operator A then sing p (R) = σ p (A) and singa (R) = σa (A). The nonstandard hull of a (standard) pseudoresolvent is obviously a pseudoresolvent. More precisely the following proposition holds: Proposition 4.5.13 ([52], Proposition 2.1) Let R : D → L(E) be a pseudoresolvent. The following assertions hold: is a pseudo-resolvent with R(λ) = R(λ). : D → L( E), λ → R(λ) R and sing(R) = sing( R). Dmax (R) = Dmax ( R) and then the orders of the poles are λ0 ∈ C is a pole of R iff it is a pole of R equal. In particular sing(R) ∩ ∂ Dmax ⊂ sing p ( R). (iv) singa (R) = sing p ( R).
(i) (ii) (iii)
Proof (i) follows by a careful application of the Transfer Principle. (ii) follows from Eq. 4.3 by an application of the Transfer Principle to the fixed operator R(λ). For (iii) and (iv) use Eq. (4.10) and apply Corollary 4.5.4 (ii) to the fixed operator R(λ). As a corollary we obtain Corollary 4.5.14 Let A be a closed densely defined operator on E. Let there exist an unbounded sequence (λn ) in ρ(A) such that (λn R(λn , A))n is bounded. Then the following assertions hold: (i) σ(A) = σ( A). (ii) σa (A) = σ p ( A).
4.5.5 Notes The spectral theory of internal S–bounded operators as presented here is due to the author (cf. also [51, 77]). Corollary 4.5.4 is partly new. Part (ii) (a) and (b) of it
4 Branch Spaces and Linear Operators
145
however are very well–known and trace back (within the frame work of Fréchet– products) to Quigley (see [53]). These facts were rediscovered by Berberian [4], Lotz (cf. [57] V.1), and others and have been used extensively since then. Theorem 4.5.6 is new. The corresponding result within the frame work of ultraproducts may be found in [51]. Section 4.5.3 is well–known. The proofs seem to be new, though they are based on the standard ones (cf. [81], for a nonstandard treatment see also [54]). Section 4.5.4 is new. The corresponding treatment of closed operators within the theory of ultraproducts is due to Krupa [33]. The use of pseudo-resolvents in the spectral theory of closed operators goes back to Hille (see [26]). Its present form is taken from [52] where it is developed within the theory of ultraproducts.
4.6 Selected Applications 4.6.1 Strongly Continuous Semigroups A typical example of the preceding notions is the generator (A, D(A)) of a bounded strongly continuous semigroup T = (Tt )t≥0 of operators Tt on E. Let us recall this notion in a little more detail: T is called a strongly continuous semigroup if Ts+t = Ts Tt for all s, t ≥ 0 and if also for all x ∈ E lim Tt x = x holds. In this t→0
case, its generator (A, D(A)) is defined by x ∈ D(A) iff Ax := lim 1t (Tt x − x) t→0
exists. The semigroup property implies that t → Tt x is norm continuous for every x ∈ E. Moreover x ∈ D(A) ⇒ Tt x ∈ D(A) and (Tt x) = ATt x = Tt Ax. (For details on strongly continuous semigroups see e.g. [48].) In what follows the hypothesis that T be bounded is not severely restrictive since in the general case there exists an M > 0 such that t → exp(−Mt)Tt is strongly continuous and bounded as well. ∞ Since T is bounded by R(z)(x) := e−t z Tt x dt, there is a bounded linear oper0
ator defined for z ∈ C with Re(z) > 0. This turns out to be a pseudoresolvent with lim u R(u)x = x u→∞
(as is easily seen, cf. part (II) of the proof of the next theorem). Therefore, we can apply Theorem 4.5.11. Since the kernel N (R) = {0}, by the formula above, R(·) is the resolvent of the closed densely defined operator B = u − (R(u))−1 (which is independent of u > 0). The easily proved formula Tt R(u)x = e R(u)x − e ut
ut 0
t
e−us Ts xds
(4.11)
146
M.P.H. Wolff
shows first of all that the map t → Tt R(u) is continuous with respect to the operator norm, and secondly that (Tt R(u)x) t=0 = u R(u)x − x = A R(u)x holds, whence B = A. It follows that (A, D(A)) satisfies our assumptions that were made in the previous section in order to construct A. can If (A, D(A)) is the generator of a bounded strongly continuous semigroup, A be characterized in a manner different than that given in the previous section. We adhere to the notions and notations given there. Theorem 4.6.1 (cf. [73]) Let (A, D(A)) be the generator of a bounded strongly continuous semigroup T = (Tt )t≥0 . Then the following assertions hold: t | R = { x : t → ∗ Tt x is S-continuous}. Moreover, the restriction t → T (i) E ER is a strongly continuous semigroup with generator A. = { R : ∃x ∈ x [x ∈ ∗ D(A), ∗ Ax is finite, and t → ∗ Tt ∗ Ax is (ii) D( A) x ∈ E S-continuous]}. . Proof (i) Let H = {x ∈ Fin(E) : t → ∗ Tt x is S-continuous} and set G = H Denote the bound of (Tt ) by M. Let u > 0 be given. (I) Since t → Tt R(u, A) is continuous with respect to the operator norm, so is t → ∗ Tt ∗ R(u, A). So t → ∗ Tt ∗ R(u, A)y is S–continuous (cf. Lemma ⊂ G, which in turn implies 4.4.2) for every y ∈ Fin(∗ E), hence R(u, A)( E) that E R ⊂ G. (II) Conversely, let t → ∗ Tt x be S–continuous. For each standard ε > 0 there exists a standard δ such that ∗ Tt x − x < ε for all t ∈ ∗ R with 0 ≤ t < δ. ∞ Because 0 u exp(−ut)dt = 1, we have by the Transfer Principle u ∗ R(u, A)x − x = u
∞
exp(−ut)(∗ Tt x − x)dt
0 δ
exp(−ut)∗ Tt x − x ∞ u exp(−ut)dt +(M + 1)
≤u
0
δ
≤ ε + (M + 1) exp(−uδ)). R . Thus, This estimate shows that limu→∞ u R(u, A)xˆ = x, ˆ whence xˆ ∈ E R holds. That t → T t | is a strongly continuous semigroup is G ⊂ E ER obvious. The rest of (i) follows from the formula 4.11 above applied to this semigroup. = R(u)( R ) for some u > 0. By (i) ⇒ x = (ii) By definition D( A) E x ∈ D( A) ∗ R(u, A)y for some y ∈ H . But then ∗ Ax = ∗ Ax − ux + ux = ux − y is finite and in H .
4 Branch Spaces and Linear Operators
147
Conversely, assume that x ∈ ∗ D(A), and that x as well as ∗ Ax = y are in H . Then a short calculation shows that x = ∗ R(u, A)(ux − y). Since ux − y ∈ H , = E R by (i). we obtain xˆ = R(u, A)(u xˆ − yˆ ), where (u xˆ − yˆ ) ∈ H The space on which (Tˆt ) is strongly continuous might be strictly larger than Eˆ R as the following example shows. Example 4.6.2 (cf. [73]) Let E be the space 2 (Z), and consider the group action Tt ( f )(k) = exp(itk) f (k) for t ∈ R. It is easily seen that R(z, A) is compact, whence R = E. On the other hand for every ε > 0 and {t1 , . . . tn } there exists a we have E k ∈ Z such that sup{| exp(it j k) − 1| : j ≤ n} < ε (see [25], Theorem 26.14). Take a hyperfinite set M ⊂ ∗ R containing all of R and choose ε 0 and k ∈ ∗ Z such that sup{| exp(itk) − 1| : t ∈ M} < ε. Set f = ek = (δk,l )l∈∗ Z . Then Tˆt ( fˆ) = fˆ holds, but t → ∗ Tt ( f ) is not S–continuous, since for t = π/k 0 ∗ Tt ( f ) = − f . Corollary 4.6.3 Let T = (Tt )t≥0 be a bounded strongly continuous semigroup on the (standard) Banach space E. The following assertions are equivalent: (i) T is uniformly continuous, i.e. the map t → Tt is continuous from R+ into L(E) , equipped with the operator norm. (ii) For all x ∈ Fin(∗ E) the map ∗ R+ t → ∗ Tt (x) is S–continuous. (iii) The generator A is bounded. Proof (i) ⇒ (ii). Obvious. R = E, so R(λ, (ii) ⇒ (iii): By Theorem 4.6.1 E A) is a topological isomorphism. −1 ˆ But then A = λI − R(λ, A) is bounded. (iii) ⇒ (i). Since A is bounded and closed, it is everywhere defined. This implies that R(λ, A) is bijective, and since t → Tt R(λ, A) is continuous with respect to the operator norm, the assertion follows.
4.6.2 Approximation of Operators and of Their Spectra General Theory Another field for which nonstandard functional analysis works well is approximation theory. We give some examples concerning the approximation of spectra (for another application in this field see [51]). To this end we recall the definition of the distance from the bounded set A to the set B in C given by d(A, B) = supa∈A (inf{|b − a| : b ∈ B}). In the following assume that a sequence (Sn )n∈N approximates in a vague sense the not necessarily continuous but at least densely defined and closed operator A. We shall look for the possible convergence of parts of the spectrum of Sn to some part of the spectrum of A. If Sn is defined on the Banach space E and T is a bounded operator such that lim T − Sn = 0 then it is not hard to prove that lim d(σ(Sn ), σ(T )) = 0
148
M.P.H. Wolff
holds. But we point out that lim sup d(σ(T ), σ(Sn )) = 0 may happen (see [32], p. 210). A slightly weaker form of the assertion turns out to hold in a much more general setting, as we now show. First of all let us recall the notion of discrete convergence from approximation theory (cf. [47]): Let E and Fn be Banach spaces. Let E 1 be a dense subspace of E and for each n let Pn : E 1 → Fn be an arbitrary linear operator. The quadruple (E, E 1 , (Fn ), (Pn )) is called an approximation scheme if lim Pn un = u holds for all u ∈ E 1 . Notice that in contrast to [47] we do not require the Pn to be continuous. The mappings Pn are so to say asymptotically isometric and this is enough for applications. A sequence (u n )n with u n ∈ Fn converges discretely to u ∈ E 1 if lim u n − Pn un = 0. In that case we write u = d − limn u n . Let E 0 ⊂ E 1 be another dense subspace of E and let A : E 0 → E 1 be a linear operator. A sequence (Sn )n of linear operators Sn on Fn approximates A discretely on E 0 if loosely spoken d − limn Sn Pn u“ = Au holds for all u ∈ E 0 . More precisely this means lim Sn Pn u − Pn Aun = 0 for all u ∈ E 0 .
n→∞
(4.12)
We denote this approximation by A = d − lim Sn . Strong convergence of a sequence of bounded operators is a special case of this notion. In fact set E = E 1 = Fn , and Pn = I for all n. In order to clear up these notions of discrete approximation, we prove the following lemma: Lemma 4.6.4 Let (E, E 1 , (Fn ), (Pn )) be a fixed approximation scheme. Then the following assertions hold: (i) For N ∈ ∗ N\N the operator P N | E 1 =: P˜N is well–defined and embeds E 1 isometrically into FN . Its unique extension to an isometry from E into F N will ˜ also be denoted by PN . (ii) Let (A, D(A)) be a densely defined operator in E and let D(A) ⊂ E 1 satisfy A(D(A)) ⊂ E 1 . Assume that (Sn )n approximates A| D(A) discretely and moreover assume that (Sn Pn x)n is bounded for every x ∈ D(A). Let N be as above. Then P˜N A| D(A) = S N P˜N | D(A) . Proof (i) If u is in E 1 then PN u N u by hypothesis. Hence (i) follows. (ii) Since (Sn )n is bounded S N is S–continuous. The remainder follows from Eq. 4.12. Example (1) Let Fn = Cn with the scalar product norm (x1 , . . . , xn )2n = n 4.6.5 2 2 1 |x j | /n. Let E = L ([0, 1]), E 1 = { f ∈ E : f continuous, f (0) = f (1)}, and E 0 = { f ∈ E 1 : f ∈ E 1 }. Set A f = f and Pn : E 1 → Fn , f →
k f . n 0≤k≤n−1
4 Branch Spaces and Linear Operators
149
Define Sn = n(Tn − In ) where In is the identity on Fn and Tn is the shift given by xk+1 1 ≤ k ≤ n − 1 Tn (x)k = . Then A = d − lim Sn . Notice that (Sn )n k=n x1 is not bounded, as is easily seen. (2) Let E = l 2 (N) be the usual Hilbert space and let T be the shift given by (T f )(k) = f (k + 1). Then σ(T ) = σa (T ) = {z ∈ C : |z| ≤ 1}. Set Fn = E, Pn = I and f (k + 1) k ≤ n − 1 (Sn f )(k) = 0 otherwise. Then (Sn ) converges strongly to T . Notice that σ(Sn ) = {0} , so the spectra do not converge in any reasonable sense. (3) Let S be the adjoint T ∗ of T . Then σa (S) = T = {z ∈ C : |z| = 1}. Set Fn = Cn , equipped with the usual scalar product norm, take Pn f = ( f (1), . . . , f (n)) and Sn (x1 , . . . , xn ) = (xn , x1 , . . . , xn−1 ). Obviously (Sn ) converges discretely to S. In this case not even the ε–pseudospectrum of Sn behaves well. In fact, we have lim supn d(σ(S), σε (Sn )) = 1 − ε. In the following, let (A, D(A)) be a fixed closed densely defined operator on the Banach space E. A subspace E 0 ⊂ D(A) is a core of A if the closure of A| E 0 equals A. If E 0 is a core of A and λ ∈ σa (A) then there exists a sequence (xn )n of normalized elements in E 0 such that limn (λ − A)xn = 0 holds. Theorem 4.6.6 ([78]) Let (E, E 1 , (Fn ), (Pn )) be a fixed approximation scheme. Let (Sn ) be a sequence of bounded operators on Fn which discretely approximates the operator A on the core E 0 ⊂ E 1 of A. Then for every ε > 0 we have σa (A) ⊂
σε (Sk )
n∈N k≥n
Proof Let ε > 0 (standard) be given and let λ ∈ σa (A) be arbitrary. Because E 0 is a core of A there exists x ∈ E 0 of norm 1 such that (λ − A)x < ε/2. By hypothesis for all N ∞ and for all y ∈ E 1 we have PN y y, which implies in particular PN x 1 as well as PN (λ − A)x (λ − A)x < ε/2. Moreover, PN (λ − A)x (λ − S N )PN x by hypothesis, whence (λ − S N )PN x < ε/2, which in turn implies that λ ∈ σε (S N ). Let A = {n ∈ N : ∀k ≥ n[(λ − Sk )Pk x < ε/2]}. By what is proved so far ∗ A contains ∗ N\N. The assertion now follows from the Spillover Principle (Theorem 2.8.12 (ii)) and the Downward Transfer Principle. The next corollary holds in particular if all Fn are Hilbert spaces and if moreover all operators Sn are normal. Corollary 4.6.7 Assume that in addition to the hypotheses of the theorem, we have R(λ, Sn ) ≤ 1/d(λ, σ(Sn )) for each n ∈ N and all λ ∈ ρ(Sn ). Then for every compact subset K ⊂ C
150
M.P.H. Wolff
lim d(σa (A) ∩ K , σ(Sn )) = 0.
n→∞
Proof Let λ ∈ ∗ (σa (A) ∩ K ) be arbitrary and let ε > 0 be a standard real number. Because K is compact the standard part ◦ λ =: z exists and is in σa (A) since this set is closed. By the theorem there exists n(ε, z) such that R(z, Sn ) ≥ 2/ε holds for all n ≥ n(ε, z). By hypothesis this implies that d(z, σ(Sn )) ≤ ε/2 for all these n. The Transfer Principle implies that d(λ, σ(S N )) d(z, σ(S N )) < ε for all N ∞, and the assertion follows. The proofs of the next two corollaries are obvious. Corollary 4.6.8 If all Fn are Hilbert spaces and all Sn are unitary operators, then σa (A) ⊂ {z ∈ C : |z| = 1}. Corollary 4.6.9 If E and Fn are Hilbert spaces and A as well as all Sn are normal, then limn→∞ d(σ(A) ∩ K , σ(Sn )) = 0 for every compact subset K of C. : Consider the first Example in 4.6.5. Sn is normal and σ(Sn ) = { exp(2iπk/n)−1 1/n k = 1, . . . , n}. It follows that {2iπk : k ∈ Z} = σ(A) (a fact to be proved much more easily directly).
Compact Operators In case of approximation of compact operators a much better estimate for the spectrum is possible. In this section we follow [79]. Let (E, E 1 , (Fn ), (Pn )) be a fixed discrete approximation scheme. Definition 4.6.10 A sequence (xn ) ∈ Fn is called discretely compact (dcompact for short) if for every ε > 0 there exists a finite set Y (ε) ⊂ E 1 depending on ε such that lim sup d(xn , Pn (Y (ε))) < ε. n
A subset X ⊂ Fn is called uniformly d-compact if for every ε > 0 there exists a finite set Y (ε) ⊂ E 1 such that lim sup d(xn , Pn (Y (ε))) < ε n
holds for all (xn ) ∈ X . Remark 4.6.11 There are various similar notions in the literature, see e.g. [2, 16], which turn out to be less general than ours. For a discussion see [79], pp. 227–228. The characterization of d-compactness by nonstandard analysis reads as follows:
4 Branch Spaces and Linear Operators
151
Proposition 4.6.12 A sequence (xn )n ∈ Fn is discretely compact if and only if to every N ≈ ∞ there exists x ∈ E such that P˜N x = x N. Proof Let x be discretely compact and let N ≈ ∞. To r ∈ N there exist a finite set Yr with lim supn d(xn , Pn (Yr )) < 2−r , in particular d(x N , PN (∗ Yr )) < 2−r . Choose yr ∈ Yr satisfying d(x N , PN yr ) < 2−r . Since PN is almost an isometry the sequence (y)r is a Cauchy sequence in E 1 hence convergent to some x ∈ E. It follows that y N = P˜N x. Conversely assume that (xn )n is not discretely compact. Then there exists ε > 0 and to every finite Y ⊂ E 1 as well as to every m ∈ N there exists n ≥ m such that d(xn , Pn (Y )) > ε. Now take a hyperfinite Y ⊂ ∗ E 1 with E 1 ⊂ Y (externally). Let M ≈ ∞. Then there exists N ≥ M with d(x N , PN (Y )) > ε. In particular for y ∈ E 1 it follows x N − y > ε, so there is no standard x with P˜N x = x N. The characterization of uniform compactness is a little more cumbersome and will not be needed in the sequel. Let us consider again Example 1 in 4.6.5, but now choose ⎛
1 1⎜ ⎜1 An = ⎜ . n ⎝ .. 1
⎞ 0 0 ··· 0 1 0 ··· 0⎟ ⎟ ⎟. .. ⎠ . 1 1 ··· 1
x The sequence (An Pn f )n converges discretely to the Volterra operator f → (x → 0 f (u)du). The next proposition shows that the set {(An Pn f )n : f = 1} is uniformly d-compact. Proposition 4.6.13 Let the linear operator A : E 1 → E 1 be approximated by the sequence (An ) of operators An : Fn → Fn . Then A is compact if and only if the set X = {(An Pn x)n : x = 1} is uniformly d-compact. Proof Let A be compact, and let ε > 0 be given. Then there exists a finite set Y with d(A(B0, 1), Y ) < ε/2, where B(0, 1) denotes the unit ball in E 1 . Let N ≈ ∞ be arbitrary. Then for all (standard) x of norm 1 we have PN Ax A N PN x as well as d(PN Ax, PN Y ) < ε/2, hence d(A N PN x, PN (Y )) < ε, which implies that X is uniformly d-compact. Conversely let X be uniformly d-compact. Then to ε > 0 there exists a finite set Y ⊂ E 1 with lim supn d(An Pn x, Pn (Y )) < ε/2 for all x ∈ E 1 of norm 1. Let N ≈ ∞ be arbitrary. Then for each x of norm 1 we have d(A N PN x, PN (Y )) < ε/2 as well as PN Ax A N PN x. Since PN is almost an isometry we obtain d(Ax, Y ) < ε for all x of norm 1, and the assertion follows. Now we describe the approximation of the spectrum of A, whenever A is compact. Let the compact operator A be discretely approximated by (An ) We define D = {z ∈ C : ∃(xn )n , xn n = 1 for all n, (xn )n d-compact and lim inf (z − An )xn n = 0}. n
152
M.P.H. Wolff
Theorem 4.6.14 Let (An )n be a uniformly bounded sequence of operators An on Fn that approximates discretely the compact operator A. Then D ⊂ σ(A) ⊂ D ∪ {0}. Proof To z ∈ D there exists a normalized d-compact sequence (xn )n satisfying lim inf n (z − An )xn n = 0. Therefor there exists N ≈ ∞ with A N x N zx N .Since (xn )n is d-compact there exists a (standard) y ∈ E with x N = P˜N y. So we obtain A N P˜N y =
A N x P˜N Ay =
N = zx N = z P˜N y. It follows z ∈ σ(A), since P˜N is an isometry. Conversely let 0 = z ∈ σ(A) be arbitrary and let y be a normalized eigenvector. Then the sequence (xn )n given by xn = Pn y = 1z Pn Ay is d-compact by Proposition 4.6.13. Moreover for N ≈ ∞ we get A N x N = A N PN y PN Ay = z PN y = zx N , which implies even limn (z − An )xn n = 0. More results are gained by refining the notion of discrete convergence, see [78].
4.6.3 Super Properties Roughly spoken a property concerning subsets of a given Banach space or concerning operators is called a super property if the corresponding object in the nonstandard hull has this property. We know already the notion of superreflexivity. Moreover compactness of an operator is also a super property (see Proposition 4.4.1). We turn now towards two other notions which are super properties. Recall that an operator T ∈ L(E) is called uniformly stable if {T n : n ∈ N } is relatively compact with respect to the operator norm. It is called stable if the orbits o(T, x) := {T n x : n ∈ N } are relatively norm compact for every x ∈ E. Definition 4.6.15 Let E be a Banach space and let T be an operator whose powers are uniformly bounded (supn T n < ∞). (i) T is called superstable if Tˆ is stable on E. (ii) T is called superergodic if Tˆ is mean ergodic. Recall first of all that in Banach spaces precompactness and relative compactness are equivalent notions, and furthermore that a sequence (u n )n in a Banach space F, say, is precompact if ∀ε > 0∃L ∈ N∀n ∈ N [d(u n , {u 1 , . . . , u L }) < ε] .
(4.13)
4 Branch Spaces and Linear Operators
153
This formula is better known than the following equivalent one: $ % ∀ε > 0∀ϕ ∈ NN ∃L ∈ N d(u ϕ(L) , {u 1 , . . . , u L }) < ε .
(4.14)
Let B E be the closed unit ball of the Banach space E. Let ε > 0 be arbitrary and let ϕ ∈ NN be a given sequence. Then for m, k ∈ N and k ≤ m we define Aε,ϕ (m, k) = {x ∈ B E : T ϕ(m) x − T k x < ε}. Set Aε,ϕ (m) =
k
Aε,ϕ (m, k) = {x ∈ B E : d(T ϕ(m) x, {T x, T 2 x, . . . , T m x}) < ε}.
1
With the aid of the two equivalent formulas 4.13 and 4.14 it is not too hard to prove the following two statements: (i) T is uniformly stable iff for all ε > 0 and all sequences ϕ there exists m ∈ N and k ≤ m such that B E = Aε,ϕ (m, k). (ii) T is stable iff for all ε > 0 and all sequences ϕ B E = m∈N Aε,ϕ (m). With this in mind we give now a characterization of superstable operators, which is independent of the nonstandard hull. Theorem 4.6.16 The following assertions are equivalent: (i) T is superstable. (ii) Tˆ is superstable. (iii) For all ε > 0 and all ϕ ∈ NN there exist an L ∈ N such that BE =
L
Aε,ϕ (m).
1
Remark 4.6.17 Condition (iii) and the discussions above show the position of the notion of super stability within the other notions of stability. Proof (i) → (iii): By definition Tˆ is stable. Using formula 4.14 we obtain & ' ˆ {Tˆ x, ˆ . . . , Tˆ m x}) ˆ 0∀ϕ ∈ NN ∀xˆ ∈ B Eˆ ∃m ∈ N d(Tˆ ϕ(m) x, We fix ε and ϕ. Then we can “lift” the remainder of the formula and get ∀x ∈ ∗ B(E)∃st m ∈ N
&
∗
' d(∗ T ϕ(m) x, {∗ T x, . . . , ∗ T m x}) < ε ,
where ∃st means “there exists a standard”. But this letter formula is equivalent to ∃st
fin
M ⊂ N∀x ∈ ∗ B(E)∃m ∈ M
&
∗
' d(∗ T ϕ(m) x, {∗ T x, . . . , ∗ T m x}) < ε ,
154
M.P.H. Wolff
(see Nelson’s algorithm [44]) where ∃st fin M means “there exists a standard finite set M”. Now by the transfer Principle this latter formula holds in our standard model (obviously without the exponent “standard”). Now set L = max(M). then the following formula holds: & ' ∃L ∈ N∀x ∈ ∗ B(E)∃m ≤ L d(T ϕ(m) x, {T x , . . . , T m x}) < ε , which is equivalent to (iii). Aη,ϕ (m) ⊂ (iii) → (i): Let ε, ϕ be given. Then (iii) holds also for η = ε/2. Moreover ∗ ˆ E ˆ ˆ Aε,ϕ (m), where the upper index means the set formed in E for T . The transfer principle yields L(η) L(η) ˆ E ∗ Aη,ϕ (m) ⊂ Aε,ϕ (m), B Eˆ = 1
1
which implies that Tˆ is stable by the paragraph preceding the theorem. So T is superstable. Eˆ (m) (see the proof of (i) → ˆ = 1L Aε,ϕ (i) → (ii): (i) implies (iii), which gives B( E) (iii). But this in turn yields (ii) because of (iii) → (i). The remainder is obvious. The importance of super stability is underlined by the following theorem: Theorem 4.6.18 ([49]) (i) If T is superstable then {z ∈ σ(T ) : |z| = 1} =: σ1 (T ) is at most countable. (ii) If σ1 (T ) is at most countable and E is superreflexive then T is superstable. For the proof we refer to [49]. In a similar manner we can characterize superergodicity (see [80]). First of all, since every power bounded operator T on a reflexive Banach space E is necessarily ergodic because the unit ball is weakly compact we obtain that every power bounded operator T on a superreflexive Banach space E is superergodic, because the nonstandard hull Eˆ of E is reflexive by Theorem 4.3.27. So the notion makes only sense in non superreflexive spaces. 1 n
Let T be a power bounded operator on the Banach space E. We set Mn (T ) = n−1 k T . Moreover for every ε > 0 and for every sequence ϕ ∈ N N set 0 Bε,ϕ (m) = {x ∈ B E : d(Mϕ(m) x, {M1 x, . . . Mm x}) < ε}.
Similarly to our proof of Theorem 4.6.18 we obtain the following proposition:
4 Branch Spaces and Linear Operators
155
Proposition 4.6.19 Let T be a power bounded operator on the Banach space E. The following assertions are equivalent: (i) (ii) (iii)
T is superergodic. ˆ Tˆ is superergodic on E. For every ε > 0 and ϕ ∈ NN there exists L ∈ N such that B E = 1L Bε,ϕ (m).
M. Yahdi [80] has shown by examples that the notion of superergodicity lies strictly between that one of ergodicity and uniform ergodicity. Finally let us point out that analogous results can be gained also for strongly continuous semigroups, see [49].
4.6.4 The Fixed Point Property Due to lack of space we can only give a brief glimpse into this field. Almost since the beginning of its development ultraproduct methods were used heavily (see [31, 39, 60]). To our knowledge the only paper where nonstandard analysis is used is that by A. Wi´snicki [71]. We give a short rapport of the paper [69] which gives an interesting insight into the theory. In the following let T be a self mapping on the the bounded closed convex subset C of the Banach space E. It is called nonexpansive if T (x) − T (y) ≤ 1 for all x, y ∈ C. It is called a contraction as usual if it is Lipschitz continuous with Lipschitz constant L(T ) < 1. The subset C is said to have the fixed point property if every nonexpansive map on it has a fixed point. A Banach space E has the fixed point property or FFP for short, if every closed bounded convex subset has this property. In [39] B. Maurey used the ultraproduct technique to prove the fixed point property for every reflexive subspace of L 1 ([0, 1]). We start with the following lemma: Lemma 4.6.20 Let C be a bounded closed convex subset of the Banach space E, and let T : C → C be nonexpansive. Then there exists a sequence (xn )n with lim n T (xn ) − xn = 0. Such a sequence is called an approximate fixed point sequence . Proof Fix a point x0 and for every n ∈ N consider the contraction Tn (x) = n1 x0 + (1 − n1 )T (x). Let xn be its unique fixed point. Then (xn )n is the desired sequence. . We need still another notion of the geometry of Banach spaces: a closed subset A is called metrically convex if to every two points x, y ∈ A there exists z ∈ A such that x − y = x − z + z − y.
156
M.P.H. Wolff
In the following we consider the nonstandard hull Cˆ ⊂ Eˆ as well as the nonstandard hull Tˆ of T . Obviously Cˆ is bounded, closed, and convex, if C possesses these properties. Moreover Tˆ is nonexpansive if T is so. Note that on account of the lemma above Tˆ has always fixed points. The next theorem is due to B. Maurey [39]. It is essential in this field: Theorem 4.6.21 (B. Maurey) The set Fi x(Tˆ ) of fixed points of Tˆ has the metric fixed point property. Proof Let a, ˆ bˆ be two different fixed points of Tˆ . Choose a ∈ a, ˆ b ∈ bˆ arbitrarily and set a − b = λ, as well as a − T (a) = ξ1 , b − T (b) = ξ2 . ξ3 := ξ1 + ξ2 is infinitesimal, hence by Robinson’s Sequential Lemma (Theorem 2.8.13), applied to the sequence (sn ) = (n 2 ξ3 )n there exists N ≈ ∞ such that N 2 ξ3 < 1. We set η = 1/N and D = {z ∈ ∗ C : a − z, b − z ≤ λ2 + η}. D is intern by Keisler’s Internal Definition Principle (Theorem 2.8.4) and it is internally closed. Claim: x0 := (a + b)/2 ∈ D, so D = ∅. Proof : a − x0 = 21 a − b = λ2 = b − x0 . Claim: The map S : D → ∗ C, given by S(z) = η2 (a + b) + (1 − η)T (z), maps D into itself. Proof : We have η a − S(z) = (1 − η)(a − T (a)) + (1 − η)(T (a) − T (z)) + (a − b). 2 Using T (a) − T (z) ≤ a − z ≤ λ/2 + η, since z ∈ D, we obtain η λ +η + λ 2 2 λ = (1 − η)ξ1 + (1 − η)η + . 2
a − S(z) ≤ (1 − η)ξ1 + (1 − η)
But ξ1 < 1/N 2 = η 2 hence (1 − η)(ξ1 + η) < η(1 − η)(1 + η) = η(1 − η 2 ) which implies a − S(z) ≤ λ/2 + η. Similarly one proves the other inequality, and the claim follows. Applying the Transfer Principle to the contraction S we get a unique fixed point x of S. Then x ∈ D and xˆ is a fixed point of Tˆ satisfying ◦
ˆ ≤ aˆ − x ˆ ≤ ◦ λ. λ = aˆ − b ˆ + xˆ − b ≤◦ λ/2
≤◦ λ/2
Definition 4.6.22 A Banach space E has property (Sm ) if for every metrically convex set A of the unit sphere S(E) with diam(A) ≤ 1 there exists ξ ∈ E such that ξ is strictly positive on A.
4 Branch Spaces and Linear Operators
157
In [69] it is proved that a Banach space which is uniformly noncreasy (UNC for short) possesses property (Sm ). This class of Banach spaces contains the uniformly convex as well as the uniformly smooth spaces. An UNC-Banach space is superreflexive and its nonstandard hull is also UNC. So in particular it possesses property (Sm ). Therefore the following result is important: Theorem 4.6.23 (A. Wisnicki [69]) Let E be a superreflexive Banach space and assume that its nonstandard hull possesses (Sm ). Then E has FFP. In order to prove the theorem we have to recall some facts of the general fixed point theory: Let E do not have FFP. Then there exists a closed convex set K with more than one point and a nonexpansive mapping T on it without fixed points with the following properties: (i) conv(T (K )) = K . (ii) K is diametral, i.e. diam(K ) = sup y x − y for all x ∈ K . (iii) Every approximative fixed point sequence (xn )n satisfies limn xn − x = diam(K ) for all x ∈ K . By rescaling and shifting one can assume that 0 ∈ K , moreover that diam(K ) = 1 and finally that whenever K is weakly relatively compact there exists an approximate fixed point sequence converging weakly to 0. Now let Eˆ be the nonstandard hull in a polysaturated extension (see Sect. 2.9) of the standard world. Then using (iii) we can prove the following facts: (a) diam(Fi x(Tˆ )) = diam( Kˆ ) = diam(K ). (b) Kˆ , and Fi x(Tˆ ) are diametral and xˆ ∈ Fi x(Tˆ ) implies x ˆ = diam(K ). We now can use all that we have discussed above to prove Wi´snicki’s theorem. Proof Assume that E does not possess the FFP. Then there exists a closed convex subset K , say, and a nonexpansive mapping T on it without fixed points and with the properties listed above. Let (xn )n be an approximate fixed point sequence of T . Without loss of generality we may assume that (xn )n converges weakly to 0 (notice that all bounded closed subsets are weakly compact since E is reflexive). Moreover we can also assume diam(K ) = 1. Hence Fix(Tˆ ) is a subset of the unit sphere which is metrically convex by Theorem 4.6.21. Since Eˆ possesses (Sm ) there exists ˆ which is strictly positive on Fix(Tˆ ). an element η ∈ ( E) By Theorem 4.3.27 η = ξˆ for an element ξ ∈ Fin(E ). Because of Theorem 4.3.15 ξ is weakly nearstandard to a standard ϕ ∈ E . Then < x N , ϕ > 0 for all N ≈ ∞. Applying Robinson’s Sequential Lemma (Theorem 2.8.13) to the sequence (sn )n given by sn :=< xn , ξ − ϕ > (notice that sn 0 for all standard n) we obtain N ≈ ∞ such that < x N , ϕ > − < x N , ξ >=< x N , ξ − ϕ > 0, hence 0 < x N , ξ > because 0 < x N , ϕ >. But this implies x N ∈ Fi x(Tˆ ) as well as ˆ < x N , ξ >= 0, a contradiction.
158
M.P.H. Wolff
Remark 4.6.24 As discussed already above, this theorem implies that all UNC Banach spaces possess FFP, a result due to S. Prus [46].
4.6.5 References to Further Applications of Nonstandard Analysis To operator Theory (1) There is an easy generalization of the notion of compact operators based on containing E, and let T | F be Proposition 4.4.1: Let F be a closed subspace of E the class of continuous operators T such that T ( E) ⊂ F. The problem is how to characterize these operators in standard terms. A fruitful application was given in [58], where E is considered to be a Banach lattice and F is the closed ideal of generated by E. This application generalized considerably earlier results on E operators on atomic Banach lattices. This research is continued by V. G. Troitsky, see [66] in a very interesting manner. (2) Let E be an ordered Banach space, and let (Tt ) be a semigroup of positive operators. Assume that (Tt ) dominates asymptotically another semigroup (St ). More precisely this means that limt→∞ d(Tt x − St x, E + ) = 0 holds for all x in the positive cone E + of E. Which properties of (Tt ) are shared by (St )? Examples are stability, asymptotic almost periodicity and spectral properties (at least if additional assumptions are made concerning the underlying space E). For some important results see e.g. [14, 50–52]. (3) There are also some sophisticated applications to semigroups of operators on C ∗ –algebras and its applications to the mathematical theory of many particle systems, see e.g. [68, 72, 75].
4.6.6 Notes The first one who used Fréchet products in the context of spectral theory of strongly continuous semigroups seems to have been R. Derndinger [12]. Since then ultrapower techniques were used quite frequently (see e.g. [43]). For more recent results see the papers mentioned in the previous subsection. The research on the connection between discrete convergence of operators and the convergence of spectra in this generality is due to the author [77, 78]. More concrete results have been obtained in the context of approximation of pseudo– differential operators by Gordon et al. [2]. For another application of nonstandard analysis to approximation theory see [74]. There is a far going generalization of discrete convergence due to B. Silbermann and his school, see e.g. [59]. Nonstandard analysis however is not applied there up to now though it seems very promising. An extensive nonstandard analytical treatment of concrete closed operators, e.g. of differential operators, is to be found in [1].
4 Branch Spaces and Linear Operators
159
Acknowledgments I would like to thank Prof. C. W. Henson, University of Illinois at ChampaignUrbana who gave me the important reference to the work of D. Dacunha-Castelles and J. L. Krivine, and who read very carefully an earlier version of Sects. 4.1 and 4.2; Prof. E. Gordon, University of Nishni Novgorod and now Eastern Illinois University, with whom I discussed his own approach to the theory of discrete approximation; Dr. H. Ploss, University at Vienna, for many helpful discussions on the theory of strongly continuous semigroups, some of which prevented me from unsightly errors; and Prof. Dr. Eduard Emel’yanov from the Middle East Technical University at Ankara who carefully read the final version of the first edition eliminating some more misprints and errors.
References 1. S. Albeverio, J.E. Fenstad, R. Høegh-Krohn, T. Lindstrøm, Nonstandard Methods in Stochastic Analysis and Mathematical Physics (Academic Press, Orlando, 1986) 2. S. Albeverio, E. Gordon, A. Khrennikov, Finite dimensional approximations of operators in the spaces of functions on locally compact abelian groups. Acta Appl. Math. 64, 33–73 (2000) 3. H. Ando, U. Haagerup, Ultraproducts of von Neumann algebras. J. Funct. Anal. 266, 6842–6913 (2014) 4. S.K. Berberian, Approximate proper vectors. Proc. Am. Math. Soc. 13, 111–114 (1962) 5. S. Baratella, S.A. Ng, Some properties of nonstandard hulls of Banach algebras. Bull. Belg. Math. Soc. Simon Stevin 18, 31–38 (2011) 6. A.R. Bernstein, A. Robinson, Solution of an invariant subspace problem of K.T. Mith and P.R. Halmos. Pac. J. Math. 16, 421–431 (1966) 7. S. Buoni, R. Harte, A.W. Wickstead, Upper and lower Fredholm spectra I. Proc. Am. Math. Soc. 66, 309–314 (1977) 8. A. Connes, Noncommutative Geometry (Academic Press, New York, 1994) 9. J.J.M. Chadwick, A.W. Wickstead, A quotient of ultrapowers of Banach spaces and semiFredholm operators. Bull. Lond. Math. Soc. 9, 321–325 (1977) 10. D. Dacunha-Castelle, J.L. Krivine, Application des ultraproducts a l’etude des espaces et des algebres de Banach. Studia Math. 41, 315–334 (1995) 11. M. Davis, Applied Nonstandard Analysis (Wiley, New York, 1977) 12. R. Derndinger, Über das Spektrum positiver Generatoren. Math. Z. 172, 281–293 (1980) 13. N. Dunford, J. Schwartz, Linear Operators Part I (Interscience Publishers, New York, 1958) 14. E.Y. Emel’yanov, U. Kohler, F. Räbiger, M.P.H. Wolff, Stability and almost periodicity of asymptotically dominated semigroups of positive operators. Proc. Am. Math. Soc. 129, 2633–2642 (2001) 15. P. Enflo, J. Lindenstrauss, G. Pisier, On the “three space problem”. Math. Scand. 36, 199–210 (1975) 16. E.I. Gordon, A.G. Kusraev, S.S. Kutadeladze, Infinitesimal Analysis (Kluver Academic Publisher, Dordrecht, 2002) 17. G. Greiner, Zur Perron-Frobenius Theorie stark stetiger Halbgruppen. Math. Z. 177, 401–423 (1981) 18. U. Groh, Uniformly ergodic theorems for identity preserving Schwarz maps on W -algebras. J. Oper. Theory 11, 395–402 (1984) 19. S. Heinrich, Ultraproducts in Banach space theory. J. Reine Angew. Math. 313, 72–104 (1980) 20. S. Heinrich, C.W. Henson, L.C. Moore, A note on elementary equivalence of C(K ) spaces. J. Symb. Log. 52, 368–373 (1987) 21. C.W. Henson, L.C. Moore, The nonstandard theory of topological vector spaces. Trans. Am. Math. Soc. 172, 405–435 (1972) 22. C.W. Henson, Nonstandard hulls of Banach spaces. Isr. J. Math. 25, 108–144 (1976) 23. C.W. Henson, L.C. Moore, Nonstandard Analysis and the theory of Banach spaces, in Nonstandard Analysis—Recent Developments, ed. by A.E. Hurd (Springer, Berlin, 1983), pp. 27–112
160
M.P.H. Wolff
24. C.W. Henson, J. Iovino, Ultraproducts in analysis, in Analysisand Logic, ed. by C. Finet (e.) et al. Report on three minicourses given at the international conference "Analyse et logique" Mons, Belgium, 25-29 August, Cambridge University Press, London Mathematical Society Lecture Notes Series , vol. 262 (2002), pp. 1-110 25. E. Hewitt, K.H. Ross, Abstract Harmonic Analysis I (Springer, Berlin, 1963) 26. E. Hille, R.S. Phillips, Functional Analysis and Semi-groups, vol. 31 (American Mathematical Society Colloquium Publications, Providence, 1957) 27. T. Hinokuma, M. Ozawa, Conversion from nonstandard matrix algebras to standard factors of type I I1 . Ill. Math. J. 37, 1–13 (1993) 28. A.E. Hurd, P.A. Loeb, An Introduction to Nonstandard Real Analysis (Academic Press, Orlando, 1985) 29. R.C. James, Characterizations of reflexivity. Studia Math. 23, 205–216 (1963/64) 30. G. Janssen, Restricted ultraproducts of finite von Neumann algebras, in Contributions to NonStandard Analysis. Studies in Logic and the Foundations of Mathematics, vol. 69. ed. by W.A.J. Luxemburg, A. Robinson (North Holland, Amsterdam, 1972), pp. 101–114 31. M.A. Khamsi, B. Sims, Ultra-methods in metric fixed point theory, in Handbook of Metric Fixed Point Theory, ed. by W. Kirk, B. Sims (Kluwer Academic Publishers, Dordrecht, 2001), pp. 177–199 32. T. Kato, Perturbation Theory of Operators (Springer, Berlin, 1976) 33. A. Krupa, On various generalizations of the notion of an F-power to the case of unbounded operators. Bull. Pol. Acad. Sci. Math. 38, 159–166 (1990) 34. H.E. Lacey, The Isometric Theory of Classical Banach Spaces (Springer, New York, 1974) 35. J. Lindenstrauss, H. Rosenthal, The L p spaces. Isr. J. Math. 7, 325–349 (1969) 36. W.A.J. Luxemburg, A general theory of monads, in Applications of Model Theory to Algebra, Analysis and Probability, ed. by W.A.J. Luxemburg (Holt, Rinehart and Winston, New York, 1969), pp. 18–86 37. W.A.J. Luxemburg, Near–standard compact internal linear operators, in: Developments in nonstandard mathematics, eds. by N. J. Cutland et al. Pitman Res. Notes Math. Ser. vol. 336 (London, 1995) pp. 91–98 38. A. Martínez-Abejón, An elementary proof of the principle of local reflexivity. Proc. Am. Math. Soc. 127, 1397–1398 (1999) 39. B. Maurey, Points fixes des contractions de certains faiblement compact de L 1 . Seminaire d’analyses fonctionelle 1980–81 (Ecole Polytechnique, Palaiseau, 1981) 40. P. Meyer-Nieberg, Banach Lattices (Springer, New York, 1991) 41. G. Mittelmeyer, M.P.H. Wolff, Über den Absolutbetrag auf komplexen Vektorverbänden. Math. Z. 137, 87–92 (1974) 42. G.J. Murphy, C ∗ -Algebras and Operator Theory (Academic Press, Boston, 1990) 43. R. Nagel (ed.), One–Parameter Semigroups of Positive Operators. Lecture Notes in Mathematics vol. 1184 (Springer, Berlin, 1986) 44. E. Nelson, Internal set theory: a new approach to nonstandard analysis. Bull. Am. Math. Soc. 83, 1165–1198 (1977) 45. Siu-Ah Ng, Nonstandard methods in Functional Analysis. Lecture and Notes (World Scientific, Singapore, 2010) 46. S. Prus, Banach spaces which are uniformly noncreasy, in Proceedings of 2nd World Congress of Nonlinear Analysis (Athens 1996). Nonlinear Anal. 30, 2317–2324 (1997) 47. H.J. Reinhardt, Analysis of Approximation Methods for Differential and Integral Equations (Springer, Berlin, 1985) 48. A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations (Springer, Berlin, 1983) 49. F. Räbiger, M.P.H. Wolff, Superstable semigroups of operators. Indag. Math., N.S. 6, 481–494 (1995) 50. F. Räbiger, M.P.H. Wolff, Spectral and asymptotic properties of dominated operators. J. Aust. Math. Soc. (Series A) 63, 16–31 (1997)
4 Branch Spaces and Linear Operators
161
51. F. Räbiger, M.P.H. Wolff, On the approximation of positive operators and the behaviours of the spectra of the approximants. Integral Equ. Oper. Theory 28, 72–86 (1997) 52. F. Räbiger, M.P.H. Wolff, Spectral and asymptotic properties of resolvent-dominated operators. J. Aust. Math. Soc. 68, 181–201 (2000) 53. C.E. Rickart, General Theory of Banach Algebras (Van Nostrand, Princeton, 1960) 54. A. Robert, Functional analysis and NSA, in Developments in Nonstandard Mathematics, Pitman Research Notes in Mathematics Series, vol. 336, ed. by N.J. Cutland, et al. (Springer, London, 1995), pp. 73–90 55. B.N. Sadovskii, Limit-compact and condensing operators. Uspehi Math. Nauk 27, 81–146 (1972) 56. S. Sakai, C ∗ -Algebras (Springer, Berlin, 1971) 57. H.H. Schaefer, Banach Lattices and Positive Operators (Springer, Berlin, 1974) 58. A. Schepp, M.P.H. Wolff, Semicompact operators. Indag. Math. New Ser. 1, 115–125 (1990) 59. M. Seidel, B. Silbermann, Banach algebras of operator sequences. Op. Matrices 6, 385–432 (2012) 60. B. Sims, “Ultra”-techniques in Banach Space Theory. Queen’s Papers in Pure and Applied Mathematics , vol. 60 (Queen’s University, Kingston, 1982) 61. K.D. Stroyan, W.A.J. Luxemburg, Introduction to the Theory of Infinitesimals (Academic Press, New York, 1976) 62. Yeneng Sun, A Banach space in which a ball is contained in the range of some countable additive measure is superreflexive. Canad. Math. Bull. 33, 45–49 (1990) 63. D.G. Tacon, Generalized semi-fredholm transformations. J. Aust. Math. Soc. Ser. A 34, 60–70 (1983) 64. M. Takesaki, Theory of Operator Algebras (Springer, Berlin, 1978) 65. L.N. Trefethen, Pseudospectra of matrices, in: Numerical Analysis, Proceedings of the 14th Dundee Conference 1991 ed. by D.F. Griffiths, Pitman (London, 1993) pp. 234–264 66. V.G. Troitsky, Measures of noncompactness of operators on Banach lattices. Positivity 8, 165– 178 (2004) 67. G. Vainikko, Funktionalanalysis der Diskretisierungsmethoden (Teubner, Leipzig, 1976) 68. R. Werner, M.P.H. Wolff, Classical mechanics as quantum mechanics with infinitesimal . Phys. Lett. Ser. A 202, 155–159 (1995) 69. A. Wi´snicki, Towards the fixed point property for superreflexive spaces. Bull. Aust. Math. Soc. 64, 435–444 (2001) 70. A. Wi´snicki, On the structure of fixed point sets of nonexpansive mappings, in: Proceedings of the 3rd Polish Symposium on Nonlinear Analysis. Lecture Note in Nonlinear Analysis vol. 3, (2002), pp. 169-174 71. A. Wi´snicki, The super fixed point property for asymptotically nonexpansive mappings. Fundam. Math. 217, 265–277 (2012) 72. M.P.H. Wolff, R. Honegger, On the algebra of fluctuation operators of a quantum meanfield system. Quantum Probab. Relat. Top. IX, 401–410 (1994) 73. M.P.H. Wolff, Spectral theory of group representations and their nonstandard hull. Isr. J. Math. 48, 205–224 (1984) 74. M.P.H. Wolff, An application of spectral calculus to the problem of saturation in approximation theory. Note di Matematica XII, 291–300 (1992) 75. M.P.H. Wolff, A nonstandard analysis approach to the theory of quantum meanfield systems, in Advances in Analysis, Probability and Mathematical Physics—Contributions of Nonstandard Analysis, ed. by S. Albeverio, W.A.J. Luxemburg, M.P.H. Wolff (Kluwer Academic Publisher, Dordrecht, 1995), pp. 228–246 76. M.P.H Wolff, An introduction to nonstandard functional analysis, in: Proceedings of Nonstandard Analysis–Theory and Applications eds. by L.O. Arkeryd, N.J. Cutland C.W. Henson (Kluwer Academic Publisher, Dordrecht, 1997), pp. 121–151 77. M.P.H. Wolff, On the approximation of operators and the convergence of the spectra of the approximants. Op. Theory: Adv. Appl. 13, 279–283 (1998)
162
M.P.H. Wolff
78. M.P.H. Wolff, Discrete approximation of unbounded operators and the convergence of the spectra of the approximants. J. Approx. Theory 113, 229–244 (2001) 79. M.P.H Wolff, Discrete approximation of compact operators and approximation of their spectra, in: Nonstandard Methods and Applications in Mathematics, Lecture Notes in Logic vol. 25 (206) eds. by N.J. Cutland, Mauri Di Nasso, David A. Ross (A.K. Peters Ltd., Wellesley) pp. 224–231 80. M. Yahdi, Super-ergodic operators. Proc. Am. Math. Soc. 134, 2613–2620 (2006) 81. K. Yosida, Functional Analysis, 2nd edn. (Springer, Berlin, 1968)
Part III
Compactifications
Chapter 5
General and End Compactifications Matt Insall, Peter A. Loeb and Małgorzata Aneta Marciniak
5.1 Introduction Here we follow the work in [7], where nonstandard analysis [9, 12, 15] is used to extend and simplify previous works in the literature on compactifications. Recall that the monad of a standard point x is the intersection of all nonstandard extensions of standard open neighborhoods of x. Points in the monad of some standard point are called nearstandard; a remote point is any point in the nonstandard extension of the space that is not nearstandard. A space has at least one remote point if and only if it is not compact. A central theme of this chapter is the following: A compactification of a regular space is produced by any equivalence relation on the set of remote points. The new points of the resulting compactification are the equivalence classes of remote points. This yields a compact space containing the original point set as a dense subset. The relative topology on that dense subset is in some cases, however, weaker than the original topology. Compactifications constructed in the literature often employ a
M. Insall (B) Department of Mathematics and Statistics, Missouri University of Science and Technology, 400 W. 12th St., Rolla, MO 65409-0020, USA e-mail:
[email protected] P.A. Loeb Department of Mathematics, University of Illinois, 1409 West Green Street, Urbana, IL 61801, USA e-mail:
[email protected] M.A. Marciniak Department of Mathematics, Engineering and Computer Science, LaGuardia Community College, CUNY, 31-10 Thomson Avenue, Long Island City, NY 11101, USA e-mail:
[email protected] © Springer Science+Business Media Dordrecht 2015 P.A. Loeb and M.P.H. Wolff (eds.), Nonstandard Analysis for the Working Mathematician, DOI 10.1007/978-94-017-7327-0_5
165
166
M. Insall et al.
continuous map from the original space into a compact space. Forming a compactification by attaching appropriate points is incisive; it allows a better understanding of the relationship between the original space and the set of compactifying points. For example, given a family of bounded real-valued functions on the original space, as was considered by the second author in [10], one can call two remote points equivalent if the nonstandard extension of each of the functions in the family has infinitesimal variation on the two point set. This leads to compactifications such as ˇ the Stone-Cech compactification, requiring only regularity, not complete regularity, of the original space. A continuum, i.e., a compact, connected, metrizable space, arises as a compactification in many natural situations. One approach to such compactifications is the theory of topological ends (see [3]). Using nonstandard analysis, we illuminate that theory and extend it. Simple examples of end compactifications are the two point compactification of the real line and the one point compactification of the complex plane. The notion of topological ends was introduced by Freudenthal in [3] to formalize the intuitive notion of a “hole” in a noncompact space. Freudenthal used sequences of connected open sets having nonempty compact boundary with empty intersection of the sets in the sequence. That approach was recently extended by the third author in [11] and then by the first and third author in [8] using nested nets of open sets with the same properties. The work in [7] on which this chapter is based shows that the special assumptions of these previous papers can be eliminated, so that our results apply to all regular, connected and locally connected topological spaces. Moreover, the definition presented here extends the notion of ends in those works in the sense that some spaces with intuitive “holes” fail to have ends (in the senses of the older works) at the locations of the “holes”, but with the newer definition given here (from [7]) they have ends there and only there.1 Among other results from [7] is the fact that a product space with two or more noncompact factors has only one end. Again, a simple example is the complex plane. Literature [5, 6] shows varied uses of ends and various methods of producing end compactifications. For example, Halin (see [5]) introduced graph-theoretical ends using equivalence classes of rays at infinity. Those ends are, in general, distinct from Freudenthal’s ends. Other literature, described in [7], includes work of Diestel [1] and Goldbring [4] as well as work that foreshadows [7] such as that of Salbany and Todorov [15, 16], and of course Robinson [13].
1 One
might consider applying nonstandard methods to the notion of ends presented in [8], leading one to use enlargements of directed sets and nets. The first author looked at this briefly with Mr. Tom Cuchta, but preliminary investigations suggest that the resulting theory is essentially equivalent—at greater complexity cost—with the work in [7].
5 General and End Compactifications
167
5.2 General Compactifications Let (Z , T ) be a topological space. We assume that Z is regular, by which we mean that for each p ∈ Z , the singleton set { p} is a closed set, and for each open neighborhood U of p there is an open neighborhood V ⊆ U of p whose closure, V , is contained in U . We also assume that Z is noncompact. If there is a compact subset K 0 of our original space such that the interior of K 0 is not regular, then we assume that Z is the complement of the interior of K 0 . We now fix a κ-saturated nonstandard extension of (Z , T ), where κ is greater than the cardinality of the topology T . Recall that compact subsets of Z are closed in Z . Since we are not requiring that Z be a metric space, all reference to monads means topological monads, rather than metric monads. That is, for a point p ∈ Z , the monad of p consists of those points in ∗ Z that are in the nonstandard extension of every standard open neighborhood of p. In case the topology is determined by a metric, the metric monads and topological monads of standard points coincide, but this makes the notion of “remote point” require some care because one can define monads of nonstandard points as well, and these monads may be different for the metric case and the topological case. Definition 5.2.1 Given x ∈∗ Z , x is remote if x is not in the topological monad of any standard point of Z . Equivalently, x is remote if and only if it is not nearstandard. Note that remote points need not be “far away” from standard points. For example, if a single standard point p is removed from the complex plane, the remaining points of the monad of p are then remote points in the nonstandard extension of the resulting space. Let there be given an equivalence relation, ∼, on the set of remote points of ∗ Z . We often refer to the relationship x ∼ y between two remote points x and y by saying that x and y are equivalent. In general, equivalence classes are external. In the following definition, the point set Z is the set in the standard model. Definition 5.2.2 A point of Z will be called an s-point. An equivalence class of remote points under the given equivalence relation ∼ will be called an r-point; it will be denoted by [x]∼ (where x is some representative of the equivalence class.) Let Y be the point set consisting of all s-points and r-points topologized as follows: 1. The nonstandard extension of Z , ∗ Z , is supplied with the S-topology. 2. The mapping ϕ : ∗ Z → Y is given by ϕ(x) =
stx [x]∼
if x is nearstandard, if x is remote.
3. The neighborhood filter base B(y) at a point y of Y is given as follows: B(y) =
{ϕ (∗ U ) : U ∈ T , y = [x]∼ ⊆ ∗ U } {ϕ (∗ U ) : U ∈ T , z ∈ U }
if y = [x]∼ is an r-point, if y = z is an s-point.
168
M. Insall et al.
In [7] it is shown that the above is a valid topologization of the set Y . That is, a set O ⊆ Y is declared “open” if it contains a member of the filter base of each of its points, or equivalently, provided that O=
ϕ ∗ U : ϕ ∗ U ∈ B( p), ϕ ∗ U ⊆ O . p∈O
Here is the relevant proposition with the proof left to the reader (or see [7]): Proposition 5.2.3 The collection B( p) at a point p ∈ Y is in fact a filter base. We let TY denote the topology on Y , i.e., the collection of open sets, generated by the neighborhood filter bases. Note that the members of the neighborhood filter bases are not in general open sets. To make use of the topology TY , it is necessary to make certain observations about the topological properties of the space (Y, TY ). Proposition 5.2.4 Let A ⊆ Z be nonempty. Then any s-point that is the standard part of a point in ∗ A is a point of the T -closure of A. Any s-point in the T -closure of A is also a point of the TY -closure of A. In fact, ϕ (∗ A) is a subset of the TY -closure of A. Proof That standard parts of points of ∗ A are points of the T -closure of A is clear. Assume that x ∈ Z is a point in the T -closure of A. Any TY open set W that contains x also contains a set ϕ (∗ U ) for which U ∈ T includes the point x. Consequently, some point z is in U ∩ A. Since z ∈ ϕ (∗ U ) ∩ A ⊆ W ∩ A, x is in the TY -closure of A. Thus any s-point in ϕ (∗ A) is in the TY -closure of A. If p is an r-point in ϕ (∗ A), let W ∈ TY contain p. There is a V ∈ T with the equivalence class corresponding to p contained in ∗ V and ϕ (∗ V ) ⊆ W . Because p ∈ ϕ (∗ A), ∗ A includes some remote point y in the equivalence class p = [y]∼ , and hence ϕ(y) = p. By the choice of V , y ∈ ∗ V . Since ∗ V ∩ ∗ A = ∅, downward transfer yields that V ∩ A includes some point x, whence x is in A and also in ϕ (∗ V ) ⊆ W . Thus every point in ϕ (∗ A) is a point of the TY -closure of A. There is a simple example in [7] of a space Z with an equivalence relation on the remote points and an open set U for which the TY -closure includes an r-point not in ϕ (∗ U ). The next result, taken from [7], is essentially the fact that the space Y we have constructed is a compactification of the original space Z . The proof relies upon a result of Salbany and Todorov [15], noted in Chap. 3 of this book, that ∗ Z , with the S-topology, is a compact space. Theorem 5.2.5 The map ϕ is a continuous surjection from ∗ Z onto Y , whence, Y is compact. Moreover, the point set Z is dense in Y supplied with the TY -topology. In general, the T -topology on Z is stronger than the relative TY -topology on Z . Definition 5.2.6 Let A ⊆ ∗ Z . Then A is not equivalence class splitting if for each remote point y ∈ A, we have [y]∼ ⊆ A.
5 General and End Compactifications
169
As indicated in [7] the following central result follows immediately from the regularity of (Z , T ) and the definition of members of TY . Theorem 5.2.7 Let O ⊆ Z be open, and let A O ⊆ ∗ O be a set of remote points that is not equivalence class splitting. Assume that: 1. To each x ∈ O corresponds a Vx ∈ T with x ∈ Vx such that the T -closure of Vx is contained in O and the remote points in ∗ Vx are contained in A O . 2. For every a ∈ A O , there is a T -open set Va such that the T -closure of Va is contained in O, every remote point in ∗ Va is in A O , and the equivalence class of a, [a]∼ , is contained in ∗ Va . Let W = x∈O ϕ (∗ Vx ) ∪ a∈A O ϕ (∗ Va ). Then 1. W ∈ TY , 2. W ∩ Z = O, 3. x ∈ O ⇒ x ∈ W ⊆ ϕ (∗ O) ∈ B(x), and 4. p = ϕ (a), a ∈ A O ⇒ p ∈ W ⊆ ϕ (∗ O) ∈ B( p). If the above hold for all open subsets of Z , then the T -topology on Z is actually the relative TY -topology on Z . The following two corollaries are taken from [7]. The easy proofs are left to the reader. Corollary 5.2.8 Assume that O ∈ T has the property that for each x ∈ O there is -closure of Vx is contained in O and ∗ Vx is not a Vx ∈ T with x ∈ Vx such that the T equivalence class splitting. Set W = x∈O ϕ (∗ Vx ). Then W ∈ TY and W ∩ Z = O. If each O ∈ T satisfies the above assumption, then the T -topology on Z equals the relative TY -topology on Z . Corollary 5.2.9 If T is a locally compact topology on Z , then the T -topology on Z equals the relative TY -topology on Z . Local connectivity of the space (Y, TY ) is at times an important consideration in what follows. In particular, we have the following result using the above notation. Corollary 5.2.10 If O ⊆ Z is an open and connected set, and if each of the sets Vx , x ∈ O, is connected, then W is connected. Proof Any set containing a connected set S and contained in the closure of S is connected. Therefore, the result follows from Proposition 5.2.4. As noted in [7], the last corollary could also be proved using the following connectivity result. Proposition 5.2.11 The nonstandard extension of any connected open subset of Z is also connected, in the S-topology.
170
M. Insall et al.
In [10], the second author of this chapter described methods of imbedding a Hausdorff space into a compact space so that each function in a given family of continuous functions on the original space has a continuous extension to the compactification, and the family of extensions separates the points of the “remainder”, i.e., the set of new points in the compact space. The following theorem from [7] describes how our approach here, using nonstandard methods and equivalence relations on the remote points, relates to that earlier work. The proof of this theorem, as presented in [7] is very straightforward in our setting. Moreover, applying this theorem to the class of all bounded, continuous, realˇ valued functions on a regular space Z gives an extension of the Stone-Cech compactification construction to include regular non-completely regular spaces, so that it is not necessary to imbed Z in a product space to obtain the compactification. As it is remarked in [7], application of this construction to R. Arens’ example of a regular but not completely regular space (see [2], p. 154), produces an example where the T -topology is stronger than the relative TY-topology. Theorem 5.2.12 Let Q be a collection of bounded, continuous, real-valued functions on Z , and call two remote points x and y of the enlargement ∗ Z equivalent provided that for each f ∈ Q, ∗ f (x) −∗ f (y) is infinitesimal. Let Y be the compactification for this equivalence relation. For each r-point p in Y , and each f ∈ Q, set f ( p) equal to the standard part of ∗ f (x) for all x in the equivalence class corresponding to p. Then f : Y → R is a continuous extension of f . Moreover the set of extensions of members of Q separates the r-points of Y .
5.3 End Compactifications The title of this section is not a political slogan. We work with a noncompact topological space (Z , T ) that is regular, connected and locally connected. If there is a compact subset K 0 of our original space such that, on the interior of K 0 , regularity or local connectivity fails, then we assume that Z is the complement of the interior of K 0 . Recall that any component W of the complement of a compact set is an open set because any x ∈ W has a connected open neighborhood that is contained in W . Let κ be a fixed cardinal number that is greater than the cardinality of the topology T . Throughout this and the next section, we will work in a κ-saturated nonstandard extension of (Z , T ). Let us say that a set A ⊆ ∗ Z is remote provided that each of its points is a remote point in ∗ Z . Definition 5.3.1 We say that two remote points x, y in ∗ Z are equivalent, and we write x ∼ y if there is a remote, internally connected set A with x ∈ A and y ∈ A. The following is an easy observation, following from the fact that the union of two connected sets containing a common point is connected.
5 General and End Compactifications
171
Proposition 5.3.2 The relation ∼ is an equivalence relation in the set of remote points. In general, the equivalence classes under this equivalence relation are external. The following definition is our incisive revision of the notions of ends that have been previously considered by Freudenthal in [3], and by the first and third authors of the present chapter of this book in [8]. It is this notion, easily available and highly intuitive only in the nonstandard setting, that provides such a natural expansion of the results of those previous works on end compactifications. Definition 5.3.3 We call the equivalence class containing a remote point x ∈ ∗ Z the end of Z represented by x. We apply here the results of the previous section on general compactifications. In particular, we employ the notion of a not equivalence class splitting subset of ∗ Z . Definition 5.3.4 We call the compactification corresponding to the equivalence relation, ∼, of this section the end compactification of Z . We call an open set O ∈ T non end-splitting, or just NES if ∗ O is not equivalence class splitting with respect to the equivalence classes forming ends.
Example 5.3.5 Let Z = 0, 13 ∪ 23 , 1 × [0, 1] in the real plane, and let T be the subspace topology on Z . As explained in [7], it follows that 1. Z has a unique end, e 2. e is the set of all points in the nonstandard extension of the square with positive but infinitesimal second coordinate and horizontal coordinate between 13 and 23 but not in the monad of either. 3. The standard points (1/3, 0) and (2/3, 0) in Z cannot be separated from each other by disjoint open neighborhoods in Y . 4. The standard points (1/3, 0) and (2/3, 0) in Z cannot be separated from e by disjoint open neighborhoods in Y . 5. The topology T is strictly stronger than the relative TY topology on Z . 6. the net method of [8] does not work to define the end in Z . Recall that a regular space is locally compact if and only if each of its compact subsets is contained in an open set for which the closure is compact. This standard fact is employed in the essentially standard proof of the following result (see [7]). Theorem 5.3.6 Let K be a compact subset of Z contained in an open set W ⊆ Z with compact closure W . Then all but a finite number of components of Z K are contained in the compact set W . It follows that if Z is locally compact, then the complement of any compact subset of Z has only a finite number of components with nonstandard extension containing remote points. The following example involves a connected and locally connected space that is not locally compact.
172
M. Insall et al.
Example 5.3.7 Let Z = {0} ∪
In ,
n∈N
where for each natural number n, In = (0, 1n ] is a homeomorphic copy of the halfopen interval (0, 1]. We take as a base for the neighborhood system of each p ∈ In the usual open base inherited from the real line. A typical element of the base for the neighborhood system of 0 is given by a standard positive ε < 1; it is the set Oε := {0} ∪
{x ∈ In : x < ε} .
n∈N
For each H ∈ ∗ NN, the non-infinitesimal points in I H form an end. These are the only ends. The monad of 0 is given by μ(0) = {0} ∪
{x ∈ In : x 0} .
n∈∗ N
The nonstandard extension of any standard open neighborhood U of 0 in Z has nonempty intersection with every end, so to form a member of B(0), ϕ (∗ U ) must contain the nonstandard interval I H for every unlimited H ∈ ∗ N, and thus In for all n ≥ m U for some m U ∈ N. Therefore, the end compactification of Z is not Hausdorff, but a Hausdorff quotient is obtainable by mapping every end to 0. Note that the number of ends in the compactification depends on the cardinality of ∗ N in the selected nonstandard extension. We give the statement and proof of this next proposition much as it is given in [7] for open sets. We formulate it more generally, but of course it still holds for open sets. The result demonstrates that Robinson’s nonstandard criterion for compactness can be ostensibly weakened for boundaries. Proposition 5.3.8 Suppose A ⊆ Z and that p∈ ∗ ∂ A is in the monad of some standard point x ∈ Z . Then actually, x ∈ ∂ A. It follows that the boundary ∂ A of a nonempty subset A of Z is compact if and only if every point α∈ ∗ ∂ A is in the monad of some standard point of Z . Proof Under the conditions of the proposition, let U be any standard open neighborhood of x; then p∈ ∗ U . Since p∈ ∗ ∂ A, the open set ∗ U contains points both inside and outside ∗ A, so by downward transfer, U contains points both inside and outside A. It follows that x ∈ ∂ A, whence p is nearstandard in ∗ ∂ A. Therefore, if all points of ∗ ∂ A are in monads of standard points of Z , then they are in monads of standard points of ∂ A, so ∂ A is compact. The converse clearly holds. Connectedness and, in particular, components play a significant role in investigations relating ends to compactifications. We state without proof the following result
5 General and End Compactifications
173
about non end-splitting components, leaving the proof to the reader, or the reader may refer to [7]. Proposition 5.3.9 Any connected component of the complement of a compact set is non end-splitting. The assumption of local compactness simplifies the study of ends, so the following proposition is useful. Proposition 5.3.10 If Z is locally compact, and if x and y are non-equivalent remote points in ∗ Z , then there is a compact set K ⊆ Z such that x and y are in the nonstandard extensions of different components of Z \ K . Proof To assume the contrary, suppose that for every compact set K , x and y are in the nonstandard extension of the same component of Z \ K . Using local compactness and saturation, it follows that there is a nonstandard compact set C containing all near-standard points such that x and y are in the same component of ∗ Z \ C, so that y ∼ x. Since Z is connected, the only nonempty subset of Z with empty boundary is Z . Clearly, Z is not end-splitting. In general, we have the following result for non-trivial open subsets of Z . We refer the reader to [7] for the proof of this result. Theorem 5.3.11 A nonempty open set has a compact boundary if and only if it is non end-splitting. In [8], the first and third authors defined ends of topological spaces in terms of nets of open sets ordered by reverse inclusion (i.e., V ≥ U iff V ⊆ U ). The sets forming the net each have nonempty boundary and have empty intersection of the closures. For any such net, each member U contains an open subset with compact boundary. That is, each such net is refined by some net of nonempty open sets, each having compact boundary. Corollary 5.3.12 Let {Uα }α∈ be a net of nonempty open subsets of Z with compact boundary directed by downward inclusion and having empty intersection of the ∗ U , the end represented by x is contained closures. For any remote point, x, in α α∈ in α∈ ∗ Uα . Proposition 5.3.13 Under the assumptions of the above corollary, let x and y be remote points in α∈ ∗ Uα . Then y is in the end represented by x. Therefore this end is the same as the end determined by the method of [8] using the net {Uα }α∈ . Proof Because of saturation, x and y are in some common internally connected open set V ⊂ α∈ ∗ Uα . If z ∈ Z (i.e., if z is a standard point of ∗ Z ), then for some index α, z ∈ Z \ U α . Consequently, no point of V is in the monad of z. That is, no point of V is nearstandard. Thus, x and y are equivalent, and determine the same end.
174
M. Insall et al.
The above corollary and proposition, both from [7], illuminate the relationship between the approach in [8] and the one here using nonstandard methods. In particular, if for any such net, each member U contains an open subset with compact boundary (as is assumed in [8]), then the ends obtained in [8] and our ends defined using nonstandard methods, are in a natural one-to-one correspondence. Example 5.3.5 demonstrates a case for which the assumption of [8] is not satisfied but nonstandard ends still can be used. Here is an example with an infinite number of “ends that are near”. Example 5.3.14 Let C denote the complex plane, topologized with the usual topology, let Q denote the set of rational numbers on the real-axis, and let I be the open real interval (0, 1). Set Z = C \ (Q ∩ I ). The space Z is not locally compact, and the nonstandard rational numbers strictly between 0 and 1 are not included in ∗ Z . Nonstandard methods sharpen the analysis in [8] of the space Z . As is true for the complex plane itself, nonstandard points outside the extension of every standard bounded set are remote and form a single end. This can be proved, for example, using our next section since C is topologically the product of two copies of the real line. Now if γ is an irrational number between 0 and 1, then points of ∗ C that have not been removed in the monad of γ are mapped by ϕ onto γ. Points in ∗ C in the monad of a removed rational number q ∈ (0, 1) are remote and form a single end. Open discs in Z that are symmetric about the real axis and intersect that axis in a set with irrational minimum and maximum values are NES. The resulting end compactification is homeomorphic with the extended complex plane, i.e., the Riemann Sphere.
5.4 Product Spaces Direct products of sets play a significant role in much of mathematics. For example, they are involved in defining the notion of an algebra, and hence in the study of topological algebras, including but not limited to, the classical structures of analysis, such as topological groups, topological rings, topological vector spaces, and such. Here we describe some connections of our previous sections to products of topological spaces and the compactifications of such product spaces. For each α in some indexing set I, let X α be a given topological space, and consider the corresponding space X α supplied with the product topology. Recall α∈I
that the projection mapping defined from such a product space onto a factor of that space is an open mapping, meaning that it takes any open subset of the product onto an open subset of that factor. X α , let pα denote the projection of p onto X α . That is, For any point p in α∈I
p = ( pα )α∈I . Let st (I) denote the collection of all standard indices in ∗ I. For each α ∈ st (I), let μα ( pα ) be the monad of pα in X a . The monad of p in the product space is the product
5 General and End Compactifications
175
μα ( pa ) ×
X α.
α∈∗ Ist(I )
α∈st(I )
It is not difficult to see then that a point p in the nonstandard extension of a product is remote if and only if there exists some α ∈ st (I) such that pα is remote. To apply our results on ends, we now assume that each X α is connected and locally connected. X α is connected if and Proposition 5.4.1 Fix β ∈ I and A ⊆ X β . Then A × α =β
only if A is a connected subset of X β . Proof Let U and V be a pair of nonempty open sets in the product space that form a disconnection of A × X α . Note that the projection of U and of V onto X β α =β
and each X α must be nonempty, and U ∪ V covers A ×
X α . For each α = β,
α =β
the projections of U and of V onto X α have a nonempty intersection since X α is connected. Moreover, the projections on X β form a disconnection of A. The desired result follows. Theorem 5.4.2 Assume that at least two spaces in the family X α , α ∈ I, are nonX α has exactly one end. compact. Then the product space α∈I
Proof We reproduce the proof given in [7]. Given a remote point pβ in ∗ X β where β ∈ st (I), the product pβ × ∗ X α is internally connected and contains no α =β
nearstandard point. Similarly, given a remote point pγ in ∗ X γ where γ ∈ st (I) and γ = β, the product pγ × ∗ X α is internally connected and contains no α =γ
nearstandard point. These two connected sets have points in common, namely those with projection pβ in ∗ X β and projection pγ in ∗ X γ . Therefore, in terms of the equivalence relation for ends, all points of pβ × ∗ X α are equivalent to all
points of pγ ×
∗
α =γ
α =β
X α . Moreover, as noted above, a point is remote in ∗
if and only if the projection pβ is remote for at least one β ∈ st (I).
Xα
α∈I
Now let p be any point in the nonstandard extension of the product, and assume that exactly one index, β, corresponds to a noncompact factor of the product space in question. Then p is remote if and only if pβ is remote. Also, it is plain to see that an internal set in the nonstandard extension of the product is (internally) connected and contains no nearstandard points if and only if its projection onto ∗ X β shares that property. Consequently, we have
176
M. Insall et al.
Theorem 5.4.3 Suppose in the above setting X α is compact for each α = β and X β is not compact. Then the number of ends for X α is the number of ends for X β . α∈I
Remark 5.4.4 Work that remains in our research on compactifications includes applications to pre-topologies (see [14]), to box products and proximity spaces, and to compactifications of topological algebras.
References 1. R. Diestel, D. Kühn, Graph-theoretical versus topological ends of graphs. J. Comb. Theory Ser. B 87, 197–206 (2003) 2. J. Dugundji, Topology (Allyn and Bacon Inc. Boston, 1966) 3. H. Freudenthal, Uber die Enden topologischer Räume und Gruppen. Math. Z. 33, 692–713 (1931) 4. I. Goldbring, Ends of groups: a nonstandard perspective. J. Log. Anal. 3(7), 1–28 (2011) 5. R. Halin, Über unendliche Wege in Graphen. Math. Ann. 157, 125–137 (1964) 6. H. Hopf, Enden offener Räume unendliche diskontinuierliche Gruppen. Comment. Math. Helv. 16, 81–100 (1943/4) 7. M. Insall, P.A. Loeb, M.A. Marciniak, End compactifications and general compactifications. J. Log. Anal. 6(7), 1–16 (2014). doi:10.4115/jla.2014.6.7 8. M. Insall, M.A. Marciniak, Nets defining ends of topological spaces. Top. Proc. 40, 1–11 (2012) 9. H.J. Keisler, An infinitesimal approach to stochastic analysis. Mem. Am. Math. Soc. 48, 297 (1984) 10. P.A. Loeb, Compactifications of Hausdorff spaces. Proc. Am. Math. Soc. 22, 627–634 (1969) 11. M.A. Marciniak, Holomorphic extensions in smooth toric surfaces. J. Geom. Anal. 22, 911–933 (2012) 12. A. Robinson, Non-standard Analysis (North-Holland, Amsterdam, 1966) 13. A. Robinson, Compactification of groups and rings and nonstandard analysis. J. Symb. Log. 34, 576–588 (1969) 14. J. Šlapal, A Digital pretopology and one of its quotients. Top. Proc. 39, 13–15 (2012) 15. S. Salbany, T. Todorov, Nonstandard analysis in topology: nonstandard and standard compactifications. J. Symb. Log. 65, 1836–1840 (2000) 16. S. Salbany, T. Todorov, Lecture Notes: Nonstandard Analysis in Topology, arXiv:1107.3323
Part IV
Measure and Probability Theory
Chapter 6
Measure Theory and Integration Horst Osswald
6.1 Introduction Loeb measures have been applied in various fields of real analysis. In his fundamental paper [15] Peter Loeb has given the first applications to probability theory. Also developed at that time (and published later in [16]) was an application constructing representing measures in potential theory. (See Sect. 3.12.2.) Fix an unlimited positive integer H and put T := H1 , H2 , . . . , H . This set T is infinite, but ∗ finite, and can be interpreted as a “time line”, which is closely related to the continuous time line [0, ∞[, because each real number between 0 and ∞ is infinitely close to some t ∈ T . The sample space ∗ RT of our probability spaces is also fixed, where ∗ RT is the internal set of all internal H 2 -tuples of elements in ∗ R. We also fix an internal stochastic process B : ∗ RT × T → ∗ R, defined by B (X, t) := s∈T, s≤t X s . However, in case of the infinite-dimensional Brownian motion, ∗ R is replaced by a hyperfinite-dimensional Euclidean space. The value B(X, t) can be understood as the profit (or loss) at time t during the game X ∈ ∗ RT . This profit or loss depends on the special probability measure we have to choose on the sample space ∗ RT . Let us start with Peter Loeb’s construction of Poisson processes from a hyperfinite model of tossing an extremely manipulated coin. Fix a standard β > 0. Let l 1 be the internal Borel probability measure on ∗ R, concentrated on {0, 1}, where l 1 ({0}) = 1 − Hβ and l 1 ({1}) = Hβ . Let l be the internal H 2 -fold product of l 1 on ∗ RT . Note that for each X ∈ {0, 1}T
β l ({X }) = 1 − H
m H 2 −m β · , H
H. Osswald (B) Mathematisches Institut der Universität München, Theresienstr. 39, 80333 Munich, Germany e-mail:
[email protected] © Springer Science+Business Media Dordrecht 2015 P.A. Loeb and M.P.H. Wolff (eds.), Nonstandard Analysis for the Working Mathematician, DOI 10.1007/978-94-017-7327-0_6
179
180
H. Osswald
where m is the number of zeros occurring in X . Using that (X, t) → s≤t X s − El 1 x is an l-square integrable martingale, we will see that for l L -almost all X ∈ ∗ RT and all r ∈ [0, ∞[ Bl (X, r ) := olim ◦ B(X, s),
(+)
s↓r
is well defined. Here limo s↓r denotes the right hand limit at r . Moreover, Loeb has shown that Bl is a Poisson process of rate β. We will see that Bl is a càdlàg process, which means that it is right continuous and has left hand limits l L -almost surely. The next convincing example of the usefulness of Loeb measures is Bob Anderson’s [2] construction of Brownian motion from a hyperfinite model of tossing an unbiased coin now. Let a 1 be the internal Borel probability measure on ∗ R, concentrated on − √1 , √1 , setting H
H
a1
1 √ H
=
1 = a1 2
1 −√ H
.
Let a be the H 2 -fold product of a 1 on ∗ RT . Then for all X ∈ a ({X }) :=
√1 , − √1 H H
T
1 . 2 ( 2 H )
Anderson has shown that for a L -almost all X ∈ ∗ RT and all limited t ∈ T , Ba : ∗ RT × [0, ∞[→ R, (X, ◦ t) →
◦
B(X, t)
(++)
is well defined and continuous. Moreover, he has proved that Ba is the onedimensional Brownian motion. In view of this result, the process B is not only a model for tossing an unbiased coin, but B also describes a random walk; that is the motion of a particle moving along the real axis in √1 steps with the probability of a left—or a right step being H
equal to 21 . Of course, we could have defined Ba using the equation under (+). Then Ba , defined by (++), were a continuous version of Ba , defined by (+). The internal process B is a Brownian motion for the internal measure a with an infinitesimal error, see Exercise 11.5 in [21]. Cutland [6] has used a measure C on ∗ RT for which B is a correct internal Brownian motion, see Proposition 11.2.1 in [21]: let C be the internal H 2 -fold product of the internal centered Gaußian measure C 1 on ∗ R of variance H1 , i.e.,
C (A) := 1
e A
− H2 x 2
dx
H . 2π
6 Measure Theory and Integration
181
Cutland has shown that the process BC := Ba is C L -a.s. well defined, continuous and the one-dimensional Brownian motion. Here is the difference between BC and Ba , BC is well defined C L -a.s., while Ba is well defined a L -a.s. However, both processes are the Brownian motion. Similar results hold for arbitrary one-dimensional Lévy processes. Let L and be two possibly very different Lévy processes. It is shown in Chap. 15 of [21] that there exist internal Borel probability measures m 1 , μ1 on ∗ R such that L can be identified with Bm and with Bμ . Here m and μ are again the H 2 -fold internal products of m 1 , μ1 , respectively, and Bm and Bμ are given by equation (+). Then Bm is well defined m L -almost surely and Bμ is well defined μ L -almost surely, where m L and μ L are the Loeb probability measures over m and μ. Note that the processes Bm and Bμ are standard parts almost surely of the same internal process B; almost surely with respect to possibly very different measures. The processes Bμ and can be identified and, of course, Bm and L, because they satisfy the same Lévy triplet. Lévy triplets characterize Lévy processes via the Lévy Khintchine formula, the Fourier transformation of Lévy processes. Lévy triplets for one-dimensional Lévy processes are of the form (a, C, ρ) where a, C ∈ R with C ≥ 0 and ρ is a Borel measure on R, the so-called Lévy measure. It is in general infinite and can be obtained from a suitable Loeb measure on ∗ R. While the Lebesgue measure is infinite in the neighborhood of infinite, the Lévy measure is infinite in the neighborhood of zero, if the Lévy measure itself is infinite. In case of Brownian motion the Lévy triplet is (a, C, ρ) = (0, 1, A → 0), in case of the Poisson process of rate β (a, C, ρ) = (β, 0, A −→ β · 1 A (1)). Each Lévy triplet can be satisfied by a process B◦ , defined under (+), where ◦ is the internal product on ∗ RT of a certain internal probability measure ◦1 on ∗ R. For details see the work of Lindstrøm [14] or [21], Chap. 15. In Sect. 6.2.6 we will see that Lebesgue measure on [0, ∞[ is equivalent to a Loeb counting measure on T . All the preceding expositions are striking examples of the usefulness of Loeb measures. In Chap. 7 Malliavin calculus will be studied for the infinite-dimensional Brownian motion, extending Cutland’s [6] and Cutland and Ng’s [7] results for the onedimensional case, following [21]. We also study Malliavin calculus for symmetric Poisson processes, which are, from a certain point of view, more subtle than the non-symmetric Poisson processes. In the following fix a superstructure V of cardinality κ such that at least the real numbers are individuals and a monomorphism ∗ from V into a κ+ saturated nonstandard model W . Given a measure μ, we will say that a property holds μ-a.e. (almost everywhere) if it holds outside a set of μ-measure 0. If μ is a probability measure, we shall write μ-a.s. (almost surely) instead of μ-a.e.
182
H. Osswald
6.2 Loeb Measures Loeb spaces (, L μ (C), μ L ) are—in general—finite σ-additive complete measure spaces in the usual standard sense. They enjoy the following properties. The σ-algebra L μ (C) is generated by an internal algebra C. By saturation, this algebra is rich enough to guarantee that each element of L μ (C) is equivalent to an element of C (see (+) below). Moreover, the σ-additive measure μ L on L μ (C) is infinitely close to an internal measure μ defined on the generating set C (see (++) below). In particular, if C is ∗ finite, then μ L may be infinitely close to a counting measure μ. In Sect. 6.2.6 we will see that, for instance, the Lebesgue measure on [0, ∞[n is σ-isomorphic to a now infinite Loeb measure on a ∗ finite set. Therefore, the Lebesgue measure can be treated in a certain sense as though it were a counting measure.
6.2.1 Loeb Measure Spaces Let be an internal nonempty set in W and let C be an internal algebra on . Recall that W is a nonstandard model, introduced at the beginning of Sect. 2.10. By internal induction, one can show that for each k ∈ ∗ N and each internal k-tuple (A1 , . . . , Ak ) in C, A1 ∪ · · · ∪ Ak ∈ C and A1 ∩ · · · ∩ Ak ∈ C. Assume that μ is an internal finitely additive measure defined on C with values in ∗ [0, S] for some S ∈ R. Again by internal induction, μ(A1 ∪ · · · ∪ Ak ) = μ(A1 ) + · · · + μ(Ak ) for each k ∈ ∗ N and each internal k-tuple (A1 , . . . , Ak ) in C such that Ai ∩ A j = ∅ for i = j. Moreover, the set function ◦
μ : C A → ◦ (μ(A))
is a finitely additive measure on the algebra C with values in [0, S]. From Proposition 2.9.7 (1) it follows that ◦ μ is even σ-additive on the algebra C. By Caratheodory’s extension theorem, ◦ μ can be extended to a measure on the σ-algebra σ(C) generated by C, that is, σ(C) is the intersection of all σ-algebras on containing the elements of C. The completion of this measure is called the Loeb measure associated with μ. We shall now present a more informative construction of Loeb measures, combining both methods in the articles of Loeb [15, 17]: An arbitrary, possibly external, subset N ⊆ is called a μ L -nullset, if μouter (N ) := inf
◦
μ(A) | A ∈ C, N ⊆ A = 0.
Set Nμ L := {N ⊆ | N is a μ L -nullset} .
6 Measure Theory and Integration
183
Lemma 6.2.1 (a) Each subset of a μ L -nullset is a μ L -nullset. (b) The set of μ L -nullsets is closed under countable unions. Proof (a) is obvious. To prove (b), assume that N1 , . . . , Nk , . . . ∈ Nμ L . In order to show that k∈N Nk ∈ Nμ L , fix an ε ∈ R+ . For each k ∈ N there exists an Ak ∈ C with Nk ⊆ Ak and ◦ μ(Ak ) < 2εk , thus μ(Ak ) < 2εk . Set Bk := A1 ∪ · · · ∪ Ak . Then μ(Bk ) < ε. Now we apply Theorem 2.10.18. Let (Bk )k∈ ∗ N be an internal extension of (Bk )k∈N with μ(Bk ) < ε for all k ∈ ∗ N. By the Spillover Principle, there exists K ∈ ∗ N with Bk ⊆ Bk+1 and μ(Bk ) < ε for all k ≤ K . an unlimited Since k∈N Nk ⊆ k∈N Ak ⊆ B K and μ(B K ) < ε, we conclude that k∈N Nk is a μ L -nullset. In the following we will use the techniques in the proof of Lemma 6.2.1 over and over again. Then we will often simply write: by saturation … If B ⊆ and A ∈ C, then A is called a μ L -approximation of B if the symmetric difference AB := (A \ B) ∪ (B \ A) of A and B is a μ L -nullset. We define L μ (C) := {B ⊆ | B has a μ L -approximation A ∈ C} ,
(+)
μ L (B) := ◦ μ(A) if A is a μ L -approximation of B.
(++)
Theorem 6.2.2 (Loeb [15]) (1) (2) (3) (4) (5)
μ L is well defined, that is, μ L does not depend on the chosen μ L -approximation. L μ (C) is a σ-algebra with C ⊆ L μ (C). μ L : L μ (C) → [0, S] is σ-additive. L μ (C) is complete. A subset B ⊆ belongs to L μ (C) iff for each ε ∈ R+ there exist A, A ∈ C such that A ⊆ B ⊆ A and μ(A \A) < ε.
Proof (1) Assume that A and A are μL -approximations of B ⊆ . Then AA ∈ Nμ L , because AA ⊆ (AB) ∪ BA ∈ Nμ L . Since AA ∈ C, wehave μ(AA ) 0. Since A is the disjoint union of A \ A) and (A \ A \ A , we have μ(A) μ(A \ A) + μ(A) − μ(A \ A ) = μ(A ). Therefore, μ(A) μ(A ), thus, ◦ μ(A) = ◦ μ(A ). (2) Since each A ∈ C is a μ L -approximation of A, we obtain C ⊆ L μ (C). We will now show that L μ (C) is a σ-algebra. Since ∈ C, ∈ L μ (C). Fix B, B ∈ L μ (C) with μ L -approximations A, A ∈ C of B, B , respectively. Then A\A is a μ L -approximation of B\B and A ∩ A is a μ L -approximation of B ∩ B , because of the lemma and because
A \ A B \ B ∪ A ∩ A B ∩ B ⊆ (AB) ∪ (A B ) ∈ Nμ L .
184
H. Osswald
Fix a sequence (Bk )k∈N in L μ (C) such that Bi ∩ B j = ∅ for i = j, and for each Bk fix a μ L -approximation Ak . We may assume that Ai ∩ A j = ∅ for i = j, because if Ck is a μ L -approximation of Bk , k ∈ N, then Ck \(C1 ∪ · · · ∪ Ck−1 ) is a μ L -approximation of Bk . The latter is the case, because (recall that the Bi are pairwise disjoint) Bk (Ck \ (C1 ∪ · · · ∪ Ck−1 )) ⊆
k
(Bi Ci ) ∈ Nμ L .
i=1
Since
k i=1
◦ μ(A
i)
= ◦ μ(
k i=1
s :=
Ai ) ≤ ◦ μ() ≤ S for all k ∈ N,
∞
◦
μ(Ak ) ∈ [0, S].
k=1
By 6.2.1), there is an A ∈ C with Ak := saturation (see the proof of Lemma 1 k∈N Ak ⊆ A and μ(A) < s + k for each k ∈ N. It follows that +μ(A) s. Ak , fix ε ∈ R . We may In order to show that A is a μ L -approximation of choose k ∈ N such that s − ◦ μ(Ak ) < ε where Ak := A1 ∪ · · · ∪ Ak . It follows that Ak ⊆ A \ Ak ∈ C and μ(A \ Ak ) s − ◦ μ(Ak ) < ε. A Ak = A \ Moreover, A is a μ L -approximation of Lemma 6.2.1, A
Bk ⊆
Bk :=
k∈N
Bk , because, by
Ak ∈ Nμ L . (Ak Bk ) ∪ A
This proves that Bk ∈ L μ (C). (3) We shall now show that μ L is σ-additive. Choose Bk and Ak , k ∈ N, and A as in the proof of (2). Then μL (
Bk ) = ◦ μ(A) = s =
∞ k=1
◦
μ(Ak ) =
∞
μ L (Bk ).
k=1
(4) Assume that μ L (B) = 0 and N ⊆ B. Fix a μ L -approximation A of B. Then A is also a μ L -approximation of N , because A, AB ∈ Nμ L and N A ⊆ A ∪ (AB). It follows that N ∈ L μ (C). (5) “⇒” Fix B ∈ L μ (C), a μ L -approximation C ∈ C of B, and an ε ∈ R+ . Then there exists a D ∈ C with CB ⊆ D and μ(D) < ε. We now have C \ D ⊆ B ⊆ C ∪ D and μ ((C ∪ D) \ (C \ D)) ≤ μ(D) < ε.
6 Measure Theory and Integration
185
“⇐” Suppose that for each n ∈ N there are An , An ∈ C with An ⊆ B ⊆ An and such that μ(An \An ) < n1 . By saturation, there is an A ∈ C with An ⊆ A ⊆ An for each n ∈ N. Fix ε ∈ R+ and n ∈ N with n1 < ε. Then AB ⊆ An \ An ∈ C and μ(An \ An )
t. Then there exists a decreasing sequence (kn )n∈N in T with t < ◦ kn and limn→∞ ◦ kn = t such that for each n ∈ N there is an An ∈ Ckn with μ L (An C) = 0. By saturation, there exist a k ∈ T with r ≤ k ≤ kn and an A ∈ Ck such that μ(AA1 ) < n1 for each n ∈ N, thus μ(AA1 ) 0. Since r k and μ L (AC) ≤ μ L (AA1 ) + μ L (A1 C) = 0, C ∈ cr . (4) It is obvious that each N ∈ Nμ L belongs to c0 . The filtration (cr )t∈[0,∞[ on L μ (C) is called the standard part of the internal filtration (Ct )t∈T on C, and the quadruple (, L μ (C), μ L , (ct )t∈[0,∞[ ) is called the adapted Loeb space over (, C, μ, (Ct )t∈T ). Problems Let μ1 be an internal Borel probability measure on ∗ R and let μ be the H 2 -fold product of μ on the set B of internal Borel sets on ∗ RT . We assume that μ is absolutely continuous with respect to the internal Lebesgue measure λT , that is, λT (B) = 0 implies μ(B) = 0 for each B ∈ B. Define for each t ∈ T
Bt := B × ∗ RT \Tt | B is an internal Borel set in ∗ RTt . Let (br )r ∈[0,∞[ be the standard part of the internal filtration (Bt )t∈T .
6 Measure Theory and Integration
197
Fix r ∈ [0, ∞[. Following Keisler’s idea [12], we define an equivalence relation ∼r on ∗ RT by setting X ∼r Y :⇔ ∀s ∈ T (s r ⇒ (X i )i≤s = (Yi )i≤s ). Now define B ∈ br∼ :⇔ B ∈ L μ (B) and ∀X, Y ∈ ∗ RT (X ∼r Y ⇒ (X ∈ B ⇔ Y ∈ B)). For each subset D ⊆ ∗ RT and each t ∈ T we define D t := πt [D] × ∗ R{s∈T |t<s} . where πt (X s )s∈T := ((X s )s≤t ). Prove that br∼ is a σ-algebra. Prove: If s ≤ t in T , then D s ⊆ D t and D ⊆ D s . Prove: If r ∈ [0, ∞[ and r < ◦ s and D ∈ br∼ , then D = D s . Prove: For each r ∈ [0, ∞[, br = br∼ ∨ Nμ L . Let f : ∗ RT → R be L μT (B)-measurable. Prove that f is br -measurable if f (X ) = f (Y ) for all X, Y ∈ ∗ RT with X ∼r Y. (6) Prove that for each br -measurable f : ∗ RT → R there exists a br∼ -measurable g : ∗ RT → R with f = g μ L -a.s.
(1) (2) (3) (4) (5)
6.3 Standard Integrability for Internal Measures In this paragraph we establish the integration theory on Loeb spaces. Fix an internal measure space (, C, μ) such that μ() is provisionally limited, and fix the corresponding Loeb space (, L μ (C), μ L ). Obviously, if C is a ∗ σ-algebra we have the notion “μ -integrability”, which is nothing more than the usual integrability “copied” from the standard model to the nonstandard model by the Transfer Principle.
6.3.1 The Definition of S-integrability and Equivalent Conditions Fix an internal Banach space B with internal norm |·|. A C-measurable function F : → B (by definition, F is then internal) is called Sμ -integrable if for all unlimited K ∈ ∗ N
|F| dμ 0. {|F|≥K }
198
H. Osswald
For example, if F is C-measurable and limited, i.e., |F| is limited, then F is Sμ -integrable. Fix a standard p ∈ [1, ∞[ and define SL p (μ, B) := F : → B| |F| p is Sμ -integrable . In the case p = 2, we call the elements of SL2 (μ, B) Sμ -square integrable. If B = ∗ R, then we write SL p (μ) instead of SL p (μ,∗ R). Lemma 6.3.1 Assume that F : → ∗ R is C-measurable and F ≥ 0. (a) If F p dμ is limited, then F is limited μ L -a.e. (b) If F p dμ 0, then F 0 μ L -a.e. Proof (a) Since U := {F is unlimited} = n∈N {F ≥ n}, we see that U ∈ L μ (C). Assume that ε := μ L (U ) > 0. Then μ({F ≥ n}) > 2ε for each n ∈ N. By the Spillover Principle, there is an unlimited K ∈ ∗ N with μ({F ≥ K }) > 2ε . We obtain
ε F p dμ ≥ F p dμ > K p · is unlimited, p p 2 {F ≥K } which proves (a). (b) Now assume that F p dμ 0. Then n · F p dμ < n1 for all n ∈ N. By the Spillover Principle, there is an unlimited K ∈ ∗ N such that K · F p dμ < K1 is limited. By (a), K · F is limited μ L -a.e. Since K is unlimited, F 0 μ L -a.e. Proposition 6.3.2 (Anderson [2], Loeb [15]) Let F : → ∗ R+ be C-measurable. The following statements (1),…,(5) are equivalent: (1) F is Sμ -integrable. (2) limn→∞ ◦ {n≤F} Fdμ = 0. (3) For each A ∈ C
is limited in any case, Fdμ 0 if μ(A) 0. A (4)
Fdμ is limited and for each ε
∈ R+ there exists a δ ∈ R+ such that
ε for all A ∈ C with μ(A) < δ. (5) There exists a function g : N → N such that for all n ∈ N
{g(n)≤F}
Fdμ
t ∧ Mt ≤ c} ∈ Ct . Mtτ ≤ c = i∈Tt
In order to prove the martingale property, fix t ∈ T with t < H and A ∈ Ct . Then, by rules for the conditional expectation and the martingale property, τ τ = E1 A∩{τ >t} Mt+ 1 − Mt = E1 A Mt+ 1 − Mt H
H
EECt 1 A∩{τ >t} Mt+ 1 − Mt = E1 A∩{τ >t} ECt Mt+ 1 − Mt = 0. H
H
Later we will use Doob’s upcrossing result, which is also an application of stopping times in connection with martingales. Fix again a (Ct )t∈T -martingale M and numbers a < b in ∗ R. Define stopping times τ1 , . . . , τn , . . . as follows:
224
H. Osswald
τ1 (X ) := inf {t ∈ T | Mt (X ) ≤ a} , τ2 (X ) := inf {t ∈ T, τ1 (X ) < t | Mt (X ) ≥ b} , τ3 (X ) := inf {t ∈ T, τ2 (X ) < t | Mt (X ) ≤ a} , τ4 (X ) := inf {t ∈ T, τ3 (X ) < t | Mt (X ) ≥ b} , and so on. Recall that inf {·} = H + H1 if {·} = ∅. Let N (X ) be the number of elements i ∈ ∗ N such that τi (X ) ≤ H . Now U[a,b] (X ) :=
N (X ) 2 N (X )−1 2
if N (X ) is even if N (X ) is odd
,
is called the number of upcrossings of the interval [a, b] by M. The proof in standard terms of the following result, due to Doob, can be found in the book of Ash [3] Theorem 7.4.2. We use transfer of the standard result into the nonstandard setting. Theorem 6.4.3 (Doob [8]) EU[a,b] ≤
1 E (M H − a)+ , b−a
where y + := max {y, 0}.
6.4.2 The Maximum Inequality The maximum inequality can be used to prove Doob’s inequality (see the standard proof of Proposition 2.3.1 and Theorem 2.4.2 in [21]). Proposition 6.4.4 (Doob [8]) Fix p ∈ [1, ∞[ and a non-negative submartingale N such that Nt ∈ L p (μ) for all t ∈ T . Then for each c ≥ 0 and for each t ∈ T
c · μ max s∈Tt
p Ns
≥ c ≤ E1maxs∈T
t
p
Ns ≥c
N p. t
6.4.3 Doob’s Inequality Doob’s inequality and the Burkholder Davis Gundy inequalities in the next section belong to the most important tools in stochastic analysis. Here are the results.
6 Measure Theory and Integration
225
Theorem 6.4.5 (Doob [8]) Fix an internal submartingale M : × T → ∗ R+ 0 and p p > 1 and suppose that M H is integrable. Then max Mt ≤ t∈T
where F p is a shorthand for
p
p M H p , p−1
1 F p dμ p .
6.4.4 The Burkholder Davis Gundy Inequalities Recall that we call a function : ∗ [0, ∞[→ ∗ [0, ∞[ with (0) = 0 strongly increasing if there exists a sequence (an )n∈ ∗ N in [0, ∞[ with a0 = 0, 1 ≤ a1 and 4an < an+1 such that for all x ∈ [0, ∞[ and all n ∈ ∗ N, (x) := (n + 1)x − (a1 + · · · + an ) if an ≤ x < an+1 . Moreover, we have used an is limited iff n ∈ N, which is not necessary for the proof of the next result. For a standard proof with all details we refer to the book [21]. Theorem 6.4.6 (Burkholder, Davis and Gandy [5]) Fix p ∈ [1, ∞[. Then there exist standard real constants c p and d p , depending on p, such that for all martingales on a discrete time line and all strongly increasing functions p ∼ p ≤ E ◦ c p · [M] H2 and E ◦ M H p ∼ p E ◦ [M] H2 ≤ E ◦ d p · M H ·
6.4.5 S-integrability of Internal Martingales Let us use the notation of the previous section. It is important to apply the internal versions of Doob’s inequality and the Burkholder Davis Gundy inequalities to internal martingales concerning external properties, like S-integrability and S-continuity. Here is the first example: Theorem 6.4.7 For each internal (Ct )t∈T martingale M and each p ∈ [1, ∞[ we obtain for all σ ∈ N
226
H. Osswald p p [M]σ2 ∈ SL1 (μ) iff Mσ∼ := max |Ms | p ∈ SL1 (μ).
s∈Tσ
p
Proof Suppose that [M]σ2 ∈ SL1 (μ). By Theorem 6.3.14, there exists a witness p
for the Sμ -integrability of c p · [M]σ2 . By Theorem 6.4.6, p p E ◦ Mσ∼ ≤ E ◦ c p · [M]σ2 is limited.
p Therefore, is also a witness for the Sμ -integrability of Mσ∼ . It follows that ∼ p Mσ ∈ SL1 (μ). The proof of the reverse implication is similar.
6.4.6 S-continuity of Internal Martingales In this section we will mention an important result, due to Hoover and Perkins, namely that, under a mild condition, a martingale is S-continuous a.s. if its quadratic variation is S-continuous a.e. To formulate this important theorem is quite simple, but the proof is not at all simple. See the proof of Theorem 10.14.2 in [21]. Theorem 6.4.8 (Hoover and Perkins [10]) Fix a (Ct )t∈T -martingale M such that EMt2 is limited for all limited t ∈ T . If [M] is S-continuous μ L -a.s., then M is S-continuous μ L -a.s. Hoover and Perkins also prove the reverse implication, but we do not need this result.
6.4.7 The Standard Part of Internal Martingales Our aim now is to convert, under mild conditions, internal martingales, defined on the ∗ finite set T to càdlàg standard martingales, defined on the continuous timeline [0, ∞[. Conversely, we lift standard martingales to internal ones. Recall that a function f , defined on [0, ∞[ is called càdlàg if the following conditions hold (C 1) f is continuous from the right, which means that lims↓r f s = fr for all r ∈ [0, ∞[ (C 2) f has left hand limits, which means that lims↑r f s exists for all r ∈ [0, ∞[. We start with a lemma which constructs from a certain internal function F : T → ∗ R a càdlàg function ◦ F : [0, ∞[→ R, which may be called the standard part of F. Lemma 6.4.9 Fix an internal F : T → t ∈ T . Set F0 := 0. Fix r ∈ [0, ∞[.
∗R
such that Ft is limited for all limited
6 Measure Theory and Integration
(a) Then lim◦ s↓r
◦ (F ) s
exists iff ◦
sup
lim
k→∞
227
r
0 in case lim◦ s↑r ◦ (Fs ) exists,
◦l
F
r−
◦
:= ◦ lim
s↑r,s∈T
(Fs ) and
◦l
F
0−
:= 0.
Pay attention for a short moment to the difference between ◦ F r − :=
lim
s↑r,s∈[0,r [
◦l F r − and
◦ F s.
(b) There exists an r ∈ ∗ ]r, ∞[, r r such that (◦ F)r Fs for all s ∈ T with s≥ r and s r. r ∈ ∗ ]0, r [, r r such that If ◦l F r − exists with r > 0, then there exists an ◦l F r − Fs for all s ∈ T with s ≤ r and s r . (c) ◦ F is càdlàg, provided ◦l F also exists for all r ∈ ]0, ∞[. Then
◦l
F
r−
=
◦ F r− .
Proof The proof of Part (a) is left to the reader. (b) By the assumption and (a), there exists a strictly monotone increasing function g : N → N with g < h such that for all n ∈ N and for all k ∈ N with k > g(n) = ∗ g(n), 1 1 1 ⇒ Fs − ◦ F (r ) < . (+) ∀s ∈ T r + ≤ s ≤ r + ∗ k g(n) n By the Spillover Principle, for each n ∈ N there exists an unlimited K n such that (+) is true for all unlimited K ≤ K n , when we replace k by K . There exists an unlimited K ∞ ≤ K n for all n ∈ N. It follows that (+) is true for all n ∈ N, when we replace r := r + K1∞ . Note that (b) is true for this r . The proof of the second k by K ∞ . Set statement of Part (b) is similar. (c) To prove that ◦ F is right continuous in r , fix a sequence (rn )n∈N in ]r, ∞[, converging to r . Fix r r as in (b). By (b), there are tn ∈ T with tn rn and Ftn (◦ F)rn . By Theorem 2.10.18, there exists and internal extension (tn )n∈ ∗ N of
228
H. Osswald
∗ r ≤ tn for all n ∈ ∗ N. There exists (tn )n∈N with ◦ an unlimited N∞ ∈ N such ◦that tn r for all unlimited n ≤ N∞ . Assume that Frn n∈N does not converge to Fr . Then there exists a k ∈ N such that Ftn − ◦ Fr ≥ k1 for infinitely many n ∈ N. By the Spillover Principle there exists an unlimited N ≤ N∞ with Ft N − ◦ Fr ≥ k1 and t N r . However, r ≤ t N , thus, Ft N ◦ Fr ,which is a contradiction. The proof ◦ that F has left hand limits in ]0, ∞[ with ◦l F r − = (◦ F)r − is similar.
Theorem 6.4.10 (Hoover and Perkins [10]). Let M : × T → ∗ R be an internal (Ct )t∈T -martingale such that Eμ |Mt | is limited for all limited t ∈ T . (a) There exists a set U of μ L -measure 1 such that (◦ M)r (X ), ◦l M r − (X ) exist for all X ∈ U and all r ∈ [0, ∞[. Moreover, ◦ M is a càdlàg process. (b) Fix r ∈ [0, ∞[. Then there exist rl , rr ∈ ∗ [0, ∞[, rl , rr r , such that Mt is a lifting of (◦ M)r for all t ∈ T with t ≥ rr , t r and Mt is a lifting of (◦ M)r − for all t ∈ T with t ≤ rl , t r . Proof (a) Fix σ ∈ N. By Proposition 6.4.4, we have for all n ∈ N, 1 μ max |Ms | ≥ n ≤ Eμ |Mσ | . s∈Tσ n It follows that μ L maxs∈Tσ |Ms | is unlimited = 0. Let Uσ be the set of all X such that Ms (X ) is limited for all s ≤ σ. Fix a < b in Q and let U[a,b] be the number of upcrossings of [a, b] by (Ms )s≤σ . By Theorem 6.4.3, Eμ U[a,b] ≤ 1 + with (Mσ − a)+ = (Mσ − a) ∨ 0. Since Eμ (Mσ − p)+ b−a Eμ (Mσ − a) is limited, U[a,b] is standard finite μ L -a.s. Therefore, we may choose U σ such Uσ . that U[a,b] (X ) is finite for all X ∈ Uσ and for all a < b in Q. Set U = Now fix r ∈ [0, ∞[ and X ∈ U . Assume that lim◦ s↓r By Lemma 6.4.9 there exist a < b in Q such that lim
inf
k→∞ r < ◦ s ≤ r + 1 k
◦
(Ms (X )) < a < b < lim
k→∞
◦ (M
s (X ))
◦
sup r
0. Lemma 6.4.9(c) implies that ◦ M is a càdlàg process. (b) Let Y be a C-measurable lifting of (◦ M)r . Since lim◦ s↓r ◦ (Ms ) = ◦ Y μ L -a.s., we have for all m ∈ N, lim◦ s↓r μ L |◦ Y − ◦ (Ms )| ≥ m1 = 0. It follows that 1 ◦ | |Y lim = 0. ≥ − M μ s ◦ s↑r m Now we proceed in the same way as in the proof of Lemma 6.4.9 and find a strictly monotone increasing functions g : N → N and an unlimited N∞ such that for all m ∈ N
6 Measure Theory and Integration
r+
229
1 1 1 1 ⇒ μ |Y − Ms | ≥ < . <s ≤r+ ∗ N∞ g(m) m m
It follows that μ L {Y Ms } = 0 for all s > r + N1∞ with s r . All these Ms are liftings of (◦ M)r , thus rr := r + N1∞ fulfils Part (b). The proof of the existence of rl is similar. Under the assumptions of Theorem 6.4.10 ◦
M : × [0, ∞[→ R, (X, r ) −→ ◦lim
s↓r
◦
(M(X, s))
is called the standard part of M. ◦ M (X, ·) exists for μ L -almost all X . Corollary 6.4.11 Let M : × T → ∗ R be an internal (Ct )t∈T -martingale such that Mt ∈ SL1 (μ) for all limited t. Let (cr )r ∈[0,∞[ be the standard part of (Ct )t∈T (see Theorem 6.2.15). Then ◦ M is a càdlàg (cr )r ∈[0,∞[ -martingale. Proof Fix r ∈ [0, ∞[ and an r r such that for all s r, r ≤ s, Ms is a lifting of ◦ M . Now ◦ M is μ -integrable, because all these M are S-integrable. It also follows r r L s property, that ◦ Mr is cr -measurable. In order to prove the martingale fix u ∈ [0, ∞[ with u > r , and B ∈ cr . We have to prove that B ◦ Mu dμ L = B ◦ Mr dμ L . We may choose s with s r, r ≤ s such that there is an A ∈ Cs with μ L (AB) = 0. Fix t u such that Mt is a lifting of ◦ Mu . Then we obtain
◦ B
◦
Mu dμ L = A
Mt dμ L
Mt dμ = A
◦
Ms dμ A
Mr dμ L .
B
Now we construct from certain (cr )r ∈[0,∞[ -martingales m internal martingales M, whose standard part ◦ M is m. In particular, ◦ M is a càdlàg version of m, i.e., for each r ∈ [0, ∞[ there exists a set Ur of μ L -measure 1 such that m r (X ) = (◦ M)r (X ) for all X ∈ Ur . Theorem 6.4.12 Suppose that m : → R is integrable. Let M ∈ SL1 (μ) be a lifting of m. Define m r := Ecr m for r ∈ [0, ∞[ and Mt := ECt M for t ∈ T . Then for all r ∈ [0, ∞[ ◦ M r := ◦lim ◦ Ms = m r μ L -a.s. s↓r
Proof Note that (m r )r ∈[0,∞[ and (Mt )t∈T are (cr )r ∈[0,∞[ -martingales, internal (Ct )t∈T -martingales, respectively, and Mt ∈ SL1 (μ) for all t ∈ T . By Theorem 6.4.10, lim◦ s↓r ◦ (Ms ) = (◦ M)r exists for all r ∈ [0, ∞[ μ L -a.s. and defines a càdlàg martingale. It remains for us to prove that (◦ M)r = m r μ L -a.s. Let B ∈ cr . In the previous theorem we have seen that there is an s r such that Ms is a lifting of (◦ M)r and such that there exists an A ∈ Cs with μ L (AB) = 0. We obtain
230
H. Osswald
◦
M
B
dμ L r
Ms dμ = A
Mdμ A
mdμ L = A
m r dμ L . B
Much to my regret, applications of measure and probability theory on Loeb spaces cannot be covered exhaustively. Many beautiful areas, where Loeb spaces have been applied successfully, will not be considered here. Examples are Hoover and Perkins’ general martingale theory and stochastic differential equations, Perkins’ approach to local time, Cutland, Kopp and Willinger’s stochastic approach to financial mathematics (see, however, Chap. 9), Lindstrøm’s work about Brownian motion on fractals, Lindstrøm work on finite-dimensional Lévy processes, Hoover and Keisler’s probability logic, and so on. We now proceed, in the next chapter, to applications of Nonstandard Analysis to the foundations for the Malliavin calculus on abstract Wiener spaces and, as an example, for symmetric Poisson processes instead of more general Lévy processes like in [21]. Using the ∗ -extension ∗ N of the set N of natural numbers, we have extended the notion of “finiteness” so that stochastic analysis, even in the infinitedimensional case, is very similar to the elementary calculus on finite-dimensional Euclidean spaces, extending the approach of Cutland and Ng [7] to one-dimensional Brownian motion. Problems Recall the notation of Sect. 6.4.7. (1) Prove Part (a) of Lemma 6.4.9. (2) Prove that ◦ F has left hand limits in ]0, ∞[ with ◦l F r − = (◦ F)r − . ◦l r ∈ ∗ ]0, r [, (3) Assume that F r − exists with r > 0. Prove that there exists an ◦l r and s r . r r such that F r − Fs for all s ∈ T with s ≤ (4) Let l 1 and l be the measures, defined in Sect. 6.1. Prove that for l L -almost all X ∈ ∗ RT and all r ∈ [0, ∞[ Bl (X, r ) := lim
◦s↓r
◦
B(X, s)
exists and is a càdlàg process. Hint: (X, t) → s≤t X s − El 1 x is an lsquare integrable (Bt )t∈T -martingale, where (Bt )t∈T is defined in the problems to Sect. 6.2.
References 1. S. Albeverio, J.E. Fenstad, R. Høegh Krohn, T. Lindstrøm, Nonstandard Methods in Stochastic Analysis and Mathematical Physics (Academic Press, Orlando, 1986) 2. R.M. Anderson, A nonstandard representation of Brownian motion and Itô integration. Isr. J. Math. 25, 15–46 (1976) 3. R.B. Ash, Real Analysis and Probability (Academic Press, New York, 1972)
6 Measure Theory and Integration
231
4. J. Berger, H. Osswald, Y. Sun, J.L. Wu, On nonstandard product measure spaces. Ill. J. Math. 46, 319–330 (2002) 5. D.L. Burkholder, B.J. Davis, R.F. Gandy, Integral inequalities of convex functions of operators on martingales, in Proceedings of 6th Berkeley Symposium, vol. 2 (University of California Press, Berkeley, 1970), pp. 223–240 6. N. Cutland, Infinitesimals in action. J. Lond. Math. Soc. 35, 202–216 (1987) 7. N. Cutland, S.-A. Ng, A nonstandard approach to the Malliavin calculus, in Advances in Analysis, Probability and Mathematical Physics—Contributions of Nonstandard Analysis, ed. by S. Albeverio, W.A.J. Luxemburg, M.P.H. Wolff (Kluwer Academic Publishers, Dordrecht, 1995), pp. 149–170 8. J.L. Doob, Stochastic Processes (Wiley, New York, 1965) 9. L. Gross, Abstract Wiener spaces, in Proceedings of 5th Berkeley Symposium on Mathematical Statistics Probability Part I (University of California Press, Berkeley, 1965), pp. 31–41 (1988) 10. D.L. Hoover, E.A. Perkins, Nonstandard construction of the stochastic integral and applications to stochastic differential equations I and II. Trans. Am. Math. Soc. 275, 1–58 (1983) 11. A.E. Hurd, P.A. Loeb, An Introduction to Nonstandard Real Analysis (Academic Press, Orlando, 1985) 12. H.J. Keisler, An infinitesimal approach to stochastic analysis, Mem. Am. Math. Soc. 48 (1984) 13. T. Lindstrøm, Hyperfinite stochastic integration I, II, III, and addendum. Math. Scand. 46, 265–333 (1980) 14. T. Lindstrøm, Hyperfinite Lévy processes. Stochastics 76(6), 517–548 (2004) 15. P.A. Loeb, Conversion from nonstandard to standard measure spaces and applications in probability theory. Trans. Am. Math. Soc. 211, 113–122 (1975) 16. P.A. Loeb, Applications of nonstandard analysis to ideal boundaries in potential theory. Isr. J. Math. 25, 154–187 (1976) 17. P.A. Loeb, A functional approach to nonstandard measure theory. Contemp. Math. 26, 251–261 (1984) 18. P.A. Loeb, H. Osswald, Nonstandard integration theory in topological vector lattices. Mh. Math. 124, 53–82 (1997) 19. W.A.J. Luxemburg, A general theory of monads, in Application of Model Theory, Algebra, Analysis and Probability, ed. by W.A.J. Luxemburg (Hold, Rinehart and Winston, New York, 1969), pp. 18–86 20. H. Osswald, Vector valued Loeb measures and the Lewis integral. Math. Scand. 68, 247–268 (1991) 21. H. Osswald, Malliavin Calculus for Lévy Processes and Infinite-Dimensional Brownian Motion, vol. 191, Cambridge Tracts in Mathematics (Cambridge University Press, Cambridge, 2012) 22. K.D. Stroyan, J.M. Bayod, Foundations of Infinitesimal Stochastic Analysis. North-Holland Studies in Logic, vol. 119 (North-Holland Publishing Co., Amsterdam, 1986) 23. Y.N. Sun, A theory of hyperfinite processes, the complete removal of individual uncertainty via exact LLN. J. Math. Econ. 29, 419–503 (1998)
Chapter 7
Stochastic Analysis Horst Osswald
7.1 Introduction In this chapter we apply the Saturation Principle to the profound mathematical theory of stochastic analysis for one-dimensional symmetrical Poisson processes and for Brownian motion with values in abstract Wiener spaces (H, B). Here H is a separable Hilbert space inside the superstructure V with norm · and B is the completion of H with respect to a Gross measurable norm |·| on H (see Sect. 6.2.2). The Borel measure on B is a Gaußian measure, induced by a Gaußian measure on the cylinder sets of H. The prototype of an abstract Wiener space is the Fréchet space of convergent sequences, endowed with the topology of pointwise convergence, over the Hilbert space of square summable sequences. It is well known that for each separable Banach space there exists a Hilbert space H such that (H, B) becomes an abstract Wiener space (see the Lecture Notes of Kuo [22]). Since the Hilbert space is the most important part of an abstract Wiener space, we have established ∗ finite-dimensional representations of separable Hilbert spaces. We have already seen that for each separable Hilbert space there exists a ∗ finite dimensional space F with ∗[H] ⊆ F ⊆ ∗ H. The scalar product ◦ F, ◦ G in H⊗d of the standard parts ◦ F, ◦ G of functions F, G in F⊗d is infinitely close the scalar 6.3.7).If H = R, then F is simply ∗ R. product F, G in F⊗d (see 1 Sect. 2 The time line T := H , H , . . . , H , the sample space FT for our probability spaces and the internal stochastic process B : FT ×T → F with B(X, t) := s≤t X s are always fixed. Recall that FT denotes the set of all internal H 2 -tuples of elements in F. The internal filtration (Bt )t∈T on FT is also fixed forever, where Bt = A × FT \Tt | A is an internal Borel set in FTt . H. Osswald (B) Mathematisches Institut der Universität München, Theresienstr. 39, 80333 Munich, Germany e-mail:
[email protected] © Springer Science+Business Media Dordrecht 2015 P.A. Loeb and M.P.H. Wolff (eds.), Nonstandard Analysis for the Working Mathematician, DOI 10.1007/978-94-017-7327-0_7
233
234
H. Osswald
Recall that Tt := {s ∈ T | s ≤ t}. Sometimes we shall use the filtration (Bt − )t∈T on FT with B0 = FT , ∅ . The standard part (br )r ∈[0,∞[ of (Bt )t∈T is defined in Sect. 6.2.7. Notice that (br )r ∈[0,∞[ is also the standard of (Bt − )t∈T . Let us turn now to the case H = R and F = ∗ R. In Sect. 6.1, we have already mentioned, that there exists a large class of internal Borel probability measures p 1 on ∗ R such that the standard part B p of the process B under p 1 exists, which means the following: Let p L be the Loeb measure over the H 2 -fold product p of p 1 on ∗ RT . Then the process B p : ∗ RT × [0, ∞[→ R, (X, r ) := olim
s↓r
◦
(B (X, s))
is p L -almost surely well defined and a one-dimensional càdlàg L évy process. In Sect. 15 of the book [36], where Lindstrøm’s [25] beautiful approach to arbitrary finite-dimensional Lévy processes is modified to our concept, the reader can find a proof of the following fact: for any one-dimensional standard Lévy process L : × [0, ∞[→ R on an arbitrary probability space there exists an internal Borel probability measure l 1 on ∗ R such that L coincides with the standard part of B under l 1 . We have already mentioned in Sect. 6.1 that we identify two Lévy processes if they fulfil the same so called Lévy triplet. Essentially, in his work [25] Lindstrøm has proved the preceding result for Lévy processes of arbitrary finite dimension, not only for the one-dimensional case. However in the book [36] and here stochastic integration for infinite-dimensional Brownian will be studied. T The sample space for d-dimensional Lévy processes with d ∈ N would be ∗ Rd . In case of infinite-dimensional Brownian the sample space is (∗ Rω )T , where ω is an unlimited number in ∗ N. In the nonstandard approach to Malliavin calculus for Lévy processes there is a further interesting advantage: As we will see, it is possible to orthonormalize the increments of a Lévy process, not only the whole process itself as in the literature (see for example the work of Nualart and Schoutens [30]). In case of Brownian motion as a continuous process the increments are always 0. My five years old grandson Silas said “0 is nothing”. However, using infinitesimals representing 0, there exist different kinds of 0, small infinititesimals and large infinitesimals, which are different from 0 except 0 itself. All these infinitesimals collapse to 0 in standard analysis, but they provide the possibility to orthonormalize the increments of a process in the internal setting. In his work [25] Lindstrøm uses large infinitesimals, he called them “splitting infinitesimals”, to establish Lévy triplets and to prove that each Lévy process can be splitted into its continuous and pure jump part (see also [36] in the setting here). Here are two further examples of Lévy processes in addition to the examples in Sect. 6.1.
7 Stochastic Analysis
235
Examples (1) Fix H = R, F = ∗ R and a standard β > 0. Let the internal Borel probability measure π 1 on ∗ R be concentrated on {0, 1, −1}, defined by π 1 ({0}) := 1 −
β 1 β β , π ({1}) := , π 1 ({−1}) := . H 2H 2H
Let π be the H 2 -fold product of π 1 on the Borel sets of ∗ RT . Let us prove that the process B is a (Bt )t∈T -martingale. To this end fix t ∈ T and C ∈ Bt . Then C = A × ∗ RT \Tt for some internal Borel set A in ∗ RTt . Now Bt+ 1 dπ = Bt dπ + X t+ 1 dπ = Bt dπ + xdπ 1 H
C
∗ RT \Tt
C
H
∗R
C
with ∗ R xdπ 1 = 0. This proves that EBt Bt+ 1 = Bt π-a.s. Moreover, we have for H all t ∈ T ⎛ ⎞
Eπ Bt2 = Eπ ⎝ X s2 + Xs · Xt ⎠ = s=r ∈Tt
s∈Tt
t · H · Eπ1 x 2 + t · H · (t · H − 1) Eπ1 x · Eπ1 y = t · H · Eπ1 x 2 = β · t. It follows that Eπ Bt2 is limited for all limited t, thus Bt ∈ SL1 (π). By Corollary 6.4.11, the standard part Bπ of B exists π L -a.s. and is a càdlàg (br )r ∈[0,∞[ -martingale. This process Bπ is a symmetric Poisson process of rate β. The first nonstandard approach to Poisson processes is due to Loeb [26]. (2) Fix an abstract Wiener space (H, B), a ∗ finite representation F of H and an orthonormal basis (ei )i≤ω of F. The scalar product on F is denoted by ·, ·. Let ω be the internal centered Gaußian measure on F of variance H1 , i.e., for all internal Borel sets A in F ω H 2 X, − H2 e ω i d (X ) i≤ω (A) := e := 2π A {(xi )i≤ω ∈ ∗ Rω |
ω
i=1 xi ei ∈A
}
e
− H2
2 i≤ω xi
d (xi )i≤ω
H 2π
ω .
Let be the internal H 2 -fold product of ω on the internal Borel sets of FT . Now the standard part B : FT × [0, ∞[→ B of B is defined for all limited t ∈ T by B (X,◦ t) :=
◦
(B (X, t)) .
B is L -almost surely well defined and a continuous Brownian motion with values in the Banach space B. Here ◦ (B (X, t)) denotes the standard part of B (X, t) with
236
H. Osswald
respect to the Gross measurable norm |·| on B. Moreover, for each continuous function f : [0, ∞[→ B one can construct an X ∈ FT such that f = B (X, ·). The proofs of all these results can be found in the book [36] in Chap. 11. This construction of a Brownian motion extends the work of Cutland and Ng [8] for one-dimensional Brownian motion. The first nonstandard approach to Brownian motion is due to Anderson [3]. Now an answer should be given to the question: what is a Banach space valued Brownian motion? Recall that a process b with values in a d-dimensional Hilbert space G, d ∈ N, is a d-dimensional Brownian motion if each component of d is a Brownian motion and these components are running independently on orthogonal axes. To be more precise, fix an orthonormal basis (ei )i≤d of G. Then b is a Brownian motion if each d, ei is a one-dimensional Brownian motion and d, ei is independent of d, e j for i = j. Let us turn now to the infinite dimensional case. Not only H is dense in B under |·|, the topological dual B of B is dense in H = H under the original norm · on H (see Proposition 4.3.6 in [36]). Now a process b with values in B is a Brownian motion if ϕ ◦ b is a one dimensional Brownian motion for each ϕ ∈ B with ϕ = 1 and ϕ ◦ b and ψ ◦ b are independent for ϕ ⊥ ψ. Note that this notion of infinite-dimensional Brownian motion really extends the finite-dimensional case. The one-dimensional, two-dimensional centered Gaußian measure on ∗ R, ∗ R2 of variance H1 is denoted by 1 , 2 , respectively. Our main concern is to present—at moderate speed—an introduction to some basic facts of the beautiful and powerful theory of stochastic analysis. Following the book [36], we imagine a journey on a slowly moving Brownian particle through the Itô and iterated Itô integral to the chaos decomposition theorem, which is the key to the Malliavin calculus. We study the Brownian motion B and the symmetric Poisson process Bπ . A detailed nonstandard approach to Malliavin calculus for B and for more general Lévy processes can be found in the book [36]. In the standard approach to the Malliavin calculus chaotic representations of Lévy functionals often serve as basis for the Malliavin calculus for Lévy processes. We refer to the books or articles [5, 6, 22, 27, 29, 38, 41–43, 45] for infinite-dimensional Brownian motion, and to [9, 23, 29, 30, 40, 42] for finite dimensional Lévy processes. Smolyanov and von Weizsäcker [39] and Bogachev [5] have developed an approach to the Malliavin calculus, applying differentiability of measures. T. Lindstrøm [24], Keisler [21], Hoover and Perkins [19] have applied nonstandard methods to stochastic analysis for the first time as far as I know. The standard theory of Stochastic Analysis for infinite-dimensional Brownian motion can be found for example in the work of Duncan and Veraiya and Duncan (see [15, 16]).
7 Stochastic Analysis
237
7.2 The Itô Integral for the Brownian Motion In order to define the Itô integral with respect to the Brownian motion B as a continuous process, we first introduce the internal Riemann-Stieltjes integral with respect to the internal Brownian motion B and then prove that this integral is S-continuous, provided that the integrand is S -square integrable and (Bt − )t∈T -adapted. Therefore, it can be converted to a continuous process. We use the notation of the preceding chapter. We drop the index at the norm · and the scalar product ·, · if it is clear, in which space H or ∗ H or F we take the norm or scalar product. Fix an internal : FT × T → F. We define B: FT × T → ∗ R by setting
B (X, t) :=
s (X )(X s ) :=
s∈Tt
s (X ), X s ,
s∈Tt
where we have identified F with its dual in the first equality. Note that B(X, t) = (s,i)∈Tt ×ω s (X ), ei · X s , ei ,where (ei )i∈ω is an internal orthonormal basis of F. Let us identify n ∈ ∗ N with the set {1, . . . , n}. We will now see
that for (Bt − )t∈T -adapted and S-square integrable the standard part ◦ B of B is the Itô integral of the standard part ◦ of with respect to the B-valued Brownian motion B . Moreover, in the non-adapted case, ◦ B becomes the Skorokhod integral. 2
We will also see that, if is (Bt − )t∈T -adapted and
locally in SL ( ⊗Tν, F), then B can be converted to a continuous process d B , defined on F × [0, ∞[ by setting for all limited s ∈ T ,
d B ·, ◦ s :=
◦
B(·, s).
This process is L -a.s. well defined. If, in addition, is a lifting of a standard function ◦ g: FT × [0, ∞[→ H (i.e.,
g (X, t) (X, t) for ( ⊗ ν) L -almost all (X, t)), then the stochastic integral gd B of g is identical to d B . Remark 7.2.1 However, there exist many which are not liftings of standard functions: Note that the following function F is locally in SL2 (ν, F), thus locally in SL2 ( ⊗ ν, F), and is (Bt − )-adapted, but F is not a lifting of a standard function: Let a ∈ H, a = 0. Set ∗ a if t · H is odd F(t) := 0 if t · H is even. Therefore, we have here an extension of the standard integration theory.
238
H. Osswald
7.2.1 The S-Continuity of the Internal Integral The following lemma will be used over and over again. The proof uses the rules for working with the conditional expectation and the fact that for centered real Gaußian measures γ of variance σ x 2k−1 dγ = 0 = x · ydγ 2 and x 2k dγ = σ k (2k)!!, R
R2
R
where n!! = (n − 1) · (n − 3) · (n − 5) · · · · · 1 for even n ∈ N. Let us identify X t with the projection FT → F, X → X t . Fix an internal orthonormal basis E := (ei )i∈ω of F. Lemma 7.2.2 Fix e, f ∈ E. Then we obtain -a.s. for each t ∈ T : (a) (b) (c) (d)
EBt − EBt − EBt − EBt −
(e) EBt − (f) EBt − (g) EBt −
X t , e2k−1 = 0 for all k ∈ ∗ N. (X t , e · X t , f) = 0 if e = f. · X f) = 0 if s < t. This is true for all (e, f) ∈ E2 . (X t , e 2n s , (2n)!! X t , e = Hn . 2 2 X t , e − H1 = H22 . X , e2 − 1 X , f2 − 1 = 0 for e = f. t 2 H1 t 2 H1 X t , e − H X s , f − H = 0 for s < t.
The following lemma is a simple application of this lemma, which will be used over and over again. Lemma 7.2.3 Fix (Bt − )t∈T -adapted , : FT × T → F with t , t ∈ L 2 (, F) for all t ∈ T . Then (a) for all σ ∈ T , E
B
σ
·
B
σ
s , s dν(s) ∈ ∗ R,
=E Tσ
2
in particular, E B σ = E Tσ s 2 dν(s) < ∞ in ∗ R.
(b) B is an internal (Bt )t∈T -martingale. Proof (a) We obtain, using the preceding Lemma, E E
s,t∈Tσ ,(e,f)∈E2
B
σ
·
B
σ
=
s , e · t , f · X s , e · X t , f = α + β + γ,
7 Stochastic Analysis
239
where α=E
s , e · t , f · X s , e · X t , f = 0,
s=t∈Tσ ,(e,f)
because for s < t,
∈E2
E s , e · t , f · X s , e · X t , f = E s , e · t , f · X s , e · EBt − X t , f = 0.
β=E
s , e · s , f · X s , e · X s , f =
s∈Tσ ,e=f∈E
E
s , e · s , f · EBs − X s , e · X s , f = 0.
s∈Tσ ,e=f∈E
γ=E
s , e · s , e · X s , e2 =
s∈Tσ ,e∈E
E
s , e · s , e · EBs − X s , e2 =
s∈Tσ ,e∈E
E
s∈Tσ ,e∈E
s , e · s , e
1 =E H
s , s dν(s). Tσ
This proves Part (a). The proof of Part (b) is similar. We use the following trick, essentially due to Lindstrøm (see [2]): In order to
handle the quadratic variation of B successfully, we modify the timeline T to the timeline T := {(s, i) | s ∈ T, i ∈ {1, . . . , ω}}. On T we use the lexicographic order, denoted by 0 and an unlimited M ∈ ∗ N such that
7 Stochastic Analysis
243
ε ≤ A := E − M = σ
E
2 (s )2 (ei ) − sM (ei ) EBs − ·s , ei 2 =
i∈ω,s∈Tσ
E Tσ
By Theorem 6.3.15, L -a.s. Therefore,
Tσ
2 s 2 − sM dν(s).
s 2 dν(s) ∈ SL1 (), thus,
s 2 dν(s) = Tσ
Tσ
Tσ
s 2 dν(s) is limited
M 2 s dν(s) L -a.s.
This proves that A 0, which is a contradiction.
Theorem 7.2.7 B is S-continuous L -a.s. Proof We use Lemma 6.3.1 and Theorem 6.3.15. There exists a U ∈ L (B) with L (U ) = 1 and such that for all X ∈ U and all σ ∈ N: 2
(i) tk (X ) Tt ks (X ) dν(s) is limited for all k ∈ N, and t ∈ Tσ .
(ii) Tσ s (X )2 dν(s) is limited, thus there exists a k ∈ N with s (X ) = ks (X ) for alls ∈ Tσ . 2
2 (iii) t → kt (X )F ∈ SL1 (νσ ), therefore, Ts \Tt rk (X ) dν(r ) 0 if s t and t ≤ s ∈ Tσ . We obtain for all X ∈ U and all limited s, t ∈ T with s t and s ≤ t: 2 2 k k k s (X ) = s (X ) r (X ) dν(r ) r (X ) dν(r ) t (X ), Ts
Tt
where we may assume that s, t ∈ Tσ . Moreover, t (X ) is nearstandard for all limited
t ∈ T and all X ∈ U . By Lemma 7.2.6, B is S-continuous L -a.s.
7.2.2 The S-Square-Integrability of the Internal Itô Integral Proposition 7.2.8 Fix a (Bt − )t∈T -adapted : FT × T → F locally in SL2 ( ⊗ν, F) and σ ∈ N. Then (a)
max B(·, t) ∈ SL2 (). t∈Tσ
244
H. Osswald
(b)
B FT × Tσ ∈ SL2 ( ⊗ νσ )
Proof (a) We use the notation in the preceding section. Note
that it suffices to prove that maxs∈T ,s≤(σ,ω) Bs ∈ SL2 (). Since B is a (Bt )t∈T 2
martingale and E B (σ,ω) is limited, by Theorem 6.4.7, it suffices to show
that B (σ,ω) ∈ SL1 (). Therefore, it suffices to prove that σ ∈ SL1 (). Now, by Lemma 7.2.6 Parts (d) and (e), limk→∞ ◦ E( − k )σ = 0 and σk ∈ SL1 (). It follows that σ ∈ SL1 () (see Corollary 6.3.3 Part (γ)). (b) follows from (a). Corollary 7.2.9 Fix a (Bt − )t∈T -adapted : FT × T → F in SL2 ( ⊗ ν, F). Then
t∈T
thus,
∈ SL2 (),
B
max
t
B ∈ SL2 ( ⊗ ν) .
Proof Note that, by Doob’s inequality, for σ ∈ N, ◦
E max
−
B
t∈T \Tσ
2
B
σ
t
4 · ◦E
t∈T \Tσ
t 2F
⎛
⎞2
≤ 4 · ◦E ⎝
t (X )(X t )⎠ =
t∈T \Tσ
1 →σ→∞ 0. H
Assume that this convergence fails. Then there exists a standard ε > 0 and infinitely many σ ∈ N with E t∈T \Tσ t 2 H1 ≥ ε. By the Spillover Principle there exists an unlimited S ∈ ∗ N with E t∈T \TS t 2 H1 ≥ ε, which contradicts the S⊗ν square-integrability of . Therefore, for all σ ∈ N,
B
max
A t∈T
max
A t∈Tσ
B t
2
d
2
d
≤
t
1
2
1
2
+
max
A t∈T \Tσ
1
2 B
2
d
.
t
By Proposition 7.2.8(a), the first summand is limited and infinitesimal if (A) 0. The second summand equals
7 Stochastic Analysis
245
max
A t∈T \Tσ
A t∈T \Tσ
B
t
−
B t
σ
B
1
2
+
σ
1
2 B
A
≤
d
2
d σ
B
+
1
2
2
max
−
B
2
d σ
,
which is limited and can be made arbitrarily small standard if (A) 0. This proves the assertion.
7.2.3 Adaptedness and Predictability Recall from Sect. 6.2.7 the construction of the standard part (br )r ∈[0,∞[ of the internal filtration (Bt )t∈T . Notice that (br )r ∈[0,∞[ is also the standard part of (Bt − )t∈T . In order to define the standard Itô integral for (br )r ∈[0,∞[ -adapted square integrable integrands with values in H we use the fact that (br )r ∈[0,∞[ -predictable processes coincide with (br )r ∈[0,∞[ -adapted processes (see Corollary 5.4.2. in [36]). Here are some details. Fix r, s in [0, ∞[ with r ≤ s. Sets of the form C×]r, s], C×]r, ∞[ with C ∈ br or sets C × [0, s], C × [0, ∞[ with C ∈ b0 are called (br )r ∈[0,∞[ -predictable rectangles. Let P be the σ-algebra, generated by the predictable rectangles. A process f , defined on FT × [0, ∞[, is called (br )r ∈[0,∞[ -predictable if f is P-measurable. A measurable set C ∈ L (B) ⊗ Leb[0, ∞[, where B is the internal Borel-algebra on FT , is called (br )r ∈[0,∞[-adapted if for each r ∈ [0, ∞[ the section C(·, r ) := X ∈ FT | (X, r ) ∈ C ∈ br . Let A be the σ-algebra of (br )r ∈[0,∞[ -adapted sets. A process f , defined on FT × [0, ∞[, is called (br )r ∈[0,∞[ -adapted if f is A-measurable. If the predictable or adapted sets are augmented by the L ⊗ λ-nullsets, then for each process f , measurable with respect to the extended σ-algebra, there exists a process, measurable with respect to the coarser σ-algebra, such that f = g L ⊗ λa.e. (See Proposition 5.1.2 in [36]). The proof of the following lifting result uses the equivalence of “adapted” and “predictable” for the filtration (br )r ∈[0,∞[ (see Corollary 5.4.2 in [36]). It is an application of Corollary 6.3.11. Let us prove the theorem in detail, because it is an example of the usefulness of that corollary. Theorem 7.2.10 Let ϕ : FT × [0, ∞[→ H be (br )r ∈[0,∞[ -adapted and (locally) in L 2 ( L ⊗ λ, H). Then ϕ has a (Bt − )t∈T -adapted lifting : FT × T → F (locally) in SL2 ( ⊗ ν, F).
246
H. Osswald
Proof By the standard Corollary 5.4.2 in [36], we may assume that ϕ is (br )r ∈[0,∞[ predictable. In order to prove the result for the locally L 2 -space, fix σ ∈ N and let L σ be the Hilbert space of (br )r ∈[0,σ] -predictable square integrable functions f : FT × [0, σ] → H. Let M be the set of all these functions having a (Bt − )t∈Tσ adapted lifting F: FT × Tσ → F ∈ SL2 ( ⊗ νσ , F). Obviously, M is a linear space. In order to prove that M is complete, fix a Cauchy sequence ( f n )n∈N in M and let Fn be a (Bt − )t∈Tσ -adapted lifting of f n in SL2 ( ⊗ νσ , F). Let G ∈ SL2 ( ⊗ νσ , F) be a lifting of g := limn→∞ f n in L 2 ( L ⊗ λ). Then we obtain, using the shorthand ρ := ⊗ νσ ,
(a) ◦ FT ×Tσ Fn − G2F dρ = FT ×[0,σ] f n − g2H d L ⊗ λ →n→∞ 0,
(b) ◦ FT ×Tσ Fn − Fm 2F dρ = FT ×[0,σ] f n − f m 2H d L ⊗ λ →n,m→∞ 0. Part (a) implies that for all ε∈ R+ , (c) ◦ ρ {Fn − GF ≥ ε} = ◦ ρ ε12 Fn − G2F ≥ 1 ≤
1 ◦ 2 FT ×Tσ Fn − GF dρ →n→∞ 0. ε2 This transition (c) from “measure to integration” is sometimes called Tchebychev inequality. It follows that there exists a strictly monotone increasing h: N → N such that for all k ∈ N and for all n ≥ h(k)
(i) FT ×Tσ Fn − G2F d ⊗ νσ < k1 , 2
(ii) FT ×Tσ Fn − Fh(k) F d ⊗ νσ < k1 , (iii) ⊗ νσ Fn − GF ≥ k1 < k1 . By Theorem 2.10.18, there exists an internal extension (Fn )n∈ ∗ N of (Fn )n∈N such that all Fn are (Bt − )t∈Tσ -adapted. By the Spillover Principle, for all k ∈ N there exists an unlimited n k ∈ ∗ N such that (i), (ii), (iii) are true for all unlimited n ≤ n k . There exists an unlimited N ≤ n k for all k ∈ N. Set F := FN . Then (i), (ii), (iii) are true for all n ∈ N, when we replace Fn by F. By (ii) and Corollary 6.3.3 (γ), F ∈ SL2 ( ⊗ νσ ). By (iii), F is a lifting of f . Since F is (Bt − )t∈Tσ -adapted, M is complete. To prove that M = L σ , let B×]r, u] be a predictable rectangle with B ∈ br and r < A ∈ Bs − of u ∈ [0, σ]. Then there exists an s r, s ∈ Tσ , and a -approximation B. Let v u, v ∈ Tσ . Set Y := A × (]s, v] ∩ Tσ ) and X := B × st −1 ]r, u] ∩ Tσ . Then X Y ⊆ (AB × Tσ ) ∪ FT × (r ∪ u) , is a ( ⊗ ν) L - nullset. Recall that r = {t ∈ Tσ | t r }. It follows that ∗ a ·1Y belongs to SL2 ( ⊗ ν, F) and is a (Bt − )t∈Tσ -adapted lifting of a · 1 B×]r,u] for a ∈ H, thus, a · 1 B×]r,u] ∈ M. In the same way one can see that a · 1 B×[0,u] ∈ M if B ∈ b0 . By the standard Proposition 5.6.1 in [36], M = L σ . It follows that 1FT ×[0,σ] · ϕ has a (Bt − )t∈Tσ -adapted lifting σ in SL2 ( ⊗ νσ , F) for all σ ∈ N. We may assume that σ (a) = σ+1 (a) for all a ∈ FT × Tσ and that σ (a) = 0 for a ∈ / FT × Tσ . It follows that σ is (Bt − )t∈T -adapted.
7 Stochastic Analysis
247
Let (σ )σ∈ ∗ N be an internal extension of (σ )σ∈N such that all σ are (Bt − )t∈T adapted. Then there exists an unlimited N ∈ ∗ N such that for all σ ∈ ∗ N, σ ≤ N , σ = 1FT ×Tσ · N . Note that 1FT ×Tσ · N is a lifting of ϕ locally in SL2 ( ⊗ νσ , F) for all unlimited σ ∈ ∗ N, σ ≤ N . This proves that, if ϕ : FT × [0, ∞[→ H is (br )r ∈[0,∞[ -predictable and locally in L 2 ( L ⊗ λ, H), then ϕ has a (Bt − )t∈T adapted lifting locally in SL2 ( ⊗ ν, F). Now assume that ϕ ∈ L 2 ( L ⊗ λ, H) and (br )r ∈[0,∞[ -predictable. By Theorem 6.3.10 and Corollary 6.3.22, ϕ has a lifting ∈ SL2 ( ⊗ ν, F). Since for σ ∈ N, 2 ◦ 1 T F ×Tσ N − F d ⊗ ν = FT ×T
FT ×[0,∞[
2 FT ×[0,σ] ϕ − ϕ H d L
1
⊗ λ →σ→∞ 0,
2 1FT ×TN∞ · N − d ⊗ ν 0 for some unlimited N∞ ∈ ∗ N, N∞ ≤ F N . Set := 1FT ×TN∞ N and note that is a (Bt − )t∈T -adapted lifting of ϕ in SL2 ( ⊗ ν, F) (use the triangle inequality for L 2 -spaces).
FT ×T
7.2.4 The Standard Itô Integral Fix a (B t − )t∈T -adapted locally in SL2 ( ⊗ ν, F).
By Theorem 7.2.7, we may convert B to a continuous stochastic process d B on the timeline [0, ∞ [: define for L -almost all X ∈ FT and all limited t ∈ T , d B: FT × [0, ∞[ (X, ◦ t) → ◦ B(X, t) . Now fix ϕ and as in Theorem 7.2.10. We define ϕd B := d B .
This process ϕd B is called the Itô integral of ϕ. One has to prove that ϕd B is well defined L -a.s., i.e., it does not depend on the chosen lifting. This follows from: Lemma 7.2.11 Suppose that is locally in SL2 ( ⊗ ν, F) and a lifting of the 0function. Then L -a.s., B 0 for all limited t ∈ T . t
248
H. Osswald
Proof Fix σ ∈ N and an othonormal basis E of F. By Lemma 7.2.2 and Doob’s inequality, we obtain E max t∈Tσ
⎛ 4 · E⎝
2 t
s , e ·s , e⎠ = 4 · E
s∈Tσ ,e∈E
4E
B
⎞2
2
≤4·E
B
σ
=
s , e2 ·s , ei 2 =
s∈Tσ ,e∈E
s , e2 EBs − ·s , e2 = 4E
s 2 dν(s) 0. Tσ
s∈Tσ ,e∈E
By Lemma 6.3.1 (b), maxt∈Tσ for all limited t ∈ T L -a.s.
B
2 t
0 L -a.s. Therefore,
B
2 t
0
We want to define the Itô integral as a random variable for (br )r ∈[0,∞[ -adapted processes ϕ : FT × [0, ∞[→ H ∈ L 2 ( L ⊗ λ, H): If : FT × T → F is a (Bt − )t∈T -adapted lifting of ϕ in SL2 ( ⊗ ν, F), define
V
◦
ϕd B: FT → R, X →
(X, s), X s .
s∈T
V Note that ϕd B is L -a.s. well defined. Now suppose that ϕ : [0, ∞[→ H ∈ L 2 (λ, H), thus, ϕ is deterministic. We set I (ϕ) :=
V
ϕd B
and call I (ϕ) the Wiener integral of ϕ. For internal : T → F define I () :=
(s), X s .
s∈T
I () is called the internal Wiener integral of .
7.2.5 Integrability of the Itô Integral Fix a (br )r ∈[0,∞[ -adapted ϕ locally in L 2 ( L ⊗ λ, H) and a (Bt − )t∈T adapted lifting locally in SL2 ( ⊗ ν, F), according to Theorem 7.2.10.
7 Stochastic Analysis
249
Theorem 7.2.12 Fix σ ∈ N. (a) The Itô integral of ϕ is a continuous (br )r ∈[0,∞[ -martingale, with E sup
2 < ∞.
ϕd B
r ∈[0,σ]
r
(b) Suppose that (ϕk )k∈N is a sequence of (br )r ∈[0,∞[ -adapted functions ϕk : FT × [0, σ] → H, converging to ϕ : FT × [0, σ] → H in L 2 ( L ⊗ λ, H). Then the Itô integral of ϕ exists and lim E sup
k→∞
2 (ϕk − ϕ) d B
r ∈[0,σ]
= 0. r
(c) Suppose that (ϕk )k∈N is a sequence of (br )r ∈[0,∞[ -adapted functions ϕk : FT × [0, ∞[→ H, converging to ϕ : FT × [0, ∞[→ H in L 2 ( L ⊗ λ, H). Then the Itô integral of ϕ exists and
V
lim E
k→∞
2 (ϕk − ϕ) d B
Proof (a) We have already seen that Proposition 7.2.8, E sup
ϕd B
2
is continuous L -a.s. By
E max
ϕd B
r ∈[0,σ]
= 0.
B
t∈Tσ
r
2 is limited. t
B t L -a.s for limited t ∈ T and B t is ϕd B ◦ t Since
Bt -measurable, ϕd B ◦ t is Bt ∨ N L -measurable, thus ϕd B ◦ t is b◦ t measurable. To prove the martingale property, fix r < u in [0, ∞[ and B ∈ br . Then there exists an s ∈ T , s r , and a -approximation A ∈ Bs of B. Let t u. By Propositions 7.2.8, 7.2.3 (b) and Theorem 6.3.4, B
◦
d L =
ϕd B u
d L
B
A
d s
B
d =
B A
t
B A
t
d L .
ϕd B r
It follows that Ebr ϕd B u = ϕd B r . (b) Assume that k : FT × T σ → F ∈ SL2 ( ⊗ νσ , F) is a (Bt − )t∈T -adapted lifting of ϕk . Then limk,l→∞ ◦ FT ×Tσ k − l 2 d ⊗ ν = 0 and, as in the proof of Lemma 7.2.11, we see that
250
H. Osswald
lim
k,l→∞
◦
E max
2 (k − l ) B
t∈Tσ
= 0. t
By Corollary 6.3.11, there is a (Bt − )t∈T -adapted function : FT × Tσ → F in ⊗ νσ , F) such that
(i) limk→∞ ◦ FT ×Tσ k − 2 d ⊗ ν = 0 and
2 (ii) limk→∞ ◦ E maxt∈Tσ (k − )B t = 0. SL2 (
Use the proof of “(a) ⇒ (c)” in the proof of Theorem 7.2.10 to see that is a lifting of ϕ and E sup
r ∈[0,σ]
2 (ϕk − ϕ) d B
◦
= E max r
t∈Tσ
2 (k − ) B
→k→∞ 0. t
The proof of (c) is left to the reader.
7.2.6 The Wiener Measure We take the σ-algebra, generated by the Lévy and Wiener integrals, as a basis for the chaos expansion. First we study the Wiener case. Therefore, recall the notation and the results of Sect. 7.1. Let CB be the Fréchet space of continuous functions from [0, ∞[ into B. In the introduction, Example (2), we have mentioned that the function κ: FT → CB , X → B (X, ·) is a surjective mapping from FT onto CB . The Borel algebra on the Banach space B is generated by the cylinder sets of the form {a ∈ B | ϕ(a) < c}, where ϕ belongs to the topological dual of B and c ∈ R. The Borel σ-algebra B (CB ) is generated by sets of the form { f ∈ CB | f (r ) ∈ D} where r ∈ [0, ∞[ and D is a cylinder set in B. Let WCB the σ-algebra on FT generated by κ, augmented by the L -nullsets, i.e., WCB = κ−1 [B] | B is a Borel set in CB ∨ N L . By Proposition 11.8.6 in [36], WCB is generated by the real random variables ◦ a, Bt where a ∈ H and B is the fixed internal process, defined in the introduction. We may identify a ∈ H with ∗ a ∈ F. Therefore, WCB does not depend on B, WCB only depends on the Hilbert space part H of the abstract Wiener space (H, B). Therefore we may define WCH := WCB .
7 Stochastic Analysis
251
The image measure W := WCB of L by κ is called the Wiener measure on B (CB ). The preceding results confirm the fact that the Hilbert space part of an abstract Wiener is the most important part. Since κ is a measure preserving bijection from FT onto CB , where we take the σ-algebra WCH on FT and the Borel σ-algebra on CB we may identify the L p spaces p p L WC ( L ) and L B(CB ) (W ), because they are canonically isometric isomorphic. The H
p
p
same holds for L WC ⊗Ln ( L ⊗ (ν n ) L ) and L B(CB )⊗Leb[0,∞[n (W ⊗ λn ). However, H p p the space L L T ( L ) is much bigger than L WC ( L ). B (F ) H
Proposition 7.2.13 Let J be the set of Lebesgue-measurable functions f : [0, ∞[→ H ∈ L 2 (λ, H) with compact support. Then WCH = {I ( f ) | f ∈ J} ∨ N L . Proof We first prove “⊆”: Recall that the sets {ϕ ◦ B (·, r ) < c} generate WCH , where ϕ ∈ B ⊆ H, r ∈ [0, ∞[ and c ∈ R. Set g(s) := 1[0, r ] (s) · ϕ. Let t ∈ T with t r and define G(s) := 1Tt (s) · ∗ ϕ. Then G: T → F is a lifting of g in SL(νt , F) and g ∈ J. Now L -a.s., I (g) I (G) =
∗
ϕ, ·s = ∗ ϕ, Bt ϕ ◦ B (·, r ).
s≤t
For the reverse inclusion we prove that each I (g) with g ∈ J and r ∈ [0, ∞[ is WCH -measurable: Since g is the limit in L 2 (λ, H) of simple functions and the Borel sets of [0, ∞[ are generated by sets of the form [0, r ], it suffices to prove that I (g) is WCH -measurable for each g with g(s) = 1[0,r ] (s) · ϕ, where ϕ ∈ H. Since B is dense in H under the Hilbert space norm · on H, ϕ ∈ H can be approximated by functions in B and since, by Theorem 7.2.12 (b), I is continuous, it suffices to prove that I (g) is WCH -measurable for each g with g(s) = 1[0,r ] (s) · ϕ where ϕ ∈ B . But we have already seen that I (g) = ϕ ◦ B (·, r ) L -a.s. Problems
(1) Prove that B is an internal (Bt )t∈T -martingale. (2) Prove that the function F, defined in Remark 7.2.1, is not a lifting of a standard function f : [0, ∞[→ R. (3) Suppose that (ϕk )k∈N is a sequence of (bt )t∈[0,∞[ -adapted functions ϕk : FT × [0, ∞[→ H, converging to ϕ : FT × [0, ∞[→ H in L 2 ( L ⊗ λ, H). Prove that the Itô integral of ϕ exists and lim E
k→∞
(4) Prove Lemma 7.2.5.
V
2 (ϕk − ϕ) d B
= 0.
252
H. Osswald
7.3 The Iterated Integral Following [36] to a large extend, we introduce the iterated Itô integral as the standard part of an internal iterated Itô integral, which can be easily defined. Methods, in order to construct standard objects from internal ones, are sometimes called pushing down techniques. Recall that, vice versa, lifting results construct internal objects from standard ones. The constructions of Brownian motion and the Itô integral and the construction of martingales, defined on the continuous time line, from internal martingales on the discrete time line T in Chap. 6 are examples of pushing down results. Now we are going to push down the internal iterated integral to the standard iterated Itô integral, according to the construction of the Itô integral in Sect. 7.2. We will later see that each functional in L 2 (WCB ) can be written as an orthogonal series of iterated Itô integrals. This result is the key in order to introduce the Malliavin calculus. In the following we use the notation of Sect. 6.3.7.
7.3.1 The Definition of the Iterated Integral First of all we introduce a mixture of deterministic and random functions in connection with iterated integrals: Fix an internal F : T n+m → F⊗(n+m) with n, m ∈ N0 . Define In,m (F): FT × T m → F⊗m , setting In,m (F)(X, s) :=
Ft,s (X t1 , . . . , X tn , ·) =
t∈T