The Structure of Lebesgue Integration Theory G. TEMPLE
OXFOR'D AT THE CLARENDON PRESS 1971
Oxford University Press, E...
192 downloads
1404 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
The Structure of Lebesgue Integration Theory G. TEMPLE
OXFOR'D AT THE CLARENDON PRESS 1971
Oxford University Press, Ely House, London W. 1 GLASGOW NEW YORK TORONTO MELBOURNE WELI.INGTON CAPE TOWN SALISBURY IBADAN NAIROBI DAR
1~8
SAT.AAM I.UHAKA ADDIS ABABA
BOMBAY CALCUTTA MADRAS KARACHI LAHORE DACCA KUALA LUMPUR SINGAPORE HONG KONG TOKYO
©OXFORD UNIVERSITY PRESS 1971
PRINTED IN GREAT BRITAIN
Preface The purpose of this work is to introduce the principles and techniques of the theory of integration in the general and simple form that we owe primarily to Lebesgue, de la Vallee-Poussin, and W. H. Young. It is addressed to those who are already familiar with the elementary calculus of differentiation and integration as applied to the standard functions of algebraical and trigonometrical type. Some slight acquaintance with the topology of open and closed sets may also now be presumed in most first-year undergraduates, for whom the book is written, but it is not essential. I have endeavoured to provide an account of the essentials of the theory and practice of Lebesgue integration that are indispensable in analysis, in theoretical physics, and in the theory of probability in a form that can be readily assimilated by students reading for honours in mathematics, physics, or engineering. To realize this purpose is a serious and important pedagogical problem, for the theory of Lebesgue integration occupies a strange, ambivalent position in the minds of mathematicians confronted with the challenge of planning a syllabus for undergraduates. Then Lebesgue integration appears to be at once indispensable and unattainable, desirable and impracticable. As compared with 'Riemann' integration, so strongly entrenched in university courses of analysis, the subject of 'Lebesgue' integration possesses three great advantages: it is applicable to a much larger class of functions, the properties of the integral are much easier to establish, and the applications of the theory are made with much greater facility. And yet the Lebesgue theory is almost universally regarded as too difficult for inclusion in undergraduate instruction, and in spite of the numerous excellent expositions of the Lebesgue theory, there is still a need for a strictly elementary account of the subject, which will make it readily accessible and utilizable in an undergraduate course.
6
Preface
First of all let us squarely face the ineluctable problems which confront the writer who aspires to provide what tho French so happily call 'une rnuvre de la haute vulgarisation'. It must be admitted that there are undoubtedly two arduous passages in the traditional approach to the Lebesgue theory. The first is the theory of measure and the second is the theory of the differentiation of an indefinite integral. The first difficulty can be evaded by framing a direct definition of the integral, independently of the theory of measure. This has been done by L. C. Young, by 0. Perron, and by F. Riesz and B. Sz-Nagy. For a number of cogent reasons I have resisted the temptation to follow this seductive deviation from the traditional route. In the first place in the systematic and structural account of the Lebesgue theory the theory of measure is conceptually prior to the general theory of integration, since it is in fact the theory of the integration of the simple functions whose range consists of just two numbers, zero and unity. In the second place it is impossible to avoid the concept of sets of points of'zero measure'. In the third place the theory of measure is indispensable in such important applications as ergodic theory and the theory of probability. In the present introduction to integration theory the theory of measure has therefore been retained, but an attempt has been made to simplify and shorten the exposition by translating the traditional account from geometrical into analytical language. This is easily accomplished by systematically representing a set of points E by its characteristic function x(E)-a device suggested by Ch. J. de la Vallee-Poussin. The theory has also been simplified by replacing 'open sets' as the central concept, by enumerable collections of intervals which may be closed, open, or half-open. The second serious difficulty in the Lebesgue theory is the differentiation of an indefinite integral. In elementary calculus an integral cf;(x) of a function f(x) is descriptively defined as a function whose derivative is the integrand f(x). To students familiar with this concept it must be a sad and disheartening
Preface
7
experience to realize that the corresponding property of the Lebesgue integral is almost the last to be established, and that it requires the formidable apparatus of the covering theorems of Vitali or of Riesz, or the theory of re8eaux developed by de la Vallee-Poussin. In the method of treatment proposed in this book the theory of differentiation is based on the analytical discussion given by F. Riesz and B. Sz-Nagy (1953) and its geometrical expression in the 'rising sun' theorem of F. Boas (1960). The removal of these two well-known difficulties in the Lebesgue theory is scarcely an adequate excuse for the publication of yet another introduction to this much-introduced subject. The real justification lies in the more exacting demands now made on authors of mathematical works. The revolution in mathematical theory, which is still proceeding, has also provoked a revolution in mathematical teaching, and has imposed new canons of exposition. The two essential and necessary conditions which a modern textbook must attempt to satisfy are those associated with the key-words 'motivation' and 'structure'. An exposition of any branch of mathematics must now provide the student with an adequate motivation, that is with a line of thought which leads naturally, and almost inevitably and automatically, from the elementary concepts and methods he already possesses to the more general and abstract ideas and techniques of the theory that he proposes to study. The motivation reveals the inadequacy of our present knowledge, it poses urgent and important questions we are as yet unable to solve, and, in its highest achievements, it restates these questions in a form which suggests what methods must be devised for a solution. The appetites excited by motivation must then be satisfied by a systematic exposition in which the whole of the theory is dominated by a few simple principles that endow the subject with a definite structure that can be described in advance before the student is committed to a detailed study. Thus a structure is not so much a set of definitions and theorems as a programme that directs the advance of the whole subject. These are high ideals for any writer and the present book must
8
Preface
be regarded as an experiment designed to determine how far these ideals can be realized in an introduction to the subject of integration. This is a subject which now offers most appropriate material for such an investigation. Numerous accounts of the theory of integration have been published, each of them furnishing its own special insight and technique. It now seems possible to extract the essential motivating ideas and structural principles that unify the whole theory. The motivating ideas, described in Chapter 2, lead from the Archimedean 'method of exhaustion' to the general concept of an integral. The structural principles are two in number-here termed the principle of bracketing and the principle of monotony. The principle of bracketing is a method of induction which enables us to extend a class of functions which are susceptible of integration, by using the concept of upper and lower integrals. We can thus ascend from the concept of area or volume to the Lebesgue measure and the Lebesgue integral. To carry out this programme we need the principle of monotony by which all sequences are reduced to monotone sequences and all functions are reduced to monotone functions. The problems of convergence and of integration are thus reduced to their simplest possible form. Both of these principles are firmly embedded in the literature, especially in the writings of W. H. Young and L. C. Young. The main contribution of the present work is the exhibition of these concepts as the essential structure of the Lebesgue theory of measure, of integration, of differentiation, and of convergence. I must express my indebtedness to friends and colleagues who have read this book in proof or in typescript, and who have given me valuable advice and help. I will mention especially Dr.J.D.M. Wright, who read the bookinproof,andDr.A. Ingleton, who, with great patience and much friendly criticism, has given me invaluable help in correcting and improving the successive drafts of the work.
Oxford 20 January 1971
G.T.
Contents
1. MOTIVATION 1.1. Introduction 1.2. The object of the Lebesgue theory 1.3. The achievement of Lebesgue 1.4. The techniques of Lebesgue theory 1.5. Alternative theories
13 13 15 16 17
2. THE CONCEPT OF AN INTEGRAL 2.1. Introduction 2.2. 'Primitives' 2.3. Areas 2.4. The Lebesgue integral 2,5. Lebesgue measure 2.6. The structure of the Lebesgue theory of integration 2. 7. Exercises
19 20 22 25 28 30 31
3. THE TECHNIQUES OF LEBESGUE THEORY 3.1. The starting-point 3.2. The method of bracketing 3.3. The Riemann integral 3.4. Monotone sequences 3.5. Infinite integrals 3.6. The Dini derivatives 3.7. Exercises
32 32 35 37 39 41 42
4. INDICATORS 4.1. Introduction 4.2. Boolean convergence 4.3. Open and closed sets 4.4. Covering theorems 4.5. The indicator of a function 4.6. Exercises
44 46 48 51 52 53
10
Contents SETS OF ZERO MEASURE
5. DIFFERENTIATION OF MONOTONE FUNCTIONS 5.1. Introduction 5.2. Sets of points of measure zero 5.3. The Cantor set of points 5.4. Average metric density 5.5. The problem of differentiation 5.6. The 'rising sun' lemma 5.7. The differentiation of monotone functions 5.8. The differentiation of series of monotone functions 5.9. Exercises
54 55 57 59 60 61 64 67 69
LEBESGUE THEORY IN ONE DIMENSION
6. GEOMETRIC MEASURE OF OUTER AND INNER SETS 6.1. Introduction 7l 6.2. Elementary sets 71 6.3. Bounded outer sets 76 6.4. Unbounded outer sets 82 6.5. The principle of complementarity 82 6.6. Inner sets 83 6. 7. Exercises 86 7. LEBESGUE MEASURE 7.1. Introduction 7 .2. Outer and inner measure 7.3. Lebesgue measure 7.4. Examples of measurable sets 7.5. Unbounded sets 7.6. Non-measurable sets 7. 7. Criteria for measurability 7 .8. Monotone sequences of sets 7 .9. Exercises 8. THE LEBESGUE INTEGRAL OF BOUNDED, MEASURABLE FUNCTIONS 8.1. Introduction 8.2. Measurable functions 8.3. Measure functions 8.4. Simple functions 8.5. Lebesgue bracketing functions 8.6. The Lebesgue-Young integral
87 87 90 94 96 97 98 99 101
103 104 107 108
no Ill
Contents 8.7. The Lebesgue integral as a positive, linear continuous functional 8.8. The differentiability of the indefinite Lebesgue integral 8.9. Exercises
11 ll4 121 124
9. LEBESGUE INTEGRAL OF SUMMABLE FUNCTIONS 9.1. Introduction 126 9.2. Summable functions 126 9.3. The Lebesgue integral of summable functions as a positive linear 'continuous' functional 131 9.4. The Lebesgue integral as a prinlitive 138 9.5. Exercises 143 LEBESGUE THEORY IN d DIMENSIONS
10. MULTIPLE INTEGRALS IO.l. Introduction 10.2. Elementary sets in d dimensions 10.3. Lebesgue theory in d dimensions 10.4. Fubini 's theorem stated 10.5. Fubini 's theorem for indicators 10.6. Fubini 's theorem for summable functions 10.7. Tonelli's theorem 10.8. Product sets 10.9. The geometric definition of the Lebesgue integral 10.10. Fubini 's theorem in d dimensions 10.11. Exercises
144 144 146 148 150 152 155 157 159 160 162
11. THE ll.l. 11.2. 11.3. 11.4. 11.5. 11.6.
LEBESGUE-STIELTJES INTEGRAL Introduction The weighted measure The Lebesgue representation of a Stieltjes integral The Lebesgue-Stieltjes integral in one dimension The Lebesgue-Stieltjes integral in two dimensions Exercises
163 164 167 170 171 173
12. EPILOGUE 12.1. The generality of the Lebesgue integral 12.2. The descriptive definition of the Lebesgue integral 12.3. Measure functions 12.4. The Young integral 12.5. References
174 174 177 178 lSI
INDEX
182
1
Motivation
1.1. Introduction The study of any branch of mathematics is essentially a guided research, and, before he commits himself to such a project, the prudent student will require satisfactory answers to three or four questions: (i) (ii) (iii) (iv)
What What What What
is the object of the investigation? measure of success will be attained? methods of investigation will be needed? and other lines of investigation are available?
We proceed to answer these questions so far as they relate to the study of the Lebesgue theory of integration and differentiation.
1.2. The object of the Lebesgue theory The object of the Lebesgue theory is stated clearly by Henri Lebesgue in the paper published in 1902 which is his thesiscertainly the most famous and influential doctoral thesis ever written. There he states his purpose to give the most precise and general definitions of three mathematical concepts: the integral of a function, the length of a curve, and the area of a curved surface. But why seek for the most general definition of an integral? Why not be content with the integral as defined in elementary treatises on the calculus? There are two reasons for a divine discontent. (i) The greater generality of a definition is obtained by greater abstraction and therefore with greater simplicity. (ii) The familiar world of the well-behaved 'tame' functions of elementary calculus is not 'closed', and most limiting processes take us out of these comfortable surroundings into a strange
14
Motivation
world of 'wild' functions where the elementary concepts of integration are no longer valid. In view of the notorious difficulty of the Lebesgue theory the first reason may appear an idle paradox. But for the purpose of comprehending the real nature of an integral we know far too much about the properties of particular functions such as xn, sin x and cos x, exp x and log x, and the multitude of known facts is an embarrassment. As soon as we begin to generalize and abstract we are no longer concerned with these trivial details and we can concentrate on the essential features of the problem, for example, is the function to be integrated monotone or continuous? The general definition of an integral given by Lebesgue is in fact essentially simple because it depends only on the most general properties of the function to be integrated. (In fairness to the student it must, however, be admitted that simplicity in mathematics, like simplicity of character, is an ideal to be achieved only by unremitting toil.) The construction of 'wild' functions from 'tame' functions by means of limiting processes is now an integral part of analysis with its own recognizable techniques such as the 'principle of the condensation of singularities'. Thus, from the function
or
f(x) = xcoslnJxJ
(x =1= 0)
0
(x = 0),
which has no derivative at the origin (x the function
= 0), we can construct
where z1 , z2, ••• are the rational numbers between 0 and l. The function cfo(x) is continuous but has no derivative at any of the infinite number of points x = zn, n = I, 2, .... The contemplation of such wild or pathological functions was repugnant to many classical analysts, such as Poincare and Hermite (Saks 1937, p. iv) but their construction has two definite advantages. At the beginning of our studies it demonstrates the inadequacy of elementary analysis, and at the end
Motivation
15
of our studies it can show that the results obtained are the 'best possible'.
1.3. The achievement of Lebesgue In this book we are concerned mainly with Lebesgue's attempt to provide the most general definition of the integral of a function of one or more variables, x, or x 1 , x 2 , ••• , xnFirst let us consider the pro~lem of integration of bounded functions over a bounded interval. In this case the method of Lebesgue yields the most general possible definition, i.e. it applies to the widest possible class of functions. These are the functions described by Lebesgue as 'measurable'. The scope of Lebesgue's definition of the integral b
J f(x) dx a
is established by the proof that if f(x) is bounded and measurable, then the indefinite Lebesgue integral X
rp(x)
=
Jf(t) dt a
possesses 'almost everywhere' a derivative rp'(x) equal to f(x). Without anticipating the precise definition of the phrase 'almost everywhere', it is sufficient to state that it allows rp(x) to have no derivative at a set of points that may be enumerable, as in the case of the wild function quoted above, or even nonenumerable. Thus the object of Lebesgue's theory is completely attained so far as bounded functions over a bounded interval are concerned. In the case of unbounded functions or unbounded intervals the success of the Lebesgue theory is only partial. In fact the Lebesgue theory now applies directly only to the limited class of functions described as 'summable', and does not apply to the class of integrals which are not 'absolutely convergent'. Thus the integral co
J
lsi:xjdx
0
16
Motivation
is infinite, and in consequence, the integral
f 00
si:x dx
0
is not directly integrable by the methods of Lebesgue, although it may be defined by the limiting process
f
a
lim a-+oo
sin x dx, X
0
which is not, strictly speaking, part of the Lebesgue theory. Finally, we must emphasize that the Lebesgue theory applies to functions of several variables and to their integrals, not only over 'domains' but also over sets of points that belong to the class described as 'measurable'.
1 .4. The techniques of Lebesgue theory The achievements of the Lebesgue theory are finely described in all the standard works on the subject, but it is not always made clear that this success of the theory is attained by the skilful use of two simple techniques, which may be described as the 'method of bracketing' and the 'method of monotony'. These methods will be expounded in detail later but they may be summarily described as follows: The method of monotony consists in the reduction of problems of convergence to the study of monotone sequences, which are familiar and simple instruments of calculation. The method of bracketing consists in using the integrals of certain tame functions to integrate certain wild functions by 'bracketing' a wil~ function between a pair of tame functions. Thus two integrable tame functions .\(x) and p.(x) may be said to bracket a wild functionf(x) with tolerance*' > 0, if .\(x) ::( f(x) ::( p.(x) b
and if
b
J p.(x) dx- J .\(x) dx < *'· a
a
The integrals of .\(x) and p.(x) may then be regarded as approximations to the (as yet) unknown and undefined integral ofj(x).
Motivation
17
If there exists such a pair of integrable bracketing functions for any prescribed tolerance E > 0, we have at our disposal the method to define and to evaluate the integral of f(x) to any desired degree of approximation. These two techniques are all that is necessary to construct the whole of the Lebesgue theory of integration.
1 .5. Alternative theories The account of the Lebesgue integral given in the following chapters follows the mature thought of Lebesgue as expounded in the second edition (1928) of his book modestly entitled Lef}OnB sur l'integration. In this account the theory of measure is developed as a preliminary to the theory of integration, but there are alternative methods of which the student should be informed. Broadly speaking, these alternative methods fall into two classes-in one class integration is reduced to measure theory, and, in the other, measure theory is subsumed into the theory of integration. Lebesgue himself adopted the first method in the first edition (1904) of his Lef}ons and this method is followed by Burkill (1953) in his Cambridge Tract. In this method geometry reigns supreme b
and the integral J.J(x) dx of a non-negative function f(x) is a
defined as the two-dimensional measure of the set of points (x,y) such that (a~ x ~ b, 0 ~ y ~ f(x)). In the second method the integral is defined directly and the measure of a set of points E is then defined as the integral of their 'indicator' a:(x, E), i.e. the function that is equal to unity if x E E or to zero if x ¢:. E. There are various techniques for a direct definition of the Lebesgue integral and we briefly refer to three of these: (i) the method of monotone sequences invented by W. H. Young (1910) and expounded by L. C. Young (1927); (ii) the modification of the Darboux-Riemann method described by Saks (1937, p. 3) and employed by Williamson (1962,p. 39); 853146X
B
18
Motivation
(iii) the use of sequences of step functions {c/>n(x)}, with some generalized type of convergence. This is the method described by Riesz and Nagy (1953) and employed by lngleton (1965). Each of these methods has its own advantages, and we may apply to them the words of Kipling: There are nine and sixty ways of constructing tribal lays, And-every-single-one-of-them-is-right.
2
The concept of an integral
2.1. Introduction In the classical treatises on the various branches of mathematics the fundamental definitions and axioms are enunciated at the very beginning and followed by a systematic explanation of their logical consequences. But an introductory account of the Lebesgue theory cannot exhibit this classical perfection, for the fundamental definition of the Lebesgue integral is not a datum to be unquestionably accepted but a quaesitum that has to be achieved. The fact that we know the name of the entity-'the Lebesgue integral' -that we have to discover must not blind us to the fact that, at the beginning of our search, we do not know anything more about it, except that we hope it will prove to be a generalization of the integral that we have met in elementary calculus. In this puzzling and paradoxical situation how can we plan a systematic investigation? The answer is provided by the distinction between 'constructive' and 'descriptive' definitions. Our task is to develop a constructive definition of the Lebesgue integral that will guarantee its real existence in the world of mathematics. The constructive definition will be the end of our search. But at the very beginning we can give a descriptive definition of the Lebesgue integral by enumerating some of the properties that it must possess. These properties will be the most general and fundamental properties: of the integrals that we have encountered in elementary analysis. We shall find that these properties, together with the two techniques of bracketing and of monotony, almost inevitably decide the path that leads to a constructive definition of the Lebesgue integral. We therefore begin by disengaging the general concept of an integral from the material provided by the elements of the differential and integral calculus. In fact, elementary calculus
20
The concept of an integral
does provide two definitions of an integral, viz. as a 'primitive' and as an area. By examining the concept of a primitive we shall obtain a descriptive definition of an integral and by examining the concept of area we shall see, in general terms, how a constructive definition can be achieved.
2.2. 'Primitives' A primitive is a correlative of a derivative, i.e. if two real functions, rp(x) andf(x), ofareal variablex, defined and bounded in the interval (a, b)= {x: a< x < b}, are so related that rp(x+hl-rp(x)-+ f(x)
as h-+ 0
for all values of x and x+h in the given interval, then f(x) is the derivative of rp(x) and rp(x) is a primitive ofj(x). Any other primitive ofj(x) differs from rp(x) only by an additive constant c, and has the form rp(x)+c. If the primitive function rp(x) is prescribed, and if it has been chosen from the severely restricted class of functions that do possess derivatives, then the definition given above does prescribe a definite limiting process for calculatingf(x) (in principle) to any prescribed degree of accuracy. The definition is therefore 'constructive'. But if it is the derivativef(x) that is prescribed, then the definition is purely 'descriptive' and provides no determinate means of calculating the primitive rp(x), which in fact is usually found (when it exists!) by ingenious artifices and patient experimentation, aided by a well-stocked memory of lists of elementary functions and their derivatives. However, the familiar functions f(x) = exp( -x2 )
or
f(x) = x-1 sin x
provide examples of integrands whose primitives cannot be expressed by any finite combination of elementary functions. The relation of a primitive to a derivative is essentially a 'local' property, i.e. the numerical value of the derivative f(x) at a specified point x = gdepends only on the values of the primitive rp(x) in an arbitrarily small neighbourhood g-e < x < g+e of
The concept of an integral
21
the point g. In more technical language, differentiability at a point x = g is a local property of a function f(x) because it is a property of the 'restriction' of j(x) to any neighbourhood of x = g, i.e. of any function f(x, E) such that j(x, E) = f(x) when g-E < x < g+E, for some E > 0. The fundamental global properties of the differential relation are the following, which we designate by (N), (L), and (P). 1, over the (N) The integral of the unit function, f(x) interval [a, b] isb-a, i.e. the function cf>(x) = x-a is a prii~itive of the unit function j(x) = l. This property is often described as the Lebesgue normalizing condition. (L) If cp 1 (x) and cp 2 (x) are respectively primitives of j 1 (x) and j 2 (x) in the same interval [a, b] thenc 1 cp 1 (x)+c 2 cp 2 (x) is a primitive of ctf1 (x)+cd2 (x) in the. same interval, for all real numbers c1 and c2 • (P) If cf>(x) is a primitive of f(x) in an interval [a, b], and if f(x) is non-negative in this interval, then
=
cp(x) ;?: cp(a). The outstanding question that remains for investigation is to investigate the continuity of the differential relation. We can in fact prove that if the sequence of derivatives {c/>~(x)}, (n = 1, 2, ... ) converges uniformly in a closed interval [a, b] to a derivative cf>'(x) as n tends to infinity, then the corresponding sequence of primitives {c/>n(x)-c/>n(a)} converges to cp(x)-cp(a). This is a very restricted species of continuity for it requires that (i) the limit of the sequence {c/>~(x)} should be itself a derivative cf>'(x), and that (ii) the convergence of cp~(x) to cp'(x) should be uniform. At the present stage of our investigation we cannot foresee how much these conditions can be relaxed in a descriptive definition of an integral. We shall in fact establish three different (but related) conditions, which are each sufficient to ensure that b
lim n-->-oo
b
I fn(x) dx = I f(x) dx, a
a
22
The concept of an integral
where {fn(x)} is a sequence of integrable functions with limit function f(x), viz. (i) the condition of 'bounded convergence', i.e.
lfn(x)l < K for each x in [a, bJand for each n, K being a constant independent of x and n (Theorem 8. 7. 7); (ii) the condition of 'monotone convergence', i.e. 0 ~ fn(x) ~ fn+l(x)
for each x in [ -oo, ooJand for each n (Theorem 9.4.2); (iii) the condition of 'dominated convergence', i.e. lfn(x) I < rp(x)
for each x in [ -oo, ooJ and for each n, rp(x) being integrable over [ -oo, ooJ (Theorem 9.3. 7). (The theorems quoted are even more general, for they refer to integrals over measurable sets of points rather than over intervals.) We are now in a position to give a descriptive definition of an integral. The definite integral of f(x) over an interval [a, bJ must be (when it exists!) a real number that depends upon the values assumed by f(x) in this interval. It is therefore a 'functional'. Our investigation of the properties of a primitive suggests that the definite integral of f(x) over an interval [a, bJshould be a positive, linear, functional satisfying the Lebesgue normalizing condition. Since the properties (P), (L), and (N) are possessed by the primitives of derivatives, these properties are consistent with one another and can be taken as a descriptive definition of an integral.
2.3. Areas In elementary analysis a constructive definition of the concept of 'area' is obtained by the method of 'exhaustion' invented by the Greek mathematicians Eudoxus and Archimedes. The method is most simply described by considering a bounded, positive, non-decreasing function f(x) defined in an
The concept of an integral
23
interval [ct, b], and the region R in the (x, y)-plane specified by the relations 0 ~ y ~f(x). a~ x ~ b, The function f(x) is not assumed to be continuous. (Naturally the same method is applicable to bounded, positive, non-increasing functions.) We divide the interval [a, bJ by a finite number of points a = x0
.(x) dx
h
and
a
Jt-t(x) dx a
are the areas of the inscribed and escribed polygons respectively, I.e.
n-1
b
J >.(x) dx = p~o (xp+I-xp)f(xp), a b
n-1
J t-t(x) dx = p~o (xp+I-xp)f(xp+I). a
Hence b
b
n-1
Jt-t(x) dx- J>.(x) dx = p~o (xp+I-xp){J(xp+ 1)-f(xp)}, a
a
24
The concept of an integral
y=j(J:)
B
1
v
/
A
v
l( b
a
X
FIG. 1
and, if the maximum length of the intervals (xp, xp+l) for = 0, l, 2, ... ,n-l, is E, then
p
0 ~ Xp+l-Xp ~
E
and b
n-1
b
Jp.,(x) dx- J.\(x) dx ~ a
E
p~o {f(xp+l)-f(xp)}
a
=
E{J(b)-f(a)}.
Thus, in the terminology of§ 1.4, the functionf(x) is bracketed by the step functions .\(x) and p.,(x) with a tolerance E{J(b) -f(a)}, which can be made arbitrarily small by sufficiently increasing the number of sub-intervals into which the interval [a, b] is divided. Now 0
~
b
b
1
J .\(x) dx ~ aJp.,(x) dx ~ n~ (xp+l-xp)f(b) = (b-a)f(b). a P-0
The concept of an integral
25
Hence, if we consider all possible divisions of the interval [a, b], b
I >.(x) dx has an upper bound,
the collection of integrals A=
a
while the collection of integrals M
=
b
I p.(x) dx
has a lower
a
bound. Therefore the integrals A have a supremum or least upper bound, sup A, while the integrals M have an infimum or greatest lower bound, inf M. Also 0 for all E
>
~
infM-supA
~
E{f(b)-f(a)}
0. Therefore
sup A= infM =A, say. Hence there is a unique number A such that, for any division of the interval [a, b], A A M ~ ~ ' b and this number is therefore defined to be the integral f(x) dx.
I
a
This we may call the 'Archimedean' integral. Clearly
b
Jf(x) dx ~ (b-a)f(b),
(b-a)f(a) ~
a
so that the Archimedean integral of a monotone function satisfies the mean value theorem, and hence is a positive functional.
2.4. The Lebesgue integral The success of the method of exhaustion applied to the Archimedean integral clearly depends upon the closeness with which the bracketing functions >.(x) and p.(x) approximate to the integrand f(x). In fact since
= f(xp) in the interval xP ~ x < >.(x)
~
f(x) ~ f(xp+l)
=
p.(x)
xP+l' it follows that
p.(x)->.(x)
=
f(xp+l)-f(xp),
i.e. the variation of f(x) in this interval. lfJ(x) is non-decreasing and continuous in [a, b] then, for any prescribed tolerance E > 0 we can choose a division of the interval such that f(xp+l)-f(xp)
~(x)} is non-negative, monotonic in x, monotonic inn, and converges to zero as n-'>- oo in [a, b], prove that cPn(x)-c/>n(a)-'>- 0
as n-'>- oo
(Denjoy).
2. If the function j(x) possesses a derivative f'(x) at each point of the interval a ,;;; x ,;;; b, j'(a) = ex, j'(b) = ~. ~ =1= ex, and y lies between ex and~. then there is a point c between a and b such that f'(c) = y (Darboux). 3. Deduce from the properties (N), (L), (P) of the differential relation (§ 2.2) that, ifc/>(x) is a primitive ofj(x) in the interval [a,b], and if L .::;; f(x) ,;;; M
for a ,;;; x ,;;; b,
thenL(b-a),;;; c/>(b)-rp(a),;;; M(b-a). Hence deduce Rolle's theorem. 4. If j(x) is a continuous non-decreasing function in [a, b] and c is any constant, prove that b
b
I {c-f(x)} dx = c(b-a)- I f(x) dx.
a
a
3
The techniques of Lebesgue theory
3.1. The starting-point In seeking for the widest possible generalization of the concept of an integral there are two directions that we may follow-we may try to generalize the concept of a primitive or we may try to generalize the concept of area. The first method has been followed by Perron, Ward, and Henstock, the second method by Riemann, Borel, Young, and Lebesgue. In following the method of Lebesgue the starting-point is necessarily the concept of the area of a rectangle and its immediate extension to the integrals of step functions such as the bracketing functions of§ 2.3. To change the metaphor, no other material is available for the construction of the Lebesgue integral than the integrals of step functions. It follows that the only techniques available for the process of construction are those already employed in constructing the functions of analysis from step functions. These are the familiar algebraical techniques of addition, multiplication, and their inverses, together with the analytical techniques of limiting processes.
3.2. The method of bracketing The two techniques of generalization characteristic of Lebesgue theory are the method of bracketing and the method of monotonic convergence. The method of bracketing has been briefly discussed in§ 1.4 as a technique for extending the concept of integration, but it is interesting to indicate the very extensive class of functionals to which it can be applied and to expose the inherent restrictions in this method. Briefly we shall show that the method of bracketing can be applied to the class offunctionals J(f), which are characterized by the property that, if f(x} ~ g(x), then
The techniques of Lebesgue theory
33
I(f) ~ I(g), but that bracketing is necessarily a 'closure' operation that can only be applied once to extend a given class of functionals.
3.2.1. A functional I(f), defined for a set F of functions f(x) defined in an interval (a ~ x ~ b), is said to be 'monotonic' if I(f) ~ I(g) whenever f(x) ~ g(x) for all x in [a, b], and f(x), g(x) belong to the class F. DEFINITION
DEFINITION 3.2.2. A function rp(x) is said to be 'bracketed' by two functions ,\(x, E), f.L(X, E) (with tolerance E > 0) if ,\ and fL belong to the domain F of a monotonic functional I(f) and if
A.(x, E) ~ rp(x) ~ f.L(X, E),
l(f.L)-1(,\) ~
€.
3.2.1. If, for each prescribed tolerance E > 0, the function rp(x) is bracketed by two functions A.(x, E), f.L(X, E) belonging THEOREM
to the domain F of afunctional 1(,\) then the supremum of 1(,\) and the infimum of l(f.L) both exist and are equal.
Let .\0 , fLo be any fixed pair of bracketing functions. Then Ao ~ rp ~ fLo,
,\ ~ rfo ~ fL·
Hence and Thus the numbers /(,\) are bounded above, and the numbers l(f.L) are bounded below. Hence the numbers/(,\) have a supremum or least upper bound, sup/(,\), and the numbers l(f.L) have an infimum or greatest lower bound, inf l(f.L). For any prescribed tolerance E ~ 0 there exist bracketing functions ,\ and fL such that l(f.L)-1(,\) ~
Hence
inf l(f.L)-sup /(,\)
and therefore
sup/(,\)
~
=
E,
€.
for all
E
>
0,
inf l(f.L).
DEFINITION 3.2.3. With the notation and terminology of Theorem 3.2.1, the 'bracketed' functional I*(rp) is defined to be
I*(rp) 853146X
= sup/(,\) = inf l(f.L). c
The techniques of Lebesgue theory
34
THEOREM
3.2.2. If rf>(x) belongs to the domain F of the functional
I(rf>), then
I*(r/>)
For
sup I(>t)
=
=
I(rf>)
I(rf>).
=
inf I(P-).
Hence the bracketed functional I*(r/>) of a function of the class F is equal to the functional I(rf>). Thus the bracketed functional is a valid and self-consistent extension of the original function, which we may call the bracketed extension. It is important to notice at once the intrinsic limitation of the method of bracketing. Let F be the domain of a monotonic functional I (f) and the domain of the bracketed functional I*(rf>). Is it possible to construct a further extension of the original functional by the use of functions if;(x) which are bracketed by functions from the class ? The answer is in the negative. THEOREM 3.2.3. If, for any prescribed tolerance E > 0, a function if;(x) is bracketed by two functions rf>1(x) and r/> 2(x) of the set , then if;(x) also belongs to the set .
For there exist bracketing functions rf>v r/> 2 of set such that
rP1 ~if;~ rP2> I*(r/> 2)-I*(r/>1) ~
and
E.
Also there exist bracketing functions >t1 , 11-v >.. 2 , 11-2 of set F such that A2 ~ rP2 ~ 1-'2 and I*(P-2)-I*(>t2) ~ €. Note also that
I*(>t 2). ~ I*(r/> 2)
and
I*(r/>1)
~
I*(P-1).
Then and
I*(P-2)- I*(>t1)
= {I*(P-2) - I*(r/>2)}+{ I*(r/>1)- I*(>t1)}+{I*(r/>2)- I*(r/>1)} ~
{I*(P- 2) - I*(>t 2)}+{ I*(P-1)- I*(>t 1)}+{I*(r/>2)- I*(r/>1)}
The techniques of Lebesgue theory
35
Thus ~ is bracketed with arbitrary tolerance 3E, by functions and t-t 2 of the class F. Hence ~ belongs to the same set $ as the functions rf>v r/> 2 , and we have not succeeded in making any further extension of the domain of the bracketed functional I*(rf>). The operation of bracketing is thus similar to the closure operation in point-set topology in as much as both are idempotent operations, i.e. the repetition of the operation produces no further extension of the set to which they are applied. .\1
3.3. The Riemann integral As we have shown in§ 2.3 bracketing by step functions furnishes the integral of bounded, monotonic functions. In fact the scope of this technique is much larger and it provides the simplest definition of the Riemann integral. As before we divide the integral [a, b] by a finite number of points a= x0 < x 1 < x 2 < ... < xn =b. Let LP and MP be the greatest lower bound and the least upper bound respectively of a function f(x) in the interval xP ~ x < xP+l' for p = 0, l, 2, ... , n-l. Let FP be any number in the range [ LP, MP]. Then the sum n-1
S
= L
Fp(xp+l-xp)
p=O
is a Riemann approximation to the integral of f(x) in [a, b]. Let E = max(xp+l-xp) for p = 0, l, 2, ... ,n-l. As E-+ 0, n -+ oo and the corresponding sums S may converge to a limit. b
If so, this limit is the Riemann integral R
Jf(x) dx. a
It is, however, clear that f(x) is bracketed by the step function .\(x) = Lp}
= MP
t-t(x) and that
~ X
.\(x) ~ f(x) ~ t-t(x) b
while
.f
1 Xp
A=
J.\(x) dx =
if a
max( a, b).
This result is necessary to establish the consistency of our definition of the monotone extension of 1( ).
The techniques of Lebesgue theory
41
3.6. The Dini derivatives In discussing the differentiation of an indefinite Lebesgue integral (§ 5.1) we shall have to take explicit cognisance of the fact that a continuous function does not necessarily possess a unique derivative everywhere. The elementary example f(x) =
lxl
shows that f(x) may not be differentiable at the origin, where f(x+hh)-f(x) --,.. -~+I
or
-1,
for x
=
0,
according as h tends to zero through positive or negative values. By various ingenious methods it is possible to construct continuous functions that are not differentiable at any rational point, or indeed at any point whatsoever, and we refer to The theory of functions by E. C. Titchmarsh (§§ ll.2l-ll.23) for an account of what has been called the 'morbid pathology' of analysis. When a function f(x) fails to possess a derivative in the ordinary sense, i.e. when the incrementary ratio G(x, h) = f(x+h)-f(x) h
does not tend to a unique limit as lhl-+ 0, we can employ the peak and chasm functions of§ 3.4 to define what are commonly called the 'Dini' derivatives, after the Italian mathematician who introduced them into analysis. There is this difference that in§ 3.4 we were considering the limits of a set of numbersfn(x) which were defined for integral values of n, whereas now we are concerned with the limits of a set of numbers G(x, h) which are defined for values of h in a continuous interval, -S < h < S. Let 7T(X, S) be the supremum of G(x, h) and x(x, S) be the infimum value of G(x, h) in the domain 0 < h < S. Then as S-+ 0, 7T(X, S) is non-increasing, and x(x, S) is non-decreasing. Therefore 7T(X, S) and x(x, S) each tend monotonically to unique
The techniques o.f Lebesgue theory
42
limits as 8 ~ 0. These are upper and lower Dini derivatives of f(x) on the right, usually denoted by D+f(x) D +f(x)
=
lim 7T(X, 8)
=
lim x(x, 8)
8--->Q
8--->Q
=
lim sup G(x, h)
=
lim inf G(x, h)
(0
0,
I(j,.) exists,
.. f'P
I
~f
as n
~
oo,
'IJ=l
and I(j) exists, are sufficient to ensure that 00
I
I(j'J)) = I(j).
'J)=l
1. Prove that the integral of a step function is a monotonic, positive, linear, absolute functional. 2. If I*(cp) is the bracketed extension of I(j), prove that (i) if I is positive, so also is I*, (ii) if I is additive, so also is I*, (iii) if I is multiplicative, so also is I*, (iv) if I is linear, so also is I*, (v) if I is linear and absolute, so also is I*, (vi) if I is positive, linear, additive, and completely additive, so also is I*.
The techniques of Lebesgue theory
43
3. The Riemann integral R(j) can be defined as the bracketed extension of the monotone functional I (f) for step functions in a bounded interval. Examine which of the properties listed above are possessed by the Riemann integral. 4. Show that the Riemann integral over a bounded interval of nonnegative, non -decreasing functions f (x) is a completely additive functional (Denjoy 1941-9, p. 428), i.e. ifj,.;;;. O,j,.(x) is non-decreasing in x,j,.(x) n
is uniformly bounded, then functionj(x) as n-+ oo, and
~ fp(x) P=1
converges to a non-decreasing
co
~ l(j,.) = 2)=1
l(f).
5. (i) Ifj,.(x) possesses a non-negative derivativej~(x) for each integer n and each x in (a, b), and if
s,.(x) =
n ~ fl'(x) 1
converges to a differentiable sum function s(x) in (a, b), show that the co
series~ f~(x)
converges in (a, b) to a function A(x) such that A( X) .;;; s'(x).
1
(Show that s(x+h)-s(x)
" > ~{fl'(x+h)-jp(s)}.) 1
(ii) Show that there is a sequence of integers Pn such that the series co
~ {s(x)-sJ>ft(x)} n=1
converges for each x in (a, b). Hence deduce that s;,.(x) -+ s'(x)
and
s~(x) -+
s'(x)
as n-+ oo. (This is a mild form of a theorem due to Fubini. Note that s(x)-s2'ft(x) .;;; s(b)-sl'ft(b).) 6. If c/J(x)
= f(x)+g(x), prove that D+J+D+g .;;; D+c/J .;;; D+c/J .;;; D+J+D+g.
By considering the functions g(x) = x-lxl,
j(x) = jxl,
show that the signs .;;; cannot be replaced by
=.
7. Ifj(x) has a unique derivativef'(x) prove that D+(J+g)
=
j'(x)+D+g.
4
Indicators
4.1. Introduction We have it on the authority of Henri Poincare ({]jJuvres de Laguerre, tome 1, Preface, p. x, Paris, 1898) that 'in the mathematical sciences a good notation has the same philosophical importance as a good classification in the natural sciences'. In the theory of sets of points there are many advantages in adopting the notation due to Charles de la Vallee-Poussin (1916), by which the whole of the theory is expressible in analytical form, rather than in the usual geometrical language. In particular, it is unnecessary to memorize formulae for the manipulation of special symbols for the intersection and complements of sets of points. DEFINITION 4.1.1. The 'characteristic function'' x(x, E) of a set of points E in a space R is defined by the relations
x E _
x( ' ) -
{1
o
(x E E), (x ¢ E),
x being any point of R. In view of the numerous meanings that have been given to the adjective 'characteristic', we shall follow the lead of modern books on probability theory and call the function x(x, E) the 'indicator' of the set E. Thus the indicator of the whole spaceR is the function f(x) 1 and the indicator of the 'empty set' is f(x) 0. When we are discussing the properties of some specified set E we may write x(x)for x(x, E) and often we may further abbreviate x(x) to the single letter X· Just as we commonly speak of 'the point (x, y)', meaning the point with coordinates (x, y), so we shall speak of the 'set' or 'set of points x(x, E)', meaning the set of points E with indicator x(x, E).
=
=
Indicators
45
The fundamental property of indicators is given by 4.1.1. The necessary and sufficient condition that a function f(x), i.e. a mapping from the space S of the points x to the space R of the real numbers, should be an indicator is that THEOREM
{J(x)}2
=
f(x)
for each x.
Since the elements a of a Boolean algebra are characterized by the relation a 2 = a it seems appropriate to give the theory of indicators the name of Boolean analysis. Thus point-set topology can be expressed in the form of Boolean analysis, and we shall proceed to summarize the relevant properties of sets of points in this form. The first fundamental relation in point-set theory is that of 'inclusion'. DEFINITION 4.1.2. A set o: is 'included' in a set fl, or 'covered' by a set fl if each point of o: is also a point of fl.
4.1.2. The necessary and sufficient condition that the should be 'covered' by the set fl is that
THEOREM
set
o:
o:(x)
For, if
o:
=
o:(x)fl(x)
for each x.
is covered by fl then, by Definition 4.1.2, o:(x) = I
whence Now whence
implies that
o:(x) 0
~
o:(x)
~
~
fl(x)
I
and
fl(x)
=
I,
~
I,
for each x. 0
~
0 ~ o:(I-fl) ~ fl(I-fl)
and
o:(I-fl) = 0,
i.e.
o:(x) = o:(x)fl(x).
fl(x)
=
0,
Conversely, if o: = o:fl then, either o: = 0 or fl = I and in either case o: ~fl. The second fundamental relation is that of 'disjunction'. 4.1.3. A finite or enumerable collection of sets, o: 1 , o: 2 , ••• , is said to be 'disjoint' if o:P o:q = 0 for p -=!=- q, i.e. if no two different sets have a point in common. DEFINITION
46
Indicators
4.2. Boolean convergence Since it is one of the distinguishing features of the Lebesgue theory of measure to consider enumerable collections of sets of points, we proceed at once to consider the conditions for the convergence of a sequence of indicators {xn(x)} (n = I, 2, ... ). In classical analysis the condition for the convergence of a sequence {sn} (n = I, 2, ... ) to a limit s as n tends to infinity is that, if e is any prescribed tolerance, then lsn-sl
rxn+l > ···• prove that the intersection of all the rxn is not empty.
6. Show that the indicator of the rational points in [0, I] can be expressed in the form X( X) = lim { lim (cosm! 1TX)2n}. m""""+oo
n~oo
SETS OF ZERO MEASURE
5
Differentiation of monotone functions
5.1. Introduction In Chapter l in order to provide a motivation for the search for the most general concept of integration we introduced the ideas of 'tame' and 'wild' functions. The tame functions of elementary calculus are the bounded continuous functions with finite derivatives everywhere. Then there are functions such as sgnx,
\x\, [x],
which are tame everywhere except at one point (in these cases the origin). From these we can construct functions such as q
L p~l
q
\x-pfq\
(p,q integers),
L
sgn(x-pfq),
p~l
which are tame everywhere except at a finite number of points; and functions such as
which is tame everywhere except at the rational points x = r 1 , r 2 , •••• Thus it appears that there are degrees of wildness and that one way of quantifying the wildness of a function is by giving some general specification of the points at which it ceases to be tame. Such a criterion of wildness should have some practical utility and be related to the general concept of integration. Thus if the 'wild' points of a wild function could be neglected in constructing its integral, the function could be described as 'wild but harmless'.
Differentiation of monotone functions
55
For example all physicists would regard lx I as a primitive of sgn x disregarding the discontinuity in sgn x at the origin. Similarly any finite number of discontinuities in an otherwise continuous integral are commonly ignored. But how far can we go in neglecting wild points? What is the largest collection of wild points that can be safely ignored in integration? The answer to this question is furnished by Lebesgue's theory of sets of points of zero measure.
5.2. Sets of points of measure zero The length of an interval I will be denoted by III and the area of a rectangle R by IR I· 5.2.1. A set of points Eon the real axis is said to have zero one-dimensional measure if, to each positive number E there corresponds an enumerable collection of intervals {In} (n = 1, 2, ... ), which cover the set E and whose total length DEFINITION
does not exceed E. We note that the intervals may possibly be overlapping, and that it is immaterial whether the intervals are open, closed, or half-open. Also, since the terms II1 1, II2 1, ... of the series are each positive, the sum I is independent of the order of enumeration. 5.2.2. A set of points E in the (x, y)-plane is said to have zero two-dimensional measure, if to each positive number E there corresponds an enumerable collection of rectangles {Rn} (with edges parallel to the lines x = 0 or y = 0) which cover the set E and whose total area DEFINITION
R=
L"" IRnl
n=l
does not exceed E. As before, the rectangles may possibly be overlapping and may or may not include their edges or vertices. When there is no danger of confusion we shall use the shorter description 'sets of zero measure', or 'null sets' for sets with zero one-dimensional measure.
56
Differentiation of monotone functions
When a function f(x) possesses a property P for each point of an interval (a, b) except for the points of a set of measure zero it is customary to say that 'f(x) possesses the property P almost everywhere in (a, b)' or to write 'f(x) possesses the property P p.p. in (a, b)' (p.p. being the abbreviation for presque partout). The essential point in these definitions, due effectively to Lebesgue, is the use of an enumerable collection of intervals (or rectangles) to cover the set of points E. It is obvious that, in one dimension, a single point, or a finite number of points has zero measure, and we shall prove that the same is true for an enumerable set of points. THEOREM 5.2.1. Any enumerable set of points E = { Xv x 2 , ••• } has zero measure.
For E is covered by the enumerable collection of intervals ]n
=
{X, Xn- 2:+1 < X < Xn + 2:+1}
whose total length is
The set of rational points, i.e. the points whose coordinates Xv x 2 , ••• are each a rational number, is enumerable and therefore has zero measure.
CoROLLARY.
5.2.2. If Ev E 2 , ••• is an enumerable collection of sets, each of measure zero, then their union E is also of measure zero.
THEOREM
For the set Ek qan be covered by an enumerable collection of intervals (akn• bkn) (n = I, 2, ... ) of total length less than ef2k, where e is any arbitrary tolerance. Hence the union of Ev E 2 , ••• can be covered by an enumerable collection of intervals of total length less than e. The question naturally arises whether a set of points of zero measure is necessarily enumerable. The answer is in the negative as is shown by the example of the 'Cantor set' in the next section.
Differentiation of monotone functions
57
5.3. The Cantor set of points To construct the Cantor set of points we proceed as follows. From the unit interval, 0 ~ x ~ l, we remove in succession E 1 , the middle third, l < x < -f, E 2 , the middle thirds of the remaining interva1s, viz.
(i) (ii)
l (iii)
f.r
0, the set 0 has zero measure. The set 0, first constructed by Cantor, is not, however, enumerable. To prove this we observe that any number x in the interval 0 ~ x ~ l can be expressed in the form
where each numerator an is either 0, l, or 2. The points of E 1 can be expressed as X=
58
Differentiation of monotone functions
the points of E 2 as
2: ;~+l+(O or 00
x
=
j);
n=3
the points of E 3 as 00
1 +(O or 32 or 9 2 or x -_ "' ~ an+ 3n 27
2+2) 9 .
3
n=4
In general, the points of En are those for which x can be expressed in the form
where
ak =
0, l, or 2, i.e.
2: ~k 00
x =
and
an = l.
n=l
Hence the points of the Cantor set 0, i.e. the points of the interval (0, l) that remain after the removal of Ev E 2 , ••• , can be represented only in the form
2: ;z, 00
x
=
k=l
where each ak is either 0 or 2. If possible let these points be enumerable. Then they will form a sequence {xn}, with n = l, 2, ... , and Xn expressible in the form 00
X = n
" ' an,k ~ 3k' k=l
where each an,k is· either 0 or 2. Now consider the point
g= ~
ok ~ 3k k=l
where ok = 2-ak,k• This point lies in the interval (0, I) and belongs to the Cantor set since Ok = 0 or 2. But Ok is always different from ak,k· Hence gis differentfromeachnumberxn, forn = l, 2, ... , i.e. the number
Differentiation of monotone functions
59
g that belongs to 0 is not included in the given enumeration of the points of 0. We have thus arrived at a contradiction, which proves that the Cantor set 0 is not enumerable. 5.4. Average metric density A simple and useful criterion to prove that a set of points E has zero measure (Boas I960, p. 64), is conveniently expressed in terms of the concept of the 'average metric density' of the set E in an interval]. DEFINITION 5.4.I. If the subset of E that lies in the interval], of length 111, can be covered by an enumerable collection of intervals of total length not greater than p 111, then we say that the 'average metric density' of the set E in the interval I is not greater than p. Clearly 0 :( p :( I. THEOREM 5.4.1. If the average metric density of a bounded set E is not greater than a number p less than unity, for all intervals I, then E is a set of measure zero.
Consider the subset of E in an interval (a, b). This subset can be covered by an enumerable collection of intervals (ak, bk) (k = I, 2, ... ) of total length not greater than p(b-a). Now the subset of Ein (ak, bk) can be covered by an enumerable collection of intervals (akz, bk1) (l = I, 2, ... ) of total length not greater than p(ak-bk). Hence the subset of E in (a, b) can be covered by an enumerable collection of intervals (akl• bk1) (k, l = I, 2, ... ) of total length not greater than
By mathematical induction it follows that the subset of E in (a, b) can be covered by an enumerable collection of intervals of
total length not greater than pn(b-a), where n is any positive integer. Since n is arbitrary, the subset of E in (a, b) has zero measure.
60
Differentiation of monotone Junctions
5.5. The problem of differentiation We are now in a position to enunciate the central result of the Lebesgue theory of differentiation as follows. If the function f(x) is continuous and non-decreasing in the interval (a, b), thenf(x) possesses a derivativef'(x) at all points of this interval, with the exception of a set of points Z of zero measure. This remarkable result, which finds so many applications, not only in the theory of integration but also in the whole of analysis and differential geometry, was first discovered by Lebesgue (1904, p. 128). The line of argument that we shall follow is that given by F. Riesz (1953, pp. 6-7) as simplified by R. P. Boas (1960, p. 134). The main strategy is to show that the following chain of inequalities hold almost everywhere in (a, b), viz. 0
< D+f(x) < D_f(x) < D-J(x) < D+f(x) < D+f(x) < oo.
It then follows at once that the four Dini derivatives ofj(x) are finite and equal almost everywhere, i.e. f(x) possesses a finite derivative almost everywhere in (a, b). To establish these inequalities it is sufficient to prove that, if f(x) is any continuous, non-decreasing function, then
and For, ify
=
D+f(x)
< oo
D+f(x)
< D_f(x)
and Hence and
p.p.
(I)
-x
and then
p.p.
g(y) g(y+h)
=
= -f(x),
-f(-y-h)
=
-f(x-h),
g(y+h)-g(y) _ f(x-h)-f(x)
h D+g(y)
-
=
-h D-j(x)
D_g(y) = D+f(x).
But g(y+h)-g(y) = f(x)-f(x-h), so that the function g(y), like f(x), is continuous and non-decreasing in y.
Differentiation of monotone functions
61
Therefore (I) implies that D+g(x) ::( D_g(x)
p.p. (II)
whence Now by the very definition of the Dini derivatives D_f(x) ::( D-j(x)
and
D+f(x) ::( D+f(x).
(III)
Hence, by combining the inequalities (I), (II), and (III), we find the desired result: D+f(x) ::( D_f(x)
p.p.
::::;; D-J(x) ::::;; D+f(x)
(I) (III)
p.p.
::::;; D+f(x).
(II) (III)
The problem is thus resolved into the proof of the two inequalities D+f(x) < oo p.p. and
D+f(x) ::( D_f(x)
p.p.
for any continuous, non-decreasing function f(x).
5.6. The 'rising sun· lemma Various methods have been devised to construct enumerable collections of intervals covering the points at which D+f(x) < oo or D+f(x) ::( D_f(x), such as those invented by Vitali (Burkill 1953, p. 46), de la Vallee-Poussin (1916), and Rajchman and Saks (Titchmarsh, p. 358). But the method devised by F. Riesz has the advantage of a simple geometric interpretation and we shall therefore adopt it here, following the vivid account given by R. P. Boas (1960, p. 134). The geometrical significance of the 'rising sun' lemma is easily grasped if we regard the graph, y = f(x), of a continuous function f(x), as the profile of a series of parallel ridges (parallel to the z-axis !) of a mountain range, illuminated by the horizontal rays of the rising sun, at infinity on the x-axis (Fig. 2).
Differentiation of monotone functions
62 y
v
•
w
W'\ ~
X
FIG. 2
Some points on the ridges are in the sunshine and some are in the shadows cast by ridges on their right. The points in the shadows occupy a number of hollows, such as x' < x < x" f(x) < f(x") in which f(x') ~ j(x"). whence In order to give precision and rigour to these geometrical intuitions we frame the following definition and theorem. DEFINITION 5.6.1. If f(x) is continuous in the interval [a, b], a point x is said to be shaded, or dominated, by a point g if
a ~x
< g~
b
and f(x)
< j(g).
Differentiation of monotone functions
63
THEOREM 5.6.I (the 'rising sun' lemma). The dominated points form an enumerable collection of disjoint open intervals (ak, bk) (k = I, 2, ... ) such that if
(x)
< .\
(x) < D+rf>(x) is the union of the sets Et..,w Since .\ and Pare rational these sets are enumerable. Hence, by Theorem 5. 7 .2, the set E has measure zero. Therefore, almost everywhere, 0 ~ D+rf>(x) ~ D_rf>(x) ~
D-rf>(x) (III in § 5.5)
~
D+rf>(x) (II in§ 5.5)
~
D+rf>(x) (III in § 5.5)
(x) is differentiable almost everywhere in (a, b).
5.8. The differentiation of series of monotone · functions Denjoy's theorem (§3.7, Exercise 4) on the integration of a monotone sequence of monotone functions has a companion theorem due to Fubini (Riesz and Sz-Nagy 1953, p. 12), on the differentiation of a convergent series of monotone functions.
68
Differentialion of monotone functions
THEOREM 5.8.1. If fn(x) is a continuous, non-decreasing function of x in [a, b] for each value of n ( l, 2, ... ) and if the series
converges to a sum function s(x) at each point of [a, b], then almost everywhere in [a, b], the series of derivatives
I
n=l
f~(x)
exists and converges to the derivative s'(x). Apart from a set of values of x of measure zero the nondecreasing functions fn(x), and the partial sums n
sn(x) =
I
k=l
fk(x),
together with the sum function s(x), have finite, non-negative derivatives in (a, b). To study the convergence of the sequence s~(x)
n
=
I f~(x) k=l
we note that
whence
s'(x)
~ s~(x)
p.p. in (a, b)
The functionsf~(x) are non-negative, whence {s~(x)} is a nondecreasing sequence. We have just shown that it is bounded by s'(x). Hence the sequence {s~(x)} is convergent almost everywhere in [a,b]. To show that its limit is s(x), we select a sub-sequence sn(x), such that 0 ~ s(b)-sn(k)(b) ~ lj2k (k = l, 2, ... ). Then whence s(x)-sn(x) is a non-decreasing function of x. Hence 0 ~ s(x)-sn(x) ~ s(b)-sn(k)(b) ~ lj2k. CJ)
Therefore the series
I
{s(x)-sn(x)}
k=l
is a convergent series of non-decreasing functions.
Differentialion of monotone functions
69
We can apply the result established above for the similar 00
series !f~(x), viz. that the differentiated series is convergent, k~l
almost everywhere in [a, b]. Thus the series 00
!
{s'(x)-s~(k)(x)}
k~l
is convergent almost everywhere in (a, b). Therefore s'(x)-s~(k)(x)--+ 0
ask--+ oo.
But the sequence {s~(x)} is non-decreasing in n. Hence s'(x)-s~(x)--+ 0
as n--+ oo.
5.9. Exercises l. Show that the least upper bound of the average metric density of a bounded set E (taken over all sub-intervals of an interval I) is either 0 or l.
2. If
E,. is the union of Emn for m = 1, 2, ... , an-t, and E is the union of the enumerable collection E 11 E 2 ••• , show that Cantor's set 0 is the complement of E with respect to [0, 1]. 3. Show that a necessary and sufficient condition that a set of points E should have measure zero is that there should exist an enumerable ro
collection of intervals I 1,I2 , ••• of finite total length~ n~l
/I,./ such that each
point of E is interior to an infinite number of these intervals (Ries21-Nagy 1953, p. 6). 4. Ifj(x) is non-increasing and not necessarily continuous in (a, b), show that the limits j(x+O) = limj(x+h) (h > 0), h->-0
j(x-0)
= limj(x-h) (h >
0)
h->-0
exist at each point x in (a, b). Definej(a-0) andj(b+O) to bej(a) andj(b) respectively. Let F(x) = max{j(x-0), j(x), j(x+O)}.
70
Differentiation of monotone functions
A point x = s in (a, b) is said to be dominated if there exists a point x = g such that and Show that the dominated points form an open set, E. If (ak, bk) is an open interval belonging toE, prove that j(ak+O)
< F(bk)·
Show that the points at whichj(x) =1= F(x) are enumerable. Hence deduce thatj(x) is differentiable almost everywhere in (a, b).
LEBESGUE THEORY IN ONE DIMENSION
6
Geometric measure of outer sets and inner sets
6.1. Introduction It appears from the considerations advanced in § 2.4 that the whole problem of the integration of bounded functions f(x) can be reduced to the problem of the integration of the indicators ex(x, t,f) of the sets of points at which f(x) > t. We therefore embark on a systematic method of integrating indicators, guided by the descriptive definition of an integral developed in Chapter 2.
When the integral
Jex(x) dx
has been defined for an indicator ex, its value will be a generalization of the length of an interval that we shall call the geometric measure g(ex) of the set ex. To define this integral we shall employ the method of bracketing (§ 3.2) and shall therefore need to construct outer and inner bracketing functions t-t(x), .\(x), such that .\(x) ~ ex(x) ~ t-t(x). These bracketing functions will be the indicators of certain outer and inner sets of points, which we now proceed to define.
6.2. Elementary sets The theory of measure in one dimension as given by Lebesgue and by de la Vallee-Poussin is expressed almost entirely in terms of open and closed sets. The exception is the fundamental union-intersection theorem (6.2.5 and 6.3.9), which requires the use of general intervals that may or may not include either of their extremities. Kolmogorov and Fomin (1961) and Williamson
72
Geometric measure of outer sets and inner sets
(1962) have simplified the original Lebesgue theory by the systematic use of general intervals throughout the theory, and we shall follow their example in order to give explicit proofs of the basic theorems in a form that can readily be extended to spaces of two or more dimensions. DEFINITION 6.2.1. In one dimension an 'interval' is a set of points x such that a -< x -< b, where the symbol - 8
and each coefficient, a8 or bt> is either 0 or l.
Now
a= (a-aT)+aT, T
and the elementary sets
=
(T-aT)+aT,
74
Gemnetric measure of outer sets and inner sets
are disjoint. Let {ys} be the finite collection of disjoint intervals in these elementary sets. Then
and
aUT= (a-aT)+(T-aT)+aT =LYs, s
where each coefficient is either 0 or l. 6.2.3. The geometric measure g(ex) of an interval ex is its length, and is zero if ex is a single point, or the empty set 0. DEFINITION
The geometric measure of an elementary set should clearly be defined in terms of the geometric measures of its component intervals, but we must take note of the fact that an elementary set can be resolved into the union of disjoint intervals in an infinite number of ways. For example, the interval (0 :( x :( I) can be expressed as the union of the intervals (0 :( x < s), (x = s), and (s < x :( 1). We therefore need the following theorems. THEOREM 6.2.3. If the interval a is the union of a finite number of disjoint intervals {as} then
g(a)
=
Ls g(as)·
For we can enumerate the intervals so that as+l lies to the right of as and adjoins as. THEOREM 6.2.4. If the elementary set a has two different representations, ·a = L exs and T = L {11,
t
s
as the union of a finite number of disjoint intervals, then
L g(exs) = Lt g(fJe)· 8
For and
a
=
a2
= s,tL exs f1t
exs = exs a =
Lt exs f1t·
Geametric measure of outer sets and inner sets
75
Hence, by Theorem 6.2.3, g(exs) =
Lt g(exsf3e),
whence Similarly whence We can now frame DEFINITION
6.2.4. The geometric measure g(ex) of an elemen-
tary set ex is where exv ex 2 , ... are the components of ex m any finite representation. The main instrument in proving theorems about geometric measure is the 'union-intersection' theorem, the simplest form of which is the following. THEOREM
6.2.5. If ex and f3 are each elementary sets then g(ex U f3)+g(exf3) = g(ex)+g(f3).
By Theorem 6.2.2, ex and f3 can be represented in the form ex=
Ls asys,
f3 =
Lt btYt
where {y8 } is a finite collection of disjoint intervals, and each coefficient, a8 or be. is either 0 or I. Hence
Whence the theorem follows at once. CoROLLARIES.
(i) If ex and f3 are disjoint elementary sets, then
g(ex U /3) = g(ex)+g(f3).
Geometric meMure of outer sets and inner sets
76
(ii) By induction it follows that if o:v o: 2, ... , an is a finite collection of disjoint elementary sets, then gCQl o:s) = g(o:l)+g(o:2)+ ... +g(o:n)·
(iii) If a and Tare elementary sets and a is covered by T, then g(a)
For, on writing o: =a, theorem, we find that 0:
and
U
~
fJ =
fJ =
T,
g(T).
T-a in the union-intersection
o:{J
= 0,
g(T) = g(a)+g(T-a)
~
g(a).
6.3. Bounded outer sets In the terminology that we have just adopted the theory of measure before Lebesgue may be briefly described as follows. The 'content' of a set of points o: was defined as the greatest lower bound of the geometric measures g(a) of all elementary sets a covering o:. But this is clearly a rather crude measure for, according to this definition, both the rational points and the irrational points in the interval (0, I) have the same contentunity, although the irrational points are vastly more numerous. In the Lebesgue theory the concept of 'content' is replaced by the concept of 'measure', and this is achieved by covering the set of points o:, not by a finite collection of disjoint intervals, but by an enumerable collection of disjoint intervals. This provides a much finer measure of a set of points, which we now proceed to explore. DEFINITION 6.3.1. An outer set is the union of an enumerable collection of disjoint intervals, called a representation of the outer set. THEOREM 6.3.1. The union of an enumerable collection of disjoint outer sets is an outer set.
If the outer set an is the union of the enumerable disjoint intervals {anp} (p = I, 2, ... ) then o:, the union of the outer sets {o:n} is the union of the intervals {anp} (n,p = I, 2, ... ).
Geometric measure of outer sets and inner sets
77
Now Also whence
amp
and
=
exm amp
ampanq
(by Theorem 4.1.2),
=
(exmamp)(exnanq)
= =
(exm exn)(amp anq) 0
ifm o:ft. n.
Hence the collection of intervals (anp) is disjoint, whence ex is an outer set. THEOREM
6.3.2. The union of an enumerable collection of
(not necessarily disjoint) intervals is an outer set. If the set ex is the union of the bounded, enumerable, nondisjoint intervals {exn} (n = I, 2, ... ) then by the covering Theorem 4.4.I, ex is also the union of the enumerable, disjoint sets {,Bn} where and
.Bn
=
exn(I-ex1 )(I-ex2 ) ... (I-exn_ 1 )
(q> I).
But, by Theorem 6.2.I, (I-ex1 ), (I-ex 2 ), ••. , (I-exn_ 1 ) are elementary sets, whence .Bn is an elementary set, i.e. the union of a finite number of disjoint intervals. Hence ex is the union of an enumerable collection of disjoint intervals, and is therefore an outer set. THEOREM
6.3.3. The intersection of any two outer sets is an
outer set. If the outer sets ex and ,8 are respectively the unions of the enumerable collections of disjoint intervals {exm} and {,Bn}, then the intersection ex,B is the union of the enumerable intervals {exm,Bn}· But (exm ,Bn)( exp ,Bq) = (exm exp)(,Bn ,Bq)
= 0 unless m =
p
and
n = q.
Hence the intervals exm .Bn are disjoint, and therefore the intersection ex,B is an outer set.
78
Geometric measure of outer sets and inner sets THEOREM
6.3.4. The union of any two outer sets is an outer
set.
In the notation of Theorem 6.3.3, the union ex U f3 of the outer sets ex and f3 is the union of the enumerable collection of intervals {exm} and {f3n}· Hence, by Theorem 6.3.2, ex U f3 is an outer set. We must next establish the existence of geometric measures for outer sets. 6.3.5. If a is a bounded outer set with enumerable disjoint components {an} then the series THEOREM
is convergent.
If a is covered by an interval w, then the union r n of the finite collection of disjoint intervals a 1 , a2 , ••• , an is an elementary set also covered by w. Hence, by Corollary (iii) to Theorem 6.2.5, n
L g(as) =
g(rn) ~ g(w).
s~l
00
Hence the series
L g(an)
is bounded. Since each term is non-
1
negative, the series is therefore convergent. THEOREM 6.3.6. If a and r are bounded outer sets with the representations 00
a=
L ex
8
00
and
T
s~l
= Lf3t t~l
and if a is covered by r, then 00
Jlg(exs) ~
00
"J:/..
< t,
then
and, by Theorem 6.3.7,
g(as) ::;;;; g(at). Hence the function g(as) converges to a unique limit>. ass-+ oo, which may be finite or infinite. DEFINITION 6.4.1. The geometric measure g(a) of the unbounded outer set a is the limit,
g(a)
= lim g(as)· S-+W
6.5. The principle of complementarity So far we have defined and studied only outer sets of pointsthe indicators of which are to be the upper bracketing functions
Geometric measure of outer sets and inner sets
83
in our definition of the Lebesgue measure. We must now examine inner sets-the indicators of which are to be the lower bracketing functions. In the original Lebesgue theory the outer sets were open sets and the inner sets were closed sets. We need to modify this theory in view of the generalization of outer sets that we adopted in§ 6.3. The motivation of the Lebesgue theory was the 'principle of complementarity', which we can describe as follows. If a and r are complementary with respect to an interval w then and ar = 0. Now let fL be an outer set covering a and v be an outer set covering r. Then T~V
and
w-v
~T =
a.
Hence a is bracketed by the sets
A
=w-v
and
t-t·
The lower bracketing set A is the complement of the upper bracketing set v with respect to the interval w. We shall take the upper bracketing sets to be 'outer sets' covering the prescribed set a; and we shall take the lower bracketing sets to be 'inner sets' which are the complements of outer sets with respect to an interval. But it is necessary to give a self-consistent definition of inner sets A and of their geometric measure g(A) and to prove that if a given set a is bracketed by an outer set fL and an inner set Athen g(A) ~ g(t.t)- We therefore proceed to establish the necessary definitions and theorems.
6.6. Inner sets THEOREM 6.6.1. If A is the complement of an outer set t-t 1 with respect to an interval w 1 , then it is also the complement of some outer set t-t 2 with respect to any other interval w 2 which covers A.
(l) We first prove that A is the complement of an outer set fL with respect to the interval w = w 1 w 2 •
Geometric measure of outer sets and inner sets
84
Since A is covered by w1 and by w 2 , it is also covered by w. Let
= A+ft· p. = w-A = w-(w 1-p.1) w
Then
=
P-1-(wl-w).
Now w = w1w2 :( w1. Thus w 1 -w consists of one or two intervals. Also w 1 -w :( p. 1, whence and
IS
a set, which clearly
(w1-w)p.1 = (w 1-w),
=
fL
P-1(1-wl+w).
But l-w1 +w is a set consisting of two or three intervals, and p.1 is the union of an enumerable collection of disjoint intervals. Hence so also is p., i.e. p. is an outer set. (2) In the general case, let w2
=
A+p. 2 , whence
P-2 = w2-w+p..
But and so that
fL
=
p.w
=
fLW2,
(w 2-w)p. = 0.
Thus p. 2 is the union of two disjoint sets, viz.: (i) w 2 -w, which consists of one or two intervals, and (ii) p., which we have proved to be an outer set.
Therefore p. 2 is also an outer set. 6.6.1. Any set A that is the complement of an outer set p. with respect to some interval w is an 'inner set'. DEFINITION
DEFINITION 6.6."2. The geometric measure of an inner set A with respect to a finite interval w covering A is defined to be
gw(A)
=
g(w)-g(w-A).
6.6.2. If gw,(A) and gw 2 (A) are the geometric measures of an inner set A with respect to two intervals w 1 and w 2 each covering A, then THEOREM
Geometric measure of outer sets and inner sets
85
We apply the union-intersection theorem (6.3.9) to the sets ex
=
fJ =
and
w1
w 2 -A.
Their union is ex U fJ = w1 +(w 2 -A)-(w 1 w2 -wl A).
But A is covered by w 1, whence WlA =A,
and
fJ is exfJ = w1 (w 2 -A) = w 1 w 2 -A.
The intersection of ex and
Hence
g(w1 )+g(w 2 -A) = g(ex)+g(fJ)
= = =
g(ex U fJ)+g(exfJ) g(w1 U w 2 ) +g(w 1 w 2 -A).
Similarly
g(w 2 )+g(w 1 -A)
Therefore
g(w1 )+g(w2 -A) = g(w 2 )+g(w 1 -A)
and
g(w 1 )-g(w 1 -A) = g(w 2 )-g(w2 -A).
g(w1 U w2 )+g(w 1 w2 -A).
DEFINITION 6.6.3. The geometric measure of a bounded inner set A is g(A) = g(w)-g(w-A), where w is any finite interval covering A. THEOREM 6.6.3. If the bounded set a is bracketed by an outer set fL and an inner set A then g(A) ~ g(t-t)· A~ a~
For
fL
and there is a finite interval w covering A and fL such that v
= w-A
is an outer set (by Definition 6.6.1). Hence fLU V = fL+V-fLV
=
t.t+w-A-t.tW+fLA
=w
since
fL = fLW
and
A = JLA.
Geometric measure of outer sets and inner sets
86
Therefore by the union-intersection theorem (6.3.9) g(w) = g(P- U v) = g(P-)+g(v)-g(P-v) :( g(P-)+g(v).
But
g(A) = g(w)-g(v),
whence
g(A) :( g(P-)-
It is almost trivial, but nevertheless necessary, to establish the following theorem. 6.6.4. If a is any bounded set, then there always exist outer sets 11- and inner sets A such that A :( a :( 1-'· THEOREM
Since a is bounded there is an interval~-' covering a, and any point of a is an interval A covered by a. A and 11- are inner and outer sets bracketing a.
6.7. Exercises l. Show that a single point x = a is both an outer set and an inner set. Show that the empty set is both an outer set and an inner set.
2. A functionf(x) is semi-continuous at x
=
g
(i) on the right, ifj(g+h) -+j(g) ash-+ 0 for h > 0, (ii) on the left, ifj(g-h)-+ j(g) ash-+ 0 for h < 0.
or
Show that cx(x) is the indicator of an outer set if cx(x) is semi-continuous (on either right or left or both) at each point g where cx(g) = l. Show that cx(x) is the indicator of an inner set if cx(x) is semi-continuous at each point g where cx(g) = 0. 3. Show that the intersection of a finite or an enumerable collection of inner sets is also an inner set (which may be empty) (see Theorem 6.3.2). 4. Show that the intersection of a finite number of outer sets is also an outer set.
I
7
lebesgue measure
7.1. Introduction In the preceding chapter we have developed the theory of the geometric measure of outer and inner sets and have shown that, if a is any bounded set of points, then there exist outer and inner sets fL and ,\ such that
and
g(,\) ~ Y(t-t)·
Our guiding principles have been those originally adopted by Lebesgue, viz.: (i) the use of enumerable collections of disjoint intervals {t-tn} for the upper bracketing function f-t, and
(ii) the use of the principle of complementarity to define the lower bracketing function ,\, Lebesgue himself never gave a complete and formal account of his theory of measure and integration and the first systematic treatment was given by de la Vallee-Poussin. In his account which we follow here, the central position is occupied by the 'union-intersection theorem'. In this chapter we develop the theory of measure as the theory of the integration of the indicators o:(x) of bounded linear sets of points.
7.2. Outer and inner measure DEFINITION 7.2.1. The outer measure m*(a) of a set of points is the greatest lower bound of the geometric measures g(t.t) of outer sets fL covering a, i.e. a
m*(a) = inf g(t-t)
for t-t
~a.
88
Lebesgue measure
DEFINITION 7.2.2. The inner measure m*(a) of a set of points a is the least upper bound of the geometric measures g(,\) of the inner sets ,\ covered by a, i.e. m*(a)
=
sup g(,\)
for,\
~
a.
THEOREM 7.2.1. The inner measure of any set a is not greater than its outer measure, i.e. m*(a)
~
m*(a).
For, with the notation of the preceding definitions, g(,\) ~ g(t-t)
whence
m*(a)
(by Theorem 6.6.3),
= sup g(,\)
~
inf g(t-t)
=
m*(a).
The union-intersection theorem (6.3.9) for the geometric measure of outer sets can now be generalized to the outer and inner measures of any pair of sets. THEOREM 7.2.2. If a and -r are any pair of sets, then m*(a)+m*(-r)
~
m*(a U -r)+m*(a-r).
From Definition 7.2.1, if e is any prescribed positive tolerance, there exist outer sets o: and (3 such that a~ o:,
and Now
m*(a)
~
T
g(o:)-e,
aU -r
~
(3
m*(-r)
~
=
1-(1-a)(1--r)
~
1-(1-o:)(1-(3)
g(f3)-e.
= u (3, 0:
and
a-r ~ o:f3.
But o: U (3 and o:(3 are outer sets. Therefore and
m*(a U -r)
~
g(o: U (3)
m*(a-r)
~
g(o:(3).
By the union-intersection theorem (6.3.9) g(o: u f3)+g(o:(3)
whence
=
g(o:)+g((3),
Lebesgue measure
Since this is true for all e
>
0 it follows that
m*(a U T)+m*(aT) THEOREM 7 .2.3.
If a and
89
T
~
m*(a)+m*(T).
are any pair of sets, then
m*(a U T)+m*(aT) ~ m*(a)+m*(T).
Let w be any finite interval covering a and
T,
and
T' = w-T.
a ' = w-a,
a' U T' = w-aT
Then and
a'T'
=
w-a U T.
m*(a U T) = g(w)-m*(a'T'),
Hence
m*(aT) = g(w)-m*(a' U T'),
and
m*(a U T)+m*(aT)
~ 2g(w)-m*(a')-m*(T')
(by Theorem 7.2.2)
CoROLLARY.
If a 1 , a 2 , ... ,
=
m*(a)+m*(T).
an
are disjoint sets, then
For and the corollary follows by induction. THEOREM 7 .2.4.
If the set a is covered by the set a~
T,
i.e. if
T,
If e is any prescribed tolerance there exists an outer set v such that T~V and g(v)-e ~ m*(T).
But whence
a~ T ~
m*(a)
~
Since this is true for each e
g(v)
>
m*(a)
v,
~ m*(T)+e.
0 it follows that ~
m*(T).
90
Lebesgue measure
m*(a) = g(w)-m*(w-a),
Also
m*(r) = g(w)-m*(w-r),
and whence
w-r m*(r)-m*(a)
~
w-a,
= m*(w-a)-m*(w-r)
~
0,
by the first part of this theorem. The following theorem is needed later (§ 7.9, Exercise 4) in constructing a criterion for measurability. THEOREM 7.2.5. If a and r are any two sets and 8 is their symmetric difference,
0 =aU =
then
T-aT
a+r-2ar, ~
Jm*(a)-m*(r)\
m*(o).
We note that T
U
8 = r+(a+r-2ar)-r(a+r-2ar)
= a+r(l-a)
~a.
Hence, by Theorem 7.2.4, m*(a)
~
m*(r U 8),
and by the union-intersection Theorem (7 .2.2), m*(r u o)+m*(ro)
~
m*(r)+m*(o).
Therefore m*(a)-m*(r)
Similarly
~
m*(o)-m*(ro)
m*(r)-m*(a)
~
~
m*(o).
m*(o).
Whence the theorem follows.
7.3. Lebesgue measure If.\ and p. are inner and outer sets bracketing a given bounded set a, then Thus if g(.\) and g(p.) are to be regarded as lower and upper approximations to the integral of a(x) the tolerance of this bracketing process, g(p.)-g(.\), cannot be less than m*(a)-m*(a) and it will not succeed in providing a definition of a(x) dx
J
Lebesgue measure
91
unless m*(a) = m*(a). Hence we are forced to the following definition. DEFINITION 7 .3.1. A bounded set a is measurable, in the sense of Lebesgue, if m*(a) = m*(a) and the common value of its outer and inner measures is the Lebesgue measure m(a) of a, i.e.
m(a) = m*(a) = m*(a). We note at once THEOREM 7.3.1. If the set a bounded by an interval w is measurable then so also is the complementary set T = w- a, and m(a)+m(T) = g(w). For, by Definition 7 .2.2, and whence
m*(T) = g(w)-m*(a) m*(a) = g(w)-m*(T), m*(T)-m*(T) = m*(a)-m*(a) = 0,
m(a)+m(T) = g(w). We can now complete the series of union-intersection theorems as follows.
and
THEOREM 7 .3.2. If a and T are bounded measurable sets, then
so is their intersection aT and their union aU T, and m(a U T)+m(aT) = m(a)+m(T). By the union-intersection theorems for outer and inner measure, (7 .2.2) and (7 .2.3), m*(a U T)+m*(aT) ~ m*(a)+m*(T)
=
m*(a)+m*(T)
~
m*(a U T)+m*(aT).
Hence
{m*(a U T)-m*(a U T)}+{m*(aT)-m*(aT)} ~ 0. But, by Theorem 7 .2.1, the bracketed expressions are each non-negative. They are therefore each equal to zero, i.e. m*(a U T)
= m*(a U T)
and Thus the union, aUT, and the intersection, aT, are each measurable.
Lebesgue measure
92
Therefore, using the union-intersection theorems again, m(a U T)+m(aT)::::;; m(a)+m(T) ::::;; m(a U T)+m(aT),
whence CoROLLARY.
(i) If a and Tare disjoint measurable sets then
m(a+T)'= m(a)+m(T). By induction, if a 1 , a 2 , ••• , an is a finite collection of disjoint measurable sets then m(a1 +a2 + ... +an) = m(a1 )+m(a2 )+ ... +m(an)·
It is not quite trivial to notice that (ii) If a and Tare measurable and a covers T, then m(a) For 1J = a(1-T)ismeasurable and m(a) = m( T)+m(1J) (cf. Theorem 7.2.4).
~
~
m( T).
m(T)
THEOREM 7.3.3. If {an} is any enumerable collection of sets of points with bounded union a, then 00
I m*(an)· n=l
m*(a) ::::;;
By Definition 7.2.1, for any prescribed tolerance is an outer set cxn such that
E
>
0, there
and The outer set cxn is the union of an enumerable collection of disjoint intervals {cxnp} (p = 1, 2, ... ), whence ex, the union of all the outer sets {cxn}, is the union of an enumerable collection of intervals {cxnp} (n,p = 1, 2, ... ). These intervals are not necessarily disjoint, but by the corollary to the covering Theorem 4.4.1 there exists an enumerable collection of disjoint elementary sets {.Bnp}, such that and
Lebesgue measure·
93
Since a = U an is covered by U cxn, it is also covered by U rxnp and 'by U !3nv (n,p = l, 2, ... ). Hence m*(a)
~ gC.~Jnv) = nJ~lg(f3np) 00
~
,L
(by Theorem 6.3.7)
g(rxnp)
n,p~l
00
= L g(cxn) n~l
This is true for all
E
~
>
00
00
00
n~l
n~l
n~l
L m*(an)+ L Ej2n = L m*(an)+€.
0, so that the theorem is established.
THEOREM 7.3.4. If {an} is any enumerable collection of disjoint sets of points with union a, then 00
m*(a) ;;:?:
,L m*(an).
n~l
By Theorem 7.2.4, since a covers the union of av a 2 , ••• , an m*(a) ;;:?: m*(
Qav)
;;:?: m*(a1 )+m*(a2 )+ ... +m*(an),
by the corollary Theorem 7.2.3. Since this is true for all n, it follows that 00
m*(a) ;;:?:
,L m*(an)•
n~l
Finally, by combining Theorems 7.3.3 and 7.3.4 we obtain THEOREM 7.3.5. If {an} is any enumerable collection of disjoint measurable sets, with union a, then a is measurable and 00
m(a) =
,L m(an)·
n~l
The corollary to Theorem 7.3.2 states that the Lebesgue measure is an additive functional in the sense that, if av a 2 , ... , an is a finite collection of disjoint measurable sets, then m(a1 +a2 + ... +an) = m(a1 )+m(a2 )+ ... +m(an).
·Lebesgue measure
94
We have now proved that the Lebesgue measure is completely additive in the sense that, if {an} is an enumerable collection of disjoint sets then their union is measurable and m(a1 +a 2 + ... ) = m(a1 )+m(a2 )+ .... The Lebesgue measure m(a) of a measurable set a is therefore a positive, additive, continuous functional of a and we shall prove (Theorem 7.4.2) that it satisfies the normalizing condition that if a is an interval, then m(a) is the length of the interval. Hence the Lebesgue measure m(a) can rightly be taken to be the integral of the indicator a.
7.4. Examples of measurable sets The simplest sets of points are finite or enumerable collections of points and we can prove at once 7.4.1. Any enumerable collection of points E is measurable and has Lebesgue measure zero. THEOREM
When E contains only one point g it is covered by the interval
g-!€ < with geometric measure 0 and
~
E.
X
< g+!€
Hence
m*(E) ~ m*(E)
0,
0.
Thus E is measurable and m(E) = 0. Hence, by the completely additive property of Lebesgue measure (Theorem 7.3.5), an enumerable collection E of points {En} is measurabl~ and 00
m(E)
= L m(En) = n=l
0.
Thus a set of points with 'zero one-dimensional' measure, according to Definition 5.2.1, has Lebesgue measure zero in accordance with Definition 7.3.1. In order of increasing complexity the next set of points to be considered is the interval.
Lebesgue measure
95
THEOREM 7.4.2. Any interval ex is measurable and its Lebesgue measure is equal to its geometric measure, i.e.
m(ex)
= g(ex).
Since ex is covered by ex it follows from Theorem 7.2.1 that m*(ex) ::( g(ex). Also, if ex is covered by an interval
w
then
Hence m*(ex)-m*(ex) ::( g(ex)+g(w-ex)-g(w) = 0. Thus ex is measurable. Now let a be any outer set covering ex. Then, by Theorem 6.3.7, g(ex) ::( g(a), whence
m*(ex) = inf g(a)
But
~
g(ex).
m*(ex) ::( g(ex). m(ex) = m*(ex) = g(ex).
Therefore
THEOREM 7.4.3. Any outer set is measurable and its Lebesgue measure is equal to its geometric measure.
If a is the union of enumerable disjoint intervals then by the complete additivity of Lebesgue measure (Theorem 7.3.5) a is measurable and
z m(an) z g(an) 00
m(a) =
00
=
n~l
(by Theorem 7.4.2)
n~l
= g(a)
(by Definition 6.3.2).
THEOREM 7.4.4. Any inner set is measurable and its Lebesgue measure is equal to its geometric measure.
By Theorem 7.3.1, if the inner set -r is the complement of an outer set a with respect to an interval w, then m*(-r)
=
g(w)-m*(a),
m*(-r) = g(w)-m*(a).
96
Lebesgue measure
But whence
m*(a)
= m*(a) = m(a) by Theorem 7.4.3; m*(T)
= m*(T) = g(w)-m(a) = g(w)-g(a)
= g(T), by Definition 6.6.2. Next we shall consider 'null sets', i.e. sets of points of zero measure.
THEOREM 7.4.5. The necessary and sufficient condition that a set o: should be measurable and have measure zero is that m*(o:) = 0.
The condition is obviously necessary for if o: has measure zero then m*(o:) = m(o:) = 0. The condition is also sufficient for the relations 0 :( m*(o:) :( m*(o:)
show that
m*(o:)
=
m*(o:)
=
= 0, 0.
The definition that we gave in Chapter 5 (5.2.1) is therefore completely in accord with the definition based on the theory of Lebesgue measure.
7.5. Unbounded sets Hitherto we have restricted ourselves to bounded sets of points o: and we have therefore been able to assert the existence of upper bracketing functions 11- such that 11- is an enumerable collection of intervals which cover o:. In the case of unbounded sets bounded upper bracketing functions may not be available and we therefore rely on the method of monotony to provide a definition of measure. DEFINITION 7.5.1. If as denotes the interval - S <X< S
an unbounded set o: is measurable if the bounded sets
LelJesgue measure
97
are measurable for each value of s, and the measure of ex is defined to be the limit of the non-decreasing function m(ex8 ), i.e. m(ex) = lim m(exa8 ). 8__,.00
This measure m(ex) may possibly be infinite.
7.6. Non-measurable sets At this stage the patient reader may well be inclined to inquire if the use of both upper and lower bracketing functions and the introduction of both outer and inner measures is really necessary, since the measure of a bounded measurable set ex can be defined simply as
m(ex) = inf g(t-L)
for all outer sets fL covering ex. It might therefore appear that only the outer measure and the upper bracketing function are necessary to define. The answer is that such a definition would apply only to measurable sets and that measurable sets are defined only by reference to their outer and inner measures. We have not proved that all sets are measurable and hence we have relied on the bracketing process to identify the measurable sets. But the awkward question then arises-are there really any nonmeasurable sets, i.e. sets such that m*(ex) -=F m*(ex)? The reply to this question falls into three parts: (i) Examples of non-measurable sets have been constructed on the assumption that the axiom of choice of set theory is valid. (ii) It has been proved by Solovay (1970) that the existence of non-measurable sets cannot be established if the axiom of choice is disallowed.
(iii) But the sets usually encountered in analysis are all
measurable. We shall therefore not pursue this matter further but refer the reader to the discussion in Burkill (1953), McShane (1,947), and Williamson (1962).
Lebesgue measure
98
7.7. Criteria for measurability The preceding considerations suggest the desirability of constructing some simple and practical criteria of the measurability of sets. To do this we shall compare the set a to be examined for measurability with certain outer sets ex, which closely approximate to a in the sense that the outer measure m*(a ~ex) of the symmetric difference of a and ex can be made arbitrarily small. We shall thus obtain some insight into the 'structure' of a measurable set a and we shall prove that for any tolerance e > 0 there exists an outer set ex such that m*(a ~ex) < e. The basic theorem follows at once from the definition of outer and inner measures: 7. 7 .1. If ex and fJ are complementary with respect to an interval w then the necessary and sufficient condition that ex and fJ should be measurable is that for any e > 0 there should exist outer sets f-t and v such that THEOREM
and
By Definitions 7.2.1 and 7.2.2,
< g(f.t), m*(fJ) < g(v), m*(ex) < g(w-fJ) = g(w)-m*(fJ). m*(ex)-m*(ex) < g(~.t)-g(w)+m*(fJ) < g(~.t)+g(v)-g(w) m*(ex)
Hence and similarly
m*(fJ)-m*(fJ) < g(~.t)+g(v)-g(w). Thus the conditions of the theorem are sufficient to ensure the measurability of ex and fJ. Also, if ex is measurable, so also is fJ by Theorem 7.3.1. Hence by definition for any tolerance e > 0 there exist outer sets f-t and v such that ex f-t, fJ v and
.. such that>.. :( a and m*(a->..) :( E.
Since a covers >.., a>.. = >... The union of 'YJ = a->.. and of>.. is 'Y) U
>.. = (a->..)+A.-(a>..->..) =a.
The intersection of 'YJ and >.. is 'Y)A
=a>..->..= 0.
Hence by the union-intersection theorems m*('YJ)+m*(A.);;:?: m*(a) and whence
m*(a)-m*(a) :( {m*('Y))-m*('YJ)}+{m*(>..)-m*(>..)} :( m*('YJ). The condition is therefore sufficient to ensure that
The condition is also necessary. For >.. is measurable, and, if a is measurable, so also is 'YJ (by the corollary to Theorem 7.3.2) and m(a) = m(A.)+m('YJ). But, by Definition 7.2.2, for every given tolerance E > 0, there exists an inner set>.. such that>.. ::(a,
m(a) ::( m(>..)+E, whence
m('YJ) ::::;; E.
7.8. Monotone sequences of sets By expressing Theorem 7.3.5 in terms of monotone sequences of measurable sets, instead of a series of disjoint sets, we can generalize it to apply to any convergent sequence of measurable sets.
100
Lebesgue measure
THEOREM 7 .8.1. If {an} is a bounded non-decreasing sequence of measurable sets, then a= lim an is also measurable and n--? oo
n-+oo
Since {an} is bounded, there is an interval w covering all the sets an- Then w-an is measurable by Theorem 7.3.1, and
is measurable by Theorem 7 .3.2. Now i.e. whence
TmTn
=
0.
Thus the sets {T n} are disjoint. But and
a1 T n
0 for n = 2, 3, ....
=
Hence, by Theorem 7.3.5, 00
m(a)
= m(a1 )+ 2 m(Tn) n~2
n
2 {m(an)-m(an-1)}, n---+oo n=2
= m(a1 )+lim by Theorem 7.3.2. Therefore
m(a) = lim m(an)· n->-oo
If {an} is a bounded non-increasing sequence of measurable sets then a = lim an is also measurable and CoROLLARY.
n->- oo
For we have only to apply Theorem 7.8.1 to the non -decreasing sequence We can now use the peak and chasm functions of Definition 3.4.2 to discuss the measurability of any bounded convergent sequence of measurable sets.
Lebesgue measure
101
THEOREM 7.8.2. If {an} is a bounded convergent sequence of measurable sets with limit a then a is also measurable and n-+oo
The associated peak and chasm sequences of Definition 3.4.2 are {1rn} and {xn} where 00
17n
and
=
U k=n
ak
Xn = anan+I··· ·
Now
17 n
;;:?:
17 n+l>
whence the sequence {1rn} converges to a limit corollary to Theorem 7.8.1,
1T
and, by the
m(1r) =lim m(1rn) ;;:?: lim sup m(an).
Similarly if each an is bounded by an interval complementary relations
w
we have the
w-x = w-lim Xn = lim{w-xn)
and
m(w-x) ;;:?: lim m(w-xn),
i.e. But hence
X=
1T
=a,
m(a) :( liminfm(an) :( limsupm(an) :( m(a),
and
m(a) = lim m(an).
7.9. Exercises l. A necessary and sufficient condition for a bounded set a to be measurable is that, for any tolerance E > 0, there exist open sets f3 and ex such that
a,-;;;{3, ~d
{3-a,-;;;ex
~~,-;;;~
2. A necessary and sufficient condition for a bounded set a to be measurable is that, for any tolerance E > 0, there exists an open set f3 such that
m*({3-a)
0, there exist an elementary set ex and two other sets 7] 1 , 7] 2 such that a
and
m*(7J 1) < E,
= ex+7Jl-7J2 m*(7J 2) < E (Lebesgue).
102
Lebesgue measure
4. A necessary and sufficient condition for a bounded set a to be measur· able is that, for any tolerance e > 0, there exists an elementary set c.: such that m*(o) ~ € where 3 = c.:+a- 2c.:a is the symmetric difference of c.: and a (Kolmogorov and Fomin). 5. A necessary and sufficient condition for a bounded set a to be measurable is that, for any set -r, m*(-r) = m*(-ra)+m*(-ra'),
where a' = 1-a.
8
The Lebesgue integral of bounded, measurable functions
8.1. Introduction In the preceding chapters (6 and 7) we have discussed measurable sets of points and the integration of indicators. Now, as we have indicated in § 2.4, the strategy of Lebesgue integration is to bracket the integrand f(x) by a pair of functions of the form
p,(x)
=
z
n-1 tp+l
o:p(x),
p=O
where o:p(x) is the indicator of the set of points at which tP
< f(x)
~ tp+I·
The problem of defining the integral ofj(x) is thus reduced to the problem of integrating the bracketing functions .\(x) and p,(x) and this depends upon the problem of integrating the indicators o:p(x). The problem of integration is therefore soluble, at least for bounded functions f(x) and finite interval of integration, when the indicators o:p(x) are integrable, i.e. when the sets of points tP
< f(x)
~ tP+l
are measurable for all values of t 0 , t 1 , ... , tnIn pre-Lebesguean analysis integrals in one dimension were defined only over finite or infinite intervals, but it is one of the remarkable and important characteristics of the Lebesgue integral that it can be defined just as easily over any measurable set of points E. It is a great convenience to adopt this more general definition from the beginning. To illustrate the concept of integration over a measurable E we may anticipate the results established later in this chapter
104
Lebesgue integral of bounded, measurable functions
(§ 8.6) and say that, if f(x) is integrable over interval [a, b] and if x(x, E) is the indicator of a measurable set of points E lying in this interval, then the product f(x)x(x, E) is also integrable over [a, b] and the integral of f(x) over the set E is b
Jf(x) dx = Jf(x)x(x, E) dx. E
a
In uniting these two ideas-of the bracketing process and of integration over a measurable set--we shall therefore begin by studying bounded functions f(x) that are defined on a bounded measurable set E, and are such that the subsets of Eat which
t t), E(f < t), E(f ~ t), E(f ~ t), E(f = t) the sets of points in a measurable set E at which f(x) > t, f(x) < t, f(x) ~ t, f(x) ~ t, f(x) = t, respectively. DEFINITION 8.2.2. A functionf(x) is 'measurable in a measurable set E' if the set E(f > t) is measurable for each value oft. THEOREM 8.2.1. If the set E(f > t) is measurable for each value oft, so also are the, sets E(f < t), E(f ~ t), E(f ~ t), E(f = t). The set E(f ~ t) is the limit of the convergent sequence of measurable sets E(f > t-1/n) for n = I, 2,.... Hence, by Theorem 7.8.2 the set E(f ~ t) is measurable. The sets E(f < t) and E(f ~ t) are complementary with respect to the set E. Hence, by Theorem 7.3.1 the set E(f < t) is measurable. Similarly, the set E(f ~ t) is measurable.
Lebesgue integral of bounded, measurable functions
105
Finally, the set E(f = t) is the intersection of the sets E(f;?: t) and E(f ~ t), whence, by Theorem 7.3.2, the set E(f = t) is measurable. THEOREM
8.2.2. Iff(x) and g(x) are each measurable in a set E,
so also are the functions f(x)+c,
f2(x), and
cf(x),
lf(x) I,
where cis any real number.
The relations E(f+c
and
E(cf
>
>
t)
=
t) = E(f
E(f
>
>
tjc)
t-c) (c
>
0)
are sufficient to show thatf+c is measurable in E for all c, and that cf is measurable in E if c is positive. Also the set of points E( -f > t) is the same as the set E(f < -t), which is measurable by Theorem 8.2.1. Hence - f is measurable in E, and cf is measurable on E whether cis positive or negative. Thesametheoremshowsthatthesets E(f > vt) and E(f < - vt) are both measurable if t is non-negative, whence their union E(f2 > t) is measurable, i.e. f2 is measurable in E. Similarly the set E( If I > t) is the union of the measurable sets E(f > t) and E(f < -t), whence If I is measurable in E. In order to establish the measurability of f+g we need the following lemma. LEMMA. If f(x) and g(x) are bounded, measurable functions in E, then the set of points E (f >g) is measurable.
Let {rn} be any enumeration of the rational numbers. Then any point gin Eat whichf(g) > g(g) lies in the intersection In of two sets of the type E(f > r n) and E(g < r n). Hence E(f > g) is the union of the enumerable, measurable sets {In} and is therefore measurable. THEOREM 8.2.3. Iff and g are measurable in E, so also aref+g, af+bg, fg, max(f, g), min(f, g).
For E(f+g > t) = E(f > t-g). By Theorem 8.2.2, -g and t-g are measurable, whence by the
106
Lebesgue integral of bounded, measurable functions
lemma,J+g is measurable. Also by Theorem 8.2.2, afand bg are measurable, whence af+bg is measurable. Now and by Theorem 8.2.2, (f+g) 2 , (f-g) 2 are measurable, whencefg is measurable. If cf> =max(!, g) and if;= min(!, g) then c/>
=
ilf-gl+t(f+g),
if;= i(f+g)-!lf-gl. Sincef+g,f-g, lf+gl, lf-gl are measurable, so also arecf> and if;. THEOREM 8.2.4. lf{fn(x)} is a sequence of functions measurable in E, then the limits
lim fn(x),
and
n__,.oo
lim fn(x).
are also measurable in E.
Let
M(x)
=
supfn(x),
L(x) = inffn(x).
The set E(M > t) is the union of the enumerable collection of measurablesetsE(fn > t),andthesetE(L < t) = E(-L > -t) is the union of the enumerable collection of measurable sets E( -fn > -t). Hence M(x) and L(x) are each measurable in E. Now let and Then Hence
Mn(x) = sup{fn(x), fn+l(x), ... }, Ln(x) = inf{Jn(x), fn+l(x), ... }. Ln+1 (x) .~ Ln(x)
lim fn(x)
and
Mn(x) ~ Mn+l(x).
= n__,.oo lim Ln(x)
and Ln(x) and Mn(x) are each measurable in E. Hence, as before, so
also are lim Ln(x) and lim Mn(x), limfn(x) and lim fn(x).
Lebesgue integral of bounded, measurable functions CoROLLARY.
107
If Un(x)} is a sequence of functions measurable in
E and fn(x) converges pointwise to a limit f(x) in E then f(x) is measurable in E.
For
f(x)
= lim fn(x) = lim fn(x). n--+ oo
n--+ oo
Finally, we prove that if f(x) is continuous in the interval
[a, b] then it is also measurable in [a, b]. THEOREM 8.2.5. The set of points in [a, b] at which f(x) > t is open, and is therefore an outer set, which is measurable by Theorem 7 .4.3.
All the functions of classical analysis can be constructed from the functions f(x) = 1 and f(x) = x by the algebraic processes of addition and multiplication together with the analytic process of taking the limit of a convergent sequence. Hence all such functions are measurable. The question of the existence of non-measurable functions, like the question of the existence of non-measurable sets of points depends upon the truth or falsehood of the axiom of choice, but we shall not explore this branch of the morbid pathology of functions.
8.3. Measure functions DEFINITION 8.3.1. Hf(x) is measurable in a set E, its 'measure function' mE(t, f) is the measure of the set of points in E at which f(x) > t. THEOREM 8.3.1. The measure function mE(t, f) is a nonincreasing function of t.
For, if s < t, then the set of points E(f set E(f > s), whence
>
t) is covered by the
mE(t, f) =:::; mE(s,f),
by Theorem 7 .3.2, Corollary (ii). IfJ(x) is measurable in a set E which lies in a finite interval I and if o: ::::; f(x) ::::; {:1 in E, then the measure function CoROLLARY.
108
Lebesgue integral of bounded, measurable functions
mE(t, f) maps the bounded domain ex ~ r ~ fJ into a bounded range (since 0 ~ mE(t,f) ~ Ill), whence mE(t,f) is integrable by§ 2.5.
8.4. Simple functions The functions >.(x) and fL(x) which Lebesgue introduced to bracket a bounded function f(x) belong to the class of 'simple' functions, which may be formally described in the following definitions and theorems. DEFINITION 8.4.1. A simple function a(x) is one which is zero outside some finite interval (a, b) and whose range is a finite collection of distinct real numbers Sv s 2 , ... , sm. THEOREM 8.4.1. The simple function a(x) of Definition 8.4.1 can be expressed in the form m
a(x)
= L
sP aP,
p~l
where aP = ap(x) is the indicator of the points at which a(x) = sP and where
For if xis any prescribed point, then a(x) has one and only one of the values Sv s 2 , ... , sm, say sk, and m
L sP ap(x) =
p~l
sk
=
a(x).
To define the integral of the simple function a(x) we could follow the same methods as those we used to define the integral of an indicator function (Chapter 7), but a few moments reflection should satisfy the reader that the final result would be expressed by DEFINITION 8.4.2. If the simple function a(x) is measurable, then its integral is
Ja(x) dx = P~l sP Jap(x) dx = P~ 1 sP mP, wheremP isthemeasureofthe set of points aP atwhicha(x) = sl" To justify this definition we prove
Lebesgue integral of bounded, measurable functions
109
8.4.2. The integral J a(x) dx of a simple function a( x) is a positive, linear functional on the space of simple functions. THEOREM
For, if a(x) is non-negative, then for p
sP ;?: 0
=
1, 2, ... , m,
Ja(x) dx ;?: 0.
whence
Now let the simple functions a(x) and T(x) have representations a(x)
=
Lp sPrxP,
.L rxp =
with Then
a(x)+T(X)
T(X)
=
Lq tq{3q,
.L f3q = L = L sP rxP+ L tqf3q 1,
= ,L (sp+tq)rxpf3q· Also .L rxpf3q = .L rxp .L f3q = L The set of values (sp+tq) may not all be distinct, but they can be grouped into a finite collection of distinct values u 1 , u 2, ••• and then ,L rxP {3q summed over all p and q for which sP +tq = uk will be the indicator x(x, uk) of the points at which a(x) = uk. Hence the sum of two simple functions is a simple function. Now, by Definition 8.4.2,
f {a(x)+T(X)} dx = -t f X(X, Uk
= ,L (sp+tq)
Uk)
dx
J
rxp{3q dx. p,q The set of points at which a(x) = sP is the union of the finite collection of disjoint sets at which a(x) = sP and T(x) = tq for
q = 1,2, .... Hence, by Theorem 7 .3.2, Corollary (i),
J rxP dx = L J rxp{3q dx. J {3q dx = ,L J rxpf3q dx. q
Similarly
p
Therefore
J {a(x)+T(x)} dx = p.,L sP J rxp{3q dx+ p. ,L tq J rxpf3q dx J rxP dx+ ,L tq J {3q dx J a(x) dx+ J T(x) dx.
= ,L sP p =
q
Lebesgue integral of bounded, measurable functions
110
Similarly we can show that, if c and k are any real numbers, ca(x), kT(x) and ca(x)+kT(X) are simple functions, and that
J {ca(x)+kT(x)} dx = c J a(x) dx+k J T(X) dx. Thus the integral of a simple function is a linear functional.
8.5. Lebesgue bracketing functions The following definition is effectively the same as that given by Lebesgue. DEFINITION 8.5.1. If E is a measurable set of points and if f(x) is a bounded function and its range, A < f(x) :( B, is
divided at the finite number of points t0 , tv···> tn so that A
=
~-t(x)
and
t0
.(x) is the lower Lebesgue bracketing function for j(x) as given in Definition 8.5.1, then >.(x) is also a lower bracketing function for rf>(x). For we have only to choose A and B so that
A < f(x)
~
rf>(x)
~
B.
j rf>(x) dx ~ j >.(x) dx,
Hence
J rf>(x) dx ~ sup J >.(x) dx = J f(x) dx.
and
E
E
E
The mean value theorem enables us to infer upper and lower bounds for the integral of f(x) from upper and lower bounds for f(x). It is obvious that there cannot be any exact converse of this theorem, but there are two theorems that are converses of Corollary (ii) of Theorem 8.7.2. THEOREM 8.7.4 (the null integral theorem) {i) If f(x) is bounded, measurable and non-negative in a measurable set E, and f(x) dx = 0, then f(x) = 0 p.p. in E.
I
E
Let En be the set of points in E at which f(x) En is measurable and 0
~ ~ m(En)
1/n. Then
= 0
E
(by Theorem 8.7.1) whence
m(En) = 0. Now the set of points PonE at whichf(x) > 0 is the union of the enumerable sets {En} (n = 1, 2, ... ). Hence the measure of the set Pis zero, i.e. f(x) = 0 p.p. in E. (ii) If f(x) is bounded and measurable in an interval I and
I f(x)dx = OforanyintervalJcoveredbyithenf(x) = Op.p.ini. J
Letj(x) > 0 in a set Pin I. Since Pis measurable it covers an inner set Q, andf(x) > 0 in Q. The complement, I -Q, is an outer set, i.e. an enumerable collection of intervals En- Hence
J f(x) dx = J f(x) dx- .f J f(x) dx = Q
I
n=l En
0.
Lebesgue integral of bounded, measurable functions
118
Therefore, by part (i),j(x) = 0 p.p. in Q, which is a contradiction, whence the result follows. So far we have followed the concise exposition of de la ValleePoussin rather closely, but we can remove a certain artificiality from his proof of the following addition theorems by using the properties of simple functions. THEOREM 8.7.5 (the addition theorem for functions). If the functions f 1 and f 2 are bounded and measurable in E so also is f 1+f2 and U1+f2) dx = f1 dx+ !2 dx.
J
J
E
J
E
E
If .\1, P..v and .\ 2, p.. 2 are lower and upper Lebesgue bracketing functions for / 1 andf2 respectively, then .\1+.\ 2 and p..1+p.. 2 are bracketing functions for f 1+f2, although they are not Lebesgue bracketing functions. Nevertheless they are simple functions, whence, by Theorem 8.4.2,
J .\
J .\
1 dx+
E
2
dx
J (.\ +.\
=
1
E
2)
dx.
E
But, by Theorem 8.7.3,
J(.\1+.\2) dx ~ J(f1+f2) dx; E
E
J .\1 dx+ J .\2 dx ~ J (/1+/2) dx,
whence
E
E
E
and, similarly,
J p..1 dx+ J p..2 dx ~ J (f1+f2) dx. E
E
E
Now we can choose .\1, .\2, P..v p.. 2 so that, given any tolerance €
>
0,
J p..1 dx-E < J/1 dx < J .\1 dx+E, E
J p.. 2 dx-E < J/ 2 dx < J .\2 dx+E. E
Since this is true for all E > 0,
J (/ +/ 1
E
2)
dx =
J/ E
1
dx+
J/ E
2
dx.
Lebesgue integral of bounded, measurable functions
119
We can now show that the Lebesgue integral is a linear functional. THEOREM 8.7.6. lff(x) and g(x) are bounded and measurable on a measurable set E, and if a, bare any real numbers, then
J (af+bg) dx = a J f dx+b J g dx. E
E
E
If a ~ 0, and if,\, p. are lower and upper Lebesgue bracketing functions to f(x), then a,\, ap. are lower and upper Lebesgue bracketing functions to af(x). Hence
Jaf dx =a Jf dx E
(a ~ 0).
E
Also, by Theorem 8.7.5,
J (-af) dx+ J (af) dx = E
whence
0,
E
J (-af) dx = - J afdx =-a J fdx. E
J af dx = a J f dx,
Therefore
E
whether a is positive or negative (or zero). Finally, by the same Theorem 8.7.5,
J (af+bg) dx = E
Jafdx+ J bg dx =a J fdx+b J g dx. E
E
E
E
COROLLARY. By induction it follows that if fv f 2 , ••• , fn is any finite collection of functions, each bounded and measurable on a measurable set E, and if c1 , c2 , ••• , en are any real numbers, then
The extension of this result to an enumerable collection of functions is one of the most powerful and attractive features of the Lebesgue theory. We shall begin by proving the 'theorem of bounded convergence', which applies to a sequence Un(x)} (n = 1, 2, ... ), such that, ateachpointxofa bounded and measurable set E, lfn(x) I is less than a constant K independent ofn, and fn(x) converges to a limit function f(x).
120
Lebesgue integral of bounded, measurable functions
THEOREM 8.7.7. If the sequence Un(x)} converges boundedly to f(x) in a bounded and measurable set E, then
J fn(x) dx--+ Jf(x) dx E
as n--+ oo.
E
Choose any positive number E. Let E 1 be the set of points at which lf(x)-fn(x) I < E, for all n and let Ek+l be the set of points at which
I f(x)-fk(x) I ?: E lf(x)-fn(x) I < E for n ?: k+l.
and
Then the sets {Ek} (k = 1, 2, ... ) are disjoint and their union is E. Hence, by Theorem 8.7.2,
Jfn(x) dx = i Jfn(x) dx.
E
k=l Ek
By the addition theorems (8.7.2) and (8.7.5),
Jf(x) dx- Jfn(x) dx = J {f(x)-fn(x)} dx E
E
E
=
i J{f(x)-fn(x)} dx+ J{J(x)-fn(x)} dx
k=l Ek
Fn
where Fn = E-E1 -E2 - ••• -En. By the mean value theorem (8.7.1),
I J{f(x)-fn(x)} dxl < Em(Ek) whence
ifn?: k+1,
Ek
lkt l
{f(x)-fn(x)} dx
I< \tm(Ek)
m(y)-m(x)
=
X
x, therefore 0 :( m(y)-m(x) :( y-x,
whence m(x) is continuous and is a non-decreasing function of x. Therefore, by Theorem 5.7.5, m(x) possesses a derivative m'(x), almost everywhere and 0 :( m'(x) :( l. It only remains to prove that, almost everywhere m'(x) = x(x). Let mn(x)
=
n{m(x+lfn)-m(x)}
(n
=
l, 2, ... ).
Then mn(x) converges boundedly to m'(x) almost everywhere in any interval [a, b]. Therefore by the theorem of bounded convergence (8.7.7), if a:( c :( b c
c
J mn(x) dx--? Jm'(x) dx a c
But
as n--? oo.
a c
J mn(x) dx =
c
J m(x+lfn) dx-n J m(x) dx
n
a
a
a
c+l/n
J
= n
a+l/n
m(x) dx-n
c
J
m(x) dx,
a
for all n > (b-c)-1. Now, by the mean value theorem (8.7.1), if A
= inf m(x) and
B
= supm(x)
for a:( x :( a+lfn, then a+l/n
Ajn :(
J
m(x) dx :( Bjn.
a
But, since m(x) is ·continuous, A --? m(a)
and
B--? m(a)
as n--? oo.
a+l/n
Therefore
n
J
m(x) dx--? m(a)
as n--? oo.
m(x) dx--? m(c)
as n--? oo.
a c+l/n
Similarly
n
J
c
Lebesgue integral of bounded, measurable functions
123
Therefore c
c
I m'(x) dx = m(c)-m(a) = I x(x) dx, a
a
for any interval [a, c]. Hence, by Theorem 8. 7 .4, m'(x)
=
p.p. in [a,b].
x(x)
A point g is said to be a point of metric density of the set E with indicator x(x) if the measure function X
m(x)
=
I x(t) dt a
has a derivative m'(g) at the point g. Hence we have proved that almost all points are points of metric density and that the value of the metric density is 1 if x E E or 0 if x f/= E. We can now examine the differentiability of the Lebesgue integral of a bounded, measurable functionf(x) over a bounded interval [a, b]. THEOREM
8.8.2.
Almost everywhere in [a, b] the integral
X
cp(x) =
Jf(t)
dt (a
t, or in the form
I f(x) dx =
B
Bm(E)-
E
I ftE(t,f) dt, A
where 11-E(t, f) is the measure of the set of points in E at which f(x) < t. These expressions cannot be immediately generalized to unbounded functions or to unbounded intervals, for, in the first term either A or B or both will be infinite. This particular difficulty disappears if we restrict ourselves to non -negative functions, for, if f(x) ;;:?: 0 then B
I f(x) dx = I
E
mE(t, f) dt.
0
We are now left with the problem of defining the Young integral for an unbounded measurable function f(x) and an unbounded measurable set E. By Definition 7 .5.1 the integrand mE(t, f) is given in terms of mE,(t, f) the measure of the set E in the interval E8 ( -s :( x :( s), as
mE(t,f) =lim mE,(t,f). 8-+CO
Hence mE(t, f) like mE,(t, f) is a non-decreasing function oft. The difficulties in framing a definition are that the upper limit B may be infinite and the integrand mE(t, f) may not be bounded. However, it is clear that we can restrict ourselves to the case when mE(t, f) is bounded in any interval 0 < e :( t (x) is summable over E, then f(x) is summable over E. mE(t,f) ~ mE(t, c/>),
For B
and
B
JmE(t,f) dt ~ JmE(t,cf>) dt. E •
E
Hence the integral on the left must converge as B--? oo.
E --?
0 and
DEFINITION 9.2.2. The Lebesgue integral over E of a nonnegative function f(x) summable over E is B
J f(x) dx = E
lim
J mE(t,f) dt.
B~oo, £--?-0 €
Lebesgue integral of summable functions
129
Consider, for example, the function sin 2x
f(x) = -
X
2
. 1fx #- 0,
,
f(O) = 0,
for which B = I. The set of points at whichf(x) > tis obviously covered by the set of points at which x- 2 > t. Hence, if E is the semi-infinite interval 0 ~ x ~ oo, then mE(t, f)
>
For the set E{x; f(x) measurable sets E{x;f(x) able. Also
> t, x
E{x; f(x)
E
t, x t,x
E E
J} is the intersection of the I} and J, and hence is measur-
I)} ~ E{x; f(x)
>
t, x
E
J)};
whence 00
00
Jmi(t, f) dt ;?: JmJ(t, f) dt.
and
0
0
Two important consequences of these definitions relate to sets of zero measure. THEOREM 9.2.4. If f(x) is any function, bounded or unbounded, measurable or not, and if E is any set of measure zero, then
Jf(x) dx = 0. E
For
mE(t,j+)
and
~
m(E)
mE(t,f-) :( m(E) B
B
= =
0, 0.
JmE(t,j+) dt = 0 = JmE(t,J-) dt, and therefore Jf(x) dx = 0,
Hence
E
by Definition 9.2.5.
Lebesgue integral of summable functions
131
THEOREM 9.2.5. lff(x) is summable over a measurable set E and Z is a subset of E of measure zero, then
Jf(x) dx = Jf(x) dx, E
F
where F is the complement of Z with respect to E.
We need only prove the theorem for f+(x). Now
mE(t,f+) = mF(t,f+)+mz(t,j+)
and
mz(t, f+) = 0.
Hence 00
00
Jf+(x) dx = JmE(t,j+) dt = JmF(t, f+) dt = Jf+(x) dx. E
0
0
F
Hence in calculating a Lebesgue integral over Ewe can always neglect the contribution from a set of points in E of measure zero. DEFINITION 9.2.6. Two functionsf(x) and g(x) each summable over E are said to be 'equivalent' in E if f(x) = g(x) almost everywhere in E.
9.2.6. lff(x) and g(x) are each summable over E and f(x) and g(x) are equivalent in E, then THEOREM
Jf(x) dx = Jg(x) dx. E
E
9.3. The lebesgue integral of summable functions as a positive, linear, 'continuous' functional To justify the definition of the Lebesgue integral of a summable function we must show that, as in the case of a bounded measurable function, it is a positive, linear, continuous functional, but now 'continuity' is taken in the very general sense of the theorem of 'dominated convergence' (9.3.7). THEOREM 9.3.1. If f(x) is non-negative and summable over a measurable set E, then f(x) dx ;;?: 0.
J
E
132
Lebesgue integral of summable functions
For in this case 00
J f(x) dx = Jf+(x) dx = J mE(t,f+) dt;?:. 0. E
E
0
Hence the Lebesgue integral of a summable function is a positive functional. THEOREM 9.3.2 (the addition theorem for sets). If the bounded measurable set E is the union of a finite or enumerable collection of disjoint, measurable sets En (n = l, 2, ... ) and if f(x) is summable over each set En then it is also summable over E and
Jf(x) dx =
E
~
Jf(x) dx.
n-l En
It is clearly sufficient to prove this theorem for f+(x). Let F be the set of points in Eat which 0
E1
= E{x; e
E2
= E{x; g1 (x)
g1 (x), g2 (x), ...}
> g2 (x), g3{x), ...} ~ e > gk+I(x), gk+ 2 (x) ... }.
~
Ek+l = E{x; gk(x)
e
Then the sets {Ek} (k = I, 2, ... ) form an enumerable collection of disjoint, me.asurable sets whose union is E. Hence by the addition theorem for summable functions (9.3.3),
where Since 0 :( gn < e in each of the sets E 1 , E 2 , ... , En it follows that 0 :( J gn :( em(Ek) if k = 1, 2, ... , n. Also Ek
f gn :( 2 f cp, f cf;- f cf;- f cf;- ... - f cp, 0 :(
Fn
f cf;
and
Fn
=
Jcf; --+ 0
n---+oo
Ez
En
as n --+ oo.
Fn
Therefore limsup
E,
E
whence
F,.
Jgn < em(E )+em(E )+ ... +em(En) :( em(E). 1
2
E
Since m( E) is finite and this inequality is true for all e
Jgn --+ 0
>
0
as n--+ 00.
E
Thus
I Jfn(x) dx- Jf(x) dx/ :( J gn(x) dx--+ 0 E
E
as n--+ 00.
E
The theorem of dominated convergence is also true for any measurable set E, bounded or unbounded. CoROLLARY.
Lebesgue integral of summable functions
137
Let E 8 be the intersection of E and the finite interval x :( 8. Then
- 8 :(
f lf-fnl
=
E
where
An,s+Bn,s>
f
An,s =
lf-fnl
E-E,
f lf-fnl• 0 :( An,s :( 2 f rp-2 f rp.
and
=
Bn,s
E,
For all n,
E
E,
Hence there exists an integer a( E) such that An,s :( E for 8 ;;:?: a( E) and for all n. Also there exists an integer r(8, E) such that
Bn,s :( E for n ;;:?: r(8, E). Hence there exists an integer v(a, E) dependent only on E such that An,u+Bn,u :( 2E for n > v(a, E). Therefore
J lf-fnl =
lim
0
n---+co E
and the theorem is established. Thus the Lebesgue integral of summable functions is a continuous functional. The existence of a summable dominant function rp(x) such that lfn(x)l :( rp(x) and rp(x) is summable over an unbounded measured intervalE is sufficient but not necessary for the convergence of the sequence, fn(x) dx to the limit limfn(x) dx.
I
I
E
E
A simple counter example is given by (n-! < x otherwise.
(x) ~ lfn(x) I for all n, then rf>(x) summable dominant function exists.
~
x- 1 and hence no
9.4. The Lebesgue integral as a primitive Wehavealreadyprovedin Theorem 8.8.2thatiff(x)isa bounded, measurable function in an interval [a, bJ then the indefinite x Lebesgue integral rf>(x) = f(t) dt
J
a
possesses, almost everywhere in [a, b], a derivative rf>' (x) equal to f(x). We can now extend this result to summable functions. We need a preliminary lemma, due to Fatou, on sequences of functions {fn(x)} which converge to a limit f(x) (but which do not possess dominated convergence). THEOREM 9.4.1 (Fatou's lemma). If {fn(x)} is any sequence of non-negative functions, each summable in a bounded measurable set E, and if fn(x)--+ f(x) pointwise in E, and liminf Jfn(x) dx < oo, E
thenf(x)issummableover Eand liminf
Jfn(x) dx ~ J f(x) dx.
n---+oo E
E
fn,k = min(fn, k) (k = 1, 2, ... ). Then, as n--+ oo,fn k--+ min(/, k) = rPk• say. Hence, by Theorem 8.7.7, fn,k--+ rPk· Let
f
f
E
E
But fn k ~ fn, whence, by Theorem 9.3.5,
f fn,k ~ f fn·
E
Therefore liminf
E
Jfn ~ liminfjJn,k =lim Jfn,k = JrPk· E
'
E
E
Since this is true for all k and since k
Jm(t,f) dt = JrPk ~ liminf Jfn, 0
E
Lebesgue integral of summable functions
139
it follows that f is summable and that
Jf
:( liminf J fn-
E
E
This theorem is the 'best possible' with the prescribed restrictions on the functionsfn(x) for we can easily construct an example where the sign of inequality must be taken. x _ {n2x fn( ) 0
Let
(0 :( x :( 1/n),
(1/n
Jf. 2
0
0
0
A special case ofFatou's lemma (9.4.1) is worthy of note, viz. the 'monotone convergence theorem'. THEOREM 9.4.2. If {fn(x)} is any monotone, non-decreasing sequence of non-negative functions, each summable in a bounded, measurable set E, fn(x)-+ f(x) pointwise in E, and
Jfn(x) dx < oo,
lim
E
then f(x) is summable over E and
Jfn(x) dx-+ Jf(x) dx. E
E
For by Fatou's lemma (9.4.1), f(x) is summable over E. Hence fn(x) dx :( f(x) dx
J
J
E
E
:( liminf n_,.oo
= lim
J fn(x) dx
(by 9.4.1)
E
J fn(x) dx,
n_,.oo E
I
since the sequence { f(x) dx} is monotonic; therefore E
lim
Jfn(x) dx = J f(x) dx.
n_,.«> E
E
140
Lebesgue integral of summable functions
Before studying the differentiation of an indefinite integral, it is a simpler problem to study the integration of a derivative. The surprising result is given by THEOREM 9.4.3. If rf>(x) is continuous and non-decreasing in [a,b] then its derivative rf>'(x) is summable over (q,,b) and b
Jrf>'(x) dx ~ rf>(b)-rf>(a). a
By Theorem 5. 7.5 the incrementary ratio
rf>(x, n) = nrf>(x+ 1/n)-nrf>(x)
(n = I, 2, ... )
converges to a limit rf>'(x) as n--+ oo almost everywhere in (a, b), i.e. at a set of points E with measure m(E) = b-a. Hence, by Fatou's lemma, rf>'(x) is summable over E and
Jr/>'(x) dx ~ liminf Jrf>(x,n) dx. n~oo
E
.
E
Now, by Theorem 9.2.5, b
Jr/>'(x) dx = Jrf>'(x) dx, a
E
while b
J rf>(x, n) dx = J rf>(x, n) dx E
a b
=
n
b
Jrf>(x+ 1/n) dx-n Jrf>(x) dx a
a
b+l/n
= n
J
a+l/n
rf>(x) dx-n
J
rf>(x) dx
a
b
if we define rf>(x) as equal to rf>(b) in the interval b Now b+l/n .n rf>(x) dx = rf>(b),
~
x
~
J
b
a+l/n
and
n
J
rf>(x) dx ~ rf>(a).
a
Hence b
b
Jrf>'(x) dx ~ li.e~f Jrf>(x, n) dx ~ rf>(b)-rf>(a). a
a
b+Ifn.
Lebesgue integral of summable functions
141
Once again we note that this is the 'best possible' result, for there are continuous and non-decreasing functions cfo(x) for which b cp'(x) dx < cp(b)-cfo(a).
J
a
To investigate the differentiability of the indefinite Lebesgue x integral cp(x) = f(t) dt
J
a
we can, as usual, restrict ourselves to a non-negative function f(x), but we need to verify the continuity of cfo(x) when f(x) is
unbounded but summable. THEOREM 9.4.4. If f(x) is non-negative and summable over the interval [a, b], then the Lebesgue integral X
=
cfo(x)
Jf(t) dt a
is continuous for all x in [a, b]. fn(x)
Let
=
{fn(x)
(f(x) :( n) (f(x) > n).
fn(x) ~ f(x)
Then
as n
~
oo,
and, by Theorem 9.3.7, b
b
Jfn(x) dx ~ Jf(x) dx a
Hence to any tolerance such that b 0 :(
E
>
0 there corresponds an integer n b
Jf(x) dx- Jfn(x) dx < !E.
a
a
Now fn(x) :( f(x), whence, if a :( a: I! f3
< fJ :( b,
then
Jf(x) dx- Jfn(x) dx < !E
0 :(
"' f3 and
as n-+ oo.
a
0 :(
Jfn(x) dx
"'
"'
(01.) < e, i.e. rf>(x) is continuous for all x in [a, b]. 0
THEOREM
(x)
= J"' f(t)
dt possesses almost everywhere
a
in (a, b) a derivative rf>'(x) equal to f(x).
The functionf(x)-fn(x) is non-negative, and summable, and the integral
J"' {f(t)-fn(t)} dt =
J"' fn(t) dt
rf>(x)-rf>(a)-
a
a
is a non -decreasing, continuous function of x. Hence, by Theorem 5.7.5, it possesses almost everywhere a derivative which is nonnegative. By the same theorem the integrals
J"' f(t) dt
and
a
J"' fn(t) dt a
are differentiable almost everywhere. Now r/>'(x) ;;?:
! J"'
fn(t) dt+
~~ ~
J"'
{j(t)-fn(t)} dt
p.p.
a
a
andf(t) ;;?:fn(t). Therefore, by Theorem 8.8.2, r/>'(x) ;;?: d:
J"' fn(t) dt = fn(x)
p.p.
a
and
r/>'(x) ;;?: lim fn(x) n-+«>
= f(x)
p.p.
But, by Theorem 9.3.3, b
b
b
J{r/>'(x)-f(x)} dx = Jr/>'(x) dx- Jf(x) dx ~ 0. a
a
a
Hence, by the null integral theorem (8.7.4), r/>'(x) = f(x)
p.p. in (a, b).
Lebesgue integral of summable functions
143
9.5. Exercises 1. Show that the frmction
smx j(x)
=
{
~
(x
=!= 0)
(x
=
0)
is not summable over the interval (0 .;;; x .;;; oo). 2. If the non-negative frmctions fn(x) (n = 1, 2, ... ) are each summable over a measurable set E, and if fn(x) .;;; fn-t- 1 (x) prove that the limit frmction f(x) = lim fn(x) n-+«>
is summable over E and that
Jfn(x) dx-+ Jj(x) dx E
3.
Iffn(x)
as n-+ oo.
E
denotes the trrmcated frmction fn(x)
= {
f(x) n
prove that (f+g)n .;;; f n + gn .;;; (9.3.3) for frmctions.
~f 0 .;;; x .;;; n 1fn<x (n=1,2, ... )
(f+ g) 2 n, and deduce the addition theorem
4. The frmction
j(x)
=
2x sin(1/x 2 ) - (2/x)cos(1/x 2 )
is the derivative of x 2 sin(1/x 2 ). Explain why j(x) is not summable over [0, 1]. 00
5. Examine the sequence of integrals
J fn(x) dx
in the light of the
0
theorems of dominated and monotone convergence if (i) (ii) (iii) (iv)
fn(x) fn(x) fn(x) fn(x)
=
= =
=
nxe-n"', 2n 2 e-n'x', cp(nx), cp(x-n),
where cp(x) is summable over (0, oo).
LEBESGUE THEORY IN d DIMENSIONS
10
Multiple integrals
10.1. Introduction It is possible to develop the theory of Lebesgue measure and integration in d-dimensional Euclidean space from the very beginning, but in the interests of intelligibility we have so far restricted our exposition to one dimension. We have, however, so phrased the terminology, the notation, the definitions, and the theorems so that most of them are applicable to d-dimensional space.
10.2. Elementary sets in d dimensions DEFINITION 10.2.1. In d dimensions, with coordinates (x1 , x 2 , ••• , xa) an 'interval' is the Cartesian product of the linear
intervals
ak-< xk-< bk (k = 1, 2, ... , d) where ak, bk are finite or infinite numbers for all k. DEFINITION 10.2.2. An 'elementary set', with indicator a is the union of a finite number of disjoint intervals, with indicators a 1 ak
=
0
if j -:::/= k,
and a= a1 +a2 +... +aw The intervals {a8 } are called the 'components 'of a. THEOREM 10.2.1. The intersection, union, difference, and symmetric difference of two elementary sets are also elementary sets. The intersection of two intervals
Multiple integrals
145
is an interval of the form
The proof for the intersection of two elementary sets then follows as in Theorem 6.2.1. If the interval T is covered by an interval w, then the complement w-T is an elementary set. For if Tis the interval (ak-< xk-< bk) (k = I, 2, ... , d) then the 2d hyperplanes, x = ak and x = bk divide the interval w into 3d intervals, one of which may be taken to beT and the remainder of which form an elementary set, i.e. the complement w-T is an elementary set. If the elementary set Tis covered by another elementary set a then the complement a-T is an elementary set. For U-T =a-UT
where {as} are the component intervals of a. Now as and l-T are elementary sets, and so is their intersection as(l-T). Also as(l-T).a1(l-T) = 0, i.e. the elementary sets aa(l-T) are disjoint. Thus a-T is the union of a finite number of disjoint intervals and is therefore an elementary set. It follows as in Theorem 6.2.1 that, if a and Tare elementary sets, so also is their union auT, the differences a-aT, T-aT, and the symmetric difference a~ T. Thus elementary sets in d dimensions form an algebra, as in one dimension. 10.2.2. If a and T are any pair of elementary sets then there exists a finite collection of disjoint intervals {Ys} such that THEOREM
T
=
2 btYt> I
aUT= 2Ys> s
where each coefficient as or b1 is either 0 or I.
The proof is exactly the same as in Theorem 6.2.2. 868146X
K
I46
Multiple integrals
DEFINITION I0.2.3. The geometric measure g(01.) of an interval 01. (ak-< xk-< bk) (k = I, 2, ... , d) is its area (if d = 2), volume (if d = 3), or hypervolume (if d > 3), and d
g(a) =
IT (bk-ak). k=l
THEOREM I0.2.3. If the interval w is the union of a finite number N of disjoint intervals {as} then N
g(w)
= L g(as)· s=l
-
, xk = b~) divide the interval w into
a finite number of intervals, w 1 , w 2 , .•. which can be enumerated so that the interior of w 8 coincides with the interior of a 8 for s = I, 2, ... , N. Hence
1 0.3. Lebesgue theory in d dimensions From this point onwards the d-dimensional theory follows, almost word for word, the one-dimensional theory of §§ 6.2, 6.3, 6.4, 6.5, and 6.6. Outer and inner measure are then defined and discussed exactly as in Chapter 7, and the Lebesgue integral as in Chapters 8 and 9, with one exception noted below and with the convention that the symbols b
JJ ... J f(x) dx
1 dx 2 ••• dxd,
Jf(x) dx, a
R
or
Jf(x) R
now represent the. integral of f(x 1 , x 2 , ••• , xd) over the interval R ak ~
Xn
~ bk
(lc
=
I, 2, ... , d).
The reduction of the d-dimensional theory to the one-dimensional case is facilitated by Lebesgue's concept of integration over a measurable set and by Young's integral
Jf(x) dx = JmE(t, f) dt, E
Multiple integrals
I47
which reduces the d-dimensional integral of f(x) over E to the one-dimensional integral of its measure function over the range ofj(x).
The one exception is that we must exclude the results concerning the differentiability of the indefinite Lebesgue integral I
cp(t) =
J f(x) dx, which is crucially dependent on the monotone a
and continuous character of cf;(t) as a function of the single variable t for non-negative functions f(x). There is, however, one new problem that arises in the multidimensional theory, which is most clearly exhibited in the case of two dimensions. This is the problem of the relation between the integral f(x, y) dxdy
I
R
(a~ x ~
over the rectangle R
b, p
~
y
~
b
g(y) =
q) and the integrals
q
I f(x, y) dx,
I f(x, y) dy,
h(x) =
a
p q
b
I g(y) dy, I h(x) dx. a
p
Elementary calculus suggests that if f(x, y) is bounded and continuous in R, then the multiple integral can be expressed as · a repeated integral in the forms q
b
I f(x, y) dxdy = I g(y) dy = I h(x) dx, R
a
p
but there are a number of well-known examples which show that these relations are not true for all unbounded functions: (I)
If
except at the origin, where f = 0 and if R is the square (0
then
=
g(y)
~
x
~
I,
-(I+y2)-I,
1
I g(y) dy 0
0
~
y
~
I),
h(x)
=
(l+x2)-l,
1
=
-!7T,
I h(x) = !1r. 0
148
Multiple integrals
The set of points (x, y) at which f(x, y) > t > 0, occupies the interior of half one of the loops of the lemniscate, tr 2 = cos 20,
whence
= !t-1
m(t,f)
and the Young integral, co
J m(t,f) dt 0
is not convergent, so that f(x, y) is not summable over R. If rfo(z) = pzP(I+z2P) (p and R is the rectangle (0 x (2)
>
I), z
= xy, f(x, y) = drfo(z)jdz
< < oo, 0 < y 0 there correspond outer sets a = a(x) and r = r(y) such that a ::( a, fJ ::( r and m(a)-E ::( m(a) ::( m(a), m(r)-E ::( m(fJ) ::( m(r).
Multiple integrals
158
Now a and Tare respectively the unions of enumerable disjoint intervals {ap}, {T q}, so that a=
!
aP,
T=
!
Tq,
and
*
aP Tq is a rectangle whose edges are the intervals aP and Tq. Hence m(ap Tq) = m(ap)m(Tq) and m(a T) =! m(ap)m(Tq)
* *
=! m(ap)! m(Tq) = m(a)m(T).
The product set ex * fJ is covered by the open set a
* T, whence
m*(ex * fJ) ~ m(a * T) = m(a)m( T) ~ {m(ex)+E}{m(fJ)+E}.
This is true for each
E
>
m*(ex
0, therefore
* {J) ~ m(ex)m({J).
Now the sets ex and fJ are each bounded. Hence there are intervals A and B, lying on the x- andy-axes respectively, such that ex ~ a ~ A and fJ ~ T ~ B. The complement of the set ex fJ with respect to the rectangle A B is expressible as the union of three disjoint sets as
*
*
A
* B-ex * fJ =
(A-ex)* (B-fJ)+(A-ex)
* fJ+ex * (B-{J).
By the result established above for outer measures
* (B-{J)} ~ m(A-ex)m(B-fJ), m*{(A-ex) * fJ} ~ m(A-ex)m({J), m*{ex * (B-{J)} ~ m(ex)m(B-fJ). Therefore m*(A * B-ex * fJ) ~ m(A)m(B)-m(ex)m({J), and m*(ex * fJ) = m(A * B)-m*(A * B-ex * {J) m*{(A-ex)
~
m(ex)m(fJ).
Thus
* fJ) ~ m*(ex * fJ) ~ m(ex)m(fJ). Hence ex* fJ is measurable and m(ex * fJ) = m(ex)m({J). m(ex)m({J) ~ m*(ex
Multiple integrals
159
10.9. The geometric definition of the Lebesgue integral The definition of the Lebesgue integral given by its inventor in his thesis and first paper (1902) was geometric rather than analytic in character, and it provides an illuminating approach, as can be seen from the clear and concise account given by Burkill ( 1953). The relation between the geometric and analytic definitions is given by the following definition and theorem. DEFINITION 10.9.1. Iff(x) is a non-negative function defined in a set E, then the 'ordinate set' of f(x) over E is the set of points (x, y) such that x E E and 0 ~ y < f(x). THEOREM 10.9.1. If f(x) is non-negative, bounded, and measurable over a bounded, measurable set E, then the Lebesgue integral J f(x) dx is equal to the two-dimensional measure of the ordinate E
set of f(x) over E. In the notation of§ 8.5 let the range [0, B] of f(x) be divided by the finite number of points t0 < t1 < t 2 < ... < tn = B. Let o:(x, y) be the indicator of the ordinate set of f(x) over E, and let Y)p(y) be the indicator of the set tP ~ y < tp+1· 0
=
=
Let
.\(x, y)
and
t-t(x, y) =
n-1
L o:(x, tp+l) * YJp(y)
P=O
n-1
Then
*
:L o:(x, tp) Y)p(y). P=O .\(x, y) ~ o:(x, y) ~ fL(X, y).
*
The sets o:(x,tp) Y)p(y) (p = 0, 1,2, ... ,n-1) are disjoint, and by Theorem 10.8.1 on product sets, m{o:(x, tp)
* Y)p(y)} =
m{o:(x, tp)}m{YJp(y)}
= m(tp, f)(tp+I-tp), where m(t, f) is the measure of the set {x; f(x) > t}. Hence the measure of the set of points with indicator t-t(x, y) is n-1
m(t-t) =
L
P=O
m(tp,f)(tP+l-tp)·
160
Multiple integrals
Similarly and
m(fL) -m(,\)
=
n-1
L {m(tp, f)-m(tp+I• f}(tp+I-tp)
p=O
n-1
< EP=O L {m(tp,f)-m(tp+vf)} < Em(E), if E = max(tp+I-tp) for p = 0, 1, 2, ... ,n-l. B
Therefore
m(,\)
< Jm(t, f) dt < m(f.L) 0
and, since ex :( fL,
< m(f.L).
m*(ex)
If w is an interval covering ex, then
< w-,\
w-ex
and, by the preceding result, m*(w-ex) :( m(w-,\) = m(w)-m(.\),
whence Now
< m(w)-m*(w-ex) = m*(ex). m*(ex)-m*(ex) < m(f.L)-m(,\) < Em(E). m(,\)
Since this holds for each
E
>
0,
m*(ex) = m*(ex).
Thus the ordinate set ex is measurable. Hence
m(,\) :( m(ex)
< m(f.L)
B
and therefore
m(ex) =
Jm(t,f) dt. 0
If the Lebesgue integral is defined geometrically as in Theorem 10.8.1, then it is possible to give very compact proofs of the convergence theo~ems (8.7.7 and 9.3.7) and ofFubini's theorem, as in Burkill's monograph (1953).
10.10. Fubini's theorem in d dimensions The generalization of Fubini's theorem to d dimensions is proved in exactly the same way as in two dimensions and it is sufficient to state the results without proof if we use the customary compact notation.
Multiple integrals
161
The d-dimensional space R is the Cartesian product of the p-dimensional space R(p) and the q-dimensional space RCq) (d = p+q). The vector x = (x 1 , x 2 , ... , xp) is a point in R RC•>
and the two repeated integrals are equal. 853146X
L
162
M 'llltiple integrals
10.11. Exercises l. j(x, A) possesses a partial derivative g(x,>.)
=
of(x,>.)jo>.
if a .;;; x .;;; b, and ex .;;; )1. .;;; {3. g(x,>.) is a bounded and measurable function of (x,>..). Prove that b h()\) = g(x,>.) dx
J
a A
b
J h(t) dt = J {f(x,>.)-f(x,cx)} dx.
exists and that
a
I b
Deduce that
h()\)
=
:)1.
j(x,>.) dx
p.p. in>...
a
2. Use Fubini's theorem to prove that w
w
0
0
J e-x' dx J e-
11 '
t7T,
=
dy
expressing the multiple integral as a Young integral. 3. If g(x) and h(x) are summable functions and G(x)
J" g(s) ds,
=
J" h(s) ds,
=
H(x)
-w
-w
prove that
J
g(x)H(x) dx+
_ oo
_
r
h(x)G(x) dx
=
lim G(t)H(t). t--+co
00
4. If f(x, y) is bounded and measurable in the interval (a .;;; x .;;; b, p .;;; y .;;; q) and is non-increasing in y for each fixed value of x, prove directly that
a
I>
a
b
J dx Jf(x,y) dy = J dy J j(x,y) dx. a
p
p
a
5. If j(x) is non-negative, and bounded over E, and if the ordinate set of j(x) over E is measurable with measure m, show that j(x) is measurable over E and that m = f(x) dx (Williamson, pp. 54, 55). .
.
J
E
6. Prove Theorems 10.8.1 and 10.9.1 for unbotmded measurable sets / E.
11
The Lebesgue-Stieltjes integral
11.1. Introduction The concept of the Stieltjes integral can be illustrated by the problem of calculating the quantity of heat Q required to raise the temperature of a given heterogeneous body by 1 °0, given the specific heat at each point of the body (Lebesgue 1928, p. xii). Let the body be divided into a finite number of parts of masses mv m 2 , ••• , mm and let f.p and i5P be the infimum and the supremum of the specific heat at points in the part with mass mP. Then, by the definition of specific heat, Q is intermediate in value between the sums n
A=
L f.pm/'
p~l
n
and
M=
L
i5PmP.
p~l
As in§ 3.3 we can consider the collections of numbers A and M for all subdivisions of the body into a finite number of parts, and we can define sup A and inf M as the lower and upper Stieltjes integrals of the specific heat over the mass of the body. If these bounds are equal we can define their common value to be the Stieltjes integral of the specific heat over the mass of the body. This process is clearly analogous to that by which we obtained the lower and upper Darboux integrals(§ 3.3) and the Riemann integral, and the result is often called the Riemann-Stieltjes integral. It is subject to the same criticism as the Riemann integral and we shall therefore pass on at once and construct the analogue of the Lebesgue integral and thus obtain the Lebesgue-Stieltjes integral. For this purpose we can adopt the whole of the Lebesgue theory if we make one small but vital change at the very beginning and replace the geometric measure of an interval by what we shall call the 'weighted measure'. Thus in the example quoted
164
The Lebesgue-Stieltjes integral
at the beginning of this section the primary concept would be not the volumes of the parts into which the body is divided but the masses of those parts. Now the physical concept of mass has one important property that is analogous to the mathematical concept of measure, viz. it is an additive function of the parts into which a body is divided, i.e. if a body of mass m is divided into a finite number of parts with masses m1 , m 2 , ..• , mn, then m = m 1 +m 2 + ... +mnBut the concept of mass differs from the concept of measure in as much as we can have masses that are concentrated into surfaces, lines, or points, whereas the three-dimensional measure of a plane, or straight line, or a point is zero. (We do not speak of three-dimensional measure of a surface or a curve in general, because there are pathological examples that invalidate the corresponding plausible assertion.) This 'grittiness' or lack of smoothness in a mass distribution necessitates a rather careful definition of the concept of weighted measure. In three-dimensional space and even in two-dimensional space this leads to rather tiresome complications and we shall therefore first restrict ourselves to the Lebesgue-Stieltjes integral in one dimension.
11.2. The weighted measure Whether we are considering an open interval (a, x) or a closed interval [a, x] its weighted measure w(x) is a non-negative, monotone, non-decreasing function of x. In any interval a ~ x ~ b, such a function is necessarily continuous at each point with the possible exception of a finite or enumerable set of points {xn} (n = 1, 2, ... ). At each of these points there exist the limits · W (Xn- O) = 1I. l l W (Xn- h) and
w(xn+O) =lim w(xn+h)
as h tends to zero through positive values. Hence we are led to the following definitions. 11.2.1. The weight function w(x) negative, non-decreasing function of x. DEFINITION
IS
a non-
The Lebesgue-Stieltjes integral DEFINITION
11.2.2 The weighted measure of a point
165
t is
w(t+O)-w(t-0).
(This is zero unless t is one of the points of discontinuity xv x2, .... ) The weighted measure of an open interval (a, b) is w(b-0)-w(a+ 0). The weighted measure of a closed interval [a, b] is w(b+O)-w(a-0). The weighted measure of a half-open interval [a, b) is w(b-0)-w(a-0). The theory of weighted measure can now be developed by exact analogy with the theory of geometric measure (Chapter 6). DEFINITION 11.2.3. The weighted measure w(a) of an elementary set a, consisting of a finite number of disjoint intervals a 1 , a 2 , .•• , an is the sum
DEFINITION 11.2.4. The weighted measure w(a) of an outer set a consisting of an enumerable collection of disjoint intervals is the sum 00 w(a) = :2 w(an)· n~1
It may now be verified as in Theorem 6.3.6 for Lebesgue measure that, with these definitions, if a and T are bounded outer sets with the representations 00
a
=
:2 a s=1
00
8
and
T
=
:2 {31, 1=1
ro
and if a is covered by
T,
then
:2 w(a s=1
oo
8)
~
:2 w(fJ1). 1=1
DEFINITION 11.2.5. The outer weighted measure w*(a) of a bounded set of points a is the infimum of the weighted measures w(a) of the outer sets a which cover a, i.e.
w*(a)
= inf w(G!) for a ;;:, a.
166
The Lebesgue-Stieltjes integral
DEFINITION 11.2.6. The inner weighted measure ww(a) of a bounded set of points a with respect to an interval w which covers a is ww(a) = w(w)-w*(w-a). THEOREM 11.2.1. The value ofww(a) is independent of w. DEFINITION 11.2.7. The inner weighted measure w*(a) of a bounded set of points is the value of ww(a) for any interval w which covers a. DEFINITION 11.2.8. A bounded set of points a is said to have Stieltjes measure f-tw(a) with respect to the weight function w(x) if the outer and inner weighted measures of a are equal, and the value of the Stieltjes measure is defined to be
f-tw(a) = w*(a) = w*(a). For brevity we often say that, under these conditions, 'the set a is measurable (w) '. THEOREM 11.2.2. The Stieltjes measure f-tw(a) is a positive, additive, continuous functional of the indicator a(x), i.e. ~
f-tw(a)
0,
f-tw(al+a2) = f..l,.,.(al)+f-tw(a2)
if ala2 = 0,
ro
and
fLw(a)
2
=
f-tw(an)
n~l
if av a2 , ••• is an enumerable collection of dis_joint sets each bounded by the same interval I. The Stieltjes measure f-tw(a) therefore possesses many of the properties of an integral, and is therefore commonly written in the form f-tw(a) = a(x) dw(x).
J
In particular, for an interval J,
f-tw(l)
c=
f dw(x). I
DEFINITION 11.2.9. A bounded function f(x) will be said to have Stieltjes measure function f-tw(t,f) (or to be measurable w) if the set of points E{x,f(x) > t} has Stieltjes measure f-tw(t,f) for each value of t.
The Lebesgue-Stieltjes integral
167
As in § 8.5 we can introduce the upper and lower Lebesgue bracketing functions f-t(x) and .\(x), corresponding to a partition
A = t0 < t1 < t 2 < ... < tn = B of the range of f(x) for a ~ x ~ b. Let e = max(tp+l-tp) THEOREM:
for p = 0, 1, 2, ... ,n-l.
11.2.3.
b
b
Jf-t(X) dw(x)- J.\(x) dw(x) ~ e{f-tw(B,f)-f-tw(A,J)}. a
a
DEFINITION 11.2.10. The Lebesgue-Stieltjes integral of f(x) with respect to the weight function w(x) over the interval [a, b] is b
Jf(x) dw(x) = a
b
inf
b
J f-t(X) dw(x) =
sup
a
J.\(x) dw(x), a
for all Lebesgue bracketing functions .\(x) and f-t(X). 11.2.4.
THEOREM: b
Jf(x) dw(x) = a
B
f-tw(A,J) .A+
Jf-tw(t,f) dt. A
This identi:fies the Lebesgue-Stieltjes integral with the 'YoungStieltjes' integral of the monotone measure function f-tw(t,f). We can then extend the definition to unbounded functions and unbounded intervals of integration as before, by considering separately the positive and negative parts of f(x).
11.3. The Lebesgue representation of a Stieltjes integral Lebesgue has shown that the Lebesgue-Stieltjes integral in one dimension can be represented as an ordinary Lebesgue integral by a simple transformation of the independent variable from x to the 'Lebesgue inverse function' of the weight function w(x). The weight function w(x) has no unique inverse in the ordinary sense, for, to a prescribed value of y, there may correspond a
168
The Lebesgue-Stieltjes integral
whole interval, o: ~ x ~ fJ of points at which w(x) = y as in Fig. 3 (a). However, since w(x) is non-decreasing in any bounded interval [a, b], it is measurable in the sense of Lebesgue, and its w(x) ll'(b)
/
/
(a)
w(a)
/ ~(/!)
y
/3
a
,f
b
)
1\
""'
b-Ill
b-fl
(b)
~ w(a)
y
"' w(b) FIG 3.
measure function g(y), the Lebesgue measure of the set of points E{x; w(x) > y}, does provide a species of inverse function for w(x) (see Fig. 3 (b)). The Lebesgue inverse function g(y) is a non-increasing function which is discontinuous at each value 7J of y which corresponds to an interval [o:, fJ] ofxin which the weight function w(x) remains constant, for g(7J+O) = b-fJ and g(TJ-0) = b-o:.
The Lebesgue-Stieltjes integral
169
In the Lebesgue transformation we introduce the function defined by the relation j[g(y)]
cf>(y).
=
The indicator function of cf>(y), say {J(y, t) is defined by the relations {J(y, t) = (cf>(y) > t), 0 (cf>(y) ~ t).
{l
If cx(x, t) is the indicator of f(x), then these relations imply that cx(g(y), t)
{J(y, t).
=
Hence the lower Lebesgue bracketing function for f(x) is
n-1
=
.L tp{J3(y, tp)-fJ(y, tp+l)}
v~o
and its Stieltjes integral is b
J ,\(x) dw(x) = :~:tv{m(tp)-m(tp+l)}, "
where
m(tp) =
JfJ(y, tp) dy,
which is the Lebesgue measure of the set E{y; cf>(y)
> tJJ}.
b
Therefore, the supremum of
J,\(x) dw(x)
is the Lebesgue
a
integral of cf>(y) over the range y = w(a) toy= w(b). We could b
J
similarly identify the infimum of f.L(x) dw(x). But the common a
value of these bounds is the Lebesgue-Stieltjes integral of f(x) with respect to w(x). IJ
w(b)
J f(x) dw(x) = J j[g(y)] dy.
Therefore
a
In particular, if f(x) elsewhere, then s
-w(a)
l in the interval (a, s) and is zero w(s)
J f(x) dw(x) = J l dy = w(s)-w(a). a
w(a)
170
The Lebesgue-Stieltjes integral
11.4. The Lebesgue-Stieltjes integral in one dimension The preceding section has been devoted to the LebesgueStieltjes integral of a bounded function f(x) with respect to a non-negative, non-decreasing weight function w(x). The theory is easily extended to weight functions that are of bounded variation, and we shall merely state the relevant definitions and theorems, leaving the proofs to the reader. 11.4.1. If w 1 (x) and w 2 (x) are two non-negative, non-decreasing weight functions, so also is their sum THEOREM
and if a boundedfunctionf(x) has Stieltjes measure for an interval I with respect to each of the weight functions w 1 (x) and w 2(x), then it has Stieltjes measure with respect to w(x) and
I f(x) dw(x) =I f(x) dw (x)+ I f(x) dw (x). 2
1
I
1
1
DEFINITION 11.4.1. A function w(x) of the real variable xis said to be 'of bounded variation' in an interval I if there exists a pair of non-negative, non-decreasing functions, p(x) and n(x), such that w(x) = p(x)-n(x) for x E I. THEOREM 11.4.2. If w(x) has bounded variation ~n I and [p1 (x), n 1 (x)], [p 2(x), n 2 (x)] are two pairs of non-negative, nondecreasing functions such that
w(x) = p 1 (x)-n 1 (x) = p 2 (x)-n 2 (x), and if the bounded function f(x) has Stieltjes measure for an interval I with respect to each of the weight functions p 1 (x), p 2 (x), n 1 (x), n 2 (x), then
I f(x) dp (x)- I f(x) dn (x) Jf(x) dp (x)- Jf(x) dn (x). 1
I
1
I
=
2
I
2
I
The Lebesgue-Stieltjes integral
171
DEFINITION 11.4.2. The Lebesgue-Stieltjes integral of the bounded function f(x) with respect to the weight function w(x) of Definition 11.4.1 for an interval] is
I f(x) dw(x) = I f(x) dp (x)- I f(x) dn (x) 1
I
I
1
I
in the notation of Theorem 11.4.2.
11.5. The lebesgue-Stieltjes integral in two dimensions The theory of the Lebesgue-Stieltjes integral in two or more dimensions will be found, in summary form, in modern books on statistics and the theory of probability (e.g. Moran 1969, p. 203 and Kingman and Taylor 1966, p. 95) and, in more detail, in serious works on integration (e.g. McShane 1947, chap. vii). The crux of the theory is the introduction of a weight function w(J) which is a mapping of the intervals in Euclidean space of d dimensions into the real axis, and which is (i) non-negative; (ii) additive, in the sense that if the interval I is the union of two disjoint intervals 1 1 , 1 2 then
w(l) = w(J1 )+w(l 2 ). The main difficulty arises from the possible discontinuities in the weight function. From the point of view of the physicist the weight function may represent a discrete distribution of mass at a number of isolated points. T~1e question then arises, does the weight function w(J) include any point masses on the boundaries of I? And, if so, how is the additive character of w(J) preserved? From the point of view of the mathematician the possibility of defining the weight function for any bounded open set Q depends upon the representation ofQ as the union of an enumerable collection of disjoint intervals In, and hence the intervals In must not all be open intervals, or all closed intervals. After meditating on these difficulties the reader may be prepared to settle for the following definitions, which allow a
172
The Lebesgue-Stieltjes integral
fairly concise account of the theory. In order to exhibit the essence of the theory we shall restrict ourselves to two dimensions. The extension to three or more dimensions is an obvious generalization, which is fully discussed in the references given above. DEFINITION 11.5.1. In the convenient terminology of Ingleton (1965) a 'standard' interval is the set of points (x, y) in the half-open rectangle a~ x < b, c ~ y 0 it follows that b
M
I f(x) dx = I m(t,J) dt. a
0
This identifies the Lebesgue integral of f(x) over (a, b) with theW. H. Young integral of its measure function m(t,f) over the range (0, M) of f(x). The significance of this investigation is that it proves that the domain of the Lebesgue functional, as defined in 12.2.1, is included in the space of bounded measurable functions, i.e. the functions f(x) whose indicator cx(x, t, f) is a Lebesgue function. Of course it remains to be proved that all such bounded measurable functions are in fact Lebesgue functions and, to do this, we need the constructive definitions of the preceding chapters. The use of monotone sequences then allows us to extend the
180
Epilogue
definition of the Lebesgue integral to unbounded functions and to unbounded intervals as in Chapter 9, but such integrals are necessarily absolutely convergent. If, however, we are prepared to waive condition (A) and admit non-absolutely convergent integrals, then the space of integrable functions can be considerably extended. In this work of analytic exploration the initial advances were made by A. Denjoy (1942-9) by means of constructive definitions involving transfinite induction. Later, 0. Perron gave a descriptive definition by adapting the method of bracketing to give directly upper and lower bounds to the integral (rather than the integrand, as in our exposition). Outstanding advances in this field have been made by A. J. Ward and R. Henstock and are described in the latter author's book Theory of integration (1963). The intentions of the author of the present work will be amply fulfilled if this introductory account of the Lebesgue integral has stimulated the reader to study the more formal and profound account of the subject. The following works are especially recommended for the undergraduate. The original paper and books by Lebesgue and the exposition by de la Vallee-Poussin are somewhat terse, and chapters x, xi, and xii of The theory offunctions by E. C. Titchmarsh (Clarendon Press, Oxford) will be found to provide a most illuminating commentary. The geometric theory of the Lebesgue integral is expounded with great clarity and conciseness in J. C. Burkill's Cambridge Tract. A rather more advanced treatment is given with greater emphasis on topological and set-theoretic concepts in Lebesgue integration by J. H. Williamson. By contrast, the treatment by A. N. Kolmogorov and S. V. Fomin in the translation published by Academic Press (New York and London, 1961) with the title Measure, Lebesgue integrals and Hilbert space, may be perhaps described as more algebraical in presentation. A forthcoming book by A. W. Ingleton will provide yet another attractive line of approach to integration theory which may be roughly characterized as the method of functional analysis.
Epilogue
181
12.5. References BoAs, R. P. (1960). A primer of real junctions. Wiley. BuRKILL, J. C. ( 1953). The Lebesgue integral. Cambridge University Press. DENJOY, A. (1941-9). Ler;ons sur le calcul des Villars, Paris.
coeffic~ents.
Gauthier-
HARTMAN, S., and MIKUSINSKI, J. {1961). The theoTy of Lebesguerneasu1·e and integration. Pergamon Press, Oxford. HENSTOCK, R. ( 1963). TheoTy o.f integration. Butterworths, London. HoBsON, E, W. (1927). The theory of Junct'ions of a real variable and the theory of Fourier's series, vols. i and ii. Cambridge University Press. lNGLETON, A. W. (1965). Institute, Oxford.
Notes on 'integration.
The Mathematical
KINGMAN, ,J. :F. C'., and TAYIA1R, 1-l. ,J. (1966). Intmdttetion to measure and probability. Cambridge University Press. KoLMOGOROV, A. N., and Fo~nN, S. V. (1961). 11'IeasuTe, Lebesgue integrals and HilbeTt space. Academic Press, New York. LEBESGUE, H. ( 1902). appl. 3, 231-359.
Integrale, longueur, aire.
Annali 111at. puT.
- - - (1904, 1928). Ler;ons su'r l'integmtinn. Gauthier-Villars, Paris. 1\'Ic SHANE, E. J. (1947). Integmtion. Princeton University Press. MoRAN, P. A. P. ( 1968). An h~t1'0rluct,ion to probability theory. Clarendon Press, Oxford. Hmsz, F., and Sz-NAGY, B. ( Hl53). lA'