Mathematical Surveys and Monographs
Volume 62
Gaussian Measures
Vladimir I. Bogachev
American Mathematical Society
Selected Titles in This Series 62 Vladimir I. Bogachev, Gaussian measures. 1998 61 W. Norrie Everitt and Lawrence Markus, Boundary value problems and symplectic algebra for ordinary differential and quasi-differential operators, 1998
60 lain Raeburn and Dana P. Williams, Morita equivalence and continuous-trace C'-algebras. 1998 59 Paul Howard and Jean E. Rubin, Consequences of the axiom of choice. 1998
58 Pavel I. Etingof, Igor B. Frenkel, and Alexander A. Kirillov, Jr., Lectures on representation theory and Knizhnik-Zamolodchikov equations. 1998 57 Marc Levine, Mixed motives. 1998
56 Leonid I. Korogodski and Yan S. Soibelman, Algebras of functions on quantum groups: Part 1, 1998
55 J. Scott Carter and Masahico Salto, Knotted surfaces and their diagrams. 1998 54 Casper Goffman, Togo Nishlura, and Daniel Waterman, Homeomorphisms in analysis. 1997
53 Andreas Kriegl and Peter W. Michor, The convenient setting of global analysis. 1997 52 V. A. Kozlov, V. G. Mas'ya, and J. Rossmann, Elliptic boundary value problems in domains with point singularities, 1997
51 Jan Mali and William P. Zlemer, Fine regularity of solutions of elliptic partial differential equations, 1997 50 Jon Aaronson, An introduction to infinite ergodic theory. 1997
49 R. E. Showalter, Monotone operators in Banach space and nonlinear partial differential equations. 1997
48 Paul-Jean Cahen and Jean-Luc Chabert, Integer-valued polynomials, 1997
47 A. D. Elmendorf, I. Kris, M. A. Mandell, and J. P. May (with an appendix by M. Cole), Rings, modules, and algebras in stable homotopy theory. 1997 46 Stephen Lipscomb, Symmetric inverse semigroups. 1996
45 George M. Bergman and Adam 0. Hausknecht, Cogroups and co-rings in categories of associative rings. 1996
44 J. Amords, M. Burger, K. Corlette, D. Kotachick, and D. Toledo, Fundamental groups of compact Kiihler manifolds. 1996
43 James E. Humphreys, Conjugacy classes in semisimple algebraic groups. 1995
42 Ralph Freese, Jaroslav Jeiiek, and J. B. Nation, Free lattices. 1995 41 Hal L. Smith, Monotone dynamical systems: an introduction to the theory of competitive and cooperative systems. 1995
40.3 Daniel Gorenstein, Richard Lyons, and Ronald Solomon, The classification of the finite simple groups, number 3; 1998
40.2 Daniel Gorenstein, Richard Lyons, and Ronald Solomon, The classification of the finite simple groups, number 2. 1995 40.1
Daniel Gorenstein, Richard Lyons, and Ronald Solomon, The classification of the finite simple groups, number 1. 1994
39 Sigurdur Helgason, Geometric analysis on symmetric spaces. 1994 38 Guy David and Stephen Semmes, Analysis of and on uniformly rectifiable sets, 1993 37 Leonard Lewin, Editor, Structural properties of polylogarithms, 1991 36 John B. Conway, The theory of subnormal operators, 1991 35 Shreeram S. Abhyankar, Algebraic geometry for scientists and engineers. 1990 34 Victor Isakov, Inverse source problems, 1990 33 Vladimir G. Berkovich, Spectral theory and analytic geometry over non-Archimedean fields, 1990
32 Howard Jacobowits, An introduction to CR structures. 1990 (Continued in the back of this publication)
Gaussian Measures
Mathematical Surveys and Monographs
Volume 62
Gaussian Measures
Vladimir 1. Bogachev
American Mathematical Society
Editorial Board Tudor Stefan Ratiu, Chair Michael Renardy
Georgia M. Benkart Peter Landweber
Translated from the original Russian manuscript by Vladimir I. Bogachev. 1991 Mathematics Subject Classification. Primary 28C20, 60B11; Secondary 60015, 60H07. ABSTRACT. This book presents a systematic exposition of the modern theory of Gaussian measures. The basic properties of finite and infinite dimensional Gaussian distributions, including their linear
and nonlinear transformations. are discussed. The book is intended for graduate students and researchers in probability theory, mathematical statistics, functional analysis, and mathematical physics. It contains a lot of examples and exercises. The bibliography contains 844 items; the detailed bibliographical comments and subject index are included.
Library of Congress Cataloging-in-Publication Data Bogachev. V. 1. (Vladimir Igorevich). 1961iGaussovskie mery. English] Gaussian measures / Vladimir I. Bogachev. (Mathematical surveys and monographs. ISSN 0076.5376 ; v. 62) p. cm. Includes bibliographical references and index. ISBN 0-8218-1054-5 (he : alk. paper)
1. Gaussian measures. QA312.B64 1998
I. Title. II. Series: Mathematical surveys and monographs ; no. 62.
515'.42 -dc21
98-27239
CIP
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication (including abstracts) is permitted only under license from the American Mathematical Society.
Requests for such permission should be addressed to the Assistant to the Publisher, American Mathematical Society, P.O. Box 6248, Providence, Rhode Island 02940-6248. Requests can also be made by e-mail to reprint -peraissionaass.org. © 1998 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. E) The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability.
Visit the AMS home page at URL: http://vw.ams.org/
10987654321
030201009998
Contents Preface
xi
Chapter 1. Finite Dimensional Gaussian Distributions 1.1. Gaussian measures on the real line 1.2. Multivariate Gaussian distributions 1.3. Hermite polynomials 1.4. The Ornstein-Uhlenbeck semigroup 1.5. I.G. 1.7. 1.8. 1.9. 1.10.
Sobolev classes Hypercontractivity Several useful estimates Convexity inequalities Characterizations of Gaussian measures Complements and problems
Chapter 2. Infinite Dimensional Gaussian Distributions 2.1. Cylindrical sets 2.2. 2.3. 2.4. 2.5. 2.6. 2.7. 2.8. 2.9. 2.10.
Basic definitions Examples
2.11.
Stochastic integrals
2.12.
Complements and problems
The Cameron-Martin space Zero-one laws Separability and oscillations Equivalence and singularity Measurable seminorms The Ornstein-Uhlenbeck semigroup Measurable linear functionals
Chapter 3. Radon Gaussian Measures 3.1. Radon measures 3.2. Basic properties of Radon Gaussian measures 3.3. Gaussian covariances 3.4. The structure of Radon Gaussian measures 3.5. Gaussian series 3.6. Supports of Gaussian measures 3.7. Measurable linear operators 3.8. Weak convergence of Gaussian measures 3.9. Abstract Wiener spaces 3.10. Conditional measures and conditional expectations Vii
97 97 100 104
109 112 119 122
129
136 140
Contents
viii 3.11.
Complements and problems
142
Chapter 4. Convexity of Gaussian Measures 4.1. Gaussian symmetrization Ehrhard's inequality 4.2. 4.3. Isoperimetric inequalities 4.4. Convex functions 4.5. H-Lipschitzian functions 4.6. Correlation inequalities 4.7. The Onsager-Machlup functions 4.8. Small ball probabilities 4.9. Large deviations 4.10. Complements and problems
157 157 162 167
Chapter 5. Sobolev Classes over Gaussian Measures 5.1. Integration by parts 5.2. The Sobolev classes Wp'' and Dr"
205 205
5.3. 5.4. 5.5. 5.6. 5.7. 5.8. 5.9. 5.10. 5.11. 5.12.
The Sobolev classes HP2' Properties of Sobolev classes and examples The logarithmic Sobolev inequality Multipliers and Meyer's inequalities Equivalence of different definitions Divergence of vector fields Gaussian capacities Measurable polynomials Differentiability of H-Lipschitzian functions Complements and problems
171 174
177 181
187 195 197
211 215
218 226 229 234 238 243 249 261 266
Chapter 6. Nonlinear Transformations of Gaussian Measures Auxiliary results 6.1. Measurable linear automorphisms 6.2. 6.3. Linear transformations 6.4. Radon-Nikodym densities 6.5. Examples of equivalent measures and linear transformations Nonlinear transformations 6.6. 6.7. Examples of nonlinear transformations Finite dimensional mappings 6.8. Malliavin's method 6.9. 6.10. Surface measures 6.11. Complements and problems
279 279 282 285 288 295 298 308 314 316
Chapter 7. Applications 7.1. Trajectories of Gaussian processes 7.2. Infinite dimensional Wiener processes Logarithmic gradients 7.3. 7.4. Spherically symmetric measures
333 333 336 339 344 346 353
7.5. 7.6.
Infinite dimensional diffusions Complements and problems
Appendix A. Locally Convex Spaces, Operators, and Measures
321
324
361
Contents
ix
A.1.
Locally convex spaces
361
A.2.
Linear operators
365
A.3.
Measures and measurability
371
Bibliographical Comments
381
References
391
Index
427
Preface The modem theory of Gaussian measures lies at the intersection of the theory of random processes, functional analysis, and mathematical physics and is closely connected with diverse applications in quantum field theory, statistical physics, financial mathematics, and other areas of sciences. The study of Gaussian measures combines ideas and methods from probability theory, nonlinear analysis, geometry, linear operators, and topological vector spaces in a beautiful and nontrivial way. The goal of this book is to present the modern theory of Gaussian measures. Chapter 1 contains basic facts about Gaussian measures on IR". In addition to the standard probabilistic facts the reader will find here a discussion of the Hermite polynomials, the Ornstein-Uhlenbeck semigroup, the Sobolev classes with Gaussian weights, the logarithmic Sobolev and Poincark inequalities, and convexity of Gaussian measures. These analytic tools play a fundamental role in the theory. Principal results belonging to linear topological theory are discussed in Chapters 2 and 3. These include classical theorems about equivalence and singularity, zero-one laws, Cameron-Martin spaces, measurable linear functionals, and the topological properties of supports. Chapter 4 contains some inequalities and estimates related to the convexity of Gaussian measures, such as Gaussian isoperimetric inequalities, and Ehrhard's and Anderson's inequalities. These inequalities are applied to the study of the exponential integrability, probabilities of small balls, and large deviations. Nonlinear problems are discussed in Chapters 5 and 6, where, in particular, the Sobolev classes over Gaussian measures and Gaussian capacities are investigated. Chapter 6 deals with transformations of Gaussian measures. In addition, we give an introduction to the Malliavin calculus. In Chapter 7 we discuss some properties of finite and infinite dimensional Gaussian processes and certain related diffusion processes. These results, apart from the interest in their own right, provide a good illustration of the ideas and methods of the previous chapters. It is worth noting here that one of the fundamental ideas in the theory of Gaussian measures is that the various centered Radon Gaussian measures are realizations of one and the same "canonical" Gaussian measure: the countable product of the standard normal Gaussian distributions on the line. This canonical measure ry is defined on the space IR°` of all real sequences. The Cameron-Martin space of ry is the classical Hilbert space 12. The space IR" has a relatively poor collection of continuous linear functionals (consisting only of the functionals that depend on finitely many variables). However, the set of measurable linear functionals is much broader: it can be identified with 12: for any (c,) E 12 the series F,',,=, C,,x,, converges -f-almost everywhere, and every measurable linear functional admits such a representation. Although the Cameron-Martin space 12 has measure zero, every continuous linear functional on it (even every continuous linear operator) admits Xi
xii
Preface
a unique measurable linear extension to all of R". Having in mind this basic example, one can understand better the "coordinate" interpretation of the results presented in this book. Certainly, there are problems in which the reduction to R" is useless. For instance, this is the case in most of the problems concerning sample path properties of Gaussian processes. Nevertheless, readers who prefer not to dwell on topological subtleties connected with infinite dimensional spaces may assume (and the isomorphism theorem from Chapter 3 gives a full justification for this) that our discussion concerns Hilbert spaces or the space R". The "unique" Gaussian measure mentioned above is often encountered in the appearance of the Wiener measure on the space of continuous trajectories; in this case very interesting objects arise that have no natural analogues in the other isomorphic representations. All auxiliary results from functional analysis and general topology used in the texts are collected in the Appendix. Prerequisites include basics of Lebesgue integration, multivariate calculus, and probability theory.
Formulas and assertions (theorems, definitions, remarks, etc.) are numbered independently of their type within each section so that the number of an assertion or a formula is preceded by the chapter number and section number. The book contains a large number of problems (exercises). The role of the
problems (in addition to the usual one) is to present a disordered collection of interesting facts and to unburden the main text from some technical details of proofs. Many problems are provided with hints; this, however, has no relation to the level of difficulty (some problems are simple exercises, but the others are really hard results borrowed from the current research literature). The bibliography does not exhaust all the publications relating to the theory of Gaussian measures. However, together with the bibliographical comments, it enables the reader to get a sufficiently complete vision of the history of the subject as well as entertain a more thorough bibliographical research. One can use this book as a source for several one- or two-semester courses (of
different levels) for graduate students. For example, Chapter 1 can form a core for an introductory course on finite dimensional Gaussian distributions. A course on infinite dimensional Gaussian distributions can be based on Chapters 2 and 3. Chapters 5 and 6 may be helpful for lecturing on nonlinear stochastic analysis. This book is an expanded and improved version of the Russian original based on the author's lectures at the Department of Mechanics and Mathematics of Moscow State University. Parts of the book have been written during my visits to other universities and mathematical institutes, in particular, in Rome, Paris. Bonn, Bielefeld, Pisa. Warwick, Stockholm, Edmonton, Minneapolis, and Haifa. I am very grate-
ful to L. Accardi, S. Albeverio, J. Baxter, G. Da Prato, G. Dell'Antonio, D. Elworthy, F. Gotze, N. Jain, N. Krylov, P. Malliavin, E. Mayer-Wolf, B. Oksendal. M. Riickner, B. Scbmuland, M. Zakai, and other colleagues from these institutions for the excellent working conditions. I have had many profitable discussions regarding various subjects treated here with V. Bentkus, S.G. Bobkov. N.V. Krylov,
M. Ledoux, M.A. Lifshits. Yu.V. Prohorov, A.N. Shiryaev, A.V. Skorohod, O.G. Smolyanov, A.M. Stepin, V.N. Sudakov, V.V. Ulyanov, H. von Weizsacker. A.Yu. Zaitsev, and O. Zeitouni. My special thanks are due to D.E. Aleksandrova. V. Bentkus, A.V. Kolesnikov, E.P. Krugova, N.N. Nedikov, O.V. Pugachev, T.S. Rybnikova, B. Schmuland, and N.A. Tolmachev, who read the drafts of the book and made many critical remarks.
CHAPTER 1
Finite Dimensional Gaussian Distributions The connection of literature purposes with the purely scientific ones. the desire to occupy imagination and at the same time to enrich life with ideas and knowledge create considerable difficult ies in composing the different parts of the book and hamper the unity of exposition. Alexander von Humboldt. Views of Nature
1.1.
Gaussian measures on the real line
We start our discussion of Gaussian measures by recalling the identity +x. 1
cr
2ir J
-x
exp
(t - a)2
\
2u-
) dt = 1.
(1.1.1)
known for all real a and o > 0. A standard way of verifying this equality is to evaluate the double integral ff e xp(-x2 - y2) dxdy in polar coordinates and apply Fltbini's theorem.
1.1.1. Definition. A Borel probability measure ry on R' is called Gaussian if it is either the Dirac measure b at a point a or has density ( (f - a)2 P(',a.a2): t i-' 1
a
27reXp
\
2o-2
with respect to Lebesgue measure. In. the latter case the measure y is called nondegenerate. The parameters a and a2 are called the mean and the variance of y, respectively.
The quantity a is called the mean-square deviation. For any Dirac measure (i.e., a probability measure concentrated at a point) we put a = 0. The mean and the variance of a Gaussian measure y admit the following representation: +x
x
a = J t y(dt). a2 = I (t - a)2 y(dt).
-x
-x
The measure with density p( , 0, 1) is called standard A mean zero Gaussian measure is called centered or symmetric.
1.1.2. Definition. A Gaussian random variable is a random variable with Gaussian distribution.
A Gaussian random variable with a centered distribution is called centered or symmetric. Clearly, an arbitrary Gaussian random variable with mean a and I
Chapter 1.
2
Finite Dimensional Distributions
variance o2 can be represented as o£ + a, where C is a random variable with the standard Gaussian distribution. Gaussian distributions are often called normal. Using equality (1.1.1) it is easy to find the Fourier transform (the characteristic functional) of the Gaussian measure -1 with parameters (a,a2). We have
7(y) := I exP(iyx)7(dx) = exP(iay - a2y2) IRI
The normal (standard Gaussian) distribution function lation
is defined by the re-
ft
p(s, 0, 1) ds.
4'(t)
The inverse function 4-1 is defined on (0.1). It is convenient to employ the follow-
ing convention: 4-1(0) = -oo and 4-1(1) = +oo. The rate of decreasing of 1 - 4) at infinity is estimated as follows.
1.1.3. Lemma. For any It > 0. one has
t
2n \ t
)e'0/2 < 1
- $(t)
1: (iii) for any numbers Al,... , An, one has n aki+...+k
t.l
ki!Hk,(A1) =
n
1
n
k expt,a; - 2 Et? C7tk, ...atn 1
PROOF. Claims (ii) and (iii) are easily deduced from the definition and (1.3.1).
Integrating the function exp(tx - 2t2) exp (sx - 2s2) in x with respect to the standard Gaussian density we get exp(ts). On the other hand, using (1.3.1), we get
Ek.n>O(n!k!)-1J2tnsk(Hn,Hk)L2(.t,). Comparing the coefficients at t and s,
we see thit Hk are mutually orthogonal and have unit norms. Note that Hk is a polynomial of degree k. Hence the linear span of Ho, ... , Hk coincides with the space of polynomials of degree at most k. Suppose that f E L2(yi) is orthogonal to all Hk. Then f is orthogonal to all polynomials. Hence all derivatives at zero of the function
F(z) =
J IR'
exp(izx)f(x)y1(dx)
vanish together with F(O), which implies that F is identically zero, since F is
0
holomorphic. Therefore, f = 0 a.e.
1.3.3. Corollary. The system of the Hermite polynomials Hk,.....k.,,
ki=0,1,...,
on 1Rn is an orthonormal basis in L2(-y,,), where yn is the standard Gaussian measure on IR".
PROOF. This follows from the fact that, given measures p, i = 1,... , n, on spaces X, and orthonormal bases {Sp' } in L2(µ;), the system of functions
,xn) forms an orthonormal basis in L2(gn 1µi).
O
Using Hermite polynomials, one constructs a decomposition of L2(yn) into the direct sum of mutually orthogonal subspaces Xk of polynomials. For any k = 0,1,..., denote by Xk the closed linear subspace in L2(-y,,), generated by the Hermite polynomials H. with Ia{ = k, +... + kn = k (note that these polynomials form an orthonormal basis in Xk). Denote by Ik the orthogonal projection in L2(yn) onto Xk.
1.3.4. Proposition. The subspaces Xk are mutually orthogonal and L2(yn) is their direct sum.
PROOF. According to Corollary 1.3.3, the linear span of the subspaces Xk is dense in L2(-y,,). Since the polynomials of the form Hk, (xi) ... Hk, (xn) with n
ki = k form a basis in Xk, it suffices to note that any two such polynomials of t=1
different degrees are orthogonal.
0
1.4.
The Ornstein-Uhlenbeck semigroup
9
Let E be a finite dimensional Hilbert space. Then the space Xk(E) of E-valued
Hermite polynomials is defined as the linear span of the mappings f v. where f E Xk, v E E. in the space L2 (_Y,,, E) of E-valued mappings. It is straightforward to see that Xk(E) is a closed subspace in the Hilbert space L2(-t,,, E). The orthogonal
projection of L2(y", E) onto Xk(E) is denoted by Ik as in the scalar case.
1.3.5. Remark. Suppose that f E C'°(1R1) is such that f(k) E L2(yl) for every k > 0. Then
Ik(f) = f fHk d'YI =
(
1
f
f(k) d,,.
Indeed, this follows directly from the definition of Hk and the k-fold integration by parts. Thus, we have
f= 1.4.
k
(
k k (f(k), l)L2(i,)Hk
The Ornstein-Uhlenbeck semigroup
Let - be a centered Gaussian measure on IR". The Ornstein-Uhlenbeck semigroup (Tt)t>o is defined on LP(y) by the following formula (known as the Mehler formula):
Te f (x) = f f (e-ex + v'-I-- e-21 y) y(dy) R"
Certainly, it has to be verified that these operators are well-defined. This is done as follows. We know that y is the image of the measure -to -y on IR" x 1R" under the mapping (x, y)'-' a-ex + 1 - e-2t y. Therefore, for every f E LP(y) we have
f I f (x) I D y(dx) = f J I f (e-x + V/e-2t y) l D y(dx) y(dy)-
(1.4.1)
By Fubini's theorem, it follows from (1.4.1) that x _.
f (e-tx +
ID
1 - e-21 y)l1 -y(dy)
R"
is -y-integrable. By Holder's inequality, we obtain that Tt f E LP(-y). Moreover, IITtf II LD(-,) s If II L`(,,)
Note that for every f E L1 (y) we have by the change of variables formula
f Ttf (x) -t(dx) = f f (x) y(dx), R^
(1.4.2)
R^
which means that y is an invariant measure for the semigroup (Tt)t>o. Note that since Tt f (x.) converges to fi dy as t - oo for every bounded continuous function f ,
it follows that -y is a unique invariant probability for M60By the same expression we define the action of Tt on the vector-valued functions
f E L'(y, E), where E is a separable Hilbert. space.
Chapter 1. Finite Dimensional Distributions
10
Simple additional considerations lead to the following result (the definitions of the notions connected with operator semigroups can be found in Appendix).
1.4.1. Theorem. For every p > 1, the family (Tt)t>o is a strongly continuous semigroup of operators on LP(y) with the operator norms IITtIIc(LP(,)) = I.
The operators Tt are symmetric nonnegative on L2(y). In addition, if E is a separable Hilbert space, then (Tt)f>o is a strongly continuous semigroup of operators with unit norms on the space LP(-y, E) of E-valued mappings.
PROOF. Since Ttl = 1 and IITtIIc(LP(,)) 0, the operators Tt have norm 1 on LP(y). In order to prove the equality Tt+, f = TjT, f note that the measure y is the image of the measure -toy under the mapping
1-e-
(y,z)`..a-"
y+
1-e-2'
Z.
Therefore,
Tt(Taf)(x) = f T.f (,-Ix +
1 - e-2t y) y(dy)
=
ffi(ee_tx + e-'
=
fi(ex
+
1 - e-2ty + e-2t-2. w) y(dw)
1-
1 --e-2s z) y(dz) y(dy)
= Tt+af(x)
The symmetry of Tt on L2('y) is verified in a similar manner. Namely, the inner product
(Ttf, 9)L'(,) = f f f (e-tx +
1 - e-2ty) 9(x) y(dy) y(dx)
can be written, by the change of variables
u= etx +
1- a-2ty, v=- 1- e-2tx + e'ty
and the Gaussian property, as
JJf(u)g(etu -
V1--e--21v),-y(du) y(dv),
which is (f, Ttg)L2(,) by the symmetry of y. Now the equality Tt = T112Tt12 implies that Tt is nonnegative. For every bounded continuous function f, the mapping t '-. Tt f with values in LP(y) is continuous, which is readily seen from the Lebesgue theorem. If f E LP(-Y),
then there exists a sequence of bounded continuous functions fk convergent to f in LP(-y). By virtue of what has already been proved, sup IITtf - TefkllLp(,) o
k - oo,
whence the strong continuity of our semigroup. The same reasoning applies to vector-valued mappings.
1.4.2. Corollary. For all f E LP(-t), one has
thmllTtf
- f fdy I
=0. LP(,)
1.4.
The Ornstein-Uhlenbeck semigroup
11
PROOF. For bounded continuous functions f this claim follows from the Lebesgue theorem. In the general case, it suffices to find a sequence of bounded continuous
functions fi convergent to f in LP(-y) and note that
suoIITtf-Ttf,IILP(,)o on the Hilbert space L2 (-y", E) of E-valued mappings, where E is any separable Hilbert space.
Chapter 1. Finite Dimensional Distributions
12
1.4.5. Proposition. The domain of definition of L is D(L) =
{f:
k2IIlk(f)Il(,
i
k=1
On this domain. one has
Lf =
(1.4.3)
k=1
The analogous statement holds true for the operator L on the space L2(1". E). where E is any separable Hilbert space.
PROOF. Suppose that f is in the domain of L. i.e.. the mapping t - Tt f to L2(ry") is differentiable at zero. Then
ItLf = dt e k`Ik(f)I
= -kIa(f) t=o
(,
IILf III 2(. ) < oo. In addition, we get (1.4.3). Conversely, let f be a function such that this series converges. By virtue of the estimate
whence Ek k2IIIk (f) II i
It-1(e-kt
- 1)I < k.
we get ast-.0
Ttft f +E 'c k=1
3C
klk(f)I2
1
= L2(.,,)
J
k=01
-0.
Hence the mapping t - Tt f is differentiable at zero. Operator L is called the Ornstein-Uhlenbeck operator.
1.5. Sobolev classes
Let S1 C IR" be an open set, p > 1, r E IN. Everywhere below the symbol C,(')- (S2) stands for the set of all infinitely differentiable functions with compact supports in Q. By CC (IR") we denote the space of all infinitely differentiable functions on IR" with bounded derivatives of all orders, and by S(IR") its subspace that con-
sists of all functions whose products with all polynomials are in C, (lR"). Recall is defined as the that the Sobolev class WP'(SI) (an alternative notation is space of all functions f E Lp(f2) whose generalized partial derivatives (the derivaBy definition, the tives in the sense of distributions) up to order r are in generalized partial derivative (or Sobolev derivative) with respect to the variable x, is a function Or, f which is integrable on S1 and for every p E CC (1) satisfies the following integration by parts formula:
f a=,v(x)f(x)dx = - J(x)Of(x)dx. 11
U
Denote by W(o' (IR") the set of all functions f on IR" such that (f E II'"*(IR")
for each ( E Co (IR"). It is known that every locally Lipschitzian function f belongs to It' °' (IR") for all p > 1. In addition, if io is Lipschitzian on IR1 and f E II ,P''1 (IR" ), then ;p o f E II' o" (R") and 0, (, o f) = iPt (f )0, f
1.5.
13
Sobolev classes
Every function f E It"" (1R") has a version fo with the following property: for every vector h, the function t r-+ fo(x + th) is locally absolutely continuous for almost every x E R" (see [297, Corollary of Theorem 5.5, Ch. 2, §5)). According to the Sobolev embedding theorem (see, e.g., [3, Theorem 5.4]), if j and m are nonnegative integers and mp > n. then every function in tt' +`(R") has a modification with j continuous derivatives, i.e., there is a natural embedding It10C+m(lR") C CJ(R"). Also. one has ltloc1"(R'n) C C3(1R"). For a detailed exposition of the theory of Sobolev spaces we refer the reader to the books [3], [536]. For the theory of Gaussian measures, it is natural to introduce analogous classes with a Gaussian weight. Let y" be the standard Gaussian measure
on R". Denote by fr the space of all 1-linear functions on IR" with the Hilbert-Schmidt norm n
2
I12
JILI171,
at.....r,=I
where {e,} is the standard basis in 111".
1.5.1. Definition. Let p > 1 and r E W. The Sobolev space
is the
completion of the space CC (IR") with respect to the Sobolev norm 1/p / Iifllp,r = { / If(x)I" In(dx) +
fI1D'f(x)II, 7n(dx)}
.
J
Here the derivatives Of are considered as mappings with values in the spaces %I.
Let f E By definition, there exists a sequence f, E Co (R") convergent to f in LP(yn) such that the sequences {D' f, }, I = 1, ... , r are fundamental in Put
X
Dl f := lim Of.. For the first derivative Df we also use the symbol Of. It is readily verified that this definition does not depend on our choice of an approximating sequence. Indeed, for the sake of simplicity, let us consider the case r = 1. Suppose that a sequence {w,} of smooth compactly supported functions G in LP(-y", 11"). Let us show that converges to zero in Lp(yn) such that G = 0 almost everywhere. To this end, it suffices to prove that (G = 0 almost everywhere for every ( E C' (R"). Thus, we may assume that the functions c > 0. Put :p = f2. Then V f = 1and the inequality to be proven is equivalent to the following one: pd-yn log
f ,alogIVId?'n R"
R"
J
Vdyn
c2. Since we have
TtvlogTtv -
J
J !Iv4o2dyn.
2
(1.6.2)
R"
r d?,, log / ypd7n,
t - xG.
R"
R"
the left side of (1.6.2) can be represented as
- f (ddt f Tt,GlogTtv d7.) dt, o
R"
which, by virtue of the semigroup property, can be written as
-
x
V (LT, v)
x( dt - f r \Rl
dtT,Vd-t.
dt.
Since the second term vanishes due to (1.4.2), we get
(LTtV) logTtPd.m I dt. Applying (1.5.2), we rewrite the same expression as
f f (VTp,O(logTt))d)dt=7,ipIVTtpl2d7n)dt. J(J
1.6.
Hypercontractivity
17
Put dy".
F(t) = J Using the identity VT, p =
and Lemma 1.4.3, which gives the estimate V
T,
we get e-21
f1 ,SLR" Tt
(Tt8=, X0)2 d - 7 5-
21
/ Tt 1 1(O1.'P)2) d7n '_11R"
- e -2t j I
JW
which yields (1.6.2) after integration in t over (0, oo).
Suppose now that f E W2.1 (-y,) and f > 0. Then the logarithmic Sobolev inequality follows by Fatou's theorem, since there is a sequence {yp,} C C b (IR") such that gyp, > 1/i and ip; f in and almost everywhere. In order to apply Fatou's theorem, note that t2 log t > -1 if t > O and that cp,(x)2 log w,(x) - 0 if +p,(x) --a 0. Finally, for an arbitrary function f E W2.1(y"), the logarithmic Sobolev inequality follows from the fact that If l E 1V2-1 (-f.) and I V If I I = I Vf I almost everywhere.
Note that the logarithmic Sobolev inequality can be written in terms of the Ornstein-Uhlenbeck operator L as
f f2loglfldy" 1, one has )'E[a"
IE
f (S) - IEf
lE
2f,W(*1)Ir=A1r1
(f,
r]
+x where M, = we get
JItIrp(o, 1, t) dt. In particular, for the standard Gaussian measure 7n x r
[1(x) - J f dln 1'n(dx) < Mrt 2)r `
J IVf(T)Ir7'n(dx).
(1.7.4)
R^
Moreover, the integrability of f follows from the existence of the integrals on the right-hand sides of these inequalities.
PROOF. Suppose first that the function f is integrable and apply estimate (1.7.1) to W(x) = i'Ir. Then it remains to note that, for every linear functional 1, the random variable l(>)) is Gaussian with variance a,,(l)2.
1.7.
Several useful estimates
21
we may In order to show the integrability of the function f with I V f I E replace it by if I and assume that f is nonnegative. Let us take a sequence of functions wZ defined on IR' as follows: p (t) = t if Its < j. v,(t) = j sgn (t) if Itl > j. Put fj = pj(f ). Then for the functions f, estimate (1.7.4) is valid with the right-hand sides uniformly bounded in j, since IVf,I S IVII. By Fatou's theorem, we obtain
lim inf [f, - / fj dyn] < oc. Since f, -. f pointwise, one has lim inf J f, dyn < oc, whence, applying again Fatou's theorem, we get the integrability off.
1.7.4. Corollary. Let y,, be the standard Gaussian measure on R', k = I and
If (x) - f(y)I 0. one has
/
-t. (x:
II
If(x)-1 fdynl
r
>r)
eH(t)
- -A 2e` 21
for all t > 0. Hence,
log G(0) < H(0) _ - J H'(s) ds
0 and ;p E D(F). Suppose that
J X
eye' dµ -
r 1
dµ (log f eX
X
dµl < J1(w)2e '° dµ
/
(1.7.10)
X
for all ip E D(r) such that all the integrals above exist. Then, for every p E D(I')
with f p dµ = 0 and all a > 1. one has
(fexp(ar()2)diz)
(1.7.11)
PROOF. We may assume that the right-hand side in (1.7.11) is finite. Let us apply (1.7.9) to the functions f = e,' and g = exp(aF(cp)2 - 3). where Q = log Jexp(or()2). Then (1.7.9) yields
f ape' dµ -
fe'' dµ (log f e, dµ) > fe'[or( y;)2 - 3, dµ. (1.7.12) x X X Denoting the left-hand side of (1.7.12) by H(e"), we obtain from (1.7.12) and
X
(1.7.10) that
H(e'') > aH(e'') -,3f e`'dµ, X
Chapter 1.
24
Finite Dimensional Distributions
whence
H(e')
1/2, one has
fexp[i_J fdY"] df. < (fexp(aIVf2)dn)'"2' JV
IV
.
(1.7.15)
Rn
PROOF. It suffices to consider the case where f f dry" = 0. Let us take for D(F') the Sobolev class W2,1 (-r") and put
Applying the logarithmic Sobolev inequality to the function f = e"'I2, we get (1.7.10). Hence Theorem 1.7.8 applies.
Note that letting a = 1 we get
fexP[f -
f
1
f dry"] dry" 1. If A has no interior, then the measure of its closure is zero according to what has been proven above. Suppose that A has inner points. Let H be the (n - 1)-dimensional hyperplane orthogonal to the vector el = (1.0.... ). By FLbini's theorem, it suffices to show that for almost every point t on the first coordinate axis, the set (t+fI)n8A has zero (n - 1)-dimensional measure. If t + R contains an inner point of the set A, then this follows from the inductive assumption, since in this case A n (t + II) is a convex set in t + 11 possessing the nonempty interior from the point of view
of t + H. It remains to note that there exist at most two points t for which the intersection defined above is nonempty, but has no interior. Indeed, according to the first claim of this lemma, such points are boundary points for the projection of the interior of A onto the coordinate line. However, the interior of A is a convex open set, hence its projection is a convex open set on the straight line, which may have at most two boundary points. 0
Chapter 1. Finite Dimensional Distributions
26
Let f and g be two Borel functions on R". Let us fix A E (0,1) and put y)ag(1
h(f,g)(x) = sup f (x
y)
i
yER' Note that the function h is measurable, since for any Borel function ,p on the plane, the set {x: sup W(x, y) > c} is the projection of the Borel set I (X, Y): w(x, y) > c}, Y
i.e., is a Souslin set (see Appendix). Denote by IIfIII the norm off in Lt(R").
1.8.2. Theorem. Let f and g be nonnegative Borel functions on R' such that 1If1I1 > 0, IIgI1I > 0. Then the following inequality holds true: (1.8.1)
Ilh(f,g)II, > Ilf11 I1gIIi-A.
PROOF. It suffices to prove (1.8.1) for bounded functions. Let us use induction
in n. Let n = 1. Since f = F sup If (, g = G sup Igl, where sup IFI = sup IGI = 1, estimate (1.8.1) reduces to the inequality IIh(F,G)IIh > IIFIIi IIGII;-A. By the concavity of logarithm, it suffices to show that IIh(F,G)111 > AIIFII1 + (1 - A)IIGII1.
(1.8.2)
Put A(t) = {F > t}, B(t) = {G > t}, C(t) = {h(F,G) > t}. Note that AA(t) + (1 - A)B(t) C C(t)
for t E (0, 1), and, in addition, A(t) and B(t) are nonempty (for t > 1 these sets are empty). Indeed, if F(a) > t, G(b) > t, then h(F, G)(Aa + (1 - A)b) > t, which is seen from the formula for h(F,G) (it suffices to substitute x = )ta + (1 - A)b, y = (1 - A)b into that expression). Therefore, Lebesgue measure m of the set C(t) is not less than Am(A(t)) + (1 - A)m(B(t)). This estimate is the simplest (one dimensional) case of the well-known Brunn-Minkowski inequality
m(M + (1 - A)B) > Am(A) + (1 - A)m(B), which is readily verified for finite unions of intervals, and then extends to arbitrary nonempty Borel sets (see 1120, Chapter 2)). Using Problem 1.10.11, we get (1.8.2).
Suppose that (1.8.1) is proven for n - 1. Put
x= (y, z), y E R', z E Rn-'. x x fo(z) = f f(y,z)dy,
go(z) =
f
g(y,z)dy,
It is not hard to prove that fo and go are Borel functions. Let us fix w E Rn-1 Then
f((y-s, -w)\a gf(s,w)\ i-a
h(f,g)(y,z)> Eup
Applying (1.8.1) to the functions t -+ f (t, z - w), t «-. g(t, w) of the real variable, we get
f h(f,g)(y,z)dy>_ -x
f f(y,z-w)dY) -x
(f g(y,w)dy) 'x
1.8.
Convexity inequalities
27
whence f h(f, g)(y, z) dy > h(j0, go) (z), since in the previous estimate one can take sup over w. Integrating in z, using the inductive assumption and Fubini's theorem, we arrive at (1.8.1).
Recall that the Brunn-Minkowski inequality states that m" (AA + (1
- A)B)1'" > Am,, (A)'/" + (1 - a)m"(B)1/",
for any Borel sets A, B in IR" with Lebesgue measure mn (see a proof in [120, Chapter 2], [237]). Theorem 1.8.2 implies this inequality in the case m"(A) = mn(B). Indeed, let us take f = IA and g = IB. Note that h(IA, IB) = IAA+(1-A)B,
since h(IA,IB) = 1 if and only if there is y E IR" such that y E (1 - ))B and x-y E AA. Hence we get from (1.8.1) that m"(AA+(1-A)B) > m"(A)"m"(B)'-'. We say that a nonnegative function o on 111" is log-concave (logarithmically concave) if
p(.1x+ (I
- A)y) ? p(x)Ag(y)1-'',
dz, y E IR", VA E [0,1].
Such a function can be written as p = exp(-V), where V: JR" - (-oo.+oo] is a convex function. The function V is finite and convex on the convex set dom(V) _ {V < oo} and is +oo outside dom(V).
1.8.3. Corollary. Suppose that p is a log-concave function on R". Let
po(u) = J
p(u, v) dv,
u E IRk.
IRn-k
Then go is log-concave on IRk.
PROOF. For every u1, u2 E IRk, A E (0,1), and v, w E IR"-k, we have
p(Aul +0 - A)u2, w) > p(nt.
u
.
v)'\p(u2,
1
v A
Hence, letting pl(w) = p(ul,w), p2 (w) = p(u2,w), we get p(Aul + (1 - a)u2, w) > h(pl, p2)(w).
0
It remains to apply Theorem 1.8.2. The next theorem gives the logarithmic concavity for Gaussian measures.
1.8.4. Theorem. Let µ be a probability measure on IR" with a log-concave density p. Then, for any Borel sets A and B and every A E [0, 11, one has
µ(,\A + (1 - A)B) ?
(1.8.3)
In particular, this holds true for the standard Gaussian measure.
PROOF. Put f = QIA, 9 = QIB, and C = AA + (1 - A) B. By condition, we have p((x - y)/A)'p(y/(I -'\))"
0.
PROOF. Note that -y(( f < t}) > y({ f < t} - a) for all t. since { f < t} is a convex symmetric set. Hence,
y(x: f(x) > t) < y(x: f(x+a) > t which yields the desired estimate if f is nonnegative (see Problem 1.10.11). If f > c, where c is a constant, then the claim follows by passing to the nonnegative function
f + Ic]. Finally, in the general case, it suffices to note that the functions ft = max(f, -k) satisfy the conditions of this corollary, are bounded from below, and their integrals with respect to the measures -y and y,, converge to the corresponding integrals of f. The second (more general) claim is derived in a similar manner from inequality (1.8.5).
I.S.
29
Convexity inequalities
1.8.7. Example. Let y be a centered Gaussian measure on R". Then
j Ixl a ry(dx) < r Ix + alP ry(dx), IV
Va E lR", p > I.
JR'
1.8.8. Lemma. Let C,.... , {" be random variables defined on a probability space (0, P) such that the distribution of the vector (l;l,... , t;") is a centered Gaussian measure with the covariance matrix A. Suppose that
Xo = 0. Then the
E X,_ 1, where X; is the linear span of following estimate holds true:
/
Pw:
1
i I and assume that (1.8.6) is proved for all smaller n. Put
R.
n-I t=1
where en 1 X,,_I. Then Con is a centered Gaussian random variable with variance > d2. In addition, CO and are independent. By Fubini's theorem, n
Pmi k
1 - crk,
= e`, where c > 0.
Chapter 1.
32
Finite Dimensional Distributions
whence E(t;/7r)Ijr i< < rk 2Esin2(rkC/2) < c/2. By Fatou's theorem. we conclude that
0
oc.
Let us mention a classical result due to H. Cramer (1731, its proof can be found in several books, e.g.. in (114, Ch. 2].
1.9.4. Theorem. Let t; and q be two independent random variables such that C + q is normal. Then C and q are normal. In other words, if the convolution of two probability measures is Gaussian, then each of them is Gaussian as well. A useful characterization of the Gaussian property was found by G. Polya 1618]. We shall derive Polya's theorem from the preceding results. which were obtained, however, much later.
1.9.5. Theorem. Let £ and q be two independent random variables such that q, and (t;+q)/ / have equal distributions. Then t; and q are centered Gaussian.
PROOF. Let p be the distribution of t; and let v be symmetric with p about the origin. Observe that two independent random variables bl and ql with distribution p * v satisfy the initial condition and have a symmetric distribution. By Theorem 1.9.3, C1 is Gaussian. Applying then Theorem 1.9.4 we conclude that { is
Gaussian as well. An alternative proof is to show that £ has second moment and O apply the central limit theorem. The following characterization of Gaussian functions (not necessarily probability densities) is obtained by E. Carlen [1481.
1.9.6. Theorem. Let f E LP(IR2' ). where p E (I. x]. Suppose that f is a product function in the coordinates (x, y) as well as in the coordinates
x+y x-y V2_
V2_
i.e.. there exist functions W,. V2 and 1, 02 in LP(1R") such that
f(x,y) _ Wt(x)'P2(y) ='01
x + yIP2 x - y
a.c.
are Gaussian functions. i.e.. zp1(x) = exp[-K(x) + ile(x) +12(x) + c],
Then f ,'Pi . c02. 11'1. and
where c is a complex constant, K is a nonnegative quadratic form. and ii. 12 are real linear functions on )R' . and the functions 2. Vii. v''2 are of this form. PROOF. First suppose that each of the functions ;o, and w, is nonnegative. Note
that even in this case the result is not covered by the previous theorems if p > 1. Let (Pi)t>>o be the heat semigroup on LP(R2') defined by PP,p(x) = f f (y)pe (x - y) dy,
;,; E LP(lR2' ).
R="
where pt is the density of the centered Gaussian measure with covariance tI on IR2"
and t > 0. Denote by (Qt)t>o the heat semigroup on LP(IR"). It is easily verified and Qti1', are smooth and strictly positive and that, that the functions Ptf,
letting u = (x+y)/f, v = (x - y)/f, we have Ptf(x,y) = Q11PI(x)Q04'2(y) = Qt `i(u)Qt't2(v)
1. 10.
Complements and Problems
33
Therefore, ar, 8b4 log Pt f = 0. Using that 2d=, 05k = a4, auk - 0,, 0t.k + at4
au, a,.,
and that log Pt f (x, y) = log Q, s:'1(u) + log Q,tl,2(v), we see that log Q, tbt and log Qtik2 are second order polynomials. By the integrability condition, Qtrp, = exp[-K, + A; + c,],
where K; is a positive definite quadratic form, A, is linear and c, E 1R1. In particular, Pt f is integrable, which implies the integrability of f by virtue of Fubini's theorem. It remains to note that P, f -. f in L' (1R2") as t 0. To remove the assumption that f is nonnegative, let us apply the previous case to I P1 f I = 1QtV1 I IQ6p2I = IQt?GI [ M021. This gives Pi f = exp[i(t + K,], where St is
a smooth real function and Kt is a negative definite quadratic form. Now the same reasoning with the logarithmic differentiation as above applies. Letting t tend to zero, we get the claim. 0
1.9.7. Remark. It is clear from the proof of Theorem 1.9.6 that its conclusion remains valid if we consider the coordinates
(xsin8+ ycos8,xcos8 - ysin0) with 0 56 irk/2, k E N, in place of 0 = it/4.
1.9.8. Corollary. If the condition mentioned in Theorem 1.9.2 is satisfied for some p 54 Irk/2, k E N, then is Gaussian.
1.10. Complements and problems
Other characterizations of Gaussian measures Characterizations of the type given by Theorem 1.9.1 were investigated by G. Darmois [184], V. P. Skitovich (7051, and later by many other authors (see references in [114] and [5021). The next result is called the Darmois-Skitovich theorem. See 15021 for its proof. Observe that Theorem 1.9.1 and Theorem 1.9.2 follow from this result.
1.10.1. Theorem. Let £! .... ,
be independent random variables and let o;
and ,3=, i = 1,... , n. be nonzero real numbers such that the random variables E o, , 1=1
and E 8 , are independent. Then the F, 's are Gaussian. t=1
A wide class of Gaussian characterizations steams from the fact that certain classical inequalities (e.g., Young's inequality) have Gaussian functions as extremals. For example, as shown in [4851, if G(x, y) is a Gaussian kernel on lR"xlR", i.e., G(x, y) = exp[Q(x. y) + I (x, y)], where Q is a complex-valued quadratic form on n 2n and 1 is a complex-valued linear function on ]R2", then the norm of the integral operator between LD(lR") and LQ(IR"), generated by kernel G, is attained only on Gaussian functions, i.e., functions of the form cexp[B(x) + f (x)], where B is a quadratic form and f is linear (both complex-valued). For related results, see [46] and [148].
We shall give an interesting characterization of Gaussian measures via the Poincari inequality (found in [105] in the one dimensional case and generalized
Chapter 1. Finite Dimensional Distributions
34
considerably in [156] whose proof we reproduce). Let p be a probability measure on R" such that the norm is in L2(µ) and
I xixj p(dx) = b+j.
Put
U(p)=sup
ff2dp
1fIVfl2dp
,
fECl(R")nL2(p)
1.10.2. Proposition. One has U(p) = 1 if and only if p is the standard Gaussian measure.
PROOF. Note that U(p) > 1, since we can take f(x) = x1. Clearly, we have U(7") = 1 by the Poincare inequality. To prove the converse, put AX) = x,,+Ag(x),
where g E C"(R"j) and A E R'. Since U(p) = 1, one has J [xj + Ag(x))2 p(dx)
0, a1 +
+ak = 1, k E NJ,
(2.1.1)
absconvA = {atal
a, E A, a, E 1R1. all +
+ Iakl < 1, k E IN}. (2.1.2)
Recall that we call cylindrical sets (or cylinders) the sets in a locally convex space X with the dual X', which have the form
C = {x E X : (11(x).....111(5)) E Cp}.
1, E X',
(2.1.3)
where Co E B(IR") is called a base of C. Certainly, this representation is not unique (e.g., the space IR3 can be represented as the preimage under a projection of 1R.1 as well as of 1R2). The linear space
L = fl" 1Ker li has finite codimension k < n. If we choose in X a k-dimensional linear subspace Lo, algebraically complementary to L, then the set C is written as
C=C1+L. where C1 is some Borel subset of Lo.
Notation. (i) Denote by C(X) the a-field, generated by all cylindrical subsets
of X. In other words, E(X) is the minimal o-field, with respect to which all continuous linear functionals on X are measurable. 39
Chapter 2. Infinite Dimensional Distributions
40
(ii) Let F be some family of functions on a set X. Let us denote by E(X,F) the minimal a-field of subsets of 9, with respect to which all functionals f E F are measurable. In other words, E(X,. F) is generated by the sets of the form if < c},
f E F. c E R'. In particular, e(X) = E(X, X' ). (iii) The image of the measure µ on a measurable space (X.M_x) under a measurable mapping f : X Y to a measurable space (Y,.My) is denoted by p of
and defined by the equality
ltof-'(B) := µ(f-'(B)). BE My. If X is a locally convex space, Y = X and MN = E(X) or Mv = 8(X). then for any h E X the mapping x h-. x + h is measurable. The image of the measure It under this mapping is denoted by µi, and is called the shift of the measure to the vector h. By definition, l1h(A) = p(A - h).
It is clear that E(X) is contained in the Borel a-field of B(X), but may not coincide with it (e.g., this is the case for the product R` of the continuum of the real lines). However, for separable Frechet spaces (in particular, for separable Banach
spaces) the equality E(X) = B(X) holds true (see Appendix). For example. this equality is true for the countable product of the real lines ft'. For any measure (nonnegative countably additive function) p on E(X ). we denote by E(X), the Lebesgue completion of E(X) with respect to p. In other words, A E E(X),, precisely when there exist sets B1, B2 E E(X) such that B1 C A C B2, µ(B2\Bj) = 0.
2.1.1. Lemma. Sets of the form
{xER": (x1....,x")EB}.
BEB(IR"). HE IN.
generate B(R") = E(R" ). PROOF. This follows from Theorem A.3.7 in Appendix, since the coordinate functions are continuous and separate the points of the space R".
2.1.2. Lemma. A set E belongs to E(X) precisely when it has the form
E = {x E X : (13(x).....1"(x)....) E B). where 1, E X', B E B(I
(2.1.4)
).
PROOF. It is readily verified that the sets of the form (2.1.4) form a a-field. Since this a-field contains all cylindrical sets, then it contains E(X ). On the other hand, for any fixed countable collection {l, } C X', the family of all sets B E B(IR ) for which the set E given by formula (2.1.4) belongs to E(X), is a a-field as well.
Since this a-field contains all sets of the form {x E R": (x3.....x") E Co}, Co E B(R"), then, by virtue of Lemma 2.1.1, it coincides with B(1R ). 2.1.3. Lemma. Let C be a cylinder in a locally convex space X with a compact base. Then, for any continuous linear operator T: X Rd. the set T(C) is closed.
PROOF. Let C = P-' (K), where K is a compact set in R" and P: X - IR" is a continuous linear mapping. Without any loss of generality, we may assume that
R" = P(X). In this case there exists an n-dimensional linear subspace Y C X,
2.1.
Cylindrical sets
41
which is an algebraic complement to Ker P. The mapping P is an algebraic, hence also topological, isomorphism of the finite dimensional spaces Y and R". Therefore, the set Q = Y n P-' (K) is compact in X. In addition, C = Q + Ker P. It remains to note that T (Q) is compact, hence T (C) = T(Q) + T(Ker P) is a closed set. 0
Note that a finite dimensional projection of a cylindrical set may fail to be a Borel set (even for sets in the plane; see the section about Souslin sets in Appendix).
2.1.4. Lemma. Let p be a measure on E(X), where X is a locally convex space. Then, for any set A E E(X). its linear span. its convex hull, and its absolutely convex hull are measurable with respect to p (i.e., belong to E(X)},).
PROOF. According to formulas (2.1.1) and (2.1.2) and an analogous formula for the linear span. it suffices to show that for any collection of sets C, E E(X) and any Borel set A C IR' . the set A = ` '\1c1 + ... + AkCk. (A1.... , Ak) E A. ci c- Ci }
is measurable with respect r to p. By Lemma 2.1.2, there exist sets B, E 8(R') and
a countable family (f, } C X' such that C, = f -'(B,). where f = (fl): X 1R" . Therefore. it suffices to prove our claim for the measure v := po f-' on lfft." and the sets
B={albs+...+Akbk, (Al.....Ak)EA. biEB,}CR-. Indeed, if we find sets E1, E2 C B(IR C) such that EI C B C E2 and v(E2\EI) = 0, then
f-'(EI) C f-'(B) = A C f-'(E2) and p (f -' (E2 )\ f -' (EI)) = 0. For the measure v, the claim follows from Theorem A.3.15 in Appendix. since B,x xBkxA is a Borel set in the corresponding product space, and the mapping (x1, ... , xk, A) - A1x1 + + Akxk is continuous. O
2.1.5. Lemma. Let p be a measure on E(X ), where X is a locally convex space. Then. for any set A E E(X ),, and any e > 0, there exists a set E of the form (2.1.4) such that B is compact in IR.", E C A and p(A\E) < E. In addition. there exists a cylinder C, with a compact base such that p(A A C,) < e. PROOF. By definition. there exists a set S E E(X) with S C A and µ(A\S) = 0. According to (2.1.4), the set S has the form
S = f-'(So). So E B(IR") f = (f,): X - R", ff E X. By Theorem A.3.11 in Appendix applied to the measure pof - ' on lR", there exists
a compact set B C So such that po f -' (So\B) < e. Now one can put E = f -'(B).
In a similar manner, there exists a cylinder D such that p(A i D) < e/2. Let D = P-' (Bo), where P: X 1R" is a continuous linear operator and Bo E B(IR"). Since v := poP-' is a Borel measure on IR", then there exists a compact set K C Bo,
for which v(Bo\K) < r/2. Letting Cf = P-'(K), we get p(A v C.) < e.
0
2.1.6. Lemma. Let p be a measure on E(X), where X is a locally convex Then, for any convex p-measurable set A and any e > 0. there exists a convex cylindrical set C. with a compact base such that p(A L CC) < E. If A is space.
absolutely convex, then C, can be found with the same property.
Chapter 2. Infinite Dimensional Distributions
42
Pttoot:. By definition, there exists a set B E E(X) such that B C A and µ(B) = li(A). By Lemma 2.1.5, we may assume that B has the form B = U', B,,, where B. = f -' (K ), the are compact in 1R" and f : X R°` has the form f = (f,), f, E X. In addition, there exists a cylindrical set AL containing A such that µ(A,) - µ(A) < e. Let AE have the form AE = P-'(C0),
Co E B(IRd),
where P: X
Rd is a continuous linear mapping. By the convexity of A, we have cony B C A. According to Lemma 2.1.3, the sets are closed in W. Hence the convex hull of their union is a Borel set. This means that
E := P(conv B) E B(IRd).
Note that E is convex and that B C P-1(E) C A,,, since E C P(A) C P(A,,).
P_' (E) up to 6 with respect to the measure p. It remains to note that there exists a compact set K in E with It o P-1(E\K) < e. Since cony K is convex and compact in E (for E is a convex set in Rd), the cylinder C, := P-' (conv K) approximates A up to 2e. The
Thus, we have approximated A by the convex cylindrical set
reasoning is similar for absolutely convex sets.
2.2. Basic definitions
2.2.1. Definition. (i) Let E be a linear space and let F be some linear space of linear functions on E, separating the points in E. A probability measure. y on E(E, F) is called Gaussian if, for any f E F, the measure y o f -' is Gaussian. (ii) Let X be a locally convex space. A probability measure y defined on the a-field E(X), generated by X', is called Gaussian if, for any f E X', the induced measure yof _' on 1R' is Gaussian. The measure y is called centered (or symmetric) if all the measures -yo f ', f E X', are centered. (iii) A random vector is called Gaussian if it induces a Gaussian measure. Note that although in case (i) the space E is not supposed to be endowed with a topology, this case in fact reduces to (ii), since one can introduce on E the locally convex topology a(E, F), generated by the family of seminorms p f(x) = I f(x) 1, f E F, as explained in Appendix. In this case F turns out to be the dual to E with this topology, and we arrive at case (ii). Some of the results discussed below do not use any topology on the space with measure, while a number of fundamental
results involve explicitly the topological structure. For this reason, for the sake of uniformity of exposition, everywhere below we consider Gaussian measures on locally convex spaces. However, nothing from the theory of locally convex spaces is used in this chapter, except for the notion of a continuous linear functional. Note that one need not require in the definition that, the measure y be probability: this is obvious for nonnegative measures and is true even for signed measures (see Problem 2.12.15).
It follows directly from the definition that the image of a Gaussian measure under a continuous affine mapping is a Gaussian measure (recall that an affine mapping is a linear mapping plus a constant vector).
2.2.2. Lemma. Let y be a Gaussian measure on locally convex space X and let T: X -. Y be a linear mapping to a locally convex space such that loT E X'
2.2.
Basic definitions
43
for any I E Y'. Then the measure yoT-' is Gaussian on Y. The same is true for the of fine mapping x'-+ Tx + a, where a E Y.
Recall that the Fourier transform (the characteristic functional) of a measure µ defined on the o,-field £(X) in a locally convex space X is given by the formula ii:
X. - C', W) = f exp(if(x)) li(dx) X
2.2.3. Example. For every measure p on £(X) and every h E X, we have Wh-(I) = eulh}u(l),
V1 E X'.
PROOF. It suffices to note that, by the change of variables formula (see Appendix), we have 11(x) Ph(dx) =
f f(x+h)µ(dx)
for every £(X)-measurable bounded function f. Some additional information about the Fourier transforms is presented in Ap-
pendix. Here we only need the fact that any two measures on £(X) with equal Fourier transforms coincide.
2.2.4. Theorem. A measure y on a locally convex space X is Gaussian if and only if its Fourier transform has the form
7(f) =exP(iL(f)
- IB(f,f))1
(2.2.1)
where L is a linear function on X* and B is a symmetric bilinear function on X' such that the quadratic form f +-- B(f, f) is nonnegative.
PROOF. Let -' be a Gaussian measure. It follows from the definition that f E L2(y) for any f E X'. Hence one can put
L(f) = Jf(x).v(dx).
B(f,g) = J[f(x) - L(f)] [g(x)
f E X*,
- L(9)] y(dx),
f, 9 E X.
It is clear that L is a linear function and B is a bilinear symmetric function. In addition, the quadratic form f '- B(f, f) is nonnegative. Now equality (2.2.1) follows by the elementary properties of the one dimensional Gaussian distributions.
Conversely, let y be a measure on £(X) with the Fourier transform given by (2.2.1). By the change of variables formula, we get
fexp(ivt)'of'(dt) = f exP(iyf (x)) y(dx) = exp [iyL(f) - 2y2B(f, f )] , whence it follows that -y is Gaussian.
2.2.5. Corollary. A Gaussian measure y on a locally convex space X is centered precisely when y(A) = y(-A) for any A E £(X). This is equivalent to the relation L = 0 in (2.2.1).
Chapter 2. Infinite Dimensional Distributions
44
PROOF. It suffices to note that the Fourier transform of the measure A .-. y(-A) is the function which is complex conjugate toy and that measures on E(X) with equal Fourier transforms coincide.
2.2.6. Corollary. The product 'y 0y2 of two Gaussian measures yl and y2 on locally convex spaces X1 and X2 is a Gaussian measure on X1 xX2. In the case where X 1 = X2 = X., the convolution yl * 72 of the measures - and y2; defined as X , (x. Y) i - x + y the image of the measure y1 ®y2 under the mapping X x X (see Appendix) is Gaussian as well.
2.2.7. Definition. Let X be a locally convex space and let u be a measure on
E(X) such that X' c L2(µ). The element a. in the algebraic dual (X')' to X'. defined by the formula
aµ(f) = f f (x) µ(dx),
(2.2.2)
x is called the mean of p.
The operator R,: X' -+ (X')', defined by the formula R,.(f)(9)
f [f (X)
- av(f)] [g(x) - aµ(9)],i(dx),
(2.2.3)
x
is called the covariance operator of p. the corresponding quadratic form on X' is called the covariance of t. Notation. Let y be a Gaussian measure on a locally convex space X. We denote by X,; the closure of the set
If - a. (f ), f E X' }, embedded into L2(y), with respect to the norm of L2(-y). The space obtained in this way and equipped with the inner product from L2(y) is called the reproducing kernel Hilbert space of the measure y. Put IhIH{,) = sup{l(h): I E X', R.,(1)(1) 5 1},
H(y) _ {h E X: IhIH(,) < oc}, U, = {h: lhlnc < 1}. The space H(y) is called the Cameron-Martin space. In the literature, it is also called the reproducing kernel Hilbert space and denoted by RKHS, but we use neither this term nor notation. The mapping R, is also defined on X;:
R,: X; -. (X' )', R,(f)(g): = f f(x)[9(x) - a-,(g)] 'y(dx),
f E X;. g E X'.
x Clearly, for any centered measure, this is merely the extension of R, from X' to X;. However, independently of whether the measure y is centered or not, for
each f E X', the functional R., (f) coincides with the functional R, (f - a, (f) ) generated by the element f - a,.(f) of the space X; (although the functional f itself may not belong to this space if a, 0 0).
2.2.
Basic definitions
45
In the case where the functional R,(f), where f E X.;, is generated by an element of the initial space X, this element is denoted by the same symbol as the functional. In this case
(R,(f),l) = (1,R,(f)), VIE X'. It is clear that in relation (2.2.1) we have B(f.f) = R, (f)(f) for all f E X. Put
a(f)
R (f)(f) = 1L(')2
fEX
If f E X, we put
a(f)
f (x)2 (dx).
Finally, the element R, (f) is also denoted by R, f .
2.2.8. Lemma. Every element g E X; is a centered Gaussian random variable with variance a(g)2 = II9II lV(,)
PROOF. The claim follows from Problem 1.10.13. but for the reader's convenience we include the proof. Let {fn } be a sequence of elements of X' for which the sequence {fn - a, (fn)) converges tog in L2(y). Then it converges to gin measure, whence
fex(itfn(z)
Jexp(itg(x)) ry(dx) = lim
-
7(dx)
x
X
= lim exP[-2t2a(fn)2J Therefore, there exists the limit d2 = lim This means that the measure yog is Gaussian with variance d2 = a(g)2 and zero mean.
nx
2.2.9. Remark. The examples of Gaussian measures discussed below show that the elements a, and R. (f) for f E X' are not always represented by vectors from the space X. However, it will be shown in the next chapter that this cannot happen for Radon Gaussian measures. In particular, such examples are not possible for Gaussian measures on separable Banach spaces.
2.2.10. Proposition. A probability measure 'y on the a-field E(X) in a locally convex space X is centered Gaussian precisely when for every real cp the image of
the measure y® on X x X under the mapping
XxX
(x,y)-xsinV+ycosW,
coincides with y. PROOF. The necessity of this condition is verified by the aid of formula (2.2.1). The proof of the sufficiency reduces to the one dimensional case. Indeed, the Fourier
transform of the image of the measure vow, where v = yo f-1, under the mapping
Chapter 2. Infinite Dimensional Distributions
46
of the form indicated above from 1R2 to 1R1 has the form /'
t
J
J exp [it(u sin p + v cos gyp)] v(du) v(dv)
]-t(dx)1'(dy) = f f exp[it(f(x)sinV +f(y)cos v)JexP[itf(z)]
=
-y(dz) = 'v(t).
0
Hence one can use Theorem 1.9.2.
2.2.11. Corollary. Assume that X is a locally convex space equipped with the o-field E(X). A measurable mapping t; with values in X is a centered Gaussian random vector in X if and only if for every pair (1i1,t;2) of independent copies and every real number gyp, the mappings f l sin V + 1;2 cos cp,
t; l coW- bsin ,p
are independent copies of l;.
There is a useful characterization of the Gaussian property by means of the triple convolutions. Let us introduce the following notation. Let X be a linear space. For any p E 1R1 put 1
2
a=3+3cos5o, Q
=
3
- 3 cos O - 7 sin p,
1
1
7 = 3 - 3 cos p + Note that a + Q + y = 1 and that the matrix
To=
f 1
sin cp.
fa Q 7 7ap Q 7 a
is orthogonal: it defines the rotation by angle cp around the axis with the direction
vector (1/f,1/f, 1103). This matrix generates the operator
T,.X: X3-.X3 (x,y,z) -T;,X(x,y,z) = (ax+$y+ryz,ryx+ay+Qz,OX +ryy+az). 2.2.12. Theorem. Let X be a locally convex space and let u be a Gaussian measure on X. Then the measure µ3 = µ®µ®µ is invariant with respect to T,,.x for any gyp. Conversely, if u is a probability measure
on E(X) such that the measure p3 is invariant with respect to T,,X for some W 2kir/3, then µ is a Gaussian measure.
2.2.
47
Basic definitions
PRooF. The first claim follows from the equality a + 0 + y = 1, which yields that the Fourier transform of the measure u'i o T-,1 coincides with the Fourier transform of the measure µ3. For the proof of the second claim let us fix f E X' and put v = ,uo f-t. Then
43 = v3oTR where the operator T ..,R;, on IR3 is given by the matrix T,. To simplify the subsequent calculations we put F = 1 and T :n;(x,y,z) =
Then the equality of the Fourier transforms of the measures v3 and v3oTR, yields the identity
F(x)F(y)F(z) = F(x,)F(yv)F(z,)-
(2.2.4)
Note that F does not vanish. Indeed, otherwise by the continuity of F there exists
a point a such that F(a) = 0 and F has not zeros on [-a, a]. However, equality (2.2.4) applied to the vector (a, 0, 0) yields
0 = F(a) = F(aa)F(8a)F(ya), which leads to a contradiction with our choice of a, since by condition q = sup(lal,181, Iyl) < 1. Let us pick 6 > 0 such that for all x from the closed interval I = [-6, 6] the values F(x) belong to a simply connected region U in the complex plane whose closure is compact and does not contain the origin. In this region U we can find a branch of logarithm and put G(x) = Log F(x), x E I. Note that F(x;o) E U for all x E I, since x,. E I. By virtue of (2.2.4) we have
C(x) + G(y) +G(z) = G(x,;) +G(y,) +G(z,,,),
ex, y, z E I.
(2.2.5)
This identity implies that G is infinitely differentiable. Indeed, integrating (2.2.5) in y over I we get 26G(x) + 26G(z) + f66 G(y) dy = f 6[G(ax+By+yz)+G(yx+ay+/3z)+G(,Qx+yy+az)]dy. It follows from the condition that the numbers a, Q, y are nonzero. Hence the first
integral on the right can be rewritten as 1
QX+36
ux-,36
G(s+yz)ds,
which is a function differentiable in x. Two other integrals admit analogous representations. Thus, the function G is differentiable. By induction one readily verifies its infinite differentiability. Differentiating three times the equality
G(x) = G(ax) + G(Qx) + C(yx), which follows from (2.2.5), we get G,,, (x) = a3Gm(ax)
+ ^`( '3X) +,Y3G..(yx)
Chapter 2. Infinite Dimensional Distributions
48
By virtue of the equality a2 + 32 + y2 = 1 one has the estimate
[al3+1313+[y13 0.
(2.3.4)
n=1
The convergence of the series in (2.3.4) for some e > 0 is equivalent to the equality y(l") = 1.
PROOF. Let series (2.3.4) converge for some e > 0. Then on -. 0. Therefore,
for any 6 > 0, there exists r > 0 such that n' I yn ((-r, r]) > 1 - 6. Indeed, according to (1.1.2), one has 2
2022
a'I)eXp(-2o2n ) 51-yn([-r,r]) 5 2 rnexp(
2rr(an 2n By the convergence of the series in (2.3.4), we readily get that x r2 an exP
lim r-.z r n=1
ton
n
). (2.3.5)
=0.
This means that y(l") = 1. Conversely, if y(l") = 1, then for some r > 0, one has nn 1 yn ((-r, r]) > 0. By estimate (2.3.5), this implies the convergence of the
series E (an/r) exp(-r2/(2on)), hence the convergence of the series in (2.3.4) for n=1
sufficiently large E. If the series in (2.3.4) converges for all e > 0, then one can construct a sequence of positive numbers en 0 such that
x exP n=1
En
ton
< 00.
Chapter 2. Infinite Dimensional Distributions
52
As it was proved above, the product v of the one dimensional Gaussian measures with variances on/£n is concentrated on 1'C. Since the measure y is the image of v under the linear mapping (xn) F-+ ( £nxn), which sends l" to co, we get y(ca) = I. If the series in (2.3.4) diverges for some e > 0, then a ball U of a sufficiently small radius in the space co has y-measure zero. Then y(U + h) = 0 for any vector h with finitely many nonzero coordinates, since the shift to such a vector transforms y into an equivalent measure. Since co can be covered by a countable union of such shifted balls, we get y(c0) = 0. Finally, note that both claims could be deduced from the Borel-Cantelli lemma (see 1697, p. 253, Ch. II, §10]), using the independence of the coordinate functions as random variables on (1R', y) (see Problem 2.12.27).
2.3.8. Example. Let pa, a E A, be a family of Gaussian measures on locally convex spaces X,,. Then the product-measure p = (8) p,, is a Gaussian measure aEA
on X = fi X.. The Cameron-Martin space of p is the Hilbert direct sum of the U
spaces H(pn), i.e..
H(p) = lh = (ha) E X : ha E H(pa). IhI
IhalH(,,.) < 00}.
f(v) _ a
the space X;, is the collection of all functions of the form
x
x fu.,
X
a(faj2 < 00,
an E A. fa E Xµ.a,.
n=1
n=1
and a, (f)=Ea ,(fa)forany f =(fe)EX'. It
PROOF. The first claim follows from Corollary 2.2.6 and the fact that the dual
to X is the direct sum of the spaces (Xe)', i.e. the general form of a continuous linear functional on X is f(x) _ E fa(xes), where x = (xa) and only finitely many fa E (Xa)' are nonzero. By the definition of a product-measure (see Appendix), this yields the announced description of X. Further, for any f of the foregoing form. one has R,.(f) (Ri,.(fa)), whence the claim about H(p). The expression for aµ is obvious.
The concept of a Gaussian measure is closely related to that of a Gaussian random process. The latter is defined as a family t = (t;t )tET of random variables such that all their finite linear combinations are Gaussian. Indeed, the measure induced by such a process on the path space 1RT with the topology of pointwise convergence is Gaussian.
2.3.9. Proposition. (i) Let be a Gaussian random process on a set T. This means that for all tl,... tn E T, the random vector has a Gaussian distribution. Then the measure pE induced by £ on the space of all functions 1RT with the topology of pointwise convergence is Gaussian.
(ii) Let T be a nonempty set and let X = IR". Then every function F of the form (2.2.1), i.e.,
f -.exp(iL(f) -
2R(f,f)).
2.3.
Examples
53
where L is a linear function on (IRT)' and B is a symmetric bilinear function
on (IRT), such that f r~ B(f, f) is nonnegative. is the Fourier transform of some Gaussian measure on X. (iii) Let y be a Gaussian measure on the space X = lRT with the topology of pointwise convergence. Then a., E X and R,(X,) C X. PROOF. The first claim follows directly from the fact that every continuous with an aplinear functional on IRT can be written as x -+ cj x(t 1) + ... + propriate choice of c, and t,. In order to get (ii). it suffices to apply Kolmogorov's theorem on the existence of a measure with the consistent finite dimensional projections (see Theorem A.3.21 in Appendix). The consistency condition required
in this theorem is satisfied. Indeed, let t1.....t" E T. It is easily seen that the function
y=
exp
ir,)
-
l=1
J=1
J=1
is the Fourier transform of a unique Gaussian measure PP,,_,. ,t on IR". Let zr be the natural projection of IR"t' to IR'. Then the Fourier transform of the o xn 1 is the function y i-. Pt,,.......1(y o it"), which coincides measure PP,,._. with since the functional yo ir, on 1R" is given by the inner product with the vector (yl.... , y-0). Thus, Pf,.....t..,, o rrn' = Pt,,.,, ,t The last claim is obvious from the description of the dual to IRT as the linear 0 span of be, t E T, where be(x) = x(t). The function B appearing in representation (2.2.1) for the measure y on Rr corresponding to the Gaussian random process { is uniquely determined by the covariance function K of the process which is defined by the formula
K(s,t) = cov
fir) = IE
lE"r),.
The covariance function is connected with the form B by the equality
K(s,t) =
(2.3.6)
where b,(x) = x(s). This is clear from the fact that linear combinations of such
functionals exhaust the dual to R". In order that the function K( . ) on T x T generate a nonnegative quadratic form B on (IRT)* by means of (2.3.6) it is necessary and sufficient that for all finite collections s i , ... , s". t I.... , t, of points in T, the matrix (K(s t,))" be nonnegative definite. Such functions on T xT are called nonnegative definite. The simplest example is K(t, t) = 1, K(s, t) = 0 if s 34 t; this corresponds to the product of one dimensional Gaussian measures. Now we shall see that the previous example is universal in the sense that every Gaussian measure can be regarded as a measure on an appropriate product of the real lines.
2.3.10. Example. Let y be a Gaussian measure on a locally convex space
X and let T = X. Define an embedding X -. 1Rr by the formula x '. x( ), x(t) := t(x), t c T. Then the image of the measure
under this embedding is a
Chapter 2.
54
Infinite Dimensional Distributions
Gaussian measure 7' on R° and
y'l y E RT: (y(tt),... ,Y(tn)....) E B)
=yIxEX:
t,EX`, BEB(R").
PROOF. Every continuous linear functional on RT has the following form: x c, E Rl, t; E T. The composition of such a functional clx(t1) + ... + with the embedding X -. RT is continuous on X. Hence this composition has Gaussian distribution with respect to the measure y'. The last claim is true by construction.
2.3.11. Example. Suppose that in the situation of Proposition 2.3.9(ii) we have
T=[0,11, L=O, B(6a, 66) = mina, 6), where ba(x) = x(a), and let B be extended by linearity to (IRT). (i.e., the linear combinations of the functionals 6Q). It is readily verified that the associated quadratic form is nonnegative. The corresponding measure P"* on the path space is called the Wiener measure. For this measure PR the following equality holds true:
J
1x(t) - x(s)]'P"'(dx) = 3It - s]2.
(2.3.7)
X
In addition, (Pw)'(CIO,1]) = 1 and the measure defined by the formula
E .-. Ptt (E0), E = Eo n C[0,1j, Eo E E(RT ), is Gaussian on C[0,1] (it is also called the Wiener measure and denoted by Pt{'). PROOF. It follows from the definition that the functional x -. x(t) - x(s) on (RT Put") is a centered Gaussian random variable with variance It - s], since
B(bt -
6,) = t +s - 2mint,s) = it - s].
This gives (2.3.7). According to Theorem A.3.22 in Appendix, the space C[0,1) has Plt.-measure. Hence the formula full outer
En C(0,1] -P"_(E), EE E(RT), defines a probability measure on the a-field of the sets of such a form. However, by Theorem A.3.7, this a-field coincides with the Borel a-field of CIO, fl. It is clear that. Pty' is a Gaussian measure on C[0,1] as well, since each element from C[0,1)' is the limit of a sequence of linear combinations of Dirac's functionals b in the *-weak topology (see Problem A.3.36 in Appendix). The process considered in the previous example is called a Wiener process or a Brownian motion. In a similar manner one defines a Wiener process on any closed
interval. For a Wiener process one can take the function wt(w) = W(t),
t E [0, 11,
on the probability space (fI, 8, P"), where S2 = C[0,1), B = B(C[O. 1)). The following are the characteristic properties of the Wiener process: 1) the increments wt, - we...... wt are independent for all to < t1 < ... < t,,-.
2.3.
Examples
55
2) wt - w, for s < t has Gaussian distribution with mean 0 and variance t - s; 3) the trajectories t - wt(w) are continuous for a.e. w.
2.3.12. Corollary. The Fourier transform of the measure P' on C[0,1] has the following representation:
PK (a) = exp (
\
2
J
J min(s, t) A(ds) A(dt))
,
11
o0
where the dual to C[0,1] is identified with the space of all Borel signed measures A on [0, 11.
PROOF. If A is a finite linear combination of Dirac's measures bt,, then the claim follows from the construction of Pu'. It remains to be noted that for every Borel measure A on [0,1] there is a sequence of measures A, that are finite linear combinations of Dirac's measures and converge weakly to A in the sense that f iO d,\, -+ f od A for every continuous function V) on [0,1] (see Problem A.3.36 in Appendix).
Note also that for any cylindrical set
l C= { x E C[0,1]: (x(tl),... ,x(tn)) E B y, B E B(R"), where 0 < t1 < t2 < ... < t,,, one has the equality P1v(C) = cn f
exp(-1 Z(ul, ... ,
s
1
dul ... dun, II
where
-
U1
(u2 - ul)2
(u,, - un-1)2
tl
t2 - ti
to - tn_1
C = [(2Tr)"tI (t2 - t1) ... (tn - to-1),
1/2 .
2.3.13. Remark. Another frequently used construction of the Wiener measure on C[0,1] is this. Let {fn} be a sequence of independent standard Gaussian random variables on a probability space (1, P) and an orthonormal basis in tJan(s)ds. the space L2[0,1]. Put en(t) = Then for a.e. w, the series 0
x x(t,w) :=
Et:n(w)en(t) n=1
converges uniformly in t E [0,1]. The measure P0 induced by this random series in C[0,1], i.e., the image of P under the mapping w i x( , w), coincides with Pu. We shall derive this fact from more general results in Chapter 3; however, direct proofs are available, see, e.g., [430]. Given the convergence of this series, the equality
Po = P'v is immediate. Indeed, it suffices to show that every continuous linear functional I on C[0,1] has equal distributions with respect to the two measures.
Chapter 2.
56
Infinite Dimensional Distributions
k
Moreover, it suffices to consider 1(x) _ E c,x(ti). Then I is centered Gaussian with respect to both measures. The variance of l with respect to P0 is
x
k
i=1
k
2
/
/
E cjcj>en(ti)en(t.i). i.j=1
n=1
n=1
This is exactly the variance of I with respect to Pw, since x x
en(ti)en(tJ) _ (I(o.l,],con)L2(0.1i (Iio.t, i, n)L=[0.1] /
n=1
n=1
=
l] = min(ti, ti )
(I(o.c,l
Let us evaluate the eigenvectors and eigenvalues of the operator Rwwith kernel
min(t, s) on L'[0, 1], which is the covariance operator of the Wiener measure P' considered on L2 [0, 1]. Since this is a symmetric compact operator, its eigenvectors
form an orthonormal basis in Lz[0, fl. Let fo min(t,s)fa(s)ds = Afa(t) a.e. Then :
r1
sfa(s)ds+tJ fa(s)ds=Afa(t)
a.e.
o
It is easy to deduce from this relationship that fA admits a twice continuously differentiable modification satisfying the differential equation
AA IM = -h(t), fa(0) = fa(1) = 0. Indeed, if A 0 0, then fa has an absolutely continuous version, whose derivative satisfies a.e. the equation
fp(s) ds = Afi(t) If A = 0, then by a similar reasoning we conclude that fa = 0 a.e. Therefore, we get a countable collection //
z
t fn(t)=fsin =, A, =7r-zfn-2/, nEIN.
In particular, this evaluation shows that the embedding to L2[0,1J of the space H(PH ), which coincides as already explained with /(L2(0, 1]), is not a nuclear operator (but, clearly, it is a Hilbert-Schmidt operator). It will be shown in the next chapter that the space H(P44') does not depend on whether P1S' is considered on C[0,1] or on L2[0,1], so that we describe the Cameron-Martin space of the Wiener measure on C[0, 1J as well.
2.3.14. Lemma. The space H(PH') coincides with the Sobolev class K p'1[0,11 of the functions f on [0, 1] such that f is absolutely continuous, f' E L 2[0' 1] and f (O) = 0. In addition, If I H(PW) = Ilf'II 0(o.1]
PROOF. Since H(PH') is the same for P'' on C[0,1] and L2[0,1], we shall deal with L2[0,1) in this proof. The function f belongs to a,(L2[0,1]) precisely when
2.3.
57
Examples
its coefficients cn in the basis {f} satisfy the condition E c2. (n-1/2)2 < oo. The =1
latter is equivalent to the existence of a function g E L2[0,1) such that
r
(9,VIn)2 = 7rCn 1 n -2
,
where an(t) = f cos(t/ an) (note that lip,,) is an orthonormal system). If such a function exists, then its primitive is f. Conversely, if f is absolutely continuous, f (0) = 0 and g = f E L2[0,1], then we get I
I
Tr(n-2)cn=- f f(t)II'n(t)dt= f 9(t),n(t)dt=(9,'0n)20
0
Since the system {tb, } is orthonormal, one has
x
(g,1JJn)2 < no. It is readily
n=1
verified that If 1IL2[o,1) To this end, it suffices to note that {1P.} is an orthonormal basis in L2[0, 1]. For another proof see Example 3.3.9. 0
The Wiener process in IR" is defined as wt = (wi , ... , wi ), where wk are independent real Wiener processes. The measure Pu' induced by this process on C([0,T],lR") (or on C([a,b],1R"), L2([a,b],JR"), etc.) is called the Wiener measure as well. The corresponding Cameron-Martin space coincides with the space W,1 ([0, T], ]Rn) of all absolutely continuous R"-valued functions h on [0, T] such that h(0) = 0 and Ih'[ E L2[0,1]. The Cameron-Martin norm is given by II h' II L- ()o.T).Qt") These facts can be either deduced from the one dimensional case or proved by a similar reasoning.
2.3.15. Example. (i) The fractional Brownian movement is defined as a centered Gaussian random process t;° on [0. T] (where T E (0, oo]) with the covariance function
K(s, t) = s° + t°
2
Is - tI°
a E (0, 2].
(ii) The stationary Ornstein-Uhlenbeck process t = (t;t)tET on a set T C 1R1 is defined as a centered Gaussian process with the covariance function
K(s, t) = e-It-al. The stationary Ornstein-Uhlenbeck process can be expressed by means of the Wiener process by virtue of the formula tt = e-twea,. For this process, every random variable ft is standard Gaussian. (iii) Let Q > 0 and a > 0. The Ornstein-Uhlenbeck process with the parameters $ and a and the initial normal distribution having mean mo and variance ao is defined as a Gaussian process with mean m(t) = e-3tmo and the covariance function
[am+ Ge3(t+(e23mmn8) - 1).
cov(r,a) = e -(t+a)3 2 -
2
If or = co = 1, mo = 0 and 0 = 1/2, we arrive at the stationary Ornstein-Uhlenbeck process. Below we shall find a representation of the Ornstein-Uhlenbeck by means of stochastic integrals.
Chapter 2. Infinite Dimensional Distributions
58
(iv) The fractional Ornstein-Uhlenbeck process is a centered Gaussian process with the covariance function
K(t, s) = exp(-fit - s1°), 0< o < 2.
(v) The Brownian bridge is the process w° := wt - twl on T = to, 1]. Its covariance function has the form min(s, t) - at. (vi) The Wiener field (or Brownian sheet) is a centered Gaussian process on T = [0,1)d with the covariance function K(s, t)
= fld min(s t;). i=l
The Wiener field introduced in [157] is also called the Chentsov-Wiener field
In the case where the parametric set T of a random process { = (et)trT on a complete probability space (0,.F, P) is endowed with a a-field T, one can talk of the measurability of the process in both variables. The process f is called measurable if the function (t, w) lt(w) is measurable with respect to the a-Held T®.F. The next result shows that any measurable Gaussian process with the trajectories from LP induces a Gaussian measure on L.
2.3.16. Example. Let (T, T, a) be a measurable space and let _ (tt)tET be a measurable Gaussian process on T. Suppose that
/ 1l (t, w)]" a(dt) < oo for a.e. w, where p > 1.
(2.3.8)
y(B) = P(w:
(2.3.9)
T
Then the formula
E B),
B E E(L"(a)),
defines a Gaussian measure on L'(a). In addition,
J;(t. .)Ipa(dt) E L'(P)
for all r E [1,00).
(2.3.10)
T
Finally, (2.3.8) is equivalent to the condition
f
[K, (t, t)p/2 + Il(t)ID] a(dt) < 0,
(2.3.11)
T
where Kt is the covariance function of the process t: and m is its mean.
PROOF. Let 1/q + I/p = 1. Recall that L9(a) = LP(a)*. Let us show that for any tb E L9(a), the function n(w) =
Je(tw)1'(t)a(dt) T
is a Gaussian random variable. The measurability of this function is readily deduced from the measurability off in both variables and Fubini's theorem. Let ib,, = ibIA,,,
A. = {t: l'(t)2E4(t,w)2 < n}. It suffices to prove our claim for t, replacing V' for each n E TV, since the corresponding random variables converge a.s. to the initial n. Denote by G the one. In this case r) E L2(P), since closed linear subspace in L2(P) generated by the random variables t;(t, ), t E T.
2.4.
The Cameron-Martin space
59
Since G consists of Gaussian random variables, it suffices to show that g E G. Let theorem n = rtl +,, where, 1 G. By
Er]fi = f [Jet.
a)rTi(u) P(dw)J
o(dt) = 0, T Q whence , = 0. Therefore, the right-hand side of (2.3.9) m a Gaussian measure on LP(o). The claim about the integrability of follows from Fernique's theorem 2.8.5 and its Corollary 2.8.6, proved below. In order
to apply Fernique's theorem note that if a is separable (i.e., L' (a) is separable), then the norm of the space X = LP(a) is E(X)-measurable. In the general case, it follows by the measurability of the process that there exists a sequence of functions gk E LQ(o) with unit norms such that k
f
P-a.e.
Indeed, it suffices to show this for the processes t;n(t,w) = 1H
where
B = {t: EIt;(t,w)IP < n}. Hence we may assume that the process (t,:.r) is in LP(o ® P). Approximating this process in LP(o . P) by finite sums of functions p(t)B(w}, where p E LP(a) and 0 E LP(P), we get a sequence {yin} C LP(a) such that, for almost all w, the function t;( , w) belongs to the closed separable subspace E of 11(a) generated by It remains to note that the norm on E is given by sup I, for some 1,, E E' with unit norms. n
Let us prove (2.3.11). If { is centered, then (2.3.11) follows from (2.3.8), since by the Gaussian property, there exists c> > 0 such that
xF(t.t) p
= (EIt;(t.w),,)pl2 0 such that EIW,w} - m(t)IP < c2 (EIC(t,w) -
m(t)1.,)p12
= c24Kt(t,
t)ip12.
11
2.4. The Cameron-Martin space 2.4.1. Lemma. A vector h from X belongs to the Cameron-Martin space H(ry) precisely when there exists 9 E X, with h = R, (g). In this case, IhIH() = 119110t,)-
PROOF. If IhIH(,) < oc, then by the Riesz theorem there exists an element g E X,* such that
f(h) = (f -a,(f),9)L,(-,) for all f E X', whence h = R,(g). Conversely, we have IhlH(,) = 1I911L2(,) < oo for every such h. 0
Chapter 2. Infinite Dimensional Distributions
60
If h = R.,g, then we say that the element g is associated with the vector h or is generated by h. In this case we use the notation
h:= 9. The relationship determining h is
f(h) = f [f(x)
- a,(f)]h(x) -r(dx).
d f E X'.
X
The Cameron-Martin space H(y) is equipped with the inner product (h, k)H(,)
(h. k)L2(,) = Ji(x)(xh(dx). x
The corresponding norm is
IhIH(.) = Aft=(,)Note that the inner product of any Euclidean space H is uniquely determined by its norm, since 4(x, y) = Ix + y12 - x - y12. It is clear from the previous lemma that if R_ (X,*) C X. then H( .) = R, (X,*).
In this case H(y)with the norm IR,(f)IH(,)= JR(f)(f) turns out to be a Hubert space, and the mapping R, is an isomorphism between X,' and H(-y).
If a,= 0. then (f)IH(-0 = Ilfll00) 2.4.2. Proposition. Let y be a Gaussian measure on a locally convex space X and let g E X.. Then the measure v on X given by the density
P(x) = exPl 9(x) - 20(9)2 with respect to the measure y is a Gaussian measure with the Fourier transform
v(f) =eXP(iRr(g)(f)+ia,(f)- 2a(f)2
(2.4.1)
PROOF. Since 9 is Gaussian, the function exp IgI is integrable with respect to the measure -y. Hence the function @ defines a finite measure v. Let f E X'. Put
k = exp[ia,(f) - 20(.9)2] Let us consider the following function of real argument z:
,p(z) = k f exP[i(f(x) - a.,(f) - zg(x))] y(dx). 11
By virtue of the inclusion f - a. (f) - zg E X. one has n. Since
x n=1
x
n-Z
x
n-2
Ifn(x)I y(am) :5 n=1
2.4.
The Cameron-Martin space
63
we get the linear space
L = {x E X : the series
n-2 fn(x) converges, n=1
having full measure and belonging to ,6(X). By construction, h does not belong to the set L. Finally, if XY is infinite dimensional, then there exists an infinite orthonormal
sequence {f} C X. It is easy to see that for every M > 0 the set S x E X : sup Ifn(x)I < M} ll n
has -y-measure zero. Therefore, the set
x
{xEX:
l fn(x)2 < 001
n=1
has -y-measure zero as well. In addition, it contains H(-y), since f,, (h) = (fn,9)L2(,)
ifh=R,,(g),gEX;. 2.4.8. Theorem. Let 7 be a Gaussian measure on a locally convex space X. For any p E (1, coo), r > p and any function f E L'(-y), the mapping (H(7), I H(,)) -. L°(y), h - f( + h), is continuous. If p = 1, then the same is true for all functions f E L'°(it). I
PROOF. This result is deduced from Corollary 2.4.3 as follows. For every cylin-
drical function f of the form f (x) = p (11(x), ... , ln(x)), where w E Cb(En) and li E X', the mapping above is continuous by Lebesgue's theorem. Let f E L'(y), where r > p. There exists a sequence {f3 } of cylindrical functions of the indicated type that converges to f in Lr(y). There exist numbers t and s such that t > 1, tp = r, t-1 + s-1 = 1. Recall that the element h = R71 h E X, corresponding to the vector h is a centered Gaussian random variable on (X, ry). Hence, 2
f exp(sh) dy = exP(2 Ih1y(7)). According to Corollary 2.4.3 and Holder's inequality, we get
JIf(x + h) -fj(x+h)Ip7(dx) = f If(z) - fj(z)I"exp(h(z)
r
- 2IhIH(,)) ti(dx) {JexP(s(z) - ZIh12 (7))
if If(z) - fi(z)Ir7(dz)J 1lt
5
{f If(z)-fj(z)I'7(dz)}
I(dr)}1/e
l1/t
_ exp(s 211hI2.(,)),
which tends to zero, as j - oo, uniformly in h E H(y) with IhIH(,) < R, for every fixed R > 0. Therefore, the mapping h s-+ f(- + h) is continuous. In the case when p = 1 and f E L" (ry), the proof is analogous with the only difference that the sequence {fj} should be taken to be convergent to f in measure with
Chapter 2. Infinite Dimensional Distributions
64
sup J f,1 < ] 1f1 1. , and instead of Holder's inequality one has to use the LebesgueVitali theorem on uniformly integrable sequences.
2.4.9. Corollary. Let y be a Gaussian measure on a locally convex space and let A be a set of positive -y-measure. Then them exists c > 0 such that cU C A - A. where U, is the closed unit ball in the space H(-I). PROOF. The function h y ((A + h) n A) is positive at zero and continuous by Theorem 2.4.8 applied to the indicator function of A, since
y((A+ h) n A) =
flA(x - h)I.4(x)7(dx). x
Therefore, there exists c > 0 such that - ((A+h)nA) > 0 for all h E cUH. Clearly,
hEA - A for such h. 2.4.10. Proposition. Let y be a Gaussian measure on a locally convex space X such that its Cameron-Martin space H(y) is separable with respect to the norm Suppose that Q is a set of full y-measure. Then, for y-a.e. x, the set
H(y)n(n-x) is dense in the space H(y) with respect to the norm I
in H('). Denote by PROOF. Let us take an orthonormal basis the countable set of all finite linear combinations of the vectors e, with rational coefficients. Since h E H(y), one has y(fl + h,,) = 1. Hence y(sl n (ft + h,)) = 1. Therefore, the set f2 :_ {x E 11: x + h E 0} has measure I as well. Put Y = nn lfl,,. Then y(Y) = 1. If x E Y. then n - x contains h for every n. whence the conclusion.
2.5. Zero-one laws
2.5.1. Lemma. Let y be a centered Gaussian measure on a locally convex space X. The collections of the functions of the form. Hk, (i1) ... Hk, U.), where the Hk, 's are Hermite polynomials and li E X', is dense in L2(y).
where V is a bounded Bore! PROOF. The functions of the form function on R", are dense in L2 (-y). Hence the claim follows from the corresponding finite dimensional result.
2.5.2. Theorem. Let y be a Gaussian measure on a locally convex space X such that R,(X `) C X. Suppose that a set A E .6(X), satisfies the condition
y(A+h)=y(A), b'hER,(X'). Then either -y (A) = 1 or y(A) = 0. In addition, if f is a y-measurable function such that for every h E R, (X*) one has
f(x+h) = f(x) then f coincides a.e. with a constant.
y-a.e..
2.5.
Zero-one laws
65
PROOF. To simplify notation let us suppose that a, = 0. Let hl = R,(11), hn = R,(ln), where li E X' and the vectors h, are orthonormal in H(y). By condition and the Cameron-Martin formula, the function
F(ti,...,tn)=y(A-t1h,
f exp(: A
tilt(s) -
211yt`h=ilH(,l)
-y(dx)
is constant. Therefore, for any collection ml,... , m of nonnegative integers not vanishing simultaneously, we have
am,+...+m F
8t""...8tn^(0,...,0)=0. Since the vectors h; are orthonormal, Lemma 1.3.2(iii) yields
f
Hm, (1, (x))
... Hm., (1.(x)) IA(x) y(dx) = 0,
x
where Hk is the k-th Hermite polynomial. Thus, the function IA is orthogonal to all the polynomials H,,,, ( 1 1 ). .Hm,. (ln), where m; are not zero simultaneously and
l; E X' are mutually orthogonal in X. This means that IA is a constant. This constant can be only 0 or 1. Considering the sets (f < c}, we get the claim for functions.
It is clear that it is sufficient to require the condition of Theorem 2.5.2 for all vectors from R,(Y), where a linear space Y is dense in X' with the L2-norm.
2.5.3. Corollary. Let the measure y be the same as in Theorem 2.5.2 and let A be a -y-measurable set such that
y(A\(A+ h)) = 0, `dh E R,.(X').
(2.5.1)
Then either y(A) = 1 or y(A) = 0. PROOF. Let h E R, (X' ). Then -h E R, (X' ). By (2.5.1), we have the equality
(A\(A - h)) = 0. Since y_h - y, we get y((A+ h)\A) = 0, which together with (2.5.1) yields y(A + h) = y(A). Note that in both claims the measurability of A + h follows automatically from that of A, since the measures y_h and y are equivalent. The following more general fact can be easily seen from the proof of the previous theorem.
2.5.4. Corollary. Suppose that in Theorem 2.5.2 the space H(y) is separable, {en} is its orthonormal basis and a set A measurable with respect to the measure -y has the property that
A + ren = A up to a set of -y-measure zero for all rational r and all n E IN. Then either y(A) = 0 or y(A) = 1. It follows from Theorem 2.5.2 and Corollary 2.4.9 that for any measurable linear
subspace L in a locally convex space with a Gaussian measure y (satisfying the condition in Theorem 2.5.2) the following alternative takes place: either y(L) = 0 or -y(L) = 1. Our next theorem shows that the same is true for any Gaussian measure. We use the argument suggested by Fernique [709] for the larger class of
Chapter 2. Infinite Dimensional Distributions
66
stable measures. Recall that an affine subspace in a linear space X is a set of the form v+ L, where v E X and L C X is a linear subspace.
2.5.5. Theorem. Let -y be a Gaussian measure on a locally convex space X and let L be a -t-measurable affine subspace in X. Then either y(L) = 1
or y(L) = 0.
PROOF. It suffices to prove the claim for linear subspaces L. Further, we may assume that L E E(X). Indeed, if L has positive measure, then, by Lemma 2.1.5, there exist a compact set K C R'° and a continuous linear mapping F: X -. R'° such that the set A = F-1(K) C L has positive measure. Clearly, the linear span S of K is Borel, hence Lo = F-'(S) is a positive measure linear subspace from E(X). Suppose first that y is centered. Let f and q be two independent random vectors in X with distribution y (e.g., one can take the measure P = y&y on X xX and put t (x, y) = x, q(x, y) = y for (x, y) E X X X) Put -
Stn={f ¢L}nit? +n. EL}, nEIN. By the linearity of L, t (w) and q(w) belong to L precisely when l; and q + 714 do. we Hence, taking into account that q + nt has the same distribution as get
P(q + n{ E L) - P({.e E L} n (q + n{ E L})
= P( 1 + n21; E L) - P({ f E L} n Jr) E L}) = P(C E L) - P(f E L)2. Using again the linearity of L, we see that the sets Stn, n E IN, are disjoint. Therefore, P(f E L) - P(f E L)2 = 0, whence the desired conclusion.
In the general case, let y1 be the image of -y under the mapping x ' -. -x. Then yo = -y * yi is a centered Gaussian measure. If y(L) > 0, then, clearly, -yo(L) > 0. Therefore, -yo(L) = 1. Since yi(L) = y(L) > 0, there is x E L such that 0 y(L) = y(L - x) = 1. 2.5.6. Corollary. Let y be a centered Gaussian measure on a locally convex space X and let L be a linear subspace in X. Then for any a ¢ L one has y.(L + a) = 0,
where y. is the inner measure. PROOF. Let B C L + a, B E E(X). By Lemma 2.1.4, the linear span Lo of the set B - a C L is measurable with respect to y. According to Theorem 2.5.5, the measure of the set Lo + a is either 0 or 1. The latter, however, is impossible, since y(Lo - a) = y(Lo + a) (due to the symmetry of y), but (Lo - a) n (Lo + a) = 0. It 0 remains to note that B C Lo + a.
2.5.7. Theorem. Let -y be a Gaussian measure on a locally convex space X such that Ii;, (X *) C X. If G is a y-measurable additive subgroup in X. then either
y(G) = 0 or y(G) = 1. PROOF. It suffices to be shown that if y(G) > 0, then C + H(y) = G. Indeed, for any h E H(-y), according to Corollary 2.4.9, there exists n E IN such that 0 n-'h E G- G. Hence h E G, i.e., H(y) C G.
2.6.
Separability and oscillations
67
2.5.8. Theorem. Let ^y be a Gaussian measure on a locally convex space X with mean a, and let C be an additive -y-measurable subgroup in X of full measure. Then 2a., E C and H(ry) C G. Then the measure -yo is symmetric, hence,
PROOF. Let '10 =
1 = -y(G) = ,yo(a. + G) = 7o (-(a, + G)) = 7o(-a, + G). Therefore, the set (-a, +G) (1(a, +G) is nonempty, which gives the first claim. For the proof of the second one let us note that, for any h E H(-f), the set h - a- + G 0 has full measure, hence the set (h - a., + C) f (-a, + G) is nonempty. We shall return to the zero-one law in the next section and in Chapter 5, where we discuss measurable polynomials.
2.6.
Separability and oscillations
2.6.1. Definition. Let (1, B, P) be a probability space and let T be a metric space. A random process £(t,w), t E T, w E 11, is called separable if there exist an at most countable set S C T (called a separant of the process) an a set Qo C f2 such
that P(5l0) = 1 and, for every open set U C T and every closed set Z C s.', the following equality holds true:
{wES2o: l;(t,w)EZ, dtEU}={wEflo: t;(t,w)EZ, dtEUf1S}. (2.6.1) Note that S is automatically dense in T, in particular, T is separable. Equality (2.6.1) is equivalent to the following condition: for every w E Ilo, t E T and e > 0, letting d be the distance in T, the point l: (t, w) belongs to the closure of the set {t;(s,w), s E S, d(s, t) < e}. Let f be a function on a space T equipped with a metric or semimetric d (recall that in the definition of a semimetric there is no condition that d(x, y) = 0 implies
x = y). The oscillation of the function f at a point r is defined as the quantity (possibly, infinite)
of(r)lim
sup G.sEK(r.b)
If(t)-f(s)I=6lim
I
sup
tEK(,.6)
inf.6)f(t)], J(t)-tE K(r
where K(r, 6) denotes the open ball in T of radius b centered at r. In a similar manner one defines the oscillation on a set M C T:
af(M) = lim sup{If(t) - f(s)I, t, s E M6, d(s,t) < b}, 6-0+
where M6 is the open 6-neighborhood of M, i.e., the collection of all points t E T such that there exists t' E Al with d(t, t') < 6. It is easily verified that a function on a metric space is continuous at a given point precisely when its oscillation at this point is zero. In the case where the function f is random, its oscillation at every fixed r or M turns out to be a function on a probability space. Hence the question about the measurability of this function arises. A simple sufficient condition of measurability is given in the next lemma.
Chapter 2.
68
Infinite Dimensional Distributions
2.6.2. Lemma. Let (T, d) be a separable metric space and let (1, B. P) be a t E T, w E i2) is a separable random probability space. Suppose that = process on T. Then the functions
-
w'
are measurable for every x and every set M C T.
Pttooe. Let S C T be a countable separant of the process t;. It is readily seen from the definition of separability that, for every open set U C T, for all w E S2o, one has
suptt(w) = sup t t(w), tEU')5
tEU
ttinf ut;t(w) =
inf
(2.6.2)
t(M),
and the functions defined by the right-hand sides of these equalities are measurable, since S is countable. Hence,
sup lCe(w) - ,(w)I = supet(w) - inf ,(w) tEt:
t.eEU
is a P-measurable function. Therefore, the function sup t,sE ti, d(t,s) 1/n) < n=1
Hence, for every fixed t, with probability 1 the sequence , converges to &. Now let us put 77, = line sup ti . It is clear that we get a modification of 4. Since d separates
the points sj. we have for s E S. The verification of the separability of the version constructed is left to the reader as Problem 2.12.22. 0
2.7.
Equivalence and singularity
71
It follows from the proof above that for a separable centered Gaussian process
(WrET with a separant To, any countable set T, dense in (Td), where d(t.s) = Elf: - C ,I2, serves as a separant. Indeed, it suffices to take a separable version t of C, for which T, is a separant and note that P(l t = t:t, V t E To U Ti) = 1. 2.7. Equivalence and singularity
2.7.1. Lemma. Let p be a Gaussian measure on a locally convex space X. A E E(X) and µ(A) > 0. If the measure v = I4µ/µ(A) is Gaussian. then µ(A) = 1. PROOF. According to Example 2.3.10, the claim reduces to the case X = IRT, in which, as we know. one has the equality R,,(X,) C X. Indeed, this reduction is a direct corollary of the equality in Example 2.3.10. Let h E H(p). For sufficiently
small t, according to Theorem 2.4.8, we have µ(A n (A + th)) > 0. Hence the measures v and veh are not mutually singular. Indeed, otherwise there is a subset
B C A such that v(B) = 1 and vth(B) = 0. By definition, this means that µ(B) = p(A) and µ(A n (B - th)) = 0. Then µ((A + th) n B)) = 0, since h E H(p). Hence µ(A n (A + th)) = 0. which is a contradiction. Therefore, h E H(v), whence with 0. nx JJJ
i=1
111
Since the set E1 has full measure, we have y(E,) = 0, provided c 1. Therefore, the mapping x +-+ cx takes some measurable sets to nonmeasurable ones. A similar effect occurs in the following surprising example discovered by Cameron and Martin in [1441.
2.7.7. Example. Let f : (0, oo) - [0,11 be an arbitrary function. Then there exists a 7-measurable set E C X such that cE is measurable for every c > 0 and y(cE) = f (c),
V c > 0.
PROOF. Let r: (0, +oc) -+ [-oo, +oo1 be an arbitrary function. Put
Ae = {x E X: l;l(x) > t},
t E [-00,+001.
It is clear that cAt = Act for all c > 0. Let
E = U (Ar(t) fl E) . t>o
Note that
cE = U(cAr(t) n cE) = U(Ac,.(t) fl E.2t). Hence cE is the union of the measurable set A,(,-2) n El and a subset of the set X\E1, which has measure zero. Therefore, the set cE is measurable with respect to y and
x ^Y(cE)
7(Ac,(c-2))
-
27,
1 z J exp(-2s ) ds.
cr(c-2)
It remains to choose the function r in order to get 1 - 4 (cr(c-2)) = f(c) for all c > 0. Obviously, this is possible.
74
Chapter 2. Infinite Dimensional Distributions
2.8. Measurable seminorms 2.8.1. Definition. Let -y be a Gaussian measure on a locally convex space X. A function q measurable with respect to £(X )., is called a £(X ).,-measurable seminorm if there exists a £(X).,-measurable linear subspace X0 C X of -f-measures such that q on X0 is a seminorm in the usual sense.
It is clear that any £(X).,-measurable seminorm can be redefined on a set of y-measure zero in such a way that it will be a seminorm in the usual sense on the whole space. Indeed, let Y be an algebraic complement to X0 in X. Put qo(x + y) := q(x) if x E Xo, y E Y. Then qp has the required properties. Note that sometimes modifications of measurable seminorms in our sense are also called measurable seminorms. However, we shall not employ this terminology.
2.8.2. Example. Let y be a Gaussian measure on a locally convex space X and let a sequence {f,,} C X' be such that q(x) = sup l f (x)l < oo almost everywhere with respect to y. Then q is a £(X),-measurable seminorm. PROOF. Let X0 = {x: q(x) < oo}. According to Problem A.3.34 in Appendix, the set Xo is in E(X). In addition, it is a linear space of full measure on which q is a seminorm. The measurability of q is obvious.
It will be shown in Chapter 4 that every y-measurable seminorm on a locally convex space X with a centered Gaussian measure y coincides almost everywhere with a function of the form
q(x) = sup[fn(x) +
f,, E X:,, a,, > 0.
2.8.3. Theorem. Let -y be a Gaussian measure on a locally convex space X and let q be a E(X).,-measurable seminorm on X. Then the restriction of q to the space H(y) with the norm I IH(,,) is continuous.
PROOF. The set V = {x: q(x) < n} has positive measure for some n E IN. According to Corollary 2.4.9, for a sufficiently small positive number r, the set V. - V, contains the ball of radius r from H(-y). Hence the seminorm q is bounded on the unit ball in H(y) which is equivalent to the continuity of q. 2.8.4. Proposition. Let -y be a Gaussian measure on a locally convex space X and let q, and 42 be two -y-measurable seminorms such that q, = Q2 a.e. Then q, = q2 on H(y). PROOF. It follows from the definition that there is a linear subspace Y of full measure on which both q, and q2 are seminorms in the usual sense. Let E :_
{x E Y: ql(x) = q2(x)} and let It E H(y). Then, for every n E IN, one has x y(E - nh) = 1, since y(E) = 1. Hence the intersection of E with n (E - nh) n-I has measure 1, in particular, there is some x in this intersection. This means that q, (x + nh) = q2(x + nh) for all n E IN. Clearly, h E Y. Since any seminorm on the two dimensional space spanned by x and It is continuous, we deduce from the
equality ql(n-tx+h) = q2(n-lx +h) that q,(h) = q2(h). The following result about the exponential integrability of seminorms is the celebrated Fernique theorem [2411.
2.8.
Measurable seminorms
75
2.8.5. Theorem. Let 1 be a centered Gaussian measure on a locally convex space X and let q be a E(X).,-measurable seminorm. Let us pick r such that
c=-y(g5r)>1/2. Put a= 24r2 log
c c if c < 1. Then
f exp(ag2) dy < 1(r, c) < oo, where I(r,c) depends only on r and c. If c = 1. then q E L'(-y). PRooF. Let r, t be two arbitrary numbers. According to Proposition 2.2.10 and the change of variables formula (see formula (A.3.1) in Appendix) we have
-y(q 5 r)-y(q > t) = f f
7(dx) 7(dy)
g(x)t
=
ffry(du),y(dv) ff
since q(u) >
t q
and q(v) >
t
q(u)>!. q(r)>!
ry(du) y(dv),
r,
, provided
is a seminorm, one has
q(u) >
q(u + v) - q(u - v)
Vu, v.
2
Therefore, we arrive at the following inequality which is crucial in the proof:
tT))2. 7(q 5 r)-t(q > t) < (-y(q>
(2.8.1)
Since q < oo almost everywhere, there exists a nonnegative number r such that c := '?'(q 5 r) > 1/2.
Then Po :=
< 1.
'Y(q 0 such that
for every n and n
Eexp(a(SUP q(Xn))2) < oc. Unlike the finite dimensional case, a measurable seminorm on a space with a centered Gaussian measure can coincide almost everywhere with a nonzero constant.
2.8.9. Example. Let -y be the countable product of the standard Gaussian measures on the real line regarded on lR". Put q. (x) = (1F_Then q = lim sup qn is a -y-measurable seminorm such that q = I almost everywhere. n-x PROOF. By virtue of the large numbers law, we have qn - 1 almost everywhere. The set L = {x: limsupgn(x) < oc}
n-X
is a linear subspace, since every function qn is a seminorm. In addition, q is a seminorm on this space. The measurability of L and q is obvious. The following theorem extends Anderson's inequality to the infinite dimensional case.
2.8.10. Theorem. Let y be a centered Gaussian measure on a locally convex space X and let A E E(X), be an absolutely convex set. Then, for any a E X such that A + a E £(X ), , the following inequality holds true: y(A + a) < -y(A).
More generally, if A + to E £(X), for all t E [0, 1], then
y(A + a) < y(A + ta),
d t E {0,1).
PROOF. Applying Lemma 2.1.6 to the measure y + y_Q and using the finite dimensional Anderson inequality, we get the claim. 2.8.11. Corollary. Let 7 be a centered Gaussian measure on a locally convex space X and let f be a function on X such that the sets if < c}, c E RI . are convex
symmetric (e.g., let f be convex symmetric). If f and f ( + a) are y-integrable. then
J f (x) 'y(dx) < X
J
f (x + a) y(dx),
da E X.
X
More generally. if f ( + ta) E L'(1) for all t > 0, then the function
J f (x + ta) y(dx) x is nondecreasing on [0, +x). In particular, this is true for every £(X )-measurable seminorm f. t
PROOF. The same reasoning as in the finite dimensional Corollary 1.8.6 applies.
Chapter 2. Infinite Dimensional Distributions
78
2.9. The Ornstein-Uhlenbeck semigroup Let -y be a centered Gaussian measure on a locally convex space X. In the same way as in the finite dimensional case the Ornstein-Uhlenbeck semigroup (T, )t>0 on LP(y) is defined by the formula
Ttf (x) =
J
1 - e-2t y) y(dy).
f (e-tx +
(2.9.1)
X
The same formula defines the Ornstein-Uhlenbeck semigroup (Tt)t>o on the spaces LP(y, E), p > 1, of mappings with values in a separable Banach space E. The next theorem is proved by the same reasoning as the one used in the finite dimensional case.
2.9.1. Theorem. For every p > 1, the family (Tt)t>0 is a strongly continuous semigroup on LP(y) with the operator norm IITtIIr(La(,)) = 1'
The operators Tt are nonnegative on L2(y). If E is a separable Banach space, then (Tt)t>o is a strongly continuous semigroup on LP(y, E), p > 1.
Denote by E the closed linear subspace in L2(y) generated by the Hermite polynomials of the form Hkt (l1) ' ... Hk,,, (lm ) , 1, E X', m = 0, 1, ... , n. Let Xo be the one dimensional space of constants + km having degree k1 +
and let Mk denote the orthogonal complement of Ek_I in Ek, k E IN. Then we get the following decomposition of L2(y) into a direct sum of mutually orthogonal subspaces: L2(y) _ ® Xk,
(2.9.2)
k=0
Denoting by Ik(F) the projections of called the Wiener chaos decomposition. F E L2(y) onto Xk, we get the decomposition
x
F = E I,. (F).
(2.9.3)
k=0
In each subspace Xk one can choose explicitly an orthonormal basis. To this end, let us take an arbitrary orthonormal basis (Wo En in X; and consider the Hermite polynomials Hat....,am:k,.....k :=Hkt(fa,)...Hk-(dam), a, EA, k, =0,1,... By virtue of Lemma 2.5.1 and the properties of the finite dimensional Hermite
polynomials, the collection of the functions we get is an orthonormal basis in L2(-y), + km = n and, for every n, its subset formed by the polynomials of degree k1 +
is a basis in X,,. Unlike the finite dimensional case, if k > 0, then the space Xk is infinite dimensional.
Clearly, if the Hilbert space Xy is separable (which is always the case for Radon Gaussian measures, as we shall see later), then the basis constructed above is countable. In addition, it is possible to choose an orthonormal basis in X,;
consisting of elements { E X. Note that in any case one can do this for the
2.10.
Measurable linear functionals
79
restriction of y to any countably generated a-field £1 C E(X) (recall that every set from £(X) is determined by some countable collection of continuous linear functionals). By analogy one defines the spaces Xk (E) of mappings with values in a separable Hilbert space E, i.e., Xk(E) is the closure of the linear span of the mappings f (x)v,
f E Xk, v E E. in L2(y, E). The following statement is easily obtained from the corresponding finite dimensional theorem.
2.9.2. Theorem. For all t > 0 and f E L2(y), one has
x Ttf = > e-kt jk(f). k=0
We shall see in Chapter 5 that the Ornstein-Uhlenbeck semigroup has a strong smoothing property (obvious in the finite dimensional case). Here we consider a simple example.
2.9.3. Example. Let f E LP (-y), p > 1. Then, for any t > 0 and for y-a.e. x, the function h - Tt f (x + h) on the space H(y) with its natural Cameron-Martin norm is continuous. The same is true if f E LP(-y, E), where E is a separable Hilbert space.
PROOF. The function y '-. f (e-tx+ 1 - e- t y) is in LP(y) for y-a.e. x. Hence it suffices to show that, for every function g E LP(-y), the function
V: h'-' f g(y+h)y(dy) x is continuous on H(-y). Writing p by the aid of the Cameron-Martin formula as D(h) =
Jgexp(i
- 2IhIH(,)) d-,,
we notice that, for any q > 1, the mapping h
exp(h
-
1h12H
with values in L9(y) is continuous on the balls in H(y). This yields the continuity of V. The vector case is analogous. 0
2.10. Measurable linear functionals Recall that a function f on a linear space is said to be affine if f = I + c, where 1 is linear and c E 1R1. By analogy one defines affine mappings with values in linear spaces.
2.10.1. Definition. Let y be a Gaussian measure on a locally convex space X. A function f on X is called a measurable linear functional (or, more precisely, -ymeasurable linear functional) on (X, y) if there exist a linear subspace L of full y-measure and a -y-measurable and linear in the usual sense function fo on L such
that f = fo y-a.e. The notion of a y-measurable acne function is defined by analogy.
Chapter 2.
80
Infinite Dimensional Distributions
Note that one can always redefine a y-measurable linear functional f : X -- IR' in such a way that fo will be linear on all of X (such a version will be called proper linear). Indeed, using any Hamel basis in X, we can extend fo to a linear function on X. Clearly, all such extensions are -y-equivalent functions. It is clear that the absolute value of a measurable proper linear functional is a measurable seminorm. Let as consider the following instructive example.
2.10.2. Example. Let -y be the countable product of the standard Gaussian measures on the real line and X = IRX. Then, for any E 12, the series E converges y-a.e., hence defines a measurable linear functional on X. However, only for finite sequences is this functional continuous.
2.10.3. Proposition. Let -y be a centered Gaussian measure on X. Any -jmeasurable linear functional f on X is a centered Gaussian random variable with variance IIfIIi=(,,)
PROOF. The random variables C(w) = f (x) and q(w) = f (y), where w = (x, y) E X x X, are independent on (X x X, yO y) and have the same distribution as f on (X. y). By linearity of f and the Gaussian property, one has
IEexp[J
] =Jfexp{iti(5-.=)] y(dx),(dy) = lEIite].
Therefore, (+ q) / f has the same distribution as 1;, which implies that t; is centered Gaussian (see Theorem 1.9.5). The following important result is a direct corollary of Theorem 2.8.3.
2.10.4. Theorem. Let y be a Gaussian measure on a locally convex space X and let 1 be a -y-measurable properly linear function on X. Then the restriction of I to the space H(y) with the norm I Ix(,) is continuous. -
The reader should be warned that in this theorem one has to deal with proper linear functions, since in the typical (infinite dimensional) situations H(y) has ymeasure zero, hence on this set measurable functions can be redefined in an arbitrary way. It is remarkable and nontrivial that. (as we shall see below) any proper linear y-measurable function is uniquely determined by its restriction to the set H(y). On the other hand, we shall see that in the infinite dimensional case there exist linear functions which are not measurable with respect to Gaussian measures.
2.10.5. Lemma. Let y be a centered Gaussian measure on a locally convex space X and let f E X; be a proper linear function such that R, f E X. Then
f(h) = (R,f,h)xt7! = Jf(x)J(x)7(dx). Vh E H(y).
(2.10.1)
PROOF. The second equality in (2.10.1) holds true by the definition of the norm in H(y). For the proof of the first one let us take a sequence { ff } C X' convergent
to f in L2(y). Let h = R,k E X, k E X;. By definition, we have k = h and
f.(h) = fn(R,k) = f f.,(x)k(x)y(dx). The right-hand side of this formula tends to the right-hand side of the equality we want to prove. Passing to a subsequence, we may assume that f, f almost
2.10.
Measurable linear functionals
81
Since the set Xo = {x: f(x) = lira is a linear subspace, nix by virtue of Theorem 2.4.7 it contains H(-y). Therefore, fn(h) - f (h) for all everywhere. h E H(-y).
2.10.6. Corollary. Let y be a centered Gaussian measure on a locally convex space X. If a sequence of -y-measurable proper linear functions f., converges to zero converge to zero in the in measure y, then the continuous linear functionals
norm of H(y)'. PROOF. By the Gaussian property, the convergence in measure implies the convergence in L2(-y). Hence the claim follows from the Cauchy-Minkowski inequality
applied to the right-hand side of (2.10.1) and the fact that IhIH(,) = 1140(,) for any h E H(y). 2.10.7. Theorem. Let y be a Gaussian measure on a locally convex space X, f and g two 1-measurable linear functionals. Then they are either distinct almost everywhere or equal almost everywhere. If y is a centered measure with R, (X') C X and f and g are proper linear, then the second of the two possibilities above takes place precisely when f = 9 on H(-y).
PROOF. We may assume that f and g are proper linear. Then L: = {f = g} is a measurable linear space. According to the zero-one law, either y(L) = 0 or y(L) = 1. In the latter case H(y) C L by Theorem 2.4.7. The last claim follows from Theorem 2.5.5.
2.10.8. Corollary. Let y be a centered Gaussian measure on a locally convex space X such that R,(X*) C X. Suppose that I is a linear y-measurable function (or a -t-measurable proper linear functional) such that it vanishes on some dense subset of the space H(y) (equipped with the norm I IHi,i). Then 1 = 0 -y-almost everywhere.
PROOF. This follows directly from Theorem 2.10.4 and Theorem 2.10.7.
2.10.9. Theorem. Let y be a centered Gaussian measure on a locally convex space X such that R, (Xy) C X X. Then the following conditions are equivalent: (i) f is a -t-measurable linear functional;
(ii) f E X4; (iii) there exists a sequence (fn) C X' convergent to f in measure. PROOF. It is clear that condition (ii) implies (iii). Let C X' converge to f in measure. There exists a subsequence { fR } (denoted by the same symbol), which converges to f y-a.e. Denote by L the set of all the points x E X such that Then L is a linear subspace of full -y-measure. Let us define there exists lim
foon Lby
nx
fo(x) = lim n
x
Thus, (iii) implies (i) (clearly, (iii) implies also (ii)).
Suppose now that f is a measurable linear functional on (X, y). We shall assume from the very beginning that f = fo, where fo is a linear functional from the definition 2.10.1. By Fernique's theorem, one has f E L2(7). According to Theorem 2.10.4, the functional f is continuous on the Hilbert space H(7). Let us choose an orthonormal basis {e0} in H(y) and denote by its part (which is
Chapter 2. Infinite Dimensional Distributions
82
at most countable), on which f does not vanish. Let en = Rkn, kn E X;. The vectors kn form an orthonormal sequence in X.Y, hence we can consider the element
x f(en)kn E X,.
g= n=1
Note that f = g on H(,), since for all e E {eQ}, not belonging to {en}, by virtue of Lemma 2.10.5 one has kn(e) = (en, e) H(,) = 0.
According to Theorem 2.10.7, f = g almost everywhere with respect to the measure -y. Thus, f E X. An analogous result holds true for measurable affine functions g. since g = f +c,
where f is a measurable linear function and c E R'. Note that without extra assumptions conditions (ii) and (iii) above are equivalent and imply (i).
2.10.10. Corollary. Let y be a centered Gaussian measure on a locally convex space X with the separable space X,*. Then there exists an orthonormal basis 1G) in X; consisting of elements of X. In addition, for any ^y-measurable linear E 12 such that functional 1, there exists a sequence
x
l(x) _ E enl;n(x) n=l
'y-a.e.
2.10.11. Theorem. Let 'y be a centered Gaussian measure on a locally convex space X such that R,(X.Y) C X. Then, for any continuous linear functional 1 on the Hilbert space H(^y), there exists a unique (up to equivalence) measurable linear functional 10 on X that coincides with l on H(-y). In addition, Itin1I Li(,) = 11111
PROOF. Let us take an arbitrary orthonormal basis {e0 } in H(y) and denote by {en} its part (which is at most countable), on which l does not vanish. Let en = R4n, where ,, E X. We can take for C,, the corresponding proper linear versions. The vectors l;n form an orthonormal sequence in the Hilbert space X;Put cn = 1(en). Then (cn) E 12. Therefore, we get the measurable linear functional 10 = E enlnn=1
Since tn(ek) = bnk, one has lo(en) = c = l(en). In addition, for any element e E {eQ}\{en} with e = Rk, where k E X,*, we have lo(eQ) = 0, since t;n(e) = (en,e)H(, = 0 by Lemma 2.10.5. Uniqueness follows from Theorem 2.10.7.
2.10.12. Corollary. Suppose that the conditions of Theorem 2.10.11 are satisfied and that {fn} is a sequence of 'y-measurable linear functionals. Then the following conditions are equivalent: (i) fn --* 0 in L2(?'); (ii) f,, 0 in measure; (iii) the restrictions of proper linear versions of the fn's to H(7) converge to zero in the norm of H(-y)'. In addition, for either of these conditions it suffices that fn - 0 on a set of positive measure.
2.11.
Stochastic integrals
83
PROOF. Clearly, (i) implies (ii). If (ii) is satisfied, then the functions fn E Xy are centered Gaussian random variables. Hence their convergence to zero in -rmeasure yields the convergence in L2(7). By Lemma 2.10.5, we get (iii). By the same lemma, IIfnIIH(y) = IIfnIIL2(y) for proper linear versions, which gives the implication (iii) = (i). Finally, the convergence of fn to zero on a set of positive measure implies, by the zero-one law, the convergence almost everywhere, since the 0 set of convergence is a measurable linear subspace.
The results of this section show that, for a centered Gaussian measure y satisfying the condition R,(X;) C X, the Cameron-Martin space H(-y) is naturally isomorphic to the dual of Xy in the sense that all continuous linear functionals on Xy have the form I i- lo(h), where h E H(y) and to is a proper linear modification of 1. With this identification, the norm of any measurable linear functional in L2(y)
is equal to the norm of the restriction of its proper linear version on the Hilbert space H(y). Measurable linear functionals on the classical Wiener space are discussed in the next section. 2.11. Stochastic integrals
Let us discuss measurable linear functionals on the Wiener space (C[0,1), PH'). Recall that for every function cp E L2(0,1), the Paley-Wiener-Zygmund stochastic integral Jcp =
JPtt)dwt 0
with respect to the Wiener process wt is defined as follows. If p is a step function of the form n
tl = 0 5 t2 5 ... 5 to+1 = 1,
cp(t) = F, Cil(t,.ii+,J, i=1
then we put 1
n
Jcp = f w(t) dwt
ci(wt,+1 - w1,). i=1
o
It is clear that Jcp is a Gaussian random variable with mean zero and variance n
E c?(t,+1 - ti) = i=1
Therefore, J has a unique extension to an isometry from L2[0,1J to the subspace in L2(Pu'), generated by all centered Gaussian random variables. Note that the Paley-Wiener-Zygmund stochastic integral cannot be defined as a Stieltjes integral, since the trajectories wt have unbounded variation almost sure. However, if we choose (C[0, 1J, Ply') as a probability space for wt, then Jcp becomes a measurable linear functional. Indeed, the functionals ci(wt,y, - wt.)
1n: w F-» i=1
Chapter 2. Infinite Dimensional Distributions
84
are continuous on C[O,1] and Jcp is obtained as their limit. Conversely, let I be a P'",-measurable linear functional. Put h = R4t'I. As shown above, h E IA o'', hence rp = h' belongs to L2 [0,11. In order to show that 1 = J 0}. Then -y(A) = 1/2, although
A+h=A, VhEH(-y). PROOF. It is clear that R, (f) 54 0, hence y(A) = 1/2. In addition, A + h = A for all h E H(y), since f (h) _ (f, R; 1(h)) = 0 for h E H(y). 2.12.5. Remark. Let X be a locally convex space and let -y be any measure on £(X). The Lebesgue theorem implies the sequential continuity of the Fourier transform of -y, provided the space X' is equipped with the *-weak topology a(X', X). For a GaussianJfdv measure y, this means the sequential continuity of the linear func-
tion L: f i-
and the covariance function f '--* R.,(f)(f) on X' with the *-weak topology. In Problem 2.12.17 it is suggested to prove that, except for the
case when the measure y is concentrated on a finite dimensional subspace, the covariance function cannot be continuous in the *-weak topology on X. However, for the Radon Gaussian measures discussed in Chapter 3 both functions are continuous in the Mackey topology on X` (defined in Appendix).
The Hellinger integral and product-measures The following result enables one to get in certain cases a sufficiently precise estimate of the variation distance between equivalent Gaussian measures, since for such measures, in principle, one can evaluate the Hellinger integral. 2.12.6. Proposition. Let µ and v be two probability measures on a measurable space (11, B) and let A be a measure on B such that p. 0 one has
-x I
I -`2-n. -1
It is readily verified that this is possible. According to Problem 2.12.25, we have vh - v for all h E 12. Let p be a Gaussian measure on Rx. If p is not singular with respect to v. then 12 C H(p), since otherwise ph I p for some It E 12, whence it follows that v has no component absolutely continuous with respect top. Note also that v(11) = 1, which implies p(1") = 1 by the zero-one law. Let a be the mean of p and po = p_a. Then 1LO is a centered Gaussian measure with H(po) = H(p). It is clear that µe(1") = 1 (this follows, e.g., from Theorem 2.5.8). Denote by y the countable product of the standard Gaussian measures on Rt. We know that H(y) = 12. According to Theorem 3.3.4 of the next chapter, the containment 12 C H(po) and the equality po(l'°) = I yield y(l') = 1, which is false. Instead of using the measure 7 one could apply the previous theorem.
Oscillation constants In many problems one has to deal with random variables of the form a(l:) = lim sup I
n-x
I.
where { _ (E") is a sequence of random variables. If a(l;) is a constant (possibly, +x), then it is called the oscillation constant of t;. A proof of the next result and an interesting discussion of the oscillation constants can be found in [119, Ch. 8]. 2.12.11. Theorem. Let 1; = (1;") be a sequence of centered Gaussian random variables with variances on tending to zero. Then it has oscillation constant o
Chapter 2. Infinite Dimensional Distributions
94
(possibly, infinite). If, in addition, sup 1fn1 < oe a.s., then for almost all w, the set n
of all cluster points of the sequence {1;,,(w)} coincides with [-a. a].
Miscellaneous remarks Finally, it is worth noting that in some problems it is useful to consider exten) sions of the Ornstein-Uhlenbeck semigroup and the Gaussian measures to the complex plane (respectively, in t and in A). In the first case we get the Fourier-Wiener transform, which can be written formally as y(A-112.
T-,,,/2
.F(f)(s)
Jf(ix + / y)'Y(dy)
x Note that if f is a polynomial of degree n and t is a complex number, then Tt f can
be defined as E e-ktlk(f). In particular, for t = -i7r/2, we get. k=O n
F(f)
Erklk(f) k=0
which shows that .F extends to a unitary operator on the complex space L2(-Y). The Fourier-Wiener transform on L'(-yn) is unitary isomorphic to the Fourier transform on L2 (RI) (provided the latter is defined with the normalization making it unitary). For a related discussion, see [1411, (142), [178], 1337], (339], [381], [4461, [678). In the second case the analytic continuation of the integrals
A -. I f(ix) -t(d,) .e
to the point i enables one to define the Feynman integral for some classes of functions f. For example, if f = e`1,1 E X`, then the foregoing integral equals j(\/ 'X1) exp(_Ao(1)2/2), and the result of the analytic continuation gives exp(-io(1)2/2), i.e., "Feynman measure" can be regarded as a "complex cylindrical measure" with the Fourier transform exp(-iQ), where Q is a nonnegative quadratic form. Concerning this and other approaches to Feynman integrals, see 113], [1341, (146], (1781, [256], [291], (403], [7031, 17171.
Problems 2.12.12. Prove that the convex and absolutely convex hulls of a cylindrical set with a compact base in a locally convex space are closed. Hint: use that the convex hull of a compact set in R" is compact.
2.12.13. Show that no single-point set belongs to F(RT) if T is uncountable.
2.12.14. a) Show that there exist Borel sets A and B in R', such that the set A + B is not Borel. Hint: see [718). b) Show that there exist closed sets A, B C R" such that
A + B is not Borel. Hint: take a closed set S in the hyperplane L = {x, = 0} C R" and a continuous function f : S -+ R' such that f (S) is not Borel: note that the set A = {(f(x),x), x E S} is closed and A+L is not Borel, since (A + L)nR'ei = f(S)e,, where el _ (1,0,0....). 2.12.15. Let µ be a countably additive function on e(X) such that p o l ' is a Gaussian measure for every 1 E X. Then µ is a probability (hence Gaussian) measure.
2.12.
Complements and problems
95
2.12.16. Prove that the functions mentioned in Example 2.3.15 are indeed covariance functions, i.e., correspond to nonnegative quadratic forms.
2.12.17. Show that if the covariance function of a Gaussian measure - on a locally convex space X is continuous in the s-weak topology on X', then the measure iy is concentrated on a finite dimensional subspace. Hint: note that a seminorm q bounded on the set {x: I f,(x)I < 1, f, E X*. i = 1,... , n} has the form go(fl.... , f"), where qo is a seminorm on R". 2.12.18. Let -y on RT be an uncountable product of T copies of the standard Gaussian measures on the line. Show that H(-y) = R, (X,) = 12(T) as a Hilbert space (12(T) is the space of all mappings h : T - R' such that h(t) does not vanish for at most countably many t and E, h(t)2 < oc). Note that H(y) is nonseparable.
2.12.19. Let (T, m) be a measurable space and p a Radon Gaussian measure on L2(m) with the covariance operator K. Show that VT is given by a symmetric kernel Q E L2(T2, m®m) and Ih(t)12 0, i.e., Jx(t) - x(s)J < C(x)It - sl°. Show that Pw(H°(0.1]) = I if a < 1/2 and PW'(H' [0, 1]) = 0 if a > 1/2. Hint: use Theorem A.3.22 and the equality f Jx(t) - x(s)I2m Pu'(dx) = (2m - 1)!! It - sl- for any m E !N; note, in addition, that Ix(t,) - x(s,)I/ It, - s,l are independent standard Gaussian random variables for disjoint closed intervals
2.12.21. Show that two probability measures p and v on a o-field B are mutually singular precisely when JIp - vII = 2. An equivalent condition: H(p, v) = 0. 2.12.22. Prove the separability of the process q, from the proof of Proposition 2.6.5. 2.12.23. A measure p on P(X) is called continuous along a vector h (V. A. Romanov)
if lim IIMth - pII = 0. Show that if a measure p is continuous along h, then so is any measure v absolutely continuous with respect to p. Hint: note that this claim is true for the measures of the form IBM and extend it to the measures of the form f p, f E L'(p).
2.12.24. Let f be a smooth probability density on R" with the integrable partial derivatives. Prove the estimate
f J If (x + h) - f (x)I dx < 118h f (x)I dx,
Deduce the estimate
V h E R".
J lahf(x)i2dX1/2,
JIf(x+h)-f(x)Idx e)
1 - e. The algebra of functions of the form F(f1,... , where F is a polynomial on 1R", separates the points in K,. Hence by the Stone-Weierstrass theorem, there exist n E P and a polynomial F on lR' such that I F(f l , ... , f") -g. I < c on K.. whence, letting C: = F(f1.... , f"), we get ry(x: IG(a) -g,(x)I > 2e) < e. It remains to note that, by the independence, we have "li
r
xX
dy=
Jei9d.r JeG d'Y = 7(g)2. X
X
On the other hand, the left-hand side converges to 1, whence a(g)2 = 0, which is a contradiction.
Notation. Throughout this chapter we employ the same notation as in the preceding one. In particular, H(-y) is the Cameron-Martin space of a Radon Gaussian measure -y, - IH(,) is the natural Hilbert norm in H(-y), U stands for the closed I
unit ball in H(y). The symbols X,,*, R.,, a,, yh, h, and a(f) retain their meanings as well.
3.2. Basic properties of Radon Gaussian measures 3.2.1. Lemma. Let X be a complete locally convex space and let ,7 be a Radon Gaussian measure on X. Then the functions a,: f i--+ a, (f) and f F-+ R., (9) (f ) i.e., the for any g E X are continuous on X" in the Mackey topology topology of the uniform convergence on weakly compact convex sets in X.
PROOF. Let 6 > 0. There exists a compact set K such that j (K) > 1 - 6. By virtue of Proposition A.1.6 in Appendix, the closed convex hull Q of the set K is compact (hence weakly compact). Then, the condition If I < 6 on Q implies the estimate
I1-'Y(f)15f I1-e=fIdy+26 0, 0 E [0, n[. Then I1 - rI 36 and cos 0 > (1 + r2 - 962) (2r) -'. Hence, for every e > 0, there is 6 > 0 such that Ia., (f ) I e and a(f )2 < e whenever sup4 I f I < 6. In addition, IR,(f)(g)I < a(f)o(g) < a(g)/. By definition, this
means the continuity in the topology rtit(X+,X).
It will be shown below that the conclusion of this lemma remains valid for arbitrary locally convex spaces.
3.2.2. Lemma. Let y be a Radon Gaussian measure on a locally convex space X which is continuously and linearly embedded into a locally convex space Y. Then
the set H(y) is independent of whether y is considered on X or on Y. PROOF. Denote by yb' the measure y considered on Y. Clearly, it is a Radon Gaussian measure. Then H(y) C H(yY) by virtue of equality (2.4.3). Assume that h E H(y'). Since the compact sets from X are compact in Y, the set X is measurable with respect to y1' and has full measure. Then it contains h, since it is
a linear subspace and yh - -t". Therefore, y,, - y, whence h E H(y).
3.2.
Basic properties
101
Recall that for Gaussian measures which are not Radon the situation is different (see Remark 2.12.3).
3.2.3. Theorem. Let y be a Radon Gaussian measure on a locally convex space X. Then its mean is an element of X and R, (X.;) C X. In addition. H(y) = R, (X,) = {h E X : yh r 7} = {h E X : jhlHi, < x Finally, for any f E X', one has
Rr(f)(f) _ f(f(x) -
a,(f))2
y(dx) = sup f(h)2.
(3.2.1)
hEl.'N
PROOF. If X is complete, then, according to Lemma 3.2.1, for any g C X,*, the
functions a, and R, (g) are continuous on X' in the Mackey topology r,1,(X', X), hence they are represented by elements of X (see Theorem A.1.1 in Appendix). In the general case this means that _the claims to be proved are valid for the measure y considered on the completion X of the space X (see Appendix). This measure is
Radon on X. Note that X' = X' (since X is dense in k) and X.; = X;. Let us consider the symmetrization y' of the measure y defined by the formula
y'(A)_(Y* yi)(
A),
where y1 (A) = y(-A). Recall that the convolution of two Radon measures µ and v is defined by the formula I, * v(B) =
Jti(B - x) v(dx).
In this case j v = µ`i (see Appendix). The Fourier transform of the measure as one readily verifies, has the form
y'(f) = exP(-2a(f)2) =17(f)I Therefore, y' is a Radon Gaussian measure with zero mean. Considering it on X, It follows from the we get that -1 = ya, where a E X (note that i (l) = equality y(X) = 7"(X) = 1 that a E X.
In order to prove the inclusion R,(g) E X for g E X', let us consider the measure p = exp(g
- 12o(g)2) y. According to Proposition 2.4.2, this measure is
Gaussian with mean 6 = a, + R.,(g). As shown above, b c- X, whence the claim follows. Equality (3.2.1) follows from what has already been proved by Lemma 2.4.1.
0 According to Remark 3.11.20 below, the previous theorem may be invalid for non Radon measures.
3.2.4. Corollary. Let y be a Radon Gaussian measure on a locally convex space X. Then the closed unit ball U from H(y) is compact in X. PROOF. The claim follows from Proposition 2.4.6, since, by virtue of Corol-
lary 2.4.9, for some c > 0, the set cU is contained in the compact set K - K, where K is any compact set of positive measure.
0
Chapter 3.
102
Radon Measures
3.2.5. Corollary. The Fourier transform and covariance of a Radon Caussian measure y are continuous on X` with the Mackey topology -r-11 (X', X).
PROOF. The continuity in the Mackey topology follows from (3.2.1) and the fact that. U is a convex compact set in X.
3.2.6. Corollary. Let u and v be two centered Radon Gaussian measures on for all a locally convex space X such that H(u) = H(v) and I hI H(.u) = h E H(p). Then u = v. PROOF. For every f E X', one has by (3.2.1)
RN(f)(f) _ sup f(h) 2, hEU
where U is the unit ball in the Hilbert space H = H(u) = H(v) with the norm IH(,.). The same is true for
Therefore,
whence
µ(f)=i(f)and u=v. 3.2.7. Theorem. Let - be a Radon Gaussian measure on a locally convex space X. Then the Hilbert spaces Xy and H(-y) are separable. PROOF. We may assume that the measure y is centered. Let K be a compact
set of positive 1-measure. Note that X' = Un 1 nK°, where
K° = If E X': sup If (z)] < 1}. ze K
Therefore, it suffices to prove the separability of K° in L2(y). By induction, one can find a sequence { f,,) C K° such that, setting Xn = span(f1e... ,fn), Xo = 0, we get 1
dist(fn, Xn_)) > dist(K°, Xn_)) = d,,, where dist(A, B) = SuPQEAinfbEBlla - blIL2(,)-
Since the finite dimensional spaces X. are separable, then the separability of K° follows from the relationship lim do = 0, n-.oc
which we now prove. Suppose that the reasoning is analogous). Then
0 (in the case
oc
dist(fn,X,,_1)>d>0, Vn. This means that the Gaussian vector 1.8.8. Hence we have the inequality
y(x: supIf,(x)I < 1 i n. Let r < k be the maximal number with the following property: there exist r elements in If,.... . f k } such that adding these elements to g, j... , gn , one gets a linearly independent. system. We may assume that these are fl.... , fr. Then, according to what has already been proved. we have (g,
g.. fl,... ,fr)_1(B1 xII') _ (gl,....gn)-(81) = (gl,... ,g., ft.... . fn)-1(B2) = (91.....gn. fl.... . fr)_1(B3)
for some B3 E
Rn+r
Hence B1 x IR' = B3, whence we conclude that two different
representations of C again lead to the equal values for V(C). The additivity of v is obvious. Let us show that it satisfies the condition in Theorem A.3.19 in Appendix. Let E > 0. Let as find a compact set K such that -r(K) > 1 - e. The set S = (K - K)/2 is compact. Hence it suffices to establish the estimate v(C) > I - 2e for any cylindrical set C containing S. Let gi E G, i = 1, .... k. We shall show that vk(T(S)) > 1-2E, where T: X Yak. Tx = (gl(x).... ,gk(x)). Put p = foT-
Then µ(y) = exp(-ZB(y)where y= (y,,.... yk) and k
B(y) = V 1
Ui9i > l["(EY.9iJ.
Chapter 3. Radon Measures
106
Hence there exists a symmetric Gaussian measure a on IRk with µ = vk * a. For some a E IRk one has
µ(T(K)) = vk * a(T(K)) < vk(T(K)
- a),
since the integral of the right-hand side in a with respect to or is the left-hand side. Now, by virtue of the easily verified relationship
(T(K)
- a) n (-T(K) + a) C
we get
[T(K)
- T(K)]
= T(S)
2
vk(T(S)) > vk{(T(K)
- a) n (-T(K) +a)}
> vk (T(K) - a) + vk (-T(K) + a) - 1 = 2vk (T(K) - a) - 1
>2µ(T(K))-1>1- 2E. Therefore, by Theorem A.3.19 in Appendix, v extends uniquely to a Radon measure.
0 3.3.2. Corollary. If the convolution of Gaussian measures p and v is tight and the measure µ is symmetric, then µ and v are tight.
PROOF. Let Q and Q be the covariances of µ and v, respectively. The convolution of the measure µ * v with its image under the mapping x -. -x is a tight centered Gaussian measure having the covariance
Hence µ is tight
by the previous theorem. Let K be a compact set such that µ' (K) > 1 - e and (µ * v)'(K) > 1 - E. Put S = K - K. Since S is compact, it suffices to be shown that v' (S) > 1 - 2E. Let C be a cylindrical set with C n S = 0. We shall show that v(C) < 2E. There exists a cylindrical set E such that K C E and C n (E - E) = 0. If v(C) > 2E, then v(E - E) < 1 - 2E, whence
µ*v(E)=fv(E -x)p(dx) e+J v(E-x)p(dx)<e+v(E-E) 0 such that V f E X';
(ii) H(v) C H(p): (iii) v(L) = 1 for every v-measurable linear subspace L of full p-measure.
3.3.
Gaussian covariances
107
PROOF. It is clear from the definition of the Cameron-Martin norm that the estimate in (i) yields the estimate clhIH(.) for all h E X and the containment in (ii). Conversely, if H(v) C H(p), then, by the closed graph theorem, there for all h E H(v). The estimate in (i) follows is c > 0 such that IhIH(,.) < from equality (3.2.1) for u and v in place of -y. i.e., sup{ f (h)2: IhIH(,,) < 1},
f EX*.
(3.3.1)
Suppose that H(v) C H(p). By the closed graph theorem, there exists a constant C such that, for any h E H(v), the estimate IhIH(,,) < holds true. Replacing the measure p by its homotetic image, we may assume that C = 1. Let L be a vmeasurable linear subspace with µ(L) = 1. We shall consider s as a Radon measure on L with the induced topology. The Fourier transform of the measure it at the point f E L' equals Let us consider the following function on the space L':
;p(f) = exp(-2li (f)).
W(f) = sup{f(h)2: IhIH(V) !5 1}.
Clearly, W is a nonnegative quadratic form on L'. By equality (3.3.1) applied to the space L, we get the estimate W(f) < R (f)(f ). According to Theorem 3.3.1, there exists a Radon Gaussian measure A on L with the Fourier transform gyp. Regarded as a measure on X, A has the same Fourier transform as v. Therefore, A = v on X. In particular, v(L) = 1. Conversely, assume that condition (iii) is fulfilled. Suppose that there exists an element h E H(v) which is not in H(µ). By Theorem 2.4.7, one can find a linear subspace L E E(X) of full p-measure, not containing h. According to the same theorem, v(L) < 1 (since otherwise H(v) C L), which is a contradiction.
3.3.5. Theorem. Let p be a centered Radon Gaussian measure on a locally convex space X. Then, for any Hilbert space E continuously embedded into H(p), there exists a centered Radon Gaussian measure A on X with H(A) = E. PROOF. It suffices to note that the same reasoning as in the proof of Theorem 3.3.4 can be applied to E replacing H(v) and L = X.
3.3.6. Theorem. Let p and v be two centered Radon Gaussian measures on a locally convex space X. Then the following conditions are equivalent:
ff2dv 1. On the other hand, (S'f,S'g)E < 1, Vg E U, whence IS'fIE < 1, which is a contradiction. This observation, which is due to Vakhania (see [796], [798], [800]), is also true for non Gaussian measures possessing covariance operators. Apart from being interesting in its own right, this result may be helpful for evaluating the Cameron-Martin space (and its analogues for non Gaussian measures).
3.3.9. Example. Let P%% be the Wiener measure considered on L2[0,1]. Then its covariance operator can be written as SS', where Sf (t) = f o f (s) ds. Indeed, in this case S' f (t) = ft' f (s) ds. It remains to use the identity , = fmin(ts)f(s)ds
JJf(u) duds,
0
0
a
which is verified by differentiating in t. Therefore, we arrive at the description of H(Pu') as the primitives of the elements of L2[0,1]. We saw in Corollary 2.3.2 that the exact analogue of Proposition 1.2.2 is not valid for infinite dimensional Hilbert spaces. However, there exist infinite dimensional locally convex spaces X such that every nonnegative continuous quadratic form on the dual space with the Mackey topology is the covariance of a Radon Gaussian measure on X and the converse is also true. The class of such spaces contains, e.g., the dual spaces to the nuclear barrelled locally convex spaces (which is a special case of the well-known Minlos theorem [549]: see also [800, Theorem VI.4.4]
and Problem 3.11.36. An important example is the space of distributions S'(lR"). In Section 3.11 we shall consider another example, where the Gaussian covariances admit an explicit description: the spaces L. 3.4.
The structure of Radon Gaussian measures
The following fundamental result is due to Tsirelson [774], [775].
3.4.1. Theorem. Let - be a Radon Gaussian measure on a locally convex space X. Then there exists a sequence of metrizable compact sets K" such that
(UK) PROOF. Let E > 0. We shall prove that there exists a compact set K such that
y(K)> 1-E, and the initial topology on K is metrizable. The main idea of the whole construction is as follows. We may assume that the measure -y has mean zero and that X is the minimal closed linear subspace of full measure (the existence of the minimal closed linear subspace of full measure follows from the zero-one law and the existence of the topological support). We shall find an absorbing set T C X' (i.e., a set whose
homotetic images cover X') and equip it with some separable metric d. Then we
shall find a compact set K C X with j(K) > 1 - e such that, for any x E K, the function F=: g -4 g(x) on (T, d) is continuous. Suppose that all this is done. Then the initial topology of X is metrizable on K. Indeed, the initial topology on K coincides with the weak topology a(X,X'), since K is compact. Let Q be a
Chapter 3.
110
Radon Measures
countable set dense in (T, d). Every functional f E T is the pointwise limit on K of some sequence of elements fn E Q. since, by virtue of our choice of T, for any x E K, the function g g(x) is continuous on (T, d). Therefore, f (x) - f (x)
for any x E K. Since the elements of T separate the points in the set K, this implies that the elements in Q separate the points in K as well. Therefore, K is a metrizable compact space (see Proposition A.1.7 in Appendix). Let us now choose T. This will be the set
T = {I E X':
I), XES
where S is an arbitrary compact subset of X of positive measure (such a set exists, since y is a Radon measure). In other words, T is the polar of S. Let us define a metric r on T by the formula r(f, 9) = III - 911V(-,)Note that r is indeed a metric, since the equality f (x) -g(x) = 0 for y-a.e. x implies that f(x) - g(x) _- 0 (otherwise there exists a closed hyperplane of full measure, which contradicts our assumption that X is the linear support of y). Let
S'={xEX:
sup ll(x)I I - t and t(z) _ V°(t,z) whenever z E K, t E T. Let us show that we can do this. On the space C = Cb(T) of bounded continuous functions on (T, d). one can introduce a metric p such that (C, p) is separable and the mappings F, : f - f (t), t E T, are continuous in the metric p. Having chosen a countable everywhere dense set To C T and a countable basis {Un} of the topology in (T, d), for such a metric we take the function
r
p(f,g) = sup2- I}!vfn(f) - Mn(9)I + n
Imn(f) - Inn(9)I]
`
where
Mn(f) = sup arctan f (t), mn(f) = inf arctan f (t). t.
To
We could also deal with kS' for k sufficiently large in place of E; then the corresponding sample paths are uniformly bounded, and there would be no need in arctan in the definition of Mn and m,,. Note that the functions F, generate the Borel
3.4.
The structure of Radon Gaussian measures
111
a-field of (C, p), since they separate the points and (C, p) is separable. The mapping T, taking x to e( -, x) E C, is Borel. By the separability of C and the Radon property of y, there exists a compact set K, C X such that y(KK) > 1 - e and the mapping ' is continuous on KE (this is proved in the same manner as the classical Lusin theorem). Recall that t;'0 is a version of C, hence -y (x: t;°(t, x) = t(x)) = 1 for any t E T. For every t E T, the set
K(t) = {x E KE: t;°(t,x) = t(x)) has measure y(KK). By virtue of our choice of Kf and the continuity of the functionals t, all these sets are compact. Since the restriction of y to KK is a Radon measure, the intersection of the compact sets K(t) has measure y(KK) as well (see Problem A.3.35 in Appendix). This intersection is taken for K. O 3.4.2. Corollary. Any sequentially continuous function is measurable with respect to every Radon Gaussian measure.
3.4.3. Corollary. Let y a Radon Gaussian measure on a sequentially complete locally convex space X. Then, for every c > 0, there exists an absolutely convex metrizable compact set K, such that y(K,) > 1 - E. PROOF. It suffices to use the fact that the absolutely convex closed hull of a metrizable compact set in a sequentially complete locally convex space is compact metrizable (see Proposition A.1.6 in Appendix). The reader is warned that a countable union of metrizable compact sets may fail to be metrizable, hence the results above do not imply the existence of a metrizable support (see Example 3.6.3 below); Tsirelson's theorem yields the following isomorphism theorem.
3.4.4. Theorem. Let µ and v be two centered Radon Gaussian measures on locally convex spaces X and Y, respectively, such that dim H(µ) = dim H(v). Then the measurable spaces (X, p) and (Y, v) are linearly isomorphic in the following sense: there exist Borel Souslin linear subspaces X° C X and Y° C Y and a
one-to-one Borel linear mapping T: X° - Y° such that µ(X°) = v(YO) = 1 and it oT-1 = v. PROOF. It suffices to prove the claim for infinite dimensional H(A). Let -y be the countable product of the standard Gaussian measures on the real line. Let us choose a metrizable compact set K C X of positive µ-measure. There exists a sequence If,,) C X' separating the points of the linear span X1 of the set K. By the aid of the process of orthogonalization we may assume that such a sequence {bf }
is an orthonormal basis in X. The mapping F: X1 -+ IR'°, Fx = (fn(x))n_1, is linear, continuous, and injective. It is clear that µ o F-1 = y. The space X1 is a countable union of metrizable compact sets, in particular, it is Souslin. All these properties are inherited by its image F(X1), which has full measure with respect to y. For the measure v, we find a space Y1 C Y and a continuous linear mapping G: Y1 - lR°° with the analogous properties. Put Z = F(X1) n G(Y1). Then Z is a a-compact linear space of full -y-measure. Put X° = F-(Z), Yo = G-1(F(X°)). The mapping T = G- IF: X° - Yo is linear and one-to-one. Note that, by Theorem A.3.15 in Appendix, both mappings F and G take Borel sets to Borel ones. Therefore, the mappings T and T-1 are Borel, and the sets X0 and
Chapter 3.
112
Radon Measures
Yo are Borel and Souslin. It is clear that p(Xo) = 1. Therefore, 7(F(Xo)) = 1, whence v(Yo) = 1. Hence p oT-1 = v. 0
An important open problem related to our preceding considerations is whether one can always find in any locally convex space with a Radon Gaussian measure a convex compact set of positive measure (which is possible, as we already know, for all sequentially complete spaces). For solving this problem, it suffices to understand whether every full measure linear subspace in 12 or 1R" with a Gaussian measure contains a convex compact set of positive measure. A similar problem remains
open for convex sets of positive measure (note that for non Gaussian measures both questions are answered negatively).
3.5. Gaussian series This section is devoted to the representations of Gaussian vectors by series of
independent one dimensional Gaussian vectors.
3.5.1. Theorem. Let -y be a centered Radon Gaussian measure on a locally convex space X. H = H(7), {en} an orthonormal basis in H, a sequence of rndependent standard Gaussian random variables on a probability space (Q, P), and A E £(H). Then the series
GMAen
(3.5.1)
n=1
converges as. in X. The distribution of its sum is a Radon Gaussian measure A with the covariance
RA(f)(f) = (A'i (f),A'R,(f))y. If A = I, then
x x =
en(x)e,,
7-a.e.
(3.5.2)
n=1
PROOF. One can reduce this theorem to the case A = 1, however, we shall consider at once the general situation. By Theorem 3.3.1, there exists a centered Radon Gaussian measure A on X with the Fourier transform 9F-+ exp
2( A' R., (g), A'R, (g))H1_
Let S be a random vector with the distribution A. defined on the same probability space as {Sn}. This can always be achieved. For example, if the space Xa is infinite dimensional, one can take for (11, P) the space (X, A), then take for any orthonormal basis in X* and put S(x) = x (if the space Xa is finite dimensional, then so is the operator A and the claim is trivial). For any g E X', the series 1:,x 1 ng(Aen) converges a.s. and its sum has the same centered Gaussian distribution as g(S). Indeed, this follows from the equality of the variances: 2
(R,(g),Aen)H
9(Aen)2 = n=1
n=1
_
2
n=1
(A'R_(g),A'R,(g)) y
1,5,
Gaussian series
113
Note that it suffices to prove the convergence of series (3.5.1) in the completion
of X, since S with probability 1 is in X. Hence we may assume from the very beginning that X is complete. Then there exists an absolutely convex metrizable compact set K such that .(K) = P(S E K) > 0. The union of the sets nK is a linear subspace of positive measure. Therefore, by the zero-one law
)(nK) = P(S E nK)
1.
The initial topology coincides with the weak one on the set K due to its compactness. Since K is metrizable, there exists a countable family of continuous linear functionals {g;}, generating the topology of K, hence the topology of every set nK as well. Unfortunately, this family may fail to generate the topology of the union of the nK's (e.g., in the space 12 with the weak topology, the coordinate functions generate the topology of every ball, but not of the whole space). Therefore, we have to deduce the convergence of our series in X from the condition that, with probability 1, the sequence {g;(Sn)} converges for every i, where (w)Ae,.
Sn(w) ;=1
The easiest way to do this is to prove that, with probability 1, the sequence Si remains in one of the sets nK. Denoting by q the Minkowski functional of K, it suffices to verify that
supq(Sj) < oc a.s. To this end, note that the sequence of the finite dimensional vectors S3 is a martinThen gale with respect to the sequence of the o-fields off, generated by the sequence q(3,) is a submartingale (see Problem A.3.38). The covariances of the vectors Sn are given by the equalities
RS., (9)(9) = (P,,A'R,(9),
where P. is the orthogonal projection in H to the linear span of the vectors e1, ... , en. These covariances are majorized by the covariance of S, since we have IhI,,, dh E H. According to Theorem 3.3.6, for any symmetric convex set C, one has
P(S, E C) > P(S E C). This means that lEq(S,,) < IEq(S), where the right-hand side is finite by Fernique's theorem. By virtue of Theorem A.3.6 in Appendix, we get the desired claim. Let us prove (3.5.2). It suffices to verify this equality for almost all x from a full measure set E that is the union of a sequence of metrizable compact sets in X. We know that there exists a sequence of continuous linear functionals g; separating the points in E. Since the series on the right in (3.5.2) converges for a.e. x E E, it suffices to show that, for every i E IN, one has 9,(x) = E
a.e. on E.
(3.5.3)
n=1
To this end, it remains to observe that g,(en) since {e;,} is an orthonormal basis in X.
which yields (3.5.3),
Chapter 3. Radon Measures
114
Note that the previous theorem yields the equality f f (x) -1 (dx) =
ff(en)
X
0
P(dj),
(3.5.4)
n-1
where {ln } is any sequence of independent standard Gaussian random variables on a probability space (0, P).
3.5.2. Corollary. Let 7 be a centered Radon Gaussian measure on a locally convex space X with the infinite dimensional Cameron-Martin space, {Cn} an or-
thonormal basis in X,;, and en = R,(f,,). Then, for any function f E LP(-y), the function (3.5.5)
fn(x) = f f (Pnx + Sny) 7(dy), X
where
x
n
C,(x)ei, Sny =
Pnx =
C,(y)e i=nt1
i=1
serves as the conditional expectation of f with respect to the afield generated by Sn In particular, {fn} -y f in LP(7) and 7-almost everywhere. The same is true for the Bochner integrable mappings with values in Banach spaces.
CI,
PROOF. For any bounded Borel function g of the form g(x) = g(Px), by Theorem 3.5.1, we have the equality
f f(x)g(Pnx)7(d) = f f
ynen) g(EY e) R. Therefore,
JfI?R For all n such that
Iffdn
1 - e and v(S) > 1 - e, where K and S are compact sets, then Q = K + S is compact as well and, in addition, Jiz(Q - x) v(dx) > (I - e)2,
p * v(Q) > S
since K C Q - x whenever x E S. By construction, one has E ro for y c- r. Indeed, the centered measure y_Q, coincides with the convolution of the images of
the measure y under the mappings x " x/f and x'-. -x/f. Let us now choose a compact set K such that y(K) > 1/2 and y(K +a,) > 1/2 for ally E r. Then
a,EK-K,since (K+a,)f1K#0.
To prove the last claim, let us note that the formula for the Fourier transform of a Gaussian measure shows that the weak convergence of yn toy implies the weak convergence of a,,, to a, in X. Since on the compact sets in X the weak topology coincides with the initial one, the last claim follows. 0
Chapter 3. Radon Measures
132
3.8.10. Corollary. A family of Radon Gaussian measures is uniformly tight precisely when the set of their means is relatively compact and the set of the corresponding centered measures is uniformly tight.
3.8.11. Theorem. Let yn be a uniformly tight sequence of centered Gaussian Radon measures on a locally convex space X. Then, for any continuous seminorm
q on X, there exists a > 0 such that sup f exp(aq(x)2)'Yn(dx) < 0c. n
X
If, in addition, the sequence {yn } converges weakly to a measure yo, then, for any w < a. one has lim f exp(Kq(x)2) 'Yn(dx) = f exp(Kq(x)2) -Yo(dx). nx X
X
In particular, for any positive r, one has
lim f nx X
q(X)r Yn(dx)
= Jq(x)'o(dx). X
PROOF. Let Q be a compact set such that ry,,(Q) > 3/4 for all n. Denote by K the closed absolutely convex hull of Q and by gK the Minkowski functional of the set K, defined on the linear span of K. Then yn(K) > 3/4 for all n E IN. Since q is a continuous seminorm, we get sup q(x) = S < oo.
K
Therefore, q(x) < SgK (x) for all x from the linear span of K, which has full measure by virtue of the zero-one law. According to Fernique's theorem 2.8.5. there exists
a positive number k such that sup f exp(kgK(x)2) yn(dx) < 00. n
X
This gives the first claim. The remaining claims follow from Lemma 3.8.7.
The next results gives a simple and natural approximation of a Gaussian measure by its finite dimensional projections.
3.8.12. Proposition. Let -y be a centered Radon Gaussian measure on a locally convex space X and {e,) an orthonormal basis in H(y). Put Pnx = e, (x)e,. =1
Then the sequence of Gaussian measures y o Pn ' converges weakly to y.
PROOF. Let f be a bounded continuous function on X. We know that P,,x - x in X for y-a.e. x. Hence, by Lebesgue's theorem, we get
lim -oc f .f(x)'Y0Pn'(dx)=rim f f(Pnx)7(dx)= f f(x)'Y(dx), X
which is our claim.
X
X
0
For some spaces, constructive conditions for the weak compactness of families of Gaussian measures are known.
3.8.
Weak convergence
133
3.8.13. Example. Let X be a separable Hilbert space. (i) A family I' of centered Gaussian measures on X with the weak topology is uniformly tight precisely when
sup 1 II=II2 y(dx) < x.
'Er
(ii) A family r of centered Gaussian measures on X with its Hilbert norm is uniformly tight (equivalently, relatively weakly compact) precisely when there exists an injective nonnegative compact operator T on X such that y(T(X)) = 1
for all yerand
supl IIT-'xll2 y(dx) < x. 'Er
An equivalent condition: sup trace (T-' K,T-') < x, where K, is the covariance ,Er operator of the measure y. (iii) A sequence of centered Gaussian measures yn with covariance operators Kn converges weakly to the centered Gaussian measure y with covariance operator K if and only if { Kn } converges to K in the Hilbert-Schmidt norm. (iv) A family I' of centered Gaussian measures on X is uniformly tight (equivalently, relatively weakly compact) if and only if the corresponding covariance op-
erators K, satisfy the following conditions: sup-,Ertrace K, < x and for some x
(and then for any) orthonormal basis {e,) in X the series F_ (K,e,,, en) converges n=1
uniformly in y E r.
PROOF. (i) The existence of a weakly compact set K of measure greater than
3/4 for all -y E I' implies the existence of a ball with this property. Hence the desired uniform estimate follows from Fernique's theorem. The converse follows from Chebyshev's inequality i ( I I x I I > R) < _ R-2 f IIxII2 y(d-r')
and the weak compactness of the closed balls in any Hilbert space. (ii) Let K be a compact set such that y(K) > 3/4 for all y from IF. Choosing an arbitrary orthonormal basis in X, we may assume that we deal with 12. Note that K is contained in some compact ellipsoid Q of the form
/ Q =
{(xn):
-I
t,2x2 < 1}.
where t, -» 0, t, > 0. Indeed, by virtue of compactness of K, one has
lim sup F x? = 0.
"x rEK,=n
Hence we can pick an increasing sequence of natural numbers N. such that
x
E x2 r) <e, Vn,m>N. By definition, one has
y(xE X: ItPnx - PmxIlx > e) =v(xE H: I,Pnx - PmxIIx > e). For any orthogonal projection P E P(H) with P 1 PN, by virtue of the relationships P,,vP = 0 and limk-x P,g+kPx = Px, one has
3.9.
Abstract Wiener spaces
139
+ E H: IIPXIL, > e) = lim v(x E H: {I(PvT,i - P, )Pxll, > F). A
Finally, by Theorem 3.3.6, we get
+ E H: II(Pv+k - Pv)Pxli., > e) < v(.t E H: II(P.,-+x - P,)xJJ.r > e)
0, let us take a positive measure set B such that f (x) E (y - e, y + e), V x E B, for some real y. Then B - B contains some interval (-r, r). Hence g(h) < (-2e, 2e) if h E (-r, r),
Chapter 3.
146
Radon Measures
because g(h) = f (x + h) - f (x) for some x E B such that x + h E B. This implies that g(h) = ch for some real c. By Fubini's theorem, there exists xo such that f (xo + h) = f (xo) + ch for a.e. h. This yields an affine version of f. 0
3.11.8. Proposition. Let y be a centered Radon Gaussian measure on a locally convex space X and let f be a 7-measurable function such that, for every fixed h E H(7), one has f (x + h) - f (x) = g(h) y-a.e., where g(h) is some number.
Then f = c+ fo y-a.e., where c E R' and fo E X. . PROOF. Let {e;) be an orthonormal basis in H(-y). According to Lemma 3.11.7, for every i, the function f admits a version gyp; that is affine along the lines x+Rlej. = c; a.e. for some number c;. By Proposition 3.5.5, By the zero-one law, f(x)=ciet(x)+...+caen(x)+i0n(x)
where vG (x) = gn
a.e.,
(x), en+2(X).... ) for some Borel function g on R. Note
that E c? < oo. Indeed, otherwise -l
JexP(itf)d.r = exp(-2t2
c?1 Jex(itibn)d-t--0, bt94 0,
as n -- oo, which is a contradiction, since y o f -1 is a probability measure on 1111.
Hence the series E c,e, converges almost everywhere, whence the convergence of the sequence {76n}. By the zero-one law, {t1,,} converges a.e. to some constant, 0 whence our claim.
3.11.9. Corollary. Let y be a centered Radon Gaussian measure on a locally convex space X and let T be a y-measurable mapping with values in a separable FYechet space Y such that, for every fixed h E H(-y), one has T(x + h) - T(x) = g(h) 7-a.e., where g(h) is some element of Y. Then T has an gain modification. The following example shows that the continuity of a linear function l: X Rt on the Hilbert space H(y) does not imply its measurability with respect to y. This example was constructed by H. von Weizsi cker assuming the continuum hypothesis;
N. Nedikov constructed an analogous example assuming Martin's axiom and the negation of the continuum hypothesis. We do not know whether such an example can be constructed without extra set-theoretical assumptions.
3.11.10. Example. Let y be a centered Gaussian measure on a separable Banach space X (or a centered Radon Gaussian measure on a locally convex space) such that H(y) is infinite dimensional. Suppose that the continuum hypothesis is true. Then there exists a linear function I on X, equal zero on H(7), but nonmeasurable with respect to the measure 1. PROOF. Let X be a separable Banach space. Since measurable proper linear functionals vanishing on H(y) equal zero almost everywhere, it suffices to extend the identically zero functional from H(y) to a linear function I on X in such a way that 1(x,) = 1, where {xa} is a family of vectors linearly independent of H(y) with 0.
3.11.
Complements and problems
147
Such a family is constructed by the aid of the transfinite induction. To this end, we establish a one-to-one correspondence between the family K of all compact sets in X of positive y-measure (this family has cardinality c) and the interval of ordinals I = [0, w1), where w1 is the first uncountable ordinal, which corresponds, by virtue of the continuum hypothesis, to the cardinality of K equal c. Let 3 E I and assume that, for all a 0 such that EIStf P < C(IE 2)P/2 = CKt(t, t)P/2.
Supports and r-additivity of Gaussian measures Let us mention the following nice result conjectured by A. Tortrat and proved by Talagrand 17521. Note that the r-additivity can be defined also for a measure p on the or-field £(X) in a locally convex space X as the property that, for every increasing net of open sets UQ E £(X) with U. U, E £(X), there holds the equality p(UQ UQ) = limo p(UQ). It is known (see [800. Theorem 1.3.21), that in this case p extends uniquely to a Borel measure which is r-additive on 8(X) (i.e., the corresponding property is preserved for all open sets U,).
3.11.16. Theorem. Every Gaussian measure on IRT is r-additive. 3.11.17. Remark. The property stated in the previous theorem may be useful, since any Gaussian measure on £(X) extends naturally to a Gaussian measure on the space IRT with T = X* (see Example 2.3.10). A useful property of r-additive measures is that they have topological supports, since any union of measure zero open sets has measure zero in the case of a r-additive measure.
3.11.18. Remark. Let X be a Banach space and let p be a probability measure on £(X) which is r-additive (e.g., Radon) for the weak topology of X. Then, by a result of R. Phillips (see [800, Ch. 1, §5, Corollary 41), it admits an extension to a Radon measure in the norm topology of X. In particular, if X is a reflexive Banach space (e.g., LP(a), p E (1, oo)), then every measure p on £(X) has a Radon extension in the norm topology (see [800, Ch. I, §5, Corollary 5)).
3.11.
Complements and problems
151
The reader should be warned that there exist Borel Gaussian measure, which are not r-additive (see [753, p. 183]). Example 3.11.11 yields non Radon Gaussian measures on separable normed spaces.
3.11.19. Example. Assume that the continuum hypothesis is true. Then there exists a Borel Gaussian measure p on a separable Euclidean space X such that p(K) = 0 for every compact set K C X. PROOF. Let -y be an arbitrary nondegenerate centered Gaussian measure on 12. As shown in the proof of Example 3.11.11, there exists a linear functional I on 12
such that y.(1-I(0)) = 0 and y'(1-'(0)) = 1. Let us put X = I'(0). Clearly, X is a separable Euclidean space (with the inner product from 12). We shall take for p the restriction of y to X. That is, for every B E 8(12), we put p(X nB) = y(B). Note that if X n Bl = X n B2, then y(B,) = y(B2), since X has outer measure 1. Hence p is well-defined. Clearly, it is a Borel measure, since every set C E B(X) has the form C = X n B for some B E 8(12) (note that this is true for all open sets by the definition of the topology in X). Every continuous linear functional on X is a restriction of a (unique) continuous linear functional on 12, which shows that p is centered Gaussian on X. Finally, p(K) = 0 for every compact set K C X, since K is compact in 12 as well.
3.11.20. Remark. Let X be any separable Banach space and let I be the linear functional constructed in Example 3.11.11. The previous example shows that, for every Gaussian measure y on X not concentrated on a finite dimensional subspace, the corresponding measure p on Xo = Ker 1, has the following properties: its Cameron-Martin space H(p) is not Hilbert (i.e., is not complete) and there exists f E Xo such that R- f ¢ Xo. Indeed, by construction, Xo contains no continuously embedded infinite dimensional Hilbert spaces (since it contains no compact sets with linear spans of uncountable dimension). As a result, X0 cannot contain since there exists an infinite dimensional Hilbert space E continuously and densely embedded into X', which would give a continuously embedded infinite dimensional Hilbert space C H(p) C Xo. It is interesting to note that this is true for every Gaussian measure on X0 not concentrated on a finite dimensional subspace, since every such measure can be obtained as the restriction of some Gaussian measure on X as described in the previous example.
3.11.21. Remark. Note that Lemma 2.7.1 stating that, for a Gaussian meap(A)_lp[.4 sure p, the measure may be Gaussian only in the case p(A) = 1, remains valid (with the same proof) also for any Borel set A. provided p is Radon.
In connection with convergence of Gaussian series, let us mention the following result which is a special case of a general theorem due to J. Hoffmann-Jorgensen and S. Kwapien. Its proof can be found in [800, Ch. V, §61.
3.11.22. Theorem. Let X be a separable Banach space not containing closed linear subspaces linearly homeomorphic to co. Suppose that {Xn} is a sequence of independent centered Gaussian vectors in X such that D
supEIlX`II 0. Then the series E X, converges almost sure in X and (3.11.4) += t
is true for every p > 0. In particular, the series in (3.11.4) converges almost sure if SUP 11 E X, Jf < oc a.s. Conversely, the almost sure convergence of this series *+
implies (3.11.4).
Note that this result applies to all reflexive separable Banach spaces. e.g., to LP [a, b], 1 < p < oo. Simple examples show that it is not valid for the space co (see, e.g., 1800, Ch. V, §5]).
Recall that a Banach space X is said to be a space of cotype 2 if E II rn II2 < o0 X
n=1
for any sequence {xn } C X such that the series E enx,, converges a.s.. where the
-1
en's are independent random variables with P(sn = 1) = P(en = -1) = 1/2. If the 11X'112` < Cc implies the a.e. convergence of the series L`,rn, then condition n=1
[' n
X is said to be a space of type 2. For example, the spaces LP have cotype max(p. 2) and type min(p, 2). The proof of the following result can be found in (164], [499], [557], [800].
3.11.23. Theorem. If ry is a Radon Gaussian measure on a Banach space X of cotype 2. then y has a Hilbert support. In particular, this is true for X = 1" with 1 < p :S 2. Conversely, if every Radon Gaussian measure on X has a Hilbert support, then X has cotype 2.
3.11.24. Remark. Let y be a centered Gaussian measure on a separable Banach space X. Recall that an operator S E G(X', X) is called nuclear if x x Sf = E Anf(an)bn, where IIa,, 11 = IIbnII = I and E IA,I < oo. We know that the n=1
n=1
covariance operator R., is nuclear. In addition, it is symmetric and nonnegative, i.e., (g. R, f) = (f, R,g) and (f, R., f) > 0 for all f, g E X*. However, not every nuclear, symmetric and nonnegative operator S E £(X', X) is the covariance of a Gaussian measure on X. It is known that the class of all nuclear, symmetric and nonnegative operators between X' and X coincides with the class of the covariance operators of Gaussian measures on X precisely when X is a space of type 2 (see, e.g., [164], [472, Ch. 9], [501], and the references therein).
Measurable linear extensions and the second quantization 3.11.25. Lemma. Let y be a centered Radon Gaussian measure on a locally convex space X and let A and B be two bounded linear operators on the Hilbert space H = H(ry) such that AA' + BB* = I. Then the image of the measure ry y under the mapping (x, y) -' Ax + By, X x X -» X.. is y. PROOF. Clearly, the mapping above is (B(X x X), E(X) )-measurable and, for
every f E X', the functional g: (x, y) f (Ax) + f (By) is linear ry:&y-measurable. In addition, the functional g is a Gaussian variable with mean zero and variance
f f(Ax)2y(dx)+ f .f(by)21'(dy)=IR-,(foA)] +JR (foB)IN.
3.11.
Complements and problems
153
The latter coincides with I A' R f I + I B' R- f I , since, according to Lemma 3.7.8, H B. It remains H R,(f o A) = A'R, f and similarly for to note that, for every h E H, one has by condition
IA*hI2
2 + IB'hIH = (AA'h,h)H +(BB'h,h), = (h, h),.
Thus, g has variance IRY f I2 = a(f )2, whence the desired conclusion.
Now, for any operator T EC(H) with IIT(y) 1, by the relation
F(T)f (x) =
f (T`x + Sy) - (dy),
f E LP(-y),
J x Note that r(T) equals TT if T = e-tI. The construction of
where S = I the second quantization originates in the quantum field theory.
3.11.26. Proposition. Let T E G(H) and IITIlr(H) 1, one has IIr(T)II[(LP(7)) = 1 PROOF. According to Lemma 3.11.25, for -y-a.e. x, the function
y'-' f If (T'x+Sy)IDy(dy) is y-integrable and its integral equals II f III,(,). It remains to apply Holder's inequality to Ir(T) f (x)I9.
In relation with measurable extensions of linear mappings on the CameronMartin space H of a centered Radon Gaussian measure y one might ask about continuous extensions. Clearly, even for linear functionals no continuous extension may exist. However, a more natural question here is as follows. Given an operator
A E C(H), is it possible to find some space E supporting y in such a way that A extends to a continuous operator on E? For example, if X is a Hilbert space, then, for any given f E X:,, there is a Hilbert space E continuously embedded into X such that y(E) = 1 and f has a version continuous on E. Indeed, we may assume that X = l2 and that the covariance operator of y is diagonal with x x the eigenvalues kn. Then f is given by E tax,,, where E knt2, < oo. We can n=1
n=1
take for E the completion of the space of finite sequences with respect to the norm -+fx)(x)2. The following result in this direction can be deduced from (x, IIxII E = the constructions in [921. Its proof is suggested as Problem 3.11.48. In Chapter 7 we shall mention another result of the same sort.
3.11.27. Proposition. Let H be a separable Hilbert space and let A E G(H). Then there exists an injective nonnegative Hilbert-Schmidt operator T on H such that A is continuous with respect to the norm IIxII E := ITxI H . In particular, A extends to a continuous linear operator on the completion E of the space (H, II - IIE) and the standard cylindrical Gaussian measure of H is countably additive on E. Another possibility to get continuity is to use a stronger topology.
3.11.28. Proposition. Let y be a centered Radon Gaussian measure on a Banach space X (or, more generally, on a Fl tfchet space) and let A: X -+ Y be a -f-measurable linear mapping with values in a separable nonmed space Y. Then
Chapter 3. Radon Measures
154
there exist a full measure linear subspace E C X and a linear operator A0 on E such that: (i) E is separable with respect to some norm II [I, the natural embedding (E. II II z) X is continuous and the Borel sets with respect to the norm iI II e are y-measurable; (ii) IIAo(x)[I,. < IIxI[, for every x E E and A = Ao a.e. Paoor. We may assume that A is a proper linear version (which we take for AO). Let. us take for E any separable Banach space (B. 11 [Id) continuously embedded into X and having full measure (we know that such a space always exists). Finally, let us equip E with the norm IIxII,: = IIx1I5 + IIAxI[,.. It remains to be noted that E is separable with respect to this norm, which follows from the separability of the set ((x, Ax) : x E B} in the separable normed space B x Y. In order to see the measurability of the Borel sets with respect to the new norm, note that the open balls with respect to this norm are measurable by the measurability [I9 and A. By the separability of E, this implies the measurability of all open subsets of E, hence of all Borel subsets. of II
Tensor products It is worth noting that abstract Wiener spaces admit the usual multiplication. It is less obvious that, as stated in the following result from {160[, they admit the tensor multiplication. A proof can be found also in [150), [472]. Recall that the tensor product H1 ®2 H2 of two Hilbert spaces H1 and H2 is defined as the completion of their algebraic tensor product H, 0 H2 with respect to the inner product (a0b,cod)2 = (a, 0H, (b,d)H,. The tensor product X1 $£ X2 of two Banach spaces X, and X2 is defined as the completion of their algebraic tensor product with respect to the norm I[xflL := sup{I(y1gy2)(x)1. IIy,Nx,- < 11-
3.11.29. Proposition. If (i 1, Ht, X,) and (i2. H2, X2) are abstract li'iener spaces. then (i1 0 i2, H, 02 H2, X, 0, X2) is also an abstract Wiener space. If y, is a Gaussian measure on X, with the Cameron-Alartin space H,, i = 1. 2. then the corresponding Gaussian measure on X1'9. X2 is denoted by yl fit. %. In addition, one has R,, rc)=0, Ve>0. .,k-x then this sequence converges in measure to some measurable mapping.
3.11.39. Let t and q be two independent centered Gaussian random vectors in a locally convex space X inducing Radon measures Pf and P,,. Suppose that PC_., « PP.
Construct an example showing that the equality P, (H(Pt)) = 0 may be true. Prove that the following conditions are equivalent: (i) P,, (H(Pt)) = 1; (ii) the vector ({ + q, q) in X x X induces a measure that is absolutely continuous with respect to the measure induced by ({, q). Hint: see [609], where a more general case is considered, when q may be non Gaussian.
3.11.40. Show that if a convex function f on a Frechet space X is measurable with respect to every Radon Gaussian measure, then it is continuous. In the case of an arbitrary locally convex space show that the function f is bounded on all bounded sets. 3.11.41. Let y be a centered Gaussian measure on a separable Hilbert space X with covariance operator K and let A E £(X). Prove the following relationships: J(Ax, x) 'y(dx) = trace AK, X
J(Ax, x)2 - (dx) = [trace AKJ2 + 2 trace (AK)2. X
If, in addition, A > 0, use Chebyshev's inequality to prove the estimates
y{x: (Ax, x) > 1) < trace AK, Hint: use Problem 1.10.17.
y{x: I (Ax, x) - trace AKl > 1 } < 2 trace (AK )2.
Chapter 3. Radon Measures
156
3.11.42. In the situation of Problem 3.11.41. assuming that 2vrK-AvrK- < 1, prove the equality
f
1/2
e(Ax.x) -y(dx) = [det(I -
= [det(I
- 2AK)]
i; z.
x Hint: use Problem 1.10.18.
3.11.43. In the situation of Problem 3.11.41, for any y,,... ,y2k E X, show that f(x,y,)...
(x,l/2k)7 (dx) = G12k1(0)(y1.....112k).
where G(x) =exp[-1(Kx,x)]
x
3.11.44. Let X be a metric space. Show that a sequence of Radon probability measures µ., converges weakly to a probability measure µ precisely when (3.8.1) holds true for every bounded uniformly continuous function f on X. Moreover, show that it suffices to have (3.8.1) for every bounded Lipschitzian function f. Hint: use Theorem 3.8.2 for the first claim and then use the fact that every bounded uniformly continuous function f on X can be uniformly approximated by bounded Lipschitzian functions.
3.11.45. Construct an example of a sequence of Radon Gaussian measures on a locally convex space, which converges weakly to a Gaussian measure, which is not tight (hence is not Radon). Hint: take the measure y constructed in Example 3.6.3 and consider
it on 1" with the topology o(1", (1'°)'); use Theorem A.3.9 in Appendix to show that the finite dimensional projections µ" of -y converge weakly to y.
3.11.46. (A.V. Kolesnikov). Let y be the measure on Ui" that is the countable product of the standard Gaussian measures on the real line. Show that to every Gaussian
measure p on 12. one can associate a Borel mapping &,: 111" . 12 such that s,,,, - f, in measure -y provided a sequence of Gaussian measures v" converges weakly to v. Hint: letting K, be the covariance operator of p and a its mean, consider the mapping 1;,, :=
G +Q,,, where Q = K is a Hilbert-Schmidt operator on l2 and the measurable linear extension corresponds to the measure y. 3.11.47. Let p and v be two centered Radon Gaussian measures on a locally convex
space X such that H(p) C H(v). Prove that H(µ s v) = H(v). Hint: use Theorem 3.3.4 f EX',forsome c>0. to show that 3.11.48. Prove Proposition 3.11.27. Hint: consider the case where IIAII < 1. take an injective nonnegative Hilbert-Schmidt operator S and consider the norm 1/2
114.
l
ISA"xIH)
and note that E (A")'S2A" is a trace class operator.
CHAPTER 4
Convexity of Gaussian Measures What has been said should be enough to make clear that in the terrain of analysis a rich vein of gold had been struck, comparatively easy to exploit and not soon to be exhausted.
H. Weyi. David Hilbert and his mathematical work
4.1. Gaussian symmetrization In this section we briefly describe the operation of symmetrization for Gaussian measures on R". Denote by ry" the standard Gaussian measure on R"; the symbol ryk will be also used to denote its orthogonal projections onto k-dimensional linear subspaces in R" (not necessarily equal Rk). Recall that 4 stands for the standard Gaussian distribution function. Let 1 < k < n, L a linear subspace in R' of dimension n - k and let e I L be a unit vector. These objects generate the mapping S(L, e), which to every closed set A C R" associates the set S(L. e)(A) defined as follows: for every x E L, the section of the set S(L, e)(A) by the k-dimensional affine subspace x + L1 is
{y: (y,e) > r}n(z+L1), where r = r(x) is such that
ryk({y: (y,e)>r}n(x+L1)) =7k(An(x+L1)),
(4.1.1)
and in the case where the measure on the right-hand side of (4.1.1) is 0, we put S(L, e)(A)f1(x+L1) = 0. and in the case where it is 1, we put S(L.e)(A) = x+L1 (this corresponds to r = +oo and r = -oo in the formula for the section). Thus,
S(L, e)(A) = U ({y: (y, e) > r(x)} n (x + L1)). xEL
The mapping S(L, e) is called a Gaussian k-symmetrtzation With respect to L in the direction of e. Gaussian k-symmetrizations for open sets are defined analogously
with the open half-spaces {x: (x,e) > r} instead of the closed ones. Clearly, it is also possible to define k-symmetrizations for all sets (using again open half-spaces), but we do not need this.
4.1.1. Example. Let k = n. Then L = {0} and
S({0},e)(A) _ {x: (x,e) > 4i-1(1 is a half-space for every closed set A. It is straightforward to verify the following properties of Gaussian symmetrizations. 157
Chapter 4. Convexity
158
4.1.2. Lemma. For arbitrary closed (or open) sets A and B, one has: (i) If A C B, then S(L,e)(A) C S(L,e)(B); (ii) R"\S(L,-e)(A) = S(L,e)(R"\A); (iii) S(L, c) (A + v) = S(L, e)(A) + v for all v E L. In particular, if Lo C L is a linear space such that A + Lo = A, then S(L, e) (A) + Lo = S(L, e) (A);
(iv) S(L, e)(A) = S(L,e)(A)+(L+Ht'e)' and S(L, e)(A) + Ae C S(L, e)(A) for
all)>0;
(v) If {A,} is an increasing sequence of open sets and A = U°° 1 Ai, then
S(L,e)(A) = U S(L,e)(Ai); =1
In particular, S(L,e)(A) is open for open A and closed for closed A. (vi) If B = B + L', then 7'n (A n B) = yn (S(L, e)(A) n B). In particular,
y" (S(L, e)(A)) = -y. (A).
(4.1.2)
Note that in order to show in (v) that S(L, e)(A) is open for open A, it suffices to verify this for open cubes with the edges parallel to the coordinate axis assuming
that L = R"-k and e = en_k}1. For such a cube Q, the set S(L,e)(Q) is an open half-cylinder whose base is the projection of Q to L. Now, for closed A, the set S(L, e) (A) is closed by (ii). Note that the stability of the classes of open and closed sets under the operation S(L, e) enables one to consider compositions of Gaussian symmetrizations on closed or on open sets.
4.1.3. Lemma. Let L1 and L2 be two linear subspaces in R" such that the orthogonal complements of L, n L2 in L1 and in L2 are mutually orthogonal. Then
S(L1ie)oS(L2,e) = S(L2ie)oS(L1,e) = S(L1 n L2,e).
PROOF. Put L1 = L + M1i L2 = L + M2, where L = L1 n L2, and L, M1, and AI2 are mutually orthogonal. Denote by E the orthogonal complement to L + All + Aft + Rte. Let S1 = S(L1, e), S2 = S(L2, e). By Lemma 4.1.2(iv), for any closed A, one has S1(A) = S1(A) + (L1 + R1e)1 = S1(A) + M2 + E,
S2(A) = S2(A) + (L2+R1e)1 = S2(A)+M, +E. Since M1 C L1, the second equality implies (due to properties (iii) and (iv) in Lemma 4.1.2) that S1 oS2(A) = S1 oS2(A) + M1. Therefore, applying the first equality to S2(A) replacing A, we arrive at the representation
S1oS2(A)=S1oS2(A)+M2+E=S1oS2(A)+MI +M2+E = S1 oS2(A) + (L + Rle)1. Thus, the set S1 oS2 (A) is invariant with respect to (L + R1 e)1. By definition, the same is true for S(L, e)(A). Both sets are sent into themselves by the translations x --. x + )e, ) > 0. Therefore, the intersection of each of them with the affine
subspace F. = x + Ll is a half-space in FZ with the inner normal e. Note that,
4.1.
Gaussian symmetrization
159
letting k = dim Ll, we have that the k-dimensional standard Gaussian measure 7k of the set S1 o S2 (A) n F= equals 1k
(S2(A) n F=) = -yk(A n F=),
since Fr is invariant under (MI +L)' and (AI2+L)1, being invariant under L. The right-hand side of the last equality coincides with 1k(S(L, e)(A) n Fr). Therefore, we get the equality of the two half-subspaces:
S,oS2(A)nF==S(L,e)(A)nF., Taking the union in x, we obtain S1oS2(A) = S(L,e)(A). Hence S2oS1(A) is the same set. The next lemma enables one to reduce k-symmetrizations of high orders to the two dimensional ones.
4.1.4. Lemma. Let n > 3 and k > 2. For every k-symmetrization S = S(L,e), there exist 2-symmetrizations Sl,... , Sk_1 such that S = S1 o.. oSk_1. PROOF. We may assume that k > 2. Let h E (L+IRle)1 be a unit vector. Put L, = (IR' h+,R' e)1, L2 = L+1111 h. Then the conditions of Lemma 4.1.3 are fulfilled
(note that L1 n L2 = L) and we get S = S(L1, e)oS(L2, e). It remains to be noted that S(L1,e) is a 2-symmetrization and that S(L2,e) is a k - 1-symmetrization. Hence, repeating this k-2 times, we get the composition of k-1 2-symmetrizations.
0 In turn, the 2-symmetrizations are obtained as the limits (in a sense to be precised) of compositions of 1-symmetrizations. Let {ej }, j = 0, 1, ... , be a sequence of unit vectors in 1R.2 defined inductively as follows: eo = (0,1), e, = (1,0), and
the angle 0j+, between ej+, and -eo is 0,/2. Then ej - -eo, (ej,e,_1 +eo) = 0, and (ej,eo) < 0.
4.1.5. Lemma. Let Sj = S(el, ej) and Tj = Sj o . . . o So. Suppose A is a closed set in 1R2. If x belongs to Tj(A), then the whole cone
C(x,eo,ej) = {x+teo+sej, t, s> 0} is contained in T,(A). PROOF. We shall use induction in j. For j = 0 the claim follows from Lemma 4.1.2. Suppose it is proved for some j > 0. Let us consider the closed interval
Ij ={Aej-Aeo, -1 0. Applying this fact to the set Af n E3, we get
S,+1(MnE3) J Si+1(Ti(A)nE.+(,3-a)(eo+ej)) +r13
=U{C(x,eo.ei): xES,+1(T,(A)nEa)}nE3. Taking the union in ;3 > a, we arrive at the inclusion
S,+1(M) D U{C(x,eo,e,): x E S,+1(T,(A) n EQ)}. Therefore, if x E S,+1(T;(A) n E0) = T,+1(A) n E0, one has
C(x, eo, e,) C S,+1(M) C SJ+1 o T; (A) = Ti(A),
0
whence the claim.
4.1.6. Corollary. Put S = S(0,e1). Let A be a compact set in R2 such that 0 < 12(A) < 1/2. Then, for every R > 0, one has slim T, (A) n {x: IxI < R} = S(A) n {x: Ixi < R},
where the convergence of sets B, B is understood in the sense of the Hausdorff metric, i.e., given e > 0, for all j sufficiently large, the sets B, and B are contained in the e-neighborhoods of each other.
PROOF. The claim is deduced from the previous lemma making use of the fact that the angle between co and ej tends to 7r so that the corresponding cones approach half-planes with the boundary parallel to R'eo, whereas the set S(A) is a half-plane with the boundary line orthogonal to el. See [222], [491] for details. 0
4.1.7. Theorem. Let U be the closed unit ball in R" and let S(L, e) be any k-symmetrization in R". Then: (i) For every closed set A C R', one has
S(L, e)(A) + rU C S(L, e)(A + rU),
t1 r > 0:
(4.1.3)
(ii) For any convex open or closed set A C R", the set S(L, e)(A) is convex.
PROOF. (i) Throughout this proof we put A' = A + rU for every set A. Step 1. First we prove that relationship (4.1.3) holds true in the case n = I. Let us consider the special case where n = 1, L = {0}, e = 1, and A is a closed interval. Put a = yl (A). Note that even this case is not completely obvious. The set S(A)'
is the ray (45-1(1 - a) - r,oo), whereas the set S(A') is the ray (4+-'(1 - 3).oo), where ,3 = yl (A'). We shall show that -' (1 - 3) < 44`(I - a) - r. To this end, holding a and r fixed, we shall maximize 0). Clearly, this is equivalent to minimization of 3. Any interval of y1-measure a can be written as Ax = [x, y(x)],
where x E [-oo,-' (1 - a)] and y(x) = 4+-' (a + 4)(x)). Then -y1-measure 3(x)
of [x - r. y(x) + r] equals 1(y(x) + r) -'1(x - r). Differentiating in x. we get 3'(x) = p(x)p(y(x) + r)/p(y(x)) - p(x - r), where p is the standard Gaussian density. An easy evaluation yields ,3'(x) = 2ap(x)p(r)(e-y' - e='). Clearly, the function 3 attains its maximum at xo determined from the equation xo = -y(xo) (this corresponds to the symmetric interval). The minimum of 3 is attained at the points -oc and 4i'' (1 - a), which correspond to the rays Ax = [-oo, 4i-'(a)] and
4.1.
Gaussian symmetrization
161
!4-1(1 - a),oo]. In this case, by virtue of the identity (D'i(1 - a) = __i(), (4.1.3) turns into an equality. Let us continue by induction. Suppose that (4.1.3) is true for any n intervals
(possibly, unbounded) and both possible choices of unit vectors e and -e. Let A = u;±i A., where the A,'s are n + 1 disjoint intervals numbered in the natural order. We may assume that the sets A' have disjoint interiors. Otherwise we can make these intervals shorter without changing (U;+1 Aj)''. Put J = UJ_2A,. We shall denote S(0,1) by S and S(0, -1) by S_. Let us replace Al and An+1 by the rays S_ (A1) and S(An:.1), which gives n+1
S_ (A) = S_ (U A,) = S_ (S_ (A1) U J U S(An+1)). -1
S_(X)=S_ (S_(Ai)uJruS(An+i)). Using that S_ (A1)r C S_ (Ar) and S(An+1)r C S(Ar.1), we get S_ (S- (AI )r U Jr U S(An+1 )r) C S_ (Ar).
Let us consider the set I := S_ (A1)r U Jr U S(An+l )r. Its complement D is a union
of at most n intervals, hence the inductive assumption yields S(D)r C S(Dr), whence S_ (S-(A1) U J U S(An+1)) = S_(1R'\Dr)
= IR'\S(Dr) C JR'\S(D)r = R'\(]R'\S-(I))r. Hence,
S_ (S_ (A1) U J U S(An+1)) r C S_
(I).
Therefore,
S-(A)r = S_(S_(A1)UJUS(Ant1)) C S_(I) C S_(Ar). The inclusion for S is verified in a similar manner. Obviously, the claim is also true for open intervals. Then it remains to be true for all open sets in R1, which, in turn, implies its validity for all closed sets, since every closed set A is the intersection of its open c-neighborhoods. Step 2. Suppose that every k-symmetrization in lRk satisfies (4.1.3). Then the same is true for every k-symmetrization in iRn, whenever n > k. Indeed, for every
x E L, we put E. := x + L1. It suffices to show that
(S(A) + rU) n Er C S(A + rU) n E_,
d x E L.
(4.1.4)
The right-hand side equals S((A+rU)nEi). The left-hand side can be represented as
(S(A) + rU) n E_
= U (S(A n Ey) + rU) n E. yE L
Every nonempty member in this union is the shift to x - y of the k-dimensional 6-neighborhood of the set S(A n Ey) with 6 = r2 - _Ix- yi2. By assumption, this 6-neighborhood in the k-dimensional affine subspace Ey is contained in the set S((A n Ey) + 6U,.), where Uk = U n L1 is the k-dimensional unit ball in L1. In
Chapter 4. Convexity
162
turn, the set S ((A n El,) + bU) + x - y is contained in S ([(A n Ed) + rU] n E_ ) since
nE.
(AnEr,)+bUk+x-y c and
=S((AnEE)+bUk)+x-y by the inclusion x - y E L and Lemma 4.1.2(iii). Therefore,
(S(AnEy)+rU) nEx CS([(AnEy)+rU]nE_). Taking the union in y E L, we arrive at (4.1.4). Step 3. Using Corollary 4.1.6 and Step 1, one verifies that every 2-symmetrization on R2 satisfies (4.1.3). Applying Step 2 and Lemma 4.1.4, we get the general result (see [222] and [491] for more details).
Let us prove statement (ii). We shall consider first a 1-symmetrization S =
S(L, e), where a is a unit, vector in R' and L = e1. For any x E L, put Ez = x + R' e. The set A n E= is either a ray or an interval, and the set S(A) n Es is a ray. We have to verify that, for any x, y E L and A E [0, 11, one has A(S(A) n E.,) + (1
- \) (S(A) n El,) c S(A) n
This inclusion is equivalent to convexity of S(A). We have A n ET = x + [as, b=]e,
A n E = y + [a,,, b ]e, S(A) n E_ = x + [e., co)e, S(A) n El, = y + (c,, oo)e. By the definition of symmetrization, one has sl#(b1) - 0(a1) = 1 - 4i(c1) and the same for y. Hence the inclusion to be proved reduces to the estimate
-ac= - (1 - A)c = A4i-1(t(b=) - 4,(a:)) + (1 - \)4,-' (s(b5) -g(ay))
«_1(4(Abr+(1-A)by) -4i(Aa=+(1 -\)a,)). This estimate follows from Problem 4.10.14. The convexity is preserved by the compositions of 1-symmetrizations, hence the claim is true for all 2-symmetrizations (which follows by a construction analogous to that of Lemma 4.1.5). Applying Lemma 4.1.4, we get the general case. 0
4.1.8. Corollary. For every closed set A in IEt", we have In (S(L, e)(A) + rU) S I" (S(L, e)(A + rU)),
d r > 0.
(4.1.5)
In particular, applying this to the n-symmetrization, we get that among all sets of equal I.-measure, the half-spaces have the minimal measure of the r-neighborhood.
4.2. Ehrhard's inequality Convexity of Gaussian measures plays an important role in diverse applications. Quantitative theorems of this sort state that for some function the inequality
y(,\A+ (1 - A)B) > i,(A,Y(A),7(B)),
V,\ E [0, 11,
holds true for all A and B from a certain class of sets. Theorem 1.8.4 is an example of this kind. The following fundamental result - Ehrhard's inequality - is derived by using the properties of Gaussian symmetrizations. We shall employ our usual
convention 4t-1(0) = -oc and t-1(1) = +oo.
4.2.
Ehrhard's inequality
163
4.2.1. Theorem. Let A and B be two convex sets in R". Then one has for all A E [0,1]:
'6-1 1ry.,(AA+(I -.1)B)} > A4 1{-,n(A)I+(1
(4.2.1)
PROOF. As it was noted above, the interior of a convex set has the same mea-
sure as the set itself. Therefore, it suffices to prove our claim for open convex
sets A and B. We shall consider R" as a hyperplane in Rn+'. Then A and B are convex subsets of R"+', whence the convexity of the set A + en+1 , where en+1 = (0, 0, ... , 0, 1). Denote by C the convex hull of the sets A + en+, and B in the space R'+' Then, for every A E [0, 11, we get
Ca:=R"n (C-Aen+1)=AA +(1-A)B. Let us consider the function
f(A) = 44-'{7 "(CA)}
\)B)
on the closed interval [0, 1]. Let us take a unit vector e I en+1 and consider the n-symmetrization S in R"+' with respect to the one dimensional subspace R'en+1 in the direction of e. The set S(C) is convex. By the definition of symmetrization, (ry"(Ca)) is the number r = r,\, for which (Aen+i + R") n S(C) = {x E R"+1 : (x. e) > r} n (Aen+l + R"), since yn(CA) = 1 4)(-ra). By the convexity of S(C), one has
(1 - A) (R" n S(C)) + A ((eni.l + R") n S(C)) C (Aen+1 + R") n S(C), whence (1 - A)(ro, oo) + A(rl, oo) C (ra, oo), i.e., ra < Arl + (1 - A)ro. Therefore,
0
f (A) > A f (1) + (1 - A) f (0), which is our claim.
It is shown in [4651 that for the validity of Ehrhard's inequality it is enough to require the convexity of only one of the two sets A and B, but it remains open whether this inequality holds true for arbitrary pairs of measurable sets. As the reader could have noticed, the theorem above (as well as Theorem 1.8.4) remains valid for an arbitrary Gaussian measure if we replace U by the corresponding ellipsoid of concentration. However, we pass directly to the infinite dimensional generalizations.
4.2.2. Theorem. Let y be a centered Radon Gaussian measure on a locally convex space X. Then, for arbitrary Borel sets A and B and all A E [0, 11, one has
y. (AA + (1
- A)B) > -y(A)ay(B)'-
.
(4.2.2)
If, in addition, A and B are convex, then
4'-'{y.(.1A+(1-.1)B)}>A4'-'{7(A)}+(1-A)4 1{-y(B)}.
(4.2.3)
PROOF. First we show how to get (4.2.2) from the finite dimensional estimate (4.2.2). Note that both inequalities in question are valid for every Gaussian measure
on R", since any such measure is the image of the standard Gaussian measure under an affine mapping, and the affine mappings transform linear combinations of sets into the linear combinations of their images. Since the ry-measure of every measurable set equals the supremum of the measures of enclosed metrizable compact
sets, it suffices to verify (4.2.2) in the case where A and B are metrizable and
Chapter 4. Convexity
164
compact. In this case C = AA + (1 - A)B is also compact metrizable. We may assume that it least one of the two sets A or B has positive measure. The linear span E of the sets A and B contains all the sets we are interested in and is a Souslin space of full measure. As it was pointed out above, E is the union of a sequence of metrizable compact sets, and there is a sequence (f,,} C E' separating the points of E. From now on we consider the measure y on E. Using the orthogonalization, we may assume that the sequence { fn} is an orthonormal basis in X.. If this sequence is finite, the whole thing reduces to the finite dimensional case discussed above.
If it is infinite, then the injective continuous linear mapping T : x ' (fn (x)) n=1 , E lR", transforms the measure y into the countable product of the standard Gaussian measures on ff(1, which reduces the claim to this product. Therefore, we assume further that E = lRx and that -r is the product of the standard one dimensional Gaussian distributions.
Let us put Pnx = (xt, x2, ... , xn, 0, 0, ... ). Note that, for every compact set K C IR'°, there holds the equality
K= n P,-' (P,(K)).
(4.2.4)
n=i
Indeed, K is contained in the right-hand side of (4.2.4). Let x it K. Then there is n such that Px ¢ P,, (K). Otherwise, for each n, there is a point kn E K with Pnkn = Px. Since K is a metrizable compact set, the sequence {k"} contains a subsequence convergent to some element k E K. Then ki = x, for all i E IN (recall
that we deal with the coordinate convergence in It'), whence k = x, which is a contradiction. Hence equality (4.2.4) is established.
Let e > 0. Applying (4.2.4) to the sets A, B, and C, and taking into account decrease as n Increases, we get that for some n the measures of the cylindrical sets An = P,,- ' (P,, (A)), Bn = P,,- ' (P,, (B)), and Cn = P» 1(Pn(C)) differ from the measures of A, B, and C, respectively, not greater than in E. In addition, C = AAn + (1 - \)B,,. Applying the finite dimensional
that the sets
inequality to the measure y,, = ry o P,,- 1, we get
-t(C)
y'n(Cn) - e
e > (y(A) -
f)'- - E,
which yields (4.2.2), since a is arbitrary. Now let us turn to (4.2.3). The previous reasoning does not work here, since the corresponding finite dimensional inequality was proved for convex sets, whereas (4.2.4) is not true, in general, for noncompact sets, and the convex hull of a compact set in IR" need not be compact. In order to bypass this difficulty, we shall reduce
everything to the case of Rx as we did above. For every e > 0, there exist metrizable compact sets Al C A and B1 C B such that y(A) < ry(A1) + e and y(B) < y(B1) + e. In view of this, it suffices to prove inequality (4.2.3) for convex
hulls of AI and BI instead of A and B (note that Aconv Al + (1 - \)conv Bl C AA + (1 - \) B). Therefore, we may assume that A = cony AI and B = cony B1. Put C = \A+ (1 -.\)B. The set C1 = AA1 + (1 - A)B1 is compact metrizable as well as the union Z = A, U B1 U C1. If y(Z) = 0, then (4.2.3) is true. Hence we may assume that y(Z) > 0. Then the linear span E of the set Z contains all the sets we are interested in and is a Souslin subspace of full measure. Now we consider
the measure y on E. We find again a sequence { fn} C E' separating the points of E. As above, this sequence gives an injeetive linear embedding of E into 1R",
4.2.
Ehrhard's inequality
165
which reduces everything to the case where E = 1R" and y is the product of the standard one dimensional Gaussian distributions. Put Pnx = (x1, x2, ... , xn, 0, 0, ...) and Qax = x - Pnx. For every Borel set D, let us define the functions I. (D) (x) = f ID(Pnx + Qny) y(dy) = Iza(D - Pnx),
where µn = y o Qn 1. Inequality (4.2.2) (that has already been proved) implies th inclusion
A{In(A) > 1/2} + (1 - \){I,,(B) > 1/2} C {In (\A + (1 - \)B) > 1/2}. Indeed, if µn(A-Pnx) > 1/2 and
1/2, then estimate (4.2.2), applied
to the measure tsn, gives
logpn (AA + (1 - A)B - Pn(Ax + (1- A)y)) = log A,, (A(A
- Px) + (1 - A)(B - Pny))
>Alogµn(A-Pnx)+(1-A)logµn(B-Pny)> log 2, whence µn (AA + (1 - A) B - P (Ax + (1 - A)y)) > 1/2. Therefore, y(A{I,, (A) > 1/2} + (1 - A){In(B) > 1/2})
1/2}).
Note that the set {In(D) > 1/2} is convex, provided so is D. Indeed, if µ n(D - Pnx) > 2
and
un(D - Pnz) > 2
then, by convexity of D, we get for all A E [0, 1] that
A(D - Px) + (1 - A)(D - Paz) = D - APnx - (1 - A)Pnz, whence in the same manner as above with the aid of (4.2.2) we arrive at the estimate
p,(D-\Pnx-(1-A)Pnz) > 2 It remains to note that the convex sets {I (A) > 1/2} and {In(B) > 1/2} are cylindrical, and, in addition, by Theorem 3.5.2, for any measurable set D, the sequence {Ia(D)} converges to ID in L2(y), hence in measure y. Applying this theorem to A and B and using the finite dimensional Ehrhard's inequality, we get estimate (4.2.3). This theorem implies, in particular, the infinite dimensional Anderson inequal-
ity already familiar to us. In Problem 4.10.15 it is suggested to derive it from Ehrhard's inequality. 4.2.3. Corollary. In the situation of the previous theorem, for every symmetric convex set A and every vector a such that A and A + a are -y-measurable, one has -y(A + a) < y(A). More generally, if A + to is -f-measurable for any t E [0' 1], then
y(A + a) < y(A + ta),
`d t E [0, 1].
(4.2.5)
Chapter 4. Convexity
166
Further, the function t '-. j f (x + ta) y(dx) is n.ondecreasing on (0, +oo), provided
f is such that the sets If G c}. c E lR 1, are symmetric and convex, and f ( + ta) is y-integrable for any t > 0.
4.2.4. Corollary. Let A be a y-measurable convex set of positive measure. Then the topological support of the restriction of y to A is convex. PROOF. First of all note that the Radon measure 71A has topological support S.
Suppose that x, y E S and A E (0, 1). We have to show that for every convex neighborhood of zero V, the set B = (Ax + (1 - A)y + V) n A has positive''measure. The sets C = (x+V)fA and D = (y+V)f1A are convex and have positive 1-measures. Since AC + (1 - A)D C B. the claim follows by Theorem 4.2.2.
0
4.2.5. Remark. Note that if A and B belong to E(X ), then their linear combinations are automatically measurable (see Chapter 2), hence there is no need to use the inner measure. In particular, this is true if X is a separable Frechet space. The same remains in force for those pairs of Borel sets, at least one of which is contained in a Souslin linear subspace of full -t-measure. In the proof of Theorem 4.2.2, the convexity was employed for approximating convex sets by convex cylinders: such approximations have been manufactured for
E(X),-measurable sets in Chapter 2. This method (unlike the construction from Chapter 2) is simple and constructive: for this reason, we formulate the corresponding statement as a separate result.
4.2.6. Proposition. Let y be a Radon Gaussian measure on a locally convex space X and let A E S(X)., be a convex set. Let us take any orthonormal basis tit H(-.) and put r (I f,,(x) := J 'A y(dy). X
where
(A L
,_i
F;(x)e;. Then the sets A := {f,, > 1/21 are convex cylinders and A is absolutely convex, then so are the sets A,,.
PROOF. We may assume that I is centered. In addition, we may assume that A is a Bore] convex set (the convex hull of some sequence of compact sets). Since
fimageI,,ofinI under L2(y) by Corollary 3.5.2, we have y(A A A,,) - 0. Denote by p the the mapping I - P,,. Then P,,x). Therefore, the condition x. z E A. can be written as p, (A -
1/2, a,, (A - P,,z) > 1/2. By
the convexity of A, we have
A E (0. 11. By virtue of inequality (4.2.2) applied to the measure p we conclude that p" (A (1 - A)P,,z) > 1/2, that is, Ax + (1 - A)z E A,,,
i.e.. A,, is convex. Finally. if A is absolutely convex, then, due to the symmetry of
p,,. the functions f above are even, hence the sets A. are symmetric about the origin.
0
4.3.
Isoperimetric inequalities
167
4.3. Isoperimetric inequalities Our next application of Gaussian symmetrizations is a proof of the following isoperimetric inequality (which can be proved without symmetrizations, but not as simply). Recall that we put 45-'(0) = -oc and 45-1(1) = +oo.
4.3.1. Theorem. Let yn be the standard Gaussian measure on 1R" and let U be the closed unit ball in 1R" centered at the origin. For every measurable set A C 1R", the following inequality holds true:
-1(yn(A+rU)) >- tin(A) +r,
`dr > 0.
(4.3.1)
PROOF. Clearly, it suffices to prove this estimate for compact sets A. Let us apply to A an arbitrary n-symmetrization S. Then the sets S(A) and S(A) + rU are half-spaces. Since the Gaussian measure is preserved by the symmetrizations, by virtue of Theorem 4.1.7, we get
yn(A + rU) = yn (S(A+ rU)) > y,, (S(A) + rU) =.(4i-'{7"(S(A))1 + r)
1{1n(A)}+r),
since the half-space S(A) + rU is obtained from the half-space S(A) by shifting to a vector of length r orthogonal to the boundary. 0
Note that inequality (4.3.1) can be written as yn(A + rU) > 40(a + r),
where a = 4i-i (yn(A)). Therefore, 4i(a + r) is the measure of the set n + rU, where II is a half-space having the same measure as A. If we define the surface measure of A as the limit of the ratio r-' (yn(A + rU) - yn(A)) as r 0, then (4.3.1) shows that the half-spaces possess the minimal surface measures among the sets of given positive measure. For the formulation of a one more infinite dimensional extension we need the following technical result.
4.3.2. Lemma. Let y be a Radon Gaussian measure on a locally convex space X and let Q be a Borel set in the Hilbert space H(y). Then, for every Borel set B C X. the set B+ Q is -r-measurable. PROOF. Observe that B + Q is a Souslin set if X is a separable Banach space, hence is -y-measurable. We know that in the general case there is a metrizable compact subset K in X of positive y-measure. Let Tn be the cube in 1R" given by Ix; I < n, i = 1,... , n. Denote by K,, the image of the metrizable compact set Tn x K" under the mapping
((t..... .tn),(kl,... .kn)) - klti +...+kntn Then Kn is also compact metrizable. Recall that H(y) is a separable Hilbert space. Therefore, the set (B n Kn) + Q is a Souslin subset of X as a continuous image of the Borel set (B fl Kn) xQ in the complete separable metric space Kn x H(y) (see Appendix). The corresponding continuous mapping has the form (k, h) '-. k + h. Therefore, for every n, the set (B n Kn) + Q is -t-measurable (see Theorem A.3.15).
The union L = U' i Kn is a linear subspace. In addition, y(L) > 0. By the
Chapter 4. Convexity
168
zero-one law, one has y(L) = 1. In particular, L contains H(7), hence Q C L. It remains to note that, letting
x
C= U((BnKn)+Q), n=1
we have CCB+Q and (B + Q)\C C X\L. Indeed, let x= b+ q, where b E B, q E Q. If x E L, then b E L, since Q C Land L is a linear subspace. Hence b E B n Kn for some n, whence x E (B n Kn) + Q C C and x ¢ (B + Q) \C. Therefore, y ((B + Q)\C) = 0.
4.3.3. Theorem. Let y be a Radon Gaussian measure on a locally convex space X, A a -y-measurable set and let Uy be the closed unit ball in the Hilbert space H = H(-y). Then
Vt>0, where a is chosen in such a way that 4'(a) = y(A). If y(A) > 1/2, then
y(A+
4'(r) > 1 - I exp(-2r2).
(4.3.2)
PROOF. We use the same reasoning as in the proof of inequality (4.2.2) and reduce everything to the case of product-measures on IR" and then apply the finite dimensional inequality proved in Theorem 4.3.1. Note that in (4.3.2) we use the estimate 4'(t) > 1 - exp (- t2) , V t > 0, which is easy to be verified. 2
4.3.4. Corollary. For every positive a, there exist ro(a) > 0 and a real number c(a) such that, for all r > ro(a), the following inequality holds true:
y(A+rU,,) > i - exp(- 2 +c(a)r) r2
(4.3.3)
provided 7(A) = at > 0.
Now, following Borell (991, we shall apply the isoperimetric inequality to the estimates of the distribution functions of certain nonlinear functionals. Let y be a Radon Gaussian measure on a locally convex space X with the Cameron-Martin space H = H(-y). For any given d > 0, let us denote by Sd(y) the class of all -ymeasurable functions f satisfying the following condition: there exist a real function Al on the closed unit ball U,, in H and a sequence of y-measurable functions gn such that one has a.e. (i) (ii) lim gn = 0 a.e.
(x+th) - Al(h)I t) _ -1( sup IAf(h)1) t+x 2 hEUH
2/d
(4.3.4)
Finally,
exp(.Ifl21d) E L'(-y),
IAd(h)I)-2,d
V. < 1 ( sup
2 hEUH
.
f
(4.3.5)
PROOF. It suffices to prove our claims for centered measures. We may assume
that the functions gn are Borel. By Egorov's theorem, there exists a Borel set A C X with y(A) > 1/2 such that lim
sup
n...x IhIH n, xEA
It-df(x+th)-Ad(h)I=0.
(4.3.6)
In particular, there exist no E IN, C > 0 and a positive measure Borel set B C A such that I f (x) I < C for all x E B and I f (x + nh) - Ad (h) I < 1 for all x E B, h E U and n > no. By Theorem 2.4.8, there exists 6 E (0, 1) such that
Jf(x+noh)v(dx)
0 was arbitrary, we get (4.3.5). Clearly, (4.3.8) implies that lim sup t-21d log y(f > t) < - 2111-z/d. t-- x It remains to be shown that
-1
lim inf t-2/d log 7(f > t)
t-x
(4.3.9)
M-21d.
We shall assume that M > 0 (otherwise the claim is trivial). Let us fix e E (0, M/2). Since Al is continuous and positively homogeneous, there exists a vector h E U. such that Jhi = 1, h E X' and A j(h) > M - e. There exists a full measure Borel set A such that conditions (i) and (ii) from the definition of Sd(y) are satisfied pointwise on A. Let us put ir(x) = x - h(x)h. Denote by y1 the image of y under the mapping x h(x)h. Note that y = y1 * y o rr-' (if we write Xas the product of the one dimensional space lR'h and the hyperplane sr(X) = h-1(0), then -y becomes the product of y1 and y o it 1). Hence there exists t1 E IR' such
that y o a'' (A - t1h) = 1. Since the sequence of Borel functions g (zr(x) + t,h) converges to zero pointwise on w-1 (A - t1 h), Egorov's theorem yields the existence
of a Borel set B C A - t1h (in fact, we can choose B C zr(X) n (A - tIh), since y o rr't is concentrated on the hyperplane x(X )) such that y o ,r-' (B) > 0 and lim sup gn (a(x) + tlh) = 0. In particular, there exists of E IN such that
n"xEB
suplt-d f (a(x) + tlh + th) t>n
-
E,
dx E
'(B), d n > n,.
Noting that h(h) = 1 and hence ir(x) + t1h + (h(x) - t1)h = x, we arrive at the following relationship taking t = h(x) - t1 in the previous estimate:
{x E a-'(B): h(x) > n,t + t1} C {x: f(a) > (M - E)I h(x) - t1I Therefore, for any t > nE (M - e), we obtain
y(x: f(x) >t)
>y
l/d
t xErr-'(B):t1+ (M-E) ).
Note that for any r E lR1 one has
y(rr-'(B) n {h > r}) = y(a-1(B))y({h > r}). Indeed,
/
l
y1([rr-'(B)n{h>r}, -y) =y1({h>r}) =y({h>r})
d}.
4.4.
Convex functions
171
if y E 7r-'(B) and this measure is zero if y ¢ 7r (B), since y, is concentrated on the line IR' h and ir(h) = 0, so th + y E a-(B) precisely when y E 7r-' (B). Hence,
y(x: f(x) > t) > (x: h(x) > t, + (,tlt
E)'/d)yoa-I(B),
whence (4.3.9) follows at once, since E > 0 can be taken arbitrarily small.
4.3.6. Example. Let f be a -y-measurable function such that f (Ah) = Ad f (h) for some d > 0 and all h E H, A > 0. Assume, in addition, that there exists a full measure linear subspace (or an additive subgroup) Xo C X such that f (x + y) < f (x) + f (y) for all x, y E X0. Then f E Sd(y) and A f(h) = f (h).
PROOF. Since H C Xo, we have f(x+h) - f(x) < f(h) and f(x+h) - f(h) > -f(-x) for all x E X0 and all hEH. Hence, for all t > 0 and h E H. one has
It-df(x+th) - f(h)I = t-dlf(x+th) - f(th)I 0 and all x E X. A > 0. Then f E Sd(y) and A1(h)= f(h) for allhEH. PROOF. Set
g. (x) =
sup hE('H. t>n
If(t-'x+h) - f(h)I
Using the compactness of UH in X and the continuity of f, it is straightforward to verify that lim _ Xg, (x) = 0 for every x E X.
We shall discuss other examples in Chapter 5 in the section on measurable polynomials.
4.4. Convex functions
In this section we discuss distributions of nonlinear functions on Gaussian spaces. The material covered here is a link between the preceding chapters which dealt with linear objects, and Chapters 5 and 6 devoted to nonlinear problems. First we formulate a general result concerning the structure of the distributions of convex functionals. A measurable function f on a locally convex space with a Radon Gaussian measure y is said to be a y-measurable convex function if it has a
modification fo: X - ff' which is convex in the usual sense, i.e., fo(Ax + (1
- A)y) < Afo(x) + (1 - A)fo(y),
VA E [0, 1], Vx, y E X.
Concave functions are defined by analogy with ">" replacing " y"(An B).
(4.6.8)
U In particular, if B is spherically symmetric, then y"(An B) >'y"(A)y"(B). PROOF. We shall see that in fact (4.6.8) is true for all spherically symmetric measures and all measurable sets A and B such that the sets A,, = A n 1Rlcp and B,D = B n 1R1 cp are symmetric intervals for every cp from the unit sphere S. To this
end, let us write y" as crop, where a is the normalized Lebesgue surface measure on S and p is some probability measure on (0, oo). We observe that a coincides with the image of m under the mapping c' '- Ucp, U" S, for any fixed cp E S. Therefore,
f(A n U(B)) m(dU) = J JP(A,, n B,,) a(dcp) a(d>,). SS U. Now we observe that p(A,, n B,) > p(A;,) since A,, n B0 is either A., or Be,. Noting that ,y. (A) =
f
p(Ap) a(d4p)
S
0
and similarly for B, we get (4.6.8).
It is interesting to note that inequality (4.6.2) that we would like to have for log-concave functions is valid for convex functions. Let us give a precise formulation following [3501.
4.6.6. Proposition. Let y be a centered Radon Gaussian measure on a locally convex space X and let f, g E L2(y) be convex functions such that I,(f) = 0. Then
f
fgdy >
ffdy J9d.
(4.6.9)
x x x PROOF. Let us consider first the case, where y = y,, is the standard Gaussian measure on IR". Suppose that f, g E W2,2(-Y,). Let (W1),>0, Wt = (W=1,... ,Wt ), be a standard Wiener process in IR". According to Proposition 2.11.1, we have
f(WT)=IEf(WT)+ JVP1._tf(wt)dwt a.e. 0
Chapter 4. Convexity
180
Using an analogous representation for OP1 _t f , the commutativity between Pt and
a., and the fact that Pt_5P1_1 = P1_ we arrive at the equality n
a.,P1_tf(ut) =E[a=,P,-tf( t)] +E f ax. 8x,Pi_sf(W.)dwa. j=1 0
Note that EE[a=,P1_tf(Hi)] = Pt(a=.P1-tf)(0) _ (az.Plf)(0)
v.f(b)7(dy) =0, R"
since
a.. P1f(x) = 8
,
f
W
f(x+y),y(dy) = f (z, -x,)f(z)'y(dz). R"
Clearly, we may assume that I, (g) = 0, passing to the function g - I, (g). In addition, we may assume that f and g have zero integrals with respect to -y, passing to the functions f - c1, g - c2. Therefore, we arrive at the following representation: t
1
JJax.o11Pi_8f(W.)dWdWt'.
f(W1) _ 0
0
An analogous representation holds for g. It remains to note that Jig d-, = lEf(K'1)g(K'1) > 0. Indeed, the matrices (A'j) = (8z, a=, Pi _d f (x)) and (B')) = (a=, a=, P1 _,g(x)) are nonnegative definite, which follows from the convexity of the functions Ptf and Ptg (implied by the convexity of f and g). Therefore, n
A'-7 B'-' = trace(AB) > 0, ij=1 since, letting c, be eigenvectors of A corresponding to eigenvalues ai > 0, we have
trace (AB) _ E a, (Bei, e,) > 0. If f and g are not supposed to be in W2.2(ryn), we =1
consider the approximations Ttf and Ttg, t > 0, where (Tt)t>o is the OrnsteinUhlenbeck semigroup. It is readily verified that Tt f and Ttg are convex and 11(Tt f) = e-tI1(f) = 0. Since Tt f, Ttg E W2.2(-yn), we arrive at (4.6.9) in the limit as t - 0. In order to complete the proof in the infinite dimensional case, we apply our standard finite dimensional approximations from Corollary 3.5.2, which give convex cylindrical functions fn and gn according to Remark 3.5.3. 0 There is a version of the correlation inequality for positively correlated random variables.
4.6.7. Proposition. Let y be a Gaussian measure on R" such that all elements of its covariance matrix K = (K;,.,) are nonnegative and let f and g be two functions in L2(,y) that are nondecreasing in every variable. Then
f
R"
fg d^y > f f d-y f g dy. R"
R"
(4.6.10)
Onsager-Machlup functions
4.7.
181
PROOF. Suppose first that f and g are bounded and nonnegative. We prove the claim by induction in n. Let n = 1. In this case, we do not need that y be Gaussian. We may assume that 11911L'(,) = 1. Then p = g y is a probability measure. According to Problem 1.10.11, it suffices to show that
p({f > t}) > y({f > t}), bt > 0. (4.6.11) If the set J = (f > t} is nonempty, then it is a ray (r, +oo) or Jr, +oo). Let us consider the probability measure v = y(J)-1yIj. For every s > 0, one has v({g > s}) _> y({g > s}), since the set {g > s} is also a ray (or empty) and its intersection with J is either J or {g > s}. Hence f g dv > f g dy, whence (4.6.11) follows by Problem 1.10.11. Suppose that (4.6.11) is proven for some n > 1. We may assume that y is centered and is not concentrated at the origin. According a well-known theorem of Perron, the maximal eigenvalue k of K corresponds to an eigenvector e = (a1,... ,a,) with nonnegative coordinates a, (see, e.g., [670,
Appendix]). We may assume that a > 0 and that k = 1. Denote by -yo the projection of y to IRn-1 regarded as a hyperplane L in 1R". Then the elements of the covariance matrix of yo are nonnegative. The conditional measures yy on the lines y + IR' en , y = (y1, ... , yn _1,0) E L, are Gaussian with means m(y) = a l y1 + +
an-lyn-1 and variance 1 (see Theorem 3.10.2). Note that if y = (y1,... yn_1,O), z=(z1,...,zn_1,0)ELandy,>z,,i=1,...,n-1,then m(y) _ m(z). By the one dimensional case, we have
f fgd-y =
R"
>
f f f(yl,... ,yn-1,t)9(Y1....
4n-l,t)yy(dt)yo(dy)
R"-1 y+Rle
f 9(y1,... f ( f f(y1,... ,yn-.,t)7'y(dt)y+Rle
1 Yn_1,t) Yh1(dt)) -to (dy)
Rn-1 y+R'e The expression on the right is the integral against yo of the product of the functions
ff(i.....y_i,t+m(y))yj(dt). f 9(y1, ... , y.- 1, t + m(y)) yl (dt) R1
R1
of y, which are nondecreasing in every variable yi. Hence we can apply the inductive assumption, whence the claim. If f and g are bounded, but not necessarily nonnegative, we apply (4.6.11) to the functions f + c and g + c with c sufficiently large and observe that this yields the claim for f and g. For unbounded f and g, we use (4.6.11) for the nondecreasing functions fk = max(min(f, k), -k) and 9k = max(min(g, k), -k) and pass to the limit as k -' oc.
Other proofs of this proposition can be found in (36), (769] (see also Problem 4.10.23).
4.7. The Onsager-Machlup functions Let p be a measure on a metric space X and let K(x, e) be the closed ball of radius a centered at x. In diverse applications one has to study the existence of the following limit: lim
p (K(a,E)
e-o A(K(b,e))
= I(a,b).
Chapter 4. Convexity
182
If this limit exists, then the function F(a, b) = log I (a, b) is called the OnsagerMachlup function (where we put log(oc) = oc, log(O) = -oo). We only discuss here the Gaussian measures, since many results on the Onsager-Machlup functions are based on the Gaussian case. Let - be a centered Radon Gaussian measure on a locally convex space X, let H = H(y) be its Cameron-Martin space with the Hilbert norm I I,,, and let q be a -y-measurable seminorm on X. We shall assume that a version of q is chosen that is a seminorm in the usual sense. For vectors h E H we shall investigate the existence of the limit lim
-y(17, + h)
y(V)
C_o
V. _ {x: q(x) -< r}.
(4.7.1)
We shall assume that
'Y(VE)>0, Ve>0 (otherwise one can assign 1 to the corresponding limit). As we already know, this condition is indeed a certain restriction on q, since it can happen in infinite dimensional spaces that -((VI) = 0 whereas -y(V2) > 0 (see Example 3.6.3).
It will be shown below that the limit in (4.7.1) exists. Note that by virtue of the Cameron-Martin formula (2.4.2), the following equality holds true: 7(
, + h) = expC -
1 Ih1 2
exp(h(x))ry(dx).
(4.7.2)
ti(ti:) J
Y
where h is the measurable linear functional, generated by h. Let us introduce some additional notation:
.4(f) =
exP(f(x)) -y (d,),
1
V. = {q < s},
7(VE)
Eq = { j E X,y: lim J,(f) exists}, C-O
Fq={f EX,;: ii mJ,(f)=1}. By virtue of (4.7.2), the existence of the limit in (4.7.1) is equivalent to the inclusion h E Eq.
A trivial example of an element from Fq is any function f that satisfies the condition If I < Cq ry-a.e. for some number C; in this case, one has exp(-Ce) < J. (f) < exp(Ce). Clearly, F. may not coincide with Eq. For example, this is the case if y is the standard Gaussian measure on 1R2 and q(x) = 1xI 1, x = (x1, X2)corresponding limit for f (x) = x2 equals I +'c e=, p(x2, 0. 1) dx2. As we shall see below, the reason why Eq and Fq may not coincide is similar in the general case.
4.7.1. Lemma. For all f E X., the following inequality holds true: 1
0 local Lipschitz estimate implies that Eq and F. are closed in X. Indeed, let fn -+ f in X,;, f,, E Eq and b > 0. It follows from the estimate proven above that, for n sufficiently large, one has I Je (f n) - J e (f) I < b for all c > 0. By condition, there is co > 0 such that I Je (f ,) - Jo (f,,) I 0. Let x
x
n=1
n=1
(4.8.4)
Note that if Q is given by (4.8.4), then expQ is integrable if and only if an < 1/2 for all n. In addition,
x E an ({n - 1)) }
lE [exp (c + E n=1
n=1
x =
exp(c+ 2 E n=1
1
x 2 c"anl TI exp(-an) / n,1
1 - an
(4.8.5)
This is readily verified by the direct evaluation taking into account that the product
x
on the right in (4.8.5) converges if E an < oc and an < 1/2. n=1 Put JJ = IE(expQJq < e). It will be shown below that the finite and positive limit lim Jf exists for every choice C-O
of q precisely when E lanl < oc. If an and an are, respectively, the positive and n=1
negative elements of the sequence {an}, then the convergence of one of the two x x
series E a,-,, E an together with the divergence of the other one implies that n=1
n=1
li m Jr = 0, respectively, limo Jf = +oo. Finally, if both series diverge, then one can choose qn = tnS, in such a way that there will be no definite limit of Je as e -+ 0.
Small ball probabilities
4.8.
189
It follows from what has been said that the problem of investigating the limit in (4.8.2) is equivalent to the problem of the study of aim 6--0
(4.8.6)
1 ({q < e} I expQdy
q.
for a centered Radon Gaussian measure -y and a --y-measurable second order polynomial Q (or, equivalently, a random variable (4.8.4)).
4.8.1. Lemma. Let y be a centered Radon Gaussian measure on a locally convex space X, let V be a -r-measurable absolutely convex set of positive measure, and let Qi, Q2 E Xo ®X2 be nonnegative. Then
Jex(-Q2)di
- 0, an > 0, an < oe and n=1
n
an orthonormal basis in X;. Clearly, we may assume that c = 0 and that expQ] is integrable (otherwise the claim is trivial). The measure
v:= IIeXPQIIIexpQ]
y
is centered Gaussian (see Example 5.10.17) and its covariance majorizes the covari-
ance of -y. Indeed, let f E X
x Then f = 1 c, ,, where
E 12. The integrals
n=1
of f 2 with respect to v and y equal the sums of the corresponding integrals of the cnE 's by the independence of the random variables cn. Therefore, the claim reduces to the one dimensional case in which it is obvious. Hence v(V) < y(V), 0 which is our claim. The case Ql = 0 is analogous. 4.8.2. Corollary. Let ? be a centered Radon Gaussian measure on a locally convex space X, let V be a -y-measurable absolutely convex set of positive measure,
x
x
n=1
n=1
and let Q = E an (t n -1), where an > 0,
sequence in X. Then
exp Q dy
0, which yields exp(-2ce2) < expQ < exp(-2cc2) on VE. Moreover, the general case reduces to the case where Q2 = 0. Indeed, letting JE (7, Q) =
exP Q dy,
1
7(K)
V,
and taking the centered Gaussian measure v := IIeXP(-Q2)IIL.(,) exp(-Q2)
'Y,
we have
Je(7,Q1 -Q2) =
JE(V,Q1) JE (V, Q2 )'
since
7(VV) = f eXP(Q2) dv V.
J
exP(-Q2) d7-
X
In addition, the fn's a r e orthogonal in L2(v) and II
nhI
2ivl = (1 + 28n)-1. Hence
Q2 = E 0. (1 + 2)3.) -1(.2, where {(,) is an orthonormal basis in X,,, i.e., Q2 n=1
satisfies the initial condition with respect to v. n
Let Sn = E a,t:?. Let us fix q > 1. By Lebesgue's theorem, there exists n such that
=1
JexP(2Q1 - 2Sn) dry < X
q2.
4.8.
Small ball probabilities
191
Using Lemma 4.8.1 and the fact that Q1 - S, > 0, we get the estimate
/eSn(
I <Je(7,Q1)=Jr('Y,QI-Sn+ S)= J(7,2Sn)
I was arbitrary, this estimate shows that it suffices to get the equality lim Je(7,2Sn) = 1. By induction, this reduces to the case a1 = ... = an-1 = 0, e-o since the measure v := 11 exp(2Sn _ 1) EI
V(.) exp(2Sn_ 1) - 7 is centered Gaussian and
Je(7,2Sn) = Je(v,2an£n)Je(7,2Sn-1). an`n- Then a(e)2 < 1/8 by the integrability of exp(8Q1). Let us fix again q > 1. Let us take a functional rl from the linear span of {fn } such that Set
x(17)2 < 1/8 and
2-1
Then we find e1 > 0 such that Je(7,4i12)
Ve E (0,e1)
q,
Since 2 < 2( - 17)2 + 2172, estimate (4.8.7) implies that, for any e E (0, e1), we have Je(7,4(C-17)2) 1 was arbitrary, the claim is proved under the additional assumptions made above. The case where f 96 0 reduces to the case considered above. Indeed, by the Cauchy inequality, we have
IJe(7,Q+1)-Je('Y,Q)I=7(Ve)If expQ(expI-1)d'y v.
1 such that the function exp(AQ1) is still 7-integrable. The functions G(z, e) = Je (y, zQ) are holomorphic in z if 0 < Re z < A and uniformly bounded in e > 0 and z with 0 < Re z < A. Since they converge to 1 as e - 0 for all sufficiently small real z, the convergence takes place for all z in [0, 1] by Vitali's theorem (see [767, Ch. V, § 5.21). 0 m
4.8.4. Remark. It is seen from the proof given above that if Q = E anCn, n=1
where E Ian < oo. and i,, E Xy are such that o(( )2 6 such that, letting b, = 6 and r,
r, we get
m(n)
Jn(e,61,...,bm(n),rl,...,rm(n)) 0, due to our choice of a;. The sequence J,,,, (y, Q) tends to zero, whereas the sequence JE,,,., (y, Q) tends to a positive number. In the general case, if condition (4.8.10) is satisfied, we can assume that the numbers a,+, and ia; I are less than 1/2. Choosing two subsequences nk and mk such that n4
-2-k < J=1
o+
m4
aj < 2-k, j=1
one can consider a construction that is completely analogous to the one described above and differs only in a more complicated notation. In particular, instead of e-" on the right in (4.8.11) we get exp(- E a ),and in (4.8.12) in place of 1 we j=1
get exp (- F, a' - E a ). Clearly, one can get an example where J, (y, Q) has j=1
j=1
as cluster points 0 and +oo; then .J (y, Q) oscillates in the interval (0, 00).
0
Note that for a suitable sequence of positive numbers q,, the measure y can be regarded also on a separable Banach space Z of the sequences x = (x,,) such that IIxII = sup lgnxn I < oo and Um lq,,xn I = 0. If {qn} decreases sufficiently fast, then
n-x
the sets {q < e} are compact in Z. Therefore, the limit in (4.8.2) may not exist for equivalent Gaussian measures even in the case of compact sets {q < e}. The reason is that the corresponding Radon-Nikodym density does not always satisfy the condition in Theorem 4.8.3. The situation may be different if the norm q is prescribed. It follows from [498, Proposition 8 and Remark, p. 288] (see also [4841) that there exist a centered Gaussian measure y on X = 12 and a sequence (an) E l2
4.9.
Large deviations
with n=1
195
Ianl = oo such that, letting q(x) = IIxII, one gets a positive finite limit
in (4.8.6). Finally, note that in Section 5.12 below, we shall give representation-free formulations of the results in this section; in addition, we shall consider the case where q may be a seminorm on H.
4.9. Large deviations Let -y be a centered Radon Gaussian measure on a locally convex space X, let U be the closed unit ball in the Hilbert space H = H(-y), and let
-y'(A)=y(e-1A), a>0. For any two sets A, BCX,weput
AeB=(xEAIx+BCA). Denote by BAL the Ben Arous-Ledoux class of all Borel subsets V of the space X such that limi0nf 7e(V) > 0.
Note that if V E BAL, then Al' E BAG for every A > 0. Any Borel set V such that AV C V for all \ > 0 and 'y(V) > 0, belongs to BAG. Put
I(A) =
2
zinf I(x)2,
1(x) = lxl,,, where I(x) = oc if x ¢ H.
Let us introduce the following functionals on B(X) :
r(A) = sup{r > 0: 3 V E BAG, (rU + V) n A = O}, where r(A) = 0 if for no r such a set V exists, and
s(A) = inf {s >- 0: 3 V E BAG, (A e V) n sU is nonempty}, where s(A) = +oo if for no s such a set V exists.
4.9.1. Theorem. For every Borel set A in X, one has limsup e2log 7e(A)
-2s(A)2.
(4.9.2)
PRooF. Let r > 0 be such that (rU + V) n A is empty for some V E BAG. Then
rye(A) < 1 --y(e-'rU,, +e-1V). Since V E BAG, there is a > 0 such that -y(e-'V) > a for all sufficiently small positive e. Therefore, by virtue of estimate (4.3.3), we get the existence of a number c(a), which depends only on a, such that for all sufficient small positive e one has -y' (A)
0, one has
-((A) > 7`(V + h). According to the Cameron-Martin formula, we get y E(V + h) = exp( Ih12
exp(r sh(x)) ly(dx).
J
c-1V
Since V E BAG, there is a such that y(e-1V) _> a > 0 for any sufficiently small e. By Jensen's inequality, we obtain
f exp(- h(x)) y(dx) > y(e-'V) exp[-`y(e 11V) l
E-It,
E-I4.
Applying the Cauchy inequality, we arrive at
f h(x) y(dx) < Ihl E-1V
Therefore, for all sufficiently small e > 0, there holds the inequality z
y`(A) > aexp(-I2Cz
- IQc )
This yields the estimate
liminfe2logye(A) > -IhIH/2> -s2/2.
E-u Since one can choose for a any number bigger than s(A), estimate (4.9.2) follows.
0
4.9.2. Lemma. The following inequalities hold true: I r(A)2 > I (A)
for every closed set A C X.
1 s(A)2 0. Let 0 < r < I(A). It follows from the definition of I(A) that 2rU fl A = 0. Since /U is compact in X and A is closed, there is an absolutely convex neighborhood of zero V such that ( 2rU + V) n A is still empty. It is clear that V E BAG. Therefore, r(A) > v. This implies the first inequality. Let A be open and I(A) < oc. Then one can pick a vector h E A n H. There exists an absolutely convex open set V such that V + h c A, which means that the set (A e V) n IhIHU is nonempty. Therefore, s(A) < IhIH, whence the second inequality. 0 This lemma implies the following classical large deviations principle.
4.9.3. Corollary. For every Borel set A in X, the following estimates hold true:
-I(A°) < liminfe2logyc(A) < limsupe2logy`(A) < -1(A). E-0 a-0 where A° is the interior of A C X and A is its closure.
4.10.
Complements and problems
197
4.9.4. Example. Let X be a separable Banach space with the norm II ' [[ and let or be the minimal positive number with U,, C {x E X : jjx[O < o} (which is the
norm of the natural embedding H -+ X). Then, for both sets A = {x: jjxjj > 1) and A = {x: lixi[ > 1}, we have I(A) = o-2/2. Therefore, li me21o87
(x:
11xIl ? £ f
2a
Let us mention a result from [4511 related to large deviations. Suppose that -y is a centered Radon Gaussian measure on a locally convex space X with the Cameron-Martin space H and let A and B be two Borel subsets of X. Put
d(A.B)=inf{la-bl,,: aEA,bEB,a-bEH}. if (A - B) n H is nonempty; otherwise d(A, B) := oc. Note that d(A, B) < oo if A and B have positive measures (since then y(A+ H) = y(B + H) = 1).
4.9.5. Theorem. Suppose that y(A) > 0 and y(B) > 0. Then lim sup 4t log f T11H dry < -d(A, B)2. e-.o
See (2341 for related results. The constructions described above can be applied to Laplace approximations. Let us mention a result from [54].
4.9.6. Theorem. Let F E L' (-y). Then, for every Borel subset A in X, one has 1
-cj < lims'upe2log J exp(-E2F(ex)) y(dx)) < -c2, A
where
C2=inf 9ER(
t+1$(An{Fa}=S({f >a}), 'yn({S(f) 1? See [458] for some positive results. In relation to the correlation inequality, let us mention the following result.
4.10.3. Theorem. Let -y be a centered Gaussian measure on 1R" with covari-
ance K = (K,,,);'j., and let y be the centered Gaussian measure with covariance K(A) _ (Kij(A))" where K;.,(A) = K,., if i. j > 2 or i = j = 1 and i1_' K1, (A) = K,,I(A) = AK1,3 if j > 2. Then
Gn(ai.....a". K(A)) := yA(x: [x,I < a,, i = 1,... n) is monotonically nondecreasing in A E [0, 1] for any fixed at, ... , a,,.
4.10.
Complements and problems
199
PROOF. Suppose first that K is nondegenerate. Using that K is positive definite, it is readily verified that K(A) is nondegenerate for all A E [0, 1]. Denote by n
n
fn (K(A), - ) the density of ya. Put A = 11 [-a,, ail, B = ji [-a,, ail. Interchangi=2
i=1
ing differentiation in A and integration, using the chain rule and Lemma 1.2.4, we get d
A dAGn(a1,...
,a,,K(A)) a n E Kj,, 8K1.7 fn (K(A), xi, ... , xn) dxl ... dxn
A j=i
n' n 2
02
>2Ki.j axlax; fn (K(A), xl,... , xn) dx1... dxn JJ B 0 = j
n
2 E Klj-fn(K(A),al,x2,... xn)dy2...dyn, xj B j°2
since 2
I
B j=2
xn)
Ki .i axj
fn (K(A), 0, xz,
--
,
dx2 ... dyn = p
by the symmetry of B and fn. The function fn (K(A), a1, x2, ... , xn) can be written as
,(al )fn-1 (Ko(A), x2 - a1 K1.2(A), ... , xn - a1Kl.n (A))
with some positive op(al) and positive definite (n - 1) x (n - 1)-matrix Ko(A). Therefore, it suffices to show that
,x,, -aiKl.n(A))dx 2...dXn
f >2Kl.j'G(al)aa B j=2
is nonnegative. This follows by Theorem 1.8.5, since the foregoing expression coincides with the derivative in t of the function t '-+ µ(B - tk), where µ is the centered Gaussian measure on IRn-1 with density fn-l (Ko(A), )) and k =
(aiKl,2(A),... ,a,K,.n(A)). If K is degenerate, the claim follows by approximat0 ing K by the sequence K + m-1 1. 4.10.4. Corollary. Suppose that K(A) depends on A = (Al,
An) in the
following way: K;,j(A) = Kj,j and K,,j(A) = A;AjKj.j if i # j. Then
Gn(a,,... ,an,K(A)) := yy(x: Ix,I < ai, i = 1,... ,n) is monotonically nondecreasing in every A, E [0,1] for any fixed a1, ... , an. In particular, n
y(x:
(x,I f y(x: lx,I 0 such that µf (x E Co(Id): '\1 (X) < 25) > exp(-Ce-d/^3), Ve > 0.
4.10.7. Corollary.
Suppose that
6,I2 < It - sla,
Vt, s E Id,
4.10.
Complements and problems
201
where 0 < 6 < 2. Assume that 0 < a < 6/2. Then there exists a constant C > 0 such that A, (x E Co(Id): IIxIIa < E) >-
exp(-CE-2d/(6-2a)),
Ve > 0.
Let {hk,} be the Haar system, i.e., h00 = 1 and, for k > 0, 1 < 1 < 2k,
if (I - 1)2-k < t < (I - 1/2)2-k, -2k/2 if (I - 1/2)2-k < t < 12-k, 2k/2
hki(t) _
0
otherwise.
It is known that the Haar system is an orthonormal basis in L2[0,1]. Letting e
Pk1(t) =
fhici(u)du, 0
we get the Schauder basis in Co(0,1], i.e., every function x E Co[0,1] is given by the uniformly convergent series x 2i
X(t) = xW'Poo(t) + E L. xkl`Pk1(t) k=o1=1
4.10.8. Theorem. Suppose that d = 1, M > 0. 0 < 6 < 2, 0:5 a < 6/2 and 1E If, - f,I2 < It - sI6.
Vt, s E [0,1].
(4.10.1)
Let q be a ^y-measurable seminorm on C(0.11 such that
x q(x)
2-(1/2-a)k
< 1b1 k=o
sup
Ixk1I,
11x E C[0,1].
1
exp(-CE-2/(6-2a)),
de > 0.
Proofs are given in [725]. For related results, see [T24), [691]. In some cases, the estimates given above are sharp. Namely, as shown in (725], if d = 1 and
Elf, - {.I2 = r(It - 81)2. then under certain additional technical conditions on r and q, one has
A' (q(x) < e)
h),F)0,1) is the fractional Ornstein-Uhlenbeck process, i.e. the centered Gaussian process on [0, 1] with the covariance function exp(-It - sI6). Then there exist positive numbers c and c' such that -ce-2/(6-2a) < log 14 (IIxII, < E) < -C E-2/(6-2a)
Chapter 4.
202
4.10.10. Example. Let p > 1, a E (0, 1/2), and let
Convexity
II. be the Holder
II
norm of order a on C[0,1] with the Wiener measure P'r'. Then there exist positive
numbers c i. = 1,... , 4, such that -cle-2 < log Pn
e) < < log P" (IIxII. < e)
0 such that
y(A fB(x,r)) Urn
y(B(x.r))
= 0 y-a.e.,
where B(x, r) is the closed ball of radius r centered at x. In [6241 there is an example of a function f E L1(y)rsuch that liminf1--y (B(x,s)) -0
'
J
f(y)y(dy): xEX,
0<s 5/2. Related problems are discussed in [623), [626], [766].
Problems 4.10.14. Show that the function f (x, y) =
-'(4'(y)
- 4'(x)) is concave on the
half-space {x < y} in IR2.
4.10.15. Derive Anderson's inequality from Ehrhard's inequality. 4.10.16. ([486]). Let -y be a Radon Gaussian measure on a locally convex space X, let A be a 'y-measurable convex set of positive measure, and let B be an open convex set
such that -y(A n B) = 0. Show that there exists a continuous linear functional f on X such that inff f (x) > ess sup f (x).
.c- B
zEA
i.e., inf,EB f(x) > f(y) for a.e. y E A. Hint: take the topological support S of the measure ryIA, which is convex by Corollary 4.2.4, and apply the Hahn-Banach separation theorem (see [670. Theorem II.9.1]) to the convex sets B and S.
4.10.17. Derive Corollary 4.4.2 from Theorem 4.4.1, using the properties of concave functions and the function 4'.
4.10.18. Let t. t E T, be a centered Gaussian process such that sup Itt I < oo a.e. tET
Prove that the set ft r, t E T} is relatively compact in L2(P). Hint: otherwise there exist d > 0 and a sequence {,,, such that El{,,, -,112 > d2 for all q from the linear span of tt,. i < n; use Lemma 1.8.8 to show that P(sup I{t I < M) = 0 for every M > 0.
4.10.19. Let 0 < r < p < oc. Show that there is a number K(r.p) with the following property: for any Banach space X and every Radon Gaussian measure ry on X, one has (1IIx1I°7(dx))'/P < K(r,p)(IIIxlL''y(dx))
Hint: use Example 4.5.8 in the case 1 < r < p < oc; in the case 0 < r < p = 1 apply the Cauchy inequality to the function IIxII = iixII 2lixii'-'`2 and use the result for p = 2 - r. 4.10.20. Show that if p is a Radon Gaussian measure on a locally convex space X and f is a µ-measurable function such that, for µ-a.e. x, the function t - f (x + th) is continuous for every h from H(p), then a median M(f) is unique. Hint: see Proposition 6.7.7 below.
4.10.21. Let p be the same measure as in the previous problem and let F be a Borel mapping from X into a normed space Y such that IIF(x + h) - F(x)II,. S IhI,rt,.>,
Vh E H(p), for p-a.e. x. Prove that
E V(p) for a < 1/2. Hint: show
that the function f (x) = IIF(x)II, satisfies the condition of Theorem 4.5.7.
4.10.22. Let p and v be two Gaussian measures on a separable Banach space X such that p(V) = v(V) for every closed ball V in X. Show that p = v. Hint: note that µI = p s (p s v) and vt = v s (p s v) also coincide on all balls; use Problem 3.11.47 to show
Chapter 4. Convexity
204
that H(µ1) = H(vi) as Hilbert spaces and then apply Corollary 4.7.5 and Corollary 4.7.8 to show that µl = vi. (Note: the result holds true for non-Gaussian measures, but the proof is much more difficult, see [6261).
4.10.23. An alternate way of proving Proposition 4.6.7 is this: letting (Se)e>o be the semigroup with generator Lip(x) = Ocp(x)/2 - (Kx,VV(x)), show that the function St f is nondecreasing in every variable; then use the identity /'
f fgdry -
J fd7 f fd'Y=f f (VStf,Vg)d-ydt. 0
Hint: see [36[.
4.10.24. Let µ be a Borel probability measure on an infinite dimensional separable Hilbert space X with the closed unit ball U. Show that inf µ(U + a) = 0, provided µ has no atom at zero. Hint: fix e > 0, take an absolutely convex compact set K and 6 > 0 such that µ(K) > 1 - E and µ(6U) < e, then take v V K with norm less than 62/2; note that there is a closed hyperplane E such that v + E does not intersect K and prove the existence of a unit vector a such that (U + a) n K C 6U.
4.10.25. Let -y be a Radon Gaussian measure and let f be a )-integrable convex function. Show that f f d-y > M(f), where M(f) is the median of f. Hint: consider the one dimensional case and use Theorem 4.10.2; another proof is given in [456].
4.10.26. Let -y be a centered Radon Gaussian measure on a locally convex space X and let q be a -r-measurable seminorm such that -1(q < E) > 0 for any E > 0. Show that there exist a full measure Borel linear subspace Xo C X and Borel linear functionals gn on X0 such that q = sup Ig I a.e. and note that sup Ign I is a Borel seminorm on Xo. Hint: n
n
use Proposition 4.4.3 and Proposition 3.11.1; observe that an = 0 in (4.4.1).
CHAPTER 5
Sobolev Classes over Gaussian Measures It is my conviction that it will be possible to prove these
existence theorems by means of a general principle... provided
also if need be that the notion of a solution shall be suitably extended.
D. Hilbert. Mathematical problems. The general problem of boundary values 5.1.
Integration by parts
In this section, we discuss integration by parts formulae for Gaussian measures. Let us start with the differentiation of functions. The various types of differentiability can be described by the following scheme of the differentiation with respect
to a class of sets M. Let X and Y be two locally convex spaces and let M be a certain class of nonempty subsets of X.
5.1.1. Definition. A mapping F: X -. Y is said to be differentiable with respect to M at the point x if there exists a continuous linear mapping from X to Y. denoted by DF(x). such that, for every fixed set M E M, one has uniformly in h from M:
F(x + th) - F(x)
= DF(x)h.
li o t Taking for M the collection of all finite sets, we get the Gateaux differentiability. If M is the class of all compact subsets, we arrive at the compact differentiability (which, for normed spaces, is called the Hadamard differentiability). Finally, if X, Y are normed spaces and M consists of all bounded sets, then we get the definition of the Frechet differentiability. In the finite dimensional spaces the Hadamard
differentiability is equivalent to that of Frechet and is stronger than the Gateaux differentiability. For locally Lipschitzian mappings between normed spaces, the Gateaux and Hadamard differentiabilities coincide (Problem 5.12.22). In infinite dimensional Banach spaces, the Frechet differentiability is strictly stronger than the Hadamard differentiability. For example, the function 1
f: V[0,11-IR', f(x)= fsiux(t)dt,
(5.1.1)
0
is everywhere Hadamard differentiable, but nowhere Frechet differentiable. The same is true for the mapping F: L2[0,1] -+ L2[0,1], F(x)(t) = sinx(t). (5.1.2) If E is a linear subspace in X equipped with some stronger locally convex topology, then one defines the differentiability along E (in the corresponding sense) at the point x as the differentiability at h = 0 of the mapping h - F(x + h) from 205
Chapter 5. Sobolev Classes
206
E to Y in the corresponding sense. The derivative along E is denoted by D£ F. If E is one dimensional, this gives the usual partial derivative OAF, defined by the formula
8hF(x) = lim
F(x + t h) - F(x)
t-0
t The derivative DE F is a mapping from X to the space C(E, Y) of all continuous
linear mappings from E to Y. Therefore, if we equip C(E, Y) with some locally convex topology, we can consider the second derivative DE2F and so on. Suppose that E and Y are normed spaces and that C(E, Y) is equipped with the operator norm. If the Frechet derivative DEF(x) exists everywhere, then DEF is again a mapping with values in a normed space. Thus, the n-fold Frechet derivative DI F can be defined inductively as DE (DE -1F) (or as D£ -1(D F) ). The mapping D, "F can be regarded as taking values in the normed space C.(E,Y) of all continuous n-linear Y-valued mappings with the norm sup{Il'(hi,... ,hn)Ilr, Ilh;IlE n + 1. Then formula (5.1.3) for fn = On(f) and the Lebesgue theorem imply the validity of this formula for f . In the case of a locally absolutely continuous bounded function f the claim follows from the classical integration by parts formula for closed intervals, since by the integrability of p there exist two sequences of numbers aj -oo and
+oo such that jp(a,)j + jp(b1)j -. 0. For functions f that are everywhere differentiable, the proof is somewhat more tedious, since such functions need not be locally absolutely continuous. However, this proof is brought to the end using the integration by parts formula for absolutely continuous functions and the obvious observation that f is absolutely continuous on every closed interval where p has no zeros, since on such intervals the function f' is integrable. 0 b;
5.1.3. Definition. A Radon (possibly, signed) measure p on a locally convex space X is called differentiable along a vector h E X in the sense of Fomin if there exists a function Oh' E L1(µ) such that, for all smooth cylindrical functions f, the following integration by parts formula holds true:
f Ohf (x) p(dx) x
f
f (x)fh(x) p(dx).
(5.1.4)
x
The function Oh" is called the logarithmic derivative of the measure p along h.
The measure Qh p is denoted by dh p and is called the derivative of p along h. By induction one defines higher derivatives dhp and mixed derivatives dh,
5.1.4. Example. A measure p on the real line is differentiable along 1 precisely when it has a locally absolutely continuous density p with respect to Lebesgue
measure and p' E L'(IR'). In this case ,31 = p'/p.
Chapter 5. Sobolev Classes
208
PROOF. If a density a with the aforementioned properties exists, then the Fomin differentiability follows from the classical integration by parts formula. Con-
versely, if there exists 8, then it is easily verified that the function B(t) =
f3r(s)ii(ds)
serves as a density for µ, whence the existence of an absolutely continuous density follows.
5.1.5. Remark. Originally, Fomin defined the differentiability of µ along h as follows: for every Borel set A, there exists the limit `im µ(A + t h) - µ(A)
(5.1.5)
This definition is equivalent to the one given above and is a special case of the definition introduced earlier by Pitcher [6081. Definition 5.1.3 admits a straightforward generalization to nonconstant vector fields (in particular, for measures on manifolds).
In infinite dimensions, a nonzero measure cannot be differentiable along all vectors. However, Gaussian measures possess large collections of vectors of differentiability.
5.1.6. Proposition. Let y be a Radon Gaussian measure on a locally convex space X. Then H(y) coincides with the collection of all vectors of differentiability. In addition, the measure y is infinitely differentiable along H(-y). If h E H(y), then
,3Z (x) = -h(x).
PROOF. Let f E.FC,c. By the Cameron-Martin formula, one has f (x + t t) - f (x) y(dx) = f (x) r(t, t) - I y(dx),
f
f
where r(t,x) = exp(th(x) -
f
IhEH(,)). The left-hand side of this expression tends
f
z
to ah f dy as t 0, whereas the right-hand side tends to h f dry by the Lebesgue theorem. The infinite differentiability along h follows from the formula above. If h V H(y), then the measures yth and y are mutually singular for all t > 0, whence one readily deduces that y is not differentiable along h (see, e.g., (5.1.5) or Problem 5.12.24).
5.1.7. Corollary. The mapping S: h - yh from the Cameron-Martin space H(y) to the Banach space of all Radon measures on X equipped with the variation norm is real-analytic. In addition, IISIk>(a)II < k'12 whenever IaIH(.) < 1. In particular, all functions A -. y(A + Ah), h E H(y), extend to entire functions on the complex plane.
PROOF. The analyticity follows from the estimate Ild"yll 5 kk12.
IhIHe,, < 1,
which is valid for the total variation of the derivative of any order k along h. * yk, where the Indeed, we may assume that y has zero mean. Then y = yl * identical measures y, coincide with the image of the measure y under the mapping
5.1.
Integration by parts
209
x - k-1'2x. It is easily verified that one has do y = dh y, * - - - * dhyk. It remains to note that Ildh-y,II = /Ildhyll 5 %/k_A11L2(,), whence Ildh vII Slog If(t) 1. By virtue of the logarithmic Sobolev inequality, we get
+x Jt2f(t)2p(t,O, 1)dt< f[8fP(t)2 + t2e2/4] p(t,O, 1)dt, +oo
-x
-x
whence our claim, since / t2et /4p(t, 0,1) dt = 2f < 8.
0
5.1.13. Theorem. Let y be a centered Radon Gaussian measure on a locally convex space X, let h, k E H(y), and let f and g be two functions from absolutely continuous on the lines x + Rlh and x + R1k for a.e. X. Suppose that ahf, ahg, akf, 8k9 E L2(y). Then hf, kg E L2(y) and the following equality is L2(-y),
valid:
f [f(x)h(x)
- ahf(x)} [9(x)k(x) - 0k9(x)] n(dx)
X
_ (h, k),H,,> f f (x) g(x) y(dx) + f akf(x) ahg(x) y(dx). X
(5.1.8)
x
PROOF. By virtue of condition and Lemma 5.1.12, both parts of equality (5.1.8) make sense. Similarly to the previous proof, the claim reduces to the two dimensio-
nal case (for el we take the vector ch if h 96 0, then find a vector e2 1 el in the plane containing h and k, and complement these two vectors to a basis in H(y)). Thus, we may assume that y is the standard Gaussian measure 12 on 1112- If h and k are independent, then it follows by the condition that f and g are elements of the Sobolev space W2.1(12), and 8hf, 8kf, 8r,9, 8kg coincide with the Sobolev derivatives. Then (5.1.8) is valid for all h, k E R2, provided that the corresponding partial derivatives are understood in the Sobolev sense. Therefore, it suffices to
5.2.
The Sobolev classes 1V P-' and DP-"
211
prove (5.1.8) for the basis vectors h = el and k = e2. If f, g E Co (1R2), then (5.1.8) follows from the usual integration by parts formula for the standard Gaussian density p on 1R2 taking into account the equality 81, 8129 = 8=20"19. In the general case we find two sequences {f) } and {g.,} of compactly supported smooth functions convergent in %,2.1 (12) to f and g, respectively. Using Lemma 5.1.12, we get (5.1.8)
by passing to the limit in the equalities for f, and gj. If the vectors h and k are linearly dependent, then we arrive at the one dimensional case, which is considered in a similar manner. Note that if 8kg is absolutely continuous along h. 8hg is absolutely continuous along k and OhOkg = Sk0hg E L2(-Y), then (5.1.8) follows by the integration by parts formula. Indeed, the left-hand side can be written as
I [8hf9k+ f8hgk+ fgOhk-8hfO9 k'- fMhOk9-8hfgk+8hfOk9]d7 = f [fehgk+fgahk-fahak9] dy = J [akfah9+fak.9h9+f9ahk-f,9hOk9J d7, which is (5.1.8), since 8hk = (h, k),, for a proper linear version of k.
5.2. The Sobolev classes IVP,' and DP,' Let y be a centered Radon Gaussian measure on a locally convex space X and let H = H(y). There are three different ways of defining Sobolev spaces over the Gaussian space (X, -y): as certain completions, by means of generalized derivatives,
and with the aid of integral representations. In this and the next sections, the necessary definitions are introduced: the relations between them are discussed in the subsequent sections. Recall that the space of all Hilbert-Schmidt operators (see Appendix) between Hilbert spaces H and E equipped with the Hilbert-Schmidt norm is denoted by ?{(H, E). The symbol 9ik stands for the Hilbert space of all k-linear Hilbert-Schmidt forms on a Hilbert space H (see Appendix). This space is naturally isomorphic to the space 7 {(H.fk_1). Classes H
For all p > 1 and r(J[ N. the Sobolev norm II E
lltt a.- is defined by the formula
r
Ilflltt o =
p/2
F,
(0, ...a;kf(x))2]
k=0 X i,.... dk>1
Y(dx))
,
(5.2.1)
where 8, stands for the partial derivative along the vector e, from an arbitrary orthonormal basis (e,) in H. Denote by the completion of the linear space FC" with respect to the norm II Ilµo.r. Note that the same norm can be written as IIDHfllLP( ,1ik)k=O
In a similar way one defines the Sobolev spaces WP"r(y, E) of mappings with values in a Hilbert space E. The corresponding norms are denoted by the same symbol 11
Note that if two sequences (f,) and {g.} from FC" are fundamental in the norm 11
Ills-p.,, where r E IN, p > 1, and converge in LP(y) to f, then the sequences and have equal limits (denoted by D f) in LP(y, fk), k < r.
Chapter 5.
212
Sobolev Classes
Indeed, suppose first that p > 1. For any fixed h E H and any I E X', we get by the integration by parts formula
f euahf,d,=-il(h) f&ui, dy + f e1if,hdy, which tends to
-il(h) f eilfdy+ f e"fhd7. The same is true for g,. Since I was arbitrary, this means that 8h f, and 8hg, have equal limits in LP(-r), whence the equality of the limits of DJ, and DHg,. For higher derivatives and vector-valued mappings the reasoning is similar. In the case p = 1 derivatives are also well-defined, but the reasoning above should be modified. Namely, by the same argument for every p E C6 (IR1), we get a common limit of D. (w o f)) and D ('p o gj), whence the claim. This observation enables one to define the derivatives D f (called the Sobolev derivatives) for all f E WP"(-y) (and similarly for f E WP''(-y, E), where E is a separable Hilbert space). Finally, let us put
H`(7) =
n
Wp'r(- ),
W"(7, E) = n
p>1 rEIN
E)
p>1, rEIN
5.2.1. Remark. (i) The classes WP-"(-y) are stable under the compositions with C6-functions; if n = 1, they are stable under the compositions with Lipschitzian functions. Indeed, if a sequence of smooth cylindrical functions fj converges to f in the space WP-' (-r) and P E Cb (1R1), then the sequence p o fj converges to o f in LP (-y) and is fundamental in WP-' (y), which is easily seen by the chain rule.
(ii) Note that, by the density of X' in X;, the classes WPr(y) do not alter if we replace X' by X. in the definition of YC".
5.2.2. Remark. Suppose that f E W'-'(-y). Let {e"} be an orthonormal basis in H and 8, := 8e,. Then by the same reasoning as in Remark 1.3.5 one has k
Jk(f) _
1)
-- f 8i ...8,kkf(x)7(dx)
Characterization via partial derivatives Let E be a Hilbert space.
5.2.3. Definition. Let F: X
E be a -y-measurable mapping.
(i) F is said to be nay absolutely continuous if for every h E H, there exists a mapping Fh : X -+ E such that F = Fh -y-a. e. and, for every x E X, the mapping t - Fh(x + eh) is absolutely continuous. (ii) F is called stochastically Gateaux differentiable if there exists a measurable mapping DH F from X to 7{(H, E) such that for any h E H, the expression
F(x + th) - F(x) - D. F(x)(h) t tends to zero in measure y as t 0. The mapping DHF is called the stochastic derivative of F.
5.2.
The Sobolev classes ll'r and Dp"
213
The n-fold derivative D, "F is defined inductively.
5.2.4. Definition. Let 1 < p < x. Denote by DP-' (-y, E) the space of all mappings f E L"(?, E) such that f is ray absolutely continuous, stochastically with Gdteaux differentiable and DHf E LP (y,N(H.E)). Let us equip the norm IIf
IIfIILP(-..E) +
Then for n = 2, 3.... we define the spaces DP-" (,y. E) inductively by the following formula: Dp."(y E)
=ffE
DP."-'(y. E): D. f E
Dp."-1 (y,n(H, E))
The corresponding norms are given by the equalities IIJ
DP " = If1 Lr[,.E + ,IIDkHfIILP(,.xk(H.E)) k=l
Let us put DP."(-y): = D' ' (y, IR') and
D"(y) = n
n
Dp.r(y)
p>l.
Dp.r(y,E)
p>1. rElhl
5.2.5. Remark. Let f E Dp"(ry) and 0 E C, (IR') (or, more generally, cp E Cb (IR' )). Then it is readily seen from the definition that ;p of E DP-"(y). If n = 1, for any Lipschitzian function ip. In addition, DH(,po f) = then ;po f E
If f E D" ;(I) and E Cx (lR') is such that the derivatives of , have at most polynomial growth at infinity, then cpof E D' (-y). yp'(f)DH f .
5.2.6. Lemma. The spaces
(ry. E). II '
11D'') are complete.
PROOF. To simplify notation we only consider the space D-'(-r). Suppose that t f j) is a Cauchy sequence with respect to the norm 11 11D) , . Passing to a subsequence we may assume that it converges almost everywhere to some function f E L' (y) and that the sequence DH f j converges to some mapping G E L' (y. H). Moreover, we may assume that I1f,-1 - f,IID= 1; in this case the generalized derivatives DNF, k = 1, ... , n, take values in the spaces LP(-y, CO- If we are in the situation described in item (i) of the previous remark, then the classes GP-" (-y, E) are defined inductively. The natural norm in the class GP."(-y, E) is defined by n
IIfIlca." =E k=0
5.2.10. Remark. We observe that by the symmetry of the usual derivatives of smooth functions, the mappings D, f , k _> 2, take values in the space of symmetric k-linear forms (or symmetric operators). In other words, Oh, ah,, f does not
depend (as a measurable mapping) on the permutations of hl,... , hk, h; E H. p > 1.
It will be shown below that
5.3. The Sobolev classes HP,' In this section, y is a centered Radon Gaussian measure on a locally convex space X and H = H(y) is its Cameron-Martin space. Recall (see Chapter 2) that the Ornstein-Uhlenbeck semigroup (Tt)t>o is defined by the formula
Ttf (x) = f f
(e-tx + V'117- -e-2t y) y(dy).
x Let L be the generator of the Ornstein-Uhlenbeck semigroup (called also the Ornstein- Uhlenbeck operator) on L2(y) (see Appendix and Chapter 2). The proof of Proposition 1.4.5 did not use the finite dimensionality of the space, hence it remains valid in the general case.
5.3.1. Proposition. The operator L has domain of definition
D(L) _
{i:
rk2111k(f)Ili2(,) < 00 k=1
on which it is given by Lf = -
kIk(f) k=1
Let r > 0. Put
JtnI2letTtf Vrf r(r/2)dt, f E LP(-y), 0
where
r(a)
Jt1et dt. 0
216
Chapter 5.
Sobolev Classes
By the same formula we define V, on LP(-y, E), where E is any separable Hilbert (or Banach) space.
5.3.2. Lemma. For any p > 1, the mapping V, is a bounded linear operator on LP(-y) with norm 1. The same is true for LP(y, E), where E is a separable Banach space.
0
PROOF. It suffices to apply the estimate IITtfIILP(,) 5 (If11Lah)-
Let f = E 1n (f) be the Wiener chaos decomposition. Then n=0
0C
3C
Vr(f) = I-(r/2)-1 7 Jt2
etet dtIn(f) = F(n+ 1)_r/2In(f), 'c
n=00
n=0
whence
Vr = (I - L)-r/2
(5.3.1)
Note that (5.3.1) can be shown in a different way. Namely, for every nonpositive
self-adjoint operator L on a separable Hilbert space E generating the semigroup (S,)t>o one has xftQ_e_tS
(I - L)-° = f(Q)tdt, a > 0.
(5.3.2)
0
Indeed, by the spectral theorem, it suffices to prove (5.3.2) for the operator L of multiplication by a nonpositive measurable function ' on the space L2(m), where m is a probability measure on some measurable space. Then St is the multiplication operator by ett' and (I - L)-° is the multiplication by (1- >')-°. Hence the claim follows from the equality
(1 - y)-Q =
T(a)
f to-le-let" dt,
valid for every y < 0. Applying this representation to the Ornstein-Uhlenbeck semigroup on L2(y), we arrive at (5.3.1). 5.3.3. Corollary. For every p > 1, (V,)r>o is a strongly continuous semigroup of contractions on LP(y).
PROOF. If f is a bounded measurable function, then V, f = (I - L)-r12 f is (I-L)-r/2-'/2 again a bounded function. Since (I -L)-r/2(I -L)-'/2 = on L2(y), the semigroup property follows on every LP(-y). It suffices to verify the continuity of the mapping r i--. Vr f for every bounded measurable f . Since I Vrf 15 sup If I, the LP-continuity follows from the L2-continuity. The strong continuity of (Vr),>0
on L2(y) is clear, since V, = (I - L) )-rJ2 and the operator (I - L)-1 is unitary equivalent to the multiplication b y ( 1 + z )- 1 for some nonnegative function ty on a suitable space L2(m). 0
The operators V,. can be included in a two-parametric family as follows. For any a > 0, put V(a) f =
1
(r/2)
fr/2e_tTtfdt. 0
5.3.
The Sobolev classes HP`
217
The same arguments as above show that ar/2V (°) is a contraction on L"(-y) and lira ar/2V(°) f = f in LP(y) for every f E L"(y). In addition, a-+x V r(") = (aI - L) -r/2 on L2(y) Note the following identity:
V(°) _ (I + (3 - a)V(°))
where (1+(3-a)V2°)
r/2
r/2
13 - a1 < a,
V(3),
(5.3.3)
is defined by means of the series > cn((3-a)V°)n n=o
corresponding to the expansion (1 + x)r12 = X£ cnxn (the corresponding operator n=0
series converges in norm by the estimate 113 - al < a). It suffices to verify identity (5.3.3) on bounded measurable functions, hence on L2(y), where it is obvious by the spectral theorem, since -L is unitary equivalent to the multiplication in some L2(m) by a nonnegative function 1G and V(°) is the multiplication by Note that the mappings Vr are injective on L"(y), p > 1. In the case p > 1 this follows from the symmetry of TT on L2(y), which implies that the dual operator to Vr on L1(y) is Vr on L9(-y), where 1/q + i/p = 1. In addition, the image of Vr is dense in LQ(y), since it contains the spaces Xk. This implies that Ker Vr = 0. (a+V,)-r/2.
For any p > 1, the same follows from (5.3.3) with 3 = 1 and a - oo taking into account that ar/2V() f -+ f in L'(-y) as a - oo. Therefore, the space Hp.r(y)
Vr(L'(y)),
IIfIIHP.- = IIV,'fIILD(,,),
is complete. Put
H" (y) = n Hl`(y), H(y) = n H°'r(- ). p>l.r>l
The norm II 11 HP` is often denoted by II In a similar manner one defines the classes
r>1
Ilp,r
E), H` (,y, E),
(y, E),
where E is any separable Hilbert space. It follows from (5.3.1) that II(1- L)r/2fIIL2(. ),
5.3.4. Example. Let f E Xn. Then f E
fE
H2.r(y)
for all p > 1 and r > 0 and
IIfIIHv. = (n + 1)r/2IIfjIL,,(,)
PROOF. Note that Vr(f) = (n + 1)-r/2 f.
5.3.5. Lemma. One has HP2.r(y) C HPI.r(y) if p2 > pi and Hp-r2(y) C Hp-r, (y) if r2 > r1. PROOF. The first claim is obvious. The second one follows from the equality Vr, = Vr,Vr,_r, and the continuity of the operator Vr,_r, on Lp(y).
It will be shown below that for any p > 1 and n E IN, hence the classes HP,n(y) are stable under the compositions with Cb -functions (if n = 1, they are stable under the compositions with Lipschitzian functions).
Chapter 5. Sobolev Classes
218
5.4. Properties of Sobolev classes and examples
An important feature of the Sobolev classes over Gaussian measures is their invariance with respect to the measurable linear isomorphisms. For this reason, completely different by their geometric properties infinite dimensional spaces possess identical collections of functions smooth in the Sobolev sense. Recall that the situation is different for functions differentiable in the FYrcchet sense. On many Banach spaces there are no nontrivial Frechet differentiable functions with bounded
supports (hence there are no nonzero FY6chet differentiable functions tending to zero at infinity). To this class belong such spaces as C[0,11 and 1'. It is known that, for a separable Banach space X, the existence of a nontrivial Frechet differentiable function f with bounded support is equivalent to the separability of X' (see [194, Ch. II, Theorem 5.3]). If such a function with the locally Lipschitzian derivative exists on both X and X', then X is linearly homeomorphic to a Hilbert space (see [194, Ch. V, Corollary 3.6)). Finally, the situation is absolutely hopeless with the continuous functions having compact supports: such functions may be nontrivial only on finite dimensional locally convex spaces. In contrast to what has been said, on general spaces there is a lot of functions from Sobolev classes, in particular, there exist nontrivial Sobolev class functions with compact supports. Throughout this section -y is a centered Radon Gaussian measure on a locally convex space X.
5.4.1. Proposition.
(i) Any function f E has a version fo with the following property: given linearly independent vectors hl,... , hn E H, for -y-a.e. x, the function (t1, ... , t) -+ fo(x + t1h1 + is in + Bi« (R") and is locally absolutely continuous in eachtj fora.e. (tl,... ,tn). The same is true if f E WP'r(y), where E is a separable Hilbert space. (ii) Let f E WJO (y) and let (e., } be an orthonormal basis in H. Then f has a modification fo such that (t1, ... , fo(x + t1e1 + is in + W'X and is infinitely differentiable for all x E X and n E I. PROOF. Let us take a sequence of smooth cylindrical functions f, such that
fi -. f a.e. and If - fi IIty, < 2-?. Letting r
,
SJ(x)=F
k=0
one gets the integrable series E S; (x). Let us put fo(x) _ slim f, (x) for all x J=1
where this limit exists and zero at all other points. We shall show that fo is a version of f with the desired properties. Let X = E ® Y, where E is the linear span of hl.... , h and Y is a closed linear subspace. By Corollary 3.10.3, there exist Gaussian conditional measures yy, y E Y, on the affine subspaces y + E such that yy has a Gaussian density with respect to the natural Lebesgue measure
on E + y. Let v be the image of y under the projection to Y. Then, for v-a.e. x y E Y, the series E SJ(x) is integrable with respect to y9. For every such y, we 3=1
have 11im IISJIIV(,,,) = 0, and the function (t1,...
fo(y+t1h, +-
with respect to the nondegenerate Gaussian belongs to the Sobolev class measure yy on lR" identified with y + E, whence the first claim in (i). The second claim follows by a similar reasoning applied to the one dimensional conditional
5.4.
Properties of Sobolev classes and examples
219
measures, taking into account that if a sequence of smooth functions on the line converges almost everywhere and is fundamental in 14'l J(yi), then it converges pointwise to a function that is locally absolutely continuous. The vector case is analogous. In (ii) we may assume that -y is the countable product of the standard Gaussian
measures on 1W and that {e"} is the standard basis in 12. Denote by H. the linear span of e1,... ,e,,. Let ff be a sequence of smooth cylindrical functions convergent to f in every space 14"(-1). Denote by 11 the set of all points x such that f (x) = lim f j (x) and .r-ac
liminf J IID fj(x+u)IIh y"(du) < oo,
b'n, r E N.
R^
By condition and Fatou's theorem, we have y(1l) = 1. There is a full measure subset Do c S such that for every x E Sto. the set (Cl(, - x) n H. is dense in H. for every n. By the Sobolev embedding theorem, for every x E 11o, the functions h .- h (x + h) on H" converge to f uniformly on every ball, and the same is true for all their derivatives along H", whence our claim.
E). where E is a separable Banach 5.4.2. Proposition. Let F E space. and let h E H be a nonzero vector. Then F has a modification which is locally absolutely continuous on the lines x + IR' h and the partial derivative of this modification along h exists -y-a. e. and coincides with Bi,F (the generalized partial derivative) a.e.
PROOF. Let Y be a closed hyperplane complementing IR' h. By the aid of the
conditional measures on the lines y + R' h, y E Y, it is easily verified that the mapping
Fo(x) = F(y) + J dhF(y + sh) ds,
x = y + th, y E Y,
(5.4.1)
0
is a modification of F with the desired properties. In particular. the existence of ahFo a.e. and the equality ahF() = 8hF a.e. follow by [214, Theorem 111.12.81.
5.4.3. Corollary. Let f E GP,"(y) and :p E Cb (1Rl) (or, more generally, If n. = 1. then ,oo f E GP-'(y) for any Lipschitzian function sp. In addition, D (y,o f) = p'(f)D f . So E Cs (IR')). Then cp of E
PROOF. We shall consider only the case n = 1. because the general claim is deduced from this case by induction using the reasoning below. According to Proposition 5.4.2, for every h E H, the function f has a locally absolutely continuous modification. For this modification, the function y,o f is also locally absolutely continuous along h and has almost everywhere the partial derivative p'(f )ah f , where ah f is the generalized partial derivative of f. The integration by parts formula implies that p'(f )ah f is the generalized partial derivative for o(f). Since V is Lipschitzian, the mapping y'(f)D f can be taken for D., (<po f). 5.4.4. Lemma. Suppose that f,, E WP`(y, E), where p > 1 and E is a separable Hilbert space. If f" -» f a.e. and supllf"IIW","(i,E) < oc, then f E The same is true for the class HP-'(-y, IE).
Chapter 5. Sobolev Classes
220
PROOF. Using that LP, p > 1, is uniformly convex (see (195. Ch. 3[)1 it is readily verified that the space LP(7, E) is uniformly convex as well. Hence LP(y, E) has the Banach-Saks property (see [195, p. 78, Theorem 11). i.e., every bounded sequence {gyp,} in this space has a subsequence {y;,^} such that the sequence n `(vi, + + p;") converges in Lt'(y, E). This property applied to the sequences { D,k fn }. k = 0,1,... , r, yields the claim. In the case of (ry. E) we write f" = Vg, and apply the Banach-Saks property to the sequence {gn} that is bounded in LP (1, E).
El
5.4.5. Proposition. Let {en } be an orthonormal basis in H and let EA" be the conditional expectation with respect to the 'y-field A. generated by el,... ,e". If f belongs to one of the classes WP-'(,y). p _> 1. or GP`(- r), p > 1, then EA" f does also. and (EA" f } converges to f with respect to the corresponding Sobolev norm. If r > 2, then LEA" f = E'4" L f . Finally. one has (5.4.2)
The same is true for mappings with values in a separable Hilbert space.
PROOF. We shall deal with the version of EA" f given by (3.5.5). Clearly, (5.4.2) is true for smooth cylindrical f. Hence it extends to f E GP-'(-t), by using
that for any function r, that depends on fl,... J,, E X.. one has BhlEA"cp = BphEA"y7, where Ph is the projection of h to the linear span of R, f,, i < n. Then Corollary 3.5.2 yields the claim for GP-'(-y), p > 1. In a similar manner it is proved for IVP-'(1), p > 1. It remains to note that T1EA^ f = EA^Tt f,
b' f E LP('.).
(5.4.3)
which is equivalent to the equality
JJf(e'Pnx +
1- a-2tPnz + S"y) 7(dy) y(dz)
XX
_ f f f (e-'P"x + e`'Snz +
1 - e-2ty) ry(dy) y(dz).
XX
where Px = el(x)el +
+ e;,(x)e." and Sn = I - P". The foregoing equality holds true, since the images of ' g y under the mappings 1 - e-t Pnz + Sny and 1 - e- ty + e-'S,,z coincide. Indeed, both images are centered Gaussian x measures, and an easy calculation shows that. the variance of any f = E CA, is j=1
x
n
j=1
J=1
E c - e-2' E cl with respect to both measures. By (5.4.3) we get LEA^ f =
EA^Lf (if f E
and V,.EA" f = EA"Vf, V f E LP(y), which yields the claim for HP-'(-r). The vector case is analogous.
5.4.6. Proposition. Let E be a separable Hilbert space. Then: C GP-'(ry E) and 4'"(,E) C for any p E (1.0o) and n E N: (ii) The derivatives in the sense of DP-"(-f, E) (respectively. IS E)) serve (i)
as the generalized derivatives in the sense of GP-" (-f. E); (iii) DP-" (y. E) = 6VP-"(y, E) for all p E 11, oo) and n E IN.
5.4.
Properties of Sobolev classes and examples
221
PROOF. The inclusion DP-"(-r, E) C GP-"(-y, E), which makes sense if E is Hilbert. follows from the integration by parts formula (5.1.6). In addition, for any f E DP."(-y, E) and h E H, the derivative Oh f in the sense of DP,' (-y, E) serves as the generalized derivative in the sense of GP."(-y, E). If f E WP"' (y, E), then there is a sequence of smooth cylindrical mappings fn convergent to f in WP-' (-y, E). This enables one to pass to the limit in the equality
f
ahV fn d, = - J Vahf" dy + f _hf,, dy
for smooth cylindrical functions V. Hence f E GP,' (y, E) and ah f is its generalized derivative. By induction, we get the claim for all n E IN. Note we can since use an analogue of (5.4.1) to define versions of D -'F for F E D,,"-'F takes values in the separable space of Hilbert-Schmidt mappings, hence the corresponding integral exists in the usual sense (as a Bochner integral). The (y, E) follows by Proposition 5.4.1. By induction, one inclusion (y. E) C gets E) C DP-"(-y, E) for any n E IN. Let us show the inverse inclusion. To this end, note that every F E D"' (y, E) is approximated by {P"F}, where P. is the orthogonal projection to the linear span of the first n elements of any fixed orthonormal basis in E. Hence the verification reduces to the case where E = 1R1. If p > 1, then F E GP-'(-y). By Proposition 5.4.5. the functions EA,F converge to F in GP-'(-y). Therefore, our claim reduces to the finite dimensional case, in which it follows by Proposition 1.5.2. If p = 1, then we may assume that F is bounded, for it is approximated by gyp, o F, where {ip, } is a sequence of smooth functions with uniformly bounded derivatives of any order such that Vj(t) = t if t E [-j, j[ and cps (t) = j sgn t if t % [-j - 1, j + 11. By the same reasoning as in Proposition 5.4.5, it is proved that the functions E'1^F have generalized derivatives D, ,4'E ' F, k = 0, ... , r, that converge to the respective derivatives of F in LP(y, l'lk). Therefore, our claim reduces again to the finite dimensional case.
5.4.7. Corollary. Let E be a separable Hilbert space. Then
E) _ {F E GP-"(y, E): In particular,
E LP(y,Nk), k S n
(y) = GP'(y) for any p > 1.
The same reasoning as in Example 2.9.3 proves the following useful property of (T,),>o.
5.4.8. Proposition. Let E be a separable Hilbert space and let f E LP(-y, E), where p > 1. Then, for any t > 0, one has: (i) For a.e. .r, the mapping h T,f(x + h). H -. E. is infinitely Frechet differentiable. In additions. for every h E H and y-a.e. x, one has ahTtf (x) =
e
e- 2t
J
f (e-'x +
1 --e-21 y)h(y) y(dy).
(5.4.4)
X
(ii) If p > 2, the derivatives f) are Hilbert-Schmidt mappings, and, moreover, the mappings h '-» f)(x + h). H -- ?ir(H, E), are continuous with respect to the Hilbert-Schmidt norms on ?{,.(H. E).
(iii) If f En, LP(-y, E), then Ttf belongs to the classes D'(-y, E). H' (-y, E), and It" (1, E).
Chapter 5.
222
Sobolev Classes
PROOF. (i) Let h E H. By Fubini's theorem, for ry-a.e. x. the mapping
g.: y'-+ f (e-tx +
1 - e-2ey)
is in LP(-y, E). Denote the set of all such points by Q. For any x E Q, the CameronMartin formula yields
Ttf(x+Ah) = =
JX
f
f (e-tx+ 1- e-2t(y+Ae-t(1 - e-2t)-112h]).y(dx) f (e-tx +
1 - e-1y)A(t, Ah, y) 'y(dy),
x
where
Ate-2t
Ae-c
A(t, Ah, y) = expt
h(y) - 2(1 - e-2t) 1 --e - a
IhItt(,l).
Differentiating in A at zero (which is possible, since g; E LP(-) and A'-+ A(t, Ah, ) is differentiable as a mapping with values in L9 (-y), q = p(p - 1)-1), we arrive at (5.4.4). The infinite Frechet differentiability of Tt f along H follows from the infinite Frtchet differentiability of the mapping h e-+ Lo(t,h, ), H -. L9(-y). The latter is readily verified by use of the equality e -t
=Q( 1-e- a hl(x),...
ah,
e -t ,
i - e- t
where Q is a polynomial on IR" whose coefficients are polynomials in the quantities e-2t(1 -e-2t)-1(h,h))H and a-21(1 -e-2t)-l(h;,h,)x. Therefore, the L9-norm of the derivative 8,, ah,. A(t, h, x) is uniformly bounded in hl,... , hn from the unit ball of H. (ii) Now let p > 2. By equation (5.4.4), we have for a.e. x e-2e
0C
s = 1 - e-2e
n=1
I1 n=1 X
2
f (e-tz+
1 - e-2ty)F"(y) Y(dy)
,
where {en} is an orthonormal basis in H. Therefore, by Bessel's inequality, we have
EI
(x) I E-
e 1
e t 2r
n=1
Jf(_t+
1- e-try) I2 ti(dy).
X
Hence DHTt f (x) E 7{(H, E) for a.e. x and e--2t e21
II DHT:f IlL (-,.x(H.E))
1 and n E IN. By Lemma 5.4.4, we get the claim.
5.4.9. Example. Let f be a bounded Borel mapping on X with values in a separable Hilbert space E. Put
F(x) = Jf(x+y)v(dv). x
Then F E D' (y, E). PROOF. As shown above, the function F is infinitely differentiable along H and
(9hF(x) =
JX
f (x + y)h(y) y(dy),
h E H.
Let {h;} be an orthonormal basis in H. Since {h,} is an orthonormal sequence in L2(y), by virtue of Bessel's inequality, we get from the equality above
I8h,F(x)IE--JIf(x+y)IEy(dy) _ suplii . n=1
X
Hence F E DA1(y, E) for all p > 1 and II DH F(x)II x(H.E) 5 sup If IE, whence the claim follows by induction.
5.4.10. Example. (i) Let f be a -y-measurable function. Suppose that there exists a number C such that one has for a.e. x
If(x+h) - f(x)I 5 CIhIH, Vh E H. Then f E DP-'(,y) for any p > 1. Recall that functions of this type are called H-Lipschitzian.
(ii) Let B E B(X). Put
d8(x):=inf{IhIH: x+hEB, hEH}, and d8(x) = 0 if (x + H) f1 B = 0. Then dB is an H-Lipschitzian function with C = 1 and dB E DP"1(y) for any p > 1.
PROOF. (i) We shall deal with a version off that is Lipschitzian along H with constant C (see Lemma 4.5.2). As we already know, f E LP(y) for all p > 1 (see Theorem 4.5.7). It remains to note that 8hf exists a.e. and IBhfI 0, the set {dB < t} can be written as (B + tV) U A, where V is the open centered unit ball in H and A = X\(B+H). It was proved in Chapter 3 that there exists a linear space E of full measure obtained as the union of a sequence of metrizable compact sets K. Since H C E, then (B + tV) U A
Chapter 5.
224
Sobolev Classes
up to a set of measure zero coincides with (BE + tV) U AE, where BE = B fl E, AE = An E = E\(BE+H). It remains to use the fact that the sets BE, IV, and H are Souslin, hence so are the sets BE + H and BE + tV, whence the measurability of AE and (BE + tV) U AE.
5.4.11. Example. Let E be a separable Hilbert space, G E
E) and
let ep be a bounded H-Lipschitzian function. Then pG E D2.(7, E) and D. (pG) = DH v $ G + oDH G,
where, given u E H, v E E. the symbol u®v stands for the operator h In particular, IDH`p(x)IHIG(x)IE +Iw(x)IIIDHG(x)IIx(H.E)
(u. h)Hv.
a.e.
PROOF. Let h E H. Then the equality ah(VG) = BhWG+Vi9hG follows from the previous example. It remains to note that IIu®rII?uH.E) = IuIHIrIE and that L2(7,r1(H,E)). DHv
5.4.12. Proposition. Let K C X be a compact set and let U D K be an open set. Then there exists a function f E H' (7) such that f is infinitely Frechet differentiable along H, 0 < f < 1, f = 1 on K. f = 0 outside some closed totally bounded set S C U (S is compact if X is complete), and sup I DH f I < x. PROOF. We construct first a function g E H" (7) such that 0 < g < 1, g > 2/3 on K and g < 3/5 outside some closed totally bounded set S C U. To this end, let us denote by V the closed absolutely convex hull of an arbitrary compact set Q D K with measure greater than 1/2. We shall deal further with the linear span E of the set V, setting all the functions zero outside E (note that 7(E) = I by the zero-one law). Clearly, K C E. Since K is compact, there exists an absolutely convex neighborhood of zero IV in X such that K + IV C U. In addition, there exists r E (0. 1) such that 2rV C IV. Put 10(x)
= inf{p, (x - y), y E K}.
x E E.
The function r/' is the distance to the set K with respect to the norm p, Since the sets K + cV are closed by the compactness of K, we get that w is a Borel function. Let
p(x) = 1 - r-10(v(x)), where 0(t) = t if t E [0, r] and 0(t) = r if t > r. Then p(x) = 1 whenever x E K, and yp(x) = 0 for all x outside K + rV. In addition, 0 < ;p < 1. Let us pick m E V, for which 7(mV) > 8/9. Finally, put g(x) = where (Tt)t>o is the Ornstein-
Uhlenbeck semigroup and t > 0 is chosen in such a way that 1 - e-t < r/8 and 1 - e- tm < r/8. With this choice, for any x E K, one has a-tx E K +rV/8 and. for any y E mV, one has 1 - e- ty E rV/8, whence a-tx+ 1 - e- ty E K+rV/4, which yields
p(e-tx +
1 - e-21y) > 3/4.
2/3. In a similar manner, for any x E E\(K+rV) and y E mV. we get a-tx V K+5rV/7, since et - 1< r/7, whence a-tx + 1 --e-2t y V K + (5/7 - 1/8)rV due to our choice of t. Then Therefore, the estimate -y(mV) > 8/9 implies
o(e-tx +
1 - e-21y) 5
2 7
+
1
8
2/3. The function thus constructed is Lipschitzian along H, since so is the function p,, which is a measurable seminorm, and all the subsequent functions arise as the results of compositions with Lipschitzian functions and Tt. By Proposition 5.4.8,
0
f is infinitely Frechet differentiable along H.
5.4.13. Remark. As it follows from the proof, if the compact set K is metrizable, then the support of f can be made metrizable as well; for any sequentially complete space, the support off can be made compact metrizable. If K is an absolutely convex compact set of positive measure, then our construction enables one to find a function f E HOC (y) such that 0 < f < 1, f = 1 on K and f = 0 outside 2K. Note also that this construction yields the existence of Hl(-y)-partitions of unity on X (cf. [300]).
5.4.14. Corollary. Let Z C X be a closed set, W a neighborhood of zero in X and U = Z + W. Then there exists a function f E Hl(y) such that 0 < f < 1, f = 1 on Z and f = 0 outside U. PROOF. We may assume that X is complete, passing to its completion and then restricting the function we constructed to the initial space. Let us take an absolutely
convex compact set V of positive measure. Let Z = Zfl{n -1 < p, < n}. The sets Z are compact, and one can take for them the functions of the form f,, =
constructed in the proof of Theorem 5.4.12. Put f = E f,,. Problem 5.12.37 n=1
suggests to verify that this series defines a function with the desired properties. 0 Another interesting example arises in the theory of stochastic differential equa-
tions. Let A: Rd -+ £(Rd) and B: Rd - Rd be mappings of the class Cb and let a;t be the solution of the stochastic differential equation dCt = A(41) d147t + B(it) dt,
where {W,} is a standard Wiener process in Rd. Denote by p( the measure generated by l;t on the path space x = C((0.1),Rd). It is known (see [361, Ch. IV] or (504, Ch. 4]), that there exists a Borel mapping F: X -+ X such that pE = yoF-1, where -1 is the Wiener measure on X (the distribution of the process {W1}).
5.4.15. Example. Under the foregoing assumptions, the mappings bt o F, t E 10, 11, turn out to be elements of the class W' (y, Rd). In particular, choosing (X, y) for a probability space for (11i)t>o, we get that, for any fixed to, the mapping w
ttfl (w) belongs to W'° (y, Rd).
The proof of this claim can be found in (361].
5.4.16. Example. Let F E G" 1(y) be such that D F = 0. Then F coincides with some constant a.e. The same is true for the class WP- (-y).
PROOF. It suffices to note that, for a.e. x and any h E H, the function t -. F(x + th) coincides a.e. with some constant.
O
5.4.17. Example. Let A be a set such that IA E GP-'(-y). Then we have either y(A) = 0 or y(A) = 1.
Chapter 5.
226
Sobolev Classes
PROOF. Put 2. Then cy is a Lipschitzian function. According to Corollary 5.4.3, we get the equality 21AD,,IA = D IA = 0.
Finally, note that the classes GP-1 (-y, H) are larger than DP-" (-y, H) (see also Problem 5.12.28).
5.4.18. Example. Let y be the product of the standard Gaussian measures , on the real line and let F: R7O - H = 12 be given by F(x) = (2-n f (2nxn)) 0= where the function f on R1 is defined as follows: f(t) = 1 - Itl if Itl < 1 and f is extended periodically to all of RI. Then IF(x + h) - F(z)I,, < IhIN, and F E G2 I (y, H), but F 0 D2"(y, H).
dz E ROO,b'h E H,
PROOF. We have
x
2
x
-nIf(2nxn+2nhn)-f(2nxn)l2o is hypercontractive: whenever p > 1, q > 1, one has IlTtf1IQ 5 IIflip
forallt>0such that e2t>(q-1)/(p-1). 5.5.4. Corollary. Let p > 2. Then the operator 1,,: f '-.
from L2(7)
to LP(y) is continuous and (5.5.5)
IIIn(f)IIp 5 (p- 1)n12IIf112.
In addition, for every p E (1, oo), the operators In are continuous on L1(y) and (5.5.6)
1IL.IIc(Lp(,)) 5 (M - 1)n/2,
M = max(P, P(P - 0-1). PROOF. The same reasoning as in the finite dimensional case applies.
where
0
5.5.5. Corollary. Let f E X and p > 2. Then, for any r E IN, one has (p 1)n/2(n+ 1)J/2I1fliL2(,) IIfllH7,- 5 IIfilyp.r < (p-1)"/21IfIIH2,r = are equivalent to the L2-norm on Xk. In particular, all norms II
-
(5.5.7)
PROOF. Inequality (5.5.7) follows from Example 5.3.4. This inequality implies 0 the equivalence of the aforementioned norms.
5.5.6. Remark. Note that the Ornstein-Uhlenbeck semigroup (T1)t>o is hypercontractive on the spaces LP(y, E) of mappings with values in a separable Hilbert
(or Banach) space E. Indeed, if q and p are related as above, one has for any F E LP (-y, E):
IITtFIIL9(,,E) 5
IIFIIE fl Lp(1) =
IIFIIL,(,.E)
Therefore, the corollaries of the hypercontractivity established above are valid for vector-valued mappings. If E is a separable Hilbert space, then (5.5.2) yields
Ji-
f
fdy12dy-
f IDHflh(H.E)d7,
fEW2.1(y,E)
X x E x Below we return to vector Poincare inequalities.
5.5.7. Corollary. Let f E Xd. For any a E (0, 2e ), there holds the inequality
y(x: If(x)I > tIIf112) 5 c(a,d)exp(-at2/d), where
c(o,d) =expa+
d d-2ea'
Chapter 5. Sobolev Classes
228
PROOF. We may assume that Ilf 112 = 1. By the estimate {I f IIp < (p - 1)d/2, we get d-3
J
eXp(alf I21d) d / 1.
5.6.
Multipliers and Meyer's inequalities
229
5.5.11. Theorem. Suppose that -y is the same measure as in Theorem 5.5.1,
P? 1 and E is a separable Hilbert space. Let f E IW'p"' (y, E) be such that
fx
Np < oo.
II
Then
f I f (x) - f f dyI E y(dx) 5 (n/2)pA-1pNp,
(5.5.8)
x where Mp =
J
I tI pp(0,1, t) dt.
PROOF. The desired estimates follow from the finite dimensional Corollary 1.7.3 (see the proof of Theorem 4.5.7).
Multipliers and Meyer's inequalities
5.6.
As above, -y stands for a centered Radon Gaussian measure on a locally convex
space X and H = H(y) is its Cameron-Martin space. Note that the operators Tf and L are special cases of operators of the form
'paf = E p(n)In(f )
(5.6.1)
n=o
We shall find conditions under which ID, is bounded on LP(-y).
5.6.1. Lemma. For every p > 1 and N E IN, the following inequality holds true for every f E L"(y):
IIT`(f - Io(f) - ... - IN-I(f))IILP(,)
2. Let s be such that p = e2a + 1. If t > s, we have by the hypercontractivity inequality
T,Tt-s(f-Io(f)
IN-i(f))II 2LP(s)
Te-3(f - Io(f) - ... -I:-V- i (f)) II Ls( ) =
I
k=N
e-k(1-')1k(f)IIL, 2 <e-2N(t-°)IIlIIi=(y)
(ti)
=
E
e-24-°)IIIk(f)IIi
k=N
N. Then the k=0
operator Q. defined by equality (5.6.1) is bounded on LP('y) for all p E (1,00). PROOF. Put
x
N-1
>2 ca(k)Ik + > cp(k)Ik = S + R. k=0
k=N
We know that S is bounded on LP(-y) by the continuity of every Ik. Set
x AN(f) =J Ttlf
00
-Io(f)-...-IN-1(f)Jdt=
k-'Ik(f) k=N
0
This operator is well-defined at least for smooth cylindrical f. By Lemma 5.6.1, we have IIAN(f)IILP(y) S
K(N,p)N-'IIfIILP(,).
Note that for every n E IN, one has
Ant(f) _ 00>2 k-nIk(f) = f k=N
Io(f)
IN_1(f)Jdt1 ...fin
(0,oo)n
Therefore,
e-N(tl+...+tn)dt1...dtn
IIAN(f)IILP(y) s K(N,p)IIIIILP(,) 1 (O.x)n
=
K(N,p)N-nhIfIILP(,)
Using the identity 00
x
0o
R(f) = E (`ank-n)Ik(f) _
k=N n=0 we arrive at the inequality
anAN(f)+ n=0
x IIR(f)IIL"(,) sK(N,p)>2IafIN-nIIfIILP(.), n=0
5.6.
Multipliers and Meyer's inequalities
231
whence the desired conclusion.
Note the following commutation relationship between D and *oDHF,
(5.6.2)
where 0(n) = p(n + 1). This identity is readily verified for the elements of Xk and then extends to all elements of W2,1 (-y). The same is true for the mappings with values in any separable Hilbert space E. In particular,
DH (I - L)-1/2 F = (2I - L)-112DHF. More generally, for any integer k and any F E W2.,(-y) with r > IkI, one has
DH (I - L)k'2F = (21- L)k/2DH F.
(5.6.3)
This identity holds true also for vector-valued mappings F E W2.,(1, E), where E is a separable Hilbert space. For the proof it suffices to verify (5.6.3) for Hermite polynomials. Our next aim is to establish the so called Meyer's equivalence of different norms on Gaussian Sobolev classes. Recall that the Hilbert transform of a function f E Co (Wt1) is defined by
4f (x) =
Jf(x + t) - f (x - t) dt. _x
t
It is known (see [844, Ch. XVI, p. 256, Theorem (3.8)]) that 9 is a continuous linear operator on every LP(1R1), p > 1. By analogy, one defines the Hilbert transform for functions on R1 with values in a separable Hilbert space E. Then it is continuous on LP(IR1, E) (see [121], where this result is proved even for a wider class of Banach spaces; the Hilbert space case makes no essential difference with the scalar case).
5.6.3. Lemma. For every p E (1, oc), there exists a constant NP such that 11DH (I -
5 NP11911 LP(,)
(5.6.4)
for every smooth cylindrical function g. Moreover, for any smooth cylindrical mapping G with values in a separable Hilbert space E one has IID,, (I - L) - 112GII LP(,.N(H E)) 5 NPIIGIILP(,,E).
(5.6.5)
PROOF. We shall start with the scalar case. For any (x, y) E XxX and 0 E IR1, we write
z s := xcos6+ysin6, ye := -xsin0+ycos0.
RB(x,y) = (xe,ye),
By virtue of formula (5.4.4). for any h E H and any bounded g, one has
ahT,9(x) =
e
1 - e-21
J 9(e`z + 1 - e X
y)h(y)'7(dy)
Chapter 5. Sobolev Classes
232
Therefore, using (5.3.1) and making the substitute cose = e', we get
x 8h(1 - L)-1/29(x) =
1
f
Jt_h/'2e_tah(Teg)(x)dt 0
/2
JJ cos9llogcos01-1/2h(y)g(xe)'Y(dy)do.
ox Since y is symmetric, we have
f h(y)9(xe) y(dy)
f h(y)9(x-e) y(dy)
X
X
Thus, we arrive at the following representation: ,r/2 eh(1 - L)-1/29(x) _
f f K(e)h(y)[9(xo) - 9(x-e)] y(dy)do,
o x
where K(O) = z cos0llogcosOl-'/2. Note that K0(0) := K(0) - (fB)-' is a bounded function on (0, it/2). This is readily verified taking into account that the function I logzI-'/2 - (1 - z)-1/2 is bounded on (0,1). Therefore, the operator ,r/2
Tf(s) = f K(0)[f(s - 0) - f(s+0)] de, 0
defined on smooth functions f extends to a continuous operator from LP[-ir, 7r] to L9[0, a/2] with norm cp. Indeed, if we put f = 0 outside [- 7r, a], then T - 9/v/2- is bounded, since 3x/2
x/2
(fT-4)f(s)= f vl'2-Ko(0)[f(a-0)-f(s+0)]de+ f f(s-0) a f(s+0) de s/2
0
which gives a bounded operator on LP(-a, a]. Let us apply this result to the function f (s) = g(x,), where x and y are regarded as fixed parameters and g is a smooth cylindrical function. This yields the following estimate: /2
a
JITf(s)IPds < pf If(s)Ipds. -,r
0
Put w/2
S(x, y) := f K(e) [9(xo) - 9(x-B)] de. 0
(5.6.6)
Multipliers and Meyer's inequalities
5.6.
233
Since -y®7 is invariant under the transformations R we have n/2
f
r J
I S(x, y)I°7(dx) ti(dy) = f f f l S(x y)I °'r(dx) yr(dy) ds
0Xx
Xx
/2
2
XIf x 0f
IS(Rs(x,y))Ipdsy(dx)ry(dy).
(5.6.7)
Using the relation n/2
S(R,(x,y)) = f K(O)[g(x.+e) - 9(za-e)] d6, a
which follows from the identity R. o Re = R,+e, and estimate (5.6.6), we see that the right-hand side of (5.6.7) is estimated by
f
x
f f f l g(xe)I P d97(dx)'y(dy) = 4cy 19(x)1 p '(dx). XX X Using the notation introduced above, for any h E H, one has p
ah(I - L)-1129(x)
7r
fS(x,y)h(y)7(dy).
(5.6.8)
x
in H. For every fixed x, we denote by Let us take an orthonormal basis F(x, - ) the orthogonal projection of the function y '-- S(x, y) to the space X1, i.e., is an orthonormal basis in X1, we obtain from F(x, y) = Il (S(z, )) (y). Since equality (5.6.8):
ID.(1- L)-,,g(x)12 =
(f S(x.y)I (y)1'(dy) x
_
1
f F(x,y)hn(y)ti(dy)J2
=
f F(x,y)27(dy)
n=I X X Using that F(x, ) is a centered Gaussian random variable and applying estimate (1.6.5) to the function F(x, - ) E X1 for every fixed x, we get IIF(x,
KpII F(x, )IIt.9
1 and r E N. Them exist positive constants mp,r and 1tfp,r such that, for any smooth cylindrical function f , one has mp,rllD,1 f IILP(,.,cr) 5 II (1 - L)r12fII LI(,) < A'fp.r [II D» f II
+ Ilf IILP(,)]
(5.7.1)
Analogous estimates hold true for E-valued mappings, where E is a separable Hilbert space.
PROOF. According to (5.6.2), for any F E W2'2(y, E), where E is a separable Hilbert space, one has
DH (I - L)-' F = D,, (21 - L)-'/2D,, (I - L) `1/2 F = DH (I - L)-1/2T,,DH (I - L)- '12F, where co(n) = '1+ n/ v/2 -+n. Therefore II DH (I - L)-1F IILr(ti.n,(H.E)) 5 G' aKIIp'IILP(,.lr),
5.7.
Equivalence of definitions
235
where C is the norm of D (I - L)-'12 and K is the norm of 1I', . Hence, IIDti GIILv(,.'H;(H.E)) < C2KII(I -
L)G1LP(,.E1
for all smooth cylindrical E-valued mappings G. Continuing in this way, we arrive at the estimates IID,kGII LP(,.'hk (H.E)) < C(p, k)fl (I - L)k 2GII LP(,,E).
k E N.
On the other hand, by (5.6.2) and Corollary 5.6.4, one has
II(I - L)GII LP(,.E) = II(I - L)'"2(I -
L)1'2GIILp(,.E)
< Apll(I - L)'12GII LP(,.E) + "pllDH (I -- L)'i2GII LP(,. ((H.E))
=1'p11(I - L) 1`2GIIL9(-.E) + tpII(2I ApII(I -
L)112GIILP(,.E) +),,Il(21-
L)'12(I -
LP(,.?t(H.E)) L)--1/2(I
- L)1 /2DHGII LP(,.?l(H.F.1)
< apIIGIILP(-.E) + (A2 + \Pbf)IIDHGIILP(,.N(H.E)) + ant11IIDH
where M is the norm of the operator IP.,, with ;Vi(n) = n + 2/ n -+I. Continuing -L)(k_ 1)/2. we estimate the norm in this way and writing (I - L)k/2 as (I -L)'/2(I of (1 - L)k'2G via the norms of D,, G, j < k. Finally, the norms of 1 < j k - 1, can be estimated via the norms of G and D G by virtue of the Poincare inequality. Indeed, let us suppose first that k = 2 and that G,, are scalar smooth cylindrical functions uniformly bounded in LP together with the second H-derivatives. Denote by v. the integrals of D,,G,,. By the Poincar6 inequality, the sequence DHC,, - t', is bounded in LP(-y, H). Put F., = C - n,,. Then DH F = DH C - r,,. In addition, the sequence
f
x
Fnd-
=JGdf
x
is bounded. Therefore, by the Poincare inequality. the sequence is bounded in LP(-y), whence the boundedness of {i } in LP(-f). Since the vn's are centered Gaussian random variables, they are uniformly bounded in L2(7), which implies the boundedness of {v,,} in H. This shows that is bounded in LP(7,H). The same reasoning applies to smooth cylindrical mappings G. with values in any separable Hilbert space E. In this case v E 7-!(H, E) and, according to Problem 4.10.19, the quantities lIV,,IILP(,.N(H,E)) are estimated via C V,,lIL2(,.1(H.E)), hence via GpIIVn Ilat(H.E). Moreover, it is easy to see from our reasoning that for every e > 0,
there exists C(e) such that IDHGIILP(,.H(H.E)) :5 EIIGIILP(,.E) + C(e)IID,; Gjll.P(,,h2(H.EI)
(5.7.2)
for every smooth cylindrical E-valued mapping C. Now, by induction, we arrive at the estimates FIIGIILP(-,,E) +C'k(F)II DH GIILP(,,,tk(H.E))+
where j = 1,... , k - 1, which bring the proof to the end.
(5.7.3)
0
5.7.2. Theorem. Let E be a separable Hilbert space and let p > 1. r E V. Then the classes E), E), and DP,'(ry,E) coincide and their norms are equivalent (in addition. the corresponding notions of derivative coincide).
Chapter 5. Sobolev Classes
236
PROOF. Meyer's equivalence together with the density of cylindrical mappings in LP (-y, E) yield the equality of the classes E) and HP"'(y, E) and the equivalence of their norms. It remains to apply Proposition 5.4.6. D
We observe that estimates (5.7.3) extend to the mappings G E 1'I'"(-Y. E).
5.7.3. Corollary. The classes Wx(y), H'°(y), D' (-y) coincide. If E is a separable Hilbert space, then the classes W' (y. E), H"(y, E), and D" (y, E) cox x incide. In addition, f = E precisely when E E oa
for allp> I. PROOF. By the previous theorem, W'(-y) = H"(y) = D'(-t). Let f E Since f E H2.,(-I) for every r E IN, we get En'Illn(f)II'z(.,) < oo. Conversely, suppose that this series converges for r = 4k, where k E IN. Then 5 c(k)n-2k. Since Iiln(f)tIH=.k < (n + 1)k'2IIIn(f)IIL2(,), the series
E
whence f E
converges in the space
n=o
5.7.4. Corollary. For any separable Hilbert space E and any k E IN, one has
XA. E 11'x(y.E) = D'°(y.E) = H"(1,E) The derivative in all Sobolev classes defined above is denoted by D or by Ot,. According to Theorem 5.7.2, the definitions of derivative in and DP-' lead to one and the same mapping up to a modification (in addition, the derivative along H turns out to be well-defined also for the classes for which it had not been introduced originally). Equality (1.5.2) extends easily to the infinite dimensional case.
5.7.5. Corollary. Let g E
f
(y). f E I{r2.2(y). Then
f gLfdy.
In a similar manner we have the infinite dimensional generalization of Proposition 1.5.6 (which was in fact justified above).
5.7.6. Corollary. For any f E
(y, E) = W2.1 (y, E) =
(y, E), where
E is a separable Hilbert space, one has
D,,Ttf
5.7.7. Lemma. Let f E WP-1(y), p > 1. Then DJ = 0 a.e. on the set (f = 0}. The same is true for the mappings from the class E) taking values in any separable Hilbert space E. PROOF. This property is obvious if we use the characterization of the Sobolev classes via the directional differentiability. Indeed, it suffices to show that, for any
h E H, having chosen a version of f that is absolutely continuous along h, one has Ohf = 0 a.e. on the set (f = 0}. By the aid of the conditional measures on the lines parallel to h, the claim reduces to the one dimensional case, in which it follows from the definition of the derivative and the fact that almost all points of any measurable set on the real line are its limit points (hence our derivative equals zero almost everywhere where it exists on the set { f = 0}). The vector case follows from the scalar one. 0
5.7.
237
Equivalence of definitions
In the same manner as in the finite dimensional case, one defines the local Sobolev classes on a locally convex space X with a centered Radon Gaussian measure y.
5.7.8. Definition. Let W ( ) , p > 1, r E IN, be the class of all functions f, for which there exist an increasing sequence of measurable sets X (called localizing) and a sequence of functions -On E W' (-y) with the following properties:
1) y(U Xn) = 1 and, for y-a.e. x, the union of the interiors of the sets n=1
(Xn - x) n H considered with the topology from H equals H; 2) ''nIX'., = 1; 3) WnJ E Wpx(y) In a similar manner one defines the classes of vector-valued mappings Woe (y, E), where E is a separable Hilbert space, and the classes H « (y). H, ('y, E). ,+I
By analogy we define the classes G o" (y, E) (replacing in the definition above the condition Wn E W'00 by Vn E G4-k(y) for all q > 1 and k E IN). Note that the classes Ha (y, E) can be defined for any r > 0. The following result is readily deduced from Lemma 5.7.7.
5.7.9. Lemma. Let f E D,± (t('n f) a. e.
If n > m, then one has
on Xn, for all k = 1.... , r and D,, f := lim n-xDH (?P f) does not
depend (up to a modification) on a choice of a localizing sequence {X.} and a sequence {ilrn } with the properties stated in the definition. The same is true for the classes GpaC ('y) and for vector-valued mappings.
5.7.10. Lemma. Let f E Wjo'(^y) be such that f and ID,, f I ,, belong to L7(y) for some p > 1. Then f E WP,'(-?). The same is true for the classes Gf '(y) and for vector-valued mappings.
PROOF. By Corollary 5.4.7, it suffices to show that f E GP-'(-r), that is, (DH f,h)H is its generalized derivative for every h E H. Using the functions ipn E W'° (y) from the definition of the local Sobolev classes and Proposition 5.4.2,
we get the versions of f,, = b,J that are locally absolutely continuous on the lines x + IR'h. Suppose that x belongs to the union of the interiors of the sets (X,, - x) n H. Then there exists an interval (-e, e) such that x + th E Xn for all t E (-e, e) and all sufficiently large n. Then f (x + th) coincides with fn (x + th) for all t E (-e, e), hence is locally absolutely continuous. In addition, the derivative of f (x + th) in t coincides for a.e. t with (DH f,, (x + th), h) = (DH f (x + th), h),,. H
Since IR'h is covered by the interiors of the sets (Xn - x) n H, we conclude that t f (x + th) is locally absolutely continuous on JR' and its derivative equals (DH f (x + th), h) for a.e. t. Therefore, the integration by parts formula applies H (y). The same reasoning proves the claim for the and yields the inclusion f E classes GP" (-y). The vector case follows from the scalar one.
O
5.7.11. Example. Let y be a centered Radon Gaussian measure on a locally convex space X, let E be a separable Hilbert space, and let f E W « (y, E) be such
that
I DH f (x)I°K(H E)y(dx) = N. Then f E WP-' (-y, E) and estimate (5.5.8) holds true.
J
Chapter 5.
238
Sobolev Classes
t if PROOF. Suppose first that E = Ift1. Let f, = g,,(f) where Itl < n and nsgnt if Itl > n. Then f E WP'1(y). It is readily verified
that ID H f I H < DH f (H . Estimate (5.5.8) for f follows from Theorem 5.5.11.
Since f - f a.e. and the integrals of IDHf,,Iv are uniformly bounded, we get by Fatou's theorem and (5.5.8) that the sequence
J
f dy is bounded. This yields
the boundedness of Moreover, we get in WP-'(-y), whence f E simultaneously the desired inequality for f. In the vector case, it suffices to note that fo = If I,,, belongs to Wl (y) and I DHfo(x)I H < II DHf(x)IIx(H.E) a.e. Hence fo E LP(-y), whence the claim.
5.7.12. Example. Let y and E be the same as in the previous example and let f E 4 ' i a (y, E), r E I N and p > 1 . be s u c h that. J ID, ,f (x)l
(H.E)y(dx) < oo.
Then f E WP''(7). PROOF. Follows by induction from the previous example.
5.8. Divergence of vector fields Let p be a Radon measure on a locally convex space X and let v: X - X be a measurable mapping, which we shall call a vector field. The differentiability along a field is defined by means of the integration by parts formula. For a function f on X, we put
&f (x)
(f'(x),v(x)) = i
fp x
tv (xt) - f(x)
whenever this limit exists.
5.8.1. Definition. The measure It is called differentiable along the field v if there exists a function 3,, E L1(µ) such that
f 8,; f (x) µ(dx) = - f f (x).3. 1(x) p(dx), X
d f E )rC" .
(5.8.1)
X
The function ,3' is called the logarithmic derivative of p along the field v or the divergence of v with respect to µ. In the latter case it is denoted also by by. 5.8.2. Example. Let X be an infinite dimensional locally convex space. Then no Gaussian measure on X is differentiable along the vector field v(x) = x. PROOF. Suppose that y is a Gaussian measure on X differentiable along v. To simplify notation, we shall assume that y is centered. Let be an orthonormal basis in H(-y). Put E 1 Fi(x)ei. Let us denote by an the a-field generated smooth function 1 p-a.e. on U}, and CT(U) = +oo if there is no such f. Then, for every subset A of X, we put CT(A) = inf{CT(U): A C U, U is open}. It is clear that in the definition of the capacity C7. on open sets one could consider only nonnegative functions, since T I f I> T f. We shall deal further with the operators Vr. The corresponding capacities are denoted by Cp.r and called Gaussian capacities. However, almost all results in this section are valid for general capacities CT.
5.9.2. Lemma. The capacity CT is subadditive, i.e., U B) G CT(A) + CT(B)
for any two sets A and B. In addition.
Cr(B) >
IITIIc(1Lp(a))p(B)'lp
dB E 8(X).
PttooF. It suffices to consider open sets A and B. For any e > 0, let f and g be functions from LP(p) such that IIfIIp 5 CT(A)+e, IIgIIp 5 CT(B)+e, TfIA > 1 and TgI B > 1 a.e. Put h = max(lf 1. Igi) Then IIhlIp 5 (Ilf IIp + IIgIIp)'1p < IIf IIp + Ilgllp T f I A > 1 and ThI B > TgI B > 1 a.e., since Th > T f and Th > Tg a.e., due to our assumption. Therefore, CT(A U B) < IIhIIp, whence
the first claim. Let us prove the second claim. Let B be open and let f E LP(p) be such that T f mB > 1 a.e. Hence IITIIc(LP(H))IIIIIp > IITfIIp > p(B)'IP. Therefore, this estimate is valid for the infimum.
If p = 1 and T = I, then CT(B) = p(B), but typically CT is not countably additive even on the Borel o-field.
Chapter 5. Sobolev Classes
244
5.9.3. Proposition. The capacity CT has the following properties:
CT( n Kn).
(i) If compact sets Kn decrease, then lies n
n_1
Or
(ii) For arbitrary sets An, one has CT(U An) < n=1
C1-(A,). n=1
PROOF. (i) Indeed, for any open set U, containing the compact set nn 1 Kn,
one can find n such that Kn C U. In order to prove claim (ii), it suffices to consider open sets An. In addition, we may assume that the series of their capacities
converges. Put A = U' 1 A. Let e > 0. Let us choose fn E LP(p) such that T fn > 1 a.e. on An and (5.9.1) IIfnIIp 0 and any function po E Cb(X ), one has
r) < r-'II;oII,
CT(x:
since Tcp/r > 1 on the open set {x: TV(x) > r}. Hence, for every r > 0, we get
n,k-+oo.
CT(IFn - FtI > r) Therefore, for any n, there exists an integer kn such that Cg-(IF)
- F,j > 2-") < 2"-",
Vi, j > kn.
Let us verify that the sequence {Gn} = converges to F quasi-everywhere. It suffices to show that this is a Cauchy sequence quasi-everywhere. Let £ > 0. Let us choose n1 such that 1/2n' < E. Put
XE = n {x: IG3 - G3.i I < 2-7 }. By virtue of the subadditivity of CT and the choice of G", one has
CT(X\Xo) < E 2-j < 2-n, < £. )>n,
Obviously, on the set XE, one has ICj+I - Gj I < 2-J for all j > ni. Therefore, the o(C3. - Cj), Go = 0, converges on X(, which proves the convergence of {Gn} by the equality E," (G3+I - Gj) = Gn. Since e is arbitrary, we get the quasi-everywhere convergence. This reasoning shows that, for any £ > 0, there exists a closed set Xe with CT(X\X,) < e, on which the convergence is uniform. Hence the limit F' of the sequence {G,,} is quasicontinuous. Moreover, it follows from our reasoning that, for any fixed r > 0 and £ E (0, r), there exist a closed set Xr with CT(X \X,) < e, on which F' is continuous, and a function g E Cb(X) series
such that If - 9IIP < e and IF' - TgI < £ on X. Then CT(F` > r) < CT(X, f1 {F' > r}) +£
r-e})+£r-£)+£ 1 p-a.e. on B, whence Cr(B) CT(B). Suppose first that h is nonnegative a.e. and let c > 0. One can find a closed set Z, on which h is continuous, such that Cr(X \Z) < E and h > 1 on B n Z. Note that the set
G=(Zn{h>1-E})U(X\Z) is open and B C G. As we have already proved, there exists a nonnegative function
ho = ux\z such that ho > 1 on X\Z a.e. and IIT-'hollp = CT(X\Z) < C. Since the functions h and h.o are nonnegative a.e., we get h + ho > 1 - e a.e. on G. Therefore,
CT(B) h a.e., since IgI > g. Hence h1 > h quasieverywhere by Lemma 5.9.5. Thus, h1113 > 1 quasi-everywhere and IIT -' h 1 l I p=
IIT-'hllp = CT(B). The uniqueness result implies that h1 = h. hence h > 0 quasi-everywhere.
0
5.9.
Gaussian capacities
247
The function ug from the previous theorem is called the equilibrium potential of the set B.
5.9.7. Corollary. Suppose that the conditions in Theorem 5.9.6 are satisfied. Then, for any increasing sequence of sets Bn, one has lim CT(Bn) =CT(Un==1Bn) nx
PROOF. Let un be the equilibrium potential of B,,. The sequence {T-'u;} is bounded in L%µ). Hence it has a subsequence {T-tank}, whose sequence {Sn} of the arithmetic means converges in LP(p) to some function g. Put u = Tg. Clearly, the union D of the sets Bn n fu,, < 1} has capacity zero, since each of these sets
has zero capacity. On the set B\D, one has Tg > 1. Therefore, Tg > I quasieverywhere on B, whence CT(B) 0. Let K be a compact set with C(X\K) < e. Note that K is compact in the topology r, hence the initial topology coincides with
r on K. In particular, there exists a -r-open set V such that K n V = K fl U. Therefore,
C,(U) < Cr(U n K) +Cr(X\K) = Cr(U n K) + C(X\K)
0, p > 1. Then, for any e > 0, there exists a metrizable compact set K, such that Cp.r(X\Ke) <e. In addition, Cp,r(B) = sup{Cp,r(K); K C B is metrizable compact} for any Bored set B (the same is true for any Souslin set B).
PROOF. First we consider the case, where X is complete. In this case, as we know, there exists a metrizable absolutely convex compact set K such that y(K) > 1/2. Denote by g the Minkowski functional of the set K defined by zero outside the linear span E of K. Since y(E) > 0, one has y(E) = 1 by the zero-one law. By Fernique's theorem, one has g En, LP (-I) - Put G(x) := Jg(x) := Tiog2/29(x) =
Jg(±i) y(dy), x
where (TL),>o is the Ornstein-Uhlenbeck semigroup. Let Kn be the closure of the
set {G < n} n E. Note that I v'2-J9(x)
- g(x) 1:5 f 9(y) -y(dy) = d,
x E E,
x whence
{Jg 1. Since n-1G > 1 a.e. on the open set X\Kn, we have
Cp,r(X\Kn)
n-'IIGIIHP.r,
whence the tightness of Cp,r follows. Let now X be an arbitrary locally convex space and let Z be its completion. We shall consider the measure y on Z. Let D be a compact subset of X of positive -y-measure and let E be the linear span of D. The set E, as it has already been noted above, is a-compact. In addition, by the zero-one law, it has full measure. The indicator function IM of the set M = Z\E equals zero a.e. and JIM = IM pointwise. Since JIM E HP,r(y), we get JIMIIHp.r=0. Therefore, G" (M) = Cp,r(IM = 1) = Cp,r(JIM = 1) 1L"(7).
PROOF. It suffices to prove our claim for centered measures. It is readily verified that the set Ald of all x such that the degree of the corresponding polynomial
mapping is at most d is measurable. In fact, A/d coincides up to a measure zero set with the intersection of the sets {x: 'I(* (x)) = 0}. I E I', where {en} is an orthonormal basis in H. Since the sets Ald are invariant with respect to the shifts to the elements of H, we get, by the zero one law, that the first of them that has nonzero measure is a set of full measure. Now we use induction in d. For d = 0 the claim is true, since in this case, for every I E I', the function 1(41) is constant along H, hence coincides a.e. with some constant. Then %P coincides a.e. with some element from Y (recall that IF is a countable separating family in Y'). Suppose the claim is true for some d > 1. Let us consider the mapping DH it: X --' C(H. Y). The mapping D 4 with values in C(H, Y) is polynomial along H of degree less than d. In addition, C(H,Y) satisfies the condition of this proposition. Indeed, let (h;} be a countable set everywhere dense in H. Then the countable family of the functionals of the form g: A - I(Ah,), I E I', separates
Chapter 5. Sobolev Classes
252
the points in C(H,Y). For every such functional, the function g(4k) = 1(8h,') is y-measurable as the pointwise limit of n[I(41(x+n-1h;)) - l(a(x))], since the functions x - 1(e(x+n-1ha)) are measurable by the inclusion h, E H. In addition, the function II4'(x)jly is measurable as well, since it coincides with sup.,1 (11(x)) for some sequence {ll} C Y' with llljlly < 1. In a similar manner one gets the measurability of the function By the inductive assumption, we get II D, WIIc(H.y) E f1P>1LP(y). This implies the claim. Indeed, let :p,,(x) = sups n, then D f is not constant.
5.10.7. Theorem. Let -y be a Radon Gaussian measure on a locally convex space X, let Y be a separable Frechet (e.g., separable Banach) space, and let F: X - Y be a y-measurable mapping. Then the following conditions are equivalent:
(i) There exists a sequence of continuous polynomial mappings Fn : X Y of degree d convergent to F y-a.e. (ii) There exists a sequence of continuous polynomial mappings Fn : X -i Y of degree d convergent to F in measure, i.e.. for every continuous seminorm p on Y and every e > 0, one has lim y{x: F(x)) > e} = 0. n d
(iii) For every I E Y', one has l o F E e Xk. k=0
(iv) There exist a nonnegative integer d and a version Ft) of F such that for every x E X, the mapping h H Fo(x + h), H -+ Y. is a continuous polynomial of degree d.
If either of the conditions above is fulfilled, then one can find a Borel version of F satisfying (iv). PROOF. It suffices to consider centered measures. Clearly, (i) is equivalent to (ii) (since any sequence of mappings with values in a metric space that converges in measure, has a subsequence convergent almost everywhere), and either of these two conditions implies (iii), since 1 o F is a continuous polynomial of degree d for every functional I E Y. Suppose that (iii) is fulfilled. By Theorem 3.6.5, there exists a separable Banach space E compactly embedded into Y such that y o F-' (E) = 1. The closed unit ball of E (which is compact in Y) is denoted by K. Since there is a sequence of continuous linear functionals on Y separating the points, we may assume that
Y is embedded into 1Rx. Then F as a mapping with values in Rx can be written as F = (F,.... , F,,... ), where Fn are -y-measurable polynomials of degree d. By Lemma 5.10.1, every Fn has a Borel version G,, such that, for every x E X, h - G,, (x + h) is a continuous polynomial of degree don H. Clearly, C = (G )n t is a version of F. In particular, G(x) E E for y-a.e. x. Obviously, the polynomial mappings (C,,... , Gn, 0, 0, ...) converge to G pointwise (hence in measure) if they
are regarded as 1R"-valued mappings. Each of these mappings it is the limit of
Chapter 5. Sobolev Classes
254
a sequence of continuous finite dimensional polynomial mappings of degree d cond
vergent in measure, since G, E ® X,,. Therefore, G is a limit of a sequence of k=0
continuous finite dimensional polynomial mappings Q,,: X IR' of degree d convergent in measure. Passing to a subsequence, we may assume that G(x)
a.e. in the topology of IR". We shall show that there is a Borel version Fo of G (hence of F) such that Fo(x + h) E E for all h E H and x E X. Let us assume first that the polynomial mappings Qn are homogeneous. Let us denote by W. the corresponding continuous symmetric d-linear forms on Xd. Note that the forms Li'n are uniquely determined and can be evaluated by (5.10.1) as follows: (-1)d-c,-- -,,Qn(x0+-1x1 +...+-dXd), (5.10.2)
Wn(xl,... xd) = I E c, E{0.1}
where Xo E X is an arbitrary element. Let us put
B := {x:
lim Qn(x) E E
n
_X
where the limit is taken in 1R". It is readily verified that B is a Borel set. Clearly,
y(B) = 1. Let us put Go(x) = lim Qn(x) if x E B and Co(x) = 0 if x ¢ B. Since the mappings Q, are homogeneous, we have AB = B and Co(ax) = \dGo(x) for
nx
any x E B and A > 0. Therefore, for every rational number r, we get
1 = y(B) = y((r2 + 1)-1/2B) =I
- rx) -y(dx),
X
since the measure -y equals the image of yey under the mapping
(x,y) '- r(1 +r2)-1/2x+ (1
+r2)-1/2y.
Hence there exists a set Cr E 8(X) such that -y(B - rx) = 1 for all x E Cr. Letting C = lrCr, we have y(B - rx) = I for all x E C and all nationals r. Clearly, C is a full measure Borel set. Passing to a full measure subset, we may assume that C is a countable union of metrizable compact sets. Then we have
y(n(B-rx-sh))
= 1,
Vx E C.Vh E H.
r,s
where the intersection is taken over all rationals r and s. Let us now fix c E C and h E H. By choosing xo E lr.,(B - rx - sh), we get xo + rx + sh E B for all rationals r and s, i.e., lim Qn(xo+rx+ax) E E for all rational r and s. It follows n-.x by identity (5.10.2) that, for every j E (0,... , d), there exists the limit V,(x,... , x, h, ... h),
n-ic
where (x, ... , x, h, ... , h) stands for the vector in X d with the d- j first components x and the last j components h. Indeed, K n(x, ... , x, h, ... , h) can be written as
F,
C,n.q,)Qn(Xo + mx + qh),
m,qE{0,....d}
where the coefficients c,,,,,,) are some absolute constants depending only on m, q, j. d. Therefore, we get V, (x, ... , x, h, ... , h) =
>
m. qE {0.....d}
cm,q,jGo(xo + mx + qh ).
255
Measurable polynomials
5.10.
This shows that V, (x, ... , x, h, ... , h) E E for all x E C and all h E H. By construction, for every x E C, the mapping h H Vj (x, .. , x, h,... , h), H -+ E, is a Borel homogeneous polynomial. According to Example 5.10.2, this mapping is continuous (hence it is also continuous as a mapping to Y). Note also that
x + H C B for every x E C. Indeed, if x E C and h E H, then Qn(x + h) is a x, h, ... , h) with some absolute coefficients
finite linear combination of
independent of n. Hence lim Qn(x + h) exists in IR" and belongs to E (since n-x the limits lim 1Vn (x, ... , x, h.... , h) belong to E). It remains to redefine Go by
n-x
zero outside C + H. which gives a version Fo with the desired properties (note that C+H is a Borel set, since both C and H are countable unions of compact metrizable sets). Let us consider the general case where Qn may not be homogeneous. Then d
Q. = E Qn.k, where Qn.k is a homogeneous of order k continuous polynomial k=0
mapping. We shall assume that H is infinite dimensional (otherwise the claim is n
trivial). Let us put pn = n-1 E e,2, where {e,} is an orthonormal basis in H with =1
e, E X'. As we know, pn - 1 a.e. Therefore, there exists a sequence {in} such that (Pi.. - 1)[Qn,o +'_ + Qn.d-2] - 0 in measure. Thus, p,,, [Qn,o+ +Qn.d-2]+Qn.d-1 + Qn.d - G in measure. Repeating this procedure and noting that p,,, Qn,k is a continuous polynomial mapping of order k + 2, we arrive at the situation with Qn,k = 0, k = 0, ... , d-2. Passing to subsequences, we may assume that the corresponding mappings converge almost everywhere. In this situation, letting A be the set of convergence of Qn.d(x) + Qn,d-1(x), we note that (-A) fl A has full measure. As a consequence, Qn,d(x) - Qn,d-I(x) converges
a.e., hence {Q,.d } and {Qn.d-1 } converge a.e., which reduces the claim to the homogeneous case considered above. By Proposition 5.10.3, IIF0IIE E
L2(-y).
Suppose now that (iv) is fulfilled. Clearly, then (iii) is fulfilled as well. Let us take a version Fo constructed in (iii) and taking values in a separable Banach space E continuously embedded into Y. It suffices to approximate F in measure by continuous polynomial mappings F. taking values in E. Let {en } be an orthonormal basis in H such that en- E X. Since E is a separable Banach space and IIFoIIE is integrable as shown above, we have Fo E L'(-t, E). Hence we can take for Fn the finite dimensional approximations constructed in Corollary 3.5.2, i.e.,
Fn(x) = f F'(>e-,(x)e, + X
t=1
t=n1
F,(y)e+)'y(dy)
n x Note that Fn is a continuous polynomial mapping, F, (x)e, + E since F. (y)e, E
=1=n+1
C for every x E X and every y E C, where C is the full measure set constructed above. With this choice, we have IF,. - FIIE - 0 in all LP(-y). 0
5.10.8. Definition. Denote by Pd(y,Y) the class of mappings satisfying either of conditions (i) - (iv) in Proposition 5.10.7. Let us put Pd(-y) := Pd(-t, IRl ). The mappings from the class Pd(-y, Y) are called -y-measurable polynomial mappings of degree d.
Using this new notation, we have Pd(-y) = Xo e
e Xd.
Chapter 5. Sobolev Classes
256
Now we prove two zero-one law type results for polynomial mappings.
5.10.9. Proposition. Let y be a centered Radon Gaussian measure on a loC Pd(y.Y).
cally convex space X, let Y be a separable Fi-chet space, and let Then the following sets have measures zero or one:
L: _ {x: nx lim FF(x) exists}, Al: _ {x: lim F,(x) = 0}. n-x PROOF. Clearly, L and Al are -y-measurable. Suppose that y(L) > 0. Let us choose an orthonormal basis (e;} in H(y). Let L, be the set of all points x E L such that the set {t E R': x + te, E L} has positive Lebesgue measure. Obviously, y(L\L,) = 0. Hence, letting Lo = n,, IL one has I (Lo) = -y(L) > 0. We shall work with versions of Fn which are polynomial of degree d along all vectors e,. Note that if a sequence of polynomials of degree d on R' converges at d+ 1 points, then it converges pointwise. Therefore, for any x E Lo, one has x + te, E Lo for all t E IR' and i E V. By the zero-one law, y(L0) = 1. The same reasoning applies to the set M.
5.10.10. Proposition. Let y be a centered Radon Gaussian measure on a locally convex space X and let F: X -+ (Y,A) be a -y-measurable mapping with values in a linear space Y equipped with a a-held A. Suppose that {en) is an orthonormal basis in H(y) such that, for every n E IN and y-a.e. x. the mapping t - F(x + ten) is a polynomial. Then, for any linear subspace L C Y such that L E A. one has -r (X
EX
:
F(x) E L) = 0 or y(x r= X: F(x)
E
L) = 1.
In particular, this is true if Y is a separable A chef space. L E B(Y) is a linear subspace and F E Pd (-Y, Y) -
PROOF. Suppose that the set f2 = F-1(L) has positive y-measure. The claim is trivial if X = R', moreover, in this case S2 = R'. Indeed, if F is a polynomial on the real line with values in Y such that F(t) E L for infinitely many values of t, then F(t) E L for all t. In the general case, let us put
fln = {x E 0: mes(t E R': F(x + ten) E L) > 0 where "mes" is Lebesgue measure. Clearly, y(i2n) = y(f2) (this is easily seen from Theorem 3.10.2 about conditional measures). Letting 11o = nn , iln, we have -r(no) = y(f2) > 0. By the one dimensional case, x+ten E i2 for all x E iho, t E R' and n E IN. Hence ibo + Rlen = N. By the zero-one law, y(120) = 1. 5.10.11. Corollary. Suppose that (T, fit, a) is a separable measurable space
and that (WsET is a measurable centered Gaussian process. Let Q(t, z) _ n E ck(t)zk, where the ck's are measurable functions on T. Then k=1
either
fQ(tE.4Fa(di) < oo T
a.e.
JQ(t.t)Ic'(dt) = cc
or
a.e.
T
The latter is equivalent to f E K{(t, t)k121ck(t)I o(dt) = oc, where K; is the cor k=1 variance function of .
Measurable polynomials
5.10.
257
PROOF. Let us put e(t) _ (1 + E Ick(t)IK{(t, t)k'2)
1
and consider the mea-
k=1
sure A = p or. We shall apply Proposition 5.10.10 to the separable Banach spaces X = Ln (Jn), L = L' (a) and Y = L' (.1). According to Example 3.11.14, the process generates a centered Gaussian measure on the space X = L"(A). Let us denote
this measure by pt. The mapping F: X -. Y, F(x)(t) = Q(t,x(t)), is continuous and polynomial of degree n. The space L is continuously embedded into Y. Hence
µF o F-'(L) is either 0 or 1. In the latter case, IIFIIL E L'(p{). Let ry, be the standard Gaussian measure on the line. By the equivalence of all norms on the finite dimensional space of polynomials of degree n on 1R', there is do > 0 such n n that IlgllL'(,) ? do E Iakl, whenever q(s) = E aksk. Therefore, k=1
k=1
f IIF(x)II,. ut (dx) = IE
f
X
T
IQ(t, t) I a(dt)
f
= Tf f-xIQ (t, KE(t, t)1/23) I y1(ds) a(dt) > dok='E T
I ck(t)I KE(t, t)k/2 a(dt).
Conversely, if the integral on the right is finite, then the integral on the left is finite as well.
5.10.12. Example. Let (wt)j>0 be a standard Wiener process, let f be a measurable function on [0, 11, and let Q(s) _
cksk, where ck E IR' and m >_ 1. k=m
Then
f
1r
1
either
If (t)Q(wt)I dt < oo a.e. or
0
J0 If (t)Q(wt)I dt = oo a.e.
1
The latter is equivalent to f If (t)Itm dt = oo. 0
Let us discuss second order measurable polynomials on a locally convex space X with a centered Radon Gaussian measure ry having the Cameron-Martin space H = H(-y). It follows from Corollary 5.10.4 that the class of all measurable functions that are continuous second order polynomials along H is precisely P2(-y). In the finite dimensional case, any second order polynomial is written as Q + I + c, where Q is a quadratic form, I is a linear function, and c is a number. The same representation is valid for continuous second order polynomials in infinite dimensions. However, for measurable polynomials in infinite dimensions, there is no natural
way of separating "quadratic forms" from constants. Indeed, let {6n} C X' be an orthonormal sequence in X. Then the continuous finite dimensional quadratic forms S,, = n-' E fk converge a.e. and in L2(ry) to 1. Thus, we arrive at the class k=1
X0 ®X2 as a reasonable candidate for the space of measurable quadratic forms in infinite dimensions.
Chapter 5. Sobolev Classes
258
5.10.13. Proposition. Let F E P2(-y). Then there exist an orthonormal basis
It,,) C X. , two sequences {an } E 12 and {c,, } E 12 and a number c such that F=c+
Cntn + E a,,(yn - 1) F, n=1 n=1
y-a.e.,
(5.10.3)
where both series converge y-a.e. and in all LP(-y). Conversely, given {cn}, and c with the aforementioned properties, both series in (5.10.3) converge y-a.e. and in all LP(y) and F E P2(-y).
PROOF. By Proposition 5.10.6, DH F(x) = A a.e., where A is a symmetric
Hilbert-Schmidt operator on H. Note that D F(x) = A(x) + v a.e. for some constant vector v E H, since D,, A = A. Let {en} be an orthonormal basis in H consisting of the eigenvectors of A corresponding to eigenvalues an. Then one has (an) E 12. Put C. = en, an = a,/2, c = (F,1)L2(.,), and C. = (v,en)x. Then E 12. Let us define F0 by means of the right-hand side in (5.10.3).
Since lE(l;n - 1) = 0 and E
112 < oo, the corresponding series converge
n=1
almost everywhere by a classical result due to Kolmogorov and Khintchine (see [697, Ch. IV, §2, Theorem 11). By Corollary 5.5.8, we have convergence in all LP(y).
Clearly, D Fo(x) = A(x) + v a.e., for one has rOe Fo(x) = cn + 2ant n(x) a.e. and anon = Ae,. Since F and Fo have equal integrals, we get F = F0 a.e. The last claim has already been proved. Note that F = c + v - z6A. 0
5.10.14. Remark. It follows from the results above that the elements of W2.1 (-Y) X0 G X2 ("measurable quadratic forms") are precisely the functions F E orthogonal to X1 in L2(y) such that D, F(x) = A a.e. for some symmetric HilbertSchmidt operator A on H. This corresponds to c = 0 in (5.10.3).
5.10.15. Remark. Note that. if F E Xo 6X2 is nonnegative, then in (5.10.3) one has X
an > 0.
>0.
Ean 0 such that exp(rQ) E LI(y). Moreover, this is true for any e < IID, Q(IC('H).
PROOF. This is readily seen from Proposition 5.10.13 and equality (4.8.5). 0
5.10.19. Example. Let -y be a Radon Gaussian measure on a locally convex space X and let e E L1(y) be such that the measure p: = p y is Gaussian. Then logp is a second order measurable polynomial, i.e., log Lo E P2 (-f).
PROOF. The claim is obvious in the finite dimensional case. Let
be an
orthonormal basis in H(y), where n = en E X', let P,,(x) _ E {i(x)ei, and let p, be the conditional expectation of p with respect to the a-field generated by P and the measure y (see Corollary 3.5.2). Note that p,, = r o P,, a.e., where rn is the density of the measure p o PI I with respect to y o P,,- 1. By the finite dimensional case, rn = exp F,,, where Fn is a second order polynomial on IRn. Recall that {pn }
5.11.
Differentiability of H-Lipschitzian functions
261
is a martingale convergent to p in LI(ry) and almost everywhere. Therefore, the continuous second order polynomials F,, o P converge a.e. to log o (note that o > 0
0
a.e.). Hence log B E P2(-y).
Suppose we want to investigate the distribution of a continuous quadratic form Q on a locally convex space X with a centered Radon Gaussian measure -y. An algorithm suggested by the preceding discussion is this: we write the restriction of Q to the Cameron-Martin space H = H(-) as (Alt. h)5, where A is a symmetric Hilbert-Schmidt operator and find the eigenvalues of A. To be more specific, let X = C[0,1]. -y = Ptt', and let I
Q(x) = fp(t)x(t)2dt.
where p E L1 [0. I].
0
Then the restriction of Q to H(PII) = ll o'' [0.1] is given by (Ah, h)H(pw). where t
t
Ah(t) _ ff p(u)h (u) du ds, which is verified by the integration by parts formula (recall that ( , ti)N(pit i = (pp'.t:")t. o,ij). Hence we arrive at the boundary value problem
Ah"(t) = -p(t)h(t).
h(O) = 0, h'(1) = 1.
Measurable polynomials on the classical Wiener space can be described as the multiple stochastic integrals defined as follows. We shall take the Wiener space (C[O.1]. P") as a probability space. Let U E L2([0. 1]"), where [0.1] is equipped with Lebesgue measure. Since Z"(u) E X,, for simple kernels it. the same is true for arbitrary kernels (see the definition of the multiple stochastic integral Z"(u) in Section 2.11). Thus. 1, (it) is a measurable polynomial on the space (C[O,1]1 Ptt ). Conversely, any P't'-measurable polynomial can be written as a sum of multiple stochastic integrals. This can be seen in several different ways. For example, one can verify that the linear span of the multiple integrals of the simple functions is where the dense in X. and then use (2.11.10) to show that if a sequence u,'s are simple kernels, converges in L2(Ptt'). then {u,} converges in L2([O.1J"). Another possibility is to show that, given an orthonormal basis {yP,} in L'2[0' 1], the H ermite
,(t) dwt and n I +... + n,. = n, f where u is expressed as a certain product of
polynomial f1 H", (, ). where
=
coincides up to a factor, with the functions ,%",(t,) (see [399, §6.6]).
5.11. Differentiability of H-Lipschitzian functions The classical Rademacher theorem states that any Lipschitzian mapping F from IR" to IR k is Frechet differentiable almost everywhere. This result has no direct extensions to the infinite dimensional case. The principal reason is not the lack of infinite dimensional analogues of Lebesgue measure, but just the existence of Lipschitzian mappings between Hilbert spaces that have no points of the Frechet differentiability at all (although as shown in [625], any real-valued Lipschitzian function on a Hilbert space has a point of the Freshet differentiability). However, the Rademacher theorem can be reformulated (in the finite dimensional case) in
Chapter 5. Sobolev Classes
262
equivalent ways that admit infinite dimensional generalizations. We discuss here one of such possibilities. The proof of the following theorem is completely analogous to the proof of Theorem 5.11.2 below.
5.11.1. Theorem. Let X be a separable normed space and let F. X -- Y be a locally Lipschitzian mapping with values in a Banach space Y with the RadonNikodym property. Then F is Gateaux differentiable (and Hadamard differentiable) everywhere, except. possibly, at the points of some Borel set, which is zero with respect to every nondegenerate Radon Gaussian measure on X.
In particular, this implies the F chet differentiability along any compactly embedded normed space E. According to Problem 5.12.23. the Gateaux differentiability in Theorem 5.11.1 cannot be replaced by the Frechet one along the whole space X. However, admitting to consideration the differentiability along a smaller subspace. it would be quite natural to impose the Lipschitz condition only along this subspace. This leads to the following question. Let F be a mapping from a locally convex space X to a Banach space Y with the Radon-Nikodym property, measurable with respect to a nondegenerate Radon Gaussian measure p on X, such that, for IL-almost all x. one has the estimate 11 F(x + h) - F(x)I),. < CIhIc,
V h E E.
where E is some normed space continuously embedded into X. Is F differentiable along E p-a.e.'. It is shown below that this question is answered positively for the Gateaux differentiability and negatively for the Frechet differentiability.
5.11.2. Theorem. Let p be a Radon Gaussian measure on a locally convex space X. H = H(p). let Y be a separable Banach space with the Radon Nikodym property, and let F: X - Y be a measurable mapping such that p-a.e. one has I{F(x + h) - F(x)II,. < CIhl,.
Vh E H.
(5.11.1)
Then:
(i) there is a set n with fl + H(p) = 1 and p(1) = I such that (5.11.1) holds true for every x E !l: in particular, there exists a modification of F satisfying (5.11.1) for all x: (ii) p-a.e. there exists the Gdteatcr derivative D, ,F and II D F(x')II c(H.v) S C
for p-a.e. x. (iii) F E GP-' (1, Y) for all p E (1. z).
PROOF. Put H = H(p). Assertion (i) has already been proved in Lemma be an orthonormal basis in H and let H. be the linear span of
4.5.2. Let
its first n elements. Let X. be any closed subspace in X algebraically complementing H,,. On the finite dimensional subspaces y + H,,, y E X,,, one can choose conditional Gaussian measures absolutely continuous with respect to the natural Lebesgue measures on these subspaces. Therefore, by virtue of the finite dimensional Rademacher theorem (which applies due to the Radon-Nikodym property of Y). the Gateaux derivatives DH F exist p-a.e. Let M be the set of all those points x at which all Gateaux derivatives DH F(x), n E IN, exist. Let us show that F is Gateaux differentiable along H at any point a E Al. Note that on the linear span L of all subspaces H,, we have a well-defined linear mapping G: h - OhF(a), which is continuous by virtue of (5.11.1), hence extends uniquely to an operator
5.11.
Differentiability of H-Lipschitzian functions
263
G E G(H, Y). Let h E H. Let us choose a sequence {hn } C L with Ih,, - hl, The estimate
F(a + thn) - F(a)
limit
x, the vectors
F(a + th) - F(a) t
F(a + th) - F(a) < C(h II
t
t
implies that, as n
0.
F(a + thn) - F(a)
- 8r,, F(a) converge to the
L
- G(h) uniformly in t. Therefore. im t-0
F(a + t h) - F(a) t
= G(h).
which means the Gateaux differentiability. Clearly,
C a.e. c(H.y) Let us prove (iii). By the integration by parts formula, it suffices to show that F E LP(ry,Y), because then the Gateaux derivative X -- C(H.Y) serves as the generalized derivative. Note that the mapping OhF is measurable for every h E H, since it coincides I-a.e. with the limit of the measurable mappings n(F(x + n-1 h) - F(x)). Since Y is separable Banach, it suffices to show that the Y-norm of F is in LP(-y). This follows from Theorem 4.5.7, since the function x -' IIF(x)IIy is H-Lipschitzian. It is clear from the proof given that the statement remains valid if H is replaced by an arbitrary normed space E that is linearly embedded into X in such a way that E contains a countable everywhere dense set from H. 5.11.3. Corollary. If the conditions in Theorem 5.11.2 are satisfied. then. for any normed space B compactly embedded into H. the Frechet derivative D. F exists p-almost everywhere.
In general, the last corollary is not valid for the space H itself. X
5.11.4. Example. Let X = IR", it = ® JA,,. where p,, is the standard Gaussn=1
ian measure on the real line, let H = 12. and let F: X 12, F(x) = where the fn's are 21_"-periodical functions on the real line such that fn(t) = t
if t E (0,2-"[, f,. (t) = 21-" - t if t E 12-n. 21-i(. Then H = H(µ) and F is Lipschitzian along H. but at no point is Frechet differentiable along H. This example can be easily modified to make F everywhere Gateaux differen-
tiable along H. Certainly, we could take some Hilbert space X instead of IR". In [86) there is an example of a probability measure p on IR" such that it is quasiinvariant along H = lz and the function f (x) = sup,, Ix,, I is p-a.e. finite and Lipschitzian along H, but a-a.e. is not Frechet differentiable along H. It was conjectured in [230) that a similar example exists also for a Gaussian measure. Although this conjecture seems likely to be true, we have no such examples.
5.11.5. Remark. Recall that, by Corollary 4.5.4, if condition (5.11.1) is fulfilled for every h E H on a full measure set dependent on h, then F has a modification, for which (5.11.1) is fulfilled for all x E X and h E H simultaneously. The following example is borrowed from [230).
5.11.6. Example. Let '1 be a centered Radon Gaussian measure on a locally convex space X and let S C X' be compact in the topology o(X'. X). Then, for -y-a.e. x, the function f - f (x) on S attains its maximum at a unique point.
Chapter 5. Sobolev Classes
264
PROOF. Put V(x) = suppEs f (x). The set S is bounded in X; by Problem 4.10.18, hence supf£S f(h) < CIhI for some C. Therefore, the function yp is H-Lipschitzian. Let x belong to the full measure set of the points where there exists the Gateaux derivative of
along H. We observe that there is a unique g E S
such that g(x) = cp(x). Indeed, suppose that there is k E S such that k g and k(x) = a(x). Then g(h) > k(h) for some h E H. We have ,;(x + th) > g(x + th) and ,.(x - th) > k(.r - th), whence 8h;p(x) > lim t-' [g(x + th) - g(x)] = g(h)-
On the other hand, 8hya(x) 1. Suppose that JIDnF(x.)IIc,N.y, < C a.e.. for some constant C. Then F admits a modification Fu such that
IIFo(x+h) - Fo(x)II, < CIhI dx E X,Vh E H. In particular, exp(o1IFII;.) E L'(y) for all a < (2C2)
.
PROOF. Let It E H. By Proposition 5.4.2, there is a modification 1j) of F such that the mapping GJ : t +- Fo(x + th) is locally absolutely continuous. In addition, for a.e. x, the derivative of this modification in t coincides with 8hF(x. + th) (the generalized derivative) for a.e. t. Since IIahFII,. < C A.P., we conclude that Gr is Lipschitzian with constant CIhI,,, in particular, II F0(x + h) - F)(x)JI,. < CIhI,,.
Since Fi, = F a.e. and It E H(y), we obtain that F)(x+h) = F(x+h) a.e.. whence JIF(x + h) - F(r) II1. < CIhI , a.e. It reinains to apply Lemma 4.5.4. 5.11.8. Corollary. Let f E be such that I D f I < C a.c. Then f has a modification fo with I fo(x+h) - fo(x)] < CIhI for allx E X and h E H(y).
5.11.9. Remark. It should be noted that JID FIIK need not be uniformly bounded a.e. even if F E Ii'21(y. H) is H-Lipschitzian. Indeed, let be the countable product of the standard Gaussian measures on R' and H = 12. Put F(x) = (2-"fn(xn))
where f, E Co (R' ), 0 < f., < 1, If I S 2". the set {t: f'(t) = 2") contains a nontrivial interval J. for every n E N. and Ilfn IIL1 .., r < 1. where y, is the standard Gaussian measure on IR'. Then F is H-Lipschitziau and belongs x 6i'2.' (y H). since I I D F(x)IJx = to 2 '"J f,' (x") 2 is y-integrable. However, n-1
is unbounded on any full measure set, since IIDF(x)lll > k on the set { x : (X,,... . xk) E J1 X
X JA. } having positive -,-measure.
Now, following (420), we discuss an interesting modification of the classical problem of extending a Lipschitzian mapping f, defined on a subset A in a normed (or just metric) space X and taking values in a normed space Y. One of the first results in this direction was obtained by McShane 1539], who proved that any real
Lipschitzian function f on an arbitrary subset of a metric space X extends with the preservation of the Lipschitz constant to the whole space (the corresponding
5.11.
Differentiability of H-Lipschitzian functions
265
extension is given by a simple explicit formula). The situation is more complicated for multidimensional mappings. For example. it may happen (see [1831) that X and Y are Banach spaces, but some mapping f : A -- Y has no Lipschitzian extensions
at all (even without any restriction on its Lipschitz constant). One of the best known positive results is Valentine's theorem (see [183)), which states that any Lipschitzian mapping f. defined on a subset of a Hilbert space X and taking values in a Hilbert space Y. has an extension to all of X with the same Lipschitz constant.
5.11.10. Theorem. Let X be a Souslin topological vector space (e.g.. a separable Frechet space). E C X a linear subspace equipped with a norm II IIE such that the ball CITE = f h E E : II h II E < 11 is a Souslin set in X. let A C X be some Souslin set. and let f : A IRt be a function. which has the following properties:
a) the sets {x E A: f (x) > c} are Souslin. b) for all h E E and r E A such that x + h e A. one has (5.11.2)
I f(x + h) - f(x) I< AIIhIIE.
Then the function f extends to a function F: X -+ lRt , hating property a) (in particular. universally measurable. i.e. measurable with respect to every Borel measure
on X) and satisfying inequality (5.11.2) for all r E X. h E E. Put A,. = (x + E) n A. Obviously. the function defined by the formula
F(x) = sup [f (y) - allx - YIIF], yE.a,
whenever Ar is nonempty. and F(r) = 0 if Ar is empty. has property (5.11.2) and, for each x E A. coincides with f (x) (since f (x) > f (y) - .llx - yll E if x, y E A). We have to prove that the sets {x: F(x) > c} are Souslin (whence it follows that the function F is universally measurable). Note that the foregoing formula is a straight forward generalization of McShane's formula [539]. It is clear that in the definition of F, when taking sup. one can replace A,. by A. Therefore, F(x) = 0 if r does not belong to the set Xo = A + E, which is Souslin as the image of the Souslin set A x E in X x X under the continuous mapping (r. y) -+ r + y (note that the fact that the unit ball in E is a Souslin set in X implies that so are all the balls in E, hence also E itself). Let us extend II 11E to X. letting I'IIE = +x if r ¢ E. For any number r, the set {r E X: IIZIIE < r} is Souslin. since it coincides with the ball of radius r in E. By the continuity of the mapping X xX - X, (r, y) - x - y, the set {(x, y) E X x X : llx - yllE S r} is Souslin for any number r. Hence, for all c E IR', the set {(r. y) E Xo x A: f (y) - ally - yII E > c} is Souslin as well, since it is the union over all rational r of the Souslin sets {(x. y) E XoxA: f(y) > r}n{(x.y)
E XvxA: r > c+allx - yIIE}
Put G(x, y) = f (y) - allx - YII E X EXo, y E A. It is clear that {x E Xo : F(x) > c} = p/l { (x, y) E Xo x A: G(r, y) >
c}lI ,
where p: X xX - X is the natural projection onto the first factor. Therefore, the set {x E X,): F(x) > r} is Souslin. 9
Chapter 5. Sobolev Classes
266
A related but weaker statement was proved in [792J, where it was shown that if f is a function, measurable with respect to a Gaussian measure µ on a separable Banach space X. defined on a p-measurable set A and satisfying on it the Lipschitz condition along the Cameron-Martin space H, then there exists a it-measurable function on X, satisfying the same Lipschitz condition and equal f p-a.e. on A. This statement follows from the theorem above, since f has a Borel modification, satisfying the same condition on some Borel set B C A with µ(B) = p(A). In fact. due to existence of Souslin supports for Gaussian measures. this result extends to general locally convex spaces X. Let us mention several open problems related to the results presented above.
(i) Does there exist a Lipschitzian function f on a separable Hilbert space, whose set of all points of the Frechet differentiability is zero with respect to all nondegenerate Gaussian measures? (ii) Let it be a centered Gaussian measure on a separable Hilbert space X and let f be a real Borel function on X, which is Lipschitzian along H = H(µ). Can it happen that the set of all points of the Ichet differentiability of f along H is ii-zero?.
(iii) Let B be a Borel set in a locally convex (say, in a separable Hilbert) space X equipped with a Gaussian measure ry with H(y) = H. What can be said about the points of Frechet differentiability of the function dB. defined in Example 5.4.10, along H?
5.12. Complements and problems Generalized Poincarh's inequalities In Chapter 1, several generalizations of the Poincare were presented. All those results extend immediately to the infinite dimensional case. For the reader's convenience, let us give the corresponding formulations. The next result follows from Proposition 1.10.3 (certainly, it can be proved directly by the same reasoning).
5.12.1. Proposition. Let -t be a centered Radon Gaussian measure on a loThen, denoting by E the integrals with
cally convex space. X and let f E respect to y. we have
(E!)2=
1,)-
(l)AE(IID,,f
( l}i}, J 2e2NtE(IITTD.fII;dt.
k=n
o
5.12.2. Corollary. Suppose that f E W2.2"(7). Then 2n-1
((k+11
2n
E -1) k=1
A.1
lE(IIDN !I'}(k)
(f ) \- (E!)2 C E -11 (
,k-11
//
_)
k=1
5.12.3. Corollary. Suppose that f E
IR
(II DN III itR).
1 and klIIDH lllVo-wk) -, 0.
Then
(ffd)2 = .t
k,yk IIDH f11 2 k=p
The following is the infinite dimensional version of Proposition 1.10.6.
5.12.
Complements and problems
267
5.12.4. Proposition. Let y be a centered Radon Gaussian measure on a locally convex space X and let f E 11'4 2(y) be positive a.e. Then, for every s E [0, 1]. one has l' r f (Lf)2dy+s 1 f IDHfI2 dy-s(1-s) 1 IDf21
f X
X
f
dy,
X
X
provided all the integrals exist. If f E
a
11'2.2(7) is positive
a.e.. then
dy 5 f (Lf)2dy+sf /IDfl. dy,
X
X
X
provided the last integral exists.
Beckner's finite dimensional result [47) mentioned in Chapter 1 extends automatically to the infinite dimensional case and yields the following generalized where y is a centered Radon Gaussian Poincar6 inequalities: let f E measure on a locally convex space X and let 1 < p 5 2, e-t = . Then
f If 12dy-f X
fIfI2dy.\
le-t1. fI2dy 5 (2-p) f ID.fI2dy, X
X 2iP
(f IfIPd-) X
1. there exists an injective symmetric compact operator K on ?-1 such that
Vf E F D," f E K,(N,,) 1-a.e. and sup IIK 1D,'fj/.Ji'..N,- < x.
fEF Let the class l1'X(1) be equipped with its topology of a Frechet space by means of the seminorrns II p. n E W. The set F C ll'X (7) is relatively compact if and only if the following two conditions are satisfied: (i) sup l[f llp., < x. Vp > 1. r > 1: fEF
(ii) For any n > 1, them exists an injective symmetric compact operator K on
?t such that V f E F D,;' f E
1-a.e. and
sup fEF
< x.
It should he noted that the Sobolev classes can be defined for derivatives along Hilbert subspaces E C X different from H(-f), i.e.. one can consider the completions
of FC' with respect to the norm IIfIIN.x.E::_
.e,l, provided
the closability condition is satisfied. i.e.. any sequence that is Cauchy with respect to the norm II IIp.k.F: and converges to zero in La(y). converges to zero with respect
to the norm I. ilp.k.F Embedding theorems for these more general classes are obtained in [2711. The corresponding capacities are of interest as well (in general, such capacities are not tight). These objects arise, e.g., in connection with linear stochastic differential equations. In general. there is no log-Sobolev inequality for such classes. For more details, see 12711, [276).
5.12.
Complements and problems
269
Negligible sets Let us recall several concepts of a "zero-set" in the infinite dimensional case. Since in this case there is no reasonable substitutes for Lebesgue measure (as well as any preference in the choice of, say. a distinguished nondegenerate Gaussian measure among the continuum of mutually singular measures), one has to introduce this concept without making use of any specific fixed measure. One of the definitions of this sort is due to Christensen [165], who suggested to call a Borel set A in a Banach space X universally zero if there exists a nonzero Borel measure µ such that µ(A+x) = 0 for all x. Certainly, this definition applies also to locally convex spaces, so that in this subsection we deal with a locally convex space X. Another definition is due to Aronszajn [22]. who introduced the following class Ao of exceptional Borel sets. For every vector e in a locally convex space X, let
us denote by Ao the class of all Borel sets A such that mes(t: x + to E A) = 0 for every x E X, where mes is Lebesgue measure. For any sequence in X, let As{en} be the class of all sets of the form A = where A. E AB, for all n. 5.12.7. Definition. A set A is called exceptional (A E A6) if it belongs to the for every sequence with the dense linear span in X. The corresponding class is smaller than that of Christensen, but both coincide with the class of all Borel sets of Lebesgue measure zero in the finite dimensional spaces. Then Phelps (599] introduced the class go of Gaussian null sets. class
5.12.8. Definition. A Borel set A is called a Gaussian null set if it is zero for every n.ondegenerate (i.e., having full support) Radon Gaussian measure on X.
According to [599], A6 C Cg. but it remains open whether this inclusion is strict. Finally, in [69] the following definition was introduced. 5.12.9. Definition. A Borel set A is called negligible if it is zero for every Radon measure. which is differentiable along vectors from a dense set. The class of all Borel negligible sets is denoted by PG. The classes introduced so far can be extended in such a way that in the finite dimensional case they will embrace all sets of Lebesgue measure zero (not necessarily Borel). To this end, let us denote by C the class of all sets in X, which are measurable with respect to every Radon measure on X which is differentiable along vectors from a dense subspace (dependent on the measure). Then Definitions 5.12.7
- 5.12.9 extend naturally to C (in particular. in the definitions of A, and the sets from C are now admissible). Let us denote the classes obtained in this way by A, G, and P, respectively. Then the following relationships hold true (see [70], [71], [72] for the proof).
5.12.10. Theorem. One has As C 9' = Po and A = B = P. Note, in particular, that go C A. however, it is open whether the sets A in the corresponding decomposition can be chosen in 8(X) (and not only in C). Similar classes A' and 9c; arise if. instead of C, we consider the class CG of all sets measurable with respect to all nondegenerate Radon Gaussian measures on X. Then, by virtue of the same reasoning as in [71], (72], one has A' = C9G. We have no examples distinguishing the classes C and CG (or the classes GG and P). The class of negligible (or Gaussian null) sets is invariant with respect to affine isomorphisms of X and possesses a lot of other useful properties of finite dimensional Lebesgue zero sets (see Theorem 5.11.1). However, it is not stable with
Chapter 5.
270
Sobolev Classes
respect to nonlinear diffeomorphisms (see Chapter 6). Let us mention an interesting open problem posed in [834]: is it true that the image of any Borel negligible set under a Lipschitzian mapping in a Banach space is universally zero in the sense of Christensen?
Laplacian AH Let X be a locally convex space with a centered Radon Gaussian measure y. In applications, besides the Ornstein- Uhlenbeck semigroup, one encounters the semigroup (P,),>o defined on the Banach space C,,(X) of uniformly continuous bounded functions on X (with the sup-norm) by the formula
Pif(x) = ff(x + V y),7(dy)5.12.11. Proposition. (PP),>o is a strongly continuous sernigroup onC,(X). PROOF. The semigroup property follows from Proposition 2.2.10 and equality
a2 + 32 = 1, where o = t112(t + s)-1I2, 0 = st/2(t + 3)-1/2. Let f be a bounded uniformly continuous function on X and let e > 0. There exists an absolutely convex neighborhood of zero V such that If (x) - f (z)l < e whenever x - z E V. Clearly, IP f (x) - P, f (z)I < e. Further, there is n such that -y(nV) > I - e. If 0< t< n-2, then f y E V for every y E nV, whence, for every x E X,
If(x)-P=f(x)1
1, then the function h
f f(x + h + f y)y(dy)
is infinitely Frechet differentiable on H and its derivative of order n is an n-linear Hilbert -Schmidt mapping. Precise estimates of the Hilbert-Schmidt norms of these derivatives are found in [474).
Measures of finite energy The collection of sets of capacity zero is much smaller than that of measure zero. For example, a closed hyperplane in a locally convex space X with a nondegenerate Gaussian measure y has measure zero, but positive capacities. On the other hand, there exist measures mutually singular with y, but vanishing on the sets of capacity
zero (we shall see in Chapter 6 that surface measures generated by sufficiently nondegenerate Sobolev class functions have such a property). Therefore, it is of interest to describe all Radon measures vanishing on every set of C,.,-capacity zero
5.12.
Complements and problems
271
(such measures are called measures of finite Cp,r-energy). Measures of finite energy can be characterized by means of positive generalized functions. More precisely, we have the following two results.
5.12.12. Theorem. Let v be a nonnegative Radon measure on X such that v(A) = 0 for each Borel set A with Cp,r(A) = 0. Then. there is a strictly positive bounded Borel function p on X such that the functional
f -. J f (r) p(r) v(dx) is continuous on HP.' (y). X
PROOF. Let h(t) = sup{v(A): Cp,r(A) < t). Note that limh(t) = 0. Indeed, otherwise for some c > 0 there is a sequence of Borel ' sets 0 A. such that Cp,r(An) < 2-" and v(An) > c. This leads to a contradiction, since for the set B = limsupA,, = n,,>, Uk>,, Ak one has v(B) > limsupv(Ak) > c and Cp.r(B) < r Cp.r(Ak) < 21n11 for any n, whence Cp.r(B) = 0. k>n
We shall apply the following result due to Maurey (see, e.g., [800, Lemma 5.5, §5, Ch. VI]). Let C be a convex set of v-measurable nonnegative functions which is bounded in the space L°(v) of all v-measurable functions equipped with the metric If (X) - g(x)I v(dr). r 1+If(x)-g(x)I
!
Boundedness of a set Al means that, for every ball V centered at zero, there is A > 0 such that Af C W. Then there exists a strictly positive measurable function p such that sup f(x) 9(x) v(dr) < 1. fEC
J
Let us take for C the set
C = { f E H'(): IIf
1. f > 0}. Note that all functions in HP-'(-t) are v-measurable. Indeed, -y-equivalent functions coincide as elements of HP,'(-r), and every function f E HP-'(-y) possesses a quasicont.inuous modification g. By condition, given e > 0, there is 6 < e such that v(A) < e provided Cp,r(A) < 6. Since there is a closed set Z with Cp,r(X\Z) < 6 on which g is continuous, we get v(X \Z) < e, whence the measurability of g with respect to v. Note that v(x: f (x) > r) < h(r) if IIf Ilp.r r) < r-' IIf II,.r < r. Thus,
j 1 + f(x) v(dx) < h(r) + r, f (x)
Vf E r2C,
x
which, together with the continuity of h at zero, means that C is bounded in the space L°(v). Let p be a function from Maurey's lemma cited above. Finally, note that, for any function f with IIfIIp,r < 1, the function g = Vr(IV, 1(f)I) is nonnegative, IIgIIp.r = Ilf1Ip.r < 1, and If I < g quasi-everywhere. Hence, ff()o(x)v(dx) < 1 g(x)p(x)v(dx) < 1.
Therefore, the functional f ' - f f(r)p(x) v(dx) is continuous on Hpr(y).
Chapter 5. Sobolev Classes
272
5.12.13. Theorem. Let '' be a linear functional on FC' continuous with respect to the norm of HP,'(-y) and nonnegative in the sense that (1I'.,') > 0 for every non negative smooth cylindrical function y.. Then there exists a nonnegative Radon measure v p on X such that (I1+, 0) = I ;p(x) vy(dx).
V
E .FC" .
X
See [7401, 14121 for the proof of this result and its extensions.
5.12.14. Remark. A Borel set B has C,,,,-capacity zero if and only if every Radon measure of finite (p. r)-energy vanishes on B (see 14121 for a proof).
More on measurable polynomials The arguments used in the proof of Theorem 5.10.7 yield the following result relating measurable polynomials to the classes Sd(y) introduced in Section 4.3. Note that according to the results above, every real-valued y-nteasurable polynomial of degree d has a version F,1 that belongs to Sd(h) and Ad (h) coincides with OFII(.r) (which is a constant).
5.12.15. Corollary. Let 7 be a Radon Gaussian measure on a locally convex space X. let Y be a separable FHchet space. and let F E Pd(-i.Y) Suppose that and .,(Ax) = ) (x) for : Y - (-x.+-lc] is such that ,(x + y) 0 and that p(F(x)) + ;rs(-F(x)) < +x v-a.e. Then F has a version Fe such that ;p o Fe E Sd(,) and A°= Y o Fd, where Fd is the d-homogeneous part of the polynomial mapping h
JF(x + h) ti(dy). X
Finally, one has 1
!
lim t- E'2 log 7(,,; o F > t) = - ( sup Fd(h)) +a-.
(5.12.1)
2 hEtH
I
exp(aIir'oFI2!d) E L1(7),
ba < (2 sup IFd(h)I) hEt'H
.
(5.12.2)
Let us give a formulation of the results on the small ball comparisons obtained in Chapter 4 in terms of the second derivative along H.
5.12.16. Proposition. Let .t be a centered Radon Gaussian measure on a where q,, e X_ . locally convex space X and let Q = E n=1
a , < oc. Then n=1
fQr)Y(d.r)
_ trace(5.12.3)
X
In particular, this is true if Q is a sequentially continuous quadratic form on X.
5.12.
Complements and problems
273
PRoor. Recall that the last claim has already been shown (see (5.10.4)). Clearly, for any n E IN and h E H, one has Olin = 2(rin.h)y.),). Let {e,} be an orthonormal basis in H. Recall that D Q is a nuclear Hilbert --Schmidt operator (see Remark 5.10.15). Then (5.12.3) by follows by the absolute convergence of the series x x x aalltln E : an(rln, FOL. (.) = 1=1 n=l
n=1
where the Parseval equality was used.
5.12.17. Theorem. Let y be a centered Radon Gaussian measure on a locally convex space X, let q be a 7-measurable seminorm such that V. = {q < e} has has summable positive measure for all e > 0. and let Q E P2(y) be such that (possibly, to -oc or +x) trace trace
X
X
,=1
n=1
2a; +2ro,,.
where 2a ; and 2an are, respectively, the positive and negative eigenvalues of DN'Q. and at least one of these two series converges to a finite number. Then liml
y(V)
JexPQ(z)'(dx) = exp(Io(Q) - trace
(5.12.4)
1.
The limit in (5.12.4) is 1 if Q is a sequentially continuous quadratic form.
As it follows from the results in Chapter 4. if Q does not satisfy the aforementioned condition, then, for a suitable norm q, there is no limit in (5.12.4).
5.12.18. Example. The assertion in Theorem 5.12.17 (hence also in Theorem 4.8.3) is valid if Q = c + f + Qo, where c E 1111, f E X; , and Qo is a sequentially continuous quadratic form on X. Let us now discuss the case where q is a seminorm which may not be a norm
on H. To this end, let us denote by F the a-field, generated by the sequence 1f, ,j that determines the seminorm q, and by IE- the conditional expectation with respect to F. In addition, let us denote by 0 the a-field, generated by the orthogonal complement to if,) in X. and by lE-' the corresponding conditional expectation. Note that if q is a norm on H, then IEFf = f and 1E° coincides with the expectation with respect to 7. In terms of the restriction of q to H, the a-field F is the one generated by h, h E Z. where Z = Kerq, and IE" is the one generated by h. h E Z. The proof of the following result is found in (84).
5.12.19. Proposition. Suppose that the conditions in Theorem 5.12.17 are satisfied except that now q is a y-measurable seminorm which may not be a norm H. Then
linJ.(7+Q) = 11eXp(lE Q)IIL (.i exp(-Ztrace
(5.12.5)
This limit is finite and positive if is a nuclear operator. In particular. if p is a Gaussian measure on X such that dµ/dy = exp Q, then there exists the limit lira p({q 5
s}).
Chapter 5. Sobolev Classes
274
According to (569], for any strictly positive absolutely continuous function i2 I
such that
has bounded variation on [0.1) and J ,p(t)-2dt = 1, one has u
Pit' (x: Ix(t)l < -ap(t), ''t E [0, 1]) lim E_0
- ,tl)
Ptr x: sup Ix(t)] < e)
(0)
(5.12.6)
This result can be deduced from Theorem 4.8.3. To this end. note that the nominator on the left can be written as Ptt' (x: q(Tx) < 1 } and F` 1(N) have measure zero and find a full measure Borel set Yo C Bo\F-1(N). As above, we put
Yk=F(Yk_1)nYoifk> 1, Yk = F-1(Yk+i)nY0 ifk 0), which is increasing to Y. Then the sets To(B) = To(B - nh) increase to the set T0(Y). Since y o T-1(To(B)) = y(B) = 1/2 and y o T- (To(Y)) = y(Y) = 1 by virtue of the injectivity of To = T on Y, we get, by the equivalence of measures, that y(To(B)) < I and y(To(Y)) = 1, which is a contradiction.
As it has already been noted, one cannot omit Lusin's condition (N) in (i). However, if this condition is not imposed, it follows from the proof that T has a modification with Lusin's property (N) (which thereby satisfies condition (ii) in Corollary 6.3.3).
6.3.4. Corollary. Let T be a -y-measurable linear mapping such that its proper linear version has property (E) on H. Suppose that f is an H-Lipschitzian function.
Then f o T is H-Lipschitzian as well and D,, (f o T)(x) = T'D. f (Tx). If f takes values in a separable Hilbert space E and is H-Lipschitzian. then f o T is also and
Dx(f oT)(x) = DNf(Tx)T. PROOF. The function f o T is measurable, since T is an absolutely continuous transformation of y. By condition and property (E), we get If o T (x + h) - f o T(x) 1 < CIThI H
1 such that I + pK + pK' + pK' K > 0. we have
''.
IIAKIILP(,)=eXp(-2'IIK112) det2(I+K) det2(1+pK+pK'+pK'K) 1
(6.4.4)
PROOF. By Proposition 3.7.10, the integral of IK(x)12 equals IIKIIx. Hence 2 Q(x) := 6K(x) - 2IK(x)IH2 + 21IKIIH
an({ - 1), where the 2an's are
belongs to X2. By Proposition 5.10.13, Q = n=1
the eigenvalues of D, 2Q and {{n} is some orthonormal basis in H. According to
6.4.
Radon-Nikodym densities
291
equality (4.8.5), we have Jexp(pQ)
dy = H "P-Pan
- Pan
n=]
]/2
if 2pa < 1. This is exactly Idet
On the other hand, DA6K(x) _
-Kx - 0x - K'Kx
-Kx - 0x according to equality (5.8.7). Hence and
6.4.5. Theorem. Let S = (I + K) -1. Then one has AK(ISx) d(yo S'1) d(7oT-1)(x) = d7
(x) = AK(x).
dy
'
(6.4.5)
PROOF. The existence of densities is already known. In order to show (6.4.5)
suppose first that y is the standard Gaussian measure on IR". Then tiK(x) = trace K - (Kx, x). Now the expression on the right in (6.4.5) for (AK o S)-1 can be written as
det7'
exp[(KT-1x,T-'x)+ 1(KT-1x,KT-1x)] exp[2(x,x)
= Idet7'I
-
2(T-1,,T-1x)l
J, which is the expression for d(yT-1)/dry given by the classical Ostrogradsky-Jacobi formula (see the calculation in the proof of Theorem 6.6.3 below, where nonlinear
transformations are considered). In the infinite dimensional case, let us take an orthonormal basis {en} in H and put K" = PKP". The finite dimensional operators K" converge to K in the Hilbert-Schmidt norm. For all sufficiently large n,
the operators T,, = I + Kn are invertible. Put S. = Tn 1. Then the densities pn = d(y oT,,-1)/dy and r" = d(-y o Sn I)/dy are given by (6.4.5) and converge a.e. to the expressions on the right-hand sides in (6.4.5) for T and S, respectively. It remains to note that these densities are uniformly integrable. Indeed, by (6.4.4), there exists p > 1 such that the sequence {rn} is bounded in LP(-y). The same is 0 true for {pn}, since T = S-1, and S satisfies the same condition as T.
6.4.6. Theorem. Two centered Radon Gaussian measures p and v on a locally convex space X are equivalent precisely when H(p) and H(v) coincide as sets
and there exists an invertible operator C E C(H(p)) such that CC' -I E f(H(p)) and
for all h E H(p).
IhIH(v) =
(6.4.6)
If C - I E 7-t(H(p)), then one has dv (x) =
dp
1 Ac._I(C-Ix)
.
(6.4.7)
Finally, if p - v, one can find a symmetric operator C with the aforementioned properties.
PROOF. Suppose that p - v. Then H(p) and H(v) coincide as sets and there exists an invertible operator C E G(H), where H: = H(p), such that (h, h E H. Then v coincides with the image of p under the
Chapter 6. Nonlinear Transformations
292
measurable linear mapping C. By virtue of the equivalence of these two measures,
CC' - I E R. Clearly, we can always take for C a nonnegative operator: just replace C by
C
I E 71, hence formula (6.4.5) applies. The existence of
an operator C with property (E) on H(µ) such that (6.4.6) holds true yields that
v=µoC'I, whence v-.p. 6.4.7. Corollary. Let u and v be two equivalent centered Radon Gaussian measures on a locally convex space X. Then there exist an orthonormal basis {en}
x
in H(p) and a sequence {an} of real numbers not equal to -1 such that E An < 00 n=1
and, for an arbitrary sequence of standard Gaussian random variables C. on a probability space (1, P), one has
µ=
Po
x
x
nen)
and v= P o (E(1 + An)ynen) n=t
n=1
/
PROOF. It suffices to take a symmetric operator C in the previous theorem x and use the eigenbasis {en } of C - I with (C - I )en = \nen. Then E an < oc, the n-1
aforementioned series converge in X almost sure and their distributions coincide with u and v, respectively.
6.4.8. Corollary. Two centered Radon Gaussian measures A and v on a locally convex space X are equivalent if and only if there exists an invertible symmetric
nonnegative operator T on H(µ) such that T - I E 7i(H(p)) and (f,9)L2(.) = (TR,f, Rp9)a(p).
df, 9 E X'.
(6.4.8)
An equivalent condition: the norms IIfIIL2(,.) and IIfIIL2(O are equivalent on X- and (f, f)L2(,) on XN is generated by a Hilbert-Schmidt the quadratic form (f, operator on X, .
PROOF. Let u - v. Then there exists an invertible symmetric operator C E C(H(µ)) with C - I E 7((H(µ)), and v is the image of µ under the mapping C. By Lemma 3.7.8, we have {f>9)L2cv) _ (f 0e,90(5)L2(p)
= (Rv(fa
),R,(9oC))H(p) =(C'RNf,C'R,g)H(p).
(6.4.9)
Hence we can put T = CC' = C2. It is readily seen that T - I E 7f(H(p)). Clearly, T is invertible. Conversely, suppose that (6.4.8) holds, where T is invertible nonnegative and T - I E 7I (H(µ)) . Put C = v T. Then writing (6.4.9) backwards, we get (f o a, g o C)t2(,,) = (f, 9)L2(v), which implies that v =,u o C-1, whence we
get v -,u.
6.4.9. Remark. If µ and v are not centered, then u ap - a E H(p) and the centered measures Ep_, the theorem above: otherwise u 1 v.
v precisely when and v_a satisfy the conditions in
Let us give a coordinate representation of the Radon-Nikodym densities of equivalent Gaussian measures.
6.4.
Radon-Nikodym densities
293
6.4.10. Corollary. Let p and v be two equivalent Radon Gaussian measures on a locally convex space X. Then dv/dµ = exp F, where F is a µ-measurable second order polynomial which admits the following representation:
(x) + E an (l;. (X)2
F(x) = c + E c,, n=1
- 1)
u-a.e.,
(6.4.10)
n=1
x
x
n=1
n=1
where c E JR'. > c < Co, E a2 < oo. an < 1/2, {t;n} is an orthonormal basis in X.;, and both series converge a.e. and in L2(µ). Conversely, if F has such a representation, then exp F E L1(p) and the measure 11 exp F1l exp F ,u is Gaussian.
PROOF. Follows from the results above combined with the description of measurable second order polynomials obtained in Chapter 5. Note that the integrability
of exp F is equivalent to the condition an < 1/2. The necessity of this condition is obvious by the one dimensional case and Fubini's theorem. The sufficiency is readily seen from the fact that 1)) = e-°(1 - 2a) -1/2 for a stan2an)-1/2 dard Gaussian random variable t; and that the product nn 1 e-°^(1 converges if an < 1/2 and
x
oc. n=1
In the case, where X is a separable Hilbert space and u and v have covariance operators K,, and K,,, we get H(µ) = K,,(X). Assuming that K,, and K have dense ranges (which can always be achieved by passing to the closure of
H(µ) in X), one can write C in the form C = v1K, K -1. On the other hand. C = K,,Co K -1, where Co = K,, -1 K E £(X) is an invertible operator. Since the operator K is an isometry of the Hilbert spaces X and H(p), we conclude that C on H(p) has property (E) precisely when Co does on X. Therefore. the equivalence of the measures µ and v is characterized by the continuity and invertibility of the operator K -1 fW together with the condition
K K K 1 - I E I{(X ).
(6.4.11)
Obviously, one can interchange the roles of p and v. Let us summarize our observations as follows.
6.4.11. Corollary. Let X be a separable Hilbert space and let µ and v be two Gaussian measures on X with covariance operators K,, and K and means a and a,,. respectively. Then y - v precisely when a,-a, E and there exists an invertible operator C on the space X such that CC' - I is a Hilbert -Schmidt operator on X and K = V"K-, C.
(6.4.12)
Otherwise p 1 v. The existence of such an operator is equivalent to the existence of a Hilbert-Schmidt operator D on X without eigenvalue -1 such that K - K,, _ K,,D K,,. If a = a,, = 0, then lnn-logan]), n=1
(6.4.13)
Chapter 6. Nonlinear Transformations
294
where the a 's are the eigenvalues of the symmetric operator (I + G)(I + G') corresponding to the eigenbasis {ipn} and t) is the element in X; generated by the vector
Kµy'.,,.
PROOF. We know that the claim reduces to the case where a , = a,, = 0 (see
Chapter 2. in particular, Proposition 2.7.3). Let p - v. We may assume again -1 that both p and v are nondegenerate. Then the operator C = KN fK is H(p) is a proper invertible and CC' - I E 7t(X). If the closure of subspace X0 in X, then we put C = I on the orthogonal complement of X0, which is consistent with (6.4.12), since Kµ = VT; = 0 on X' by the symmetry of these two operators. Let us prove (6.4.13). Since
D:= (I + G)(I + G') - I x
is a symmetric Hilbert-Schmidt operator on X, we have E (a - 1)2 < oc. The sequence
KN;pn is an orthonormal basis in H(p), hence {rmn } is an orthonormal
basis in X. Together with the convergence of the series
x
n-l
Iloga., - (an - 1)an 1I
this yields the convergence in L2(µ) of the series on the right in (6.4.13). It is straightforward to see that the function p defined by the right-hand side in (6.4.13) is in L' (p). Let us show that the measure A = !t p coincides with v. Note that
(%, 17,4-(,,) = ((I+D)ipj,y,)x =an(VJ, ,)x.
(6.4.14)
then, denoting by I(x) the element of X,, generated by Indeed, if u. v E K;x. we get (cf. Remark 2.3.3)
(1(u).I(v))L,(,,} _ (K,,KN1/2u,Kµ1"2v)r = ((I+D)u,v)c.
(6.4.15)
1(u) We observe that if u2 - u in X, then I(uj) - 1(u) in L2(p), hence 1(u,) in L2(p). Therefore, (6.4.15) holds true for any u, v E X. in particular, we get x x (6.4.14). Now let _ E cnl1n, where E cn < oo. Then by the independence of I
n=1
the rp,'s on (X.pr) and (6.4.14), simple calculations yield
J exp(i)edp=exp(-2ancn) = J x
n=1
x
0
whence p` = v.
Note that if K is injective, then the operator D introduced above equals D=K4-112K,.KNt12-I.
Then the equivalence condition can be restated as the inclusions D E 7{(X) and together with the invertibility of I + D. a - a4 E In the just considered Hilbert case, there is a sufficient condition for the equivalence which does not involve the square roots of the covariance operators. Suppose
that H(p) = H(v) and, in addition, that K. = (I+Q)K,, where Q E 7{(X) and the operator I + Q is invertible. Then p v. Indeed, let D = K,, - I K K,, -' - I.
6.5.
Examples of equivalent measures
295
Since Q = 1, then Q = K whence, in the eigenbasis {en} of the operator K, we get
x
oc
E(D2en,en) =
DZ
K,.
x
1en,
K- en = E(Q2en,en) < 00, n=1
n=1
n=1
K
i.e., (6.4.11) is fulfilled, since D is symmetric.
6.5. Examples of equivalent measures and linear transformations Let us consider the case where one of the two equivalent Gaussian measures is the Wiener measure Pu' on the space C[0,1] or L2[0,1].
6.5.1. Example. A Gaussian measure v on L2[0,1) is equivalent to the Wiener measure PW if and only if a,, E H(Pw) and its covariance operator R is an integral operator with a kernel K (the covariance function of the corresponding Gaussian process) of the form a
tr
J o
Q(u,v)dudv,
Ju
(6.5.1)
where Q E L2([0,1]2) is a symmetric function such that the corresponding integral
operator has no eigenvalue -1. In this case, for a.e. (t, s), one has the equality Q(t, s) = s). PROOF. It suffices to consider the case where a = 0. Suppose that v - Pw Let us apply Corollary 6.4.7 and take the corresponding orthonormal basis {en} in H(Pµ') = W02" [0, 1] and the sequence {an}. Then the functions 7yn(t) = e'. (t) form an orthonormal basis in L2[0,1]. Put cc
+2An)0n(t)wn(s)
Q(u,v) _ n=1
and note that this series converges in L2([0,1)2). The integral operator with kernel
Q has no eigenvalue -1, since )12 + 2)n A -1 due to our condition an 0 -1. Let {&n} be independent standard Gaussian random variables. Since the series 00
x
E n(w)en(t) and E (1+A )tn(w)en(t) converge in L2(P,CIO, 1)) and, for almost n=1
1
every fixed w, converge in C[0,1], we get that the covariance of v is given by the kernel 1: (1 + An)2en(t)en(s),
n=1
whereas the function min(t, s) (the covariance function of the Wiener process) is obtained if an - 0. Therefore. s
t
min(t,s) = ffQ(uv)dudv. o
Conversely, suppose we have (6.5.1), where the integral operator with kernel Q on L2[0,11 has eigenvalues A,, -76 -1 and eigenbasis { pn}. Then the functions
Chapter 6. Nonlinear Transformations
296
e
(t) =
(s) ds form an orthonormal basis in H(P1t) = 141[0,1] and
J
oc
ac
Anen(t)en(s).
Een(t)en(8) +
(6.5.2)
n=1
n=1
where both series converge uniformly. Let {&n} be a sequence of independent standard Gaussian random variables. Let us observe that An + 1 > 0. Indeed, by condition, the quadratic form Qo with kernel K. is nonnegative on L2[0,1]. It is readily seen that Qo extends to a continuous nonnegative quadratic form Q1 on the space C[0,1]' (identified with the space of all signed measures on [0, 1]) given by
=
f J min(t, a) m(ds) m(dt) + Jill Q(u, v) du dv m(ds) m(dt). 0
0
0
0
0
0
Hence, for any h E Y17o'1 [0, 1], defining a functional f on C[0,1[ by
x'--' -(x,h')L21o.1i +x(1), we have Q, (f) > 0. This functional coincides P11'-a.e. with the stochastic integral
of h'. Hence the quadratic form with kernel min(t, s) on C[0,1]' evaluated at f gives h' f L, :0.1 . Integrating by parts in the term involving Q, we obtain I
1
Q,(f) = Ilh'I1i(o.1) + f J Q(t, s)h'(t)h'(s) dt ds. 0
0
Taking h' = Vn , we get 1 + An ? 0. Clearly, the measure v0 obtained as the x is equivalent to the Wiener measure distribution of the sum E 1 + A,, n=1 (the latter corresponds to A. a 0). Now it remains to note that v = v0i since by (6.5.2) the covariance of v0 is given by the same kernel R as the covariance of v. If s is fixed, then the function t s K,(t, s) is absolutely continuous and its derivative for a.e. t equals a
1+IQ(u,t)du if t<s, 0
J0
Q(u, t) du if t > s.
The result is absolutely continuous in s on the intervals (0, t) and (t, 1) and its derivative is Q(s, t) for a.e. s in these two intervals.
6.5.2. Example. Let r be a continuously differentiable function with strictly positive derivative on [0, 11 such that r(0) = 0 and r(1) = 1. Put
Tx(t) =
r (t)x(T(t))
Then the measure v = P' oT`1, i.e., the distribution of the process w,(t)/ r'(t), is equivalent to the Wiener measure PW if and only if the function T' is absolutely continuous and r" E L2[0,11.
6.5.
Examples of equivalent measures
297
PROOF. Let 'r' E W2.1[0, 1] and g(t) = 1/ r'(t). Then g E W2"1[0,1]. Denote by 9 the inverse function to r. Let us show that the operator T on H(P') _ K0.1
[0,1] has property (E). /If h E 140" [0,1/], then Th E W02" [0,1]/ and (t))112h'(r(t))
Th'(t) = -2r"(t)(T'(t))-312h(T(t)) + (T = 9 (t)h(r(t)) + (T'(t))112h'(r(t)). In addition, given y E
140.1
[0, 1], it is easily verified that
T-ly(t) =
r'(9(t))y(9(t)) =
1Y(0(t))
is a function in Ii'2' 1 [0, 1]. Therefore, T is invertible on %,,02,1[0,11. Since the operator V on L2[0,11 given by Vx(t) = r'(t)x(r(t)) is orthogonal, the operator U given by t
Uh(t) _ / Vh'(s) ds 0
Pu', it remains to show that U - T is a Hilbert-Schmidt operator on H(P" ). Indeed, then TT' - I is a is orthogonal on HV2 [0,1]. In order to see that. v O
Hilbert-Schmidt operator. We have
[(U -T)h]'(t) = -g'(t)h(r(t)), vh E H(Pu'), whence
Ilg'IIi21o,t1 max Ih(t)12, which gives the desired inclusion
U - T E 7 t(H(PIS.)), since we can take an orthonormal basis
with E
in H(P1L')
oc. Conversely, assume that v - P. Then the covariance
n=t
function of the process w7(t)/ r'(t) equals min(r(t),r(s))/ r'(t)r'(s) and has the form indicated in Example 6.5.1. This shows that the function 1/ r'(t) is in W2.1 [0, 11, which is equivalent to the inclusion r' E WV2.1 [0, 11, since r' is continuous O
and strictly positive.
Let us evaluate the Radon-Nikodym density of the measure induced by the standard Ornstein-Uhlenbeck process { on [0, 1] with to = 0 with respect to the Wiener measure Pu'. Certainly, Girsanov's theorem discussed below yields immediately the equality 1
dm(
1
dP"'(w)=exp(-2 rwtdwt-8 fwdt) 0
=exp
1
0
2 wt+2-8 1
1
f w,dt 2
0
since by Ito's formula one has
f t
0
W. dw, = 2W2 _t.
(6.5.3)
Chapter 6. Nonlinear Transformations
298
However, in order to have some exercise, we shall follow a longer way based on our general theorems. The measure A, on C[O,1] is the image of Pug under the linear mapping T defined from the integral equation c
Tx(t) = x(t) - 2 JTx(s)ds. 0
This equation is uniquely solvable and the inverse linear operator S is given by
= x(t) +
Is(s)
It is easily seen that the operator Q = S - I is nuclear on H = H(Pw) and its complexification has no eigenvalues. It remains to note that [Qx[H = IIzI 2Io.1I/4
and that
/
/
5Q(x) = -2 E(Qx,en)xen(x) 1
n=1
x
1
r1
_ -2 E(x,en)L'[O.lJ J n=1
n(s)dx(s) _ -2 Jz(s) d z (9),
0
0
where {en } is any orthonormal basis in H. Now (6.4.7) yields (6.5.3).
6.6. Nonlinear transformations Let y be a centered Radon Gaussian measure on locally convex space X and let H = H(-y). In diverse theoretical problems and applications one encounters the problem of investigating the images of the measure y under nonlinear mappings
T: X -. X. One of the best studied is the situation where T has the following special form: T(x) = x + F(x),
where F: X H is a sufficiently regular mapping. It is instructive to have in mind the situation, where y is the countable product of the standard Gaussian measures on the line and F: 1R°° l2 or y is the classical Wiener measure and F(x)(t) = fJ u(x)(a) ds, where u: C[0,1] --, L2[0,1]. In fact, all the results below are invariant under measurable linear isomorphisms, so that it would be enough to consider one of these two concrete cases. As the investigation of the linear case shows, one should expect that certain conditions connected with Hilbert-Schmidt operators on H will arise. It is intuitively clear that the smooth mappings of the form above behave locally as linear mappings I + D F. Therefore, a natural candidate to fit Hilbert-Schmidt type conditions is the derivative This expectation is justified. The principal results of this section state that a mapping I + F transforms y into an equivalent measure if F satisfies certain technical conditions and I + D F is invertible on H. We shall start with a lemma which is of independent interest. In this lemma and some subsequent results, we make use of the following trivial observation: if y is a Radon Gaussian measure with the Cameron-Martin space H and F: X H is a y-measurable mapping, then y o T-l, where T = I + F, is a Radon measure concentrated on a linear subspace of X that is a countable union of metrizable
6.6.
299
Nonlinear transformations
compact sets. Indeed, let Y be a full measure linear subspace that is a countable union of metrizable compact subsets of X. Then Y has full measure with respect to y o T-'. Clearly, any Borel measure on Y is Radon (see Appendix).
6.6.1. Lemma. Let T = I+ F : X -+ X, where F: X -. H is a -y-measurable mapping such that
IF(x+h)-F(x)lH c)=
r
J
Andy-»0 asn.k--.x, b'c>0.
Hence there exists a Borel mapping Go: X -. H such that IG,, - God,, - 0 in measure y. Passing to a subsequence, we get the convergence almost everywhere. Put So = I + Go and note that by Lemma 6.1.8 we have I o y. By the equivalence of the measures y and y o S;, n = 0.1..... the full measure
set Do contains a set hi, with y(h,) = 1 such that &(f11) C S1o for every n = 0,1.... On the set hl, one has IFnoSn - FooSoIH 5 IFnoSn IFnoSo - FooSoIH < AIGn - GoIH + IFnoSo - FooS0IH
0.
On the other hand, Fn o Sn = -Gn - -Go, whence Fo o So = -Go a.e., i.e., -Fo(x+Go(x)) = Go(x) a.e. Since the equation y = -Fo(x+y) is uniquely solvable for a.e. x (see the previous lemma), we have C = Co a.e. Thus, IG,, - GIH - 0 in measure. For the initial version, we have (I +G)(T(x)) = (I +G)(To(.r)) = x a.e.
6.6.
301
Nonlinear transformations
In order to show that T((1 + G)(x)) = x a.e., it suffices to note that the preimage under I +G of the full measure set SI2 = {x: T(x) = To(x)} has full measure. This 0 follows from Lemma 6.1.8, which yields the relationship y o (I + G)-' « y. In order to formulate the principal results of this section, as in the linear case, we shall need the concept of the regularized F edholm-Carleman determinant for the operators of the form I + K, K E R. Let us introduce the following notation for mappings F of the class TV1,C (y, H):
AF(x) := det 2 (I + D F(x)) I exp bF(x) -
2I F(x)I ,]
.
6.6.3. Theorem. Let F: X - H be a -y-measurable mapping such that
IF(x + h) - F(x)I < \Ihl,,, Vh E H for y-a.e. x,
(6.6.4)
where A < 1. Suppose that D,, F(x) is a Hilbert-Schmidt operator for a.e. x and that y-a.e. one has II D,,F(x)IIx < Al < oc. Then: (i)
There exists a full measure set S2 such that S1 + H = S2, T = I + F: 1 0 is one-to-one and onto, T(X\S2) C X\S2. In addition, the inverse mapping
S (in the sense that T(S(x)) = S(T(x)) = x for all x E S2) has the form S = I + G, where G satisfies the condition
IG(x+ h) - G(x)I < all - A)-'Ihl,,. `dh E H, `dx E S2, and 5 M(1 - a)-I: (ii) The measure yoT-' is equivalent toy and the density of the measure yoT-' with respect to y has the form i
d(y
)
dT
(x) = AG(x) = AF
(x)) .
(6.6.5)
(T- I
In addition,
d(yoS-') dy
(x) = AF(x).
(6.6.6)
PROOF. The existence of the inverse mapping follows from Lemma 6.6.1. We shall now verify all other claims. We shall make use of Theorem 4.5.7 about the exponential integrability of H-Lipschitzian mappings, which gives, in particular, the inclusion IFI E L2(y). Let be an orthonormal basis in H such that e;, E X. We know that the linear mapping J: x (?1(x)) identifies y with the product p of the standard Gaussian measures on the space IR". In particular, there exists a p-measurable linear mapping L with JL(y) = y p-a.e. Since J gives an isomorphism of the
spaces H(y) and H(p) = 12, the mapping To: y '-- y + JF(Ly) on IR" has the same properties as T. Indeed, IFo(y + h) - Fo(y)I y(N) = I F(Ly + Lh) - F(Ly)I H < alLhlx =.IhIH(,.), Vh E H(p).
Chapter 6. Nonlinear Transformations
302
S Al µ-a-e. Therefore, In a similar manner, IID,, ,,,Fo(y)IIo(H(µ)) = it suffices to prove our statements assuming that y = µ. First we shall consider the case, where F has the form F = (gyp,, ... , V.,0,0.... ) with gyp; E Cb (R') . In this case, by Fubini's theorem, everything reduces to the finite dimensional case, and it remains to apply the classical Ostrogradsky-Jacobi formula for the diffeomorphism
T = I + F on IR". This formula, applied to the standard Gaussian measure y with density p,, and an arbitrary smooth bounded function >!', yields
f O(T(x))pn(x)dx = f i'(y)P'(T-'(y)) IdetT'(T-'(y)) I IV
' dy.
R^
In order to derive from this formula the desired expression for the density of the induced measure, it suffices to apply Lemma 6.1.3 and notice that ITzI2 - Iz12 = 2(F(z), z) + IF(z)I2 and det 2(1 + F') = det(I + F') exp[-trace F'). Therefore, det (I + F'(x)) I exp [- (F(x), x) - 1 IF(x)I2]
= Idet2(1+F'(x)) exp[trace F'(x)-(F(x),x)- IIF(x)I21 = Idet2(I+F'(x)) exP[6F(x) - 1IF(x)I2]. Note that in the case we consider, by construction, the mapping C also has the form G = (gl, ... , g", 0, 0, ... ), where Co = (gi . , g,,) is a smooth mapping on K" with the Lipschitz constant ta Since the mapping So = I + Co on 1R" is the inverse to To = I + FO, where Fo = (p,, ... , iPn), one has SS(T0)TT = Ion 1R", whence Go (To)
V + 'rol
It is readily seen that 11(1 + FF)"'1Ic(f°) < (1 - A) -t, since [IFollc(wt^) < A < 1. Therefore, the Hilbert-Schmidt norm of the operator G'(To) is estimated by the number M(1 - A)'. Since To is a diffeomorphism, one has
IIGoIIx(J°) 5 M(1 - X)-'. Finally, in order to finish our discussion of the finite dimensional mappings, note that all the arguments given above are applicable to Lipschitzian mappings instead of smooth ones, with the only difference that the corresponding equalities and estimates involving the derivatives make sense and are valid almost everywhere instead of everywhere (see 1237, §3.2]). Certainly, this can be obtained as a corollary of the smooth case by the aid of suitable smooth approximations.
Let us now turn to the infinite dimensional mappings. Recall that we deal with X = IR" and H = 12. We may replace F by its version and assume that F is Lipschitzian along H with constant A for every x. We shall approximate our mapping F by the finite dimensional mappings of the form F. (x) = J PnF(xt, x2, ... 1 xn, yn+i , yn+2....) Y(dy), where P" is the orthogonal projection in H onto the linear span of e1,... , en . Note
that J Fn - FI - 0 in L2(y) and almost everywhere. Indeed, F. can be written in the form F. = P"IE"F, where IEnF is the conditional expectation of F with respect to the a-field generated by the first n coordinate functions. Since IF - P" FI H -. 0
6.6.
Nonlinear transformations
303
pointwise, then, by the Lebesgue theorem, the same is true also in L2(y). According to Theorem A.3.5 in Appendix, we get
1.
6.6.7. Theorem. Let T(x) = x+F(x), where F: X H is a mapping of the class lice 1a such that the set fl = {r > 0} from Definition 6.6.4 has full measure (e.g., let F E RC 1 (-y, H)). Put
M = {x E f1: det 2 (I + DH F(x)) a 0}.
Then there exists a partition of M into disjoint measurable sets M such that on M one has T = T,,: = I + F , where, for every n, the mapping F E (y, H) is bounded and Lipschitzian along H, and, moreover, T is bijective and transforms y into an equivalent measure. In addition, for any bounded measurable function g, one has the equality
f 9(T, (x)) AF, (x) y(dx) = f g(x) y(dx). Ji
Y
(6.6.8)
Chapter 6. Nonlinear Transformations
306
Further, the set T-' (x) f1 M has at most countable cardinality N(x, M) for almost every x and, for any bounded function f, there hold the equalities
f f (x) y(dx) =
J f (T (x)) Ap(x) -y(dx), Al,,
J f (x)N(x, M) y(dx) = J f (T(x)) AF(x) y(dx) x X Finally, 7I f o T-' «'y and the following equality is valid:
E
d(7IMoT_1)(x)=
dy
1
yET-1(z)1A! AF(y)
PROOF. We shall derive this theorem from the previous one, showing that "locally" T is a composition of the mappings of the type considered in that theorem with a linear mapping transforming y into an equivalent measure. Note first that, as is easily verified, the composition T3 = T1 o T2 of two mappings TI and T2 of the
form T; = I + F,, F, E W2" (y, H), satisfying the condition y o Ti-' « y, has the following properties: y o T3 ' O is a standard Wiener process in Rd. It is known (see, e.g., [361] or [504, Ch. 7]) that pe is equivalent to the Wiener measure Pw on C ([0' TI, Rd) and its Radon-Nikodym density is given by 7'
A(w) = exp(J B(ut) dw-
f
IB(wt)12 dt).
(6.7.1)
0
0
This result is a special case of Girsanov's theorem (see Bibliographical Comments). More generally, the equivalence holds for the measures pt and it" generated by the diffusions C and 77 on Rd governed by the stochastic differential equations with one and the same diffusion coefficient A, which is a Lipschitzian matrix-valued mapping on Rd, equal initial values, and different drift coefficients B1 and B2, respectively, provided some technical conditions are satisfied, in particular, if B1(x) - B2(x) =
A(x)C(x), where C is a mapping satisfying certain conditions (see [504, Ch. 7]). As it was mentioned before Example 5.4.15, there exists a Borel transformation F of the Wiener space such that t(w) = F(w)(t). Hence µE is the image of the Wiener measure under the mapping T defined by t
T(w)(t) = w(t) +
f
B(F(w)(s)) ds,
0
which can be written in the form T = 1 +G. where G takes values in the CameronMartin space H (which consists of the absolutely continuous functions vanishing at zero and having square-integrable derivative). However, the mapping G may fail
to be differentiable along H, so the results in the previous section do not imply Girsanov's theorem.
Chapter 6. Nonlinear Transformations
310
If in the previous example a = const > 0 and g(0) = 0, then, according to Girsanov's theorem, the measure p is equivalent to the Gaussian measure jcvla_w generated by the process fwt (i.e., µy Ow is a homotetic image of Pa'). In particular, the measures µ{ and P1v are equivalent if a = 1. One can verify that in this case the measure pt is differentiable along all vectors from the Cameron-Martin space of Pu (see [608], [93]). Pitcher [608] conjectured that there is no differentiability for nonconstant or. His conjecture was proved in [74], [77] (see also Problem 6.11.14).
Let us apply Corollary 6.6.8 to derive Girsanov's theorem in the case of the identity diffusion matrix. To simplify notation we consider the one dimensional case, however, the considerations below apply to the multidimensional case as well. Let { be the diffusion governed by the stochastic differential equation
dEt = dwt +
en = 0.
Suppose first that B E Co (IR ). We shall assume that a probability space for the Wiener process is the classical Wiener space. The measure µf on C[0,11 is the image of the Wiener measure P14' under the mapping T(w)(t) = t(w) given by the integral equation
T(w)(t) = w(t) +
J B(T(w)(s)) ds. 0
This integral equation is uniquely solvable on [0, 1], since for any continuous function So, the mapping
V (x)(0 ='P(t) + J B(x(s)) ds is a contraction on C[a, b] provided lb - of sup [B'I < 1. The inverse mapping S to T is given by
S(x)(t) = x(t) - J B(x(s)) ds.
Clearly, the mapping G := S - I takes values in H = H(P"') and is infinitely Frechet differentiable. For every x, the operator D,,G(x) is nuclear. We have
8hG(x)(t) = - J B'(x(s))h(s)ds.
h E C[0,1].
0
It is straightforward to see that the complexification of the operator D,,G(x) has no nonzero eigenvalues, since the associated linear differential equation Ah'(t) = -B'(x(t))h(t), h(0) = 0, has only zero solution in the complexification of 140'1[0.1). Therefore, det s (1+ D,,G(x)) = 1 and trace,, D,,G(x) = 0. In addition,
IG(x)I = f IB(x(s))1Zds. 0
6.7.
311
Examples of nonlinear transformations
Letting {e,,} be an orthonormal basis in H, we have that {e;,} is an orthonormal basis in L2[0,11, whence the following equality in L2[0,1]:
B(x(s))
(B o x, e041 0.1]e"(s)' n=1
Hence, cc
1
0C
Je(s)dx(s)
_ -J Bn=1 n=1
_-J
0
0
B(x(s)) dz(s).
0
Now the formula for AG(x) gives the same expression as Girsanov's theorem:
/
2I2
1
P,,. (x) = expl f B(x(t)) dx(t)
- J1Bt
dt).
In a more general case, where B is, say, just bounded Borel, we take a sequence of uniformly bounded smooth functions BB convergent to B a.e. and verify that the corresponding Radon-Nikodym densities are uniformly integrable. Clearly, they converge a.e. to the desired expression.
6.7.3. Example. Let y be a centered Gaussian measure on a separable Banach space X, let H = H(-y), and let F: X -+ H be a continuously Frechet differentiable mapping (certainly, H is equipped with its natural norm). Suppose that I + F'(x) is injective on H for every x. Then y o (I + F)-1 y.
PROOF. Since F'(x) E £(X, H), then F'(x)l is a Hilbert-Schmidt operator (see Proposition 3.7.10). It follows from our assumption that I + F'(x) is invertible on H. The condition implies the continuous differentiability of the mappings h
F(x + h), H --' H. According to Remark 3.7.13, the mapping A -- AI H(.) is continuous from G(X, H) to ?{(H). By the continuity of the mapping x '- F'(x), X G(X, H), the mapping x F'(x)1 from X to 71(H) is continuous. 6.7.4. Example. Let y be the countable product of the standard Gaussian measures on the line and let T = I + F, where F: IR" --+ 12,
F(x) = (fn(xt,... ,xn-1))
n-
>2fn(xt.... .xn_1)2 < C, 1
n=1
and the fn's are Borel functions with fi = 0. Then T is one-to-one and, letting dy = e, where S=T-t one has ryoT l ry , d(y'oT-1) = o 1T -1 ' d(yoS-1) ,
dry
x
e
1 x
P(x) = exp(- > fn(xl,... ,xn-1)xn - 2 >2 fn(xl,... ,xn-1)2). n=1
(6.7.2)
n=1
PROOF. Note that for every y E IR", the equation T(x) = y has a unique solution: xl = yl, x2 = y2 - f2(y1), x3 = y3 - f3 (y1, y2 - f2(yl )), etc. In particular, S = T` l has the same structure as T. Let F,, = (fl,... . f.,0,0,.'.). If the functions ff are smooth, then the claim for I + Fn in place of I + F follows by the
Chapter 6. Nonlinear Transformations
312
Ostrogradsky-Jacobi formula employed in the proof of Theorem 6.6.3 (of course,
it is a special case of that theorem), since 8S,Fj = 0 and det2(1 + D F") = 1. By a simple approximation argument, the claim is true for any bounded Borel F,,. n
Now observe that the sequence
-1)x J is a martingale with
F, fj (x 1, ... i=1
respect to the sequence of a-fields An generated by x1,... , x", for yn = IE't" t;, where EE-t' is the corresponding expectation and t; _ - E f j (x1, ... , x,_ 1)xj (this J=1
series converges in L2). Denote by IE the expectation with respect to -y.
Our
reasoning applies to 2F in place of F, hence Eexp[21;,, - 2f,] = 1 and IEexp(2fn) < exp(2C), whence the uniform integrability of Therefore, exp(t:n - ; -i e in L1 (y). By Lemma 6.1.8 applied to the transformations (I + F") convergent to S, we get d(ry o S-1)/dry = p. By Lemma 6.1.3, one has
yoT-1
y and d(yoT-1)/dl = 1/(AoT-1).
As an application of nonlinear transformations of Gaussian measures we shall discuss an interesting construction of quasiinvariant measures on the groups of diffeomorphisms suggested in [685]. Let Sl be a bounded domain in IR" with a smooth boundary and let Diffk(Sl) be the class of all Ck-diffeomorphisms of the closure of Q. A proof of the following result can be found in 16851, [414].
6.7.5. Lemma. Let m and k be two integers such that k > 3m + 1. Then, for every i = 1,... , n, there exist real numbers co,... , c,n such that the differential operator Q defined on Diffk(f1) by "I EcjO'[(f')-lal`-,f]
Q(f) = 3=0
has the following properties: (f) Q(f) _ (f')-1O'f+ terms of lower orders in 8,;
(ii) for any,, and f in Diffk(Sl), the expression Q(1po f) - Q(f) as a differential operator in f has order less than, k - m.
Let us fix two integers k and in = 21 such that 2m > n, 2k > 3m - 2. Denote by Lk the space of those elements f in the Sobolev space 4i'2.2k+m(Sl 1R,") that vanish at the boundary of Sl together with the first k derivatives (more precisely, which belong to
(lR" 1R") when assigned zero values outside of Sl). The space
Lk has a natural Banach norm. The affine space I + Lk, where I is the identity mapping on 11, is equipped with the topology induced from Lk. One can check that
the intersection Gk = (I + Lk) nDiff2k(Sl) is an open subset of I + Lk with this induced topology.
Let E = id o.-(Sl, IR") be the Sobolev space of I("-valued mappings which vanish with the first m derivatives on the boundary of Sl. It is well-known that E is a closed subspace of the Hilbert space VS,2,m(11,IR") (with its natural norm) and that
the operator (-1)'Am has the inverse T which is a nonnegative Hilbert-Schmidt operator. Hence there is a centered Gaussian measure -y on E with covariance T. The main idea of [685] is to transport y to Diff2k(Sl) by means of the nonlinear differential operators constructed in Lemma 6.7.5. To this end, note that by virtue of that lemma, there is a differential operator Q: Gk lR") such that the principal part of Q(f) is (f')-1 L f ,where Lf = 18?k f /8x; k, and, for every W E
6.7.
Examples of nonlinear transformations
313
Diff2k(SZ), the expression Q(+po f) -Q(f) contains only those derivatives off which have order not greater than 2k - m. The map Q is differentiable and its derivative
at the point I is L. By a classical theorem in partial differential equations, L is a linear isomorphism between Lk and R"). By the inverse function theorem, It-) there is a neighborhood W of 1 in Gk such that Q: W -+ Q(W) C is a diffeomorphism.
Let us take a probability measure v defined by v = gy, where p is a smooth function on E whose support is a ball in Q(W). Put po(A) = v(Q(A n H-)).
Denote by G the subgroup of Diff2k(Q) which consists of the elements g such
that g - I E
Ijo.2k+2m(f
Rn). Let {g,} be a countable dense subset of G with
g, E Diff2k+2m+2(ft). Finally, let us put
p(A) = Ec;po(giA),
where c, > 0 and E c, = 1.
i=1
i=1
6.7.6. Proposition. The measure p on Diff2k(1?) is left-quasiinvariant under the action of the subgroup G and the measurep * A. where A(A) = p(A-') is leftand right-quasiinvariant under the action of G.
The proof is based on the fact that the mapping T = Q o L,, o Q-1, where L, , (f) _ +p o f , has the following special form:
T, f - f =
oQ-'f) - Q(Q-1 f) =T112Am 2P,(Q-1f) where P,, is a differential operator of order less than 2k, so that A"'12Pr,g E W02"(0). Thus, T, is a mapping which, for
0, y(F-'(M)) > 0.
Then, for every n and y-a.e. x, the set. An : _ { t : x + te" E F-' (m, M) }
(6.7.3)
Chapter 6. Nonlinear Transformations
314
has Lebesgue measure zero. Indeed, let C. be the set of those x, for which An is not a Lebesgue zero set, and let Xn be a closed hyperplane in X such that X = Xn+Rle". Denote by ir: X -- X" the natural linear projection and put v = yo7r-1. We know that on the straight lines y+Rlen, y E X,,, there exist conditional Gaussian measures ryv, not concentrated at single points. Hence, for v-almost all
y E Xn, the set An has Lebesgue measure zero. Since Cn = (Cn n X") + Rle", then, by the definition of v, this is equivalent to the equality y(C,) = 0. By virtue of the continuity of F on x + RI en for a.e. x, we get that, for a.e. x, the following alternative takes place:
either F(x+ten) <m, VtER1, or F(x + ten) >M, VtER'. In addition, for every such x, if one of these two cases takes place for n = 1, then
the same case takes place for all n. In other words, the space X up to a set of measure zero is decomposed into two sets
D1 = {x: F(x+ten) <m, Vt E R1, Vn E IN}, 12 = {x: F(x+ten) > M, Vt E R', Vn E IN}. Clearly, these are measurable sets invariant with respect to the shifts along the vectors {en}. By virtue of the zero-one law, one of them has measure 0 and the other one has measure 1. This contradicts (6.7.3). 6.8. Finite dimensional mappings Let 7 be a centered Radon Gaussian measure on a locally convex space X and
let F: X -. R" be a sufficiently regular mapping. What can be said about the induced measure k = 7oF-1? Is it absolutely continuous with respect to Lebesgue measure A? Does it have a bounded density? Is it possible to choose a smooth version of this density? Certainly, some additional assumptions of nondegeneracy
of F are needed, since otherwise the measure j may have atoms. For example, if X = R', n = 1 and F is a smooth function, then a necessary and sufficient condition for the absolute continuity of k is that 7(x: F'(x) = 0) = 0. The proof of the following result is found in [237, § 3.2].
6.8.1. Theorem. Let F: R" - R" be a Lipschitzian mapping. Then, for every bounded Borel function g on R" and every measurable set A C R", the following identity
htrue:
J g(F(x)) (det F'(x)( dx = f 9(y)N(FI A, y) dy, A
(6.8.1)
Rn
where N(FI A, y) is the total number of elements in An F-1(y).
6.8.2. Corollary. Let A C R" be a measurable set and let
F = (F1, ... , Fn) : R"
R"
be a measurable mapping that a.e. on A has the first order partial derivatives. 0 a.e. on A. Then Lebesgue measure on A Suppose that det((VF=, VFj)) ti,j=1 is transformed by F into an absolutely continuous measure.
6.8.
Finite dimensional mappings
315
PROOF. In the case where F is Lipschitzian, the claim follows directly from identity (6.8.1). In fact, this identity means that the measure (f de F'j dx) o F- 1 has density N(FIA, x). In the general case, we may assume that F has the first order partial derivatives at every point of A. We shall use the well-known fact (see (237, 3.1.4]) that F is approximately differentiable at all points of A, and hence, according to 1237, 3.1.8(, A can be represented as a countable union of measurable
sets Aj such that the restriction of F to each of the Ac's is Lipschitzian. Since F can be extended from A. to IR" as a Lipschitzian mapping F., and the partial derivatives of F, almost everywhere on A; coincide with the partial derivatives of F, we get the claim.
6.8.3. Theorem. Let {a,} be an arbitrary sequence in H(y) and let F: X Rd be a y-measurable mapping with the following property: for y-a.e. x, there exist vectors vi(x),... ,Vd(x) in {a,} such that the vectors
F(x + tv, (x)) - F(x)
t-0
t
exist and are linearly independent. Then the measure y o F` on Rd is absolutely continuous. In particular, this is true if F is a mapping whose components belong to the Sobolev class 14 "(,), and the mapping DH F(x) is surjective almost everywhere.
PROOF. It suffices to show the absolute continuity of the measures 11B o F', where B is the set of all points x such that the vectors aa, F(x), i = I, ... , d, are linearly independent. Let E be the linear span of at.... , ad, and Y a closed linear subspace complementary to E. There exist. the conditional Gaussian measures y° given by densities on the planes E + y, y E Y. By Corollary 6.8.2, the measures yy(Bn(e+ ,) of-' are absolutely continuous. Hence, for every Lebesgue measure zero
set Z C Rd, we have yV (BnF-1(Z)) = 0, whence y(BnF-'(Z)) = 0. Note that, given an orthonormal basis {e,, } in H(y). the condition D F(x)(H) = Rd implies that Rd is spanned by DHF(x)(e,,) for some it,... ,id. Now the last claim follows from the existence of a modification of F that is locally absolutely continuous on
the lines parallel to e . ... , e (see Proposition 5.4.1). Unlike the finite dimensional case, the condition in Theorem 6.8.3 is not necessary. There exists an example (see (409], [410]) of a function F: C[0,11 - R, where C[0,1] is provided with the Wiener measure P's , such that F is infinitely Frechet differentiable, but the measure P"'I{F'=oi o F-' has a smooth density. In this sense, there is no direct infinite dimensional analogue of Sard's theorem (see, however, [4791 and Theorem 6.11.5 below).
6.8.4. Corollary. Let F: X -. Rd be a measurable mapping such that for a.e. x, the mapping h - F(x + h) is locally Lipschitzian on H(-y), and the set of all points x, where the Gateaux derivative DHF(x) exists but is not surjective, has measure zero. Then the measure y o F` is absolutely continuous.
6.8.5. Corollary. Let ' be a non-atomic symmetric Gaussian measure on a separable Banach space X. Then the norm q of X has an absolutely continuous distribution on (X, y). PROOF. We may assume that H is dense in X replacing X by the closure of H (which has full measure by Theorem 3.6.1). Recall that the Gateaux derivative q'
Chapter 6.
316
Nonlinear Transformations
exists y-a.e. (see Theorem 5.11.1). Clearly, q' does not vanish at all points where
it exists (note that q'(x)(x) = q(x) > 0 if q(x) exists). Then 0 at all such points x, since q'(x)(h), h E H, and H is dense in X. Hence Dxq(x) * 0 a.e.
x
6.8.6. Corollary. Let Q = E Qn, where the function Qn is a -y-measurable n=0
polynomial of degree not bigger than n, and the series converges in L2(-y). Suppose that
x
1 AnIIQnjI L?('1) < oc nO
for some A > 1.
Then either the measure y o Q-1 is absolutely continuous or
Q = const -,-a. e. PROOF. Let {e, } be an orthonormal basis in H(-y). For every n, the conditional
measures on the lines y + )Ede, have Gaussian densities for all y E Y, where Y is a fixed hyperplane complementary to We,,. Hence, for a.e. x and every closed interval (a, b}, we have Q(x + ten) = F_',, Qn (x + ten) and /b
A- J Qn(x+ten)2dt 1, the function W(t) = Q(x+th) possesses the mean
square approximations vpn by polynomials of degree n such that the quantities < 1. According CnM2: = IIV-YnIIL'[n.bj satisfy the condition n-x to Bernstein's theorem (see, e.g., 1785, p. 399, Ch. VI, 6.9.151), this implies that V has a real-analytic modification. Thus, for any fixed n, the function Q has a limsup(en(p)2)1
modification that is real-analytic on the lines x+W WE,, and. hence is either constant or has the derivative with at most countably many zeros. The set of all x for which such a modification is constant for all n is invariant under the shifts to the vectors ten and, hence its measure is either 1 (and then Q = const a.e.) or 0. In the latter case, the measure y o Q-1 is absolutely continuous.
8.8.7. Example. Let y be a centered Radon Gaussian measure on a locally d
convex space X and let F E e Xk. Then either F is a constant or the measure k=0
y o F-1 is absolutely continuous.
6.9. Malliavin's method In this section, we discuss basic ideas of a general approach to the study of regularity of the finite dimensional images of Gaussian measures suggested by P. Malliavin and called now the Malhamn calculus.
6.9.1. Example. Let yn be the standard Gaussian measure on IR' and let f be a polynomial on lRn without critical points (i.e., V f has no zeros). Then the measure p = y o f -1 has a smooth density which together with all its derivatives decreases at infinity faster than any power of Ix1-1.
6.9.
Malliavin's method
317
PROOF. Let us consider the Fourier transform of the measure p, which, by the change of variables formula, has the form
A(t) =
J
exp(itf)d7n
Clearly, the function N is infinitely differentiable. Let us show that, for every k E V, the function tkj.(t) is bounded. The idea of the proof is to employ the vector field v = V f . Denoting by d1, the operator of differentiation along the field v and by p
the standard Gaussian density on R", by the aid of a formal integration by parts, we get
itµ(t) = Jo etI 1k-fdry = -
J
e`tf div
p vdx. (_1)
Let us now note that the function div (p v) has the form i-'Qp, where A _
atf
I V f 12 is a polynomial without zeros, and Q is some polynomial. This integration
by parts is justified if the function I-'Qp is integrable. The latter is indeed true, since, by the Seidenberg-Tarski theorem (see [347, p. 368, Example A.2.7j),
there exist two positive numbers C and a such that o(x) > Cuxi-O, whenever JxI > 1. Thus, Itµ(t)I < Jk0-'QIl1.lr,.,,. Repeating the procedure described, we get the boundedness of all the functions t'p(t). The same reasoning applies to the functions f'p, r E IN, replacing p, which completes the proof.
O
Trying in the infinite dimensional case to act according to the same plan, we face at once the obvious difficulty that the last equality in the integration by parts formula used above makes no sense due to the lack of infinite dimensional analogues of Lebesgue measure. Certainly, this difficulty is overcome if one defines the action of the differential operators directly on measures. In fact, this is the essence of the
theory of differentiable measures of Fomin and the Mailiavin calculus. However, a more delicate problem arises of finding vector fields, for which the integration by parts is possible. A simple example: the function f (x) = (x, x) on an infinite dimensional Hilbert space X. As shown in Chapter 5, there is no Gaussian measure on X differentiable along the vector field v = V f : x -- 2x. For this reason, for the Gaussian measure y with the covariance operator K, it is natural to take the field u = KV f, along which, as we know, the measure y is differentiable. Therefore, similarly to the finite dimensional case, it remains to verify the integrability of the functions (2Kx,x)-1.
6.9.2. Theorem. Let F = (F1,... , F,): X -. R" be a mapping such that F, E 141" (,Y), i = 1, ... , n. and
E np>1 Lx'(-y), where A= det ((DHF., D. Fi )H).
H = H(-y). Then the measure u := y o F-' has a density from the class S(R"). PROOF. Put ai j = (D F;, D Fj ) and denote by v'j the elements of the matrix inverse to (a,j). By condition, ii" E bt'"(y). For any smooth function W on R", we denote by 8;V the partial derivative in xi. According to the chain rule, one has the equality yik
a,'P o F =
k,j-l
n
ak) app ° F =
[1 L,k(9"(Y ° F), k[-1
Chapter 6. Nonlinear Transformations
318
where vk := D Fk. Integrating this equality and making use of the change of variables formula, we arrive at the relationship
I &w(v) µ(dy) _ R°
I
k='X
o F(x) 7(dx)
By the aid of the integration by parts formula, each term on the right-hand side i transformed into - cp o F(x)g;k(x) y(dx), where
E W'(-y)gik = v`k6vk + Therefore, the generalized partial derivative of the measure p in x; is the bounded
y) o F. Replacing atop by the partial derivatives of higher orders and repeating the procedure described, we conclude that all generalized partial derivatives of p are bounded measures. This yields the existence of a smooth density p of p. The same reasoning applies if the measure y is replaced by any measure p - y, where B E W'(-y). This implies that p E S(1R."). measure (g;&.
6.9.3. Remark. It is seen from the proof of the previous theorem that, in order to get only some finite differentiability of the density, it suffices to impose the existence of a sufficiently large number of derivatives of F and a sufficiently high order of integrability of the function A-1.
It also follows from this proof that the Sobolev norms of the density of the measure p are estimated via the norms IIF,IIp.r and IIO_,IILo(,) for sufficiently large p and r. In addition, if the measure ry is replaced by the measure P y, where e E W", then the Sobolev norms of the smooth density k, of the measure B y are for sufficiently large p and r. This estimated via IIF+IIp.r, IIPIIp.r, and IIo-' observation may be useful for constructing generalized functions on X. In addition, it will be used below in the study of the surface measures.
6.9.4. Example. Let F E W2.2 (y) be such that D F # 0 a.e. and ILFI + IIDN F(x)Ilx
E L'(y)
ID,, F12
Then the measure y o F'' has a density p of bounded variation (in particular, p is bounded). Moreover, if E La(y), then IIL-(-,) 1 and the principal coefficient 1, the following estimate holds: mes
(t:
I f(t) I < E) < 2dE1/d,
where mes is Lebesgue measure. In addition, for any nondegenerate Gaussian mea-
sure v on IR" and every polynomial G of degree d on fit" which has the form n G(x) = c+ E g3(xl). x = (xl,... ,x"), where c> 0 and the g,'s are nonnegative J=1
polynomials on IR1 with the principal terms xd, the following estimate holds:
v(x: G(x) < e) < c(v)(2d)"e"/d, where c(v) depends only on v.
PROOF. Let us prove the first estimate by induction in d, noting that it is obvious for d = 1. If a polynomial f of degree d > 1 has a zero a, then f (t) _ (t - a)g(t), where g is a polynomial of degree d - 1, which, by the assumption of induction, satisfies the corresponding inequality. Since the set {I f I < e} is contained
in the union of the sets [a - e1/d, a+e""d] and (IgI < E1-1"d), its measure does not exceed 2E1'd + 2(d whence the desired estimate. If f has no zeros, say, is positive, then in the case f > E the estimate trivially holds, and the case where f = E has a root is reduced to the considered one by passing to f - e. In order to prove the second estimate, it suffices to be shown that it holds true with c(v) = I for the standard Gaussian measure, which is easily achieved by applying Fubini's theorem and the one dimensional estimate proven above. 6.9.8. Example. Let Q be the polynomial on C[0, 1] with the Wiener measure Pit defined by the formula 1
Q(x) = Jq(x(t))de, 0
where q is a nonconstant polynomial on the real line. Then the measure Pit, o Q` 1 has a smooth density. PROOF. Indeed, for any fixed n, we can split [0.1] into n equal closed intervals I, and choose smooth nonzero functions h1 with supports in I,. Taking for X0 an arbitrary closed linear subspace in C[0, 11, algebraically complementing the linear span of the hi's, and writing x = x0 + cl hl +... + c"hn, x0 E X0, we obtain (since the supports of the hJ's are disjoint) a decomposition of Q with the properties required in Corollary 6.9.6.
We shall say that a quadratic form Q on a Hilbert space H is infinite dimensional if Q(x) = (Ax, x), where A E £(H) is a symmetric operator such that
dimA(H) = c. 6.9.9. Corollary. Let a quadratic form Q on X (in the usual algebraic sense) be measurable with respect to -y. Suppose that Q is infinite dimensional on H(-y).
Then the measure ry o Q-1 has a density from the Schwartz class S(R'). The analogous claim is true for Q = (Q1,... , Q") : X - ' 1R.", where the Qi 's are quadratic forms whose nontrivial linear combinations satisfy the foregoing condition.
6.10.
Surface measures
321
The proof can be found in [77], (93] (see also Problem 6.11.25). It is an open problem whether there always exists a smooth (or bounded) density of the measure y o F-', where y is a nondegenerate Gaussian measure on an infinite dimensional
Hilbert space X and F is a continuous polynomial such that F' -A 0 (or, more generally, the set {F' = 0} is a finite dimensional manifold). A closely related problem is the study of the behavior of the quantity y(G < c) for small a and a nonnegative polynomial G. It is worth noting in this connection that, in the infinite dimensional case, the set of zeros of a continuous polynomial may have a rather complicated structure (see Problem 6.11.19).
6.10. Surface measures By the aid of Malliavin's method one can define surface measures generated by a Gaussian measure y on sufficiently regular surfaces. In order not to overshadow the essence of the problem by minor technical details, we shall assume that the surface S C X is given by the equation F(x) = 0, where F E W' (-y) and 1
E
I
I
L'(y),
(6.10.1)
p>1
moreover, from the very beginning we shall take for F its C,-quasicontinuous version, i.e., a version that is Cp,r-quasicontinuous for all p, r (see Problem 5.12.44
and Chapter 5). What should one mean by a surface measure on S? A natural interpretation is to take the e-neighborhood of the surface, divide its measure by 2e and let a tend to zero. Since a metric is, in general, only on H(y), then a suitable candidate for the e-neighborhood is the set S + eU". However, for some technical reasons, it is more convenient to define the surface measure ys as follows. As shown
above, the measure y o F-' has a smooth density k. Let
O = {y: k(y) > 0}.
On the sets Sy := F-' (y), y E 0, we shall choose in a special way condition measures yy and, for any bounded Borel function V, we shall put
f p(s)ys"(ds) := k(y) f w(s)ID"F(s)(ds). S"
(6.10.2)
$"
It will be shown below that this relationship defines some measure, which vanishes on the sets having zero Cp.r-capacity for all p, r > 1. Hence this measure is independent of our choice of a Cx-quasicontinuous version of F. Note also that since
the measure y has a Souslin support E, which is embedded injectively into lR" by means of a continuous linear mapping, we may assume that X = 1R°` (we may assume also that we deal with a Hilbert space). This observation may simplify some technical details. Let us find an absolutely convex compact set K such that y(K) > 1/2. We shall fix a function 9 E W"(-y), which equals I on K and is 0 outside 2K (the existence of such a function is established in Chapter 5). Put 9,(x) = 9(j-lx). We denote by g' a C,,-quasicontinuous version of g E W'O(y). For a function g E WOO(-y), let us denote by k9 the smooth density of the measure (g y) o F-I (which exists by virtue of condition (6.10.1)). We shall assume that 0 E 0, and give a construction for y = 0; for an arbitrary y E 0 the construction is completely analogous.
Chapter 6. Nonlinear Transformations
322
6.10.1. Lemma. Let us take a sequence of smooth probability densities 'Pj on the real line of the form 'pj(t) = jWo(jt), where oo is a smooth probability density with support in [-2,2), and po = 1 on [-1,1]. Let us consider the measures
vn.j = n.j ),
(F x
n.j(x) =
k(F('
.
Then, for every n fixed, as j - oo, the measures converge weakly to some measure v concentrated on S. In addition, the measures v vanish on the sets which have zero Cp,,.-capacity for all p, r > 1. Finally, for any function g E W°°(7), one has the equality
J)((fr) =
k,9..(0) k(0)
(6.10.3)
X
PROOF. For any fixed n, the measures v,,, are uniformly bounded and have common compact support. Therefore, it suffices to prove the convergence of the gdv,,,j, as j -+ oo, for all continuous g E W°°(7). These integrals are integrals J written in the form J 9(x)en(x) k(F(x))> 7(dx) = I ksen(y) k(y)) dy.
The right-hand side tends to In addition, this gives formula (6.10.3).
as j - oo by virtue of our choice of 'j.
Let now A be a set of zero C,,,.-capacity for all p, r > 1 and let e > 0. We can find an open set U D A, for which C2.2(U) < s. There exists a function ff E W2.2('() such that f, > 1 a.e. on U and Ilfc112.2 5 e. Then
(Iu'y) o f-' S (f, 7) o f-1 = kf, dx, whence v,,(U) < kf,(0)/k(0). It remains to note that kf,(0) tends to zero as e - 0, since, by virtue of Example 6.9.4, there holds the estimate Ikf(0)1 < const Ofll2,2
0
is uniformly tight and converges weakly 6.10.2. Theorem. The sequence to some probability measure 7o concentrated on S and vanishing on all sets having
zero Cp,r-capacity for all p, r > 1. In addition, for all g E W'°(1), the following equality holds true: g*(x)'Y°(dx) = k(0,)
(6.10.4)
J X
PROOF. Equality (6.10.3) shows that the sequence of nonnegative measures v,,
is bounded. In addition, it is uniformly tight. Indeed, the sequence g,,, = 1 - 9,,, tends to zero with respect to any norm 11 Ilp,r. Moreover, sup,, -f 0 as m - ac. According to Remark 6.9.3 and Example 6.9.5, sup,, k,_#, (0) 0 as oo. Since g,,, = 1 outside the compact set 2mK and g,,, > 0, then equality m (6.10.3) implies the uniform tightness of the sequence
Hence, for the proof of
its weak convergence, it suffices to verify the convergence of the integrals
g dv
k9(y) for all bounded functions g E W°°(7). It remains to note that for any y E 0. This follows from Remark 6.9.3 and Example 6.9.5. In addition, relationship (6.10.4) is established, whence it is seen that the measure 7o is a
6.10.
323
Surface measures
probability. Finally, the fact that the measure y° vanishes on the sets having zero
Cp,r-capacity for all p, r > 1, is proved in the same manner as in the previous lemma.
In a similar manner the conditional measures ryy arise. Now, by the aid of (6.10.2), we shall introduce the surface measures ysy. Let us put S = So. 6.10.3. Corollary. For any function g E W°°(-y), one has the equality J
I
9*(x)..S(dx) _ li0 2e
(6.10.5)
g(z)1DHF(x)IHy(dx)
IFI<e
S
PROOF. For all u E W'(-y), by virtue of the existence of a smooth density of
the measure (u y) o F-', one has k,.(0) = lim
'
J IFI 0. In the case of a smooth mapping this means that F is a local diffeomorphism. The density of the image-measure is given by the formula (2x)-"I2ldetVF(F_1(x))t-l
4(x) =
exp(-ZIIxiil)
Therefore, the question is: when I det VF[ = I? Geometrically the equivalence of the initial condition to the latter one is obvious (since F preserves the norm, it preserves the standard Gaussian measure if and only if it preserves the spherical Lebesgue measure). The latter is equivalent to the preservation of the Lebesgue volume by F, which is precisely our condition on the Jacobian. Note that
VF(x) = U(x) + (VU(x))(x) = U(x)[1 + U*(x)(VU(x))(x)].
6.11.
Complements and problems
Since I det U(x)I
325
1, our initial condition reduces to the identity
det[I+U'(x)(VU(x))(x)]
(6.11.1)
1.
A simple sufficient condition for the last identity is the following one: for every h E IR, the operator V (U' (x)h) is nilpotent. Indeed, differentiating the identity U`(x)U(x) = I, we get U' (x)VU(x) + [VU`(x)] U(x) =_O.
Hence the operator U'(x)VU(x)(h) is nilpotent for any h. Letting h = x, we arrive at (6.11.1). Recall that an operator A E C(H) is called quasinilpotent if lim IIAn1111" = 0. n-x If such an operator is compact, then it is called a Volterra operator. It is known that a nuclear Volterra operator has zero trace (see [296, Ch. III, Theorem 8.4)).
Let U: X -. C(H) be a mapping such that, for every h E H, the mapping Uh: x U(x)h belongs to W1 (-y, H). 6.11.2. Theorem. Suppose that U(x) is an orthogonal operator a.e. and that, for every h E H, the operator DHU(x)h is quasinilpotent a.e. Then, for every orthonormal basis {en } in H, the functions b(Uen) are independent standard Gaussian random variables on (X, -y). Therefore, the mapping
x T(x) = Eb(Uen)(x)en n=1
is well-defined and -y o T-1 = y.
6.11.3. Remark. Let l; and 17 be independent standard Gaussian random variables and p a nonnegative Borel function. According to an unproven conjecture by Cantelli (see [516, p. 316]), if C + tp(1;')q is Gaussian, then
I.
To simplify notation, we write fj (y) instead of f3 (y,... , yj-1) (i.e., we extend fj to IRn as a function depending only on the first j - I coordinates). We shall deal with specific versions of ff. Namely, we fix Borel versions of ff (y) with the following
properties: (y,x) ,--+ fj(y)(x) is a Borel function on some Borel set D C 1R'-1 xX
Chapter 6.
326
Nonlinear Transformations
such that, for every y E R'-1, the section Dy := {x E D: (y,x) E D} is a Borel linear subspace with y(D5) = 1 and x - f,(y)(x) is linear on Du. In order to construct such a version, let us take an orthonormal basis 1C.1 in X, such that (,} C X' and define D as the domain of convergence of the series is a Borel function Clearly. ii;n: y n=1 x on IR-1-1 and E rj,n(y)2 < oc for every y. Then D is a Borel set with the desired n-l properties. With this choice, T becomes a Borel mapping. The next very interesting result was found in [473], [818] in a bit less general form, but with a similar proof.
6.11.4. Proposition. Suppose that, for every y = (yr,... , y,,) E IR", the elements fl, f2(yl),... ,f,,(yi,... ,y,,) are orthonormal in X;,. Then yoT-1 is the standard Gaussian measure on Ut". In addition, it is possible to choose Gaussian conditional measures yy, y E IR", on X, i.e., every measure ryv is concentrated on the set T-1(y), for every B E B(X) the function y" yy(B) is measurable on R" and (6.11.2)
y(B) = f yy(B) y a T-1(dy). ER.
n
Finally. yy has mean all =
yjR, fj(y) and covariance operator n
R5: f
R,f -
f (R,fi(y))R1fi(y) j=1
PRooF. The first claim is verified by induction in n. Suppose it is true for n - 1. Let us evaluate the Fourier transform of the measure y o T-1. This reduces to evaluating the integral fexP(i[Y191(x) + ... + yngn(x)]) y(dx). X
By the change of variables formula, it suffices to consider the case where X = IR'°, It is the product of the standard Gaussian measures and g1(x) = xt. If we fix xl and integrate with respect to (x2, x3, ... ), then we get exp(iylx1) eip(-[y2+...+yn]/2), which follows from the inductive assumption (recall that, for x1 fixed, 92 = fl (xl )
is a constant element of X;, g3(x) = f3(xt,g2)(x), and so on). Integrating in x1, we get exp(-[y2 + ... + y.2]/2), whence the first claim.
i
Now let us prove the second claim. Note that, although T may not. be linear, the sets T-1(y) are afftne subspaces Ty 1(y) of codimension n, where Ty is the linear mapping generated by the functionals fl, f2(yl), ,fn(yl. ,yn_1). Clearly, Ty 1(y) = ay + Ker T, since Tr(ay) = y, which follows from the relationship
f,(y)(R,fk(y)) _
6,k.
Thus, in order to conclude that the announced measure yy is concentrated on T-'(y), it suffices to note that R1(X') = KerT5. Finally, let us verify (6.11.2). To this end, given f E X', it has to be shown that.
1(f) =
f(f)oT(d). B"
Complements and problems
6.11.
327
By definition,
y"(f) = exp[iyif(Rfi(y))] exp[-2Rr(f)(f)+1: f(R.fi(y))2] i=1
Let us integrate this expression in y with respect to the standard Gaussian measure on lR". Integrating first in y,, and using that the elements f., do not depend on y,,, we get r l ll yif(Rrf2(y)) - 2Rr(f)(f)+ 2>f(Rrfi(y))2J expl-2f(R,fn(y))ZJ
exp
,=t
L ;=1
lL
= exp[i E yif (Rrfi(y))J eXp[-2R,(f)(f) + 2 Ef (R_ fi(y))2]
L j_1 j=t Integrating then in y,,'..... y' , we get exp(- R., (f) (f) /2) = y`(f ), which proves O
our claim.
Let us mention an infinite dimensional version of Sard's theorem obtained in [795] (see also [287], [447], [452]).
6.11.5. Theorem. Let F: X -. H be a Borel mapping, which belongs to the class Woo,' (y, H). Suppose that there exist a Borel set 1 with y(Sl) > 0 and a Borel function r : Il -+ (0, oo) such that, for every x E S1, the mapping h F(x + h),
H -+ H, is differentiable on the open ball {h E H: IhI < r(x)} and the mapping h -+ D F(x + h) with values in 11(H) is continuous. Put T(x) = x + F(x). Then for any Borel set B C X, one has
r
y(T(B n 0)) < I IAF(x)I y(dx).
(6.11.3)
Bnn
In particular, the image of the set {x E 11: det
0} has measure zero.
Note that if y(Q) = 1 (e.g., if F belongs to the class 7-IC' (y, H)) and if T has Lusin's property (N), then (6.11.3) yields
y(T(B)) S f IAF(x)I y(dx).
(6.11.4)
B
This is not true in general if T does not have Lusin's property (N), but T has a version which satisfies (6.11.4).
Distributions of nonlinear functionals The following result was proved in [496]. A shorter proof was suggested in [75].
6.11.6. Theorem. Let y be a nondegenerate Radon Gaussian measure on a Banach space X and M(t, x) = y(y: Ily - xfl 0.
Let X be a Banach space whose norm q satisfies the condition
ba(e) > Ce°, (6.11.5) where C, a > 0. Suppose that y is a Radon Gaussian measure on X such that dim H(y) > a. Then q has a bounded density of distribution on (X, y) (see 1642]). For example, condition (6.11.5) is fulfilled for the spaces LP, I < p < or, with a = max(p, 2). As shown in (642], condition (6.11.5) cannot be replaced by the weaker condition of the uniform convexity. It is known [640], [641] that on every infinite dimensional separable Banach space X, for every e > 0, one can define a new norm II IIE and a centered Gaussian measure ye such that (1 - e)llxllX < IIxVI= ( (1 + `)Ilx]lx, and the function F(r) = y,(x: Ilxll: < r) has derivative F' unbounded near zero. If X is filbert, then, in addition, the norms 11 ll, can be taken infinitely Fr@chet differentiable outside the origin with bounded derivatives on the unit sphere. Let us mention one more result concerning the distribution of the norm (see (189], [780], [751). -
6.11.7. Theorem. Let X be a Banach space such that its norm q has k Lipschitzfan Frechet derivatives on the unit sphere and satisfies condition (6.11.5). Let y be a centered Radon Gaussian measure on X. If dimH(y) = oo, then the function q(x) < t) is k times differentiable and M(k) is absolutely continuAl: t --+ -Y (x:
ous. In addition, the function Q: x --+ y(U + x), where U is the unit ball in X, has k continuous Frechet derivatives. Moreover, the mapping (t, x) - y(tU + x) is k-fold continuously Frdchet differentiable. The aforementioned condition is fulfilled. in particular, for the spaces L2' [a, b], n E IN.
6.11.8. Theorem. Let F E P2(y) be such that F V P, (-y) and let A = D,2F. The following two conditions are equivalent:
(i) either the quadratic form Q: h -+ (Ah, h)H is strictly positive or strictly negative definite on some two dimensional plane in H, or F = glg2+g3+c. where g, are measurable linear functionals, c is a real number, and 93 does not belong to the linear span of gl and g2; (ii) the induced measure p := y o F-1 admits a bounded density. Moreover, in this case p admits a density of bounded variation.
PROOF. Assume that condition (i) holds true. To be specific, suppose that Q is strictly positive definite on the plane spanned by the vectors el and e2. We can always choose these vectors in such a way that Q(ei) = Q(e2) = 1 and (Ael,e2)u = 0.
Note that Be,F = fl + c1, 882F = f2 + c2, where fl, f2 are measurable linear functionals and c1, c2 E 1&. We shall deal with proper linear versions of fl and f2. Then we get 8e,8e,F = f2(et) = (D,+Fel,e2)s = (Ael,e2)H = 0 In a similar manner, 88, F = 88,F = 1 a.e.
a.e.
(6.11.6)
(6.11.7)
Complements and problems
6.11.
329
Put v =a, Fe l + 8,, Fe2 i G = &F = (fl + cl )2 + (f 2 + c2)2. We shall prove that there is C > 0 such that
'(t) p(dt) < Csup
(6.11.8)
V p E CC (IR1).
This implies the existence of a density of bounded variation. In order to get (6.11.8), we shall prove that
x
f'(F(x)) G(x) + 7(dx) < CsuPkp(t)I, V V E Co (IRl), E
(6.11.9)
with some C independent of E. Since G > 0 a.e., this yields (6.11.8). Let,p E Co (IRl) be fixed. Integrating by parts, the left-hand side of (6.11.9) can be rewritten as
l
d7=- j
O,.(`poF)G1
VoF[(G&+e2 +Gb+EJd7.
x Using (6.11.6) and (6.11.7), we get i9 ,,G = 2G and by = 2 - (f, + cl )el - (f2 + c2 )e2. Therefore, it remains to be shown that the L1-norms of the functions X
2E
(fl + cl)el
(G+E)2 -
G+E
-
(f2 + c2)e2
G+£
are uniformly bounded in E > 0. It follows from our choice of fl. f2, el. e2 that the measure v on IR2 induced by the mapping (fl + cl, f2 + c2) is nondegenerate and has a density bounded by some M > 0. Using the change of variables formula and then polar coordinates, we get
f (G +
f
E)2
X
f
2,6
dry = ,I (x2 + y2 + E)2 v(dxdy) < 2,M j (r22+E)2 dr = 27111. R2
0
Let us estimate the norm of Al := (fl +c1)el(G+E)-1. Let el = sill +a2f2+9, where a; are some reals independent of E and 9 is a measurable linear functional orthogonal in L2(-y) to fl and f2. Then g is independent of fl and f2, and since G is a function of fl and f2, we get IIAll1L'(,)
< loll ll
(f lG++ciE)f l
IIL+ la2lH
(f lG++clE)f2
fl + cl
'ILK(,) + ll911L1(,)11
G+£
IILI(j)
Since G = (fl + cl )2 + (f2 + c2)2, in order to get a uniform bound, it suffices to show that If, +cl1G-1 is integrable. Noting that the density of v is majorized by kexp(-8x2 - 8y2) for some positive k and 0, we get, using again the change of variables formula and polar coordinates, c
J l fG + El l d7 = i x2 X
R2
+i
2
y2 +
E
v(dxdy) < 2nk f rz+ E exp(- 3r2) dr. 0
which is uniformly bounded in e. In a similar manner one estimates the term (f2 + c2)e'2(G + E)-1. If the second possibility mentioned in (i) occurs, then g3 = c191 + c292 + 94, where g4 is a ry-measurable functional independent of gl and 92; hence the distribution of F is the convolution of a nondegenerate Gaussian measure with a probability measure, whence the existence of a bounded density
Chapter 6. Nonlinear Transformations
330
follows. We observe that the previous consideration could be reduced at once to x 1), where the two dimensional case if we write F = c + E C.C. + E n=1 n=1 x x c E IR', cn < oo, an < oo, and n = en for some orthonormal basis {en} n=1
n=1
in H(y). However, this would not ease the calculations above. Let us show that if condition (i) is not satisfied, then p cannot have a bounded density. First of all note that Q has rank either 1 or 2 (if Q has rank more than 2, then condition (i) is satisfied, and Q cannot have rank zero, since then F E Pl(y)). If Q has rank 1, x then, for some m, we get F = a,,, (t n, - 1) + CnCn + c, where an, # 0. Since n=)
condition (i) is not satisfied, one has n = 0 if n 96 m. Now the claim follows from the fact that for any reals k and c, the distribution of the polynomial x2 + kx + c on IR1 with the standard Gaussian measure has no bounded density. If Q has rank 2, but is not definite, in a similar manner we obtain that F =.9192 +g3 +c a.e., where g, are nontrivial measurable linear functionals and c E R1. Again, since condition (i) is not satisfied, g3 is a linear combination of 91 and g2. Hence it remains to verify that the distribution of the polynomial xy + ax + Qy + c on R2 does not 0 admit a bounded density (see Problem 6.11.24).
6.11.9. Corollary. Let Q be a -y-measurable function which is a quadratic form on X in the usual algebraic sense. A necessary and sufficient condition for the boundedness of the distribution density of Q is the existence of a two dimensional
linear subspace L in H(y), on which the form Q is either strictly positive definite or strictly negative definite (moreover, in this case the distribution density of Q has automatically bounded variation). Let us mention also the following nice result from 14501.
6.11.10. Theorem. Let y be a centered Radon Gaussian measure on a locally convex space X and let f2 be -y-measurable polynomials. Put F = (fl,... fn). Denote by . ' the class of all polynomials Q on IRn such that Q(F) = 0 y-a.e. Let Z = {z E IRn : Q(z) = 0, VQ E J}. Then the measure y o F-1 is absolutely continuous with respect to the natural Lebesgue measure on the algebraic variety Z. In particular, the measure y o F-1 is absolutely continuous if and only if .7 = {0}.
Problems 6.11.11. (Rademacher-Ellis theorem [227], 16321). Let is be a Borel measure on a Souslin space without points of positive measure. Prove that a Borel (or, more generally, p-measurable) mapping F: X -. X satisfies Lusin's condition (N) if and only if it takes every µ-measurable set into a p-measurable set. Hint: one implication follows from the measurability of the Borel images of Souslin sets (see Appendix); to get the converse, show that every set of positive measure with respect to a non-atomic Bore! measure p on a Souslin space X contains a nonmeasurable subset (use that (X, p) is isomorphic as a measurable space to a closed interval with Lebesgue measure). 6.11.12. Let y be a centered Radon Gaussian measure on a locally convex space X, let H = H(-y), and let K E 11(H) be a symmetric operator with the eigenvalues bigger
than -1. Show that
(det2(l+K))
1/2
= f exp(26K(,x))y(dx). x
(6.11.10)
6.11.
Complements and problems
331
Hint: take an orthonormal basis in H consisting of the eigenvectors of K and reduce the claim to the one dimensional case: then Kx = ax, bK(x) = at - axe, and the integral on +a)-In, which is the left side. the right in (6.11. 10) equals exp(a/2)(1 6.11.13. Let A, B E C, (1R'). A > c > 0 and let l: be the diffusion process on (0.11. B(C,)dt. Fo = 0. where governed by the stochastic differential equation dot = wt is a Wiener process. Prove that there exists a continuous mapping 4': C(0. 11 -. C[0.11 such that , = 4'(wt) a.e. Show that this may be false for a diffusion in R'. Hint: in the one dimensional case use Ito's formula to find a diffeomorphism ,p such that the process qt = p(s;t) is a diffusion with the unit diffusion coefficient and a smooth bounded drift v. Show that the unique solution qt of the integral equation qt (w) = rp+w(t)+ f, v(q,(w)) ds depends continuously on w E C(0.1). In order to construct a counter-example in R2. show that the functional
S(w) _ ! wI(t)dw2(t) - f t w2(t)dw1(t), 1
0
w = (wl.w2).
0
where w1(t) and w2(t) are two independent standard Wiener processes. has no continuous modification (use [361, representation (6.6). §6, Ch. 6]). Deduce that the same is true for each of the two integrals separately, which implies that the functional f (w) = j yo(w1(t)) dw2(t) has no continuous modification on the unit ball in C([0,11,1R2). provided p is a smooth function such that (,p[ < 1 and ,p(x) = x on [-1. 1]. Finally, consider the matrix-valued mapping A(x) = (A,,) with .411(x) = 1..42:(x) = 1. At2(x) = `o(x2)/2, and A21(x) = 0. 6.11.14. Let A and B be twice continuously differentiable functions on R1 with bounded derivatives and let Cr be the diffusion process given by the stochastic differential equation dt:t = A(Ct)dwt + B(Ft)dt, t;o = c, c E R'. Prove that if A is not constant, then the measure pE induced by t; on C[0,1) has no nonzero vectors of continuity in the sense of Problem 2.12.23 (in particular, pf has no nonzero vectors of differentiability in the sense of Definition 5.1.3). Hint: assuming that pt is continuous along h, show that h is Holder continuous of order 1/2 (see (77]): consider the function
L(t,x) = limsupfx(t + 6) - x(t)I(26loglog(1/b))-''2. t E 10? 11, x E C[0. 1).
r-o
Show that µt(x: L(t,x) _ ]A(x(t)l) = 1 for every t (see equality (7.1.4) in Chapter 7). Notice that L(t, z + Ah) = L(t,x) for every A E R' and choose tin such a way that A(x(t) + Ah(t)) qE A(x(t)) for sufficiently small positive A and all x from a positive measure set.
6.11.15. Let {t be the diffusion process in R" governed by the stochastic differential equation dl;t = A(t)dwt + [B(t)Ft + C(t)1 dt, {o = x E R". where A and B are bounded Borel matrix-valued mappings and C is a bounded Borel vector-valued mapping on (0,11. Show that the process t; is Gaussian. Hint: use that the affine transformations of Gaussian vectors are Gaussian and that the limit of a sequence of Gaussian vectors is Gaussian.
6.11.16. Let A and B be twice continuously differentiable functions on R' with bounded derivatives. Prove that the diffusion process given by the stochastic differential
equation dEt = A(tt)dwt + B(t;t)dt. o = c, c E R', is Gaussian if and only if A = crust and B(x) = ax + b, where a, b E R'. Prove the multidimensional analogue in the case where A takes values in the positive matrices (the claim is false without this restriction). Hint: use Problem 6.11.14 to reduce the problem to the case A = 1. In that case use Girsanov's theorem to show that the distribution µF of the process f in C(0,1) is equivalent to the Wiener measure P'v' and its Radon-Nikodym density is expQ(w), where Q is given
by (6.7.1). Notice that Q E P2(P'v'), since pt is Gaussian, whence OQ E Au for any h E Co [0, 11. Noting that the same is true for any interval (0. r). r E (0, 1), replacing [0, 1], and evaluating t9sQ explicitly, conclude that B" = 0.
Chapter 6. Nonlinear Transformations
332
6.11.17. Suppose that f is a diffeomorphism of R' such that yl o f-' = yl, where yl is the standard Gaussian measure. Show that f (x) = x or f (x) = -x. Hint: show that f (0) = 0; assuming that f'(0) > 0. use the change of variables formula to show that the inverse function g satisfies the ordinary differential equation g'(y) = exp(zg(y)2 - Zy2). 6.11.18. Show that there exists a nonlinear homeomorphism f of the real line such that y, o f -' = -y,, where yl is the standard Gaussian measure.
6.11.19. (S.V. Konyagin). Construct an example of a continuous polynomial F of degree 4 on a separable Hilbert space such that the cardinality of the disjoint connected closed components of the set F-1(0) is continuum. Moreover, an arbitrary Souslin set A on the real line can be obtained as the orthogonal projection of such a set F-'(0). Hint: given a closed set T C 12, whose complement is the union of the open balls with centers a(") and radii r", consider a polynomial F on 12x12, defined by the formula
F(x, y) _ E 2 " (c" (r.2 - IIx - a'"' 112 ) + y2) 2. "=1
(
i
Verify that the projection of F-'(0) to the first . factor coincides with T. Show that A can be obtained as the orthogonal projection of a
where c" = 2-" (l + r2 + IIa(n)112) closed set in 12.
6.11.20. Let y be the standard Gaussian measure on 1R' and let be a mapping such that the F,'s are polynomials, and the matrix with the entries or., _ (VF,, VF,). i. j = 1,... , d, is nondegenerate at every point. Then the induced measure y" o F-' on Rd has a smooth density from the Schwartz class S(1Rd). Hint: use that I det o,., I -' E L' (y") for all r > 1 by the Seidenberg-Tarski theorem. 6.11.21. Prove Corollary 6.10.4.
H such that F 6.11.22. Construct an example of a measurable mapping F: X is smooth along H, IIDHFIIc(H) 5 1/2, but the measure y o (I + F) ' is singular with respect to y. Hint: find F of the form F(x) on 1R" with the measure y which is the countable product of the standard Gaussian measures on the real line. 6.11.23. Show that the quadratic form Q(w) =
r1
J0
t
J 9(t, s) dw dwt 0
has a bounded distribution density if and only if there is a two dimensional plane L C Co [0,1] such that Q on L is either strictly positive definite or strictly negative definite.
6.11.24. Let l;l and {2 be two independent standard Gaussian random variables. Show that t2 - 42 has no bounded density of distribution. Show that 41 - fz has a Cb -density of distribution.
6.11.25. Let y be a centered Radon Gaussian measure on a locally convex space X and let F = (F, , ... , F.), where F. E Xo O Xi G X2. Suppose that all nontrivial linear combinations of the operators DH F, , ... , D,,2& have infinite dimensional ranges. Show that y o F-1 has a density from S(1R").
CHAPTER 7
Applications First. I believe, an introduction must be given at the beginning of the speech... Second is statement of the facts with any direct evidence to support it. third is indirect evidence, fourth is what seems probable, and I believe confirmation and supplementary confirmation are spoken of by that outstanding master-artist in words, the man from Byzantium. Plato. The Phaedrus
No philosophy will prove the connection necessary for all sciences so well as a specialized investigation of a part of a science
whatever it be. Here at every step one encounters something that cannot be understood without knowing one thing or another; and it is sometimes a long way to go for a reference. It is here that it becomes most clear that the sciences not only border one with another, but strike and penetrate one into another. However, in specialized investigations, it is a method and a direction which is principal.
N. 1. Pirogov. Letters from Heidelberg
7.1.
Trajectories of Gaussian processes
Let (WtET be a Gaussian process on a set T. The trajectories (or the sample paths) of the process t are functions on T, hence it is natural to raise questions about their properties such as boundedness or continuity (in the case, where T is a topological space). Since the distribution of the process is uniquely determined by its mean and covariance, one can expect that there exists a characterization of the boundedness or continuity of the sample paths in terms of these characteristics. The first result in this direction was obtained by Kolmogorov, who posed the problem in the general case. This problem has proved to be very difficult, and only recently the efforts of many years resulted in a solution that can be considered as a sufficiently complete one. In order to simplify the formulations we consider below only centered Gaussian processes. The principal idea in this circle of problems is to consider the semimetric
d(s, t) =
IEIC. - Ct12,
s, t E T,
and the associated metric entropy H(T, d, e). which is defined as H(T, d. e) = log N(T, d, e), where N(T, d, e) is the minimal number of points in an --net in T with respect to the semimetric d (i.e., points a.i such that the open balls of radius e centered at ai cover T). Therefore, one is concerned with the case, where T is completely bounded with respect to d. Note that normally d is a metric (i.e., 333
Chapter 7.
334
Applications
t;t = , a.e. if d(s, t) = 0), but we do not assume this. The expression
J(T,d) := r H(T d, e) de 0
is called the Dudley integral. It is clear that, for any completely bounded (with respect to d) space T, the convergence of this integral is equivalent to its convergence at zero.
7.1.1. Example. Let wt be a Wiener process on [0, 1]. Then d(s, t) = It - sl ] + 1, where [c] is the entire part of c. Hence the integrand has the logarithmic singularity at zero, and the Dudley integral converges.
and N(T,d,e) = [
Making use of the metric entropy one can estimate the supremum of the Gaussian process fit. Put
Sup(T) = sup{E(sup t:t), F C T, card F < oo tEF
In the case of a separable process there is no need to employ finite subsets of T.
7.1.2. Theorem. There exist two positive numbers CI and C2 such that, for any centered Gaussian process fit, the following inequality is valid:
C1 supeVH(T,d,e) < Sup(T) < C2J(T,d). f>0
(7.1.1)
Let us now introduce another metric characteristic. We fix a natural number q > 1. Let A = (An)n5N be a decreasing sequence of finite partitions of T into parts of the diameters at most 2q-n. For every t E T, let us denote by An(t) the unique element of the partition An, which contains t. Suppose that the elements A E An of every partition are equipped with weights an(A) such that EAEA., an(A) < 1 for every n. Put 00
BA, = su
(1 09
-
1/2
1
an (An(t)) CET n=! ` Finally, denote by 8(T) the infimum of the quantities 04. over all possible partitions A = (An ),,EN and weights a = (a(A))AEA., The relationship 8(T) < oc is called the majorizing measures condition. This condition is related indeed to measures on T. A probability Borel measure it on T is called majorizing if SUP f0 tier
1/2
r(log 1
i(B(t,e))
de < co,
where B(t, e) is the open ball of radius e centered at t. It is known that there exist two positive constants KI(q) and K2(q) such that
xr
K, (q) 8(T) < inf sup j (log 1 ) v tETJ p(B(t,e))
1/2
de
K2(q) 8(T),
0
where the infimum is taken over all probability measures µ. The following theorem characterizes the regularity of Gaussian processes in terms of the majorizing measures.
7.1.
335
Trajectories of Gaussian processes
7.1.3. Theorem. Let = (E, )eET be a centered Gaussian process. (i) For any q > 1, there exists a positive constant C(q) such that Sup(T) < C(q)O(T). If, in addition, 6A,,, < oo for some sequence of partitions A with weights a and
(log sup k-'x tETa(An 1
q"
(7.1.2 \ )
= 0,
/ lt))
then the process t; has a modification, whose almost all trajectories are uniformly continuous on (T, d). (ii) There exists a number qo > 2 such that, for any q > qo, there exists a number C(q) > 0 (independent oft), for which the following inequality is valid.
6(T) < C(q)Sup(T). If, in addition, the process C is continuous a.s., then condition (7.1.2) is fulfilled for some A and a. The following fact is very useful in applications.
7.1.4. Corollary. The convergence of the Dudley integral of a Gaussian process t; implies the existence of a continuous modification of C.
7.1.5. Example. Let t; = (Ft)IET be a centered Gaussian process on a set T C IR" such that there exist positive numbers C and b such that Eltt - SaI2 < Cllog it
-
S111-6.
Then t; has a modification with continuous trajectories on T equipped with the metric from R". al
PROOF. In this case the function H(T, d, e) is estimated by coast e- r , hence 0 the Dudley integral converges.
Proofs of the aforementioned results and a more detailed discussion can be found in [4711, [472], (491].
Concerning various interesting properties of the Brownian paths, see [369), [4811. For example, almost every Brownian path has no points of differentiability and has unbounded variation on every interval. The fact that the Brownian path has infinite variation (almost sure) on every interval [a, b] follows immediately from theorem on the quadratic variation (see Problem 7.6.12)) which states that 2^
nx t-1
lim F, (wit
- w(i-1)2
12 = 1
(7.1.3)
a.e.
Indeed, for any path of bounded variation the corresponding limit (on [a, b] replacing [0, 1J) is zero. The following more subtle result is proved in [257].
7.1.6. Proposition. Let f : R' x [0, oo)
R' be a continuous function. Sup-
pose that for every T > 0, [2-T)-1
nx lim
I f (w(,+1)2 t=0
(i + 1)2-") - f (wi2
i2-n)
2
0
Chapter 7. Applications
336
in probability. Then f (x, t) = f (0, t) for all x E IR' and t > 0. In particular, this is the case if f (t, wt) a.s. has locally bounded variation.
An extension of L6vy's theorem to more general Gaussian processes is found in [45].
Another interesting property of a Brownian path wt is expressed by the following Khintchine loglog-law (see, e.g., [369, Ch. 1]): for every fixed t > 0, one has
P(w: limsup [Wt+6(w) 6-.0+
wt(w)[
= 1) = 1.
(7.1.4)
J26 log log 3'
It is worth noting in this connection that a sample modulus of continuity of wt satisfies a.s. the condition 6,,,, (h) = O ( h 0911; . More precisely, as shown by Levy (see [369, Ch. 1]),
P(w: limsuP t-s-'o+
1w,P) - w'(w)I
= 1) = 1.
2[t - s[ log it--1.
In addition, for every t, one has
P(w: limsup [wt+h(w) h-o
- w,(w)[
[h[
> 0) = 1.
Chung, Erdos, and Sirao [168] showed that if h is an increasing continuous function such that t-1/2h(t) is decreasing for small t, then the convergence of the integral
J(h) = , J t`712h3(t) exp( h2t)`) dt 0
is necessary and sufficient for the equality
P(w: ta=c max Iwt(w) - ws(w)[ < h(e), a - 0+) = 1. Moreover, if J(h) = ac, then the probability above is zero. 7.2. Infinite dimensional Wiener processes We shall now discuss the concept of a Wiener process in infinite dimensional locally convex spaces. Note that if Wt is a standard Wiener process in 1R", then for any unit vector v E R.", the process (v, Wt) is one dimensional Wiener. Hence one might try to define a Wiener process in a separable Hilbert space X as a continuous process We with values in X such that, for every unit vector v E X, the real process (v, Wt), is Wiener. However, such a process does not exist if X is infinite dimensional. Indeed, let u and v be two orthogonal unit vectors in X. Then z E[ (u ft, W) a =t=2E[(u,wt)2]+21E[(v,44t)2J. X
Therefore, IE [(u, Wt) x (v, W, ), ] = 0. Let {en } be an orthonormal basis in X . Then
the orthogonal jointly Gaussian random variables (en,Wt)a are independent and lE[(e,,, Ilt)X] = t. By virtue of a classical result, the series 00
(Wt, 1411t)x = E(en, l'V')2 n=1
7.2.
Wiener processes
337
diverges a.s., which is a contradiction. Nevertheless, the idea suggested can be embodied as follows. Let X be a locally convex space, let H be a separable Hilbert space continuously and densely embedded into X, and let jH : X' -. H be the embedding defined as follows. For any k E X', the functional h (k, h)" is continuous on H. Hence there exists a vector j,, (k) E H such that, for all h E H, one has (7.2.1) (jH (k), h) = x, (k, h), . N
For example, if H = H(ry), where -y is a centered Radon Gaussian measure on the
space X, then jH (f) = R, f for every f E X'. 7.2.1. Definition. Let X and H be the same as above. A continuous random process (Wt)t>o on (11,.F, P) with values in X is called a Wiener process associated with H if, for every k E X' with Iju(k)I,H = 1, the one dimensional process (k,11t)
is Wiener.
Let Ft C F. t _> 0, be an increasing family of o-fields. A Wiener process (Wt)t>o is called an .fit-Wiener process if, for all t, s > r, the random vector Ht -4'. is independent of .F and the random vector Wt is ,t-measurable.
Note that in the case X = H = 111" we arrive at the usual definition. In connection with this definition, the following two problems arise.
a) Let X be a locally convex space. Is it possible to find H such that there exists an associated Wiener process in X? b) Let H C X be given. Does there exist an associated Wiener process in X? The first question has a positive answer for many spaces. 7.2.2. Proposition. A Wiener process in a locally convex space X exists precisely when there exists a separable Hilbert space E, continuously and densely embedded into X. If X is sequentially complete, then this is equivalent to the existence of a bounded sequence in X, whose linear span is dense. PROOF. The necessity of this condition is obvious. Suppose that X is a separable Hilbert space with an orthonormal basis {fin}. Let us choose numbers to > 0 such that to < oo. Let n=1
x
Jh1Hn / u(t)2 dt, 0
0
0
and, for any u E D(A2), one has i
1
-
r u"(t)u(t) dt = / u'(t)2 dt + u'uI01 0
0 1
r
=J u'(t)2dt+u(1)2+u(0)2> fu(t)2dt. 0
0
Since the embedding H C X is a Hilbert-Schmidt mapping, there exists an injective
nonnegative self-adjoint Hilbert-Schmidt operator Q on X such that Q(X) = H and I 1H = IQ-1 - Ix. It is clear that
Q(X)=HCT,(X)nT2(X)and T12h=T22h, Vh E H. Since Q2(X) C Q(X) and T,2(X) C T,(X), i = 1,2, then our claim follows from Example 7.3.4 if we choose for the p,'s the centered Gaussian measures on X with the covariance operators T,2, i = 1, 2, and take A = R', where R = T 2 Q2 = T2 2 Q2. The theorem is proven.
7.3.8. Corollary. Let p = apI + (1 - a)µ2, where p1 and µ2 are the centered Gaussian measures from the previous theorem and a E (0,1). Then p is a nonGaussian probability measure, but f3 = ,0"H' = A is a bounded linear operator.
The situation described in Theorem 7.3.7 is not possible if H is the CameronMartin space of the two measures.
7.3.9. Proposition. Let u be a Radon probability measure on a locally convex
space X, let A be a continuous linear operator on X, and let f31 = A p-a.e. Suppose that every set from E(X) up to a set of measure zero coincides with a set from the o-field generated by the functionals k o A. k E X'. Then p is a centered Gaussian measure. In particular. this is true if the operator A is injective.
PROOF. Let k E X Put 1 = koA and h = j,, (k). Suppose that ,31(x) equals Ax p-a.e. Then i3 (x) coincides p-a.e. with -1(x). Let us show that I is a centered Gaussian random variable on (X, M). To this end, note that, by virtue of the integration by parts formula, one has dt p(tl) = i f l(x) exp(itl(x)) p(dx) = i
J
8l, exp(itl(x)) µ(dx) = -tl(h)µ(tl),
whence µ`(t1) = exp(-'-2l(h)t2). Thus, all functionals k o A. k E X', are centered Gaussian random variables on (X, p). By condition, this implies that p is the centered Gaussian measure with covariance Q(koA) = (k, Aj (k)). The last claim follows from Proposition A.3.12 in Appendix.
Let p and v be two equivalent probability measures on X constructed according to Example 7.3.8 such that 31 = 3N = A, where A is a continuous linear mapping on X. Let be continuous linear functionals on X such that jH (ft),. .. , j, are orthonormal in H and let Y = m Ker f j. Put P =
Chapter 7.
344
Applications
(Ji .... , f,) : X -. R" and L = Ker P. Denote by iry the natural projection of X to Y and put py = it o Ire. vy = v o 7t}.1. Then the measures p and v have equal Gaussian conditional measures py = vy on the subspaces L + y, y E Y. Indeed, it is known (see [821, (911) that, for almost all with respect to both py and vy points y E Y, the measures py have the logarithmic gradients, associated with the space H. spanned by iH (fl),... , jH (fn), given by By(x) = PB(z), x E y + L. This implies that pU = vy is a Gaussian measure. For other examples of this phenomenon in the theory of Gibbs measures, see 12861. See also 15651 for related examples.
7.4. Spherically symmetric measures In infinite dimensions, there is no any analogue of Lebesgue measure, hence it is more difficult to define nontrivial symmetries of measures. It is easily seen that Dirac's measure at zero is the only spherically symmetric probability measure on an infinite dimensional Hilbert space X. However, one can consider H-spherically symmetric measures on X as measure that are invariant with respect to the action of the group of orthogonal operators on H. A more precise definition is as follows.
Let a separable Hilbert space H be densely and continuously embedded into a locally convex space X as described above.
7.4.1. Definition. A probability Radon measure it on X is called H-spherically symmetric if its Fourier transform has the form
dl E X',
p(l) _ where
is a function on 1R'.
7.4.2. Theorem. Let p' be a centered Radon Gaussian measure on a locally convex space X with the infinite dimensional Cameron-6fartin space H. Then a measure p on E(X) without atom at zero is H-spherically symmetric if and only if it is the mixture of the Gaussian measures u': B '-+ p'(t-'B), i.e., xr
p(B) =
J0
pi(B)a(dt),
BE E(X),
(7.4.1)
where or is some probability measure on (0,oc).
PROOF. By condition, p`(f) = j.(Ij8 (f)( ), where cp is a function on [0,00). By the Lebesgue theorem, cp is continuous. Since i is nonnegative definite, Schonberg's theorem applies (see [800, Theorem 4.2, Ch. IV1), which yields the representation xr
J expt- I t2[hI2 ) a(dt) 0
with some probability measure a on (0, oc). This yields
x
x
µ(f) = Jexp(_t2IjH(f)I) a(dt) = pt(f)a(dt), J 0
0
whence the conclusion.
7.4.3. Corollary. Let H be a separable infinite dimensional Hilbert space continuously and linearly embedded into a locally convex space X. Suppose that there exists a Radon probability measurep on X that is H-spherically symmetric and has
7.4.
Spherically symmetric measures
345
no atom at zero. Then there exists a centered Radon Gaussian measure p' with the Cameron-Martin space H such that (7.4.1) holds. PROOF. Suppose first that X is complete. As above, we get the representation
f 0
with some probability measure a on (0, oo). Let e > 0. Let us find r > 0 such that a((r, oo)) > 1 - e. By the additional completeness assumption, there is an absolutely convex compact set K with u(K) > 1 - E. Now let C be any cylinder containing K. Denote by t' the cylindrical additive set function with the Fourier transform exp(-I jH (f )I2/2) and note that (7.4.1) holds true for the cylindrical sets B. There exists a cylindrical absolutely convex set Q with a compact base such that K C Q C C (see the proof of Lemma 2.1.6). Clearly, Fit(Q) < i '(Q) if t > r. Hence 1 - c < µ(Q) < e + µ'(Q), which gives p'(C) > 1 - 2e. Therefore, (pl)*(K/r) > 1 - 2e. It remains to apply Theorem A.3.19 in Appendix. In the general case (where X may not be complete), one can consider p on a completion Y of X and get the corresponding Radon Gaussian measure µ' on Y. Then p' (X) = 1. Indeed, letting K C X be any compact set of positive p-measure and denoting by
L its linear span, we get by the zero-one law that p'(L) = 1, since otherwise
0
pt(L) = 0 for every t > 0.
There is no similar characterization in the finite dimensional case, since such a mixture has a positive density. However, Problem 7.6.16 and the next result enable one to describe differentiable H-spherically symmetric measures both in finiteand infinite dimensional spaces as the measures possessing logarithmic gradients, is a real function. associated with H, of the form 31j (x) = c(x)x, where
7.4.4. Proposition. Let p be a probability Radon measure on X differentiable along all vectors from j,4 (X "). Suppose that
8H (x) = c(x)x,
(7.4.2)
where c is a measurable real function on X. Then u is H-spherically symmetric. If H is infinite dimensional and y is a centered Radon Gaussian measure on X with the Cameron-Martin space H, then there exists a probability measure a on (0, oc) such that
x
µ(B) = J..v(tB)a(dt).
(7.4.3)
Conversely, if an H-spherically symmetric measurep has the logarithmic gradient ,3H, then (7.4.2) holds true.
PROOF. First of all, note that it suffices to prove the first claim for all finite dimensional projections of p, which have the form P,,x = (10)'... 1.(x)), where
the functionals li E X* are such that the vectors e; = jH(li) are orthogonal in H (moreover, it suffices to consider only two dimensional projections). Let B be the conditional expectation of the mapping P"/3, with respect to the a-field,
generated by ll,... ,1,,. Then it is readily seen that. B,,(x) = 3.(P,,x), where ,l 3,, is the logarithmic gradient of the measure p o on IR" (generated by the space IR"). Clearly, this conditional expectation has the form x ' c(x)P"x,
Chapter 7. Applications
346
where c is the corresponding expectation of the function c (it suffices to note that. P"3y(x) = c(x)Px). In particular, c,,(x) = d"(Px), where d" is a function on lit". It remains to make use of Problem 7.6.16. Suppose now that an H-spherically symmetric measure p has the logarithmic gradient 31f. In the case of IR" the validity of our claim is easily seen from the fact that a spherically symmetric absolutely continuous measure has a density which
depends only on the norm of the argument. Suppose that H is infinite dimensional. By Corollary 7.4.3, there exists a centered Radon Gaussian measure y, for which (7.4.3) holds true. It follows that the measure it is concentrated on a Souslin linear subspace E and that there exists a sequence C X' separating the points in E. We may assume that the vectors j form an orthonormal basis in H. The projections p o P, 1 considered above also have the logarithmic gradients 3" (associated with the spaces P"(H)). Since these projections are spherically symmetric, one gets 3,, (y) = d (y) y. Let us fix i E IN. It is readily verified that the conditional expectations g" := IE" (1 with respect to the o-field, generated by 11, ... ,1", coincide with (d,, o P")1,, whenever n > i. On the other hand, the sequence converges in measure to (1,, 3N) . Since the set Ker li has p-measure zero, the sequence {d" o P. } converges in measure to some function c, which, thereby, does
not depend on i. Thus, (l;, 3t, (x)) = c(x)1{(x) a.e., whence one gets the desired
0
relationship (7.4.2).
7.5. Infinite dimensional diffusions Let X be a locally convex space, let H be a separable Hilbert space continuously
and densely embedded into X, and let B: X
X be a Borel mapping. Let us
consider the following stochastic differential equation:
4:= t+B(Et)dt, to=x.
(7.5.1)
By a solution we mean a random process ti = (lt)i>o (called a diffusion process) in X such that there exist a filtration F = {)Qe>0, with respect to which the process l; is adapted, anF-Wiener process (W=)t>0 in X, associated with H, such that, for all t > 0, one has a.s. a
t = x+w + 21 JB)ds. 0
In the finite dimensional case this corresponds to the concept of a weak solution. It is possible to define also a strong solution, namely, to require that the foregoing conditions be fulfilled for any a priori given Wiener process (W1)1>0 with the fil-
tration Ft = a(W, : s < t). There exist some other interpretations of a solution of equation (7.5.1). It should be noted that in infinite dimensional spaces (say, in infinite dimensional Banach spaces) equation (7.5.1) may fail to have solutions even for a bounded continuous mapping B (see examples in (79J). If X is a Banach space, then the Lipschitzness of B is sufficient for the existence of a strong solution of (7.5.1). Note the following important special case of equation (7.5.1): B(x) = -x. The solution of this equation exists and is called an Ornstein flhlenbeck process. Let us recall several analytic objects related to the concept of a Markov process in a topological space X (the concept itself will not be used below; basic information concerning Markov processes can be found in (822]). Let p be a Radon
Chapter 7.
348
Applications
coincide. Let us show that Tt/2 f (x) = E f (l f) for every f r= Bb(X) and all x E X,
t > 0. By the first equality in (7.5.5), it suffices to prove that the law v of the t
random variable Wt - I
J0
e("-t)V2W8 ds equals the image of -y under the mapping
y H \Irl - e-ty. Clearly, v is a centered Gaussian measure. Let I E X* be such that jj (1) 1,, = 1. We have to evaluate the variance of the random variable t
-
I f (e-t)"2 I(W,)ds. 2Je
0
Since 1(W,) is the standard Wiener process, we can deal with the one dimensional
case. Then by formula (2.11.8), the variance of t equals that of f(s_t).f2dw3, 0 tr
which is f e°-t ds = 1 - e-. This is exactly the variance of the image of ry under 0
the mapping y -+
0
1 --e-41(y), whence our claim.
Note that -y is a unique probability measure on E(X) invariant for the OrnsteinUhlenbeck semigroup (Tt)t>o. Indeed, for any bounded continuous cylindrical
as t - oo.
function f and every x, one has Tt f (x) -+ /
If µis an invari-
ant probability measure, then ! f d= J Tt f du converges to
f fdp=
J f dry, whence
fdry.
Let X be a locally convex space and let H C X be a continuously and densely embedded separable Hilbert space. To every mapping B: X - X, one can associate the elliptic operator L defined on FCOO by the equality
Lf = &f,f +X. (f', B),,
(7.5.7)
where 00
A. f (x) := E f (x),
(7.5.8)
n=1
and {en } is an orthonormal basis in H. Note that the sum in (7.5.8) does not depend on a concrete choice of an orthonormal basis in H. For any function f of the form f = '(I .....In), where tP E Cb (Rn), Ii E X*, one has n
-9,W01,...,ln)1.(l;,B)X. t=1
If X is a Hilbert space and H = T(X), where T is a nonnegative injective HilbertSchmidt operator with eigenvectors hn, forming an orthonormal basis in X, and eigenvalues t,,, then the vectors en = tnhn form an orthonormal basis in H and 00
00
Lf = 1 t2n8hn1+ > Bnah..f, n=1
n=1
7.5.
Infinite dimensional diffusions
349
where B = The proof of the following theorem is given in [91].
7.5.2. Theorem. Let p be a Radon probability measure on a locally convex space X and let H C X be a continuously and densely embedded separable Hilbert space.
(i) Let (1, B) E L2(µ),
Vl E X.
(7.5.9)
Suppose that the operator L given by (7.5.7) is symmetric on FC- C L2(µ). Then the logarithmic gradient 3'N exists and coincides with B p-a. e. (ii) If the logarithmic gradient 3;, exists and the mapping B = 8H satisfies condition (7.5.9), then the corresponding operator L is symmetric. In addition. for any f E .FC', one has (L f, f)L2(u) < 0-
7.5.3. Remark. According to the F iedrichs theorem, statement (ii) implies that the operator L has a nonpositive self-adjoint extension, i.e., it extends to the generator of a symmetric Markov semigroup on L2(p). If there is a Radon Gaussian measure on X with the Cameron-Martin space H, then, according to [15], there exists a diffusion process with invariant measure A. for which the generator of the transition semigroup coincides with 2L on .FC'°. Therefore, the previous theorem shows that the logarithmic gradients of measures are, up to factor 2, the drifts of the symmetrizable diffusions.
We already know from the previous chapters that stochastic differential equations are closely related to nonlinear transformations of Gaussian measures. Hence it is natural to ask about the conditions of the absolute continuity of the distributions of diffusion processes and their transition probabilities and invariant measures with respect to Gaussian measures. Under very broad assumptions, the transition probabilities and invariant measures of the diffusion processes on 1R° given by equation (7.5.1) are absolutely continuous with respect to the standard Gaussian measure. The situation is completely different in the infinite dimensional case. First of all, typically, the transition probabilities P(t. x, ) are mutually singular for different t. For instance. this happens in the case of the Wiener process W . where the transition probability P(t.0, - ) is the image of the Gaussian measure ry equal to the distribution of I under the mapping x i x. Secondly, the transition probabilities and invariant measures may be mutually singular with respect to all Gaussian measures (see [91]). We discuss here the interesting and important special
case, where B(x) = -x + v(x) with a vector field v: X -» H (which corresponds to "small perturbations" of the Ornstein-Uhlenbeck process). Let H be, as above, the Hilbert space associated with Wt, let 7 be the distribution of 1V1, and let
B(x) _ -x + v(x),
v: X
H.
We shall assume that condition (7.5.9) is satisfied. The study of invariant measures of the diffusion generated by (7.5.1) is closely related to the elliptic equation
L'p=0,
Chapter 7. Applications
350
which is understood in the following weak sense:
I Lf(x)p(dx)
= 0,
V f E .FC°`.
(7.5.10)
x Under very broad assumptions, any invariant measure of the diffusion given by equation (7.5.1) satisfies equation (7.5.10) (e.g., this is true if sup I vI H < oo). The following result is due to [694].
7.5.4. Theorem. Suppose that sup Iv(x)I,, < oc. Then there exists a process (.i )t>o,:Ex satisfying equation (7.5.1), and, in addition, this process has an invariant probability measure p equivalent to the measure y.
We turn now to the results concerning the regularity of solutions of elliptic equation (7.5.10), which gives some information about the invariant measures of diffusion processes (7.5.1). In the case where X = 1R" and B is a smooth mapping, classical Weyl's theorem states that every solution of equation (7.5.10) is an absolutely continuous measure with a smooth density with respect to Lebesgue measure. The following extension of Weyl's result was obtained in [91]. Note that unlike the classical situation, the corresponding differential operator is not defined on all distributions, since the coefficient B is only integrable with respect to a solution p. In
particular, this result applies to singular drifts B which need not be locally integrable with respect to Lebesgue measure. For example, the measure with density x2 exp(-x2/2) satisfies the corresponding equation with B(x) = 2x-' - x.
7.5.5. Proposition. Let p be a probability measure on IR" such that (7.5.10) is fulfilled, where Lcp = AV + (Vip, B) and B is a Borel vector field on 111" with I B IE L2(µ). Then p has a density p E In particular, p is differentiable along all vectors from IR". In addition, the following estimate holds true:
f
TpI dx < JIB(x)12/2(dx). p
(7.5.11)
Moreover, Vp/p is the orthogonal projection of B onto the closure of the set of the gradients of the functions from Co (1R") in the space L2(p,IR").
7.5.6. Theorem. Let X, H, and y be the same as above. Suppose that a probability measurep on X satisfies equation (7.5.10), where
B(x) = -x+v(x), v: X - H, Ivies E L2(p). Then:
(i) the measure it is absolutely continuous with respect to the measure -y and its Radon-Nikodym derivative p has the form p = F2, where the function F is in the Sobolev class W2.1 (y);
(ii) The measurep is differentiable along all vectors h E H and j3H (x) = -x + H is the orthogonal projection of v onto the closure of u(x), where u: X the set {DH f I f E .FC°°} in the Hilbert space L2(p, H). PROOF. (i) We may assume that u = v, since p satisfies equation (7.5.10) with B1 (x) = -x + u(x). This follows from the equality
f x.(f',u-v).
du= f (DHf,u-v)udp=0, b'f E FC'.
7.5.
Infinite dimensional diffusions
351
Let {en } be an orthonormal basis in H such that e = j (1 ), 1 E X'. We shall temporarily consider both measures y and p on the a-field generated by the functionals {1,} (replacing v by the corresponding conditional expectation). For any F E L2(p, H), the sequence Fn = E,, [Flan] of the conditional expectations of F with respect to the a-fields o generated by ii i ... , ln, converges to F in L2(p,H) (see [800, Ch. II, Theorem 4.1]). Let H. be the linear span of e 1 . . . . . en and Pn : X - Hn, Pnx = 11(x)el + ... + In (x)en. The space Hn is equipped with the inner product from H. Note that 1, (ej) = (e;, a j ),, for all i, j. Hence P. I,,
is the orthogonal projection in H onto Hn and IPnh - hl,, -. 0, as n - no, for F in all h E H. Therefore, for the mappings Fn defined above, one has L2(p, H). Indeed,
f IF(x) -
p(dx)
< f I F(x) -
f
,1i(dx)
N p(dx) + f I P3F(x) -
IF(x) - PnF(x)I2 Ez(dx) +
f
IF(x) - FF(x)IH p(dx) -. 0,
since the first term on the right-hand side( tends to 0 by the Lebesgue theorem. Put
Vn .- IE,,[Pnvlan] = PnEµ[vlan], Therefore,
v
bn := E [PnBlan]
in L2(p,H)as n-+oo.
Note that bn(x) = -Pnx + vn(x). Let An :=poP,,1. There exist Borel mappings b,: H Hn such that bn = 'b o P p-a.e. It is easily verified (this verification is found in (91, Proposition 3.3J), that the measure An on H satisfies the equation L , p = 0, where n
Lnu=Ea,y
Vu ECb (Hn).
u+(V,, Zi,bn)H.,.
i=l
According to Proposition 7.5.5, the measure pn has a density fn with respect to the
standard Gaussian measure yn on Hn. Let qn be the standard Gaussian density
on H (recall that H is equipped with the inner product from H). By virtue In y = of Proposition 7.5.5 one has pn := fngn E W"1(Hn) and a"" addition, I AH I H,, E L2 (pn ). Therefore, QXn(Z)
z+ V,,f fn(z)(z)
for pn -a . e. 2 E H
.
(7. 5.12)
On the other hand, according to Proposition 7.5.5, one has
(z)+dn(z)
forpn-a.e. zEJIn,
(7.5.13)
where the mappings do : Hn -+ Hn are such that
f (V.,f(z),dn(z)) H.,
V f E Cb' (H.).
(7.5.14)
Chapter 7. Applications
352
By virtue of relationships (7.5.12) and (7.5.13), we get Pnx
vn(x) = bn(x) + Pnx =
_
')H
By (7.5.14) one has
f
p-a.e.
fn(Pnx)
(VHffl(Pflx)(p)")
p(dx) =0.
(7.5.15)
Indeed, (7.5.15) is derived from (7.5.12) as follows. By 191, Theorem 2.8], there exist functions q; E Cb (HH), i E IN, such that
qi - Q N, The mapping S: z
in
2(pn, H.)
-z on Hn coincides with QXnQ, where Q(z) _ -2(z,z)H,,.
It is easily verified that the mapping S is also in the closure of {V,,, f I f E C6 (Hn)} (the latter follows from in LZ(pn, Hn), making use of the fact that S E L2(pn, v in L2(p, H), the square integrability of bn and v,, with respect to p). Since v,, then, by virtue of (7.5.15), there exist two mappings d and G from L2(p, H) such that do P. Pn -y d and o Pn / f n o P. - Gin L2 (p. H). It is easily seen from
(7.5.14) that the mapping d is orthogonal to {D f : f E FCG }, hence also to C. Since we assume that t, = u, we get that d is orthogonal to v. whence d = 0. Thus,
fn0Pn
v inL2fig. H), n- oc.
(7.5.16)
We shall now use the logarithmic Sobolev inequality. Since ff E
by
virtue of Proposition 7.5.5, we may put Vn
fn o& E
11'2.1
(7)
and apply the logarithmic Sobolev inequality top. Moreover,
JX fn(Pnx)-t(dx) = NJ 4 J (Da
= / I VN" fn(x) I2yn ^rn(dx)
n(x)I H - Y
f
X
H
r Vj fn(x) 12 Hn
fn(X)7n(dX) = 1.
MX)
Hn
A (x)
f,, (x) 1'n(dx) = J I n
H,
V, fn (x) 12
fn(x)
f .l
X
H
pn(dx)
2
fn(Pnx)
p(dy),
H
where the use of the chain rule is easily justified by replacing fn by fn+E and letting f f) o Pn If,, o Pn e tend to 0. By virtue of (7.5.16), the norms of the mappings in L2(p,H) are uniformly bounded by some constant. C. Therefore, sup f pn(x)2log I;Fn(x)I -t(dx) < Ci2. n
X
7.6.
Complements and problems
353
This estimate implies the uniform integrability of the functions f, o P. _ ,pn on (X,ry). Since (fn o Pn)nEIN is a martingale with respect to and the measure y, we conclude that this martingale converges to some function p E L' (y).
Put A := p 7. Then, for
WnA and all 1 = clli +... + cln,. we get
J exp(il) dAn -. x fexP(il)dA.
X
On the other hand, as n >r m, one has
r
f exp(il)dAn=1
x
x
r
H
r
= f exp(il) dµ = J x H
dµ = J exp(il) dµ. x Therefore, u = \ = py on the u-field E({1,}) introduced above. It follows that ,u = py on E(X), since the sequence {l;} was arbitrary and, for the measure y (hence also for py), the sets from E(X) coincide with sets from E({l,}) up to sets of measure zero. Both measures are Radon, which yields that p = py on 8(X). Since W - yr =: y in measure y and one has the estimate C2/4.
II DHVnII
it follows that V E 14"2whence claim (i). (ii) Let h E H. By Lemma 5.1.12 we get ph E L2(-y). This yields the equality
3h = -h + (h,W -'D.W)H. Taking vectors of the form h = jH k, k E X', we arrive at the equality 3H (x) = -x + 2+p-' DH w(x) k-ax.
The equality W-IDH:p = u follows from relationship (7.5.16) taking into account that the sequence V. converges to V in measure y and is bounded in W2,1 (-Y) (which implies that the arithmetic means of its subsequence converge to Win W2-1(7)). O
7.6. Complements and problems
Gaussian comparisons In the study of the trajectories of Gaussian processes several different types of comparison of their covariances have proved to be efficient. We encountered one of
these types in Chapter 3, where the covariances were compared by means of the usual ordering W < V for quadratic forms. Another type of comparison suggested by Slepian [7131 makes use of the process itself and is transparent in the well-known Slepian inequality (7131.
7.6.1. Theorem. Let l; and n be two centered separable Gaussian processes on a set T with covariance functions KK and K,,. Suppose that KE(t, t) = K,,(t, t) and KE(s, t) < K,,(s, t) for all s, t E T. Then one has
P(suplt > M) > P(supnr > M), C
T
VAf E Et'.
Chapter 7.
354
Applications
The Slepian inequality was generalized by Gordon (306] - [308). Slepian and Gordon inequalities are special cases of the following comparison theorem proved in (391); related results were obtained in [610], [611]; see also the proof in [471, Theorem 3.11].
7.6.2. Theorem. Let t = (t , ... , ") and il = (rla.... , r)") be centered Gaussian vectors in iR" and let
A = {(i,j): Efifj < E?.nj}, B = {(i,j): E{,t7 > Erliq,}. Suppose that a function f E 4V (1R") is such that 82,82, f > 0 if (i, j) E A and 8=,a--, f < 0 if (i, j) E B (more generally, these inequalities can be interpreted in the sense of the generalized functions; then we need not require that f be locally Sobolev). Then E f (l;) < E f (rl). One more natural way of comparing the covariances of centered Gaussian processes t; and q was used by Sudakov, Fernique, Markus, and Shepp (see [732], [243],
(526]), who considered the following condition: EIt;, - &12 < E]p, - ?It]' for all s, t E T. According to [526], if the process q on [0, 1] has continuous trajectories, then the process t does also. An analogous statement is true for the processes on separable metric spaces. Proofs can be found in [119, § 91. Interesting connections between Gaussian measures, the path properties of Gaussian processes, and the geometry of Banach spaces, in particular, applications of the various inequalities discussed above, are found in [306), [307]. [308], (547], and (605].
Logarithmic gradients and linear stochastic equations Let us mention several additional results concerning linear logarithmic gradients and linear stochastic differential equations (see [82]). As it was mentioned above, a probability measure p with linear logarithmic gradient A is invariant for some diffusion process t with drift A/2 (and this process is Gaussian). Since not every process generated by a linear stochastic differential equation has an invariant measure, the question arises concerning a characterization of the linear mappings which are logarithmic gradients of measures. The next result is proved in [82].
7.6.3. Proposition. Let y be a centered nondegenerate Radon Gaussian mea-
sure on a locally convex space X and let A: X - X be a y-measurable linear mapping. Suppose that X is sequentially complete. Then the following conditions are equivalent: (i) There exists a separable Hilbert space H densely embedded into X such that
j (X') C H(') and A = ON; (ii) The function (f, g) '-+ -(f, A'g) on X' is an inner product. where
A': X' - H(y), (A'k,h)H(. _ (k, Ah), i.e., (f, g) '-' (g, Aj5 (f )) is an inner product on X
The next result can be proved along the same lines as Proposition 7.3.9.
7.6.4. Proposition. Let y be a Gaussian measure on a locally convex space X and let p be a probability measure on E(X) differentiable along some linear subspace D C H(y) such that, for all h E D, the functions 3r', and 3; admit equal
7.6.
Complements and problems
355
modifications which are continuous linear functionals. Assume, in addition, that such functionals separate the points in X. Then y = µ. This result can be used to get an infinite dimensional analogue of Proposition 1.10.2 characterizing Gaussian measures.
7.6.5. Proposition. Let y be a centered Radon Gaussian measure on a locally convex space X with the Cameron-Martin space H and let {t:,,} C X' be an orthonormal basis in X. Suppose that it is a Radon probability measure on X such that X' C L2(µ), (tt,t;j)L2(,,) = b(;J, and
U(p):=sup(
1
ff2dµ
dp,
fE.F}=1,
f JD.fJ2 where F is the collection of all functions f E L2(µ) of the form f = cp(II,... ,1.), 1JJ
cp E C"(IR"), li E X. Then µ = -y. PROOF. Put en = Ry(an). The same reasoning as in Proposition 1.10.2 shows
that
Jn9diL=JO9dti, V9 E.F. X
X
Since (e,, } is a basis in H and e" = j (t n ), we conclude that
f 9dP. =
f
x
x
b'l; E X','dg E .F.
This yields the equality /3K (x) = -x. Therefore, by Proposition 7.6.4 (or Proposi0 tion 7.3.9), µ = y. An important for applications class of Gaussian diffusion processes on infinite dimensional spaces X is connected with the equations of the form
dX1 = d1l't + AX,dt,
X() = x,
(7.6.1)
where W1 is a Wiener process, associated with a Hilbert space H C X, and A is the generator of a strongly continuous semigroup (Ti)t>o on H. One of the first problems arising in this connection is the interpretation of (7.6.1), since H has measure zero with respect to the distributions of K',. This problems arises even in the case, where the semigroup (T1)t>o is defined on the whole space X, since the domain of definition of the generator may be very narrow. The following result from [92) enables one to overcome this difficulty.
7.6.6. Theorem. Let (Ti)t>() be a strongly continuous semigroup with generator (A, D(A)) on a separable Hilbert space H. Then H can be embedded linearly and continuously into some Hilbert space E in such a way that H is dense in E, (Tt)t>0 extends to a strongly continuous semigroup (T,E)t>o on E, and H turns out to be embedded into domain D(AE) of the generator of the extended semigroup (equipped with the Hilbert norm IIAExIIE + IIxIIE) by means of a Hilbert-Schmidt operator. Moreover, it is possible to choose E in such a way that the natural embedding of H to D((AE)2) is a Hilbert-Schmidt operator.
7.6.7. Corollary. If the conditions in Theorem 7.6.6 are satisfied, then there exists a continuous Gaussian process (Xl )t>o with values in E such that, for all
Chapter 7.
356
Applications
x E D(AE), one has XT E D(AE) and equation (7.6.1) is satisfied with A = AE. where (Ht)j>0 is a 14'Fiener process in E associated with H. In addition.
Xf = TEx +
Wt
+
I AETTE,W. ds.
t > 0.
0
Applications to partial differential equations We have already encountered probabilistic representations of solutions of various partial differential equations by means of Gaussian functional integrals. Another example of this type is the so called Feynman-Kac formula- Let us consider the Cauchy problem Su((ttt,x)
= 2Au(t,x)+V(x)u(t.x).
u(O,x)= f(x).
The Feynman-Kac formula is the following path integral representation of the solution of this Cauchy problem (valid under certain conditions on V, of course):
f (w(t) + x) exp (r V(w(s) + x) ds) Pt (dw).
u(t.x)
0
For related information, see [263]. [386], [387]. [406], [639]. [704].
Limit theorems Gaussian measures play an important role in the limit theorems. Let us make several remarks about one of the most important of them - the central limit theorem (the abbreviation: CLT). Let X be a locally convex space and let {Xn ) be a sequence of X-valued independent centered random vectors with one and the same Radon distribution p. Put Sn-X,+...+X,.
vn Note that the distribution of S coincides with the measure p". defined by the equality p" (A) = (p +... * p)(n'12A). where the convolution is n-fold. The central limit theorem concerns the following two problems: 1) convergence of the sequence of random vectors S. (in a suitable sense); 2) if this sequence converges to some random element Y, then what is the rate of convergence on a certain class of sets? More precisely, let M be a fixed class of subsets of X (say, a certain class of balls in a Banach space). Then the problem is to estimate the quantities
An(M) = sup ,P(S E Af) - P(Y E XI)I. M EM
For example, a typical problem of this sort is to estimate
On (j, r) = I P(f (S.) < r) - P(f(Y) < r) where j is some function on X (usually a norm or a smooth function). We shall consider only !Radon probability measures p such that
f Y
t(x)2p(dx) < oc,
V1 E X'.
7.6.
Complements and problems
357
In this case we say that the measure p has the weak second moment. A measure p on X has the strong second moment if f q(x)`p(dx) < oc
x
for every continuous seminorm q on X.
7.6.8. Definition. Let X be a locally convex space. (i) A probability measure p with mean m. on X is called pre-Gaussian if it has the weak second moment and there exists a Gaussian measure y with mean in. on X such that
f fgdp= f x
fgdy,
`df.gEX'.
x
(ii) A probability measure p with zero mean on X is said to satisfy the central limit theorem (CLT) if the sequence {p'") is uniformly tight. A probability measure p with mean m is said to satisfy the CLT if the measure p_,,, with zero mean satisfies the CLT. (iii) The space X is called a space with the CLT property if every probability measure p on X with zero mean and the strong second moment satisfies the CLT, X is called a space with the strict CLT property if the CLT is fulfilled for every probability measure p on X with zero mean and the weak second moment.
Note that if X is a separable Frechet space, then, as the next lemma shows, the definition of the CLT given in (ii) becomes equivalent to the classical one requiring the weak convergence of the sequence {p""} to a Gaussian measure.
7.6.9. Lemma. Let p be a probability measure with zero mean on a locally convex space X. If the sequence (p" I is uniformly tight, then it converges weakly to some centered Radon Gaussian measure y. In addition, p is a pre-Gaussian measure.
Proof is left as Problem 7.6.21. On the space X = lR", every probability measure with the weak second moment satisfies the CLT. Certainly, such a measure has also the strong second moment. The situation is different in the infinite dimensional
case. For instance, the space C[0,1] does not have the CLT property. Moreover, there exists a pre-Gaussian measure with compact support in C[0,1], which does not satisfy the CLT. On the other hand, there exists a probability measure with compact support in C[0,1], which is not pre-Gaussian. Finally, there exists a measure on C[0,1], which satisfies the CLT, but has no strong second moment (see [592] concerning such examples). It is known that any Hilbert space has the CLT property. Since in a Hilbert space the covariance operator of a probability measure p is nuclear precisely when p has the strong second moment, we see that in this case the class of all pre-Gaussian measures coincides with the class of all measures satisfying the CLT (and also with the class of all probability measures having the strong second moment). As the space C[0,1] shows, these three classes of measures may be all different for general Banach spaces. The equality of all the three classes characterizes Hilbert spaces (more precisely, Banach spaces linearly homeomorphic to Hilbert spaces). In other words, a Banach space is linearly homeomorphic to a
358
Chapter 7.
Applications
Hilbert space if and only if the existence of the strong second moment of a probability measure is equivalent to the validity of the CLT for this measure. It is known that every probability measure with the strong second moment on a Banach space X satisfies the CLT precisely when X is a space of type 2. Therefore, on any non-Hilbert space of type 2 there exists a measure, which satisfies the CLT, but has no strong second moment. If every measure on X satisfying the CLT has the strong second moment, then X is known to be a space of cotype 2; moreover, this property is a full characterization of the spaces of cotype 2. Note also that X has cotype 2 precisely when every pre-Gaussian measure on X satisfies the CLT. Proofs of these assertions and the corresponding references can be found in (592, Ch. 31, [472, Ch. 10]. Let us mention several properties of locally convex spaces with the strict CLT property. This property was introduced in (73], where the proofs can be found.
7.6.10. Theorem. A Banach space X has the strict CLT property precisely when dim X < oc. The strict CLT property is inherited by the closed subspaces and is retained by the strict inductive limits of increasing sequences of closed subspaces, by countable products, arbitrary direct sums, and the countable projective limits.
7.6.11. Example. Let X be the dual to a complete nuclear barrelled locally convex space Y. Then X with the strong topology has the strict CLT property. For example, this is true if X is the dual to a nuclear Freshet space. The following spaces have the strict CLT property: Co [a, b], S(IRk), S(IRk)' IR" A detailed survey of the results connected with the problem of estimating the rate of convergence in the central limit theorem can be found in [592], [58]. An Cn_,/2 important achievement in this area was the proof of the estimate in the case, where U is a ball in a Hilbert space and the vector X, has the strong third moment (assuming the existence of the sixth moment this estimate was first obtained by F. Gotze, and the improvement involving only the third moment is due to V. Yurinsky). It has been recently shown by V. Bentkus and F. Gotze [67] that if X, takes values in a Hilbert space and has the strong fourth moment, then Cn'I, provided the topological support of the limit Gaussian measure has dimension at least 9. V. Bentkus discovered that, for general Banach spaces, no
moment restrictions enable one to get an estimate better than Cn-1/6 even if the distribution of the norm with respect to the limit Gaussian measure has a bounded density. If no assumptions are made concerning the distribution of the norm, then the rate of convergence on balls can be arbitrarily slow (see [641]). It remains an open problem what is the rate of convergence on balls in the case, where the limit Gaussian measure is the Wiener measure on C[O,1]. The central limit theorem is just but one problem in the growing area of limit theorems for infinite dimensional random elements. An extensive and interesting material, including the study of convergence of sums of independent random vectors, is presented in the works cited in Bibliographical Comments. One of the related questions is the law of the iterated logarithm (see [191], [305), [435], [442], [472], [476), [487], [726]). In its simplest formulation it states that if {t;,,} is a sequence of independent random vectors in a separable Banach space X with one and the same centered Gaussian distribution y, then with probability 1 the sequence E"=, 1;j/2n log ogn has as a cluster point (in the topology of X) every element of the unit ball U of the Hilbert space H(y). The same is true for a locally convex space X if y is a Radon measure (see [96], [753]).
7.6.
359
Complements and problems
A random vector with values in a locally convex space X is called (see [494],
[771]) stable of order a E (0.2] if, for every n, there exists a vector a E X such that, for any independent copies y1, ... , f,, of the vector , the random vector n-"({1 + +Sn) - a" has the same distribution as . A measure is called stable if it is the distribution of a stable random vector. Stable of order 2 random vectors are precisely Gaussian vectors. The distribution of any stable vector is a mixture of Gaussian measures (see [749]).
Problems 7.6.12. Let S. _ E {,,,k, where for every n E V.
are independent
k=1
centered Gaussian random variables with variance n-'. Show that S" - 1 in the square mean and deduce (7.1.3). Hint: show that IE(S" - 1)2 = nE(f!.1 -1/n)2 = 2/n; to prove (7.1.3) use the martingale convergence theorem (see [337, §2.2, Theorem 2.3]).
7.6.13. Let f be a continuous function on 11' such that the function t'- f (w,(w)) has bounded variation on [0.1] for a.e. w. where (w1)r>o is a standard Wiener process. Show that f is constant. Hint: see [257]. 7.6.14. Let 7 be a centered Radon Gaussian measure on a locally convex space, let
H be the Cameron-Martin space of 1, and let f E IV2'1(')). Put p = f µ. Show that
3 (x) = -x+DHf(-)/f(x). 7.6.15. Show that in the situation of Proposition 7.3.9 the measure µ is a unique probability measure with the logarithmic gradient A generated by H.
7.6.16. Show that a measure p on IR" with the logarithmic gradient 3" (generated by IR") is spherically symmetric if and only if there exists a real function c(.) on IR" such that 01'(x) = c(x)x p-a.e. In addition, every such function c admits a spherically symmetric modification. Hint: show that, for every orthogonal operator S, one has a.e. the equality p(Sx) = p(x), where p is a density of p. To this end, letting Te be the group of rotations in angle tin a fixed two dimensional plane, verify that etp(T,x) = 0 for a.e. t and a.e. x, choosing a modification of p such that the function t - p(Ttx) is absolutely continuous for a.e. x.
7.6.17. Prove that the measure µ defined by equality (7.4.3) has the logarithmic gradient associated with H precisely when f t ' o(dt) < oc. Hint: the necessity of this 0
condition (noted in [568]) reduces to the one dimensional case by taking the measure u o I - ', where I E X' is not zero. The sufficiency part is trivial. 7.6.18. Is the standard Ornstein-Uhlenbeck process a martingale? 7.6.19. Let (C. (t)) be a sequence of martingales on a common probability space with a given filtration. Show that the process sup, {,(t) is a submartingale with respect to the same filtration.
7.6.20. Prove that if a probability measure µ on a locally convex space X is stable of some order and convex (i.e., satisfies (4.2.2)), then it is Gaussian. Hint: reduce the statement to the one dimensional case and use the fact that any convex measure has the second moment, whereas among the stable measures only Gaussian measures have this property (see (697, Ch. III, §5]). 7.6.21. Prove Lemma 7.6.9. Hint: make use of the relative weak compactness of the sequence of measures p" and the uniqueness of its possible weak limit, which follows from the central limit theorem for the one dimensional projections.
Chapter 7. Applications
360
7.6.22. Let X be a separable Banach space, which contains a closed linear subspace linearly homeomorphic to the space co. Show that there exists a probability measure p on X with compact support such that p is mutually singular with every pre-Gaussian measure on X. In particular, this is true for the space X = C[0,1]. Hint: use the method of Example 2.12.10; see also [93].
7.0.23. Let 7 be the countable product of the standard Gaussian measures on the real line considered on the Hilbert space X of all sequences (x,,) with the finite norm n-2x,2,) 1". Define a probability measure it on X by n=1
=
JA/tt)dt, 0
where p is any positive probability density on (0, oo) such that f t2p(t) dt = oo. Show that for any finite collection of p-measurable linear functionals mapping V: R" -+ X, one has III x
0
and any Borel
- (0(11(x), ... ,1,,(x)) 112 p(dx) = oo.
X
Prove that, more generally, the same is true if 1; (x) are replaced by measurable functions of the f o r m 1 j (x) = f j (x, 11(x), ..., 1 _ ( z ) ) , where f, =1 1 is a measurable linear functional,
f: X x Rj -'
R, (x, y) -, f j (x, y) is measurable linear in x and Borel in y. Hint: use Anderson's inequality and Proposition 6.11.4.
APPENDIX A
Locally Convex Spaces, Operators, and Measures We do not understand many matters not because our concepts are weak, but because these matters do not belong to the circle of our concepts. Ko.sma Prutkov
A.1. Locally convex spaces
Basic definitions Proofs of the facts presented below and an additional information concerning locally convex spaces can be found in (670], (220]. A nonnegative function p on a real linear space X is a called a seminorm if p(Ax) = (AIp(x) and p(x + y) < p(x) + p(y) for all reals A and all vectors x, y E X. A real linear space X is called a locally convex space if it is equipped with a family of seminorms P = (p,).EA on X separating the points (i.e., for every nonzero element x E X there exists an index a E A such that po(x) > 0). The topology on X generated by the family P consists of the open sets which are arbitrary unions of the basis neighborhoods of the form
{x: p,,(x-a)<e,, i=1,...,n}, o;EA, aEX,nEN. Clearly, different families of seminorms can define one and the same topology. A normed space is a special case of a locally convex space. The topological dual to a locally convex space X (the space of all continuous linear functionals on X) is denoted by X'. Sometimes we use also the algebraic dual X' which is the space of all (not necessarily continuous) linear functionals on X. However, the term dual is reserved for the topological dual throughout this book. Every locally convex space X has a Hamel basis, i.e., a collection of linearly independent vectors {v0} such that every element in X is a finite linear combination of the vectors v,. A mapping Y A between linear spaces X and Y is called affine if A = L + a, where L: X is a linear mapping and a E Y is a fixed vector. A typical example of a locally convex space arising in the theory of random processes is the space IRT of all real functions on a nonempty set T equipped with the topology of pointwise convergence, or, in other words, the topology generated by the family of seminorms
pt(x) = jx(t)(,
t E T.
The space IRT is called the product of T copies of IRI. In particular, if T is the set
N of all natural numbers, then the corresponding space is denoted by IR". The dual to IRT coincides with the linear span of the functionals x - x(t), t E T (see [670, p. 137, Theorem IV.4.3]); this is clear from the fact that a linear functional 361
Appendix
362
f bounded on the neighborhood {x: Ix(t;)I < c, i < n} is a linear combination of the functionals 61, : x - x(t; ), i < n, since it is zero on n i Ker b,, . The linear span of a set A in a linear space is denoted by span A. For any sets A and B in a linear space X and any scalar A, we put
AA:={AaIaEA}, A+B:={a+blaEA,bEB}. A set A in a locally convex space X is called bounded if, for every neighborhood
of zero V in X, there exists A > 0 such that. A C AV. This is equivalent to the boundedness on A of every continuous linear functional.
A set A in a locally convex space is called symmetric if A = -A. A set A in a locally convex space is called convex if Aa + (1 - A)b E A for all A E [0, 1] and a, b E A. A convex set. A is called absolutely convex (or convex balanced) if AA C A
for every scalar A with JAI < 1. Clearly, this is equivalent to the convexity and symmetry of A. The convex hull of a set A is the minimal convex set (denoted by cony A) containing A. The absolutely convex hull absconv A of a set A is defined analogously. The closed absolutely convex hull of a set A is the minimal absolutely convex closed
set containing A. We say that a locally convex space (X, r,,) is continuously embedded into a locally convex space (Y, T,.) if X is a linear subspace in Y and the natural embedding (X, r,) --. (Y, r,.) is continuous. If, in addition, X is dense in Y, then we say that X is densely embedded. Let E be a linear space and let F be a linear subspace in the space of all linear functionals on E, separating the points in E (i.e., for every two different elements in E, there is a functional from F taking on these elements different values). Denote by a(E. F) the locally convex topology on E generated by the family of seminorms
pt(x) = If
f E F.
This is the topology of pointwise convergence on F if the elements of E are considered as functionals on F. Two typical examples: the weak topology a(X, X') on the locally convex space X and the *-weak topology a(X',X) on its dual. An important property of the topology a(E, F) is that the dual to (E.a(E, F)) coincides (as a linear space) with F, i.e., every linear functional that is continuous in the topology a(E, F) has the form .r -+ f (x), f E F. In particular, any continuous in the topology a(X',X) linear functional F on the space X' has the form F(f) = f (a) for some a E X. The Mackey topology rst(X',X) on X` is defined by means of the serninorms
KE1C. p,K(f)=supif(x)I, zEK where K: is the family of all absolutely convex a(X,X')-compact subsets of X. A proof of the following Mackey theorem can be found in [670, p. 131, Ch. IV, 3.2, Corollary 1].
A.1.1. Theorem. Every linear functional F on X' continuous in the Mackey topology Tkj (X'. X) has the form F(f) = f (a) for some a E X X. A topological space T is called metrizable if the topology of T is generated by a metric. A locally convex space is metrizable precisely when its topology is generated by a countable family of seminorms. A complete metrizable locally convex space is called a Pr6chet space. For example, the countable product of the real lines fx is a
A.1.
Locally convex spaces
363
Frechet space. Any Banach space (i.e.. complete normed space) is a Frechet space. The most typical examples of Banach spaces encountered in the theory of Gaussian measures are: the space I" of all bounded sequences x = (x,,) with IIxii = sup Ixnl, n
its closed subspace co consisting of all sequences converging to zero, the spaces Lo(o), where p E [1, oc], and the space C[a, b] of all continuous functions on [a, b] with the sup-norm. Every locally convex space X is completely regular, i.e., for every point x E X and every neighborhood U of x, there exists a continuous function f : X -. [0, 1] such that f (x) = 1 and f = 0 outside U (it suffices to be able to construct such
a function for x = 0 and any neighborhood of the form U = (p < 1}, where p is a continuous seminorm; in this case one can put f (z) = 1 - p(z) if z E U and
f(z)=0ifz
U).
A.1.2. Lemma. Let K be a compact set in a completely regular topological space X and let U be an open set containing K. Then: (i) There exists a continuous function f : X -. [0,1] equal I on K and 0 outside U;
(ii) Every continuous function y on K extends to a continuous function i' on X such that sup_v ktI = suph Jpi and >L' = 0 outside U. PROOF. A proof of (i) can be found in [220, p. 19]. For the proof of (ii) it suffices to find a continuous extension of to X with preservation of maximum and multiply it by the function from (i). The Stone-Weierstrass theorem implies the existence of a bounded continuous function g on X equal p on K. Now we can replace g by the function 0(g), where 0(t) = t if Iti < sup I'i, 0(t) = sup Icpl if ItI > sup l0I.
0
Recall that a mapping F between topological spaces is called sequentially continuous if F(x,,) -y F(x) whenever x, -+ x.
A partially ordered set A is called directed if, for every a and 0 from A, there
is 7 E A such that a < -y and 3 < y. A net of elements of the set X is a subset {xa}aEA C X indexed by a directed set A. The concept of a net generalizes that of a sequence. A net {xa})EA in a locally convex space X is called fundamental (or Cauchy) if it is fundamental with respect to every seminorm q from some family of seminorms
generating the topology of X (i.e., for every e > 0, there exists L E A such that
q(x,, -x3)<eforalla>A,3>\F). A.1.3. Definition.
(i) A locally convex space X is called sequentially complete if every Cauchy sequence in X converges. (ii) A locally convex space X is called complete if every fundamental net in X converges.
(iii) A subset A of a locally convex space X is called sequentially closed if it contains the limit of every convergent sequence of its elements. In a similar manner one defines the completeness and the sequential completeness for subsets of X. It is clear that every complete locally convex space is sequentially complete. An infinite dimensional Hilbert space with the weak topology gives an example of a sequentially complete locally convex space which is not complete (Problem A.3.25). Similarly to metric spaces, locally convex spaces possess completions.
Appendix
364
A.1.4. Theorem. Every locally convex space X has a completion X. i.e., there exist a complete locally convex space 9, a linear subspace Xo everywhere dense in k and a linear homeomorphism h: X - X0. The product X x Y of locally convex spaces X and Y possesses the natural structure of a locally convex space: the corresponding family of seminorms is defined by (x, y) +--+ p(x) + q(y), where p and q are representatives of the families of seminorms defining the topologies of X and Y, respectively.
Convex sets and compact sets Let us describe a construction connected with convex sets which finds numerous applications in measure theory on linear spaces. Let A be an absolutely convex set
in a locally convex space X. Denote by E the linear span of A. Put PA (x) = inf {r > 0: x E rA}, r E EA . The function p,, on E,, is called the Minkowski functional (or the gauge function) of the set A. A.1.5. Theorem. Let A be an absolutely convex sequentially closed bounded set in a locally convex space X. Then the function p, is a norm on E . whose closed unit ball is A. In addition. the natural embedding (EA , p,,) into X is continuousIf A is sequentially complete, then (E4, p4) is a Banach space. The proof can be found in [220. p. 444, Lemma 6.5.21.
Let us formulate a number of results about compact sets in locally convex spaces that we use in the main text. A.1.6. Proposition. In any complete locally convex space, the closed absolutely convex hull of a compact set is compact.
The previous statement may fail for not necessarily complete spaces. Part (ii) of the next proposition is due to [6441. The proof below is borrowed from [901.
A.1.7. Proposition.
(i) The metrizability of a compact space K is equivalent to the existence of a sequence of continuous functions separating the points in K. A compact set K in a locally convex space X is metrizable if and only if there exists a sequence {ln} C X' separating the points of K. (ii) The closed absolutely convex hull k of any metrizable compact set K in a locally convex space X is metrizable: if X is sequentially complete. then k is a metrizable compact space. PROOF. (i) Clearly, on any metrizable compact set there is a sequence of continuous functions separating the points. Recall that if on a set K one has two Hausdorff topologies r3 and r2 with respect to which K is compact and the natural embedding (K. rl) - (K, r2) is continuous, then this mapping is a homeomorphism. Therefore, if continuous functions fn separate the points of a compact set K, the metric
B(x, l/) = E 2-n
Ifn(x) - .fn(y)I
1 + Ifn(x) - fn(y)I n=1 generates the initial topology of K. This simple observation implies also that on a compact set K in a locally convex space X the weak topology coincides with the initial one and. in addition, that the weak topology coincides with every topology
A.2.
365
Linear operators
on K generated by any family of continuous linear functionals separating the points in K. Therefore, in the case where there is a countable family with this property, the corresponding topology is defined by the aforementioned metric. Conversely, if a compact set K in a locally convex space X is metrizable, then the weak topology on K has a countable base of the form
{x: Il' (x - a)I < k-l, j = 1..... n(a)}, a E A, P E X', k E IN,
where A C K is an at most countable set. Therefore, there exists an at most countable family of continuous linear functionals separating the points in K. (ii) Let K be a metrizable compact set in a locally convex space X. Assume
first that X is complete. According to the Riesz theorem, the dual to C(K) is identified with the space of signed Borel measures on K. By the Banach-Alaoglu theorem, the closed unit ball U in C(K)' is compact in the *-weak topology. Since
the space C(K) is separable (see Problem A.3.23), then, by virtue of (i), U is compact metrizable in the weak topology. Let us consider the mapping
I: U -+ X, I(m) = rxrn(dx), K
where the integral is understood in the sense of Pettis (see Section A.3 below), and its existence follows from the completeness of X (see the proof of Lemma A.3.20 below). It is easy to see that this mapping is continuous if U is equipped with
the *-weak topology and X is given the weak topology. Therefore, the absolutely convex set 1(U) is weakly compact in X. Moreover, by the metrizability of U, this set is metrizable (see [231, Theorem 4.4.15]). Clearly, I(U) contains the closed absolutely convex hull of K, since K C 1(U) by virtue of the equality k = I(bk), where bk is the probability measure at the point k. Therefore, the closure of the absolutely convex hull of K is a metrizable compact set as a closed subset of a metrizable compact space (in fact, as can be easily shown, 1(U) coincides with the closed absolutely convex hull of K). It remains to note that the first claim from (ii) follows now from the existence of a completion of X. 0 A.2.
Linear operators
Bounded operators Recall some well-known facts from the theory of linear operators. We consider below only real spaces.
The range of a linear operator A on a space X is denoted by A(X). Ker A stands for the kernel of the operator A (the preimage of zero). Denote by £(X, Y) the space of all continuous linear operators from a locally convex space X to a locally convex space Y. Let £(X) := L(X, X ). If X and Y are normed spaces, then C(X, Y) is equipped with the operator norm II - IIc(x.r) An operator A on a normed space is called compact if it takes the unit ball to a precompact set. The space of all compact operators from X to Y is denoted by IC(X,Y). Put K(X) :=1C(X,X). The following useful result is called the closed graph theorem (see [670, p. 78, Theorem III.2.3]).
A.2.1. Theorem. Let X and Y be two &3 chet spaces (e.g., Banach spaces). A linear mapping A: X -. Y is continuous if and only if its graph {(x, Ax), X E X}
Appendix
366
is closed in XxY. In particular, if Banach spaces X and Y are continuously linearly embedded into a locally convex space Z and X C Y. then the natural embedding
X -. Y is continuous. Let H be a Hilbert space. In the definitions and statements below for the sake of simplicity of formulations we use the notation which means implicitly that the spaces in question are infinite dimensional; clearly, in the finite dimensional case we have in mind finite bases, etc.
A.2.2. Definition. An operator A E C(H) is called symmetric if (Ax.y) _ (x, Ay) for all x, y E H. A symmetric operator A E C(H) is called nonnegative if
(Ax,x)>OforallxEH.
Note that in real spaces (unlike the complex ones) the positivity of the quadratic form (Ax, x) does not imply the symmetry of A. For every nonnegative operator B E C(H), there exists a unique nonnegative operator C E C(H) denoted by v such that C2 = B. For A E C(H) we put IAI:=
Note that for any h E H one has (A.
Ax, x) = (Ax, Ax). An operator K E C(H) is compact precisely when so is the operator IKI. According to the Hilbert-Schmidt theorem, for any compact symmetric linear operator (I AI x, I AIx) =
A on a separable Hilbert space, there exists an orthonormal basis {e } such that
Ae =
a tend to zero.
A.2.3. Definition. An operator on a Hilbert space which preserves the inner product is called isometric (or an isometry). A linear operator is called orthogonal if it is invertible and preserves the inner product. For example, the operator x
(0, xl , x2, ...) on 12 is isometric, but not or-
thogonal. The polar decomposition of an operator A is the representation
A=UTAI, where U E C(H) is a linear isometry on the closure of I AI (H) given by U(I AI x) = Ax
and zero on the orthogonal complement of IAI(H). Note that U is well-defined., which follows from the fact that if I AI v = 0, then Av = 0. The operator U is called a partial isometry. If an operator A is injective (i.e., has zero kernel) and has the dense range, then U is an orthogonal operator. The polar decomposition can be written also in the form
A = AA' V, where V is the operator adjoint to the partial isometry from the polar decomposition for A. This representation yields the following simple, but useful fact: for every operator A E C(H) with dense range. one can find an injective nonnegative symmetric operator B such that B(H) = A(H). Indeed, by the factorization by the kernel of A this claim reduces to the case where A is injective. The range of the symmetric nonnegative operator B = AA' is dense as well, hence, as one can easily verify, this operator is injective. In addition, B(H) = A(H), which follows from the formula above, since V is orthogonal.
A.2.
367
Linear operators
A.2.4. Proposition. Let E be also a Hilbert space. Then K(H, E) coincides with the closure of the class of the finite dimensional continuous operators with respect to the operator norm.
A.2.5. Definition. Let H and E be two Hilbert spaces. An operator A E £(H, E) is called a Hilbert-Schmidt operator if the series (A.2.1)
IIAeaIIE a
converges for some orthonormal basis {ea} in H. If the space H is nonseparable, then the membership of A in the class of Hilbert-
Schmidt operators means that A is zero on the orthogonal complement of some separable subspace Ho C H and C'
,IIAen IIE 0, there exist a diagonal operator DE and a symmetric Hilbert-Schmidt operator SE on H such that A = D, +S, and JIS,, 11%(H) :5 c.
A.2.16. Lemma. Let E, H be two Hilbert spaces, A E G(E, H). Suppose that H is separable and the operator A is injective. Then E is separable as well.
PROOF. The set A'(H') is dense in E' by the injectivity of A. Hence the space E' is separable, which implies the separability of E.
Semigroups and unbounded operators A linear mapping A defined on a dense linear subspace D(A) in a Hilbert space
H and taking values in H is called a densely defined linear operator. A densely defined operator is called closed if its graph is a closed set in HxH. If A is a densely
defined linear operator, then the domain D(A') of the operator A' is defined as the set of all vectors y E H such that the functional x H (Ax, y) is continuous on D(A) with the norm from H. By the Riesz theorem, there exists z E H such that (Ax, y) = (x, z) for all x E D(A). By definition, A'y = z. Note that the set D(A') may coincide with {0}.
We say that A is a symmetric operator in a (real) Hilbert space H if A is a linear mapping from a dense linear subspace D(A) C H (called the domain of A) to H such that (Ax, y) = (x, Ay) for all x, y E D(A). For any symmetric operator, the adjoint operator is defined at least on D(A), hence is also densely defined. A.2.17. Definition. A symmetric linear operator is called self-adjoint if it coincides with its adjoint (i.e., D(A*) = D(A) and A' = A on this domain).
A.3.
Measures and measurability
371
Unlike the case of bounded operators, a symmetric operator may fail to be self-adjoint.
A.2.18. Definition. Let X be a Banach space. A family (Te)e>o C £(X) is called a strongly continuous semigroup on X if To = I. Tt+, = TtT, for all t, s > 0, and, for every x E X, the mapping t Ttx from [0, oc) to X is continuous. One of the fundamental results in the theory of operator semigroups states that the linear subspace
D(L) :_ {h E X :
limTtht-h
exists in the norm of X }
is dense in X (see 1214, p. 620, Lemma VIII.1.8)). In addition, the linear operator L defined on D(L) by the equality
Lh=iim
Th - h t
is closed. This operator is called the generator of the semigroup (Tt)t>o.
A.3. Measures and measurability Measures and integrals Concerning the facts from the Lebesgue integration theory mentioned below, see [697, Ch. II]. The term "measure" means a countably additive bounded nonnegative
measure on a o-field of sets M. For two measures p and v on M, the absolute continuity of p with respect to v is denoted by p 4Z v. If p 1 by LP(p) we denote the Banach spaces of p-measurable functions whose absolute values are integrable in power p. The norm in LP(p) is denoted by II . IILD(,.) or by II IIp. The integral of a function f over
a set A with respect to a measure p is denoted by J f (x) p(dx) or by A
In A
the case of integrating over the whole space the limits of integration are sometimes
omitted. The indicator function of a set A is denoted by IA (IA(x) = 1 if x E A, IA (X) = 0 if x j9 A). If p is a measure and A is a p-measurable set, then the measure PIA := IA p (i.e., pI A(B) = p(A fl B)) is called the restriction of p to the set A. It is known that every signed measure m (a countably additive real function on a o-field B in a space 0) can be written as m = m+ - m-, where m+ and mare mutually singular nonnegative measures on B called the positive and negative parts of m, respectively. The quantity Ilmil := m"(1?) + m' (1) is called the total variation of m (or shortly the variation of m). The variation is a norm on the linear space of all signed measures on B making it into a Banach space. The variation distance IIp - vII between two nonnegative measures p and v on o-field B can be written as
IIp - VII = sup{l (B) - v(B)I + Ip(1l\B) - v(S2\B)I, B E B}.
Appendix
372
X
Let pn be probability measures defined on o'-fields Bn in spaces X. Put X,,. Let B = ®- 1Bn be the o-field generated by all sets of the form
B = Bi x B2 x
x B. X Xn+1 X Xn.t2
, where Bi E Bi. Recall that the countable
product 000pn is a probability measure p on B (called a product-measure) defined n=1
by p(B) = µ1(B1) IA. (Bn) for the sets B of the form above. It is readily seen that this set function is well-defined. A well-known theorem in measure theory states p is countably additive (which is not obvious) on the algebra generated by such sets,
x
hence it uniquely extends to a measure on B denoted also by ®pn and called the n=1
product of the pn's. The construction of countable products enables one to define arbitrary products (8) A,, of probability measures on or-fields S. in spaces Xa. To
'
a
this end, it suffices to note that the v-field (&,B., generated by the sets 11a Ca, where Ca E Ba and only finitely many of the C,,'s differ from Xa, consists of the m
sets of the form E = C x Y, where C E ® Ban and Y = its#a X0. Hence we n=1
may put ®pa (E) = ® pa.. (C) a
n=1
Let p be a measure on a measurable space (X, B), let Y be a space with a v-field E, and let f : X - (Y, E) be a p-measurable mapping (i.e., f -1(E) C Br,; such mappings are also called (B,,, E)-measurable). Then the measure
pof-': A~p(f-1(A)) on E is called the image of the measure p under the mapping f . A function cp on Y is integrable with respect to pof -1 precisely when the function po f is p-integrable on X. In this case the following identity called the change of variables formula holds true:
f
w(y) IAof-1(dy) = I
v(f(x)),u(dz).
(A.3.1)
X
Y
A.3.1. Definition. A family .F C L1 (p) is called uniformly integrable if
lim sup f
C-. ofET If1>>C
if (x) I p(dx) = 0.
A sufficient condition for the uniform integrability is the estimate sup fey
f
I1(x)I log If (x)I p(dx) < oo.
Concerning the uniform integrability we refer the reader to Meyer's book [541, Chapter II, §2] and Shiryaev's book (697, Chapter II). The next classical result is called the Vitali-Lebesgue theorem.
A.3.2. Theorem. Let If, } C L' (p) be a sequence convergent almost everywhere (or in measure) to a function f. If the sequence {fn} is uniformly integrable, then it converges to f in the norm of L1(p).
Let (X, M, p) be a space with measure and let A C M be a one more or-field. Recall that for every integrable function f there exists a function EA f measurable
Measures and measurability
A.3.
373
with respect to A such that
r E-4f (x)g(x)p(dx) = ff(x)9(x)P() for every bounded function g measurable with respect to A. The function IEAf is called the conditional expectation of f with respect to A.
A.M. Definition. Let (X,M. p) be a probability space, T C IR' and let At, t E T, be an increasing family of a-fields contained in JAI. The family {ft}tEr of µ-integrable functions is called a martingale with respect to {At} if
IEA, f, = f V t. s E T. s < t. If in the relationship above we replace the sign "=" by ">". then we get the definition of a submartingale.
A.3.4. Example. Let f E L' (P), where (52,.x', P) is a probability space, and let {A, }tET C .P be an increasing family of a-fields. Then the family {IE't` f }tET is a martingale with respect to {A, },E ,..
The following two results obtained by Doob (the last statement in the next theorem is due to P. Levy) play an important role in probability theory. Their proofs can be found in 1541, Ch. V. §3J or [697, Ch. VII, §3J.
A.3.5. Theorem. Let { fn } be a martingale on a probability space (X. M, P) with respect to an increasing sequence of a-fields {An} in M. If the family {fn} is uniformly integrable. then there exists a function f E L'(P). measurable with respect to the a-field A, generated by {An}. such that IE'4^ f = fn for all n. In f almost everywhere and in L'(P). If f E L'(P), where r > 1. then addition. fn there is the convergence in L'(P) as well. Conversely. if f E L'(P) is measurable with respect to A. then {EA,, f } is a uniformly integrable martingale convergent
to f a.e. and in L'(P). The next result is Doob's inequality.
A.M. Theorem. Let (f,, } be a submartingale with respect to an increasing .sequence of a-fields such that the functions fn are nonnegative and Supllfn1IL2(P) 0, there exists a compact set K C B such that p(B\K) < e. A measure p is called tight if condition (iii) is satisfied for B = X.
A.3.11. Theorem. Every Borel measure on any complete separable metric space is Radon.
If p is a Borel (e.g., Radon) measure on a topological space X. then by pmeasurable sets we always mean the elements of B(X ),, (the Lebesgue completion of B(X) with respect to p). Any Radon measure on a locally convex space E is uniquely determined by its values on 6(X ).
A.3.
Measures and measurability
375
A.3.12. Proposition. Suppose that it is a Radon measure on a locally convex space X. Then, for every p-measurable set A. there is a set B E E(X) such that
p(ADB)=0. Moreover, if G C X' is an arbitrary linear subspace separating the points in X. then such a set B can be chosen in E(X,G).
PRooF. Let e > 0. Let us find compact sets K C A and S C X such that
p(K) > p(A) - e and p(S) > p(X) - e. We may assume that K C S. There exists an open set U D K with u(U) < p(K) + e. Since on the compact set S the initial topology coincides with the weak topology, there is a set V open in the weak
topology such that V n S = U n S. By the compactness of K one can find a set IV which is a finite union of open cylindrical sets such that K C IV C V. Then we have IV E E(X) and p(IV ZD A) < p(W L K) + e < p(V\K) + e < 3e,
whence our claim. The same proof works for C replacing X', since the topology a(X,G) coincides with a(X, X') on S.
A.3.13. Corollary. Let p be a Radon measure on a locally convex space X. Then the collection of all bounded cylindrical functions on X is dense in LP(p) for every p E [1, co). In addition, the linear space T generated by the functions of the form exp(i f ), f E X. is dense in the complex spaces LP(p). Moreover, both claims remain valid if X' is replaced by any linear subspace C C X' separating the points in the space X.
A.3.14. Definition. Let p be a Borel measure on. a topological space X. A closed set S,, C X is said to be the topological support of p if p(X 0 and there is no smaller closed set with this property. Every Radon measure has the topological support (see Problem A.3.35). In measure theory an important role is played by Souslin sets defined as the images of complete separable metric spaces under continuous mappings to Hausdorff topological spaces. Hausdorff topological spaces that are continuous images of complete separable metric spaces are called Souslin spaces. Non-Borel sets of this kind were discovered by M. Ya. Souslin. For example. the orthogonal projection of a Borel set in 1R.2 to lR' may fail to be Borel, but it is a Souslin set. It is known (see [185)) that there exist an infinitely differentiable function f : IR' IR' and a Borel set B C 1111 such that f (B) is not Borel. N. N. Lusin established the measurability of Souslin sets. In the next theorem we have collected the most important properties of Souslin sets frequently used in measure theory. Their proofs can be found in [674, p. 124, Chapter 11, Corollary 1: p. 95, Theorem 2: p. 103, Corollary 3; p. 107, Corollary 16] or in [344].
A.3.15. Theorem. Suppose that X and Y are Hausdorff topological spaces and f : X -. Y is a mapping. Then: (i) Every Souslin set in Y is measurable with respect to every Radon measure on Y. (ii) If X is a complete separable metric space and f is continuous, then f(B) is a Souslin set in Y for every Borel set B C X. If, in addition, f is injective, then f (B) is Borel in Y.
Appendix
376
(iii) If X and Y are Souslin spaces and f is a Bore! mapping, then the images and preimages of Souslin sets are Souslin. If f is injective. then f (B) is Bore! in Y for every Bore! set B in X. It is known that every Borel measure on any Souslin space is Radon (see [674,
p. 122]). On every Souslin space X, there exists a countable set of continuous functions separating the points. Therefore, all compact sets in Souslin spaces are metrizable. Hence every Borel measure on any Souslin space is concentrated on a countable union of metrizable compact sets. Note also that all Borel subsets of Souslin spaces are Souslin. It is worth mentioning that a space which is a countable union of its Souslin subspaces is Souslin itself. However, the complement of a Souslin set may fail to be Souslin: moreover, if the complement of a Souslin set is Souslin, then this set is Borel (see 1674, Corollary 1, p. 101]). A Souslin space may be nonmetrizable (for example, the space 12 with the weak topology). However, sequentially closed sets in Souslin spaces are Borel (see [674, Corollary 1. p. 109]). In particular, any sequentially continuous function on a Souslin space is Borel. A proof of the following result can be found in [674, Lemma 18, p. 108].
A.3.16. Proposition. Let X be a Souslin space. Then B(X) is generated by some countable family of sets. In addition, B(X) = E(X, If,)) for every sequence of Borel functions f separating the points in X. Finally, if X is a Souslin locally convex space, then such a sequence can be chosen in any set G C X* separating the points in X. A very important object connected with a measure on locally convex space is its Fourier transform.
A.3.17. Definition. Let X be a locally convex space and let p be a measure on E(X ). The Fourier transform µ of the measure p is defined by the formula
µ: X' - C. µ(f) = JexP(if(x)) p(dx).
(A.3.2)
x A.3.18. Proposition. Any two measures on E(X) with equal Fourier transforms coincide.
According to Corollary A.3.13, any two Radon measures with equal Fourier transforms are equal. The following theorem may be useful for constructing Radon measures. A.3.19. Theorem. Let X be a locally convex space, let G be a linear subspace in X' separating the points in X. and let p be an additive nonnegative function on the algebra 1c of all cylindrical sets generated by G. Suppose that p has the following property: for any e. there exists a compact set KK C X such that p(C) < e for every cylindrical set C E IZc which is disjoint with K, Then p uniquely extends to a Radon measure on X. Let X be a locally convex space and let p and v be two measures on 6(X) - Then the measure p8v is defined on E(XxX) (note that E(XxX) = e(X)6(X), which can be easily deduced from the equality (X xX )' = X'xX' ). The image of this measure
under the mapping X x X - X, (x, y) '-- x + y, is called the convolution of the measures p and v and is denoted by p * v. It is easily verified that, letting A = p * v, one has A = µv. With the aid of Theorem A.3.19 one readily proves that the product
A.3.
Measures and measurability
377
of Radon measures µ;, i = 1.... , n, on a locally convex spaces X; is uniquely extended to a Radon measure p on X 1 x . x X,,. More generally, if µ are Radon probability measures on locally convex spaces X,,, then the product-measure ®µ n=1
extends uniquely to a Radon measure on X = 11n 1 X,,. By the product of Radon measures we always mean the result of this extension. Certainly, for reasonable spaces (e.g., separable metric or Souslin), there is no need to consider extensions. since the product-measure is defined on the Borel a-field of the product space from the very beginning. It is known (see [800, p. 60. Ch. I. Theorem 4.11) that if p is the aforementioned product of two Radon measures µ1 and p2, then for every B E 13(XIxX2), thefunction x2 µ1(B?,), where Bx, = {x1 E XI: (x1,x2) E B}, is Borel and p(B) = it I(Bx,)l12(dx2)
J
XZ
In particular, if p and v are Radon measures on a locally convex space X, then their convolution p * v extends uniquely to a Radon measure (again denoted by µ * v). By the convolution of Radon measures we always mean the result of this extension. In this case, according to [800, p. 64, Ch. I, Proposition 4.4]), for every B E 8(X), the function x - µ(B - x) is Borel and the following equality holds true:
µ * i'(B) = f µ(B - x) v(dx). X
Pettis integral Let f : (X, B(X )},) (X, t'(X )) be a measurable mapping on a locally convex space X with a Radon measure p. The element. m E X is called the Pettis integral of the mapping f if for every I from X' the function l(f) is integrable with respect to p and its integral equals 1(m). Put f f(x)p(dx):= m.
A.3.20. Lemma. Let µ be a Radon probability measure on a sequentially complete locally convex space X concentrated on a metrizable compact set K. Then any
sequentially continuous linear mapping A: X - X has Pettis integral which is an element of the closed convex hull of the compact set A(K). PROOF. According to Problem A.3.36, there exists a sequence of probability measures µ with finite supports in K that converges weakly to the measure is. For the measures p,,, the Pettis integrals In := fk obviously exist and are elements of the convex hull Q of the compact set A(K). Since for any I E X' the function 1 o A is continuous on the metrizable compact set K (being sequentially continuous), by construction, the sequence 1(I,,) converges to fh l(Ax) µ(dx). Hence the sequence {I,,} is a Cauchy sequence in the weak topology. Note that if X is complete, then the closure of Q is compact, and the initial topology coincides with the weak one on this closure. Hence converges to some point m E X. Clearly, m. is the Pettis integral of A.
Suppose now that X is only sequentially complete. The compact set A(K) is metrizable (as a continuous image of a metrizable compact space, see [231.
Appendix
378
Theorem 4.4.15]). By virtue of Proposition A.1.7, the closure of Q is a metrizable compact set as well. Therefore, we conclude again that the sequence {In } converges to some point m., which is the Pettis integral of A. 0
If X is a separable Banach space and (S2,µ) is a space with measure, then for measurable mappings f : 1 -. X satisfying the condition 11f11, E L1(µ), the notion of the Bochner integral is defined by the analogy with the Lebesgue integral for scalar functions, see [214, Ch. 1111). In this case the Pettis integral exists as well
and coincides with the Bochner one. Let us denote by U'(µ, X) the Banach space of all p-measurable X-valued mappings f such that IIfIILI(P.X,
{ff(x)II x µ(d2)}
1/p
< oc.
S1
This notation is also used in the case when X is a normed space, but then it is additionally required that f be Bochner integrable.
Random vectors Let (52,.x', P) be a probability space, X a locally convex space. The symbol is used to denote the expectation of a random variable t: on fl (i.e., lE,t; is the Lebesgue integral of the measurable function C). A measurable mapping l; : S2 lE
(X,E(X)) is called a random vector in X. The measure Pt(C) = P(t:-(C)) is called the distribution (or the law) of C. Clearly, every probability measure on E(X) can be obtained in such a form (with the identity mapping C(x) = x). If we have a family of probability measures µn on X, then there is a family of independent random vectors f,, on one and the same probability space S2 such that
x
x
n=I
n=1
Pt, = FIn (take Si = fi X,,, Xn = X, P = ® p,,, Cn(w) = wn) A random process (tt)tET is, by definition, a collection of random variables on a probability space
(f),.P, P). In this case Ct...... t,,.B = {w:
(w)) E B} E,F
(A.3.3)
for every Borel set B E B(IRn) and any t I , ... , to E T. Therefore, we can define a
measure on the algebra R(R') of the cylindrical sets of the form (A.3.3) by the formula
µ'(Ct,... J_B)=P((tt,,...,EtjEB). This measure is automatically countably additive, hence it uniquely extends to a countably additive measure on E(RT) denoted by pt and called the distribution of the process t in the functional space (or the measure generated by f). Conversely, any probability measure p on E(lRT) is the distribution of the random process
t't(w)=w(t)ifwetake S2=lRrandP=p. Note that f o r any finite collection t1, ... , t,, E T, the formula above defines a probability measure Pt,..... t on R" called the finite dimensional distribution of {-
It is clear that if {sl,....sk} C {ti.... ,t"}, i.e., s, = t7i, i = 1,... ,k, then the image of Pt,,.... t under the mapping n
k
coincides with Pa,...,,,,, (i.e., the projections are consistent). The following result is a celebrated theorem due to Kolmogorov (421) (see its proof also in 1822, Ch. 51) .
A.3.
Measures and measurability
379
A.3.21. Theorem. Suppose that for every finite set of points tt, ... , to E T. a probability measure P......t on JR" is given such that the aforementioned consistency property is satisfied. Then there exists a probability measure P whose finite dimensional projections are exactly P,,.
Another Kolmogorov's result enables us to construct measures on the space C[a,b] (its proof can be found in [822, Ch. 5]). A.3.22. Theorem. Let fit. t E [a. b]. be a random process such that for some a > 1, C > 0, and e > 0. one has IE{fr - l;. JO < Cat - slh+`,
dt, s E [a, b].
Then there exists a random process 77, t E [a, b]. with continuous trajectories such that for each t one has nr = tt a.s. In particular. the process 171 has the same finite
dimensional distributions as r (hence Ee" = µ£). In addition. (µ')*(C[a.b]) = 1. Moreover, (pt )'(W [a, b]) = 1 for every b E (0, a/a), where H5[a, b] is the set of all functions satisfying the Holder condition of order 6.
Note that the same is true for the processes with values in separable Banach spaces (see [289. Ch. 3, §5] ).
Problems A.3.23. (i) Let K be a metrizable compact space. Show that the space C(K) with the sup-norm is separable. (ii) Show that the product I' of the continuum of segments is separable, but the Banach space C(I') is not.
A.3.24. Prove that if X a separable normed space, then there exists a countable family of continuous linear functionals separating the points in X, hence X' is separable in the s-weak topology. The converse is not true. A.3.25. (i) Show that any reflexive Banach space is sequentially complete in the weak topology. (ii) Let X be an infinite dimensional normed space. Show that X is not complete in the weak topology (i.e., there exists a net which is Cauchy in the weak topology, but has no limit in the weak topology). In addition, the space X' is not complete in the s-weak topology.
A.3.26. Construct an example of a locally convex space X such that there exists a sequentially continuous linear functional on X that is not continuous.
A.3.27. Let K be a convex compact set in a separable metrizable locally convex space X. Show that K is the intersection of a sequence of closed half-spaces: if K absolutely
convex, then p,, has the form p,; (a) = supi l,(x) for some sequence (1,) C X'.
A.3.28. Let X be a Banach space. Prove that if a linear mapping A : X -- X is continuous from the weak topology to the norm topology, then the range of A is finite dimensional.
A.3.29. Show that there exists no continuous norm on the space 1R" with its natural topology.
A.3.30. Let X be a Hilbert space, let Y be a locally convex space, and let L E C(X,Y). (i) Show that L takes the closed unit ball to a closed set. (ii) Show that if K E K(X,Y). then K maps the closed unit ball to a compact set. A.3.31. (i) Give an example of two nonnegative nuclear operators on a Hilbert space which have dense ranges intersecting only at zero. (ii) Let K be a compact operator on an infinite dimensional separable Hilbert space H. Prove that there exists a compact
Appendix
380
operator S such that its range is dense in H, but intersects the range of K only at zero. (iii) Let X be a Banach space and A E L(X). Show that if the set A(X) is dense and does not coincide with X, then there exists an operator B E £(X) such that the set B(X) is dense and intersects B(X) only at zero. Hint: see [6821.
A.3.32. Let K be a compact set in a Hilbert space H. Show that K is contained in a compact ellipsoid of the form A(U), where A is a symmetric compact operator on H and U is the unit ball of H. Hint: it suffices to consider separable H with an orthonormal basis {en }: construct an increasing sequence of natural numbers k(n), n E IN. such that k(n)
(r,e))2 < 2-" for all z E K and n > 1. Let Ae. = aye,, where a. = 1 if
jo, 270
(h,k)H(,), 60
Pd(-y), 255
II-IIp,371
Pd(y,Y), 255
6,2 IILIixa, 13 II
-
Ilp.r, 217
PA , 364
213 215
p(.,a,o2), 1 RKHS, 44
llfllxp-r, 217
Rw, 56
llfJJwp.', 211 llµ - t.II, 371 JAI, 366 Ihltr(7), 44
R,, 44, 100 RT, 361 R°°, 361 span A, 362 S(L, e), 157 Sd('Y), 168
S(R'), 12
V H , 236 Oh F, 206
8vf, 238 00
® µn, 372
(Tt)t>o, 9, 78, 215
n=1
UN , 44, 100
lim inf f (s), 68
ug, 247
lim sup f(8), 68
V,(°)f, 216 Vrf, 215 W(',' 1 [0, 11, 56
WP.r(Q), 12
WP"(y), 211 WP,'(-y, E), 211
Wp" (7n), 13 Wp'r(7n, E), 16 W?c (Rn), 12 WP'(-y), 237 W)oe (y, E), 237
W°°(1), 212 W°°(y,E), 212 X', 361
X',361 X7, 44, 100
Xk, 8, 78 Xk(E), 9, 79 py, 340 Ph , 207 pu , 238 yh , 100
AH, 270, 348 6v, 238 Ap, 301 AK, 290 µ', 3, 43, 376 µ s v, 376, 377
µt, 378 µh, 40
µo f-1, 40, 372
µw,371 µv,371 v, 371 µIA, 371 A
s-t
absolutely convex hull, 25, 362 set, 25, 362 abstract Wiener space, 136 affine
function, 79 measurable, 80 mapping, 42, 361 subspace, 66 Anderson inequality, 28, 77, 165 automorphism measurable, 284 measurable linear, 284 Baire measure, 374 Banach-Sales property, 220 Ben Arous-Ledoux class, 195 Bochner integral, 378 Borel function, 374 mapping, 374 measure, 374 Brownian bridge, 58 motion, 54 path, 335, 336 sheet, 58 Brunn-Minkowski inequality, 27
Cameron-Martin formula, 61 classical, 85 Cameron-Martin space, 44, 100 capacity, 243 Carleman inequality, 289
Index
centered Gaussian measure, 1, 42 central limit theorem, 3, 356 change of variables formula, 372 chaos decomposition. 78 Chebyshev-Hermite polynomial, 7 Chentsov-Wiener field, 58 closed absolutely convex hull, 362 compact linear operator, 365 compactness in Sobolev classes, 267 complete space, 363 completely regular space, 363 completion of a locally convex space, 364 concave function. 171 conditional expectation, 114, 220, 373 of Gaussian vector, 140 conditional measure, 140, 326 continuous polynomial, 250 convex
function, 171 hull, 25, 362 set, 25, 362 topological support, 166 convolution of Gaussian measures, 44 of measures, 376
of Radon Gaussian measures, 98 of Radon measures, 377 correlation inequality. 177 covariance, 44
429
of a norm, 327 of a polynomial, 319, 330 of a process, 378 of a quadratic form, 321, 330 of a second order polynomial, 328 divergence, 238 Doob inequality. 373 theorem, 373 dual, 361 algebraic, 361 topological, 361 Dudley integral. 334
Ehrhard's inequality. 162 ellipsoid of concentration, 5 entropy. 23 equilibrium potential, 247 equimeasurable rearrangement, 198 exceptional set. 269 expectation, 378
extended stochastic integral, 242 extension Lipschitzian, 265 measurable linear, 125
function, 53
Fernique theorem, 74 Feynman integral, 94 Feynman-Kac formula, 356 Fisher's information, 36
operator, 4, 44
flow generated by a vector field, 324
factorization, 108
cylindrical measure, 136 set. 39
degree of a polynomial, 250 derivative generalized, 215 generalized partial, 214 logarithmic, 207 Sobolev, 12, 212 stochastic. 212 vector logarithmic, 340 determinant Ftedholm-Carleman, 288 differentiability along subspace, 205 Fomin, 207 Fr4chet, 205 Gateaux, 205 Hadamard, 205 stochastic Gateaux, 213 differentiable mapping, 205 measure, 207 diffusion process, 86, 346 distribution of a nonlinear functional, 327
formula
Cameron-Martin, 61, 85 change of variables, 372 Feynman-Kac. 356 integration by parts, 207 Ito, 86 Mehler, 9 Stokes, 323 Fourier transform, 3, 43, 376 Fourier-Wiener transform, 94 fractional Brownian movement, 57 Ornstein-Uhlenbeck process, 58 Frechet space. 362 Fredholm-Carleman determinant, 288 function H-Lipschitzian, 174, 223, 261 p-measurable, 371 affine, 79 Borel, 374 concave, 171 convex, 171 gauge. 364 log-concave, 27 measurable convex, 171 nonnegative definite, 53 Onsager-Machlup, 182 quasicontinuous, 244
Index
430
smooth cylindrical, 207 functional measurable linear, 80 proper linear, 80 gauge function, 364 Gaussian k-symmetrization, 157 capacity, 243 conditional measure, 326 diffusion, 331 measure, 3, 42 measure on LP, 58, 148 measure on IRT, 52, 150 null set, 269 orthogonal measure, 88
Hellinger, 92
Pettis, 377 stochastic iterated. 89 Ito, 85 multiple, 89 Paley-Wiener-Zygmund, 83 integration by parts formula, 207 invariant measure, 347 isometric operator, 366 isoperimetric inequality, 167 iterated stochastic integral, 89 Ito formula, 86 stochastic integral, 85 Ito-Nisio theorem, 68
process, 52
Radon measure, 97 random variable, I vector, 42 Gaussian measure Radon-Nikodym density, 291 generalized derivative, 215 partial derivative, 214 generator, 371 Girsanov's theorem, 309 Gross measurable seminorm, 137 Hamel basis, 361 Hellinger integral, 92 Hermits polynomial, 7 Hilbert transform, 231
Hilbert-Schmidt mapping, 367 operator, 367 theorem, 366 homogeneous polynomial, 249
Langevin equation, 87 Laplacian AN, 270 large deviations, 196 large numbers law, 72 Lebesgue completion, 40 Levy theorem, 335 Lipschitzian extension, 265 locally convex space, 361 nuclear, 155 log-concave function, 27 measure, 28 log-concavity, 27 log-log law, 336, 358 logarithmic derivative, 207 derivative along a field, 238 gradient, 340 Sobolev inequality, 16, 226 Lusin's condition (N), 281, 330 Lusin's property (N), 281
hypercontractivity, 17, 227
image of measure, 40 Inequality Anderson, 28, 77, 165 Blachman-Stam, 36 Brunn-Minkowski, 27 Carleman, 289 correlation, 177 Doob, 373 Ehrhard, 162 generalized Poincar6, 36, 266 isoperimetric, 167 logarithmic Sobolev, 16, 226 Poincar6, 18, 226, 228
8idak,178 Slepian, 353 integral Bochner, 378 Dudley, 334 Feynman, 94
Mackey topology, 362 majorizing measures condition, 334 Malliavin calculus, 316 mapping H-Lipschitzian, 174
k-linear Hilbert-Schmidt, 368 7-measurable polynomial, 255 p,measurable, 372 affine, 42, 361 Borel, 374 differentiable, 205
Hilbert-Schmidt, 367 measurable linear, 122 polynomial, 250 preserving Gaussian measure, 284, 325 proper linear, 122 ray absolutely continuous, 212 sequentially continuous, 363 stochastically differentiable, 212 Markov semigroup, 347
Index Martin's axiom, 90 martingale, 373 mean, 1, 4, 44 mean-square deviation, 1 measurable, 122 affine function, 80 in the sense of Gross, 137 linear automorphism, 284 linear extension, 125 linear functional, 80 linear mapping, 122 linear operator, 122 linear space, 90 polynomial, 250, 274 process, 58 quadratic form, 257 seminorm, 74 measure H-spherically symmetric, 344 r-additive, 143 Baire, 374 Borel, 374
canonical cylindrical Gaussian, 136 conditional, 140, 326 cylindrical, 136 differentiable, 207 differentiable along a field, 238 Gaussian, 1, 42 nondegenerate, 119 Gaussian on LP, 58, 148
Gaussian on B", 3 Gaussian on BT, 52, 150 Gaussian orthogonal, 88 invariant, 347 log-concave. 28 nondegenerate Gaussian, 119 of finite Cp,,.-energy, 271 pre-Gaussian, 357 product, 372 quasiinvariant, 312
Radon, 374 Radon Gaussian, 97 stable, 359 standard Gaussian, 1 surface, 321 tight, 374 Wiener, 54 median, 176 of a convex function, 204 Mehler formula, 9 metric entropy, 333 metrizable space, 362 Meyer equivalence, 231 theorem, 231 Minkowski functional, 364 mixture of Gaussian measures, 344, 359 modification
431
continuous, 335 natural of a Gaussian process, 69 of a mapping, 371 quasicontinuous, 245 separable of a process, 70 modulus of continuity of we, 336 modulus of convexity, 328 multiple stochastic integral, 89 multipliers theorem, 230
natural modification of a Gaussian process, 69 negligible set, 269 net, 363 nondegenerate Gaussian measure, 119 nonlinear equivalent transformation, 305 nonnegative operator, 366 normal distribution. 2 distribution function, 2 nuclear operator, 152, 368 space, 155 Onsager-Machlup function, 182 operator compact, 365 covariance, 44 diagonal, 370 Hilbert-Schmidt, 367 isometric, 366 measurable linear, 122 nonnegative, 366 nuclear, 152, 368 Ornstein-Uhlenbeck, 215 orthogonal, 366 quasinilpotent, 290, 325 self-adjoint, 370 symmetric, 366 trace class, 368 Ornstein-Uhlenbeck operator, 12, 215 process, 87, 346, 347 fractional, 58 stationary, 57 semigroup, 9, 78, 215 orthogonal operator, 366 oscillation, 67 oscillation constant, 93
partition of unity, 225 Pettis integral, 377 Poincar4 inequality, 18, 226, 228 polar decomposition, 366 polarization identity, 250 polynomial, 249 -y-measurable, 250
Chebyshev-Hermite, 7 distribution, 319, 330
432
Hermite, 7 homogeneous, 249 mapping, 250 pre-Gaussian measure, 357 process diffusion, 86, 346 diffusion Gaussian, 331 Gaussian, 52 measurable, 58 Ornstein-Uhlenbeck, 87, 346, 347 random, 378 separable, 67 symmetrizable diffusion, 347 Wiener, 54, 337 product-measure, 92, 372 Gaussian, 52, 99 Radon,377 progressive measurability, 85 proper linear, 80 proper linear version, 122 property (E), 285
quadratic form distribution, 321, 330 measurable, 257 sequentially continuous, 259 quadratic variation, 335 quasi-everywhere, 244 quasicontinuity, 244 quasicontinuous modification, 245 quasiinvariant measure, 312 quasinilpotent, 290 Radon Gaussian measure, 97 measure, 97, 374 Radon-Nikodym density of Gaussian measure, 291 property, 209 random process, 378 variable Gaussian, 1 vector, 378
reproducing kernel Hilbert space, 44 restriction of a measure, 371 second quantization, 153 self-adjoint operator, 370 semigroup Markov, 347 Ornatein-Uhlenbeck, 9, 78 strongly continuous, 371 seminorm, 361 Gross measurable, 137 measurable, 74 separable modification of a process, 70 process, 67
Index
separant, 67 sequentially complete apace, 363 continuous mapping, 363 set absolutely convex, 25, 362 bounded,362 convex, 25, 362 cylindrical, 39 exceptional, 269 Gaussian null, 269 negligible, 269 Souslin, 375 symmetric, 25, 362 universally zero, 269 shift of measure, 40 $idak's inequality, 178 signed measure, 371 Skorohod theorem, 130 Slepian inequality, 353 Sobolev class, 12, 211, 214 derivative, 12, 212
norm, 13, 211 space, 13, 211
Sobolev embedding theorem, 13 Souslin set, 375 space, 375 space
abstract Wiener, 136 complete, 363 completely regular, 363 continuously embedded, 362 Fr4chet, 362 locally convex, 361 metrizable, 362
of cotype 2, 152, 358 of type 2, 152, 358 sequentially complete, 363 Sobolev, 13, 211 Souslin, 375
with the Radon-Nikodym property, 209 spherically symmetric measure, 344 stable measure, 359 random vector, 359 standard distribution function, 2 Gaussian measure, 1, 4 stochastic derivative, 212 stochastic differential equation, 86 stochastic integral, 83, 88 extended, 242
iterated, 89 itb's, 85 multiple, 89 Paley-Wiener-Zygmund, 83
433
Index
strong second moment, 357 submartingale, 373 support
uniform integrability, 372 uniformly tight family, 130 universally zero set, 269
Banach, 121 connected, 313
Hilbert. 121 of measure, 375 topological, 119, 166, 375 convex,166
surface measure, 321 symmetric Gaussian measure, 1, 42 operator. 366 set, 362
symmetrizable diffusion process, 347
tensor product, 154 theorem Alexandrov. 129 Bernstein, 30 central limit, 356 closed graph. 365 Cramer, 32 Darmois-Skitovich, 33 Doob, 373 Fernique, 74 Girsanov, 309 Gnedenko, 30
Hilbert-Schmidt, 366 lto-Nisio, 68 Kakutani. 92 Kolmogorov, 378 Lebesgue-Vitali, 372 Levy, 335 Mackey, 362 Meyer. 231
multipliers, 230 normal correlation, 7, 140
Polya, 32 Prohorov, 130 Rademacher, 261 Rademacher-Ellis, 330 Seidenberg-Tarski, 317 Skorohod, 130 Sobolev embedding, 13 Tsirelson, 109 tight measure. 97, 374 tightness of capacity, 247 topological support convex, 166
of a Gaussian measure, 119 of a measure, 375 trace, 369
trace class operator, 368 transformation linear equivalent, 286 nonlinear equivalent. 305 Tsirelson theorem, 109
variance, 1
vector
Gaussian, 42 random, 378 stable, 359 vector logarithmic derivative, 340 version
C.-quasicontinuous, 249 of a mapping, 371 proper linear. 122 weak
compactness of measures, 129 convergence of measures, 129 second moment. 357 sequential compactness, 129 Wiener chaos, 78
field. 58 measure, 54 process, 54, 337
infinite dimensional, 337 modulus of continuity, 336 zero-one law, 64 for polynomials, 256
Selected Titles in This Series (Continued from the front of this publication)
31 Paul J. Sally, Jr. and David A. Vogan. Jr., Editors, Representation theory and harmonic analysis on semisimple Lie groups. 1989
30 Thomas W. Cusick and Mary E. Flahive, The MIarkoff and Lagrange spectra. 1989 29 Alan L. T. Paterson, Amenability. 1988 28 Richard Beals, Percy Deift, and Carlos Tbmel, Direct and inverse scattering on the line. 1988
27 Nathan J. Fine, Basic hypergeometric series and applications. 1988 26 Hari Bercovici, Operator theory and arithmetic in H". 1988 25 Jack K. Hale, Asymptotic behavior of dissipative systems. 1988 24 Lance W. Small, Editor, Noetherian rings and their applications. 1987 23 E. H. Rothe, Introduction to various aspects of degree theory in Banach spaces. 1986 22 Michael E. Taylor, Noncommutative harmonic analysis. 1986
21 Albert Baernstein, David Drasin. Peter Duren, and Albert Marden, Editors, The Bieberbach conjecture: Proceedings of the symposium on the occasion of the proof, 1986
20 Kenneth R. Goodearl, Partially ordered abelian groups with interpolation. 1986 19 Gregory V. Chudnovsky, Contributions to the theory of transcendental numbers. 1984 18 Frank B. Knight, Essentials of Brownian motion and diffusion. 1981 17 Le Baron O. Ferguson, Approximation by polynomials with integral coefficients. 1980 16 O. Timothy O'Meara, Symplectic groups. 1978 15 J. Diestel and J. J. Uhl, Jr., Vector measures. 1977 14 V. Guillemin and S. Sternberg, Geometric asymptotics. 1977 13 C. Pearcy, Editor, Topics in operator theory. 1974 12 J. R. Isbell, Uniform spaces. 1964 11 J. Cronin, Fixed points and topological degree in nonlinear analysis. 1964 10 R. Ayoub. An introduction to the analytic theory of numbers. 1963 9 Arthur Said, Linear approximation. 1963 8 J. Lehner, Discontinuous groups and automorphic functions. 1964 7.2 A. H. Clifford and G. B. Preston, The algebraic theory of semigroupe. Volume II. 1961 7.1 A. H. Clifford and G. B. Preston, The algebraic theory of semigroups. Volume 1. 1961 6 C. C. Chevalley, Introduction to the theory of algebraic functions of one variable. 1951 5 S. Bergman, The kernel function and conformal mapping. 1950 4 O. F. G. Schilling, The theory of valuations. 1950 3 M. Marden, Geometry of polynomials. 1949 2 N. Jacobson, The theory of rings. 1943 1 J. A. Shohat and J. D. Tamarkin, The problem of moments. 1943
ISBN 0-8218-1054-5
911780821
ETx
' tiw.amiti.sr