Exercises in Probability A Guided Tour from Measure Theory to Random Processes, via Conditioning Derived from extensive...
84 downloads
721 Views
1MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Exercises in Probability A Guided Tour from Measure Theory to Random Processes, via Conditioning Derived from extensive teaching experience in Paris, this book presents 100 exercises in probability. The exercises cover measure theory and probability, independence and conditioning, Gaussian variables, distributional computations, convergence of random variables and an introduction to random processes. For each exercise the authors have provided a detailed solution as well as references for preliminary and further reading. There are also many insightful notes that set the exercises in context. Students will find these exercises extremely useful for easing the transition between simple and complex probabilistic frameworks. Indeed, many of the exercises here will lead the student on to the frontier of research topics in probability. Along the way, attention is drawn to a number of traps into which students of probability often fall. This book is ideal for independent study or as the companion to a course in advanced probability theory.
CAMBRIDGE SERIES IN STATISTICAL AND PROBABILISTIC MATHEMATICS Editorial Board R. Gill (Department of Mathematics, Utrecht University) B. D. Ripley (Department of Statistics, University of Oxford) S. Ross (Department of Industrial Engineering, University of California, Berkeley) M. Stein (Department of Statistics, University of Chicago) D. Williams (Department of Mathematical Sciences, University of Wales, Swansea) This series of high quality upper-division textbooks and expository monographs covers all aspects of stochastic applicable mathematics. The topics range from pure and applied statistics to probability theory, operations research, optimization and mathematical programming. The books contain clear presentations of new developements in the field and also of the state of the art in classical methods. While emphasizing rigorous treatment of theoretical methods, the books also contain applications and discussions of new techniques made possible by advances in computational practice. Already Published 1. Bootstrap Methods and Their Application, by A. C. Davison and D. V. Hinkley 2. Markov Chains, by J. Norris 3. Asymptotic Statistics, by A. W. van der Varrt 4. Wavelet Methods for Time Series Analysis, by Donald B. Percival and Andrew T. Walden 5. Bayesian Methods: An Analysis for Statisticians and Interdisciplinary Researchers, by Thomas Leonard and John S. J. Hsu 6. Empirical Processes in M-Estimation, by Sara van de Geer 7. Numerical Methods of Statistics, by John F. Monahan 8. A User’s Guide to Measure Theoretic Probability, by David Pollard 9. The Estimation and Tracking of Frequency, by B. G. Quinn and I. Hannan 10. Data Analysis and Graphics Using R, by J. Maindonald and J. Braun 11. Statistical Models, by A. C. Davison 12. Semiparametric Regression, by D. Ruppert, M. P. Wand and R. J. Carroll
Exercises in Probability A Guided Tour from Measure Theory to Random Processes, via Conditioning L. Chaumont and M. Yor Universit´e Pierre et Marie Curie, Paris VI
published by the press syndicate of the university of cambridge The Pitt Building, Trumpington Street, Cambridge, United Kingdom cambridge university press The Edinburgh Building, Cambridge CB2 2RU, UK 40 West 20th Street, New York, NY 10011–4211, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia Ruiz de Alarc´ on 13, 28014 Madrid, Spain Dock House, The Waterfront, Cape Town 8001, South Africa http://www.cambridge.org C
Cambridge University Press 2003
This book is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2003 Third printing, with corrections, 2005 Printed in the United States of America Typeface Computer Modern 10/13 pt.
System LATEX 2ε [tb]
A catalogue record for this book is available from the British Library ISBN 0 521 82585 7 hardback
To Paul-Andr´e Meyer, in memoriam
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Some frequently used notations . . . . . . . . . . . . . . . . . . . . . . xv 1 Measure theory and probability
1
1.1 Sets which do not belong in a strong sense, to a σ-field . . . . . . . .
1
1.2 Some criteria for uniform integrability . . . . . . . . . . . . . . . . . .
3
1.3 When does weak convergence imply the convergence of expectations?.
4
1.4 Conditional expectation and the Monotone Class Theorem . . . . . .
5
1.5 Lp -convergence of conditional expectations . . . . . . . . . . . . . . .
5
1.6 Measure preserving transformations . . . . . . . . . . . . . . . . . . .
6
1.7 Ergodic transformations . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.8 Invariant σ-fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.9 Extremal solutions of (general) moments problems . . . . . . . . . . .
8
1.10 The log normal distribution is moments indeterminate . . . . . . . .
9
1.11 Conditional expectations and equality in law . . . . . . . . . . . . . 10 1.12 Simplifiable random variables . . . . . . . . . . . . . . . . . . . . . . 11 1.13 Mellin transform and simplification . . . . . . . . . . . . . . . . . . . 12 Solutions for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2 Independence and conditioning
25
2.1 Independence does not imply measurability with respect to an independent complement . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 vii
viii
Contents 2.2
Complement to Exercise 2.1: further statements of independence versus measurability . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 Independence and mutual absolute continuity . . . . . . . . . . . . . 27 2.4 Size-biased sampling and conditional laws . . . . . . . . . . . . . . . . 28 2.5 Think twice before exchanging the order of taking the supremum and intersection of σ-fields! . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.6 Exchangeability and conditional independence: de Finetti’s theorem . 30 2.7 Too much independence implies constancy . . . . . . . . . . . . . . . 31 2.8 A double paradoxical inequality . . . . . . . . . . . . . . . . . . . . . 32 2.9 Euler’s formula for primes and probability . . . . . . . . . . . . . . . 33 2.10 The probability, for integers, of being relatively prime . . . . . . . . 34 2.11 Bernoulli random walks considered at some stopping time . . . . . . 35 2.12 cosh, sinh, the Fourier transform and conditional independence . . . 36 2.13 cosh, sinh, and the Laplace transform . . . . . . . . . . . . . . . . . 37 2.14 Conditioning and changes of probabilities . . . . . . . . . . . . . . . 38 2.15 Radon–Nikodym density and the Acceptance–Rejection Method of von Neumann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.16 Negligible sets and conditioning
. . . . . . . . . . . . . . . . . . . . 39
2.17 Gamma laws and conditioning . . . . . . . . . . . . . . . . . . . . . 41 2.18 Random variables with independent fractional and integer parts . . . 42 Solutions for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3 Gaussian variables
67
3.1 Constructing Gaussian variables from, but not belonging to, a Gaussian space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.2 A complement to Exercise 3.1 . . . . . . . . . . . . . . . . . . . . . . 68 3.3 On the negative moments of norms of Gaussian vectors . . . . . . . . 69 3.4 Quadratic functionals of Gaussian vectors and continued fractions . . 70 3.5 Orthogonal but non-independent Gaussian variables . . . . . . . . . . 72
Contents
ix
3.6 Isotropy property of multidimensional Gaussian laws . . . . . . . . . 73 3.7 The Gaussian distribution and matrix transposition . . . . . . . . . . 73 3.8 A law whose n-samples are preserved by every orthogonal transformation is Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.9 Non-canonical representation of Gaussian random walks . . . . . . . . 74 3.10 Concentration inequality for Gaussian vectors . . . . . . . . . . . . . 76 3.11
Determining a jointly Gaussian distribution from its conditional marginals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Solutions for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4 Distributional computations
91
4.1 Hermite polynomials and Gaussian variables . . . . . . . . . . . . . . 92 4.2 The beta–gamma algebra and Poincar´e’s Lemma . . . . . . . . . . . . 93 4.3 An identity in law between reciprocals of gamma variables . . . . . . 96 4.4 The Gamma process and its associated Dirichlet processes. . . . . . . 97 4.5 Gamma variables and Gauss multiplication formulae . . . . . . . . . . 98 4.6 The beta–gamma algebra and convergence in law . . . . . . . . . . . 100 4.7 Beta–gamma variables and changes of probability measures . . . . . . 100 4.8 Exponential variables and powers of Gaussian variables . . . . . . . . 101 4.9 Mixtures of exponential distributions . . . . . . . . . . . . . . . . . . 102 4.10 Some computations related to the lack of memory property of the exponential law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.11 Some identities in law between Gaussian and exponential variables . 104 4.12 Some functions which preserve the Cauchy law . . . . . . . . . . . . 105 4.13 Uniform laws on the circle . . . . . . . . . . . . . . . . . . . . . . . . 105 4.14 Trigonometric formulae and probability . . . . . . . . . . . . . . . . 106 4.15 A multidimensional version of the Cauchy distribution . . . . . . . . 106 4.16 Some properties of the Gauss transform . . . . . . . . . . . . . . . . 108 4.17 Unilateral stable distributions (1)
. . . . . . . . . . . . . . . . . . . 110
x
Contents 4.18 Unilateral stable distributions (2)
. . . . . . . . . . . . . . . . . . . 111
4.19 Unilateral stable distributions (3)
. . . . . . . . . . . . . . . . . . . 112
4.20 A probabilistic translation of Selberg’s integral formulae . . . . . . . 115 4.21 Mellin and Stieltjes transforms of stable variables . . . . . . . . . . . 116 4.22 Solving certain moment problems via simplification . . . . . . . . . . 117 Solutions for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5 Convergence of random variables
149
5.1 Convergence of sum of squares of independent Gaussian variables . . 150 5.2 Convergence of moments and convergence in law . . . . . . . . . . . . 150 5.3 Borel test functions and convergence in law . . . . . . . . . . . . . . . 150 5.4 Convergence in law of the normalized maximum of Cauchy variables . 151 5.5 Large deviations for the maximum of Gaussian vectors . . . . . . . . 151 5.6 A logarithmic normalization . . . . . . . . . . . . . . . . . . . . . . . 152 √ 5.7 A n log n normalization . . . . . . . . . . . . . . . . . . . . . . . . . 152 5.8 The Central Limit Theorem involves convergence in law, not in probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 5.9 Changes of probabilities and the Central Limit Theorem . . . . . . . 154 5.10 Convergence in law of stable(μ) variables, as μ → 0 . . . . . . . . . . 154 5.11 Finite dimensional convergence in law towards Brownian motion . . 155 5.12 The empirical process and the Brownian bridge . . . . . . . . . . . . 157 5.13 The Poisson process and Brownian motion . . . . . . . . . . . . . . 157 5.14 Brownian bridges converging in law to Brownian motions. . . . . . . 158 5.15 An almost sure convergence result for sums of stable random variables159 Solutions for Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 6 Random processes
175
6.1 Solving a particular SDE . . . . . . . . . . . . . . . . . . . . . . . . . 177
Contents
xi
6.2 The range process of Brownian motion . . . . . . . . . . . . . . . . . 178 6.3 Symmetric Levy processes reflected at their minimum and maximum; E. Cs´aki’s formulae for the ratio of Brownian extremes . . . . . . . . 178 6.4 A toy example for Westwater’s renormalization . . . . . . . . . . . . . 180 6.5 Some asymptotic laws of planar Brownian motion . . . . . . . . . . . 182 6.6 Windings of the three-dimensional Brownian motion around a line . . 183 6.7 Cyclic exchangeability property and uniform law related to the Brownian bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 6.8 Local time and hitting time distributions for the Brownian bridge . . 185 6.9 Partial absolute continuity of the Brownian bridge distribution with respect to the Brownian distribution . . . . . . . . . . . . . . . . . . 187 6.10 A Brownian interpretation of the duplication formula for the gamma function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 6.11 Some deterministic time-changes of Brownian motion . . . . . . . . . 189 6.12 Random scaling of the Brownian bridge . . . . . . . . . . . . . . . . 190 6.13 Time-inversion and quadratic functionals of Brownian motion; L´evy’s stochastic area formula. . . . . . . . . . . . . . . . . . . . . . . . . . 191 6.14 Quadratic variation and local time of semimartingales . . . . . . . . 193 6.15 Geometric Brownian motion . . . . . . . . . . . . . . . . . . . . . . 193 6.16 0-self similar processes and conditional expectation . . . . . . . . . . 195 6.17 A Taylor formula for semimartingales; Markov martingales and iterated infinitesimal generators . . . . . . . . . . . . . . . . . . . . . . . 196 6.18 A remark of D. Williams: the optional stopping theorem may hold for certain “non-stopping times” . . . . . . . . . . . . . . . . . . . . . 197 6.19 Stochastic affine processes, also known as “Harnesses”. . . . . . . . . 198 6.20 A martingale “in the mean over time” is a martingale . . . . . . . . 200 6.21 A reinforcement of Exercise 6.20 . . . . . . . . . . . . . . . . . . . . 201 Solutions for Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
xii
Contents Where is the notion N discussed ? . . . . . . . . . . . . . . . . . . . . 226 Final suggestions: how to go further ? . . . . . . . . . . . . . . . . . 227 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Preface Originally, the main body of these exercises was developed for, and presented to, the students in the Magist`ere des Universit´es Parisiennes between 1984 and 1990; ´ the audience consisted mainly of students from the Ecoles Normales, and the spirit ? of the Magist`ere was to blend “undergraduate probability” (= random variables, their distributions, and so on ...) with a first approach to “graduate probability” ? (= random processes). Later, we also used these exercises, and added some more, either in the Pr´eparation a` l’Agr´egation de Math´ematiques, or in more standard Master courses in Probability. In order to fit the exercises (related to the lectures) in with the two levels alluded to above, we systematically tried to strip a number of results (which had recently been published in research journals) of their random processes apparatus, and to exhibit, in the form of exercises, their random variables skeleton. Of course, this kind of reduction may be done in almost every branch of mathematics, but it seems to be a quite natural activity in probability theory, where a random phenomenon may be either studied on its own (in a “small” probability world), or as a part of a more complete phenomenon (taking place in a “big” probability world); to give an example, the classical central limit theorem, in which only one Gaussian variable (or distribution) occurs in the limit, appears, in a number of studies, as a one-dimensional “projection” of a central limit theorem involving processes, in which the limits may be several Brownian motions, the former Gaussian variable appearing now as the value at time 1, say, of one of these Brownian motions. This being said, the aim of these exercises was, and still is, to help a student with a good background in measure theory, say, but starting to learn probability theory, to master the main concepts in basic (?) probability theory, in order that, when reaching the next level in probability, i.e. graduate studies (so called, in France: ´ Diplˆome d’Etudes Approfondies), he/she would be able to recognize, and put aside, difficulties which, in fact, belong to the “undergraduate world”, in order to concentrate better on the “graduate world” (of course, this is nonsense, but some analysis of the level of a given difficulty is always helpful...). Among the main basic concepts alluded to above, we should no doubt list the notions of independence, and conditioning (Chapter 2) and the various modes of convergence of random variables (Chapter 5). It seemed logical to start with a short Chapter 1 where measure theory is deeply mixed with the probabilistic aspects. Chapter 3 is entirely devoted to some exercises on Gaussian variables: of course, no one teaching or studying probability will be astonished, but we have always been struck, over the years, by the number of mistakes which Gaussian type computations seem to lead many students to. A number of exercises about various distributional computations, with some emphasis on beta and gamma distributions, as well as stable laws, are gathered in Chapter 4, and finally, perhaps as an eye opener, a few exercises involving random xiii
xiv processes are found in Chapter 6, where, as an exception, we felt freer to refer to more advanced concepts. However, the different chapters are not autonomous, as it is not so easy – and it would be quite artificial – to separate strictly the different notions, e.g. convergence, particular laws, conditioning, and so on.... Nonetheless, each chapter focusses mainly around the topic indicated in its title. As often as possible, some comments and references are given after an exercise; both aim at guiding the reader’s attention towards the “bigger picture” mentioned above; furthermore, each chapter begins with a “minimal” presentation, which may help the reader to understand the global “philosophy” of this chapter, and/or some of the main tools necessary to solve the exercises there. But, for a more complete collection of important theorems and results, we refer the reader to the list of textbooks in probability – perhaps slightly slanted towards books available in France! – which is found at the end of the volume. Appended to this list, we have indicated on one page some (usually, three) among these references where the notion N is treated; we tried to vary these sources of references. A good proportion of the exercises may seem, at first reading, “hard”, but we hope the solutions – not to be read too quickly before attempting seriously to solve the exercises! – will help; we tried to give almost every ε–δ needed! We have indicated with one star * exercises which are of standard difficulty, and with two stars ** the more challenging ones. We have given references, as much as we could, to related exercises in the literature. Internal references from one exercise to another should be eased by our marking in bold face of the corresponding numbers of these exercises in the Comments and references, Hint, and so on . . . Our thanks go to Dan Romik, Koichiro Takaoka, and at a later stage, Alexander Cherny, Jan Obloj, Adam Osekowski, for their many comments and suggestions for improvements. We are also grateful to K. Ishiyama who provided us with the picture featured on the cover of our book which represents the graph of densities of the time average of geometric Brownian motion, see Exercise 6.15 for the corresponding discussion. As a final word, let us stress that we do not view this set of exercises as being “the” good companion to a course in probability theory (the reader may also use the books of exercises referred to in our bibliography), but rather we have tried to present some perhaps not so classical aspects.... Paris and Berkeley, August 2003
xv
Some frequently used notations a.e.
almost everywhere.
a.s.
almost surely.
r.v.
random variable.
i.i.d.
independent and identically distributed (r.v.s).
Question x of Exercise a.b Our exercises are divided in questions, to which we may refer in different places to compare some results. *Exercise
Exercise of standard difficulty.
**Exercise
Challenging exercise.
P|A Q|A
P is absolutely continuous with respect to Q, when both probabilities are considered on the σ-field A. When the choice of A is obvious, we write only P Q.
dP dQ A
denotes the Radon–Nikodym density of P with respect to Q, on the σ-field A, assuming P|A Q|A , again, A is suppressed if there is no risk of confusion.
P ⊗ Q denotes the tensor product of the two probabilities P and Q. X(P ) w
νn −→ ν An n-sample Xn of the r.v. X ε
denotes the image of the probability P by the r.v. X. indicates that the sequence of positive measures on IR (or IRn ) converges weakly towards ν. denotes an n-dimensional r.v. (X1 , . . . , Xn ), whose components are i.i.d., distributed as X. Bernoulli (two valued) r.v.
N or G Standard centred Gaussian variable, with variance 1: x2
P (N ∈ dx) = e− 2
√dx , 2π
(x ∈ IR).
T
variable: Standard stable(1/2) IR + –valued dt 1 P (T ∈ dt) = √2πt exp − , (t > 0). 3 2t
Z
Standard exponential variable: P (Z ∈ dt) = e−t dt, (t > 0).
Za (a > 0) Za,b (a, b > 0)
Standard gamma(a) variable: P (Za ∈ dt) = ta−1 e−t Standard beta(a, b) variable: P (Za,b ∈ dt) = ta−1 (1 − t)b−1
dt , β(a,b)
dt , Γ(a)
(t > 0).
(t ∈ (0, 1)).
It may happen that, for convenience, we use some different notation for these classical variables.
Chapter 1 Measure theory and probability Aim and contents
This chapter contains a number of exercises, aimed at familiarizing the reader with some important measure theoretic concepts, such as: Monotone Class Theorem (Williams [63], II.3, II.4, II.13), uniform integrability (which is often needed when working with a family of probabilities, see Dellacherie and Meyer [16]), Lp convergence (Jacod and Protter [29], Chapter 23), conditioning (this will be developed in a more probabilistic manner in the following chapters), absolute continuity (Fristedt and Gray [25], p. 118). We would like to emphasize the importance for every probabilist to stand on a “reasonably” solid measure theoretic (back)ground for which we recommend, e.g., Revuz [49]. Exercise 1.11 plays a unifying role, and highlights the fact that the operation of taking a conditional expectation is a contraction (in L2 , but also in every Lp ) in a strong sense.
** 1.1
Sets which do not belong in a strong sense, to a σ-field
Let (Ω, F, P ) be a complete probability space. We consider two (F, P ) complete sub-σ-fields of F, A and B, and a set A ∈ A. 1
2
Exercises in Probability The aim of this exercise is to study the property: 0 < P (A|B) < 1 ,
P a.s.
(1.1.1)
1. Show that (1.1.1) holds if and only if there exists a probability Q, which is equivalent to P on F, and such that (a)
0 < Q(A) < 1,
and
(b)
B and A are independent.
Hint: If (1.1.1) holds, we may consider, for 0 < α < 1, the probability:
1A 1Ac Qα = α ·P . + (1 − α) P (A|B) P (Ac |B) 2. Assume that (1.1.1) holds. Define B A = B ∨ σ(A). Let 0 < α < 1, and Q be a probability which satisfies (a) and (b) together with: (c)
Q(A) = α, and (d)
dQ is B A -measurable. dP F
Show then the existence of a B-measurable r.v. Z, which is > 0, P a.s., and such that:
1Ac 1A EP (Z) = 1, and Q = Z α + (1 − α) ·P . P (A|B) P (Ac |B) ˆ which satisfies (a) and (b), toShow that there exists a unique probability Q gether with (c), (d) and (e), where: (e)
ˆ = P . :Q B
B
3. We assume, in this and the two next questions, that A = B A , but it is not assumed a priori that A satisfies (1.1.1). Show then that A ∈ A satisfies (1.1.1) iff the two following conditions are satisfied: (f) there exists B ∈ B such that: A = (B ∩ A) ∪ (B c ∩ Ac ), up to a negligible set, and (g) A satisfies (1.1.1). Consequently, if A does not satisfy (1.1.1), then there exists no set A ∈ A which satisfies (1.1.1). 4. We assume, in this question and in the next one, that A = B A , and that A satisfies (1.1.1). Show that, if B is not P -trivial, then there exists a σ-field A such that B ⊆A ⊆A, and that no set in A satisfies (1.1.1). / /
1. Measure theory and probability
3
5. (i) We further assume that, under P , A is independent of B, and that: P (A) = 12 . Show that A ∈ A satisfies (1.1.1) iff A is P -independent of B, and P (A ) = 12 . (ii) We now assume that, under P , A is independent of B, and that: 1 P (A) = α, with: α = 0, 2 , 1 . Show that A (belonging to A, and assumed to be non-trivial) is independent of B iff A = A or A = Ac . (iii) Finally, we only assume that A satisfies (1.1.1). Show that, if A (∈ A) satisfies (1.1.1), then the equality A = B A holds. Comments and references: The hypothesis (1.1.1) made at the beginning of the exercise means that A does not belong, in a strong sense, to B. Such a property plays an important role in: ´ma and M. Yor: Sur les z´eros des martingales continues, S´eminaire de J. Aze Probabilit´es XXVI, Lecture Notes in Mathematics, 1526, 248–306, Springer 1992. **
1.2 Some criteria for uniform integrability
Consider, on a probability space (Ω, A, P ), a set H of r.v.s with values in IR+ , which is bounded in L1 , i.e. sup E(X) < ∞ . X∈H
Recall that H is said to be uniformly integrable if the following property holds:
sup X∈H
(X>a)
XdP a→∞ −→ 0 .
(1.2.1)
To each variable X ∈ H associate the positive, bounded measure νX defined by:
νX (A) =
XdP
(A ∈ A) .
A
Show that the property (1.2.1) is equivalent to each of the three following properties: (i) the measures (νX , X ∈ H) are equi-absolutely continuous with respect to P , i.e. they satisfy the criterion: ∀ε > 0, ∃δ > 0, ∀A ∈ A, P (A) ≤ δ⇒ sup νX (A) ≤ ε , X∈H
(1.2.2)
4
Exercises in Probability (ii) for any sequence (An ) of sets in A, which decrease to ∅, then:
sup νX (An ) = 0 ,
lim
n→∞
(1.2.3)
X∈H
(iii) for any sequence (Bn ) of disjoint sets of A,
lim
n→∞
sup νX (Bn ) = 0 .
(1.2.4)
X∈H
Comments and references: (a) The equivalence between properties (1.2.1) and (1.2.2) is quite classical; their equivalence with (1.2.3) and a fortiori (1.2.4) may be less known. These equivalences play an important role in the study of weak compactness in: C. Dellacherie, P.A. Meyer and M. Yor: Sur certaines propri´et´es des espaces H 1 et BMO, S´eminaire de Probabilit´es XII, Lecture Notes in Mathematics, 649, 98–113, Springer, 1978. (b) De la Vall´ee-Poussin’s lemma is another very useful criterion for uniform integrability (see Meyer [40]; one may also consult: C. Dellacherie and P.A. Meyer [16]). The lemma asserts that (Xi , i ∈ I) is uniformly integrable if and only if there exists a strictly increasing function Φ : IR+ → IR+ , such that Φ(x) −→ ∞, as x x → ∞ and supi∈I E[Φ(Xi )] < ∞. (Prove that the condition is sufficient!) This lemma is often used (in one direction) with Φ(x) = x2 , i.e. a family (Xi , i ∈ I) which is bounded in L2 is uniformly integrable. See Exercise 5.8 for an application. * 1.3 When does weak convergence imply the convergence of expectations? Consider, on a probability space (Ω, A, P ), a sequence (Xn ) of r.v.s with values in IR+ , which are uniformly integrable, and such that: w
Xn (P ) −→ ν . n→∞
1. Show that ν is carried by IR+ , and that
ν(dx)x < ∞.
2. Show that E(Xn ) converges, as n → ∞, towards
ν(dx)x.
1. Measure theory and probability
5
Comments: (a) Recall that, if (νn ; n ∈ IN) is a sequence of probability measures on IRd (for simplicity), and ν is also a probability on IRd , then: w
νn −→ ν if and only if : νn , f −→ ν, f n→∞
n→∞
for every bounded, continuous function f . w
(b) When νn −→ ν, the question often arises whether νn , f −→ ν, f also for n→∞ n→∞ some f s which may be either unbounded, or discontinuous. Examples of such situations are dealt with in Exercises 5.3 and 5.8. (P )
(c) Recall Scheffe’s lemma: if (Xn ) and X are IR+ -valued r.v.’s, with Xn −→ X, n→∞ and E[Xn ] −→ E[X], then Xn −→ X in L1 (P ), hence the Xn s are uniformly n→∞ n→∞ integrable, thus providing a partial converse to the statement in this exercise. * 1.4 Conditional expectation and the Monotone Class Theorem Consider, on a probability space (Ω, F, P ), a sub-σ-field G. Assume that there exist two r.v.s, X and Y , with X F-measurable and Y G-measurable such that, for every Borel bounded function g : IR → IR+ , one has: E[g(X) | G] = g(Y ) . Prove that: X = Y a.s. Hint: Look at the title ! Comments: For a deeper result, see Exercise 1.11. **
1.5 Lp -convergence of conditional expectations
Let (Ω, F, P ) be a probability space and X ∈ Lp (Ω, F, P ), X ≥ 0, for some p ≥ 1. 1. Let IH be the set of all sub-σ-fields of F. Prove that the family of r.v.s {(E[X | G]p ) : G ∈ IH} is uniformly integrable. (We refer to Exercise 1.2 for the definition of uniform integrability.) 2. Show that if a sequence of r.v.s (Yn ) , with values in IR+ , is such that (Ynp ) is uniformly integrable and (Yn ) converges in probability to Y , then (Yn ) converges to Y in Lp .
6
Exercises in Probability 3. Let (Bn ) be a monotone sequence of sub-σ-fields of F. We denote by B the limit of (Bn ), that is B = ∨n Bn if (Bn ) increases or B = ∩n Bn if (Bn ) decreases. Prove that Lp E(X | Bn ) −→ E(X | B) . Hint: First, prove the result in the case p = 2.
Comments and references: These three questions are very classical. We present the end result (of question 3.) as an exercise, although it is an important and classical part of the Martingale Convergence Theorem (see the reference hereafter). We wish to emphasize that here, nonetheless, as for many other questions the Lp convergence results are much easier to obtain than the corresponding almost sure one, which is proved in J. Neveu [43] and D. Williams [63]. *
1.6 Measure preserving transformations
Let (Ω, F, P ) be a probability space, and let T : (Ω, F) → (Ω, F) be a transformation which preserves P , i.e. T (P ) = P . 1. Prove that, if X : (Ω, F) → (IR, B(IR)) is almost T -invariant, i.e. X(ω) = X(T (ω)), P a.s., then, for any bounded function Φ : (Ω × IR, F ⊗ B(IR)) → (IR, B(IR)), one has: E[Φ(ω, X(ω))] = E[Φ(T (ω), X(ω))] .
(1.6.1)
2. Conversely, prove that, if (1.6.1) is satisfied, then, for every bounded function g : (IR, B(IR)) → (IR, B(IR)), one has: E[g(X) | T −1 (F)] = g(X(T (ω))),
P a.s.
(1.6.2)
3. Prove that (1.6.1) is satisfied if and only if X is almost T -invariant. Hint: Use Exercise 1.4.
*
1.7 Ergodic transformations
Let (Ω, F, P ) be a probability space, and let T : (Ω, F) → (Ω, F) be a transformation which preserves P , i.e. T (P ) = P . We denote by J the invariant σ-field of T , i.e. J = {A ∈ F : 1A (T ω) = 1A (ω)} . T is said to be ergodic if J is P -trivial.
1. Measure theory and probability
7
1. Prove that T is ergodic if the following property holds: (a) for every f, g belonging to a vector space H which is dense in L2 (F, P ), −→ E(f )E(g) , E [f (g ◦ T n )] n→∞ where T n is the composition product of T by itself, (n − 1) times: T n = T ◦ T ◦ ··· ◦ T. 2. Prove that, if there exists an increasing sequence (Fk )k∈IN of sub-σ-fields of F such that: (b) ∨k Fk = F, (c) for every k, T −1 (Fk ) ⊆ Fk , (d) for every k,
n
(T n )−1 (Fk ) is P -trivial,
then the property (a) is satisfied. Consequently, the properties (b)–(c)–(d) imply that T is ergodic.
*
1.8 Invariant σ-fields
Consider, on a probability space (Ω, F, P ), a measurable transformation T which preserves P , i.e. T (P ) = P . Let g be an integrable random variable, i.e. g ∈ L1 (Ω, F, P ). Prove that the two following properties are equivalent: (i) for every f ∈ L∞ (Ω, F, P ), E[f g] = E[(f ◦ T )g] (ii) g is almost T -invariant, i.e. g = g ◦ T , P a.s. Hint: One may use the following form of the ergodic theorem: n 1 L1 f ◦ T i −−− − −→ E[f | J ] , n→∞ n i=1
where J is the invariant σ-field of T . Comments and references on Exercises 1.6, 1.7, 1.8: (a) These are featured at the very beginning of every book on Ergodic Theory. See, for example, K. Petersen [45] and P. Billingsley [6].
8
Exercises in Probability (b) Of course, many examples of ergodic transformations are provided in books on Ergodic Theory. Let us simply mention
that if (Bt ) denotes Brownian 1 motion, then the scaling operation, B → √c Bc· is ergodic for c = 1. Can you prove this result? Actually, the same result holds for the whole class of stable processes, as proved in Exercise 5.15. (c) Exercise 1.11 yields a proof of (i) ⇒ (ii) which does not use the Ergodic Theorem.
**
1.9 Extremal solutions of (general) moments problems
Consider, on a measurable space (Ω, F), a family Φ = (ϕi )i∈I of real-valued random variables, and let c = (ci )i∈I be a family of real numbers. Define MΦ,c to be the family of probabilities P on (Ω, F) such that: (a) Φ ⊂ L1 (Ω, F, P ) ; (b) for every i ∈ I, EP (ϕi ) = ci . A probability measure P in MΦ,c is called extremal if whenever P = αP1 +(1−α)P2 , with 0 < α < 1 and P1 , P2 ∈ MΦ,c , then P = P1 = P2 . 1. Prove that, if P ∈ MΦ,c , then P is extremal in MΦ,c , if, and only if the vector space generated by 1 and Φ is dense in L1 (Ω, F, P ). 2.
(i) Prove that, if P is extremal in MΦ,c , and Q ∈ MΦ,c , such that Q P , and dQ is bounded, then Q = P . dP (ii) Prove that, if P is not extremal in MΦ,c , and Q ∈ MΦ,c , such that Q P , with 0 < ε ≤ dQ ≤ C < ∞, for some ε, C > 0, then Q is not dP extremal in MΦ,c .
3. Let T be a measurable transformation of (Ω, F), and define MT to be the family of probabilities P on (Ω, F) which are preserved by T , i.e. T (P ) = P . Prove that, if P ∈ MT , then P is extremal in MT if, and only if, T is ergodic under P . Comments and references: (a) The result of question 1 appears to have been obtained independently by: M.A. Naimark: Extremal spectral functions of a symmetric operator. Bull. Acad. Sci. URSS. S´er. Math., 11, 327–344, (1947). (see e.g. N.I. Akhiezer: The Classical Moment Problem and Some Related Questions in Analysis. Publishing Co., New York, p. 47, 1965), and
1. Measure theory and probability
9
R. Douglas: On extremal measures and subspace density, Michigan Math. J., 11, 243–246, (1964). II Proc. Amer. Math. Soc., 17, 1363–1365, (1966). It is often used in the study of indeterminate moment problems, see e.g. Ch. Berg: Recent results about moment problems. Probability measures on groups and related structures, XI (Oberwolfach, 1994), pp. 1–13, World Sci. Publishing, River Edge, NJ, 1995. Ch. Berg: Indeterminate moment problems and the theory of entire functions. Proceedings of the International Conference on Orthogonality, Moment Problems and Continued Fractions (Delft, 1994). J. Comput. Appl. Math., 65, no. 1–3, 27–55, (1995). (b) Some variants are presented in: E.B. Dynkin: Sufficient statistics and extreme points. Ann. Probab. 6, no. 5, 705–730, (1978). For some applications to martingale representations as stochastic integrals, see: M. Yor: Sous-espaces denses dans L1 et H 1 et repr´esentations des martingales, S´eminaire de Probabilit´es XII, Lecture Notes in Mathematics, 649, 264-309, Springer, 1978. (c) The next exercise gives the most classical example of a non-moments determinate probability on IR. It is those particular moments problems which motivated the general statement of Naimark–Douglas. * 1.10
The log normal distribution is moments indeterminate
Let Nσ2 be a centred Gaussian variable with variance σ 2 . Associate to Nσ2 the log normal variable: Xσ2 = exp (Nσ2 ) . 1. Compute the density of Xσ2 ; its expression gives an explanation for the term “log normal”. 2. Prove that for every n ∈ ZZ, and p ∈ ZZ,
E
Xσn2
pπ sin Nσ 2 σ2
= 0.
(1.10.1)
3. Show that there exist infinitely many probability laws μ on IR+ such that: (i) for every n ∈ ZZ,
n2 σ 2 μ(dx) x = exp 2 n
.
10
Exercises in Probability (ii) μ has a bounded density with respect to the law of exp (Nσ2 ).
Comments and references: (a) This exercise and its proof go back to T. Stieltjes’ fundamental memoir: T.J. Stieltjes: Recherches sur les fractions continues. Reprint of Ann. Fac. Sci. Toulouse 9, (1895), A5–A47. Reprinted in Ann. Fac. Sci. Toulouse Math., 6, no. 4, A5–A47 (1995). There are many other examples of elements of Mσ2 , including some with countable support; see, e.g., Stoyanov ([56], p. 104). (b) In his memoir, Stieltjes also provides other similar elementary proofs for different moment problems. For instance, for any a > 0, if Za denotes a gamma variable, then for c > 2, the law of (Za )c is not moments determinate. See, e.g., J.M. Stoyanov [56], § 11.4. (c) There are some sufficient criteria which bear upon the sequence of moments mn = E[X n ] of an r.v. X and ensure that the law of X is determined uniquely from the (mn ) sequence. (In particular, the classical sufficient Carleman cri terion asserts that if n (m2n )−1/2n = ∞, then the law of X is moments determinate.) But, these are unsatisfactory in a number of cases, and the search continues. See, for example, the following. J. Stoyanov: Krein condition in probabilistic moment problems. Bernoulli 6, no. 5, 939–949 (2000). A. Gut: On the moment problem. Bernoulli, 8, no. 3, 407–421 (2002). *
1.11 Conditional expectations and equality in law
Let X ∈ L1 (Ω, F, P ), and G be a sub-σ-field of F. The objective of this exercise (def) is to prove that if X and Y = E[X | G] have the same distribution, then X is G measurable (hence X = Y ). 1. Prove the result if X belongs to L2 . 2. Prove that for every a, b ∈ IR+ , E[(X ∧ a) ∨ (−b) | G] = (Y ∧ a) ∨ (−b) , and conclude. 3. Prove the result of Exercise 1.4 using the previous question.
(1.11.1)
1. Measure theory and probability
11
4. In Exercise 1.8, prove, without using the Ergodic Theorem that (i) implies (ii). 5. Let X and Y belong to L1 , prove that if E[X | Y ] = Y and E[Y | X] = X, then X = Y , a.s. Comments and references: (a) This exercise, in the generality of question 1, was proposed by A. Cherny. As is clear from questions 2, 3, 4 and 5, it has many potential applications. See also Exercise 2.6 where it is used to prove de Finetti’s representation theorem of exchangeable sequences of r.v.s. (b) Question 5 is proposed as Exercise (33.2) on p. 62 in D. Williams’ book [62]. *
1.12 Simplifiable random variables
An r.v. Y which takes its values in IR+ is said to be simplifiable if the following property holds: (law) if XY = Y Z, with X and Z taking their values in IR+ , and X, resp.: Z, are (law)
independent of Y , then: X = Z. 1. Prove that, if Y takes its values in IR+ \ {0}, and if the characteristic function of (log Y ) has only isolated zeros, then Y is simplifiable. 2. Give an example of an r.v. Y which is not simplifiable. (law)
3. Suppose Y is simplifiable, and satisfies: Y = AB, where on the right hand side A and B are independent, and neither of them is a.s. constant. (law)
Prove that A cannot be factorized as: A = Y C, with Y and C independent. Comments and references: (a) To prove that two r.v.s are identical in law, it is sometimes very convenient to first multiply both these variables by a third independent r.v. and then to simplify this variable as in the present exercise. Many applications of this idea are shown in Chapter 4, see for instance Exercise 4.16. To simplify an equality in law as above, the positivity of the variables is crucial as the following example shows: let ε1 and ε2 be two independent (law) symmetric Bernoulli variables, then ε1 ε2 = ε1 does not imply that ε2 = 1 !! For some variants of these questions, see Durrett [19], p. 107, as well as Feller [20], section XV, 2.a.
12
Exercises in Probability
(b) An r.v. H is said to be infinitely divisible if for every n ∈ IN, there exists (law) (n) (n) H1 , . . . , Hn(n) which are i.i.d. and H = H1 + · · · + Hn(n) . The famous L´evy–Khintchin formula asserts that E[exp(iλH)] = exp ψ(λ), for a function ψ (of a very special form). In particular, the characteristic function of H has no zeros; we may use this result, in the setup of question 1, for Y such that H = log Y is infinitely divisible. (c) Exercise 7, p. 295 in A.N. Shiryaev [55] exhibits three independent r.v.s such (law) that U + V = W + V , but U and W have not the same law. Hence, exp(V ) is not simplifiable. *
1.13 Mellin transform and simplification
The Mellin transform of (the distribution of) an IR+ valued r.v. Z is the function: s → E[Z s ], as defined on IR+ (it may take the value +∞). Let X, Y, Z be three independent r.v.s taking values in IR+ , and such that: (law)
(i) XY = ZY , (ii) for 0 ≤ s ≤ ε, with some ε > 0, E[(XY )s ] < ∞, (iii) P (Y > 0) > 0. (law)
Show that: X = Z. Comments and references: (a) The advantage of this last Exercise (and its result) over the previous one is that one needs not worry about the characteristic function of (log Y ). The (small) cost is that we assume X, Y, Z have (small enough) moments. In our applications (e.g. in Chapter 4), we shall be able to use both criteria of Exercises 1.12 and 1.13. (b) It may be worth emphasizing here (informally) that the Mellin transform is injective (on the set of probabilities on IR+ ), whereas its restriction to IN is not. (Please give precise statements and keep them in mind!) See Chapter VI of Widder [61].
1. Measure theory and probability – solutions
13
Solutions for Chapter 1
Solution to Exercise 1.1 1. Suppose that (1.1.1) holds, then we shall prove that for every α ∈ (0, 1), the probability Qα satisfies (a) and (b). First, note Qα is equivalent
that
to P . Indeed, let 1Ac ∩N A∩N N ∈ F be such that Qα (N ) = 0, then EP P1(A| = 0 and E P P (Ac | B) = 0. Since B) 0 < P (A| B) < 1 and 0 < P (Ac | B) < 1, P a.s., the preceding identities imply that P (A ∩ N ) = 0 and P (Ac ∩ N ) = P (N ) = 0. The converse is obvious. On
0. Hence 1A the other hand, Qα (A) = αEP P (A| B) = α ∈ (0, 1). To prove the independence,
A∩B = P (B), whenever B ∈ B. Therefore Qα (A ∩ B) = αP (B) note that EP P1(A| B) and we easily verify that Qα (A)Qα (B) = αP (B). Suppose now that (a) and (b) hold and set N0 = {ω ∈ Ω : P (A| B) = 0}, and N1 = {ω ∈ Ω : P (A| B) = 1}. We have P (A ∩ N0 | B) = 1N0 P (A| B) = 0, thus P (A ∩ N0 ) = 0 and Qα (A ∩ N0 ) = 0. But N0 ∈ B, thus Qα (A ∩ N0 ) = 0 = Qα (A)Qα (N0 ) and Qα (N0 ) = 0. This is equivalent to P (N0 ) = 0. To prove that P (N1 ) = 0, it suffices to consider P (Ac ∩ N1 | B) and to proceed as above.
2. Let Qα be the probability defined in 1, and set Zα =
dQ . dQα |F
First, we show
that EQα (Zα | B) = Zα , Qα a.s. It suffices to prove that for every F-measurable r.v. X ≥ 0, EQ (X) = EQα (EQα (Zα | B)X) . By (d), Zα is BA -measurable, thus all we have to prove is: Q(A) = EQα (EQα (Zα | B)1A ) . It follows from the definition of Qα that: EQα (EQα (Zα | B)1A ) = αEP
1A EQα (Zα | B) P (A| B)
Furthermore, since Zα is BA -measurable, EQα (EQα (Zα | B)1A ) = αEP (EQα (Zα | B)) = αEP (Zα ) .
.
14
Exercises in Probability
On the other hand, we have: Q(A) = EQα (Zα 1A )
1A = αEP Zα P (A| B) = αEP (Zα ) .
Put Z = E(Zα | B), then from the above, we verify that EP (Z) = 1. Since Q and Qα are equivalent, Zα > 0, Qα a.s. Therefore, Z > 0, Qα a.s., which is equivalent to Z > 0, P a.s. We have proven that Z satisfies the required conditions. Now, let Q and Qα be two probabilities which satisfy (a) and (b) together with ˆ | . This implies that (c), (d) and (e). By (c) and (e), it is obvious that Q|BA = Q BA ˆ dQ dQ ˆ = , P a.s. and we conclude that Q = Q on the σ-field F. dP |F
dP |F
3. Suppose that (1.1.1) holds. Each element of B A is of the form (B1 ∩A)∪(B2 ∩Ac ), where B1 and B2 belong to B. Indeed, the set {(B1 ∩ A) ∪ (B2 ∩ Ac ) : B1 , B2 ∈ B} is a σ-field which contains BA (we leave the proof to the reader). Put A = (B1 ∩ A) ∪ (B2 ∩ Ac ), then by (1.1.1), we have 0 < 1B1 P (Ac | B) + 1B2 P (Ac | B) < 1 ,
P a.s. ,
and thus B1 = B2c , up to a negligible set. The converse is obvious. 4. With A = {A ∈ A : (1.1.1) is not satisfied for A}, it is not difficult to prove that A is a σ-field. Moreover, it is clear that B ⊆ A ⊂ A. Now, let B ∈ B be non trivial, then B ∩ A ∈ A and B ∩ A ∈ / B, thus B ⊆A . / 5. (i) If A satisfies (1.1.1) then, by 3, there exists B ∈ B such that A = (B ∩ A) ∪ (B c ∩ Ac ), thus P (A| B) = 1B P (A) + 1B c P (Ac ) = 1/2. The converse is obvious. (ii) A ∈ A , therefore there exist B1 and B2 such that A = (B1 ∩ A) ∪ (B2 ∩ Ac ) and A is independent of B iff P (A | B) = 1B1 P (A) + 1B2 P (Ac ) is constant. Since A is non-trivial and P (A) ∈ / {0, 1/2, 1}, this holds if and only if B1 = ∅ and B2 = Ω or B1 = Ω and B2 = ∅.
(iii) Since A ∈ A , it is clear that B A ⊆ A. Moreover, by 3, there exists B ∈ B such that A = (B ∩ A) ∪ (B c ∩ Ac ). Then, we can prove that A = (B ∩ A ) ∪ (B c ∩ A c ) A and A ∈ BA , thus A⊆B . /
1. Measure theory and probability – solutions
15
Solution to Exercise 1.2 (1.2.1)=⇒(1.2.2): Pick > 0, then on the one hand, according to (1.2.1), there exists b > 0 such that : sup νX (X > b) ≤ /2 . X∈H
On the other hand, for every A ∈ A such that P (A) ≤ /(2b), sup νX ((X ≤ b) ∩ A) ≤ bP (A) ≤ /2 .
X∈H
Finally, (1.2.2) follows from the inequality : sup νX (A) ≤ sup νX ((X ≤ b) ∩ A) + sup νX (X > b) ≤ ε .
X∈H
X∈H
X∈H
(1.2.2)=⇒(1.2.4): If (Bn ) is a sequence of disjoint sets in A then n≥0 P (Bn ) ≤ 1, and thus limn→∞ P (Bn ) = 0. Therefore, for ε and δ as in (1.2.2), there exists N ≥ 1 such that P (Bn ) ≤ δ for every n ≥ N , hence supX∈H Bn X dP ≤ ε. (1.2.4)=⇒(1.2.3): Suppose that (1.2.3) is not verified. Let (An ) be a sequence of sets in A which decreases to ∅ and such that limn→∞ (supX∈H νX (An )) = > 0. For every X ∈ H, limn→∞ νX (An ) = 0, thus there exists X 1 ∈ H and n1 > 1 such that νX 1 (A1 \An1 ) ≥ /2, put B1 = A1 \An1 . Furthermore, there exists n2 > n1 and X 2 ∈ H such that νX 2 (An1 \An2 ) ≥ /2, put B2 = An1 \An2 . We can construct, in this manner, a sequence (Bn ) of disjoint sets of A such that, supX∈H νX (Bn ) ≥ /2 and (1.2.4) is not verified. (1.2.3)=⇒(1.2.1): Suppose that (1.2.1) does not hold, then there exists > 0 such that for every n ∈ IN, we can find X n ∈ H which verifies νX n (X n > 2n ) ≥ . Put An = ∪p≥n (X p > 2p ), then (An ) is a decreasing sequence of sets of A such that supX∈H νX (An ) ≥ . Moreover, limn→∞ P (An ) = 0, indeed, for every n ≥ 1, P (An ) ≤
∞
P (X p ≥ 2p ) ≤ C
p=n
∞ 1 p=n
2p
→ 0,
as n → ∞.
This proves that (An ) decreases to a negligible set A. Finally, put An = An \A, then (An ) is a sequence of A which contradicts (1.2.3).
Solution to Exercise 1.3 (def)
1. Since each of the laws νn = Xn (P ) is carried by IR+ , then for every bounded, continuous function f which vanishes on IR+ , and for every n ∈ IN, we have: E(f (Xn )) = 0. By the weak convergence of νn to ν, we get f (x) ν(dx) = 0 and this proves that ν is carried by IR+ .
16
Exercises in Probability
To prove that
xν(dx) < ∞, note that
xν(dx) = lim
a↑∞
(x ∧ a)ν(dx) = lim
a↑∞
lim (x ∧ a)νn (dx) ≤ sup E[Xn ] < ∞ , n→∞ n
since the Xn s are uniformly integrable. 2. For any a ≥ 0, write |E[Xn ] −
xν(dx)| ≤ |E[Xn ] − E[Xn ∧ a]| + |E[Xn ∧ a] −
+ | (x ∧ a)ν(dx) −
(x ∧ a)ν(dx)|
xν(dx)| .
ε > 0, we can find a such that Since the r.v.s Xn are uniformly integrable, for any
for every n, |E[Xn ] − E[Xn ∧ a]| ≤ ε/3 and | (x ∧ a)ν(dx) − xν(dx)| ≤ ε/3. Moreover, from the convergence in law, there exists N such that for every n ≥ N ,
|E[Xn ∧ a] − (x ∧ a)ν(dx)| ≤ ε/3.
Solution to Exercise 1.4 First solution: From the Monotone Class Theorem, the identity: E[g(X) | G] = g(Y ) extends to: E[G(X, Y ) | G] = G(Y, Y ) , for every (bounded) Borel function G : IR×IR → IR+ . Hence taking G(x, y) = 1I{x =y} yields the result. Second solution: Let a ≥ 0. From the hypothesis, we deduce: E((X1I{|X|≤a} − Y 1I{|Y |≤a} )2 | G) = 0 ,
a.s.
So, E((X1I{|X|≤a} − Y 1I{|Y |≤a} )2 ) = 0, hence X1I{|X|≤a} = Y 1I{|Y |≤a} , a.s, for any a ≥ 0.
Solution to Exercise 1.5 1. Thanks to the criterion for uniform integrability (1.2.2) in Exercise 1.2, it suffices to show that the sets: {{E[X | G]p > a} : G ∈ IH} have small probabilities as a → ∞ uniformly in IH. But this follows from 1 P (E[X | G]p > a) ≤ P (E[X p | G] > a) ≤ E[X p ] . a
1. Measure theory and probability – solutions
17
Note that instead of dealing with only one variable, X ∈ Lp , X ≥ 0, we might also consider a family {Xi , i ∈ I} of r.v.s such that {Xip , i ∈ I} is uniformly integrable. Then, again, the set {E[Xi | G]p , i ∈ I, G ∈ IH} is uniformly integrable. 2. Let ε > 0, then E[|Yn − Y |p ] ≤ E[|Yn − Y |p 1I{|Yn −Y |p ≤ε} ] + E[|Yn − Y |p 1I{|Yn −Y |p >ε} ] ≤ ε + 2p−1 (E[Ynp 1I{|Yn −Y |p >ε} ] + E[Y p 1I{|Yn −Y |p >ε} ]) , where the last equality comes from |x + y|p ≤ 2p−1 (|x|p + |y|p ). When n goes to ∞, the terms E[Ynp 1I{|Yn −Y |p >ε} ] and E[Y p 1I{|Yn −Y |p >ε} ]) converge to 0, since (Ynp ) is uniformly integrable and P (|Yn − Y |p > ε) converges to 0 as n goes to ∞. 3. In this question, it suffices to deal with the case p = 2. Indeed, suppose the result is true for p = 2 and consider the general case where p ≥ 1. First suppose that p ∈ [2, ∞) and let X ∈ Lp (Ω, F, P ). This implies that X ∈ L2
L2 (Ω, F, P ) and E(X | Bn ) ∈ L2 (Ω, F, P ) for every n ∈ IN. Since E(X | Bn ) −→ E(X | B), as n → ∞, the sequence of r.v.s (E(X | Bn )) converges in probability to E(X | B) and from question 1, the sequence (E[X | Bn ]p ) is uniformly integrable. Therefore, from question 2, (E(X | Bn )) converges in Lp to E(X | B). Lp If p ∈ [1, 2) then there exists a sequence (Xk ) ∈ L2 such that Xk −→ X, as L2
k → ∞. Assume that for each k ∈ IN, E(Xk | Bn ) −→ E(Xk | B) as n → ∞; Lp then from H¨older’s inequality, E(Xk | Bn ) −→ E(Xk | B) as n → ∞. Let ε > 0, and k such that X − Xk Lp ≤ ε. There exists n0 such that for all n ≥ n0 , E(Xk | Bn ) − E(Xk | B)Lp ≤ ε, and we have E(X | Bn ) − E(X | B)Lp ≤ E(X | Bn ) − E(Xk | Bn )Lp + E(Xk | Bn ) − E(Xk | B)Lp + E(Xk | B) − E(X | B)Lp ≤ 3ε . Now we prove the result in the case p = 2. At first, assume that (Bn ) is an increasing sequence of σ-fields. There exist Y ∈ L2 (Ω, B, P ) and Z ∈ L2 (Ω, B, P )⊥ such that X = Y + Z. For every n ∈ IN, Z ∈ L2 (Ω, Bn , P )⊥ , hence E(X | Bn ) = E(Y | Bn ). Put Yn = E(Y | Bn ), then for m ≤ n, E[(Yn − Ym )2 ] = E[Yn2 ] − E[Ym2 ], so E[Yn2 ] increases and is bounded, so this sequence of reals converges. From Cauchy’s criterion, the sequence (Yn ) converge in L2 towards an r.v. Y˜ . Now we show that Y˜ = Y : for any k ≤ n and Γk ∈ Bk , E[Yn 1IΓk ] = E[Y 1IΓk ]. But the left hand side converges as n → ∞ towards E[Y˜ 1IΓk ] = E[Y 1IΓk ]. Finally, we verify from the Monotone Class Theorem that {Γ ∈ B : E[Y˜ 1IΓ ] = E[Y 1IΓ ]} is equal to the sigmafield B, hence Y˜ = Y . Assume now that (Bn ) decreases. For any integers n and m such that m > n, the r.v.s X − E[X | Bn ] and E[X | Bm ] − E[X | Bn ] are orthogonal in L2 (Ω, F, P ), and from the decomposition: X − E[X | Bm ] = X − E[X | Bn ] + E[X | Bn ] − E[X | Bm ],
18
Exercises in Probability
we have X − E[X | Bm ]2L2 = X − E[X | Bn ]2L2 + E[X | Bn ] − E[X | Bm ]2L2 . This equality implies that the sequence (X − E[X | Bn ]L2 ) increases with m and is bounded by 2.XL2 , hence it converges. Furthermore, the same equality implies that E[X | Bm ] − E[X | Bn ]L2 tends to 0 as n and m go to ∞: we proved that (E[X | Bn ]) is a Cauchy sequence in L2 (Ω, F, P ), hence it converges. Call Y the limit in L2 of (E[X | Bn ]). Note that Y is B-measurable. For any B ∈ B and n ∈ IN, we have E[X1IB ] = E[E[X | Bn ]1IB ]. Letting n go to ∞ on the right hand side, we obtain E[X1IB ] = E[Y 1IB ], hence Y = E[X | B].
Solution to Exercise 1.6 1. Since T preserves P , we have: E[Φ(ω, X(ω))] = E[Φ(T (ω), X(T (ω)))] . When X is almost T -invariant, the right–hand side is E[Φ(T (ω), X(ω))]. 2. Let g be such a function. Every T −1 (F)-measurable function is of the form φ ◦ T where φ is an F-measurable function. Therefore, it suffices to prove that for any IR+ -valued F-measurable function φ, E[φ(T (ω))g(X(ω))] = E[φ(T (ω))g(X(T (ω)))] . Since T preserves P , it is equivalent to prove that E[φ(T (ω))g(X(ω))] = E[φ(ω)g(X(ω))] . But this identity has already been proved in question 1, with Φ(T (ω), X(ω)) = φ(T (ω))g(X(ω)). 3. If (1.6.1) is satisfied, then by question 2, (1.6.2) holds and by Exercise 1.4, X = X ◦ T a.s.
Solution to Exercise 1.7 1. Property (a) implies that for every f and g in L2 , E [f (g ◦ T n )] −→ E(f )E(g) . n→∞
(1.7.a)
Indeed, let (fk ) and (gk ) be two sequences of H which converge respectively towards f and g in L2 and write |E[f (g ◦ T n )] − E[f ]E[g]| ≤ |E[f (g ◦ T n )] − E[fk (gk ◦ T n )]| + |E[fk (gk ◦ T n )]−E[fk ]E[gk ]|+|E[fk ]E[gk ]−E[f ]E[g]|.
1. Measure theory and probability – solutions
19
Since T preserves P , then for any n, |E[f (g ◦ T n )] − E[fk (gk ◦ T n )]| = |E[(f − fk )(gk ◦ T n )] + E[f (g ◦ T n − gk ◦ T n )]| 1
1
1
≤ E[(f − fk )2 ] 2 E[(gk ◦ T n )2 ] 2 + E[f 2 ]1/2 E[(g ◦ T n − gk ◦ T n )2 ] 2 1
1
1
= E[(f − fk )2 ] 2 E[gk2 ] 2 + E[f 2 ]1/2 E[(g − gk )2 ] 2 , so that for any ε and n, there exists K, such that for all k ≥ K, both terms |E[f (g ◦ T n )] − E[fk (gk ◦ T n )]| and |E[fk ]E[gk ] − E[f ]E[g]| are less than ε. The result is then a consequence of the fact that for any k, the term |E[fk (gk ◦T n )]−E[fk ]E[gk ]| converges towards 0 as n → ∞. Now let g ∈ L2 (I), then g ◦ T n = g and (1.7.a) yields E[f g] = E[f ]E[g], for any f ∈ L2 (F). This implies g = E[g], hence I is trivial. 2. Let H = ∪k≥0 L2 (Ω, Fk , P ), then from Exercise 1.5, H is dense in L2 (Ω, F, P ). Now we prove that H satisfies property (a). Let g ∈ H, then there exists k ∈ IN such that g ∈ L2 (Ω, Fk , P ). Moreover, from (c), ((T n )−1 (Fk ))n≥0 is a decreasing sequence of σ-fields. Let f ∈ H, then k→∞ from (d) and Exercise 1.5, E[f | (T n )−1 (Fk )] −→ E[f ], in L2 (Ω, F, P ). Put fn = E[f | (T n )−1 (Fk )] and gn = g ◦ T n , then we have, |E[fn gn ] − E[f ]E[gn ]| ≤ fn − E[f ]L2 gn L2 and since E[gn ] = E[g] and g ◦ T n L2 = gL2 , one has the required convergence.
Solution to Exercise 1.8 1. By (ii) and the invariance of P under T , we have: E[f g] = E[f ◦ T · g ◦ T ] = E[f ◦ T · g] , and thus (ii) implies (i). Suppose that (i) holds then by applying this property to f , f ◦ T , f ◦ T 2 , · · · f ◦ T n , successively, we get: E[f g] = E[f ◦ T n · g] for every n ∈ IN∗ , hence,
1 n E[f g] = E Σp=1 f ◦ T p · g . n
Since f ∈ L∞ , we can apply Lebesgue’s theorem of dominated convergence together with the Ergodic Theorem to get
lim E n→∞
1 n Σp=1 f ◦ T p · g = E[E[f | J ]g] . n
20
Exercises in Probability
Consequently, one has E[f g] = E[E[f | J ]g], for every f ∈ L∞ . This identity is equivalent to E[f g] = E[E[f | J ]E[g | J ]] = E[f E[g | J ]] , for every f ∈ L∞ , which implies that g = E[g | J ], a.s. This last statement is equivalent to g = g ◦ T , a.s.
Solution to Exercise 1.9 1. Call L the vector space generated by 1 and Φ. First, assume that L is dense in L1 (Ω, F, P ) and that P = αP1 + (1 − α)P2 , for α ∈ (0, 1) and P1 , P2 ∈ MΦ,c . We easily derive from the previous relation that L is dense in L1 (Ω, F, Pi ), i = 1, 2. Moreover, it is clear by (b) that P1 and P2 agree on L, hence it follows that P1 = P2 . Conversely, assume that L is not dense in L1 (Ω, F, P ). Then from the Hahn– Banach theorem, there exists g ∈ L∞ (Ω, F, P ) with P (g = 0) > 0, such that
gf dP = 0, for every f ∈ L. We may assume that g∞ ≤ 1/2, then put P1 = (1 − g)P and P2 = (1 + g)P . Clearly, P1 and P2 belong to MΦ,c and we have P = 12 (P1 + P2 ) but these probabilities are not equal to P since P (g = 0) > 0. ˜ generated by 1 and Φ is dense in 2. (i) From question 1, the vector space Φ dQ 1 ˜ L (Ω, F, P ). Since dP is bounded, Φ is also dense in L1 (Ω, F, Q). (ii) Under the hypothesis, L1 (Ω, F, Q) and L1 (Ω, F, P ) are identical. 3. This study is a particular moments problem with (def)
Φ = {f − f ◦ T ; f ∈ b(Ω, F)} , and the constants cf = 0. So, P ∈ MT is extremal if and only if Φ ∪ {1} spans a dense space in L1 (P ), or equivalently the only functions g in L∞ (Ω, F, P ) such that: for all f ∈ b(Ω, F),
E[f g] = E[(f ◦ T )g]
(1.9.a)
are the constants. But, in Exercise 1.8, we proved that (1.9.a) is equivalent to the fact that g is almost T -invariant; thus P ∈ MT is extremal if and only if I is P -trivial, that is T is ergodic under P .
1. Measure theory and probability – solutions
21
Solution to Exercise 1.10 1. The density of Xσ2 is readily obtained as:
1 1 (log y)2 √ exp − 2σ 2 2πσ 2 y
,
y > 0.
2. We obtain the equality (1.10.1) by applying the formula
z2σ2 E[exp(zNσ2 )] = exp 2 (which is valid for every z ∈ C) with z = n + iρ, ρ =
pπ σ2
and n, p ∈ ZZ.
3. Let for instance μ be the law of exp(Nσ2 ) under the probability measure
1+
p
pπ cp sin Nσ 2 σ2
dP ,
where (cp ) is any sequence of reals such that p |cp | ≤ 1, then question 1, asserts that (i) is satisfied. Moreover, (ii) is trivially satisfied.
Solution to Exercise 1.11 1. From the hypothesis, we deduce that E[(X − Y )2 ] = E[X 2 ] − E[Y 2 ] = 0 . 2. We first note that E[X ∧ a | G] ≤ Y ∧ a, but since X ∧ a and Y ∧ a have the same law, this inequality is in fact an equality. Likewise, we obtain (1.11.1). The same argument as for question 1 now yields: (X ∧ a) ∨ (−b) = (Y ∧ a) ∨ (−b)
a.s.,
and finally, letting a and b tend to +∞, we obtain X = Y , a.s. 3. We easily reduce the proof to the case where X and Y are bounded. Then the hypothesis implies simultaneously that X and Y have the same law, and E[X | G] = Y , so that we can apply the above result. 4. Under the hypothesis (i) of Exercise 1.8, we deduce E[g | T −1 (F)] = g ◦ T .
22
Exercises in Probability
Hence the above result yields g = g ◦ T , a.s. 5. It is easily deduced from the hypothesis that the identity (1.11.1) is satisfied, hence X = Y , a.s.
Solution to Exercise 1.12 1. The main difficulty in this question lies in the fact that we cannot define “log X” and “log Z”, since X and Z may actually be zero on non-negligible sets. We rewrite (law) the hypothesis XY = ZY trivially as (law)
1I{X>0} XY = 1I{Z>0} ZY , and since P (Y = 0) = 0, we can write, for any λ = 0:
E 1I{X>0} exp (iλ(log X + log Y )) = E 1I{Z>0} exp (iλ(log Z + log Y )) . From the independence hypothesis, we obtain
E 1I{X>0} exp (iλ log X) E [exp (iλ log Y ))]
= E 1I{Z>0} exp (iλ log Z) E [exp (iλ log Y ))] , (law)
from which we easily deduce X = Z. 2. Applying the Fourier inverse transform, we can check that the characteristic x function of the density f (x) = π1 1−cos is given by x2
ϕ(t) =
1 − |t| 0
for for
|t| ≤ 1 , |t| > 1
2x) and that the characteristic function of the density g(x) = π1 (1−cos x)(1+cos is ψ(t) = x2 1 ϕ(t) + 2 (ϕ(t − 2) + ϕ(t + 2)), for all t ∈ IR. Let X and Y be r.v.’s with values in IR+ \ {0} such that log X has density g and log Y has density f . Let Z be an independent copy of Y , then equation (law) ψ(t)ϕ(t) = ϕ2 (t), for all t ∈ IR ensures that XY = ZY . Nonetheless, X and Z have different laws, so Y is not a simplifiable variable.
(law)
(law)
3. Assume A = Y C, then, we have: Y = Y CB, but since Y is simplifiable, (law) we deduce: 1 = CB, hence CB = 1, a.s. This is impossible since C and B are assumed to be independent, and B is not constant.
1. Measure theory and probability – solutions
23
Solution to Exercise 1.13 We deduce from the hypothesis that: E[X s ] = E[Z s ] , for 0 ≤ s ≤ ε. Thus, the laws of X and Z have the same Mellin transforms on [0, ε], hence these laws are equal (see the comments at the end of the statement of this exercise).
A relevant reference: H. Georgii [68] makes use of a number of the arguments employed throughout our exercises, especially in the present Chapter 1. Thus, as a further reading, it may be interesting to look at the discussions in [68] related to external Gibbs measures.
Chapter 2 Independence and conditioning “Philosophy” of this chapter
(a) A probabilistic model {(Ω, F, P ); (Xi )i∈I } consists of setting together in a mathematical way different sources of randomness, i.e. the r.v.s (Xi )i∈I usually have some complicated joint distribution It is always a simplification, and thus a progress, to replace this “linked” family by an “equivalent” family (Yj )j∈J of independent random variables, where by equivalence we mean the equality of their σ-fields: σ(Xi , i ∈ I) = σ(Yj , j ∈ J) up to negligible sets. (b) Assume that the set of indices I splits into I1 + I2 , and that we know the outcomes {Xi (ω); i ∈ I1 }. This modifies deeply our perception of the randomness of the system, which is now reduced to understanding the conditional law of (Xi )i∈I2 , given (Xi )i∈I1 . This is the main thema of D. Williams’ book [64]. (c) Again, it is of great interest, even after this conditioning with respect to (Xi )i∈I1 , to be able to replace the family (Xi )i∈I2 by an “equivalent” family (Yj )j∈J2 , which consists of independent variables, conditionally on (Xi )i∈I1 . Note that the terms “independence” and “conditioning” come from our everyday language and are very suitable as translations of the corresponding probabilistic concepts. However, some of our exercises aim at pointing out some traps which may originate from this common language meaning. (d) The Markov property (in a general framework) asserts the conditional independence of the “past” and “future” σ-fields given the “present” σ-field. It provides an unending source of questions closely related to the topic of this chapter. The elementary articles 25
26
Exercises in Probability
F.B. Knight: A remark on Markovian germ fields. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 15, 291–296 (1970) K.L. Chung: Some universal field equations. S´eminaire de Probabilit´es, VI, pp. 90–97. Lecture notes in Math., Vol. 258, Springer, Berlin, 1972. give the flavor of such studies.
* 2.1 Independence does not imply measurability with respect to an independent complement 1. Assume that, on a probability space (Ω, F, P ), there exist a symmetric Bernoulli variable ε (that is, ε satisfies: P (ε = +1) = P (ε = −1) = 12 ), and an r.v. X , which are independent. (law) Show that εX and ε are independent iff X is symmetric (that is: X = −X). 2. Construct, on an adequate probability space (Ω, F, P ), two independent σ-fields A and B, and an r.v. Y such that: (i) Y is A ∨ B-measurable; (ii) Y is independent of B; (iii) Y is not measurable with respect to A. 3. Construct, on an adequate probability space (Ω, F, P ), two independent σ-fields A and B, and a non-constant r.v. Z such that: (j) Z is independent of A; (jj) Z is independent of B; (jjj) Z is A ∨ B-measurable. 4. Let G be a Gaussian subspace of L2 (Ω, F, P ) which admits the direct sum decomposition: G = G1 ⊕G2 . (Have a brief look at Chapter 3, if necessary . . . ) Define A = σ(G1 ), and B = σ(G2 ). (a) Show that there is no variable Y ∈ G, Y = 0 such that the hypotheses (i)–(ii)–(iii) are satisfied. (b) Show that there is no variable Z ∈ G, Z = 0, such that the hypotheses (j)–(jj)–(jjj) are satisfied. Comments and references: This exercise is an invitation to study the notion and properties (starting from the existence) of an independent complement to a given sub-σ-field G in a probability space (Ω, F, P ). We refer the reader to the famous article V.A. Rohlin: On the fundamental ideas of measure theory. Mat. Sbornik N.S., 25 (67), 107–150 (1949).
2. Independence and conditioning
27
(Be aware that this article is written in a language which is closer to Ergodic Theory than to Probability Theory.) * 2.2 Complement to Exercise 2.1: further statements of independence versus measurability Consider, on a probability space (Ω, F, P ), three sub-σ-fields A, B, C, which are (F, P ) complete. Assume that: (i) A ⊆ B ∨ C, and (ii) A and C are independent. 1. Show that if the hypotheses (i) A ⊆ B ∨ C, and (ii)’ A ∨ B is independent of C are satisfied, then A is included in B. 2. Show that, if (ii)’ is not satisfied, it is not always true that A is included in B. 3. Show that if, besides (i) and (ii), the property (iii): B ⊆ A is satisfied, then: A = B. **
2.3 Independence and mutual absolute continuity
Let (Ω, F, P ) be a probability space and G be a sub σ-field of F. 1. Let Γ ∈ F. Prove that the following properties are equivalent: (i) Γ is independent of G under P , (ii) for every probability Q on (Ω, F), equivalent to P , with able, Q(Γ) = P (Γ).
dQ dP
G measur-
2. Let Q be a probability on (Ω, F) which is equivalent to P and consider the following properties: is G measurable. (jj) for every set Γ ∈ F independent of G under P , Q(Γ) = P (Γ). (j)
dQ dP
Prove that (j) implies (jj). 3. Prove that in general, (jj) does not imply (j). Hint: Let G = {∅, Ω, {a}, {a}c }, assuming that {a} ∈ F, and P ({a}) > 0. Show that if X, F-measurable, is independent from G, then X is constant. Prove that if G ⊂ F, with G = F, then there exists Q which satisfies (jj), but not (j).
28
Exercises in Probability
Comments and references: (a) This exercise emphasizes the difficulty of characterizing the set of events which are independent of a given σ-field. Another way to ask question 2, is: “Does the set of events which are independent of G characterize G?” Corollary 4 of the following paper is closely related to our problem: ´ M. Emery and W. Schachermayer: On Vershik’s standardness criterion and Tsirelson’s notion of cosiness. S´eminaire de Probabilit´es, XXXV, Lecture Notes in Mathematics, 1755, 265–305, Springer, Berlin, 2001. (b) It is tempting to think that if G has no atoms, then (jj) implies (j). But, this is not true: M. Emery kindly gave us some examples of σ-fields G without atoms, with G strictly included in F, and such that any event in F independent from G is trivial. *
2.4 Size-biased sampling and conditional laws
Let X1 , . . . , Xn , be n independent, equidistributed r.v.s which are a.s. strictly positive. Define Sn = X1 + X2 + · · · + Xn , and assume n ≥ 3. Let, moreover, J be an r.v. taking values in {1, 2, . . . , n} such that: P (J = j | X1 , . . . , Xn ) = Xj /Sn . ∗ ) as follows: We define the (n − 1)-dimensional r.v. X∗(n−1) = (X1∗ , . . . , Xn−1
Xi∗
=
Xi , if i < J Xi+1 , if i ≥ J
(i ≤ n − 1) .
∗ ∗ Show that, given Sn−1 ≡ X1∗ + · · · + Xn−1 the r.v.’s X∗(n−1) and XJ are independent, ∗ and that, moreover, the conditional law of X∗(n−1) , given Sn−1 = s, is identical to the conditional law of X(n−1) = (X1 , . . . , Xn−1 ), given Sn−1 = s.
Comments and references: This result is the first step in proving the inhomogeneous ˜ ˜ ˜ Markov property for Sn − m j=1 Xj , m = 1, 2, . . . , n , where (X1 , . . . , Xn ) is a sizebiased permutation of (X1 , . . . , Xn ), see, e.g., p. 22 in M. Perman, J. Pitman and M. Yor: Size-biased sampling of Poisson point processes and excursions. Probab. Theory and Related Fields, 92, no. 1, 21–39 (1992) for the definition of such a random permutation. See also
2. Independence and conditioning
29
L. Gordon: Estimation for large successive samples with unknown inclusion probabilities. Adv. in Appl. Math. 14, no. 1, 89–122 (1993) and papers cited there for more on size-biased sampling of independent and identically distributed sequences, as well as Lemma 10 and Proposition 11 in J. Pitman: Partition structures derived from Brownian motion and stable subordinators. Bernoulli 3, no. 1, 79–96 (1997) for some related results. We thank J. Pitman for his suggestions about this exercise.
2.5 Think twice before exchanging the order of taking the supremum and intersection of σ-fields!
**
Let (Ω, F, P ) be a probability space, C a sub-σ-field of F, and (Dn )n∈IN a decreasing sequence of sub-σ-fields of F. C and Dn , for every n, are assumed to be (F, P ) complete. 1. Prove that if, for every n, C and D1 are conditionally independent given Dn , then:
(C ∨ Dn ) = C ∨
n
Dn
holds.
(2.5.1)
n
2. If there exists a sub-σ-field En of Dn such that C and D1 are conditionally independent given En , then C and D1 are conditionally independent given Dn . Consequently, if C and D1 are independent, then (2.5.1) holds. 3. The sequence (Dn )n∈IN and the σ-field C are said to be asymptotically independent if, for every bounded F-measurable r.v. X and every bounded Cmeasurable r.v. C, one has: E[E(X | Dn )C] −→ E(X)E(C) . n→∞
Prove that the condition (2.5.2) holds iff
n
(2.5.2)
Dn and C are independent.
4. Let Y0 , Y1 , . . . be independent symmetric Bernoulli r.v.’s For n ∈ IN, define Xn = Y0 Y1 . . . Yn and set C = σ(Y1 , Y2 , . . .), Dn = σ(Xk : k > n). Prove that (2.5.1) fails in this particular case. Hint: Prove that ∩n (C ∨ Dn ) = C ∨ σ(Y0 ); but C ∨ (∩n Dn ) = C, since ∩n Dn is trivial.
30
Exercises in Probability
Comments and references: (def)
(a) The need to determine germ σ-fields G =
n
Dn occurs very naturally in many
problems in Probability Theory, often leading to 0–1 laws, i.e. G is trivial. (b) A number of authors, (including the present authors, separately!!), gave wrong proofs of (2.5.1) under various hypotheses. This seems to be one of the worst traps involving σ-fields. (c) A necessary and sufficient criterion for (2.5.1) to hold is presented in: ¨ cker: Exchanging the order of taking suprema and countable H. von Weizsa intersection of σ-algebras. Ann. I.H.P., 19, 91–100 (1983). However, this criterion is very difficult to apply in any given set-up. (d) The conditional independence hypothesis made in question 1 above is presented in: T. Lindvall and L.C.G. Rogers: Coupling of multidimensional diffusions by reflection. Ann. Prob., 14, 860–872 (1986). (e) The following papers discuss, in the framework of a stochastic equation, instances where (2.5.1) may hold or fail. M. Yor: De nouveaux r´esultats sur l’´equation de Tsirelson. C. R. Acad. Sci. Paris S´er. I Math., 309, no. 7, 511–514 (1989). M. Yor: Tsirelson’s equation in discrete time. Probab. Theory and Related Fields, 91 no. 2, 135–152 (1992). (f) A simpler question than the one studied in the present exercise is whether the following σ-fields are equal: (A1 ∨ A2 ) ∩ A3 and (A1 ∩ A3 ) ∨ (A2 ∩ A3 ) (A1 ∩ A2 ) ∨ A3 and (A1 ∨ A3 ) ∩ (A2 ∨ A3 ) .
(2.5.3) (2.5.4)
With the help of the Bernoulli variables ε1 , ε2 , ε3 = ε1 ε2 and the σ-fields Ai = σ(εi ), i = 1, 2, 3, already considered in Exercise 2.2, one sees that the σ-fields in (2.5.3) and (2.5.4) may not be equal. * 2.6 Exchangeability and conditional independence: de Finetti’s theorem A sequence of random variables (Xn )n≥1 is said to be exchangeable if for any permutation σ of the set {1, 2, . . .} (law)
(X1 , X2 , . . .) = (Xσ(1) , Xσ(2) , . . .) .
2. Independence and conditioning
31
Let (Xn )n≥1 be such a sequence and G be its tail σ-field, i.e. G = ∩n Gn , with Gn = σ {Xn , Xn+1 , . . .}, for n ≥ 1. 1. Show that for any bounded Borel function Φ, (law)
E[Φ(X1 ) | G] = E[Φ(X1 ) | G2 ] . 2. Show that the above identity actually holds almost surely. 3. Show that the r.v.s X1 , X2 , . . . are conditionally independent given the tail σ-field G. Comments and references: (a) The result proved in this exercise is the famous de Finetti’s Theorem, see e.g. B. De Finetti: La pr´evision : ses lois logiques, ses sources subjectives. Ann. Inst. H. Poincar´e, 7, 1–66 (1937). It essentially says that any sequence of exchangeable random variables is a “mixture” of i.i.d. random variables. ´ D.J. Aldous: Exchangeability and related topics. Ecole d’´et´e de probabilit´es de Saint-Flour, XIII–1983, 1–198, Lecture Notes in Mathematics, 1117, Springer, Berlin, 1985. O. Kallenberg: Foundations of Modern Probability. Second edition. Springer-Verlag, New York, 2002. (See Theorem 11.10, p. 212). See also P.A. Meyer’s discussion in [40] of the Hewitt-Savage theorem. (b) Question 2 is closely related to Exercise 1.4, and/or to Exercise 1.11. (c) De Finetti’s theorem extends to the continuous time setting in the following form: Any c`adl`ag exchangeable process is a mixture of L´evy processes. See Proposition 10.5 of Aldous course cited above. See also Exercise 6.19 for some applications. *
2.7 Too much independence implies constancy
Let (Ω, F, P ) be a probability space on which two real valued r.v.s X and Y are defined. The aim of questions 1, 2, 3 and 4 is to show that the property:
(I)
(X − Y ) and X are independent (X − Y ) and Y are independent
can only be satisfied if X − Y is a.s. constant.
32
Exercises in Probability 1. Prove the result when X and Y have a second moment.
In the following, we make no integrability assumption on either X or Y . 2. Let ϕ (resp. H), be the characteristic function of X (resp. X − Y ). Show that, if (I) is satisfied, then the identity:
ϕ(x) 1 − |H(x)|2 = 0 ,
for every x ∈ IR ,
(2.7.1)
holds. 3. Show that if (2.7.1) is satisfied, then: |H(x)| = 1 ,
for |x| < ε, and ε > 0, sufficiently small .
(2.7.2)
4. Show that if (2.7.2) is satisfied, then X − Y is a.s. constant. In the same vein, we now discuss how much constraint may be put on the conditional laws of either of the components of a two dimensional r.v., given the other. 5. Is it possible to construct a pair of r.v.s (X, Y ) which satisfy the following property, for all x, y ∈ IR:
(J)
conditionally on X = x, Y is distributed as N (x, 1) conditionally on Y = y, X is distributed as N (y, 1)
?
(Here, and in the sequel N (a, b), denotes a Gaussian variable with mean a, and variance b.) 6. Prove the existence of a pair (X, Y ) of r.v.s such that
(K)
X is distributed as N (a, σ 2 ) conditionally on X = x, Y is distributed as N (x, 1).
Compute explicitly the joint law of (X, Y ). Compute the law of X, conditionally on Y = y. Comments and references: See Exercise 3.11 for a unification of questions 5 and 6. *
2.8 A double paradoxical inequality
Give an example of a pair of random variables X, Y taking values in (0, ∞) such that: (i) E[X | Y ] < ∞ , E[Y | X] < ∞ , a.s., (ii) E[X | Y ] > Y , E[Y | X] > X , a.s.
2. Independence and conditioning
33
Y Hint: Assume that P X ∈ { 12 , 2} = 1, and P (X ∈ {1, 2, 4, . . .}) = 1. More precisely, denote pn = P (X = 2n , Y = 2n−1 ), qn = P (X = 2n , Y = 2n+1 ) and find necessary and sufficient conditions on (pn ), (qn ) for (i) and (ii) to be satisfied. Finally, find for which values of a, the preceding discussion applies when: pn = qn = (const.) an .
Comments and references: That (i) and (ii) may be realized for a pair of nonintegrable variables is hinted at in D. Williams ([64], p.401), but this exercise has been suggested to us by B. Tsirel’son. For another variant, see Exercise (33.2) on p. 62 in D. Williams [62]. See also question 5 of Exercise 1.11 for a related result. *
2.9 Euler’s formula for primes and probability
Let N denote the set of positive integers n, n = 0, and P the set of prime numbers (1 does not belong to P). We write: a|b if a divides b. ∞ 1
To a real number s > 1, we associate ζ(s) =
k=1
on N by the formula: Ps ({n}) =
ks
and we define the probability Ps
1 . ζ(s)ns
1. Define, for any p ∈ P, the random variables ρp by the formula: ρp (n) = 1{p|n} . Show that the r.v.s (ρp , p ∈ P) are independent under Ps , and prove Euler’s identity:
1 1 = 1− s . ζ(s) p∈P p Hint:
1 = Ps ({1}). ζ(s)
2. We write the decomposition of n ∈ N as a product of powers of prime numbers, as follows:
n= pαp (n) , p∈P
thereby defining the r.v.s (αp , p ∈ P). Prove that, for any p ∈ P, the variable αp is geometrically distributed, with parameter q, that is: Ps (αp = k) = q k (1 − q) (k ∈ IN) . Compute q. Prove that the variables (αp , p ∈ P) are independent.
34
Exercises in Probability
Comments and references: This is a very well-known exercise involving probabilities on the integers; for a number of variations on this thema, see for example, the following. P. Diaconis and L. Smith: Honest Bernoulli excursions. J. App. Prob., 25, 464–477 (1988). S.W. Golomb: A class of probability distributions on the integers. J. Number Theory, 2, 189–192 (1970). M. Kac: Statistical independence in probability, analysis and number theory. The Carus Mathematical Monographs, No. 12. Published by the Mathematical Association of America. Distributed by John Wiley and Sons, Inc., New York, xiv+93 pp. (1959). Ph. Nanopoulos: Loi de Dirichlet sur IN∗ et pseudo-probabilit´es. C. R. Acad. Sci. Paris S´er. A-B, 280, no. 22, Aiii, A1543–A1546 (1975). M. Schroeder: Number Theory in Science and Communications. Springer Series in Information Sciences. Springer, 1986. G.D. Lin and C.-Y. Hu: The Riemann zeta distribution. Bernoulli, 7, no. 5, 817–828 (2001). This exercise is also discussed in D. Williams [64]. *
2.10 The probability, for integers, of being relatively prime
Let (Aj )j≤m be a finite sequence of measurable events of a probability space (Ω, F, P ) and define ρk = 1≤i1 0. 4. Assume that the equivalent properties stated in question 1 hold. If ν(dt) is a positive σ-finite measure on IR+ , we define: Lν () =
IR+
ν(dt)e−t and Sν (m) =
ν(dt) IR+
m . 1 + tm
42
Exercises in Probability Then, prove that: E[XLν (L)] = Sν (E[X]) .
* 2.18 Random variables with independent fractional and integer parts Let Z be a standard exponential variable, i.e. P (Z ∈ dt) = e−t dt. Define {Z} and [Z] to be respectively the fractional part, and the integer part of Z. 1. Prove that {Z} and [Z] are independent, and compute their distributions explicitly. 2. Consider X a positive random variable whose law is absolutely continuous. Let P (X ∈ dt) = ϕ(t) dt. Find a density ϕ such that: (i) {X} and [X] are independent; (ii) {X} is uniformly distributed on [0, 1]. Hint: Use the previous question and make an adequate change of probability from e−t dt to ϕ(t) dt. 3. We make the same hypothesis and use the same notations as in question 2. Characterize the densities ϕ such that {X} and [X] are independent.
2. Independence and conditioning – solutions
43
Solutions for Chapter 2
Solution to Exercise 2.1 1. Let f and g be two bounded measurable functions on (Ω, F, P ). Then, 1 E(f (εX)g(ε)) = [E(f (X)g(1)) + E(f (−X)g(−1))] 2 and the variables εX and ε are independent if and only if E(f (X))g(1) + E(f (−X))g(−1) = E(f (X))(g(1) + g(−1)) for every f and g as above. If X is symmetric then this identity holds. Conversely, let g such that g(1) = 0 and g(−1) = 0, then the above identity shows that X is symmetric. 2. Let (Ω, F, P ) be a probability space on which there exist A, B ∈ F such that A, B are independent and P (A) = P (B) = 1/2. Defining X = 1IA − 1IAc , ε = 1IB − 1IB c , A = σ(X) = {∅, Ω, A, Ac } and B = σ(ε) = {∅, Ω, B, B c }, then A and B are independent sub-σ-fields of F. Moreover, the r.v.s X and ε are independent symmetric Bernoulli variables, so from question 1., Y = εX and ε are independent. Hence Y is A ∨ B-measurable and independent of B, but Y is not measurable with respect to A since σ(Y ) = {∅, Ω, (A ∩ B) ∪ (Ac ∩ B c ), (A ∩ B c ) ∪ (Ac ∩ B)} and neither of the sets (A ∩ B) ∪ (Ac ∩ B c ), and (A ∩ B c ) ∪ (Ac ∩ B) can be equal to A. 3. Take Z = Y = εX, A, and B defined in the previous question. By applying question 1 to the pair of variables (εX, ε) and then to the pair of variables (εX, X), we obtain that Z (which is obviously A ∨ B-measurable), is both independent of A and independent of B. 4. (a) Let Y1 ∈ G1 and Y2 ∈ G2 be such that Y = Y1 + Y2 and assume that Y is independent of Y2 . Since Y and Y2 are centred, the variables Y1 + Y2 and Y2 are
44
Exercises in Probability
orthogonal. Therefore, Y2 = 0 a.s. and (iii) does not hold (we have supposed that Y = 0). (b) We answer the last point in the same manner: let Z1 ∈ G1 and Z2 ∈ G2 such that Z = Z1 + Z2 . If Z is independent of both Z1 and Z2 then Z1 = 0 and Z2 = 0 a.s. Comments on the solution: Since any solution to question 3 also provides a solution to question 2, it is of some interest to modify question 2 as follows: (i) and (ii) are unchanged; (iii) is changed in (iii)’: Y is neither measurable with respect to A, nor independent from A. (Solution: take A = σ(G), B = σ(ε), where G is Gaussian, centred, ε is Bernoulli, independent from G, then Y = Gε solves the modified question 2.)
Solution to Exercise 2.2 1. All variables considered below are assumed to be bounded (so that no integrability problem arises). Since A, B and C are complete, it suffices to prove that for every A-measurable r.v., X, one has E(X | B) = X, a.s. Let Y and Z be two r.v.s respectively B-measurable and C-measurable then
E(E(X | B ∨ C)Y Z) = = = =
E(XY Z) E(XY )E(Z) , by (ii)’, E(E(X | B)Y )E(Z) E(E(X | B)Y Z) , since B and C are independent.
Since the equality between the extreme terms holds for every pair of r.v.s Y and Z as above, the Monotone Class Theorem implies that E(X | B) = E(X | B ∨ C) a.s. But X is B ∨ C-measurable, by (i), thus: E(X | B) = X, a.s. 2. See question 2 of Exercise 2.1. 3. This follows from question 1 of the present exercise.
2. Independence and conditioning – solutions
45
Solution to Exercise 2.3 1. (i) ⇒ (ii): Let Q be a probability measure which is equivalent to P . Since Γ is independent of G, and dQ is G measurable then dP
Q(Γ) = EP
dQ dQ 1IΓ = EP P (Γ) = P (Γ) . dP dP
(ii) ⇒ (i): The property (ii) is clearly equivalent to: for every bounded G measurable function φ, EP [φ1IΓ ] = EP [φ]P (Γ), which amounts to Γ being independent of G. 2. (j) ⇒ (jj): This follows from (i) ⇒ (ii). 3. Suppose that X is independent of the event {a}, then for any bounded measurable function f : E[f (X)1I{a} ] = E[f (X)]P ({a}) but, almost surely f (X)1I{a} = f (X(a))1I{a} , so that E[f (X)1I{a} ] = f (X(a))P ({a}) , hence E[f (X)] = f (X(a)) for any f , bounded and measurable. This proves that X = X(a), a.s.
Solution to Exercise 2.4 Let f be a bounded Borel function defined on IRn−1 and g, h be two bounded real valued Borel functions, then: ∗ ∗ ∗ ∗ ∗ E(f (X(n−1) )g(XJ )h(Sn−1 )) = E(E(f (X(n−1) ))g(XJ ) |Sn−1 )h(Sn−1 )) .
(2.4.a)
On the other hand, put (j)
X(n−1) = (X1 , X2 , · · · , Xj−1 , Xj+1 , · · · , Xn )
(j)
Sn−1 =
Xi ,
1≤i≤n, i =j
then from the hypotheses: ∗ ∗ E(f (X(n−1) )g(XJ )h(Sn−1 )) =
n
(j)
j=1
=
n j=1
(j)
E(f (X(n−1) )g(Xj )h(Sn−1 )1{J=j} ) ⎛
E
⎝f (X (j) )g(Xj )h(S (j) ) n−1 (n−1)
⎞
Xj (j)
Sn−1 + Xj
⎠ .
46
Exercises in Probability (j)
Since X(n−1) and Xj are independent, then: ∗ ∗ E(f (X(n−1) )g(XJ )h(Sn−1 ))
=
n
⎛
E
⎞
⎝E(f (X (j) ) | S (j) )g(Xj )h(S (j) ) n−1 n−1 (n−1)
j=1 (j)
(j)
Xj (j)
Sn−1 + Xj
⎠.
(j)
Put E(f (X(n−1) ) | Sn−1 ) = k(Sn−1 ), then from above, ∗ ∗ ∗ ∗ )g(XJ )h(Sn−1 )) = E(k(S(n−1) )g(XJ )h(Sn−1 )) . E(f (X(n−1)
This implies: ∗ ∗ ∗ ∗ ∗ )g(XJ )h(Sn−1 )) = E(k(S(n−1) )E(g(XJ ) | Sn−1 )h(Sn−1 )) E(f (X(n−1) ∗ ∗ ∗ ∗ = E(E(f (X(n−1) | S(n−1) )E(g(XJ ) | Sn−1 )h(Sn−1 )) (2.4.b)
Comparing (2.4.a) and (2.4.b), we have: ∗ ∗ ∗ ∗ ∗ E(f (X(n−1) )g(XJ ) | Sn−1 ) = E(f (X(n−1) ) | Sn−1 )E(g(XJ ) | Sn−1 ), ∗ ∗ ∗ , X(n−1) and XJ are independent. Now, putting E(f (X(n−1) )| thus, given Sn−1 ∗ ∗ Sn−1 ) = k1 (Sn−1 ) and E(f (X(n−1) ) | Sn−1 ) = k2 (Sn−1 ) then we shall show that ∗ k1 (Sn−1 ) = k2 (Sn−1 ) a.s. By the same arguments as above, we have: ∗ ∗ E(f (X(n−1) )g(Sn−1 )) =
n
(j)
(j)
(j)
E E(f (X(n−1) | Sn−1 )g(Sn−1 )
j=1
Xj Sn
.
(j)
Since the law of X(n−1) is the same as the law of X(n−1) , ∗ ∗ E(f (X(n−1) )g(Sn−1 ))
=
n j=1
E
(j) (j) Xj k2 (Sn−1 )g(Sn−1 ) Sn
∗ ∗ = E(k2 (Sn−1 )g(Sn−1 )) . ∗ ∗ ∗ ∗ )g(Sn−1 )) = E(k1 (Sn−1 )g(Sn−1 )), which proves But also, by definition, E(f (X(n−1) the result.
Solution to Exercise 2.5 1. In this solution, we set: D = ∩n Dn . It suffices to show that for every bounded, C ∨ D1 -measurable r.v. X: E(X | ∩n (C ∨ Dn )) = E(X | C ∨ D) ,
a.s.
(2.5.a)
2. Independence and conditioning – solutions
47
We may consider variables X of the form: X = f g, f and g being bounded and respectively C-measurable and D1 -measurable. Let X be such a variable, then on the one hand, since C ∨ Dn decreases, Exercise 1.5 implies: L1
E(X | C ∨ Dn ) −→ E(X | ∩n (C ∨ Dn )) ,
(n → ∞) .
(2.5.b)
On the other hand, since C and D1 are independent conditionally on Dn , then E(X | C ∨ Dn ) = f E(g | C ∨ Dn ) = f E(g |Dn ) . L1
But again, from Exercise 1.5, E(g | Dn ) −→ E(g | D) as n goes to ∞. Moreover, C and D1 being independent conditionally on Dn for each n, these σ-fields are independent conditionally on D, hence L1
E(X | C ∨ Dn ) −→ f E(g | D) = E(X | C ∨ D) ,
(2.5.c)
as n goes to ∞. We deduce (2.5.a) from (2.5.b) and (2.5.c). 2. With f , g, h such that f is C-measurable, g is D1 -measurable and h is Dn measurable, we want to show that: E(E(f | Dn )E(g | Dn )h) = E(f gh) .
(2.5.d)
We shall show the following stronger equality: E(E(f | Dn )E(g | Dn )h | En ) = E(f gh | En ) .
(2.5.e)
Let k be En -measurable; then, from the hypothesis and since gh is D1 -measurable, E(f ghk) = E(E(f | En )E(gh | En )k) = E(E(f | En )E(E(g | Dn )h | En )k) . By applying the hypothesis again: E(f ghk) = E(E(f E(g | Dn )h | En )k) . Finally, by conditioning on Dn : E(f ghk) = E(E(E(f | Dn )E(g | Dn )h | En )k) . Since, on the other hand, for every k, En -measurable, E(f ghk) = E(E(f gh | En )k) , one deduces that (2.5.e) holds. (This result holds in that particular case because Dn ⊂ D1 but it is not true in general.)
48
Exercises in Probability
3. If (2.5.2) holds then it is obvious that D and C are independent. Suppose now that D and C are independent and let X be bounded and F-measurable and C be bounded and C-measurable, then from Exercise 1.5, E(E(X | Dn )C) −→ E(E(X | D)C) = E(X)E(C) , (n → ∞) . 4. First, we have ∩n (C ∨ Dn ) ⊂ C ∨ σ(Y0 ) since for each n, C ∨ Dn ⊂ C ∨ σ(Y0 ). Moreover, it is obvious that Y0 is ∩n (C ∨ Dn )-measurable, so C ∨ σ(Y0 ) ⊂ ∩n (C ∨ Dn ). On the other hand, we easily check that X1 , X2 , . . . are independent, hence from Kolmogorov’s 0–1 law, the σ-field ∩n Dn is trivial, and C ∨ (∩n Dn ) = C. Comments on the solution: The counterexample given in question 4 is presented in Exercise 4.12 on p. 48 of D. Williams [63].
Solution to Exercise 2.6: 1. It follows from the exchangeability property that for any n ≥ 2, (law)
(X1 , X2 , X3 , . . .) = (X1 , Xn , Xn+1 , . . .) , hence for any bounded Borel function Φ, E[Φ(X1 ) | Gn ] = E[Φ(X1 ) | G2 ] . (law)
We conclude by applying the result of Exercise 1.5 which asserts that E[Φ(X1 ) | Gn ] converges in law towards E[Φ(X1 ) | G], as n goes to ∞. 2. This is a direct consequence of Exercise 1.11. 3. Question 2 shows that X1 and G2 are conditionally independent given G. Similarly, for any n, Xn and Gn+1 are conditionally independent given G, so, by iteration, all r.v.s of the sequence (X1 , X2 , . . .) are conditionally independent given G.
Solution to Exercise 2.7 1. When X and Y are square integrable r.v.s, assumption (I) implies Var(X − Y ) = E[(X − Y )2 ] − E[X − Y ]2 = 0, hence X − Y is a.s. constant. 2. Let x ∈ IR, then ϕ(x)|H(x)|2 = E[eixX ]E[e−ix(X−Y ) ]E[eix(X−Y ) ] = E[eixY ]E[eix(X−Y ) ] = E[eixX ] = ϕ(x) .
2. Independence and conditioning – solutions
49
The second equality follows from the independence between X and X − Y , while the third one, follows from the independence between Y and X − Y . 3. This follows immediately from the continuity of ϕ, and the fact that ϕ(0) = 1. Indeed, since ϕ(x) = 0, for |x| < ε, ε sufficiently small, from (2.7.1) we have |H(x)| = 1, for |x| < ε. 4. We shall show that if for an r.v. Z, its characteristic function ψ(x) = E[eixZ ] satisfies |ψ(x)| = 1 for |x| ≤ ε, for some ε > 0, then Z is a.s. constant. Indeed, consider an independent copy Z of Z. We have: E[exp ix(Z −Z )] = 1, for |x| < ε, so that: 1 − cos(x(Z − Z )) = 0, or equivalently: x(Z − Z ) ∈ 2πZZ, a.s. This 2π implies Z − Z = 0, a.s, since if |Z(ω) − Z (ω)| > 0, then |Z(ω) − Z (ω)| ≥ |x| → ∞, as x → 0. Now, trivially, Z = Z a.s. is equivalent to Z being a.s. constant. 5. First solution: Under the assumption (J), the law of Y − X given (X = x) would be N (0, 1), hence Y − X would be independent of X. Likewise, under the assumption (J), Y − X would be independent of Y . But, this could only happen if Y − X is constant, which is in contradiction with the fact (also implied by (J)) that Y − X is N (0, 1). Thus (J) admits no solution. Second solution: Under the assumption (J), we have: P [X ∈ dx] e−
(x−y)2 2
dy = P [Y ∈ dy] e−
(x−y)2 2
dx ,
x, y ∈ IR .
This implies that P [X ∈ dx] dy = P [Y ∈ dy] dx ,
x, y ∈ IR .
But this identity cannot hold because P [X ∈ dx] and P [Y ∈ dy] are finite measures whereas the Lebesgue measures dx and dy are not. 6. The bivariate r.v. (X, Y ) admits a density which is given by : (x−a)2
(y−x)2
e− 2σ2 e− 2 √ P [X ∈ dx, Y ∈ dy] = √ dx dy , 2π 2πσ 2
x, y ∈ IR .
So, the distribution of Y is given by:
(x−a)2 (y−x)2 dy P [X ∈ dx, Y ∈ dy] = e− 2σ2 e− 2 dx 2πσ {x∈IR} {x∈IR}
50
Exercises in Probability 2 2 2 (y−a)2 − 1+σ2 x− a+σ 2y dy − 2σ 1+σ = e dx e 2(1+σ2 ) 2π {x∈IR} (y−a)2 dy − = e 2(1+σ2 ) , y ∈ IR . 2π(1 + σ 2 )
We conclude that Y is distributed as N (a, 1 + σ 2 ) and the conditional density of X given Y = y is:
fX/Y =y (x) =
2
1 + σ 2 − 1+σ e 2σ2 2πσ 2 2
2
x− a+σ 2y 1+σ
2
x ∈ IR ,
,
2
y , σ ). which is the density of the law N ( a+σ 1+σ 2 1+σ 2
Solution to Exercise 2.8 We use the notations defined in the hint of the statement. We are looking for X and Y such that Y X E |X > 1, E |Y > 1. X Y First, for any n ≥ 0,
n+1 P (Y = 2n−1 , X = 2n ) , X = 2n ) 1 n+1 P (Y = 2 = n 2n−1 + 2 2 P (X = 2n ) P (X = 2n ) 1 pn qn = +2 . 2 pn + qn pn + qn The latter expression is greater than 1 whenever 1 (2.8.a) q n > pn 2 for any n. Now, the other conditional expectation, is equal to:
Y | X = 2n E X
X | Y = 2n E Y
n+1 P (X = 2n−1 , Y = 2n ) , Y = 2n ) 1 n+1 P (X = 2 = n 2n−1 + 2 2 P (Y = 2n ) P (Y = 2n ) qn−1 1 pn+1 = +2 . 2 qn−1 + pn+1 qn−1 + pn+1
| Y = 2n to be greater From above the necessary and sufficient condition for E X Y than 1 is 1 pn+1 > qn−1 . (2.8.b) 2 When pn = qn , conditions (2.8.a) and (2.8.b) reduce to: pn > 12 pn−2 . This is satisfied by any sequence (pn ) of the form pn (= qn ) = Can , for 0 < C < 1 and √12 < a < 1.
n Note that conditions (2.8.a) and (2.8.b) imply E[X] = ∞ n=0 2 (pn + qn ) = ∞. Actually, it is easily shown that condition (ii) in the statement never occurs when both X and Y are integrable.
2. Independence and conditioning – solutions
51
Solution to Exercise 2.9 1. It suffices to check that Ps [ρp1 = 1, . . . ρpk = 1] = Πkj=1 Ps [ρpj = 1] ,
(2.9.a)
for any finite sub-sequence (ρp1 , . . . , ρpk ), k ≥ 1 of (ρp )p∈P . Indeed, for such a sub-sequence, we have: Ps [ρp1 = 1, . . . ρpk = 1] = Ps [∩kj=1 {lpj : l ∈ N }] = Ps [{lp1 . . . pk : l ∈ N }] 1 = s l∈N ζ(s)(lp1 . . . pk ) 1 = , (p1 . . . pk )s
1 s since l∈N ζ(s)l s = Ps (N ) = 1. Moreover, this identity implies Ps (ρpj = 1) = 1/pj , for each j = 1, . . . , k, and (2.9.a) is proven.
To prove Euler’s identity, first note that ∩p∈N {ρp = 0} = {1}. Then, we have Ps ({1}) = Ps [∩p∈P {ρp = 0}], that is: 1 = Πp∈P Ps [ρp = 0] ζ(s) 1 = Πp∈P 1 − s . p 2. It is not difficult to see that {αp = k} = pk {ρp = 0} , where we use the notation: pk {ρp = 0} = {pk j : j ∈ N , ρp (j) = 0}. Hence
1 1 1 Ps [αp = k] = Ps [p {ρp = 0}] = ks Ps [ρp = 0] = ks 1 − s p p p
k
.
The latter identity shows that αp is geometrically distributed with parameter q = 1/ps . Let k1 , . . . , kn be any sequence of integers and p1 , . . . , pn be any sequence of prime numbers, then {αp1 = k1 , . . . , αpn = kn } = pk11 . . . pknn {ρp1 = 0, . . . , ρpn = 0} . And Ps [pk11 . . . pknn {ρp1 = 0, . . . , ρpn = 0}] =
pk11 s
1 P [ρp1 = 0, . . . , ρpn = 0] . . . . pknn s
Hence, the independence between αp1 , . . . , αpn follows from the independence between ρp1 , . . . , ρpn .
52
Exercises in Probability
Solution to Exercise 2.10 1. First note that m
1I∪m A = 1I(∩m Ac = 1 − Ac )c = 1 − 1I∩m k=1 k k=1 k k=1 k
(1 − 1IAk ) .
k=1
Developing the last term, we have: m
1−
(1 − 1IAk ) =
k=1
m
(−1)k−1
1≤i1 n, then P [Sτ = k | τ = n] = 0. • If k ≤ n, then P [Sτ = k | τ = n] = P [Sn = k | τ = n] = P [∩ni=1 {Xi = xi } | τ = n] {xi :
n
i=1
=
n k
xi =k}
p(1 − q) 1 − pq
k
1−p 1 − pq
n−k
.
6. From above, we have E[Sτ | τ = n] =
n
kP [Sτ = k | τ = n] =
k=1
np(1 − q) , 1 − pq
then it follows that E[Sτ ] = =
∞ n=1 ∞ n=0
P [τ = n]E[Sτ | τ = n] =
∞ (1 − pq)n nqp2 (1 − q) n=1
n(1 − pq)n−1 p2 q(1 − q) =
1−q . q
1 − pq
2. Independence and conditioning – solutions
55
By definition, Zτ = Xτ Yτ = 1, which implies Xτ = Yτ = 1. Thus we have: E[Tτ ] = 1. On the other hand, 1 1 E[τ ]E[X1 ] = E[τ ]p = p= , pq q and E[τ ]E[Z1 ] = (1/pq) pq = 1.
Solution to Exercise 2.12 1/2 1. Taking b = c = 0 in (ii) gives E[e−aL ] = 1/2+a , for every a ≥ 0, hence, L is exponentially distributed with parameter 1/2. Now set a = 0 and b = c in (ii), then we obtain the characteristic function of W :
E[e
ibW
−1
|b| sinh b ] = cosh b + b = 1 , if b = 0,
that is
,
if b = 0,
E[eibW ] = e−|b| .
Hence, W is Cauchy distributed with parameter 1. 2. Set ϕ(L) = E[eibW− +icW+ | L]. The r.v. ϕ(L) satisfies: E[e−aL ϕ(L)] = E[e−aL+ibW− +icW+ ] , for every a ≥ 0. But from the expression of the law of L obtained in question 1. and (ii), we have:
E[e−aL+ibW− +icW+ ] = E e−aL
|b| c coth c 1 c e−( 2 − 2 + 2 )L , sinh c
|b| c −( c coth − 12 + 2 )L c 2 thus ϕ(L) = sinh e and it is clear, from this expression for ϕ(L), that c it can be written as: ϕ(L) = E[eibW− | L]E[eicW+ | L] ,
for all b, c ∈ IR, where:
|b| | L] = exp − L (2.12.a) E[e 2 c 1 E[eicW+ | L] = exp − (c coth c − 1)L . (2.12.b) sinh c 2 Thus, W+ and W− are conditionally independent given L. From the conditional independence proved above, we have for all l ∈ IR+ and x ∈ IR: E[eibW− | L = l, W+ = x] = E[eibW− | L = l] , ibW−
56
Exercises in Probability
moreover, from (2.12.a), we obtain: E[eibW− | L = l] = e− 2 |b| , l
b ∈ IR .
Hence, conditionally on W+ = x and L = l, W− is Cauchy distributed with parameter l/2. From (2.12.b), we have E[eicW+ | L = l] =
c coth c 1 c e−( 2 − 2 )l , sinh c
c = 0 .
Comments on the solution: For l = 0, the latter Fourier transform can be inverted whereas for l = 0, there is no known explicit expression for the corresponding density. (However, computations in A.N. Borodin and P. Salminen [7] show that an extremely complicated “closed form” formula might be obtained; this has also been remarked upon by R. Ghomrasni (2003, to appear)) This difficulty has been observed in: ´vy: Propri´et´es des lois dont les fonctions caract´eristiques sont J. Bass and P. Le 1/ch z, z/sh z, 1/ch2 z. C. R. Acad. Sci. Paris, 230, 815–817 (1950). One can find series developments for the corresponding densities in A.N. Borodin and P. Salminen [7]. See also: P. Biane, J. Pitman and M. Yor: Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions. Bull. Amer. Math. Soc. (N.S.), 38, no. 4, 435–465 (2001).
Solution to Exercise 2.13 1. This is quite similar to the computations made in question 2 of Exercise 2.12, and one finds:
λ2 μ2 E exp − A− + A+ 2 2
|L
λ2 μ2 = E exp − A− | L E exp − A+ | L 2 2
μ λ 1 exp − (μ cothμ − 1)L . = exp − L 2 sinh μ 2
From question 7 of Exercise 4.2, if T is (s)-distributed (that is P (T ∈ dt) = √ dt exp − 1 , t > 0), then its Laplace transform is: 2t 2πt3
λ2 E exp − T 2
= exp(−λ) ,
λ ≥ 0,
2. Independence and conditioning – solutions
57
and the law of T is characterized by the identity 1 , N2 where N is a centred Gaussian variable with variance 1. Hence, the Laplace transform of A− may be expressed as: (law)
T =
λ2 E exp − A− 2
∞ 1 = dv e−v e−vλ = 1+λ 0
∞
= 0
dv e−v E e−
λ2 v 2 T 2
λ2 Z 2 = E exp − T 2
λ2 Z 2 = E exp − 2 N2
and question 1 follows. 2. The identity in law (2.13.2) follows from the double equality:
λ2 μ2 E[exp(−aL + iλW+ + iμW− )] = E exp −aL − A+ − A− 2 2
= E exp −aL + iλN+ A+ + iμN− A− 3. This follows from:
λ2 μ2 E[exp i(λW+ + μW− ) | L = l] = E exp − A+ − A− 2 2
.
|L = l .
4. Write the Laplace transform of (A− , A+ ) as
∞ λ 1 dt exp −t cosh μ + sinh μ = μ 0 cosh μ + μλ sinh μ ∞ ∞ μ cosh μ λ2 u t2 t μ = dt e−t sinh μ du e− 2 √ e− 2u . sinh μ 0 0 2πu3
By letting μ converge towards 0 in the above expression, we obtain the following different expressions for the density g of A− :
∞ t 1 − e−t /2u t2 √ dt e √ exp − = dt e−t g(u) = 2u 0 0 2πu 2πu3 u/2 √ √ e 1 − eu/2 P (N ≥ u), = √ E[(N − u)+ ] = √ u 2πu ∞
2
−t
where N denotes a standard normal variable. These expressions are closely related to the Hermite functions h−1 and h−2 ; see e.g. Lebedev [69]. Note also that from (2.13.1) the variable L is exponentially distributed with param2 eter 12 and from question 1, the law of A− given L = l is the same as the law of l4 T .
58
Exercises in Probability
The law of L given A− is then obtained from Bayes formula, since we know the law (def) of A− given L, the law of L and the law of A− . So, with H = L2 we find: 1 √
P (H ∈ dh | A− = u) =
−
g(u) 2πu3
dh he
h2 +h 2u
.
We then use the conditional independence of A+ and A− given L, to write:
μ2 E exp − A+ 2
μ2 | A− = u = E E exp − A+ | H | A− = u 2 2 μ = P (H ∈ dh | A− = u)E exp − A+ | H = h 2 ∞ 2 h μ 1 √ +h exp −h(μ cothμ − 1) = dh h exp − 3 2u sinh μ g(u) 2πu 0
=
μ sinh μ
1 √
∞
√
t2
dt t e−( 2 +t
u coth μ)
g(u) 2πu3 0 μ g(u(μ coth μ)2 ) = . sinh μ g(u)
5. This can be done by first establishing a series development of the Laplace transform of (A− , A+ ) in terms of e−s :
s2 E exp − (λ2 A− + μ2 A+ 2
= =
1 cosh(sμ) + μλ sinh(sμ) 2
esμ 1 +
= where ρ =
λ 1− μ λ 1+ μ
λ + μ −sμ
e−sμ 1 −
2e
1+
λ μ
(1 + ρe−2sμ )
λ μ
,
. Note that since |ρ| < 1, we get
s2 E exp − (λ2 A− + μ2 A+ 2
∞ 2 −sμ = (−ρ)n e−2sμn λe 1+ μ n=0
=
2 1+
=
2 1+
∞
(−ρ)n e−sμ(2n+1) λ μ n=0 ∞ ∞ 2 a2 dt n − 2tn − s2 t √ (−ρ) a e e n λ 0 2πt3 μ n=0
where an = (2n + 1)μ. Hence, we obtain: P (λ2 A− + μ2 A+ ∈ dt) =
2 1+
∞ λ μ n=0
(−ρ)n √
a2 dt n an e− 2t . 3 2πt
,
2. Independence and conditioning – solutions
59
Solution to Exercise 2.14 1. We first recall that for Q absolutely continuous with respect to P , and X0 the Radon–Nikodym density of Q with respect to P , then EP (X0 ) = EQ (1) = 1, and X0 ≥ 0 P a.s., since Q(X0 < 0) = EP [X0 1{X0 0,
IE[Φ | f ≥ M U ] =
f ∧1 M f EP M ∧1
EP Φ
=
EP [Φ(f ∧ M )] , EP [f ∧ M ]
which converges, as M → ∞, towards EP [Φf ], by monotone convergence.
Solution to Exercise 2.16 1. First note that under the conditions of the statement, the functions y → E[f (SX) | Y = y] and z → E[f (SX) | SY = z] are continuous on IR+ and the conditional expectation E[f (SX) | SY = 0] may be defined as E[f (SX) | SY = 0] = lim
ε→0
E[f (SX)1I{0≤SY ≤ε} ] . P (0 ≤ SY ≤ ε)
Let λ > 0 and write E[f (SX)1I{0≤SY ≤ε} ] = E[f (SX)1I{0≤SY ≤ε} 1I{S 0. This can also be written as 0
∞
E[X | L = l]e−l e−λl dl =
∞
e−a
−1 l
e−λl dl .
0
The functions between parenthesis in the above integrals have the same Laplace transform, so they are a.e. equal, hence, we have: E[X | L = l] = exp −(a−1 − 1)l ,
a.e. (dl).
and thus: a = (1 + α)−1 . If (b) holds, then we may write: E[X exp −λL] = E[E[X | L] exp −λL] = E[exp −(α + λ)L] ,
2. Independence and conditioning – solutions
63
a and from (2.17.a), this last expression equals 1+λa . So we obtained (a). Suppose that (a) holds. By analytical continuation, we can show that E[X exp a (λL)] = 1−λa holds for any λ < a1 . Hence, for any λ < a1 , from Fubini’s Theorem, and the series development of the exponential function, we have
⎡
(λL)k
E[X exp (λL)] = E ⎣X
k≥0
⎤
Lk ⎦= λ E X . k! k! k≥0 k
a and identifying this expression Then we obtain (c) by writing : k≥0 λk ak+1 = 1−λa with the previous one. 1 Conversely, if (c) holds, then, for any 0 ≤ μ < E[X] ,
Lk E[X] , μ E X = E[X exp (μL)] = k! 1 − μE[X] k≥0 k
from Fubini’s Theorem. By analytical continuation, this expression also holds for λ = −μ ≥ 0, so (a) holds. 2. For variables of the type of X, it is easy to show that E[X | L] = exp(−αL), i.e. (b) is satisfied. 3. Let us show that under condition (2.17.2), the following properties are equivalent. (a) There exists a > 0 such that for any λ ≥ 0:
a E [X exp(−λL)] = 1 + λa
γ
.
(b) There exists α ≥ 0 such that E [X | L = ] = exp(−α). (c) For every k ≥ 1,
E XLk = γ(γ + 1) . . . (γ + k − 1)E[X]
k+γ γ
.
Moreover, if those properties are satisfied, then the following relationship between a and α holds: 1 a= . 1+α Under condition (2.17.2), we say that L is gamma distributed with parameter γ (see Exercise 4.3). It is not difficult to check that its Laplace transform is given by E[exp −λL] =
1 λ+1
γ
,
λ > 0.
(2.17.b)
64
Exercises in Probability
We verify (a)⇐⇒(b) in the same manner as in question 1. To check (a)⇐⇒(c), we use the same arguments together with the series development: (1 + x)−γ = 2 k 1 − γx + γ(γ + 1) x2 + · · · + γ(γ + 1) · · · (γ + k − 1) xk! + · · ·, which holds for any x ≥ 0. 4. We start from the equality
(tL)k E X = tk E[X]k+1 , k!
t ∈ IR+ , k ≥ 0 .
Integrating with respect to ν(dt), we have:
(tL)k ν(dt)E X = ν(dt)E[X](tE[X])k . k! IR+ IR+
Finally we obtain the result by performing the sum over k ≥ 0 and using Fubini’s Theorem.
Solution to Exercise 2.18 1. For any pair of bounded Borel functions f and g, we write: E[f ({Z})g([Z])] = =
∞
n+1
n=0 n ∞ −n
du e−u f (u − n)g(n)
1
e
dv f (v)e−v g(n) .
0
n=0 −v
e −n Hence P ({Z} ∈ dv) = 1−e (1 − e−1 ), n ∈ IN and −1 1I{v∈[0,1]} dv, P ([Z] = n) = e {Z} and [Z] are independent.
2. We take ϕ(t) = e−[t] (1 − e−1 ) = (1 − e−1 )e{t} e−t . Then with Q = (1 − e−1 )e{Z} P , we obtain: (def)
−1
EQ [f ({Z})g([Z])] = (1 − e )
∞
n=0
−n
e
g(n)
1
dv f (v) . 0
3. Let X be an r.v. with density ϕ then E[f ({X})g([X])] = =
∞
n+1
n=0 n ∞ 1 n=0
0
dt f (t − n)g(n)ϕ(t)
du f (u)ϕ(u + n) g(n) .
2. Independence and conditioning – solutions
65
From above, the independence between {X} and [X] is equivalent to ∞
1
du f (u)ϕ(u + n) g(n) =
0 n=0 ∞ 1
du f (u)ϕ(u + n)
n=0 0
∞
1
du g(n)ϕ(u + n) ,
n=0 0
for any pair of bounded Borel functions f and g. This is also equivalent to
1
du f (u)ϕ(u + n) = 0
∞
1
1
du f (u)ϕ(k + u)
k=0 0
dv ϕ(v + n) , 0
for any bounded Borel function f that is
ϕ(u + n) =
∞
ϕ(k + u)
k=0
1
dv ϕ(v + n) ,
a.e. (du).
0
Then, ϕ must satisfy: for every n ∈ IN, ϕ(u + n) = ψ(u)cn ,
u ∈ (0, 1), a.e.,
(2.18.a)
for a positive sequence (cn ) in l1 , and ψ ≥ 0, 01 du ψ(u) < ∞. Conversely, it follows from the above arguments that if ϕ satisfies (2.18.a), then the independence of {X} and [X] holds. Comments on the solution: (a) More generally, B. Tsirel’son told us about the following result. Given every pair of probabilities on IR+ , μ and ν, which admit densities with respect to Lebesgue measure, there exists a joint law on IR2+ , that of (A, B) such that: (i) A is distributed as μ, (ii) B is distributed as ν, (iii) B − A belongs to Q, a.s. In our example: B is exponentially distributed and A = {B}. (b) Note how the property (2.18.a) is weaker than the “loss of memory property” for the exponential variable, which is equivalent to: ϕ(u + s) = Cϕ(u)ϕ(s) ,
u, s ≥ 0 .
Chapter 3 Gaussian variables Basic Gaussian facts
(a) A Gaussian space G is a sub-Hilbert space of L2 (Ω, F, P ) which consists of centred Gaussian variables (including 0). A most useful property is that, if X1 , . . . , Xk ∈ G, they are independent if and only if they are orthogonal, i.e. E[Xi Xj ] = 0 ,
for all i, j such that i = j.
(b) It follows that if IH is a subspace of G, and if X ∈ G, then: E[X | σ(IH)] = projIH (X) . For these classical facts, see e.g. Neveu ([42], Chapter 2), and/or Hida–Hitsuda ([28], chapter 2). (c) The above facts explain why linear algebraic computations play such an important rˆole in the study of Gaussian vectors, and/or processes, e.g. the conditional law of a Gaussian vector with respect to another is Gaussian and can be obtained without manipulating densities (see, e.g., solution to Exercise 3.11). (d) However, dealing with nonlinear functionals of Gaussian vectors may necessitate other (nonlinear!) techniques; see e.g. Exercise 3.10. (e) We have not used in our exercises, but would like to mention, the following non-decomposability property of the Gaussian distribution, due to H. Cramer. If X 67
68
Exercises in Probability
and Y are two independent centred variables such that their sum is Gaussian, then each of them is Gaussian. See, e.g., Lukacs [38], p. 243, for this result as well as its counterpart, due to Raikov, for the Poisson law). (f) Further fine results on Gaussian random functions are found in Fernique [21] and Lifschits [37].
3.1 Constructing Gaussian variables from, but not belonging to, a Gaussian space *
Let X and Y be two centred, independent, Gaussian variables. 1. Show that, if ε is an r.v. which takes only the values +1 and −1 and which is measurable with respect to Y , then εX is a Gaussian variable which is independent of Y . In particular, εX and ε are independent. 2. Show that the sub-σ-field Σ of σ(X, Y ) which is generated by the r.v.s Z which satisfy: (i) Z is Gaussian and (ii) Z is independent of Y is the σ-field σ(X, Y ) itself. 3. Prove an analogous result to that of question 2, when Y is replaced by a Gaussian space IH which is assumed to be independent of X. Comments: The results of this exercise – question 2, say – highlight the difference between the Gaussian space generated by X and Y , i.e. the two-dimensional space Σ = {λX + μY ; λ, μ ∈ IR}, and the space L2 (σ(X, Y )), which contains many other Gaussian variables than the elements of Σ. *
3.2 A complement to Exercise 3.1
Let (Ω, F, P ) be a probability space, and let G (⊆ L2 (Ω, F, P )) be a Gaussian space. Let Z be a σ(G)-measurable r.v., which is, moreover, assumed to be Gaussian, and centred.
3. Gaussian variables
69
1. Show that, if the condition (γ) the closed vector space (in L2 (Ω, F, P )) which is generated by Z and G is a Gaussian space holds, then Z belongs to G. 2. In the case dim(G) = 1, construct an r.v. Z, which is σ(G)-measurable, Gaussian, centred, and such that (γ) is not satisfied. Comments: This exercise complements Exercise 3.1, in that it shows that many variables Z constructed in question 2 of Exercise 3.1 did not satisfy (γ) with G = {λX + μY : λ, μ ∈ IR}. ** 3.3
On the negative moments of norms of Gaussian vectors
We consider, on a probability space, two C-valued r.v.s Z = X+iY and Z = X +iY such that: (a) the vector (X, Y, X , Y ) is Gaussian, and centred; (b) the variables X and Y have variance 1, and are independent; (c) the variables X and Y have variance 1, and are independent. In the sequel, IR2 and C are often identified, and z, z denotes the Euclidean scalar product of z and z , elements of C IR2 , and |z| = z, z1/2 .
1 1. Show that E |Z|p
< ∞ if and only if p < 2.
2. Let A be the covariance matrix of Z and Z , that is ∀θ1 , θ2 ∈ IR2 ,
E [θ1 , Z θ2 , Z ] = θ1 , Aθ2 .
Show that there exists a Gaussian vector ξ, with values in IR2 , independent of Z, such that: Z = A∗ Z + ξ, where A∗ denotes the transpose of A. Show that ξ = 0 a.s. if and only if A is an orthogonal matrix, that is: A∗ A = Id. 3. Show that if (I2 − A∗ A) is invertible, and if p < 2, then:
1 Z (i) E |Z |p
≤ C, for a certain constant C;
70
Exercises in Probability
1 1 (ii) E p |Z| |Z |p
< ∞.
Comments and references: It is interesting to compare the fact that the last expression is finite for p < 2, under the hypothesis of question 3, with what happens for Z = Z , that is: 1 < ∞ if, and only if : p < 1. E |Z|2 p The finiteness result (ii) is a key point in the proof of the asymptotic independence of the winding numbers, as time goes to infinity, for certain linearly correlated planar Brownian motions, as shown in: M. Yor: Etude asymptotique des nombres de tours de plusieurs mouvements browniens complexes corr´el´es. In: Festschrift volume in honor of F. Spitzer: Random Walks, Brownian Motion and Interacting Particle Systems, eds. R. Durrett, H. Kesten, pp. 441–455, Birkh¨auser, 1991. ** 3.4 Quadratic functionals of Gaussian vectors and continued fractions Let p ∈ IN, and β0 , β1 , . . . , βp+1 be a sequence of (p + 2) independent, Gaussian r.v.s with values in IRd . For each i, the d components of βi ≡ (βi,j ; j = 1, 2, . . . , d) are themselves independent and centred, and there exists ci > 0 such that: 1 2 E βi+1,j = (i = −1, 0, . . . , p , j = 1, 2, . . . , d) . ci We write x, y for the scalar product of two vectors in IRd . 1. Let U be an orthogonal d × d matrix, with real coefficients. Show that the r.v.s
p i=0
βi , βi+1 and
p i=0
βi , U βi+1 have the same law.
Hint: For every k ∈ IN, U k is an orthogonal matrix. 2. In this question, we assume d = 2. Define Sk =
p i=k
βi , βi+1 .
(In the sequel, it may be helpful to use the convention: β−1 = βp+2 = 0, and S−1 = S0 , Sp+1 = 0.) Prove, using descending iteration starting from n = p, that for every n, there exist two functions hn and kn such that:
|m|2 kn (x) E [exp(ix Sn ) | βn = m] = hn (x) exp − 2
.
(3.4.1)
3. Gaussian variables
71
Prove a recurrence relation between, on one hand, hn , hn+1 and kn , and on the other hand, kn and kn+1 . Deduce therefrom the formulae kn (x) =
x2 x2 x2 cn + cn+1 + cn+2 +
2
. . . xcp
x2
≡ −−−−−−−−−−−−−−−−−−−−−−−−−−−−− x2
cn + −−−−−−−−−−−−−−−−−−−−− − x2
(3.4.2)
cn+1 + −−−−−−−−−−− ... and E [exp(ixS0 )] =
p n=−1
cn kn (x) x2
3. In this question, we assume d = 1. Define ai =
.
1 √ (i 2 ci−1 ci
(3.4.3) = −1, 0, . . . , p).
Show that S0 has the same distribution as: g, Ag, where g = (g0 , . . . , gp+1 ) is a random vector which consists of centred, independent Gaussian variables, ⎛ ⎞ 0 a0 0 . . . 0 0 ⎜ ... 0 0 ⎟ ⎜ a0 0 a1 ⎟ ⎜ ⎟ ⎜ · ⎟ · · · · · ⎟ . with variance 1, and A = ⎜ ⎜ · · · · · · ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ 0 0 . . . ap−1 0 ap ⎠ 0 0 . . . 0 ap 0 Deduce therefrom that S0 is distributed as
p+1 n=0
dn gn2 , where (dn ) is the sequence
of eigenvalues of A. 4. In this question, we assume d = 2. Prove the formula: p+2 i 1 , (c) E [exp(ix S0 )] = 2x det A + 2xi I2 5. Write Da0 ,...,ap (λ) = det(A + λId ). Prove a recurrence relation between Da0 ,...,ap , Da1 ,...,ap , and Da2 ,...,ap . How can one understand, in terms of linear algebra, the identity of formulae (b) and (c)? 6. Give a necessary and sufficient condition in terms of the sequence of real numbers (αn , n ≥ 0) such that n
αi gi gi+1
converges in L2 ,
i=0
where (gi ) is a sequence of independent, centred, reduced Gaussian variables.
72
Exercises in Probability
Comments and references: (a) For more details, we refer to the article from which the above exercise is drawn: Ph. Biane and M. Yor: Variations sur une formule de Paul L´evy. Ann. Inst. H. Poincar´e Probab. Statist., 23, no. 2, suppl., 359–377 (1987). and its companion: C. Donati-Martin and M. Yor: Extension d’une formule de Paul L´evy pour la variation quadratique du mouvement brownien plan. Bull. Sci. Math., 116, no. 3, 353–382 (1992). (b) Both articles and, to a lesser extent, the present exercise, show how some classical continued fractions (some originating with Gauss) are related to series expansions of quadratic functionals of, say, Brownian motion. (c) In question 6, using the Martingale Convergence Theorem (see e.g Exercise 1.5), it is possible to show that,
i=0
to +∞. **
n
αi gi gi+1 converges almost surely as n goes
3.5 Orthogonal but non-independent Gaussian variables
1. We denote by z = x+iy the generic point of C = IR+iIR. Prove that, for every n ∈ IN, there exist two homogeneous real valued polynomials of degree n, Pn and Qn , such that: z n = Pn (x, y) + iQn (x, y) . (These polynomials are closely related to the Tchebytcheff polynomials.) 2. Let X and Y be two centred, reduced, independent Gaussian variables. Show that, for every n ∈ IN∗ , An =
Pn (X, Y ) (X 2
+Y
2)
n−1 2
and Bn =
Qn (X, Y ) (X 2 + Y 2 )
n−1 2
are two independent, centred, reduced Gaussian variables. 3. Prove that the double sequence S = {An , Bm ; n, m ∈ IN} consists of orthogonal variables, i.e. if C and D are two distinct elements of S, then: E[CD] = 0.
3. Gaussian variables
73
4. Let n = m. Compute the complex cross moments of (An , Bn , Am , Bm ), that is E[Zna (Zm )b ], for any integers a, b, where Zn = An + iBn and Zm = Am + iBm . Is the vector (An , Bn , Am , Bm ) Gaussian ? *
3.6 Isotropy property of multidimensional Gaussian laws
Let X and Y be two r.v.s taking their values in IRk . Let, moreover, Gk be a Gaussian r.v. with values in IRk , which is centred, and has Ik for covariance matrix, and let U be a r.v. which is uniformly distributed on Sk−1 , the unit sphere of IRk . We assume moreover that Gk , U , X and Y are independent. Prove that the following properties are equivalent: (law)
(law)
(law)
(i) X = Y ; (ii) Gk , X = Gk , Y ; (iii) U, X = U, Y , where x, y is the scalar product of x and y, two generic elements of IRk , and
x = (x, x)1/2 is the Euclidian norm of x ∈ IRk . **
3.7 The Gaussian distribution and matrix transposition 1. Let Gn be an n-sample of the Gaussian, centred distribution with variance 1 on IR. Prove that, for every n × n matrix A with real coefficients, one has:
AGn = A∗ Gn , (law)
where A∗ is the transpose of A. 2. Let X be an r.v. with values in IR, which has a finite second moment. Prove that if, for every n ∈ IN, and every n × n matrix A, one has:
AXn = A∗ Xn , (law)
where Xn is an n-sample of the law of X, then X is a centred Gaussian variable. Comments: The solution we propose for question 1 uses systematically the Gauss transform, as defined in Exercise 4.16. Another proof can be given using the fact that AA∗ and A∗ A have the same eigenvalues, with the same order of multiplicity: write AA∗ = U ∗ DU and A∗ A = V ∗ DV , where U and V are orthogonal matrices and D is diagonal, then use: (law)
(law)
Gn = V Gn = U Gn .
74
Exercises in Probability
* 3.8 A law whose n-samples are preserved by every orthogonal transformation is Gaussian Let X be an r.v. with values in IR, and, for every n ∈ IN, let Xn denote an n-sample of (the law of) X. 1. Prove that if X is Gaussian and centred, then for every n ∈ IN, and every orthogonal n × n matrix R, one has: (law)
Xn = RXn . 2. Prove that, conversely, if the above property holds for every n ∈ IN, and every orthogonal n × n matrix R, then X is Gaussian and centred. Comments and references about Exercises 3.6, 3.7 and 3.8: This characterization of the normal distribution has a long history going back to the 1930s and early 1940s with H. Cram´er and S. Bernstein, the main reference being: M. Kac: On a characterization of the normal distribution. Amer. J. Math., 61, 726–728 (1939). Amongst more recent references, we suggest: C. Donati-Martin, S. Song and M. Yor: On symmetric stable random variables and matrix transposition. Annales de l’I.H.P., 30, (3), 397–413 (1994) G. Letac and X. Milhaud: Une suite stationnaire et isotrope est sph´erique. Zeit. f¨ ur Wahr., 49, 33–36, (1979) G. Letac: Isotropy and sphericity: some characterizations of the normal distribution. The Annals of Statistics., 9, (2), 408–417 (1981) S. Berman: Stationarity, isotropy and sphericity in p . Zeit f¨ ur Wahr., 54, 21–23, (1980) and the book by S. Janson [30]. ** 3.9
Non-canonical representation of Gaussian random walks
Consider X1 , . . . , Xn , . . . a sequence of centred, independent Gaussian r.v.s, each with variance 1. Define for n ≥ 1, Sn =
n i=1
Xi and Σn = Sn −
n j=1
Sj . j
3. Gaussian variables
75
We define the σ-fields: Fn = σ{X1 , . . . , Xn } and Gn = σ{Xi − Xj ; 1 ≤ i, j ≤ n} .
Si Sj − ; 1 ≤ i, j ≤ n . 1. Prove that: Gn = σ{Σ1 , Σ2 , . . . , Σn } = σ i j 2. Fix n and let {Sˆk , k ≤ n} = {Sk − nk Sn , k ≤ n}. Prove that {Sˆk , k ≤ n} is independent of Sn and is distributed as {Sk , k ≤ n} conditioned by Sn = 0. Prove that Gn = σ{Sˆk , k ≤ n}. Sn . n Show that, for any n ≥ 1, the equality: Gn = σ{Y1 , Y2 , . . . , Yn } holds, and that, moreover, the variables (Yn , n ≥ 1) are independent. Hence, the sequence Σn = nj=1 Yj , n = 1, 2, . . . has independent increments.
3. Define Yn = Xn −
Let (αk , k = 1, 2, . . .) be a sequence of reals, such that αk = 0, for all k. (def) Prove that Σ(α) = Sn − nk=1 αk Sk has independent increments if and only if n αk = k1 , for every k. Sn+1 Yn+1 Sn − =− . n n+1 n Sn Gn+k (k ∈ IN) as a linear comDeduce therefrom an expression for E n bination, to be computed, of (Yn+1 , Yn+2 , . . . , Yn+k ).
4. Prove the equality:
5. Prove that Sn is measurable with respect to σ{Yn+1 , Yn+2 , . . .}. 6. Prove that: F∞ = G∞ , where: F∞ = lim ↑ Fk , and G∞ = lim ↑ Gk . k
k
Comments and references: This is the discrete version of the following “non-canonical” representation of Brownian motion, which is given by: Σt = St −
0
t
t ds t 1 − log dSu , Ss ≡ s u 0
(3.9.1)
where we use the (unusual!) notation (St , t ≥ 0) for a one-dimensional Brownian motion. Then, (Σt , t ≥ 0) is another Brownian motion, and (3.9.1) is a non canonical representation of Brownian motion, in the sense that (Su ; u ≤ t) cannot be reconstructed from (Σu , u ≤ t); in fact, just as in question 2, the random variable St is independent of σ{Σu ; u ≤ t}; nonetheless, just as in question 6, the two σ-fields σ{Su ; u ≥ 0} and σ{Σu ; u ≥ 0} are equal up to negligible sets. For details, and references, see, e.g., Chapter 1 in M. Yor [65]. Another example, due to L´evy is: t 3 t u Σt = St − Su du = 3 − 2 dSu . t 0 t 0
76
Exercises in Probability
(Σt , t ≥ 0) is a Brownian motion and (Su , u ≤ t) cannot be constructed from (Σu , u ≤ t) since 0t u dSu = tSt − 0t Su du is independent from (Σu , u ≤ t). For a more extended discussion, see: Y. Chiu: From an example of L´evy’s. S´eminaire de Probabilit´es, XXIX, 162–165, Lecture Notes in Mathematics, 1613, Springer, Berlin, 1995. A recent contribution to this topic is: Y. Hibino and M. Hitsuda: Canonical property of representations of Gaussian processes with singular Volterra kernels. Infin. Dimens. Anal. Quantum Probab. Relat. Top., 5, (2), 293–296 (2002). *
3.10 Concentration inequality for Gaussian vectors
Let X and Y be two independent centred d-dimensional Gaussian vectors with covariance matrix Id . 1. Prove that for any f, g ∈ Cb2 (IRd ),
Cov(f (X), g(X)) = 0
where ∇f (x) =
∂f (x) ∂xi
1
E[ ∇f (X), ∇g(αX +
√
1 − α2 Y ) ] dα ,
(3.10.1)
.
Hint: First check (3.10.1) for f (x) = eit,x and g(x) = eis,x , s, t, x ∈ IRd . of (X, αX + 2. √ Let μα be the Gaussian measure in IR2d which is the distribution 1 − α2 Y ) and denote by μ the probability measure 01 μα dα. Let Z be a random vector in IRd such that the 2d-dimensional random vector (X, Z) has law μ. Prove that for any Lipschitz function f , such that f Lip ≤ 1 and E[f (X)] = 0, the inequality E[f (X)etf (X) ] ≤ tE[etf (Z) ] , (3.10.2) holds for all t ≥ 0 and deduce that t2
E[etf (X) ] ≤ e 2 .
(3.10.3)
3. Prove that for every Lipschitz function f on IRd with f Lip ≤ 1, the inequality P (f (X) − E[f (X)] ≥ λ) ≤ e− holds for any λ ≥ 0.
λ2 2
.
(3.10.4)
3. Gaussian variables
77
Comments and references: The inequality (3.10.4) is known as the concentration property of the Gaussian law. Many developments based on stochastic calculus may be found in: M. Ledoux: Concentration of measure and logarithmic Sobolev inequalities. S´eminaire de Probabilit´es XXXIII, 120–216, Lecture Notes in Mathematics, 1709, Springer, Berlin, 1999. M. Ledoux: The concentration measure phenomenon. Mathematical Surveys and Monographs, 89. American Mathematical Society, Providence, RI, 2001. See also Theorem 2.6 in: W.V. Li and Q.M. Shao: Gaussian processes: inequalities, small ball probabilities and applications. Stochastic processes: theory and methods, 533–597, Handbook of Statist., 19, North-Holland, Amsterdam, 2001. This inequality has many applications in large deviations theory, see Exercise 5.5; the present exercise was suggested by C. Houdr´e, who gave us the next reference: ´: Comparison and deviation from a representation formula. In: StochasC. Houdre tic Processes and Related Topics, 207–218, Trends in Mathematics, Birkh¨auser Boston, Boston, MA, 1998. R. Azencott: Grandes d´eviations et applications. Eighth Saint Flour Probability Summer School–1978 (Saint Flour, 1978), pp. 1–176, Lecture Notes in Mathematics, 774, Springer, Berlin, 1980. The inequality (3.10.4) is somehow related to the so called Chernov’s inequality, i.e. E[(f (X) − E[f (X)])2 ] ≤ E[(f (X))2 ] , for any f ∈ C 1 (IRd ). Note that this inequality was re-discovered by Chernov in 1981; it had been already proved much earlier by Nash in 1958. A proof of it may be found in Exercises 4.29 and 4.30, pp. 126 and 127 in Letac [36]. See also Exercise 3.13, Chapter 5 in Revuz and Yor [51] for a proof based on stochastic calculus. * 3.11 Determining a jointly Gaussian distribution from its conditional marginals Give a necessary and sufficient condition on the six-tuple (α, β, γ, δ, a, b) ∈ IR6 , for the existence of a two-dimensional Gaussian variable (X, Y ) (not necessarily centred) with both X and Y non-degenerate satisfying: L(Y | X = x) = N (α + βx, a2 ) L(X | Y = y) = N (γ + δy, b2 ) .
(3.11.1) (3.11.2)
78
Exercises in Probability
Solutions for Chapter 3
Solution to Exercise 3.1 1. Let f and g be two bounded measurable functions. We prove the result by writing E[f (εX)g(Y )] = E[f (X)g(Y )1I{ε=1} ] + E[f (−X)g(Y )1I{ε=−1} ]
= E[f (X)] E[g(Y )1I{ε=1} ] + E[g(Y )1I{ε=−1} ] = E[f (X)]E[g(Y )] .
2. For any a ∈ IR, let εYa = 1I{Y ≤a} − 1I{Y >a} , then Za = εYa X, is a Gaussian variable which is independent of Y and σ(X, Y ) = σ(X, Za : a ∈ IR). Since, from question 1, each Za is independent of Y , then the latter identity proves that σ(X, Y ) = Σ. 3. Let X be a Gaussian variable which is independent of IH. We want to prove that the sub-σ-field of σ(X, IH) which is generated by the r.v.s satisfying: (i) Z is Gaussian and (ii) Z is independent of IH is the σ-field σ(X, IH) itself. The proof is essentially the same as in the previous question. To any Y ∈ IH, we associate the family (εYa : a ∈ IR) as in question 2. Moreover, we can check as in question 1. that for fixed a ∈ IR and Y ∈ IH, each εYa X, is a Gaussian variable which is independent of IH. Finally, it is clear that σ(IH) = σ(εYa : a ∈ IR, Y ∈ IH) and σ(X, IH) = σ(X, εYa X : a ∈ IR, Y ∈ IH).
Solution to Exercise 3.2 1. Let IH be the closed vector space generated by Z and G, and let IF be the Gaussian subspace of IH such that IH = IF ⊕ G. Z can be decomposed as: Z = Z1 + Z2 ,
3. Gaussian variables – solutions
79
with Z1 ∈ IF and Z2 ∈ G. It follows from general properties of Gaussian spaces that Z1 is independent of G; hence E[Z | σ(G)] = Z2 . But Z is σ(G)-measurable, so Z (= Z2 ) belongs to G. 2. Let Y and G be such that G = {αY : α ∈ IR}. Pick a real a > 0 and define Z = −Y 1I{|Y |≤a} + Y 1I{|Y |≥a} . We easily check that Z is Gaussian, centred and is σ(G)measurable although the vector (Y, Z) is not Gaussian. Indeed, Y +Z = 2Y 1I{|Y |≥a} , thus P (Y + Z = 0) = P (|Y | ≤ a) belongs to the open interval (0, 1); the closed vector space generated by Z and G cannot be Gaussian. Comments on the solution: In the case dim(G)=2, any variable Z constructed in question 2 of Exercise 3.1 constitutes an example of an r.v. which is σ(G)measurable, Gaussian, centred, and such that (γ) is not satisfied.
Solution to Exercise 3.3 1. A classical computation shows that |Z|2 = X 2 + Y 2 is exponentially distributed with parameter 1/2. The result follows. 2. Set ξ = (ξ1 , ξ2 ) = Z − A∗ Z. From (a), the vector (X, Y, ξ1 , ξ2 ) is Gaussian, so to prove that ξ and Z are independent, it is enough to show that their covariance matrix equals 0. But by definition, this matrix is the difference between A and the covariance matrix of Z and A∗ Z, which is precisely A. The covariance matrix of A∗ Z is A∗ A and the covariance matrix of Z is I2 , so ξ has covariance matrix: I2 − A∗ A and the result follows. 3. (i) We know from question 1 that if p < 2, then the conditional expectation E |Z1 |p | Z is integrable, hence a.s. finite. On the other hand, we deduce from the independence between Z and ξ that
1 E |Z |Z |p
1 = E |Z ∗ |A Z + ξ|p 1 = E . |ξ + x|p x=A∗ Z
(3.3.a)
1 is defined and bounded on IR2 . Now let us check that the function x → E |ξ+x| p ∗ Since the covariance matrix I2 −A A of ξ is invertible, then the law of ξ has a density and moreover, there exists a matrix M such that M ξ is centred and has covariance matrix I2 . Let |M | be the norm of the linear operator represented by M , then from
80
Exercises in Probability
the inequality |M ξ + M x| ≤ |M ||ξ + x|, we can write
2 2 1 1 |M |p e−[(y1 −x1 ) +(y2 −x2 ) ]/2 p E ≤ |M | E = dy1 dy2 , |ξ + x|p |M ξ + M x|p 2π IR2 (y12 + y22 )p/2
where M x = (x1 , x2 ). The latter expression is clearly bounded in (x1 , x2 ) ∈ IR2 . Hence, the conclusion follows from equality (3.3.a) above. (ii) From (i), we have
1 1 1 1 E =E E |Z |Z|p |Z |p |Z|p |Z |p
1 ≤ CE < +∞ . |Z|p
Solution to Exercise 3.4 1. Since βi is centred and has covariance matrix c−1 i−1 Id , then for any orthogonal (law) matrix U , U βi = βi . Moreover, from the independence between the βi’s and since for every k ∈ IN, U k is an orthogonal matrix, then we have (law)
(β0 , . . . , βp+1 ) = (β0 , U β1 , U 2 β2 , . . . , U p+1 βp+1 ) . This identity in law implies p
(law)
< βi , βi+1 > =
i=0
p
U i βi , U i+1 βi+1 .
i=0
We conclude by noticing that for any i, U i βi , U i+1 βi+1 = βi , U βi+1 . 2. First, for n = p, we have E[eixSp | βp = m] = E[eixβp ,βp+1 | βp = m] = E[eixm,βp+1 ] −
= e
x2 |m|2 2cp
,
and the latter expression has the required form. Now, suppose that for a given index n ≤ p, E[eixSn | βn = m] has the form which is given in the statement. Using this expression, we get E[eixSn−1 | βn−1 = m] = E[eixβn−1 ,βn eixSn | βn−1 = m] = E[eixm,βn eixSn ] = E[eixm,βn E[eixSn | βn ]] = E[eixm,βn hn (x)e− = hn (x)
2 j=1
cn−1 2π
+∞ −∞
dy eixmj y e−kn (x)y
2 /2
|βn |2 kn (x) 2
e−cn−1 y
2 /2
]
3. Gaussian variables – solutions
= hn (x)
2 j=1
= hn (x)
81
+∞ √ ixmj z cn−1 1 2 dz e kn (x)+cn−1 e−z /2 2π kn (x) + cn−1 −∞
x2 |m|2 cn−1 − e 2(kn (x)+cn−1 ) . kn (x) + cn−1
From the above identity, we deduce the following recurrence relations: hn−1 (x) =
hn (x)cn−1 x2 and kn−1 (x) = , kn (x) + cn−1 kn (x) + cn−1
(3.4.a)
which implies (3.4.2). Expression (3.4.1) yields E[exp ixS0 ] = h0 (x)E[e−|β0 | k0 (x)/2 ] 2 c−1 +∞ 2 = h0 (x) dy e−(k0 (x)+c−1 )y /2 2π −∞ 2 +∞ c−1 h0 (x)c−1 −z 2 /2 = h0 (x) dz e = . 2π(k0 (x) + c−1 ) −∞ k0 (x) + c−1 2
Combining this identity with the above recurrence relation, we obtain k−1 (x) k−1 (x) h1 (x)c0 = c−1 x2 k1 (x) + c0 x2 p h0 (x) k−1 (x) cn kn (x) = h1 (x)c0 2 c−1 = ··· = x x2 x2 n=−1
E[exp ixS0 ] = h0 (x)c−1
3. Ag being given by Ag = (a0 g1 , a0 g0 + a1 g2 , a1 g1 + a2 g3 , . . . , ap−1 gp−1 + ap gp+1 , ap gp ) , we obtain g, Ag = (2a0 g0 g1 + 2a1 g1 g2 + 2a2 g2 g3 + · · · + 2ap gp gp+1 ) , which allows us to conclude that S0
= (law)
= =
β0 β1 + · · · + βp βp+1 1 1 1 g0 g1 + √ g1 g2 + · · · + √ gp gp+1 √ c−1 c0 c0 c1 cp−1 cp 2a0 g0 g1 + · · · + 2ap gp gp+1 .
A may be diagonalized as A = U −1 DU , where U is an orthogonal matrix. Since (law) (law) 2 U g = g, we have g, U −1 DU g = g, Dg = p+1 n=1 dn gn .
82
Exercises in Probability
4. When d = 2, S0 can be developed as follows: S0 = β0,1 β1,1 + β0,2 β1,2 + β1,1 β2,1 + β1,2 β2,2 + · · · , so it clearly appears that S0 is equal in law to the sum of two independent copies of the sum pi=0 βi βi+1 in the case d = 1. Now let us compute E[eixS0 ] for d = 1:
E[eixS0 ] = E eix p+1
=
√
n=0
p+1 n=0
2 dn gn
=
p+1
+∞
n=0 −∞
1 2 dy √ e−(1−2xidn )y /2 2π
1 . 1 − 2xidn
So, when d = 2, taking account of the above remark, one has: ixS0
E[e
p+1
1 i ]= = 2x n=0 1 − 2xidn
p+1
1
det A +
i I 2x
.
5. Developing the determinant of A+λId with respect to the first column, we obtain: Da0 ,...,ap (λ) = λDa1 ,...,ap (λ) − a0 det(M ) , where
⎛ ⎜ ⎜ ⎜ ⎜ M =⎜ ⎜ ⎜ ⎜ ⎜ ⎝
a0 a1 0 0 0 λ a2 0 0 a2 λ a3 ... ... ... 0 . . . 0 ap−1 0 ... ... 0
... ... ... ... λ ap
0 0 0
⎞
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ap ⎠
λ
and det(M ) = a0 Da2 ,...,ap (λ), so that Da0 ,...,ap (λ) = λDa1 ,...,ap (λ) − a20 Da2 ,...,ap (λ) . This recurrence relation is a classical way to compute the characteristic polynomial of A. Henceforth formulae (b) and (c) combined together yield another manner to obtain this determinant. 6. Set Xn =
n i=0
αi gi gi+1 , then for any n and m such that n ≤ m one has E[(Xm − Xn )2 ] =
m
αi2 ,
i=n+1
so (Xn ) is a Cauchy sequence in L2 (which is equivalent to the convergence of (Xn ) 2 in L2 ) if and only if ∞ i=0 αi < ∞.
3. Gaussian variables – solutions
83
Solution to Exercise 3.5 1. We prove the result by induction. For n = 0 and n = 1, the statement is clearly true. Let us suppose that for a fixed n ∈ IN, there exist two homogeneous polynomials of degree n, Pn and Qn , such that: z n = Pn (x, y) + iQn (x, y) . We can write z n+1 as: z n+1 = xPn (x, y) − yQn (x, y) + i[yPn (x, y) + xQn (x, y)] . The polynomials Pn+1 (x, y) = xPn (x, y)−yQn (x, y) and Qn+1 = yPn (x, y)+xQn (x, y) are homogeneous with degree n + 1, so the result is true for z n+1 . 2. Set Z = X + iY . It is well known that Z may be written as Z = R exp(iΘ), where R = (X 2 + Y 2 )1/2 and Θ is a uniformly distributed r.v. over [0, 2π[ which is n independent of R. On the one hand, note that |Z|Zn−1 = R exp(inΘ) has the same law as Z. On the other hand, from question 1, we have Zn = An + iBn . |Z|n−1 So the complex r.v. An + iBn has the same law as Z, which proves that An and Bn are two independent, centred Gaussian variables. 3. First, from question 2, we know that E[An Bn ] = 0 for any n ∈ IN. Let n = m. With the notations introduced above, we have E[(An + iBn )(Am + iBm )] = E[R2 exp i(n + m)Θ] = E[R2 ]E[exp i(n + m)Θ] = 0 , and E[(An + iBn )(Am − iBm )] = E[R2 exp i(n − m)Θ] = E[R2 ]E[exp i(n − m)Θ] = 0 . Developing the left hand side of the above expressions, we see that the terms E[An Am ] + E[Bn Bm ], E[An Am ] − E[Bn Bm ], E[Am Bn ] + E[Bm An ], and E[Am Bn ] − E[Bm An ] vanish, from which we deduce that E[An Am ] = E[Bn Bm ] = E[Am Bn ] = 0. 4. (An , Bn ) and (Am , Bm ) are pairs of independent, centred Gaussian r.v.s such that Zn = An +iBn = R exp(inΘ) and Zm = Am +iBm = R exp(imΘ). So, for integers a, b: E[Zna (Zm )b ] = E[Ra+b exp(i(na − mb)Θ)] = E[Ra+b ]E[exp(i(na − mb)Θ)] .
84
Exercises in Probability (law)
From question 1 of Exercise 3.3, R2 = 2ε, where ε is exponentially distributed a+b a+b a+b with parameter 1, thus E[Ra+b ] = 2 2 E[ε 2 ] = 2 2 Γ a+b + 1 . Moreover, 2 E[exp(i(na − mb)Θ)] =
exp(2iπ(na−mb))−1 , 2iπ(na−mb)
E[Zna (Zm )b ]
= 2
if na = mb, and 1 if na = mb. Hence
a+b (exp(2iπ(na − mb)) − 1) Γ +1 , 2 iπ(na − mb) if na = mb.
a+b−2 2
= 1,
if na = mb,
If the vector (An , Bn , Am , Bm ) were Gaussian, then from question 3, the pairs (An , Bn ) and (Am , Bm ) would be independent, hence Zn and Zm would be independent. But this is impossible since |Zn | = |Zm | = R is not deterministic. Note that the vector (An , Bn ) is not Gaussian either. This can easily be verified for n = 2.
Solution to Exercise 3.6 (law)
(i)⇐⇒(ii): The identity in law X, G = Y, G is equivalent to E[eiλX,G ] = E[eiλY,G ] for every λ ∈ IR. Computing each member of the equality, we get λ2
λ2
E[e− 2 X ] = E[e− 2 Y ] for every λ ∈ IR, hence, the Laplace transforms of the (law) (non-negative) r.v.s X 2 and Y 2 are the same, which is equivalent to X =
Y . 2
2
(ii)⇐⇒(iii): First observe that since the law of G is preserved by any rotation in (law) (law) IRk , G U = G, hence (ii)⇐⇒ G U, X = G U, Y . Now, it is easy to check that G is a simplifiable r.v., so from Exercise 1.12 (or use Exercise 1.13) (law) (law)
G U, X = G U, Y ⇐⇒ U, X = U, Y . Comments on the solution: It appears, from above, that these equivalences also hold if we replace G by any other isotropic random variable (i.e. whose law is invariant under orthogonal transformations) whose norm is simplifiable.
Solution to Exercise 3.7 1. Let Gn be an independent copy of Gn , then for any real λ, we have:
∗ G ,G n n
E[eiλGn ,AGn ] = E[eiλA
].
By conditioning, on the left hand side by Gn , and using the characteristic function of Gn (and similarly for the right-hand side, with the noles of Gm and Gn exchanged), the previous identity can be written as: E[e−
λ2
AGn 2 2
] = E[e−
λ2
A∗ Gn 2 2
].
3. Gaussian variables – solutions
85
(law)
This proves the identity in law: AGn = A∗ Gn , by the uniqueness property of the Laplace transform. 2. Let n ∈ IN∗ and denote by X (i) , i = 1, . . . , n the coordinates of Xn , then by considering the equality in law of the statement for ⎛
1 ⎜ ⎜ A= √ ⎜ n⎝
1 0 · 0
··· ··· ··· ···
1 0 · 0
⎞ ⎟ ⎟ ⎟, ⎠
we get the following identity in law n 1 (law) √ | X (i) | = |X| . n i=1
(3.7.a)
Since X is square integrable, we can apply the Central Limit Theorem, so, letting n → ∞ in (3.7.a), we see that |X| is distributed as the absolute value of a centred Gaussian variable. It remains for us to show that X is symmetric. By considering the identity in law in the statement of question 2 for
A=
1 1 −1 −1
we obtain (law)
|X (1) + X (2) | = |X (1) − X (2) | . From Exercise 3.6, this identity in law is equivalent to (law)
ε(X (1) + X (2) ) = ε(X (1) − X (2) ) ,
(3.7.b)
where ε is a symmetric Bernoulli r.v. which is independent of both X (1) + X (2) and X (1) − X (2) . Let ϕ be the characteristic function of X. The identity (3.7.b) may be expressed in terms of ϕ as: 12 (ϕ2 (λ) + ϕ2 (λ)) = ϕ(λ)ϕ(λ), or equivalently (ϕ(λ) − ϕ(λ))2 = 0. Hence, ϕ(λ) takes only real values, that is: X is symmetric.
Solution to Exercise 3.8 1. It suffices to note that RXn is a centred Gaussian vector with covariance matrix RR∗ = I2 .
86
Exercises in Probability
2. Let n = 2 and X2 = (X 1 , X 2 ), then by taking
−1 √ 2 √1 2
√1 2 √1 2
A=
,
we get (law)
(law)
X1 + X2 = X1 − X2 =
√
2 X1 .
(3.8.a)
Writing this identity in terms of the √ characteristic function ϕ of X, we obtain the equation: ϕ(t)2 = ϕ(t)ϕ(t) = ϕ( 2t), for any t ∈ IR. Therefore, ϕ is real and non-negative. Since moreover, ϕ(0) = 1, then from the above equation and by the continuity of ϕ, we deduce that it is strictly positive √ on IR. We may then define ψ = log ϕ on IR. From the equation 2ψ(t) = ψ( 2t), which holds for all t ∈ IR, and from the continuity of ψ, we deduce by classical arguments that ψ has the form 2 ψ(t) = at2 , for some a ∈ IR. So ϕ has the form ϕ(t) = eat . Finally, differentiating √ each member of the equation ϕ(t)2 = ϕ( 2t), we obtain a = −var(X)/2. Assuming the variance of X is finite, a more probabilistic proof consists in writing:
2
1 ϕ(t) = ϕ √ t 2 from which we deduce:
2n
1 = ··· = ϕ √ nt ( 2)
,
2n 1 Xi . X = n 2 2 i=1 (law)
Since from (3.8.a) X is centred, letting n → ∞, we see from the central limit theorem that X is a Gaussian variable.
Solution to Exercise 3.9 1. First note that Gn = σ{Xi+1 − Xi ; 1 ≤ i ≤ n − 1} and σ{Σ1 , Σ2 , . . . , Σn } = σ{Σi+1 − Σi ; 1 ≤ i ≤ n − 1}, so the first equality comes from the relation: Xi+1 − Xi = Note also that
i+1 (Σi+1 − Σi ) + Σi−1 − Σi , i
1 ≤ i ≤ n − 1.
!
"
S i Sj Si+1 Si σ − ; 1 ≤ i, j ≤ n = σ − ;1 ≤ i ≤ n − 1 , i j i+1 i so that the second equality is due to:
Si+1 Si 1 Si+1 − = Xi+1 − i+1 i i i+1 1 (Σi+1 − Σi ) , 1 ≤ i ≤ n − 1 . = i
(3.9.a)
3. Gaussian variables – solutions
87
2. The vector (Sˆ1 , . . . , Sˆn , Sn ) is Gaussian and for each k ≤ n, Cov(Sˆk , Sn ) = 0, hence, Sn is independent of {Sˆk , k ≤ n}. Writing {Sk , k ≤ n} = {Sˆk + nk Sn , k ≤ n}, we deduce from this independence that {Sˆk , k ≤ n} has the same law as {Sk , k ≤ n} conditioned by Sn = 0. Finally, we deduce from question 1 and the equality, 1 ˆ S − 1j Sˆj = 1i Si − 1j Sj , that Gn = σ{Sˆk , k ≤ n}. i i 3. The equality Gn = σ{Y1 , . . . , Yn } is due to (3.9.a). For each n ≥ 1, (Y1 , . . . , Yn ) is a Gaussian vector whose covariance matrix is diagonal, hence, Y1 , . . . , Yn are independent. Since the vector Σ(α) n , n = 1, 2, . . . is Gaussian, its increments are independent if and only if E[(Xk − αk Sk )(Xn − αn Sn )] = 0 , for any k < n. Under the condition αk = 0, this is verified if and only if: αk = k1 . 4. This equality has been established in (3.9.a). By successive iterations of equality (3.9.a), we obtain: k Yn+j Sn+k Sn = − . n n + k j=1 n + j − 1
(3.9.b)
Note that from question 2, Sn+k is independent of Gn+k . Since moreover Sn+k is centred and (Yn+1 , . . . , Yn+k ) is Gn+k -measurable, we have:
Sn E | Gn+k n
k Sn+k Yn+j | Gn+k = E − E n+k n + j−1 j=1
= −
k
Yn+j . j=1 n + j − 1
5. Letting k → ∞ in (3.9.b), we obtain: a) lim
k→∞
b)
Sn+k = 0, a.s. and in LL , by the law of large numbers, n+k
∞ Yn+j Sn =− , n j=1 n + j − 1
a.s.,
the convergence of the series holds in L2 and a.s. In particular, note that a.s. n+k limk→∞ Sn+k = 0 by the law of large numbers. 6. It is clear, from the definition of Fn and Gn that for every n, Fn = Gn ∨ σ(X1 ), which implies F∞ = G∞ ∨ σ(X1 ). Moreover, from question 3, G∞ = σ{Y1 , Y2 , . . .} and from question 5, σ(X1 ) ⊂ σ{Y1 , Y2 , . . .}. The conclusion follows.
88
Exercises in Probability
Solution to Exercise 3.10 1. By standard approximation arguments, it is enough to show identity (3.10.1) for f (x) = eit,x and g(x) = eis,x , with s, t, x ∈ IRd . Let us denote by ϕ the −|t|2
characteristic function of X, that is E[eit,X ] = ϕ(t) = e 2 , t ∈ IRd . First, it follows from the definition of the covariance that Cov(f (X), g(X)) = ϕ(s + t) − ϕ(s)ϕ(t). On the other hand, the computation of the integral in (3.10.1) gives:
1 0
E[ ∇f (X), ∇g(αX +
$ √ 2 (itj exp(i t, X))j , isj exp(i s, αX + 1 − α Y )
dα E 0
= −
1
dα 0
1 − α2 Y ] dα
#
1
=
√
j
√
sj tj E[eit+αs,X ]E[eis
1−α2 ,Y
]
j
1 dα s, t exp − ( t 2 + 2α s, t + s 2 ) 2 0 −s,t 1−e 1 = −ϕ(s)ϕ(t) = (ϕ(s + t) − ϕ(s)ϕ(t)) . 2 2
= −
1
2. Let f be any Lipschitz function such that f Lip ≤ 1 and E[f (X)] = 0, and define (def) gt = etf , for t ≥ 0. Then applying (3.10.1) to f and gt gives E[f (X)gt (X)] = E[∇f (X), ∇gt (Z)] = tE[∇f (X), ∇f (Z) etf (Z) ] ≤ tE[etf (Z) ] , thanks to the condition f Lip ≤ 1. Now let the function u be defined via E[etf (X) ] = eu(t) . Then E[f (X)etf (X) ] = u (t)eu(t) , and from the above inequality: u (t) ≤ t. Since u(0) = 0, we conclude that u(t) ≤
t2 , 2
t2
that is E[etf (X) ] ≤ e 2 .
3. By symmetry, one can also apply (3.10.3) to −f , hence (3.10.3) holds for all t ∈ IR and for every Lipschitz function f on IRd with f Lip ≤ 1 and E[f (X)] = 0. Then (3.10.4) follows from Chebychev’s inequality, since: t2
etλ P (f (X) − E[f (X)] > λ) ≤ e 2 , for every t and λ,
t2 λ2 − λt = exp − P (f (X) − E[f (X)] > λ) ≤ exp inf t>0 2 2
.
3. Gaussian variables – solutions
89
Solution to Exercise 3.11 (Given by A. Cherny) Let (X, Y ) be a Gaussian vector with the mean vector (mX , mY ) and with the covariance matrix cXX , cXY . cXY , cY Y By the “normal correlation” theorem (see for example, Theorem 2, Chapter II, § 13 in Shiryaev [55], but prove the result directly...),
cXY c2 L(Y | X = x) = N mY + (x − mX ), cY Y − XY , cXX cXX cXY c2 L(X | Y = y) = N mX + (y − mY ), cXX − XY . cY Y cY Y In order for (α, β, γ, δ, a, b) to satisfy the desired property, we should have cXY , cXX cXY γ = mX − mY , cY Y
α = mY − mX
cXY , cXX cXY δ= , cY Y β=
cXX cY Y − c2XY , cXX cXX cY Y − c2XY b2 = cY Y a2 =
for some (mX , mY , cXX , cXY , cY Y ). Owing to the inequality c2XY ≤ c2XX c2Y Y , we have 0 ≤ βδ ≤ 1. Let us first consider the case when βδ = 0. Then cXY = 0, which yields β = 0, δ = 0, a2 > 0, and b2 > 0. On the other hand, for any (α, β, γ, δ, a, b) with these properties, we can find corresponding (mX , mY , cXX , cXY , cY Y ). Let us now consider the case when 0 < βδ < 1. Then βb2 = δa2 , a2 > 0, and 2 b > 0. On the other hand, for any (α, β, γ, δ, a, b) with these properties, we can find corresponding (mX , mY , cXX , cXY , cY Y ). Let us finally consider the case when βδ = 1. Then c2XY = cXX cY Y , which yields α = −βγ, a2 = 0, and b2 = 0. On the other hand, for any (α, β, γ, δ, a, b) with these properties, we can find corresponding (mX , mY , cXX , cXY , cY Y ). As a result, the desired necessary and sufficient condition is: (α, β, γ, δ, a, b) ∈ {β = 0, δ = 0, a2 > 0, b2 > 0} ∪ {0 < βδ < 1, βb2 = δa2 , a2 > 0, b2 > 0} ∪ {βδ = 1, α = −βγ, a2 = 0, b2 = 0}.
Chapter 4 Distributional computations
Contents of this chapter
Probabilists are often dealing with the following two topics. (i) Given a random vector Y = (X1 , . . . , Xn ) whose distribution on IRn is known, e.g. through its density, compute the law of ϕ(X1 , . . . , Xn ), where ϕ : IRn → IR is a certain Borel function. This involves essentially changing variables in multiple integrals, hence computing Jacobians, etc. (ii) The distribution of Y = (X1 , . . . , Xn ) may not be so easy to express explicitly. Then, one resorts to finding expressions for some transform of this distribution which characterizes it, e.g. its characteristic function ΦY (y) = E[exp i < y, Y >], y ∈ IRn , or, if Y ≡ X1 takes values in IR+ , its Laplace transform λ → E[e−λY ], λ ≥ 0, or its Mellin m → E[Y m ], m ≥ 0, or m ∈ C, or its Stieltjes transform: transform: 1 s → E 1+sY , s ≥ 0 (or variants of this function). For this topic, we refer for instance to the books by Lukacs [38] and Widder [61]. In this chapter, we have focussed mainly on the families of beta and gamma variables, as well as on stable(α) unilateral variables. Most of the identities on these stable variables are found in the books by Zolotarev [66] and Uchaikin and Zolotarev [59]. For some proofs (by analysis) of identities involving the gamma and beta functions, we refer the reader to Andrews, Askey and Roy [1] or Lebedev [69].
91
92 *
Exercises in Probability
4.1 Hermite polynomials and Gaussian variables
Preliminaries: Let (x, t) denote the generic point in IR2 . There exists a sequence of polynomials in two variables (the Hermite polynomials) such that, for every a ∈ C:
a2 t ea (x, t) ≡ exp ax − 2
=
∞ an n=0
n!
hn (x, t) .
In the following, we write simply ea (x) for ea (x, 1), resp. hn (x) for hn (x, 1). Throughout the exercise, X and Y denote two centred, independent, Gaussian variables, with variance 1. Part A. 1. Compute E [ea (X)eb (X)] (a, b ∈ C) and deduce: E [hn (X)hm (X)] (n, m ∈ IN). 2. Let a ∈ C, and c ∈ IR. Compute E [exp a(X + cY ) | X] in terms of the function of two variables ea , as well as of X and c.
Compute E (X + cY )k | X , for k ∈ IN, in terms of the polynomial of two variables hk , and of X and c. Part B. Define T = 12 (X 2 + Y 2 ), g =
X2 , and ε = sgn(X) X2 + Y 2
1. Compute explicitly the joint law of the triple (T, g, ε). What can be said about these three variables? √ What is the distribution of 2T ? √ 2. Define μ = ε g =
√ X . X 2 +Y 2
Show that, for any a ∈ C, the following identities hold. (i) E [exp(aX) | μ] = ϕ(aμ), where: ϕ(x) = 1 + x exp F (x) =
x
−∞
dy e−y
2 /2
(ii) E [sinh(aX) | μ] =
.
π aμ exp 2
1
a2 μ2 2
.
(iii) E [cosh(aX) | μ] = 1 + a2 μ2 dy exp 0
a 2 μ2 (1 2
− y2) .
x2 2
F (x), and
4. Distributional computations
93
3. Show that, for any p ∈ IN: E [h2p+1 (X) | μ] = c2p+1 μ(μ2 − 1)p 1
p−1
E [h2p (X) | μ] = c2p + c2p μ2 dy (μ2 (1 − y 2 ) − 1) 0
(p ≥ 1) ,
where c2p , c2p+1 , and c2p are some constants to be determined. Y 4. Define ν = √ . Are the variables μ and ν independent? 2T Compute, for a, b ∈ C, the conditional expectation: E [exp(aX + bY ) | μ, ν] with the help of the function ϕ introduced in question 2 of part B. Comments and references: This exercise originates from the article: ´ ´ma and M. Yor: Etude J. Aze d’une martingale remarquable. S´eminaire de Probabilit´es, XXIII, 88–130, Lecture Notes in Mathematics, 1372, Springer, Berlin, 1989. We now explain how the present exercise has been constructed. The remarkable martingale which √ is studied in this article is the so-called Az´ema martingale: μt = sgn(Bt ) t − γt , where (Bt ) is a one-dimensional Brownian motion and γt = sup {s ≤ t : Bs = 0}. Let us take t = 1, and denote μ = μ1 , ε = sgn(B1 ), 2 g = 1 − γ1 , X = B1 . Then the triplet (X 2 , ε, g) is distributed as X 2 , ε, X 2X+Y 2 in the present exercise (Brownian enthusiasts can check this assertion!) so that the (Brownian) quantity E[ϕ(B1 ) | μ1 = z] found in the above article is equal to
X2 E ϕ(X) | ε X 2 +Y 2 = z . **
4.2 The beta–gamma algebra and Poincar´ e’s Lemma
Let a, b > 0. Throughout the exercise, Za is a gamma variable with parameter a, and Za,b a beta variable with parameters a and b, that is: P (Za ∈ dt) = ta−1 e−t dt/Γ(a) , (t ≥ 0) and P (Za,b ∈ dt) =
ta−1 (1 − t)b−1 dt , (t ∈ [0, 1]) . β(a, b)
1. Show the identity in law between the two-dimensional variables: (law)
(Za , Zb ) = (Za,b Za+b , (1 − Za,b )Za+b ) ,
(4.2.1)
94
Exercises in Probability where, on the right hand side, the r.v.s Za,b and Za+b are assumed to be independent, and on the left hand side, the r.v.s Za and Zb are independent. 2. Show the identity: (law)
Za,b+c = Za,b Za+b,c
(4.2.2)
with the same independence assumption on the right hand side as in the preceding question. Hint: Use Exercise 1.12 about simplifiable variables. 3. Let k ∈ IN, and N1 , . . . , N2 , . . . , Nk be k independent centred Gaussian variables, with variance 1.
Show that, if we denote: Rk =
k
i=1
1/2
Ni2
, then
(law)
Rk2 = 2Zk/2
(4.2.3)
and xk = (x1 , x2 , . . . , xk ) ≡ R1k (N1 , N2 , . . . , Nk ) is uniformly distributed on the unit sphere of IRk , and is independent of Rk . 4. Prove Poincar´e’s Lemma: (n) Fix k > 0, k ∈ IN. Consider, for every n ∈ IN, xn = (x1 , . . . , x(n) n ) a uniformly distributed r.v. on the unit sphere of IRn . √ (n) (n) Prove that: n(x1 , . . . , xk ) converges in law, as n → ∞, towards (N1 , . . . , Nk ), where the r.v.s (Ni ; 1 ≤ i ≤ k) are Gaussian, centred with variance 1. 5. Let xk = (x1 , x2 , . . . , xk ) be uniformly distributed on the unit sphere of IRk , and let p ∈ IN, 1 ≤ p < k. Show that: p i=1
(law)
x2i = Z p , k−p . 2
(4.2.4)
2
6. Let k ≥ 3. Assume that xk is distributed as in the previous question.
Show that:
k−2 i=1
( k −1) 2
x2i
is uniformly distributed on [0, 1].
7. We shall say that a positive r.v. T is (s) distributed if:
1 dt exp − P (T ∈ dt) = √ 3 2t 2πt
.
4. Distributional computations
95
(a) Show that if T is (s)-distributed, then (law)
T = 1/N 2 ,
(4.2.5)
where N is a Gaussian r.v., centred, with variance 1. Deduce from this identity that
α2 T E exp − 2
α ≥ 0.
= exp(−α) ,
(b) Show that if (pi )1≤i≤k is a probability distribution, and if T1 , T2 , . . . , Tk k
are k independent, (s)-distributed r.v.s, then
i=1
p2i Ti is (s) distributed.
8. (a) Show that, if xk = (x1 , x2 , . . . , xk ) is uniformly distributed on the unit sphere of IRk , and if (pi )1≤i≤k is a probability distribution, then the dis k p2i tribution of: does not depend on (pi )1≤i≤k . 2 i=1 xi (b) Compute this distribution. 9. (This question presents a multidimensional extension of question 1; it is independent of the other ones.) Let (Zai ; i ≤ k) be k independent gamma variables, with respective parameters ai . Show that
k i=1
Zai is independent of the vector:
Za i /
k j=1
Zaj ; i ≤ k , and
identify the law of this vector, which takes its values in the simplex
k
= (x1 , . . . , xk ); xi ≥ 0;
k
xi = 1 .
i=1
Comments and references: (a) There is a large literature about beta and gamma variables. The identities in law (4.2.1) and (4.2.2) are used very often, and it is convenient to refer to them and some of their variants as “beta–gamma algebra”; see e.g.: D. Dufresne: Algebraic properties of beta and gamma distributions, and applications. Adv. in Appl. Math., 20, no. 3, 285–299 (1998). JF. Chamayou and G. Letac: Additive properties of the Dufresne laws and their multivariate extensions. J. Theoret. Probab., 12, no. 4, 1045–1066 (1999).
96
Exercises in Probability
(b) As pointed out in: P. Diaconis and D. Freedman: A dozen of de Finetti results in search of a theory. Ann. Inst. H. Poincar´e, 23, Supp. to no. 2, 397–423 (1987), Poincar´e’s Lemma is misattributed. It goes back at least to Mehler. Some precise references are given in D. W. Stroock: Probability Theory: An Analytic View. Cambridge University Press (1993). See, in particular, pp. 78 and 96. G. Letac: Exercises and Solutions Manual for Integration and Probability. Springer-Verlag, New York, 1995. Second French edition: Masson, 1997. (c) The distribution of T introduced in question 7 is the unilateral stable distribution with parameter 1/2. The unilateral stable laws will be studied more deeply in Exercises 4.17, 4.18 and 4.19.
4.3 An identity in law between reciprocals of gamma variables **
We keep the same notations as in Exercise 4.2. Let α, μ > 0, with α < μ; the aim of the present exercise is to prove the identity in law: 1 1 (law) 1 1 ZA = + + , (4.3.1) Zα Zμ Zα Z μ ZB where A = (μ − α)/2, B = (μ + α)/2 and on the right hand side of (4.3.1), the four gamma variables are independent. 1. Prove that the identity (4.3.1) is equivalent to: 1 Za+2b
1 1 + + Za Za+2b
Zb (law) 1 = , Za+b Za
a, b > 0 .
(4.3.2)
2. Prove that (4.3.2) follows from the identity in law 1 Za+2b
1 1 + + C Za+2b
Zb (law) 1 Zb = 1+ Za+b Za+b C
,
(4.3.3)
where C is distributed as Za and, in the above identity, C, Zb , Za+b , Za+2b are independent. (For the moment, take (4.3.3) for granted). Hint: Use the fundamental beta–gamma algebra identity (4.2.1) in Exercise 4.2.
4. Distributional computations
97
3. We shall now show that the identity (4.3.3) holds for any random variable C assumed independent of the rest of the r.v.s in (4.3.3). We denote this general identity as (4.3.3)gen . Prove that (4.3.3)gen holds if and only if there is the identity in law between the two pairs:
1
Zb Zb + , Za+2b Za+b Za+b 1
Za+2b
(law)
=
1
Zb , Za+b Za+b
.
(4.3.4)
4. Finally, deduce (4.3.4) from the fundamental beta–gamma algebra identity (4.2.1). Comments: The identity in law (4.3.1) has been found as the probabilistic translation of a complicated relation between McDonald functions Kν (see Exercise 4.19 for their definition). The direct proof proposed here is due to D. Dufresne.
4.4 The Gamma process and its associated Dirichlet processes *
This exercise is a continuation, at process level, of the discussion of the gamma–beta algebra in Exercise 4.2, especially question 9. A subordinator is a continuous time process with increasing paths whose increments are independent and time homogeneous. Let (γt , t ≥ 0) be a gamma process, i.e. a subordinator whose law at time t is the gamma distribution with parameter t: P (γt ∈ du) =
ut−1 e−u du . Γ(t)
1. Prove that for t0 > 0 fixed, the Dirichlet process of duration t0 > 0, which we define as γt , t ≤ t0 (4.4.1) γt0 is independent from γt0 , hence from (γu , u ≥ t0 ). 2. We call the process
γt , γ1
t ≤ 1 , the standard Dirichlet process (Dt , t ≤ 1).
Prove that the random vector (Dt1 , Dt2 − Dt1 , . . . , Dtk − Dtk−1 , D1 − Dtk ), for t1 < t2 < · · · < tk ≤ 1, follows the Dirichlet law with parameters (t1 , t2 − t1 , . . . , 1 − tk ), i.e. with density t −t
ut11 −1 ut22 −t1 . . . ukk k−1 (1 − (u1 + . . . + uk ))−tk , Γ(t1 )Γ(t2 − t1 ) . . . Γ(1 − tk )
98
Exercises in Probability with respect to the Lebesgue measure du1 du2 . . . duk , on the simplex
= {(ui )1≤i≤k+1 :
ui = 1, ui ≥ 0} .
i
k
3. Prove the following relation for any Borel, bounded function f : [0, 1] → IR+
1 1 E = E exp − f (u) dγu . 0 1 + 01 f (u) dDu
(4.4.2)
Comments and references: In the following papers D.M. Cifarelli and E. Regazzini: Distribution functions of means of a Dirichlet process. Ann. Statist., 18, no. 1, 429–442 (1990) P. Diaconis and J. Kemperman: Some new tools for Dirichlet priors. Bayesian statistics, 5 (Alicante, 1994), 97–106, Oxford Sci. Publ., Oxford Univ. Press, New York, 1996 D.M. Cifarelli and E. Melilli: Some new results for Dirichlet priors. Ann. Statist., 28, no. 5, 1390–1413 (2000) the authors have used the formula (4.4.2) together with the inversion of the Stieltjes transform to obtain remarkably explicit formulae for the densities of 01 f (u) dDu . As an example,
1
(law)
1
u dD(u) = 0
has density: **
1 π
D(u) du 0
1 sin(πx) xx (1−x) 1−x on [0, 1].
4.5 Gamma variables and Gauss multiplication formulae
Preliminary: The classical gamma function ∞
Γ(z) =
dt tz−1 e−t
0
satisfies Gauss’s multiplication formula: for any n ∈ IN, n ≥ 1, Γ(nz) =
1 (2π)
n−1 2
n
nz− 12
n−1
k Γ z+ n k=0
.
(4.5.1)
(See e.g. Section 1.5 of [1].) In particular, for n = 2, formula (4.5.1) is known as the duplication formula (see e.g. Lebedev [69])
4. Distributional computations
1 1 1 Γ(2z) = √ 22z− 2 Γ(z)Γ z + 2 2π
99
,
(4.5.2)
and, for n = 3, as the triplication formula:
Γ(3z) =
1 2 1 3z− 1 Γ z+ 3 2 Γ(z)Γ z + 2π 3 3
.
(4.5.3)
1. Prove that if, for a > 0, Za denotes a gamma variable with parameter a, then the identity in law: (law)
(Zna )n = nn Za Za+ 1 . . . Za+ n−1 n
n
(4.5.4)
holds, where the right hand side of (4.5.4) features n independent gamma variables, with respective parameters a + nk (0 ≤ k ≤ n − 1). 2. We denote by Za,b a beta variable with parameters (a, b). Prove that the identity in law: (law) (Zna,nb )n = Za,b Za+ 1 ,b . . . Za+ n−1 ,b (4.5.5) n
n
holds, where the right hand side of (4.5.5) features n independent beta variables with respective parameters a + nk , b for 0 ≤ k ≤ n − 1. Hint: Use the identity (4.2.1) in Exercise 4.2. Comments and references: (a) The identity in law (4.5.4) may be considered as a translation in probabilistic terms of Gauss’s multiplication formula (4.5.1). For some interesting consequences of (4.5.4) in terms of one-sided stable random variables, see Exercises 4.17, 4.18 and 4.19 hereafter and the classical reference of V.M. Zolotarev [66]. (b) In this exercise, some identities in law for beta and gamma variables are deduced from the fundamental properties of the gamma function. The inverse attitude is taken by L. Gordon: A stochastic approach to the Gamma function. Amer. Math. Monthly, 101, 858–865 (1994). See also: A. Fuchs and G. Letta: Un r´esultat ´el´ementaire de fiabilit´e. Application `a la formule de Weierstrass sur la fonction gamma. S´eminaire de Probabilit´es, XXV, 316–323, Lecture Notes in Mathematics, 1485, Springer, Berlin, 1991.
100 **
Exercises in Probability
4.6 The beta–gamma algebra and convergence in law 1. Let a > 0, and n ∈ IN, n ≥ 1. Deduce from the identity (4.2.1) that (law)
Za = (Za,1 Za+1,1 . . . Za+n−1,1 ) Za+n
(4.6.1)
where, on the right hand side, the (n + 1) random variables which appear are independent.
Za+n , n → ∞ converges in law, hence n in probability, towards a constant; compute this constant.
2. Prove that, as n → ∞, the sequence
3. Let U1 , U2 , . . . , Un , . . . be a sequence of independent random variables, each of which is uniformly distributed on [0, 1]. Let a > 0. Prove that: 1/a
1
1
(law)
nU1 U2a+1 . . . Una+(n−1) −−− − −→ Za . n→∞ 4. Let a > 0 and r > 0. With the help of an appropriate extension of the two first questions above, prove that: (law)
nZa,r Za+r,r . . . Za+(n−1)r,r −−− − −→ Za , n→∞ where, on the left hand side, the n beta variables are assumed to be independent. * 4.7 Beta–gamma variables and changes of probability measures (We keep the same notation as in Exercises 4.2, 4.5, and 4.6.) Consider, on a probability space (Ω, A, P ), a pair (A, L) of random variables such that A takes its values in [0, 1], and L in IR+ , and, moreover:
A 1−A , L L
(law)
=
1 1 , Zb Za
where Za and Zb are independent (gamma) variables. We write M =
L . A(1 − A)
,
4. Distributional computations
101
1. Prove the identity in law: (law)
(A, M ) = (Za,b , Za+b ) , where, on the right hand side (hence, also on the left hand side), the two variables are independent. 2. Let β > 0. Define a new probability measure Qβ on (Ω, A) by the formula: Qβ = cβ Lβ · P . (a) Compute the normalizing constant cβ in terms of a, b and β. (b) Prove that, under Qβ , the variables A and M are independent and that A is now distributed as Za+β,b+β , whereas M is distributed as Za+b+β . 1−A A and , and prove that they (c) Compute the joint law under Qβ of L L are no longer independent. * 4.8
Exponential variables and powers of Gaussian variables
1. Let Z be an exponential variable with parameter 1, i.e: P (Z ∈ dt) = e−t dt Prove that: (law)
Z =
√
(t > 0) .
2Z|N | ,
(4.8.1)
where, on the right hand side, N denotes a centred Gaussian variable, with variance 1, which is independent of Z. Hint: Use the identity (4.5.4) for n = 2 and a = 12 . 2. Let p ∈ IN, p > 0. Prove that: (law)
1
1
1
1
1
Z = Z 2p 2 2 +···+ 2p |N1 | |N2 | 2 . . . |Np | 2p−1
(4.8.2)
where, on the right hand side of (4.8.2), N1 , N2 , . . . , Np are independent, centred Gaussian variables with variance 1, which are also independent of Z. 3. Let (Np , p ∈ IN) be a sequence of independent centred Gaussian variables with variance 1. Prove that:
n p=0
1
(law)
|Np | 2p −−− − −→ n→∞
1 Z . 2
102 *
Exercises in Probability
4.9 Mixtures of exponential distributions
Let λ1 and λ2 be two positive reals such that: 0 < λ1 < λ2 < ∞. Let T1 and T2 be two independent r.v.s such that for every t ≥ 0, P (Ti > t) = exp(−λi t) (i = 1, 2). 1. Show that there exist two constants α and β such that, for every Borel set B of IR+ , P (T1 + T2 ∈ B) = αP (T1 ∈ B) + βP (T2 ∈ B) . Compute explicitly α and β. 2. Consider a third variable T3 , which is independent of the pair (T1 , T2 ), and which satisfies: for every t ≥ 0, P (T3 ≥ t) = exp −(λ2 − λ1 )t. (a) Compute P (T1 > T3 ). (b) Compare the law of the pair (T1 , T3 ), conditionally on (T1 > T3 ), and the law of the pair (T1 + T2 , T2 ). (c) Deduce therefrom a second proof of the formula stated in question 1.
Hint: Write E[f (T1 )] = E f (T1 )1(T1 >T3 ) + E f (T1 )1(T1 ≤T3 ) for f ≥ 0, Borel. 3. (This question is independent of question 2). Let λ1 , λ2 , . . . , λn be n positive reals, which are distinct and let T1 , T2 , . . . , Tn be n independent r.v.s such that: for every t ≥ 0, P (Ti > t) = exp(−λi t) (i = 1, 2, . . . , n). (n)
Show that there exist n constants αi for every Borel set B in IR+ , P
n
Ti ∈ B =
i=1
such that:
n
(n)
αi P (Ti ∈ B) .
(4.9.1)
i=1
Prove the formula (n) αi
=
j=i
λi 1− λj
−1
.
(4.9.2)
Comments and references: Sums of independent exponential variables play an important role in: Ph. Biane, J. Pitman and M. Yor: Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions. Bull. Amer. Math. Soc. (N.S.), 38, no. 4, 435–465 (2001).
4. Distributional computations
103
** 4.10 Some computations related to the lack of memory property of the exponential law Let T and T be two independent, exponentially distributed r.v.s, i.e. P (T ∈ dt) = P (T ∈ dt) = e−t dt (t > 0). 1. Compute the joint law of
T ,T T +T
+ T .
What can be said of this pair of r.v.s? 2. Let U be a uniformly distributed r.v. on [−1, +1], which is independent of the pair (T, T ). , and Y = U (T + T ). Define: X = log 1+U 1−U
Prove that the pairs (X, Y ) and log TT , T − T have the same law. Hint: It may be convenient to consider U˜ = 12 (1 + U ). 3. Compute, for all μ, ν ∈ IR: E[exp i(νX + μY )] . Hints: πν , (1 + iμ)1−iν , (1 − iμ)1+iν . sinh(πν) π for the classical Γ (b) The formula of complements: Γ(x)Γ(1 − x) = sin(πx) function may be used. (a) It is a rational fraction of:
4. Prove that, for any μ ∈ IR, E[Y exp iμ(Y −X)] = 0. Compute E[Y | Y −X]. 5. Prove that the conditional law of (T, T ), given (T > T ), is identical to the law of T + T2 , T2 ; in short:
(law)
((T, T ) | (T > T )) =
T T T+ ; 2 2
.
Deduce that the law of (|X|, |Y |) is identical to that of: log 1 +
2T T
,T .
Comments and references: The origin of questions 4 and 5 is found in the study of certain Brownian functionals in Ph. Biane and M. Yor: Valeurs principales associ´ees aux temps locaux browniens. Bull. Sci. Math., (2) 111, no. 1, 23–101 (1987).
104
Exercises in Probability
* 4.11 Some identities in law between Gaussian and exponential variables
1. Let N1 , N2 , N3 be three independent Gaussian variables, centred, with vari
ance 1. Define: R =
3 i=1
1/2
Ni2
, and compute its law.
2. Let X be an r.v. such that P (X ∈ dx | R = r) = 1r 1]0,r] (x)dx. Define: U = what is the law of U ? What can be said of the pair (R, U )?
X ; R
3. Let Y = R − X; what is the law of Y given R? Compute the law of the pair (X, Y ). 4. Define G = X − Y and T = 2XY . What is the law of the pair (G, T )? What is the law of G? of T ? What can be said of these two variables? 5. Define V = 2U − 1. Remark that the following identities hold:
G = RV
and T = R
and, therefore:
2
G = 2T
What is the law of
2
V2 1−V2
1−V2 2
,
.
(4.11.1)
V2 given T ? 1−V2
(6) Let A be an r.v. which is independent of T , and which has the arcsine distribution, that is: da P (A ∈ da) =
π a(1 − a)
(a ∈]0, 1[) .
Show that: (law)
G2 = 2T A .
(4.11.2)
Note and comment upon the difference with the above representation (4.11.1) of G2 .
4. Distributional computations *
105
4.12 Some functions which preserve the Cauchy law 1. Let Θ be an r.v. which is uniformly distributed on [0, 2π[. Give a simple argument to show that tan(2Θ) and tan(Θ) have the same law. 2. Let N and N be two centred, independent Gaussian variables with variance 1. def (i) Prove that C = N and tan(Θ) have the same distribution. N (ii) Show that C is a Cauchy variable, i.e.
P (C ∈ dx) =
dx π(1 + x2 )
3. Prove, only using the two preceding questions, that if C is a Cauchy variable, then so are:
1 1 1+C . (4.12.1) C− and 2 C 1−C Comments and references: Equation (4.12.1) is an instance of a much more general result which, roughly speaking, says that the complex Cauchy distribution is preserved by a large class of meromorphic functions of the complex upper half plane. E.J.G. Pitman and E.J. Williams: Cauchy-distributed functions of Cauchy variates. Ann. Math. Statist., 8, 916–918 (1967). F.B. Knight and P.A. Meyer: Une caract´erisation de la loi de Cauchy. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 34, no. 2, 129–134 (1976). G. Letac: Which functions preserve Cauchy laws? Proc. Amer. Math. Soc., 67, no. 2, 277–286 (1977). One may find some other related references in: J. Pitman and M. Yor: Some properties of the arcsine law related to its invariance under a family of rational maps. In a Festschrift volume for H. Rubin, IMS-AMS series, ed: Das Gupta (2004), 45, 126–137. *
4.13 Uniform laws on the circle
Let U and V be two random variables which are independent and uniformly distributed on [0, 1[. Let m, n, m , n be four integers, all different from 0, that is elements of ZZ \ {0}.
106
Exercises in Probability
What is the law of {mU + nV }, where {x} denotes the fractional part of x. Give some criterion which ensures that the two variables {mU + nV } and {m U + n V } are independent, where {x} denotes the fractional part of x. Hint: Consider E [exp 2iπ (p{mU + nV } + q{m U + n V })] *
for p, q ∈ ZZ.
4.14 Trigonometric formulae and probability 1. Let Θ be an r.v. which is uniformly distributed on [0, π[. Compute the law of X = cos(Θ). 2. Prove that if Θ and Θ are independent and uniformly distributed on [0, 2π[, then: (law) cos(Θ) + cos(Θ ) = cos(Θ + Θ ) + cos(Θ − Θ ) . Hint: Use Exercise 4.13. 3. Prove that, if X and Y are independent, and have both the distribution of cos(Θ) [which was computed in question 1], then: 1 (law) (X + Y ) = XY . 2
* 4.15
A multidimensional version of the Cauchy distribution
Let N be a centred Gaussian r.v. with variance 1. def
1. Prove that if T = 1/N 2 , then, one has: P (T ∈ dt) = √
1 dt exp − 3 2t 2πt
(t > 0).
2. Prove that if C is a standard Cauchy variable, that is: its distribution is given dx (x ∈ IR), then its characteristic function is by P (C ∈ dx) = π(1 + x2 ) E [exp(iλC)] = exp(−|λ|)
(λ ∈ IR) .
3. Deduce from the preceding question that the variable T , defined in question 1, satisfies: λ2 E exp − T = exp(−|λ|) (λ ∈ IR) . 2 Hint: Use the representation of a Cauchy variable given in Exercise 4.12, question 2.
4. Distributional computations
107
4. Let T1 , T2 , . . . , Tn , . . . be a sequence of independent r.v.s with common law that of T . Compare the laws of
1 n2
n
i=1
Ti and T .
How does this result compare with the law of large numbers? 5. Let N0 , N1 , . . . , Nn be (n + 1) independent, centred Gaussian variables, with variance 1. Compute explicitly the law of the random vector:
N1 N2 Nn , ,..., N0 N 0 N0
,
(4.15.1)
and compute also its characteristic function: ⎡
⎛
n
⎞⎤
Nj E ⎣exp ⎝i λj ⎠⎦ , N0 j=1
where (λj )j≤n ∈ IRn .
6. Prove that this distribution may be characterized as the unique law of X valued in IRn , such that θ, X is a standard Cauchy r.v. for any θ ∈ IRn , with |θ| = 1. Comments and references: (a) The random vector (4.15.1) appears as the limit in law, for h → 0, of : ⎛
⎞
n t+h 1 ⎝ Hsj dβsj ⎠ 0 0 (βt+h − βt ) j=0 t
(4.15.2)
where β 0 , β 1 , . . . , β n are (n + 1) independent Brownian motions, and {H j ; j = 0, 1 . . . , n} are (n + 1) continuous adapted processes. The limit in law, as h → 0, for (4.15.2) may be represented as Ht0 +
n j=1
Htj
Nj N0
with (Ht0 , . . . Htn ) independent of the random vector (4.15.1). For more details, we refer to: C. Yoeurp: Sur la d´erivation des int´egrales stochastiques. S´eminaire de Probabilit´es, XIV (Paris, 1978/1979), 249–253, Lecture Notes in Mathematics, 784, Springer, Berlin, 1980 D. Isaacson: Stochastic integrals and derivatives. Ann. Math. Statist., 40, 1610–1616 (1969).
108
Exercises in Probability n
(b) Student’s laws (of which a particular case is N0 /
Ni2
1/2
) are classically
i=1
considered when testing mean and variances in Gaussian statistics. See e.g. Chapter 26–27 in N.L. Johnson, S. Kotz and N. Balakrishnan [31], and Sections 7.2, 7.4, 7.5 in C. J. Stone: A course in Probability and Statistics. Duxbury Press, 1996. **
4.16 Some properties of the Gauss transform
Let A be a strictly positive random variable, which is independent of N , a Gaussian, centred r.v. with variance 1. Let X be such that: √ (law) X = AN. We shall say that the law of X is the Gauss transform of the law of A. 0. Preliminaries: This question is independent of the sequel of the exercise. (i) Prove that X admits a density which is given by:
1 x2 E √ exp − 2A 2πA
.
2
(ii) Prove that for any λ ∈ IR, E[exp(iλX)] = E exp − λ2 A . Hence the Gauss transform is injective on the set of probability measures on IR+ . 1. Prove that, for every α ≥ 0, one has
1 1 1 = E α/2 E E α |X| A |N |α
.
(4.16.1)
1 < ∞. Give a criterion on A and α, which ensures that: E |X|α 2. Prove that, for every α > 0, the quantity E
1 Aα/2
is also equal to:
∞ 1 1 E α/2 = dx xα−1 E [cos(αX)] . α α −1 A Γ 2 22 0
Hint: Use the elementary formula:
1 rα/2
=
1
Γ
α 2
∞
(4.16.2)
dt t 2 −1 e−tr . α
0
3. The aim of this question is to compare formulae (4.16.1) and (4.16.2) in the 1 case E < ∞. |X|α
4. Distributional computations
109
3.1. Preliminaries. (i) Recall (see, e.g. Lebedev [69]) the duplication formula for the gamma function:
1 2z− 1 1 2 Γ(2z) = √ 2 Γ(z)Γ z + 2 2π π . and the formula of complements Γ(z)Γ(1 − z) = sin(πz) t
(def)
(ii) For 0 < α < 1, aα = lim dxxα−1 cos(x) exists in IR.
t↑∞ 0
1 , when this quantity is finite, in terms of the |N |α gamma function.
(iii) Compute E
1 3.2. Prove that, if E |X|α
< ∞, then:
1 1 aα E α/2 = α α −1 E A |X|α Γ 2 22
.
(4.16.3)
Hint: Use formula (4.16.2) and dominated convergence. 3.3. Comparing (4.16.1) and (4.16.3) and using the result in 3.1 (iii), prove that:
πα aα = Γ(α) cos 2
(0 < α < 1) .
(4.16.4)
3.4. Give a direct proof of formula (4.16.4). 4. [A random Cauchy transform] Develop a similar discussion as for the Gauss transform, starting with an identity in law: (law) Y = BC , where, on the right hand side, B and C are independent, B > 0 P a.s. and C is a Cauchy variable with parameter 1. 1 In particular, show that: E = E (|C|α ), and compute this quan|C|α tity, when it is finite, in terms of the gamma function. Hint: Use question 2 in Exercise 4.12.
110
Exercises in Probability
Comments and references: (a) One of the advantages of using the Gauss transform √ for certain variables (law) A with complicated distributions is that X = N A may have a simple distribution. A noteworthy example is the Kolmogorov–Smirnov statistics: 2 A = sups≤1 |b(s)| , where (b(s), s ≤ 1) denotes the one-dimensional Brownian bridge; A satisfies ⎛
⎞
2
P ⎝ sup |b(s)|
∞
≤ x⎠ =
s≤1
(−1)n exp(−2nx2 ) ,
n=−∞
whereas there is the simpler expression:
P |N | sup |b(s)| ≤ a = tanh(a) , s≤1
a ≥ 0.
For a number of other examples, see the paper by Biane–Pitman–Yor referred to in Exercise 4.9. (b) The Gauss transform appears also quite naturally when subordinating an increasing process (τt , t ≥ 0) to an independent Brownian motion (βu , u ≥ 0) following Bochner (1955), that is defining: Xt = βτt . Then, for each t, one (law) √ τt N . has: Xt = *
4.17 Unilateral stable distributions (1)
In this exercise, Za denotes a gamma variable with parameter a > 0, and, for 0 < μ < 1, Tμ denotes a stable unilateral random variable with exponent μ, i.e. the Laplace transform of Tμ is given by: (λ ≥ 0) ,
E [exp(−λTμ )] = exp(−λμ )
(4.17.1)
1. Prove that, for any γ > 0, and for any Borel function f : IR+ → IR+ , the following equality holds:
E f (Z μγ )
1 μ
=E f
Zγ Tμ
c (Tμ )γ
,
(4.17.2)
, and, on the right hand side, Zγ and Tμ are assumed to be where c = μ ΓΓ(γ) ( μγ ) independent. In particular, in the case μ = γ, one has:
1 μ
E f (Z ) = E f
Zμ Tμ
Γ(μ + 1) (Tμ )μ
(4.17.3)
where Z denotes a standard exponential variable (with expectation 1).
4. Distributional computations
111
2. Prove that for s ∈ IR, s < 1, one has E [(Tμ )μs ] =
Γ(1 − s) . Γ(1 − μs)
(4.17.4)
Comments and references: (a) Formula (4.17.2) may be found in Theorem 1 of: D.N. Shanbhag and M. Sreehari: An extension of Goldie’s result and further results in infinite divisibility. Z. f¨ ur Wahr, 47, 19–25 (1979). Some extensions have been established in: W. Jedidi Stable processes: mixing and distributional properties. Pr´epublication du Laboratoire de Probabilit´es et Mod`eles Al´eatoires (2000). (b) The (s) Exercise 4.2 satisfies E[exp(−λT )] = √ variable considered in(law) exp(− 2λ). Hence, we have: 12 T = T 1 , with the above notations. 2
1 2
(c) Except for the case μ = (see the discussion in Exercise 4.18), there is no simple expression for the density of Tμ . However, see: H. Pollard: The representation of e−x as a Laplace integral. Bull. Amer. Math. Soc., 52, 908–910 (1946). λ
V.M. Zolotarev: On the representation of the densities of stable laws by special functions. (In Russian.) Teor. Veroyatnost. i Primenen., 39 (1994), no. 2, 429–437; translation in Theory Probab. Appl. 39, (1994) no. 2, 354–362 (1995). *
4.18 Unilateral stable distributions (2)
(We keep the notation of Exercise 4.17.) 0. Preliminary: The aim of this exercise is to study the following relation between the laws of two random variables X and Y taking values in (0, ∞): for every Borel function f : (0, ∞) → IR+ ,
E[f (X)] = E f
1 Y
c Yμ
where μ ≥ 0, and c is the constant such that: E
c Yμ
(4.18.1)
= 1.
(law) 1 In the case μ = 0, one has obviously c = 1, and (4.18.1) means that X = ; Y moreover, in the general case μ > 0, replacing X and Y by some appropriate powers, we may assume that 0 < μ < 1.
112
Exercises in Probability
1. Prove that the following properties are equivalent: (i) X and Y satisfy (4.18.1); c μ−1 (ii) P (Zμ X ∈ dt) = t ϕ(t)dt, Γ(μ) where Zμ and X are assumed to be independent, and ϕ(t) = E[exp(−tY )];
1
(iii) P Z μ X ∈ du = cμuμ−1 ϕμ (u)du, where ϕμ (u) = E [exp −(uY )μ ], and X and Z are assumed to be independent. 2. Explain the equivalence between (ii) and (iii) using the result of Exercise 4.17. Comments and references: The scaling property of Brownian motion yields many illustrations of formula (4.18.1). For μ = 0, a classical example is the following identity in law between B1 , the Brownian motion taken at time 1 and T1B , the first hitting time by this process of the level 1: (law)
T1B =
1 . B12
For μ = 12 an example is given by the relationship between the law of the local time for the Brownian bridge and the inverse local time of Brownian motion. We refer to: Ph. Biane, J.F. Le Gall and M. Yor: Un processus qui ressemble au pont brownien. S´eminaire de Probabilit´es XXI, 270–275, Lecture Notes in Mathematics, 1247, Springer, Berlin, 1987. J. Pitman and M. Yor: Arcsine laws and interval partitions derived from a stable subordinator. Proc. London Math. Soc., (3) 65, no. 2, 326–356 (1992). **
4.19 Unilateral stable distributions (3)
(We keep the notation of Exercise 4.17.) 1. Let 0 < μ ≤ 1 and 0 < ν ≤ 1. Prove the double identity in law 1
1
Z μ (law) Z ν (law) Z = = Tν Tμ Tν Tμ
(4.19.1)
where the variables Z, Tμ and Tν are assumed to be independent, and we make the convention that T1 = 1 a.s. Prove that (4.19.1) is equivalent to 1
(law)
Zμ =
Z . Tμ
(4.19.2)
4. Distributional computations
113
2. Recall the identity in law (4.5.4) obtained in Exercise 4.5: (law)
(Zna )n = nn Za Za+ 1 · · · Za+ n−1 . n
n
Combining this identity in law for an appropriate choice of a, with (4.19.2), prove that for n ∈ IN∗ , one has : (law)
T1 = n
−1
nn (Z 1 · · · Z n−1 ) n
.
n
(4.19.3)
3. Let Za,b denote a beta variable with parameters a and b, as defined in Exercise 4.2, and recall the identity in law (4.2.1): (law)
Za = Za,b Za+b , obtained in this exercise (on the right hand side, Za+b and Za,b are independent). Deduce from this identity that for all n, m ∈ IN, such that m < n, we have (with the convention, for m = 1, that 0k=1 = 1):
m T mn
m
(law)
= n
n
m−1
Z k ,k( 1 − 1 ) n
k=1
m
n−1
n
Zk
n
k=m
,
(4.19.4)
where the r.v.s on the right hand side are independent. 4. Using identity (4.8.2) in Exercise 4.8, prove that for any p ≥ 1: (law)
T2−p =
22
p −1
2
p
−1
(N12 N22 · · · Np2 )
,
(4.19.5)
where N1 , . . . , Np are independent centred Gaussian variables with variance 1. 5. Applications: (a) Deduce from above the formulae: dt e− 4t √ , 4πt3 1
P (T 1 ∈ dt) = 2
−1
P ((T 1 ) 3
dt ∈ dt) = Γ( 13 )Γ( 23 )
(4.19.6)
2 2√
33
⎛ ⎞
2 K1 ⎝ 3 3 t
t⎠ , (4.19.7) 3
where Kν is the MacDonald function with index ν, which admits the integral representation (see e.g. Lebedev [69]): ν
1 x Kν (x) = 2 2
0
∞
dt tν+1
x2 exp − t + 4t
.
(b) From the representation (4.19.5), deduce the formula: ⎛
P⎝
⎞
1 1
4(T 1 ) 3 4
> y⎠ =
da 11 y
. exp − 1 2 π 0 a 3 (1 − a) 3 a(1 − a)
(4.19.8)
114
Exercises in Probability (c) Set T =
4 27
T2
−2
3
, prove that
P (T ∈ dx) =
with C = Γ
2 3
B
1 1 , 3 6
−1
C dx 1
x3
exp − xz
1
dz
0
4
5
z 3 (1 − z) 6
,
(4.19.9)
.
Comments and references: (a) Formula (4.19.2) is found in : D.N. Shanbhag and M. Sreehari: On certain self-decomposable distributions. Z. f¨ ur Wahr, 38, 217–222 (1977). These authors later conveniently extended this formula to gamma variables, as is presented in Exercise 4.17. As an interpretation of formula (4.19.2) in terms of stochastic processes, we mention: 1 (α) (law) lS = S α , (Tα )α (α)
where lS is the local time of the Bessel process with dimension 2(1 − α), considered at the independent exponential time S. This relation is due to the scaling property of Bessel processes; in fact, one has: (α) (law)
l1
=
1 . (Tα )α
(α)
Here, l1 is the local time taken at time 1. Its distribution is known as the Mittag–Leffler distribution with parameter α, whose Laplace transform is given by: E[exp(xl1 )] = E[exp(x(Tα )−α )] = (α)
∞
xn . n=0 Γ(αn + 1)
For more details about Mittag–Leffler distributions, we refer to S.A. Molchanov and E. Ostrovskii: Symmetric stable processes as traces of degenerate diffusion processes. Teor. Verojatnost. i Primenen., 14, 127–130 (1969) D.A. Darling and M. Kac: On occupation times for Markoff processes. Trans. Amer. Math. Soc., 84, 444–458 (1957) R. N. Pillai: On Mittag–Leffler functions and related distributions. Ann. Inst. Statist. Math., 42, no. 1, 157–161 (1990).
4. Distributional computations
115
(b) Formulae (4.19.3), (4.19.5) and their applications, (4.19.7) and (4.19.8) are found in V.M. Zolotarev [66]. * 4.20 A probabilistic translation of Selberg’s integral formulae Preliminary: A particular case of the celebrated Selberg’s integral formulae (see below for references) is the following: for every γ > 0,
E[|Δn (X1 , . . . , Xn )|2γ ] =
n Γ(1 + jγ) j=2
Γ(1 + γ)
,
(4.20.1)
where X1 , . . . , Xn are independent Gaussian variables, centred, with variance 1, and Δn (x1 , . . . , xn ) = j 0) ,
(4.21.4) (law)
where C denotes a standard Cauchy variable and Cμ = sin(πμ)C −cos(πμ) = sin(πμ−Θ) , and Θ is uniform on [0, 2π[. sin(Θ) Hint: Using residue calculus, show that
1 π
IR
eisx
sin πμ sinh πμs dx = . cosh x + cos πμ sinh πs
Comments and references: (a) Formula (4.21.3) is very simple and striking, since it shows that the density of the ratio of two independent, unilateral, stable variables can be made explicit, whereas there is no such explicit formula for the density of Tμ , except in the particular cases which are considered in Exercise 4.19. When μ = 12 , the identity T1 2
(law)
T1 + T1 2
=
1 (law) = cos2 (Θ) , 1 + C2
2
where T 1 and T 1 are independent, C is a Cauchy r.v. and Θ is uniformly 2 2 distributed on [0, 2π[, plays some role in the derivation by:
4. Distributional computations
117
´vy: Sur certains processus stochastiques homog`enes. Compositio Math., P. Le 7, 283–339 (1939)
of the arcsine law for 01 ds 1I{Bs >0} , where B is the standard Brownian motion. His results have been extended to some multidimensional cases in: M. Barlow, J. Pitman and M. Yor: Une extension multidimensionnelle de la loi de l’arcsinus. S´eminaire de Probabilit´es XXIII, Lecture Notes in Mathematics, 1372, 294–312 (1989). (b) Formula (4.21.3) is found in both: V.M. Zolotarev: Mellin–Stieltjes transforms in probability theory. Teor. Veroyatnost. i Primenen., 2, 444–469 (1957) and J. Lamperti: An occupation time theorem for a class of stochastic processes. Trans. Amer. Math. Soc., 88, 380–387 (1958). (c) The same law occurs in free probability, with the free convolution of two free stable variables. See Biane’s appendix (Proposition A4.4) in: H. Bercovici and V. Pata: Stable laws and domains of attraction in free probability theory. With an appendix by Philippe Biane. Ann. of Math., (2) 149, no. 3, 1023–1060 (1999). (d) See Exercise 5.10 for some discussion of the asymptotic distribution for Tμ , as μ → 0, using partly (4.21.3). **
4.22 Solving certain moment problems via simplification
Part A. Let (μ(n), n ∈ IN) be the sequence of moments of a positive random variable V , which is moments determinate. Let (ν(n), n ∈ IN) be the sequence of moments of a positive random variable W , which is simplifiable. (law) Assume further that V = W R, for R a positive r.v., independent of W . Prove that there exists only one moments determinate distribution θ(dx) on IR+ , such that: ∞ μ(n) , n ∈ IN , (4.22.1) xn θ(dx) = ν(n) 0 and that θ is the distribution of R. Part B. (Examples.) Identify R when: (i) μ(n) = (n!)2 and ν(n) = Γ(n + α)Γ(n + β), α, β > 1;
118
Exercises in Probability
(ii) μ(n) = (2n)! and ν(n) = n!; (iii) μ(n) = (3n)! and ν(n) = (2n)!. Comments and references: (a) This exercise combines some of our previous discussions involving moment problems, simplifications, beta–gamma algebra, etc. (b) This exercise has been strongly motivated from a discussion with K. Penson, in particular about the paper: J.R. Klauder, K.A. Penson and J.M. Sixdeniers: Constructing coherent states through solutions of Stieltjes and Hausdorff moment problems. Phys. Review A., 64, 1–18.
4. Distributional computations – solutions
119
Solutions for Chapter 4
Solution to Exercise 4.1 Part A. 1. On the one hand, we have (ab)n 1 , E[ea (X)eb (X)] = E[exp (a + b)X] exp − (a2 + b2 ) = exp ab = 2 n≥0 n!
a, b ∈ C .
On the other hand, E[ea (X)eb (X)] =
an b m n,m
n!m!
E[hn (X)hm (X)] ,
a, b ∈ C .
This shows that E[hn (X)hm (X)] = 0, if n = m and E[hn (X)hm (X)] = n!, if n = m. 2. From the expression of the Laplace transform of Y , we deduce E[exp(a(X + cY )) | X] = exp(aX)E[exp acY ] = ea (X, −c2 ) . On the other hand, we have E[exp(a(X + cY )) | X] =
∞ an n=0
so E[(X + cY )k | X] =
∂k
e (X, −c ∂ak a
2
n!
E[(X + cY )n | X] ,
)|a=0 = hk (X, −c2 ).
Part B. 1. First, note that ε = sgn(X) is a symmetric Bernoulli r.v. which is independent of (X 2 , Y ), therefore it is independent of (T, g), and we only need to compute the law of this pair of variables. Let f be a bounded measurable function on IR+ × [0, 1], then 2 1 2 x2 2 2 2 E[f (T, g)] = f e−(x +y )/2 dx dy . (x + y ), 2 π IR2+ 2 x + y2
120
Exercises in Probability
The change of variables: √ x = √ 2uv y = 2u − 2uv ,
u = (x2 + y 2 )/2 v = x2 /(x2 + y 2 ) gives
E[f (T, g)] =
IR+ ×[0,1]
f (u, v)
e−u 1I{v∈]0,1]}
du dv . π v(1 − v)
Therefore, the density of the law of (T, g) on IR2 is given by 1 e−u 1I{u∈[0,∞]}
1I{v∈]0,1[} , π v(1 − v)
u, v ∈ IR.
This expression shows that T and g are independent and since ε is independent of the√pair (T, g), it follows that ε, T and g are independent. From above, the law of 2T has density t exp − 12 t2 1I{t∈[0,∞[} (so T is exponentially distributed), and g is arcsine distributed. √ 2. (i) Observing that X = μ 2T , and using the previous question, we can write √ E[exp (aX) | μ] = E[exp (aμ 2T ) | μ] ∞ 1 = exp (atμ)t exp (− t2 ) dt 2 0 = ϕ(aμ) , √ where ϕ is given by ϕ(z) = E[exp(z 2T )], for any z ∈ C. For any real x, we have
ϕ(x) = 0
∞
t2
text− 2 dt = 1 + x
∞
t2
ext− 2 dt
0
x2 ∞ − 1 (t−x)2 e 2 dt = 1 + x exp 2 0 x y2 x2 = 1 + x exp e− 2 dy , 2 −∞ which gives the expression of ϕ on C, by analytical continuation. The formulas given in (ii) and (iii) simply follow from the development of cosh(aX) = 12 eaX + e−aX . 3. Since sinh is an odd function, from the definition of ea (X), we can write
a2n+1 a2 exp − E[h2n+1 (X) | μ] . E[sinh(aX) | μ] = 2 n≥0 (2n + 1)!
4. Distributional computations – solutions On the other hand, from question 2, we have
a2 exp − E[sinh(aX) | μ] = 2 =
121
π a2 (μ2 − 1) aμ exp 2 2 π 1 2n+1 2 (μ − 1)n . μ a 2 n≥0 2n n!
Equating the two above expressions, we obtain π (2p + 1)! μ(μ2 − 1)p . 2 2p p!
E[h2p+1 (X) | μ] =
From similar arguments applied to cosh(aX), we obtain a2n n≥0
(2n)!
E[h2n (X) | μ]
1 a2 μ 2 a2 = exp − 1 + a2 μ2 dy exp (1 − y 2 ) 2 2 0 1 (−1)n a2n μ2 = 1+ dy (μ2 (1 − y 2 ) − 1)n−1 a2n , + n n! n−1 (n − 1)! 0 2 2 n≥1
which gives, for p ≥ 1, E[h2p (X) | μ] =
(−1)p (2p)! (2p)!μ2 + 2p p! 2p−1 (p − 1)!
0
1
dy (μ2 (1 − y 2 ) − 1)p−1 .
4. The r.v.s μ and ν cannot be independent since they are related by μ2 + ν 2 = 1. it is obvious that T , g, ε and ε are independent. Moreover Set ε = sgn(Y ) then √ √ note that ν = ε g, so 2T is independent of the pair of variables (μ, ν), and √ √ E[exp(aX + bY ) | μ, ν] = E[exp(aμ 2T + bν 2T ) | μ, ν]
∞ 1 = exp (atμ + btν) t exp − t2 dt 2 0 = ϕ(aμ + bν) .
Solution to Exercise 4.2 1. We obtain this identity in law by writing for any bounded Borel function f defined on IR2+ , E[f (Za,b Za+b , (1 − Za,b )Za+b )]
1
= 0
0
∞
dx dy f (xy, (1 − x)y)
1 = Γ(a + b)β(a, b)
0
∞
0
∞
1 1 xa−1 (1 − x)b−1 y a+b−1 e−y β(a, b) Γ(a + b)
dz dt f (z, t)z a−1 e−z tb−1 e−t ,
122
Exercises in Probability
where the second equality has been obtained with an obvious change of variables. Note that the above calculation (with f ≡ 1) allows us to deduce the formula Γ(a)Γ(b) = Γ(a + b)β(a, b) . 2. Let Za , Za+b , Za+b+c , Za,b , Za,b+c , and Za+b,c be independent variables. Since Za+b+c satisfies the hypothesis of Exercise 1.12, the identity in law we want to prove is equivalent to (law)
Za+b+c Za,b+c = Za+b+c Za,b Za+b,c . Applying question 1 to the products Za+b+c Za,b+c and Za+b+c Za+b,c , we see that the above identity also is equivalent to (law)
Za = Za+b Za,b , which has been proved in question 1. (law)
3. A classical computation shows that for any i, Ni2 = 2Z1/2 . The identity in law (law) Rk2 = 2Zk/2 follows from the expression of the Laplace transform of the gamma a 1 variable with parameter a, i.e. E[exp −λZa ] = λ+1 , λ > 0. The law of a uniformly distributed vector xk on the unit sphere Sk−1 of IRk is (law) characterized by the identities |xk | = 1 and xk = ρxk for every rotation mak trix ρ in IR . Since such matrices are orthogonal, from Exercise 3.8 it holds that (law) (N1 , . . . , Nk ) = ρ(N1 , . . . , Nk ). Hence for every rotation matrix ρ, one has
1 (N1 , . . . , Nk ), Rk Rk
(law)
=
ρ
1 (N1 , . . . , Nk ), Rk Rk
and since | R1k (N1 , . . . , Nk )| = 1, we conclude that tributed on Sk−1 and independent of Rk .
1 (N1 , . . . , Nk ) Rk
is uniformly dis-
4. Let (Ni )i≥1 be an infinite sequence of independent centred Gaussian r.v.s with variance 1. It follows from question 3 that for any k ≤ n, √ √ n (n) (n) (law) n(x1 , . . . , xk ) = (N1 , . . . , Nk ) , Rn where Rn is the√norm of (N1 , . . . , Nk , . . . , Nn ). From question 3 (i) and the law of large numbers, Rnn converges almost surely to 1/E[2Z1/2 ] = 1, which proves the result.
5. From question 3 (ii), Rk2 pi=1 x2i = pi=1 Ni2 and Rk is independent of pi=1 x2i . Moreover, we know from question 3 (i) that pi=1 Ni2 is distributed as 2Z p2 and that
4. Distributional computations – solutions 1 Rk2
is distributed as 12 Z −1 So we have Z p2 k . 2
p
(law)
(law)
=
i=1 (law)
of question 2 we also obtain Z p2 = Z p , k−p Z k = 2 2 2 thanks to the fact that Z k is simplifiable.
123
x2i Z k . Applying the result
p
2
2 i=1 xi Z k . And we conclude 2
2
(law)
2 k/2−1 6. From above, we have ( k−2 = (Zk/2−1,1 )k/2−1 and we easily check from i=1 xi ) a the above definition of beta variables that for any a > 0, Za,1 is uniformly distributed on [0,1].
7. (a) By the change of variables t = 1/x2 , for any bounded Borel function f , we have
1 E f N2
=
∞ 2 1 2∞ 1 dt 1 − x2 √ f e dx = f (t)e− 2t √ = E[f (T )] . π 0 x2 2π 0 t3 2
α To obtain the other identity in law, set F (α) = E[exp(− 2T )]. From above, F (α) =
2 ∞ π
0
α2
x2
e− 2x2 e− 2 dx, hence one has F (α) = −α
2 ∞
2 ∞
2 − u2
π
0
α2
x2
e− 2x2 e− 2
dx . x2
The change
2 − α2 2u
e du = −F (α). Therefore, of variables u = αx gives F (α) = − π 0 e −α F (α) = ce , where c is some constant. Since F (0) = 1, we conclude that F (α) = e−α . (b) This simply from the expression of the Laplace transform of the (s) dis follows tribution: E exp − 12 α2 ki=1 p2i Ti = ki=1 exp(−pi α) = exp(−α). 8. (a) Applying the result of question 7, we obtain that for any k ≥ 1: (law)
T =
k p2i 2 i=1 Ni
(law)
=
k 1 p2i , 2 Rk i=1 x2i
where, on the right hand side, Rk2 is independent from (x1 , . . . , xk ). Then we have k 1 p2i (law) 1 1 = . Rk2 i=1 x2i Rk2 x21
Hence, since
1 Rk2
is a simplifiable variable (use Exercise 1.12), we obtain: k p2i 2 i=1 xi
for any j ≤ k.
(law)
=
1 (law) 1 = 2, x21 xj
124
Exercises in Probability
(b) Now we give a more computational proof which also yields the density of The calculation of the characteristic functions of log R12 and log T gives
k
p2i i=1 x2 . i
k
1 E exp iλ log 2 Rk
= E[(2Zk/2 )−iλ ] = 2−iλ
and
2 2−iλ ∞ −iλ−1/2 −s e−x /2 dx = √ (x2 )−iλ √ s e ds π 0 0 2π = 2−iλ Γ(1/2 − iλ) .
E[eiλ log T ] = E[(N 2 )−iλ ] = 2
Γ(k/2 − iλ) Γ(k/2)
∞
So that for all λ ∈ IR,
E exp iλ log
k p2j
Γ( 12 − iλ)Γ( k2 ) Γ(k/2) β( 1 − iλ, k2 − 12 ) = k Γ(k/2 − 1/2) 2 Γ( 2 − iλ)
=
2 j=1 xj
1 Γ(k/2) 1 (1 − t) = exp iλ log Γ(k/2 − 1/2) 0 t t1/2
k−2 ∞ (x − 1) exp(iλ log x) 2 xk/2 1
=
In conclusion, the law of the r.v. 9. Since k i=1
k
i=1
Zai /
k
j=1 Zaj
⎛
Zai and the vector ⎝Zai /
k
p2i i=1 x2 i
k−3 2
k−3 2
dt
dx .
k−3
has density
k−2 (x−1) 2 2 xk/2
1I[1,∞) (x).
= 1, it suffices to show the independence between k
⎞
Zaj : i ≤ k − 1⎠.
j=1
Let f : IR+ → IR and g : IRk−1 → IR be Borel and bounded. From the change of + variables, ⎧ ⎪ y1 = ⎪ ⎪ ⎪ ⎪ ⎪ · ⎪ ⎪ ⎪ ⎨ ·
x1 yk
⎪ · ⎪ ⎪ ⎪ ⎪ ⎪ yk−1 = xk−1 ⎪ yk ⎪ ⎪ ⎩ y = x + ··· + x k 1 k
and putting Za i = Zai /
k j=1
or equivalently
⎧ x1 = y 1 yk ⎪ ⎪ ⎪ ⎪ ⎪ · ⎪ ⎪ ⎪ ⎨ ·
⎪ · ⎪ ⎪ ⎪ ⎪ ⎪ xk−1 = yk−1 yk ⎪ ⎪ k−1 ⎩
xk = y k −
Zaj , we obtain:
E[f (Za1 + · · · + Zak )g(Za 1 , . . . , Za k−1 )]
x1 xk−1 f (x1 + · · · + xk )g ,..., × = x 1 + · · · + xk x 1 + · · · + xk IRk+ xa11 −1 . . . xakk −1 −(x1 +···+xk ) e dx1 . . . dxk Γ(a1 ) . . . Γ(ak )
i=1
yi yk
4. Distributional computations – solutions
=
IRk+
f (yk )g(y1 , . . . , yk−1 )(y1 yk )a1 −1 . . . (yk−1 yk )ak−1 −1 ×
yk −
k−1
ak −1
yi yk
i=1
=
125
e−yk ykk−1 1I k−1 dy1 . . . dyk Γ(a1 ) . . . Γ(ak ) {(y1 ,...,yk−1 )∈[0,1] }
k−1 e−yk yk 1 a1 −1 f (yk ) . . . (yk−1 )ak−1 −1 × dyk k−1 g(y1 , . . . , yk−1 )(y1 ) Γ(a + · · · + a ) IR+ IR+ 1 k a +···+a
1−
k−1
ak −1
yi
i=1
−1
Γ(a1 + · · · + ak ) 1I k−1 dy1 . . . dyk−1 , Γ(a1 ) . . . Γ(ak ) {(y1 ,...,yk−1 )∈[0,1] }
which shows the required⎛independence. The above ⎞ identity also gives the density of the law of the vector: ⎝Zai /
k
Zaj : i ≤ k − 1⎠ on IRk−1 :
j=1 k−1 Γ(a1 + · · · + ak ) yi )ak −1 1I{(y1 ,...,yk−1 )∈[0,1]k−1 } , (y1 )a1 −1 . . . (yk−1 )ak−1 −1 (1 − Γ(a1 ) . . . Γ(ak ) i=1
⎛
and this law entirely characterizes the law of the vector ⎝Zai / the simplex
k
⎞
Zaj : i ≤ k ⎠ on
j=1
k
.
Solution to Exercise 4.3 1. We prove the equivalence only by taking a = α and b = A. 2. It suffices to show that the right hand sides of (4.3.2) and (4.3.3) are identical in law, that is:
1 (law) 1 Zb = 1+ . Za Za+b Za But the above identity follows directly from the fundamental beta–gamma identity (4.2.1). 3. We obtain (4.3.3), by adding the first coordinate to the second one multiplied by 1 , in each member of (4.3.4). C 4. This follows (as in question 2) from the equivalent form of (4.2.1): (law)
(Za+b , Zb ) = (Za+b,b Za+2b , (1 − Za+b,b )Za+2b ) by trivial algebraic manipulations.
126
Exercises in Probability
Solution to Exercise 4.4 Both questions 1 and 2 are solved by considering the finite dimensional variables (γt1 , γt2 , . . . , γtk ), for t1 , t2 , . . . , tk , and using question 9 of Exercise 4.2. 3. The independence between (Du , u ≤ 1) and γ1 allows us to write:
E exp −γ1
1 0
f (u) dDu
∞
= 0
dt e−t E exp −t
0
1
f (u) dDu
1 = E . 1 1 + 0 f (u) dDu
Solution to Exercise 4.5 1. It suffices to identity the Mellin transforms of both sides of (4.5.4). (See comment (b) in Exercise 1.13.) Observe that for all a > 0 and k ≥ 0, E[(Za )k ] = E[(nn Za Za+ 1 . . . Za+ n−1 )k ] = nnk Πn−1 j=0 E n
Za+ j
n
=
Γ(k+a) . Γ(a)
k
n
Therefore, for every k ≥ 0, = nnk Πn−1 j=0
Γ(k + a + nj ) Γ(a + nj )
Γ(na + nk) = E[(Zna )nk ] , Γ(na)
where the second to last equality follows from (4.5.1). 2. From question 1 of Exercise 4.2, we deduce that for all n ≥ 1 and a > 0, (law)
(Zna )n = (Zna+nb )n (Zna,nb )n . Applying the result of the previous question to Zna and Zn(a+b) gives: (law)
Za Za+ 1 . . . Za+ n−1 = Za+b Za+b+ 1 . . . Za+b+ n−1 (Zna,nb )n . n
n
n
n
Now, from question 1 of Exercise 4.2, we have Za+ i = Za+ i +b Za+ i ,b , so that n
(law)
=
Za+b Za+b+ 1 . . . Za+b+ n−1 n
n
n
n
Za,b Za+ 1 ,b . . . Za+ n−1 ,b n
n
n
Za+b Za+b+ 1 . . . Za+b+ n−1 (Zna,nb ) . n
n
The variable Za+b Za+b+ 1 . . . Za+b+ n−1 being simplifiable, we can apply the result of n n Exercise 1.12 to get the conclusion. (We might also invoke the injectivity of the Mellin transform.)
4. Distributional computations – solutions
127
Solution to Exercise 4.6 1. From question 1 of Exercise 4.2, we have: (law)
Za = Za,1 Za+1 ,
(law)
Za = Za,1 Za+1,1 Za+2 , . . . ,
which allows to obtain (4.6.1) by successive iterations of this result.
−(a+n)
λ ] = 1+ 2. The Laplace transform of n Za+n is given by E[e n and we easily check that it converges towards e−λ for any λ ∈ IR+ . This proves that n−1 Za+n converges in law towards the constant 1 as n goes to +∞. −λn−1 Za+n
−1
(law)
A less computational proof consists of writing: Za+n = Za + X1 + · · · + Xn , where on the right hand side of this equality, the n + 1 r.v.s are independent, each of the Xi s having exponential law with parameter 1. Then applying the law of large numbers, we get Za+n → 1 = E[X1 ] in law as n → ∞. n 1
3. As we already noticed in question 6 of Exercise 4.2, for any n, the r.v. U a+(n−1) is distributed as Za+(n−1),1 , so from question 1, we have (law)
Za =
1 a
1 a+(n−1)
nU1 . . . Un
Za+n , n
where Za+n is independent of (U1 , . . . , Un ). We deduce the result from this independence and the convergence in law established in question 2. 4. By the same arguments as in question 1, we show that (law)
Za = Za,r Za+r,r . . . Za+(n−1)r,r Za+nr , and as in question 2, we show that n−1 Za+nr converges in law to the constant r as n goes to +∞. Then we only need to use the same reasoning as in question 3.
Solution to Exercise 4.7 (law)
1. From the identities: ( A , 1−A ) = ( Z1b , Z1a ) and M = L L (law)
(A, M ) =
L , A(1−A)
we deduce
Za , Za + Zb , Z a + Zb
and we know from question 9 of Exercise 4.2 that the right hand side of the above identity is distributed as (Za,b , Za+b ), these two variables being independent.
128
Exercises in Probability
2. (a) From Qβ (Ω) = 1, we obtain cβ = E[Lβ ]−1 . Since L = M A(1 − A), we have from the independence between M and A: E[Lβ ] = E[M β ]E[Aβ (1 − A)β ]. Now, let us compute E[Aβ (1 − A)β ]: first we have Aβ (1 − A)β =
Zaβ Zbβ . (Za +Zb )2β
Since we can
Zaβ Zbβ 2β = Aβ (1 − A)β (Za + Zb )2β and from write (Za Zb )β as (Za Zb )β = (Za +Z 2β (Za + Zb ) b) a , Zb and Za + Zb , established in question 9 of the independence between ZaZ+Z b Za +Zb
Exercise 4.2, we have:
E[Aβ (1 − A)β ] = E[(Za Zb )β ]E[(Za + Zb )2β ]−1 2β −1 = E[Zaβ ]E[Zbβ ]E[Za+b ] Γ(a + β) Γ(b + β) Γ(a + b) = . Γ(a) Γ(b) Γ(a + b + 2β) Finally, with E[M β ] = cβ =
Γ(a+b+β) , Γ(a+b)
we have:
Γ(a)Γ(b) Γ(a + b + 2β) Γ(a)Γ(b) = . Γ(a + β)Γ(b + β) Γ(a + b + β) Γ(a + b + β)B(a + β, b + β)
(b) Let us compute the joint Mellin transform of the pair of r.v.s (A, M ) under Qβ , that is: EQβ [Ak M n ], for every k, n ≥ 0. EQβ [Ak M n ] = cβ E[Ak M n Lβ ] = cβ E[Ak+β (1 − A)β M n+β ] = cβ E[Ak+β (1 − A)β ]E[M n+β ] . As in 2 (a), by writing Zak+β Zbβ = Ak+β (1 − A)β (Za + Zb )k+2β , we have k+2β −1 ] E[Ak+β (1 − A)β ] = E[Zak+β ]E[Zbβ ]E[Za+b Γ(a + k + β) Γ(b + k + β) Γ(a + b) = , Γ(a) Γ(b) Γ(a + b + k + 2β)
so that with E[M n+β ] =
Γ(a+b+n+β) , Γ(a+b)
we obtain:
Γ(a + k + β) Γ(b + k + β) Γ(a + b + n + β) Γ(a) Γ(b) Γ(a + b + k + 2β) cβ = B(a + k + β, b + β)Γ(a + b + n + β) Γ(a)Γ(b) B(a + k + β, b + β) Γ(a + b + n + β) = B(a + β, b + β) Γ(a + b + β) k n = E[Za+b,b+β ]E[Za+b ] .
EQβ [Ak M n ] = cβ
Thus, under Qβ , the r.v (A, M ) has the same joint Mellin transform as the pair (Za+b,b+β , Za+b ), hence their laws coincide.
4. Distributional computations – solutions
129
(c) For f be a bounded Borel function, we have
EQβ f
A 1−A , L L
which gives the density g of g(s, t) =
1 1 , Lβ Zb Z a
1 1 (Za Zb )β = cβ E f , Zb Za (Za + Zb )β 1 1 (xy)β a−1 b−1 −(x+y) cβ = x y e dxdy , 2 f Γ(a)Γ(b) IR+ x y (x + y)β s+t cβ = f (s, t)(s + t)−β s−(a+1) t−(b+1) e− st dsdt , Γ(a)Γ(b) IR2+ = cβ E f
A 1−A , L L
under Qβ :
s+t cβ (s + t)−β s−(a+1) t−(b+1) e− st 1I{s≥0,t≥0} . Γ(a)Γ(b)
Solution to Exercise 4.8 1. From the identity in law (4.5.4), we have (law)
Z =
√
2Z 2Z 1 , 2
where Z 1 is a gamma r.v. with parameter 1/2 which is independent of Z. To con2
clude, it suffices to check that
(law)
2Z 1 = |N |, which is easy. 2
2. Let Z1 , Z2 , . . . , Zp be exponentially distributed r.v.’s with parameter 1 such that the 2p r.v.s Z1 , Z2 , . . . , Zp , N1 , N2 , . . . √ , Np are independent. Put Z = Z0 . From (law) above, the identities in law: Zi−1 = 2Zi |Ni | hold for any i = 1, . . . , p. √ (law) √ 2Z1 |N1 | and from the indeReplacing Z1 by 2Z2 |N2 | in the identity Z = (law)
1
1
1
1
pendence hypothesis, we obtain: Z = Z24 2 2 + 4 |N1 | |N2 | 2 . We get the result by successive iterations of this process. 1
1
1
3. It is clear that the term Z 2p 2 2 +···+2p in equation (4.8.2) converges almost surely 1 1 1 1 1 (law) 1 to 2. Moreover, since 12 Z 2p 2 2 +···+ 2p |N1 ||N2 | 2 . . . |Np | 2p−1 = 2 Z according to
1
1
equation (4.8.2), then |N1 ||N2 | 2 . . . |Np | 2p−1
converges in law to 12 Z.
130
Exercises in Probability
Solution to Exercise 4.9 1. For every t ≥ 0, we have
∞
P (T1 + T2 > t) =
0
t
= 0
λ1 e−λ1 s P (s + T2 > t) ds
λ1 e−λ1 s e−(λ1 −λ2 )s ds +
∞
t
λ1 e−λ1 s ds
λ1 λ2 e−λ1 t − e−λ2 t λ2 − λ 1 λ2 − λ1 λ2 λ1 = P (T1 > t) − P (T2 > t) . λ2 − λ 1 λ2 − λ 1 =
This proves that the probability measure P (T1 + T2 ∈ dt) is a linear combination 2 of the probability measures P (T1 ∈ dt) and P (T2 ∈ dt) with coefficients α = λ2λ−λ 1 λ1 and β = − λ2 −λ1 . 2. (a) This follows from the simple calculation:
P (T1 > T3 ) =
{s≥t}
λ1 (λ2 − λ1 )e−λ1 s e−(λ2 −λ1 )t ds dt
= λ1 (λ2 − λ1 )
∞
−(λ2 −λ1 )t
∞
e
0
−λ1 s
e
ds dt =
t
λ 2 − λ1 . λ2
(b) Let us first compute the law of (T1 , T3 ) conditionally on (T1 > T3 ). Let f be a bounded Borel function defined on IR2 , we have from above: 1 E[f (T1 , T3 )1I{T1 >T3 } ] P (T1 > T3 )
E[f (T1 , T3 ) | T1 > T3 ] =
=
IR2
f (s, t)λ1 λ2 e−λ1 s e−(λ2 −λ1 )t 1I{0≤t≤s} ds dt .
So, conditionally on (T1 > T3 ), the pair of r.v.s (T1 , T3 ) has density λ1 λ2 e−λ1 s e−(λ2 −λ1 )t 1I{0≤t≤s} , on IR2 . Now we compute the law of (T1 + T2 , T2 ):
E[f (T1 + T2 , T2 )] =
IR2 +
=
IR2
f (x + y, y)λ1 λ2 e−λ1 x e−λ2 y dx dy f (s, t)λ1 λ2 e−λ1 s e−(λ2 −λ1 )t 1I{0≤t≤s} ds dt .
We note that the pair of r.v.s (T1 + T2 , T2 ) has the same law as (T1 , T3 ) conditionally on (T1 > T3 ).
4. Distributional computations – solutions
131
(c) From the result we just proved, we can write: λ2 − λ1 E[f (T1 + T2 )] . (4.9.a) λ2 On the other hand, inverting T1 and T3 in question 2 (b), we obtain that the pair of r.v.s (T3 + T2 , T2 ) has the same law as the pair of r.v.s (T3 , T1 ), conditionally on (T3 > T1 ). This implies λ1 (4.9.b) E[f (T1 )1I{T1 T1 ] = E[f (T2 )] . λ2 Writing E[f (T1 )] = E[f (T1 )1I{T1 >T3 } ] + E[f (T1 )1I{T1 T3 } ] = E[f (T1 + T2 )]P [T1 > T3 ] =
(n)
3. It can be proved by induction that formula (4.9.1) is verified with the αi ’s given in (4.9.2). Question 1 gives the result at rank n = 2. Now suppose that the result is true at some rank n ∈ IN and let Tn+1 be independent of T1 , T2 , . . . , Tn and such that P (Tn+1 > t) = exp(−λn+1 t), λn+1 being distinct from λ1 , . . . , λn . For all t ≥ 0, we have: P
n+1
Ti > t
∞
=
P
0
i=1
i=1
t
=
P
n
0
=
Ti > t − s λn+1 e−λn+1 s ds
Ti > t − s λn+1 e−λn+1 s ds +
i=1 n t
0 i=1
n
∞ t
λn+1 e−λn+1 s ds
αi P (Ti > t − s)λn+1 e−λn+1 s ds + e−λn+1 t . (n)
(n)
Recall that ni=1 αi = 1. Using the fact that for any i = 1, 2, . . . , n, the constants λn+1 λi ai = λn+1 and bi = λi −λ satisfy P (Ti +Tn+1 > t) = ai P (Ti > t)+bi P (Tn+1 > t), −λi n+1 we can write the above identity in the following manner: P
n+1
Ti > t
=
i=1
= =
n i=1 n i=1 n i=1
(n+1)
(n) αi
0
t
−λn+1 s
P (Ti > t − s)λn+1 e
−λn+1 t
ds + e
(n)
αi P (Ti + Tn+1 > t) (n)
αi (ai P (Ti > t) + bi P (Tn+1 > t)) =
n+1
(n+1)
αi
P (Ti > t) ,
i=1 (n+1)
’s are determined via the last equality by αi where the αi (n+1) (n) 1, . . . , n, and αn+1 = ni=1 αi bi .
(n)
= αi ai , i =
This equality being valid for all t ≥ 0, we proved (4.9.1) and (4.9.2) at rank n + 1.
132
Exercises in Probability
Solution to Exercise 4.10
T 1. According to question 1 (or 9) of Exercise 4.2, the pair T +T is dis,T + T tributed as (U˜ , Z2 ), where U˜ and Z2 are independent, U˜ is uniformly distributed on [0, 1] and Z2 has density te−t 1I{t≥0} (the reader may also want to give a simple direct proof).
2. Put U˜ = 12 (1 + U ), then U˜ is uniformly distributed on [0, 1] and independent of (T, T ). Moreover, we have the identity:
1+U U˜ ˜ log , U (T + T ) = log , (2U − 1)(T + T ) . 1−U 1 − U˜ To conclude, note that the identity in law
U˜ T (law) log , (2U˜ − 1)(T + T ) = log , T − T T 1 − U˜
(law)
T may be obtained from the identity T +T = (U˜ , T + T ) established in the ,T + T previous question, from a one to one transformation.
3. The characteristic function of (X, Y ) is E[exp i(νX + μY )]
∞
= 0
exp(iν log t) exp(−t(1 − iμ)) dt
= (1 − iμ)−(1+iν) (1 + iμ)−(1−iμ)
0
∞
∞
exp(−iν log t ) exp(−t (1 + iμ)) dt
0
exp(iν log s − s) ds
∞ 0
exp(−iν log s − s ) ds
= (1 − iμ)−(1+iν) (1 + iμ)−(1−iμ) Γ(1 + iν)Γ(1 − iν) = (1 − iμ)−(1+iν) (1 + iμ)−(1−iμ) iνΓ(iν)Γ(1 − iν) πν = (1 − iμ)−(1+iν) (1 + iμ)−(1−iμ) , sinh(πν) where Γ is the extension in the complex plane of the classical gamma function. 4. Deriving with respect to μ the expression obtained in question 3, we have: E[Y exp i(νX + μY )] (1 − iν)(1 + iμ)−iν (1 − iμ)iν+1 − (iν + 1)(1 − iμ)iν (1 + iμ)1−iν πν . = sinh(πν) (1 − iμ)2(iν+1) (1 + iν)2(1−iν) Setting, ν = −μ in the right hand side of the above equality, we obtain the result. The relation E[Y exp iμ(Y − X)] = E[E[Y | Y − X] exp iμ(Y − X)] = 0 ,
μ ∈ IR
4. Distributional computations – solutions
133
characterizes E[Y | Y − X] and shows that this r.v. equals 0. 5. The first part of the question is a particular case of question 2 of Exercise 4.9. To show the second part, first observe that (X, Y ) (|X|, |Y |)
(law)
=
=
T T (law) (law) log , T − T = log , T − T = −(X, Y ) T T (X, Y )1I{X>0,Y >0} − (X, Y )1I{X≤0,Y ≤0} .
and
The two above equalities imply that (law)
(|X|, |Y |) = ((X, Y ) | X > 0, Y > 0) , which, from the definition of (X, Y ), can be written as: (law)
(|X|, |Y |) =
T log , T − T | T > T . T
From the first part of the question, we finally have: (law)
(|X|, |Y |) =
2T log 1 + ,T T
.
Solution to Exercise 4.11 (law)
1. From the identity in law (4.2.3), we have R = distributed with parameter
2
2 2 − r2 r e π
3 , 2
2Z 3 , where Z 3 is gamma 2
2
so we easily compute the law of R: P (R ∈ dr) =
dr.
2. If f is a bounded Borel function defined on IR2 , then from the change of variables (r, u) = (r, xr ), we obtain:
E[f (R, U )] =
2 2 − r2 x 1 f r, (x) 1 I r e 2 dx dr [0,r] r r π IR2 +
1
∞
= 0
0
f (r, u)
2 2 − r2 r e 2 dr du . π
This shows that R and U are independent and U is uniformly distributed over [0, 1].
3. If f is as above, then E[f (X, Y ) | R = r] = 1r 0r f (x, r −x) dx = 1r 0r f (r −y, y) dy. This shows that conditionally on R = r, Y is uniformly distributed over [0, r].
134
Exercises in Probability
Integrating this identity with respect to the law of R, we obtain:
∞
E[f (X, Y )] = 0
∞
=
1 r 2 2 − r2 f (r − y, y) dy r e 2 dr r 0 π
0
∞
f (x, y)
0
which gives the density of the law of (X, Y ):
(x+y)2 2 (x + y)e− 2 dx dy , π
2 (x π
+ y)e−
(x+y)2 2
1I[0,+∞)2 (x, y).
4. From the law of the pair (X, Y ), we deduce:
E[f (G, T )] = 0
=
∞
∞
0
f (x − y, xy)
(x+y)2 2 (x + y)e− 2 dx dy π
g2 1 f (g, t) √ e− 2 dg 2e−2t 1I[0,∞) (t) dt . IR2 2π
We conclude that G and T are independent. Moreover, G is a centred normal random variable with variance 1 and T is exponentially distributed with parameter 2. 5. From (4.11.1) and the independence between G and T , conditionally on T = t, V2 1−V 2
has the same law as
G2 , 2t
that is: P
V2 1−V 2
∈ dz | T = t =
1 t π
2
z − 2 e−tz dz. 1
6. From question 5, we know that G2 is gamma distributed with parameter 12 . From question 4, 2T is exponentially distributed with parameter 1, and A is beta distributed with parameters 12 and 12 , so equation (4.11.2) is given by equation (4.2.1) in the particular case a = b = 12 . What makes the difference between (4.11.1) and V2 (4.11.2) is the fact that in (4.11.1) the variable 1−V 2 is not independent of T whereas it is the case for A in (4.11.2).
Solution to Exercise 4.12 (law)
1. Since Θ is uniformly distributed on [0, 2π[, we have 2Θ (mod 2π) = Θ, hence (law) tan(2Θ) = tan Θ. 2.(i) Let Θ be the argument of the complex normal variable Z = N + iN . As we already noticed in Exercise 3.5, Θ is uniformly distributed on [0, 2π[. So, the identity: tan Θ = NN gives the result.
4. Distributional computations – solutions
135
(ii) Let f be a bounded Borel function. In the equality 1 2π
E[f (tan Θ)] =
2π
f (tan x) dx = 0
taking y = tan x, we obtain E[f (tan Θ)] =
+∞ −∞
1 π
π
f (tan x) dx, 0
dy f (y) π(1+y 2) .
3. Thanks to the well known formula tan 2a = 2 1, we have
tan a , and applying question 1 − tan2 a
tan Θ (law) = tan Θ , 1 − tan2 Θ (law) 2C (law) = so that if C is a Cauchy r.v., then C = 1−C 2 . Since C deduced from question 2 (i), we obtain: tan(2Θ) = 2
(law)
C =
1 − C2 1 1 = −C 2C 2 C
1 , C
which may be
.
To obtain the second equality, first recall from Exercises 3.7 and 3.8 that, N and (law) N being as in question 2, N + N and N − N are independent, hence N + N = √ (law) N − N = 2N . So, from question 2 (i), we deduce: 1+ N + N C = = N −N 1− (law)
N N (law) = N N
1+C . 1−C
Solution to Exercise 4.13 a) Recall that the law of a variable Z taking values in [0, 1[ is characterized by its discrete Fourier transform: ϕ2 (p) = E[exp(2iπpZ)],
p ∈ ZZ
and that Z is uniformly distributed on [0, 1[ if and only if ϕ2 (p) = 11{p=0} . As a consequence of this, it follows immediately that {mU +nV } is uniformly distributed. b) A necessary and sufficient condition for the variables {mU +nV } and {m U +n V } to be independent is that E[exp [2iπ(p{mU + nV } + q{m U + n V })]] = E[exp (2iπp{mU + nV })]E[exp(2iπq{m U + n V })], for any p, q ∈ Z. Observe that since exp [2iπ(p{mU + nV } + q{m U + n V })] = exp [2iπ(p(mU + nV ) + q(m U + n V ))] ,
136
Exercises in Probability
the characteristic function of interest to us is: E[exp [2iπ(p{mU + nV } + q{m U + n V })]]
=
[0,1]2
exp [2iπ(p(mx + ny) + q(m x + n y))] dx dy .
The above expression equals 0 if pm + qm = 0 or pn + qn = 0 and it equals 1 if pm + qm = 0 and pn + qn = 0. But there exist some integers p, q ∈ Z \ {(0, 0)} such that pm + qm = 0 and pn + qn = 0 if and only if nm = n m (in which case take e.g. q = nm, p = −nm = −n m). We conclude that a necessary and sufficient condition for the variables {mX +nY } and {m X + n Y } to be independent is that nm = n m.
Solution to Exercise 4.14
1. If f is any bounded Borel function, then E[f (cos(Θ))] = π1 0π f (cos θ) dθ = 1 1 dx f (x) √1−x 2 , where we put x = cos(θ) in the last equality. So, X has density π −1 1 √ , on the interval [−1, 1]. π 1−x2 2. Applying Exercise 4.13 for m = 1, n = 1, m = 1, n = −1 and Θ = πU , (law) Θ = πV shows that (cos(Θ + Θ ), cos(Θ − Θ )) = (cos Θ, cos Θ ) 3. This identity in law follows immediately from question 2 and the formula cos(Θ + Θ ) + cos(Θ − Θ ) = 2 cos(Θ) cos(Θ ).
Solution to Exercise 4.15 1. See question 7 of Exercise 4.2. 2. It suffices to check that the inverse Fourier transform of the function exp(−|λ|), λ ∈ IR, corresponds to the density of the standard Cauchy variable. Indeed, we have:
1 −iλx −|λ| 1 1 1 1 + , e e dλ = = 2π IR 2π 1 − ix 1 + ix π(1 + x2 ) for all x ∈ IR. 3. We proved in question 2 of Exercise 4.12 that if N and N are independent, (law) centred Gaussian r.v.s, with variance 1. then C = NN . So, conditioning on N and applying the result of question 1, we can write:
N E[exp(iλC)] = E exp iλ N
λ2 1 = E exp − 2 N 2
λ2 = E exp − T 2
.
4. Distributional computations – solutions
137
4. A direct consequence of the result of question 7 (b), Exercise 4.2, is that n12 ni=1 Ti has the same law as T , which disagrees with the law of large numbers. Indeed, if the Ti s had finite expectation, then the law of large numbers would imply that n1 ni=1 Ti converges almost surely to this expectation and n12 ni=1 Ti would converge almost surely to 0. 5. For every bounded Borel function f , defined on IRn , we have:
E f
N1 Nn ,..., N0 N0
=
=
IRn
IRn
f (y1 , . . . , yn ) ⎣
IRn
dx0 IR
n+1 2
⎛
|x0 |n (2π)
⎛
f (y1 , . . . , yn )
)⎝ Γ( n+1 2 π
n+1 2
1+
n+1 2
n
⎛
⎞− n+1
n 1 ⎝ 1+ yj2 ⎠ n (2π) 2 j=1
2
dy1 . . . dyn
⎞− n+1 2
yj2 ⎠
dy1 . . . dyn .
j=1
Hence, the density of the law of
⎞⎤
n x2 exp − 0 ⎝1 + yj2 ⎠⎦ dy1 . . . dyn 2 j=1
|x0 |n x2 dx0 √ exp − 0 2 IR 2π
f (y1 , . . . , yn )
⎞
n 1 exp − ⎝ x2j ⎠ dx0 dx1 . . . dxn 2 j=0
=
1 (2π)
⎡
=
⎛
xn x1 f ,..., n+1 x x0 IR 0
N1 n ,..., N N0 N0
⎡ ⎛
is given by:
⎞⎤− n+1
n n+1 ⎣ ⎝ π 1+ yj2 ⎠⎦ Γ 2 j=1
2
(y1 , . . . , yn ) ∈ IRn ,
,
√ −n n+1 n−1 2 · 2 · 3 · . . . · (n − 1), if n is even and Γ( where Γ( n+1 ) = π2 ) = ! if n is 2 2 2 (def) 1 odd. To obtain the characteristic function of this random vector, set T = N 2 and 0 observe that ⎡
⎛
n
⎞% %
⎤
− 12 Nj % E ⎣ exp ⎝i λj ⎠ %% N0 ⎦ = e 2N0 N0 % j=1
n j=1
λ2j
=e
− 12
n j=1
λ2j T
,
so that from questions 1 and 3 ⎡
⎛
E ⎣exp ⎝i
n j=1
⎞⎤
λj
Nj ⎠⎦ =E e N0
for every (λ1 , . . . , λn ) ∈ IRn .
− 12
n j=1
λ2j
T
⎛
⎛
= exp ⎜ ⎝− ⎝
n j=1
⎞1 ⎞ 2 2⎠ ⎟ λj ⎠ ,
(4.15.a)
138
Exercises in Probability
6. Put X =
N1 n ,..., N N0 N0
. We obtain E[exp(it θ, X)] = exp(−|t|) ,
λ t, where λ = (λ1 , . . . , λn ) ∈ IRn , in the equation for all t ∈ IR, by taking θ = |λ| (4.15.a). Since the law νX of X is characterized by (4.15.a), this proves that νX is the unique distribution such that for any θ ∈ IRn , with |θ| = 1, θ, X is a standard Cauchy r.v.
Solution to Exercise 4.16 0. Preliminaries: (i) and (ii) follow directly from the definition of X and the independence between A and N . 1. Since each of the variables A, N and X is almost surely different from 0, we can (law) (law) write: X1 = √1A N1 , which implies: |X|1 α = 1α2 |N1|α , for every α ≥ 0. Moreover, A each of the variables |X|1 α , 1α2 and |N1|α being positive, their expectation is defined, A so formula (4.16.1) follows from the independence between A and N . From equation (4.16.1), we deduce that E and E
1 |N |α
1 is |X|α −α 2
are finite, that is when α < 1 and A
finite whenever both E
1 α A2
is integrable.
2. Applying the fact that for every r > 0, α > 0, r− 2 = α
1 ∞ 0 Γ( α 2)
dt t 2 −1 e−tr and α
Fubini’s theorem, we obtain:
E
α 1 1 ∞ = α dt t 2 −1 E[e−tA ] . α A2 Γ 2 0
Now, we deduce from the expression of the characteristic function of N , and √ √ the independence between this variable and A that E[e−tA ] = E[ei 2tAN ] = E[ei 2tX ]. Finally, one has
E
1 α A2
= =
1
Γ
α 2
∞
1
Γ
α 2
α
0
2
√
dt t 2 −1 E[ei
α −1 2
∞
2tX
]=
1
Γ
α 2
2
α −1 2
∞
dx xα−1 E[eixX ]
0
dx xα−1 E[cos (xX)] ,
0
where the last equality is due to the fact that the imaginary part of the previous integral vanishes.
4. Distributional computations – solutions
139
3.1. (iii) An obvious calculation gives:
1 = E |N |α
2∞ 1 α 1 − x2 1 ∞ 1 −y 1 2 √ √ Γ − dx α e = dy α + 1 e = . π 0 x 2 2 2α π 0 2α π y2 2 (4.16.a)
3.2. First note that from the hypothesis, |X| is almost surely strictly positive and t|X| α−1 cos (xX) = |X|1 α 0 dy y α−1 cos (y). Taking the expectation of each mem0 dx x ber, and applying Fubini’s Theorem, we obtain:
t
t
dx x
α−1
0
1 t|X| α−1 E[cos (xX)] = E dy y cos (y) . |X|α 0
t|X|
aα 1 < Moreover, from 3.1 (ii), limt→∞ |X|1 α 0 dy y α−1 cos (y) = |X| α , a.s. Since E |X|α s α−1 ∞ and 0 dy y cos (y) is bounded in s, we can apply Lebesgue’s theorem of domi t|X| nated convergence to the variable |X|1 α 0 dy y α−1 cos (y). Therefore, from (4.16.2) we have:
E
1 α A2
= lim
t→∞
1
Γ
α 2
2 2 −1 α
t
dx xα−1 E[cos (xX)]
0
1 t|X| 1 aα α−1 = lim α α −1 E dy y cos (y) = α α −1 E . t→∞ Γ |X|α 0 |X|α 22 Γ 2 22 2
1
3.3. We easily deduce from the previous questions that
1 aα = E |N |α
−1
α
Γ 2 √ α α −1 Γ 2 2 = 2α−1 π 1 α . 2 Γ 2−2
Preliminary 3.1 (i) implies √
α Γ 2
=
2π2 2 −α Γ(α) 1
Γ
α 2
+
1 2
α 1 1 α and Γ Γ + − 2 2 2 2
which yields aα = Γ(α) cos
πα 2
=
π
sin π
α 2
+
1 2
=
π
cos
πα 2
,
.
∞ α−1 −rx 1 3.4. Recall the hint for question 2: r1α = Γ(α) e dx for all r > 0 and 0 x α > 0. This formula may be extended to any complex number in the domain H = {z : Re(z) > 0}, by analytical continuation, that is
∞ 1 1 = xα−1 e−zx dx , zα Γ(α) 0
z ∈H.
140
Exercises in Probability
Set z = λ + iμ, then identifying the real parts of each side of the above equality, we have: cos πα 1 ∞ α−1 −λx 2 = lim x e cos (μx) dx , λ→0 Γ(α) 0 μα hence by putting y = μx:
πα 1 ∞ α−1 −λy = lim y e cos (y) dy . Γ(α) cos λ→0 Γ(α) 0 2 It remains for us to check that limλ→0 cos (y) dy. To this end, write
∞ π 2
= =
y α−1 e−λy cos(y) dy =
(4k+3) π2
π k≥0 (4k+1) 2
k≥0
(4k+3) π2
(4k+1) π2
1 ∞ Γ(α) 0
y α−1 e−λy cos (y) dy =
(2k+3) π2
π k≥0 (2k+1) 2
y α−1 e−λy cos(y) dy +
1 ∞ Γ(α) 0
y α−1
y α−1 e−λy cos(y) dy
(4k+5) π2
(4k+3) π2
y α−1 e−λy cos(y) dy
(y α−1 − (y + π)α−1 e−λπ )e−λy cos(y) dy .
Observe that when 0 < α < 1, each term of the above series is negative and we can state:
(4k+3) π2
π k≥0 (4k+1) 2
k≥0
(4k+3) π2
(4k+1) π2
(y α−1 − (y + π)α−1 e−λπ ) cos(y) dy ≤
∞ π 2
y α−1 e−λy cos(y) dy ≤
(y α−1 − (y + π)α−1 )e−λy cos(y) dy .
By monotone convergence, both series of the above inequalities converge towards: (4k+3) π2 α−1 − (y + π)α−1 ) cos(y) dy = π∞ y α−1 cos(y) dy, as λ → 0 and the k≥0 (4k+1) π (y 2 2 result follows.
4. The equivalent form of equation (4.16.1) is
1 1 1 =E E , E α α |Y | B |C|α
(4.16.b)
for every α ≥ 0. Now, recall the expression for the characteristic function of C: E[eitC ] = exp (−|t|), t ∈ IR. same arguments as in question 2, we obtain So, by the 1 ∞ 1 ∞ α−1 α−1 that for every α > 0, E B1α = Γ(α) E[e−tB ] = Γ(α) E[eitBC ] = 0 dt t 0 dt t 1 ∞ Γ(α) 0
∞ 1 α−1 dt tα−1 E[eitY ] = Γ(α) E[cos(tY )]. Now, suppose that E 0 dt t +∞, then as in question 3.2, we have:
1 |Y |α
0:
λ μ
E[Z ] =
E[Zμλ ]Γ(μ
1 + 1)E , (Tμ )λ+μ
so that:
1 E (Tμ )μ(θ+1)
Hence E
1 (Tμ )μσ
=
Γ(1+σ) Γ(1+μσ)
E[Z θ ] E[Zμθμ ]Γ(μ + 1) Γ(1 + θ) . = μΓ(μ(1 + θ))
=
. Then, we obtain the result by analytical continuation.
142
Exercises in Probability
Solution to Exercise 4.18 1. We first note the following general fact. For every Borel function f : (0, ∞) → IR+ , one has:
∞
∞
1 μ−1 s E[f (sX)]e−s ds Γ(μ) 0 ∞ t 1 tμ−1 −X = E f (t)e dt Γ(μ) X μ 0
E[f (Zμ X)] =
= 0
1 μ−1 e− X t f (t)E Γ(μ) Xμ t
dt .
(4.18.a)
Now, assume (i) holds and observe that an equivalent form of equation (4.18.1) is 1 1 E[cg(Y )] = E g X X μ , for every Borel function g : (0, ∞) → IR+ , so that, the above expression is ∞ 1 μ−1 t f (t)E[ce−tY ] dt , Γ(μ) 0 and (ii) follows. Conversely if (ii) holds then on the one hand, we have
E[f (Zμ X)] =
∞
0
1 μ−1 t f (t)E[ce−tY ] dt Γ(μ)
and, on the other hand, (4.18.a) allows us to identify E[ce−tY ] as E X1μ e− X for every t ≥ 0 and (i) follows. (i)⇔(iii): Suppose (i) holds, then for every Borel function f : (0, ∞) → IR+ :
1 μ
∞
E[f (Z X)] =
1
E[f (t μ X)]e−t dt
0
∞
=
⎡
f (u)μu
μ−1
0
∞
=
t
μ
⎤
e−( X ) ⎦ ⎣ E du Xμ u
f (u)cμuμ−1 E e−(uY )
μ
du ,
0
and (iii) follows. If (iii) holds, then an obvious change of variables yields for every λ > 0: E[f (λ− μ Z μ X)] = 1
1
∞
cμuμ−1 f (λ− μ u)E[e−(uY ) ] du = 1
μ
0
0
∞
E
c t f μ Y Y
λe−λt dt ,
on one hand, and E[f (λ− μ Z μ X)] = 0∞ E f t μ X λe−λt dt on the other hand. Equating these expressions, and using the injectivity of the Laplace transform, we obtain (i). 1
1
1
4. Distributional computations – solutions
143
2. Suppose X, Z, Zμ and Tμ are independent. Equation (4.17.3) allows us to write: 1 μ
E[f (Z X)] = E f
Zμ X Tμ
Γ(μ + 1) . Tμμ
So the equivalence between (ii) and (iii) is a consequence of the following equalities,
E f
Zμ X Tμ
Γ(μ + 1) Tμμ
∞
= E
= 0
=
0 ∞ ∞
c μ−1 −tY t e f dt Γ(μ)
t Tμ
Γ(μ + 1) Tμμ
du cμuμ−1 f (u)E[exp(−Tμ uY )] 1
du cμuμ−1 f (u)E[exp(−uY )μ ] = E[f (Z μ X)] ,
0
where we set u =
t Tμ
in the second written expression.
Solution to Exercise 4.19 1. Let Z, Tν , Tμ be as in the statement of question 1. For all t ≥ 0, we have:
1 μ
P (Z > tTν ) = E
∞
−s
tμ Tνμ
e
ds = E[exp(−tμ Tνμ )]
= E[exp(−tTν Tμ )] = P (Z > tTν Tμ ) . The third equality following from the expression of the Laplace transform of Tμ , given in Exercise 4.17. This proves the identity in law 1
Z μ (law) Z = , Tν Tμ Tν and the other identity follows by exchanging the role of μ and ν. We deduce (4.19.2), from the above identity in law and the fact that T1ν is simplifiable (see Exercise 1.12). 2. Choosing a =
1 n
in (4.5.4) and μ =
1 n
in (4.19.2), we obtain: (law)
(law)
Z n = nn Z 1 . . . Z n−1 Z = n
n
Z . T1 n
Since Z is simplifiable, we can write: (law)
nn Z 1 . . . Z n−1 = n
n
1 , T1 n
which leads directly to the required formula.
144
Exercises in Probability
3. When we plug the representations: Zn
(law)
nn Z 1 . . . Z n−1 Z
Zm
(law)
mm Z 1 . . . Z m−1 Z ,
=
n
=
n
m
into the formula (law) Zn =
m
Zm T mn
m ,
we obtain 1
(law)
nn Z 1 . . . Z n−1 = mm Z 1 . . . Z m−1 n
m
n
m
T mn
m .
(4.19.a)
(We simplified by Z on both sides.) Then we may use the beta–gamma algebra (formula (4.2.1)) to write: (law)
k ≤ n,
Zk = Zk, k −k Z k , n
n m
n
m
where, on the right hand side, Z k , k − k is independent of Z k . Then (4.19.a) simplifies n m n m into: m m−1 n−1 m (law) n = n Z k ,k( 1 − 1 ) Zk , n m n n T mn k=1 k=m which is the desired formula. 4. Raising formula (4.8.2) to the power 2p gives: Z2
p
(law)
= 22
p −1
p
N12 N22
p−1
. . . Np2 Z ,
(4.19.b)
where N1 , . . . , Np are independent centred Gaussian variables with variance 1. Combining this identity with equation (4.19.2) for μ = 21p and simplifying by Z gives formula (4.19.5). 5. (a) Equation (4.19.6) follows from either formula (4.19.3) for n = 2 or formula (4.19.5) for p = 1 and has already been noticed in question 1 of Exercise 4.15: (law) precisely, one has T 1 = 2N1 2 , which yields (4.19.6). 2
Equation (4.19.7) follows from formula (4.19.3) for n = 3: ⎛
⎞
⎛
⎞
1 x ⎠ P⎝ > x⎠ = P ⎝Z 2 > 3 T1 27Z 1 ⎡
3
=
1
Γ
E ⎣ 2 3
⎤
3
∞
x(27Z 1 )−1 3
dy
e−y ⎦ 1
y3
.
4. Distributional computations – solutions
145
Hence, we obtain:
P (T 1 )−1 ∈ dx
⎡
dx
=
3
Γ
E ⎢ ⎣ 2 3
9Γ
⎞− 1
1 ⎝ x ⎠ 27Z 1 27Z 1 3
dx
=
⎛
1 3
Γ
2 3
⎛
3
exp ⎝−
3
1 1
x3
∞ 0
dy
x 27Z 1
⎞⎤ ⎠⎥ ⎦
3
x +y − 4 exp 27y y3
,
which yields (4.19.7), after some elementary transformations. (b) We use the representation (4.19.5) for p = 2: (law)
T1 = 4
1 . 8N12 N24
First we write N1 + iN2 = R exp(iΘ) , where Θ and R are independent, Θ√is uniform on [0, 2π[, and from question 3 of 1 (law) Exercise 4.2, R = (N12 + N22 ) 2 = 2Z. This gives 1
(law)
T1 =
1
(law)
8R6 (cos Θ)2 (sin Θ)4
4
=
43 Z 3 A(1
− A)2
where A = cos2 Θ is independent of Z and has law P (A ∈ da) =
1 π
, √
da , a(1−a)
a ∈ [0, 1].
To obtain (4.19.8), it remains to write: ⎛
⎞
1
1 ⎠ 1 P⎝ = P (T 1 < x) = P Z > 1 > 1 1 1 2 4 4(T 1 ) 3 4x 3 4x 3 A 3 (1 − A) 3
4
= E exp −
1 1
1
.
2
4x 3 A 3 (1 − A) 3
(c) From (4.19.4), we have (law)
T = Z1,1 Z2 . 3 6
3
(We refer to Exercise 4.2 for the law of Z 1 , 1 .) Then, we obtain: 3 6
P (T > x) = P (Z 2 > xZ −1 1 1) = , 3
3 6
⎡
Γ
⎤
1
E ⎣ 2 3
xZ −1 1 1
dt t
2 −1 3
e−t ⎦ ,
3,6
hence, denoting here Z for Z 1 , 1 , we obtain: 3 6
⎡
1 P (T ∈ dx) = 2 E ⎣ Z Γ 1
3
with C = Γ
2 3
B
1 1 , 3 6
−1
.
Z x
13
⎤
x ⎦ C exp − = 1 Z x3
e− z
x
1
dz 0
4
5
z 3 (1 − z) 6
,
146
Exercises in Probability
Solution to Exercise 4.20 To prove identity (4.20.2) (i), it is enough to show that for every γ > 0, n
⎡⎛
⎞γ ⎤
1 E[|Δn (X1 , . . . , Xn )|2γ ] = E ⎣⎝ ⎠ ⎦ , T1 j=2 j
but from identity (4.19.2), (or (4.17.2)), one has ⎡⎛
⎞γ ⎤
1 E[Z jγ ] Γ(1 + jγ) E ⎣⎝ ⎠ ⎦ = = , γ T1 E[Z ] Γ(1 + γ) j
for every j, and the result follows thanks to (4.20.1). A direct application of (4.19.3) shows that for every j = 2, 3, . . . , n: 1 (law) j = (j ) Zi . j T1 1≤i<j j
Since the r.v.s (j j )
1≤i<j
Z i , j = 2, 3, . . . , n, on the right hand side, are indepenj
dent, their product is equal in law to the product of
1 T1
, j = 2, 3, . . . , n and we have
j
obtained (4.20.2) (ii).
Solution to Exercise 4.21 1. Let Z be an exponential variable with mean 1, such that Tμ , Tμ and Z are 1 independent. First, for every s ≥ 0, we have E[ 1+sX ] = E[e−sXZ ]. Moreover, the identity in law (4.19.2) implies that for every s ≥ 0:
E e−sXZ = E e−sTμ Z and the last term clearly equals
1 μ
= E e−s
μZ
,
1 . 1+sμ
2. From the definition of X we have: E[X s ] = E[Tμs ]E[Tμ−s ]. The identity (4.17.4) s s Γ( μ Γ(1− μ ) ) , for s > 0, as well as E[Tμs ] = Γ(1−s) , for s < μ. So that implies E[Tμ−s ] = μΓ(s) with the formula of complements, as given in Exercise 4.16, we obtain
s s 1Γ μ Γ 1− μ sin πs s E[X ] = , = μ Γ(s)Γ(1 − s) μ sin πs μ
0 < s < μ.
4. Distributional computations – solutions
147
3. Now observe that the expression (4.21.4) admits an analytical continuation in the domain {z ∈ C : 0 < Re(z) < μ}, hence the characteristic function of log X μ is given by sinh πμt E[X iμt ] = E[exp it log X μ ] = , t ∈ IR . μ sinh πt Thanks to the integral computed above, we deduce that P (log X μ ∈ dx) =
sin πμ 1 dx , πμ cosh x + cos πμ
from which we deduce (4.21.3) by an obvious change of variables. Finally, again some elementary change of variables allows us to deduce the expression of TTμ in μ terms of a Cauchy variable from (4.21.3).
Solution to Exercise 4.22 Part A. . We deduce from our definition of μ and ν Let X be an r.v. such that: E[X n ] = μ(n) ν(n) n n that E[V ] = E[(W X) ], for every n; but since V is moments determinate, we have (law) (law) (law) V = W X. On the other hand, we know that V = W R, hence, W X = W R. (law) Since W is simplifiable, it follows that X = R. Part B. Examples. (law)
(i) Write V = ZZ , where Z and Z are two independent exponential variables (law) and W = Zα Zβ , where Zα and Zβ are two independent gamma variables, with respective parameters α and β. The sequences of moments given in (i) are those of V and W respectively. Moreover, from the beta–gamma algebra (see Exercise 4.2), we have (law) (law) Z = Z1,α−1 Zα and Z = Z1,β−1 Zβ , (law)
consequently, R = Z1,α−1 Z1,β−1 , whose density is (α − 1)(β − 1)
1 r
(1 − t)α−2
(t − r)β−2 dt, tβ−1
(0 < r < 1).
(law)
(ii) Here, the moments μ(n) are those of V = Z 2 ; then the duplication formula (law) for the gamma function translates probabilistically as: Z 2 = 2ZN 2 (see Exercise (law) (law) 2 (law) 4.5). Hence, since W = Z, we get R = 2N = Z1/2 . (law)
(iii) Now, the moments μ(n) are those of V = Z 3 , and the moments ν(n) are those (law) of W = Z 2 . From the discussion in Exercise 4.19, and more particularly (4.19.2)
148
Exercises in Probability
for μ = 23 , we find that R =
1 T2/3
2
is a desired solution. Note that, here, following
the comment (b) in Exercise 1.10, the law of Z 3 is not moments determinate!
Chapter 5 Convergence of random variables
“Overhead”
• A sequence of random variables {Xn , n ∈ IN} may converge in a number of different ways, i.e. (w) (P ) a.s. { Xn −→ ⇒ { Xn −→ ⇒ { L(Xn ) −→ n→∞ · } n→∞ · } n→∞ · } ⇑ Lp { Xn −→ n→∞ · } • Important additional assumptions are necessary to obtain implications in reverse order. Adequate references may be found in most of the books cited in our Bibliography. • The most well-known examples of almost sure convergence occur with the Law of Large Numbers (LLN), whereas the Central Limit Theorem (CLT) involves convergence in distribution. See Jacod and Protter [29] for a “minimalist” discussion. • To a large extent, the “universal character” of the LLN and the CLT make them the “two pillars” of Probability Theory. • None of our exercises (with the exception of Exercise 5.5) discusses an important complement to the LLN and the CLT, namely the LDT: the Large Deviations Theorem of Cramer, and its many extensions due to Donsker–Varadhan and Freidlin–Wentzell, among others. We recommend the nice succession of exercises in Letac ([36], Exercises 408, 409 and 410), which constitutes a self-contained proof of Cramer’s theorem.
149
150
Exercises in Probability
* 5.1 Convergence of sums of squares of independent Gaussian variables Let (Xn , n ∈ IN) be a sequence of independent Gaussian variables, with respective mean μn , and variance σn2 . 1. Prove that Σn Xn2 converges in L1 if, and only if: Σn (μ2n + σn2 ) < ∞ .
(5.1.1)
2. Prove that, if the condition (5.1.1) is satisfied, then: Σn Xn2 converges in Lp , for every p ∈ [1, ∞[. 3. Assume that μn = 0, for every n. Prove that: if Σn σn2 = ∞, then P (Σn Xn2 = ∞) = 1. (See complements after Exercise 5.15). *
5.2 Convergence of moments and convergence in law
Let (Xn , n ∈ IN) be a sequence of r.v.s which take their values in [0, 1]. 1 1. Prove that, if for every k ∈ IN, E(Xnk ) −→ k+1 , then the sequence (Xn ) conn→∞ verges in law; identify the limit law.
2. Let a > 0. Solve the same question as question 1 when a . k+a
1 k+1
is replaced by:
Comments and references: The result of this exercise is a very particular case (the Xn ’s are uniformly bounded) of application of the method of moments to prove convergence in law; precisely, if μn is the law of Xn , and if there exists a probability measure μ such that:
(i) the law μ is uniquely determined by its moments xk dμ(x), k ∈ IN; (this is the case if exp(α|x|) dμ(x) < ∞ for some α > 0), (ii) for every k ∈ IN,
xk dμn (x) (= E[Xnk ]) converges to
xk dμ(x),
then the sequence (Xn ) converges in distribution to μ (see Feller [20], p. 269). This is also discussed in Billingsley [4], Theorem 30.2, in the 1979 edition. *
5.3 Borel test functions and convergence in law
Let (Xn ) and (Yn ) be two sequences of r.v.s such that:
5. Convergence of random variables
151
(i) the law of Xn does not depend on n; (law) → (X, Y ). (ii) (Xn , Yn ) n→∞
1. Show that for every Borel function ϕ : IR → IR, the pair (ϕ(Xn ), Yn ) converges in law towards: (ϕ(X), Y ). 2. Give an example for which the condition (ii) is satisfied, but (i) is not, and such that there exists a Borel function ϕ : IR → IR for which (ϕ(Xn ), Yn ) does not converge in law towards (ϕ(X), Y ). Comments and references: For some applications to asymptotic results for functionals of Brownian motion, see Revuz–Yor [51], Chapter XIII.
5.4 Convergence in law of the normalized maximum of Cauchy variables
*
Let (X1 , X2 , . . . , Xn , . . .) be a sequence of independent Cauchy variables with parameter a > 0, i.e. a dx P (Xi ∈ dx) = . π(a2 + x2 )
Show that
1 n
sup Xi
converges in law towards
i≤n
1 , T
where T is an exponential vari-
able, the parameter of which shall be computed in terms of a. Comments and references: During our final references searches, we found that this exercise is in Grimmett–Stirzaker [26], p. 356! *
5.5 Large deviations for the maximum of Gaussian vectors
Let X = (X1 , . . . , Xn ) be any centred Gaussian vector. 1. Prove that for every r ≥ 0: r2
P ( max Xi ≥ E[ max Xi ] + σr) ≤ e− 2 , 1≤i≤n
1≤i≤n
(5.5.1)
1
where σ = max1≤i≤n E[Xi2 ] 2 . Hint: Use Exercise 3.10 for a suitable choice of a Lipschitz function f .
152
Exercises in Probability
2. Deduce from above that: 1 1 log P ( max Xi ≥ r) = − 2 . 2 r→+∞ r 1≤i≤n 2σ lim
(5.5.2)
Comments and references: As in Exercise 3.10, the above results may be extended to continuous time processes. The identity (5.5.2) is a large deviation formulation of the inequality (5.5.1) and aims at investigating the supremum of Gaussian processes on a finite time interval, such as the supremum of the Brownian bridge, see Chapter 6. R. Azencott: Grandes d´eviations et applications. Eighth Saint Flour Probability Summer School–1978 (Saint Flour, 1978), 1–176, Lecture Notes in Mathematics, 774, Springer, Berlin, 1980. M. Ledoux: Isoperimetry and Gaussian analysis. Lectures on probability theory and statistics (Saint-Flour, 1994), 165–294, Lecture Notes in Mathematics, 1648, Springer, Berlin, 1996. *
5.6 A logarithmic normalization
Let r > 0, and consider πr (du) = r sinh(u)er(1−cosh u) du, a probability on IR+ . Let Xn be an r.v., with distribution π1/n . 1. What is the law of n1 cosh Xn ? (law) 2. Prove that: log(cosh Xn ) − log n n→∞ → Y . Compute the law of Y .
3. Deduce therefrom that:
log(cosh Xn ) (P ) Xn (P ) → 1, and then that: → 1. log n n→∞ log n n→∞
(law) Y +log Z. Give a direct proof of this result, 4. Deduce finally that Xn −log n n→∞ using simply the fact that Xn is distributed with < π1/n .
Comments and references: (a) The purpose of this exercise is to show that simple manipulations on a seemingly complicated sequence of laws may lead to limit results, without using the law of large numbers, or the central limit theorem. (b) The distribution πr occurs in connection with the distribution of the winding number of planar Brownian motion. See: M. Yor: Loi de l’indice du lacet Brownien, et distribution de Hartman-Watson. Zeit. f¨ ur Wahr. 53 (1980), p. 71–95.
5. Convergence of random variables **
5.7 A
√
153
n log n normalization
1. Let U be a uniform r.v. on [0, 1], and ε an independent Bernoulli r.v., i.e. P (ε = +1) = P (ε = −1) = 1/2. √ Compute the law of X = ε/ U . 2. Let X1 , X2 , . . . , Xn , . . . be a sequence of independent r.v.’s which are distributed as X. Prove that: X1 + X2 + · · · + Xn (law) −→ N , (n log n)1/2 n→∞ where N is Gaussian, centred, with variance 1. 3. Let α > 0, and let X1 , X2 , . . . , Xn , . . . be a sequence of independent r.v.s such that: dy P (Xi ∈ dy) = c 3+α 1I{|y|≥1} |y| for a certain constant c. Give a necessary and sufficient condition on a deterministic sequence (ϕ(n), n ∈ IN) such that: ϕ(n) → ∞, as n → ∞ which implies: X1 + · · · + Xn (law) −→ N , ϕ(n) n→∞ where N is Gaussian, centred. (See complements after Exercise 5.15). Comments and references: As a recent reference for limit theorems, we recommend the book by V.V. Petrov [46]. Of course, Feller [20] remains a classic. * 5.8
The Central Limit Theorem involves convergence in law, not in probability 1. Consider, on a probability space (Ω, A, P ), a sequence (Yn , n ≥ 1) of independent, equidistributed r.v.s, and X a Gaussian variable which is independent of the sequence (Yn , n ≥ 1). We assume moreover that Y1 has a second moment, and that: E[Y1 ] = E[X] = 0 ; E[Y12 ] = E[X 2 ] = 1 . Define: Un =
√1 (Y1 n
+ · · · + Yn ).
Show that: n→∞ lim E[|Un − X|] exists, and compute this limit. Hint: Use Exercise 1.3.
154
Exercises in Probability
2. We keep the assumptions concerning the sequence (Yn , n ≥ 1). Show that, for any fixed p ∈ IN, the vector (Y1 , Y2 , . . . , Yp , Un ) converges in law as n → ∞. Describe the limit law. 3. Does (Un , n → ∞) converge in probability? Hint: Assume that the answer is positive, and use the results of questions 1 and 2. 4. Prove that: question 3.
p
σ{Yp , Yp+1 , . . .} is trivial, and give another argument to answer
Comments: See also Exercise 5.11 for some related discussion involving “nonconvergence in probability”.
5.9 Changes of probabilities and the Central Limit Theorem
**
Consider two probabilities P and Q defined on a measurable space (Ω, A) such that Q = D · P , for D ∈ L1+ (Ω, A, P ), i.e. Q is absolutely continuous w.r.t. P on A. Let X1 , X2 , · · · , Xn , · · · be a sequence of i.i.d. r.v.s under P , with second moment; we write m = EP (X1 ), and σ 2 = EP ((X1 − m)2 ). Prove that, under Q, one has: n 1 (law) √ (Xi − m) −→ N σ n i=1
where N denotes a centred Gaussian variable, with variance 1. Comments and references: (a) This exercise shows that the fundamental limit theorems of probability theory: the Law of Large Numbers and the Central Limit Theorem are valid not only in the classical framework of i.i.d. r.v.s (with adequate integrability conditions) but also in larger frameworks. A sketch of the proof of this result can be found in P. Billingsley [4], although our proof is somewhat more direct. Of course, there are many other extensions of these fundamental limit theorems, the simplest of which is (arguably) the case proposed in this exercise. (b) This exercise is also found, in fact in greater generality, in Revuz ([50], p. 170, Exercise (5.14)). (c) (Comment by J. Pitman). The property studied in this exercise gave rise to the notion of stable convergence, as developed by R´enyi (1963) and Aldous– Eagleson (1978).
5. Convergence of random variables *
155
5.10 Convergence in law of stable(μ) variables, as μ → 0
Let 0 < μ < 1, and Tμ a stable(μ) variable whose Laplace transform is given by exp(−λμ ), λ ≥ 0. 1. Prove that, as μ → 0, (Tμ )μ converges in law towards Z1 , where Z is a standard exponential variable. 2. Let Tμ be an independent copy of Tμ . Prove that towards
Z , Z
3. Prove that:
Tμ Tμ
μ
converges in law
where Z and Z are two independent copies. Z (law) 1 = U Z
− 1, where U is uniform on (0, 1), and prove the result of
question 2, using the explicit form of the density of
Tμ Tμ
μ
as given in formula
(4.21.3). Comments and references: A full discussion of this kind of asymptotic behaviour for the four-parameter family of stable variables is provided by: N. Cressie: A note on the behaviour of the stable distributions for small index α. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 33, no. 1, 61–64 (1975/76). ** 5.11 Finite dimensional convergence in law towards Brownian motion Let T1 , T2 , . . . , Tn , . . . be a sequence of positive r.v.s which have the common distribution: dt 1 μ(dt) = √ exp − . 2t 2πt3 [Before starting to solve this exercise, it may be helpful to have a look at the first three questions of Exercise 4.15.] 1. Let (εi ; i ≥ 1) be a sequence of positive reals. Give a necessary and sufficient condition which ensures that the sequence: n 1 ε2 Ti , n i=1 i
n ∈ IN ,
converges in law. When this condition is satisfied, what is the limit law? Hint: The limit law may be represented in terms of μ.
156
Exercises in Probability
√ 2. We now assume, for the remainder of this exercise, that εi = 1/2 i (i ≥ 1). We define: Sn =
1 n
n
i=1
ε2i Ti .
(a) Prove that the sequence (Sn )n∈IN converges in law. (b) Let k ∈ IN, k > 1. Prove that the two-dimensional random sequence:
S(k−1)n , Skn
converges in law, as n → ∞, towards (αT1 , βT1 + γT2 ), where α, β, γ are three constants which should be expressed in terms of k. (c) More generally, prove that, for every p ∈ IN, p > 1, the p-dimensional random sequence (Sn , S2n , . . . , Spn ) converges in law, as n → ∞, towards an r.v. which may be represented simply in terms of the vector (T1 , T2 , . . . , Tp ). (d) Does the sequence (Sn ) converge in probability? 3. We now refine the preceding study by introducing continuous time. (a) Define Sn (t) =
1 n
2 [nt ]
i=1
ε2i Ti , where [x] denotes the integer part of x.
Prove that, as n → ∞, the finite dimensional distributions of the process (Sn (t), t ≥ 0) converge weakly towards those of a process (Tt , t ≥ 0) which has independent and time-homogeneous increments, that is: if t1 < t2 < · · · < tp , the variables Tt1 , Tt2 − Tt1 , . . . , Ttp − Ttp−1 are independent, and the law of Tti+1 −Tti depends only on (ti+1 −ti ). The latter distribution should be computed explicitly. (b) Now, let X1 , X2 , . . . , Xn , . . . be a sequence of independent r.v.s, which are centred, have the same distribution, and admit a second moment. Define: ξn (t) =
√1 n
[nt] i=1
Xi .
Prove that, as n → ∞, the finite–dimensional distributions of the process (ξn (t), t ≥ 0) converge weakly towards those of a process with independent and time homogeneous increments which should be identified. Comments and references: (a) If (Bt , t ≥ 0) is the standard Brownian motion, then the weak limit of the process (ξn (t), t ≥ 0), introduced at the end of the exercise, has the same law as (σBt , t ≥ 0), where σ 2 = E[Xi2 ], in (b) above. In this setting, the process (Tt , t ≥ 0) may be defined as the first hitting time process of σB, that is,
5. Convergence of random variables
157
for any t ≥ 0, Tt = inf{s ≥ 0, σBs ≥ t}. Since σB has independent and time homogeneous increments, the same properties hold for the increments of (Tt , t ≥ 0) and the results of questions 3 (a) and 3 (b) follow. (b) We leave to the reader the project of amplifying this exercise when the Ti s are replaced by stable(μ) variables, i.e. variables which satisfy E[exp(−λTi )] = exp(−λμ ) ,
λ ≥ 0,
for some 0 < μ < 1 (the case treated here is μ = 1/2). **
5.12 The empirical process and the Brownian bridge
Let U1 , U2 , . . . be independent r.v.s which are uniformly distributed on [0, 1]. We define the stochastic process b(n) on the interval [0,1] as follows:
b
(n)
(t) =
√
n 1 n 1I[0,t] (Uk ) − t , n k=1
t ∈ [0, 1] .
(5.12.1)
1. For any s and t in the interval [0,1], compute E[b(n) (t)] and Cov[b(n) (s), b(n) (t)]. 2. Prove that, as n → ∞, the finite-dimensional distributions of the process (b(n) (t) , t ∈ [0, 1]) converge weakly towards those of a Gaussian process, (b(t) , t ∈ [0, 1]), whose means and covariances are the same as those of b(n) . Comments and references: The process b(n) is very often involved in mathematical statistics and is usually called the empirical process. For large n, its path properties are very close to those of its weak limit (b(t) , t ∈ [0, 1]) which is known as the Brownian bridge. The latter has the law of the Brownian motion on the time interval [0,1], starting from 0 and conditioned to return to 0 at time 1. Actually, the convergence of b(n) towards b holds in much stronger senses. This is the well known Glivenko–Cantelli Theorem which may be found in the following references. G.R. Shorack: Probability for Statisticians. Springer Texts in Statistics, SpringerVerlag New York, 2000. G.R. Shorack and J.A. Wellner: Empirical Processes with Applications to Statistics. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons, Inc., New York, 1986. See also Chapter 8, p. 145, of Toulouse [58].
158 **
Exercises in Probability
5.13 The Poisson process and Brownian motion (c)
Let c > 0, and assume that (Nt , t ≥ 0) is a Poisson process with parameter c. (c)
(c)
Define: Xt = Nt − ct. Show that, for any n ∈ IN∗ , and any n-tuple (t1 , . . . , tn ) ∈ IRn+ , the random (c) (c) vector √1c (Xt1 , . . . , Xtn ) converges in law towards (βt1 , . . . , βtn ), as c → ∞, where (βt , t ≥ 0) is a Gaussian process. Identify this process. (c)
Comments: The processes (Nt , t ≥ 0) and (βt , t ≥ 0) belong to the important class of stochastic processes with stationary and independent increments, better known as L´evy processes. The Poisson process and Brownian motion may be considered as the two building blocks for L´evy processes. ** 5.14 Brownian bridges converging in law to Brownian motions Define a standard Brownian bridge (b(u), 0 ≤ u ≤ 1) as a centred continuous Gaussian process, with covariance function E[b(s)b(t)] = s(1 − t) ,
0 ≤ s ≤ t ≤ 1.
(5.14.1)
1. (a) Let (Bt , ≤ t ≤ 1) be a Brownian motion. Prove that b(u) = Bu − uB1 ,
0≤u≤1
(5.14.2)
is a standard Brownian bridge, which is independent of B1 . This justifies the following assertion. Conditionally on B1 = 0, (Bu , 0 ≤ u ≤ 1) is distributed as (Bu −uB1 , 0 ≤ u ≤ 1). (law)
(b) Prove that (b(1 − u), 0 ≤ u ≤ 1) = (b(u), 0 ≤ u ≤ 1). From now on, we shall use the pathwise representation (5.14.2) for the Brownian bridge. 2. Prove that, when c → 0, the family of processes indexed by (u, s, t)
(b(u), u ≤ 1) ;
1 1 √ b(cs), 0 ≤ s ≤ ; c c
1 1 √ b(1 − ct), 0 ≤ t ≤ c c
, (5.14.3)
converges in law towards {(b(u), u ≤ 1) ; (Bs , s ≥ 0) ; (Bt , t ≥ 0)} ,
(5.14.4)
where b, B and B are independent and B, B are two Brownian motions.
5. Convergence of random variables
159
3. Prove that the family of processes
√ λ b(exp(−t/λ)), t ≥ 0 converges in law, as λ → +∞, towards Brownian motion. Comments and references: These limits in law may be considered as “classical”; some consequences for the convergence of quadratic functionals of the Brownian bridge are discussed in G. Peccati and M. Yor: Four limit theorems involving quadratic functionals of Brownian motion and Brownian bridge. In Asymptotic Methods in Stochastics, Amer. Math. Soc., Comm. Series, in honour of M. Cs¨org¨o 44, 75–87 (2004).
5.15 An almost sure convergence result for sums of stable random variables
*
Let X1 , X2 , . . . , Xn , . . . be i.i.d. stable r.v.s with parameter α ∈ (0, 2], i.e. for any n ≥ 1, 1 (law) n − α S n = X1 , (5.15.1) where Sn =
n
k=1
Xk . Prove that for any bounded Borel function f ,
n 1 1 −1 f k α Sk → E[f (X1 )] , log n k=1 k
almost surely as n → +∞.
(5.15.2)
Hint: Introduce the associated continuous time stable L´evy process (St )t≥0 , and t apply the ergodic theorem to the re-scaled process Zt = e− α Set , and to the shift transformation. Comments and references: In the article: ¨ s and G.A. Hunt: Changes of sign of sums of random variables. Pac. P. Erdo J. Math., 3, 673–687 (1953) the authors proved a conjecture of P. L´evy (1937), i.e. for any sequence of partial sums Sn = X1 + · · · + Xn of i.i.d. r.v.s (Xn ) with continuous symmetric distribution, 1 1 1 1I{Sk >0} = , N →∞ log N k 2 k≤N lim
a.s.
This almost sure convergence result has been later investigated independently in P. Schatte: On strong versions of the central limit theorem. Math. Nachr., 137, 249–256 (1988)
160
Exercises in Probability
and G.A. Brosamler: An almost everywhere central limit theorem. Math. Proc. Camb. Philos. Soc., 104, no.3, 561–574 (1988). where the authors proved that any sequence of i.i.d. r.v.s (Xn ) with E[X1 ] = 0 and E[|X1 |2+δ ] < ∞, for some δ > 0 satisfies the so called “Almost Sure Central Limit Theorem”: 1 1 lim 1I Sk = φ(x) , a.s., for all x ∈ IR, √ <x N →∞ log N k k k≤N (φ is the standard Gaussian distribution function). It was shown slightly later that the constant δ may be assumed to be zero. Many further developments have been made on this topic. The later ones concern for instance: r.v.s which are in the domain of attraction of a stable law (our exercise is a particular case), continuous time semi-stable processes, etc. An exhaustive survey on the topic is done in I. Berkes: Results and Problems Related to the Pointwise Central Limit Theorem. Asymptotic methods in probability and statistics (Ottawa, ON, 1997), 59–96, NorthHolland, Amsterdam, 1998. Complements to Exercises 5.1 and 5.7: (C.1) Question 3 of Exercise 5.1 may be completed as follows: (i) If a sequence of independent variables (Yn ) satisfies: (law) Yn = σn2 Y , with Y ≥ 0, E[Y ] = 1, then Σn σn2 < ∞ if and only if Σn Yn < ∞, a.s. (ii) Exhibit a sequence (Yn ) of independent Rx -valued r.v.’s such that E[Yn ] = 1 (hence: Σm E[Y n ] = ∞), but P (Σn Yn < ∞) = 1 Solution: P Yn = n21−1 = 1 − n12 ; P (Yn = n2 − 1) = n12 . (C.7) Question 3 of Exercise 5.7 may be completed as follows: Let α > 1, and let X1 , X2 , . . . , be a sequence of independent r.v.’s such that: P(Xi ∈ dy dy) = c |y| α 1{|y|≥1} , for a certain constant c. Give a necessary and sufficient condition on a deterministic sequence (ϕ(n), n ∈ IN) such that: ϕ(n) → ∞, as n → ∞, which implies: X1 + · · · + Xn (law) → Z, ϕ(n) n→∞ where Z is β-stable, for some β ∈ (0, 2). Solution: α ∈ (1, 3); take ϕ(n) = n1/α−1 , β = α − 1; α = 3; ϕ(n) = (n log n)1/2 , β = 2; √ α > 3; ϕ(n) = n, β = 2.
5. Convergence of random variables – solutions
161
Solutions for Chapter 5
Solution to Exercise 5.1 1. The sum of r.v.s
i
Xi2 converges in L1 if and only if:
E[Xi2 ] =
i
σi2 + μ2i < ∞ .
i
2. We show that under assumption (5.1.1), i Xi2 belongs to Lp . First observe that we can write Xi as σi Yi +μi , where Yi is a sequence of independent, centred Gaussian variables with variance 1. Therefore we have:
Xi2 p ≤
i
≤
σi2 Yi2 p +
i
Y12 p
2σi μi Yi p +
i
σi2
+ Y1 p
i
i
2|σi μi | +
i
μ2i ,
i
which is finite thanks to the inequality 2|σi μi | ≤ + Thus, ∞ p 2 in L towards i=1 Xi , by Lebesgue’s dominated convergence. σi2
μ2i
μ2i .
n i=1
Xi2 converges
3. Let Y be a centred Gaussian variable with variance 1. For every n, we have:
E e−
n i=1
Xi2
=
n i=1
n
n
E e−σi Y 2
2
=
n i=1
(1 + 2σi2 )− 2 . 1
n
Since i=1 2σi2 ≤ i=1 (1 + 2σi2 ), when n → ∞, E e− i=1 Xi converges towards 0. This proves that P (limn→+∞ ni=1 Xi2 = +∞) > 0. But the event {limn→+∞ ni=1 Xi2 = +∞} belongs to the tail σ-field generated by (Xn , n ∈ IN), which is trivial, so we have P (limn→+∞ ni=1 Xi2 = +∞) = 1. 2
Solution to Exercise 5.2
1 1. Since k+1 = 01 du uk , it follows that E[f (Xn )] converges towards 01 du f (u) for any polynomial function f defined on [0,1]. Then, the Weierstrass Theorem allows
162
Exercises in Probability
us to extend this convergence to any continuous function on [0,1]. So (Xn , n ≥ 0) converges in law towards the uniform law on [0, 1]. 2. We may use the same argument as in question 1. Another means is to consider the Laplace transform ψXn of Xn . For every λ ≥ 0, ψXn (λ) converges towards (−λ)k k≥0
k!
(−λ)k a = E[X k ] = E[e−λX ] , a + k k≥0 k!
where X is an r.v. whose law has density: axa−1 1I{x∈[0,1]} . So (Xn , n ≥ 0) converges in law towards the distribution whose density is as above.
Solution to Exercise 5.3 1. It suffices to show that for any pair of bounded continuous functions f and g, lim E[f (ϕ(Xn ))g(Yn )] = E[f (ϕ(X))g(Y )] .
n→∞
Let gn be a uniformly bounded sequence of Borel functions which satisfies: E[g(Yn ) | Xn ] = gn (Xn ) . From the hypothesis, Xn converges in law towards X. Since the law of Xn does not depend on n, this law is the law of X. So, we can write: E[f (ϕ(Xn ))gn (Xn )] = E[f (ϕ(X))gn (X)] . Set h = f ◦ ϕ and let (hn ) be a sequence of continuous functions with compact support which converges towards h in the space L1 (IR, BIR , νX ), where νX is the law of X. For any k and n, we have the inequalities: |E[h(X)gn (X) − h(X)g(Y )]| ≤ E[|h(X)gn (X) − hk (X)gn (X)|] + |E[hk (X)gn (X)] − E[hk (X)g(Y )]| + E[|hk (X)g(Y ) − h(X)g(Y )|] ≤ BE[|h(X) − hk (X)|] + |E[hk (X)gn (X)] − E[hk (X)g(Y )]| + BE[|hk (X) − h(X)|] , where B is a uniform bound for both g and gn , that is |g| ≤ B and |gn | ≤ B, for all n. Since hk is bounded and continuous, and from the hypotheses of convergence in law, we have for every k: lim |E[hk (X)gn (X)]−E[hk (X)g(Y )]| = lim |E[hk (Xn )g(Yn )]−E[hk (X)g(Y )]| = 0 .
n→∞
n→∞
5. Convergence of random variables – solutions
163
Finally, since hk (X) converges in L1 (Ω, F, P ) towards h(X), the term E[|hk (X) − h(X)|] converges towards 0 as k goes to ∞. 2. To construct a counter-example, we do not need to rely on a bivariate sequence of r.v.s (Xn , Yn ). It is sufficient to consider a sequence (Xn ) which converges in law towards X and a Borel discontinuous function ϕ, such that ϕ(Xn ) does not converge in law towards ϕ(X). A very simple counter-example is the following: set Xn = a + n1 , with a ∈ IR and n ≥ 1. Define also ϕ(x) = 1I{x≤a} , x ∈ IR, then Xn converges surely, hence in law, towards a, although ϕ(Xn ) = 0, for all n ≥ 1. So, it cannot converge in law towards ϕ(X) = ϕ(a) = 1.
Solution to Exercise 5.4 Let Fn be the distribution function of by:
1 n
supi≤n Xi . For every x ∈ IR, Fn is given
1 Fn (x) = P sup Xi ≤ x = P (X1 ≤ nx)n n i≤n n ∞ a dy = 1− . nx π(a2 + y 2 ) If x < 0, then since P (X1 ≤ nx) decreases
to 0 as n → ∞, Fn (x) converges to 0. ∞ a dy Suppose x > 0. In that case, Fn (x) ∼ exp −n nx π(a2 +y2 ) , as n → +∞, and since ∞ a dy ∞ na dz nx π(a2 +y 2 ) = x π(a2 +n2 z 2 ) , we have
lim Fn (x) = exp −
n→+∞
∞ x
a dz πz 2
= exp −
a πx
.
To conclude, we check that the function F given by F (x) = limn→∞ Fn (x) =
a exp − πx 1I(0,∞) (x), for all x ≥ 0, is the distribution function of T1c , where Tc is an exponential r.v. with parameter c = πa :
exp −
c 1 = P Tc > =P x x
1 <x . Tc
Comments on the solution: The law of T1c is a particular case of the Fr´echet distributions (with distribution function: exp(−ax−α ), a > 0, α > 0), which are themselves a special class of limit laws for extremes. See, for example: ¨ppelberg and T. Mikosch: Modelling Extremal Events. P. Embrechts, C. Klu For Insurance and Finance. Applications of Mathematics, 33. Springer-Verlag, Berlin, 1997.
164
Exercises in Probability
Solution to Exercise 5.5 1. Let M be a matrix such that K = M t M and set f (x) = max (M x)i . 1≤i≤n
Then we can check from Cauchy–Schwarz inequality, that f is Lipschitz with: ⎛
f Lip
= ⎝ max
1≤i≤n
n
⎞1 2 1
2 ⎠ Mi,j
= max E[Xi2 ] 2 = σ , 1≤i≤n
j=1
hence question 3 of Exercise 3.10 implies:
P
r2
max Xi ≥ E[ max Xi ] + σr ≤ e− 2 .
1≤i≤n
1≤i≤n
2. Question 1 yields the inequality lim sup r→+∞
1 1 log P ( max Xi ≥ r) ≤ − 2 . 2 1≤i≤n r 2σ
To get the other inequality, it suffices to note that for any i = 1, . . . , n:
P
max Xi ≥ r ≥ P (Xi ≥ r) = 1 − Φ
1≤i≤n
r σi
r2 − 2σ 2 i
exp ≥√
2π 1 +
r σi
,
1
where σi = E[Xi2 ] 2 and Φ is the distribution function of the centred Gaussian law with variance 1. Hence, lim inf r→∞
1 1 log P (max Xi ≥ r) ≥ − 2 , 2 i≤n r 2σi
for all i ≤ n.
Solution to Exercise 5.6 1. Let f be a bounded Borel function defined on IR+ . We have
1 E f cosh Xn n
∞
= 0
1 1 1 − cosh u cosh u sinh u exp f n n n
With the change of variables x = E f
1 cosh Xn n
1 n
cosh u, we obtain du =
=
∞ 1 n
f (x) exp
√ n dx , n2 x2 −1
1 − nx n
dx ,
and
du .
5. Convergence of random variables – solutions which shows that the law of
1 n
cosh Xn has density exp
1 n
165
− x 1I{x≥ 1 } . n
2. From above, the r.v. n1 (cosh Xn − 1) follows the exponential law with parameter 1, thus coshnXn converges in law towards the exponential law with parameter 1. Consequently, log coshnXn converges in law towards Y = log X, where X follows the exponential law with parameter 1, that is
∞
E[f (log X)] =
e−x f (log x) dx =
∞
−∞
0
e−e ey f (y) dy . y
3. Since log(cosh Xn ) − log n converges in law towards Y < ∞, a.s., the ratio log(cosh Xn ) − 1 converges in law (and thus in probability) towards 0. log n Write:
log(cosh Xn ) log n
=
Xn log n
−2Xn
+ log(1+e log n )−log 2 . Since Xn is a.s. positive, log(1 + e−2Xn )
is a.s. bounded by log 2 and from above that:
(P ) Xn → log n n→∞
log(1+e−2Xn )−log 2 log n
converges a.s. towards 0. We conclude
1.
4. From the preceding question, we know that
Xn
log e
1 + e−2Xn 2
− log n
from which we deduce: Xn − (log n) − log 2
(law)
→ Y,
n→∞
(law)
→ Y.
A direct proof of this result may be obtained by writing: E[f (Xn − log n)] =
1 1∞ (sinh(v + log n))e n (1−cosh(v+log n)) f (v)dv, n −(log n)
for a bounded function f on IR, and letting n tend to +∞
Solution to Exercise 5.7 1. For any bounded Borel function f , we have:
E f
ε √ U
1 −1 1 = E f √ +E f √ 2 U U 1 ∞ 1 −1 1 dx = f √ +f √ du = (f (x) + f (−x)) 3 , 2 0 u u x 1
hence, the law of X has density:
1 1I . |x|3 {|x|≥1}
166
Exercises in Probability
2. The characteristic function of X is given by:
it −it 1 ϕX (t) = E exp √ + E exp √ 2 U U ∞ t dx = E cos √ = 2t2 cos x 3 , t ∈ IR . x t U When t goes towards 0, ϕX (t)−1 is equivalent to t2 log t, so when n n goes towards ∞, X√ +···+X t n , that is ϕX √ , t ∈ IR is equivalent the characteristic function of 1 n log n
to
n log n
1 t2 1 1+ log t − log n − log log n n log n 2 2
n
t ∈ IR .
,
t2
When n goes to ∞, the latter expression converges towards e− 2 , for any t ∈ IR, which proves the result. 3. Let ε and U be as in question 1. We easily check that the law of X = εU − 2+α has the density which is given in the statement of question 3. Its characteristic function dx is E[eitX ] = (2 + α)t2+α t∞ cos x x2+α+1 , and is equivalent to 1 − 2+α t2 , when t goes 2α to 0. Therefore, if ϕ2 (n) → +∞, as n → ∞, then the n characteristic function of X1 +···+Xn 2+α t2 is asymptotically equivalent to 1 − 2α ϕ2 (n) , for any t ∈ IR, as n → ∞. ϕ(n) 1
t2
This expression converges towards e− 2 if ϕ2 (n) is equivalent to
2+α n, α
as n → ∞.
Note the important difference between the setups of questions 1 and 3; in the latter, the classical CLT assumptions are satisfied.
Solution to Exercise 5.8 1. First we show that (|Un − X|, n ≥ 1) is uniformly integrable, which follows from the fact that this sequence is bounded in L2 (see the discussion following Exercise 1.2). Here are the details. Indeed, for every n ≥ 1:
1
1
E[|Un − X|2 ] ≤ E[Un2 ] 2 + E[X 2 ] 2
2
.
1
2
But E[Un2 ] = n1 nk=1 E[Yk2 ] = 1, so that, E[|Un − X|2 ] ≤ 1 + E[X 2 ] 2 . Now, from the Cauchy–Schwarz and Bienaym´e–Tchebychev inequalities, we have for any a > 0: a E[|Un − X|1I{|Un −X|>a} ] ≤ E[(Un − X)2 ]. Hence E[|Un − X|1I{|Un −X|>a} ] ≤
1 2 1
1 + E[X 2 ] 2 , a
so lima→+∞ supn∈IN E[|Un − X|1I{|Un −X|>a} ] = 0 and (|Un − X|, n ≥ 1) is uniformly integrable. Moreover, applying the Central Limit Theorem, we obtain that
5. Convergence of random variables – solutions
167
(|Un − X|, n ≥ 1) converges in law towards |Y − X|, where Y is an independent copy of X, (recall that (Un ) is independent from X). From the result of Exercise 1.3, we conclude that: 2 lim E[|Un − X|] = E[|Y − X|] = √ . π
(5.8.a)
n→∞
The last equality follows from the fact that X − Y is a centred Gaussian variable whose variance equals 2 (see Exercise 3.8). 2. From the independence between the r.v.s Yn , we can write, for any bounded continuous function, f , defined on IRp+1 : E [f (Y1 , . . . , Yp , Un )]
=
1 1 E f y1 , . . . , yp , √ (y1 + · · · + yp ) + √ (Yp+1 + · · · + Yn ) p n n IR P (Y1 ∈ dy1 , . . . , Yp ∈ dyp ) .
×
The Central Limit Theorem ensures that for any (y1 , . . . , yp ) ∈ IRp ,
limn→∞ =
1 1 E f y1 , . . . , yp , √ (y1 + · · · + yp ) + √ (Yp+1 + · · · + Yn ) n n E[f (y1 , . . . , yp , X)] ,
where X is a centred reduced Gaussian variable. Finally, from above and since f is bounded, lim E[f (Y1 , . . . , Yp , Un )] = E[f (Y1 , . . . , Yp , X)] . n→∞ Hence, when n goes to ∞, (Y1 , . . . , Yp , Un ) converges in law towards (Y1 , . . . , Yp , X), where X is a centred reduced Gaussian variable which is independent of (Y1 , . . . , Yp ). 3. If Un would converge in probability, then it would necessarily converge towards an r.v. such as X defined in the previous question. Let us assume that when n goes to (P ) ∞, Un −→ X, where X is a centred Gaussian variable with variance 1, independent of the sequence (Yn , n ≥ 1). Since from question 1, |Un − X|, n ≥ 1 is uniformly L1
integrable, the convergence in probability implies that Un −→ X. But, from (5.8.a) E[|Un − X|] converges towards √2π , which contradicts our hypothesis. So Un cannot converge in probability. 4. Set G∞ = ∩p σ{Yp , Yp+1 , . . .}. It follows from the hypothesis that for any n ≥ 1 and m ≥ 1 such that n < m, σ{Y1 , . . . , Yn } is independent of σ{Yk : k ≥ m}. Since for any m ≥ 1, G∞ ⊂ σ{Yk : k ≥ m}, the sigma-fields σ{Y1 , . . . , Yn } and G∞ are
168
Exercises in Probability
independent for any n ≥ 1. An application of the Monotone Class Theorem shows that σ{Y1 , . . . , Yn , . . .} and G∞ are independent. Since G∞ ⊂ σ{Y1 , . . . , Yn , . . .}, the sigma-field G∞ is independent of itself, so it is trivial. Now assume that Un converges in probability. It is well known that hence the limit would be measurable with respect to G∞ . According to what we just proved, this limit must be constant, but this would contradict the Central Limit Theorem. Comments on the solution: Actually, the property proved in solving question 4, is valid for any sequence of independent random variables and is known as Kolmogorov’s 0−1 law.
Solution to Exercise 5.9
Set S n = σ√1 n ni=1 (Xi − m) and let N be a Gaussian, centred r.v., with variance 1 (defined on an auxiliary probability space). Our goal is to show that for any bounded continuous function f with compact support on IR, EQ [f (S n )] → E[f (N )] ,
as n → ∞.
(5.9.a)
For k ≥ 1, put Dk = EP [D | Gk ], with Gk = σ{X1 , . . . , Xk }. Since we deal only with variables X1 , X2 , . . . , Xn , . . ., we may assume, without loss of generality, that D is measurable with respect to G∞ = limk↑∞ Gk . We then write: EQ [f (S n )] = EP [(D − Dk )f (S n )] + EP [Dk f (S n )] .
(5.9.b)
Applying the result of Exercise 1.5, together with the inequality: |EP [(D − Dk )f (S n )]| ≤ EP [|D − Dk |] f ∞ , we see that the first term of (5.9.b) converges towards 0, as k goes to ∞, uniformly in n. It remains to show that for any k ≥ 1, EP [Dk f (S n )] → E[f (N )] ,
as n → ∞.
(5.9.c)
To this aim, put S k,n = σ√1 n ni=k (Xi − m) and note that, since S k,n is independent of Gk and EP [Dk ] = 1, one has EP [Dk f (S k,n )] = EP [f (S k,n )] → E[f (N )] ,
as n → ∞.
(5.9.d)
Since f is uniformly continuous and |S k,n − S n | → 0 a.s., as n → +∞, |EP [Dk f (S k,n )] − EP [Dk f (S n )]| → 0 ,
as n → +∞ ,
(5.9.e)
by Lebesgue’s Theorem of dominated convergence. This finally leads to (5.9.c), via (5.9.d) and (5.9.e).
5. Convergence of random variables – solutions
169
Solution to Exercise 5.10 1. This follows easily from the Mellin transform formula: E[(Tμ )μs ] =
Γ(1 − s) , Γ(1 − μs)
for s < 1,
which was derived in Exercise 4.17 (cf. formula (4.17.4)). Hence, limμ→0 E[(Tμ )μs ] = Γ(1 − s), which proves the result. 2. This follows from question 1, since:
Tμ log Tμ
μ
= μ log Tμ − μ log Tμ
converges in law towards log Z − log Z . (law)
3. The identity in law: ZZ = U1 − 1 follows, e.g., from the beta–gamma algebra, (law) since (Z, Z ) = (U, 1 − U )(Z + Z ) (use question 1 of Exercise 4.2). Passing to the limit in formula (4.21.3), as μ → 0, one obtains, as μ → 0:
Tμ Tμ
μ
and the right hand side is the law of
(law)
−→ 1 U
dy . (y + 1)2
− 1.
Solution to Exercise 5.11 1. Question 3 of Exercise 4.15 allows us to write
n λ2 ε2 Ti E exp − 2n i=1 i
n
λ2 = E exp − ε2i Ti 2n i=1
n |λ| = exp − √ εi n i=1
,
for every λ ∈ IR. This shows that the sequence n1 ni=1 ε2i Ti converges in law if and only if √1n ni=1 εi converges towards a finite real value, say ε. In this case, the √
Laplace transform of the limit in law is: e− 2λ ε , λ ≥ 0. We deduce from question 3 of Exercise 4.15 that this limit is the law of ε2 T .
1 √ , so it is equiv2. (a) The series ni=1 2√ may be compared with the integral 1n 2dx x i √ n 1 alent to n and in this case, √n i=1 εi converges to 1. Applying question 1, we see that Sn converges in law to μ.
170
Exercises in Probability
kn 1 2 (b) First write Skn = k−1 S(k−1)n + kn i=(k−1)n+1 εi Ti . It is clear from question 1 k T1 . The Laplace that the first term on the right hand side converges in law to k−1 k transform of the second term is
⎡
⎛
⎞⎤
⎛
⎞
kn kn 1 λ2 1 λ √ ⎠. E ⎣exp ⎝− ε2i Ti ⎠⎦ = exp ⎝− √ 2 kn i=(k−1)n+1 2 kn i=(k−1)n+1 i
It converges to exp − (k+√kλ√k−1) , as n goes to ∞, so the sequence
1 kn
kn i=(k−1)n+1
ε2i Ti
converges in law to (k+√k1√k−1) T2 , as n goes to ∞. Since both terms in the above decomposition of Skn are independent, we can say that (S(k−1)n , Skn ) converges in law to (T1 , k−1 T1 + (k+√k1√k−1) T2 ), as n goes to ∞. k (c) The p-dimensional sequence (Sn , S2n , . . . , Spn ) may be written as ⎛
⎞
pn n n 2n n 1 1 1 1 1 ⎝ ε2i Ti , ε2i Ti + ε2i Ti , . . . , ε2i Ti + · · · + ε2 Ti ⎠ . n i=1 2n i=1 2n i=n+1 pn i=1 pn i=(p−1)n+1 i
jn 1 2 Since for each k ≤ p, the variables kn i=(j−1)n+1 εi Ti , j = 1, 2, . . . , k are independent and converge respectively in law to kj (j+√j1√j−1) Tj , the limit in law of (Sn , S2n , . . . , Spn ) is
⎛
⎞
p k j j 1 1 1 1 ⎝T1 , T1 + √ T2 , . . . , √ √ √ √ Tj , . . . , Tj ⎠ . 2 k (j + j j − 1) p (j + j j − 1) 2+ 2 j=1 j=1
(d) From above, we see that when n goes to ∞, S2n − Sn converges in law towards 1√ T − 12 T1 which is non-degenerate. This result prevents Sn from converging in 2+ 2 2 probability. 3. (a) The p-dimensional sequence (St1 , St2 , . . . , Stp ) may be written as ⎛
[t21 n]
2
[t1 n] 1 1 ⎜1 2 εi Ti , ε2i Ti + ⎝
n
i=1
n
i=1
[t22 n]
n i=[t2 n]+1
[t21 n]
ε2i Ti , . . . ,
1 n
1
[t2p n]
ε2i Ti + · · · +
i=1
1 n i=[t2
p−1 n]+1
For each k ≤ p, the Laplace transform ⎡
⎛
⎞⎤
[t2 n]
2 k λ2 ⎜ λ ⎟⎥ E[exp(− (Stk − Stk−1 )] = E ⎢ ε2 Ti ⎠⎦ ⎣exp ⎝− 2 2n i=[t2 n]+1 i
⎛
k−1
[t2k n]
⎞
1 λ ⎜ √ ⎟ = exp ⎝− √ ⎠ , 2 n i=[t2 n]+1 i k−1
⎞ ⎟
ε2i Ti ⎠ .
5. Convergence of random variables – solutions
171
2
λ ≥ 0, converges towards E[exp(− λ2 (Ttk − Ttk−1 )] = exp (−λ(tk − tk−1 )). It follows from the independence hypothesis that the r.v.’s (Stk − Stk−1 )1≤k≤p are independent, hence so are the increments (Ttk − Ttk−1 )1≤k≤p . In conclusion, the finite-dimensional distributions of the process (Sn (t), t ≥ 0) converge weakly towards those of a process with independent and time homogeneous increments. The increments, Ttk − Ttk−1 , of the limit process are such that (tk − tk−1 )−2 (Ttk − Ttk−1 ) has law μ. This process is the so-called stable(1/2) subordinator. (b) The increments of the process (ξn (t), t ≥ 0) at times t1 < · · · < tp are ⎛
[nt ]
⎞
[nt2 ] p 1] 1 1 1 [nt ⎝√ Xi , √ Xi , . . . , √ Xi ⎠ . n i=1 n i=[nt1 ]+1 n i=[ntp−1 ]+1
Put t0 = 0 and call σ 2 the variance of the Xi s and ψ their characteristic function. [ntk ] For each k = 1, . . . , p, we can check that √1n i=[nt Xi converges in law towards k−1 ]+1 2 a centred Gaussian variable with variance σ (tk − tk−1 ) as n goes to +∞. Indeed, [ntk ] the characteristic function of the r.v. √1n i=[nt Xi is equal to: k−1 ]+1
t ϕn (t) = ψ √ n
[ntk ]−[ntk−1 ]
= (1 −
σ2 2 1 t + o( ))[ntk ]−[ntk−1 ] , 2n n
2 2
t ∈ IR .
When n tends to +∞, ϕn (t) converges towards exp − t 2σ (tk − tk−1 ) . Finally, note that the increments of the process (ξn (t), t ≥ 0) at times t1 < · · · < tp are independent. So, we conclude as follows. The finite-dimensional distributions of the process (ξn (t), t ≥ 0) converge weakly towards those of a process with independent and time homogeneous increments. The increments of the limit process at times t1 < · · · < tp are centred Gaussian variables with variance σ 2 (tk − tk−1 ), k = 2, . . . , p. Hence, the limit process is (σBt , t ≥ 0), where (Bt , t ≥ 0) is a Brownian motion.
Solution to Exercise 5.12
1. Let s and t belong to the interval [0, 1], and write b(n) (t) = √1n nk=1 (1I[0,t] (Uk )−t). We have n 1 E[b(n) (t)] = √ (P (Uk ≤ t) − t) = 0 . n k=1 Suppose that s ≤ t; from the independence between the r.v.’s Uk , we obtain: n 1 Cov[b(n) (s), b(n) (t)] = Cov(1I[0,t] (Uk ), 1I[0,s] (Uk )) n k=1 = s − st ,
172
Exercises in Probability
so that for any s and t, Cov[b(n) (s), b(n) (t)] = s ∧ t − st .
(5.12.a)
2. Let λi > 0 and ti ∈ [0, 1], i = 1, . . . , j, for any integer j ≥ 1 and write: j
⎛
λi b
i=1
(n)
⎞
j j n 1 ⎝ (ti ) = √ λi 1I[0,ti ] (Uk ) − λi ti ⎠ . n k=1 i=1 i=1
From (5.12.b) and the Central Limit Theorem, we see that in law towards a centred Gaussian variable with variance: ⎛
Var ⎝
j
⎞
λi 1I[0,ti ] (Uk )⎠ =
i=1
j
λ2i (ti − t2i ) +
i=1
n i=1
(5.12.b)
λi b(n) (ti ) converges
λi λi (ti ∧ ti − ti ti ) .
1≤i=i ≤j
It means that (b(n) (t1 ), . . . , b(n) (tj )) converges in law towards a centred Gaussian vector (b(t1 ), . . . , b(tj )) whose covariance matrix is given by: Cov(b(ti ), b(ti )) = ti ∧ ti − ti ti .,
i, i = 1, . . . , j ,
which allows us to reach our conclusion.
Solution to Exercise 5.13 (c)
The Poisson process with parameter c, (Nt , t ≥ 0), is the unique (in law) increasing right continuous process with independent and stationary (or time homogeneous) (c) increments, such that for each t > 0, Nt has a Poisson distribution with parameter (c) ct. It follows that the process (Xt , t ≥ 0) also has stationary and independent increments. Set t0 = 0 and let (t1 , . . . , tn ) ∈ IRn+ be such that t1 < t2 < · · · < tn , (c) (c) (c) (c) (c) then the r.v.’s (Xt1 , Xt2 − Xt1 , . . . , Xtn − Xtn−1 ) are independent and for each (c) (c) (c) k = 1, . . . , n, Xtk − Xtk−1 has the same distribution as Xtk −tk−1 . Let us compute (c) the characteristic function of √1c Xtk −tk−1 . For any λ ∈ IR, we have:
iλ (c) E exp √ Xtk −tk−1 c
∞
√ (c(tk − tk−1 ))n √iλc n e = exp −iλ(tk − tk−1 ) c − c(tk − tk−1 ) n! n=0 iλ √ √ = exp −c(tk − tk−1 )(1 − e c ) − iλ(tk − tk−1 ) c .
2
When c → ∞, the latter expression converges towards exp − λ2 (tk − tk−1 ) , which is the characteristic function of a centred Gaussian variable with variance tk − tk−1 .
5. Convergence of random variables – solutions This shows that, when c → ∞, the random vector
(c) (c) √1 (X , X t1 t2 c
173 (c)
(c)
− Xt1 , . . . , Xtn −
(c)
Xtn−1 ) converges in law towards a random vector whose coordinates are independent centred Gaussian r.v.s with variances tk −tk−1 , k = 1, . . . , n. So, we conclude that the (c) (c) random vector √1c (Xt1 , . . . , Xtn ) converges in law towards (βt1 , . . . , βtn ), as c → ∞, where (βt , t ≥ 0) is a centred Gaussian process which has stationary and independent increments. The latter property entirely characterizes the process (βt , t ≥ 0), which is known as Brownian motion and has already been encountered in question 3 (b) of Exercise 5.8. From above, one may easily check that the covariance of Brownian motion is given by: E[βs βt ] = s ∧ t , s, t ≥ 0 . Comments on the solution: As an additional exercise, one could check that the (c) process (Nt , t ≥ 0) may be constructed as follows. Let (Xk )k≥1 be a sequence of independent exponential r.v.s with parameter c. Set Sn = nk=1 Xk , n ≥ 1, then (c) (Nt , t
(law)
≥ 0) =
∞
1I{Sn ≤t} , t ≥ 0 .
n=1
Solution to Exercise 5.14 1. (a) This follows immediately from the Gaussian character of Brownian motion, and the fact that E[b(u)B1 ] = u − u = 0 . (b) It suffices to write b(1 − u) = B1−u − (1 − u)B1 = uB1 − (B1 − B1−u ), and to use the fact that (−(B1 − B1−u ), 0 ≤ u ≤ 1) is a Brownian motion. 2. We first write
√1 b(cu), c
0≤u≤
1 c
and
√1 b(1 c
− cu), 0 ≤ u ≤
1 c
in terms of B
√ 1 1 √ b(cu) = √ Bcu − cuB1 , c c 1 1 √ b(cu) = √ (B1−cu − (1 − cu)B1 ) , c c and it follows that this question may be reduced to showing the following: for fixed U , S, T > 0, the process
1 1 (Bu , 0 ≤ u ≤ U ); √ Bcu , 0 ≤ u ≤ S ; √ (B1−cu − B1 ), 0 ≤ u ≤ T c c (5.14.a) converges in law towards (B, β, β ), a three-dimensional Brownian motion. This 1 √ follows from the fact that the mixed covariance function E Bu c Bcs converges
174
Exercises in Probability
towards 0 as c goes to 0, and that each of the three families which constitute (5.14.a) is constant in law (i.e. its law does not depend on c). √ √ t t 3. By writing λ b(e− λ ) = λ b(1 − (1 − e− λ )) and√using the time reversal property t of the Brownian bridge, it suffices to show that λ b(1 − e− λ ) converges in law towards Brownian motion. √Now, using the representation of b as b(u) = Bu − uB1 , t u ≤ 1, we can replace b in λ b(1 − e− λ ) by B, a Brownian motion. Finally, we can restrict ourselves to t ≤ T , for T fixed and then with the help of the 1 − ε H¨older property of the Brownian trajectories (Bu , u ≤ T ), we have 2 % √ %%
t %% (a.s.) − λt % sup λ %B 1 − e −B −→ 0, λ % t≤T as λ → ∞. Now the result follows from the scaling property.
Solution to Exercise 5.15 (law)
Let (St )t≥0 be the stable process which satisfies S1 = X1 , that is a continuous time c`adl`ag (i.e. right continuous and left limited) process with independent and stationary increments satisfying the scaling property t− α St = S1 = X1 . 1
(law)
(law)
(5.15.a) − αt
The process Z defined from S by the Lamperti transformation Zt = e is stationary, i.e. for any fixed s,
(Zt+s , t ≥ 0) = e−
t+s α
Set+s , t ≥ 0
Set , t ≥ 0
(law)
= (Zt , t ≥ 0) ,
has independent increments. Hence the tail σ-field ∩t σ{Zt+s , s ≥ t} is trivial and from the Ergodic Theorem, 1 t f (Zu ) du → E[f (Z1 )] , a.s., as t → ∞. t 0 From the change of variables v = eu and the scaling property of S, we obtain: 1 1 t1 f (v − α Sv ) dv → E[f (S1 )] , a.s., as t → ∞. log t 0 v Finally, the result in discrete time easily follows from the above. Comments on the solution: Lamperti’s transformation is a one-to-one transformation between the class of L´evy processes (i.e. processes with stationary and independent increments) and semistable Markov processes (i.e. non-negative Markov processes satisfying a scaling property such as (5.15.a)). This transformation (and more on semistable processes) is presented in: P. Embrechts and M. Maejima: Self-similar Processes. Princeton University Press, 2002.
Chapter 6 Random processes
What is a random process ?
In this chapter, we shall consider on a probability space (Ω, F, P ), random (or stochastic) processes (Xt , t ≥ 0) indexed by t ∈ IR+ , that is applications (t, ω) → Xt (ω) assumed to be measurable with respect to BIR+ ⊗ F, and valued in IR (or C, or IRd ). In the early stages of developments of stochastic processes (e.g. the 1950s, see Doob’s book [17]), it was a major question to decide whether, given finite dimensional marginals {μt1 ,...,tk }, there existed a (measurable) process (Xt , t ≥ 0) with these marginals. Kolmogorov’s existence theorem (for a projective family of marginals; see, e.g., Neveu [43]) together with Kolmogorov’s continuity criterion E[|Xt − Xs |p ] ≤ C|t − s|1+ε ,
t, s ≥ 0 ,
for some p > 0, ε > 0, and C < ∞, which bears only on the two-dimensional marginals, often allows us to obtain at once continuous paths realizations. It has also been established (Meyer [40]) that on a filtered probability space (Ω, F, (Ft )t≥0 , P ), with (Ft )t≥0 right continuous and (F, P ) complete, a supermartingale admits a “c`adl`ag” (i.e. right continuous, with left limits) version. As a consequence, many processes admit a c`adl`ag version, and their laws may be constructed on the corresponding canonical space of c`adl`ag functions (often endowed with the Skorokhod topology). Once the (fundamental) question of an adequate realization solved, then the probabilistic study of a random process often bears upon its Markov (or lack of Markov!) property, as well as upon all kinds of behavior (either local or global), and also upon various transformations of one process into another.
175
176
Exercises in Probability
Basic facts about Brownian motion
In this chapter, we draw more directly on research papers on Brownian motion, (and more generally stochastic processes). Thus, the reader will see how some of the random objects (random variables, etc.) used in the previous chapters appear naturally in the Brownian context. Also, a few times, we have given less details in the proofs, and have simply referred the reader to the original papers. Among continuous time stochastic processes, i.e. families of random variables (Xt ) indexed by t ∈ IR+ , Brownian motion – often denoted (Bt , t ≥ 0) – is one of the most remarkable: (i) It is a Gaussian process: for any k-uple t1 < t2 < . . . < tk , the random vector (Bt1 , Bt2 , . . . , Btk ) is Gaussian, centred, with: E[Bti Btj ] = ti ∧ tj . (ii) It has independent and stationary increments: for every t1 < t2 < . . . < tk , Bt1 , Bt2 − Bt1 , . . . , Btk − Btk−1 are independent, and Bti − Bti−1 is distributed as B(ti −ti−1 ) . (iii) Almost surely, the path: t → Bt is continuous; more precisely, it is locally H¨older of order ( 12 − ε), for every ε ∈ (0, 12 ). (iv) It is a self-similar process of order 12 , i.e. (law) √ for any c > 0, (Bct , t ≥ 0) = ( cBt , t ≥ 0) .
(Note that the previous chapter just ended with a reference to self-similar processes.) (v) It is the unique continuous martingale (Mt , t ≥ 0) such that: for any s < t, E[(Mt − Ms )2 | Fs ] = t − s , a famous result due to L´evy. (For clarity, assume that (Ft ) is the natural filtration associated with (Mt ).) Note also that property (i) on the one hand, and properties (ii) and (iii) on the other hand, characterize Brownian motion. These facts are presented in every book dealing with Brownian motion, e.g. Karatzas and Shreve [34], Revuz and Yor [51] and Rogers and Williams [53], etc.
6. Random processes
177
A discussion of random processes with respect to filtrations
The basic notions of a constant (respectively increasing, decreasing, affine, differentiable, convex, etc.) function f (which we assume here to be defined on IR+ , and taking values in IR) have very interesting analogues which apply to stochastic processes (Xt )t≥0 adapted to an increasing family of σ-fields (Ft )t≥0 : • an (Ft ) conditionally constant process (for s ≤ t, E[Xt | Fs ] = Xs ) is a (Ft ) martingale; • an (Ft ) conditionally increasing process (for s ≤ t, E[Xt | Fs ] ≥ Xs ) is a (Ft ) submartingale; • conditionally affine processes are studied in Exercise 6.19; their probabilistic name is harnesses; • conditionally convex processes have been shown to play some interesting role in Mathematical Finance. See: P. Bank and N. El Karoui:A stochastic representation theorem with applications to optimization and obstacle problems. Ann. of Prob., vol. 32, no. 1B, 1030– 1067 (2004). A fundamental result of Doob–Meyer is that every conditionally increasing process (i.e. submartingale) is the sum of a conditionally constant process (i.e. martingale) and an increasing process.
*
6.1 Solving a particular SDE
Let (Bt , t ≥ 0) denote a one-dimensional Brownian motion. Prove that the stochastic differential equation (SDE): t
Xt = x +
1+ 0
Xs2
1 dBs + 2
t
Xs ds
(6.1.1)
0
admits a unique solution, for every x ∈ IR. Fix x. Show that the solution is given by: Xt = ϕ(Bt ), t ≥ 0, where ϕ : IR → IR is a C 2 function. Compute ϕ. Comments and references: A list of explicitly solvable SDEs is provided in: P.E. Kloeden and E. Platen: Numerical Solutions of Stochastic Differential
178
Exercises in Probability
Equations. Applications of Mathematics, vol. 23, Springer, 1992. **
6.2 The range process of Brownian motion
Let (Bt , t ≥ 0) be a real valued Brownian motion, starting from 0, and (Ft , t ≥ 0) be its natural filtration. Define: St = sup Bs , It = inf Bs , and, for c > 0, θc = inf{t : St − It = c}. s≤t
s≤t
1. Show that, for any λ ∈ IR,
λ2 t Mt ≡ cosh (λ(St − Bt )) exp − 2
is an (Ft ) martingale.
2
2. Prove the formula: E exp − λ2 θc
=
2 1 . ≡ 2 1 + cosh(λc) cosh λ 2c
Comments and references: The range process (St − It , t ≥ 0) of Brownian motion is studied in: J.P. Imhof: On the range of Brownian motion and its inverse process. Ann. of Prob., 13, 3, 1011–1017 (1985) P. Vallois: Amplitude du mouvement brownien et juxtaposition des excursions positives et n´egatives. S´eminaire de Probabilit´es, XXVI, Lecture Notes in Mathematics, 1526, 361–373, Springer, Berlin, 1992 P. Vallois: Decomposing the Brownian path via the range process. Stochastic Process. Appl., 55, no. 2, 211–226 (1995).
6.3 Symmetric L´ evy processes reflected at their minimum and maximum; E. Cs´ aki’s formulae for the ratio of Brownian extremes
*
1. Let X = (Xt , t ≥ 0) be a symmetric L´evy process and define its past minimum and past maximum processes as It = inf s≤t Xs and St = sups≤t Xs . Prove the following identity in law, for a fixed t (law)
(law)
(St − Xt , Xt − It , Xt ) = (−It , St , Xt ) = (St , −It , −Xt ) .
(6.3.1)
6. Random processes
179
Hint: Use the time reversal property of L´evy processes, that is (law)
(Xt − X(t−u)− , u ≤ t) = (Xu , u ≤ t) .
(6.3.2)
2. As an application, prove that St − Xt (law) St = . St − It S t − It 3. In this question, X denotes a standard Brownian motion and T an exponential 2 time with parameter θ2 , independent of X. Prove that, for x, y ≥ 0 P (ST ≤ x, −IT ≤ y, XT ∈ da) = (6.3.3) θ −θ|a| sinh(θy) sinh(θx) e − e−θ|a−x| − e−|a+y| da 2 sinh(θ(x + y)) sinh(θ(x + y)) P (ST − XT ≤ x, −IT + XT ≤ y) = P (ST ≤ x, −IT ≤ y) (6.3.4) sinh(θx) + sinh(θy) . = 1− sinh(θ(x + y))
1 admits As an application, prove that the distribution function of the ratio S1S−I 1 both the following continuous integral and discrete representations:
P
S1 ≤a S1 − I1
=
1−a ∞ sinh(ax) dx 2 0 cosh2 x
= (1 − a) 2a
(6.3.5)
2 n−1
∞ (−1) n=1
πa + − 1 . (6.3.6) n+a sin(πa)
4. Check that these formulae agree with Cs´aki’s formulae for the same quantity:
P
∞ S1 (−1)n−1 n ≤ a = 2a(1 − a) 2 2 S1 − I1 n=1 n − a
(6.3.7)
1 a 1 a π + − = a(1 − a) ψ 1 + −ψ + , 2 2 2 sin(πa) a where ψ(x) =
d dx
(6.3.8)
(log Γ(x)) is the digamma function.
Comments and references: Formulae (6.3.7) and (6.3.8) are due to E. Cs´aki (reference below), who deduced them from classical theta series expansion for the joint law of (St , It , Xt ), (see e.g. Revuz-Yor [51], p.111, Exercise (3.15)). In the same paper, Cs´aki also obtained variants of these formulae for the Brownian bridge; these have been interpreted via some path decomposition by Pitman and Yor. ´ki: On some distributions concerning maximum and minimum of a Wiener E. Csa process. In: Colloq. Math. Soc. J´anos Bolyai, 21, North-Holland, Amsterdam-New York, 43–52, (1979).
180
Exercises in Probability
J. Pitman and M. Yor: Path decompositions of a Brownian bridge related to the ratio of its maximum and amplitude. Studia Sci. Math. Hungar., 35, no. 3–4, 457–474 (1999). **
6.4 A toy example for Westwater’s renormalization
Let (Bt , t ≥ 0) be a complex valued BM , starting from 0. 1. Consider, for every λ > 0, and z ∈ C \ {0}, the equation t
Zt = z + Bt + λ 0
Zs ds . |Zs |2
(6.4.1)
Identify the process (|Zt |, t ≥ 0). Show the representation: Zt = |Zt | exp(iγHt ) , where Ht =
t 0
ds , |Zs |2
(6.4.2)
and γ is a process which is independent of |Z|. Identify γ.
2. Define Pzλ to be the law on C(IR+ , C) of the process Z, which is the solution of (6.4.1).
Show that the family Pzλ ; z ∈ C \ {0} may be extended by weak continuity to the entire complex plane C. We denote by P0λ the limit law of Pzλ , as z → 0. We still denote by (Zt ) the coordinate process on C(IR+ , C), and Zt = σ{Zs , s ≤ t}. Show that, for every z = 0, and every t > 0, Pzλ and Pz0 are equivalent on the σ-field Zt , and identify the Radon–Nikodym density dPzλ /dPz0 . 3. Show that, for every continuous, bounded functional F : C ([0, 1], C) → IR, the quantity ⎡ ⎛ ⎞⎤ 2 1 1 0⎣ ds λ ⎠⎦ E F (Z) exp ⎝− Λz z 2 |Zs |2 0
where
⎡
λ2 Λz = Ez0 ⎣exp − 2
1 0
⎤
ds ⎦ |Zs |2
converges, as z → 0, but z = 0. Describe the limit in terms of P0λ .
6. Random processes
181
4. Let (Rt , t ≥ 0) be a Bessel process with dimension d > 2, starting from 1. Define the process (ϕu , u ≥ 0) via the equation: t
log Rt = ϕHt , where Ht = 0
ds . Rs2
(a) Identify the process (ϕu , u ≥ 0). (b) Prove that log1 t Ht converges in probability towards a constant which should be computed. (c) Prove the same result with the help of the Ergodic Theorem by considering:
a
n 1
ds , Rs2
as n → ∞, for some fixed a > 1.
(d) Assume now that (Rt , t ≥ 0) is starting from 0. Show that:
1 1 ds 1 log ε ε Rs2
converges in probability as ε → 0.
(e) Show the convergence in law, as ε → 0, and identify the limit of: 1 1/2 θ(ε,1) , where θ(ε,1) is the increment between ε and 1 of a contin(log 1ε ) uous determination of the angle around 0 made by the process Z, under the law P0λ . Comments and references: This is a “toy example”, the aim of which is to provide some insight into (or, at least, some simple analogue for!) Westwater’s renormalization result which we now discuss briefly. First, consider a two-dimensional Brownian motion (Bt , t ≥ 0) and let fn (z) = n2 f (nz), where f : IR2 ( C) → IR+ is a continuous function with compact support, such that dxdyf (x, y) = 1. It was proved originally by S. Varadhan (1969) [this result has then been reproven and extended in many ways, by J.F. Le Gall and J. Rosen in particular] that, if we denote {X} = X − E(X) for a generic integrable variable X, the sequence:
1
1
ds 0
0
dt{fn (Bt − Bs )}
converges a.s. and in every Lp , as n → ∞ towards what is now called the renormalized local time of intersection γ. Moreover, for any k ∈ IR+ , the sequence of probabilities: (def)
Wn(k) = cn exp −k
1
1
ds 0
0
dt{fn (Bt − Bs )} · P |σ(Bs ,s≤1)
converges weakly to c exp(kγ) · P |σ(Bs ,s≤1) . Westwater’s renormalization result is that for the similar problem involving now
182
Exercises in Probability
the three-dimensional Brownian motion (Bs , s ≤ 1), the sequence (Wn(k) , n ≥ 1) converges weakly towards a probability W (k) , which is singular w.r.t. W (0) ≡ P |σ(Bs ,s≤1) . In other words, Varadhan’s Lp -convergence result does not extend to three-dimensional Brownian motion, although a weak convergence result holds. Further studies were made by S. Kusuoka, showing that, under W (k) , the reference process still has double intersections. Thus, both for d = 2 and d = 3, the objective (central in Constructive Quantum Field Theory) to construct in this way some kind of self avoiding Brownian motion could not be reached. Intensive research on this subject is presently being done, by G. Lawler, O. Schramm, W. Werner, using completely different ideas (i.e. a stochastic version of Loewner’s equation on the complex plane). **
6.5 Some asymptotic laws of planar Brownian motion 1. Let (γt , t ≥ 0) be a real-valued Brownian motion, starting from 0. (a) Let c ∈ IR. Decompose the process (eicγu , u ≥ 0) as the sum of a continuous martingale, and a continuous process with bounded variation. (b) Show that the continuous process: ⎛ ⎝γu ,
u
⎞
dγs exp(icγs ); u ≥ 0⎠
0
which takes its values in IR × C converges in law as c → ∞ towards
1 γu ; √ (γu + iγu ); u ≥ 0 2
,
where γ, γ , γ are three real-valued independent Brownian motions, starting from 0. 2. Let (Zt , t ≥ 0) be a complex-valued Brownian motion, starting from Z0 = 1. Show that, as t → ∞,
1 Zs ds log t |Zs |3 t
0
converges in law towards a limit, the law of which will be described. Comments and references: A much more complete picture of limit laws for planar Brownian motion is given in: J.W. Pitman and M. Yor: Further asymptotic laws of planar Brownian motion, Ann. Prob., 17, (3), 965–1011 (1989).
6. Random processes
183
See also D. Revuz and M. Yor [51], Chapter XIII and Y. Hu and M. Yor: Asymptotic studies of Brownian functionals. In: Proceedings of the Random Walks conference held in Budapest (1998), 187–217, Bolyai Soc. Math. Stud., 9, Janos Bolyai Math. Soc., Budapest, 1999. The following articles on asymptotic distributions and strong approximations for diffusions are also recommended: ¨ ldes: Asymptotic independence and strong approximation. A survey. In A. Fo Endre Cs´aki 65. Period. Math. Hungar., 41, no. 1–2, 121–147 (2000) ¨ ldes and Y. Hu: Strong approximations of additive functionals E. Csaki, A. Fo of a planar Brownian motion. Pr´epublication 779, Laboratoire de Probabilit´es et Mod`eles Al´eatoires (2002).
6.6 Windings of the three-dimensional Brownian motion around a line
**
Let B = (X, Y, Z) be a Brownian motion in IR3 , such that B0 ∈ D ≡ {x = y = 0}. Let (θt , t ≥ 0) denote a continuous determination of the winding of B around D, which may be defined by taking a continuous determination of the argument of the planar Brownian motion (Xu +iYu , u ≤ t) around 0 = 0+i0. To every Borel function f : IR+ → IR+ , we associate the volume of revolution
Γf = (x, y, z) : (x2 + y 2 )1/2 ≤ f (|z|) ⊂ IR3 .
t
Denote θtf = dθs 1(Bs ∈Γf ) . 0
1. Show that, if
log f (λ) log λ
2θtf (law) → dγu 1(βu ≤aSu ) log t t → ∞ σ
→ a,
λ→∞
then
0
2
where (β, γ) is a Brownian motion in IR , starting from 0, Su = sup βs ,
and σ = inf{u : βu = 1} .
s≤u
2. Show, by a simple extension of the preceding result, that, if a > 1, then: 1 (P ) →0 . (θt − θtf ) log t (t→∞) Show, under the same hypothesis, that, in fact: θt − θtf converges a.s., as t → ∞.
184
Exercises in Probability
Comments and references: The aim of this exercise is to explain how the wanderings of B in different surfaces of revolution in IR3 , around D, contribute to the asymptotic 2θt (law) windings: log −→ γσ , as t → ∞. For a full proof and motivations, see: t J.F. Le Gall and M. Yor: Enlacements du mouvement brownien autour des courbes de l’espace. Trans. Amer. Math. Soc., 317, 687–722 (1990).
6.7 Cyclic exchangeability property and uniform law related to the Brownian bridge **
Let {bt , 0 ≤ t ≤ 1} be the standard Brownian bridge and F the σ-field it generates. Define the family of transformations Θu , u ∈ [0, 1], acting on the paths of b as follows: bt+u − bu , if t < 1 − u , Θu (b)t = . bt−(1−u) − bu , if 1 − u ≤ t ≤ 1 This transformation consists in re-ordering the paths {bt , 0 ≤ t ≤ u} and {bt , u ≤ t ≤ 1} in such a way that the new process is continuous and vanishes at times 0 and 1. For a better understanding of the sequel, it is worth drawing a picture. 1. Prove the cyclic exchangeability property for the Brownian bridge, that is: (law)
Θu (b) = b ,
for any u ∈ [0, 1] .
Let I(⊂ F) be the sub-σ-field of the events invariant under the transformations Θu , u ∈ [0, 1], that is: for any given u ∈ [0, 1], A ∈ I ,
if and only if 1IA (b) = 1IA ◦ Θu (b) ,
P − a.s.
An example of a non trivial I–measurable r.v. is given by the amplitude of the bridge b, i.e. sup0≤u≤1 bu − inf 0≤u≤1 bu . 2. Prove that for any functional F (b) ∈ L1 (P ), E[F (b) | I] =
0
1
du F ◦ Θu (b) .
(6.7.1)
Hint: Prove that for every u, v ∈ [0, 1], Θu Θv = Θ{u+v} , where {x} is the fractional part of x. 3. Let m be the time at which b reaches its absolute minimum. It can be shown that m is almost surely unique. Hence, m = inf{t : bt = inf s∈[0,1] bs }. Prove that m is uniformly distributed and is independent of the invariant σ-field I. Hint: First note that {m ◦ Θu + u} = m, a.s.
6. Random processes
185
4. Let A0 = 01 du 1I{bu ≤0} be the time that b spends under the level 0. Prove that A0 is uniformly distributed and independent of the invariant σ-field I. Comments and references: The uniform law for the minimum time and the time spent in IR+ (or IR− ) by the Brownian bridge has about the same history as the arcsine law for the same functionals of Brownian motion and goes back to ´vy: Sur certains processus stochastiques homog`enes. Compositio Math., 7, P. Le 283–339 (1939). By now, many proofs and extensions of this uniform law are known; those which are presented in this exercise are drawn from: L. Chaumont, D.G. Hobson and M. Yor: Some consequences of the cyclic exchangeability property for exponential functionals of L´evy processes. S´eminaire de Probabilit´es, XXXV, Lecture Notes in Mathematics, 1755, 334–347, Springer, Berlin, 2001. The identity in law between the time at which the process reaches its absolute minimum and the time it spends under the level 0 is actually satisfied by any process with exchangeable increments, as shown in F.B. Knight The uniform law for exchangeable and L´evy process bridges. Hommage a` P.A. Meyer et J. Neveu. Ast´erisque, 236, 171–188 (1996) L. Chaumont: A path transformation and its applications to fluctuation theory. J. London Math. Soc., (2), 59, no. 2, 729–741 (1999). This property admits an analogous version in discrete time and may be obtained as a consequence of fluctuation identities as first noticed in E. Sparre-Andersen: On sums of symmetrically dependent random variables. Scand. Aktuar. Tidskr., 26, 123–138 (1953).
6.8 Local time and hitting time distributions for the Brownian bridge
**
Let Pa be the law of the real valued Brownian motion starting from a ∈ IR and (t) Pa→x be the law of the Brownian bridge with length t, starting from a and ending at x ∈ IR at time t. Let (Xu , u ≥ 0) be the canonical coordinate process on C(IR+ , IR). The density of the Brownian semigroup will be denoted by pt , i.e.
1 (a − b)2 exp − pt (a, b) = √ 2t 2πt
.
(6.8.1)
186
Exercises in Probability
(t) 1. Prove that Pa→x may be realized as the law of the process
u u a + Bu − Bt + x, u ≤ t . t t
2. We put, for any y ∈ IR, Ty = inf{t : Xt = y}. Prove the reflection principle for Brownian motion, that is, under P0 , the process X y defined by Xty = Xt ,
on {t ≤ Ty } ,
Xty = 2y − Xt ,
on {t > Ty } ,
has the same law as X. Let St = sups≤t Xs . For x ≤ y, y > 0, prove that P0 (St > y, Xt < x) = P0 (Xt < x − 2y) ,
(6.8.2)
and compute the law of the pair (Xt , St ) under P0 . 3. Deduce from question 1 that for any a, y ∈ IR, ∂ |y − a| (y − a)2 exp − dt , Pa (Ty ∈ dt) = pt (y, a) dt = √ ∂y 2t 2πt3
t > 0. (6.8.3)
(Note that when a = y, Pa (Ta = 0) = 1.) 4. Let (Lu , u ≤ 1) be the local time of X at level 0. Using L´evy’s identity: (law)
(S − X, S) = (|X|, L) ,
under P0 ,
prove that under P0 , (Xt , Lt ) has density:
1 2πt3
1 2
(|x| + y)2 (|x| + y) exp − 2t
,
x ∈ IR, y ≥ 0 .
(6.8.4)
Deduce from the preceding formula an explicit expression for the joint law (1) of (Xt , Lt ), for a fixed t < 1, under the law P0→0 of the standard Brownian bridge. Comments and references: A detailed discussion of the (very classical!) result stated in the first question is found in: D. Lamberton and B. Lapeyre: Introduction to Stochastic Calculus Applied to Finance. Chapman & Hall, London, 1996. French second edition: Ellipses, Paris, 1997. We may also cite D. Freedman’s book [24] which uses the reflection principle an infinite number of times to deduce the joint law of the maximum and minimum of Brownian motion, a result which goes back to Bachelier!
6. Random processes
187
L. Bachelier: Probabilit´es des oscillations maxima. C. R. Acad. Sci. Paris, 212, 836–838 (1941). Further properties of the local times of the Brownian bridges are discussed in J.W. Pitman: The distribution of local times of a Brownian bridge. S´eminaire de Probabilit´es, XXXIII, Lecture Notes in Mathematics, 1709, 388–394, Springer, Berlin, 1999.
6.9 Partial absolute continuity of the Brownian bridge distribution with respect to the Brownian distribution **
We keep the same notation as in Exercise 6.8. 1. Prove that, under the Wiener measure P , the process
u u a 1− + x + uX( 1 − 1 ) , u ≤ t u t t t
(t) (t) has law Pa→x . In particular, the family {Pa→x ; a ∈ IR, x ∈ IR} depends continuously on a and x.
Hint: Use the Markov property and the invariance of the Brownian law by time inversion. (t) Deduce therefrom the law of T0 under Pa→x , when a = 0.
Hint: Take for granted the following result for a nice transient one-dimensional diffusion (Xt ) Pa (Ly ∈ dt) = C pt (a, y) dt , where Ly is the last passage time at y ∈ IR by (Xt ) (i.e. Ly = sup{t ≥ 0 : Xt = y}), and pt (a, y) denotes the density of the semigroup of (Xt ) with respect to Lebesgue measure dy. See the reference in the comments below. 2. Prove the following absolute continuity relation, for every s < t: (t) Ea→x [F (Xu , u
≤ s)] = Ea
pt−s (Xs , x) F (Xu , u ≤ s) , pt (a, x)
(6.9.1)
where F : C([0, s], IR) → IR+ is any bounded Borel functional. 3. More generally, show that for every stopping time S and every s < t, (t) [1I{S<s} F (Xu , u Ea→x
≤ S)] = Ea
pt−S (XS , x) 1I{S<s} F (Xu , u ≤ S) . (6.9.2) pt (a, x)
188
Exercises in Probability
4. Prove the equality: for every s ≤ t, (t) Pa→x (Ty
< s) = Ea
pt−Ty (y, x) 1I{Ty <s} . pt (a, x)
(6.9.3)
Then, using question 2 of Exercise 6.8, prove that for every s < t, (t) Pa→x (Ty
∂ p (y, x) t−s ∈ ds) = ds ps (y, a) , ∂y pt (a, x)
(6.9.4)
and check that when y = 0, it agrees with the result found in question 1. 5. Using the symmetry property of Brownian motion, prove that: pt (a, −x) (t) P (T0 < s) , pt (a, x) a→−x pt (a, −x) (t) (T0 < t) = , x, a > 0 . Pa→x pt (a, x)
(t) Pa→x (T0 < s) =
x, a > 0 ,
(6.9.5) (6.9.6)
Comments and references: We have just seen how the time inversion property of Brownian motion (which is shared by only a few diffusions!) allows us to express a Brownian bridge in terms of Brownian motion. This is further exploited in Exercise 6.13. See also related discussions in: S. Watanabe: On time inversion of one-dimensional diffusion processes. Wahrscheinlichkeitstheorie und Verw. Gebiete, 31, 115–124 (1974/75)
Z.
J. Pitman and M. Yor: Bessel processes and infinitely divisible laws. In: Stochastic integrals (Proc. Sympos., Univ. Durham, Durham, 1980), pp. 285– 370, Lecture Notes in Mathematics, 851, Springer, Berlin, 1981.
6.10 A Brownian interpretation of the duplication formula for the gamma function
*
Recall (see (4.5.2)) that the duplication formula is:
1 1 1 Γ(2z) = √ 22z− 2 Γ(z)Γ z + 2 2π
.
1. Prove that (4.5.2) is equivalent to the identity in law (law)
Z1 = 2 Z1 Z1/2
(6.10.1)
where Za denotes the standard gamma variable with parameter a, and Z1 and Z1/2 are assumed to be independent.
6. Random processes
189
Remark that (6.10.1) may also be written as (law)
2Z1 |N | ,
Z1 =
(6.10.2)
where N denotes a standard gaussian variable with variance 1, independent of Z1 . (The identity (6.10.2) is, in fact, the identity (4.8.1).) 2. Let (Bt , t ≥ 0) denote standard Brownian motion starting from 0, and let x ≥ 0. Define Tx = inf{t ≥ 0 : Bt = x}. Prove the following formulae: √ P (Tx < Z1 ) = exp(− 2x) , √ P (Tx < Z1 ) = P (x < Z 1 |N |)
(6.10.3) (6.10.4)
where, on the left hand sides, Tx and Z1 are assumed to be independent and on the right hand side, Z1 and N are independent. 3. Deduce from (6.10.3) and (6.10.4) that (6.10.2), hence (6.10.1), are satisfied. ** 6.11
Some deterministic time-changes of Brownian motion
Consider a linear Brownian motion (Bt , t ≥ 0) starting from 0, and two regular functions u, v: IR+ → IR. We assume that u satisfies u(0) = 0, and is strictly increasing, and is C 1 . Assume also that v(t) = 0, for every t > 0, and that v has bounded variations. 1. Prove that the process Xt = v(t)Bu(t) , t ≥ 0 is a semimartingale (in its own filtration) and that its martingale part is: t
v(s)dBu(s) ,
t≥0 .
0
2. Prove that the martingale part of X is a Brownian motion if, and only if: v 2 (s)u (s) ≡ 1 . B 3. We denote by (LX t ) and (Lt ) the respective local times at 0 of X and B. Prove the formula:
LX t =
u(t)
v u−1 (s) dLB s .
0
(6.11.1)
190
Exercises in Probability
4. If B is a Brownian motion, and β ∈ IR, β = 0, the formula:
1 − e−2βt Ut = e B 2β
βt
t≥0
,
defines an Ornstein–Uhlenbeck process with parameter β, i.e. (Ut ) solves the SDE: dUt = dγt + βUt dt , where γ is a Brownian motion. Apply the formula in the preceding question to relate the local times at 0 of U and B. 5. If (b(t), t ≤ 1) is a standard Brownian bridge, it follows from question 1 of Exercise 6.9 that the process
Bt = (1 + t)b
t 1+t
,
t≥0 ,
is a Brownian motion. Prove that, if (Lt , t ≥ 0), resp. ( u , u ≤ 1), is the local time of B, resp. b, at 0, then: t dLv = t (t ≥ 0) . (6.11.2) 1+v 0 1+t Give an explicit expression for the law of
t 0
dLv , 1+v
for fixed t.
Hint: Use the result obtained in Exercise 6.8, question 3. *
6.12 Random scaling of the Brownian bridge
Let (Bt , t ≥ 0) be a real valued Brownian motion, starting from 0, and define Λ = sup{t > 0 : Bt − t = 0}. 1. Prove that
√
(law)
Λ = |N | ,
(6.12.1)
where N is Gaussian, centred, with variance 1. 2. Prove that:
√
t ,t ≤ Λ , (6.12.2) Λ where, on the right hand side, Λ is independent of the standard Brownian bridge (b(u), u ≤ 1). (law)
(Bt − t, t ≤ Λ) =
Λb
6. Random processes
191
3. Let μ ∈ IR. Prove the following extension of the two previous questions: (i) (law)
Λμ = (ii)
(law)
(Bt − μt, t ≤ Λμ ) =
N2 . μ2
(6.12.3)
t Λμ b , t ≤ Λμ Λμ
,
(6.12.4)
where, on the right hand side, Λμ = sup{t > 0 : Bt − μt = 0} is independent of the standard Brownian bridge (b(u), u ≤ 1). Deduce from the identity (6.12.4) that conditionally on Λμ = λ, the process (Bt − μt, t ≤ λ) is distributed as a Brownian bridge of length λ. Comments and references: (a) One can also prove that for any μ ∈ IR, the law of the process (Bu −μu, u ≤ t), given Bt −μt = 0 does not depend on μ and is distributed as a Brownian bridge with length t. This invariance property follows from the Cameron–Martin absolute continuity relation between (Bu − μu, u ≤ t) and (Bu , u ≤ t). There are in fact other diffusions than Brownian motion with drifts which have the same bridges as Brownian motion. See: I. Benjamini and S. Lee: Conditioned diffusions which are Brownian bridges. J. Theoret. Probab., 10, no. 3, 733–736 (1997) P. Fitzsimmons: Markov processes with identical bridges. Electron. J. Probab., 3, no. 12, 12 pp. (1998). (b) It is often interesting, in order to study a particular property of Brownian motion, to consider simultaneously its extensions for Brownian motion with drift. Many examples appear in: D. Williams: Path decomposition and continuity of local time for onedimensional diffusions I. Proc. London Math. Soc., 28, (3), 738–768 (1974).
6.13 Time-inversion and quadratic functionals of Brownian motion; L´ evy’s stochastic area formula.
*
(m)
Let Pa→0 denote the law of the Brownian bridge with duration m, starting at a, and ending at 0.
192
Exercises in Probability
1. Prove that, if (Bt , t ≥ 0) denotes Brownian motion starting at 0, then the law of the process
1 1 − (1 + t)B 1+t 1+m
conditioned by B 1 −
1 1+m
0≤t≤m
;
(m)
= a is Pa→0 .
Hint: Use the Markov property and the invariance of the Brownian law by time inversion. 2. Let m > 0, and consider ϕ : [0, m] → IR a bounded, Borel function. Prove that, if (Bt , t ≥ 0) denotes Brownian motion starting at 0, then the law of
m 2 0 dtϕ(t)Bt , conditioned by {Bm = a} is the same as the law of (1 + m)
2
m
0
ds Bs2 (1 + m)s ϕ 4 (1 + s) 1+s
,
√ conditioned by Bm = a 1 + m .
Show that, consequently, one has:
E exp −λ
m 0
dt Bt2 | Bm = a
= E exp −λ(1 + m)
2
m 0
√ ds Bs2 Bm = a 1 + m 4 (1 + s)
for every λ ≥ 0. Set λ =
b2 . 2
Prove that this common quantity equals, in terms of a, b and m:
bm sinh(bm)
1
a2 (bm coth(bm) − 1) . exp − 2m
2
(6.13.1)
Comments and references: This exercise is strongly inspired by: F. B. Knight: Inverse local times, positive sojourns, and maxima for Brownian motion. Colloque Paul L´evy, Ast´erisque, 157–158, 233–247 (1988).
The expression (6.13.1) of E exp
2 − b2
m 0
dt
Bt2
| Bm = a is due to P. L´evy (1951).
It has since been the subject of many studies; here we simply refer to M. Yor: Some Aspects of Brownian Motion. Part I. Some Special Functionals. Lectures in Mathematics ETH Z¨ urich. Birkh¨auser Verlag, Basel, 1992, p. 18, formula (2.5). Note that in Exercises 2.12 and 2.13, we saw another instance of this formula in another disguise involving time spent by Brownian motion up to its first hit of 1.
6. Random processes ** 6.14
193
Quadratic variation and local time of semimartingales
Consider, on a filtered probability space, a continuous semimartingale Xt = Mt + Vt for t ≥ 0, such that: X0 = M0 = V0 = 0, and ⎡ ⎢
E ⎣M ∞
⎛∞ ⎞2 ⎤ ⎥ + ⎝ |dVs |⎠ ⎦ < ∞ .
(6.14.1)
0
(Recall that X, the quadratic variation of X does not depend on V , i.e. it is equal to M .) 1. Prove that (Xt ) satisfies:
E (XT )2 = E [XT ]
(6.14.2)
for every stopping time T if, and only if: 1(Xs =0) |dVs | = 0 ,
a.s.
(6.14.3)
2. Prove that (Xt ) satisfies (6.14.2) if and only if (Xt+ ) and (Xt− ) satisfy (6.14.2). 3.
(i) Show that if (Mt ) is a square integrable martingale, with M0 = 0, then Xt = −Mt + StM , where StM = sups≤t Ms , satisfies (6.14.2). (ii) Show that if (Mt ) is a square integrable martingale, with M0 = 0, then M Xt = Mt + LM t satisfies (6.14.2) if and only if Lt ≡ 0, which is equivalent to Mt ≡ 0.
4. Prove that if (Lt ) denotes the local time of (Xt ) at 0, then (Xt ) satisfies: E [|XT |] = E[LT ]
(6.14.4)
for every stopping time T if, and only if (Xt ) is a martingale. Comments: This exercise constitutes a warning that the well-known identities (6.14.2) and (6.14.4), which are valid for square integrable martingales, do not extend to general semimartingales. **
6.15 Geometric Brownian motion
Let (Bt , t ≥ 0) and (Wt , t ≥ 0) be two independent real valued Brownian motions.
194
Exercises in Probability
Prove that the two-dimensional process: ⎛
(Xt , Yt ; t ≥ 0) = ⎝exp(Bt ), (def)
t
⎞
exp(Bs )dWs ; t ≥ 0⎠
(6.15.1)
0
which takes values in IR2 converges to ∞ as t tends to ∞, (i.e. Xt2 + Yt2 → ∞, as t → ∞). Hint: Consider the time change obtained by inverting: t
At =
ds exp(2Bs ) . 0
Comments and references: The process (6.15.1) occurs very naturally in relation with hyperbolic motion, i.e a diffusion in the plane with infinitesimal 2 Brownian 2 ∂ ∂2 generator y2 ∂x + . 2 ∂y 2 J.C. Gruet: Semi-groupe du mouvement brownien hyperbolique. Stochastics and Stochastic Rep., 56, no. 1–2, 53–61 (1996). For some computations related to this exercise, see e.g. M. Yor: On some exponential functionals of Brownian motion. Adv. in Appl. Probab., 24, no. 3, 509–531 (1992). The following figure presents the graphs of the densities of the distributions of At for some values of t. We thank K. Ishiyama (Nagoya University) for kindly providing us with this picture. 0.7
t=1.0 t=2.0 t=3.0 t=4.0 t=5.0 t=10.0 t=20.0
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
8
6. Random processes
195
Numerical problems related to these densities are discussed in K. Ishiyama: Methods for evaluating density functions of exponential functionals represented as integrals of geometric Brownian motion. To appear in: Method Comput. App. Prob. (2005). *
6.16 0-self similar processes and conditional expectation
Let (Xu , u ≥ 0) be 0-self similar, i.e. (law)
(Xcu , u ≥ 0) = (Xu , u ≥ 0) ,
for any c > 0
(6.16.1)
and assume that E[|X1 |] < ∞. Prove that, for every t > 0, E[Xt | X t ] = X t , where X t =
1 t t 0
(6.16.2)
ds Xs .
Comments and references: (a) Let C be a cone in IRn , with vertex at 0 and define: Xt = 1I{Bt ∈C} , where (Bt ) denotes the n-dimensional Brownian motion. Then formula (6.16.2) yields: 1 t E 1I{Bt ∈C} | ds 1I{Bs ∈C} = a = a , t 0
a remarkable result since the law of 0t ds 1I{Bs ∈C} is unknown except in the very particular case where C = {x ∈ IRn : x1 > 0}. See the following papers for a number of developments: R. Pemantle, Y. Peres, J. Pitman and M. Yor: Where did the Brownian particle go? Electron. J. Probab., 6, no. 10, 22 pp. (2001) J.W. Pitman and M. Yor: Quelques identit´es en loi pour les processus de Bessel. Hommage `a P.A. Meyer et J. Neveu. Ast´erisque 236, 249–276 (1996). (b) In the paper: N.H. Bingham and R.A. Doney: On higher-dimensional analogues of the arc-sine law. J. Appl. Probab., 25, no. 1, 120–131 (1988) the authors show that in general, time spent in a cone (up to 1) by ndimensional Brownian motion (e.g. in the first quadrant, by two-dimensional Brownian motion) is not beta distributed, which a priori was a reasonable guess, given L´evy’s arc-sine law for the time spent in IR+ by linear Brownian motion.
196
Exercises in Probability
** 6.17 A Taylor formula for semimartingales; Markov martingales and iterated infinitesimal generators Let (Ω, F, (Ft ), P ) be a filtered probability space. We say that a right continuous process, (Xt , t ≥ 0) which is (Ft ) adapted is (Ft ) differentiable if there exists another right continuous process, which we denote as (Xt ), such that:
(def)
(i) MtX = Xt − 0t ds Xs is an (F t )-martingale.
(ii) E[|Xt |] < ∞ and E 0t ds |Xs | < ∞. 1. Prove that if (Xt ) is (Ft ) differentiable, then its (Ft ) derivative, (Xt ), is unique, up to indistinguability. 2. Assume that (Xt ) is (Ft ) differentiable up to order (n+1) and use the notation: X = (X ) and by recurrence X (n) = (X (n−1) ) , n ≥ 3. Then prove the formula: Xt − tXt +
n
tn (n) t2 Xt + · · · + (−1)n Xt 2 n!
t s (n+1) ds + MtX − s dMsX + · · · + (−1)n Xs 0 n! 0 Hint: Use a recurrence formula and integration by parts. t
(6.17.1)
t
=
0
n
s (n) dMsX . n!
3. Let (Bt , t ≥ 0) denote a real valued Brownian motion. Define the Hermite polynomials Hk (x, t), as: k(k − 1) k−2 x + (6.17.2) 2 t2 k(k − 1)(k − 2)(k − 3) k−4 x + ··· + (6.17.3) 2 22 t[k/2] k(k − 1) . . . (k − 2[k/2] + 1) k−2[k/2] x . (−1)[k/2] [k/2]! 2[k/2]
Hk (x, t) = xk − t
Prove that (Hk (Bt , t), t ≥ 0) is a martingale. 4. Give another proof of this martingale property using the classical generating function expansion:
a2 t exp ax − 2
=
∞ ak k=0
k!
Hk (x, t) ,
a ∈ IR, (x, t) ∈ IR × IR+ .
Comments and references: The sequence of Hermite polynomials is that of orthogonal polynomials with respect to the Gaussian distribution; hence, it is not astonishing that the Hermite polynomials may be related to Brownian motion. For orthogonal polynomials associated to other Markov processes, see:
6. Random processes
197
W. Schoutens: Stochastic Processes and Orthogonal Polynomials. Lecture Notes in Statistics, 146, Springer-Verlag, New York, 2000. In particular, the Laguerre polynomials, resp. Charlier polynomials are associated respectively to Bessel processes, and to the Poisson and Gamma process.
6.18 A remark of D. Williams: the optional stopping theorem may hold for certain “non-stopping times”
**
Let W0 = 0, W1 , W2 , . . . , Wn , . . . be a standard coin-tossing random walk, i.e. the variables W1 , W2 − W1 , . . . , Wk − Wk−1 , . . . are independent Bernoulli variables such that P (Wk − Wk−1 = ±1) = 12 , k = 1, 2, . . .. For p ≥ 1, define: σ(p) = inf{k ≥ 0 : Wk = p} , m = sup{Wk , k ≤ η} ,
η = sup{k ≤ σ(p) : Wk = 0} , γ = inf{k ≥ 0 : Wk = m} .
It is known (see the reference below) that: (i) m is uniformly distributed on {0, 1, . . . , p − 1}; (ii) Conditionally on {m = j}, the family {Wk , k ≤ γ} is distributed as {Wk , k ≤ σ(j)}. Take these two results for granted. For n ∈ IN, denote by Fn the σ-field generated by W0 , W1 , . . . , Wn . Prove that for every bounded (Fn )n≥0 martingale (Mn )n≥0 , and every j ∈ {0, 1, . . . , p − 1}, one has: E Mγ 1I{m=j} = E[M0 ]P (m = j) . In particular, one has: E[Mγ ] = E[M0 ]. Comments and references: (a) (i) and (ii), and more generally the analogue for a standard random walk of D. Williams’s path decomposition of Brownian motion {Bt , t ≤ Tp }, where Tp = inf{t : Bt = p} were obtained by J.F. Le Gall in: J.F. Le Gall: Une approche ´el´ementaire des th´eor`emes de d´ecomposition de Williams. S´eminaire de Probabilit´es, XX, 1984/85, Lecture Notes in Mathematics, 1204, 447–464, Springer, Berlin, 1986. Donsker’s theorem then allows us to deduce D. Williams’ decomposition results from those for the random walk skeletons.
198
Exercises in Probability
(b) In turn, D. Williams (2002) showed the analogue for Brownian motion of the optional stopping result presented in the above exercise. Our exercise is strongly inspired from D. Williams’s result, in: D. Williams: A “non-stopping” time with the optional stopping property. Bull. London Math. Soc., 34, 610–612 (2002). (c) It would be very interesting to characterize the random times, which we might call “pseudo-stopping times”, for which the optional stopping theorem is still valid. * 6.19
Stochastic affine processes, also known as “harnesses”.
Note that given three reals s < t < T , the only reals A and B which satisfy the equality f (t) = Af (s) + Bf (T ) for every affine function f (u) = αu + β, (α, β ∈ IR) are: A=
T −t , T −s
and B =
t−s . T −s
Now assume that on a probability space (Ω, F, P ), a process (Φt , t ≥ 0) is given such that: E[|Φt |] < ∞, for each t ≥ 0. We define the past–future filtration associated with Φ as Fs,T = σ{Φu : u ∈ [0, s] ∪ [T, ∞)}. We shall call Φ an affine process if it satisfies T −t (t − s) E[Φt | Fs,T ] = (6.19.1) Φs + ΦT , s < t < T . T −s T −s 1. Check that (6.19.1) is equivalent to the property that for s ≤ t < t ≤ u, the quantity Φt − Φt E | F (6.19.2) s,u , t − t does not depend on the pair (t, t ), hence is equal to
Φu −Φs . u−s
2. Prove that Brownian motion is an affine process. 3. Let (Xt , t ≥ 0) be a centred Gaussian process, which is Markovian (possibly inhomogeneous). Prove that there exist constants αs,t,T and βs,t,T such that E[Xt | Fs,T ] = αs,t,T Xs + βs,t,T XT . Compute the functions α and β in terms of the covariance function K(s, t) = E[Xs Xt ]. Which among such processes (Xt ) are affine processes?
6. Random processes
199
4. Let (Xt , t ≥ 0) be a L´evy process, with E[|Xt |] < ∞, for every t. Prove that (Xt ) is an affine process. Hint: Prove that for a < b, and for every λ ∈ IR: E[a−1 Xa exp (iλXb )] = E[b−1 Xb exp (iλXb )]. 5. Prove that, although a subordinator (Tt , t ≥ 0) may not satisfy E[Tt ] < ∞, nonetheless the conditional property (6.19.1) is satisfied. Hint: Prove that, for a < b, and every λ > 0,
E
Ta Tb exp(−λTb ) = E exp(−λTb ) . a b
6. Prove that if an affine process (Xt ) is L1 -differentiable, i.e. Xt+h − Xt L1 −→ Yt , as h → 0, h then for every s, t, P (Ys = Yt ) = 1, and (Xt ) is affine in the strong sense, i.e. Xt = αt+β, for some random variables α and β which are F0 ∨G∞ -measurable. 7. We now assume only that (Xt , t ≥ 0) is a process with exchangeable increments, which is continuous in L1 . Prove that it is an affine process. Comments and references: (a) The general notion of a harness is due to J. Hammersley. The particular case we are studying here is called a simple harness by D. Williams (1980), who proved that essentially the only continuous harnesses are Brownian motions with drifts. In the present exercise, we preferred the term affine process. In his book [63], D. Williams studies harnesses indexed by ZZ, and proves that they are only affine processes in the strong sense. (b) Without knowing (or mentioning) the terminology, Jacod and Protter (1988) showed that every integrable L´evy process is a harness. In fact, their proof only uses the exchangeability of increments, as we do to solve question 7. In relation with that question, we should recall that any process with exchangeable increments is a mixture of L´evy processes. (See the comments in Exercise 2.6, and the reference to Aldous’ St-Flour course, Proposition 10.5, given there). (c) As could be expected, L´evy was first on the scene! (See L´evy’s papers referred to below.) Indeed, he remarked that, for Φ a Brownian motion, the property (6.19.1) holds because Φt − is independent from Fs,T .
T −t t−s Φs + ΦT T −s T −s
200
Exercises in Probability
(d) If (Φt = γt , t ≥ 0) is the gamma subordinator, then with the notation in t (6.19.2), ΦΦtu−Φ is independent from Fs,u , which implies a fortiori that (Φt ) is −Φs a harness. (e) The arguments used in questions 4 and 5 also appear in Bertoin’s book ([67], p. 85), although no reference to harnesses is made there. ´vy: Un th´eor`eme d’invariance projective relatif au mouvement brownP. Le ien. Comment. Math. Helv., 16, September 1943, p 242–248. ´vy: Une propri´et´e d’invariance projective dans le mouvement brownien. P. Le C. R. Acad. Sci., Paris, 219 October 1944, p. 377–379. J.M. Hammersley: Harnesses. Proc. Fifth Berkeley Sympos. Mathematical Statistics and Probability, Vol. III: Physical Sciences, Univ. California Press, Berkeley, Calif., 89–117, 1967. J. Jacod and P. Protter: Time reversal on L´evy processes. Ann. Probab., 16, no. 2, 620–641 (1988). D. Williams: Brownian motion as a Harness. Unpublished manuscript, (1980). D. Williams: Some basic theorems on harnesses. In: Stochastic analysis, (a tribute to the memory of Rollo Davidson), eds. D. Kendall, H. Harding 349–363, Wiley, London, 1973. (f) For a recent discussion of harnesses, see R. Mansuy and M. Yor: Harnesses, L´evy bridges and Monsieur Jourdain. Stoch. Proc. App., 115, no. 2, 329–338 (2005). *
6.20 A martingale “in the mean over time” is a martingale
Consider a process (α(s), s ≥ 0) that is adapted to the filtration (Fs )s≥0 , and t satisfies: for any t > 0, E[|α(t)|] < ∞ and E 0 du |α(u)| < ∞. Prove that the two following conditions are equivalent: (i) (α(u), u ≥ 0) is a (Fu ) martingale, (ii) for every t > s, E
1 t t−s s
dh α(h) | Fs = α(s).
Comments and references: Quantities such as on the left hand side of (ii) appear very naturally when one discusses whether a process is a semimartingale with respect to a given filtration by using the method of “Laplaciens approch´es” due to P.A. Meyer. For precise statements, see C. Stricker: Une caract´erisation des quasimartingales. S´eminaire de Probabilit´es, IX, Lecture Notes in Mathematics, 465, 420–424, Springer, Berlin, 1975.
6. Random processes *
201
6.21 A reinforcement of Exercise 6.20
Consider a process (H(s), s ≥ 0) not necessarily adapted to the filtration (Fs )s≥0 , which satisfies: (i) for any s > 0, E[|H(s)|] < ∞. (ii) for any s, the process t → E
Ht −Hs t−s
| Fs , (t > s) does not depend on t.
We call the common value α(s). 1. Prove that (α(s), s ≥ 0) is an (Fs )-martingale. 2. Assume that (H(s), s ≥ 0) is (Fs ) adapted and that (α(s), s ≥ 0) is measurable. Prove that (Ht , t ≥ 0) is of the form
Ht = Mt +
t
α(u) du , 0
where (Mt , t ≥ 0) is a (Ft )-martingale. Comments: For a general discussion on Exercises 6.17 to 6.21, see the comments at the beginning of this chapter.
202
Exercises in Probability
Solutions for Chapter 6
In the solutions developed below, (Ft ) will often denote the “obvious” filtration involved in the question, and F = F (Xu , u ≤ t) a functional on the canonical path space C(IR+ , IRd ), which is measurable with respect to the past up to time t.
Solution to Exercise 6.1 √ The coefficients of the SDE (6.1.1) satisfy the Lipschitz condition: | 1 + y2 − √ 1 3 2 1 + z | + 2 |y − z| ≤ 2 |y − z|, so this equation admits a unique strong solution, for every x ∈ IR. For any C 2 function ϕ : IR → IR, Itˆo’s formula applied to ϕ(Bt ) gives: 1 d (ϕ(Bt )) = ϕ (Bt ) dBt + ϕ (Bt ) dt . 2
(6.1.a)
If Xt = ϕ(Bt ), for aC 2 function ϕ, then identifying equations (6.1.1) and (6.1.a), we obtain ϕ (Bt ) = 1 + ϕ2 (Bt ) and ϕ (Bt ) = ϕ(Bt ). Now, solving the differential equation ϕ (y) = 1 + ϕ2 (y) , ϕ (y) = ϕ(y) , on IR, we obtain ϕ(y) = sinh (y + c), where c is any real constant. This shows that Xt = sinh (Bt + argsinh x), t ≥ 0 is the unique process which satisfies (6.1.1).
Solution to Exercise 6.2 1. Let F : IR × IR2+ → IR be the function defined by F (x, y, z) = cosh(λ(y − 2 x)) exp − λ2 z . This function satisfies: 1 ∂2 ∂ F (x, y, z) + F (x, y, z) = 0 , 2 ∂x2 ∂z
∂ F (x, x, z) = 0 , ∂y
(x, y, z) ∈ IR × IR2+ . (6.2.a)
6. Random processes – solutions
203
Thanks to the first part of (6.2.a) and the fact that (St , t ≥ 0) is an increasing process, Itˆo’s formula applied to the semimartingale: (F (Bt , St , t), t ≥ 0) gives:
F (Bt , St , t) = F (0, 0, 0) +
t 0
t ∂ ∂ F (Bu , Su , u) dBu + F (Bu , Su , u) dSu . ∂x 0 ∂y
S increases. This remark Now observe that Bu = Su , for each time u at which
∂ and the second part of (6.2.a) show that the integral 0t ∂y F (Bu , Su , u) dSu vanishes, hence t ∂ F (Bt , St , t) = 1 + F (Bu , Su , u) dBu , 0 ∂x which proves the result. 2. Since the distribution of Brownian motion is symmetric, we can replace in question 1, St − Bt by Bt − It and St by It , so that
λ2 t Nt ≡ cosh (λ(Bt − It )) exp − 2
is a (Ft ) martingale. Observe that θc is a (Ft ) stopping time and that the martingales (Mt∧θc ) and (Nt∧θc ) are bounded, therefore, we may apply the optional stopping theorem which yields: E[Mθc ] = 1 ,
and E[Nθc ] = 1 .
(6.2.b)
Note also that almost surely, either Mθc −Bθc = c and Bθc −Iθc = 0, or Mθc −Bθc = 0 and Bθc − Iθc = c, so that
Mθc + Nθc
λ2 θ c = exp − cosh(λc)1I{Mθc −Bθc =c} + 1I{Mθc −Bθc =0} 2
+ cosh(λc)1I{Bθc −Iθc =c} + 1I{Bθc −Iθc =0}
λ2 θ c = exp − (cosh(λc) + 1) . 2 The result now follows from equations (6.2.b). Comments on the solution: The result of this exercise implies that (law) θc = T 2c + T˜2c ,
copies of inf {t : |Bt | = a}. This follows from where Ta and T˜a are 2 independent λ2 t the fact the cosh(λBt ) exp − 2 is a (Ft ) martingale.
204
Exercises in Probability
Solution to Exercise 6.3 The solutions to questions 1 and 2 are immediate from the Hint; as an example, (law)
(law)
St − Xt = sup(X(t−u)− − Xt ) = −It . u≤t
3. First note the following identities for x ≥ 0 and y ≥ 0 P (ST ≤ x, −IT ≤ y, XT ∈ da) = P (T ≤ Tx , T ≤ T−y , XT ∈ da) = P (T ≤ Tx ∧ T−y , XT ∈ da) =E
1 − 1I{Tx ∧T−y ≤T } 1I{XT ∈da}
2 θ2 − θ2 t = P (XT ∈ da) − E e 1I{Xt ∈da} dt 2 Tx ∧T−y = P (XT ∈ da)
∞ 2 2 θ2 − θ2 Tx ∧T−y − θ2 u − E e EXTx ∧T−y e 1I{Xu ∈da} du . 2 0
Now, identity (6.3.3) is easily deduced from the following well known facts. θ −θ|a| e da 2 θ2 sinh(θy) E e− 2 Tx ∧T−y 1I{Tx 0, the process Z (ε) = (Zt+ε , t ≥ 0) satisfies the equation (ε)
Zt
(ε)
= Zε + Bt + λ
0
t
Zs(ε) ds (ε)
|Zs |2
.
(6.4.a)
Moreover, we assume that the √ law of Zε is radial, (i.e. it is invariant by rotation), and that |Zε | is distributed as ερ, where ρ is the value at time 1 of a Bessel process with dimension δ, starting from 0. With the help of Kolmogorov’s consistency
206
Exercises in Probability
theorem, the probability measure P0λ is now well defined and under P0λ , the process (|Zt |2 , t ≥ 0) is the square of a Bessel process of dimension δ = 2(λ + 1), starting from 0, so Zε converges P0λ -almost surely towards 0, as ε → 0. Moreover, from the above equation, under P0λ , conditionally on Zε , the process Z (ε) has law PZε . This shows the weak convergence of (Pzλ , z > 0) towards P0λ . An application of Girsanov’s Theorem shows that for every t, the absolute continuity relationship holds for every z = 0: Pzλ |F t
λ Zt λ2 = exp −
z
2
t
0
ds Pz0 |F . t |Zs |2
(6.4.b)
3. From (6.4.b), we have, for every bounded, continuous functional F which is Z1 measurable: ⎡
⎛
⎞⎤
2 1 1 0⎣ ds λ 1 1 λ ⎠⎦ = Ez F (Z) E F (Z) exp ⎝− , 1 λ Λz z 2 |Zs |2 |Z1 |λ E λ z |Z1 | 0
and from the weak convergence established in the previous question, together with the nice integrability properties of |Z11 |λ , under the {Pz0 } laws, we obtain that the
above expression converge towards E0λ
1 |Z1 |λ
−1
E0λ F (Z) |Z11 |λ , as z → 0.
4. (a) Recall that R satisfies the equation Rt = 1 + Bt +
d − 1 t ds , 2 0 Rs
where B is a standard Brownian motion, so that since R does not take the value 0, a.s., we obtain, from Itˆo’s formula:
log Rt =
0
t
dBs d − 2 t ds + . Rs 2 0 Rs2
Changing time in this equation with the inverse H −1 of H (which is equal to the s quadratic variation of 0t dB ), we obtain: Rs log RHu−1 = βu + where βu = d−2 u. 2
Hu−1 0
dBs Rs
d−2 u, 2
is a standard Brownian motion. This shows that ϕu = βu + (law)
(b) First, we deduce from the scaling property of R (i.e. Rt = t1/2 R1 ) that 1 1 P log Rt −→ , log t 2
as t → ∞.
6. Random processes – solutions
207
From the scaling property of Brownian motion, we obtain: 1 βH = W log t t
1 Ht (log t)2
,
where Ws = log t β( (logs t)2 ), s ≥ 0 is a standard Brownian motion. Then we verify
from the scaling property of R, that (log1 t)2 E [Ht ] → 0, as t → ∞, hence W (log1 t)2 Ht converges towards 0 in probability. Then we deduce from above and the representation of log Rt established in the previous question that 1 1 P Ht −→ , log t d−2
as t → ∞.
(c) For the following argument, it suffices to assume that R starts from 0. Then,
a ds (law) an ds the scaling property of R implies that 1 R2 = an−1 R2 , for every positive integer s s n. Since the scaling transform of R (: R → √1c Rc· , c = 1) is ergodic, the Ergodic Theorem implies: n ap a ds log a 1 ds 1 lim , −→ E = (log a)E = 2 2 2 n→∞ n p−1 R R1 d−2 1 Rs s p=1 a
hence, log1an shown that
an
ds 1 1 Rs2 converges almost surely towards d−2 , as
t ds 1 1 converges almost surely towards d−2 , log t 1 Rs2
a.s.
n → ∞. It is then easily as t → ∞.
(d) Again, it follows from the scaling property of R that 1 1 1 ds (law) 1 ε du = , log 1ε ε Rs2 log 1ε 0 Ru2
so we deduce from the previous questions that towards
1 , d−2
as ε → 0.
1 log 1ε
1
ds ε Rs2
converges in probability
(e) The representation proved in question 1 shows that (law)
θ(ε,1) = γH1 − γHε = γ
ε
1
ds |Zs |2
,
so that from the scaling property of γ:
1 log ε
− 1 2
(law)
θ(ε,1) = γ
1 log ε
−1 0
1 ε
ds |Zs |2
.
We finally deduce from question (d) that the right hand side of the above equality converges in law towards γ1/(d−2) , as ε → 0.
208
Exercises in Probability
Comments on the solution: As an important complement to the asymptotic result 1 t ds a.s. 1 , −→ 2 log t 0 Rs d−2
as t → ∞ ,
for R a Bessel process with dimension d > 2, we present the following 4 t ds (law) (law) = −→T ( = = inf{t : βt = 1}) , (log t)2 0 Rs2
as t → ∞ ,
(6.4.c)
where R is a two-dimensional Bessel process starting from R0 = 0, and T is stable(1/2). See Revuz–Yor ([51], Chapter XIII) for consequences of (6.4.c) for the asymptotics of the winding number of planar Brownian motion.
Solution to Exercise 6.5 1. (a) Itˆo’s formula allows us to write eicγt = 1 + ic
t 0
eicγu dγu −
c2 t icγu e du . 2 0
So the process (eicγt , t ≥ 0) is the sum of the martingale ic
the process with bounded variation 1 − (b) Define Zu(c) =
u
0 (c)
c2 2
t
t
0
eicγu dγu , t ≥ 0 and
icγu du, t ≥ 0 . 0 e
dγs exp(icγs ), u ≥ 0, and consider the IR3 -valued martingale
(γ, Re(Z (c) ), Im(Z )). The variation and covariation processes of this mar" # different
t (c) tingale, e.g. γ, Re(Z ) = 0 ds cos(cγs ), converge a.s., as c → ∞, towards those
√1 γ , √1 γ 2 2
t
of γ, , (this follows from the occupation time density formula). The desired result may be deduced from this, but we shall not give the details. 2. Recall from question 1 of Exercise 6.4 (with λ = 0), the following representation of the complex Brownian motion Zt = |Zt | exp (iγHt ) ,
where Ht = 0t |Zdu 2 , and γ is a standard Brownian motion which is independent u| of |Z|. (This is known as the skew-product representation of complex Brownian motion.) From obvious changes of variables, we obtain: t 1 t Zs 1 Ht log2 t ds = dv exp iγ = log t ds exp (i(log t)˜ γs ) , v log t 0 |Zs |3 log t 0 0 H
where γ˜s = that
c u 2 0
1 2 . γ log t (log t) s
Now, on the one hand, recall from question 1 (a) and (b)
ds exp(icγs ), u ≥ 0 , converges in law towards
√1 (γ 2 u
+ iγu ), u ≥ 0 , as
6. Random processes – solutions
209
c → ∞ (γ and γ are independent Brownian motions). On the other hand, as stated in the Comments and references of Exercise 6.4, (log4 t)2 Ht converges in law towards a 1/2 stable unilateral random variable, T , as t → ∞; in fact, from the independence of γ and |Z|, we may deduce that when t → ∞, t 1 t Zs log2 t ds = log t ds exp (i(log t)˜ γs ) log t 0 |Zs |3 0 H
converges towards
√ (law) 1 2 γ 1 T + iγ 1 T = √ (γT + iγT ) , 4 4 2
where T , γ and γ are independent. This independence and the scaling property of Brownian motion allow us to write: 1 1 1 √ (γT + iγT ) (law) = √ (γ1 + iγ1 ) , 2 2 γ1 with the help of question 1 of Exercise 4.15. Finally, from question 5 of Exercise |λ| , 4.15, we derive the characteristic function of the above variable, which is exp − √ 2 2 λ ∈ IR .
Solution to Exercise 6.6 1. For this question, we need to use the so called skew-product representation of the planar Brownian motion (X, Y ) (which in fact we have already proven in the solution to question 1 of Exercise 6.4): put Xt + iYt = Rt exp iθt . Then there exists a planar Brownian motion (β, γ) such that Rt = |B0 | exp (βHt ) , Now, we may write θtf =
and
t 0
θt = θ0 + γHt ,
with
(def)
Ht =
0
t
ds . Rs2
(6.6.a)
dγHs 1I{Rs ≤f (|Zs |)} .
To proceed, we shall admit that, up to the division by log t, we may replace in the a definition of θtf , the process f (|Zs |) by s 2 . So, in the sequel, we put θ˜tf = inverse of s → Hs , we have
t 0
dγHs 1I{βH
a
2 s ≤s }
. Since u →
u 0
dv exp(2βv ) is the
2 ˜f 2 Ht θ = dγu 1I{βu ≤ a log( u dv exp(2βv ))} . 2 0 log t t log t 0
210
Exercises in Probability
Now, put λ = log2 t , make the change of variables u = λ2 v, and use the fact that β˜v = λ1 βλ2 v , and γ˜v = λ1 γλ2 v , v ≥ 0, are two independent Brownian motions. We obtain: 1 1 ˜f λ2 Ht θt = d˜ γv 1I{β˜v ≤ a log λ2 v dh exp(2λβ˜h )} . 2λ 0 λ 0
Now observe that the stochastic integral process 0· d˜ γv 1I{β˜v ≤ a log λ2 v dh exp(2λβ˜h )} 2λ 0
converges in probability (uniformly on any interval of time) towards: 0· d˜ γv 1I{β˜v ≤a sups≤v β˜s } . This convergence holds jointly with that of λ12 Ht towards T˜ = inf{v : β˜v = 1}, which proves the desired result. 2. The same computation as above leads to (law) 1 (θt − θtf ) → dγu 1(βu >aSu ) = 0 . log t t→∞ σ
0
So that the convergence holds in probability.
Solution to Exercise 6.7 1. For any u ∈ [0, 1], let Θu be the transformation which consists in re-ordering the paths {f (t), 0 ≤ t ≤ u} and {f (t), u ≤ t ≤ 1} of any continuous function f on [0,1], such that f (0) = 0, in such a way that the new function Θu (f ) is continuous and vanishes at 0. Formally, Θu (f ) is defined by
Θu (f )(t) =
f (t + u) − f (u) , f (t − (1 − u)) + f (1) − f (u) ,
if t < 1 − u , . if 1 − u ≤ t ≤ 1
Let Bt , t ∈ [0, 1] be a real valued Brownian motion on [0,1]. Since B has independent and time homogeneous increments, for any u ∈ [0, 1], we have (law)
Θu (B) = B .
(6.7.a) (law)
Now we use the representation of the Brownian bridge as bt = Bt − tB1 , t ∈ [0, 1], which is proved in question 5 of Exercise 6.9. For u ∈ [0, 1], put B = Θu (B), then it is not difficult to check that Θu (b)t = Bt − tB1 , t ∈ [0, 1], and the result follows from (6.7.a). 2. First note that Θv (b) = {b{t+v} − bv , 0 ≤ t ≤ 1}. So, Θu ◦ Θv (b) = Θu ({b{t+v} − bv , 0 ≤ t ≤ 1}) = {b{{t+u}+v} − b{u+v} , 0 ≤ t ≤ 1} = Θ{u+v} (b) .
6. Random processes – solutions
1
Let F ∈ L1 (P ); we can check that we have for any v ∈ [0, 1],
1 0
0
211
F ◦ Θu du is I-measurable. Indeed, from above
F ◦ Θu du ◦ Θv =
1
F ◦ Θu ◦ Θv du
0
1
F ◦ Θ{u+v} du =
= 0
1
F ◦ Θu du .
0
On the other hand, from question 1, for any A ∈ I, and any u ∈ [0, 1], we have E[F 1IA ] = E[F ◦ Θu 1IA ]. Integrating this expression over [0,1] and using Fubini’s Theorem, we obtain: 1 E[F 1IA ] = E du F ◦ Θu 1IA , 0
which proves the result. 3. Consider G, a bounded measurable functional; the result follows from the computation: E[G(m) | I] =
1
0
G(m) ◦ Θu du
m
= 0
du G(m − u) +
1
m
du G(1 + m − u)
1
ds G(s) .
= 0
The first equality above comes from (6.7.1), and the second one comes from the relation {m ◦ Θu + u} = m which is straightforward and may easily be seen on a picture. 4. As in question 3, consider G a bounded measurable functional and write
E[G(A0 ) | I] =
1
0
G(A0 ◦ Θu ) du
1
0
1
du G
=
0
ds 1I{bs ≤bu } .
Let be the local time of b, at time 1 and level x ∈ IR. Applying the occupation time density formula, we obtain: (l1x )
E[G(A0 ) | I] =
1
0
=
−∞ +∞
= Making the change of variables: z =
0 +∞
l1x
ds 1I{bs ≤bu }
l1x
−∞
x y −∞ l1
dx G
1
dx G
E[G(A0 ) | I] = which proves the desired result.
1
du G
ds 1I{bs ≤x}
0 x
−∞
l1y
dy .
dy in the above equality, we obtain
1
dz G(z) , 0
212
Exercises in Probability
Solution to Exercise 6.8 1. This follows immediately from the Gaussian character of Brownian motion, as in question 1.(a) of Exercise 5.14. 2. Note that (XTyy +t − y, t ≥ 0) = (−(XTy +t − y), t ≥ 0), hence from the Markov property of Brownian motion at time Ty , under P0 , (XTyy +t − y, t ≥ 0) is independent of (Xty , t ≤ Ty ) = (Xt , t ≤ Ty ) and has the same law as (Xt , t ≥ 0). This proves that X y is of a standard Brownian motion. We prove identity (6.8.2) from the following observation: {St > y, Xt < x} = {Xty > 2y − x} ,
P0 a.s.
which we easily verify on a picture. Then, differentiating (6.8.2) with respect to x and y, we obtain the density of the law of the pair, (Xt , St ) under P0 :
2 πt3
1 2
(2y − x)2 (2y − x) exp − 2t
x ≤ y, y > 0 .
,
(6.8.a)
3. Put x = y in (6.8.2), we obtain P0 (St > y, Xt < y) = P0 (Xt > y). Since P0 (St > y) = P0 (St > y, Xt ≥ y) + P0 (St > y, Xt < y), we deduce from the previous identity that P0 (St > y) = 2P0 (Xt > y) , y > 0 . (6.8.b) Now, (6.8.b) together with the fact that P0 (St > y) = P0 (Ty < t) imply P0 (Ty < t) = 2P0 (Xt > y). Moreover, the homogeneity of the Brownian law (see 6.8.1) yields the equality P0 (Ty−a < t) = Pa (Ty < t), for any a < y, which finally gives Pa (Ty < t) = 2P0 (Xt > y − a) ,
y > 0.
(6.8.c)
Then equation (6.8.3), for any a < y, follows from (6.8.c) by differentiating with respect to t. The general case is obtained from the symmetry property of Brownian motion. 4. From L´evy’s identity, we have: (law)
(St − Xt , St ) = (|Xt |, Lt ) , under P0 . This identity in law together with (6.8.a), yield the density of the law of the pair (|Xt |, Lt ):
2 πt3
1 2
(x + y)2 (x + y) exp − 2t
,
x, y ≥ 0 .
To deduce from above the density of (Xt , Lt ), first write (Xt , Lt ) = (sgn(Xt )|Xt |, Lt ). Now, observe that since Lt is a functional of (|Xu |, u ≥ 0) and sgn(Xt ) is independent
6. Random processes – solutions
213
of (|Xu |, u ≥ 0), then sgn(Xt ) is independent of (|Xt |, Lt ). Moreover, sgn(Xt ) is a Bernoulli symmetric random variable. Hence, the density of the pair (Xt , Lt ) under P0 is:
1 1 2 (|x| + y)2 (|x| + y) exp − , x ∈ IR, y ≥ 0 . 2πt3 2t We may deduce from above and the absolute continuity relation (6.9.1), an explicit expression for the density of the joint law of (Xt , Lt ), t < 1, under the law of the (1) standard Brownian bridge, P0→0 :
1 2π(1 − t)t3
1 2
x2 (|x| + y)2 − (|x| + y) exp − 2t 2(1 − t)
x ∈ IR, y ≥ 0 .
,
Solution to Exercise 6.9 1. For any a, μ ∈ IR, let Pμa be the law of the Brownian motion starting at μ with drift a, that is the law of (μ + Xs + as, s ≥ 0) under P . The invariance of the Brownian law by time inversion may be stated as
sX 1 , s > 0 , Pμa = [(Xs , s > 0) , Paμ ] . s
In other words, time inversion interchanges the drift and the starting point. From the above identity, we have for any x ∈ IR:
sX 1 , s > 0 s
, Pμa
x · | X1 = t t
Applying the Markov property at time s =
= [(Xs , s > 0) , Paμ (· | Xt = x)] . 1 t
to the process
sX 1 , s ≥ 0 , Pμa gives s
sX 1 − 1 , 0 ≤ s ≤ t , P xa = [(Xs , 0 ≤ s ≤ t) , Paμ (· | Xt = x)] , s
t
t
which is the first part of the question. Note that the law of the process on the right hand side of the above identity does not depend on μ. (t) We now deduce the law of T0 under Pa→x , with the help of the hint, and of the representation of the bridge in terms of Brownian motion with drift. Indeed, it (t) follows from these two arguments that under Pa→x , the r.v. T10 − 1t is distributed as
the last passage time at 0 of the process: xt + Xs + as, s ≥ 0 , where (Xs , s ≥ 0) is a standard Brownian motion. Thus, from the hint, we deduce:
(t) Pa→x
1 x 1 − ∈ ds = (cst) p(a) , 0 ds , s T0 t t
(6.9.a)
where p(a) s (x, y) is the density of the semigroup of Brownian motion with drift a. It is easily shown that:
p(a) s (x, y)
1 1 a2 s exp − =√ (x − y)2 + a(x − y) + 2s 2 2πs
,
214
Exercises in Probability
so that, it follows from (6.9.a) that: $ (t) (T0 Pa→x
∈ ds) = |a| ds
t 1 x2 s x a2 exp − + a + 2πs3 (t − s) 2 t(t − s) s 2
1 1 − s t
. (6.9.b)
(t) 2. Let f be a generic bounded Borel function. It follows from the definition of Pa→x that
IR
(t) Ea→x [F (Xu , u ≤ s)]pt (a, x)f (x) dx = Ea [F (Xu , u ≤ s)f (Xt )]
= Ea [F (Xu , u ≤ s)EXs [f (Xt−s )]]
=
Ea [F (Xu , u ≤ s)pt−s (Xs , x)]f (x) dx ,
IR
where the second equality is obtained by applying the Markov property at time s. We obtain relation (6.9.1) by identifying the first and last terms of these equali(t) ties, using the continuity in x for both quantities: Ea→x [F (Xu , u ≤ s)]pt (a, x) and Ea [F (Xu , u ≤ s)pt−s (Xs , x)], for F a bounded continuous functional. 3. Relation (6.9.2) follows from (6.9.1) using the fact that (s → pt−s (Xs , x), s < t) is a Pa -martingale, together with the optional stopping theorem. 4. We obtain (6.9.3) by taking S = Ty and F = 1I{Ty <s} in (6.9.2). We obtain (6.9.4) from (6.9.3) by writing: (t) (Ty Pa→x
s
< s) = 0
pt−u (y, x) Pa (Ty ∈ du) . pt (a, x)
Then, differentiate with respect to s and use (6.8.3). Finally, it is immediate to recover the density of the law of T0 , given in (6.9.b) from (6.9.3). 5. To prove (6.9.5), observe that pu (0, x) = pu (0, −x), for all u ≥ 0, hence applying (6.9.3), we have: (t) Pa→−x (T0
< s) = Ea
pt−T0 (0, x) 1I{T0 <s} . pt (a, −x)
By combining the above equality with (6.9.3), we obtain (6.9.5). In particular, since (t) Pa→−x (T0 < t) = 1, we have (6.9.6).
Solution to Exercise 6.10
1. Recall that Z1 and 2 Z1 Z 1 , have the same law if and only if their real moments 2
coincide (see Exercise 4.3). For every z ≥ 0, we have E[Z12z−1 ] = Γ(2z), on the one
6. Random processes – solutions
215
hand and
z− 12
E (2 Z1 Z 1 )2z−1 = 22z−1 E Z1 2
z− 12
E Z1
2
1 1 1 1 1 = 22z−1 Γ(z + )Γ( )−1 Γ(z) = √ 22z− 2 Γ(z)Γ z + 2 2 2 2π
on the other hand. (law)
The identity (6.10.2) follows from: N 2 = 2Z 1 . 2
Tx 2. It follows from (6.8.3) that 2 has a unilateral stable law with parameter 12 . So, 2x we may write ∞ √ P (Tx < Z1 ) = P (Tx < t)e−t dt = E[e−Tx ] = exp(− 2x) . 0
The second formula follows from properties of Brownian motion which we now recall. At first, observe that if B is the past maximum process of B, defined by B t = sup0≤s≤t Bs , then (6.10.4) can be written as P (Tx < Z1 ) = P (x < B Z1 ). Secondly, it follows from the so-called reflection principle of Brownian motion that (law) B t = |Bt |, for every t ≥ 0. Finally, the scaling property of Brownian motion (law) 1 implies that Bt = t 2 B t, so that we have P (x < B Z1 ) = √1 for every deterministic √ P (x < |BZ1 |) = P (x < Z1 |B1 |) = P (x < Z1 |N |). 3. With (6.10.3) and (6.10.4), we obtain P (x < precisely implies identity (6.10.2).
√
√ Z1 |N |) = exp(− 2x), which
Solution to Exercise 6.11 1. The process (Bu(s) , s ≥ 0) is a martingale in its own filtration which is also that of X, since v(t) = 0, for all t > 0. Hence, using Itˆo’s formula, we may write:
Xt =
t
0
v(s) dBu(s) +
0
t
Bu(s) dv(s) ,
t ≥ 0.
The process ( 0t Bu(s) dv(s), t ≥ 0) is adapted to the same filtration and has bounded variation. This proves that (Xt ) is a semimartingale whose martingale part is
t 0 v(s) dBu(s) .
2. According to L´evy’s Characterization Theorem, the martingale ( 0t v(s) dBu(s) , t ≥ 0) is a Brownian motion if and only if its quadratic variation is t, that is 0
t
v 2 (s) du(s) = t , for every t ≥ 0 .
216
Exercises in Probability
Since u and v are continuous functions, this is equivalent to v 2 (s)u (s) ≡ 1. 3. Tanaka’s formula applied to X gives |Xt | =
t 0
v(s)sgn(Bu(s) ) dBu(s) +
t
0
|Bu(s) | dv(s) + LX t .
(6.11.a)
Thanks to the same formula, we also have |Bu(t) | = which leads to
t 0
v(s) d|Bu(s) | =
t
0
sgn(Bu(s) ) dBu(s) + LB u(t) ,
t 0
v(s)sgn(Bu(s) ) dBu(s) +
t
0
v(s) dLB u(s) .
It follows from the above identity and the decomposition |Xt | = v(t)|Bu(t) | =
t 0 v(s) d|Bu(s) | + 0 |Bu(s) | dv(s), that
t
|Xt | =
t 0
v(s)sgn(Bu(s) ) dBu(s) +
0
t
|Bu(s) | dv(s) +
t 0
v(s) dLB u(s) .
(6.11.b)
We obtain the result by comparing (6.11.a) and (6.11.b). 4. With u(t) = (6.11.1):
1−e−2βt , 2β
−1 2β
u−1 (t) =
LUt
=
log(1 − 2βt), and v(t) = eβt , we have from
1−e−2βt 2β
0
√
dLB s . 1 − 2βs t , 1−t
5. We obtain this formula simply by applying (6.11.1) with u(t) = t u−1 (t) = 1+t and v(t) = t.
hence
According to question 2 of Exercise 6.8, the density of the law of lt is
ht (y) =
=
1 2π(1 − t)t3 2 π(1 − t)t3
$
=
1 2
∞
(|x| + y)2 x2 (|x| + y) exp − − 2t 2(1 − t) −∞
dx
y2 ∞ (x + (1 − t)y)2 exp − (x + y) exp − dx 2 2t(1 − t) 0
21 y2 ∞ z2 exp − dz √ 1−t ( t(1 − t)z + ty) exp − πt 2 2 y t
$ $
=
1 2
2 π
1−t y2 exp − t 2t
⎛ $
⎞
1−t ⎠ y2 y , + 2y exp − Φ ⎝− 2 t
where, in the last equality, Φ is the distribution function of the centred normal dLv distribution with variance 1. Finally, from (6.11.2), the density of 0t 1+v is given by: $
t (y) = h 1+t
2 (1 + t)y 2 exp − πt t
y2 y + 2y exp − Φ −√ 2 t
.
6. Random processes – solutions
217
Solution to Exercise 6.12 1. From the time inversion invariance property of Brownian motion, the process ˜t (def) B = tB 1 , t ≥ 0 is a Brownian motion and we have t
1 ˜u = 1} (def) = inf {u : B = T˜1 . Λ As is well known, and follows from the scaling property and the reflection principle (law) ˜ −2 (see question 1 of Exercise 6.8), one has: T˜1 = B 1 , which gives (6.12.1). It also appears in Exercise 5.11 that T˜1 has a unilateral stable (1/2) distribution (see the comments at the end). So, the result follows from question 1 of Exercise 4.15. 2. Reversing the time in (6.12.2), we see that this equation is equivalent to
√
1 1 ˜t − 1, t ≥ T˜1 (law) = t Λb ,t ≥ . B tΛ Λ Now, the proof of the above identity may be further reduced by making the changes of variables t = T˜1 + u and t = Λ1 + u, to showing:
˜ ˜ − 1, u ≥ 0 T˜1 , B u+T1
(law)
=
1 , Λ
1 + Λu 1 √ b ,u ≥ 0 1 + Λu Λ
.
Applying the strong Markov property of Brownian motion at time T˜1 and using the fact that it has time homogeneous and stationary increments, this also may be written as
1 1 + Λu 1 (law) √ b ,u ≥ 0 , , T˜1 , (Bu , u ≥ 0) = Λ 1 + Λu Λ where B is a standard Brownian motion which is independent of T˜1 . Finally it follows from the scaling property of Brownian motion, and question 5 of Exercise 6.11 that the process
1 + Λu 1 √ b ,u ≥ 0 1 + Λu Λ is a standard Brownian motion which is independent of Λ. This proves the result. 3. (i) and (ii). The proofs of (6.12.3) and (6.12.4) are exactly the same as the proofs of questions 1 and 2, once we notice that 1 ˜u = μ} . = inf {u : B Λμ From of Brownian motion, for any λ > 0, the re-scaled process √ the scaling property
λb λt , 0 ≤ t ≤ λ is distributed as a Brownian bridge starting from 0, ending at 0 and with length λ. Hence, the second part of the question follows directly from the first.
218
Exercises in Probability
Solution to Exercise 6.13 1. First recall that the process B = (tB 1 , t > 0) has the same law as (Bt , t > 0).
Moreover, the process b = Bs −
t
s B , 1+m 1+m
s ∈ [0, 1 + m] has the law of a standard
(1+m)
Brownian bridge P0→0 . From the Markov property of Brownian bridge applied at time 1, conditionally on b1 = a, the process
(b1+t , t ∈ [0, m]) = (1 + t) B
1 1+t
−B
1 1+m
, t ∈ [0, m]
(m)
has law Pa→0 . Finally, we deduce the result from the identity in law
(1 + t) B
1 1+t
−B
1 1+m
, t ∈ [0, m]
(law)
=
(1 + t) B
1 1 − 1+m 1+t
, t ∈ [0, m] . (m)
2. From above, using the fact that if (Xu , u ≤ m) is distributed as Pa→0 , then (m) (Xm−u , u ≤ m) is distributed as Pa→0 . This leads to the identity in law:
0
m
dt ϕ(t)Bt2 | Bm = a
(law)
m 0
dt ϕ(t)Bt2
| Bm = a
m
(law)
s 1+m
m
=
0
ϕ(m − t)(1 + t)2 B 2 1
1 − 1+m 1+t
0
and from the change of variable
=
−
1 1+t
=
1 1+m
(1 + m)s ϕ 1+s
m dt | B 1+m =a ,
in the right hand side, it follows:
(m + 1)3 2 m B s ds | B 1+m =a . (s + 1)4 1+m
Recall that from the scaling property of Brownian motion, √ (law) s , 0 ≤ s ≤ m) = (B , 0 ≤ s ≤ m) , ( m + 1B m+1 s which yields, from above:
(law)
=
m 0
dt ϕ(t)Bt2 | Bm = a
(1 + m)
2
m 0
ds (1 + m)s Bs2 ϕ 4 (s + 1) 1+s
m
One means to compute the expression E exp −λ dt 0
√
| Bm = a 1 + m .
Bt2
| Bm = a is to deter-
mine the Laplace transform (def)
Iα,b = E exp
2 −αBm
b2 m 2 − dt Bt , 2 0
as a consequence of Girsanov’s transformation. One will find a detailed proof in the book referred to in the statement of the exercise.
6. Random processes – solutions
219
Solution to Exercise 6.14 1. First observe that from the condition (6.14.1), the semimartingale X converges almost surely, so that XT and XT are well defined for any stopping time T (taking possibly the value ∞). We also point out that from Itˆo’s formula, we have Xt2 − Xt = 2
t
Xs dXs
0
t
= 2 0
Xs dMs + 2
t
0
Xs dVs .
(6.14.a)
Assume that (6.14.2) holds for every stopping time T . Then it is well known that the process (Xt2 − Xt , t ≥ 0) is a martingale. From (6.14.a) we deduce that
t ( 0 Xs dVs , t ≥ 0) is also a martingale. Since this process has finite variation, it necessarily vanishes. This is equivalent to (6.14.3). Assume now that (6.14.3) holds. Then the process ( 0t Xs dVs , t ≥ 0) vanishes and from (6.14.a), (Xt2 − Xt , t ≥ 0) is a martingale. Therefore, (6.14.2) holds for every bounded stopping time. If T is any stopping time, then we may find a sequence of (bounded) stopping times Tn , n ≥ 1 which converges almost surely towards T and 2 such that for every n ≥ 1, Xt∧T − Xt∧Tn , t ≥ 0, is a bounded martingale. Hence, n
lim XT2n − XTn n→∞
E[XT2n ]
= XT2 − XT ,
a.s., and
= E[XTn ] .
(6.14.b) (6.14.c)
From the Burkholder–Davis–Gundy inequalities, there exists a constant C such that: ⎡
⎡
2 ⎤
E ⎣ sup |Xs | s≥0
⎦ ≤ CE ⎢ ⎣M ∞
⎛∞ ⎞2 ⎤ ⎥ + ⎝ |dVs |⎠ ⎦ ,
(6.14.d)
0
hence we obtain (6.14.2) for T from (6.14.b), (6.14.c), (6.14.1) and Lebesgue’s Theorem of dominated convergence. 2. X + and X − are semimartingales such that for any time t ≥ 0, X + t + X − t = Xt and (Xt+ )2 + (Xt− )2 = Xt2 , almost surely. Hence, if X + and X − satisfy (6.14.2), then so does X. Suppose that X satisfies (6.14.2). From question 1, X also satisfies (6.14.3). We will check that X + and X − satisfy (6.14.1) and (6.14.3) (so that they satisfy (6.14.2)), that is ⎡
"
(+) E⎢ ⎣ M
∞ 0
# ∞
⎡ ⎛∞ ⎞2 ⎤ " # ⎢ (−) + ⎝ |dVs(+) |⎠ ⎥ ⎦ < ∞, E ⎣ M 0
1I{Xs+ =0} |dVs(+) |
= 0,
∞
and 0
∞
⎛∞ ⎞2 ⎤ + ⎝ |dVs(−) |⎠ ⎥ ⎦ < ∞, 0
(6.14.e) 1I{Xs− =0} |dVs(−) |
= 0,
(6.14.f)
220
Exercises in Probability
where M (+) and M (−) are the martingale parts of X + and X − , respectively, and V (+) and V (−) are their finite variation parts. From Tanaka’s formula and (6.14.3), V (+) and V (−) are given by (+)
Vt
(−) Vt
t
= 0
= −
1I{Xs >0} dVs +
t
0
1 2
0
1 1I{Xs ≤0} dVs + 2
t
1 X dLX s = Lt 2
t
0
dLX s = −
t 0
1 1I{Xs =0} dVs + LX , 2 t
where LX is the local time at 0 of X. Since V and LX have zero variation on the set {t : Xt = 0} which contains both {t : Xt+ = 0} and {t : Xt− = 0}, then V (+) (resp. V (−) ) has zero variation on {t : Xt+ = 0} (resp. {t : Xt− = 0}). This proves (6.14.f).
t
t + Now we check (6.14.e). Since LX t = 2 Xt − 0 1I{Xs >0} dVs − 0 1I{Xs >0} dMs , there exists a constant D such that ⎡
2
2 ⎣ sup |Xs | E[(LX ∞ ) ] ≤ DE
∞
+ 0
s≥0
2
|dVs |
⎤
+ M ∞ ⎦ .
From"(6.14.d), the right hand side of the above "inequality is finite. Finally, we # #
∞
(+) (−) have M = 0 1I{Xs >0} d M s ≤ M ∞ and M = 0∞ 1I{Xs ≤0} d M s ≤ ∞ ∞ M ∞ , almost surely. This ends the proof of (6.14.e). 3. (i) We easily check that the semimartingale X = −M + S M satisfies (6.14.3), hence it satisfies (6.14.2). (ii) Developing the square of the semimartingale X = M + LM , we obtain: M 2 E[XT2 ] = E[MT2 + 2MT LM T + (LT ) ] 2 = E[MT2 ] + E[(LM T ) ] 2 = E[M T ] + E[(LM T ) ].
Since XT = M T , X satisfies (6.14.2) if and only if LM t = 0, for all t; but since M M E[|Mt |] = E[Lt ] for every t, Lt = 0 is equivalent to Mt = 0. 4. Tanaka’s formula yields: |Xt | − Lt =
0
t
sgn(Xs ) dXs .
If (6.14.4) holds for every stopping time T then (|Xt | − Lt ) is a martingale and so
is Xt = 0t sgn(Xs ) d(|Xs | − Ls ). If (Xt ) is a martingale, then (6.14.4) holds for every bounded stopping time. But as in question 1, the condition (6.14.1) allows us to prove that (6.14.4) holds for every stopping time.
6. Random processes – solutions
221
Solution to Exercise 6.15 Let At =
t
⎛
0
ds exp(2Bs ). First we prove the identity
⎝exp(Bt ),
t
⎞
exp(Bs )dWs ; t ≥ 0⎠ = (R(At ), β(At ); t ≥ 0) ,
(6.15.a)
0
where R is a two-dimensional Bessel process started at 1 and β is a Brownian motion, independent of R. Since
t 0
exp(Bs )dWs ; t ≥ 0 is a martingale, whose quadratic variation is At , the
process β is nothing but its Dambis–Dubins–Schwarz (DDS) Brownian motion, and the identity ⎛ ⎞ ⎝
t
exp(Bs )dWs ; t ≥ 0⎠ = (β(At ); t ≥ 0)
0
follows. To prove the other identity, write from Itˆo’s formula:
exp(2Bt ) = 1 + 2
t 0
exp(2Bs ) dBs + 2
t 0
exp(2Bs ) ds .
(6.15.b)
Changing the time on both sides of this equation with the inverse A−1 u of At we get
)=1+2 exp(2BA−1 u
u
0
exp(BA−1 ) dγs + 2u , s
where γ is the (DDS) Brownian motion defined by 0t exp(Bs ) dBs = γ(At ). This shows that exp(2BA−1 ) is the square of a two-dimensional Bessel process started at u 1 and the first identity of (6.15.a) is proved. The independence between R and β follows from that of B and W : after time changing, γ, the driving Brownian motion of R is orthogonal √ to β, hence these Brownian motions are independent. Finally, since R2 + β 2 is transient (it is a three-dimensional Bessel process) and limt→+∞ At = +∞, it is clear that the norm of the process (R(At ), β(At ); t ≥ 0) converges almost surely towards ∞ as t → +∞. Note that instead of using equation (6.15.b), we might also have simply developed exp(Bt ) with Itˆo’s formula, and found R instead of R2 .
Solution to Exercise 6.16 (def)
To present the solution, we find it easier to work with the process It = 0t ds Xs , rather than with X t = 1t It . Let f be a real valued function defined on IR which is C 1 , with compact support. Differentiating f (It ) with respect to t gives dtd (f (It )) =
t Xt f (It ), so that f (It ) = f (0) + 0 ds Xs f (Is ), and
E[f (It )] = f (0) + E
0
t
dsXs f (Is ) .
222
Exercises in Probability
On the other hand, using the scaling property of the process X, we obtain that for
(law) fixed t, f (It ) = f (tI1 ) = f (0) + 0t ds f (sI1 )I1 , hence:
E[f (It )] = f (0) +
t 0
ds E[f (sI1 )I1 ] .
Comparing the two expressions obtained for E[f (It )] and again using the scaling property of X, we get, for every s: 1 E[Xs f (Is )] = E[I1 f (sI1 )] = E[Is f (Is )] , s which yields the result.
Solution to Exercise 6.17 1. Suppose that there exists another process (Yt , t ≥ 0) (rather than (Xt , t ≥ 0))
t t such that Xt − 0 ds Ys is an Ft -martingale. Subtracting Xt − 0 ds Xs , we see that
t t 0 ds (Xs − Ys ) is an Ft -martingale. Since 0 ds (Xs − Ys ) has finite variations, it necessarily vanishes. This proves that Xt and Yt are indistinguishable. 2. It is clear that formula (6.17.1) holds for n = 1. Suppose that formula (6.17.1) holds up to order n − 1. An integration by parts gives: t n s tn (n) t sn−1 (n) X = Xs ds + dXs(n) . n! t (n − 1)! n! 0 0
Since dXs(n) = dMsX
+ Xs(n+1) ds, we have:
t n t n n sn−1 s s (n+1) (n) (n) n−1 t X (n) Xs ds = (−1) Xt − dMs Xs − ds . (−1) (n − 1)! n! n! 0 0 0 n! (6.17.a) Plugging (6.17.a) into formula (6.17.1) taken at order n − 1, we obtain (6.17.1) at order n. n−1
(n)
t
3. We will prove a more general result. If (Xt ) is a Markov process, taking values in (E, E), its extended infinitesimal generator L is an operator acting on functions f : E −→ IR such that f (Xt ) is (Ft ) differentiable, where Ft = σ{Xs , s ≤ t}. Then one can show that (f (Xt )) = g(Xt ), for some function g. One denotes g = Lf and f ∈ D(L). Now it follows directly from the previous question that if f ∈ D(Ln+1 ), then: f (Xt ) − tLf (Xt ) + is a martingale.
t n tn s n+1 t2 2 L f (Xt ) + · · · + (−1)n Ln f (Xt ) − L f (Xs ) ds 2 n! 0 n!
6. Random processes – solutions
223
The result of this question simply follows from the above property applied to Brownian motion whose infinitesimal generator L satisfies Lf = 12 f , (f ∈ Cc2 ) and to f (x) = xk . 4. A formal proof goes as follows. Write, for s < t, the martingale property,
a2 t E exp aBt − 2
| Fs
a2 s = exp aBs − 2
,
and develop both sides as a series in a; the martingale property for each process (Hk (Bt , t), t ≥ 0) follows. Now, justify fully those arguments ! Comment: The notion of extended infinitesimal generator for a Markov process (Xt ) which, to our knowledge, is due to H. Kunita: Absolute continuity of Markov processes and generators. Nagoya Math. J., 36, 1–26 (1969). H. Kunita: Absolute continuity of Markov processes. S´eminaire de Probabilit´es X, Lecture Notes in Mathematics, 511, 44–77, Springer, Berlin, 1976 is very convenient to compute martingales associated with (Xt ), especially when the laws of X are characterized via a martingale problem, a` la Stroock–Varadhan, for which the reader may consult: D.W. Stroock and S.R.S. Varadhan: Multidimensional Diffusion Processes. Grundlehren der Mathematischen Wissenschaften, 233. Springer-Verlag, Berlin– New York, 1979. Second edition, 1997.
Solution to Exercise 6.18 For every n ∈ IN, there exists a function ϕn of (n + 1) arguments, such that Mn = ϕn (W0 , . . . , Wn ). From (ii), we have:
E[Mγ | m = j] = E ϕσ(j) (W0 , W1 , . . . , Wσ(j) ) = E[Mσ(j) ] = E[M0 ] ,
where the last equality follows from the optional stopping theorem. Consequently, we obtain:
E Mγ 1I{m=j}
= E E[Mγ | m = j]1I{m=j} = E[M0 ]P (m = j) .
Thus, a simple explanation of this result is that, once we condition with respect to {m = j}, γ becomes a stopping time.
224
Exercises in Probability
Solution to Exercise 6.19 1. The equivalence between (6.19.1) and (6.19.2) is straightforward. 2. Since Brownian motion (which we shall denote here by X) is at the same time a L´evy process, and a Gaussian process, we know that there exist, for fixed s, t, T , two reals u and v such that E[Xt | Fs,T ] = uXs + vXT , which we write equivalently as E[Xt − Xs | Fs,T ] = ((u − 1) + v)Xs + v(XT − Xs ).
(6.19.a)
From (6.19.a), we can deduce easily that E[Xs (Xt − Xs )] = 0 = (u − 1) + v E[(Xt − Xs )(XT − Xs )] = t − s = v(T − s), which yields the right values of u and v. 3. Any centred Markovian Gaussian process may be represented as Xt = u(t)βv(t) , with β a Brownian motion and u and v two deterministic functions, v being a monotone. Call (Bt ) the natural filtration of β and assume that v is increasing (for simplicity). Then from question 1, we have E[Xt | Fs,T ] = u(t)E[βv(t) | Bv(s),v(t) ] v(T ) − v(t) v(t) − v(s) βv(s) + βv(T ) = u(t) v(T ) − v(s) v(T ) − v(s) v(t) − v(s) X(T ) v(T ) − v(t) Xs + = u(t) , v(T ) − v(s) u(s) v(T ) − v(s) u(T ) which yields the right values of α and β. Now, for X to be an affine process, we should have v(T ) − v(t) u(t) T −t = v(T ) − v(s) u(s) T −s v(t) − v(s) u(t) t−s = . v(T ) − v(s) u(T ) T −s 4. The identity presented in the hint is easily deduced from the following E[exp(i(λXa + μXb )] = exp(−aψ(λ + μ) − (b − a)ψ(μ)) , where ψ denotes the L´evy exponent associated with X, and a < b, λ, μ ∈ IR. Then we have d iE[Xa exp(iμXb )] = E[exp i(λXa + μXb )]|λ=0 = −aψ (μ) exp(−bψ(μ)) , dλ
6. Random processes – solutions
225
which implies the desired result. It follows that
Xa Xb E | Fb+ = , a b and with the help of the homogeneity of the increments of X, we obtain:
E
Xt − Xs XT − Xs | Fs,T = , t−s T −s
which shows that X is an affine process. 5. The proof is the same after replacing the characteristic function by the Laplace transform. 6. Assume that X is differentiable. By letting t ↓ t in (6.19.2), we obtain E[Yt | Fs,u ], s−ε which does not depend on t ∈ (s, u); but we also have Ys = limε→+0 Xs −X , in L1 ε and the same is true for Yu . Thus Ys and Yu are Fs,u -measurable, hence they are −Xs both equal to Xuu−s ; the proof of this question is now easily ended. 7. It is easily shown that the property (6.19.2) is satisfied for t and t of the form s + nk (u − s), 0 ≤ k ≤ n; hence, it also holds for all t, t ∈ [s, u], using the L1 continuity of X.
Joint solution to Exercises 6.20 and 6.21 We note that it suffices to prove Exercise 6.21, since then Exercise 6.20 will follow, by considering t
Ht =
du α(u) . 0
To prove Exercise 6.21, we consider s < t < u, and we write
H(u) − H(s) | Fs α(s) = E u−s H(u) − H(s) = E E | Ft | Fs u−s u−t t−s E[α(t) | Fs ] + α(s) . = u−s u−s Comparing the extreme terms, we obtain α(s) = E[α(t) | Fs ] . To prove question 2 of Exercise 6.21, write α(s) = E E[Ht − Hs | Fs ] = E is a martingale.
t s
1 t t−s s
duαu | Fs , hence
du αu | Fs , which allows us to deduce that Ht −
t 0
du αu
226
Where is the notion N discussed ?
Where is the notion N discussed ?
In this book Monotone class theorem Chapter 1 Uniform integrability Ex. 1.2, Ex. 1.3 Convergence of r.v.s Chapter 5 Independence Chapter 2 Conditioning Chapter 2 Gaussian space Chapter 3 Laws of large numbers Chapter 5 Central limit theorems Ex. 5.8 Large deviations Ex. 5.5 Characteristic functions Chapter 4, Chapter 5 Laplace transform Chapter 4 Mellin transform Ex. 1.13, Ex. 4.21 Infinitely divisible laws Ex. 1.12, Ex. 5.11 Stable laws Ex. 4.17 to 4.19, Ex. 5.11 Domains of attraction Ex. 5.15 Ergodic Theorems Ex. 1.8 Martingales Ex. 1.5, Chapter 6
In the literature Meyer ([40], T19, T20; pp. 27, 28); Dacunha-Castelle and Duflo ([12], chapter 3) Meyer ([40], pp. 35–40); Durrett ([19], section 4.5); Grimmett and Stirzaker ([26], pp. 353) Fristedt and Gray ([25], chapter 12); Billingsley [5]; Chung [11] Dacunha-Castelle and Duflo ([12], chapter 5); Kac [32] Meyer ([40], chapter 2, section 4); Williams [63]; Fristedt and Gray ([25], chapter 21, chapter 23) Neveu [42]; Janson [30]; Lifschits [37] Williams ([64], p. 103); Durrett ([19], chapter 1); Feller ([20], chapter VII) Williams ([64], p. 156); Durrett ([19], chapter 2); Feller ([20], chapter VIII) Toulouse ([58], chapter 3); Azencott (see ex. 3.10); Durrett ([19], 1.9) Lukacs [38]; Williams ([64], p. 166); Fristedt and Gray ([25], chapter 13) Feller ([20], chapter VII, 6) Chung ([11], 66); Meyer [40] Zolotarev [66]; Widder [61]; Patterson [44] Fristedt and Gray ([25], chapter 16); Feller ([20], chapter 6); Durrett ([19], 2.8) Zolotarev [66]; Feller [20] Fristedt and Gray ([25], chapter 17); Feller ([20], chapter IX); Petrov [46] Durrett ([19], chapter 6); Billingsley (see ex. 1.8) Baldi, Mazliak and Priouret [2]; Neveu ([43], section B); Williams [63]; Grimmett and Stirzaker ([26], section 7.7, 7.8, chapter 12)
Final suggestions: how to go further ?
227
Final suggestions: how to go further ?
The reader who has had the patience and/or interest to remain with us until now may want to (and will certainly) draw some conclusions from this enterprise. . . . If you are really “hooked” (in French slang, “accro”!) on exercises, counterexamples, etc., you may get into [2], [14], [15], [18], [23], [27], [36], [54],[56], [57]. We would like to suggest two complementary directions. (i) When, in the middle of a somewhat complex research program, involving a highly sophisticated probability model, it is often comforting to check one’s assertions by “coming down” to some consequences involving only one – (or finite) – dimensional random variables. As explained in the foreword of this book, most of our exercises have been constructed in this way, mainly by “stripping” the Brownian set-up. (ii) The converse attitude may also be quite fruitful; namely to view a onedimensional model as embedded in an infinitely dimensional one. An excellent example of the gains one might draw in “looking at the big picture” is Itˆo’s theory of (Brownian) excursions: jointly (as a process of excursions), they constitute a big Poisson process; this theory allowed us to recover most of L´evy’s results for the individual excursion, and indeed many more. . . . To illustrate, here is a discussion relative to the arc-sine law of P. L´evy: in his
famous 1939 paper, P. L´evy noticed that the law of 1t 0t ds 1I{Bs >0} , for fixed t, is also the same as that of τ1h 0τh ds 1I{Bs >0} , where, for fixed h, τh = inf{t : lt > h}, h ≥ 0, is the inverse of the local time process l. This striking remark motivated Pitman and Yor to establish the following infinite dimensional reinforcement of L´evy’s remark: for fixed t and h, both sequences: 1t (M1 (t), . . . , Mn (t), . . .) and 1 (M1 (τh ), . . . , Mn (τh ), . . .) have the same distribution, where M1 (t) > M2 (t) > . . . τh is the decreasing sequence of lengths of Brownian excursions over the interval (0, t). We could not find any better way to conclude on this topic, and with this book, than by simply reproducing one sentence in Professor Itˆo’s Foreword to his Selected Papers (Springer, 1987): “After several years, it became my habit to observe even finite-dimensional facts from the infinite-dimensional viewpoint.” So, let us try to imitate Professor Itˆo !!
References [1] G.E. Andrews, R. Askey and R. Roy: Special Functions. Encyclopedia of Mathematics and its Applications, 71. Cambridge University Press, Cambridge, 1999. [2] P. Baldi, L. Mazliak and P. Priouret: Martingales et Chaˆınes de Markov. Collection M´ethodes, Hermann, 1998. English version: Chapman & Hall/CRC, 2002. [3] Ph. Barbe and M. Ledoux: Probabilit´e. De la Licence `a l’Agr´egation. ´ Editions Espaces 34, Belin, 1998. [4] P. Billingsley: Probability and Measure. Third edition. John Wiley & Sons, Inc., New York, 1995. [5] P. Billingsley: Convergence of Probability Measures. Second edition. John Wiley & Sons, Inc., New York, 1999. [6] P. Billingsley: Ergodic Theory and Information. Reprint of the 1965 original. Robert E. Krieger Publishing Co., Huntington, N.Y., 1978. [7] A.N. Borodin and P. Salminen: Handbook of Brownian Motion – Facts and Formulae. Second edition. Probability and its Applications. Birkh¨auser Verlag, Basel, 2002. [8] L. Breiman: Probability. Addison-Wesley Publishing Company, Reading, Mass.-London-Don Mills, Ont. 1968. ´maud: An Introduction to Probabilistic Modeling. Corrected reprint of [9] P. Bre the 1988 original. Springer-Verlag, New York, 1994. [10] Y.S. Chow and H. Teicher: Probability Theory. Independence, Interchangeability, Martingales. Third edition. Springer-Verlag, New York, 1997. [11] K.L. Chung: A course in Probability Theory. Third edition. Academic Press, Inc., San Diego, CA, 2001. 229
230
References
[12] D. Dacunha-Castelle and M. Duflo: Probabilit´es et statistiques. Tome 1. Probl`emes `a temps fixe. Masson, Paris, 1982. [13] D. Dacunha-Castelle and M. Duflo: Probabilit´es et statistiques. Tome 2. Probl`emes `a temps mobile. Masson, Paris, 1983. [14] D. Dacunha-Castelle, M. Duflo and V. Genon-Catalot: Exercices de probabilit´es et statistiques. Tome 2. Probl`emes ` a temps mobile. Masson, 1984. [15] D. Dacunha-Castelle, D. Revuz and M. Schreiber: Recueil de probl`emes de calcul des probabilit´es. Deuxi`eme ´edition. Masson, Paris 1970. [16] C. Dellacherie and P.A. Meyer: Probabilit´es et potentiel. Chapitres I `a IV. Hermann, Paris, 1975. [17] J.L. Doob: Stochastic Processes. A Wiley-Interscience Publication. John Wiley & Sons, Inc., New York, 1990. [18] A.Y. Dorogovtsev, D.S. Silvestrov, A.V. Skorokhod and M.I. Yadrenko: Probability theory: collection of problems. Translations of Mathematical Monographs, 163. American Mathematical Society, Providence, RI, 1997. [19] R. Durrett: Probability: Theory and Examples. Second edition. Duxbury Press, Belmont, CA, 1996. [20] W. Feller: An Introduction to Probability Theory and its Applications. Vol. II. Second edition John Wiley & Sons, Inc., New York–London–Sydney, 1971. [21] X. Fernique: Fonctions al´eatoires gaussiennes, vecteurs al´eatoires gaussiens. Universit´e de Montr´eal, Centre de Recherches Math´ematiques, Montr´eal, QC, 1997. [22] D. Foata and A. Fuchs: Calcul des probabilit´es. Second edition. Dunod, 1998. [23] D. Foata and A. Fuchs: Processus stochastiques, processus de Poisson, chaˆınes de Markov et martingales. Cours et exercices corrig´es. Dunod (Paris), 2002. [24] D. Freedman: Brownian Motion and Diffusion. Second edition. SpringerVerlag, New York-Berlin, 1983. [25] B. Fristedt and L. Gray: A Modern Approach to Probability Theory. Probability and its applications. Birkh¨auser Boston, Inc., Boston, MA, 1997. [26] G.R. Grimmett and D.R. Stirzaker: Probability and Random Processes. Second edition. The Clarendon Press, Oxford University Press, New York, 1992.
References
231
[27] G.R. Grimmett and D.R. Stirzaker: One Thousand Exercises in Probability. The Clarendon Press, Oxford University Press, New York, 2001. [28] T. Hida and M. Hitsuda: Gaussian processes. Translations of Mathematical Monographs, 120. American Mathematical Society, Providence, RI, 1993. [29] J. Jacod and Ph. Protter: Probability Essentials. Universitext. SpringerVerlag, Berlin, second edition 2002. [30] S. Janson: Gaussian Hilbert Spaces. Cambridge Tracts in Mathematics, 129. Cambridge University Press, Cambridge, 1997. [31] N.L. Johnson, S. Kotz and N. Balakrishnan: Continuous univariate distributions. Vol. 2. Second edition. John Wiley & Sons, Inc., New York, 1995. [32] M. Kac: Statistical Independence in Probability, Analysis and Number Theory. The Carus Mathematical Monographs, No. 12. Distributed by John Wiley and Sons, Inc., New York 1959. [33] O. Kallenberg: Foundations of Modern Probability. Second edition. Probability and its Applications. Springer-Verlag, New York, 2002. [34] I. Karatzas and S.N. Shreve: Brownian Motion and Stochastic Calculus. Second edition. Springer-Verlag, New York, 1991. [35] A.N. Kolmogorov: Foundations of the Theory of Probability. Chelsea Publishing Company, New York, 1950. [36] G. Letac: Exercises and Solutions Manual for Integration and Probability. Springer-Verlag, New York, 1995. French edition: Masson, second edition 1997. [37] M.A. Lifschits: Gaussian Random Functions. Kluwer Acad. Publishers (1995). [38] E. Lukacs: Developments in Characteristic Function Theory. Macmillan Co., New York, 1983. [39] P. Malliavin and H. Airault: Integration and Probability. Graduate Texts in Mathematics, 157. Springer-Verlag, New York, 1995. French edition: Masson, 1994. [40] P.A. Meyer: Probability and Potentials. Blaisdell Publishing Co. Ginn and Co., Waltham, Mass.–Toronto, Ont.–London, 1966. French edition: Hermann, 1966. [41] J. Neveu: Mathematical Foundations of the Calculus of Probability. HoldenDay, Inc., 1965. French edition: Masson, second edition, 1970.
232
References
[42] J. Neveu: Processus al´eatoires gaussiens. S´eminaire de Math´ematiques Sup´erieures, No. 34. Les Presses de l’Universit´e de Montr´eal, 1968. [43] J. Neveu: Discrete-parameter Martingales. Revised edition. North-Holland Mathematical Library, Vol. 10., Amsterdam–Oxford–New York, 1975. [44] S.J. Patterson: An introduction to the theory of the Riemann zeta-function. Cambridge Studies in Advanced Mathematics, 14. Cambridge University Press, Cambridge, 1988. [45] K. Petersen: Ergodic Theory. Cambridge Studies in Advanced Mathematics 2. Cambridge University Press, 1983. [46] V.V. Petrov: Limit Theorems of Probability Theory. Sequences of Independent Random Variables. Oxford University Press, New York, 1995. [47] J. Pitman: Probability. Springer-Verlag, 1993. ´nyi: Calcul des probabilit´es. Collection Universitaire de Math´ematiques, [48] A. Re No. 21 Dunod, Paris 1966. [49] D. Revuz: Int´egration. Hermann, 1997. [50] D. Revuz: Probabilit´es. Hermann, 1998. [51] D. Revuz and M. Yor: Continuous Martingales and Brownian Motion. Third edition. Springer-Verlag, Berlin, 1999. [52] L.C.G. Rogers and D. Williams: Diffusions, Markov Processes, and Martingales. Vol. 1. Foundations. Reprint of the second (1994) edition. Cambridge Mathematical Library. Cambridge University Press, Cambridge, 2000. [53] L.C.G. Rogers and D. Williams: Diffusions, Markov Processes, and Martingales. Vol. 2. Itˆo Calculus. Second edition. Cambridge University Press, 2000. [54] J.P. Romano and A.F. Siegel: Counterexamples in probability and statistics. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, CA, 1986. [55] A.N. Shiryaev: Probability. Second edition. Graduate Texts in Mathematics, 95. Springer-Verlag, New York, 1996. [56] J.M. Stoyanov: Counterexamples in Probability. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, Ltd., Chichester, second edition, 1997. ´kely: Paradoxes in probability theory and mathematical statistics. [57] G.J Sze Mathematics and its Applications (East European Series), 15. D. Reidel Publishing Co., Dordrecht, 1986.
References
233
[58] P.S. Toulouse: Th`emes de probabilit´es et statistiques. Agr´egation de math´ematiques. Dunod, 1999. [59] V.V. Uchaikin and V.M. Zolotarev: Chance and Stability. Stable Distributions and their Applications. Modern Probability and Statistics. VSP, Utrecht, 1999. [60] P. Whittle: Probability via Expectation. Fourth edition. Springer Texts in Statistics. Springer-Verlag, New York, 2000. [61] D.V. Widder: The Laplace Transform. Princeton Mathematical Series, v. 6. Princeton University Press, Princeton, N. J., 1941. [62] D. Williams: Diffusions, Markov processes, and martingales. Vol. 1. Foundations. Probability and Mathematical Statistics. John Wiley & Sons, Ltd., Chichester, 1979. [63] D. Williams: Probability with Martingales. Cambridge Mathematical Textbooks. Cambridge University Press, Cambridge, 1991. [64] D. Williams: Weighing the Odds. A Course in Probability and Statistics. Cambridge University Press, Cambridge, 2001. [65] M. Yor: Some Aspects of Brownian Motion Part I. Some special functionals. Lectures in Mathematics ETH Z¨ urich. Birkh¨auser Verlag, Basel, 1992. [66] V.M. Zolotarev: One-dimensional Stable Distributions. Translations of Mathematical Monographs, 65. American Mathematical Society, Providence, RI, 1986. [67] J. Bertoin: L´evy Processes. Cambridge University Press, 1996. [68] H. Georgii: Gibbs Measures and Phase Transitions. De Gruyter Studies in Mathematics, Vol. 9, 1998. [69] N. Lebedev: Special Functions and their Applications. Dover, 1972. [70] D.W. Stroock: Markov processes from Itˆo’s perspective. Princeton University Press, 2002.
Index The abbreviations: ex., sol., and chap. refer respectively to the corresponding exercise, solution, and short presentation of a chapter.
of a subspace ex. 1.9 digamma function ex. 6.3 Dirichlet process ex. 4.4 ergodic transformation ex. 1.7, ex. 1.8, ex. 1.9 exchangeable sequences of r.v.’s ex. 2.6 processes ex. 6.7, ex. 6.19 extremes asymptotic laws for sol. 5.4
absolute continuity ex. 2.3, ex. 6.9 affine process ex. 6.19 Bessel process ex. 6.4, sol. 6.15 Brownian motion chap. 6 geometric ex. 6.15 hyperbolic ex. 6.15 Brownian bridge ex. 6.7 Carleman criterion ex. 1.10 Central Limit Theorem chap. 5, ex. 5.8, ex. 5.9 change of probability ex. 2.14 characteristic function ex. 1.12 concentration inequality ex. 3.10 conditional expectation ex. 1.4 independence ex. 2.12 law ex. 2.4, ex. 4.10 conditioning chap. 2, ex. 2.14, ex. 2.16, ex. 2.17 continued fractions ex. 1.10, ex. 3.4 convergence almost sure ex. 1.5, ex. 3.4 in law ex. 1.3, ex. 4.6, ex. 5.4, ex. 5.8, ex. 5.2 in Lp ex. 5.1 weak ex. 1.3, ex. 5.11, ex. 5.12 density Radon–Nikodym
ex. 2.15 235
gamma function ex. 4.5, ex. 4.16 gamma process ex. 4.4 Gauss multiplication formula ex. 4.5 duplication formula ex. 6.10 triplication formula ex. 4.5 harness ex. 6.19 hitting time distribution ex. 6.8 infinitely divisible r.v. ex. 1.12 infinitesimal generator ex. 6.15, extended sol. 6.17 independence chap. 2, ex. 2.3, 2.7 asymptotic ex. 2.5 invariance property ex. 6.12 Itˆo’s formula ex. 6.1, sol. 6.1, ex. 6.4 Kolmogorov’s 0-1 law
sol. 5.8
large deviations ex. 5.5 law of large numbers chap. 5 local time (of a semimartingale) ex. 6.8 L´evy’s arcsine law ex. 6.7, ex. 6.16
236
Index
L´evy’s characterization of Brownian motion sol. 6.10 L´evy’s identity ex. 6.8 L´evy processes ex. 5.13 , sol. 5.15, ex. 6.3, ex. 6.7, ex. 6.19 martingale ex. 1.5, ex. 6.11, ex. 6.17 complex valued sol. 6.5 Markov property ex. 6.13 strong sol. 6.8, sol. 6.12 Markov process ex. 6.17 Mittag–Leffler distributions ex. 4.19 moments method ex. 5.2 problem ex. 1.9 of a random variable ex. 3.3, ex. 5.2 Monotone Class Theorem ex. 1.4 polynomials Hermite ex. 6.17 orthogonal ex. 6.17 Tchebytcheff ex. 3.5 process empirical ex. 5.12 Gaussian chap. 3, chap. 6 L´evy ex. 5.13 , sol. 5.15, ex. 6.3, ex. 6.7, ex. 6.19 Poisson ex. 5.13 semi-stable sol. 5.15 quadratic variation ex. 6.15
ex. 6.14,
range process (of Brownian motion) ex. 6.2 reflection principle sol. 6.11 semimartingale ex. 6.17 scaling property sol. 6.3 random scaling ex. 6.12 Selberg’s formula ex. 4.20 self-similar process ex. 6.9, ex. 6.16 skew-product representation sol. 6.5
space Gaussian chap.3, ex. 3.1, ex. 3.2 Hilbert chap.3 stopping time ex. 2.11, ex. 6.9 non- ex. 6.18 sigma-field ex. 2.5 tail σ-field ex. 1.7, sol. 5.1 Tanaka’s formula sol. 6.10 time-change ex. 6.11 time-inversion ex. 6.9 transform chap. 4 Fourier ex. 2.12, sol. 5.7 Gauss ex. 4.16 Laplace ex. 2.13, ex. 2.17, sol. 5.2, sol. 5.11 Mellin ex. 4.21 Stieltjes ex. 4.21 uniform integrability
ex. 1.2, ex. 1.3
variable beta ex. 4.2, ex. 4.6, ex. 4.7 Cauchy ex. 4.12, ex. 4.15, ex. 6.10 exponential ex. 4.8, ex. 4.9, ex. 4.11, ex. 4.17 gamma ex. 2.17, ex. 3.3, ex. 4.2, ex. 4.5, ex. 6.10 ex. 4.6, ex. 4.7 Gaussian chap. 3, ex. 3.1, ex. 3.7, ex. 4.1, ex. 4.11 simplifiable ex. 1.12, ex. 4.2 stable ex. 4.17, ex. 4.18, ex. 4.19, ex. 4.21, ex. 5.15 stable(1/2) ex. 4.15, ex. 5.11 uniform ex. 4.2, ex. 4.6, ex. 4.13, ex. 6.7