Problem Books in Mathematics
Edited by P. Winkler
For further volumes: http://www.springer.com/series/714
Dmytro Gusak · Alexander Kukush · Alexey Kulik· Yuliya Mishura · Andrey Pilipenko
Theory of Stochastic Processes With Applications to Financial Mathematics and Risk Theory
123
Dmytro Gusak Institute of Mathematics of Ukrainian National Academy of Sciences Kyiv 01601 Ukraine
[email protected] Alexander Kukush Department of Mathematical Analysis Faculty of Mechanics and Mathematics National Taras Shevchenko University of Kyiv Kyiv 01033 Ukraine alexander
[email protected] Alexey Kulik Institute of Mathematics of Ukrainian National Academy of Sciences Kyiv 01601 Ukraine
[email protected] Yuliya Mishura Department of Probability Theory and Mathematical Statistics Faculty of Mechanics and Mathematics National Taras Shevchencko University of Kyiv Kyiv 01033 Ukraine
[email protected] Andrey Pilipenko Institute of Mathematics of Ukrainian National Academy of Sciences Kyiv 01601 Ukraine
[email protected] Series Editor Peter Winkler Department of Mathematics Dartmouth College Hanover, NH 03755-3551 USA
[email protected] ISSN 0941-3502 ISBN 978-0-387-87861-4 e-ISBN 978-0-387-87862-1 DOI 10.1007/978-0-387-87862-1 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2009939131 Mathematics Subject Classification (2000): 60-xx:60Gxx 60G07 60H10 91B30 c Springer Science+Business Media, LLC 2010 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
To our families
Preface
This collection of problems is planned as a textbook for university courses in the theory of stochastic processes and related special courses. The problems in the book have a wide spectrum of the level of difficulty and can be useful for readers with various levels of mastering in the theory of stochastic processes. Together with technical and illustrative problems intended for beginners, the book contains a number of problems of theoretical nature that can be useful for students and undergraduate students that pursue advanced studies in the theory of stochastic processes and its applications. Among others, the important aim of the book is to provide a teaching staff an efficient tool for preparing seminar studies, tests, and exams concerning university courses in the theory of stochastic processes and related topics. While composing the book, the authors have partially used the collections of problems in probability theory [16, 65, 75, 83]. Also, some exercises and problems from the monographs and textbooks [4, 9, 19, 22, 82] were used. At the same time, a large part of our problem book contains original material. The book is organized as follows. The problems are collected into chapters, each chapter being devoted to a certain topic. At the beginning of each chapter, the theoretical grounds for the corresponding topic are given briefly together with the list of bibliography, which the reader can use in order to study this topic in more detail. For the most of the problems, either hints or complete solutions (or answers) are given, and some of the problems are provided with both hints and solutions (answers). However, the authors do not recommend that a reader use the hints systematically, because solving a problem without assistance is much more useful than using a ready-made idea. Some statements that have a particular theoretical interest are formulated on theoretical grounds, and their proofs are formulated as problems for the reader. Such problems are supplied with either complete solutions or detailed hints. In order to work with the problem book efficiently, a reader should be acquainted with probability theory, calculus, and measure theory within the scope of respective university courses. Standard notions, such as random variable, measurability, independence, Lebesgue measure and integral, and so on are used without additional discussion. All the new notions and statements required for solving the problems are given either on theoretical grounds or in the formulations of the problems vii
viii
Preface
straightforwardly. However, sometimes a notion is used in the text before its formal definition. For instance, the Wiener and Poisson processes are processes with independent increments and thus are formally introduced in a Theoretical grounds for Chapter 5, but these processes are used widely in the problems of Chapters 2 to 4. The authors recommend that a reader who comes to an unknown notion or object use the Index in order to find the corresponding formal definition. The same recommendation concerns some standard abbreviations and symbols listed at the end of the book. Some problems in the book form cycles: solutions to one of them are grounded on statements of others or on auxiliary constructions described in some preceding solutions. Sometimes, on the contrary, it is proposed to prove the same statement within different problems using essentially different techniques. The authors recommend a reader pay specific attention to these fruitful internal links between various topics of the theory of stochastic processes. Every part of the book was composed substantially by one author. Chapters 1–6, and 16 are composed by A. Kulik, Chapters 7, 12–15, 18, and 19 by Yu. Mishura, Chapters 8–10 by A. Pilipenko, Chapter 17 by A. Kukush, and Chapter 20 by D. Gusak. Chapter 11 was prepared jointly by D. Gusak and A. Pilipenko. At the same time, every author has made a contribution to other parts of the book by proposing separate problems or cycles of problems, improving preliminary versions of theoretical grounds, and editing the final text. The authors would like to express their deep gratitude to M. Portenko and A. Ivanov for their careful reading of a preliminary version of the book and valuable comments that led to significant improvement of the text. The authors are also grateful to T. Yakovenko, G. Shevchenko, O. Soloveyko, Yu. Kartashov, Yu. Klimenko, A. Malenko, and N. Ryabova for their assistance in translation, preparing files and pictures, and composing the subject index and references. The theory of stochastic processes is an extended discipline, and the authors understand that the problem book in its current form may cause critical remarks from readers, concerning either the structure of the book or the content of separate chapters. While publishing the problem book in its current form, the authors are open for remarks, comments, and propositions, and express in advance their gratitude to all their correspondents. Kyiv December 2008
Dmytro Gusak Alexander Kukush Alexey Kulik Yuliya Mishura Andrey Pilipenko
Contents
Definition of stochastic process. Cylinder σ -algebra, finite-dimensional distributions, the Kolmogorov theorem . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 3 3 7 9
Characteristics of a stochastic process. Mean and covariance functions. Characteristic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11 11 13 13 16 17
3
Trajectories. Modifications. Filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21 21 24 24 29 31
4
Continuity. Differentiability. Integrability . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33 33 34 34 38 40
1
2
ix
x
5
Contents
Stochastic processes with independent increments. Wiener and Poisson processes. Poisson point measures . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 43 46 47 53 54
6
Gaussian processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59 59 61 62 66 67
7
Martingales and related processes in discrete and continuous time. Stopping times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71 71 79 79 93 98
Stationary discrete- and continuous-time processes. Stochastic integral over measure with orthogonal values . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
107 107 110 111 119 122
9
Prediction and interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
129 129 130 131 133 135
10
Markov chains: Discrete and continuous time . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
137 137 140 140 152 154
8
Contents
xi
11
Renewal theory. Queueing theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
159 159 162 162 169 170
12
Markov and diffusion processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
175 175 182 182 186 188
13
Itˆo stochastic integral. Itˆo formula. Tanaka formula . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
193 193 196 196 205 209
14
Stochastic differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
215 215 217 217 223 225
15
Optimal stopping of random sequences and processes . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
229 229 231 231 235 237
16
Measures in a functional spaces. Weak convergence, probability metrics. Functional limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
241 241 250 250 259 262
xii
Contents
17
Statistics of stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
271 271 281 281 286 287
18
Stochastic processes in financial mathematics (discrete time) . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
303 303 306 306 310 311
19
Stochastic processes in financial mathematics (continuous time) . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
315 315 317 317 322 322
20
Basic functionals of the risk theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
327 327 343 343 348 350
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
359 364 364 365 367
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
1 Definition of stochastic process. Cylinder σ -algebra, finite-dimensional distributions, the Kolmogorov theorem
Theoretical grounds Let (Ω , F, P) be a probability space, (X, X) be a measurable space, and T be some set. Definition 1.1. A random function X with phase space X and parameter set T is a function X : T × Ω (t, ω ) → X(t, ω ) ∈ X such that for any t ∈ T the mapping X(t, ·) : ω → X(t, ω ) is F − X-measurable. Hereinafter the mapping X(t, ·) is denoted as X(t). According to commonly accepted terminology it is a random element taking values in X. The definition introduced above is obviously equivalent to the following one. Definition 1.2. A random function X with phase state X and parameter set T is a family of random elements {X(t),t ∈ T} with values in X indexed by points of T. A random function, as defined in Definitions 1.1 and 1.2, sometimes is also called a stochastic (random) process, but usually the term stochastic process is reserved for the case where T is an interval or a ray on the real axis R. A random sequence (or stochastic process with discrete time) is a random function defined on T ⊂ Z. A random field is a random function defined on T ⊂ Rd , d > 1. For a fixed ω ∈ Ω , function X(·, ω ) : T t → X(t, ω ) is called a trajectory or a realization of the random function X. Denote by X⊗m = X ⊗ · · · ⊗ X the product of m copies of σ -algebra X, that is, the least σ -algebra of a subsets of Xm that contains every set of the form A1 × · · · × Am , A1 , . . . , Am ∈ X. Definition 1.3. For given values m ≥ 1 and t1 , . . . ,tm ∈ T, (m-dimensional) finitedimensional distribution PtX1 ,...,tm of random function X is the joint distribution of random elements X(t1 ), . . . , X(tm ) or, equivalently, the distribution of the vector (X(t1 ), . . . , X(tm )) considered as a random element with values in (Xm , X⊗m ). The set {PtX1 ,...,tm ,t1 , . . . ,tm ∈ T, m ≥ 1} is called the set (or the family) of finite-dimensional distributions of the random function X. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 1,
1
2
1 Definition of stochastic process. Finite-dimensional distributions
Theorem 1.1. (The Kolmogorov theorem on finite-dimensional distributions) Let X be a complete separable metric space and X be its Borel σ -algebra. Suppose a family {Pt1 ,...,tm ,t1 , . . . ,tm ∈ T, m ≥ 1} be given, such that, for any m ≥ 1 and t1 , . . . ,tm ∈ T, Pt1 ,...,tm is a probability measure on (Xm , X⊗m ). The following consistency conditions are necessary and sufficient for a random function to exist, such that the family {Pt1 ,...,tm ,t1 , . . . ,tm ∈ T, m ≥ 1} is the family of finite-dimensional distributions for this function. (1) For any m ≥ 1,t1 , . . . ,tm ∈ T, B1 , . . . , Bm ∈ X and arbitrary permutation π : {1, . . . , m} → {1, . . . , m}, Pt1 ,...,tm (B1 × · · · × Bm ) = Ptπ (1) ,...,tπ (m) (Bπ (1) × · · · × Bπ (m) ) (permutation invariance). (2) For any m > 1,t1 , . . . ,tm ∈ T, B1 , . . . , Bm−1 ∈ X, Pt1 ,...,tm (B1 × · · · × Bm−1 × X) = Pt1 ,...,tm−1 (B1 × · · · × Bm−1 ) (projection invariance). The random function provided by the Kolmogorov theorem can be constructed on a special probability space. Further on, we describe the construction of this probability space. Let Ω = XT be a space of all functions ω : T → X. Definition 1.4. A set A ⊂ Ω is called a cylinder set, if it has the following representation (1.1) A = {ω ∈ Ω | (ω (t1 ), . . . , ω (tm )) ∈ B} for some m ≥ 1,t1 , . . . ,tm ∈ T, B ∈ X⊗m . A cylinder set has many representations of the form (1.1). A set B in any representation (1.1) of the cylinder set A is called a base (or basis) of A. A class of all cylinder sets is denoted by C(X, T) or simply by C. This class is an algebra, but, in general, it is not a σ -algebra (see Problem 1.35). A minimal σ -algebra σ (C) that contains this class is called a σ -algebra generated by cylinder sets or cylinder σ -algebra. The random function in the Kolmogorov theorem can be defined on the space Ω = XT with the σ -algebra F = σ (C), X(t, ω ) = ω (t),t ∈ T, ω ∈ Ω , and probability P constructed in some special way (see, for instance, [79], Chapter 2 or [9], Appendix 1). The cylinder σ -algebra has the following useful characterization (see Problem 1.29). Theorem 1.2. A set A ⊂ XT belongs to σ (C(X, T)) if and only if there exists a sequence (tn )∞ n=1 ⊂ T and a set B ∈ σ (C(X, N)) such that A = {ω | (ω (tn ))∞ n=1 ∈ B}.
1 Definition of stochastic process. Finite-dimensional distributions
3
Bibliography [9], Chapter I; [24], Volume 1, Chapter I, §4; [25], Chapter II, §2; [15], Chapter II, §§1,2; [79], Chapters 1,2.
Problems 1.1. Let η be a random variable with the distribution function F. Prove that X(t) is a stochastic process, if (a) X(t) = η t; (b) X(t) = min(η ,t); (c) X(t) = max(η ,t 2 ); (d) X(t) = sign (η + t), where 1, x ≥ 0, sign x = −1, x < 0. Draw the trajectories of the process X. Find one-dimensional distributions of the process X. 1.2. Let τ be a random variable with uniform distribution on [0, 1] and {X(t),t ∈ [0, 1]} be a waiting process corresponding to this variable; that is, X(t) = 1It≥τ ,t ∈ [0, 1]. Find all (a) one-dimensional; (b) two-dimensional; (c) m-dimensional distributions of the process X. 1.3. Two devices start to operate at the instant of time t = 0. They operate independently of each other for random periods of time and after that they shut down. The operating time of the ith device has a distribution function Fi , i = 1, 2. Let X(t) be the number of operating devices at the instant t. Find one- and two-dimensional distributions of the process {X(t),t ∈ R+ }. 1.4. Let ξ1 , . . . , ξn be independent identically distributed random variables with distribution function F, and 1 n 1 X(x) = #{k|ξk ≤ x} = ∑ 1Iξk ≤x , x ∈ R n n k=1 (remark that X(·) ≡ Fn∗ (·) is the empirical distribution function based on the sample ξ1 , . . . , ξn ). Find all (a) one-dimensional; (b) two-dimensional; (c) m-dimensional distributions of the process X. Here and below, # denotes the number of elements of a set. 1.5. Is it possible for stochastic processes {X1 (t), X2 (t),t ≥ 0} to have (a) identical one-dimensional distributions, but different two-dimensional ones; (b) identical twodimensional distributions, but different three-dimensional ones? 1.6. Let Ω = T = R+ , F = B(R+ ), A ⊂ R+ . Here and below, B(X) denotes the Borel σ -algebra in X. Prove that: (1) XA (t, ω ) = 1It=ω · 1Iω ∈A is a stochastic process for an arbitrary set A. (2) YA (t, ω ) = 1It≥ω · 1Iω ∈A is a random process if and only if A ∈ B(R+ ). Depict all possible realizations of the processes XA ,YA .
4
1 Definition of stochastic process. Finite-dimensional distributions
1.7. At the instant of failure of some unit in a device, this unit is immediately replaced by a reserve one. Nonfailure operating times for each unit are random variables, jointly independent and exponentially distributed with parameter α > 0. Let X(t) be the number of failures up to the time moment t. Find finite-dimensional distributions of the process X(t), if there are (a) n reserve units; (b) an infinite number of reserve units. 1.8. Let random variable ξ have distribution function F. Denote F [−1] (x) = inf{y| F(y) > x},
x ∈ [0, 1]
(the function F [−1] is called the generalized inverse function for F or the quantile transformation of F), and set ζ = F [−1] (ε ), where ε is a random variable uniformly distributed on [0, 1]. Prove that ζ has the same distribution with ξ . 1.9. Prove that it is possible to construct a sequence of independent identically distributed random variables defined on the probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] , which (a) take the values 0 and 1 with probabilities 12 ; (b) are uniformly distributed on [0, 1]; (c) have an arbitrary distribution function F. Here and below, λ 1 |[0,1] denotes the restriction of the Lebesgue measure λ 1 to [0, 1]. 1.10. Prove that it is impossible to construct on the probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] a family of independent identically distributed random variables {ξt ,t ∈ [0, 1]} with a nondegenerate distribution. 1.11. Let μ , ν be such distributions on R2 that μ (R × A) = ν (A × R) for every A ∈ B(R). Prove that it is possible to construct random variables ξ , η , ζ defined on some probability space in such a way that the joint distribution of ξ and η equals μ , and the joint distribution of η and ζ equals ν . 1.12. Assume a two-parameter family of distributions {μm,n , m, n ≥ 1} on R2 is given, consistent in the sense that (a) for any A, B ∈ B(R) and m, n ≥ 1, μm,n (A × B) = μn,m (B × A); (b) for any A ∈ B(R) and l, m, n ≥ 1, μn,m (A × R) = μn,l (A × R). Is it true that for any such family there exists a sequence of random variables {ξn , n ≥ 1} satisfying the relations μm,n (C) = P((ξm , ξn ) ∈ C) for any m, n ≥ 1 and C ∈ B(R2 )? 1.13. Let {Xn , n ≥ 1} be a random sequence. Prove that the following extended random variables (i.e., the variables with possible values +∞ and −∞) are measurable with respect to the σ -algebra generated by cylinder sets: (a) supn Xn ; (b) lim supn Xn ; (c) number of partial limits of the sequence {Xn }. 1.14. In the previous problem, let random variables in the sequence {Xn , n ≥ 1} be independent. Which ones among extended variables presented in the items a) — c) are degenerate, that is, take some value from R ∪ {−∞, +∞} with probability 1?
1 Definition of stochastic process. Finite-dimensional distributions
5
1.15. Suppose that in Problem 1.13 random variables {Xn , n ≥ 1} may be dependent, but for some m ≥ 2 and for any l = 1, . . . , m random variables {Xnm+l , n ≥ 0} are independent. Which ones among extended variables presented in the items a) — c) of Problem 1.13 are degenerate? 1.16. Let {ξn , n ≥ 1} be a sequence of i.i.d. random variables. Indicate such a sequence {an , n ≥ 1} ⊂ R+ that lim supn→+∞ ξn /an = 1 almost surely, if (a) ξn ∼ N(0, σ 2 ); (b) ξn ∼ Exp(λ ); (c) ξn ∼ Pois(λ ). 1.17. Let {X(t),t ∈ R+ } be a stochastic process with right continuous trajectories, a ∈ C(R+ ) be some deterministic function. Prove that the following variables are extended random variables (that is, measurable functions from Ω to R ∪ {−∞, +∞}): (a) supt∈R+ X(t)/a(t); (b) lim supt→+∞ X(t)/a(t); (c) the number of partial limits of the function X(·)/a(·) as t → +∞; (d) Var(X(·), [a, b]) (variation of X(·) on the interval [a, b]). 1.18. Are the random variables presented in the previous problem measurable for an arbitrary stochastic process without any additional conditions on its trajectories? 1.19. Suppose that a random process {X(t),t ∈ [0; 1]} has continuous trajectories. Prove that the following sets are measurable. A = {ω ∈ Ω | X(t, ω ),t ∈ [0; 1] satisfies Lipschitz condition}. B = {ω ∈ Ω | mint∈[0;1] X(t, ω ) < 7}. C = {ω ∈ Ω |
1 0
X 2 (s, ω )ds > 3 maxs∈[0;1] X(s, ω )}.
D = {ω ∈ Ω | ∃ t ∈ [0; 1) : X(t, ω ) = 1}. E = {ω ∈ Ω | X(1/2, ω ) + 3 sin X(1, ω ) ≤ 0}. F = {ω ∈ Ω | X(t, ω ),t ∈ [0, 1] is monotonically nondecreasing}. G = {ω ∈ Ω | ∃t1 ,t2 ∈ [0, 1],t1 = t2 : X(t1 , ω ) = X(t2 , ω ) = 0}. H = {ω ∈ Ω | X(t, ω ),t ∈ [0, 1) is monotonically increasing}. I = {ω ∈ Ω | at some point trajectory X(·, ω ) is tangent from above to the axis Ox; that is, there exists such an interval [τ1 , τ2 ] ⊂ [0, 1], that X(t, ω ) ≥ 0 as t ∈ [τ1 , τ2 ] and mint∈[τ1 ,τ2 ] X(t, ω ) = 0}. 1.20. Let {Xn (t), n ≥ 1,t ∈ [0, 1]} be a sequence of random processes with continuous trajectories. Prove that the set {ω ∈ Ω | ∑n Xn (t, ω ) uniformly converges on [0, 1]} is a random event. 1.21. Let Γ ⊂ R be an open set and suppose that the trajectories of a process {X(t),t ∈ R+ } are right continuous and have left limits. (1) Are the following functions extended random variables: (a) τ Γ ≡ sup{t : ∀s ≤ t, X(s) ∈ Γ }; (b) τΓ ≡ inf{t : X(t) ∈ Γ }? (2) Prove that τ Γ = τΓ (this value is called the hitting time of the set Γ by the process X).
6
1 Definition of stochastic process. Finite-dimensional distributions
1.22. Solve the previous problem assuming Γ is a closed set. 1.23. Let Γ ⊂ R be some set and τΓ ≡ inf{t : X(t) ∈ Γ } be a hitting time of Γ by a process {X(t),t ∈ R+ } with right continuous trajectories that have left limits. Is the variable τΓ an extended random variable? 1.24. Solve the previous problem assuming Γ is a closed set and trajectories of the process {X(t),t ∈ R+ } do not satisfy any continuity conditions. 1.25. Suppose that trajectories of the process {X(t),t ∈ [0, 1]} are right continuous and have left limits. Prove that for any ω ∈ Ω the trajectory X(·, ω ) is Riemann integrable on [0, 1], and 01 X(t) dt is a random variable. 1.26. Suppose that trajectories of the process {X(t),t ∈ [0, 1]} are right continuous. Is it true that for any ω ∈ Ω trajectory X(·, ω ) is Riemann integrable on [0, 1]? 1.27. Let trajectories of the process {X(t),t ∈ [0, 1]} be right continuous, and τ be a random variable with values in [0, 1]. Prove that the function X(τ ) : Ω ω → X(τ (ω ), ω ) is a measurable function; that is, X(τ ) is a random variable. 1.28. Present an example of random process {X(t),t ∈ [0, 1]} and random variable τ taking values in [0, 1] such that X(τ ) is not a random variable. 1.29. Prove Theorem 1.2. 1.30. Prove that the following subsets of R[0,1] do not belong to the σ -algebra generated by cylinder sets. (a) The set of all continuous functions (b) The set of all bounded functions (c) The set of all Borel functions 1.31. Construct a random process {X(t),t ∈ [0, 1]} defined on probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] in such a way that the set {ω ∈ Ω | function X(·, ω ) is continuous} is not a random event. 1.32. Construct a random process {X(t),t ∈ [0, 1]} defined on probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] in such a way that the set {ω ∈ Ω | function X(·, ω ) is bounded} is not a random event. 1.33. Construct a random process {X(t),t ∈ [0, 1]} defined on probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] in such a way that the set {ω ∈ Ω |function X(·, ω ) is measurable} is not a random event. 1.34. Prove that there exist subsets of the set {0, 1}N that do not belong to the σ algebra generated by cylinder sets (suppose that X = 2{0,1} ). 1.35. Prove that the class C(X, T) of cylinder sets is an algebra; if T is an infinite set and X contains at least two points, then the class is not a σ -algebra.
1 Definition of stochastic process. Finite-dimensional distributions
7
1.36. Let X = {X(t),t ∈ T} be a random function defined on some probability space (Ω , F, P) with phase space X. Prove that for any subset A ⊂ XT that belongs to the cylinder σ -algebra we have: {ω ∈ Ω |X(·, ω ) ∈ A} ∈ F. 1.37. Let X = {X(t),t ∈ T} be a random function defined on some probability space (Ω , F, P) with phase space X and let {ω ∈ Ω |X(·, ω ) ∈ A} ∈ F for some subset A ⊂ XT . Can we assert that A belongs to cylinder σ -algebra? Compare with the previous problem.
Hints 1.2. For any t1 < · · · < tm the random variables X(t1 ), . . . , X(tm ) can take values 0 and 1, only. Moreover, if X(t j ) = 1, then X(tk ) = 1 for k = j + 1, . . . , m. Therefore the joint distribution of X(t1 ), . . . , X(tm ) is concentrated at the points z0 = (1, . . . , 1), z1 = (0, 1, . . . , 1), . . . , zm−1 = (0, . . . , 0, 1), zm = (0, . . . , 0) (there are m + 1 such points). The fact that (X(t1 ), . . . , X(tm )) = z j ( j = 2, . . . , m − 1) means that X(t j−1 ) = 0 and X(t j ) = 1, which gives τ ∈ (t j−1 ,t j ]. 1.6. (1) X(t, ·) ≡ 0 if t ∈ A and X(t, ·) = 1I{t} (·) if t ∈ A; that is, in both these cases we have a measurable function. (2) {ω | X(t, ω ) = 1} = {ω ≤ t, ω ∈ A}.
1.8. {F (ε ) ≤ x} = {inf{y| F(y) > ε } ≤ x} = ∞ n=1 {inf{y| F(y) > ε } < x + 1/n} = ∞ −1 {F(y) > ε } = { ε < F((x + 1/n)−)} = {ε ≤ F(x)}. n=1 n=1 y∈Q,y<x+1/n 1.9. (a) Let εk (ω ) be equal to the kth digit after the point in the binary notation for the number ω ∈ [0, 1]. Then {εk , k ∈ N} are i.i.d. random variables, which take on values 0 and 1 with probabilities 12 (prove it!). (b) Sets N and N2 are equinumerous, therefore on [0, 1] there exists a double sequence {εk, j , k, j ∈ N2 } of i.i.d. random variables that take on values 0 and 1 with probabilities 12 . Take ξk = ∑∞j=1 2− j εk, j . (c) Use item (b) and Problem 1.8. 1.10. Suppose that the set A ⊂ B(R) is such that P(ξt ∈ A) ∈ (0, 1); then for any t = s the distance in L2 (Ω , F, P) between 1Iξt ∈A and 1Iξs ∈A is equal to some constant cA > 0; that is, the space L2 (Ω , F, P) is not separable. Compare with properties of the space L2 ([0, 1]). 1.11. Let ε1 , ε2 be independent variables with uniform distribution on [0, 1]; ξ˜ , η˜ be random variables with joint distribution μ ; {F(y), y ∈ R} be distribution function of η˜ ; and {F(x/y), x, y ∈ R} be conditional distribution function of ξ˜ under conb dition that {η = y} (i.e., P(ξ˜ ≤ a, η˜ ≤ b) = −∞ F(a/y) dF(y), a, b ∈ R). Denote by F [−1] and F [−1] (·/y) the generalized inverse functions of F, F(·/y), y ∈ R, take ξ = F [−1] (ε1 , ε2 ), η = F [−1] (ε2 ), and prove that ξ , η have joint distribution μ (see Problem 1.8). After that repeat the same procedure (with the same ε1 , ε2 ) for variables ζ˜ , η˜ such that the joint distribution of η˜ , ζ˜ equals to ν .
8
1 Definition of stochastic process. Finite-dimensional distributions
1.12. Find a symmetric matrix of the size 3 × 3, which is not a nonnegatively defined one, but such that all three matrices obtained by obliteration of the ith row and the ith column (i = 1, 2, 3) of this matrix are nonnegatively defined. Does there exist a Gaussian three-dimensional vector with such a covariance matrix? of partial limits of the sequence {Xn (ω ), n ≥ 1}, 1.13. (c) Let N(ω ) be the number then {ω ∈ Ω | N(ω ) ≥ k} = α1 ,...,α2k ∈Q,α1 0 that X(s, ω ) ∈ Γ , s ∈ [t,t + ε (ω )). Therefore {τ Γ < a} = b∈Q,b 0. t (b) X(t) = W (e R. ), t2∈ (c) X(t) = W 1 − t , t ∈ [−1, 1]. 2.4. Let W be the Wiener process. Find the characteristic function for W (2)+2W (1). 2.5. Let N be the Poisson process with intensity λ . Find the characteristic function for N(2) + 2N(1). 2.6. Let W be the Wiener process. Find: (a) E(W (t))m , m ∈ N. (b) E exp(2W (1) +W (2)). (c) E cos(2W (1) +W (2)). 2.7. Let N be the Poisson process with intensity λ . Find: (a) P(N(1) = 2, N(2) = 3, N(3) = 5). (b) P(N(1) ≤ 2, N(2) = 3, N(3) ≥ 5). (c) E(N(t) + 1)−1 . (d) EN(t)(N(t) − 1) · · · · · (N(t) − k), k ∈ Z+ .
14
2 Mean and covariance functions. Characteristic functions
2.8. Let W be the Wiener process and f ∈ C([0, 1]). Find the characteristic func tion for random variable 01 f (s)W (s) ds (the integral is defined for every ω in the Riemann sense; see Problem 1.25). Prove that this random variable is normally distributed. 2.9. Let W be the Wiener process, f ∈ C([0, 1]), X(t) = Find RW,X .
t 0
f (s)W (s) ds, t ∈ [0, 1].
2.10. Let N be the Poisson process, f ∈ C([0, 1]). Find the characteristic functions of random variables: (a) 01 f (s)N(s) ds; (b) 01 f (s)dN(s) ≡ ∑ f (s), where summation is taken over all s ∈ [0, 1] such that N(s) = N(s−). 2.11. Let N be the Poisson process, f , g ∈ C([0, 1]), X(t) = 0 g(s)dN(s), t ∈ [0, 1]. Find: (a) RN,X ; (b) RN,Y ; (c) RX,Y .
t
t 0
f (s)N(s) ds, Y (t) =
2.12. Find all one-dimensional and m-dimensional characteristic functions: (a) for the process introduced in Problem 1.2; (b) for the process introduced in Problem 1.4. 2.13. Find the covariance function of the process X(t) = ξ1 f1 (t) + · · · + ξn fn (t), t ∈ R, where f1 , . . . , fn are nonrandom functions, and ξ1 , . . . , ξn are noncorrelated random variables with variances σ12 , . . . , σn2 . 2.14. Let {ξn , n ≥ 1} be the sequence of independent square integrable random variables. Denote an = Eξn , σn2 = Var ξn . (1) Prove that series ∑n ξn converges in the mean square sense if and only if the series ∑n an and ∑n σn2 are convergent. (2) Let { fn (t), t ∈ R}n∈N be the sequence of nonrandom functions. Formulate the necessary and sufficient conditions for the series X(t) = ∑n ξn fn (t) to converge in the mean square for every t ∈ R. Find the mean and covariance functions of the process X. 2.15. Are the following functions nonnegatively defined: (a) K(t, s) = sint sin s; (b) K(t, s) = sin(t + s); (c) K(t, s) = t 2 + s2 (t, s ∈ R)? 2.16. Prove that for α > 2 the function K(t, s) = 12 (t α + sα − |t − s|α ) , t, s ∈ Rm is not a covariance function. 2.17. (1) Let {X(t), t ∈ R+ } be a stochastic process with independent increments and E|X(t)|2 < +∞, t ∈ R+ . Prove that its covariance function is equal to RX (t, s) = F(t ∧ s), t, s ∈ R+ , where F is some nondecreasing function. (2) Let {X(t),t ∈ R+ } be a stochastic process with RX (t, s) = F(t ∧ s), t, s ∈ R+ , where F is some nondecreasing function. Does it imply that X is a process with independent increments? 2.18. Let N be the Poisson process with intensity λ . Let X(t) = 0 when N(t) is odd and X(t) = 1 when N(t) is even. (1) Find the mean and covariance of the process X. (2) Find RN,X .
2 Mean and covariance functions. Characteristic functions
15
2.19. Let W and N be the independent Wiener process and Poisson process with intensity λ , respectively. Find the mean and covariance of the process X(t) = W (N(t)). Is X a process with independent increments? 2.20. Find RX,W and RX,N for the process from the previous problem. 2.21. Let N1 , N2 be two independent Poisson processes with intensities λ1 , λ2 , respectively. Define X(t) = (N1 (t))N2 (t) ,t ∈ R+ if at least one of the values N1 (t), N2 (t) is nonzero and X(t) = 1 if N1 (t) = N2 (t) = 0. Find: (a) The mean function of the process X (b) The covariance function of the process X 2.22. Let X,Y be two independent and centered processes and c > 0 be a constant. Prove that RX+Y = RX + RY , R√cX = cRX , RXY = RX RY . 2.23. Let K1 , K2 be two nonnegatively defined functions and c > 0. Prove that the following functions are nonnegatively defined: (a) R = K1 + K2 ; (b) R = cK1 ; (c) R = K1 · K2 . 2.24. Let K be a nonnegatively defined function on T × T. (1) Prove that for every polynomial P(·) with nonnegative coefficients the function R = P(K) is nonnegatively defined. (2) Prove that the function R = eK is nonnegatively defined. (3) When it is additionally assumed that for some p ∈ (0, 1) K(t,t) < p−1 , t ∈ T, prove that the function R = (1 − pK)−1 is nonnegatively defined. 2.25. Give the probabilistic interpretation of items (1)–(3) of the previous problem; that is, construct the stochastic process for which R is the covariance function. 2.26. Let K(t, s) = ts,t, s ∈ R+ . Prove that for an arbitrary polynomial P the function R = P(K) is nonnegatively defined if and only if all coefficients of the polynomial P are nonnegative. Compare with item (1) of Problem 2.24. 2.27. Which of the following functions are nonnegatively defined: (a) K(t, s) = sin(t − s); (b) K(t, s) = cos(t − s); (c) K(t, s) = e−(t−s) ; (d) K(t, s) = e−|t−s| ; 2 4 (e) K(t, s) = e−(t−s) ; (f) K(t, s) = e−(t−s) ? 2.28. Let K ∈ C ([a, b] × [a, b]). Prove that K is nonnegatively defined if and only if the integral operator AK : L2 ([a, b]) → L2 ([a, b]), defined by AK f (t) =
b a
K(t, s) f (s) ds,
f ∈ L2 ([a, b]),
is nonnegative. 2.29. Let AK be the operator from the previous problem. Check the following statements. (a) The set of eigenvalues of the operator AK is at most countable. (b) The function K is nonnegatively defined if and only if every eigenvalue of the operator AK is nonnegative.
16
2 Mean and covariance functions. Characteristic functions
2.30. Let K(s,t) = F(t − s), t, s ∈ R, where the function F is periodic with period 2π and F(x) = π − |x| for |x| ≤ π . Construct the Gaussian process with covariance K of the form ∑n εn fn (t), where {εn , n ≥ 1} is a sequence of the independent normally distributed random variables. 2.31. Solve the previous problem assuming that F has period 2 and F(x) = (1 − x)2 , x ∈ [0, 1]. 2.32. Denote {τn , n ≥ 1} the jump moments for the Poisson process N(t), τ0 = 0. Let {εn , n ≥ 0} be i.i.d. random variables that have expectation a and variance σ 2 . Consider the stochastic processes X(t) = ∑nk=0 εk , t ∈ [τn , τn+1 ), Y (t) = εn , t ∈ [τn , τn+1 ), n ≥ 0. Find the mean and covariance functions of the processes X,Y. Exemplify the models that lead to such processes. 2.33. A radiation measuring instrument accumulates radiation with the rate that equals a Roentgen per hour, right up to the failing moment. Let X(t) be the reading at point of time t ≥ 0. Find the mean and covariance functions for the process X if X(0) = 0, the failing moment has distribution function F, and after the failure the measuring instrument is fixed (a) at zero point; (b) at the last reading. 2.34. The device registers a Poisson flow of particles with intensity λ > 0. Energies of different particles are independent random variables. Expectation of every particle’s energy is equal to a and variance is equal to σ 2 . Let X(t) be the readings of the device at point of time t ≥ 0. Find the mean and covariance functions of the process X if the device shows (a) Total energy of the particles have arrived during the time interval [0,t]. (b) The energy of the last particle. (c) The sum of the energies of the last K particles. 2.35. A Poisson flow of claims with intensity λ > 0 is observed. Let X(t),t ∈ R be the time between t and the moment of the last claim coming before t. Find the mean and covariance functions for the process X.
Hints 2.1. See the hint to Problem 2.17. 2.4. Because the variables (W (1),W (2)) are jointly Gaussian, the variable W (2) + 2W (1) is normally distributed. Calculate its mean and variance and use the formula for the characteristic function of the Gaussian distribution. Another method is proposed in the following hint. 2.5. N(2) + 2N(1) = N(2) − N(1) + 3N(1). The values N(2) − N(1) and N(1) are Poisson-distributed random variables and thus their characteristic functions are known. These values are independent, that is, the required function can be obtained as a product.
2 Mean and covariance functions. Characteristic functions
17
2.6. (a) If η ∼ N(0, 1), then Eη 2k−1 = 0, Eη 2k = (2k − 1)!! = (2k − 1)(2k − 3) · · · 1 for k ∈ N. Prove and use this for the calculations. (b) Use the explicit formula density. for the Gaussian (c) Use formula cos x = 12 eix + e−ix and Problem 2.4. 2.10. (a) Make calculations similar to those of Problem 2.8. (b) Obtain the characteristic functions of the integrals of piecewise constant functions f and then uniformly approximate the continuous function by piecewise constant ones. 2.17. (1) Let s ≤ t; then values X(t) − X(s) and X(s) are independent which means that they are uncorrelated. Therefore cov(X(t), X(s)) = cov(X(t) − X(s), X(s)) + cov(X(s), X(s)) = cov(X(t ∧ s), X(t ∧ s)). The case t ≤ s can be treated similarly. 2.23. Items (a) and (b) can be proved using the definition. In item (c) you can use the previous problem. 2.24. Proof of item (1) can be directly obtained from the previous problem. For the proof of items (2) and (3) use item (1), Taylor decomposition of the functions x → ex , x → (1 − px)−1 and a fact that the pointwise limit of a sequence of nonnegatively defined functions is also a nonnegatively defined function. (Prove this fact!).
Answers and Solutions 2.1. RW (t, s) = t ∧ s, RN (t, s) = λ (t ∧ s). 2.2. aX (t) = t, RX (t, s) = 2(t ∧ s)2 . 2.3. For arbitrary f : R+ → R+ , the covariance function for the process X(t) = W ( f (t)),t ∈ R+ is equal to RX (t, s) = RW ( f (t), f (s)) = f (t) ∧ f (s). 2.8. Let In = n−1 ∑nk=1 f (k/n)W (k/n). Because the process W a.s. has continuous trajectories and the function f is continuous, the Riemann integral sum In converges to I = 01 f (t)W (t) dt a.s. Therefore φIn (z) → φI (z), n → +∞, z ∈ R. Hence, EeizIn = Eeizn n
−1
∑nk=1 f (k/n)W (k/n)
2 −(2n)−1 zn−1 ∑nj=k f ( j/n)
= ∏e
i ∑nk=1 zn−1 ∑nj=k f ( j/n) (W (k/n)−W ((k−1)/n))
= Ee
→ e−(z
2 /2 1 1 0 t
) (
2
f (s) ds) dt
k=1
Thus I is a Gaussian random variable with zero mean and variance 2.9. RW,X (t, s) =
s
f (r)(t ∧ r) dr. 1
2.10. (a) φ (z) = exp λ 01 eiz t f (s) ds − 1 dt .
(b) φ (z) = exp λ 01 eiz f (t) − 1 dt . 0
,
n → ∞.
1 1 0
t
2
f (s) ds
dt.
18
2 Mean and covariance functions. Characteristic functions
2.11. RN,X (t, s) = λ 2 0s f (r)(t ∧ r) dr, RN,Y (t, s) = λ 2
u∧s t g(r) dr du. 0 f (u) 0
t∧s 0
g(r) dr, RX,Y (t, s) = λ 2 ×
2.12. (a) Let 0 ≤ t1 < · · · < tn ≤ 1; then φt1 ,...,tm (z1 , . . . , zm ) = t1 eiz1 +···+izm + (t2 − t1 )eiz2 +···+izm + · · · + (tm − tm−1 )eizm + (1 − tm ). (b) Let 0 ≤ t1 < · · · < tn ≤ 1, then −1 −1 −1 −1 φt1 ,...,tm (z1 , . . . , zm ) = F(t1 )eiz1 n +···+izm n + (F(t2 ) − F(t1 ))eiz2 n +···+izm n + · · · + (F(tm ) − F(tm−1 ))eizm n
−1
n + (1 − F(tm )) .
2.13. RX (t, s) = ∑nk=1 σk2 fk (t) fk (s). 2.15. (a) Yes; (b) no; (c) no. 2.17. (2) No, it does not. 2.18. (1) aX (t) = 12 1 + e−2λ t , RX (t, s) = 14 e−2λ |t−s| − e−2λ (t+s) . (2) RN,X (t, s) = −λ (t ∧ s)e−2λ s . 2.19. aX ≡ 0, RX (t, s) = λ (t ∧ s). X is the process with independent increments. 2.20.
k(λ t)k (λ t)k ∑ k! + s · ∑ k! , RX,N ≡ 0. k<s k≥s
RX,W (t, s) = E[N(t) ∧ s] = e
−λ t
2.21. aX (t) = exp λ1teλ2 t − (λ1 + λ2 )t ; function RX is not defined because EX 2 (t) = +∞,t > 0. 2.25. There exist several interpretations, let us give two of them. m + The first one: let R = f (K) and f (x) = ∑∞ m=0 cm x with cm ≥ 0, m ∈ Z . Let the radius of convergence of the series be equal to r f > 0 and K(t,t) < r f ,t ∈ R+ . Consider a triangular array {Xm,k , 1 ≤ k ≤ m} of independent centered identically distributed processes with the covariance function K. In addition, let random variable √ ξ be independent of {Xm,k } and Eξ = 0, Dξ = 1. Then the series X(t) = c0 ξ + √ m ∑∞ m=1 cm ∏k=1 Xm,k (t) converges in the mean square for any t and the covariance function of the process X is equal to R. The second one: using the same notations, denote c = ∑∞ k=0 ck , pk = ck /c, k ≥ 0. Let {Xm , m ≥ 1} be a sequence of independent identically distributed centered processes with the covariance function K, and ξ be as above. Let η be {Xm , m ≥ 1}, with the random variable, independent both on ξ and the processes √ P(η = k) = pk , k ∈ Z+ . Consider the process X(t) = c ∏ηk=1 Xk (t) assuming that ∏0k=1 Xk (t) = ξ . Then the covariance function of the process X is equal to R. In particular, the random variable η should have a Poisson distribution in item (2) and a geometric distribution in item (3).
2 Mean and covariance functions. Characteristic functions
19
2.26. Consider the functions Rk = (∂ 2k /∂ t k ∂ sk )R, k ≥ 0. These functions are nonnegatively defined (one can obtain this fact by using either Definition 2.3 or Theorem 4.2). Function Rk can be represented in the form Rk = Pk (K), where the absolute term of the polynomial Pk equals the kth coefficient of the polynomial P multiplied by (k!)2 . Now, the required statement follows from the fact that Q(t,t) ≥ 0 for any nonnegatively defined function Q. 2.27. Functions from the items (b), (d), (e) are nonnegatively defined; the others are not. 2.28. Let K be nonnegatively defined. Then for any f ∈ C([a, b]), (AK f , f )L2 ([a,b]) =
b b
K(t, s) f (t) f (s) dsdt n k(b − a) b−a 2 j(b − a) ,a+ ≥0 K a+ = lim ∑ n→∞ n n n j,k=1 a
a
because every sum under the limit sign is nonnegative. Because C([a, b]) is a dense subset in L2 ([a, b]) the above inequality yields that (AK f , f )L2 ([a,b]) ≥ 0, f ∈ L2 ([a, b]). On the other hand, let (AK f , f )L2 ([a,b]) ≥ 0 for every f ∈ L2 ([a, b]), and let points t1 , . . . ,tm and constants z1 , . . . , zm be fixed. Choose m sequences of continuous functions { fn1 , n ≥ 1}, . . . , { fnm , n ≥ 1} such that, for arbitrary function φ ∈ C([a, b]), ab φ (t) fnj (t) dt → φ (t j ), n → ∞, j = 1, . . . , m. Putting fn = ∑mj=1 z j fnj , we obtain that ∑mj,k=1 z j zk K(t j ,tk ) = limn→∞ ab ab K(t, s) fn (t) fn (s) dsdt = limn→∞ (AK fn , fn ) ≥ 0. 2.29. Statement (a) is a particular case of the theorem on the spectrum of a compact operator. Statement (b) follows from the previous problem and theorem on spectral decomposition of a compact self-adjoint operator.
3 Trajectories. Modifications. Filtrations
Theoretical grounds Definition 3.1. Random functions {X(t),t ∈ T} and {Y (t),t ∈ T}, defined on the same probability space, are called equivalent (or stochastically equivalent), if P(X(t) = Y (t)) = 1 for any t ∈ T. Random functions {X(t),t ∈ T} and {Y (t),t ∈ T}, possibly defined on different probability spaces, are called stochastically equivalent in a wide sense if their corresponding finite-dimensional distributions coincide. A random function Y equivalent to X is called a modification of the random function X. Definition 3.2. Let T be a linearly ordered set. A filtration (or a flow of σ -algebras) on a probability space (Ω , F, P) is a family of σ -algebras F = {Ft ,t ∈ T} that satisfies the condition Fs ⊂ Ft ⊂ F, s,t ∈ T, s ≤ t. Filtration is called complete if every σ -algebra Ft includes all null probability sets from F.
Denote Ft+ = s>t Fs , Ft− = s 0, P(ρ (X(t), X(s)) > ε ) → 0, s → t. Then there exists a modification of the function X that is measurable (i.e., a measurable modification). Remark 3.1. In most textbooks the statement of Theorem 3.1 is formulated and proved under the condition that the space T is compact or σ -compact (e.g., T = R, T = Rd , etc.). For the separable space T, this statement still holds true. This follows from the result of Problem 3.45, published primarily in the paper [86] (see also Problem 3.46). Definition 3.6. A random function X is called separable if there exist a countable dense subset T0 ⊂ T and a set N ∈ F with P(N) = 0 such that, for every open set G ⊂ T and closed set F ⊂ X, {ω | X(t, ω ) ∈ F, t ∈ G}{ω | X(t, ω ) ∈ F, t ∈ G ∩ T0 } ⊂ N. The set T0 is called the set of separability for the function X. Theorem 3.2. Let the space T be separable and the space X be compact. Then every random function {X(t),t ∈ T} has a modification being a separable random function (i.e., a separable modification). Let us consider the question of existence of a modification of a random function with all its trajectories being continuous functions (i.e., a continuous modification). Theorem 3.3. Assume that the space T is compact, the space X is complete, and a random function {X(t),t ∈ T} is continuous in probability. Then the continuous modification for the function X exists if and only if the following condition holds true,
3 Trajectories. Modifications. Filtrations
⎛ P⎝
∞ ∞
23
⎞
n=1 m=1 s,t∈T0 ,d(t,s)
cg(h)) ≤ q(c, h), h > 0, t ∈ [h, T − h], ∞
∞
n=0
n=1
∑ g(2−n T ) < +∞, ∑ 2n q(c, 2−n T ) < +∞, !
c ∈ R+ .
(3.1)
Then the process X has a continuous modification. As a corollary of Theorem 3.4, one can obtain the well-known sufficient Kolmogorov condition for existence of a continuous modification. Theorem 3.5. Let {X(t),t ∈ [0, T ]} be a stochastic process satisfying Eρ α (X(t), X(s)) ≤ C|t − s|1+β , t, s ∈ [0, T ] with some positive constants α , β ,C. Then the process X possesses a continuous modification. Under the sufficient Kolmogorov condition, the properties of the trajectories of the process X can be specified in more detail. Recall that function f (t),t ∈ [0, T ] is said to satisfy the H¨older condition with index γ (γ > 0) if sup |t − s|−γ ρ ( f (t), f (s)) < +∞. t=s
Theorem 3.6. Under conditions of Theorem 3.5, for arbitrary γ < β /α the process X has a modification with the trajectories satisfying the H¨older condition with index γ . Analogues of the sufficient Kolmogorov condition are available for random functions defined on parametric sets that may have more complicated structure than an interval. Let us give a version of this condition for random fields. Theorem 3.7. Let {ξ (x), x = (x1 , . . . , xd ) ∈ D ⊂ Rd } be a random field such that Eρ α (ξ (x), ξ (y)) ≤ Cx − yd+β , x, y ∈ D with some positive constants α , β ,C. Then the field ξ possesses a continuous modification.
24
3 Trajectories. Modifications. Filtrations
Remark 3.2. In numerous models and examples, there arises a wide class of stochastic processes that do not possess continuous modification because of the jump discontinuities of their trajectories. The most typical and important example here is the Poisson process. This leads to the following definitions and notation. A function f : [a, b] → X is called c`adl`ag if it is right continuous and has left-hand limits in every point of [a, b]. This notation is the abbreviation for the French phrase continue a` droite, limite a` gauche. Similarly, a function that is left continuous and has right-hand limits in every point is called c`agl`ad. Analogous English abbreviations rcll and rllc are used less frequently. The set of all c`adl`ag functions f : [a, b] → X is denoted D([a, b], X) and is called the Skorohod space. The short notation for D([a, b], R) is D([a, b]). The following theorem gives sufficient conditions for a stochastic process to possess a c`adl`ag modification, formulated in terms of three-dimensional distributions of the process. Theorem 3.8. Let {X(t),t ∈ [0, T ]} be a continuous in probability stochastic process. Suppose that there exist a nondecreasing function {g(h), h ∈ [0, T ]} and a function {q(c, h), c ∈ R+ , h ∈ [0, T ]} such that (3.1) holds true and P({ρ (X(t − h), X(t)) > cg(h)} ∩ {ρ (X(t), X(t + h)) > cg(h)}) ≤ q(c, h), h > 0,t ∈ [h, T − h]. Then the process X has c`adl`ag modification. Also, for existence of either c`adl`ag or continuous modifications sufficient conditions are available, formulated in terms of conditional probabilities. Theorem 3.9. Let {X(t), t ∈ [0, T ]} be a stochastic process, and {α (ε , δ ), ε , δ > 0} be a family of constants such that P(ρ (X(t), X(s)) > ε /FsX ) ≤ α (ε , δ ) a.s., 0 ≤ s ≤ t ≤ s + δ ≤ T, ε > 0. Then (1) If limδ →0+ α (ε , δ ) = 0 for any ε > 0, then the process X has c`adl`ag modification. (2) If limδ →0+ δ −1 α (ε , δ ) = 0 for any ε > 0, then the process X has continuous modification.
Bibliography [9], Chapter I; [24], Volume 1, Chapter III, §2–5; [25], Chapter IV, §2–5; [15], Chapter II, §2; [79], Chapters 8–11.
Problems 3.1. Prove that if the domain T of a random function X is countable and σ -algebra T includes all one-point sets, then the random function X is measurable.
3 Trajectories. Modifications. Filtrations
25
3.2. (1) Prove that if a process is measurable then each of its trajectories is a measurable function. (2) Give an example of a nonmeasurable stochastic process with all its trajectories being measurable functions. (3) Give an example of a stochastic process with all its trajectories being nonmeasurable functions. 3.3. Let Ω = [0, 1], and σ -algebra F consist of all the subsets of [0, 1] having their Lebesgue measure equal either 0 or 1. Let X(t, ω ) = 1It=ω , ω ∈ [0, 1],t ∈ [0, 1]. Prove that (a) X is a stochastic process; (b) X is not measurable. 3.4. Prove that stochastic process {X(t),t ∈ R+ } is measurable assuming its trajectories are: (a) right continuous; (b) left continuous. 3.5. Assume it is known that every trajectory of the process {X(t),t ∈ R+ } is either right continuous or left continuous. Does it imply that this process is measurable? Compare with the previous problem. 3.6. Prove that if a process {X(t), t ∈ R+ } is measurable and a random variable τ possesses its values in R+ , then X(τ ) is random variable. Compare this problem and Problem 3.4 with Problems 1.27 and 1.28. 3.7. A process {X(t), t ∈ R+ } is called progressively measurable if for every T > 0 the restriction of the function X to [0, T ] × Ω is B([0, T ]) ⊗ FTX − X-measurable. Construct a process X that is measurable, but not progressively measurable. 3.8. Let stochastic process {X(t), t ∈ R+ } be continuous in probability. Prove that this process has: (a) measurable modification; (b) progressively measurable modification. 3.9. Let all values of a process {X(t), t ∈ R+ } be independent and uniformly distributed on [0, 1]. (a) Does this process have a c`adl`ag modification? (b) Does this process have a measurable modification? 3.10. Let T = R+ , (Ω , F, P) = (R+ , B(R+ ), μ ), where μ is a probability measure on R+ that does not have any atoms. Introduce the processes X,Y by X(t, ω ) = 1I{t=ω } , Y (t, ω ) = 0, t ∈ T, ω ∈ Ω . (1) Prove that X and Y are stochastically equivalent; that is, Y is a modification of X. (2) Check that all the trajectories of Y are continuous and all the trajectories of X are discontinuous. 3.11. Prove that if a stochastic process {X(t), t ∈ R} has a continuous modification, then every process Y , stochastically equivalent to X in a wide sense, has a continuous modification too.
26
3 Trajectories. Modifications. Filtrations
3.12. Assume that the random field {ξ (x), x ∈ Rd } is such that, for every n ∈ N, the field {ξ (x), x ≤ n} has a continuous modification. Prove that the field {ξ (x), x ∈ Rd } itself has a continuous modification. 3.13. Let {X(t),t ∈ R+ } be a Gaussian process with zero mean and covariance function RX (t, s) that is equal to (1) exp[−|t − s|] (Ornstein–Uhlenbeck process). (2) 12 (t 2H + s2H − |t − s|2H ), H ∈ (0, 1] (fractional Brownian motion). Prove that the process X(·) has a continuous modification. Find values of γ such that these processes have modifications with their trajectories satisfying the H¨older condition with index γ . 3.14. Prove that if a function f : [0, 1] → R satisfies the H¨older condition with index γ > 1, then f is a constant function. Derive that K(t, s) = 12 (t 2H + s2H − |t − s|2H ) is not a covariance function for H > 1. Compare with Problem 2.16. 3.15. Let Gaussian process {X(t),t ∈ R} be stochastically continuous, and t0 ∈ R be fixed. Prove that the process Y (t) = E[X(t)/X(t0 )],t ∈ R has a continuous modification. 3.16. Prove that the Wiener process has a modification with every trajectory satisfying the H¨older condition with arbitrary index γ < 12 . √ 3.17. Let W be the Wiener process. Prove that lim supt→0+ W (t)/ t = +∞ with probability one. In particular, almost every trajectory of the Wiener process does not satisfy the H¨older condition with index γ = 12 . 3.18. Prove that for any α >
1 2
W (t) P lim α = 0 = 1. t→+∞ t 3.19. Let {W (t),t ≥ 0} be the Wiener process. Prove that there exists the limit in probability 2 n k k+1 −W , P-lim ∑ W n→∞ n n k=0 and find this limit. Prove that
k k+1 −W P-lim ∑ W = ∞. n→∞ n n k=0 n
3.20. Prove that almost all trajectories of the Wiener process have unbounded variation on [0, 1]. 3.21. Let {W (t), t ∈ R+ } be the Wiener process. Prove that, with probability one, λ 1 {t ≥ 0| W (t) = 0} = 0.
3 Trajectories. Modifications. Filtrations
27
3.22. Prove that, with probability one, the Wiener process attains its maximum value on [0, 1] only once. 3.23. Let W be the Wiener process. For a, b ∈ R+ , a < b denote Cab = {x ∈ C(R+ )| x(t) = x(a), t ∈ [a, b]}.
Prove that the set {ω |W (·, ω ) ∈ a 0,Y (0) = X(0). Prove that Y is a stochastic process with c`agl`ad trajectories. 3.27. Let τ be a random variable taking values in R+ and X(t) = 1It>τ , t ≥ 0. Find a condition on the distribution of τ that would be necessary and sufficient for the process X to have a c`adl`ag modification. 3.28. Let {X(t), t ∈ [0, T ]} be a continuous in probability stochastic process with independent increments. Prove that X has a c`adl`ag modification. 3.29. Let f ∈ D([a, b]). Prove that (a) Function f is bounded. (b) The set of discontinuities of the function f is at most countable. 3.30. Let the trajectories of a stochastic process X belong to the space D([0, T ]) and let c > 0 be given. Prove that τc = inf{t| |X(t) − X(t−)| ≥ c} is a random variable. 3.31. Let the trajectories of a stochastic process X belong to the space D([0, T ]) and let Γ ∈ B(R) be given. Prove that τ = inf{t| X(t) −X(t−) ∈ Γ } is a random variable. 3.32. Let {X(t), t ∈ R} be a separable stochastic process taking its values in a complete metric space. Prove that if the process X has a continuous modification then ˜ = 0 such that the trajectory X(·, ω ) is continuous there exists a set N˜ ∈ F with P(N) ˜ for any ω ∈ N. 3.33. Let {X(t), t ∈ R} be a separable stochastic process taking its values in a complete metric space. Prove that if the process X has a c`adl`ag modification then there ˜ = 0 such that the trajectory X(·, ω ) is c`adl`ag for any exists a set N˜ ∈ F with P(N) ˜ ω ∈ N.
28
3 Trajectories. Modifications. Filtrations
3.34. Let {X(t), t ∈ R} be a separable stochastic process taking its values in a complete metric space. Assume that the process X has a measurable modification. Does ˜ = 0 that restriction of the function this imply existence of such a set N˜ ∈ F with P(N) ˜ X(·, ·) to R × (Ω \N) is a measurable function? 3.35. Let {X(t), t ∈ R} be a stochastic process taking its values in X = [0, 1]. Does separability of X imply separability of the subspace L2 (Ω , F, P) generated by the family {X(t), t ∈ R} of the values of this process? 3.36. Prove the following characterization of the σ -algebra FtX : it includes all A ∈ F for which there exists a set A0 ∈ FtX,0 such that AA0 ∈ NP . 3.37. (1) Let X,Y be two stochastically equivalent processes. Prove they generate the same filtration. (2) Make an example of two stochastically equivalent processes X,Y such that the corresponding filtrations {FtX,0 ,t ∈ T} and {FtY,0 ,t ∈ T} do not coincide. 3.38. Let
0, X(t) = (t − 12 )η
t ∈ [0, 12 ], , t ∈ ( 12 , 1]
where P(η = ±1) = 12 . Describe explicitly the natural filtration for the process X. Is this filtration: (a) right continuous? (b) left continuous? 3.39. Let τ be a random variable uniformly distributed on [0, 1], F = σ (τ ) (σ -algebra, generated by random variable τ ), X(t) = 1It>τ , t ∈ [0, 1]. Describe explicitly the natural filtration for the process X. Is this filtration: a) right continuous? b) left continuous? 3.40. Let stochastic process {X(t),t ∈ R+ } have continuous trajectories. (1) Prove that its natural filtration is left continuous. (2) Provide an example that this filtration is not necessarily right continuous. 3.41. Provide an example of a process having c`adl`ag trajectories and generating filtration that is neither left continuous nor right continuous. 3.42. Is a filtration generated by a Wiener process: (a) left continuous? (b) right continuous? 3.43. Is a filtration generated by a Poisson process: (a) left continuous? (b) right continuous? 3.44. Let {W (t), t ∈ R+ } be a Wiener process and assume that all its trajectories are continuous. (1) Is it necessary for a filtration {FtW,0 , t ∈ R+ } to be: (a) left continuous? (b) right continuous? (2) Answer the same questions for the filtration {FtN,0 , t ∈ R+ }, where {N(t), t ∈ R+ } is the Poisson process with c`adl`ag trajectories.
3 Trajectories. Modifications. Filtrations
29
3.45. Let (Ω , F , P) be a probability space, Y be a separable metric space, and (A, A) be a measurable space. Consider the sequence of F ⊗ A-measurable random elements {Xn = Xn (a), a ∈ A, n ≥ 1} taking their values in Y. Assume that for every a ∈ A there exists a limit in probability of the sequence {Xn (a), n ≥ 1}. Prove that there exists an F ⊗ A-measurable random element X(a), a ∈ A, takP
ing its values in Y , such that Xn (a) −→ X(a), n → ∞ for every a ∈ A. 3.46. Prove Theorem 3.1.
Hints
3.1. {(t, ω )| X(t, ω ) ∈ B} = t∈T {t} × {ω | X(t, ω ) ∈ B}; the union is at most countable. Every set {t} × {ω | X(t, ω ) ∈ B} belongs to T ⊗ F because X(t) is a random element by the definition of a random function and therefore {ω | X(t, ω ) ∈ B} ∈ F. 3.4. You can use Problem 3.1 and relation X(t, ω ) = limn→∞ X (([tn] + 1)/n, ω ) in item (a) or X(t, ω ) = limn→∞ X ([tn]/n, ω ) in item (b). 3.5. Consider a sum of a process XA from item (1) of Problem 1.6 and a process Y (t, ω ) = 1It>ω . 3.11. Use Theorem 3.3. 3.13, 3.16. Use Theorem 3.6 with α = 2n, n ∈ N. 3.15. If DX(t0 ) > 0, then E[X(t)/X(t0 )] = EX(t) + (X(t0 ) − EX(t0 )) ×
cov(X(t), X(t0 )) . DX(t0 )
If DX(t0 ) = 0, then Y (t) = EX(t). 3.18. Use Problems 3.16 and 6.5 (d). 3.17. Consider a sequence of random variables ξk = 2−(k+1)/2 (W (2−k )−W (2−k−1 )), k ∈ N and prove that P(lim supk→∞ |ξk | = +∞) = 1. Use this identity. 3.21. Introduce a function 1I{(t,ω )|W (t,ω )=0} and use the Fubini theorem. 3.22. It is sufficient to prove that for any a < b probability of the event {maxt∈[0,a] W (t) = maxt∈[b,1] W (t)} is equal to zero. One has max W (t) = W (a) + (W (b) −W (a)) + max (W (t) −W (b))
t∈[b,1]
t∈[b,1]
and the variables W (b) −W (a) and maxt∈[b,1] (W (t) −W (b)) are jointly independent with the σ -algebra FW a . In addition, the distribution of W (b) − W (a) is absolutely continuous. Therefore, the conditional distribution of maxt∈[b,1] W (t) w.r.t. FW a is absolutely continuous. This implies the needed statement.
30
3 Trajectories. Modifications. Filtrations
3.23. The fact that P(W ∈ Cab ) = 0 for fixed a < b, can be proved similarly to the previous hint. Then you can use that a 0 ∃u, v ∈ [0,t − ε ] ∩ Q : |u − v| < δ , |X(u) − X(v)| > c − δ . 3.31. Consider a sequence θk = τ1/k , k ≥ 1 (see Problem 3.30); then τ = min{θk | X(θk ) − X(θk −) ∈ Γ }. 3.34. Put Ω = [0, 1], F is the σ -algebra of Lebesgue measurable sets, P is the Lebesgue measure, X(t, ω ) = 1It=ω 1It∈A + 1It>ω , where A is a Lebesgue nonmeasurable set. 3.35. Consider a stochastic process {X(t),t ∈ R} with i.i.d. values uniformly distributed on [0, 1] (see Problem 1.10 and corresponding hint). This process has a separable modification due to Theorem 3.2. 3.36. Use the “principle of the fitting sets”: prove that the class given in the formulation of the problem is a σ -algebra containing FtX,0 and NP . Prove that every σ -algebra containing both FtX,0 and NP should also contain this class. 3.42. (a) Use Problem 3.40. (b) Use Problem 5.54. 3.43. (a) For any t > 0, B ∈ B(R), one has {N(t) ∈ B}{N(t−) ∈ B} ⊂ {N(t−) = that N has c`adl`ag trajectories). Therefore {N(t) ∈ B} ∈ N(t)} P (assuming ∈N N . σ NP ∪ s 2−k < 2−k . Check that there exists an F ⊗ A-measurable random element X(a), a ∈ A such that Xnk (a) (a) → X(a), k → ∞ almost surely for every a ∈ A. 3.46. Choose a sequence of measurable subsets {Un,k } of the space T such that for any n ≥ 1: (1) T = ∪kUn,k . (2) The sets Un,k and Un, j are disjoint for k = j and their diameters do not exceed 1/n. Let tn,k ∈ Un,k . Then Xn (t) = ∑k X(tnk )1IUn,k (t) is measurable. Prove that for any t sequence {Xn (t), n ≥ 1} converges to X(t) in probability as n → ∞ and use Problem 3.45.
3 Trajectories. Modifications. Filtrations
31
Answers and Solutions 3.2. Item (1) immediately follows from the Fubini theorem. As an example for item (2), one can use the process from Problem 1.6 with a Borel nonmeasurable set. In item (3), one of the possible examples is: Ω = T = [0, 1], T = F = B([0, 1]), X(t, ω ) = 1It∈A , where A is a Borel nonmeasurable set. 3.3. Every variable X(t, ·) is an indicator function for a one-point set and, obviously, is measurable w.r.t. F. This proves (a). In order to prove that the process X is not measurable, let us show that the set {(t, ω )| t = ω } does not belong to B([0, 1]) ⊗ F. Denote K the class of sets C ⊂ [0, 1]2 satisfying the following condition. There exists a set ΔC with its Lebesgue measure equal to 0 such that for arbitrary t, ω1 , ω2 ∈ [0, 1] it follows from (t, ω1 ) ∈ C, (t, ω2 ) ∈ C that at least one of the points ω1 , ω2 belongs to ΔC . Then K is a σ -algebra (prove this!). In addition, K contains all the sets of the type C = A × B, B ∈ F. Then, by the “principle of the fitting sets”, class K contains B([0, 1]) ⊗ F. On the other hand, the set {(t, ω )| t = ω } does not belong to K (verify this!). 3.5. It does not. 3.6. Because a superposition of two measurable mappings is also measurable, it is sufficient to prove that the mapping Ω ω → (τ (ω ), ω ) ∈ R+ × Ω is F − B(R+ ) ⊗ F-measurable. For C = A × B, A ∈ B(R+ ), B ∈ F we have that {ω | (τ (ω ), ω ) ∈ C} = {ω | τ (ω ) ∈ A} ∩ B ∈ F. Because “rectangles” C = A × B generate B(R+ ) ⊗ F, this proves the needed measurability. 3.7. Take Ω = [0, 1], F = B([0, 1]), 1It=ω , t ∈ [0, 1] . X(t, ω ) = ω, t ∈ (1, 2] Then the process X is measurable (verify this!). On the other hand, the σ -algebra F1X is degenerate, that is, contains only the sets of Lebesgue measure 0 or 1. Then the process X is not progressively measurable since its restriction to the time interval [0, 1] is not measurable (see Problem 3.3). 3.9. (a) No, because it would be right continuous in probability. (b) No. Assume that exists. Then for any t ∈ R+ , ω ∈ Ω there t such modification 1 exists Y (t, ω ) = 0 X(s, ω ) − 2 ds (Lebesgue integral). By the Fubini theorem, EY 2 (t) = 0t 0t E X(s1 ) − 12 X(s2 ) − 12 ds1 ds2 = 0t 0t 1Is1 =s2 ds1 ds2 = 0, and then Y (t) = 0 a.s. Every trajectory of the process Y is continuous as a Lebesgue integral with varying upper bound, therefore Y (·, ω ) ≡ 0 for almost all ω . Then X(·, ω ) ≡ 12 for the same ω . It is impossible. 3.13. a) γ < 12 ; b) γ < H.
32
3 Trajectories. Modifications. Filtrations
3.27. Denote by Kτ the set of atoms of distribution for τ (i.e., the points t ∈ R+ for which P(τ = t) > 0). The required condition is as follows. For any t ∈ Kτ there exists ε > 0 such that P(τ ∈ (t,t + ε )) = 0. 3.31. According to Problem 3.30, θk = τ1/k , k ∈ N are random variables. Both the process X and the process Y (t) = X(t−) are measurable (see Problems 3.4 and 3.26), therefore Zk = X(θk ) − X(θk −), k ∈ N are random variables (see Problem 3.6). Thus τ = ∑∞ k=1 θk · 1IZk ∈Γ · ∏m>k 1IZm ∈Γ is a random variable too. 3.34. Not true. 3.35. Not true. 3.37. (1) For any t ∈ T, B ∈ X one has {X(t) ∈ B}{Y (t) ∈ B} ⊂ {X(t) = Y (t)}, and then {X(s) ∈ B} ∈ FtY , s ≤ t (see Problem 3.36). Because NP ⊂ FtY , one has FtX ⊂ FtY . Similarly, it can be proved that FtY ⊂ FtX . (2) Let Ω = [0, 1], F = B([0, 1]), P is the Lebesgue measure, T = {t}, X(t, ω ) ≡ 0,Y (t, ω ) = 1Iω =1 . Then FtX,0 = {∅, [0, 1]}, FtY,0 = {∅, {1}, [0, 1), [0, 1]}. 3.38. (a) No; (b) yes. 3.39. FtX contains all sets of the form {ω : τ (ω ) ∈ A}, where A is an arbitrary Borel set such that either P(A ∩ (t, 1]) = 0 or P(A ∩ (t, 1]) = 1 −t. Filtration FX is both right and left continuous. 3.42. (a) Yes; (b) yes. 3.43. (a) Yes; (b) yes. 3.44. (1) (a) Yes; (b) no. (2) (a) No; (b) no.
4 Continuity. Differentiability. Integrability
Theoretical grounds Definition 4.1. Let T and X be metric spaces with the metrics d and ρ , respectively. A random function {X(t),t ∈ T} taking values in X is said to be (1) Stochastically continuous (or continuous in probability) at point t ∈ T if ρ (X(t), X(s)) → 0 in probability as s → t. (2) Continuous with probability one (or continuous almost surely, a.s.) at point t ∈ T if ρ (X(t), X(s)) → 0 with probability one as s → t. (3) continuous in the L p sense, p > 0 (or mean continuous of the power p ) at point t ∈ T if Eρ p (X(t), X(s)) → 0 as s → t. If a random function is stochastically continuous (continuous with probability one, continuous in the L p sense) at every point of the parametric set T then it is said to be stochastically continuous (respectively, continuous with probability one or continuous in the L p sense) on this set. Note that sometimes, while dealing with continuity of the function X in the L p sense, one assumes the additional condition Eρ p (X(t), x) < +∞,t ∈ T, where x ∈ X is a fixed point. Continuity in the L1 sense is called mean continuity, and continuity in the L2 sense is called mean square continuity. Theorem 4.1. Let {X(t), t ∈ [a, b]} be a real-valued stochastic process with EX 2 (t) < +∞ for t ∈ [a, b]. The process X is mean square continuous if and only if aX ∈ C([a, b]), RX ∈ C([a, b] × [a, b]). Definition 4.2. A real-valued stochastic process {X(t), t ∈ [a, b]} is said to be stochastically differentiable (differentiable with probability one, differentiable in the L p sense) at point t ∈ [a, b] if there exists a random variable η such that X(t) − X(s) → η, s → t t −s in probability (with probability one or in L p , respectively). The random variable η is called the derivative of the process X at the point t and is denoted by X (t). D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 4,
33
34
4 Continuity. Differentiability. Integrability
If a stochastic process is differentiable (in any sense introduced above) at every point t of the parametric set T then it is said to be differentiable (in this sense) on this set. If, in addition, its derivative X = {X (t),t ∈ T} is continuous (in the same sense), then the process X is said to be continuously differentiable (in this sense). Theorem 4.2. Let {X(t), t ∈ [a, b]} be a real-valued stochastic process with EX 2 (t) < +∞ for t ∈ [a, b]. The process X is continuously differentiable in the 1 mean square if and only if aX ∈ C1 ([a, b]), RX ∈ C ([a, b] × [a, b]) and there ex2 ists derivative ∂ /∂ t ∂ s RX . In this case, aX (t) = aX (t), RX (t, s) = 2 the continuous ∂ /∂ t ∂ s RX (t, s), RX, X (t, s) = (∂ /∂ s) RX (t, s). Definition 4.3. Let {X(t), t ∈ [a, b]} be a real-valued stochastic process. Assume there exists the random variable η such that, for any partition sequence {λ n = {a = n ) → 0 and for any sequence of t0n < t1n < · · · < tnn = b}, n ∈ N} with maxk (tkn − tk−1 n n n n suites {θn = {θ1 , . . . , θn }, n ∈ N} with θk ∈ [tk−1 ,tkn ], k ≤ n, n ∈ N, the following convergence takes place, n
∑ X (θkn )
n n tk − tk−1 → η , n → ∞
k=1
either in probability, with probability one, or in the L p sense. Then the process X is said to be integrable (in probability, with probability one or in the L p sense, respectively) on [a, b]. The random variable η is denoted ab X(t) dt and called the integral of the process X over [a, b]. Theorem 4.3. Let {X(t),t ∈ [a, b]} be a real-valued stochastic process with EX 2 (t) < +∞ for t ∈ [a, b]. The process X is mean square integrable on [a, b] if and only if the functions aX and RX are Riemann integrable on [a, b] and [a, b] × [a, b], respectively. In this case, E
b a
X(t) dt =
b a
aX (t) dt, D
b a
X(t) dt =
b b a
a
RX (t, s) dtds.
Bibliography [24], Volume 1, Chapter IV, §3; [25], Chapter V, §1; [79], Chapter 14.
Problems 4.1. Let {X(t), t ∈ [0, 1]} be a waiting process; that is, X(t) = 1It≥τ ,t ∈ [0, 1], where τ is a random variable taking its values in [0, 1]. Prove that the process X is continuous (in any sense: in probability, L p , a.s.) if and only if the distribution function of τ is continuous on [0, 1].
4 Continuity. Differentiability. Integrability
35
4.2. Give an example of a process that is (a) Stochastically continuous but not continuous a.s. (b) Continuous a.s. but not mean square continuous. (c) For given p1 < p2 , continuous in L p1 sense but not continuous in L p2 sense. 4.3. Suppose that all values of a stochastic process {X(t), t ∈ R+ } are independent and uniformly distributed on [0, 1]. Prove that the process is not continuous in probability. 4.4. Let {X(t), t ∈ [a, b]} be continuous in probability process. Prove that this process is (a) Bounded in probability: ∀ε > 0 ∃C ∀t ∈ [a, b] :
P (|X(t)| ≥ C) < ε ;
(b) Uniformly continuous in probability: ∀ε > 0 ∃δ > 0 ∀t, s ∈ [a, b], |t − s| < δ :
P (|X(t) − X(s)| ≥ ε ) < ε .
4.5. Prove that the necessary and sufficient condition for stochastic continuity of a real-valued process X at a point t is the following. For any x, y ∈ R, x < y, P(X(t) ≤ x, X(s) ≥ y) + P(X(t) ≥ y, X(s) ≤ x) → 0, s → t. 4.6. Prove that the following condition is sufficient for a real-valued process {X(t), t ∈ R} to be stochastically continuous: for any points t0 , s0 ∈ R, x, y ∈ R, P(X(t) ≤ x, X(s) ≤ y) → P(X(t0 ) ≤ x, X(s0 ) ≤ y), t → t0 , s → s0 . 4.7. (1) Let stochastic process {X(t), t ∈ [a, b]} be mean square differentiable at the point t0 ∈ [a, b]. Prove that cov (η , X (t0 )) = (d/dt)|t=t0 cov(η , X(t)) for any η ∈ L2 (Ω , F, P). (2) Let {X(t),t ∈ [a, b]} bea mean square stochastic process and integrable η ∈ L2 (Ω , F, P). Prove that cov η , ab X(s) ds = ab cov(η , X(s)) ds. 4.8. Let {X(t),t ∈ R} be a real-valued centered process with independent increments and EX 2 (t) < +∞,t ∈ R. Prove that for every t ∈ R there exist the following mean square limits: X(t−) = lims→t− X(s), X(t+) = lims→t+ X(s). 4.9. Let {N(t),t ∈ R+ } be the Poisson process. (1) Prove that N is not differentiable in any point t ∈ R+ in the L p sense for any p ≥ 1. (2) Prove that N is differentiable in an arbitrary point t ∈ R+ in the L p sense for every p ∈ (0, 1). (3) Prove that N is differentiable in an arbitrary point t ∈ R+ in probability and with probability 1. Find the derivatives in items (2) and (3).
36
4 Continuity. Differentiability. Integrability
4.10. Let {X(t), t ≥ 0} be a real-valued homogeneous process with independent increments and D[X(1) − X(0)] > 0. Prove that the processes X(t) and Y (t) := X(t + 1) − X(t) are mean square continuous but not mean square differentiable. 4.11. Consider a mean square continuous process {X(t),t ∈ R} such that D(X(t) − X(s)) = F(t − s),t, s ∈ R for some function F and F (0) = 0. Describe all such processes. 4.12. Let the mean and covariance functions of the process X be equal aX (t) = t 2 , RX (t, s) = ets . Prove that X is mean square differentiable and find : (a) E[X(1) + X(2)]2 . (b) EX(1)X (2).
2 (c) E X(1) + 01 X(s) ds . (d) cov X (1), 01 X(s) ds .
4 (e) E X(1) + 01 X(s) ds assuming additionally that the process X is Gaussian. = N(t) − t be the 4.13. Let N be the Poisson process with intensity λ = 1 and N(t) corresponding compensated process. Find: (a) E 13 N(t) dt · 24 N(t) dt. dt · 2 N(t) dt · 3 N(t) dt. (b) E 01 N(t) 1 2 4.14. Let X(t) = α sin(t + β ), t ∈ R, where α and β are independent random variables, α is exponentially distributed with parameter 2, and β is uniformly distributed on [−π , π ]. (1) Is the process X continuous or differentiable in the mean square? (2) Prove using the definition that the mean square derivative X (t) is equal to α cos(t + β ). (3) Find RX , RX,X , RX , aX , aX . 4.15. Suppose that the trajectories of a process {X(t), t ∈ R} are differentiable at the point t0 , and the process X is mean square differentiable at this point. Prove that the derivative of the process at the point t0 in the mean square sense coincides almost surely with the ordinary derivative of the trajectory of the process. 4.16. Consider a mean square continuously differentiable stochastic process {X(t), t ∈ R}. (1) Prove that the processes X and X have measurable modifications. Further on, assume X and X be measurable. (2) Prove that X(t) = X(0) +
t 0
X (s)ds a.s., t ∈ R.
(4.1)
Consider both the mean square integral and the Lebesgue integral for a fixed ω (by convention, 0t is equal to − t0 for t < 0).
4 Continuity. Differentiability. Integrability
37
(3) Prove that X has a continuous modification. (4) Assume ∞ −∞
EX 2 (t) dt < ∞,
∞ −∞
E(X (t))2 dt < ∞.
Prove that for almost all ω the function t → X(t, ω ) belongs to the Sobolev space W21 (R), that is, a space of square-integrable functions with their weak derivatives being also square-integrable. Prove that, for P-almost all ω and λ 1 -almost all t, the Sobolev derivative coincides with the mean square derivative. (5) Will the previous statements still hold true if X is supposed to be mean square differentiable but not necessarily continuously mean square differentiable? 4.17. (1) Verify whether a process {X(t),t ∈ R} is either mean square continuous or mean square differentiable if (a) X(t) = min(t, τ ), where the random variable τ is exponentially distributed. (b) X has periodic trajectories with period 1 and X(t) = t − τ , t ∈ [τ , τ + 1], where the random variable τ is uniformly distributed on [0; 1]. (2) Find RX . If X is mean square differentiable then find X and RX , RX ,X . 4.18. Let X(t) = f (t − τ ), where τ is exponentially distributed and 1 − |x|, |x| ≤ 1, f (x) = 0, |x| > 1. Is the process X mean square differentiable? If so, find X and E(X (1))2 . 4.19. Let random variable τ have the distribution density (1−x)α α +1 , x ∈ [0, 1] p(x) = 0, x∈ / [0, 1] with α > 0. Is the stochastic process ⎧ ⎪ ⎨0, X(t) = 1, ⎪ ⎩ t−τ
t ≤ τ, t ≥ 1, , t ∈ (τ ; 1) 1−τ
either mean square continuous or mean square differentiable (a) on R. (b) On [0, 1]? 4.20. Assume that function f is bounded and satisfies the Lipschitz condition. Let random variable τ be continuously distributed. Prove that the stochastic process X(t) = f (t − τ ),t ∈ R is mean square differentiable. 4.21. (The fractional effect process). Let {εn , n ≥ 1} be nonnegative nondegen−1 erate i.i.d. random variables. Define Sn = ∑nk=1 ξk , f (x) = 1 + x4 . Prove that the stochastic process X(t) = ∑∞ n=1 f (t + Sn ), t ∈ R is mean square continuously differentiable.
38
4 Continuity. Differentiability. Integrability
4.22. Is it possible that a stochastic process (a) Has continuously differentiable trajectories but is not mean square continuously differentiable? (b) Is mean square continuously differentiable but its trajectories are not continuously differentiable? + process then the process 4.23. Prove t that if X(t), t ∈ R is a mean square continuous Y (t) = 0 X(s)ds is mean square differentiable and Y (t) = X(t).
4.24. Let {W (t), t ∈ R+ } be the Wiener process. (1) Prove that, for a given δ > 0, stochastic process Wδ (t) = (1/δ ) tt+δ W (s)ds, t ∈ R+ is mean square continuously differentiable. Find its derivative. (2) Prove that l.i.m.δ →0+ Wδ (t) = W (t), where l.i.m. denotes the limit in the mean square sense.
Hints 4.4. Item (a) follows from item (b). Assume that the statement of item (b) is not true and prove that there exist ε > 0 and sequences {tn } and {sn } converging to some point t ∈ [a, b] such that P(|X(tn ) − X(sn )| ≥ ε ) ≥ ε . Show that this contradicts the stochastic continuity of X at the point t. 4.6. Use Problem 4.5. 4.8. Prove and use the following fact. If H is a Hilbert space and {hk , k ∈ N} is an orthogonal system of elements from this space with ∑k hk 2H < +∞, then there exists h = ∑k hk ∈ H, where the series converges in norm in H. In this problem, one should consider the increments of the process as elements in the Hilbert space H = L2 (Ω , F, P). 4.9. (1), (2) Show that N(t) − N(s) p ∼ λ |t − s|1−p , s → t. E t −s (3)
N(t) − N(s) → 0, s → t t −s
⊂ {N(t−) = N(t)} .
Derivatives of N in probability, a.s. and in the L p , p ∈ (0, 1) sense are zero. 4.10. Use Problem 2.17 and Theorem 4.2. 4.12. (d) Use Problem 4.7. (e) If ξ ∼ N(a, σ 2 ) then Eξ 4 = a4 + 6a2 σ 2 + 3σ 4 . Prove this formula and use it. 4.16. (1) Both X and X are stochastically continuous (prove this!), thus existence of their measurable modifications is provided by Theorem 3.1.
4 Continuity. Differentiability. Integrability
39
(2) Let ϕ ∈C0∞ (R) be a compactly supported nonnegative infinitely differentiable ∞ ∞ ϕ (x)dx = 1. Define ϕ (x) = n ϕ (nx), X (t) := X(t − s)ϕn (s) function with n n −∞ −∞ ∞ X(s)ϕn (t − s)ds. Check that the trajectories of the process X are infinitely ds = −∞ n ∞ ∞ X (s)ϕn (t − s)ds = −∞ X(s)ϕn (t − s)ds, and for any t differentiable a.s., Xn (t) = −∞ lim X (t) = X (t) n→∞ n
lim Xn (t) = X(t),
n→∞
(4.2)
in the mean square. Because the trajectories of Xn are continuously differentiable a.s., the Newton–Leibnitz formula implies Xn (t) = Xn (0) + One has
t 0
Xn (s)ds a.s.
(4.3)
t t t E Xn (s) − X (s) ds E Xn (s) ds − X (s) ds ≤ 0
0
E|Xn (s) − X (s)|
E|X (s)| +
0
∞
−∞ E|X (s1 )|ϕn (s − s1 )ds1 .
and ≤ Pass to the limit in (4.3) using (4.2) and the Lebesgue dominated convergence theorem. (3) Follows immediately from the item (2). (4) It is well known that if a function f ∈ L2 (R) has the form f (t) = c +
t 0
g(s)ds, t ∈ R,
(4.4)
with some c ∈ R, g ∈ L2 (R), then f ∈ W21 (R) and g is equal to its Sobolev derivative. Now use the result from the item (2). Note that, moreover, the function f appears to be absolutely continuous. In particular, f is differentiable for almost all t and g is its ordinary derivative. (5) Statement (1) still holds true. this, consider the sequence of In order to show measurable processes Yn (t) := n X t + n−1 − X(t) (X is mean square continuous and thus can be supposed to be measurable). Then Yn (t) → X (t), n → ∞ in probability. Use Problem 3.45. Statement (2) still holds true assuming that T −T
E(X (t))2 dt < ∞, T > 0.
(4.5)
This condition is needed in order to construct the approximating sequence {Xn } and justify passing to the limit in (4.3). Note that under condition (4.5) the integral in the right-hand side of (4.1) is well defined as the Lebesque integral for a.s. ω but not necessarily as the mean square integral. Statement (4) still holds true and statement (3) holds true assuming (4.5). 4.17. (1) (a) Use the Lebesgue dominated convergence theorem and check that X (t) = 1It≤τ . Furthermore, EX(t) =
∞ 0
min(t, x)e−x dx =
t 0
xe−x dx +
∞ t
te−x dx = 1 − e−t .
40
4 Continuity. Differentiability. Integrability
Let s ≤ t. Then EX(s)X(t) = s 0
x2 e−x dx +
t s
∞ 0
sxe−x dx +
min(s, x) min(t, x)e−x dx =
∞ t
ste−x dx = −e−s (s + 2) + 2 − se−t .
In order to obtain the covariance function of the derivative, use Theorem 4.2. (b) Similarly to item (a), the process X(t) is mean square continuous. If there exists X (t), then X (t) = 1 a.s. (Problem 4.15) and the process Y (t) = X(t) − t must be mean square continuously differentiable with zero derivative. That is, Y (t) = Y (0) a.s. (Problem 4.16). But this is not correct. 4.19. The process X(t) is mean square differentiable at every point t except possibly t = 1. At the same time, X (t) = 0 when t ≤ τ or t > 1, and X (t) = 1/(1 − τ ) when t ∈ (τ , 1]. Because X(t) − X(1) X(t) − X(1) 1 = a.s., = 0 = lim t→1− t→1+ t −1 1−τ t −1 lim
the process {X(t),t ∈ R} is not differentiable at t = 1. Now, consider the restriction of X(t) to [0, 1]. Check that the mean square derivative X (1) exists if and only if E(1 − τ )−2 < ∞; that is, α > 1. Check that X (1) = (1 − τ )−1 . 4.24. Use the considerations from the solution to Problem 4.23.
Answers and Solutions 4.5. For any x < y, P(X(t) ≤ x, X(s) ≥ y) + P(X(t) ≥ y, X(s) ≤ x) ≤ P(|X(t) − X(s)| > y − x) → 0 as s → t in the case when X is stochastically continuous at point t. On the other hand, let ε , δ > 0 be fixed and we choose C so that P(|X(t)| > C) < δ . Consider the sets (k + 1)ε (k − 1)ε kε kε , Bk = (x, y) | x ≥ , y ≤ , Ak = (x, y) | x ≤ , y ≥ 2 2 2 2 k∈ Z. Let us assume that m = [2C/ε ] + 1, then {(x, y)| |x − y| > ε , |x| ≤ C} ⊂ m k=−m (Ak ∪ Bk ). Under the assumptions made above P((X(t), X(s)) ∈ Ak ∪ Bk ) → 0 as s → t for every k. Therefore lim sups→t P(|X(t) − X(s)| > ε ) ≤ P(|X(t)| > C) < δ . Because δ is arbitrary, we have P(|X(t) − X(s)| > ε ) → 0, s → t. 4.11. We get from the independence of increments that D(X(t) − X(0)) = nF(t/n) for every n ∈ N, t ∈ R. It follows from the equality F(z) = F(−z) that F (0) = 0. Obviously, F(0) = D(X(t) − X(t)) = 0. Thus D(X(t) − X(s)) = o(|t − s|2 ) as |t − s| → 0. Therefore D(X(t) − X(0)) = 0, t ∈ R and thus X(t) = X(0) + aX (t) − aX (0), t ∈ R.
4 Continuity. Differentiability. Integrability
41
4.12. (a) e4 + 2e2 + e + 25. (b) e2 + 4. t (c) 3e + 29 + 01 e −1 t dt. (d) 1. 2 2 4 t 1 et −1 4 3e − 2 + 01 e −1 dt + 3 3e − 2 + (e) 43 + 6 43 0 t dt . t 4.13. (a) 34 + 13 ; (b) 12 . 4.14. (1) Yes, it is. (2) By the Lebesgue dominated convergence theorem lim E
ε →0
2 α sin(t + ε + β ) − α sin(t + β ) − α cos(t + β ) = 0. ε
The Lagrange theorem implies that the expression in parentheses is dominated by the quantity α 2 . (3) RX (s,t) = 12 cos(t − s); RX,X (s,t) = − 12 sin(t − s); RX (s,t) = 12 cos(t − s), aX = aX ≡ 0. 4.15. Mean square convergence of (X(t) − X(t0 ))/(t −t0 ) to some random variable ξ as t → t0 implies convergence in probability. Thus, we can select a sequence tk → t0 , k → ∞ such that X(tk ) − X(t0 ) → ξ a.s., k → ∞. tk − t0 Since the trajectories of X(t) are differentiable at the point t0 , the limit ξ coincides a.s. with ordinary derivative. 4.18. Similarly to item (a) of Problem 4.17, the process X is continuously differentiable in m.s. and ⎧ ⎪ t ∈ [τ − 1, τ ], ⎨1, X (t) = −1, t ∈ (τ , τ + 1], ⎪ ⎩ 0, t∈ / [τ − 1, τ + 1]. 2 E X (1) = E1I1∈[τ −1;τ +1] = P(τ ∈ [0, 2]) = 1 − e2α . 4.20. The Lipschitz continuity of the function f implies its absolute continuity and therefore its differentiability at almost every point w.r.t. the Lebesgue measure. Let U be the set of the points where the derivative of the function f exists. Then, for every t0 ∈ R X(t) − X(t0 ) ≥ P(t0 − τ ∈ U) = 1. P ∃ lim t→t0 t − t0 On the other hand, for every t,t0 ,t = t0 , the absolute value of the fraction (X(t) − X(t0 ))/(t − t0 ) does not exceed the Lipschitz constant of the function f . Therefore, by the Lebesgue theorem on dominated convergence, the process {X(t), t ∈ R} is mean square differentiable and X (t) = g(t − τ ), where
42
4 Continuity. Differentiability. Integrability
g(t) =
f (t), t ∈ U . 0, t∈ U
4.22. (a) Yes, if E(X(t))2 = +∞. (b) Yes. Look at the process from Problem 4.17. 4.23. Let ε > 0 be arbitrary. Choose δ > 0 such that E(X(t) − X(s))2 < ε as soon as |t − s| < δ . Thus, for every s ∈ (t − δ ,t + δ ), s = t : t E
s
2 2 t X(z)dz (X(z) − X(t))dz − X(t) = E s t −s t −s
≤
E st (X(z) − X(t))2 dz < ε. t −s
5 Stochastic processes with independent increments. Wiener and Poisson processes. Poisson point measures
Theoretical grounds Definition 5.1. Let T ⊂ R be an interval. A stochastic process {X(t), t ∈ T} taking values in Rd is a process with independent increments if for any m ≥ 1 and t0 , . . . ,tm ∈ T, t0 < · · · < tm the random vectors X(t0 ), X(t1 ) − X(t0 ), . . . , X(tm ) − X(tm−1 ) are jointly independent. A process with independent increments is said to be homogeneous if, for any t, s, v, u ∈ T such that t − s = v − u, the increments X(t) − X(s) and X(v) − X(u) have the same distribution. The following theorem shows that all the finite-dimensional distributions for the process with independent increments on T = R+ are uniquely determined by the starting distribution (i.e., the distribution of X(0)) and distributions of the increments (i.e., the distributions of X(t) − X(s), t > s ≥ 0). Theorem 5.1. The finite-dimensional distributions of the process with independent increments {X(t),t ∈ R+ } taking values in Rd are uniquely determined by the family of the characteristic functions # " φ0 (·) = E exp{i(·, X(0))Rd }, φs,t (·) = E exp{i(·, X(t) − X(s))Rd }, 0 ≤ s < t . On the other hand, in order for a family of the functions {φ0 , φs,t , 0 ≤ s < t} to determine a process with independent increments, it is necessary and sufficient that (1) Every function φ0 , φs,t , 0 ≤ s < t is a characteristics function of a random vector in Rd . (2) The following consistency condition is fulfilled
φs,t φt,u = φs, u , 0 ≤ s < t < u. Theorem 5.1 is a version of the Kolmogorov theorem on finite-dimensional distributions (see Theorems 1.1, 2.2 and Problem 5.2).
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 5,
43
44
5 Stochastic processes with independent increments
Definition 5.2. The (one-dimensional) Wiener process W is the real-valued homogeneous process with independent increments on R+ such that W (0) = 0 and for any t > s the increment W (t) − W (s) has the distribution N(0,t − s). The multidimensional Wiener process is the m-dimensional process W (t) = (W 1 (t), . . . ,W m (t)), where {W i (t),t ≥ 0} are jointly independent Wiener processes. Let T ⊂ R be an interval and κ be a locally finite measure on B(T) (i.e., κ possesses finite values on bounded intervals). Definition 5.3. The Poisson process X with intensity measure κ on T is the process with independent increments such that the increment X(t) − X(s) for any t > s has the distribution Pois(κ((s,t])). The Poisson process N with parameter λ > 0 is a homogeneous process with independent increments defined on T = R+ , such that N(0) = 0 and for any t > s the increment N(t) − N(s) has the distribution Pois(λ (t − s)). The Poisson process with parameter λ is, obviously, the Poisson process with intensity measure κ = λ · λ 1 |R+ on T = R+ (λ 1 is a Lebesgue measure, λ 1 |R+ is its restriction to R+ ). Note that this form of the measure κ is implied by homogeneity of the process. Further on, we use the short name the Poisson process for the Poisson process with some parameter λ . Let 0 ≤ τ1 ≤ τ2 ≤ · · · be a sequence of random variables and X(t) = ∑∞ k=1 1I τk ≤t ,t ∈ + R . The process X is called the registration process associated with the sequence {τk }. The terminology is due to the following widely used model. Assume that some sequence of events may happen at random (e.g., particles are registered by a device, claims are coming into a telephone exchange or a server, etc.). If the variable τk is interpreted as the time moment when the kth event happens, then X(t) counts the number of events that happened until time t. Proposition 5.1. The Poisson process with parameter λ is the registration process associated with the sequence {τk } such that τ1 , τ2 − τ1 , τ3 − τ2 , . . . are i.i.d. random variables with the distribution Exp(λ ). Let us give the full description of the characteristic functions for the increments of stochastically continuous homogeneous processes with independent increments (such processes are called L´evy processes). Theorem 5.2. (The L´evy–Khinchin formula) Let {X(t),t ≥ 0} be a stochastically continuous homogeneous processes with independent increments taking values in Rd . Then φs,t (z) = exp[(t − s)ψ (z)], s ≤ t, z ∈ Rd , 1 ψ (z) = i(z, a)Rd − (Bz, z)Rd + 2
Rd
ei(z,u)Rd − 1 − i(z, u)Rd 1Iu
≤1 Rd
Π (du), (5.1)
where a ∈ Rd , the matrix B ∈ Rd×d is nonnegatively defined, and the measure Π satisfies the relation
5 Stochastic processes with independent increments Rd
45
(u2Rd ∧ 1)Π (du) < +∞.
The function ψ is called the cumulant of the process X; the measure Π is called the L´evy measure of the process X. Consider a “multidimensional segment” T = T1 × · · · × Td ⊂ Rd with some segments Ti ⊂ R, i = 1, . . . , d. On the set T, the partial order is naturally defined: t = (t 1 , . . . ,t d ) ≥ s = (s1 , . . . , sd ) ⇔ t i ≥ si , i = 1, . . . , d. For a function X : T → R and s ≤ t, we denote by Δs,t (X) the increment of X on the “segment” (s,t] = (s1 ,t 1 ] × · · · × (sd ,t d ]; that is, ε1 +···+εd 1 1 1 d d d Δs,t (X) = (−1) X t − ε (t − s ), . . . ,t − ε (t − s ) . 1 d ∑ ε1 ,...,εd ∈{0,1}
Definition 5.4. Let κ be a locally finite measure on B(T). A real-valued random field {X(t), t ∈ T} is called the Poisson random field with intensity measure κ, if for any t0 , . . . ,tm ∈ T, t0 ≤ t1 ≤ · · · ≤ tm the variables X(t0 ), Δt0 ,t1 (X), . . . , Δtm−1 ,tm (X) are jointly independent and for every s ≤ t the increment Δs,t (X) has the distribution Pois(κ((s, t])). Let (E, E, μ ) be a space with σ -finite measure. Denote Eμ = {A ∈ E | μ (A) < +∞}. Definition 5.5. A random point measure on a ring K ⊂ Eμ is the mapping ν that associates with every set A ∈ K a Z+ -valued random variable ν (A), and satisfies the following condition. For any Ai ∈ K, i ∈ N such that Ai ∩ A j = ∅, i = j, i Ai ∈ K, ν Ai = ∑ ν (Ai ) a.s. i
i
The random point measure is called the Poisson point measure with intensity measure μ , if for any A1 , . . . , Am ∈ Eμ , Ai ∩ A j = ∅, i = j the values ν (A1 ), . . . , ν (Am ) are jointly independent and for every A ∈ Eμ the value ν (A) has the distribution Pois(μ (A)). The term “random point measure” can be explained by the following result (see Problem 5.41) which shows that this object has a natural interpretation as a collection of measures indexed by ω ∈ Ω and concentrated on the countable subsets of E (“point measures”). Proposition 5.2. Let ν be a point measure on a ring K. Assume that σ (K) contains all one-point sets and there exists a countable ring K0 ⊂ K such that σ (K0 ) ⊃ K. Then there exists the mapping νˆ : Ω × K → R+ such that: (1) For every A ∈ σ (K) the function νˆ (·, A) is an extended random variable. (2) For every ω ∈ Ω the mapping A → νˆ (ω , A) is the σ -finite measure concentrated at some enumerable set and taking natural values on the points of this set; (3) νˆ (·, A) is equal to ν (A) a.s. for every A ∈ K.
46
5 Stochastic processes with independent increments
If σ (K) = E then the Poisson point measure with intensity measure μ defined on K can be extended to the Poisson point measure defined on Eμ , and such an extension is unique. For a given Poisson point measure ν with intensity μ , the corresponding centered (or compensated) Poisson point measure is defined as ν˜ (A) = ν (A) − μ (A), A ∈ Eμ . Let f : E → R be measurable w.r.t. σ (K) and { f = 0} ⊂ A for some A ∈ K. Then the integral of f over the random point measure ν is naturally defined by % $ f (z)ν (dz) (ω ) = f (z)νˆ (ω , dz), ω ∈ Ω , (5.2) E
E
where νˆ is the collection of measures given by Proposition 5.2 (see Problem 5.42). ˜ If ν is a Poisson measure, ν is the corresponding compensated measure, and f ∈ L2 (E, μ ), then the integral E f (z)ν˜ (dz) is well defined as the stochastic integral over an orthogonal random measure (see Chapter 8 and Problem 5.43). Two definitions of the integrals mentioned above are adjusted in the sense that if f ∈ L2 (E, d μ ) and { f = 0} ∈ Eμ then
f (z)ν (dz) −
E
E
f (z)μ (dz) =
E
f (z)ν˜ (dz)
(5.3)
(see Problem 5.44). Frequently, one needs to consider Poisson point measures defined on a product of spaces, for instance, E = R+ × Rd , E = B(E). In this case, the above-defined integrals of a function (s, u) → f (s, u)1Is≤t,u∈A are denoted t 0
A
f (s, u)ν (ds, du),
t 0
A
f (s, u)ν˜ (ds, du).
Theorem 5.3. Let {X(t),t ≥ 0} be a stochastically continuous homogeneous process with independent increments taking its values in Rd . Let a, B, Π be, respectively, the vector, the matrix, and the measure appearing in the L´evy–Khinchin formula for the cumulant of this process (see Theorem 5.2). Then there exist the independent d-dimensional Wiener process W and Poisson point measure on E = R+ × Rd with the intensity measure μ = λ 1 |R+ × Π such that X(t) = at + B
1/2
W (t) +
t 0
{uRd >1}
uν (ds, du) +
t 0
{uRd ≤1}
uν˜ (ds, du), (5.4)
t ∈ R+ . And vice versa, let X be determined by the equality (5.4) with arbitrary a, B, and independent d-dimensional Wiener process W and Poisson point measure ν with the intensity measure μ = λ 1 |R+ × Π . Then X is a L´evy process and its cumulant is determined by the equality (5.1).
Bibliography [9], Chapter II; [24], Volume 1, Chapter III, §1; [25], Chapter VI; [15], Chapter VIII; [78]; [79], Chapters 2, 27, 28.
5 Stochastic processes with independent increments
47
Problems 5.1. Verify that the consistency condition (condition 2 of Theorem 5.1) holds true for characteristic functions of the increments for: (a) Wiener process; (b) Poisson process with intensity measure κ. Such verification is necessary for Definitions 5.2 and 5.3 to be formally correct. 5.2. Let the family of functions {ψ0 , ψs,t , 0 ≤ s < t} be given, and let every function of the family be a characteristic function of some random variable. For any m ≥ 1,t1 , . . . ,tm ∈ R+ , z1 , . . . , zm ∈ R, put
φt1 ,...,tm (z1 , . . . , zm ) = ψ0 (zπ (1) + · · · + zπ (m) )ψ0,π (t1 ) (zπ (1) + · · · + zπ (m) ) × ψπ (t1 ),π (t2 ) (zπ (2) + · · · + zπ (m) ) . . . ψπ (tm−1 ),π (tm ) (zπ (m) ), where the permutation π is such that tπ (1) ≤ · · · ≤ tπ (m) . Prove that the necessary and sufficient condition for the family {φt1 ,...,tm , t1 , . . . ,tm ∈ T, m ≥ 1} to satisfy consistency conditions of Theorem 2.2 is that the family {ψ0 , ψs,t , 0 ≤ s < t} satisfies the consistency condition of Theorem 5.1. Prove also that if these conditions hold then the process X with the finite-dimensional characteristic functions {φt1 ,...,tm } is a process with independent increments. 5.3. Let {μt , t > 0} be a family of probability measures on R such that, for any s,t > 0, μt+s equals the convolution of μt and μs . Prove that there exists homogeneous process with independent increments {X(t), t > 0} such that for every t the distribution of X(t) equals μt . Describe the finite-dimensional distributions of the process {X(t), t > 0}. 5.4. Let N be the Poisson process with intensity λ . Find: (a) P(N(1) = 2, N(2) = 3, N(3) = 5). (b) P(N(1) √ ≤ 2, N(2) = 3, N(3) ≥ 5). (c) P(N( 2) = √3). (d) P(N(3) = 2). (e) P(N(4) = 3, N(1) = 2). (f) P(N(2)N(3) = 2). (g) P(N 2 (2) ≥ 3N(2) − 2). (h) P(N(2) + N(3) = 1). 5.5. Find E(X(1) + 2X(2) + 3X(3))2 ; where X is: (a) the Wiener process; (b) the Poisson process.
E(X(1) + 2X(2))3 ; E(X(1) + 2X(2) + 1)3 ,
48
5 Stochastic processes with independent increments
5.6. Specify the finite-dimensional distributions for the Poisson process. 5.7. Assume that stochastic process {X(t),t ≥ 0} satisfies the conditions: (a) X takes values in Z+ and X(0) = 0. (b) X has independent increments. (c) P(|X(t + h) − X(t)| > 1) = o(h), P(X(t + h) − X(t) = 1) = λ h + o(h), h → 0 with some given λ > 0. Prove that X is the Poisson process with intensity λ . 5.8. Assume that stochastic process {X(t), t ≥ 0} satisfies conditions (a),(b) from the previous problem and condition (c ) P(|X(t + h) − X(t)| > 1) = o(h), P(X(t + h) − X(t) = 1) = λ (t)h + o(h), h → 0, where λ (·) is some continuous nonnegative function. Find the finite-dimensional distributions for the process X. 5.9. Let {N(t), t ∈ R+ } be the Poisson process, τ1 be the time moment of its first jump. Find the conditional distribution of τ1 given that the process has on [0, 1]: (a) Exactly one jump (b) At most one jump (c) At least m jumps (m ∈ N). 5.10. Prove that the distribution of the Poisson process defined on [0, 1] conditioned that the process has m jumps (m ∈ N) on [0, 1] is equal to the distribution of the process X(t) = ∑m k=1 1Iξk ≤t , t ∈ [0, 1], where ξ1 , . . . , ξm are i.i.d. random variables uniformly distributed on [0, 1]. 5.11. Let τ be a random variable, and X(t) = 1It≥τ ,t ∈ R be the corresponding waiting process. What distribution should the random variable τ follow in order for X to be a process with independent increments? 5.12. Prove Proposition 5.1. 5.13. Let τn be the time moment of the nth jump for the Poisson process. Prove that the distribution density of τn equals
λ n xn−1 −λ x e , x ≥ 0. n! 5.14. Let N be the Poisson process, τn be the time moment of its nth jump, and N(t), t ∈ [τ2n , τ2n+1 ), X(t) = N(t) − 1, t ∈ [τ2n−1 , τ2n ). Draw the trajectories of the process X. Calculate P(X(3) = 2), P(X(3) = 2, X(5) = 4). Is the process X a process with independent increments? 5.15. Let the number of signals transmitted via a communication channel during time [0,t] be the Poisson process N with intensity λ > 0. Every signal is successfully received with probability p ∈ (0; 1), independently of the process N and other signals. Let {X(t),t ≥ 0} be the number of a signals received succesfully. Find: (a) onedimensional; (b) multidimensional distributions of X.
5 Stochastic processes with independent increments
49
5.16. The numbers of failures in work for plants A and B during the time period [0,t] are characterized by two independent Poisson processes with intensities λ1 and λ2 , respectively. Find: (a) one-dimensional; (b) multidimensional distributions of the total number of failures for both plants A and B during the time period [0,t]. 5.17. Let {ξn , n ≥ 1} be i.i.d. random variables with the distribution function F and ν be a Poisson distributed λ independent of {ξn }. ' & random variable with parameter Prove that the process Xν (x) = ∑νn=1 1Iξn ≤t ,t ∈ R is the Poisson process with the intensity measure κ determined by the relation κ((a, b]) = λ (F(b) − F(a)), a ≤ b. 5.18. Within the conditions of the previous problem, prove that all multidimensional distributions of the process λ −1 Xν weakly converge as λ → +∞ to the corresponding finite-dimensional distributions of a (nonrandom) process that is equal to the determinate function F. 5.19. Let {ξn , n ≥ 1} be i.i.d. random variables with the distribution function F and {N(t),t ∈ R+ } be the Poisson process with intensity λ independent of {ξn }. Prove ' & N(t) that X(x,t) = ∑n=1 1Iξn ≤x , (x,t) ∈ R × R+ is the Poisson random field with intensity κ × λ 1 |R+ , where κ is the measure defined in Problem 5.17. 5.20. The compound Poisson process is a process of the form X(t) = ∑k=1 ξk , t ∈ R+ , where ξk , k ∈ N are i.i.d. random variables and {N(t), t ∈ R+ } is a Poisson process independent of {ξn }. Prove that the compound Poisson process is a homogeneous process with independent increments. N(t)
5.21. Prove that the sum of two processes with independent increments, which are independent of each other, is again a process with independent increments. 5.22. Let W be the Wiener process, N1 , . . . , Nm be the Poisson processes with parameters λ1 , . . . , λm and the processes W, N1 , . . . , Nm be jointly independent. Let also c0 , . . . , cm ∈ R. Prove that X = c0W + c1 N1 + · · · + cm Nm is a homogeneous process with independent increments and find the parameters a, B, Π in the L´evy–Khinchin formula for X (Theorem 5.2). 5.23. Let W be the Wiener process, N λ and N μ be two Poisson processes with intensities λ and μ , respectively. Let also {ηi , i ≥ 1} be i.i.d. random variables exponentially distributed with parameter α and {ζk , k ≥ 1} be i.i.d. random variables exponentially distributed with parameter β . The processes W, N λ, N μ and sequences {ηi }, {ζk } are assumed to be jointly independent. Prove that X(t) = bW (t) +
N λ (t)
∑
i=1
ηi −
N μ (t)
∑
ζi , t ∈ R+
i=1
is a homogeneous process with independent increments and find the parameters a, B, Π in the L´evy–Khinchin formula for X. Find the characteristic function of the variable X(t),t > 0 and express the distribution density for this variable in integral form.
50
5 Stochastic processes with independent increments
5.24. Prove that the m-dimensional Wiener process is a process with independent increments and each of its increments W (t) − W (s), t > s has the distribution N(0, (t − s)IRm). 5.25. Let {W (t), t ≥ 0} be the two-dimensional Wiener process, B(0, r) = {x ∈ R2 | x ≤ r}, r > 0. Find P(W (t) ∈ B(0, r)). 5.26. Let {W (t) = (W1 (t), . . . ,Wm (t)), t ∈ R+ } be an m-dimensional Wiener process and let set A ⊂ Rm have zero Lebesgue measure. Prove that the total time spent by W in the set A equals zero a.s. (compare with Problem 3.21). 5.27. Let {W (t) = (W1 (t), . . . ,Wm (t)), t ∈ R+ } be an m-dimensional Wiener pro2 cess and x = (x1 , . . . , xm ) ∈ Rm be a point such that ∑m i=1 xi = 1. Prove that Y (t) := m + ∑i=1 xiWi (t), t ∈ R is the Wiener process. 5.28. Let {W (t) = (W1 (t), . . . ,Wm (t)), t ∈ R+ } be an m-dimensional Wiener process and an (m ×m)-matrix U have real-valued entries and be orthogonal (i.e., UU T = E). (t) = UW (t), t ≥ 0} is again an m-dimensional Wiener process. Prove that {W 5.29. Prove that there exists a homogeneous process {X(t), t > 0} with independent increments and distribution density pt (x) =
1 t−1 −x x e 1Ix>0 . Γ (t)
5.30. Prove that there exists homogeneous process {X(t), t > 0} with independent increments and the characteristic function EeizX(t) = e−t|z| , t > 0. 5.31. Find the parameters a, B, Π in the L´evy–Khinchin formula for the process from: (a) Problem 5.29. (b) Problem 5.30. 5.32. Prove that there does not exist a process with independent increments X such that: (a) X(0) has a continuous distribution but the distribution of X(1) has an atom. (b) The distribution of X(0) is absolutely continuous but the distribution of X(1) is not. 5.33. Prove that there does not exist a process with independent increments X such that X(0) is uniformly distributed on [0, 1] and X(1) is exponentially distributed. 5.34. Prove that there does not exist a homogeneous process with independent increments X with P(X(1) − X(0) = ±1) = 12 . 5.35. Give examples of homogeneous processes X with independent increments defined on [0, 1] for which X(0) = 0 and the distribution of X(1) is: (a) discrete (b) absolutely continuous (c) continuous singular.
5 Stochastic processes with independent increments
51
5.36. Prove that the process with independent increments {X(t), t ∈ R+ } is stochastically continuous if and only if its one-dimensional characteristic function φt (z) = EeizX(t) , t ∈ R+ , z ∈ R is a continuous function w.r.t. t for every fixed z. 5.37. Assume {X(t), t ∈ [0, 1]} is a process with independent increments. Does it imply that the process Y (t) = X(−t), t ∈ [−1, 0] has independent increments? Compare with Problem 12.20. 5.38. Let {X(t), t ∈ R+ } be a nondegenerate homogeneous process with independent increments. Prove that P(|X(t)| > A) > 0 for any t > 0 and A > 0. 5.39. Let {X(t), t ∈ R+ } be a nondegenerate homogeneous process with independent increments. Prove that for every a > 0, b > 0 there exist random variables τn , σn , n ≥ 1 such that almost surely τn < σn < τn+1 , σn − τn ≤ a and |X(σn ) − X(τn )| > b for every n ≥ 1. 5.40. Let {X(t), t ∈ R+ } be a homogeneous process with independent increments and piecewise constant trajectories. Prove that X is a compound Poisson process (see Problem 5.20). 5.41. Prove Proposition 5.2. 5.42. Prove that the formula (5.2) defines a random variable (i.e., a measurable function of ω ). 5.43. Prove that the compensated Poisson point measure ν˜ = ν − μ is the centered orthogonal measure with a structural measure μ (here ν is a Poisson point measure with the intensity measure μ ). 5.44. Prove equality (5.3). 5.45. Let ν be the Poisson point measure with intensity measure μ . Prove that (1) The characteristic function of a random variable E f (z)ν (dz) equals $ % it f (z) φ (z) = exp (e − 1)μ (du) . E
(2) The characteristic function of a random variable E g(z)ν˜ (dz) equals $ % itg(z) ˜ φ (z) = exp (e − 1 − iz)μ (du) E
(the functions f , g are such that the corresponding integrals are correctly defined). 5.46. Let {X(t), t ∈ T1 × · · · × Td } be the Poisson random field with intensity mea i i i i i i sure κ. Define the mapping ν on a ring K = { m i=1 (s ,t ], s ,t ∈ T, s ≤ t , i = 1, . . . , m, m ∈ N} by the equality
ν(
m
m
(si ,t i ]) = ∑ Δsi ,t i (X), if (si ,t i ] ∩ (s j ,t j ] = ∅, i = j.
i=1
i=1
Prove that ν is the Poisson point measure with intensity measure κ.
52
5 Stochastic processes with independent increments
5.47. Let T ⊂ R be an interval and let ν be a Poisson point measure with in i i i i i i tensity measure κ defined on the ring K = { m i=1 (s ,t ], s ,t ∈ T, s ≤ t , i = 1, . . . , m, m ∈ N}. Assume that {X(t), t ∈ T} is a stochastic process such that X(t) − X(s) = ν ((s,t]) for any s,t ∈ T, s ≤ t. Does it imply that X is the Poisson process with intensity measure κ? Compare with the previous problem. 5.48. Let {X(t) = ∑k=1 ξk ,t ∈ R+ } be a compound Poisson process. Define the point i i i i + i i measure ν on the ring K = { m i=1 (s ,t ], s ,t ∈ R , s ≤ t , i = 1, . . . , m, m ∈ N} by m m i i i i j j equality ν ( i=1 (s ,t ]) = ∑i=1 Δsi ,t i (X), as (s ,t ] ∩ (s ,t ] = ∅, i = j. What distribution should the random variables {ξk } follow in order for ν to be a Poisson point measure? What is its intensity measure in that case? N(t)
5.49. Let {X(t) = ∑k=1 ξk , t ∈ R+ } be a compound Poisson process, E = R+ × R, i i i i i i K={ m i=1 (a , b ], a , b ∈ E, a ≤ b , i = 1, . . . , m, m ∈ N}. For a = (s, x), b = N(t) i i (t, y) ∈ E, a ≤ b we define ν ((a, b]) = ∑k=N(s) 1Iξk ∈(x,y] and put ν ( m i=1 (a , b ]) = i i i i j j ∑m i=1 ν ((a , b ]), as (a , b ] ∩ (a , b ] = ∅, i = j. Prove that ν is a Poisson point measure with intensity measure λ (λ 1 × μ ), where λ is the parameter of the process N, λ 1 is the Lebesgue measure, and μ is the distribution of the variable ξ1 . N(t)
5.50. Let (E, E) be a measurable space, {Xn , n ≥ 1} be a sequence of i.i.d random elements with values in E, and ζ be a random variable following the distribution Pois(λ ) and independent of {Xn , n ≥ 1}. Prove that the mapping ν : E A → ν (A) = ζ ∑k=1 1IXk ∈A is the Poisson point measure with the intensity measure λ μ , where μ is the distribution of X1 . 5.51. Let {X(t) = ∑k=1 ξk ,t ∈ R+ } be a compound Poisson process, and α > 0 be a fixed number. Prove that E|X(t)|α < +∞ for any t > 0 if and only if E|ξ1 |α < +∞. N(t)
5.52. Let X be a L´evy process with the L´evy measure Π , and α > 0 be a fixed number. Prove that EX(t)αRd < +∞ for any t > 0 if and only if u d >1 uαRd Π (du) R < +∞.
α ∈ A} be A⊃ 5.53. (General “0 and 1” rule) Let {Gα, independent σ -algebras, ∞ A1 ⊃ A2 ⊃ · · · , ∞ α ∈Ak Gα , k ∈ N. Prove: if A ∈ k=1 Bk n=1 An = ∅, Bk = σ then P(A) = 0 or 1. 5.54. Let the process {X(t), t > 0} with independent increments have right-hand continuous trajectories. Prove that every random variable, measurable w.r.t. σ algebra t>0 σ (X(s), s ≤ t), is degenerate, that is, possesses a nonrandom value with probability one. 5.55. Describe all centered continuous Gaussian processes with independent increments whose trajectories have bounded variation on any segment.
5 Stochastic processes with independent increments
53
Hints 5.1. Write down the explicit expressions for the characteristic functions of the increments. 5.3. Use Theorem 5.1. 5.4. Express the events in the terms of increments of the process N. For example, {N(2) + N(3) = 1} = {N(2) = 0, N(3) = 1} = {N(2) − N(0) = 0, N(3) − N(2) = 1}. This implies that P(N(2) + N(3) = 1) = P(N(2) − N(0) = 0)P(N(3) − N(2) = 1) = e−2λ · [e−λ λ ] = λ e−3λ . 5.7. Write down the differential equations for the functions fk (t) = P(X(t) = k), k ∈ Z+ . Prove that the solution to this system of differential equations with f0 (0) = 1, fk (0) = 0, k ≥ 1 is unique. Verify that the corresponding probabilities for Poisson process satisfy this system. 5.9. The corresponding conditional distribution functions equal: (a) F1 (y) = P(τ1 ≤ y/τ1 ≤ 1, τ2 > 1). (b) F2 (y) = P(τ1 ≤ y/τ2 > 1). (c) F3 (y) = P(τ1 ≤ y/τm ≤ 1). Calculate these conditional probabilities using the identity {τm ≤ y} = {N(y) ≥ m}. 5.10. Use Problem 1.2. 5.12. For a given m ≥ 1 and 0 < a1 < b1 < a2 < · · · < bm−1 < am , calculate the probability P(τ1 ∈ (a1 , b1 ], . . . , τm−1 ∈ (am−1 , bm−1 ], τm > am ). 5.13. Differentiate by x the equality n−1
P(τn ≤ x) = P(N(x) ≥ n) = 1 − ∑ ((λ x)k /k!)e−λ x , x ≥ 0. k=0
5.17 — 5.20. Calculate the common characteristic functions for the increments. 5.21. Use the following general fact: if {ηα , α ∈ A} are jointly independent random variables, and A1 , . . . , An are some disjoint subsets of A and ζi ∈ σ (ηα , α ∈ Ai ), i = 1, . . . , n, then the variables ζ1 , . . . ζn are jointly independent. 5.23. In order to obtain the characteristic function φX(t) , use considerations similar to those used in the proof of Problem 5.17. In order to express the distribution density of X(t), use the inversion formula for the characteristic function: pX(t) (x) = (2π )−1 R e−izx φX(t) (z) dz. 5.28. Verify that the process UW (t) is also a process with independent increments and UW (t) −UW (s) follows the distribution N(0, (t − s)IRm ).
54
5 Stochastic processes with independent increments
5.29,5.30. Use Theorem 5.1. 5.32. Assuming ζ = ξ + η and variables ξ and η are independent, prove that: (a) If the distribution of ξ does not have atoms then the distribution of ζ also does not have them. (b) If the distribution of ξ has the density then the distribution of ζ has it, too. 5.33. Write down the characteristic function φ for uniform distribution and the characteristic function ψ for exponential distribution. Check that ψ /φ is not a characteristic function of a random variable. 5.38. If P(|X(t)| > A) = 0 then P(|X(t/2)| > A2 ) = 0 (Prove it!). Conclude that DX(t/2k ) ≤ A2 /22k . After that, deduce that DX(t) = 0. 5.42. For a simple function f , the integral in the right-hand side of (5.2) is equal to the sum of the values νˆ on a finite collection of sets A ∈ σ (K) with some nonrandom weights, and thus it is a random variable according to statement (1) of Proposition 5.2. Any nonnegative measurable function can be monotonically approximated by a simple ones. 5.44. First prove the formula (5.3) for simple functions and then approximate (both pointwisely and in the L2 (μ ) sense) an arbitrary function by a sequence of simple ones. 5.45. For a simple function f , the integrals are the sums of (independent) values of the Poisson measure ν or compensated Poisson measure ν˜ with some nonrandom weights. Use the explicit formula for the characteristic function of the Poisson random variable. Approximate a measurable function by a sequence of simple ones. 5.49,5.50. Calculate the common characteristic functions for the values of ν on disjoint sets A1 , . . . , An . 5.54. Use Problem 5.53, putting A = N, Gα = σ (X(t) − X(s), 2−α −1 ≤ s < t ≤ 2−α ), α ∈ N, Ak = {k, k + 1, . . . }. 5.55. The process is identical to zero almost everywhere. In order to show this, make the appropriate change of time variable and use Problem 3.19.
Answers and Solutions 5.6. PtN1 ,...,tm (A) =∑(u1 ,...,um )∈A P(N(t1 ) = u1 , . . . , N(tm ) = um ). For 0 < t1 < · · · < tm and u1 , . . . , um ∈ Z+ such that u1 ≤ · · · ≤ um , P(N(t1 ) = u1 , . . . , N(tm ) = um ) = P(N(t1 ) = u1 )P(N(t2 ) − N(t1 ) = u2 − u1 ) × · · · ×P(N(tm ) − N(tm−1 ) = um − um−1 ) = e−λ tm
(λ t1 )u1 . . . (λ tm − λ tm−1 )um −um−1 . u1 ! . . . (um − um−1 )!
5 Stochastic processes with independent increments
55
5.8. P(X(t ) = ki , i = 1, . . . , m) = P(N(Λ (ti )) = ki , i = 1, . . . , m) where i Λ (t) = 0t λ (s) ds,t ≥ 0 and N is the Poisson process with parameter λ = 1. 5.10. Let 0 ≤ t1 < · · · < tn ≤ 1, u1 ≤ · · · ≤ un ≤ m then P(N(t1 ) = u1 , . . . , N(tn ) = un /N(1) = m) λ m −1 −λ t1 (λ t1 )u1 −λ (t2 −t1 ) (λ (t2 − t1 ))u2 −u1
= e−λ · e · e m! u1 ! (u2 − u1 )! m−un
( λ (1 − t )) n . × · · · × e−λ (1−tn ) (m − un )! We finish the proof by using Problem 1.2. 5.11. The distribution should be degenerate. 5.12. Let m ≥ 1 be fixed. For 0 < a1 < b1 < a2 < · · · < bm−1 < am we have that P(τ1 ∈ (a1 , b1 ], . . . , τm−1 ∈ (am−1 , bm−1 ], τm > am ) = P(N(a1 ) = 0, N(b1 ) − N(a1 ) = 1, . . . , N(am ) − N(bm−1 ) = 0)
= e−λ a1 λ (b1 − a1 )e−λ (b1 −a1 ) . . . λ (bm−1 − am−1 )e−λ (bm−1 −am−1 ) × e−λ (am −bm−1 ) =
(a1 ,b1 ]×···×(am−1 ,bm−1 ]×(am ,+∞)
λ m e−λ xm dx1 . . . dxm .
From the same formula with am replaced by arbitrary bm > am , we get P (τ1 , . . . , τm ) ∈ A = λ m e−λ xm dx1 . . . dxm
(5.5)
A
for every set A of the form A = (a1 , b1 ] × · · · × (am , bm ], a1 < b1 < · · · < am < bm .
(5.6)
τ1 , . . . , τm is concentrated on the set Δm := The joint distribution of the variables # " (x1 , . . . , xm )| 0 ≤ x1 ≤ · · · ≤ xm . Because the family of sets of the type (5.6) is a semiring that generates a Borel σ -algebra in Δm , relation (5.5) implies that the joint distribution density for the variables τ1 , . . . , τm is equal to p(x1 , . . . , xm ) = λ m e−λ xm 1I0≤x1 ≤···≤xm . On the other hand, for independent Exp(λ ) random variables ξ1 , . . . , ξm , the joint distribution density for the variables ξ1 , ξ1 + ξ2 , . . . , ξ1 + · · · + ξm is equal to m
λ e−λ x1 1Ix1 ≥0 ∏ λ e−λ (xk −xk−1 ) 1Ixk −xk−1 ≥0 = p(x1 , . . . , xm ), k=2
d
that is, (τ1 , . . . , τm ) =(ξ1 , . . . , ξ1 + · · · + ξm ). This proves the required statement, because the finite dimensional distributions of a registration process are uniquely defined by the finite-dimensional distributions of the associated sequence {τk } (prove the latter statement!).
56
5 Stochastic processes with independent increments
5.15. X is the Poisson process with intensity pλ . 5.16. X is the Poisson process with intensity λ1 + λ2 . 5.17. Take x1 < · · · < xn , u1 , . . . , un ∈ R and denote Δ Xi = X(xi ) − X(xi+1 ), Δ Fi = F(xi ) − F(xi+1 ). We have E exp[i(u1 X(x1 ) + u2 Δ X2 + · · · + un Δ Xn )] =
∞
∑ E(exp[i(u1 X(x1 ) + u2 Δ X2 + · · · + un Δ Xn )]/ν = k) ·
k=0
=
∞
λ k e−λ · k! j=1
k · eiu1 F(x1 ) + eiu2 Δ F2 + · · · + eiun Δ Fn + (1 − F(xn )) k
∑ E exp i ∑ (u1 1Iξ j ≤x1 + u2 1Ix1 1.
13.17. Apply the Itˆo formula and find stochastic differentials of the processes. (a) X(t) = W 2 (t). b) X(t) = sint + eW (t) . (c)X(t) = t ·W (t). (d) X(t) = W12 (t)+W22 (t), where W (t) = (W1 (t),W2 (t)) is a two-dimensional Wiener process. (e) X(t) = W1 (t) ·W2 (t). (f) X(t) = (W1 (t)+W2 (t)+W3 (t))(W22 (t)−W1 (t)W3 (t)), where (W1 (t),W2 (t),W3 (t)) is a three-dimensional Wiener process. 13.18. Let f ∈ Lˆ2 , X(t) = 0t f (s)dW (s). Prove that process M(t) := X 2 (t) − t 2 0 f (s)ds, t ≥ 0 is a martingale.
13.19. Let W (t) = (W1 (t), . . . ,Wm (t)), t ≥ 0 be an m-dimensional Wiener process, f : Rm → R, f ∈ C2 (Rm ). Prove that t 1 t f (W (s))ds, f (W (t)) = f (0) + ∇ f (W (s))dW (s) + 2 0 0
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
199
2 2 where ∇ = ((∂ /∂ x1 ), . . . , (∂ /∂ xm )) is the gradient, and = ∑m i=1 (∂ /∂ xi ) is the Laplacian.
13.20. Find the stochastic differential dZ(t) of a process Z(t), if: (a) Z(t) = 0t f (s)dW (s), where f ∈ Lˆ2 ([0,t]). (b) Zt = exp{α W (t)}. (c) Zt = exp{α X(t)}, where process {X(t), t ∈ R+ } has a stochastic differential dX(t) = α dt + β dW (t), α , β ∈ R, β = 0. (d) Zt = X n (t), n ∈ N, dX(t) = α X(t)dt + β X(t)dW (t), α , β ∈ R, β = 0. (e) Zt = X −1 (t), dX(t) = α X(t)dt + β X(t)dW (t), α , β ∈ R, β = 0. 13.21. Let hn (x), n ∈ Z+ be the Hermite polynomial of nth order: 2 2 dn hn (x) = (−1)n e(x /2) n e−(x /2) . dx (1) Verify that h0 (x) = 1, h1 (x) = x, h2 (x) = x2 − 1, h3 (x) = x3 − 3x. (2) Prove that the multiply Itˆo stochastic integrals t t tn−1 1 ··· dW (tn ) · · · dW (t2 )dW (t1 ) , n ≥ 1 In (t) := 0
0
0
are well-defined. (3) Prove that
n!In (t) = t
n/2
hn
W (t) √ , t > 0. t
13.22. Let Δn (T ) = {(t1 , . . . ,tn )| 0 ≤ t1 ≤ · · · ≤ tn ≤ T }. For f ∈ L2 (Δn (T )), denote T t t2 n f ··· f (t1 , . . . ,tn )dW (t1 ) · · · dW (tn ) . In (T ) := 0
(1) Prove
0
0
that EInf (T )Img (T ) = 0 for any n = m, f ∈ L2 (Δn (T )), g ∈ L2 (Δm (T )), T tn t2 EInf (T )Ing (T ) = ··· f (t1 , . . . ,tn )g(t1 , . . . ,tn )dt1 · · · dtn 0 0 0
and
for any n ≥ 1, f , g ∈ L2 (Δn (T )). (2) Let 0 = a0 ≤ a1 ≤ · · · ≤ am ≤ T, ki ∈ Z+ , i = 0, . . . , m − 1, n = k0 + · · · + km−1 . Put f (t1 , . . . ,tn ) =
m−1 k0 +···+ki+1 −1
∏ i=0
Prove that Inf (T ) =
m−1
∏ i=0
∏
j=k0 +···+ki
(ai+1 − ai )ki /2 hki ki !
1I[ai ,ai+1 ) (t j ).
W (ai+1 ) −W (ai ) √ . ai+1 − ai
(3) Prove that for any m ∈ N, ai ∈ [0, T ], ki ∈ Z+ , i = 1, . . . , m there exist functions f j ∈ L2 ([0, T ] j ), 0 ≤ j ≤ n = k0 + · · · + km−1 and a constant c such that n
W (a1 )k1 · · ·W (am )km = c + ∑ I j j (T ). j=0
f
200
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
(4) Use the completeness of polynomial functions in any space L2 (Rn , γ ) where γ is a Gaussian measure (see, e.g., [5]), and prove that for any random variable ξ measurable w.r.t. σ (W (a1 ), . . . ,W (am )) there exists a unique sequence of functions { fn ∈ L2 (Δn (T )), n ∈ Z+ } such that ∞
ξ = f0 + ∑ I j j (T ).
(13.2)
f0 = Eξ , Eξ 2 = f02 + ∑ fn 2L2 (Δn (T )) .
(13.3)
f
j=0
Moreover
n≥1
(5) Prove that for any random variable ξ ∈ L2 (Ω , FTW , P) there exists a unique sequence of functions { fn ∈ L2 (Δn (T )), n ∈ Z+ } such that (13.2), (13.3) are satisfied. 13.23. Let X(t) be a.s. a positive stochastic process, dX(t) = X(t)(α dt + β dW (t)), α , β ∈ R. Prove that d ln X(t) = (α − 12 β 2 )dt + β dW (t), t ∈ R+ . 13.24. Assume that processes X(t) and Y (t), t ∈ R+ have stochastic differentials dX(t) = a(t)dt + b(t)dW (t), dY (t) = α (t)dt + β (t)dW (t). Prove that d(XY )t = X(t)dY (t) +Y (t)dX(t) + dX(t)dY (t) = (α (t)X(t) + a(t)Y (t) + b(t)β (t))dt + (β (t)X(t) + b(t)Y (t))dW (t). 13.25. Assume that processes X(t) and Y (t), t ∈ R+ have stochastic differentials dX(t) = adt + bdW (t), dY (t) = Y (t)( α dt + β dW (t)), α , β , a, b ∈R, and Y (0) > −1 (t) = Y −1 (t) (−α + β 2 )dt − β dW (t) , d X(t)Y −1 (t) = 0 a.s. Prove that dY
Y −1 (t) (a − bβ + X(t)(β 2 − α ))dt + (b − β X(t))dW (t) . (Note that Y (t) > 0,t ≥ 0 a.s. if Y (0) > 0, see Problem 14.2.) 13.26. Let {W (t), Ft , t ∈ [0, T ]} be a Wiener process, { f (t), FtW , t ∈ [0, T ]} be a bounded process, | f (t)| ≤ C, t ∈ [0, T ]. Prove that E| 0t f (s)dW (s)|2m ≤ C2mt m (2m − 1)!! 13.27. Assume that a process X has a stochastic differential dX(t) = U(t)dt +dW (t), where U is bounded process. Define a process Y (t) = X(t)M(t), where M(t) = exp {− 0t U(s)dW (s) − 12 0t U 2 (s)ds}. Apply the Itˆo formula and prove that Y (t) is an Ft -martingale where Ft = σ {U(s),W (s), s ≤ t}. 13.28. Let {W (t), t ∈ [0, T ]} be a Wiener process. Assume process { f (t), FtW , t ∈ t T 2m [0, T ]} is such that 0 E| f (t)| dt < ∞. Prove that E| 0 f (s)dW (s)|2m ≤ [m(2m − 1)]mt m−1 0t E| f (s)|2m ds. 13.29. Prove that for any p ≥ 1 there exist positive constants c p ,Cp such that c p E| max
t
t∈[0,T ] 0
f (s)dW (s)|2p ≤ E|
T 0
f 2 (s)ds)| p ≤ Cp E| max
t
t∈[0,T ] 0
f (s)dW (s)|2p .
ˆ , and a pro13.30. Let { f (t), FtW , t ∈ R+ } be a bounded stochastic process, f ∈ L t2 cess X(t) be the stochastic exponent, that is, have the form X(t) = exp{ 0 f (s)dW (s)− 1 t 2 f (s)ds}, where {W (t), Ft , t ∈ R+ } is a Wiener process. Prove that dX(t) = 0 2 X(t) f (t)dW (t).
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
201
13.31. Let (Ω , F, {Ft }t∈[0,T ] , P) be a filtered probability space, {W (t), Ft , t ∈ [0, T ]} be a Wiener process,{γ (t), Ft , t ∈ [0, T ]} be a progressively measurable stochastic ∞} = 1, and {ξ (t), Ft , t ∈ [0, T ]} be a stochastic process such that P{ 0T γ 2 (s)ds < process of the form ξ (t) = 1 + 0t γ (s)dW (s), t ∈ [0, T ] (its stochastic integrability was grounded in Problem 13.10). Assume that ξ (t) ≥ 0, t ∈ [0, T ] P-a.s. Prove that ξ is a nonnegative supermartingale and Eξ (t) ≤ 1, t ∈ [0, T ]. 13.32. Assume that a process {X(t), t ∈ R+ } has a stochastic differential dX(t) = α X(t)dt + σ (t)dW (t), X(0) = X0 ∈ R, where α ∈ R, σ ∈ L2 ([0,t]) for all t ∈ R+ . Find m(t) := EX(t). 13.33. Assume that a process {X(t), t ∈ R+ } has a stochastic differential dX(t) = μ (t)dt + σ (t)dW (t), where μ ∈ L1 ([0,t]), σ ∈ L2 ([0,t]) for all t ∈ R+ , and μ (t) ≥ 0 for all t ∈ R+ . Prove that {X(t), FtW , t ∈ R+ } is a submartingale, where FtW is a flow of σ -algebras generated by a Wiener process {W (t), t ∈ R+ }.
13.34. Let X(t) = exp{ 0t μ (s)ds + 0t σ (s)dW (s)}, and μ ∈ L1 ([0,t]), σ ∈ L2 ([0,t]) for all t ∈ R+ . Find conditions on functions μ and σ under which the process X(t) is (a) a martingale; (b) a submartingale; (c) a supermartingale. 13.35. Apply the Itˆo formula and prove that the following processes are martingales with respect to natural filtration: (a) Xt = et/2 cosW (t). (b) Xt = et/2 sinW (t). (c) Xt = (W (t) + t) exp{−W (t) − 12 t}. 13.36. Let {W (t), Ft , t ∈ R+ } be the Wiener process. Prove that W 4 (1) = 3 + 1 3 0 (12(1 − t)W (t) + 4W (t))dW (t). 13.37. Let {W (t), Ft , t ∈ R+ } be the Wiener process, and τ be a stopping time such that Eτ < ∞. Prove that EW (τ ) = 0, EW 2 (τ ) = Eτ . 13.38. Let a < x < b, {X(t) = x +W (t), t ∈ R+ } where W is the Wiener process. Denote ϕ (x) := P(X(τ ) = a) where τ is the exit moment of the process X from the interval (a, b). Prove that ϕ (x) ∈ C∞ [a, b] and that ϕ satisfies differential equation ϕ (x) = 0, x ∈ [a, b], and ϕ (a) = 1, ϕ (b) = 0. Prove that P(X(τ ) = a) = (b − x)/(b − a). This is a particular case of Problem 14.31.
13.39. Let {H(t), FtW , t ∈ [0, T ]} be a stochastic process with 0T H 2 (t)dt < ∞ a.s. t Put M(t) := 0 H(s)dW (s). (1) Prove that M(t) is a local square integrable martingale. (2) Let E supt≤T M 2 (t) < ∞. Prove that E 0T H 2 (t)dt < ∞ and the process M(t) is a square integrable martingale. 13.40. Let
x2 1 , 0 ≤ t < 1, x ∈ R, and p(1, x) = 0. exp − p(t, x) = √ 2(1 − t) 1−t
202
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
Put M(t) := p(t,W (t)), where W (t) is a Wiener process. (a) Prove that M(t) = M(0) + 0t (∂ p/∂ x)(s,W (s))dW (s). (b) Let H(t) = (∂ p/∂ x)(t,W (t)). Prove that 01 H 2 (t)dt < ∞ a.s., and 1 2 E 0 H (t)dt = +∞. 13.41. (1) Let {W (t) = (W1 (t), . . . ,Wm (t)), Ft ,t ≥ 0} be an m-dimensional Wiener 2 process. Put M(t) := ∑m i=1 (Wi (t)) . Prove that {M(t) − mt, Ft ,t ≥ 0} is a martingale and its quadratic characteristic equals M(t) = (2) Let N(t) = (M(t))1/2 ,
ϕ (x) =
t
4M(u) du. 0
ln |x|, m = 2, |x|2−m , m ≥ 3.
Prove that {ϕ (N(t)), Ft ,t ≥ 0} is a local martingale.
13.42. Let {Mt , Ft , t ∈ [0, T]} be a martingale of the form 0t H(s)dW (s) + 0t K(s)ds, where 0t H 2 (s)ds < ∞ a.s, 0t |K(s)|ds < ∞ a.s. Prove that K(t) = 0 λ 1 |[0,T ] × P a.s. 13.43. Let M ∈ M 2, c . Prove that M 2 (t) − M 2 (s) = 2
t 0
M(s)dM(s) + M(t) − M(s), s < t.
13.44. Let {M(t), Ft , t ∈ R+ } be a continuous local martingale, M(0) = 0, and limt→∞ M(t) = ∞ a.s. Denote
τt := inf{s > 0| M(s) > t}, t > 0. Prove that a stopped stochastic process {M(τt ), t ≥ 0} is a Wiener process with respect to the filtration {Fτt , t ∈ R+ }. 2, c 13.45. Let M ∈ Mloc , M(0) = 0 and EM(t) < ∞ for all t > 0. Prove that M ∈ M 2, c .
13.46. (The Dynkin formula) (1) Assume that a stochastic process {X(t), Ft , t ∈ R+ } has a stochastic differential dX(t) = μ (t, X(t))dt + σ (t, X(t))dW (t), X(0) = x ∈ R, where μ and σ are continuous bounded functions. Let a function f be bounded and f = f (t, x) ∈ C1 (R+ ) ×C2 (R), τ be a bounded stopping time, and a second-order differential operator L be of the form ∂f 1 2 ∂2 f + σ (t, x) 2 . L f (t, x) = μ (t, x) ∂x 2 ∂x (This operator is similar to the operator L introduced in Definition 12.16, but now its coefficients are nonhomogeneous in time.) Prove that τ ∂f + L f (u, Xu )du. E f (τ , Xτ ) = f (0, X0 ) + E ∂t 0
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
203
(2) Prove the following version of the Dynkin formula for unbounded stopping times. Let C02 (Rn ) be the class of twice continuously differentiable functions on Rn with compact support, and τ be a stopping time such that Ex τ < ∞. Then Ex f (Xτ ) = f (x) + E
τ 0
L f (Xu )du.
(3) Let n-dimensional stochastic process {X(t), Ft , t ∈ R+ } have a stochastic differential of the form dXi (t) = μi (t, X(t))dt + Σ m j=1 bi, j (t, X(t))dW j (t), X(0) = x ∈ Rn , where W (t) = (W1 (t),W2 (t), . . . ,Wm (t)) is an m-dimensional Wiener process. Assume that all components of μ and b are bounded continuous functions, a function f is bounded, f = f (t, x) ∈ C1 (R+ ) × C2 (Rn ), and τ is a bounded stopping time. Write down and prove the multidimensional Dynkin formula for X. 13.47. Let {W (t), t ∈ R+ } be a Wiener process. Prove that the limit (13.1) exists in L2 (P) and for all t ∈ R+ and x ∈ R, t 1 1IW (s)∈[x,∞) dW (s) + L(t, x) a.s. (W (t) − x)+ − (W (0) − x)+ = 2 0 13.48. Prove Theorem 13.4 (the Tanaka formula). 13.49. Prove that there exists continuous in (t, x) a modification of L(t, x). 13.50. Let {W (t), t ∈ R+ } be a Wiener process. Prove that for all t ∈ R+ and a ≤ b, b a
L(t, x)dx =
t 0
1IW (s)∈(a,b) ds a.s.
13.51. Let f ∈ L1 (R). Prove that for each t ∈ R+ , ∞
−∞
L(t, x) f (x)dx =
t 0
f (W (s))ds a.s.
13.52. Prove that a limit in (13.1) exists with probability 1. 13.53. (1) Let {L(t, y),t ≥ 0, y ∈ R} be a local time of a Wiener process. Prove that Ex (L(τa ∧ τb , y)) = 2u(x, y), a ≤ x ≤ y ≤ b where τz := inf{s ≥ 0|W (s) = z}, z = a, b, and u(x, y) = ((x − a)(b − y))/(b − a). (2) Prove that Ex
τa ∧τb 0
f (W (s))ds = 2
b a
u(x, y) f (y)dy
for any nonnegative bounded Borel function f : R → R+ . 13.54. Let W (t) = (W1 (t), . . . ,Wn (t)) be an n-dimensional Wiener process, X(t) = a +W (t) where a = (a1 , . . . , an ) ∈ Rn (n ≥ 2), and let R > a. Denote τR := inf{t ≥ 0 | X(t) ∈ B(0, R)} where B(0, R) = {x ∈ Rn | x < R}. Find EτR . 13.55. Let X be the process from Problem 13.54, but a > R. Find the probability for this process to hit the ball B(0, R).
204
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
13.56. Find some function f = ( f1 , f2 ) such that for any process X(t) = (X1 (t), X2 (t)) satisfying the equation dX1 (t) = −X2 (t)dW (t) + f1 (X1 (t), X2 (t))dt, dX2 (t) = X1 (t)dW (t) + f2 (X1 (t), X2 (t))dt, the equality X12 (t) + X22 (t) = X12 (0) + X22 (0),t ≥ 0 a.s. holds true. 13.57. (1) Prove the L´evy theorem. A stochastic process {W (t), Ft ,t ≥ 0} is a Wiener process if and only if it is a square integrable martingale, W (0) = 0, and E((W (t) − W (s))2 /Fs ) = t − s for any s < t. (2) Prove the following generalization of the L´evy theorem. For any square integrable martingale {M(t), Ft ,t ≥ 0} with the quadratic characteristic (see Definition 7.10) of the form M(t) = 0t α (s) ds where α ∈ L1 ([0,t]) for any t > 0, and α (s) > 0 for all s > 0, there exists a Wiener process {W (t), Ft ,t ≥ 0} such that M(t) = 0t (α (s))1/2 dW (s). (3) Prove the multidimensional version of the L´evy theorem. An n-dimensional stochastic process {W (t) = (W1 (t),W2 (t), . . . ,Wn (t)), Ft ,t ≥ 0} is an n-dimensional Wiener process if and only if it is a square integrable martingale, W (0) = 0, and the processes Wi (t)W j (t) − δi j t are martingales for any 1 ≤ i, j ≤ n. Another formulation for the latter claim is that, for any 1 ≤ i, j ≤ n, the joint quadratic characteristic Wi ,W j (t) equals δi j t . 13.58. Prove that stochastic process Y (t) := process.
t
0 sign (W (s))dW (s),
t ≥ 0 is a Wiener
13.59. Let W (t) = (W1 (t), . . . ,Wm (t)) be a Wiener process. Assume that a progressively measurable process U(t),t ≥ 0 takes values in a space of n × m matrices and U(t)U ∗ (t) = idn a.s., where idn is the identity n × n matrix. Prove that t W (t) = 0 U(s)dW (s) is an n-dimensional Wiener process. 13.60. Assume that a progressively measurable stochastic process β (t),t ≥ 0 satisfies the condition: ∃ c,C > 0 ∀ t ≥ 0 : c ≤ β (t) ≤ C a.s. Let A(t) =
t
−1
β (s)ds, A (t) := inf s ≥ 0|
0
s
β (z)dz = t .
0
Prove that A−1 (t) is a stopping time. Consider filtration {FtA := FA−1 (t) ,t ≥ 0}. Prove that the process (t) = W
−1 (t) A
β 1/2 (s)dW (s)
0
is an
{FtA ,t
≥ 0}-Wiener process. Prove the following change of variables formula, A−1 (t) 0
b(s)dW (s) =
t 0
(z), b(A−1 (z))β −1/2 (A−1 (z))dW
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
205
where b(t),t ≥ 0 is a σ {W (s), s ≤ t}-adapted process such that T 2 b (s)ds < ∞ = 1 P 0
for each T > 0.
Hints 13.1. Represent the integral as a mean-square limit of the corresponding integral sums. 13.2. See the hint to Problem 13.1. The Itˆo formula can also be applied. 13.3. Prove that these random variables are jointly Gaussian (see Problem 13.1), and calculate their covariance. 13.4. (3) Apply the Itˆo formula to W 3 (t). 13.5. Apply the definition of a stochastic integral and properly approximate a continuous process f ∈ Lˆ2 ([0, T ]). kT /n 13.6. Prove that process gn = ∑n−1 k=1 n (k−1)T /n g(s)ds1I[kT /n,(k+1)T /n) belongs to Lˆ2 ([0, T ]), and gn → g, n → ∞ in Lˆ2 ([0, T ]). In order to prove this verify that linear kT /n operator Pn : f → ∑n−1 n f (s)ds1I[kT /n,(k+1)T /n) in Lˆ2 ([0, T ]) has a norm 1, k=1
(k−1)T /n
and a sequence {Pn , n ≥ 1} strongly converges to the identity operator in Lˆ2 ([0, T ]). 13.8. Prove that a sequence of simple processes fn (t) = ∑ n k
k/n
(k−1)/n
f (s)ds1It∈[k/n,(k+1)/n)
converges to a process f in Lˆ2 ([0, T ]). Construct a similar sequence gn for g and ob serve that the Itˆo integrals of these processes 0t fn (s)dW (s), 0t gn (s)dW (s) coincide on the set A for any fixed t ∈ [0, T ]. In order to prove that t t f (s)dW (s) = g(s)dW (s) = 0 P ω ∈ A| ∃t ∈ [0; T ] t
0
0
t
notice that processes 0 f (s)dW (s), 0 g(s)dW (s) are continuous in t. 13.9. Prove the required statements for a stopping time that take values in a finite set. Then approximate arbitrary stopping time τ by a sequence of stopping times τn = ∑k≥1 k/n1Iτ ∈[(k−1)/n, k/n) . See also the solution to Problem 13.37. 13.10. Put τn = inf{t ≥ 0 | 0t f 2 (s)ds = n}∧T. Then τn is a stopping time, a sequence {τn } is nondecreasing, and for a.a. ω there exists n0 = n0 (ω ) such that τn (ω ) = T, n ≥ n0 . Set fn (t) = f (t)1Iτn >t . Then 0T ( fn (t, ω ) − f (t, ω ))2 dt = 0 if τn (ω ) = T. Denote I(t) = 0t fn (s)dW (s) for {ω | τn (ω ) = T } and apply the results of the previous two problems. 13.11. Let
τm = inf{t ≥ 0|
t 0
g2 (s)ds = m}∧T, gn (t) =
n−1
∑n
k=1
kT /n
(k−1)T /n
g(s)ds1I[kT /n,(k+1)T /n) (t).
206
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
Then (see Problems 13.6, 13.10) T 0
g(s)dW (s) =
kT /n
n−1
∑n
(k−1)T /n
k=1
T 2 T 2 0 gn (t)dt ≤ 0 g (t)dt for all ω ∈ Ω , n ∈ N, and T 0
g(s)1Iτm =T dW (s),
g(s)ds(W ((k + 1)T /n) −W (kT /n))
=
T 0
gn (t)1Iτm =T dW (t)
for a.a. ω such that τm (ω ) = T . Tend n → ∞ and apply Problems 13.6, 13.10. 13.12. Apply results of Problems13.5–13.10. 13.13. The first sum converges to 01 W (t)dW (t) = (W 2 (1) − 1)/2 in L2 (P). Compare the sums in (b), (c) with the sum in (a) and apply the reasoning of Problem 3.19 to estimate differences between the sums. 13.14. Introduce stopping times
τN,n = inf{t ≥ 0| Then
t 0
b2n (s)ds ≥ N} ∪ {T }.
T
0
2 bn (t)1It≤τN, n − b0 (t)1It≤τN, 0
dt → 0, n → ∞.
Therefore 0T bn (t)1It≤τN,n dW (t) converges in L2 (P) to 0T b0 (t)1It≤τN,n dW (t) as n → ∞, and so converges in probability. The localization property of the stochastic integral (Problem 13.8) implies that T
bn (t)1It≤τN,n dW (t) =
0
T 0
bn (t)dW (t)
for a.a. ω from the set {τN,n = T }. So, to complete the proof, it suffices to observe that ∀ ε > 0 ∃ N : lim P(τN,n < T ) < ε . n→∞
13.15. Introduce auxiliary processes
ξn,m (t) =
∑
k=0
Then
1 0
k+1/m
m−2
m k/m
ξn (s)ds1I ((k+1)/m),((k+2)/m) (t).
2 2 1 ξ (t)dW (t) ≤ 3 ( ξ (t) − ξ (t))dW (t) n,m n 0 0 0 0 n 2 + 01 ξn,m (t)dWn (t) − 01 ξ0,m (t)dW0 (t) 2 . + 01 (ξ0 (t) − ξ0,m (t))dW0 (t)
ξn (t)dWn (t) −
1
(13.4) It is easy to see that for each m we have the following convergence in probability,
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula 1
lim
n→∞ 0
ξn,m (t)dWn (t) =
It can be proved that Pm : f →
207
1
ξ0,m (t)dW0 (t). 0 k+1/m (t) f (s)ds1I ∑m−2 k=0 m k/m ((k+1)/m),((k+2/m)
in
Lˆ2 ([0, T ]) has a norm 1, and a sequence {Pm , m ≥ 1} strongly converges to the identity operator in Lˆ2 ([0, T ]). So, 1 2 lim E (ξ0 (t) − ξ0,m (t))dW0 (t) m→∞
= lim E m→∞
Observe that
0
1 0
(ξ0 (t) − ξ0,m (t))2 dt = 0.
2 E 0 (ξn (t) − ξn,m (t))dWn (t) 1 2 = E 0 (ξn (t) − ξn,m (t)) dt ≤ 3 E 01 (ξn (t) − ξ0 (t))2 dt 1 1 2 2 +E 0 (ξ0 (t) − ξ0,m (t)) dt + E 0 (ξ0,m (t) − ξn,m (t)) dt
1
≤ 6E
1 0
(ξn (t) − ξ0 (t))2 dt + 3E
1 0
(ξ0 (t) − ξ0,m (t))2 dt.
Therefore the first term in the right-hand side of (13.4) converges in the mean square to 0 as n, m → ∞. So, the left-hand side of (13.4) converges to 0 in probability as n → ∞. 13.19. Apply the multidimensional Itˆo formula. 13.20. Apply the Itˆo formula. 13.21. (2), (3) Use the √ method of mathematical induction. In particular, in item (3), let a(s) := (W (s))/ s. Then da(s) = s−1/2 dW (s) − s−3/2W (s)ds/2. Due to the Itˆo formula d sn/2 hn (a) = n2 sn/2−1 hn (a) + sn/2 hn (a(s)) s−1/2 dW (s) − s−3/2W (s)ds/2 −1 ds = sn/2−1 nh (a(s)) − a(s)h (a(s)) + 12 sn/2 hn (a(s))s n n 2 + hn (a(s)) ds + s(n−1)/2 hn (a(s))dW (s). The following property of Hermite polynomials is well known: nhn (x) − xhn (x) + hn (x) = 0, and also hn (x) = nhn−1 (x). That is d sn/2 hn (a(s)) = ns(n−1)/2 hn−1 (a(s))dW (s), that which was to be demonstrated (this is the inductive step). 13.22. (1) Use properties of stochastic integrals. (2) Apply Problem 13.21. (4) Uniqueness follows from item (1). To prove existence, use items (1) and (3). f f (5) Random variables In n (T ), Imm (T ) are orthogonal in L2 (Ω , FTW , P) if m = n (see fn item (1)), so {In (T )| fn ∈ L2 (Δn (T ))} are orthogonal subspaces of L2 (Ω , FTW , P). Therefore, it suffices to verify that linear combinations of multiple Itˆo integrals are dense in L2 (Ω , FTW , P). As mentioned, the set of polynomials is dense in any space L2 (Rn , γ ), where γ is a Gaussian measure. So any square integrable random variable
208
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
of the form g(W (s1 ), . . . ,W (sn )) can be approximated in L2 (P) by linear combinations of multiple stochastic integrals. Prove now that a set of square integrable random variables having the form g(W (s1 ), . . . ,W (sn )), sk ∈ [0, T ], n ∈ N is dense in L2 (Ω , FTW , P). 13.23.–13.25. Apply the Itˆo formula. In Problem 13.24 this can be made straightforwardly; in Problems 13.23 and 13.25 an additional limit procedure should be used because the functions x → ln x and (x, y) → x/y do not belong to the classes C2 (R) and C2 (R2 ), respectively. For instance, for any given c > 0 there exists a function Fc ∈ C2 (R) such that Fc (x) = ln x, x ≥ c. Write the Itˆo formula for Fc (X(t)) and then tend c → 0+. 13.26. Let X(t) := 0t f (s)dW (s). Put τN := inf{t ∈ R+ | sup0≤s≤t |X(s)| ≥ N} ∧ T, apply the Itˆo formula to |X(t ∧ τN )|2m , obtain the estimate E|X(t)|2m ≤ C2 m(2m − t 1) 0 E|X(s)|2m−2 ds, and apply mathematical induction. 13.28. the Itˆo formula and obtain the equality E|X(t ∧ τN )|2m = m(2m − t∧τApply N |X(s)|2m−2 f 2 (s)ds, where X(t) and τN are the same as in Problem 13.26. 1)E 0 Apply the H¨older inequality with p = m/(m − 1), q = m to the right-hand side. Verify and then use the following facts: E|X(t ∧ τN )|2m < ∞ and E|X(t ∧ τN )|2m is nondecreasing in t. 13.29. Prove that the quadraticcharacteristic of the square integrable martingale t t 2 0 f (s)dW (s), t ∈ [0, T ] equals 0 f (s)ds, t ∈ [0, T ]. Apply the Burkholder–Davis inequality for a continuous-time martingale (Theorem 7.17). 13.30. Apply the Itˆo formula. 13.31. Consider the following stopping times: τn = inf{t ∈ [0, T ] | 0t γ 2 (s)ds ≥ n}, T 2 where we set τn = T if 0 γ (s)ds < n. Use properties of the Itˆo integral and verify that process ξ (t ∧ τn ) is a continuous nonnegative Ft -martingale. Check that τn → T a.s. as n → ∞. Apply the Fatou lemma and deduce both statements. 13.32. Write the equation for m(t). 13.33. Prove that 0t σ (t)dW (t) is a martingale with respect to the indicated flow and apply the definition of a submartingale. 13.34. Apply the Itˆo formula. 13.39. (1) Consider a sequence of stopping times τn = inf{t ∈ R+ 0t H 2 (s)ds = n} ∧ T and prove that EM 2 (τn ) = E 0τn H 2 (s)ds. (2) Apply the Lebesgue monotone convergence theorem and relation E 0τn H 2 (s)ds = EM 2 (τn ) ≤ E supt≤T M 2 (t). 13.41. (1) To calculate a quadratic characteristic apply the Itˆo formula to the difference M 2 (t) − M 2 (s), 0 ≤ s < t. (2) Apply the multidimensional version of the Itˆo formula. 13.42. Consider a sequence of stopping times τn = inf{t ∈ R+ | 0t H 2 (s)ds = n} ∧ T. 13.43. Apply the generalized Itˆo formula. 13.46. Apply the Itˆo formula and Doob’s optional sampling theorem. 13.49. Apply Theorem 3.7. 13.51. Use approximation and Problem 13.50. 13.53. (1) Due to Problem 13.38, P{τa < τb } = (b − x)/(b − a). Prove that Ex
τa ∧τb 0
sign(W (s) − y)dW (s) = 0
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
209
and apply Theorem 13.4 to deduce the identity Ex (L(τa ∧ τb , y)) = |b − y| − |x − y| + (b − x)/(b − a)(|a − y| − |b − y|) and then the required statement. (2) Follows from item (1) and Problem 13.51. 13.57. (1) Apply the generalization of the Itˆo formula to a function F(W (t)) = exp {iuW (t)}. Denote I(t) := E(F(W (t))/Fs ). Then I(t) satisfies the equation I(t) = F(W (s)) − (u2 /2) st I(θ ) d θ . Therefore, I(t) = F(W (s)) exp {−(u2 /2)(t − s)}, and so E(exp {iu(W (t) −W (s))}/Fs ) = exp {−(u2 /2)(t − s)}. t (2) Put W (t) := 0 ((dM(s))/(α (s))). Verify that {W (t), t ≥ 0} is a Wiener process. 13.58. A process Yt is a square integrable martingale with quadratic characteristic Y t = 0t sign 2 (W (s))ds. Use Problem 3.21 and prove that Y t = t. Finally apply Problem 13.57 (L´evy theorem). 13.59. See hint to Problem 13.58. 13.60. Due to Problem 7.44, a random variable A−1 (t) is a stopping time. Check that (t) = 0, EW 2 (t) = t, and use the L´evy (t) is a continuous FA−1 (t) -martingale, EW W theorem (Problem 13.57). To prove the change of variable formula, approximate a process b(t) by simple processes, prove the formula for them and apply the result of Problem 13.14.
Answers and Solutions 13.13. (a)
(b)
(c)
13.16.
W 2 (1) − 1 , 2 W 2 (1) + 1 , 2 W 2 (1) . 2 (2n − 1)!! n+1 T . n+1
13.32. m(t) = X0 eα t . 13.34. For almost all s ∈ R+ and ω ∈ Ω , the value μ (s, ω ) − ((σ 2 (s, ω ))/2) should (a) equal 0; (b) ≥ 0: (c) ≤ 0. 13.36. Put X(t) := E(W 4 (1)/FtW ), 0 ≤ t ≤ 1. Due to the Markov property of a Wiener process, X(t) = E(W 4 (1)/W (t)). The conditional distribution of W (1) given W (t) is Gaussian, N(W (t), 1 − t), thus (1) −W (t))4 /W (t) X(t) = E (W(1) −W (t) +W (t))4 /W (t) = E (W + 4E (W (1) −W (t))3W (t)/W (t) + 6E (W (1) −W (t))2W 2 (t)/W (t) + 4E (W (1) −W (t))W 3 (t)/W (t) +W 4 (t) = 3(1 − t)2 + 6(1 − t)W 2 (t) + W 4 (t).
210
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
Hence, due to the Itˆo formula X(s) = X(0) + 0s (12(1 − t)W (t) + 4W (t)3 )dW (t) (check this). Finally, X(0) = EW 4 (1) = 3. 13.37. Consider the adapted function f (s, ω ) := 1Iτ (ω )≥s . Then P 0∞ f 2 (s)ds < ∞ t n = P(τ < ∞) = 1. Let us show that 0 f (s)dW (s) = W (t ∧ τ ) a.s. t Put τn = k/2 n n if (k − 1)/2 ≤ τ ≤ k/2 , τn = ∞ if τ = ∞. Consider integrals 1I dW (s) = ∞ ∞ 0 τn ≥s n for some i, then t 1I 1 I dW (s). If t = i/2 dW (s) = 1 I dW (s) = τ ∧t≥s τ ≥s τ ∧t≥s n n n 0 0 0 stochastic integral and Wiener process, the last equalWτn ∧t . Due to continuity of the ity is satisfied for all t. Next, 0∞ E(1Iτn ≥s −1Iτ ≥s )2 ds = 0∞ (P(s ≤ τn ) − P(s ≤ τ )) ds = t Eτn − Eτ ≤ 1/2n → 0, n → ∞, so 0t 1Is≤τ dW (s) = l.i.m. τ n→∞ 0 1Is≤τn dW ∞ (s) = W ( τ ∧t) = W ( τ ∧t). Then P-a.s. W ( τ ) = 1 I dW (s) = l.i.m.n→∞ n s≤ τ 0 1Is≤τ dW (s), 0 and E 0∞ 1I2s≤τ ds = Eτ < ∞. So, EW (τ ) = E 0∞ 1Is≤τ dW (s) = 0, and EW 2 (τ ) = E( 0∞ 1Is≤τ dW (s))2 = 0∞ E1I2s≤τ ds = Eτ . 13.44. One can verify that random variable τt is a stopping time for each t > 0. One can also check that Mτt = t a.s. Let us show that {Mτt , Fτt , t ∈ R+ } is a square integrable martingale. Define a localizing sequence
σk := inf{t > 0| |Mt | > k}. R+ }
is a bounded martingale, and hence, by Theorem 7.5 a Then {Mt∧σk , Ft , t ∈ process {Mτt ∧σk , Fτt , t ∈ R+ } is a bounded martingale as well. According to Problem 13.43 τt ∧σk M 2 (τt ∧ σk ) = 2 M(s)dM(s) + M(τt ∧ σk ). 0
+2 it holds that It was mentioned in Theoretical grounds in Chapter 7 that for M ∈ M 2 2 M (t) − M(t) is a martingale. Therefore, M (t ∧ σk ) −M(t ∧ σk ), and so M 2 (τt ∧ τ ∧σ σk ) − M(τt ∧ σk ) are martingales. That is, a process 0 t k M(s)dM(s) is a martingale, moreover a bounded martingale. Thus its expectation is zero and EM 2 (τt ∧ σk ) = EM(τt ∧ σk ) ≤ EM(τt ) ≤ t. Due to Fatou’s lemma {M(τt ), Fτt , t ∈ R+ } is a square integrable martingale. Due to now to prove Problem 13.57, it suffices that M(τt ) is a continuous process and E (M(τt ) − M(τs ))2 /Fτs = t − s. Let us first prove the last relation. It is easy to see that E (M(τt ) − M(τs ))2 /Fτs = M τ (t) − M τ (s), where M τ is a quadratic characteristic of a martingale M(τt ). Due to the generalization of the Itˆo formula, M 2 (τt ) = 2 0τt M(s)dM(s) + t. We can con2 sider the τtlast relation as a Doob–Meyer decomposition for supermartingale M (τt ), where 0 M(s)dM(s) is a local martingale, and A(t) = t is nonrandom and thus a predictable nondecreasing process. Because of uniqueness of such decomposition M τ (t) = t. Finally, let us prove the continuity M(τt ). A function τt is right continuous. Therefore M(τt ) has right continuous trajectories a.s. Note that M(τt − ) = M(τt ) = t and M is a continuous process. Denote by A ⊂ Ω the set on which the trajectories are not continuous. Then A = {ω ∈ Ω | there exists t > 0 such that τt − = τt , and M(τt − ) = M(τt )} ⊂ r,s∈Q {ω ∈ Ω | there exists t > 0 such that τt − < r < s < τt , M(r) = M(s), M(r) = M(s)} ⊂ r,s∈Q, 0 M(r)}. Then {M(r) = M(s), M(r) = M(s)} = {σ ≥ s, M(r) = M(σ ∧s)} and M(σ ∧s∧ σk )−M(r ∧ σk ) = 0. Then (M(σ ∧s∧ σk ))2 −
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
(M(r ∧ σk ))2 =
σ ∧s∧σk r∧σk
211
M(u)dM(u), and the right-hand side has zero expectation,
and the expectation of the left-hand side equals E (M(σ ∧ s ∧ σk ) − M(r ∧ σk ))2 . Tend k → ∞ and apply the Fatou lemma: E(M(σ ∧ s) − M(r))2 = 0. It is easy to deduce now that P(A) = 0. 13.45. Let us write down a generalization of the Itˆo formula for a localizing sequence {τn , n ≥ 1} (see also Problem 13.43), M 2 (t ∧ τn ) = 2
t∧τn
t∧τn
0
M(s)dM(s) + M(t ∧ τn ),
and M(s)dM(s) is a martingale (see Theoretical grounds to Chapter 7). So 0 E 0t∧τn M(s)dM(s) = 0, hence EM 2 (t ∧ τn ) ≤ EM(t) < ∞. The application of Fatou’s lemma completes the proof. n μ (t, 13.46. (3) E f (τ , Xτ ) = f (0, X0 )+E 0τ (L f + ∂ f /∂ t) (u, Xu )du, where L = Σi=1 i 1 n 2 m x)(∂ f (t, x))/∂ xi + 2 Σi, j=1 σi j (t, x)(∂ f (t, x))/∂ xi ∂ x j with σi j := Σk=1 bik b jk . 13.47. Put fx (y) = (y − x)+ . Define approximations fxε (y)(ε > 0) of the function fx (y): ⎧ if y ≤ x − ε , ⎨ 0, fxε (y) = (y − x + ε )2 /4ε , if x − ε ≤ y ≤ x + ε , ⎩ y − x, if y ≥ x + ε . There exists a sequence ϕn ∈ C∞ (R) of functions, with compact supports that contract to {0}, and such that gn := ϕn ∗ fxε (i.e., gn (y) = R fxε (y−z)ϕn (z)dz) satisfy relations: gn → fxε and gn → fxε uniformly on R, and gn → fxε pointwise except at points x± ε . Notice that gn ∈ C∞ (R). For example, we can put ϕn (y) = nϕ (ny), where ϕ (y) = c exp{−(1 − y2 )−1 } for |y| < 1 and ϕ (y) = 0 for |y| ≥ 1; and a constant c is 1 such that c −1 ϕ (y)dy = 1. Apply the Itˆo formula to gn : t
1 t g (W (s))ds. 2 0 n 0 For all t a sequence 1Is∈[0,t] gn (W (s)) converges as n → ∞ to 1Is∈[0,t] fxε (W (s)) uni formly on R+ × Ω . So, 0t gn (W (s))dW (s) converges in L2 (P) to 0t fxε (W (s))dW (s). Observe that limn→∞ gn (W (s)) = fxε (W (s)) a.s. for each s ∈ R+ because P(W (s) = x ± ε ) = 0 for all ε > 0. Due to Fubini’s theorem (applyied to a product of measures P × λ 1 ) this limit relation holds a.s. for almost all s ∈ R+ with respect to Lebesgue measure λ 1 . Because |gn | ≤ (2ε )−1 , the Lebesgue dominated theorem implies the convergence 0t gn (W (s))ds → 0t fxε (W (s))ds in L2 (P) and a.s. So, for each x and t t 1 t 1 fxε (W (s))dW (s) + 1I ds. a.s. fxε (W (t)) − fxε (W (0)) = 2 0 2ε W (s)∈(x−ε ,x+ε ) 0 (13.5) + as (0)−x) A sequence fxε (W (t))− fxε (W (0)) converges L2 (P) to (W (t)−x)+ −(W t ε ↓ 0 because | fxε (W (t))− fxε (W (0))| ≤ |W (t)−W (0)|. Moreover E 0 ( fxε (W (s))− √ 1IW (s)∈[x,∞) )2 ds ≤ E 0t 1IW (s)∈(x−ε ,x+ε ) ds ≤ 0t (2ε / 2π s)ds → 0 as ε → 0. Theregn (W (t)) − gn (W (0)) =
gn (W (s))dW (s) +
fore 0t fxε (W (s))dW (s) converges in L2 (P) to required statement.
1 0
1IW (s)∈[x,∞) dW (s). This implies the
212
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
13.48. A process (−W ) is also a Wiener process. So it possesses a local time at the point (−x). Let us denote it by L− (t, −x). Applying Definition 13.1 to L− (t, −x) we obtain that L− (t, −x) = L(t, x) a.s. Combine the last equality and the application of the result of Problem 13.47 to (−W ) and (−x) instead of W and x. Then we get t 1 (W (t) − x)− − (W (0) − x)− = − 1IW (s)∈(−∞,x] dW (s) + L(t, x) 2 0 (check the last equality). Add this equality to the equality from Problem 13.47, and obtain the required statement because 01 1IW (s)=x dW (s) = 0 a.s. 13.50. Without loss of generality we may assume that L(t, x) is continuous in (t, x) (see Problem 13.49). Denote I(t, x) = 0t 1IW (s)∈[x,∞) dW (s), I(t, x) is continuous in (t, x). Then (Problem 13.47) 1 L(t, x) = (W (t) − x)+ − (W (0) − x)+ − I(t, x). (13.6) 2 x+ε Let fxε (z) = (1/2ε ) x− ε 1Iz∈[y,∞) dy (see solution of Problem 13.47). Due to the stochastic Fubini theorem (Theorem 13.5), we obtain t 1 t x+ε 0 f xε (W (s))dW (s) = 2ε 0 x−ε 1IW (s)∈[y,∞) dy dW (s) 1 x+ε t
=
2ε x−ε
0 1IW (s)∈[y,∞) dW (s) dy
=
1 x+ε 2ε x−ε I(t, y)dy
a.s.
(Verify that conditions of the stochastic Fubini theorem are really satisfied.) Substituting the received identity into formula (13.5) we obtain that a.s. 1 x+ε 1 t 1 fxε (W (t)) − fxε (W (0)) − I(t, y)dy = 1I ds. (13.7) 2ε x−ε 2 0 2ε W (s)∈(x−ε ,x+ε ) Let us integrate the last equality with respect to x. Then b" # 1 x+ε fxε (W (t)) − fxε (W (0)) − I(t, y)dy dx 2ε x−ε a 1 t b 1I dxds a.s. (13.8) = 4ε 0 a x−ε <W (s)<x+ε For any z ∈ R
1 b 1I(x−ε ,x+ε ) (z)dx = 1Iz∈(a,b) + 1Iz=a + 1Iz=b . (13.9) ε ↓0 2ε a Make ε ↓ 0 in (13.8). A function I is continuous, so identity (13.9) implies that b 1 1 (W (t) − x)+ − (W (0) − x)+ − I(t, x) dx = 1I ds a.s. 2 0 W (s)∈(a,b) a The required statement follows from the last identity and (13.6). 13.52. Let us apply equality (13.8). Its left-hand side is continuous in ε > 0 and the right-hand side is left-continuous in ε > 0. So, identity (13.8) holds for all ε > 0 simultaneously, for a.a. ω . The left-hand side converges to 12 L(t, x) as ε ↓ 0 (see (13.6)) because I(t, · ) is continuous. That is what was to be demonstrated. 13.54. Let m ∈ N. Apply the multidimensional Dynkin formula to the process X, τ = τm = τR ∧m and a bounded function f ∈ C2 (R) such that f (x) = x2 as x ≤ R. Observe that L f (x) = 12 f (x) = n when x ≤ R. Therefore lim
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula E f (X(τm )) = f (a) + 12 E 0τm f (X(s))ds τm 2 2
= a + E
0
213
nds = a + nEτm .
So Eτm for all m ∈ N. Thus τR = limm→∞ τm < ∞ a.s. and EτR = (1/n)(R2 − a2 ). 13.55. Let σk be the first exit time from the ring Ak = {x ∈ Rn | R ≤ x ≤ 2k R}, k ∈ N. Let also fn,k ∈ C2 (R) have a compact support and − ln |x|, n = 2, fn,k (x) = |x|2−n , n > 2, ≤ (1/n)(R2 − a2 )
as R ≤ x ≤ 2k R. Because fn,k = 0 on a set Ak , Dynkin’s formula implies E fn,k (X(σk )) = fn,k (a) for all k ∈ N. We put pk := P(|X(σk )| = R), qk = P(|X(σk n > 2 separately. If n = 2 then due to (13.10)
)| = 2k R),
(13.10)
and consider cases n = 2 and
− ln R · pk − (ln R + k ln 2)qk = − ln a, k ∈ N. So qk → 0 as k → ∞, and P(σk < ∞) = 1. If n > 2, then (13.10) implies pk · R2−n + qk (2k · R)2−n = a2−n . Because 0 ≤ qk ≤ 1, a 2−n . lim pk = P(σk < ∞) = k→∞ R 13.56. f1 (x1 , x2 ) = −0, 5x1 ; f2 (x1 , x2 ) = −0, 5x2 .
14 Stochastic differential equations
Theoretical grounds Consider a complete filtration {Ft ,t ∈ [0, T ]} and an m-dimensional Wiener process {W (t),t ∈ [0, T ]} with respect to it. By definition, a stochastic differential equation (SDE) is an equation of the form dX(t) = b(t, X(t))dt + σ (t, X(t))dW (t), 0 ≤ t ≤ T,
(14.1)
with X0 = ξ , where ξ is an F0 -measurable random vector, b = b(t, x) : [0, T ] × Rn → Rn , and σ = σ (t, x) : [0, T ] × Rn → Rn×m are measurable functions. Equality (14.1) is simply a formal writing of the stochastic integral equation X(t) = X0 +
t 0
b(s, X(s))ds +
t
σ (s, X(s))dW (s), 0 ≤ t ≤ T.
(14.2)
0
Definition 14.1. A strong solution to stochastic differential equation (14.1) on the interval 0 ≤ t ≤ T is an Ft -adapted Rm -value process {X(t), 0 ≤ t ≤ T } with a.s. continuous paths and such that after its substitution into the left- and right-hand sides of relation (14.2), for each 0 ≤ t ≤ T the equality holds with probability 1. Definition 14.2. Equation (14.1) has a unique strong solution in the interval [0, T ] if the fact that processes X = X(t) and Y = Y (t), t ∈ [0, T ], are strong solutions to the given equation with the same initial condition, implies that X is a modification of Y (then these continuous processes do not differ, i.e., P(X(t) = Y (t),t ∈ [0, T ]) = 1). Theorem 14.1. Assume both Lipschitz and linear growth conditions |b(t, x) − b(t, y)| + |σ (t, x) − σ (t, y)| ≤ L|x − y|, x, y ∈ Rn ,t ∈ [0, T ]; |b(t, x)|2 + |σ (t, x)|2 ≤ L(1 + |x|2 ), t ∈ [0, T ], x ∈ Rn , where L > 0 is a constant. Then the stochastic differential equation has a unique strong solution. Here we denote Euclidean norm of both vector and matrix by the symbol | · |; that is,
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 215 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 14,
216
14 Stochastic differential equations
|b(t, x)| =
n
∑ |bk (t, x)|2
1/2
, |σ (t, x)| =
k=1
n,m
∑
|σk j (t, x)|2
1/2
.
k=1, j=1
Theorem 14.1 gives the simplest conditions for existence and uniqueness of a strong solution; those are the classical Lipschitz and linear growth conditions. There exist generalizations of the theorem to the case where the Lipschitz condition is replaced by a weaker one. Theorem 14.2. Consider a scalar equation with homogeneous coefficients X(t) = X0 +
t 0
b(X(s))ds +
t
σ (X(s))dW (s), 0 ≤ t ≤ T,
(14.3)
0
and assume that its coefficients satisfy the next conditions. (1) The functions b(x) and σ (x) are bounded. (2) There exists a strictly increasing function ρ (u) on [0, ∞) such that ρ (0) = 0, 0+ ρ −2 (u) = ∞, and |σ (x) − σ (y)| ≤ ρ (|x − y|) for all x, y ∈ R (it is Yamada’s condition [91], see also [38]). (3) There exists an increasing convex function ς (u) on [0, ∞), such that ς (0) = 0, −1 0+ ς (u) = ∞, and |b(x) − b(y)| ≤ ς (|x − y|) for all x, y ∈ R. Then equation (14.3) has a unique strong solution. In particular, one may take ρ (u) = uα , α ≥ 12 , and ς (u) = Cu. Now, we pass to the definition of a weak solution. Assume that only nonrandom coefficients b(t, x) and σ (t, x) are given, and at the moment there is no stochastic object at hand. Theorem 14.3. Consider a scalar equation (14.3) with homogeneous coefficients and assume that its coefficients satisfy the Lipschitz and linear growth conditions. Then a process X has a strong Markov property. Definition 14.3. If on a certain probability space (Ω , F, P) one can construct a W (t)), t ∈ [0, T ]}, which are filtration {Ft , t ∈ [0, T ]} and two processes {(X(t), (t), t ∈ [0, T ]} is a Wiener process, and adapted to the filtration and such that {W is a solution to equation (14.1) in which W is changed for W , then it is said that X(t) equation (14.1) has a weak solution. Theorem 14.4. Let coefficients b(t, x) and σ (t, x) be measurable locally bounded functions, continuous in x for each t ∈ [0, T ], and moreover |b(t, x)|2 + |σ (t, x)|2 ≤ L(1 + |x|2 ), t ∈ [0, T ], x ∈ Rn . Then equation (14.1) has a weak solution. Remark 14.1. Throughout this and the next chapters by a Wiener process we mean a process W that satisfies the usual definition of a Wiener process, except, maybe, for the condition W (0) = 0. However, if no initial condition is specified, it is assumed that W (0) = 0.
14 Stochastic differential equations
217
Bibliography [9], Chapter VIII; [38], Chapter IV; [90], Chapter 12, §§12.4–12.5; [24], Volume 3, Chapters II and III; [25], Chapter VIII, §§2–4; [26]; [51], Chapter 19; [57], Chapter 4; [79], Chapter 31; [20], Chapter 14; [8], Chapter 7, §7.5; [46], [49], Chapter 21; Chapter 12, §12.5; [54], Chapter 3, §3.5; [61], Chapter V; [68], Chapter 13, §13.1; [85], Chapters 9, 10, and 15.
Problems 14.1. Let {W (t), t ∈ R+ } be a one-dimensional Wiener process. Prove that the next processes are the solutions to corresponding stochastic differential equations. (a) X(t) = eW (t) is a solution to an SDE dX(t) = 12 X(t)dt + X(t)dW (t). (b) X(t) = (W (t))/(1 + t) with W (0) = 0 is a solution to an SDE dX(t) = −(1/(1 + t))X(t)dt + (1/(1 + t))dW (t), X(0) = 0. (c) X(t) = sinW (t) with W (0) = a ∈ (−(π /2).(π /2)) is a solution to an SDE dX(t) = − 12 X(t)dt +(1−X 2 (t))1/2 dW (t) for t < τ (ω ) = inf{s > 0|W (s) ∈ [−(π /2), (π /2)]}. (d) (X1 (t), X2 (t)) = (t, et W (t)) is a solution to an SDE 0 dX1 (t) 1 = dt + X1 (t) dW (t), X2 (t) dX2 (t) e (e) (X1 (t), X2 (t)) = (chW (t), shW (t)) is a solution to an SDE 1 X1 (t) dX1 (t) X2 (t) = dt + dW (t). dX2 (t) X1 (t) 2 X2 (t) 14.2. Prove that the process X(t) = X0 exp{(r − (σ 2 /2))t + σ W (t)} is a strong solution to an SDE dX(t) = rX(t)dt + σ X(t)dW (t), X(0) = X0 , t ∈ R+ , and find an equation that is satisfied by the process X(t) = X0 exp{rt + σ W (t)}. 14.3. Let the process {X(t),t ∈ R+ } be a solution to an SDE dX(t) = (μ1 X(t) + μ2 )dt + (σ1 X(t) + σ2 )dW (t), X0 = 0, t ∈ R+ . (1) Find an explicit& form for X(t). ' (2) Let S(t) = exp (μ1 − (σ12 /2))t + σ1W (t) , where W is the same Wiener process that is written in the equation for X. (a) Prove that the process {S(t)} is a strong solution to an SDE dS(t) = μ1 S(t)dt + σ1 S(t)dW (t), t ∈ R+ . (b) Find a stochastic differential equation that is satisfied by the process {S−1 (t)}. (3) Prove that d(X(t)S−1 (t)) = S−1 (t) ((μ2 − σ1 σ2 )dt + σ2 dW (t)) . 14.4. Let {W (t) = (W1 (t), . . . ,Wn (t)), t ∈ R+ } be an n-dimensional Wiener process. Find a solution to SDE n dX(t) = rX(t)dt + X(t) ∑ αk dWk (t) , X(0) > 0. k=1
218
14 Stochastic differential equations
14.5. (1) Prove that the process X(t) = α (1 − t/T ) + β t/T + (T − t)
t 0
dW (s) , 0 ≤ t ≤ T, T −s
is a solution to SDE
β − X(t) dt + dW (t), t ∈ [0, T ], X0 = α . T −t (2) Prove that X(t) → β as t → T −, a.s. (3) Prove that the process X is a Brownian bridge over the interval [0, T ] with fixed endpoints X(0) = α and X(T ) = β . (A standard Brownian bridge is obtained with T = 1, α = β = 0, see Example 6.1 and Problems 6.13, 6.21.) dX(t) =
14.6. Solve differential the next stochastic equations. dX1 (t) 1 1 0 dW1 (t) (a) = dt + , 0 0 X1 (t) dX2 (t) dW2 (t) where W (t) = (W1 (t),W2 (t)) is a two-dimensional Wiener process. (b) dX(t) = X(t)dt + dW (t). (c) dX(t) = −X(t)dt + e−t dW (t). 14.7. (1) Solve the Ornstein–Uhlenbeck equation (or the Langevin equation) dX(t) = μ X(t)dt + σ dW (t), μ , σ ∈ R. The solution is called the Ornstein–Uhlenbeck process (cf. Problem 6.12). (2) Find EX(t) and DX(t) (cf. Example 6.2). 14.8. (1) Solve the mean-reverting Ornstein–Uhlenbeck equation dX(t) = (m − X(t))dt + σ dW (t), μ , σ ∈ R. Here the coefficient m is the “mean value”. Respectively, a solution to this equation is called the mean-reverting Ornstein–Uhlenbeck process. (2) Find EX(t) and its asymptotic behavior as t → ∞, and also find DX(t). 14.9. Solve an SDE dX(t) = rdt + α X(t)dW (t), r, α ∈ R. 14.10. Solve the next stochastic differential equations: . . (1) dX(t) = ( 1 + X 2 (t) + 1/2X(t))dt + 1 + X 2 (t)dW (t), 2 X(t) − a(1 + t)2 dt + a(1 + t)2 dW (t). (2) dX(t) = 1+t 14.11. Consider a linear SDE dX(t) = A(t)x(t)dt + σ dW (t), X(0) = x, where A(t) is a nonrandom continuous on R+ function, moreover A(t) ≤ −α < 0 for all t ≥ 0. Prove that σ2 σ2 σ 2 −2α t e (1 − e−2α t ) ≤ + x2 − . EX 2 (t) ≤ e−2α t x2 + 2α 2α 2α
14 Stochastic differential equations
219
14.12. Solve a two-dimensional SDE, dX1 (t) = X2 (t)dt + α dW1 (t), dX2 (t) = X1 (t)dt + β dW2 (t), where (W1 (t),W2 (t)) is a two-dimensional Wiener process and α , β ∈ R. 14.13. (1) Solve a system of stochastic differential equations dX(t) = Y (t)dt, dY (t) = −β X(t)dt − α Y (t)dt + σ dW (t), where α , β , and σ are positive constants. (2) Show that in the case where the vector (X(0),Y (0)) has a joint Gaussian distribution, the vector process (X,Y ) is Gaussian. Find its covariance function. 14.14. (Feynman–Kac formula) Let {X(t), Ft ,t ∈ R+ } be a diffusion process that admits a stochastic differential dX(t) = μ (t, X(t))dt + σ (t, X(t))dW (t), 0 ≤ t ≤ T, and let there exist a solution f (t, x), (t, x) ∈ [0, T ] × R, to a partial differential equation ∂f + L f (t, x) = r(t, x) f (t, x), 0 ≤ t ≤ T, ∂t with boundary condition f (T, x) = g(x). (Here r(t, x) ∈ C([0, T ] × R), r ≥ 0, and the operator L is introduced in Problem 13.46.) Prove that T / f (t, x) = E g(X(T ))e− t r(u,X(u))du X(t) = x . 14.15. Prove that the next one-dimensional SDE has a unique strong solution: dX(t) = log(1 + X 2 (t))dt + 1IX(t)>0 X(t)dW (t), X0 = a ∈ R. 14.16. Let a, c, d be real constants and a > 0. Consider a one-dimensional SDE, dX(t) = (cX(t) + d)dt + (2aX(t) ∨ 0)1/2 dW (t).
(14.4)
(1) Prove that for any initial value X(0), the equation has a unique strong solution. (2) Prove that in the case d ≥ 0 and X(0) ≥ 0, the solution is nonnegative; that is, X(t) ≥ 0 for all t ≥ 0, a.s. 14.17. (Gronwall–Bellman lemma) Let {x(t), 0 ≤ t ≤ T } be a nonnegative contin uous on [0, T ] function that satisfies the inequality x(t) ≤ a + b 0t x(s)ds, t ∈ [0, T ], a, b ≥ 0. Prove that x(t) ≤ aebt , t ∈ [0, T ]. 14.18. Let the assumptions of Theorem 14.1 hold. Prove that under an additional assumption E|ξ |2 < ∞, the unique solution X to equation (14.1) satisfies the inequality E|X(t)|2 ≤ k1 exp(k2t), 0 ≤ t ≤ T with some constants k1 and k2 . 14.19. Let {Xs,x (t),t ≥ s}, x ∈ Rd be a solution to SDE (14.1) for t ≥ s, with initial condition Xs,x (s) = x. Assume that the coefficients of the equation satisfy both Lipschitz and linear growth conditions. Prove that for any T > 0 and p ≥ 2, there exists a constant c such that for Xs,x (t) the following moment bounds are valid.
220
14 Stochastic differential equations
(1) ∀ s,t ∈ [0, T ], s ≤ t ∀ x ∈ Rd : E|Xs,x (t)| p ≤ c(1 + |x| p ). (2) ∀ s,t, 0 ≤ s ≤ t ≤ T ∀ x1 , x2 : E|Xs,x1 (t) − Xs,x2 (t)| p ≤ c|x1 − x2 | p . (3) ∀ x ∀ s,t1 ,t2 ∈ [0, T ], s ≤ t1 ∧ t2 : E|Xs,x (t1 ) − Xs,x (t2 )| p ≤ c(1 + |x| p )|t1 − t2 | p/2 . (4) ∀ x ∀ s1 , s2 ,t ∈ [0, T ], t ≥ s1 ∨ s2 : E|Xs1 ,x (t) − Xs2 ,x (t)| p ≤ c(1 + |x| p )|s1 − s2 | p/2 . 14.20. Assume the conditions of Problem 14.19. Prove that the process {Xs,x (t), t ≥ s}, x ∈ Rn has a modification that is continuous in (s,t, x). 14.21. Let (Ω , F, {Ft }t∈[0,T ] , P) be a filtered probability space, {W (t), Ft ,t ∈ [0, T ]} be a Wiener process, and {γ (t), Ft ,t ∈ [0, T ]} be a progressively measurable stochastic process with P{ 0T γ 2 (s)ds < ∞} = 1. Consider an SDE dX(t) = γ (t)X(t)dW (t), X(0) = 1. (1) Prove that there exists a nonnegative continuous solution to the equation, which is unique and given by the formula X(t) = exp{ 0t γ (s)dW (s) − 12 0t γ 2 (s)ds}. The process X(t) is called a stochastic exponent (see Problem 13.30). (2) Prove that the process X is a supermartingale and EX(t) ≤ 1. 14.22. (Novikov’s condition for martingale property of stochastic exponent) Let the conditions of Problem 14.21 hold. Then under E exp{ 12 0T γ 2 (s)ds} < ∞, a supermartingale X(t, γ ) := exp{ 0t γ (s)dW (s) − 12 0t γ 2 (s)ds} is a martingale and EX(t, γ ) = 1. 14.23. Assume the conditions of Problem 14.21 and for a certain δ > 0 let it hold that sup E exp{δ γ 2 (t)} < ∞. t≤T
Prove that EX(T, γ ) = 1, t ∈ [0, T ]. 14.24. Let under the conditions of Problem 14.21, γ be a Gaussian process with supt≤T E|γ (t)| < ∞ and supt≤T Dγ (t) < ∞. Prove that EX(T, γ ) = 1. 14.25. (Girsanov theorem for continuous time) Let under the conditions of item (1) of Problem 14.21, it hold EX(t, γ ) = 1. Define a probability measure Q by an equality dQ= X(t, γ )dP. Prove that on the probability space (Ω , F,Q) a stochastic process (t) := Wt − t γ (s)ds is Wiener with respect to the filtration {Ft }t∈[0,T ] . W 0 14.26. Let a : Rn → Rn be a bounded measurable function. Construct a weak solution {X(t) = Xx (t)} to an SDE dX(t) = a(X(t))dt + dW (t), X(0) = x ∈ Rn . 2 14.27. Let W be a Wiener ( process. Prove that X(t) := W (t) is a weak solution to an SDE dX(t) = dt + 2 |X(t)|dW (t) where W is another Wiener process.
14 Stochastic differential equations
14.28. Let X(t) = x0 +
t 0
a(s)dL(s) +
t 0
221
b(s)dW (s),
where L(t),t ≥ 0 is a continuous nondecreasing adapted process. Consider the pro from Problem 13.60 and introduce processes X(t) = X(A−1 (t)) cesses β , A−1 , and W −1 and L(t) = L(A (t)). Prove that + b(A−1 (t))β −1/2 (A−1 (t))dW (t). = a(A−1 (t))d L(t) d X(t) In particular, if β (t) = c(X(t)), a(t) = α (X(t)), and b(t) = σ (X(t)), where α , σ , c is a solution are nonrandom functions, c(x) > 0, and L(t) = t, then the process X(t) to SDE, −1 −1/2 (t). = α (X(t))c (X(t))dt + σ (X(t))c (X(t))dW d X(t) 14.29. Let measurable processes α (t) and β (t), t ≥ 0 be adapted to the σ -algebra generated by a Wiener process and ∃ c,C > 0 ∀ t ≥ 0 : |α (t)| ≤ C, c ≤ β (t) ≤ C. Prove that for a process X with dX(t) = α (t)dt + β (t)dW (t), ≥ 0, it holds with probability 1 that lim sup |X(t)| = +∞ a.s. t→∞
14.30. Let X(t),t ≥ 0 satisfy SDE, dX(t) = a(X(t))dt + b(X(t))dW (t), t ≥ 0, where a, b : R → R satisfy the Lipschitz condition. Let b(x) > 0 for all x ∈ [x1 , x2 ] and / X(0) = x0 ∈ [x1 , x2 ]. Prove that with probability 1 the exit time τ = inf{t ≥ 0| X(t) ∈ (x1 , x2 )} of the process X from the interval [x1 , x2 ] is finite and Eτ m < ∞ for all m > 0. 14.31. Let {X(t),t ≥ 0} be a solution to SDE (14.3) with initial condition X(0) = x, where b, σ : R → R satisfy the Lipschitz condition and σ (x) = 0, x ∈ R. For the process X, prove that the probability pab (x), x ∈ (a, b), to hit the point a before the point b equals (s(b) − s(x))/(s(b) − s(a)), where y x 2b(z) dz dy exp − s(x) = 2 c1 c2 σ (z) with arbitrary constants c1 and c2 . 14.32. Find the probability pab (x), x ∈ (a, b) to hit the point a before the point b for a process X that satisfies the next stochastic differential equations with initial condition X(0) = x. (a) dX(t) = dW (t). (b) dX(t) = dW (t) + Kdt. (c) dX(t) = (2 + sin X(t))dW (t). (d) dX(t) = AX(t)dt + BX(t)dW (t) with B = 0 and A > 0. (e) dX(t) = (A/(X(t)))dt + dW (t), where A > 0.
222
14 Stochastic differential equations
14.33. Find the probability that a path of a Wiener process intersects a straight line y = kt + l with t ≥ 0. 14.34. Prove that with probability 1 the process X(t) from Problem 14.32 (e) does not hit the origin in finite time for A ≥ 1/2. 14.35. Let W be a Wiener process on Rn with n > 1, and x ∈ Rn \ {0}. Prove that the stochastic process X(t) = x +W (t) satisfies SDE, n−1 dt + dB(t), dX(t) = 2X(t) where B(t) is a one-dimensional Wiener process. Use Problem 14.34 and check that P(∃ t > 0| X(t) = 0) = 0. 14.36. Let coefficients of an SDE and a function s satisfy the conditions of Problem 14.31. Prove that: (a) If limx→+∞ s(x) = +∞ and limx→−∞ s(x) = −∞, then P(sup X(t) = +∞) = P(inf X(t) = −∞) = 1. t≥0
t≥0
(b) If limx→+∞ s(x) = +∞ and limx→−∞ s(x) = c ∈ R, then P(sup X(t) < ∞) = P(inf X(t) = −∞) = P( lim X(t) = −∞) = 1. t→+∞
t≥0
t≥0
(c) If there exist finite limits limx→−∞ s(x) = s(−∞) and limx→+∞ s(x) = s(+∞), then P(supt≥0 X(t) < ∞/X(0) = x) = P(inft≥0 X(t) = −∞/X(0) = x) s(+∞)−s(x) = s(+∞)−s(−∞) . 14.37. Assume that coefficients of an SDE satisfy the conditions of Problem 14.31. Prove that: (a) P(lim sup |X(t)| = ∞) = 1. t→∞
(b) P(lim inf X(t) = −∞, lim sup X(t) ∈ R) = 0. t→∞
t→∞
(c) With probability 1 one of the three disjoint events occurs: either limt→∞ X(t) = +∞, or limt→∞ X(t) = −∞, or {lim inft→∞ X(t) = −∞, lim sup X(t) = +∞}; that is, t→∞
P(lim inf X(t) = −∞, lim sup X(t) = +∞) t→∞
t→∞
+ P(lim X(t) = −∞) + P(lim X(t) = +∞) = 1. t→∞
t→∞
14.38. Let coefficients of an SDE satisfy the conditions of Problem 14.31. Denote by τ[a,b] = inf{t ≥ 0| X(t) ∈ / [a, b]} the first exit time of the process X from an interval [a, b]. Prove that the function v(x) = E(τ[a,b] /X(0) = x) is finite and equal to v(x) = − where
x a
2ϕ (y)
y a
dz dy + σ 2 (z)ϕ (z)
b a
ϕ (x) = exp −
a
x
2ϕ (y)
y a
x dz a ϕ (z)dz , dy b σ 2 (z)ϕ (z) a ϕ (z)dz
2b(z) dz . σ 2 (z)
14 Stochastic differential equations
223
14.39. Let τ be the exit time of a Wiener process from an interval [−a, b] with a > 0 and b > 0. Find Eτ .
Hints 14.1. Apply the Itˆo formula. 14.2. Apply the Itˆo formula to X(t) = (r − (σ 2 /2))t + σ W (t) and F(x) = ex . 14.3. (1) Note that it is a linear heterogeneous equation. Apply a method similar to the variation of constants method. (2) Apply the Itˆo formula. 14.4. Look for a solution in the form X(t) = X(0) exp{at + ∑nk=1 bkWk (t)}, apply the Itˆo formula, and take into account the independence of components of Wk . t (T −t) ((dW (s))/(T − s)) = 0 a.s., set M(t) = 14.5. In order to prove that lim t→T − 0 t ((dW (s))/(T − s)), apply modified Theorem 7.16 (some analogue of Doob’s mar0 tingale inequality for continuous time), and prove that P(supT (1−2−n )≤t≤T (1−2−n−1 ) (T − t)|M(t)| > ε ) ≤ 2ε −2 2−n . Apply the Borel–Cantelli lemma and obtain that for a.a. ω there exists n(ω ) < ∞ such that for all n ≥ n(ω ) it holds ω ∈ An , where An = {ω |
sup
(T − t)|M(t)| > 2−(n/4) }.
T (1−2−n )≤t≤T (1−2−n−1 )
14.6. (b) Multiply both sides of the equation by e−t and compare with d(e−t W (t)). 14.7, 14.8. Use Problem 14.1. 14.9. Multiply both sides by exp{−α W (t) + 12 α 2t}. 14.10. (1) First solve an SDE, . 1 dX(t) = 1 + X 2 (t)dW (t) + X(t)dt. 2 For this purpose write the Itˆo formula for a function f which is unknown at the moment: ( t f (X(s)) 1 +X 2 (s)dW (s) f (X(t)) = f (X(0)) + 0 t + 0 f (X(s))X(s) ds + 1/2 0t f (X(s))(1 + X 2 (s))ds, and find the function f such that the integrand expression in the Lebesgue integral in the latter equality is identical zero. Then use the fact that in the initial equation the first summands in the drift and diffusion coefficients coincide. (2) For the most part, the reasoning is similar. 14.11. Use the Itˆo formula for X 2 (t), compose an ordinary differential equation for EX 2 (t), and solve it. 14.15. Use Theorem 14.1. 14.16. (1) Check that the conditions of Theorem 14.2 hold true. (2) Separately consider the cases d = 0 and d > 0. In the first case, based on uniqueness of the solution to equation (14.4), prove that X(t) ≡ 0 if X(0) = 0; and if X(0) > 0 then set σ = inf{t| X(t) = 0} and show that X(t) = X(t ∧ σ ). Let d > 0. We set σ−ε = inf{t| X(t) = −ε } where ε > 0 satisfies −cε + d > 0. Suppose that P(σ−ε < ∞) > 0. Then with probability 1, if to choose any r < σ−ε such that X(t) < 0
224
14 Stochastic differential equations
for t ∈ (r, σ−ε ), we have dX(t) = (cX(t) + d)dt in the interval (r, σ−ε ); that is, X(t) is growing in this interval, which is impossible. 14.18. Use the Gronwall–Bellman lemma. 14.20. Use the results of the previous problem and check that ∀ R > 0, T > 0, p ≥ 2 ∃c > 0 ∀ s1 , s2 ,t1 ,t2 ∈ [−T, T ], s1 ≤ t1 , s2 ≤ t2 ∀ x1 , x2 ∈ Rn , x2 ≤ R, x2 ≤ R : EXs1 ,x1 (t1 ) − Xs2 ,x2 (t2 ) p ≤ c(|s1 − s2 | p/2 + |t1 − t2 | p/2 + x1 − x2 p ). Use Problem 3.12 and Theorem 3.7. 14.21. (1) Existence of the solution of the given form can be derived from the Itˆo formula. Let Y be another continuous solution. Prove by the Itˆo formula that d((Y (t))/(X(t))) = 0. (2) It follows from Problem 13.31. 14.24. Use Problem 14.23. 14.25. Similarly to Problem 7.96, check that the process {Mt } is a Q-local martinNext, use the Itˆo formula and gale if and only if Mt X(t, γ ) is a local P-martingale. (t) · X(t, γ ) = t X(s, γ )dW (s) + t W (s)X(s, γ ) f (s)dW (s); that is, this obtain that W 0 0 (t) is a local Q-martingale. Now, because process is a local P-martingale. Thus, W is a square ]t = t, obtain, based on Theorem 7.17, that W the quadratic variation [W integrable Q-martingale. 14.27. Write the Itˆo formula for X and compare it with the required equation. Use Problem 13.58. 14.28. Make an ordinary change of variables in the first integral: A−1 (t) 0
a(s)dL(s) =
t 0
a(A−1 (z))d L(z).
For another integral, use Problem 13.60. 14.33. The desired probability equals the probability that the process satisfying dX(t) = dW (t)−k dt and X(0) = −l hits the point 0. Denote τ = inf{t ≥ 0|X(t) = 0}. If k = 0 then P(τ < ∞) = 1. Let, to be specific, k < 0. Then with probability 1 (see Problem 3.18), limt→+∞ X(t) = +∞. Therefore, P(τ < ∞) = 1 if l ≥ 0. For l < 0, P(τ < ∞) = limn→∞ P(X(t) hits 0 before the point n) −kn −kl = limn→∞ e e−kn−e−1 = e−kl . 14.34. Use the reasoning from Problem 14.36. 14.35. Let ζ = inf{t ≥ 0| X(t) = 0} ∪ {+∞}. Then by the Itˆo formula dX(t) =
n Wk (t) n−1 dt + ∑ dWk (t), t ≤ ζ . 2X(t) W (t) k=1
From Problem 13.59 it follows that the process B(t) =
n
∑
t Wk (s)
k=1 0
W (s)
dWk (s)
is Wiener. 14.36. (a) Denote by Px the distribution of X under the condition X(0) = x. Then for any x1 , x2 , x1 ≤ x ≤ x2 : Px (supt≥0 X(t) ≥ x2 ) ≥ Px (the process X hits x2 before x1 ) = (s(x) − s(x1 ))/(s(x2 ) − s(x1 )) (see Problem 14.31). Let x1 → −∞.
14 Stochastic differential equations
225
(b) For any x2 > x Px (supt≥0 X(t) > x2 ) ≤ Px (there exists x1 ≤ x such that the process X hits x2 before x1 ) = limx1 →−∞ Px (X(t) hits x2 before x1 ) = (s(x) − s(−∞))/ (s(x2 ) − s(−∞)). Let x2 → +∞ and use the result of Problem 14.37. (c) Use reasoning of item (b) and the result of Problem 14.37. 14.37. (a) With probability 1 the process X(t) exits from any interval (see Problem 14.30). (b) Use the strong Markov property of a solution to SDE (see Theorem 14.3), the Borel–Cantelli lemma, and Problem 14.31, and check the following. If for some c ∈ R there exists a sequence of Markov times {τn } such that limn→∞ τn = +∞ and X(τn ) = c, then P(lim inft→∞ X(t) = −∞)= P(lim supt→∞ X(t) = +∞) = 1. If P(lim inft→∞ X(t) = −∞ and lim supt→∞ X(t) > c) > 0, then take τn = inf{t ≥ σn : X(t) = c} with σn+1 = inf{t ≥ τn : X(t) = c − 1} and σ0 = 0. (c) See (a) and (b). 14.38. Notice that the function v is twice continuously differentiable and Lv(x) = −1, x ∈ [a, b], v(a) = v(b) = 0, where Lv(x) = b(x)v (x) + 12 σ 2 (x)v (x). Then by the Itˆo formula and the properties of a stochastic Itˆo integral one has that for any n ∈ N, E(v(X(n ∧ τ[a,b] ))/X(0) = x) n∧τ
= v(x) + E 0 [a,b] Lv(X(s))ds/X(0) = x = v(x) − E(n ∧ τ[a,b] /X(0) = x).
Because τ[a,b] < ∞ a.s. (see Problem 14.30), by the dominated convergence theorem the left-hand side of the equality tends to 0, whereas by the monotone convergence theorem the expectation on the right-hand side converges to E(τ[a,b] /X(0) = x). 14.39. Use the result of Problem 14.38 with the process dX(t) = dW (t).
Answers and Solutions 14.14. By the Itˆo formula
∂f + L f (t, X(t)) dt + dP(t), d f (t, X(t)) = ∂t where dP(t) = (∂ f /∂ x)σ (t, x)dW (t). Taking into account that f is a solution to the given partial differential equation, we obtain
d f (t, X(t)) = r(t, X(t)) f (t, X(t))dt + dP(t), 0 ≤ t ≤ T. Because this SDE is linear in f , its solution has a form T T r(u,X(u))du − ts r(u,X(u))du t f (t, X(t)) + e dP(s) . f (T, X(T )) = e t
Taking the expectation under the condition X(t) = x, accounting for the boundary condition, and using the martingale property of the stochastic process P, then we obtain the desired statement.
226
14 Stochastic differential equations
14.17. Let a > 0. Then for t > 0 t = log a + b x(s)ds 0
bx(t) ≤ b. a + b 0t x(s)ds
Integrating from 0 up to t we obtain t log a + b x(s)ds − log a ≤ bt, t ∈ [0, T ], t
0
x(s)ds ≤ aebt .
Consider the case a = 0 yourself. and a + b 0 14.19. (1) Let τN = inf{t ≥ s| Xs,x (t) ≥ N} ∧ T. For simplicity we consider only the case m = n = 1. The Itˆo formula implies that t∧τN |Xs,x (z)| p−1 sign (Xs,x (z)) b(z, Xs,x (z))dz |Xs,x (t ∧ τN )| p = |x| p + p
s
t∧τN 1 +σ (z, Xs,x (z))dW (z) + p(p − 1) |Xs,x (z)| p−2 σ 2 (z, Xs,x (z))dz. 2 s Therefore, E|Xs,x (t ∧ τn )| p ≤ K1 |x| p + E st∧τN |Xs,x (z)| p−2 (1 + |Xs,x (z)| + ξsz2 )dz ≤ K2 |x| p + E st∧τN (1 + |Xs,x (z)| p )dz ≤ K3 |x| p + st (1 + E|Xs,x (z ∧ τN )| p )dz ≤ K3 |x| p + T + st E|Xs,x (z ∧ τN )| p dz .
Now, the Gronwall–Bellman lemma implies the inequality E|Xs,x (t ∧ τN )| p ≤ c(1 + |x| p ), 0 ≤ s ≤ t ≤ T, x ∈ R, where the constant c does not depend on N. Use Fatou’s lemma to prove the desired inequality. Items (2) to (4) are proven similarly to (1) based on the Itˆo formula and Gronwall–Bellman lemma. t t 2 14.22. Let a> 0, and σa = inf{t ] | 0 γ (s)dW (s) − 0 γ (s)ds = −a}, σa = T , t 2 ∈ [0, T t if inft∈[0,T ] 0 γ (s)dW (s) − 0 γ (s)ds > −a. Let also λ ≤ 0. Show that EX(σa , λ γ ) = 1. According to Problem 14.21, X(σa , λ γ ) = 1 + λ
σa 0
X(s, λ γ )γ (s)dW (s).
Thus, it is enough to show that E 0σa X 2 (s, λ γ )γ 2 (s)ds < ∞. But this relation is im plied by the next two bounds: E 0σa γ 2 (s)ds ≤ E 0T γ 2 (s)ds ≤ E exp{ 12 0T γ 2 (s)ds} < ∞, and s s X(s, λ γ ) = exp λ ( γ (u)dW (u) − γ 2 (u)du 0 0 s 2 λ × exp (λ − ) γ 2 (u)du ≤ exp{|λ |a}, for all λ ≤ 0 and s ≤ σa . 2 0
14 Stochastic differential equations
227
Now, we show that EX(σa , λ γ ) = 1 for 0 < λ ≤ 1. Define
ρ (σa , λ γ ) = eλ a X(σa , λ γ ), A(ω ) = B(ω ) =
σa
σa
σa 0
γ 2 (s)ds,
γ (s)dW (s) − γ 2 (s)ds + a ≥ 0, 0 0 √ is clear that 0 ≤ λ ≤ 1 if and only and let u(z) = ρ (σa , λ γ ), where λ = 1 − 1 − z. It √ if 0 ≤ z ≤ 1. Besides, u(z) = exp{(z/2)A(ω )+(1− 1 − z)B(ω )}. For 0 ≤ z < 1, one k ω ), where can P-a.s. expand the function u(z) into a series: u(z) = ∑∞ k=0 (z /k!)pk (√ pk (ω ) ≥ 0 P-a.s., for all k ≥ 0. Problem 13.31 implies that Eu(z) ≤ ea(1− 1−z) < ∞. k If 0 ≤ z0 < 1 and |z| ≤ z0 , then E ∑∞ k=0 (|z| )/(k!)pk (ω ) ≤ Eu(z0 ) < ∞. Therefore, k due to the Fubini Theorem for any |z| < 1 we have Eu(z) = ∑∞ k=0 (z /k!)Epk (ω ). √ k On the other hand, for −∞ ≤ z < 1 we have ea(1− 1−z) = ∑∞ k=0 (z /k!)ck , where ck ≥ 0, k ≥ 0. From this, and also from the equality Eρ (σa , λ γ ) = eλ a , λ ≤ 0, we ∞ k k Epk (ω ) = ck , obtain for −1 < z ≤ 0 that ∑∞ k=0 (z /k!)Epk (ω ) = ∑k=0 (z /k!)ck , thus, √ ∞ k k ≥ 0, and then for 0 ≤ z < 1 we have Eu(z) = ∑k=0 (z /k!)ck = ea(1− 1−z) . Because A(ω ) and B(ω ) are nonnegative P-a.s., then ρ (σa , λ γ ) ↑ ρ (σa , γ ) for λ ↑ 1. Due to the monotone convergence theorem, ea = lim Eρ (σa , λ γ ) = Eρ (σa , γ ). λ ↑1
Evidently, 1 = EX(σa , γ ) = EX(σa , γ )1Iσa 0}. Then U
(15.1)
⊂ Γ c.
Bibliography [19], Chapter III; [67]; [23], Chapter 6; [81]; [54], Chapter 2; [61], Chapter X; [85].
Problems 15.1. Prove Lemma 15.1. 15.2. Prove: if a function f is excessive and τ ≥ σ are two Markov moments, then Ex f (Xσ ) ≥ Ex f (Xτ ). 15.3. Let a function f be excessive, and τΓ be a time of the first entry into a set Γ ⊂ X by a Markov chain {Xn , n ∈ Z+ }. Prove that the function h(x) := Ex f (XτΓ ) is excessive as well. 15.4. Prove: if a payoff function f is excessive, then the price of the game v equals f . 15.5. Prove: if an excessive function g dominates a payoff function f then it also dominates the price of the game v. 15.6. Prove Lemma 15.2.
232
15 Optimal stopping of random sequences and processes
15.7. Prove the following properties of excessive functions defined on X. (1) If a function f is excessive and α > 0, then the function α f is excessive as well. (2) If functions f1 , f2 are excessive, then the sum f1 + f2 is excessive as well. (3) If { fα , α ∈ A} is a family of excessive functions, then the function f (x) := infα ∈A fα (x) is excessive as well. (4) If { fn , n ≥ 1} are excessive functions and fn ↑ f pointwise, then f is an excessive function as well. 15.8. Let a Markov chain {Xn , n ∈ Z+ } be a symmetric random walk with absorption points 0 and N. That is, X = {0, 1, . . . , N}, pxy = 12 if x = 1, 2, . . . , N − 1, y = x ± 1, pNN = 1, and p 00 = 1. Prove: the class of functions f : X → R+ , which are excessive w.r.t. this random walk, coincides with the class of concave functions. 15.9. Let a Markov chain {Xn , n ∈ Z+ } be the same as in Problem 15.8. (1) Prove that the price of the game v(x) is the least concave function for which v(x) ≥ f (x), x ∈ X = {0, 1, . . . , N}. (2) Prove that to stop at time τ0 , when the chain first enters any of the points x with f (x) = v(x), is an optimal strategy. 15.10. Let a Markov chain {Xn , n ∈ Z+ } be a random walk over the set X = {0, 1, . . . , N} and px(x+1) = p, px(x−1) = q = 1 − p = p if x = 1, 2, . . . , N − 1 and pNN = 1, p 00 = 1 (nonsymmetric random walk with absorption at points 0 and N). (1) Describe the class of functions f : X → R+ which are excessive w.r.t. the random walk. (2) Let a premium function f (x) = x. Find an optimal strategy and calculate the price of the game if: (a) q > p; (b) q < p. 15.11. Let a Markov chain {Xn , n ∈ Z+ } be a random walk over a set X = {0, 1, . . . , N} and px(x+1) = p, px(x−1) = q = 1 − p if x = 1, 2, . . . , N − 1 and pN(N−1) = 1, p 01 = 1 (nonsymmetric random walk with reflection at points 0 and N). Describe the class of functions f : X → R+ that are excessive w.r.t. the random walk. 15.12. (Optimal stopping for a Wiener process with absorption) Consider a Wiener process {W (t), t ∈ R+ } with W (0) = x ∈ [0, a]. Furthermore, if the process visits point 0 or a, then it stays there forever; that is, X = [0, a] (such a process is called a Wiener process with absorption at the points 0 and a). Let a function f ∈ C([0, a]) and is nonnegative. Find the price of the game v(x) = supτ Ex f (W (τ )) and construct a Markov moment τ0 , for which v(x) = Ex f (W (τ0 )). 15.13. (Optimal stopping for a two-dimensional Wiener process) Let W (t) = {(W1 (t), W2 (t)), t ∈ R+ } be a Wiener process in R2 . (1) Prove that only constant functions are superharmonic nonnegative functions with regard to W . (2) Prove that there is no optimal stopping for an unbounded function f . (3) Prove that the continuation set is Γ c = {x| f (x) < || f ||∞ }, where || f ||∞ = supx∈R2 | f (x)|.
15 Optimal stopping of random sequences and processes
233
(4) Prove that if the logarithmic capacity cap (∂Γ c ) = 0, then τΓ = ∞ a.s. (The logarithmic capacity of a planar compact set is the value γ (E) = exp{−V (E)}, where V (E) = infP E×E ln |u − v|−1 dP(u)dP(v) and the infimum is taken over all probabilistic measures on E. The value V (E) is called the Robbins constant for the set E and the set E is called polar if V (E) = +∞ or, the equivalent, if γ (E) = 0.) (5) Prove that if cap (∂Γ c ) > 0 then τΓ < ∞ a.s. and v f (x) = || f ||∞ = Ex f (WτΓ ). It means that τΓ is the optimal stopping. 15.14. Let us suppose that there exists a Borel set H such that gH (x) := Ex g(X(τH )) dominates a function g and, at the same time, gH (x) ≥ Ex g(X(τ )) for all stopping times τ and all x ∈ Rn . Prove that in this case vg (x) = gH (x); that is, τH is an optimal stopping. 15.15. (Optimal stopping for an n-dimensional Wiener process if n ≥ 3) Let W (t) = {(W1 (t), . . . ,Wn (t)),t ∈ R+ } be a Wiener process in Rn with n ≥ 3. (1) Let a premium function be x−1 , if x ≥ 1, g(x) = 1, ifx < 1, x ∈ R3 . Prove that this function is superharmonic in R3 . Furthermore, it is such that vg = g, and it is optimal to stop immediately regardless on an initial point. (2) Let a premium function be x−α , ifx ≥ 1, h(x) = 1, ifx < 1,
α > 1, a set H = {x ∈ Rn | x ≤ 1}, and a function h(x) = Ex h(W (τH )) (remember that τH is the moment of the first hit on a set H). (a) Show that h(x) = Px (τH < ∞). (b) Show that 1, if x < 1, h(x) = x−1 , if x ≥ 1, It means that the function h coincides with the function g from item 1), which is the superharmonic majorant for h. h = g and the moment τH is an optimal stopping. (c) Prove that vh = 15.16. Let {W (t),t ∈ R+ } be a one-dimensional Wiener process. Find vg and an optimal stopping τ0 where it exists, if: (a) vg (x) = supτ Ex |W (t)| p with p > 0. 2 (b) vg (x) = supτ Ex e−W (τ ) . (c) vg (s, x) = supτ Es,x (e−ρτ chW (τ )), where ρ > 0 and ch z := (ez + e−z )/2. (Here the expectation is taken under the condition W (s) = x.) 15.17. Prove that U ⊂ Γ (see formula (15.1)).
234
15 Optimal stopping of random sequences and processes
15.18. (Optimal stopping for a time-dependent premium function) Let a premium function g be of the following form: g = g(t, x) : R × Rn → R+ , g ∈ C(R × Rn ). Find g0 (x) and τ0 such that g0 (x) = sup Ex g(τ , X(τ )) = Ex g(τ0 , X(τ0 )), τ
where X is a diffusion process with dX(t) = b(X(t))dt + σ (X(t))dW (t), t ∈ R+ and X(0) = x, b : Rn → Rn , σ : Rn → Rn×m are measurable functions, W is an mdimensional Wiener process. 15.19. Let X(t) = W (t), t ≥ 0 be a one-dimensional Wiener process and a premium function g(t, x) = e−α t+β x , x ∈ R, where α , β ≥ 0 are some constants. +of the process (1) Prove that the generator L s+t Y(s,x) (t) = W (t) + x is given by
2 +f (s, x) = ∂ f + 1 ∂ f , f ∈ C2 (R). L ∂ s 2 ∂ x2 +g = (−α + 1 β 2 )g and the identity vg = g holds true when (2) Deduce that L 2 2 β ≤ 2α and the optimal strategy is the instant stopping; if β 2 > 2α then optimal moment τ0 does not exist and vg = +∞.
15.20. Let {Yn , Fn , 0 ≤ n ≤ N} be the Snell envelope for a nonnegative process {Xn , Fn , 0 ≤ n ≤ N} (see Problem 7.22). Let Tn,N be the set of stopping times taking values in the set {n, n + 1, . . . , N}. (1) Prove that the r.v.
τ0 := inf{0 ≤ n ≤ N| Yn = Xn } is a stopping time and the stopped stochastic process {Ynτ0 = Yn∧τ0 , Fn , 0 ≤ n ≤ N} is a martingale. (2) Prove that Y0 = E(Xτ0 /F0 ) = supτ ∈T0,N E(Xτ /F0 ) (i.e., in this sense τ0 is an optimal stopping). (3) Generalizing the statements (1) and (2), prove that the r.v.
τn := inf{n ≤ j ≤ N| Y j = X j } is a stopping time and Yn = sup E(Xτ /Fn ) = E(Xτn /Fn ). τ ∈Tn,N
15.21. Prove that a stopping τ is optimal if and only if Yτ = Xτ and the stopped process {Ynτ , Fn , 0 ≤ n ≤ N} is a martingale. This statement means that τ0 from Problem 15.20 is the least optimal stopping time.
15 Optimal stopping of random sequences and processes
235
15.22. Let Yn = Mn −An be the Doob–Meyer decomposition for a Snell envelope (see Problem 7.62). Put τ1 = inf{0 ≤ n ≤ N| An+1 = 0} ∧ N. Prove that τ1 is an optimal stopping and τ1 ≥ τ for any optimal stopping τ . It means that τ1 is the largest optimal stopping. 15.23. Let a sequence {Xn , 0 ≤ n ≤ N} be a homogeneous Markov chain taking values in a finite set X with transition matrix P. Also let a function ϕ = ϕ (n, x) : {0, 1, . . . , N} × X → R be measurable. Prove that the Snell envelope for the sequence Zn := ψ (n, Xn ) is determined by the formula Yn = y(n, Zn ) where the function y is given by relations: y(N, x) = ψ (N, x) for any x ∈ X and for 0 ≤ n ≤ N − 1 u(n, ·) = max(ψ (n, ·), Pu(n + 1, ·)). 15.24. Let {Yn , 0 ≤ n ≤ N} be a Snell envelope for a sequence {Xn , 0 ≤ n ≤ N}. Prove that for every n EYn = supτ ∈Tn,N EXτ . In particular, EY0 = supτ ∈T0,N EXτ . 15.25. Prove that τ0 is optimal if and only if EXτ0 = supτ ∈T0,N EXτ .
Hints 15.2. Write formula (15.3) (see solution to Problem 15.1) for σ and τ . Deduce that Ex α σ f (Xσ ) ≥ Ex α τ f (Xτ ). Next, let α → 1. 15.4. Derive from the definition of v(x) that v(x) ≥ f (x) and from Problem 15.20 (or Lemma 15.1) that f (x) ≥ v(x). 15.7. Use the definition of excessive function or Lemma 15.1. 15.8. Use the definition of excessive function. 15.9. Use Theorem 15.1 and Problem 15.8. 15.12. I method. At first, one can prove that for the Wiener process with absorption, the class of functions g : [0, a] → R+ satisfying the condition g(x) ≥ Ex g(W (τ )) for all Markov moments τ coincides with the class of all nonnegative concave functions. The proof of the statement that any concave function satisfies this inequality is rather technically complicated (see, e.g., [19]). To prove the concavity of the function g satisfying the inequality g(x) ≥ Ex g(W (τ )), establish the inequality Ex g(W (τ )) = g(x1 )
x2 − x x − x1 + g(x2 ) , x2 − x1 x2 − x1
where τ is the moment of the first exit from the interval [x1 , x2 ] ∈ [0, a]. In this order check that the probability P(x, x1 , x2 ) to start from a point x and to get to x1 earlier than to x2 for 0 ≤ x1 ≤ x2 ≤ a is equal to (x2 − x)/(x2 − x1 ). (See Problem 14.31.) Next, prove that the price of the game is the least concave dominant of the function f . Consider a strategy τ that consists in waiting till the moment when process W visits point x1 or point x2 and then the strategies τ1 or τ2 , respectively, are used. Here
236
15 Optimal stopping of random sequences and processes
τi , i = 1, 2 are strategies leading under initial states x1 and x2 to the average payoff which is more than v(x1 ) − ε or v(x2 ) − ε . (The existence of such strategies follows from the definition of supremum.) Derive that Ex f (W (τ )) ≥
x2 − x x − x1 v(x1 ) + v(x2 ) − ε , x2 − x1 x2 − x1
and that
x − x1 x2 − x v(x1 ) + v(x2 ). x2 − x1 x2 − x1 Thus, the function v is concave. It means that the price of the game is the least nonnegative, concave dominant for the function f . Indeed, v obviously dominates f and is concave. Moreover, it follows from the above considerations that for any other concave dominant z of f we have that z(x) ≥ Ex z(W (τ )) ≥ Ex f (W (τ )) = v(x). And finally, prove that for f ∈ C([0, a]) the optimal Markov moment equals τ0 = inf{t| W (t) ∈ Γ }, where Γ = {x ∈ [0, a]| f (x) = v(x)}. II method. Use the fact that a Wiener process is a diffusion process. Take into account the absorption at 0 and a. 15.13. (1) Suppose that f is a nonnegative superharmonic function with regard to W and there exist two points x, y ∈ R2 with f (x) < f (y). Consider Ex f (W (τ )), where τ is the time of the first visit by the Markov process Wt small disk with center at y. Use a multidimensional version of the Dynkin formula (Problem 13.46) and Problem 13.55. (2) Follows directly from item (1). (3) Follows from item (1) and the definition of continuation set. (4) and (5) follow from item (3). (See also [63] and [43].) Indeed, according to [43] and [63], if a set B is compact and τB = inf{t > 0| W (t) ∈ B} then P(τB < ∞) equals 0 or 1 depending on whether the set B has zero or positive logarithmic capacity; in our case B = R2 \D is compact. v(x) ≥
15.15. (1) Follows obviously from the definitions. (2)(b) Use Problems 13.54 and 13.55. (c) Use item (b) and Problem 15.14. 15.19. The first statement is evident. For the second one consider only a case where +g(s, x) > 0} = R2 in this case, thus, Γ c = R2 . β 2 > 2α . First, the set U := {(s, x)| L It means that τ0 does not exist. Second, use Theorem 15.3 in order to construct the least superharmonic majorant: Ess,x g(Y (t)) = supt∈Sn Ee−α (s+t)+β W (t) = supt∈Sn e−α (s+t) · eβ x+(1/2)β t 2 = supt∈Sn g(s, x)e(−α +(1/2)β )t = g(s, x) exp((−α + 12 β 2 )2n ), x
2
therefore, gn (s, x) → ∞ as n → ∞. 15.23. Use the definition of a Snell envelope and the fact that for a bounded and measurable function f : X → R and a homogeneous Markov chain {Zn , 0 ≤ n ≤ N} it holds
15 Optimal stopping of random sequences and processes
237
E ( f (Zn+1 )/Fn ) = P f (Zn ), where P f (x) = ∑y∈X pxy f (y), {pxy }x,y∈X are entries of the transition matrix P. 15.24. Use Problem 15.20. 15.25. Let E(Xτ0 /F0 ) = supτ ∈T0,N E(Xτ /F0 ). Calculate the mathematical expectation for both parts and prove that EXτ0 = supτ ∈T0,N EXτ . On the contrary, let EXτ0 = supτ ∈T0,N EXτ . Prove that in this case Yτ = Xτ and Y τ are martingales.
Answers and Solutions 15.1. Put ϕ (x) = f (x)− α P f (x), 0 < α < 1. Then f (x) = (ϕ + α Pϕ +· · ·+ α n Pn ϕ + α n+1 Pn+1 f )(x) and ϕ (x) ≥ 0, x ∈ E. Furthermore, 0 ≤ Pn f = Pn−1 (P f ) ≤ Pn−1 f n n which means that α n Pn f → 0 as n → ∞. This implies that f (x) = ∑∞ n=0 α P ϕ (x) 0 n where P = I is the identity operator. Check that P ϕ (x) = Ex ϕ (Xn ). Then * ) f (x) = Ex
∞
∑ α n ϕ (Xn )
.
(15.2)
n=0
And again, similarly to (15.2), prove that Ex α τ f (Xτ ) = Ex (α τ ϕ (Xτ ) + α τ +1 ϕ (Xτ +1 ) + · · · ).
(15.3)
Comparing (15.2) with (15.3) we can conclude that f (x) ≥ Ex α τ f (Xτ ). Now, let α to 1. 15.3. Let τΓ = inf{n ≥ 1| Xn ∈ Γ }. Then, τΓ ≥ τΓ . It follows from Problem 15.2 that Ex f (Xτ ) ≤ Ex f (XτΓ ) = h(x). But, if the first step leads the Markov chain from x to Γ y then Ex f (Xτ ) = Ey f (XτΓ ) = h(y). So, Ex f (Xτ ) = ∑y∈X pxy h(y) = Ph(x). Thus, Γ Γ Ph(x) ≤ h(x), x ∈ X. 15.5. If g ≥ f and g is an excessive function, then for any strategy τ Ex f (Xτ ) ≤ Ex g(Xτ ) ≤ g(x), which implies that v(x) = supτ Ex f (Xτ ) ≤ g(x). 15.6. Because for τ = ∞ the relation Ex f (X∞ ) = 0 holds true, then v(x) ≥ 0. Fix ε > 0. It follows from the definition of supremum that for every y ∈ X there exists a strategy τε ,y such that Ey f (Xτε ,y ) ≥ v(y) − ε . Now let the strategy τ consist in making one step and then, if this step leads to the point y, we continue with strategy τε ,y . It is evident that τ is a Markov moment (check this), and for this τ
238
15 Optimal stopping of random sequences and processes
Ex f (Xτ ) =
∑ pxy Ey f (Xτε ,y ) ≥ ∑ pxy (v(y) − ε ) ≥ Pv(x) − ε .
y∈X
y∈X
Now pass to the limit as ε → 0. 15.10. (1) Put yk = (q/p)k , k = 0, . . . , N, Y = {y0 , . . . , yN }. Put a function f : {0, . . . , N} → R+ . It is excessive if and only if the function g : Y → R+ determined by the identities g(yk ) = f (k), k ∈ {0, . . . , N}, is concave. (2) For f (x) = x the function g defined above is equal to g(y) = logq/p y. It is concave if q > p and convex if q < p. For the first case a concave majorant for g is this function itself, and for the second case such majorant is a linear function w with w(y0 ) = g(y0 ) = 0, w(yN ) = g(yN ) = N; that is, w(y) = ((y − y0 )/(yN − y0 )) N. So, (a) an optimal strategy consists in the immediate stopping, and v(x) = x. (b) An optimal strategy consists in the stopping at the first of one of the points 0 or N, and x q x −1 p q = N N. v(x) = w p q −1 p 15.11. Prove that every excessive function f is constant. For any x consider the moment τx of the first entry into the state x. Because all states are connected with each other and the total number of states is finite, then for any y it holds Py (τx < +∞) = 1, and thus, Ey f (τx ) = f (x). Considering f as a payoff function, we obtain that the corresponding price of the game v(y) ≥ f (x). Because f is excessive, then v = f , and it means that f (y) ≥ f (x). Changing the roles for the points x and y we obtain the opposite inequality, f (x) ≥ f (y). So, f (x) = f (y). 15.14. It is evident that if g(x) is the least excessive majorant for the function g, then g(x) ≤ gH (x). On the other hand, gH (x) ≤ supτ Ex g(X(τ )) = vg (x), and it follows from Theorems 15.2 (1) and (15.3) that vg = gH . 15.16. (a) vg (x) = +∞, and τ0 does not exist. (b) vg (x) = 1, and τ0 = inf{t > 0|W (t) = 0}. (c) if ρ < 12 , then vg (s, x) = +∞, and τ0 does not exist; if ρ ≥ 12 , then vg (s, x) = g(s, x). 15.17. Let a point x ∈ V ⊂ U and τ0 be the time of the first exit from a bounded open set V . According to the Dynkin formula (see Problem 13.46, the multidimensional version) for any v > 0 Ex g(X(τ0 ∧ v)) = g(x) + Ex
τ0 ∧v 0
L g(Xs )ds > g(x).
It means that g(x) < vg (x); that is, x ∈ Γ c . 15.18. Reduce the nonhomogeneous in time problem to a homogeneous one, the solution of which is determined by Theorem 15.2, as follows. Define a new diffusion process
15 Optimal stopping of random sequences and processes
Y (t) = Ys,x (t) as Y (t) =
239
s+t , t ∈ R+ , Xx (t)
where Xx (t) is a diffusion starting from the point x, and s ∈ R+ . Then 1 0 (Y (t))dW (t), dW (t) = b(Y (t))dt + σ dY (t) = dt + σ (X(t)) b(X(t)) with b(y) = b(t, x) =
1 b(x)
⎞ 0...0 (y) = σ (t, x) = ⎝ . . . . . . ⎠ ∈ R(n+1)×m , ∈ Rn+1 , σ σ (x) ⎛
where y = (t, x) ∈ R × Rn . We see that Y is the diffusion process starting from the point y = (s, x). Let Py = Ps,x be the distribution of Y and Ey = E(s,x) mean the expectation with respect to the measure Py . The problem can be written in terms of Y (t) as follows: to find g0 and τ0 such that g0 (x) = vg (0, x) = sup E0,x g(Y (τ )) = E0,x g(Y (τ0 )) τ
which is a particular case for the problem of finding vg (s, x) and τ with vg (s, x) = sup Es,x g(Y (τ )) = Es,x g(Y (τ0 )). τ
This problem is standard if we replace X(t) with Y (t). 15.20. (1) Because YN = XN , τ0 is correctly defined. Besides, theevents {τ0 = 0} = {Y0 = X0 } ∈ F0 , {τ0 = n} = {Y0 > X0 } ∩ · · · ∩ {Yn−1 > Xn−1 } {Yn = Xn } ∈ Fn , 1 ≤ n ≤ N; that is, τ0 is a stopping time. Furthermore, Y(n+1)∧τ0 − Yn∧τ0 = (Yn+1 − Yn )1Iτ0 >n . Because Yn > Xn on the set {τ0 > n} then Yn = E(Yn+1 /Fn ). So, τ0 τ − Yn 0 /Fn ) = E(Y(n+1)∧τ0 − Yn∧τ0 /Fn ) = 1Iτ0 >n E (Yn+1 − E(Yn+1 /Fn )/Fn ) = E(Yn+1 τ 0. This implies that {Yn 0 , Fn , 0 ≤ n ≤ N} is a martingale. τ (2) Because Y 0 is a martingale then Y0 = Y0τ0 = E(YNτ0 /F0 ) = E(Yτ0 /F0 ) = E(Xτ0 /F0 ). From the other side, if τ ∈ T0,N , then the stopped process Y τ is a supermartingale. So, Y0τ = Y0 ≥ E(YNτ /F0 ) = E(Yτ /F0 ) ≥ E(Xτ /F0 ). Thus, E(Xτ0 /F0 ) = supτ ∈T0,N E(Xτ /F0 ). The proof of item (3) is similar to the proofs of (1) and (2), if we replace 0 with n. 15.21. Let the stopped process Y τ be a martingale. Prove, taking into account the first condition, that Y0 = E(Xτ /F0 ), and deduce the optimality of τ from item 2) of Problem 15.20. And vice versa, let a stopping τ be optimal. Prove the following sequence of inequalities using Problem 15.20 (2). Y0 = E(Xτ /F0 ) ≤ E(Yτ /F0 ) ≤ Y0 . Based on this and the inequality Xτ ≤ Yτ derive that Xτ = Yτ . Then prove the inequalities E(Yτ /F0 ) = Y0 ≥ E(Yτ ∧n /F0 ) ≥ E(Yτ /F0 ) and deduce that E(Yτ ∧n /F0 ) =
240
15 Optimal stopping of random sequences and processes
E(Yτ /F0 ) = E(E(Yτ /Fn )/F0 ). Because Yτ ∧n ≥ E(Yτ /Fn ) then Yτ ∧n = E(Yτ /Fn ) and this means that {Ynτ , Fn , 0 ≤ n ≤ N} is a martingale. 15.22. Because the process {An , 1 ≤ n ≤ N} is predictable, then τ1 is a stopping time. It is evident that Y τ1 = M τ1 as Aτn1 = An∧τ1 ≡ 0. It means that Y τ1 is a martingale. According to Problem 15.21, we need to prove that Yτ1 = Xτ1 . For ω such that τ1 = N the identities Yτ1 = YN = XN = Xτ1 hold true. If {τ1 = n}, 0 ≤ n < N, then the following identities take place: Yτ1 1Iτ1 =n = Yn 1Iτ1 =n = Mn 1Iτ1 =n = E(Mn+1 /Fn )1Iτ1 =n = E(Yn+1 + An+1 /Fn )1Iτ =n > E(Yn+1 /Fn )1Iτ =n ; that is, Yτ1 1Iτ1 =n = Yn 1Iτ1 =n = Xn 1Iτ1 =n = Xτ1 1Iτ1 =n . Thus, τ1 is an optimal stopping, τ ≥ τ1 , and P(τ > τ1 ) > 0. So, EAτ > 0 and EYτ = EMτ − EAτ = EM0 − EAτ = EY0 − EAτ < EY0 , and the stopped process M τ is a martingale (check why it is true), and thus, τ is not an optimal stopping.
16 Measures in a functional spaces. Weak convergence, probability metrics. Functional limit theorems
Theoretical grounds In this chapter, we consider random elements taking values in metric spaces and their distributions. The definition of a random element taking values in X involves the predefined σ -algebra X of subsets of X. The following statement shows that in a separable metric space, in fact, the unique natural choice for the σ -algebra X is the Borel σ -algebra B(X). Lemma 16.1. Let X be a separable metric space, and X be a σ -algebra of subsets of X that contains all open balls. Then X contains every Borel subset of X. Further on, while dealing with random elements taking values in a separable metric space X, we assume X = B(X). For a nonseparable space, a σ -algebra X is given explicitly. Theorem 16.1. (The Ulam theorem) Let X be a Polish space and μ be a finite measure on the Borel σ -algebra B(X). Then for every ε > 0 there exists a compact set Kε ⊂ X such that μ (X\Kε ) < ε . Generally, we deal with functional spaces X; that is, spaces of the functions or the sequences with a given parametric set. Let us give an (incomplete) list of such spaces with the corresponding metrics. In all the cases mentioned below we consider x, y ∈ X. (1) X = C([0, T ]), ρ (x, y) = maxt∈[0,T ] |x(t) − y(t)|.
−k max (2) X = C([0, +∞)), ρ (x, y) = ∑∞ t∈[0,k] |x(t) − y(t)| ∧ 1 . k=1 2
1/p . (3) X = L p ([0, T ]), p ∈ [1, +∞), ρ (x, y) = 0T |x(t) − y(t)| p dt (4) X = L∞ ([0, T ]), ρ (x, y) = ess supt∈[0,T ] |x(t) − y(t)|.
p 1/p . (5) X = p , p ∈ [1, +∞), ρ (x, y) = [∑∞ k=1 |xk − yk | ] (6) X = ∞ , ρ (x, y) = supk∈N |xk − yk |. (7) X = c0 = {(xk )k∈N : ∃ limk→∞ xk = 0}, ρ (x, y) = supk∈N |xk − yk |.
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 241 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 16,
242
16 Weak convergence, probability metrics. Functional limit theorems
(8) X = D([a, b]) (Skorokhod space, see Chapter 3, Remark 3.2), the metrics in this space are given below. Definition 16.1. Let X = {X(t),t ∈ T} be a real-valued random process with T = [0, T ] or T = [0, +∞). If there exists a modification X of this process with all its trajectories belonging to some functional space X (e.g., one of the spaces from items ω ) is F − X (1)–(4), (8) of the above–given list), and the mapping Xˆ : Ω ω → X(·, measurable for a certain σ -algebra X in X, then the process X is said to generate the random element Xˆ in (X, X). If process X generates a random element Xˆ in (X, X), then its distribution in (X, X) is the probability measure μX ≡ P ◦ Xˆ −1 , μX (A) = P(Xˆ ∈ A), A ∈ X. The notions of the random element generated by a random sequence and the corresponding distribution are introduced analogously. Example 16.1. The Wiener process generates a random element in C([0, T ]) (see Problem 16.1). The distribution of the Wiener process in C([0, T ]) is called the Wiener measure. Definition 16.2. A sequence of probability measures {μn } defined on the Borel σ algebra of the metric space X weakly converges to measure μ if X
f d μn →
X
f dμ, n → ∞
(16.1)
for arbitrary continuous bounded function f : X → R (notation: μn ⇒ μ ). If the sequence of distributions of random elements Xˆn converges weakly to the distribution ˆ then the sequence of the random elements Xˆn converges weakly or of the element X, d ˆ by distribution to Xˆ (notation: Xˆn ⇒ Xˆ or Xˆn → X). If processes Xn generate random elements in a functional space X and these elements converge weakly to the element generated by a process X, then the sequence of the processes Xn is said to converge to X by distribution in X. A set A ∈ B(X) is called a continuity set for the measure μ if μ (∂ A) = 0 (∂ A denotes the boundary of the set A). Theorem 16.2. All the following statements are equivalent. (1) μn ⇒ μ . (2) Relation (16.1) holds for every bounded function f satisfying the Lipschitz condition: there exists L such that | f (x) − f (y)| ≤ Lρ (x, y), x, y ∈ X (here ρ is the metric in X). (3) lim supn→∞ μn (F) ≤ μ (F) for every closed set F ⊂ X. (4) lim infn→∞ μn (G) ≥ μ (G) for every open set G ⊂ X. (5) limn→∞ μn (A) = μ (A) for every continuity set A for the measure μ . Theorem 16.3. (1) Let X, Y be metric spaces and F : X → Y be an arbitrary function. Then the set DF of the discontinuity points for this function is a Borel set (moreover, a countable union of closed sets).
16 Weak convergence, probability metrics. Functional limit theorems
243
(2) Let random elements Xˆn , n ≥ 1 in (X, B(X)) converge weakly to a random element Xˆ and P(Xˆ ∈ DF ) = 0. Then random elements Yˆn = F(Xˆn ), n ≥ 1 in (X, B(X)) ˆ converge weakly to the random element Yˆ = F(X). Consider the important partial case X = C([0, T ]). If X is a random process that generates a random element in C([0, T ]), then for every m ≥ 1,t1 , . . . ,tm ∈ [0, T ] the finite-dimensional distribution PtX1 ,...,tm can be represented as the image of the distribution of X in C([0, T ]) under the mapping
πt1 ,...,tm : C([0, 1]) x(·) → (x(t1 ), . . . , x(tm )) ∈ Rm . (“the finite-dimensional projection”). Because every function πt1 ,...,tm is continuous, Theorem 16.3 yields that, for every sequence of random processes Xn that converge by distribution in C([0, T ]) to a process X, for every m ≥ 1,t1 , . . . ,tm ∈ [0, T ] finitedimensional distributions PtX1n,...,tm converge weakly to the finite-dimensional distribution PtX1 ,...,tm . It should be mentioned that the inverse implication does not hold true and convergence of the finite-dimensional distributions of random processes does not provide their convergence by distribution in C([0, T ]) (see Problem 16.13). Definition 16.3. (1) A family of measures {μα , α ∈ A} is called weakly (or relatively) compact if each of its subsequences contains a weakly convergent subsequence. (2) A family of measures {μα , α ∈ A} is called tight if for every ε > 0 there exists a compact set Kε ⊂ X such that μα (X\Kε ) < ε , α ∈ A. Theorem 16.4. (The Prokhorov theorem) (1) If a family of measures {μα , α ∈ A} is tight, then it is weakly compact. (2) If a family of measures {μα , α ∈ A} is weakly compact and X is a Polish space, then this family is tight. It follows from the definition that a sequence of measures {μn } converges weakly if and only if (a) this family if weakly compact, and (b) this family has at most one weak partial limit (i.e., if two of its subsequences converge weakly to measures μ and μ , then μ = μ ). The statements given above provide the following criteria which are very useful for investigation of the convergence of the random processes by the distribution in C([0, T ]). Proposition 16.1. In order for random processes Xn to converge by distribution in C([0, T ]) to a random process X it is necessary and sufficient that (a) The sequence of their distributions in C([0, T ]) is tight. (b) All the finite-dimensional distributions of the processes Xn converge weakly to the corresponding finite-dimensional distributions of the process X. For a tightness of a sequence of distributions in C([0, T ]) of stochastic processes, a wide choice of sufficient conditions is available (see [25], Chapter 9; [4], Chapter 2; [9], Chapter 5). Here we formulate one such condition that is analogous to the Kolmogorov theorem (Theorem 3.12).
244
16 Weak convergence, probability metrics. Functional limit theorems
Theorem 16.5. Let a sequence of random processes Xn = {Xn (t),t ∈ [0, T ]}, n ≥ 1 be such that, for some constants α , β ,C > 0, E|Xn (t) − Xn (s)|α ≤ C|t − s|1+β , t, s ∈ [0, T ], n ∈ N. Then the sequence of the distributions of these processes in C([0, T ]) is tight. A random walk is a sequence of sums Sn = ∑nk=1 ξk , n ≥ 1, where {ξk k ∈ N} are independent random variables (in general, these variables can have various distributions, but we assume further they are identically distributed. For random walks, see also Chapters 10, 11, and 15). Theorem 16.6. (The Donsker theorem) Let {Sn , n ≥ 1} be a random walk and Eξk2 < +∞. Then the random processes Xn (t) =
S[nt] − [nt]Eξ1 ξ[nt]+1 − Eξ1 ( + (nt − [nt]) ( , t ∈ [0, 1], n ≥ 1 nDξ1 nDξ1
converge by distribution in C([0, 1]) to the Wiener process. Corollary 16.1. Let F : C([0, 1]) → R be a functional with its discontinuity set DF having zero Wiener measure. Then F(Xn ) ⇒ F(W ). Note that Corollary 16.1 does not involve any assumptions on the structure of the laws of the summands ξk , and thus the Donsker theorem is frequently called the invariance principle: the limit distribution is invariant w.r.t. choice of the law of ξk . Another name for this statement is the functional limit theorem. Processes with continuous trajectories have a natural interpretation as random elements valued in C([0, T ]). This allows one to study efficiently the limit behavior of the distributions of the functionals of such processes. In order to extend this construction for the processes with c`adl`ag trajectories, one has to endow the set D([0, T ], Y) with a structure of a metric space, and this space should be separable and complete (see Problem 16.17, where an example of an inappropriate metric structure is given). Below, we describe the metric structure on D([0, T ], Y) introduced by A. V. Skorokhod. Let Y be a Polish space with the metric ρ . Denote by Λ the class of strictly monotone mappings λ : [0, T ] → [0, T ] such that λ (0) = 0, λ (T ) = T. Denote λ (t) − λ (s) λ = sup ln . t −s s=t Definition 16.4. For x, y ∈ D([0, T ], Y), denote d(x, y) = inf{ε | ∃λ ∈ Λ , sup ρ (x(λ (t)), y(t)) ≤ ε , sup |λ (t) − t| ≤ ε }, t∈[0,T ]
t∈[0,T ]
d0 (x, y) = inf{ε | ∃λ ∈ Λ , sup ρ (x(λ (t)), y(t)) ≤ ε , λ ≤ ε }. t∈[0,T ]
16 Weak convergence, probability metrics. Functional limit theorems
245
Theorem 16.7. (1) The functions d, d0 are metrics on D([0, T ], Y). (2) The space (D([0, T ], Y), d) is separable but is not complete. (3) The space (D([0, T ], Y), d0 ) is both separable and complete. (4) A sequence {xn } ⊂ D([0, T ], Y) converges to some x ∈ D([0, T ], Y) in the metric d if and only if this sequence converges to x in the metric d0 . The last statement in Theorem 16.15 shows that the classes of the closed sets in the spaces (D([0, T ], Y), d) and (D([0, T ], Y), d0 ) coincide, and therefore the definitions of the weak convergence in these spaces are equivalent. For a tightness of a sequence of distributions in D([0, T ]) of stochastic processes, a wide choice of sufficient conditions is available (see [25], Chapter 9, [4], Chapter 3). We formulate one of them. Theorem 16.8. Let a sequence of random processes Xn = {Xn (t),t ∈ [0, T ]}, n ≥ 1 be such that, for some constants α , β ,C > 0, E|Xn (t) − Xn (s)|α |Xn (r) − Xn (t)|α ≤ C|r − s|1+β , s < t < r, n ∈ N. Then the sequence of the distributions of these processes in D([0, T ]) is tight. Consider a triangular array of random variables {ξnk , 1 ≤ k ≤ n}, where for every n random variables ξn1 , . . . , ξnn are independent and identically distributed. Consider the random walk Snk = ∑kj=1 ξn j , 1 ≤ k ≤ n corresponding to this triangular array. Theorem 16.9. Let the central limit theorem hold for the array {ξnk , 1 ≤ k ≤ n}; that is there exists random variable η such that Snn ⇒ η . Then the random processes Xn (t) := S[nt] , t ∈ [0, 1], n ≥ 1 converge by distribution in D([0, 1]) to the stochastically continuous homogeneous process with independent increments Z such that d
Z(1) = η . Note that, under an appropriate choice of the array {ξnk }, any infinitely divisible distribution can occur as the distribution of the variable η . Correspondingly, any L´evy process can occur as the limiting process Z. Together with a qualitative statement about convergence of a sequence of distributions, frequently (especially in applications) explicit estimates for the rate of convergence are required. The rate of convergence for a sequence of probability distributions can be naturally controlled by a distance between the prelimit and limit distributions w.r.t. some probability metric; that is, a metric on the family of probability measures. Below, we give a list of the most important and frequently used probability metrics. The class of probability measures on a measurable space (X, X) will be denoted by P(X). Consider first the case X = R, X = B(R). In this case, every measure μ ∈ P(R) is uniquely defined by its distribution function Fμ .
246
16 Weak convergence, probability metrics. Functional limit theorems
Definition 16.5. The uniform metric (or the Kolmogorov metric) is the function dU (μ , ν ) = sup |Fμ (x) − Fν (x)|,
μ , ν ∈ P(R).
x∈R
Definition 16.6. The L´evy metric is the function dL (μ , ν ) = inf{ε | Fν (x − ε ) − ε ≤ Fμ (x) ≤ Fν (x + ε ) + ε , x ∈ R},
μ , ν ∈ P(R).
Definition 16.7. The Kantorovich metric is the function dK ( μ , ν ) =
R
|Fμ (x) − Fν (x)| dx,
μ , ν ∈ P(R).
(16.2)
Note that the integral in the right-hand side of (16.2) can diverge, and thus the Kantorovich metric can take value +∞. Next, let (X, ρ ) be a metric space, X = B(X). Definition 16.8. The L´evy-Prokhorov metric is the function dLP (μ , ν ) = inf{ε | μ (A) ≤ ν (Aε ) + ε , A ∈ B(X)},
μ , ν ∈ P(X),
where Aε = {y| ρ (y, A) < ε } is the open ε -neighborhood of the set A. It requires some effort to prove that dLP is a metric indeed; see Problem 16.52. For a Lipschitz function f : X → R, denote by Lip( f ) its Lipschitz constant; that is, the infimum of L such that | f (x) − f (y)| ≤ Lρ (x, y), x, y ∈ X. Definition 16.9. The Lipschitz metric is the function dLip (μ , ν ) = sup f d μ − f d ν , f : Lip( f )≤1
X
X
μ , ν ∈ P(X).
The Lipschitz metric is closely related to the Kantorovich metric; see Theorem 16.12. Some authors use term “Kantorovich metric” for the metric dLip . For μ , ν ∈ P(X), denote by C(μ , ν ) the class of all random elements Z = (X,Y ) in (X × X, X ⊗ X) such that the first component X has distribution μ and the second component Y has distribution ν . Such a random element is called a coupling for the measures μ , ν . Definition 16.10. The Wasserstein metric of the power p ∈ [1, +∞) is the function dW,p (μ , ν ) =
inf
(X,Y )∈C(μ ,ν )
[Eρ p (X,Y )]1/p ,
In general, the Wasserstein metric can take value +∞.
μ , ν ∈ P(X).
16 Weak convergence, probability metrics. Functional limit theorems
247
We remark that some authors insist that, from the historical point of view, the correct name for dW,p is the Kantorovich metric (or the Kantorovich-Rubinstein metric). We keep the term “Wasserstein metric” which is now used more frequently. The Wasserstein metric is a typical example of a coupling (or minimal) probability metric. The general definition for the coupling metric has the form inf
(X,Y )∈C(μ ,ν )
H(X,Y ),
(16.3)
where H is some metric on the set of random elements (see [92], Chapter 1). In the definition of the Wasserstein metric, H is equal to the L p -distance H(X,Y ) = ρ (X,Y )L p . Under quite general assumptions, the infimum in the definition of the Wasserstein metric is attained; that is, for a given μ , ν ∈ P(X), p ∈ [1, +∞) there exists an element Z ∗ = (X ∗ ,Y ∗ ) ∈ C(μ , ν ) such that p (μ , ν ) Eρ p (X ∗ ,Y ∗ ) = dW,p
(16.4)
(see Problem 16.55). Any element Z ∗ satisfying (16.4) is called an optimal coupling for the measures μ , ν w.r.t. metric dW,p . In the important particular cases, explicit formulae are available both for the Wasserstein metric and for corresponding optimal couplings. Proposition 16.2. Let X = R, ρ (x, y) = |x − y|. For arbitrary μ , ν ∈ P(X) define the vector [−1] [−1] Zμ ,ν = (Fμ (U), Fν (U)), where U is the random variable uniformly distributed on [0, 1] and F [−1] (x) = inf{y| F(y) > x} is the quantile transformation for the function F. Then for every p ∈ [1, +∞) the random vector Zμ ,ν is an optimal coupling for the measures μ , ν w.r.t. to the metric dW,p . In particular (see Problem 16.57), p dW,p (μ , ν ) =
1 [−1] 0
Fμ
[−1]
(x) − Fν
p (x) dx,
p ∈ [1, +∞).
(16.5)
Now, let (X, X) be an arbitrary measurable space. Recall that, by the Hahn theorem, for any σ -finite signed measure κ there exists a set C ∈ X such that κ(A) ≥ 0 for any A ∈ X, A ⊂ C and κ(B) ≤ 0 for any B ∈ X, B ⊂ X\C. The measure |κ| (·) := κ(· ∩ C) − κ(· ∩ (X\C)) is called the variation of the signed measure κ, and |κ| (X) is called the total variation of κ. Definition 16.11. The total variation metric (or the total variation distance) is the function dV (μ , ν ) = μ − ν var , μ , ν ∈ P(X), where μ − ν var is the total variation of the signed measure μ − ν .
248
16 Weak convergence, probability metrics. Functional limit theorems
Definition 16.12. The Hellinger metric is the function ⎡
)3
dH ( μ , ν ) = ⎣
X
dμ − dλ
3
dν dλ
⎤1/2
*2
dλ ⎦
,
μ , ν ∈ P(X),
where λ is an arbitrary σ -finite measure such that μ λ , ν λ . The value dH (μ , ν ) does not depend on the choice of λ (see Problem 16.65). The Hellinger metric is closely related to the Hellinger integrals. Definition 16.13. The Hellinger integral of the power θ ∈ [0, 1] is the function Hθ (μ , ν ) =
X
dμ dλ
θ
dν dλ
1−θ
dλ ,
μ , ν ∈ P(X).
Here, as in the previous definition, λ is a measure such that μ λ , ν λ . For θ = 0 or 1, the notational convention 00 = 1 is used. The Hellinger integral H1/2 (μ , ν ) is also called the Hellinger affinity. The values Hθ (μ , ν ), θ ∈ [0, 1] do not depend on the choice of λ (see Problem 16.65). Hellinger integrals appear to be a useful tool for investigating the properties of absolute continuity and singularity of the measures μ and ν (see Definitions 17.1, 17.2, and Problems 16.68,16.69). In order to estimate how close two probability distributions are each to other, some “distance” functions are also used, not being the metrics in the true sense. These functions can be nonsymmetric w.r.t. μ , ν , fail to satisfy the triangle inequality, and so on. Here we give one such function that is used most frequently. Definition 16.14. For μ , ν ∈ P(X), let λ be a σ -finite measure such that μ λ , ν λ . Denote f = d μ /d λ , g = d ν /d λ . The relative entropy (or the Kullback–Leibler distance) for the measures μ , ν is defined by f dλ . f ln E(μ ν ) = g X Here, the notational conventions 0 ln(0/p) = 0, p ≥ 0, and p ln(p/0) = +∞, p > 0 are used. The relative entropy can take value +∞. Its value for a given μ , ν does not depend on the choice of λ (see Problem 16.65). Let us formulate the most important properties of the probability metrics introduced above. Let X be a Polish space. Theorem 16.10. (1) A sequence {μn } ⊂ P(X) converges weakly to μ ∈ P(X) if and only if dLP (μn , μ ) → 0, n → +∞. (2) The set P(X) with the L´evy–Prokhorov metric dLP forms a Polish metric space.
16 Weak convergence, probability metrics. Functional limit theorems
249
Theorem 16.11. Assume the metric ρ on the set X is bounded. Then for every p ∈ [1, +∞), (1) The set P(X) with the Wasserstein metric dW,p forms a Polish metric space. (2) The sequence {μn } ⊂ P(X) converges weakly to μ ∈ P(X) if and only if dW,p (μn , μ ) → 0, n → +∞. Theorem 16.12. (The Kantorovich–Rubinstein theorem) The Wasserstein metric dW,1 coincides with the Lipschitz metric dLip . Furthermore, in the case X = R, ρ (x, y) = |x − y| both these metrics are equal to the Kantorovich metric dK . Convergence of a sequence of measures w.r.t. total variation metric dTV is called var convergence in variation (notation: μn → μ ). This convergence is stronger than the var weak convergence; that is, μn → μ implies μn ⇒ μ , but inverse implication, in general, does not hold. The following statement, in particular, shows that convergence in the Hellinger metric is equivalent to convergence in variation. Proposition 16.3. For the Hellinger metric dH and the total variation metric dTV , the following relations hold. dH2 ≤ dTV ≤ 2dH . Let us give one more property of the total variation metric, which has a wide range of applications in ergodic theory for Markov processes with a general phase space. The following statement, by different authors, is named the coupling lemma or the Dobrushin lemma. Proposition 16.4. For any μ , ν ∈ P(X), dTV (μ , ν ) = 2
inf
(X,Y )∈C(μ ,ν )
P(X = Y ).
The coupling lemma states that the total variation metric, up to multiplier 2, coincides with the coupling metric that corresponds to the “indicator distance” H(X,Y ) = P(X = Y ). The properties given above show that there exist close connections between various probability metrics. The variety of probability metrics used in the literature is caused by the fact that every such metric arises naturally from a certain class of models and problems. On the other hand, some of the metrics have several additional properties that appear to be useful because these properties provide more convenient and easy analysis involving these metrics. One such property is called the tensorization property and means that some characteristics of the metric are preserved under the operation of taking a tensor product. Let us give two examples of statements of such kind (see Problems 16.60, 16.67). Proposition 16.5. Let X = X1 × X2 and the metric ρ on X has the form
1/p ρ (x, y) = ρ1p (x1 , y1 ) + ρ2p (x2 , y2 ) , x = (x1 , x2 ), y = (y1 , y2 ) ∈ X,
250
16 Weak convergence, probability metrics. Functional limit theorems
where p ∈ [1, +∞) and ρ1 , ρ2 are the metrics in X1 , X2 . Then the distance between arbitrary product-measures μ = μ1 × μ2 , ν = ν1 × ν2 w.r.t. the Wasserstein metric of the power p is equal to
1/p p p (μ1 , ν1 ) + dW,p ( μ 2 , ν2 ) . dW,p (μ , ν ) = dW,p Proposition 16.6. Let X = X1 × X2 . Then the distance between arbitrary productmeasures μ = μ1 × μ2 , ν = ν1 × ν2 w.r.t. the Hellinger metric satisfies 1 2 1 1 dH (μ , ν ) = 1 − 1 − dH2 (μ1 , ν1 ) 1 − dH2 (μ2 , ν2 ) . 2 2 2 In particular, dH2 (μ , ν ) ≤ dH2 (μ1 , ν1 ) + dH2 (μ2 , ν2 ).
Bibliography [4]; [9], Chapter V; [17]; [25], Chapter IX; [88]; [92], Chapter I.
Problems 16.1. Let {X(t),t ∈ [0, T ]} be a process that has a continuous modification. Prove that the process X generates a random element in C([0, T ]). 16.2. Let {X(t),t ∈ [0, T ]} be a process that has a measurable modification and such that E 0T X 2 (t) dt < +∞. Prove that the process X generates a random element in L2 ([0, T ]). 16.3. Let {X(t),t ∈ [0, T ]} be a process that has a c`adl`ag modification. Prove that the process X generates a random element in D([0, T ]). 16.4. Let X be a metric space and μ be a finite measure on B(X). Prove that (1) For any A ∈ B(X), ε > 0 there exist a closed set Fε and open set Gε such that Fε ⊂ A ⊂ Gε and μ (Gε \Fε ) < ε (the regularity property for a measure on a metric space). (2) If X is a Polish space then for any A ⊂ B(X), ε > 0 there exists a compact set Kε ⊂ A such that μ (A\Kε ) < ε (a refinement of the Ulam theorem). (3) For any p ∈ [1, +∞) the set Cb,u (X) of all bounded uniformly continuous functions on X is dense in L p (X, μ ). 16.5. Let (X, ρ ) be a Polish space, and μ : B(X) → [0, 1] be an additive set function. Prove that μ is σ -additive (i.e., is a measure) if and only if μ (A) = sup{μ (K)| K ⊂ A, K is a compact set}, A ∈ B(X).
16 Weak convergence, probability metrics. Functional limit theorems
251
16.6. Let X be the σ -algebra of subsets of ∞ generated by the open balls, and {ξk , k ≥ 1} are i.i.d. random variables with P(ξk = ±1) = 12 . Prove: (1) The sequence {ξk } generates a random element ξ in (∞ , X). (2) Every compact subset K ⊂ ∞ belongs to the σ -algebra X, and for every such set P(ξ ∈ K) = 0. 16.7. Let {ξk , k ≥ 1} be the sequence of i.i.d. random variables that have standard normal distribution. Prove: # " ( (1) The sequence ζk = ξk / ln(k + 1) does not generate a random element in c0 . (2) The sequence {ζk } generates a random element ζ in the space ∞ with the σ -algebra X generated by the open balls. (3) Every compact subset K ⊂ ∞ belongs to the σ -algebra X, and for every such set P(ζ ∈ K) = 0. Thereby, for the distributions of the element ζ and the element ξ introduced in the previous problem, the statement of the Ulam theorem fails. 16.8. Let Xn , X be the random variables, Xn ⇒ X, and the distribution function FX be continuous in every point of some closed set K. Prove that supx∈K |FXn (x) − FX (x)| → 0, n → +∞. 16.9. Give an example of the random vectors Xn = (Xn1 , Xn2 ), n ≥ 1, X = (X 1 , X 2 ) such that Xn ⇒ X and (a) For every n there exists a function fn ∈ C(R) such that Xn2 = fn (Xn1 ) a.s.. (b) There does not exist a measurable function f such that X 2 = f (X 1 ) a.s. Such an example demonstrates that functional dependence is not preserved under weak convergence. 16.10. Let {Xn } be a sequence of random elements in (X, B(X)) with a tight family of the distributions. Prove that for every f ∈ C(X, Y) the family of the distributions in (Y, B(Y)) of the elements Yn = f (Xn ) is also tight. 16.11. Let X, Y be metric spaces, X × Y be their product, and {μn } be a sequence of measures on the Borel σ -algebra in X × Y. Prove that the sequence {μn } is tight if and only if both the sequences {μn1 }, {μn2 } of the marginal distributions for the measures {μn } are tight. The marginal distributions for a measure μ on B(X × Y) are defined by
μ 1 (A) = μ (A × Y), A ∈ B(X),
μ 2 (B) = μ (X × B), B ∈ B(Y).
16.12. Consider the" following functional spaces: # (a) Lip([0, 1]) = f f Lip := | f (0)| + sups,t∈[0,1],s=t | f (t) − f (s)|/|t − s| < +∞ (Lipschitz functions). " # (b) Hγ ([0, 1]) = f f Hγ ≡ | f (0)| + sups,t∈[0,1],s=t | f (t) − f (s)|/|t − s|γ < +∞
(H¨older functions with the index γ ∈ (0, 1)).
252
16 Weak convergence, probability metrics. Functional limit theorems
Are they Banach spaces w.r.t. norms · Lip and · Hγ , respectively? Which of these spaces are separable? 16.13. Let
t
Xn (t) = nt1I[0,1/(2n)) + 1 − 1I[1/(2n),1/n) , t ∈ [0, 1], n ≥ 1, X ≡ 0. n Prove that (a) All the finite-dimensional distributions of the process Xn converge weakly to the corresponding finite-dimensional distributions of the process X. (b) The sequence {Xn } does not converge to X by distribution in C([0, 1]). 16.14. Let {a, b, c, d} ⊂ (0, 1) and a < b, c < d. Calculate the distance between the functions x = 1[a,b) and 1I[c,d) , considered as elements of D([0, 1]), w.r.t. the metric (a) d; (b) d0 . 16.15. Let xn (t) = 1I[1/2,1/2+1/n) (t),t ∈ [0, 1], n ≥ 2. (1) Prove that the sequence {xn } is fundamental in D([0, 1]) w.r.t. the metric d, but this sequence does not converge. (2) Check directly that this sequence is not fundamental in D([0, 1]) w.r.t. the metric d0 . 16.16. Is the set D([0, 1]) a closed subset of the space B([0, 1]) of all bounded functions on [0, 1] with the uniform metric? 16.17. Prove that the space D([0, 1]), endowed with the uniform metric, is complete but is not separable. 16.18. Prove that if xn → x in the metrics d of the space D([0, 1]) and the function x is continuous, then xn − x∞ → 0. 16.19. Prove that: (1) C([0, 1]) is a closed subset of D([0, 1]). (2) C([0, 1]) is a nowhere dense subset of D([0, 1]); that is, for every nonempty open ball B ⊂ D([0, 1]) there exists a nonempty open ball B ⊂ B such that B ∩ C([0, 1]) = ∅. 16.20. Give an example of sequences {xn }, {yn } in D([0, 1]) such that the sequences themselves converge in D([0, 1]) but the sequence of the R2 -valued functions {zn (t) = (xn (t), yn (t))} does not converge in D([0, 1], R2 ). 16.21. Is the mapping S : (x, y) → x + y, x, y ∈ D([0, 1]) continuous as a function D([0, 1]) × D([0, 1]) → D([0, 1])? 16.22. For a given a < b consider the following functional on C([0, 1]): Iab (x) =
1 0
1Ix(s)∈[a,b] ds,
x ∈ C([0, 1]).
Prove that this functional is not continuous, but the set of its discontinuity points has zero Wiener measure.
16 Weak convergence, probability metrics. Functional limit theorems
253
16.23. Let {X(t) ∈ [0, 1]} be a process with continuous trajectories. For a ∈ R, denote by DXa the set of t ∈ [0, 1] such that the corresponding one-dimensional distribution has an atom in the point a; that is, DXa = {t ∈ [0, 1]| P(X(t) = a) > 0}. Prove that: (1) DXa ∈ B([0, 1]). (2) The set of discontinuity points for the functional Iab introduced in the previous problem has zero measure w.r.t. distribution of the process X if and only if the set DXa ∪ DXb has zero Lebesgue measure. 16.24. For a given a < b, consider the functional Iab (Problem 16.22) on the space D([0, 1]). Prove that this functional is not continuous, but for every c = 0 the set of its discontinuity points has zero measure w.r.t. distribution of the process X(t) = N(t) + ct,t ∈ [0, 1], where N is the Poisson process. Does the last statement remain true for c = 0? 16.25. For a given z ∈ R, consider the functional on C(R+ ) inf{t | x(t) = z}, {t | x(t) = z} = ∅, τ (z, x) = +∞, {t | x(t) = z} = ∅. Prove that τ (z, ·) is not a continuous functional, but the set of its discontinuity points has zero Wiener measure. 16.26. On the space C([0, 1]), consider the functionals M(x) = max x(t), ϑ (x) = min{t | x(t) = M(x)}, x ∈ C([0, 1]). t∈[0,1]
Prove that (a) the functional M is continuous; (b) the functional ϑ is not continuous, but the set of its discontinuity points has zero Wiener measure. 16.27. Prove that the following functionals on C([0, 1]) are not continuous, but the sets of their discontinuity points have zero Wiener measure. κ(x) = min{t ∈ [0, 1]| x(t) = x(1)},
χ (x) = max{t ∈ [0, 1]| x(t) = 0},
x ∈ C([0, 1]). 16.28. Let a : [0, 1] → R be a positive continuous function. Prove that the functional Ma : D([0, 1]) → R, Ma (x) = supt∈[0,1] x(t)/(a(t)) is continuous. 16.29. Denote T (x) = inf{t ∈ [0, 1] | x(t−) = x(t)}, x ∈ D([0, 1]). Is T (·) a continuous functional on D([0, 1])? 16.30. For c > 0, denote Tc (x) = inf{t ∈ [0, 1] | |x(t−) − x(t)| > c}, x ∈ D([0, 1]), that is, the moment of the first jump of the function x with the jump size exceeding c. Prove that for any measure μ on D([0, 1]) there exists at most countable set Aμ such that for arbitrary c ∈ Aμ the set of discontinuity points for Tc has zero measure μ .
254
16 Weak convergence, probability metrics. Functional limit theorems
16.31. Denote τa (x) = inf{t ∈ [0, 1]| x(t) ≥ a}, x ∈ D([0, 1]), that is, the moment of the first passage of x over the level a (if the set is empty, then put τa (x) = 1). Describe the set of values a ∈ R such that the set of discontinuity points for τa has zero measure w.r.t. the distribution of the Poisson process in D([0, 1]). 16.32. Prove that for arbitrary a ∈ R the set of discontinuity points for the functional τa introduced in the previous problem has zero measure w.r.t. the distribution of the process X(t) = N(t) − t,t ∈ [0, 1], where N is the Poisson process. 16.33. Let {Sn = ∑k≤n ξk , n ∈ Z+ } be a random walk with Eξk = 0, Eξk2 = 1. Prove that for any a < b √ √ 1 P(Sn ∈ [a N, b N]) → ∑ N n≤N
1 b 0
a
1 −y2 /(2s) √ e dyds, N → ∞. 2π s
16.34. Let √{Sn , n ∈ Z+ } be as in the previous problem. Denote HSN (z) = #{n ≤ N| Sn ≥ z · N}, z ∈ R. Prove that P
1 N H (0) ≤ α N S
→
√ 2 arcsin α , N → ∞, α ∈ (0, 1). π
16.35. (1) Let {Sn = ∑k≤n ξk , n ∈ Z+ } be the random walk with P(ξk = ±1) = 12 . Prove that P(max Sn ≥ z) = 2P(SN > z) + P(SN = z), z ∈ Z+ . n≤N
(2) Let W be the Wiener process. Prove that P(max W (s) ≥ z) = 2P(W (t) ≥ z), s≤t
z ≥ 0.
(3) Let {Sn = ∑k≤n ξk , n ∈ Z+ } be a random walk with Eξk = 0, Eξk2 = 1. Prove that
√ P(maxn≤N Sn ≥ z · n) √ → 2, N → ∞, z ≥ 0. P(SN ≥ z · n)
16.36. Let W be the Wiener process, z > 0. (1) Find the distribution density of the random variable τ (z,W ) (the functional τ (·, ·) is defined in Problem 16.25). (2) Prove that Eτ α (z,W ) = +∞ for α ≥ 12 and Eτ α (z,W ) < +∞ for α ∈ (0, 12 ). 16.37. Prove that {Y (z) = τ (z,W ), z ∈ R+ } is a stochastically continuous homogeneous process with independent increments. 16.38. Find the cumulant and the L´evy measure of the process Y from the previous problem.
16 Weak convergence, probability metrics. Functional limit theorems
255
16.39. Let {Sn = ∑k≤n ξk } be the random walk with P(ξk = ±1) = 12 , and HSN be the function defined in Problem 16.34. Prove that for z < 0, α ∈ (0, 1),
1 N H (z) ≤ α P N S
→
1−α 3 2 0
·
|z|
−z 3 e
π3 s 2
2 /(2s)
3 · arcsin
α ds, N → ∞. 1−s
Give the formula for limN→∞ P N −1 HSN (z) ≤ α when z > 0. 16.40. Let {Sn } be as in the previous problem. (1) For a given n, N (n < N) find P(Sm ≤ Sn , m ≤ N). (2) Denote ϑ N (S) ≡ min{m : Sm = maxn≤N Sn }. Prove that P(ϑ N (S) ≤ α · N) →
√ 2 arcsin α , N → ∞, α ∈ (0, 1). π
16.41. Let {Sn = ∑k≤n ξk , n ∈ Z+ } be the random walk with P(ξk = ±1) = 12 . Find P(max Sn = m, ϑ N (S) = k, SN = r). n≤N
16.42. Find the joint distribution of the variables maxt∈[0,1] W (t), ϑ (W ),W (1), where W is a Wiener process, and the functional ϑ is defined in Problem 16.26. 16.43. Let {Sn = ∑k≤n ξk , n ∈ Z+ } be the random walk with P(ξk = ±1) = 12 . Find P(maxn≤N Sn = m, minn≤N Sn = k, SN = r). 16.44. Find the joint distribution of the variables maxt∈[0,1] W (t), mint∈[0,1] W (t),W (1) (W is a Wiener process). Compare with Problem 7.108. 16.45. Let {Sn = ∑k≤n ξk , n ∈ Z+ } be a random walk with Eξk = 0, Eξk2 = 1. Prove that (2m+1)z √ 2 1 √ e−y /2 dy P(max |Sn | ≤ z · N) → ∑ (−1)m n≤N (2m−1)z 2 π m∈Z π 2 (2m + 1)2 4 ∞ (−1)m exp − = 1− ∑ , N → ∞, z > 0. π m=1 m + 1 8z2 d
16.46. Without an explicit calculation of the distribution, show that κ(W ) = χ (W ) (the functionals κ, χ are defined in Problem 16.27). 16.47. Prove that P(κ(W ) ≤ α ) = P(χ (W ) ≤ α ) =
√ 2 arcsin α = lim P(Sn = 0, n ≥ α · N), N→∞ π
α ∈ (0, 1), where {Sn } is the random walk with P(ξk = ±1) = 12 .
256
16 Weak convergence, probability metrics. Functional limit theorems
16.48. Without passing to the limit, find the distribution of the variables κ(W ), χ (W ). Compare with the previous problem. 16.49. In the array {ξnk , 1 ≤ k ≤ n} let the random variables ξn1 , . . . , ξnn be i.i.d. with P(ξn1 = 1) = λ /n, P(ξn1 = 0) = 1 − λ /n. Prove that for any a > 0, P(Snk ≤ ak, k = 1, . . . , n) → P( max (N(t) − at) ≤ 0), n → ∞, t∈[0,1]
where N is the Poisson process with intensity λ . 16.50. In the situation of the previous problem, prove that for arbitrary a > 0 the distributions of the variables n−1 #{k : Snk > ak} weakly converge to the distribution of the variable 01 1IN(t)>at dt. 16.51. Verify the following relations between the uniform metric dU and and L´evy metric dL . (1) dU (μ , ν ) ≥ dL (μ , ν ), μ , ν ∈ P(R). (2) If the measure ν possesses a bounded density pν , then dU (μ , ν ) ≤ 1 + sup pν (x) dL (μ , ν ), μ ∈ P(R). x∈R
In particular, if ν ∼ N(0, 1), then dL (μ , ν ) ≤ dU (μ , ν ) ≤ 1 + (2π )−1/2 dL (μ , ν ),
μ ∈ P(R).
16.52. Verify the metric axioms for the L´evy–Prokhorov metric dLP : (a) dLP (μ , ν ) = 0 ⇔ μ = ν . (b) dLP (μ , ν ) = dLP (ν , μ ). (c) dLP (μ , ν ) ≤ dLP (μ , π ) + dLP (π , ν ) for any μ , ν , π ∈ P. 16.53. Prove the triangle inequality for the Wasserstein metric. 16.54. Let X be a Polish space, μ , ν ∈ P(X), and let {Zn , n ≥ 1} ⊂ C(μ , ν ) be an arbitrary sequence. Prove that the sequence of distributions of Zn , n ≥ 1 in X × X is weakly compact. 16.55. Let X be a Polish space, μ , ν ∈ P(X), p ∈ [1, +∞). Prove the existence of an optimal coupling for the measures μ , ν , that is, of such an element Z ∗ = (X ∗ ,Y ∗ ) ∈ p (μ , ν ). C(μ , ν ) that Eρ p (X ∗ ,Y ∗ ) = dW,p 16.56. Let X be a Polish space, p ∈ [1, +∞). Prove that the class of optimal couplings for the measures μ , ν depends on μ , ν continuously in the following sense. For any sequences {μn } ⊂ P(X) and {νn } ⊂ P(X), convergent in the metric dW,p to measures μ and ν , respectively, and any sequence Zn∗ , n ≥ 1 of optimal couplings for μn , νn , n ≥ 1 weakly convergent to an element Z ∗ , the element Z ∗ is an optimal coupling for μ , ν .
16 Weak convergence, probability metrics. Functional limit theorems
257
16.57. Prove formula (16.5) (a) for discrete measures μ , ν ; (b) in the general case. 16.58. Calculate the Wasserstein distance dW,2 (μ , ν ) for μ ∼ U(0, 1), ν ∼ Exp(λ ). For what λ is this distance minimal; that is, which exponential distribution gives the best approximation for the uniform one? 16.59. Calculate the Wasserstein distance dW,2 (μ , ν ) for μ ∼ N(a1 , σ12 ), ν ∼ N(a2 , σ22 ). 16.60. Prove Proposition 16.5. 16.61. Let {λk }, {θk } be a given sequences of nonnegative real numbers such that ∑k λk < +∞, ∑k θk < +∞, and {ξk }, {ηk } are sequences of independent centered Gaussian random variables with the variances {λk } and {θk }, respectively. Find the Wasserstein distance dW,2 between the distributions of the random elements generated by these two sequences in the space 2 . 16.62. Let {X(t),Y (t),t ∈ [a, b]} be centered Gaussian processes, and let their covariance functions RX , RY be continuous on [a, b]2 . (1) Prove that the processes X,Y generate random elements in the space L2 ([a, b]). (2) Prove the following estimate for the Wasserstein distance between the distributions μX , μY of the random elements generated by the processes X,Y in L2 ([a, b]), 9 dW,2 (μX , μY ) ≤
[a,b]2
(QX (t, s) − QY (t, s))2 dsdt,
where QX , QY ∈ L2 ([a, b]2 ) are arbitrary kernels satisfying b a
QX (t, r)QX (s, r) dr = RX (t, s),
b a
QY (t, r)QY (s, r) dr = RY (t, s),
t, s ∈ [a, b]. 16.63. Prove that the Wasserstein distance dW,2 between the distributions of the random elements generated by the Wiener process and the√Brownian bridge in L2 ([0, 1]) √ √ is bounded by 1/ 3 − ( 2)/3 from below and by 1/ 3 from above. 16.64. Prove that the creating of convex combinations of probability measures does not enlarge the Wasserstein distance dW,p ; that is, for any μ1 , . . . , μm , ν1 , . . . , νm ∈ P(X) and α1 , . . . , αm ≥ 0 with ∑m k=1 αk = 1, ) * dW,p
m
m
k=1
k=1
∑ αk μk , ∑ αk νk
≤ max dW,p (μk , νk ). k=1,...,m
Does this property hold for other coupling metrics? 16.65. A σ -finite measure λ is said to dominate measure μ ∈ P(X) if μ λ . In Definitions 16.11—16.13, the values of the Hellinger metric dH (μ , ν ), Hellinger integrals Hθ (μ , ν ), θ ∈ [0, 1], and relative entropy E(μ ν ) are defined in terms of a measure λ that dominates both μ and ν . Prove that, for a given μ , ν ∈ P(X): (1) There exists at least one such a measure λ . (2) The values of dH (μ , ν ), Hθ (μ , ν ), θ ∈ [0, 1], and E(μ ν ) do not depend on the choice of λ .
258
16 Weak convergence, probability metrics. Functional limit theorems
16.66. Verify that Hθ (μ1 × μ2 , ν1 × ν2 ) = Hθ (μ1 , ν1 )Hθ (μ2 , ν2 ), θ ∈ [0, 1]. Use this relation for proving Proposition 16.6. 16.67. Prove Proposition 16.3. 16.68. Let μ , ν ∈ P(X). Prove the following statements. (1) H0 (μ , ν ) = H1 (μ , ν ) = 1 and Hθ (μ , ν ) ≤ 1 for every θ ∈ (0, 1). If Hθ (μ , ν ) = 1 for at least one θ ∈ (0, 1), then μ = ν . (2) The function Hμ ,ν : [0, 1] θ → Hθ (μ , ν ) is log-convex; that is, α Hμ ,ν (αθ1 + (1 − α )θ2 ) ≤ Hμα,ν (θ1 )Hμ1− ,ν (θ2 ), θ1 , θ2 ∈ [0, 1], α ∈ (0, 1).
(3) The function Hμ ,ν is continuous on the interval (0, 1). (4) The measure μ is absolutely continuous w.r.t. ν if and only if the function Hμ ,ν is continuous at the point 1. (5) In order for the measures μ and ν to be mutually singular it is necessary that for every θ ∈ (0, 1), and it is sufficient that for some θ ∈ (0, 1), the Hellinger integral Hθ (μ , ν ) equals 0. :
16.69. (Kakutani alternative). Let (X, X) = (∏n∈N Xn , n∈N Xn ) be a countable product of a measurable spaces (Xn , Xn ), n ∈ N, and let μ and ν be the product measures on this space: μ = ∏n∈N νn , μ = ∏n∈N νn , where μn , νn ∈ P(Xn ), n ∈ N. Assuming that for every n ∈ N the measure μn is absolutely continuous w.r.t. νn , prove that for the measures μ , ν only the two following relations are possible. (a) μ ν ; (b) μ ⊥ ν . Prove that the second relation holds if and only if ∏n∈N H1/2 (μn , νn ) = 0. 16.70. Prove that the Hellinger intregrals are continuous w.r.t. the total variation convar var vergence; that is, as soon as μn → μ , νn → ν , n → ∞, one has Hθ (μn , νn ) → Hθ (μ , ν ), n → ∞, θ ∈ [0, 1]. 16.71. Calculate Hθ (μ , ν ), θ ∈ [0, 1] for (a) μ ∼ N(a1 , σ 2 ), ν ∼ N(a2 , σ 2 ). (b) μ ∼ N(a, σ12 ), ν ∼ N(a, σ22 ). (c) μ is the uniform distribution on [a1 , b1 ]; ν is the uniform distribution on [a2 , b2 ] (a1 < b1 , a2 < b2 ). 16.72. Let μ , ν be the distributions of Poisson random variables with the parameters λ and ρ , respectively. Find Hθ (μ , ν ), θ ∈ [0, 1]. In the case λ = ρ , find θ∗ such that Hθ∗ (μ , ν ) = minθ ∈[0,1] Hθ (μ , ν ). 16.73. Let μ and ν be the distributions of the random vectors (ξ1 , . . . , ξm ) and (η1 , . . . , ηm ). Assuming that the components of the vectors are independent and ξk ∼ Pois(λk ), ηk ∼ Pois(ρk ), k = 1, . . . , m, find Hθ (μ , ν ), θ ∈ [0, 1].
16 Weak convergence, probability metrics. Functional limit theorems
259
16.74. Let μ and ν be the distributions of the random elements in L2 ([0, T ]) defined by the Poisson processes with the intensity measures κ1 and κ2 , respectively (see Definition 5.3). Find Hθ (μ , ν ), θ ∈ [0, 1]. 16.75. Prove that the relative entropy E(μ ν ) is equal to the left derivative at the point 1 of the function θ → Hθ (μ , ν ): E(μ ν ) = lim
θ →1−
Hθ (μ , ν ) − 1 . θ −1
16.76. Prove that E(μ1 × μ2 ν1 × ν2 ) = E(μ1 ν1 ) + E(μ2 ν2 ). 16.77. (Variational formula for the entropy). Prove that, for arbitrary measure ν ∈ P(X) and arbitrary nonnegative function h ∈ L1 (X, ν ), ln h d μ − E(μ ν ) . ln h d ν = max X
μ ∈P(X)
X
Hints ω ) ∈ B} ∈ F for arbitrary open (or closed) ball B ⊂ X 16.1–16.3. Prove that {ω : X(·, and respective modification X of the process X. Use Lemma 16.1. 16.4. (1) Use the “principle of the fitting sets”. Prove that the class of the sets described in the formulation of the problem is a σ -algebra that contains all open sets. (2) If Fε /2 is a closed set from the previous statement and Kε /2 is a compact set from ε = Fε /2 ∩ Kε /2 is the required compact. the statement of the Ulam theorem, then K (3) Consider the following classes of functions: K0 = Cb,u (X); K1 = {the functions of the form f = 1IG , G is an open set}; K2 = {the functions of the form f = 1IA , A is a Borel set}; K3 = L p (X, μ ). Prove that every function from the class Ki (i = 1, 2, 3) can be obtained as an L p -limit of a sequence of linear combinations of the functions from Ki−1 . 16.7. (1) Use statement (a) of Problem 1.16. (2), (3) Use reasoning analogous to that given in the proof of Problem 16.6. 16.10. Use that the image of a compact set under a continuous mapping is also a compact set. 16.11. Use the previous problem and the following two statements. (1) The functions πX : X × Y (x, y) → x ∈ X, πY : X × Y (x, y) → y ∈ Y are continuous; (2) if K1 , K2 are the compact sets in X, Y then K1 × K2 is a compact set in X × Y. 16.13. (a) For a given t1 , . . . ,tm and n greater than some n0 = n0 (t1 , . . . ,tm ), PtX1n,...,tm = PtX1 ,...,tm . ' & (b) For an open set G = y| supt |y(t)| < 12 , limn P(Xn ∈ G) = 0 < 1 = P(X ∈ G). 16.14. If the function λ does not satisfy conditions λ (c) = a, λ (d) = b, then supt |x(λ (t)) − y(t)| = 1.
260
16 Weak convergence, probability metrics. Functional limit theorems
16.15. Use the previous problem. 16.17. Let xa = 1It∈[a,1] ∈ D([0, 1]), a ∈ [0, 1]. Then, for every a1 = a2 , the uniform distance between xa1 and xa2 is equal to 1. 16.22, 16.23. Verify that x ∈ C([0, 1]) is a continuity point for the functional Iab if and only if 01 1I{a}∪{b} (x(t)) dt = 0. Use the hint to Problem 3.21. 16.24. Verify that x ∈ D([0, 1]) is a continuity point for the functional Iab if and only if 01 1I{a}∪{b} (x(t)) dt = 0. 16.25. Verify that x ∈ C([0, 1]) is a discontinuity point for the functional τ (·, z) in the following cases. (1) {x(t) = z} = ∅ and at least one of the sets {x(t) < z}, {x(t) > z} is not empty. (2) There exists a nonempty interval (a, b) ⊂ R+ such that x(t) = z,t ∈ (a, b). Prove that otherwise x ∈ C([0, 1]) is a continuity point for τ (·, z) and use Problem 3.23. 16.26. (a) | maxt x(t) − maxt y(t)| ≤ supt |x(t) − y(t)|. (b) Verify that x ∈ C([0, 1]) is a continuity point for the functional ϑ if and only if the function x takes its maximum value on [0, 1] in a unique point, and use Problem 3.22. 16.27. Describe explicitly the sets of discontinuity points for the functionals κ, χ (see the Hint to Problem 16.25) and use Problem 3.23. 16.33. Use the invariance principle, Theorem 16.3, and Problem 16.22. 16.34. Use Problem 10.32 and the strong Markov property for the random walk (see also [22], Vol. 1, Chapter III, §4). 16.35. (1) Use the reflection principle (see Problem 10.32 or [22], Vol. 1, Chapter III, §1). (2), (3) Use the invariance principle and item 1). In item (2), you can also use the reflection principle for the Wiener process; see Problem 7.109. 16.36. (1) P(τ (z,W ) ≤ x) = P(maxs≤x W (s) ≥ z). (2) Use item (1). 16.37. Use the strong Markov property for the Wiener process (see Definition 12.9 and Theorem 12.5). 16.38. Let n ∈ N, and denote η = τ (1,W ), ηn = τ (1/n,W ). By Problem 16.36, d
d
η = n2 ηn . Thus, for every n ≥ 1, ηn = n−2 (η1 + · · · + ηn ), where η1 , . . . , ηn are the independent random variables identically distributed with η . Therefore, η has a stable distribution with the parameter α = 12 . In addition, η ≥ 0. For the description of a characteristic function of a stable distribution, see [22] Vol. 2, Chapter XVII, §§3,5. 16.39. Prove the relation P(HSN (z) = m) =
N
∑ P(HSN−k (0) = m)P(τ = k),
k=1
√
where τ = min{l : Sl ≥ z · N}. Use this relation and Problems 16.34, 16.36. 16.40. See [22], Vol. 1, Chapter III, §7. 16.41–16.44. See [4], Chapter 2, §7. 16.46. Use Problem 6.5, item (e). 16.47. Use Problems 16.27, 16.40, and the invariance principle.
16 Weak convergence, probability metrics. Functional limit theorems
261
16.48. P(χ (W ) < x) = P(W (s) = 0, s ∈ [x, 1]) = P(W (x) > 0, min W (s) −W (x) ≥ −W (x)) s∈[x,1]
+ P(W (x) < 0, max W (s) −W (x) ≤ −W (x)) s∈[x,1] −y2 /(2x) ∞ e 2 −z2 /(2(1−x)) √ ( 1− = e dz dy. R |y| 2π x 2π (1 − x) 16.49. Use Theorem 16.9 and Problem 16.28. 16.50. Use Theorem 16.9 and Problem 16.24. 16.51. (1) If ε ≥ dU (μ , ν ), then Fμ (x) ≤ Fν (x) + ε ≤ Fν (x + ε ) + ε for every x ∈ R. (2) If ε > dL (μ , ν ), then Fμ (x) ≤ Fν (x + ε ) + ε = Fν (x) +
x+ε x
pν (y) dy + ε , x ∈ R.
16.53. For random elements taking values in a Polish space X, prove the statement analogous to the one given in Problem 1.11. Then use this statement in order to solve the problem. 16.54. See Problem 16.11. 16.55. Use Problem 16.54 and the Fatou lemma. 16.56. Use the triangle inequality for the Wasserstein metric and, analogously to the solution of Problem 16.55, the Fatou lemma. 16.57. If X is a random variable and Xn = [nX]/n, n ∈ N is its discrete approximation, then E|X − Xn | p ≤ n−p → 0, n → ∞, and therefore the distributions of the variables Xn converge to the distribution of X in the metric dW,p . Therefore the statement of item (b) can be proved using item (a) and Problem 16.56. 16.58. Use formula (16.5). 16.59. Use formula (16.5) and the fact that the distribution function for N(a, σ 2 ) has the form F(x) = Φ ((x − a)/σ ) , where Φ denotes the distribution function for N(0, 1). 16.61. Use Problem 16.59 and Proposition 16.34. 16.62. Use Problem 6.13 in item (1) and Problems 6.28, 6.30 in item (2). 16.63. Use Problems 6.35 and 16.59 in order to obtain the upper and the lower estimates, respectively. 16.64. Let X1 , . . . , Xm be the random elements with distributions μ1 , . . . , μm , respectively, and θ be a random variable, independent of X1 , . . . , Xm and taking values 1, . . . , m with probabilities α1 , . . . , αm . Then the random element ⎧ ⎪ ⎨X1 , θ = 1 Xθ = . . . ⎪ ⎩ Xm , θ = m has the distribution α1 μ1 + · · · + αm μm .
262
16 Weak convergence, probability metrics. Functional limit theorems
16.66.
d(μ1 × μ2 ) d μ1 d μ2 = . d(λ1 × λ2 ) d λ1 d λ2
The Hellinger metric and the Hellinger affinity satisfy relation 1 − dH2 (μ , ν ) = H1/2 (μ , ν ). 16.69. Use Problem 16.68. 16.75. Use item (4) of Problem 16.68, the Fatou lemma, and the Lebesgue dominated convergence theorem. 16.77. Use Jensen’s inequality.
Answers and Solutions 16.1. Let B be a closed ball with & the center x ∈ C([0, T ]) and the radius r. Then ω ) ∈ B} = ∩t∈Q∩[0,T ] ω | |X(t, ω ) − x(t)| ≤ r} ∈ F. {ω | X(·, 16.2. Let B be a closed ball with the center x ∈ L2 ([0, T ]) and the radius r. Then {ω | ω ) − x(t))2 dt ≤ r}. Because the process X(t) − x(t) is ω ) ∈ B} = {ω | T (X(t, X(·, 0 T 2 − x(t)) dt is a random variable and thus {ω | X(·, ω ) ∈ B} ∈ F. measurable, 0 (X(t) 16.3. See [4], Theorem 14.5. 16.6. (1) Let B be a closed ball with the center x and radius r, then {ξ ∈ B} = ∩k {|ξk − xk | ≤ r} ∈ F. (2) Every compact set can be represented as an intersection of a countable family of sets, each one being a finite union of the open balls. Therefore, every compact set belongs to X. Let us prove that every open ball B with the radius 1 has zero measure w.r.t. distribution of the element ξ ; because every compact set is covered by a finite union of such balls, this will provide the required statement. Let the center of the ball B be a sequence x = (xk )k∈N . Then, for every k ∈ N, at least one of the inequalities holds true: |xk − 1| ≥ 1, |xk + 1| ≥ 1. Consider the following sequence (yk )k∈N : if for the given k the first relation holds, then yk = −1; otherwise yk = 1. Then {ξ ∈ B} ⊂ k {ξk = yk } and P(ξ ∈ B) ≤ ∏k∈N 12 = 0. 2,n 2 16.9. Consider the points xnjk = (x1,n jk , x jk ) ∈ [0, 1] , j, k = 1, . . . , n such that j−1 j k−1 k 1,n 2,n x jk ∈ , , x jk ∈ , , k, j = 1, . . . , n, n n n n r,n and xr,n jk = xil , r = 1, 2 for every j, k, i, l such that ( j, k) = (i, l). There exists a Borel
(and even a continuous) function fn : [0, 1] → [0, 1] such that fn (xk1,nj ) = xk2,nj , k, j = 1, . . . , n. Define the distribution of the random vector Xn = (Xn1 , Xn2 ) in the following way. Xn takes values xnjk , j, k = 1, . . . , n with the probabilities n−2 . By the definition, fn (Xn1 ) = Xn2 . On the other hand, (Xn1 , Xn2 ) weakly converges to the vector X = (X 1 , X 2 ) with independent components uniformly distributed on [0, 1]. For arbitrary Borel function f : [0, 1] → [0, 1], one has cov( f (X 1 ), X 2 ) = 0 and therefore relation f (X 1 ) = X 2 does not hold. 16.12. Both the spaces Lip([0, 1]) and Hγ ([0, 1]) with arbitrary γ ∈ (0, 1) are Banach. None of these spaces is separable.
16 Weak convergence, probability metrics. Functional limit theorems
263
16.14. d(x, y) = max |a − c|, |b − d| , % $ a b−a 1−b . , ln d0 (x, y) = min 1, max ln , ln c d −c 1−d 16.16. Yes, it is. 16.18. If xn → x in the metric d of the space D([0, 1]), then there exists a sequence λn ∈ Λ such that supt |λn (t) − t| → 0, supt |xn (t) − x(λn (t))| → 0. As soon as x is continuous, it is uniformly continuous, and thus supt |x(t) − x(λn (t))| → 0. This gives the required convergence supt |xn (t) − x(t)| → 0. 16.20. xn (t) = 1It∈[1/2−1/(2n),1] , yn (t) = 1It∈[1/2−1/(3n),1] . 16.21. No, it is not. Consider the functions xˆn = xn , yˆn = −yn , where xn , yn are the functions from the previous solution. Then xˆn → 1It∈[1/2,1] , yˆn → −1It∈[1/2,1] , but xˆn + yˆn → 0. 16.28. If xn → x in the metric d of the space D([0, 1]) then there exists a sequence λn ∈ Λ , n ≥ 1 such that supt |λn (t) − t| → 0, supt |xn (t) − x(λn (t))| → 0. Because a is continuous, supt |a(λn (t)) − a(t)| → 0. Thus sup xn (t) − sup x(t) = sup xn (t) − sup x(λn (t)) → 0. t a(t) t a(t) t a(t) t a(λn (t)) 16.29. No, it is not. 16.31. For a ∈ Z+ . 16.36. (a) $ % ∞ 2 2 1 d −y2 /(2x) √ e dy = √ x−3/2 e−z /(2x) . p(x) = dx 2π x z 2π ( √ 16.38. Π (du) = (1/ 2π )1Iu>0 u−3/2 du, ψ (z) = − 2|z|. 16.39. For z > 0, α ∈ (0, 1), 3 1−α 3 1 N z −z2 /(2s) 2 α HS (z) ≤ α → ds+ P · e · arcsin N π 3 s3/2 1−s 0 ∞
2 1 z √ · 3/2 e−z /(2s) ds, N → ∞. 2π s 16.52. Statement (c) (the triangle inequality) follows immediately from the rela> 0 (we leave details for the reader). It is obvious that tion (Aε )δ ⊂ Aε +δ , ε , δ dLP (μ , μ ) = 0. Because ε >0 Aε = A for any closed set A, it follows from the relation dLP (μ , ν ) = 0 that μ (A) ≤ ν (A) for every closed A. Then for every continuous function f taking values in (0, 1) one has
+
X
1−α
n
∑ f (tn,k )μ n→+∞
f (x) μ (dx) = lim
≤ lim
n→+∞
k=1 n
∑
k=1
{x| f (x) ∈ [tn,k−1 ,tn,k ]}
f (tn,k )ν {x| f (x) ∈ [tn,k−1 ,tn,k ]} = f (x) ν (dx), X
264
16 Weak convergence, probability metrics. Functional limit theorems
where the sequence of partitions πn = {0 = tn,0 < · · · < tn,n = 1} is chosen in such a way that |πn | → ∞, n → ∞ and μ ( f = tn,k ) = ν ( f = tn,k ) = 0, 1 ≤ k ≤ n. This inequality and the analogous inequality for f = 1 − f provide that X f d μ = X f d ν . Because f is arbitrary, this yields μ = ν and completes the proof of statement (a). Let us prove (b). It follows from (a) that dLP (μ , ν ) = 0 if and only if dLP (ν , μ ) = 0. Assume that dLP (μ , ν ) > 0 and take t ∈ (0, dLP (μ , ν )). By definition, there exists a set A ∈ B(X) such that μ (A) > ν (At ) +t. Denote B = X\At ; then the latter inequality can be written as μ (A) > 1 − ν (B) + t or, equivalently, ν (B) > μ (X\A) + t. If x ∈ Bt then there exists some y ∈ X\At such that ρ (x, y) < t and thus x ∈ A. Therefore, Bt ⊂ X\A and we have the inequality
ν (B) > μ (X\A) + t ≥ μ (Bt ) + t, which yields t < dLP (ν , μ ). Because t ∈ (0, dLP (μ , ν )) is arbitrary, we obtain that dLP (μ , ν ) ≤ dLP (ν , μ ). By the same arguments applied to the pair (ν , μ ) instead of (μ , ν ), we have dLP (μ , ν ) ≥ dLP (ν , μ ) and thus dLP (μ , ν ) = dLP (ν , μ ). 16.55. If dW,p (μ , ν ) = +∞ then one can take arbitrary coupling Z, thus only the case dW,p (μ , ν ) < +∞ needs detailed consideration. Take a sequence Z n = (X n ,Y n ) ∈ p (μ , ν ), n → ∞. By Problem 16.54 the family C(μ , ν ) such that Eρ p (X n ,Y n ) → dW,p n of distributions of Z , n ≥ 1 is tight. Using the Prokhorov theorem and passing to a limit, we may assume that Z n , n ≥ 1 converges weakly to some Z ∗ = (X ∗ ,Y ∗ ). Because the projection in X × X on one component is a continuous function, by Theorem 16.3 we have X n ⇒ X ∗ ,Y n ⇒ Y ∗ . Therefore, X ∗ and Y ∗ have distributions μ and ν , respectively; that is, Z ∗ ∈ C(μ , ν ). Consider a sequence of continuous functions fk : R+ → R+ , k ∈ N such that ∞ ∑k=1 fk ≡ 1 and fk (t) = 0 t ∈ [k−1, k]. Every function φk (z) = ρ p (x, y) fk (ρ (x, y)), z = (x, y) ∈ X × X is continuous and bounded, and thus Eφk (Z n ) → Eφk (Z ∗ ), n → ∞, k ∈ N. Therefore, by the Fatou lemma, Eρ p (X ∗ ,Y ∗ ) =
∞
∞
k=1
k=1
sup ∑ Eφk (Z n ) ∑ Eφk (Z ∗ ) ≤ lim n→∞
p = lim sup Eρ p (X n ,Y n ) = dW,p (μ , ν ). n→∞
Hence Z ∗ is an optimal coupling. 16.57. (a) Consider the case p > 1. Denote by T = {tk , k ∈ N} the set of all the points that have a positive measure μ or ν . Then the distribution of any vector Z ∈ C(μ , ν ) is defined by the matrix {z jk = P(Z = (t j ,tk ))} j,k∈N that satisfies relations
∑ zik = ν ({tk }) =: yk , ∑ z ji = μ ({t j }) =: x j , i
k, j ∈ N.
i
We denote the class of such matrices also by C(μ , ν ). By the definition of the Wasserstein metric and Problem 16.55,
16 Weak convergence, probability metrics. Functional limit theorems p dW,p (μ , ν ) =
=
∑
inf
{z jk }∈C(μ ,ν ) j,k∈N
∑
265
z jk c jk
z∗jk c jk , c jk := |t j − tk | p , k, j ∈ N,
j,k∈N
where the matrix {z∗jk } corresponds to the distribution of an optimal coupling Z ∗ ∈ C(μ , ν ). We write j ≺ k, if t j < tk . Let us show that for arbitrary j1 ≺ j2 , k1 ≺ jk at least one number z∗j1 k2 , z∗j2 k1 is equal to 0. This would be enough for solving the problem, because this condition on the matrix {z∗jk }, together with the conditions
∑ z∗ik = yk , ∑ z∗ji = x j , i
k, j ∈ N,
i
defines this matrix uniquely, and, on the other hand, this condition is satisfied for the matrix corresponding to the coupling described in Proposition 16.2. Assume that for some j1 ≺ j2 , k1 ≺ k2 the required condition fails, put a = z jk } by min(z∗j1 k2 , z∗j2 k1 ) > 0 and define the new matrix { ⎧ ∗ ⎪ ( j, k) ∈ {( jl , kr ), l, r = 1, 2}, ⎨z jk , ∗ z jk = z jk + a, ( j, k) = ( j1 , k1 ) or ( j2 , k2 ), ⎪ ⎩∗ z jk − a, ( j, k) = ( j1 , k2 ) or ( j2 , k1 ). By the construction, { z jk } ∈ C(μ , ν ) and
∑
z jk c jk = a[c j1 k1 + c j2 k2 − c j1 k2 − c j2 k1 ] +
j,k∈N
∑
z∗jk c jk .
j,k∈N
It can be checked directly that for every s1 < s2 , r1 < r2 , |r1 − s1 | p + |r2 − s2 | p < |r1 − s2 | p + |r2 − s1 | p , whence c j1 k1 + c j2 k2 − c j1 k2 − c j2 k1 < 0. This means that
∑
j,k∈N
z jk c jk
0 be fixed. Take such random elements (X1ε ,Y1ε ) ∈ C(μ1 , ν1 ), (X2ε ,Y2ε ) ∈ C(μ2 , ν2 ) that p p (μ1 , ν1 ) + ε , Eρ2p (X2ε ,Y2ε ) ≤ dW,p ( μ 2 , ν2 ) + ε . Eρ1p (X1ε ,Y1ε ) ≤ dW,p
Such elements exist by the definition of the Wasserstein metric. Construct the rand d dom element Z ε = (Z1ε , Z2ε , Z3ε , Z4ε ) with (Z1ε , Z3ε ) =(X1ε ,Y1ε ), (Z2ε , Z4ε ) =(X2ε ,Y2ε ), and ε ε ε ε ε elements (Z1 , Z3 ), (Z2 , Z4 ) being independent. By construction, Z1 and Z2ε are independent. In addition, Z1ε has distribution μ1 and Z2ε has distribution μ2 . Thus, (Z1ε , Z2ε ) has distribution μ . Analogously, (Z3ε , Z4ε ) has distribution ν . Therefore Z ε ∈ C(μ , ν ) and p p p (μ , ν ) ≤ Eρ p (Z1ε , Z2ε ), (Z3ε , Z4ε ) ≤ dW,p (μ1 , ν1 ) + dW,p (μ2 , ν2 ) + 2ε . dW,p This gives . the required statement because ε > 0 is arbitrary. 1/2
1/2
16.61. ∑k (λk − θk )2 . 16.62. (1) The processes X,Y are mean square continuous (Theorem 4.1), and thus have measurable modification (Theorem 3.1). Hence these processes generate random elements in L2 ([a, b]) (Problem 16.2). (2) Let W be the Wiener process. Put X(t) =
b a
QX (t, s) dW (s), Y (t) =
b a
QY (t, s) dW (s), t ∈ [a, b].
The processes X,Y are centered and their covariance functions equal RX and RY , respectively (Problem 6.35). Therefore (X,Y ) ∈ C(μ , ν ) and 2 (μ , ν ) ≤ EX dW,2
=
b b a
E
a
−Y 2L2 ([a,b]
=E
b a
2 (QX (t, s) − QY (t, s))dW (s) dt =
(X(t) −Y (t))2 dt
[a,b]2
(QX (t, s) − QY (t, s))2 dsdt.
16.63. It follows from Problem 6.13 that the pair of the processes W (t),W (t) − t,t ∈ [0, 1] is a coupling for μ , ν (here W is the Wiener process). Then 2 (μ , ν ) ≤ dW,2
1 0
1 t 2 dt = . 3
16 Weak convergence, probability metrics. Functional limit theorems
267
On the other hand, for arbitrary coupling (X,Y ) ∈ C(μ , ν ), EX −Y 2L2 ([0,1]) ≥ E
$
1
0
%2 (X(t) −Y (t)) dt
= E[ξ − η ]2 ,
where we have used the notation ξ = 01 X(t) dt, η = 01 Y (t) dt. The variables ξ and η are centered Gaussian ones with the variances a = 01 01 (t ∧ s) dtds = 13 and b = 01 01 (t ∧ s − ts) dtds = 29 , respectively. Then Problem 16.59 yields that . EX −Y 2L
2 ([0,1])
≥
√
√ √ 2 1 . a− b = √ − 3 3
16.64. Let X1 , . . . , Xm ,Y1 , . . . ,Ym be random elements with distributions μ1 , . . . , μm , ν1 , . . . , νm , and θ be an independent random variable that takes values 1, . . . , m with probabilities α1 , . . . , αm . Then the variables Xθ ,Yθ (see the Hint for the notation) have distributions ∑k αk μk , ∑k αk νk . For every ε > 0 the elements X1 , . . . , Xm ,Y1 , . . . ,Ym can be constructed in such a way that p Eρ p (X j ,Y j ) ≤ ε + max dW,p (μk , νk ), k
j = 1, . . . , m.
Then p p dW,p (μ , ν ) ≤ Eρ p (X,Y ) = ∑ α j Eρ p (X j ,Y j ) ≤ ε + max dW,p (μk , νk ). k
j
Because ε > 0 is arbitrary, this finishes the proof. Literally the same arguments show that, for arbitrary H from (16.3) (not necessarily a metric), the function dH,min := inf(X,Y )∈C(μ ,ν ) H(X,Y ) has the same property ) dH,min
m
m
k=1
k=1
*
∑ αk μk , ∑ αk νk
≤ max dH,min (μk , νk ) k=1,...,m
as soon as, in the previous notation, H(Xθ ,Yθ ) ≤ ∑ αk H(Xk ,Yk ). k
16.65. (a) λ = 12 (μ + ν ) is a probability measure that dominates both μ and ν . (b) Let λ1 , λ2 dominate μ , ν simultaneously. Assume first that λ1 λ2 . Then d μ d λ1 dμ = d λ2 d λ1 d λ2
λ2 − a.s.,
and thus d μ θ d μ 1−θ d μ θ d μ 1−θ d λ1 θ d λ1 1−θ d λ2 = d λ2 d λ2 d λ1 d λ2 d λ2 X d λ2 X d λ1
268
16 Weak convergence, probability metrics. Functional limit theorems
=
X
dμ d λ1
θ
dμ d λ1
1−θ
d λ1 d λ2
d λ2 =
X
dμ d λ1
θ
dμ d λ1
1−θ
d λ1 .
Now, let λ1 , λ2 do not satisfy any additional assumption. Then the measure λ3 = 1 2 (λ1 + λ2 ) dominates both λ1 and λ2 , and therefore, taking into account previous considerations, we get d μ θ d μ 1−θ d μ θ d μ 1−θ d λ1 = d λ3 d λ1 d λ3 X d λ1 X d λ3 d μ θ d μ 1−θ = d λ2 . d λ2 X d λ2 Invariance of Definitions 16.11, 16.13 with respect to the choice of λ can be proved analogously. 16.67. Denote f =d μ /d λ , g = d ν /d λ , then (μ − ν )(A) = A ( f − g) d λ , A ∈ X, and thus μ − ν var = X | f − g| d λ . Hence ( √ √ ( √ f − g)2 d λ ≤ | f − g|( f + g) d λ X X %1/2 $ %1/2 $ ( ( √ √ = dTV (μ , ν ) ≤ ( f − g)2 d λ ( f + g)2 d λ X X . = dH (μ , ν ) 2 + 2H1/2 (μ , ν ) ≤ 2dH (μ , ν ).
dH2 (μ , ν ) =
(
(
16.68. Denote f = d μ /d λ , g = d ν /d λ . Then H0 (μ , ν ) = X g d λ = 1, H1 (μ , ν ) = older inequality with p = 1/θ X f d λ = 1. Inequality Hθ ( μ , ν ) ≤ 1 comes from the H¨ applied to f θ , g1−θ . Log-convexity of the function Hμ ,ν is provided by the relation
f αθ1 +(1−α )θ2 g1−αθ1 −(1−α )θ2 d λ α 1−α α f θ1 g1−θ1 f θ2 g1−θ2 = d λ ≤ Hμα,ν (θ1 )Hμ1− ,ν (θ2 );
Hμ ,ν (αθ1 + (1 − α )θ2 ) =
X X
in the last inequality we have used the H¨older inequality with p = 1/α . This proves the statements (1), (2). The measures μ and ν are mutually singular if and only if μ (A) = 1, ν (A) = 0 for some set A ∈ X. This condition is equivalent to the condition for the product f g to be equal to zero λ -a.s. (verify this!), whence the statement (5) follows. Statement (3) follows from the statements (2) and (5): if Hμ ,ν takes value 0 at some point, then Hμ ,ν is zero identically (and thus is continuous) on (0, 1). If, otherwise, all the values of Hμ ,ν are positive, then θ → ln Hμ ,ν (θ ) is a bounded convex function on [0, 1], and thus is continuous on (0, 1). statement (4). If μ ν then one can assume λ = ν and Hθ (μ , ν ) = Consider θ d ν . We have f θ → f , θ → 1− pointwise. In addition, f θ ≤ f ∨ 1 ∈ L (X, ν ) for f 1 X every θ ∈ (0, 1), and, combined with the dominated convergence theorem, it implies that Hθ (μ , ν ) → X f d ν = 1 = H1 (μ , ν ), θ → 1−. On the other hand, if μ ν , then there exists A ∈ X with μ (A) > 0, ν (A) = 0. Hence, for every θ ∈ (0, 1)
16 Weak convergence, probability metrics. Functional limit theorems
Hθ (μ , ν ) =
269
f θ g1−θ d λ ≤ [μ (X\A)]θ .
X\A
Taking a limit as θ → 1−, we get lim sup Hθ (μ , ν ) ≤ μ (X\A) < 1 = H1 (μ , ν ); θ →1−
that is, the function Hμ ,ν is discontinuous at the point 1. 16.69. Denote hn (θ ) = Hθ (μn , νn ), HN (θ ) = ∏∞ n=N hn (θ ); then H1 (θ ) = Hθ ( μ , ν ) (verify this!). By the assumption, μn νn , and thus every function hn is continuous at the point 1 (Problem 16.68, item 4)). Assume μ ν . Then H1 is discontinuous at the point 1 and γ := lim infθ →1− H1 (θ ) < 1. Therefore, by statement (2) from the same problem, H1 ( 12 ) ≤ γ 1/2 . Then there exists N1 ∈ N such N1 −1 hn ( 12 ) ≤ γ 1/3 . Because hn (θ ) → 1, θ → 1− for any n ∈ N, we have that ∏n=1 together with statement (2) of Probthe relation lim infθ →1− HN1 (θ ) = γ, which, 1 1/2 lem 16.68, provides an estimate HN1 2 ≤ γ . Therefore, there exists N2 ∈ N such N2 −1 h 1 ≤ γ 1/3 . Repeating these arguments, we obtain a sequence Nk , k ≥ 1 that ∏n=N 1 n 2 Nk −1 hn 12 ≤ γ 1/3 < 1, k ∈ N (here we denote N0 = 1). Thus such that ∏n=N k−1 * ∞ 1 ∏ hn 2 ≤ ∏ γ 1/3 = 0, n=Nk−1 k=1
)
∞
H 1 (μ , ν ) = ∏ 2
k=1
Nk −1
and therefore μ ⊥ ν . −n 16.70. Take λ = 14 (μ + ν + ∑∞ n=1 2 ( μn + νn )) ; then the measure λ dominates every measure μ , ν , μn , νn , n ≥ 1. Denote f=
dμ dν , g= , dλ dλ
fn =
d μn d νn , gn = . dλ dλ
Then X
| fn − f |d λ = μn − μ var → 0,
X
|gn − g|d λ = νn − ν var → 0, n → ∞.
By the H¨older and Minkowski inequalities, θ |Hθ (μ , ν ) − Hθ (μn , νn )| = f θ g1−θ d λ − fnθ g1− d λ n X
≤ +
n → ∞, θ ∈ (0, 1). 2 2 16.71. (a) e−(θ (1−θ )(a1 −a2 ) )/(2σ ) .
X
X
X
| f − fn | d λ |g − gn | d λ
θ
X
g dλ
1−θ X
1−θ
fn d λ
1−θ
→ 0,
270
16 Weak convergence, probability metrics. Functional limit theorems
.
(b)
σ11−θ σ2θ
.
(1 − θ )σ12 + θ σ22
(c) One can assume that a1 ≤ a2 . In this case, Hθ (μ , ν ) =
(b1 − a2 )+ , θ ∈ (0, 1). (b1 − a1 )θ (b2 − a2 )1−θ
16.72. Hθ (μ , ν ) = exp[λ θ ρ 1−θ − θ λ −(1− θ )ρ ]. θ∗ = logλ /ρ [(λ /ρ − 1)/ ln(λ /ρ )].
θ ρ 1−θ − θ λ − (1 − θ )ρ 16.73. Hθ (μ , ν ) = exp ∑m . λ k k k=1 k k 16.74. Denote κi (T ) = κi ([0, T ]), κˆ i = κi /(κi (T )), i = 1, 2. Then
Hθ (μ , ν ) = exp κ1θ (T )κ21−θ (T )Hθ (κˆ 1 , κˆ 2 ) − θ κ1θ (T ) − (1 − θ )κ2θ (T ) .
16.77. Let us prove that X ln h d μ − E(μ ν ) ≤ ln X h d ν for arbitrary measure μ ∈ P(X). If μ ν , then E(μ ν ) = +∞ and the required inequality holds true. If μ ν , denote f = d μ /d ν . Applying Jensen’s inequality to the concave function ln(·), we get X
ln h d μ − E(μ ν ) =
X
≤ ln
(ln h − ln f ) d μ =
X
h d μ = ln f
X
X
(ln h − ln f ) d μ
h dν .
On the other hand, if h/ f = const, then this inequality becomes an equality. Put d μ = f d ν , where f = h/( X h d ν ) if X h d μ > 0 and f ≡ 1 otherwise. Then μ is a probability measure with X ln h d μ − E(μ ν ) = ln X h d ν .
17 Statistics of stochastic processes
Theoretical grounds General statement of the problem of testing two hypotheses Let the trajectory x(·) of a stochastic process {X(t), t ∈ [0, T ]} be observed. It is known that the paths of the process belong to a metric space of functions F[0,T ] defined on [0, T ]. For example, it can be the space of continuous functions C([0, T ]) or Skorokhod space D([0, T ]); see Chapter 16. On F[0,T ] the Borel σ -field B is considered. Two hypotheses concerning finite-dimensional distributions of the process X are given. According to the hypothesis Hk , k = 1, 2, a probability measure μk on the σ -field B corresponds to the process {X(t)}; that is, under the hypothesis Hk the equality holds P(X(·) ∈ A) = μk (A), A ∈ B. Based on the observations, one has to select one of the hypotheses. It can be done on the basis of either randomized or nonrandomized decision rule, as we show below. It is said that a randomized decision rule R is given, if for each possible trajectory x(·) ∈ F[0,T ] the probability p(x(·)) is defined (here p is a measurable functional on F[0,T ] ) to accept H1 if the path x(·) is observed, and 1 − p(x(·)) is the probability to accept the alternative hypothesis H2 . The rule is characterized by the probabilities of Type I and Type II errors: α12 = P(H1 |H2 ) to accept H1 when H2 is true, and α21 = P(H2 |H1 ) to accept H2 when H1 is true. The error probabilities are expressed by integrals
α12 =
F[0,T ]
p(x)μ1 (dx), α21 =
(1 − p(x))μ1 (dx).
F[0,T ]
It is natural to look for the rules minimizing the error probabilities. In many cases it is enough to content oneself with nonrandomized decision rules for which p(x(·)) takes only values 0 and 1. Then F[0,T ] is partitioned into two measurable sets G1 and G2 := F[0,T ] \ G1 ; if x(·) ∈ G1 then H1 is accepted, whereas if x(·) ∈ G2 then H2 is accepted. The set G1 is called the critical region for testing H1 . The error probabilities are calculated as αi j = μ j (Gi ), i = j. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 271 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 17,
272
17 Statistics of stochastic processes
Absolutely continuous measures on function spaces Let μ1 and μ2 be two finite measures on the σ -field B in F[0,T ] . Definition 17.1. The measures μ1 and μ2 are singular if there exists a partition of the total space F[0,T ] into two sets Q1 and Q2 such that μ1 (Q2 ) = 0 and μ2 (Q1 ) = 0. Notation: μ1 ⊥ μ2 . Definition 17.2. The measure μ2 is absolutely continuous with respect to μ1 if for each A ∈ B such that μ1 (A) = 0, it holds μ2 (A) = 0. Notation: μ2 μ1 . If μ2 μ1 then by the Radon–Nikodim theorem there exists a B-measurable nonnegative function ρ (x) such that for all A ∈ B
μ2 (A) =
ρ (x)μ1 (dx).
(17.1)
A
The function is called the density of the measure μ2 with respect to μ1 . Notation:
ρ (x) =
d μ2 (x). d μ1
Definition 17.3. If μ2 μ1 and μ1 μ2 simultaneously, then the measures μ1 and μ2 are called equivalent. Notation: μ1 ∼ μ2 . Measures μ1 and μ2 are equivalent if and only if the function ρ from the equality (17.1) is positive a.e. with respect to μ1 . In this case
μ1 (A) =
1 μ2 (dx), A ∈ B. ρ (x)
A
For any finite measures μ1 and μ2 one can find pairwise disjoint sets Δ1 , Δ2 , and Δ such that μ1 (Δ2 ) = 0 and μ2 (Δ1 ) = 0, and on Δ the measures are equivalent; that is, there exists a measurable function ρ : Δ → (0, +∞) such that for all A ∈ B,
μ2 (A ∩ Δ ) =
ρ (x)μ1 (dx),
A∩Δ
μ1 (A ∩ Δ ) =
A∩Δ
1 μ2 (dx). ρ (x)
(17.2)
Let H be a real separable infinite-dimensional Hilbert space. Consider a finite measure μ on the Borel σ -field B(H). Definition 17.4. The mean value of a measure μ is called a vector mμ ∈ H such that (mμ , x) =
(z, x)d μ (z), x ∈ H.
H
The correlation operator of a measure μ is called a linear operator Sμ in H such that (Sμ x, y) = (z − mμ , x)(z − mμ , y)d μ (z), x, y ∈ H. H
17 Statistics of stochastic processes
273
It is known that the correlation operator Sμ of a measure μ , if it exists, is a continuous self-adjoint operator. Moreover it is positive; that is, (Sμ x, x) ≥ 0, x ∈ H. Definition 17.5. A measure μ is called a Gaussian measure on H if for each linear continuous functional f on H, the induced measure μ ◦ f −1 is a Gaussian measure on the real line. The correlation operator Sμ of a Gaussian measure μ always exists, moreover ∑∞ i=1 λi (Sμ ) < ∞ where λi (Sμ ) are eigenvalues of Sμ that are counted according to orthonormal eigenvectors. their multiplicity. Let {ei , i ≥ 1} be the corresponding ( They form a basis in H. Define the operator Sμ in H, (
∞
Sμ x = ∑
.
λi (Sμ )(x, ei )ei , x ∈ H.
i=1
Theorem 17.1. (Hajek–Feldman theorem) Let μ and ν be two Gaussian measures in H, with √ common correlation operator √ S and mean values mμ = 0 and mν = a. If a ∈ S(H) then μ ∼ ν , and if a ∈ S(H) then μ ⊥ ν . In the case μ ∼ ν , the Radon–Nikodim derivative is , , √ −1 ,2 ∞ xk ak 1, dν , (x) = exp{− , S a, , + ∑ λ }. dμ 2 k k=1 Here λk are positive eigenvalues of the operator S, and ϕk are the corresponding eigenvectors, the coefficients xk = (x, ϕk ), ak = (a, ϕk ), k ≥ 1, and the series converges for μ –almost all x. The Neyman–Pearson criterion Fix ε ∈ (0, 1). The Neyman–Pearson criterion presents a randomized rule for hypothesis testing, which for a given upper bound ε of a Type I error (that is, when α12 ≤ ε ), minimizes a Type II error α21 . Consider three cases concerning the measures μ1 and μ2 related to the hypotheses H1 and H2 . (1) μ1 ⊥μ2 . Then there exists a set G1 such that μ1 (G1 ) = 1 and μ2 (G1 ) = 0. If x(·) ∈ G1 then we accept H1 , otherwise if x(·) ∈ G1 then H2 is accepted. We have
α12 = μ2 (G1 ) = 0, α21 = μ1 (F[0,T ] \ G1 ) = 0. Thus, in this case one can test the hypotheses without error. (2) μ1 ∼ μ2 . Let ρ be the density of μ2 with respect to μ1 . For λ > 0 denote Rλ = {x ∈ F[0,T ] : ρ (x) < λ }, Γ λ = {x ∈ F[0,T ] : ρ (x) = λ }. Then there exists λ¯ such that
μ2 (Rλ ) ≤ ε , ¯
μ2 (Rλ ∪ Γ λ ) ≥ ε . ¯
¯
274
17 Statistics of stochastic processes
Consider three options. ¯ ¯ (2a) μ2 (Rλ ) = ε . Then set G1 = Rλ . We have
α12 = ε , α21 = 1 − μ1 (Rλ ). ¯
(2b) μ2 (Rλ ) < ε and μ2 (Rλ ∪ Γ λ ) = ε . Then set G1 = Rλ ∪ Γ λ . We have ¯
¯
¯
¯
¯
α12 = ε , α21 = 1 − μ1 (Rλ ) − μ1 (Γ λ ). ¯
¯
(2c) μ2 (Rλ ) < ε and μ2 (Rλ ∪ Γ λ ) > ε . Then we construct a randomized rule by means of the probability functional p(x): ¯ p(x) = 1 if x ∈ Rλ , ¯ ¯ p(x) = 0 if x ∈ F[0,T ] \ (Rλ ∪ Γ λ ), ¯
¯
ε −μ2 (Rλ ) ¯ μ2 (Γ λ ) measure μ2 of
p(x) =
¯
¯
if x ∈ Γ λ . ¯
If the any single path equals 0, then in case (2c) we define a nonran¯ ¯ domized rule as follows. There exists D ⊂ Γ λ such that μ2 (D) = ε − μ2 (Rλ ); we set ¯ G1 = Rλ ∪ D and obtain the decision rule with α12 = ε and minimal α21 . (3) Now, let μ1 and μ2 be neither singular nor equivalent. There exist pairwise disjoint sets Δ1 , Δ2 , and Δ such that (17.2) holds and μ2 (Δ1 ) = μ1 (Δ2 ) = 0. Let Rλ = {x ∈ Δ : ρ (x) < λ } ∪ Δ1 , Γ λ = {x ∈ Δ : ρ (x) = λ }. Consider two options. (3a) ε ≥ 1 − μ2 (Δ2 ). We set G1 = Δ1 ∪ Δ , then
α12 = 1 − μ2 (Δ2 ) ≤ ε , α21 = 0. (3b) ε < 1 − μ2 (Δ2 ). Choose λ¯ such that
μ2 (Rλ ) ≤ ε , ¯
μ2 (Rλ ∪ Γ λ ) ≥ ε , ¯
¯
and construct the rule as in case (2). Therefore, in order to construct an optimal criterion one has to find the sets Δk on which the singular measures are concentrated, or in the case of equivalent measures to find the relative density of measures, wherein the probability law of ρ (x(·)) is needed for each hypothesis. Hypothesis testing for diffusion processes The case of different diffusion matrices Let x(t), t ∈ [0, T ], be a path of a diffusion process in Rm , and under a hypothesis Hk the drift vector of the diffusion process is ak (t, x), and its diffusion matrix is Bk (t, x) (all the functions are continuous in both arguments); k = 1, 2. This means that under Hk the observed diffusion process is a weak solution to the stochastic integral equation
17 Statistics of stochastic processes
x(t) = x0 +
t
ak (s, x(s))ds +
0
t
1/2
Bk (s, x(s))dW (s), t ∈ [0, T ].
275
(17.3)
0
Here W is an m–dimensional Wiener process; that is, W (t) = (W1 (t), . . . ,Wm (t))' , 1/2 t ∈ [0, T ], where Wi , i = 1, m are independent scalar Wiener processes, and Bk is a positive semidefinite matrix such that its square is a positive semidefinite matrix Bk . For the equation (17.3) the analogue of Theorem 14.5 about the existence of a weak solution holds true. Having the path x(t) one can find Bk (t, x(t)), t ∈ [0, T ], provided the hypothesis Hk is true. This can be done as follows. For z ∈ Rm we set 2 2n −1 k+1 k λ (t, z) = lim ∑ x t −x n t ,z . (17.4) n n→∞ 2 2 k=0 The limit in (17.4) exists a.s. under each hypothesis Hk , k = 1, 2, and
λ (t, z) =
t
(Bk (s, x(s))z, z) ds, t ∈ [0, T ],
(17.5)
0
if Hk is true. If on the observed path for some t ∈ [0, T ] and z ∈ Rm , t
(B1 (s, x(s))z, z)ds =
0
t
(B2 (s, x(s))z, z)ds,
(17.6)
0
then the equality (17.5) is correct only for a single value of k. Due to the continuity of the integrand functions, (17.6) holds true if and only if for some t ∈ [0, T ] and z ∈ Rm , (B1 (t, x(t))z, z) = (B2 (t, x(t))z, z). Thus, under this condition we accept Hk if (17.5) holds for that k, and finally obtain the error-free decision rule. Condition for equivalence of measures, and distribution of density under various hypotheses Now, let along the observed path ∀z ∈ Rm :
(B1 (t, x(t))z, z) = (B2 (t, x(t))z, z).
Then B1 (t, x(t)) ≡ B2 (t, x(t)). Therefore, one can assume that ∀t ∈ [0, T ], x ∈ Rm :
B1 (t, x) = B2 (t, x).
Assume that the distribution of x(0) is given and does not depend on the choice of the hypothesis. Denote
276
17 Statistics of stochastic processes
B(t, x) = B1 (t, x) = B2 (t, x), a(t, x) = a2 (t, x) − a1 (t, x). Let μk be a measure generated by the observed process on the space C([0, T ]) under the hypothesis Hk ; k = 1, 2. Theorem 17.2. For the equivalence of measures μ1 ∼ μ2 , the next condition is sufficient: for each t, x there exists b(t, x) ∈ Rm such that the next two conditions hold: (1)
a(t, x) = B(t, x)b(t, x). T
(2)
(a(t, x(t)), b(t, x(t)))dt < ∞
0
for almost every x(·) with respect to the measure μ2 . Therein the density of μ2 with respect to μ1 is T
ρ (x(·)) = exp{ (b(t, x(t)), dx(t)) 0
1 − 2
T
(b(t, x(t)), a1 (t, x(t)) + a2 (t, x(t)))dt}.
(17.7)
0
Here the differential dx(t) is written based on the stochastic equation (17.3), and the first integral on the right-hand side of (17.7) is understood, respectively, as a sum of the Lebesgue integral and the stochastic Ito integral. Homogeneous in space processes Let ak (t, x) = ak (t) and Bk (t, x) = Bk (t), k = 1, 2; that is, the coefficients of the diffusion process do not depend on the spatial variable. As above we assume that all the coefficients are continuous functions. Then the process {x(t), t ∈ [0, T ]} has independent increments. From (17.5) it follows that
λ (t, z) =
t
(Bk (s)z, z)ds
0
if Hk is true. Therefore, the hypotheses are tested without error if there exists such t that B1 (t) = B2 (t). Let B1 (t) = B2 (t) = B(t) and a(t) = a2 (t) − a1 (t). Denote by Lt the range {B(t)z : z ∈ Rm } and by E the set of such t ∈ [0, T ] that a(t) does not belong to Lt . Let P(t) be the projection operator on Lt . If λ 1 (E) > 0 then the hypotheses are tested without error: T
I(x(·)) := 0
P(t)(x(t) − a1 (t))2 dt = 0
17 Statistics of stochastic processes
277
under the hypothesis H1 , and I(x(·)) > 0 under the hypothesis H2 . Now, let a(t) ∈ Lt , t ∈ [0, T ]; that is, ∀t ∃b(t) ∈ Rm : a(t) = B(t)b(t).
(17.8)
In order for the vector b(t) in (17.8) to be uniquely defined, we select it from the subspace Lt ; this is possible because the matrix B(t) is symmetric. Note that then (a(t), b(t)) ≥ 0. Under condition (17.8), the necessary and sufficient condition for the absolute continuity of the measures μ1 and μ2 is the condition T
(a(t), b(t))dt < ∞.
(17.9)
0
Under the conditions (17.8) and (17.9), the density of the measure μ2 with respect to μ1 in the space C([0, T ]) is T
T
0
0
1 ρ (x(·)) = exp{ (b(t), dx(t)) − 2
(b(t), a1 (t) + a2 (t))dt}.
(17.10)
Under the hypothesis Hk it holds log ρ (x(·)) ∼ N(mk , σ 2 ) with
σ =
T
2
(a(t), b(t))dt, mk = (−1)k
0
σ2 ; k = 1, 2. 2
This makes it possible to construct the Neyman–Pearson criterion. Next, we construct an error-free test under the assumption (17.8) and the condition T
(a(t), b(t))dt = +∞.
(17.11)
0
Select a sequence of continuous functions bn (t) such that n≤
T
(B(t)bn (t), bn (t))dt =
T
0
(a(t), bn (t))dt < ∞
0
(this is possible due to the imposed assumptions). Then we accept the hypothesis H1 if ⎛ ⎞ T
lim ⎝ (B(t)bn (t), bn (t))dt ⎠
−1 T
n→∞
0
otherwise the hypothesis H2 is accepted.
0
bn (t)d(x(t) − a1 (t)) = 0,
278
17 Statistics of stochastic processes
Hypothesis testing about the mean of Gaussian process Let x(t), t ∈ [0, T ], be a path of a scalar Gaussian process with given continuous correlation function R(t, s). Under the hypothesis H1 the mean of the process equals 0, whereas under the alternative hypothesis H2 the mean is equal to a given continuous function a(t). Condition for singularity of measures Introduce a linear operator R in X := L2 ([0, T ]), (Rg)(t) =
T
R(t, s)g(s)ds, g ∈ X, t ∈ [0, T ].
0
This is a Hilbert–Schmidt integral operator. Its eigenspace which corresponds to a zero eigenvalue is the kernel of the operator R. Also the operator has a sequence of positive eigenvalues and corresponding normalized eigenfunctions {λk , ϕk ; k ≥ 1} with ∑k≥1 λk < ∞, the functions ϕk are pairwise orthogonal, and their linear combinations are dense in the range of the operator R. Consider two cases. (1) In the space X the function a(·) has no series expansion in the functions ϕk . Then the measures μ1 and μ2 on the space X that correspond to the hypotheses H1 and H2 , are singular. We describe a decision rule. Let a(·) ˆ = a(·) − ∑ (a, ϕk )ϕk (·).
(17.12)
k≥1
Hereafter (·, ·) is the inner product in X, and in the case of an infinite number of ϕk , the series in (17.12) converges in the norm in X. If T
ˆ I(x(·)) :=
x(t)a(t)dt ˆ =0
0
then we accept H1 ; otherwise we accept H2 . (2) In the space X the function a(·) has a Fourier expansion in the functions ϕk : a(t) =
∑ ak ϕk (t),
ak := (a, ϕk ).
(17.13)
k≥1
(In the case of an infinite number of ϕk , the series in (17.13) converges in the norm in X). Assume additionally that ∞
a2
∑ λkk = ∞.
(17.14)
k=1
Then μ1 ⊥μ2 . A decision rule is constructed as follows. Select a sequence {mn } such that mn 2 a ∀n ≥ 1 : ∑ k ≥ n. k=1 λk
17 Statistics of stochastic processes
If
) lim
n→∞
mn
a2k ∑ k=1 λk
*−1
mn
ak ∑ λk k=1
T
x(t)ϕk (t)dt = 0
279
(17.15)
0
then we accept the hypothesis H1 ; otherwise we accept H2 . Condition for equivalence of measures In the notations of the previous subsection, the criterion of the equivalence of the measures μ1 and μ2 is the condition a2
∑ λkk < ∞.
(17.16)
k≥1
Under this condition the density of μ2 with respect to μ1 is 4 a2k xk ak 1 , ρ (x) = exp ∑ − ∑ 2 k≥1 λk k≥1 λk
(17.17)
where xk := (x, ϕk ), and the first series under the exponent converges for μ1 –almost all x. Under the hypothesis Hk , log ρ (x(·)) ∼ N(mk , σ 2 ), σ 2 =
a2
∑ λkk ,
k≥1
1 mk = (−1)k σ 2 ; 2
k ≥ 1.
For the condition (17.16) it is sufficient that in X = L2 ([0, T ]) there exists a solution b(·) to the Fredholm Type I equation a(t) =
T
R(t, s)b(s)ds, 0 ≤ t ≤ T.
(17.18)
0
Via this solution the density ρ can be written differently: ⎧ ⎫ T ⎨T ⎬ 1 ρ (x) = exp x(s)b(s)ds − a(s)b(s)ds . ⎩ ⎭ 2 0
(17.19)
0
Parameter estimation of distributions of stochastic process Let x(t), 0 ≤ t ≤ T, be the observed path of a stochastic process that generates a probability measure μθ on the function space F[0,T ] . A parameter θ is to be estimated and belongs to a parameter set Θ , which is a complete separable metric space or a Borel subset in such a space. Definition 17.6. A function θ (x) : F[0,T ] → Θ which is B − B(Θ ) measurable is called the estimator θ (x) of the parameter θ for any family of measures μθ .
280
17 Statistics of stochastic processes
Assume that there exists a σ -finite measure ν on Borel σ -field B in F[0,T ] , with respect to which all the measures μθ are absolutely continuous and d μθ (x) = ρ (θ , x), θ ∈ Θ , x ∈ F[0,T ] . dν Then the family of measures {μθ , θ ∈ Θ } is called regular. Definition 17.7. An estimator θ (x) of the parameter θ for a regular family of measures {μθ } is called strictly consistent under increasing T, if θ (x) → θ as T → ∞, a.s. For a regular family of measures, the estimator can be found by the maximum likelihood method via maximization of the function ρ (θ , x(·)) on Θ . A real parameter θ for a regular family of measures {μθ , θ ∈ Θ } can be estimated by the Bayes method as well. Let Θ be a finite or infinite interval on the real line, on which a pdf is given. We call it the prior density of the parameter θ . Based on the path x = x(·) one can compute the posterior density
ρ (θ |x) := Θ
ρ (θ , x)ρ (θ ) , θ ∈ Θ. ρ (θ , x)ρ (θ )d θ
It is correctly defined if for the observed path it holds ρ (θ , x) > 0, for a.e. θ ∈ Θ . For an estimator θ (x), we introduce two loss functions: quadratic L2 (θ (x), θ ) := (θ (x) − θ )2 and all-or-nothing loss function L0 (θ (x), θ ) := 1Iθ (x)=θ . The latter is approximated by the functions Lε (θ (x), θ ) := 1I|θ (x)−θ |>ε as ε → 0+. Under the quadratic loss function, the Bayes estimator θˆ2 (x) of the parameter θ is defined as a minimum point of the next function (we suppose that the posterior density possesses a finite second moment):
Q(θˆ ) :=
L2 (θˆ , θ )ρ (θ |x)d θ , θˆ ∈ Θ .
Θ
This implies that θˆ2 (x) coincides with the expectation of the posterior distribution; that is, θˆ2 (x) = θ ρ (θ |x)d θ . Θ
Under the all-or-nothing loss function, we have the approximating cost functions Qε (θˆ ) :=
Lε (θˆ , θ )ρ (θ |x)d θ , θˆ ∈ Θ .
Θ
Their minimum points, under unimodal and smooth posterior density, tend to the mode of this density. Therefore, the mode is taken as the Bayes estimator θˆ0 (x) under a given loss function,
17 Statistics of stochastic processes
281
θˆ0 (x) = argmax ρ (θ |x). θ ∈Θ
The case of a pairwise singular family {μθ , θ ∈ Θ } is more specific for statistics of stochastic processes, in contrast to classical mathematical statistics. It is natural to expect in this case, that the parameter θ can be estimated without error by a single path x(t), 0 ≤ t ≤ T . Definition 17.8. An estimator θ (x) of the parameter θ for a pairwise singular family of measures {μθ } is called consistent if ∀θ ∈ Θ :
μθ {x ∈ F[0,T ] : θ (x) = θ } = 1.
Thus, the consistent estimator makes it possible to find the parameter without error for a singular family of measures, which is impossible for a regular family of measures.
Bibliography [51], Chapter 24; [57], Chapters 7, 17; [31], Chapter 4; [37], Chapters 2–4.
Problems 17.1. On [0, T ] a process is observed which is the Wiener process {W (t)} under the hypothesis H1 , and is the process {γ t +W (t)} with given γ = 0 under the hypothesis H2 . Construct the Neyman–Pearson test. 17.2. On [0, T ] a process is observed which is a homogeneous Poisson process with intensity λk under the hypothesis Hk ; k = 1, 2. (a) Prove that the corresponding measures in the space E = D([0, T ]) are equivalent with density d μ2 (x) = ρ (x(·)) = d μ1
λ2 λ1
x(T )
e(λ1 −λ2 )T , x ∈ E.
(b) Construct the Neyman–Pearson criterion to test the hypotheses. 17.3. Let {x(t), t ∈ [0, T ]} and μ1 , μ2 be the objects described in the subsection of Theoretical grounds Hypothesis testing about the mean of Gaussian process. Condition for singularity of measures. Prove that: (a) In cases (1) and (2) of the above-mentioned subsection the measures μ1 and μ2 are singular. (b) Under condition (17.16) it holds μ1 ∼ μ2 , and the density of μ2 with respect to μ1 is given in (17.17).
282
17 Statistics of stochastic processes
17.4. Prove that the decision rule described in the subsection of Theoretical grounds Hypothesis testing about the mean of Gaussian process. Condition for singularity of measures, case (1), tests the hypotheses without error. 17.5. Prove that the decision rule described in the subsection of Theoretical grounds Hypothesis testing about the mean of Gaussian process. Condition for singularity of measures, case (2), tests the hypotheses without error. 17.6. On [0, 1] a path x(·) of a Gaussian process with correlation function e−|t−s| is observed. Under the hypothesis H1 the mean of the process equals 0, whereas under the hypothesis H2 it is equal to a given function a ∈ C2 ([0, 1]) with a (0) = a(0), a (1) = −a(1). Prove that the Neyman–Pearson criterion is constructed as follows. If 1 a(s) − a (s) ds < σ Φ −1 (ε ) + σ 2 x(s) 2 0
then H1 is accepted; otherwise H2 is accepted. Here σ > 0 and 2σ 2 = a2L2 + a 2L2 + a(0)2 + a(1)2 ,
Φ is the cdf of standard normal law, and Φ −1 is the inverse function to the function Φ . 17.7. On [0, T ] a path x(·) of a zero mean Gaussion process is observed. Under the hypothesis H1 the correlation function of the process is R(t, s), whereas under the hypothesis H2 it is equal to σ 2 R(t, x) with unknown positive σ 2 = 1. Because σ 2 is unknown, the hypothesis H2 is composite. Here R(t, s) is a given continuous function such that the integral operator in L2 ([0, T ]), (Ag)(t) =
T
R(t, s)g(s)ds, t ∈ [0, T ], g ∈ L2 ([0, T ]),
0
has an infinite number of positive eigenvalues. Construct an error-free criterion to test the hypotheses. 17.8. On [0, 2] a path x(·) of a scalar diffusion process is observed. Under the hypothesis H1 the diffusion coefficient b1 (t, x) = 1, whereas under the hypothesis H2 the diffusion coefficient b2 (t, x) = t. Under each hypothesis Hk the drift coefficient is the unknown continuous function ak (t, x), k = 1, 2. Construct an error-free test. 17.9. On [0, T ] a path x(·) of a scalar diffusion process starting from 0 is observed. Under both hypotheses H1 and H2 its diffusion coefficient is t, t ∈ [0, T ], and under the hypothesis H1 its drift coefficient is 0. (a) Let T < 1 and under H2 the drift coefficient is | logt|−1/2 for t ∈ (0, T ] and 0 for t = 0. Construct an error-free test. √ (b) Let under H2 the drift coefficient be t, t ∈ [0, T ]. Construct the Neyman– Pearson criterion to test the hypotheses.
17 Statistics of stochastic processes
283
17.10. On [0, T ] a path x(·) of a two-dimensional diffusion process starting from the origin is observed. Under the hypotheses H1 and H2 its diffusion matrix is diagonal with entries 1 and t on√the√diagonal. Under H1 its drift vector is 0, whereas under H2 the drift vector is ( 4 t; 4 t)' . Construct the Neyman–Pearson criterion to test the hypotheses. 17.11. On [0, T ] a path N(·) of a homogeneous Poisson process with intensity λ is observed. (a) Based on Problem 17.2 (a), show that the maximum likelihood estimator of the parameter λ is λˆ T = N(T )/T (more precisely the maximum in λ > 0 of the density at the observed path is attained if N(T ) > 0, and in the case N(T ) = 0 we set λˆ T = 0). λ ; it is a strongly (b) Prove the next: it is an unbiased estimator; that is, Eλˆ T = √ consistent estimator under increasing T ; the normalized estimator T (λˆ T − λ ) converges in distribution to the normal law as T → ∞; that is, λˆ T is an asymptotically normal estimator. 17.12. On [0, T ] a path N(·) is observed of a nonhomogeneous Poisson process with intensity function λ t (it is a density of the intensity measure with respect to Lebesgue measure). (a) Show that the maximum likelihood estimator of the parameter λ is λˆ T = 2N(T )/T 2 (more precisely, the maximum in λ > 0 of the density on the observed path is attained if N(T ) > 0, and in the case N(T ) = 0 we set λˆ T = 0). (b) Prove that λˆ is an unbiased and strongly consistent estimator (see the corresponding definitions in Problem 17.11). (c) Prove that T (λˆ T − λ ) converges in distribution to the normal law as T → ∞; that is, λˆ T is an asymptotically normal estimator. 17.13. Let f , g ∈ C(R+ ); f (t) ≥ 0, t > 0; g(t) > 0, t > 0. Nonhomogeneous Poisson processes {N f (t), Ng (t), t ≥ 0} are given with intensity functions f and g (these functions are the densities of the intensity measures with respect to Lebesgue measure). Let μ1 be the measure generated by the process Ng on D([0, T ]), and μ2 be the similar measure for N f . Prove that μ2 μ1 and T
d μ2 f (ti ) 1Ix(ti )−x(ti −)=1 exp{ (g(t) − f (t))dt}, (x) = ∏ d μ1 i g(ti )
(17.20)
0
x ∈ D([0, T ]). Here ti are jump points of the function x, and if x ∈ C([0, T ]) then the product in (17.20) is set to be equal to 1. 17.14. On [0, T ] a path N(·) is observed of a nonhomogeneous Poisson process with intensity function 1 + λ0t, λ0 > 0 (it is a density of the intensity measure with respect to Lebesgue measure). (a) Write an equation for the maximum likelihood estimator λˆ T of the parameter λ0 and show that with probability 1 this equation has a unique positive root for all T ≥ T0 (ω ). (b) Prove that λˆ T is strongly consistent; that is, λˆ T → λ as T → ∞, a.s.
284
17 Statistics of stochastic processes
17.15. On [0, T ] a path x(·) is observed of a mean square continuous stochastic process with given correlation function r(s,t). For the integral operator J on L2 ([0, T ]) with the kernel r(t, s) it holds that Ker J = {0}. A mean value m of the process is estimated, and m does not depend on t. Let ⎫ ⎧ T ⎬ ⎨ M = mˆ = f (t)x(t)dt f ∈ C([0, T ]); ∀ m ∈ R : Em mˆ = m ⎭ ⎩ 0
(that is, M is a certain class of linear unbiased estimators). Here the integral is a mean square limit of integral Riemann sums, and Em is a standard notation for the expectation with respect to the distribution μm of the observed process with mean m. Prove that: (a) *−1 ) inf Dmˆ =
m∈M ˆ
∞
∑ λn−1 a2n
,
n=1
where {λn , ϕn , n ≥ 1} are all the eigenvalues of J and corresponding orthonormal eigenfunctions, and an = 0T ϕn (t)dt, n ≥ 1. (b) In particular if the series in (a) diverges then ∃ {mˆ k , k ≥ 1} ⊂ M : mˆ k → m as k → ∞, a.s.; that is, then mˆ k is strictly consistent in the sense of Definition 17.7. 17.16. On [0, T ] a path x(·) is observed of a mean square continuous stochastic process with given correlation function r(s,t). A mean value m of the process is estimated, and m does not depend on t. Let M = {mˆ F =
T 0
x(t)dF(t) | F is a function of bounded variation;
∀ m ∈ R : Em mˆ F = m}. Here the integral is a mean square limit of integral Riemann sums. For the notation Em see the previous problem. Suppose that there exists an estimator mˆ F0 ∈ M such that for all s ∈ [0, T ], T 0 r(s,t)dF0 (t) = C. Prove that min Dmˆ H = Dmˆ F0 = C.
mˆ H ∈M
17.17. On [0, T ] a path x(·) of the process with given correlation function r(s,t) is observed . A mean value m of the process is estimated, and m does not depend on t. Let M be the class of estimators from Problem 17.16. Prove that the next estimator has the least variance in M. (a) mˆ 1 = (2 + β T )−1 x(0) + x(T ) + β 0T x(t)dt , if r(s,t) = exp{−β |t − s|} with β > 0. (b) mˆ 2 = x(0) if r(s,t) = min(s + 1,t + 1). Dm, ˆ where mˆ G ∈ M and G (c) In cases (a) and (b) prove that Dmˆ G > minm∈M ˆ is an absolutely continuous function of bounded variation (that is, G(t) = G(0) + t 0 f (s)ds, t ∈ [0, T ], with f ∈ L1 ([0, T ])).
17 Statistics of stochastic processes
285
17.18. On [0, T ] a path x(·) of the process {μ t + σ W (t)} is observed where W is a separable Wiener process with unknown parameters μ ∈ R and σ 2 > 0. (a) Construct an error-free estimate of the parameter σ 2 . (b) For a fixed σ 2 , prove that the maximum likelihood estimator of the parameter μ is μˆ T = x(T )/T . √ (c) Prove that the 2expectation of μˆ T is μ , and μˆ T → μ as T → ∞, a.s., and T (μˆ T − μ ) ∼ N(0, σ ). 17.19. Assume the conditions of Problem 17.18 and let the prior distribution N(μ0 , σ02 ) of the parameter μ be given. (a) Find the posterior distribution of the parameter μ . (b) Construct the Bayes estimator of the parameter under the quadratic loss function. 17.20. On [0, T ] a path N(·) is observed of a homogeneous Poisson process with intensity λ that has the prior gamma distribution Γ (α , β ). (a) Find the posterior distribution of the parameter λ . (b) Construct the Bayes estimator λ under the quadratic loss function and under the all-or-nothing loss function. 17.21. On [0, T ] the process is observed ⎫ ⎧ * ) t ⎬ ⎨ m ϕ (s) + ∑ θi gi (s) ds + σ W (t) , x(t) = ⎭ ⎩ i=1 0
where W is a Wiener process, the unknown function ϕ belongs to a fixed subspace K ⊂ L2 ([0, T ]); {gi } are given functions from L2 ([0, T ]) that are linearly independent modulus K; that is, a linear combination of these functions which belongs to K is always a combination with zero coefficients; and θ = (θ1 , . . . , θm )' ∈ Rm and σ > 0 are unknown parameters. Let M = ⎫ ⎧ T ⎬ ⎨ θˆ = f (t)dx(t) f ∈ L2 ([0, T ], Rm ); ∀ θ ∈ Rm ∀ ϕ ∈ K : Eϕ ,θ θˆ = θ . ⎭ ⎩ 0
Prove that there exists a unique estimator θˆ ∗ ∈ M such that for any estimate θˆ ∈ M the matrix S − S∗ is positive semidefinite. Here S and S∗ are covariance matrices of the estimators θˆ and θˆ ∗ . 17.22. Let X = [0, 1]2 , and Θ = [0, 1]∪[2, 3], and for θ ∈ Θ μθ be a measure on B(X). If θ ∈ [0, 1] then μθ is Lebesgue measure on [0, 1] × {θ }, whereas for θ ∈ [2, 3], μθ is Lebesgue measure on {θ − 2} × [0, 1]. (a) Check that the measures {μθ } are pairwise singular. (b) Prove that there is no consistent estimator θ (x), x ∈ X, of the parameter θ .
286
17 Statistics of stochastic processes
Hints 17.1. Both processes are diffusion ones with continuous paths, and the density is
ρ (x) =
d μ2 (x), x ∈ C([0, T ]). d μ1
17.2. (a) Use a representation of a homogeneous Poisson process given in Problem 5.17. (b) The Neyman–Pearson criterion for equivalent measures can be applied. 17.3. The μ1 and μ2 are Gaussian measures on Hilbert space X = L2 ([0, T ]). Use the Hajek–Feldman theorem. 17.4. Let L be a closure of the set R(X), where R is an integral operator with kernel R(t, s). If a vector h is orthogonal to R(X) then under the hypothesis H1 the variance of r.v. (x(·), h(·)) is 0, therefore, the r.v. is equal to 0, a.s. Then μ1 (L) = 1. 17.5. Under the hypothesis H1 , {(x(·), ϕk (·)), k ≥ 1} is a sequence of independent Gaussian random variables with distributions N(0, λk ), k ≥ 1. 17.6. Solve an integral equation (17.18) where T = 1, a(·) is the function from the problem situation, and R(t, s) = e−|t−s| , t, s ∈ [0, 1]. 17.7. Let {ϕn , n ∈ N} be an orthonormal system of eigenfunctions of the operator A with corresponding eigenvalues λn , n ∈ N. Then under both hypotheses xn := 0T x(t)ϕn (t)dt, n ≥ 1 is a sequence of centered independent Gaussian random variables. 17.8. Because the diffusion coefficients are different, singular measures on C([0, 2]) correspond to the hypotheses. 17.9. (a) The equality (17.11) holds true. (b) The density of μ2 with respect to μ1 can be found by the formula (17.10). 17.10. The condition (17.9) holds. 17.11. (a) Let dμ ρλ (x) = λ (x), x ∈ D([0, T ]). d μ1 Here μλ , λ > 0 is a measure generated by a homogeneous Poisson process with intensity λ . Then λˆ T is a point of maximum in λ > 0 of the log-density L(λ ; N) := log ρλ (N). (b) For T ∈ N use the SLLN and CLT. 17.12. (a) Let νλ be a measure on D([0, T ]) generated by given process. The formula for the density dν ρλ (x) := λ (x) d ν1 is derived similarly to Problem 17.2 (a). √ (b), (c) The process {N1 (t) := N( 2t), t ≥ 0} is a homogeneous process with intensity λ . 17.13. Use Problem 5.17 and generalize the solution of Problem 17.2 (a). 17.14. (a) Use Problem 17.13. The derivative in λ of the log-density L(λ , N) at the observed path is a strictly decreasing function in λ . (b) Investigate the behavior of the function
17 Statistics of stochastic processes
ϕ (λ , T ) := T −2
287
∂ L(λ , N) ∂λ
as T → ∞, when λ is from the complement to the fixed neighborhood of λ0 . 17.15. (a) Expand f in Fourier series by the basis {ϕn }. (b) Use the Riesz lemma about a subsequence of random variables that converges a.s. 17.16. Let the minimum of the variance be attained at mˆ F ∈ M. For α , β ∈ [0, T ] introduce G(t) = 1It≥α − 1It≥β , t ∈ [0, T ]. Then for all δ ∈ R it holds mˆ F+δ G ∈ M. 17.17. Use Problem 17.16. 17.18. (a) Use Problem (17.4). (b) Let the measure μ1 correspond to the process {σ W (t)}, and the measure μ2 be generated by the given process. Use the formula (17.7). (c) Use Problem 3.18. 17.19. Use a density ρ (x) from the solution of Problem 17.18. 17.20. The density of the distribution of the process is derived in Problem 17.2. 17.21. Reformulate this problem in terms of vectors in the space H = L2 ([0, T ]). 17.22. (b) Prove to the contrary. Let θ (x) be a consistent estimator. Introduce A1 = {x : θ (x) ∈ [0, 1]}. Then for all x2 ∈ [0, 1] it holds λ 1 ({x1 ∈ [0, 1] : (x1 , x2 ) ∈ A1 }) = 1.
Answers and Solutions 17.1. Under the hypothesis Hk the observed process {x(t), t ∈ [0, T ]} generates a measure μk on the space E = C([0, T ]); k = 1, 2. By Theorem 17.2 we have μ1 ∼ μ2 , and by formula (17.7) it holds
ρ (x) = exp{γ x(T ) −
γ2 T }, x ∈ E. 2
Without loss of generality we can assume that γ > 0 (for γ < 0 one should consider the process y(t) := −x(t)). Then for λ > 0 1 1 2 λ R := {x ∈ E : ρ (x) < λ } = x ∈ E : x(T ) < (log λ + γ T ) . γ 2 Under the hypothesis H1 , x(T ) ∼ N(0, T ), whereas under the hypothesis H2 , x(T ) ∼ N(γ T, T ). Fix ε ∈ (0, 1). We are looking for c = c(ε ) such that μ2 ({x : x(T ) < c}) = ε . We have c − γT x(T ) − γ T √ < √ x: μ2 ({x : x(T ) < c}) = μ2 T T c − γT √ =ε =Φ T
288
17 Statistics of stochastic processes
√ for c(ε ) = Φ −1 (ε ) T + γ T. According to the Neyman–Pearson criterion we accept H1 if x(T ) < c(ε ); otherwise we accept H2 . At that c(ε ) ¯ √ , α21 = 1 − μ1 ({x : x(T ) < c(ε )}) = Φ T where Φ¯ := 1 − Φ . 17.2. (a) Let {ξn , n ≥ 1} be independent random variables, uniformly distributed on [0, T ], and νi ∼ Pois(λi T ), νi is independent of {ξn }, i = 1, 2. According to Problem 5.17 the process 4 Xi (t) =
νi
∑ 1Iξn ≤t ,
t ∈ [0, T ]
n=1
is a homogeneous Poisson process on [0, T ] with intensity λi . A measure μi on E that is generated by the process Xi , is concentrated on the set of functions of the form f0 (t) = 0,
k
∑ 1Ixn ≤t ,
fk (x,t) =
t ∈ [0, T ],
n=1
where k ≥ 1 and x = (x1 , . . . , xk ) is a vector of k distinct points from the interval (0, T ). Let Fk = { fk (x, ·) | x = (x1 , . . . , xk ) ∈ Ak } where Ak is a symmetric Borel set in (0, T )k , and F0 = { f0 }. Then 4
μi (Fk ) = P
νi
∑ 1Iξn ≤t ∈ Fk , νi = k
= P{(ξ1 , . . . , ξk ) ∈ Ak } · P{νi = k},
n=1
where k ≥ 1 and i = 1, 2. Hence for k ≥ 0 we have k μ2 (Fk ) P{ν2 = k} λ2 = = e(λ1 −λ2 )T . μ1 (Fk ) P{ν1 = k} λ1 Therefore, for any Borel set B ⊂ D([0, T ]), B
=
λ2 λ1
x(T )
∞
k=0
{x∈B: x(T )=k}
∑
=
∞
e(λ1 −λ2 )T d μ1 (x) =
λ2 λ1
∑ μ2 ({x ∈ B :
x(T )
e(λ1 −λ2 )T d μ1 (x) =
x(T ) = k}) = μ2 (B).
k=0
(b) Suppose that λ2 > λ1 . For λ > 0 we have Rλ := {x ∈ E : ρ (x) < λ } = {x ∈ E : x(T ) < y}, y =
log λ + (λ2 − λ1 )T . log λ2 − log λ1
17 Statistics of stochastic processes
289
Given ε ∈ (0, 1) we are looking for y = nε ∈ Z+ such that μ2 (Rλ ) < ε and μ2 (Rλ ∪ Γ λ ) ≥ ε . Under the hypothesis H2 we have x(T ) ∼ Pois(λ2 T ). The desired nε can be found uniquely from the condition
∑
0≤k 0 by the Chebyshev inequality we have P{|zn | ≥ ε } ≤ ε −4 Ez4n , thus ∞
∑ P{|zn | ≥ ε } < ∞,
n=1
and by the Borel–Cantelli lemma with probability 1 for n ≥ n0 (ε , ω ) it holds |zn | < ε . Therefore, under the hypothesis H1 it holds zn → 0, a.s. Next, under the hypothesis H2 , )
mn
a2 zn = 1 + ∑ k k=1 λk
*−1
mn
ak
∑ λk (x − a, ϕk ) → 1, a.s.
k=1
17.6. We are looking for a continuous solution b(·) to the integral equation a(t) =
1
e−|t−s| b(s)ds, t ∈ [0, 1].
0
Rewrite the equation in a form a(t) =
t s−t
e
b(s)ds +
1
et−s b(s)ds, t
0
whence
t
1
0
t
a (t) = − es−t b(s)ds + et−s b(s)ds, t 1 a (t) = es−t b(s)ds + et−s b(s)ds − 2b(t), t
0
b(t) =
a(t) − a (t) , t ∈ [0, 1]. 2
(17.21)
Integrating by parts we verify that this continuous function does satisfy the given integral equation. It is essential that a ∈ C2 ([0, 1]), a (0) = a(0), and a (1) = −a(1). Then the density ρ = dd μμ21 can be written in the form (17.19), with the function b(·) given in (17.21). We have ⎛ ⎞ 1 1 1 1 a(s)b(s)ds = ⎝ a2 (s)ds − a(s)a (s)ds⎠ 2 0
0
=
0
1 2 aL2 + a 2L2 + a2 (0) + a2 (1) , 2
that is denoted by σ 2 in the problem situation. Then
17 Statistics of stochastic processes
log ρ (x) =
1
x(s)b(s)ds −
0
σ2 , x ∈ C([0, 1]), 2
291
(17.22)
and under both hypotheses D log ρ (x(·)) =
1
⎛ ⎝
0
1
⎞ e−|t−s| b(s)ds⎠ b(t)dt =
0
1
a(t)b(t)dt = σ 2 .
0
Next, under H1 , E log ρ (x(·)) = −
σ2 σ2 and log ρ (x(·)) ∼ N − , σ 2 . 2 2
According to representation (17.22) we have under H2 , that E log ρ (x(·)) =
1 0
2 σ2 σ2 σ 2 = and log ρ (x(·)) ∼ N ,σ . a(s)b(s)ds − 2 2 2
We use the Neyman–Pearson criterion to construct the decision rule. Fix ε ∈ (0, 1). We accept H1 if log ρ (x(·)) < C, where a threshold C is found from the equation 2 σ , σ2 < C = ε, μ2 ({x : log ρ (x) < C}) = P N 2 and C = σ Φ −1 (ε ) + σ 2 /2. Thus, H1 is accepted if 1
x(s)b(s)ds < σ Φ −1 (ε ) + σ 2 ;
0
otherwise we accept H2 . There the Type II error is σ2 α21 = μ1 ({x : log ρ (x) ≥ C}) = P N − , σ 2 ≥ C = Φ¯ Φ −1 (ε ) + σ . 2 17.7. Under the hypothesis H1 we have Dxn = λn , whereas under the hypothesis H2 it holds Dxn = σ 2 λn . By the SLLN, under H1 we have yn :=
1 n xk2 ∑ λk → 1, a.s., n k=1
whereas under hypothesis H2 it holds yn → σ 2 = 1, a.s. Thus, we accept H1 if ⎛ ⎞2 T n 1 1 ⎝ x(t)ϕk (t)dt ⎠ = 1; lim ∑ n→∞ n λ k=1 k 0
otherwise we accept H2 .
292
17 Statistics of stochastic processes
17.8. For t ∈ [0, 2] we set
λ (t) = lim
2n −1
n→∞
∑
x
k=0
2 k+1 k t − x t . 2n 2n
Under the hypothesis Hk this limit exists a.s., and by the formula (17.5)
λ (t) =
t
bk (s, x(s))ds; k = 1, 2.
0
In particular λ (t) = t for k = 1, and λ (t) = t 2 /2 for k = 2. The values of these functions differ, for example, at t = 1. Therefore, we calculate
λ (1) = lim
n→∞
2n −1
∑
k=0
2 k+1 k x −x n . n 2 2
We accept the hypothesis H1 if λ (1) = 1; otherwise we accept H2 . 17.9. (a) In the notations of subsection Homogeneous in space processes from Theoretical grounds, we have a1 (t) = 0, B(t) = t; a2 (t) = | logt|−1/2 for t ∈ (0, T ], a2 (0) = 0; a(t) = a2 (t). There exists a solution b(t) to the equation (17.8) and it is equal to ⎧ t =0 ⎨ 0, 1 b(t) = , 0 < t ≤ T. ⎩ ( t | logt| Integral (17.11) is divergent: T
a(t)b(t)dt =
0
T 0
dt = +∞. t| logt|
That is why measures in the space C([0, T ]) that correspond to the distribution of the process under the hypotheses H1 and H2 are singular. In order to construct an error-free criterion we have to find a sequence of continuous functions bn (t), n ≥ 1, t ∈ [0, T ], such that T
lim
n→∞
a(t)bn (t)dt = +∞.
0
The functions can be defined as follows: T b(t), T n ≤ t ≤ T, bn (t) = b n , 0 ≤ T < Tn .
17 Statistics of stochastic processes
If
T
lim (
n→∞
tb2n (t)dt)−1
0
T
293
bn (t)dx(t) = 0
0
then we accept H1 ; otherwise we accept H2 . (b) In the notations of subsection √ Homogeneous in space processes from Theoretical grounds, we have a2 (t) = t = a(t); b(t) = t −1/2 for t ∈ (0, T ] and b(0) = 0. Condition (17.9) holds, therefore, a density of μ2 with respect to μ1 in the space C([0, T ]) is equal to T
ρ (x) = exp{
0
dx(t) 1 √ − 2 t
log ρ (x) =
T 0
T 0
1 √ √ tdt}, t
dx(t) T √ − . 2 t
Under the hypothesis Hk we have 1 T log ρ (x) ∼ N(mk , σ 2 ), σ 2 = T , mk = (−1)k σ 2 = (−1)k ; 2 2
k = 1, 2.
Fix a bound ε for the Type I error, ε ∈ (0, 1). We accept the hypothesis H1 if log ρ (x(·)) < L := σ Φ −1 (ε ) +
T 2
which is equivalent to T 0
dx(t) √ √ < T Φ −1 (ε ) + T. t
Otherwise we accept H2 . Then α12 = ε ,
L − m1 σ √ α21 = Φ¯ Φ −1 (ε ) + T .
α21 = P{N(m1 , σ 2 ) ≥ L} = Φ¯
,
17.10. In the notations of subsection Homogeneous in space processes from Theo√ √ retical grounds we have a1 (t) = 0,√ B(t) = diag(1,t), a2 (t) = a(t) = ( 4 t; 4 t)' . From equation (17.8) we find b(t) = ( 4 t; b2 (t))' with b2 (t) = t −3/4 for t ∈ (0, T ] and b2 (0) = 0. Check the condition (17.9): T 0
(a(t), b(t))dt =
T √ 0
1 t+√ t
2 dt = T 3/2 + 2T 1/2 < ∞. 3
294
17 Statistics of stochastic processes
Then by the formula (17.10) log ρ (x) =
T √
T
0
0
4
t dx1 (t) +
dx2 (t) − t 3/4
)
* T 3/2 1/2 +T , 3
where (x1 (t), x2 (t))' = x(t) is the observed vector path. Under the hypothesis Hk , k = 1, 2, * ) σ2 T 3/2 2 2 1/2 +T , mk = (−1)k . log ρ (x(·)) ∼ N(mk , σ ), σ = 2 3 2 Let the bound ε of the Type I error be given, ε ∈ (0, 1). We accept the hypothesis H1 if σ2 , log ρ (x(·)) < L := σ Φ −1 (ε ) + 2 which is equivalent to T √
T
0
0
4
t dx1 (t) +
dx2 (t) < σ 2; t 3/4
otherwise we accept H2 . Then α12 = ε and L − m1 2 ¯ = Φ¯ Φ −1 (ε ) + σ . α21 = P{N(m1 , σ ) ≥ L} = Φ σ 17.11. (a) The log-density is L(λ ; N) = N(T ) log λ + (1 − λ )T, λ > 0. In the case N(T ) > 0 it attains its maximum at λ = λˆ T := T −1 N(T ). (c) Introduce an i.i.d. sequence ni = N(i) − N(i − 1), i ≥ 1; n1 ∼ Pois(λ ). For N = k ∈ N consider as k → ∞ : N(k) n1 + · · · + nk = → En1 = λ , a.s. λˆ k = k k
(17.23)
Next, for any real T ≥ 1 consider N([T ]) [T ] N(T ) − N([T ]) · + δT , δT := . λˆ T = [T ] T T
(17.24)
As a result of (17.23) the first summand in (17.24) tends to λ , a.s. We have 0 ≤ δT ≤
N([T ] + 1) − N([T ]) nm+1 = , m := [T ]. [T ] m
It remains to prove that as m → ∞, nm+1 → 0, a.s. m
(17.25)
17 Statistics of stochastic processes
Consider
∞
∑E
m=1
n
m+1
m
2
=
295
∞
λ +λ2 < ∞. 2 m=1 m
∑
For ε > 0 we have by the Chebyshev inequality that # " n 1 nm+1 2 m+1 , P >ε ≤ 2 m ε m therefore, a series of these probabilities converges. Then by the Borel–Cantelli lemma |nm+1 /m| ≤ ε for all m ≥ m0 (ω ), a.s. This implies (17.25). (d) For T = k ∈ N we have as k → ∞: √ (n1 − λ ) + · · · + (nk − λ ) d √ k(λˆ k − λ ) = → N(0, Dn1 ) = N(0, λ ). k
(17.26)
According to the expansion (17.24), for any T ≥ 1 we have √ N([T ]) − λ T √ + T δT . T (λˆ T − λ ) = T Now, (17.26) implies that the first summand converges in distribution to N(0, λ ) as T → ∞. The second summand is estimated as √ nm+1 0 ≤ T δT ≤ √ , T √ √ P where m = [T ]. Then E| T δT | → 0 as T → ∞, and T δT → 0 as T → ∞. Finally the Slutzky lemma implies that √ d T (λˆ T − λ ) → N(0, λ ), T → ∞. 17.12. (a) The density mentioned in Hints is equal to
ρλ (x) = λ x(T ) e(1−λ )T
2 /2
, x ∈ D([0, T ]),
and this implies the desired relation. (b), (c) For the process N1 introduced in Hints, we have N1 (T 2 /2) , λˆ T = T 2 /2 hence the desired relation follows from Problem 17.1. In particular 3 T2 ˆ d d (λT − λ ) → N(0, λ ), T (λˆ T − λ ) → N(0, 2λ ), 2 as T → ∞. 17.13. Let {ξn1 , n ≥ 1} be independent random variables distributed on [0, T ] −1 with a density gT (t) = g(t) 0T g(s)ds , and {ξn2 , n ≥ 1} be an i.i.d. sequence
296
17 Statistics of stochastic processes
with similar density fT generated by the function f ; ν1 ∼ Pois 0T g(t)dt and ν2 ∼ Pois 0T f (t)dt , and νi is independent of {ξni , n ≥ 1}, i = 1, 2. According to Problem 5.17 the processes 4 Xi (t) =
νi
∑ 1Iξni ≥t ,
t ∈ [0, T ]
n=1
are nonhomogeneous Poisson processes on [0, T ] with intensity functions g (for i = 1) and f (for i = 2). In the notations from the solution to Problem 17.2 (a), we have for the measures μ1 and μ2 for k ≥ 1: μ2 (Fk ) = P{(ξ12 , . . . , ξk2 ) ∈ Ak } · P{ν2 = k} =
k fT (ti )
k
∏ gT (ti ) · ∏ gT (ti )dt1 . . . dtk · P{ν1 = k} k=1 Ak i=1 ⎛ ⎞k T
⎜ × ⎝ 0T
g(t)dt f (t)dt
T ⎟ ⎠ exp{ (g(t) − f (t))dt}. 0
0
Here ∏ki=1 gT (ti ) is a density of random vector (ξ11 , . . . , ξk1 ). Then
μ2 (Fk ) =
Ak
T
f (ti ) ∏ g(ti ) × exp{ (g(t) − f (t))dt}d μ1 (x), i=1 k
0
where ti = ti (x) are jump points of a step-function x(·). As in the solution to Problem 17.2 (a), this implies that for any Borel set B ⊂ D([0, T ]),
μ2 (B) =
B
T
f (ti ) ∏ g(ti ) 1Ix(ti )−x(ti −)=1 exp{ (g(t) − f (t))dt}d μ1 (x). i 0
17.14. (a) Based on Problem 17.13 the estimator λˆ T is found as a maximum point of the function N(T ) λT2 , λ > 0, L0 (λ , N) = ∑ log(1 + λ ti ) − 2 i=1 or (for N(T ) ≥ 1) as a solution to the equation hT (λ ) :=
1 N(T ) ti 1 = , λ > 0. ∑ 2 T i=1 1 + λ ti 2
The function hT is strictly increasing and continuous in λ ≥ 0. We have hT (+∞) = 0 < 12 . For the existence of a unique solution one has to ensure that hT (0) =
1 N(T ) 1 ti > . ∑ 2 T i=1 2
(17.27)
17 Statistics of stochastic processes
297
We have EhT (0) = T −2 EN(T )·E(ti | ti ≤ T ). Here ti is any jump point of the observed path, and under the condition ti ≤ T its density is equal to 1 + λ0 t T + λ02T Then EhT (0) =
1 T2
T
2
.
t(1 + λ0t)dt =
0
1 1 λ0 T + > . 2 3 2
(17.28)
It is straightforward to check that hT (0) − EhT (0) → 0 as T → ∞, a.s. Therefore, (17.28) implies that the inequality (17.27) holds with probability 1 for all T ≥ T0 (ω ). For such T ≥ T0 (ω ) there exists a unique maximum point of the function L0 (λ , N). (b) Notice that EhT (λ0 ) =
1 T2
T 0
t(1 + λ0t) 1 dt = . 1 + λ0 t 2
Fix 0 < ε < λ0 . For 0 < λ ≤ λ0 − ε we have hT (λ ) ≥ hT (λ0 − ε ) = EhT (λ0 − ε ) + o(1) ≥
1 + δ1 (ε ) + o(1). 2
Here δ1 (ε ) > 0 and o(1) is a r.v. tending to 0 as T → ∞, a.s. In a similar way for λ ≥ λ0 + ε we have hT (λ ) ≤ hT (λ0 + ε ) = EhT (λ0 + ε ) + o(1) ≤
1 − δ2 (ε ) + o(1). 2
Because hT (λˆ T ) = 12 , then with probability 1 there exists Tε (ω ) such that for all T ≥ Tε (ω ) it holds |λˆ T − λ | < ε . This proves the strong consistency of λˆ T . 17.15. (a) Let f generate mˆ ∈ M. The unbiasedness of mˆ is equivalent to the condition T 0 f (t)dt = 1. Then ∞
Dmˆ = (J f , f ) = ∑ λn c2n , cn := ( f , ϕn ), n ≥ 1, i=1
at that
∑∞ n=1 cn an
= 1. By the Cauchy–Schwartz inequality ) 1=
∞
∑ cn an
n=1
*2 ≤
∞
∞
n=1
n=1
∑ λn c2n · ∑ λn−1 a2n ,
(17.29)
−1 2 −1 . and Dmˆ ≥ ∑∞ n=1 λn an Let the latter series converge (the divergency case is treated in solution (b) below). The equality in (17.29) is attained if cn is proportional to λn−1 an (though −1 ∑∞ n=1 λn an ϕn is not necessarily a continuous function). Introduce a continuous function
298
17 Statistics of stochastic processes
)
N
∑
fN (t) =
*−1
λn−1 a2n
n=1
N
· ∑ λn−1 an ϕn (t), N ≥ 1, t ∈ [0, T ], n=1
and the corresponding estimator mˆ N =
T
) fN (t)x(t)dt = m +
N
∑
*−1
λn−1 a2n
n=1
0
Here 1 xn = √ λn
T
N
∑
.
λn−1 an xn .
n=1
ϕn (t)(x(t) − m) dt, n ≥ 1,
0
is a sequence of uncorrelated random variables with zero mean and unit variance. Then ) *−1 ) *−1 N
∑ λn−1 a2n
lim Dmˆ N = lim
N→∞
N→∞
=
n=1
∞
∑ λn−1 a2n
.
n=1
Moreover there exists the mean square limit of mˆ N as N → ∞, and this is an −1 2 −1 . This estimator can be out unbiased estimator mˆ ∗ with variance ∑∞ n=1 λn an of M. (b) In this case Dmˆ N → 0 as N → ∞. Now, from the unbiasedness of mˆ N it follows P that mˆ N → m as N → ∞, and by the Riesz lemma there exists a subsequence of estimators {mˆ N(k) , k ≥ 1} that converges to m, a.s. 17.16. Let mˆ F ∈ M. Then 0T dF(t) = m and Dmˆ F =
T T
r(s,t)dF(s)dF(t) =: Φ (F).
0 0
Suppose that the minimum of the variance is attained at mˆ F , and G is a function introduced in Hints. For each δ ∈ R we have
Φ (F + δ G) = Φ (F) + δ Φ (G) + 2δ
T
2
R(t)dG(t) ≥ Φ (F)
0
where R(s) =
T 0
r(s,t)dF(t), s ∈ [0, T ]. This implies that β
R(t)dG(t) = R(α ) − R(β ) = 0.
α
Therefore, R(s) ≡ C and Dmˆ F = 0T R(s)dF(s) = C. Vice versa, let F be a function of bounded variation such that mˆ F ∈ M and R(s) ≡ C. Let mˆ H ∈ M; then for G := H − F we have 0T dG(t) = 0 and
17 Statistics of stochastic processes
Dmˆ H = Φ (F + G) = Φ (F) + Φ (G) + 2
299
T
R(t)dG(t) 0
= Φ (F) + Φ (G) ≥ Φ (F) = Dmˆ F . 17.17. (a) mˆ 1 = 0T x(t)dF(t), F(t) = (2 + β T )−1 (1It>0 + 1It≥T + β t) . The equality holds 0T dF(t) = F(T ) − F(0) = 1, therefore, mˆ 1 ∈ M. Next, it is straightforward that for each s ∈ [0, T ], T
e−β |t−s| dF(t) =
0
2 = const, 2+βT
thus, mˆ 1 has the least variance in M. (b) mˆ 2 = 0T x(t)dF(t), F(t) = 1It>0 . This estimator belongs to M, and for each s ∈ [0, T ] it holds T
min(s + 1,t + 1)dF(t) = 1 = const.
0
That is why mˆ 2 has the least variance in M. (c) Content ourself with case (a). Based on Problem 17.16 it is enough to show there is no such function f ∈ L1 ([0, T ]) that T
e−β |t−s| f (t)dt ≡ 1, s ∈ [0, T ].
(17.30)
0
To the contrary, suppose that (17.30) holds. Differentiating we obtain that for almost every s ∈ [0, T ], −β s
−β e
s
βt
βs
e f (t)dt + β e
T
e−β t f (t)dt = 0.
s
0
The last two equalities imply that −β s
s
2e
eβ t f (t)dt = 1
0
for almost every s ∈ [0, T ]. Both parts of the equation are continuous in s; then
s βt βs 0 e f (t)dt ≡ e /2. Differentiating this identity we obtain f (s) = β /2 for almost
every s ∈ [0, T ]. But this function does not satisfy (17.30), because the integral I(s) =
T
−β |t−s|
e 0
dt =
T −s
e−β |u| du,
−s
is not a constant. We came to contradiction.
s ∈ [0, T ],
300
17 Statistics of stochastic processes
17.18. (a)
σ =T 2
−1
2n −1
∑
lim
n→∞
k=0
k+1 x T 2n
k −x nT 2
2 , a.s.
(b) μ 2T d μ2 −2 , x ∈ C([0, T ]). ρ (x) = (x) = exp σ μ x(T ) − d μ1 2 Hence, μˆ T = argmaxμ >0 log ρ (x(·)) = T −1 x(T ). (c) It holds μˆ T = μ + σ W (T )/T , which implies the desired relations. For that Problem 3.18 is used. 17.19. (a) Write down the prior density up to multipliers that do not depend on μ :
ρ (μ ) ∼ exp{−
μ2 μ μ0 + 2 }. 2 2σ0 σ0
Then the posterior density ρ (μ |x) is proportional to the expression:
ρ (μ |x) ∼ ρ (μ )ρ (x|μ ) ∼ exp{− with A :=
Aμ 2 + Bμ }, 2
μ0 x(T ) T 1 + and B := 2 + 2 . σ 2 σ02 σ σ0
Therefore,
ρ (μ |x) ∼ exp{−
(μ − BA−1 )2 }. 2A−1
The posterior distribution will be the normal law N(μT , σT2 ) with parameters
μT = BA−1 =
μ0 σ 2 + σ02 x(T ) σ02 σ 2 2 and σ = . T T σ02 + σ 2 T σ02 + σ 2
(b) The Bayes estimator is
μT = KT μ0 + (1 − KT )
x(T ) σ2 with KT = 2 . T σ + T σ02
Thus, the estimator is a convex combination of the prior estimator μ0 and the maximum likelihood estimator T −1 x(T ). The coefficient KT is called the confidence factor. As T → ∞ it tends to 0 (that is, for large T we give credence to the maximum likelihood estimator), while as T → 0 it tends to 1 (that is, for small T we give more credence to the prior information rather than to the data). Answer: the posterior distribution is N(μT , σT2 ) with
μT = KT μ0 + (1 − KT )
σ2 x(T ) , KT = 2 , T σ + T σ02
17 Statistics of stochastic processes
σT2 =
301
σ02 σ 2 , T σ02 + σ 2
and μT is the Bayes estimator of the parameter μ . 17.20. (a) Up to multipliers that do not depend on λ , ρ (λ ) ∼ λ α −1 e−β λ . According to Problem 17.2, the density of the distribution of the process ρ (N|λ ) ∼ λ N(T ) e−λ T . Then ρ (λ |N) ∼ ρ (λ )ρ (N|λ ) ∼ λ α +N(T )−1 e−(β +T )λ . The posterior distribution is gamma distribution Γ (α + N(T ), β + T ). (b) In the first case the Bayes estimator is N(T ) α + N(T ) α = K1 + (1 − K1 ) . λˆ T 1 = β +T β T Here K1 = β /(β + T ) is the confidence factor (see the discussion in the solution of Problem 17.19). In the second case the Bayes estimator is
α + N(T ) − 1 . λˆ T 2 = β +T It exists if α + N(T ) > 1. Under the additional constraint α > 1 it holds
α −1 N(T ) with K2 = K1 . λˆ T 2 = K2 + (1 − K2 ) β T This estimator is a convex combination of the prior estimator (α − 1)/β (under the same loss function) and the maximum likelihood function. Answer: gamma distribution Γ (α + N(T ), β + T ), N(T ) α β with K = , λˆ T 1 = K + (1 − K) β T β +T ) λˆ T 2 = K αβ−1 + (1 − K) N(T T . 17.21. The unbiasedness condition Eϕ ,θ θˆ = θ means the following:
∀ϕ ∈K:
¯ < f , ϕ >= 0;
< f , g' >= Im .
Here g = (g1 , . . . , gm )' ; < f , ϕ > is a column vector with components ( f , ϕi ), and < f , g' > is a matrix with entries ( fi , g j ); Im stands for the unit matrix of size m. Looking for an optimal estimator in the class M is reduced to the following problem. Find a vector function f = ( f1 , . . . , fm )' such that: (1) { f1 , . . . , fm } ⊂ K ⊥ . (2) < f , g' >= Im . (3) For each vector function h that satisfies the conditions (1) and (2), the matrix < h, h' > − < f , f ' > is positive semidefinite.
302
17 Statistics of stochastic processes
Let P be the projection operator on K, and Pg = (Pg1 , . . . , Pgm )' , Φ =< Pg, (Pg)' >. The matrix Φ is nonsingular as a result of linear independence of the functions {Pgi }. The desired vector is unique and has a form f = f ∗ = Φ −1 Pg. The covariance matrix S∗ of the corresponding estimator is S∗ = E[(θˆ ∗ − θ )(θˆ ∗ − θ )' ] = σ 2 Φ −1 . 17.22. (b) Continue the reasoning from Hints. By the Fubini theorem
λ 2 (A1 ) =
λ 1 ({x1 : (x1 , x2 ) ∈ A1 })d λ 1 (x2 ) = 1.
[0,1]
Here λ 2 is Lebesgue measure on the plane. Next, A2 := X \ A1 = {x : θ (x) ∈ [2, 3]} and the consistency of the estimator implies that for all x1 ∈ [0, 1] it holds
λ 1 ({x2 ∈ [0, 1] : (x1 , x2 ) ∈ A2 }) = 1. Therefore,
λ 2 (A2 ) =
1 · d λ 1 (x1 ) = 1.
[0,1]
But due to additivity of a measure, 1 = λ 2 (X) = λ 2 (A1 ) + λ 2 (A2 ) = 2. We came to a contradiction.
18 Stochastic processes in financial mathematics (discrete time)
Theoretical grounds Consider a model of a financial market with a finite number of periods (i.e., of the moments of time) at which it is possible to trade, consume, spend, or receive money or other valuables. The model consists of the following components. There exist d + 1 financial assets, d ≥ 1, and the prices of these assets are available at moments t ∈ T = {0, 1, . . . , T }. The price of the ith asset at moment t is a nonnegative r.v. Si (t) defined on the fixed probability space (Ω , F, P). This space is assumed to support some filtration {Ft }t∈T , and we suppose that the random vector St = (S0 (t), S(t)) = (S0 (t), S1 (t), . . . , Sd (t)) is measurable with respect to the σ -field Ft . With the purpose of technical simplifying we assume in what follows that F0 = {∅, Ω } and FT = F. In the most applications the asset S0 (t) is considered as a risk-free (riskless) bond (num´eraire), and sometimes it is supposed that S0 (t) = (1 + r)t , where r > −1 is a risk-free interest rate. In real situations r > 0, but it is not obligatory. Other assets are considered as risky ones, for example, stocks, property, currency, and so on. Definition 18.1. A predictable d + 1-dimensional stochastic process ξ = (ξ 0 , ξ ) = {(ξ 0 (t), ξ 1 (t), . . . , ξ d (t)), t ∈ T} is called the trading strategy (portfolio) of a financial investor. A coordinate ξ i (t) of the strategy ξ corresponds to the quantity of units of the ith asset during the tth trading period between the moments t − 1 and t. Therefore, ξ i (t)Si (t − 1) is the sum invested into the ith asset at the moment t − 1, and ξ i (t)Si (t) is the corresponding sum at the moment t. The total value of the portfolio at the moment t − 1 equals (ξ (t), S(t − 1)) = ∑di=0 ξ i (t)Si (t − 1), and at the moment t this value can be equated to (ξ (t), S(t)) = ∑di=0 ξ i (t)Si (t) ( ( · , · ) is, as always, the symbol of the inner product in Euclidean space). The predictability of the strategy reflects the fact that the distribution of resources happens at the beginning of each trading period, when the future prices are unknown. Definition 18.2. The strategy ξ is called self-financing, if the investor’s capital V (t) satisfies the equality V (t) = (ξ (t), S(t)) = (ξ (t + 1), S(t)), t ∈ {1, 2, . . . , T − 1}. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 303 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 18,
304
18 Stochastic processes in financial mathematics (discrete time)
The self-financing property of the trading strategy means that the portfolio is always redistributed in such a way that its total value is preserved. The strategy is self-financing if and only if for any t ∈ {1, 2, . . . , T }, t
V (t) = (ξ (t), S(t)) = (ξ (1), S(0)) + ∑ (ξ (k), (S(k) − S(k − 1))). k=1
The value (ξ (1), S(0)) is the initial investment that is necessary for the purchasing of the portfolio ξ (1). Below we assume that S0 (t) > 0 P-a.s., for all t ∈ T. In this case it is possible to define the discounted prices of the assets X i (t) := (Si (t))/(S0 (t)), t ∈ T, i = 0, 1, . . . , d. Evidently, after the discounting we obtain that X 0 (t) ≡ 1, and X(t) = (X 1 (t), . . . , X d (t)) is the value of the vector of risk assets in terms of units of the asset S0 (t), which is the discounting factor. Despite the fact that the asset S0 (t) is called risk-free and the vector X(t) is called the vector of risk assets, these notions are relative, to some extent. Introduce the extended vector of discounted prices X(t) = (1, X 1 (t), . . . , X d (t)). Definition 18.3. A stochastic process of the form {V (t), Ft ,t ∈ T} where V (t) = (ξ (t), X(t)) =
(ξ (t), S(t)) S0 (t)
is called the discounted capital of investor. If a strategy is self-financing, the equality t
V (t) = (ξ (1), X(0)) + ∑ (ξ (k), (X(k) − X(k − 1))) k=1
holds true for all t ∈ T. Here
∑0k=1
:= 0.
Definition 18.4. A self-financing strategy is called the arbitrage possibility if its capital V satisfies inequalities V (0) ≤ 0, V (T ) ≥ 0 P-a.s., and P(V (T ) > 0) > 0. Definition 18.5. A probability measure Q on (Ω , F) is called the martingale measure, if the vector-valued discounted price process {X(t), Ft ,t ∈ T} is a d-dimensional Q-martingale; that is, EQ X i (t) < ∞ and X i (s) = EQ (X i (t)/Fs ), 0 ≤ s ≤ t ≤ T , 1 ≤ i ≤ d. Theorem 18.1. A financial market is free of arbitrage if and only if the set P of all martingale measures, which is equivalent to measure P, is nonempty. In this case there exists a measure P∗ ∈ P with bounded density dP∗ /dP. Definition 18.6. A nonnegative r.v. C on (Ω , F, P) is called the European contingent claim (payoff). The European contingent claim can be interpreted as an asset that guarantees to its owner the payment C(ω ) at moment T . The moment T is called the expiration date, or maturity date of the claim C. The corresponding discounted contingent claim has a form H = C/ST0 . If this discounted contingent claim can be presented in a functional form, namely, H = f (X(·), where f : Rd(T +1) → R+ is a measurable function, then it is called the derivative, or derivative security, of the vector of primary financial assets X(t),t ∈ T.
18 Stochastic processes in financial mathematics (discrete time)
305
Definition 18.7. A contingent claim C is called attainable (replicable, redundant), if there exists a self-financing strategy ξ such that the value of portfolio at the maturity date equals C; that is, C = (ξ T , ST ) P-a.s. In this case we say that a strategy ξ creates a replicating portfolio for C (replicates C, is a hedging strategy for C). A contingent claim is attainable if and only if the corresponding discounted contingent claim has a form T
H = (ξ T , X T ) = VT = V0 + ∑ (ξk , (Xk − Xk−1 )), k=1
for some self-financing strategy ξ . Definition 18.8. An arbitrage-free financial market is called complete if on this market any contingent claim is attainable. Theorem 18.2. A financial market is complete if and only if there exists and is unique the equivalent martingale measure. Definition 18.9. A number π (H) is called the arbitrage-free price (fair price) of discounted European contingent claim H if there exists a nonnegative adapted stochastic process X d+1 = {X d+1 (t), Ft ,t ∈ T} such that X d (0) = π (H), X d (T ) = H, and the extended financial market (X 0 (t), . . . , X d+1 (t)) is arbitrage-free. Theorem 18.3. If a contingent claim is attainable then it has a unique arbitrage-free price EP∗ (H). If a contingent claim is not attainable, then the set of its arbitragefree prices is an interval of the form (π ↓ (H), π ↑ (H)) on nonnegative axis (possibly π ↑ (H) = ∞). Definition 18.10. A nonnegative adapted stochastic process C = {C(t), Ft ,t ∈ T} is called the American contingent claim. An American contingent claim is a contract that is issued at moment t = 0 and obliges the writer to pay a certain amount C(t), provided the buyer decides at moment t to exercise this contract. The contract is exercised only once. If the buyer has not decided to exercise the contract till the maturity date T , then at this moment the contract is automatically exercised. The buyer has a possibility to exercise the contract not only at nonrandom moment t ∈ T, but at any stopping time τ ∈ T. The aim of the buyer is to find an optimal stopping time τ0 in the sense that EC(τ0 ) = sup0≤τ ≤T EC(τ ), where τ are stopping times, or, in terms of the corresponding discounted contingent claim H(t) = (C(t))/(S0 (t)), EH(τ0 ) = sup0≤τ ≤T EH(τ ). Definition 18.11. An American call option on an asset S is the derivative that can be exercised at any stopping time τ ∈ T, and in this case the payment is (S(τ ) − K)+ , where S(t) is the price at moment t of the underlying asset. An American put option is defined similarly. The strategies of the buyer and writer of an American option are different: the buyer wants to exercise the option at that moment τ0 where the mean value of the payment is the biggest, and the writer wants to create his portfolio in order to have a possibility to exercise the option whenever the buyer comes.
306
18 Stochastic processes in financial mathematics (discrete time)
Bibliography [55], Chapter IV; [23], Chapters 1,5,6; [84], Volume 2, Chapters V and VI; [21], Chapters I and II; [46], Chapters 4–8; [54], Chapter IX; [62], Chapters 1–4.
Problems 18.1. The owner of a European call option has a right, but not an obligation, to buy some asset, for example, some stock S, at moment T at price K, which is fixed initially. This price is called the strike price. Similarly, the owner of a European put option has a right, but not an obligation, to buy some asset, for example, the same stock S, at moment T at the price K, which is fixed initially. (1) Prove that the value of the call option C = Ccall equals C = C(T ) = (S(T ) − + K) , and the value of the put option P = P put equals P = P(T ) = (K − S(T ))+ . (2) Let a finance market be arbitrage-free, π (C) be the arbitrage-free price of a European call option, and π (P) be the arbitrage-free price of the European put option, both with strike price K. Prove that π (C) ≤ S(0) and π (P) ≤ K, where S(0) is the initial price of the risk asset. 18.2. (1) We know that the financial market is arbitrage-free, the price of an asset at moment 0 equals S(0), and at moment T the possible values of this asset are S(ωi ), i = 1, . . . , M. Also, let the risk-free interest rate at any moment equal r. What is the risk-free price of a European call option on this asset if the strike price equals K with K < min1≤i≤M S(ωi )? (2) What is the risk-free price of a European call option on this asset if the strike price is zero? 18.3. We know that the financial market is arbitrage-free, the interest rate equals r at any period, and T is the expiration date. (1) Prove the following inequalities constructing an explicitly arbitrage strategy in an opposite case : (a) π (P) ≥ K(1 + r)−T − S(0). (b) π (C) ≥ S(0) − K(1 + r)−T . (2) Using a definition of the martingale measure prove the following specifications of the inequalities from item (1). (a) Prove that the arbitrage-free price of a European put option admits the bounds max(0, (1 + r)−T K − S(0)) ≤ π (P) ≤ (1 + r)−T K. (b) Prove that the price of the corresponding call option admits the bounds max(0, S(0) − (1 + r)−T K] ≤ π (C) ≤ S(0). 18.4. We know that a financial market is arbitrage-free, the interest rate equals r at any period, T is the expiration date, and K is the strike price of all the options mentioned below.
18 Stochastic processes in financial mathematics (discrete time)
307
(1) (a) Prove that under the conditions for absence of arbitrage, the put–call parity holds between arbitrage-free prices of call and put options: S(0) + π (P) − π (C) = K(1 + r)−T . (b) Prove the following generalization of the put–call parity to any intermediate moment: S(t) + P(t) −C(t) = K(1 + r)−T +t , where P(t) and C(t) are arbitrage-free prices of put and call options at moment t, respectively. (2) Prove that the selling one asset, selling one put option, and buying one call option yields positive profit with vanishing risk (the arbitrage) under the assumption S(0) + π (P) − π (C) > K(1 + r)−T . (3) Prove that the buying one asset, buying one put option, and selling one call option provides the arbitrage under the assumption S + π (P) − π (C) < K(1 + r)−T . 18.5. Let the price of an asset (e.g., stock) at moment t equal S(t). All the options under consideration are supposed to have the expiration date T and the strike price K, unless otherwise specified. The interest rate equals r at any period between buying and exercising options. Calculate the capital at moment T of the investor whose activity at moment t can be described as follows. (a) She has one call option and one put option. (b) She has one call option with strike price K1 and sells one put option with strike price K2 . (c) She has two call options and sells one asset. (d) She has one asset and sells one call option. 18.6. (Law of one price) Let the financial market be arbitrage-free, C be an attainable contingent claim, and ξ = {ξ (t),t ∈ T} be any replicating portfolio for C. Prove that the initial capital V (0) = (ξ (1), S(0)) is the same for any such portfolio. 18.7. (Binomial model, or Cox–Ross–Rubinstein model) Assume that there is one riskless asset (a bond) {Bn = (1 + r)n , 0 ≤ n ≤ N} with the interest rate r > −1 and one risky asset (a stock) {Sn , 0 ≤ n ≤ N} within the financial market. The price Sn can be calculated as follows. S0 > 0 is a given value, Sn+1 is equal either to Sn (1 + a) or Sn (1 + b), where −1 < a < b. Hence, Ω = {1 + a, 1 + b}N . We put F0 = {∅, Ω } and Fn = σ {S1 , . . . , Sn }, 1 ≤ n ≤ N. Assume that every element of Ω has positive probability. Let Rn = Sn /Sn−1 , 1 ≤ n ≤ N. If {y1 , . . . , yn } is some element of Ω then P ({y1 , . . . , yn }) = P(R1 = y1 , . . . , Rn = yn ). (1) Show that Fn = σ {R1 , . . . , Rn }, 1 ≤ n ≤ N. (2) Show that the discounted stock price Xn := Sn /(1 + r)n is a P∗ -martingale if and only if EP∗ (Rn+1 /Fn ) = 1 + r, 0 ≤ n ≤ N − 1. (3) Prove that the condition r ∈ (a, b) is necessary for the market to be arbitragefree. (4) Prove that under the condition r ∈ (a, b) a random sequence {Xn , Fn , 0 ≤ n ≤ N} is a P∗ -martingale if and only if random variables R1 , . . . , Rn are mutually independent and identically distributed and P∗ (R1 = 1 + b) = (r − a)/(b − a) =: p∗ . Show that the market is complete in this case.
308
18 Stochastic processes in financial mathematics (discrete time)
18.8. In the framework of an arbitrage-free and complete binomial model consider a discounted derivative security of the form H = f (X0 , . . . , XN ), where f : RN+1 → R+ is a measurable function. (1) Prove that H is integrable with respect to the martingale measure P∗ . (2) Prove that the capital Vn of any hedging strategy for H can be presented in the form Vn = EP∗ (H/Fn ) = vn (X0 , X1 (ω ) . . . , Xn (ω )), where the function vn (x0 , . . . , xn ) = EP∗ f x0 , . . . , xn , xn (X1 /X0 ), . . . , xn ((XN−n )/(X0 )) . (3) Prove that a self-financing strategy ξ = (ξ 0 , ξ ), which is a replicating strategy for H, has the form ξn (ω ) = Δn (X0 , X1 (ω ), . . . , Xn−1 (ω )), where
Δn (x0 , x1 , . . . , xn−1 ) = aˆ =
ˆ vn (x0 ,...,xn−1 ,xn−1 b)−v ˆ n (x0 ,...,xn−1 ,xn−1 a) , ˆ a) xn−1 (b− ˆ 1+a ˆ 1+b 1+r , b = 1+r ,
and 0 ξ10 (ω ) = EP∗ (H) − ξ1 (ω )X0 , ξn+1 (ω ) − ξn0 (ω ) = − (ξn+1 (ω ) − ξn (ω ))Xn .
(4) Let H = f (XN ). Prove that in this case the functions vn (xn ) can be presented N−n−k b ˆ k )Ck (p∗ )k (1 − p∗ )N−n−k ; in particular, the in the form vn (xn ) = ∑N−n N−n k=0 f (xn aˆ unique arbitrage-free price of the contingent claim H can be presented in the form π (H) = v0 (X0 ) = ∑Nk=0 f (X0 aˆN−k bˆ k )CNk (p∗ )k (1 − p∗ )N−k . (5) Denote by Cn the price at moment n of a European call option with the expiration date N and the strike price K. Prove that under the conditions of item (4) of Problem 18.7 it holds that Cn = c(n, Sn ), where c(n,x) (1+r)n−N
+ = EP∗ x ∏Ni=n+1 Ri − K
(N−n)! ∗ j ∗ N−n− j x(1 + a)N−n− j (1 + b) j − K + . = ∑N−n j=0 (N−n− j)! j! (p ) (1 − p )
18.9. (Trinomial model) Let a financial market consist of one risky asset (stock) {Sn , 0 ≤ n ≤ N} and one riskless asset (bond) {Bn = (1 + r)n , 0 ≤ n ≤ N} with the interest rate r > −1. The price of Sn is defined as follows. S0 > 0 is a fixed number, and Sn+1 equals either Sn (1 + a), or Sn (1 + b), or Sn (1 + c), where −1 < a < b < c. Therefore, Ω = {1 + a, 1 + b, 1 + c}N ; that is, any element of Ω can be presented as ω = {y1 , . . . , yN }, where yn = 1 + a, or 1 + b, or 1 + c. Similarly to the binomial model, put F0 = {∅, Ω } and Fn = σ {S1 , . . . , Sn }, 1 ≤ n ≤ N. Suppose that any element of Ω has positive probability. (1) How many equivalent martingale measures exist in this model? Is this model arbitrage-free? complete? (2) If we consider two martingale measures, do they lead to the same price of attainable payoff; that is, does the law of one price hold in the trinomial model? 18.10. Denote by C(t) and Y (t), t ∈ T the price which will pay the buyer of a European and American option, respectively, if he buys them at moment t. It is supposed that the options are derivatives at the same asset. Prove that Y (t) ≥ C(t).
18 Stochastic processes in financial mathematics (discrete time)
309
18.11. Consider an American option with payoffs {Y (t),t ∈ T}. Let a European option at moment T have the same payoff Y (T ). Prove the following statement. If C(t) ≥ Y (t) for any t ∈ T and any ω ∈ Ω , then C(t) = Y (t) and the optimal strategy for the buyer is to wait until moment T and then exercise the option. 18.12. (Hedging of American option) Assume an American option can be exercised at any moment n = 0, 1, . . . , N, and let {Sn , Fn , 0 ≤ n ≤ N} be a stochastic process, that is equal to the profit, provided the option is exercised at moment n. Denote by {Sn0 , 0 ≤ n ≤ N} the price of a risk-free asset (the discounting factor), which is supposed to be nonrandom, and let Xn = Sn /Sn0 . Denote also by {Yn , 0 ≤ n ≤ N} the price (value) of the option at moment n. The market is supposed to be arbitrage-free and complete, and let P∗ be the unique martingale measure. (1) Prove that YN = SN . (2) Prove that for any 0 ≤ n ≤ N − 1 the equality holds Yn = max(Sn , Sn0 EP∗ (
Yn+1 /Fn )). 0 Sn+1
(3) Prove that in the case Sn0 = (1 + r)n , the price of the American option can be presented as 1 EP∗ (Yn+1 /Fn )). Yn = max(Sn , 1+r (4) Prove that the discounted price of the American option Zn := Yn /Sn0 is a P∗ supermartingale; moreover it is the smallest P∗ -supermartingale dominating the sequence {Xn , 0 ≤ n ≤ N}. Prove that Zn is a Snell envelope of the sequence {Xn }. (5) Prove that the following equalities hold: Zn = supτ ∈Tn,N EP∗ (Xτ /Fn ), 0 ≤ n ≤ N. (See Problems 7.22 and 15.20 for the corresponding definitions.) 18.13. (Price of an American put option in the context of the Cox–Ross–Rubinstein model) Let a financial market consist of one stock {Sn , 0 ≤ n ≤ N}, with the price defined by the Cox–Ross–Rubinstein model and one bond with the interest rate r > −1 (see Problem 18.7). (1) Prove that at moment n with 0 ≤ n ≤ N the price Pn of the American put option with the maturity date N and the strike price K equals Pn = P(n, Sn ), where P(n, x) can be found as follows. P(N, x) = (K − x)+ , and for 0 ≤ n ≤ N − 1 it holds P(n, x) = max((K − x)+ , (( f (n + 1, x))/(1 + r))), where f (n + 1, x) = pP (n + 1, x(1 + a)) + (1 − p)P (n + 1, x(1 + b)) , with p = (b − r)/(b − a). (2) Prove that P(0, x) = supτ ∈T0,N EP∗ ((1 + r)−τ (K − xVτ )+ ) , where the sequence {Vn , 0 ≤ n ≤ N} is given by V0 = 1,Vn = ∏ni=1 Ui , 1 ≤ n ≤ N, and Ui are some random variables. Determine their simultaneous distribution with respect to the measure P∗ . (3) Use item (2) and prove that the function P(0, x) : R+ → R+ is convex. (4) Let a < 0. Prove that there exists a real number x∗ ∈ [0, K] such that for x ≤ x∗ it holds P(0, x) > (K − x)+ . (5) Let the owner have an American put option at moment t = 0. What are the values of S0 for which it is the most profitable to exercise this option at the same moment?
310
18 Stochastic processes in financial mathematics (discrete time)
18.14. An American option is called attainable if for any stopping time 0 ≤ τ ≤ T there exists a self-financing strategy such that the corresponding capital V satisfies the equality V (τ ) = Y (τ ). Let an American option be attainable, and the corresponding discounted stochastic process H(t) := Y (t)/S0 (t) be a submartingale with respect to some martingale measure P∗ . Prove that the optimal stopping time τ coincides with the exercise date (i.e., τ = T ), and the price of this American option coincides with the price of a European option, C(T ) = Y (T ). 18.15. Consider an American call option that can be exercised at any moment t ∈ T, and in the case where it is exercised at moment t ∈ T, the strike price equals K(1 + q)t , where q is a fixed number; that is, the payoff at moment t equals (S(t) − K(1 + q)t )+ . For q ≤ r, where r is the interest rate, prove that the option will not be exercised before moment t = T . 18.16. Let a risky asset at moment T have the price S(T ), where S(T ) is a r.v. with the distribution determined by the measure P. At moment T, let the option on this asset have the price C(T ). Consider a portfolio consisting of ξ0 units of a riskless asset and ξ units of a risky asset, and the portfolio is constant during all trading periods; such a strategy is called “buy and hold”; it is not obligatory self-financing. We suppose that the interest rate is zero, and let the initial capital equal V (0). (1) Prove that the additional costs that must be invested by the owner of this portfolio with the purpose to be able at moment T to exercise the contingent claim C(T ), can be calculated by the formula D := C(T ) −V (0) − ξ (S(T ) − S(0)). (2) In terms of ES(T ), EC(T ), DS(T ), and cov(S(T ),C(T )), determine those values of V (0) and ξ that minimize E(D2 ), and prove that under these values of V (0) and ξ it holds ED = 0. (3) Prove that in the case of a complete market the option C(T ) linearly depends on S(T ) − S(0), and that V (0) and ξ can be chosen in such a way that D = 0.
Hints 18.7. (1) Express {S1 , . . . , Sn } via {R1 , . . . , Rn } and vice versa. (2) Write the equation EQ (Xn+1 /Fn ) = Xn in the equivalent form EQ (((Xn+1 )/(Xn ))/ Fn ) = 1. (3) Let the market be arbitrage-free. Then there exists a measure P∗ ∼ P such that Xn is a P∗ -martingale. Furthermore, use the statement of item (2). (4) If Rn are mutually independent and P∗ (Ri = 1 + b) = p∗ then EP∗ (Rn+1 /Fn ) = 1+r. Check this and use the statement (2). And conversely, let EP∗ (Rn+1 /Fn ) = 1+r. Derive from here that P∗ (Rn+1 = 1 + b/Fn ) = p∗ and P(Rn = 1 + a/Fn ) = 1 − p∗ . Prove by induction that P∗ (R1 = x1 , . . . , Rn = xn ) = ∏ni=1 pi , where pi = p∗ , if xi = 1 + b, and pi = 1 − p∗ , if xi = 1 + a. Note that the P∗ -martingale property of Xn uniquely determines the distribution (R1 , . . . , RN ) with respect to the measure P∗ , so, uniquely determines the measure P∗ itself. That is why the market is complete. 18.8. (1) Integrability of H is evident, because all the random variables take only a finite number of values.
18 Stochastic processes in financial mathematics (discrete time)
311
(2) To prove the equality Vn = EP∗ (H/Fn ) it is necessary to prove at first, using backward induction, that all the values Vn of the capital are nonnegative. Then, with the help of the formula for the capital of a self-financing strategy, it is possible to prove that Vn is a martingale with respect to the measure P∗ . To prove the second inequality you can write Xk , n ≤ k ≤ N in the form Xk = Xn (Xk /Xn ) and use the fact that the r.v. Xk /Xn is independent of Fn and has the same distribution as (Xk−n )/X0 . (3) Write the equality ξn (ω )(Xn (ω )−Xn−1 (ω )) = Vn (ω )−Vn−1 (ω ), in which ξn (ω ), Xn−1 (ω ), Vn−1 (ω ) depend only on the first n − 1 components of the vector ω . Denote ω a := (y1 , . . . , yn−1 , 1 + a, yn−1 , . . . , yN ), ω b := (y1 , . . . , yn−1 , 1 + b, yn−1 , . . . , yN ) and obtain the equalities ξn (ω )(Xn−1 (ω )bˆ − Xn−1 (ω )) = Vn (ω b ) − Vn−1 (ω ) and ξn (ω )(Xn−1 (ω )aˆ − Xn−1 (ω )) = Vn (ω a ) − Vn−1 (ω ). Derive from here the formulae for ξn (ω ). The formulae for ξn0 (ω ) can be derived from a definition of the selffinancing strategy. (4) Use item (2). 18.12. (1) Apply backward induction starting at moment N. Use the fact that at moment N it is necessary to pay the price for an option that equals his benefit at this moment (i.e., YN = SN ), and at moment N − 1 it is necessary to have the capital that is sufficient for buying at that moment (i.e., SN−1 ), and also have the sum that is sufficient to buy it at moment N, but the cost ofthis sum at moment N − 1 equals 0 0 EP∗ (XN /FN−1 ) = SN−1 EP∗ (SN /SN0 )/FN−1 . SN−1 Statements (2)–(5) follow from item (1) if using Problems 15.21–15.23. 18.13. Apply Problem 18.12. 18.15. Apply Problem 18.11.
Answers and Solutions 18.1. (1) This statement is evident. (2) If π (C) > S(0) then at moment t = 0 the seller of the call option sells it and buys the stock at the price S(0). Therefore, at moment t = T he can pay for the claim concerning the call option, for any market scenario and any strike price. In this case he will obtain a guaranteed profit π (C) − S(0) > 0. Next, if π (P) > K then the seller of the put option sells it at the price π (P) and at moment t = T has a possibility to buy the stock for the stock price K < π (P) in the case where the option will be exercised. In this case he will obtain a guaranteed profit π (P) − K > 0. 18.2. (1) Nonarbitrage price of such an option equals π (C) = S(0) − (K/((1 + r)T )). (2) S(0). 18.3. (1)(a) Suppose that π (P) < K(1 + r)−T − S(0); that is, (π (P) + S(0))(1 + r)T < K. At moment t = 0 it is possible to borrow the sum π (P) + S(0), buy the stock at the price S(0), and buy the option with the strike price K at the price π (P). At moment T we sell the stock at the price K (or even at a higher price, if its market price exceeds K) and return (π (P) + S(0))(1 + r)T as a repayment for the borrowed sum. So, we will have a guaranteed profit not less than K − (π (P) + S(0))(1 + r)T > 0. (b) Suppose that π (C) < S(0) − K(1 + r)−T ; that is, (S(0) − π (C))(1 + r)T > K. At moment t = 0 it is possible to make a short sale of the stock (short sale of a
312
18 Stochastic processes in financial mathematics (discrete time)
stock is an immediate sale without real ownership) at the price S(0) and buy the option with the strike price K at the price π (C). At moment T we have the sum (S(0) − π (C))(1 + r)T , so we can buy the stock at the price K (or even at a lower price if its market price is lower than K) and return the borrowed sum. So, we obtain a guaranteed profit not less than (S(0) − π (C))(1 + r)T − K > 0. 18.4. (1) Both relations of the put–call parity are direct consequences of a definition of the martingale measure and the evident equality C(T ) − P(T ) = S(T ) − K. (2) At moment t = 0 we act as proposed, obtain the sum S(0) + π (P) − π (C), which we put into a bank account and obtain at moment t = T the sum (S(0) + π (P) − π (C))(1 + r)T . If S(T ) ≥ K we use the call option, buy the stock at the price K, and return the borrowed sum. Possible exercising of the put option will not lead to losses. If S(T ) < K, then after exercising the put option we buy the stock at the price K, and return our debt with the help of this stock. We do not exercise the call option. In both cases we have a profit not less than (S(0) + π (P) − π (C))(1 + r)T − K > 0. (3) At moment t = 0 we borrow the sum S + π (P) − π (C) and act as mentioned in the problem situation. If S(T ) < K then we use the put option, sell the stock at the price K, and return the borrowed sum which size is now (S(0) + π (P) − π (C))(1 + r)T . Possible exercising of a call option will not lead to losses. If S(T ) ≥ K then the call option will be exercised, we will sell the stock at the price K, will not exercise the call option, and also can return the borrowed sum which size is now (S(0) + π (P) − π (C))(1 + r)T . In both cases we have a profit at least K − (S(0) + π (P) − π (C))(1 + r)T > 0. 18.5. (a) ((S(T ) − K)+ + (K − S(T ))+ = S(T ) − K. (b) (S(T ) − K1 )+ + π (P)(1 + r)T −t − (K2 − S(T ))+ . (c) 2(S(T ) − K)+ + S(t)(1 + r)T −t . (d) S(T )1IS(T ) 0 (the heat equation, or the diffusion equation), where u(x, 0) = u0 (x) = max{e(k+1)x/2 − e(k−1)x/2 , 0}, v(x, τ ) = exp{−(k − 1)x/2 − (k + 1)2 τ /4}u(x, τ ). Check that the function 1 u(x, τ ) = √ 2 πτ
R
u0 (y)e−(((x−y)
2 )/4τ )
dy
is the unique solution to the diffusion equation (Cauchy problem) with initial condition u(x, 0) = u0 (x). (3) Obtain the Black–Scholes formula by the inverse change of variables. 19.6. In the framework of the Black–Scholes model consider the discounted capital V (t) = ψ (t)Z(t) + ϕ (t), where Z(t) = B−1 (t)S(t). Prove that the portfolio (ϕ , ψ ) is self-financing if and only if dV (t) = ψ (t)dZ(t). 19.7. Let a market consist of one stock S(t) = S(0) exp{σ W (t) + μ t} and one bond B(t) = exp{rt}. Also, suppose that a filtration F is natural, that is, generated by a Wiener process {W (t),t ∈ [0, T ]}. Prove that this market is arbitrage-free, and for any nonnegative FT -measurable claim X such that EX 2+α < ∞ for some α > 0, there exists a replicating portfolio (ϕ , ψ ). Also, prove that the arbitrage-free price of X at moment t equals
π (X)(t) = B(t) EP∗ (B−1 (T )X|Ft ) = e−r(T −t) EP∗ (X|Ft ), where P∗ is the equivalent martingale measure with respect to which the discounted process {B−1 (t)S(t)} is a martingale. 19.8. Let a stochastic process X be the nominal return, dX(t) = X(t)(α dt + σ dW (t)), (t)), and a stochastic process Y describe the inflation, dY (t) = Y (t)(γ dt + δ dW is a Wiener process, independent of W . We suppose that the coefficients where W α , σ , γ , and δ are constant. Derive the SDE for real return Z(t) := X(t)/Y (t). 19.9. In the framework of the Black–Scholes model consider a European contingent claim of the form
19 Stochastic processes in financial mathematics (continuous time) ⎧ ⎪ if S(T ) ≤ A, ⎨K, H = K + A − S(T ), if A < S(T ) < K + A, ⎪ ⎩ 0, if S(T ) > K + A.
319
The expiration date of H is supposed to equal T . Define a portfolio consisting of bonds, stocks, and a European call option, that will be constant in time and replicates the claim H. Define the arbitrage-free price of H. 19.10. (1) Using Problem 19.5, choose such a change of variables that permits the reduction of the equation
∂ u ∂ 2u ∂u = 2 + a + bu, a, b ∈ R ∂t ∂x ∂x to the diffusion one. (2) Choose such a change of time that permits the reduction of the equation c(t)
∂ u ∂ 2u = 2 , c(t) > 0, t > 0 ∂t ∂x
to the diffusion one. (3) Suppose that σ 2 (·) and r(·) in the Black–Scholes equation are the functions of t, however, (r(t))/(σ 2 (t)) does not depend on t. Rewrite the Black–Scholes formula for this case. 19.11. Suppose that in the Black–Scholes equation the functions r(·) and σ 2 (·) are known nonrandom functions of t. Prove that the following steps reduce the Black– Scholes equation to the diffusion one. (1) Put S = Kex , C = Kv, and t = T − t , and obtain the equation
∂v 1 2 ∂ 2v 1 2 ∂v − r(t )v. = σ (t ) + r(t ) − σ (t ) ∂ t 2 ∂ x2 2 ∂x (2) Change the time variable as τ-(t ) =
t 1 2 0 2 σ (s)ds and obtain the equation
∂ v ∂ 2v ∂v = 2 + a(τ-) − b(τ-)v, ∂τ ∂x ∂x where a(τ-) = 2r/σ 2 − 1, b(τ-) = 2r/σ 2 . (3) Prove that the general solution to the first-order partial differential equation of the form ∂v ∂v = a(τ-) − b(τ-)v ∂ τ∂x can be presented as v(x, τ-) = F(x+A(τ-))e−B(-τ ) , where dA(τ-)/d τ- = a(τ-), dB(τ-)/d τ- = b(τ-), and F(·) is an arbitrary function. (4) Prove that the solution to the second-order partial differential equation from item (2) has a form x, τ-), v(x, τ-) = e−B(-τ )V (-
320
19 Stochastic processes in financial mathematics (continuous time)
where x- = x + A(τ-), A(τ-) B(τ-) are the functions of τ-, taken from item (3), and V is a solution to the diffusion equation ∂ V /d τ- = ∂ 2V /∂ x2 . (5) Transform the initial data correspondingly to the change of variables. 19.12. Let C(S, t) and P(S, t) be the prices at moment t of a European call and put option, correspondingly, with the same strike price and exercise date. (1) Prove that both P and C − P satisfy the Black–Scholes equation; moreover, the boundary condition for C − P is extremely simple: C(S, T ) − P(S, T ) = S − K. (2) Deduce from the put–call parity that S − Ke−r(T −t) is a solution to the Black– Scholes equation with the same boundary condition. 19.13. Use the exact solution to the diffusion equation to find the Black–Scholes price P(S, t) of a put option P(S, T ) = (K − S)+ without using the put–call parity. 19.14. (1) Prove that in the case where the initial condition of the boundary value problem for the heat equation is positive, then u(x, τ ) > 0 for any τ > 0. (2) Deduce from here that for any option with positive payoff its price is also positive, if it satisfies the Black–Scholes equation. 19.15. (1) In the framework of the Black–Scholes model find the arbitrage-free option price with the payoff f (S(T )), where the function f ∈ C(R) and increases at infinity not faster than a polynomial. (2) Find the arbitrage-free option price with the payoff of the form BH (K − S(T )), where H (s) = 1Is≥0 is a Heaviside function and B > 0 is some constant (the option “cash-or-nothing”). (3) The European digital call option of the kind asset-or-nothing has the payoff S(T ) in the case S(T ) > K, and zero payoff in the case S(T ) ≤ K. Find its price. 19.16. What is a probability that a European call option will expire in-the-money? 19.17. Calculate the price of a European call option on an asset with dividends, if the dividend yield equals rD on the interval [0, T ]. 19.18. What is the put–call parity relation for options on the asset with dividends? 19.19. What is the delta for the call option with the continuous and constant dividend yield rD ? 19.20. Calculate Δ , Γ , θ , ρ , and V for put and call options. 19.21. On the Black–Scholes market a company issued an asset “Golden logarithm” (briefly GLO). The owner of GLO(T ) with the expiration date T receives at moment T the sum log S(T ) (in the case S(T ) < 1 the owner pays the corresponding sum to the company). Define the price process for GLO(T ). 19.22. Let the functions r(x) and u(x) denote the interest rate and the process (flow) of the cash receipt, correspondingly, under the condition that the initial value of a risk asset X(0) = x, where X is a time-homogeneous diffusion process with the drift
19 Stochastic processes in financial mathematics (continuous time)
321
coefficient μ = μ (x) and diffusion σ = σ (x); μ and σ are continuous functions, and σ (x) = 0 for all x ∈ R. Suppose that u is bounded and continuous. (1) Write the stochastic differential of the process X. (2) Check that the function t u(t, x) := E e− 0 r(X(s))ds u(X(t))X(0) = x , t ∈ R+ can be considered as the expected discounted cash flow at moment t under the condition that X(0) = x. (3) Find a partial derivative equation for which the function u(t, x) is a solution. 19.23. (When is the right time to sell the stocks?) Suppose that the stock price {S(t),t ∈ R+ } is a diffusion process of the form dS(t) = rS(t)dt + σ S(t)dW (t), S(0) = x > 0 (for the explicit form of S(t) see Problem 14.3). Here W is a onedimensional Wiener process, r > 0, and σ = 0. Suppose that there is a fixed transaction cost a > 0, connected to the sale of the asset. Then, regarding inflation, the discounted asset price at moment t equals e−ρ t (S(t) − a). Find the optimal stopping time τ0 , for which Es,x e−ρτ0 (S(τ0 ) − a) = sup Es,x e−ρτ (S(τ ) − a) = sup Es,x g(τ , S(τ )), τ
τ
where g(t, y) = e−ρ t (y − a). 19.24. (Vasicek stochastic model of interest rate) According to the Vasicek model, the interest rate r(·) satisfies a SDE, dr(t) = (b − ar(t))dt + σ dW (t), where W is a Wiener process. (1) Find an explicit form of r(·). (2) Find the limit distribution of r(t) as t → ∞. 19.25. (Cox–Ingersoll–Ross stochastic model of interest rate) According to the Cox– Ingersoll–Ross model, the interest rate r(·) satisfies a SDE ( dr(t) = (α − β r(t))dt + σ r(t)dW (t), where W is a Wiener process, α > 0, and β > 0. The process {r(t)} is also called the square of the Bessel process. (Concerning the existence and uniqueness of the strong solution to this equation see ( Problem 14.16.) (1) Define the SDE for { r(t)} in the case α = 0. (2) Suppose that a nonrandom function u(·) satisfies the ordinary differential equation u (t) = −β u(t) − (σ 2 /2)u2 (t), u(0) = θ ∈ R. Fix T > 0 and assume that α = 0. Find the differential equation for the function G(t) = E exp{−u(T − t)r(t)}. Calculate the mean value and variance of r(T ). (3) In the general case, calculate the density and moment-generating function for the distribution of r(t).
322
19 Stochastic processes in financial mathematics (continuous time)
Hints 19.2. Yes, for a call option it is true. To check this statement it is necessary to prove that if we fix all other parameters, then the stochastic process {Yt := e−rt (St − K)+ ,t ≥ 0} becomes a submartingale with respect to the natural filtration and to the risk-neutral measure. For put options the situation becomes more complicated, and the answer is negative in the general case. 19.3. I method. Prove that under opposite inequalities the arbitrage is possible. II method. Directly use the form of the solution to the Black–Scholes equation. 19.6. Verify directly the definition of the self-financing property. 19.10. (3) Choose a change of time in order to reduce the equation to the diffusion one, and then use the usual Black–Scholes formula or apply Problem 19.11. 19.15. (2), (3) Substitute the corresponding function f into the formula obtained in item (1). 19.16. This is the probability of the event {S(T ) ≥ K}, and the distribution of log S(T ) is Gaussian. 19.18. Solve the Black–Scholes equation for C − P (with dividends), using the boundary condition C(S, T ) − P(S, T ) = S − K. 19.22. Apply Problem 14.14. 19.24. (2) Use the equality (19.3) (see Answers and Solutions to this chapter) and the fact that the integral on the right-hand side has a Gaussian distribution.
Answers and Solutions 19.1. ES(T ) =
2 1 √ ea+(σ /2) . σ 2π
19.7. It is possible to construct the martingale measure P∗ by the Girsanov theorem (see Problems 14.25 and 14.22). For this purpose it is necessary to put 2 4 1 μ −r 1 μ −r 1 dP∗ := exp − − σ W (t) − − σ t . dP σ 2 2 σ 2 Then the Novikov’s condition evidently holds on the finite interval [0, T ], stochastic process W˜ (t) := W (t) + ((μ − r)/(σ ) − 12 σ is a Wiener process with respect to the same filtration, and the discounted process (S(t))/(B(t)) is a martingale with respect to the measure P∗ and the same filtration. Now, because EX 2+α < ∞ for some
19 Stochastic processes in financial mathematics (continuous time)
323
α > 0, it is easy to check with the help of the H¨older inequality that the claim X is square integrable with respect to a measure P∗ . Put V (t) = EP∗ (B−1 (T )X|Ft ). Because the filtration F is generated by a Wiener process, then, for example, by Theorem 5.13 [57], the representation holds V (t) = EV (0) + 0t β (s)dW (s). Now, it is necessary to put ψ (t) = μ −1 β (t)B(t)S−1 (t) and ϕ (t) = V (t) − ψ (t)S(t)B−1 (t). Furthermore you can check it on your own with the help of the Itˆo formula that the following equations hold: B(t)V (t) = ϕ (t)B(t) + ψ (t)S(t), whence, in particular, X = ϕ (T )B(T ) + ψ (T )S(T ); that is, our portfolio replicates X, and also B(t)V (t) = B(t)dV (t)+V (t)dB(t) = ϕ (t)dB(t)+ ψ (t)dS(t); it is equivalent to the self-financing property of the strategy (ϕ , ψ ). (t)). 19.8. dZ(t) = Z(t)((α − γ + δ 2 )dt + σ dW (t) − δ dW 19.9. Write H in the form H = K · 1 − (S(T ) − A)+ + (S(T ) − A − K)+ . Thus, the desired portfolio can be constructed from K bonds of the price 1 each, short position in option (S(T ) − A)+ (i.e., this option must be sold), and the long position in option (S(T ) − A − K)+ (i.e., this option must be bought). Hence the arbitrage-free price of the claim H at moment t equals π (H)(t) = Ke−r(T −t) − π (S(T ) − A)+ + π (S(T ) − A − K)+ , where the arbitrage-free prices of the options mentioned above have to be defined by the Black–Scholes formula. 19.15. (1) The required arbitrage-free price equals −(y − μ )2 1 √ dy. f (ey ) exp 2σ 2 σ 2π R 19.21. The required price process has a form π (t) = log S(t) + (r − σ 2 /2)(T − t). of the process Y (t) = (s + t, S(t)) is given by the 19.23. The infinitesimal operator L formula 2 f (s, x) = ∂ f + rx ∂ f + 1 σ 2 x2 ∂ f , f ∈ C2 (R2 ). L ∂s ∂x 2 ∂ x2
Therefore, in our case Lg(s, x) = e−ρ s ((r − ρ )x + ρ a), whence if r ≥ ρ , R × R+ , U := {(s, x)| Lg(s, x) > 0} = ρ }, if r < ρ . {(s, x)|x < ρa−r So, if r ≥ ρ , then U = Γ c = R × R+ , and there is no optimal stopping time. If r > ρ , then vg = ∞, and for r = ρ it holds vg (s, x) = xe−ρ s (prove these statements). Consider the case r < ρ and prove that the set Γ c is invariant in t; that is, Γ c + (t0 , 0) = Γ c for all t0 . Indeed,
324
19 Stochastic processes in financial mathematics (continuous time)
Γ c + (t0 , 0) = {(t + t0 , x)| (t, x) ∈ Γ c } = {(s, x)| (s − t0 , x) ∈ Γ c } = {(s, x)| g(s − t0 , x) < vg (s − t0 , x)} = {(s, x)| eρ t0 g(s, x) < eρ t0 vg (s, x)} = {(s, x)| g(s, x) < vg (s, x)} = Γ c . Here the equalities vg (s − t0 , x) = supτ Es−t0 ,x e−ρτ (S(τ ) − a) = supτ Ee−ρ (τ +(s−t0 )) (S(τ ) − a) = eρ t0 supτ Ee−ρ (τ +s) (S(τ ) − a) = eρ t0 vg (s, x) were used. Therefore, a connected component of the set Γ c containing U must have the form Γ c (x0 ) = {(t, x)| 0 < x < x0 }, for some x0 > aρ /(ρ − r). Note that Γ c cannot have any other components, because another component V of the set Γ c must < 0 in V , and then for y ∈ V satisfy the relation Lg Ey g(Y (τ )) = g(y) + Ey
τ 0
Lg(Y (t))dt < g(y),
for all stopping times bounded by the exit time from a strip in V . So, it follows from Theorem 15.2, item (2), that vg (y) = g(y), and then V = ∅. Put τ (x0 ) = τΓ c (x0 ) and calculate g(s, x) = gx0 (s, x) := Es,x g (Y (τ (x0 ))) . This function is a solution to the boundary value problem ∂f ∂s
+ rx ∂∂ xf + 12 σ 2 x2 ∂∂ x2f = 0, 0 < x < x0 , 2
f (s, x0 ) = e−ρ s (x0 − a).
(19.1)
If we try a solution of (19.1) of the form f (s, x) = e−ρ s ϕ (x), we get the following one-dimensional problem
−ρϕ + rxϕ (x) + 12 σ 2 x2 ϕ (x) = 0, 0 < x < x0 ,
ϕ (x0 ) = x0 − a.
(19.2)
The general solution to the equation (19.2) has a form
ϕ (x) = C1 xγ1 +C2 xγ2 , where Ci , i = 1, 2 are arbitrary constants, and ) * 3 1 2 2 −2 1 2 2 γi = σ σ − r ± (r − σ ) + 2ρσ , γ2 < 0 < γ1 . 2 2 Because the function ϕ (x) is bounded as x → 0, it should hold C2 = 0, and the −γ boundary requirement gives C1 = x0 1 (x0 − a). Hence,
19 Stochastic processes in financial mathematics (continuous time)
gx0 (s, x) = f (s, x) = e−ρ s (x0 − a)
x x0
γ1
325
.
If we fix (s, x), then the maximal value of gx0 (s, x) is attained at x0 = xmax = aγ1 /(γ1 − 1) (here γ1 > 1 if and only if r < ρ ). At last, vg (s, x) = sup Es,x g τ (x0 ), X(τ (x0 )) = sup gx0 (s, x) = gxmax (s, x). x0
x0
The conclusion is that one should sell the stock at the first moment when the price of it reaches the value xmax = aγ1 /(γ1 − 1). The expected discounted profit obtained from this strategy equals γ1 − 1 γ1 −1 x γ1 . vg (s, x) = e−ρ s a γ1 19.24. (1) r(t) =
b b + r(0) − e−at + σ a a
t 0
e−a(t−s) dW (s).
(19.3)
( r(t). Then β σ 1 σ2 q(t) + dt + dW (t). dq(t) = − 2 8 q(t) 2
19.25. (1) Denote q(t) =
(2) The function G satisfies the differential equation (in the integral form) G(t) = exp{−θ r(T )} − tT G(s)r(s)(u (T − s) + β u(T − s) + (σ 2 /2)u2 (T − s))ds = E exp {−θ r(T )}; that is, in fact, it does not depend on t. Now, it is easy to prove (please do it yourself) that θ β e−β t . u(t) = 2 β + σ2 θ (1 − e−β t ) Now, it is necessary to write the equality G(0) = E exp{−u(T )r(0)} = E exp{−θ r(T )} and to take the derivative of the left-hand and right-hand sides of this equality in θ to find the corresponding moments. (3) The density of distribution of r(t) equals ft (x) = ce−c(u+x) q=
2α σ2
q/2 x u
√ Iq (2c xr), x ≥ 0, c =
− 1,
and the function Iq (x) =
∞
2β , σ 2 (1−e−β t )
u = r(0)e−β t ,
(x/2)2k+q
∑ k!Γ (k + q + 1)
k=0
is a modified Bessel function of the first kind and order q. This is a noncentral χ 2 distribution with 2(q + 1) degrees of freedom and the skew coefficient 2cu. The density of distribution of r(t) can be also presented in the form
326
19 Stochastic processes in financial mathematics (continuous time)
ft (x) =
∞
∑ (((cu)k )/k!)e−cu gk+q+1,c (x),
k=0
where the function gγ ,λ (x) = (1/(Γ (γ )))λ γ xγ −1 e−λ x is the density of Gamma distribution Γ (γ , λ ). The moment generating function equals m(ν ) =
c q+1 " cuν # . exp c−ν c−ν
(Noncentral χ 2 distributions are considered in detail in [42].)
20 Basic functionals of the risk theory
Theoretical grounds Mathematical foundations of investigating of the risk process in insurance were created by Swedish mathematician Filip Lundberg in 1903–1909. For a long time this theory had been developed by mostly Nordic mathematicians, such as Cram´er, Segerdal, Teklind, and others. Later on risk theory started to develop not only with connection to insurance but also as the method of solving different problems in actuarial and financial mathematics, econometrics. In the second half of the twentieth century the applied area of risk theory was expanded significantly. Processes with independent increments play a very important role in risk theory. The definitions and the characteristics of the homogeneous processes with independent increments are presented in Chapter 5. Not general multidimensional processes but rather real-valued ones with independent increments and with the jumps of the same sign are used in queueing and risk theory. In particular, stepwise processes ξ (t) with jumps ξk satisfying one of the following conditions, E eiαξ1 /ξ1 > 0 =
c , c − iα
E eiαξ1 /ξ1 < 0 =
b , b + iα
(20.1)
have a range of application. Theorem 20.1. The L´evy process {ξ (t), t ≥ 0} is piecewise constant (stepwise) with probability one if and only if its L´evy measure Π satisfies the condition Π (R\{0}) < ∞ and the characteristic function in the L´evy–Khinchin formula (Theorem 5.2) of an increment ξ (t) − ξ (0) is as follows, Eeiα (ξ (t)−ξ (0)) = et ψ (α ) ,
(20.2)
with the cumulant function ψ (α ) determined by the relation
ψ (α ) =
∞ eiα x − 1 Π (dx). −∞
(20.3)
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 327 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 20,
328
20 Basic functionals of the risk theory
Theorem 20.2. The L´evy process {ξ (t),t ≥ 0} is a nondecreasing function of time with probability one if and only if ∞ eiα x − 1 Π (dx), (20.4) Eeiα (ξ (t)−ξ (0)) = et ψ (α ) , ψ (α ) = iα a + 0
where a ≥ 0 and the measure Π satisfies the condition
1 0
xΠ (dx) < ∞.
Theorem 20.3. The process {ξ (t),t ≥ 0} with independent increments has a bounded variation with probability one on any bounded interval if and only if eiα x − 1 Π (dx), (20.5) Eeiα (ξ (t)−ξ (0)) = et ψ (α ) , ψ (α ) = iα a + where a ∈ R,
R
1
−1 |x|Π (dx) < ∞.
Theorem 20.4. The characteristic function of the compound Poisson process N(t)
ξ (t) = at + ∑ ξk , (ξ0 = ξ (0) = 0),
(20.6)
k=0
where {ξk , k ≥ 1} are i.i.d. random variables independent of the simple Poisson process N(t) with intensity λ > 0, can be expressed in the form of identity (20.2) with the following cumulant function, eiα x − 1 dF(x), F(x) = P(ξ1 < x), x ∈ R. ψ (α ) = iα a + λ (20.7) R
Definition 20.1. The compound Poisson (20.6) process with jumps of the same sign is said to be: (1) Upper continuous if a > 0, P(ξk < 0) = 1. (2) Lower continuous if a < 0, P(ξk > 0) = 1. We call such kinds of processes semicontinuous. Definition 20.2. The compound Poisson process (20.6) with a ≤ 0 is said to be almost upper semicontinuous if the first condition in (20.1) is satisfied with c > 0. The processes (20.6) with a ≥ 0 are said almost lower semicontinuous if the second condition holds true in (20.1) with b > 0. If for these processes a = 0, then they are called stepwise almost upper or lower semicontinuous. Let us introduce the basic notions connected with risk processes and their basic characteristics. Definition 20.3. The classic risk process (or the reserve process) is the process N(t)
Ru (t) = ξu (t) = u +Ct − ∑ ξk , C > 0, u > 0, P(ξk > 0) = 1,
(20.8)
k=0
which describes the reserved capital of an insurance company at time t. Here u is N(t) an initial capital, C is a gross risk premium rate. The process S(t) = ∑k=0 ξk (ξ0 = S(0) = 0) determines the outpayments of claims with mean value 0 < μ = Eξ1 < ∞.
20 Basic functionals of the risk theory
329
Definition 20.4. The safety security loading is the number
δ=
Eξ (1) C − λ μ = > 0, ξ (t) = ξ0 (t), ES(1) λμ
(20.9)
(Here Eξ (1) = m := C − λ μ .) Definition 20.5. The claim surplus process is the process
ζ (t) = u − ξu (t) = S(t) −Ct.
(20.10)
Let us denote the extremums of the processes ξ (t) and ζ (t) as
ξ ± (t) = sup (inf) ξ (t ); 0≤t ≤t
±
ξ = sup (inf) ξ (t);
ζ ± (t) = sup (inf) ζ (t ); 0≤t ≤t
±
ζ = sup (inf) ζ (t);
0≤t 0).
(20.12)
It can be written in the terms of distributions of the extremums as follows:
Ψ (u) = P(ζ + > u) = P(τ + (u) < ∞) = P(ξ − < −u).
(20.13)
Definition 20.7. The ruin probability with finite horizon [0, T ] is the probability
Ψ (u, T ) = P(ζ + (T ) > u) = P(τ + (u) < T ).
(20.14)
Besides the extreme values (20.11), other boundary functionals are also used in risk theory. In particular, the following overjump functionals are used:
γ + (u) = ζ (τ + (u)) − u − the value of overjump; γ+ (u) = u − ζ (τ + (u) − 0) − the value of lowerjump; γu+
(20.15)
+
= γ (u) + γ+ (u), u > 0;
here γu+ is the value of the jump covering the level u. Let γ+ (u) be the lowerjump of the stepwise process ζ (t) with a = 0 under the condition that the first jump ξ1 > u crossed the level u ≥ 0. Then γ+ (u) takes up the fixed value u with positive probability P(γ+ (u) = u, ζ + > u) = F(u) = P(ξ1 > u) > 0. Let us also mention that all boundary functionals (20.11), (20.15) have their own interpretation in risk theory, namely: τ + (u) is the ruin time. γ + (u) is the security of ruin. γ+ (u) is the surplus ζ (t) prior to ruin. γu+ is the claim causing ruin.
330
20 Basic functionals of the risk theory
(Figures 2 and 3, page 360 contain the graphs of the process ξu (t) and ζu (t), and of the functionals mentioned above). Denote by τ (u) = inf{t > τ + (u) | ξu (t) > 0} the time of returning of ξu (t) after the ruin into the half-plane Π + = {y ≥ 0}, τ (u) − τ + (u), τ + (u) < ∞, (20.16) T (u) = τ + (u) = ∞. ∞, T (u) is said to be the first “red period”, determining the first duration of ξu (t) being in the risk zone Π − = {y < 0} or ζ (t) being in the risk zone Πu+ = {x > u}. (Figure 6, page 363 contains the graphs of these functionals for the classic risk process). The risk zone Πu+ = {x > u} and the survival zone {x ≤ u} for the process ζ (t) are divided by the “critical” boundary x = u. Let Z + (u) = Z1+ (u) =
ζ (t),
sup
τ + (u)≤tu ds,
0
(20.18)
which determines for t → ∞ the total duration of the “red period” Qu (∞) =
∞
1Iζ (s)>u ds.
0
Let us note that the functionals (20.16)–(20.18) are needed to study the behavior of the risk processes after the ruin. This need is explained by the possibility for the insurance agency to function even after the ruin. It can borrow some capital. In order to estimate the predicted loan, it is important to know the distribution of these functionals. To study the distributions of the functionals from (20.11), (20.15)–(20.18), we need the results from the theory of boundary problems for the processes with independent increments which can be found in [7, 33, 50]. Let {ξ (t), t ≥ 0} (ξ (0) = 0) be a general real-valued homogeneous process with independent increments that has the following characteristic function in the L´evy– Khinchin form: Eeiαξ (t) = et ψ (α ) ,
σ2 ψ (α ) = iαγ − α 2 + 2
∞ −∞
iα x
e
iα x −1− 1 + x2
Π (dx),
(20.19)
20 Basic functionals of the risk theory 0 t) = e−st ,
s > 0, t > 0,
ϕ (s, α ) = Eeiαξ (θs ) ,
ϕ± (s, α ) = Eeiαξ
± (θ
s)
,
and consider a randomly stopped process ξ (θs ). The introduction of θs allows us to write in the short form the Laplace–Karson transform of the distributions of ξ (t), ξ ± (t) and their characteristic function. In particular, P(s, x) := P(ξ (θs ) < x) = s Eeiαξ (θs ) = s
∞ 0
∞ 0
e−st P(ξ (t) < x) dt, x ∈ R,
e−st Eeiαξ (t) dt, Eeiαξ
± (θ
s)
=s
∞ 0
e−st Eeiαξ
± (t)
dt.
It is easy to prove that
ϕ (s, α ) = Eeiαξ (θs ) =
s . s − ψ (α )
(20.20)
Theorem 20.5. The following main factorization identity holds true for the characteristic function of ξ (θs )(see Theorem 2.2 in [33]):
ϕ (s, α ) = ϕ+ (s, α )ϕ− (s, α ), ±∞ iα x ± e − 1 dNs (x) , ϕ± (s, α ) = exp ± Ns+ (x) = − Ns− (x) =
0
∞ 0 ∞
(20.21) (20.22)
0
e−st t −1 P(ξ (t) > x) dt,
x > 0,
e−st t −1 P(ξ (t) < x) dt, x < 0.
The relations (20.22) are called the Spitzer–Rogozin identities. The characteristic functions ϕ± (s, α ) in them are expressed via the complicated transformations of the distributions for the positive (negative) values of ξ (·). Let us mention that the following supplements to the extremums of the process
ξˆ ± (t) = ξ (t) − ξ ∓ (t), ξˆ ± (θs ) = ξ (θs ) − ξ ∓ (θs ) satisfy the following relations, d ξˆ ± (θs ) = ξ ± (θs );
that is,
E exp{iα ξˆ ± (θs )} = ϕ± (s, α ).
332
20 Basic functionals of the risk theory
It means that the components of the main factorization identity (20.22) can be interpreted as the characteristic functions of the supplements ξˆ ± (θs ). Later in the text we use the following notations, P+ (s, x) := P(ξ + (θs ) < x), x > 0; P− (s, x) := P(ξ − (θs ) < x), x < 0. We denote as θv the exponentially distributed random variable (independent of θs and ξ (t) ) with a parameter v > 0. The following statement on the second factorization identity holds true. Theorem 20.6. The joint distribution of the pair {τ + (·), γ + (·)} is determined by the moment generating function +
+
Ee−sτ (θv )−zγ (θv ) 1Iτ + (θv ) 0). = v−z ϕ+ (s, iz)
(20.23)
It is easy to determine the moment generating functions inverting (20.23) in v: Ee−sτ
+ (x)−zγ
+ (x)
−zγ+ (x)
Ee where
1Iτ + (x)0 = Ee−zγ
+ (0)
(because P(ξ + >
0) = P(ξ + = +∞) = 1), and the distributions γ + (0) and γ + (∞) = limu→∞ γ + (u) are connected by the relation + 1 −zγ + (0) 1 − Ee . (20.26) Ee−zγ (∞) = zEγ + (0)
20 Basic functionals of the risk theory
333
If m < 0, then the moment generating function of the absolute maximum can be expressed by the generalized Pollaczek–Khinchin formula / + + p+ , g0 (z) = E e−zγ (0) ξ + > 0 , (20.27) Ee−zξ = 1 − q+ g0 (z) q+ (s) = P(ξ + (θs ) > 0) → q+ = P(ξ + > 0), 0 < q+ < 1, s→0
p+ = 1 − q+ .
The complicated dependence of the distribution of the positive (negative) values of the process for the positive (negative) components ϕ± (s, α ) of the main factorization identity becomes considerably simpler for the semicontinuous and almost semiconthe jumping part of tinuous processes. Later we consider only the processes ξ (t), which has the bounded variation. For such kind of processes |x|≤1 |x|Π (dx) < ∞ and the cumulant is as follows.
ψ (α ) = iα a −
σ2 2 α + 2
eiα x − 1 Π (dx), R
σ 2 ≥ 0.
(20.28)
Thus, the drift coefficient a = 0 and jumps of the process have different signs for the semicontinuous processes with σ 2 = 0. For almost semicontinuous processes with σ 2 = 0 only the exponentially distributed jumps and drift coefficient a have different signs. Let us denote k(r) := ψ (−ir) and write the Lundberg equation k(r) − s = 0, s ≥ 0.
(20.29)
For upper (lower) semicontinuous and almost semicontinuous processes due to the convexity of k(r) in the neighborhood of r = 0 the equation (20.29) has only one positive (negative) root rs = ±ρ± (s) which completely determines ϕ± (s, α ). Theorem 20.8. The following relations hold for the upper continuous nonmonotonic process ξ (t) with the cumulant (20.28), where Π (dx) = 0 for x > 0 and m = k (0):
ϕ+ (s, α ) =
ρ+ (s) , k(ρ+ (s)) = s, ρ+ (s) − iα
P+ (s, x) := P(ξ + (θs ) > x) = e−ρ+ (s)x , x ≥ 0,
ρ+ (s) → 0 for m ≥ 0; s→0
ρ+ (0) = m−1
(20.30)
for m > 0.
The distribution of ξ − (θs ) can be determined by the relation P− (s, x) =
1 P (s, x) + P(s, x), x < 0. ρ+ (s)
(20.31)
If σ 2 > 0, then the derivative P (s, x) (in x) exists for all x ∈ R1 . If σ 2 = 0, a > 0, and ξ (t) have the bounded variation, then the derivative P (s, x) exists only for x = 0 and
334
20 Basic functionals of the risk theory
s P (s, +0) − P (s, −0) = , a p− (s) := P(ξ − (θs ) = 0) =
s > 0, aρ+ (s)
(20.32)
1 m , p− (0) = , m > 0, m a where a is a constant drift from (20.28).
ρ+ (0) =
Theorem 20.9. The following relations hold for the lower continuous nonmonotonic process ξ (t) with the cumulant (20.28), where Π (dx) = 0 for x < 0 and m = k (0):
ϕ− (s, α ) =
ρ− (s) , k(−ρ− (s)) = s, ρ− (s) + iα
ρ− (s) → 0 for m ≤ 0; ρ− (0) = m−1 for m < 0,
(20.33)
s→0
P− (s, x) = eρ− (s)x , x ≤ 0. The distribution of ξ + (θs ) is determined by the relation P+ (s, x) =
1 P (s, x) + P(s, x), x > 0. ρ− (s)
If σ 2 = 0, a < 0, then p+ (s) = P(ξ + (θs ) = 0) = p+ (s) → p+ = s→0
s > 0, |a|ρ− (s)
(20.34)
|m| λμ , q+ = , m < 0. |a| |a|
Theorem 20.10. For the almost upper semicontinuous process ξ (t) satisfying the first condition in (20.1) with c > 0 and with cumulant function of the form
ψ (α ) = cλ1
∞ 0 eiα x − 1 e−cx dx + (eiα x − 1)Π (dx), λ1 > 0, −∞
0
the following relations holds:
ϕ+ (s, α ) =
p+ (s)(c − iα ) , ρ+ (s) = cp+ (s), k(ρ+ (s)) = s, ρ+ (s) − iα P+ (s, x) = q+ (s)e−ρ+ (s)x , x ≥ 0,
ρ+ (s) → 0 if m ≥ 0; s→0
P− (s, x) =
ρ+ (0) = m−1 ,
p+ (0) = (cm)−1 ,
(20.35)
m = Eξ (1) = k (0) > 0.
∞ 1 P(s, x) − cq+ (s) e−cy P(s, x − y) dy , x < 0. p+ (s) 0
(20.36)
20 Basic functionals of the risk theory
335
Theorem 20.11. For the almost lower semicontinuous process ξ (t) satisfying the second condition in (20.1) with b > 0 and with cumulant function of the form
ψ (α ) = bλ2
0 ∞ eiα x − 1 ebx dx + (eiα x − 1)Π (dx), λ2 > 0, −∞
0
the following relations hold true.
ϕ− (s, α ) =
p− (s)(b + iα ) , ρ− (s) = bp− (s), k(−ρ− (s)) = s, ρ− (s) + iα
ρ− (0) = |m|−1 ,
p− (0) = (b|m|)−1 , m = k (0) < 0;
(20.37)
ρ− (s)x
P− (s, x) = q− (s)e , x < 0, ρ− (s) → 0 if m ≤ 0. s→0 0 1 P+ (s, x) = P(s, x) − bq− (s) eby P(s, x − y) dy , x > 0. p− (s) −∞
(20.38)
The corresponding results for the distributions of the absolute extremums are simple corollaries from Theorems 20.8–20.11. Corollary 20.1. The following relations are true for the upper continuous processes ξ (t) in accordance to the sign of m = Eξ (1) (or to the sign of safety security loading δ = m/(λ μ )): (1) If m > 0 then ρ+ (s) → 0, ρ+ (s)s−1 → m−1 , P(ξ + = +∞) = 1. It follows s→0
s→0
from (20.31) that for s → 0 we obtain ∞ − P(ξ < x) = m P(ξ (t) < x) dt , x < 0. 0
(20.39)
x
( √ (2) If m = 0 then we have for s → 0 that ρ+ (s) ≈ 2sσ1−1 , σ1 = Dξ (1), P(ξ ± = ±∞) = 1. (3) If m < 0 then ρ+ (s) → ρ+ > 0 and, according to (20.30), we obtain the following relation as s → 0,
s→0
P(ξ + > x) = e−ρ+ x , x ≥ 0, P(ξ − = −∞) = 1.
(20.40)
Corollary 20.2. The following is true for the lower continuous processes in accordance with the sign of m (or with the sign of δ ). (1) If m < 0, then ρ− (s) → 0, ρ− (s)s−1 → |m|−1 and P(ξ − = −∞) = 1. Acs→0
s→0
cording to (20.34) we obtain the following relations as s → 0: ∞ + P(ξ > x) = m P(ξ (t) > x) dt , x > 0. 0
(20.41)
x
√ (2)(If m = 0, then we obtain the following relations as s → 0: ρ± (s) ≈ 2sσ1−1 , σ1 = Dξ (1), P(ξ ± = ±∞) = 1. (3) If m > 0, then ρ− (s) → ρ− > 0 and according to (20.33) we obtain the following relations as s → 0:
s→0
P(ξ − < x) = eρ− x , x ≤ 0, P(ξ + = +∞) = 1.
(20.42)
336
20 Basic functionals of the risk theory
Note that the process ξ (t) = at + σ W (t) is both upper and lower continuous. Thus, its characteristic functions ξ ± (θs ) are determined by the first formulas in (20.30) and (20.33). Corollary 20.3. The following relations hold true for the almost upper semicontinuous process ξ (t) with a = 0 in accordance with the sign of m (or with the sign of δ ). (1) If m > 0 then ρ+ (s) → 0, ρ+ (s)s−1 → ρ+ (0) = m−1 , P(ξ + = +∞) = 1. s→0
s→0
According to (20.36) for x < 0
lim P− (s, x) = P(ξ − < x) ∞ = cm P(ξ (t) < x) dt − c
s→0
0
∞ ∞
−cy
e
0
0
P(ξ (t) < x − y) dt dy . (20.43)
√ (2) If m = 0 then tending s → 0 we obtain that ρ+ (s) ≈ 2sσ1−1 , P(ξ ± = ±∞) = 1. (3) If m < 0 then ρ+ (s) → ρ+ = cp+ , p+ = P(ξ + = 0) = c|m|/λ > 0, q+ = 1 − p+ s→0
and according to (20.35) P(ξ + > x) = q+ e−ρ+ x , x > 0, P(ξ − = −∞) = 1.
(20.44)
Corollary 20.4. The following relations are true for the almost lower semicontinuous process ξ (t) with a = 0. (1) If m < 0 then ρ− (s) → 0, ρ− (s)s−1 → |m|−1 , P(ξ − = −∞) = 1, p+ = s→0
b|m|/|λ |, q+ = 1 − p+ . According to (20.38) we have for x > 0:
s→0
lim P+ (s, x) = P(ξ + > x) ∞ P(ξ (t) > x) dt − b = b|m|
s→0
0
0
∞ ∞ x
eb(x−y) P(ξ (t) > y) dy dt . (20.45)
√ (2) If m = 0 then given s → 0, ρ− (s) ≈ 2sσ1−1 , P(ξ ± = ±∞) = 1. (3) If m > 0 then ρ− (s) → ρ− = bp− , p− = P(ξ − = 0) = (bm)/λ > 0 and s→0
according to (20.37) for x < 0
P(ξ − < x) = q− eρ− x , q− = 1 − p− , P(ξ + = +∞) = 1.
(20.46)
Let us define the Laplace–Karson transform of the ruin probability with finite horizon (20.14):
Ψs (u) = s
Then, according to (20.34),
∞
0
e−stΨ (u,t) dt.
20 Basic functionals of the risk theory
Ψs (u) =
1 P (s, −u) + P(s, −u), u > 0 ρ− (s)
337
(20.47)
for the upper continuous classic ruin process ξu (t) assigned by the formula (20.8). And it follows from (20.39) that for m > 0 and s → 0, ∞ Ψ (u) = lim Ψs (u) = m P(ξ (t) < x) dt = P(ξ − < −u). (20.48) s→0
0
x=−u
We should take into account that the jumps ξk are positive for the lower continuous claim surplus process ζ (t) = S(t) −Ct (see (20.10)). On the other hand, for the almost lower semicontinuous risk process we have that
ζ (t) = S(t) −C(t), C(t) =
N2 (t)
∑ ξk ,
Eeiαξk =
k=0
b , b > 0, b − iα
(20.49)
N (t)
1 the jumps of the process S(t) = ∑k=0 ξk , that is, the claims ξk are also positive (the Poisson processes N1,2 (t) with intensities λ1 > 0, λ2 > 0 are independent). The cumulant of ζ (t) can be written in two ways:
ψ (α ) = λ1 (ϕ1 (α ) − 1) + λ2 (
b − 1), ϕ1 (α ) = Eeiαξ1 b + iα
or
ψ (α ) = λ (ϕ (α ) − 1), ϕ (α ) = λ
pϕ1 (α ) + q
b b + iα
,
where λ = λ1 + λ2 , p = λ1 /λ , q = 1 − p, δ = (λ |μ |)/(λ1 μ1 ) > 0, λ μ = λ1 μ1 − λ2 b−1 , μ1 = Eξ1 . Let also P(ξ1 > 0) = F 1 (0) = 1, F 1 (x) = P(ξ1 > x), F(x) = pF 1 (x), x > 0. For both processes (20.10) and (20.49) the conditional moment generating function for γ + (0) in Pollaczek–Khinchin formula (see (20.27)) is as follows. / + E e−zγ (0) ζ + > 0 = −1 ∞ −zx (20.50) ϕ0 (z) = 0∞ F(x) dx F(x) dx, for ζ (t) in (20.10) 0 e = g0 (z) = q1+ 0∞ e−yz (dF(y) + bF(y)dy), for ζ (t) in (20.49). Furthermore, the moment generating function for ζ + is determined by the classic Pollaczek–Khinchin formula. This implies the following decomposition, +
Ee−zζ =
∞ p+ = p+ ∑ (q+ ϕ0 (z))n . 1 − q+ ϕ0 (z) n=0
(20.51)
Let us denote Sn = ∑k≤n ξk , where ξk are i.i.d. random variables with the moment generating function ϕ0 (z). Inverting (20.51) in the variable z, we obtain
Ψ (u) = P(ζ + > u) = p+
∞
∑ qn+ P(Sn > u).
n=1
(20.52)
338
20 Basic functionals of the risk theory
It means that ζ + = Sν . Here the random variable ν follows the geometric distribution with parameter q+ = P(ζ + > 0) < 1. d
Definition 20.8. The process (20.49) is called the risk process with random (exponentially distributed) claims (Figures 4 and 5, page 362 contain the graphical images of the functionals of the process (20.49)).
Let us mention that if Eeiαξk = c/(c − iα ), Eeiαξk = b/(b − iα ) then the process (20.49) is both the almost upper and almost lower semicontinuous. The joint moment generating function of the overjump functionals {τ + (x), γk (x), k = 1, 3}, where γ1 (x) = γ + (x), γ2 (x) = γ+ (x), γ3 (x) = γx+ , that is, V (s, x, u1 , u2 , u3 ) = Ee−sτ
+ (x)− 3 u γ (x) ∑k=1 k k
1Iτ + (x) 0. For the lower continuous (almost lower semicontinuous) processes ζ (t) this equation is as follows. (s + λ )V (s, x, u1 , u2 , u3 ) − λ
x
V (s, x − y, u1 , u2 , u3 )dF(y) = A(x, u1 , u2 , u3 ),
−∞
where A(x, u1 , u2 , u3 ) = λ
∞ x
e(u1 −u2 )x−(u1 +u2 )z dF(z), x > 0.
Let us denote the convolution A(·) with dP− (·) (which is the exponential distribution of ζ − (θs ); see (20.33), (20.37)), as G(·). Then G(s, x, u1 , u2 , u3 ) =
+0 −∞
A(x − y, u1 , u2 , u3 ) dP− (s, y)
= p− (s)A(x, u1 , u2 , u3 ) + q− (s)ρ− (s)
0 −∞
(20.53)
A(x − y, u1 , u2 , u3 ) dP− (s, y),
p− (s) ≥ 0. Note that for the lower continuous processes ζ (t) in (20.53) p− (s) = 0, q− (s) = 1. Theorem 20.12. The joint moment generating function of the overjump functionals for the lower semicontinuous (almost semicontinuous) risk processes ζ (t) is determined by the relation V (s, u, u1 , u2 , u3 ) = s−1
u −0
G(s, u − y, u1 , u2 , u3 ) dP+ (s, y).
(20.54)
The moment generating function of the pair {τ + (u), γk (u)}, (k = 1, 3) is determined by the relation Vk (s, u, uk ) : = Ee−sτ = s−1
+ (u)−u
u −0
k γk (u)
1Iτ + (u) 0 for the processes (20.10) and (20.49), P± (s, x) = P(ζ ± (θs ) < x)
(±x > 0),
G1 (s, u, u1 ) = G(s, u, u1 , 0, 0), G2 (s, u, u2 ) = G(s, u, 0, u2 , 0), G3 (s, u, u3 ) = G(s, u, 0, 0, u3 ). In turn, functions Gk corresponding to the process ζ (t) from (20.10) have a form ∞ λ ρ− (s) e−u1 y − e−ρ− (s)y dF(u + y), G1 (s, u, u1 ) = ρ− (s) − u1 0 ∞
e−u2 (u+z)−ρ− (s)z F(u + z) dz, ∞ e−u3 z 1 − eρ− (s)(u−z) dF(z). G3 (s, u, u3 ) = λ
G2 (s, u, u2 ) = λ ρ− (s)
(20.56)
0
u
The relations (20.54) and (20.55) can be easily inverted in uk (k = 1, 3). Denote
φ (s, k, u, x) =
∂ P(γk (u) < x, ζ + (θs ) > u). ∂x
This value tends to a limit as s → 0. We denote this limit φk (u, x). The limit distributions of the overjump functionals and marginal ruin functions can be found using these relations. Theorem 20.13. The first two ruin functions for the lower continuous ruin processes ζ (t) in the case when m < 0 are determined by the relations
Φ1 (u, x) := P(ζ + > u, γ + (u) > x) = =
λ λ F(u + x) + c |m|
u 0
x
φ1 (u, z)dz
F(u + x − z) dP+ (z), F(y) =
Φ2 (u, y) := P(ζ + > u, γ+ (u) > y) = =
∞
⎧ ⎨
∞ y
(20.57)
∞
F(z) dz, y
φ2 (u, z)dz
λ (u)F(y), |m| P + u u−y λ ⎩ p F(u) + F(y) dP (z) + F(u − z) dP (z) , + + + u−y 0 |m|
(20.58) y > u, 0 < y < u,
where P+ (u) = P(ζ + < u), p+ = |m|/C. The distribution density of the claim γu+ that caused the ruin (i.e., of the third ruin function) is as follows.
φ3 (u, y) := =
∂ ∂ P(γu+ < y, ζ + > u) = − Φ3 (u, y) ∂z ∂y
u λ |m| F (y) 0 (y − u + z) dP+ (z), 0 λ |m| F (y) −y (z + y) dP+ (y + u),
y > u, 0 < y < u.
(20.59)
340
20 Basic functionals of the risk theory
In order to study the distribution of the first duration and the total duration of “red period” it should be taken into account that T (u) = τ − (−γ + (u)),
u > 0,
−
x < 0.
d
τ (x) = inf{t > 0| ζ (t) < x},
(20.60)
The statement below follows from (20.55), (20.56), and (20.60) (if k = 1, s → 0). Theorem 20.14. Consider the lower continuous risk process ζ (t) (see (20.10)). The following relations hold true for m = Eζ (1) < 0. g+ (u, z) := Ee−zγ
+ (u)
1Iτ + (u) 0, p+ =
κu (z) := Ee−zZ
+ (u)
+
1Iζ + >u = e−uz Ee−zζ .
(20.66)
20 Basic functionals of the risk theory
341
Theorem 20.15. Let ζ (t) be the lower almost semicontinuous stepwise risk process. The limiting ruin densities for m < 0 are determined by the relations
φ1 (u, x) = F∗ (u + x) +
λ b|m|
u
F∗ (u + x − y)dP+ (y), x ≥ 0,
0
(20.67)
F∗ (x) = F (x) + bF(x), x ≥ 0, F(0) = p > 0. If y = u ≥ 0 then P(γ2 (u) = γ+ (u) = u) = F(u) = pF 1 (u), and for y = u λ F(y)P+ (u), y > u, u φ2 (u, y) = |m| λ −1 |m| F(y) u−y dP+ (z) + b P+ (u − y) , 0 < y < u,
φ3 (u, y) =
u λ −1 |m| F (y) 0− (z + y − u + b )dP+ (z), 0 λ −1 |m| F (y) −y (z + y + b )dP+ (z + u),
y > u, 0 < y < u.
(20.68)
(20.69)
Let us remark that (20.67) follows from (20.64) inverting it on u1 = z and (20.68) and (20.69) follow from (20.55) using the boundary transition as s → 0 and inversion on u2,3 . At that time the integral in the first row in (20.69) can be written as u 0− −1
z + y − u + b−1 dP+ (y)
= p+ (z − u + b ) +
u
z + y − u + b−1 dP+ (y), z > u.
0+
The relations (20.57) and (20.58) as u → 0 imply the following. Corollary 20.5. Let ζ (t) be the lower continuous risk process (see (20.10)). Then for + m < 0 the following relations are true for Ee−sτ (0) 1Iγk (0)>z = P(ζ + (θs ) > θ , γk (0) > z), k = 1, 3.
λ ∞ (z−y)ρ− (s) e F(y) dy, C z ∞ λ e−yρ− (s) F(y) dy, P(γ+ (0) > z, ζ + (θs ) > 0) = C z ∞ λ 1 − e−yρ− (s) dF(y). P(γ0+ > z, ζ + (θs ) > 0) = Cρ− (s) z
P(γ + (0) > z, ζ + (θs ) > 0) =
(20.70)
If m < 0 then for s → 0 and the lower continuous process ζ (t) the formula (20.70) implies p+ = |m|/C, P(γ + (0) > z, ζ + > 0) = P(γ+ (0) > z, ζ + > 0) = P(γ0+ > z, ζ + > 0) =
λ C
∞ z
y dF(y), F(z) =
P(γk (0) > 0, ζ + > 0) = P(ζ + > 0) = q+ =
λμ , C
λ F(z), C
∞
F(y) dy, z
μ = F(0).
(20.71)
342
20 Basic functionals of the risk theory
Corollary 20.6. If ζ (t) is the lower almost semicontinuous risk process (see (20.49)) then the following relations are true (F(x) = pF 1 (x), x > 0, p = F(0)) p± (s) = P(ζ ± (θs ) = 0) > 0,
p+ (s)ρ− (s) =
sb , s+λ
∞ λ F(x) + q− (s)b e(x−y)ρ− (s) F(y) dy , s+λ x ∞ λ b q− (s) e−vρ− (s) F(v) dv, (20.72) P(γ+ (0) > y, ζ + (θs ) > 0) = s+λ y λ bq− (s) ∞ 1 − e−yρ− (s) dF(y) . F(z) + P(γ0+ > z, ζ + (θs ) > 0) = s+λ ρ− (s) z
P(γ + (0) > x, ζ + (θs ) > 0) =
If s → 0 and m < 0 then it follows from (20.72) for the lower almost semicontinuous process ζ (t) that p+ = (b|m|)/λ and P(γ + (0) > x, ζ + > 0) = p F 1 (x) + bF 1 (x) → q+ = p(1 + bμ1 ), x→0
P(γ+ (0) > y, ζ + > 0) = bpF 1 (y), F 1 (y) =
∞
(20.73)
F 1 (x) dx,
y
P(γ+ (0) > 0, ζ + > 0) < q+ , P(γ+ (0) = 0, ζ + > 0) = pF 1 (0), ∞ P(γ0+ > z, ζ + > 0) = p F 1 (z) + b y dF1 (y) → q+ . z
z→0
In order to calculate the moments mk = Eζ (1)k (given mk < ∞, k = 1, 4) of the risk process ζ (t) = S(t) − Ct (or ζ (t) = S(t) − C(t)) the derivatives of its cumulant k(r) = ln Eerζ (1) = ψ (−ir) at zero point (r = 0) are used. They are said to be semiinvariants k (0) =: κ1 = m1 k (0) =: κ2 = Dζ (1) = m2 − m21 ,
(20.74)
k (0) =: κ3 = m3 − 3m1 m2 + 2m31 , k(4) (0) =: κ4 = m4 − 3m1 − 4m1 m3 + 12m21 m2 − 6m41 . This implies that m1 = κ1 , m2 = κ2 + κ12 , m3 = κ3 + 3κ1 κ2 + κ13 ,
(20.75)
m4 = κ4 + 3κ22 + 4κ1 κ3 + 6κ12 κ2 + κ14 . And let us finally mention that it is not always possible to find the ruin probabilities Ψ (T, u) and Ψ (u) in an explicit form from the integro-differential equation derived for Ψ (T, u). Most often the Laplace–Karson transform on T or Laplace or Fourier transform on u are used. That is why the approximating estimates of these probabilities are often used in risk theory. They could be found in [30, 33].
20 Basic functionals of the risk theory
343
Bibliography [7, 33, 47, 50]; [55] Chapter III; [1, 10, 30, 68].
Problems 20.1. Let us consider the process ξ (t) = at + σ W (t) with a characteristic function Eeiαξ (t) = et ψ (α ) , ψ (α ) = iaα − 12 σ 2 α 2 . Write the characteristic function for ξ (θs ), express the components of the main factorization identity (characteristic functions of ξ ± (θs )) in terms of roots rs = ±ρ± (s) of the Lundberg equation. Find the distributions of the extremums P± (s, x) = P(ξ ± (θs ) < x), (±x > 0). For Eξ (1) = a < 0, (a > 0) find the distributions of the absolute extremums. Find out the shape of the distribution for γk (x) (γ1 (x) = γ + (x), γ2 (x) = γ+ (x)) and for the first duration of T (x) being over the level x. If a < 0 find the moment generating function for the total duration of being over the level x > 0. Show that Q0 (t) satisfies the arcsine law for a = 0. 20.2. Let ζ (t) = S(t) −C(t) be a risk process with claims ξk and premiums ξk both following the exponential distributions, ξk with parameter c and ξk with parameter 1. Furthermore, let N1 (t) c , S(t) = ∑ ξk , Eeiαξk = c − iα k=0 C(t) =
N2 (t)
∑ ξk ,
k=0
Eeiαξk =
1 , 1 − iα
where N1,2 (t) are independent Poisson processes with λ1 = λ2 = 1. Find ψ (α ), ϕ (s, α ), and ϕ± (s, α ) using the roots of the Lundberg equation. If ±m < 0 (m = Eζ (1)), find the characteristic function of ζ ± . Find the joint moment generating function of {τ + (x), γ + (x)} relying on the second factorization identity (20.23). If m < 0, find the moment generating function for the distribution of the first duration of “red period” T (x) and the moment generating function for the total duration Qx (∞) of being over the level x > 0. N(t)
20.3. Let ζ (t) = −t + ∑k=0 ξk be a claim surplus process following the exponential distribution ϕ (α ) = Eeiαξk = c(c − iα )−1 , c > 0, and N(t) be a Poisson process with intensity λ > 0. Find ψ (α ), ϕ (s, α ) and express ϕ± (s, α ) via roots of the Lundberg equation. Find the characteristic function of ζ ± and the ruin probability for ±m < 0 (m = Eζ (1) = (λ − c)/c) if the initial capital u > 0 and m < 0. Write the formulas for the densities of the ruin functions φk (u, z) = (∂ /∂ z)P(γ k (u) < z, ζ + > u), k = 1, 2, 3. 20.4. Calculate all three densities of the ruin functions for the process ζ (t) from Problem 20.3 taking into account that
344
20 Basic functionals of the risk theory
P+ (y) = P(ζ + < y) = 1 − q+ e−ρ+ y , y > 0. Prove also that the first density function in the solution of Problem 20.3 can be simplified (see (20.93) below):
∂ P(γ + (u) < x, ζ + > u) = λ e−ρ+ u−cx , ∂x λ P(γ + (u) > x, ζ + > u) = e−ρ+ u−cx . c Find the moment generating function for γ + (x) and the moment generating function for the first duration of the “red period” T (u). 20.5. Let us consider the risk processes with random premiums from Problem 20.2 and the classical risk process with linear premium function from Problem 20.4 the claims of which follow the exponential distribution with parameter c > 0. It was shown that the moment generating functions of the distribution γ + (u) have the same form (see the expressions for g+ (u, z) in the end of the solutions of Problems 20.2 and 20.4). Find the moment generating function of the total deficit Z + (u). N(t)
20.6. Let ζ (t) = (∑k=0 ξk ) − t be a claim surplus process where ξk have the characteristic function 3 1 7 , ϕ (α ) = Eeiαξk = + 2 3 − iα 7 − iα where N(t) is a Poisson process with intensity λ = 3, (m = Eζ (1) = −2/7). Prove that −(iα )3 + 7(iα )2 − 6iα , ψ (α ) = (3 − iα )(7 − iα ) s(3 − iα )(7 − iα ) s = , ϕ (s, α ) = s − ψ (α ) P3 (s, iα ) where ϕ (s, α ) is a fractional rational function, and P3 (s, r) is a cubic polynomial: P3 (s, r) = r3 + r2 (s − 7) + r(6 − 10s) + 21, s > 0. Find the roots of the equation P3 (0, r) = 0 and show that the negative root r1 (s) = −ρ− (s) → 0 and the positive roots r2 (s) = ρ+ (s) < r3 (s) stay positive as s → 0: s→0
ρ+ (s) = r2 (s) → 1, r3 (s) → 6. Express ϕ± (s, α ) via roots found. Find the distribution of ζ + and the ruin probability Ψ (u). 20.7. Find the first two ruin functions Φ1,2 (u, x) for the process from Problem 20.6 taking into account that 1 −3x e + e−7x , x > 0, 2 24 −x 1 −6x + P(ζ > x) = e + e , x > 0. 35 35 F(x) =
20 Basic functionals of the risk theory
345
20.8. Find the characteristic function of the absolute maximum ζ + for the process ζ (t) from Problem 20.6 using the Pollaczek–Khinchin formula (see (20.50) and (20.51)) and show that the denominator of the characteristic +
Eeiαζ =
(3 − iα )(7 − iα ) 2 7 (3 − iα )(7 − iα ) − 3(5 − iα )
coincides with P2 (0, r) after the substitution r = iα . As a result the identity of the last characteristic function obtained by the Pollaczek– Khinchin formula and the characteristic function for ζ + obtained from Problem 20.6 using factorization is assigned. 20.9. Let ζ (t) = S(t)−C(t) be a risk process with exponentially distributed premiums S(t) =
N1 (t)
∑
ξk , C(t) =
k=0
N2 (t)
∑ ξk ,
k=0
where N1,2 (t) are independent Poisson processes with λ1 = λ2 = 1,
ϕ1 (α ) = Eeiαξk =
1 b , ϕ2 (α ) = Eeiαξk = . 2 (1 − iα ) b − iα
Prove that ψ (α ) = λ1 (ϕ1 (α ) − 1) + λ2 (ϕ2 (−α ) − 1),
ϕ (s, α ) = and find
s s − ψ (α )
ϕ± (s, α ) = Eeiαζ
± (θ
s)
.
20.10. Consider the process from Problem 20.9 (given that b = 1/14, m = 2 − b−1 = −12 < 0) taking into account that s−1 ρ− (s) → |m|−1 = 1/12, s−1 ρ− (s) → s→0
(b|m|)−1 = 7/6. Show that
ϕ+ (α ) = lim
s→0
s→0
(1 − iα )2 s (1 − iα )2 = b|m| . p− (s) P2 (s, iα ) P2 (0, iα )
Show that the fractional rational characteristic function ϕ+ (α ) allows the decomposition 3 27 25 1 1 . (20.76) ϕ+ (α ) = + + 7 41 1 − 4iα 287 7iα − 12 Invert (20.76) in α and find the distribution for ζ + and Ψ (u). Find the densities φ1,2 (x), using the formula (20.67) which follows from (20.64) after inversion on z, and (20.68). 20.11. For the above problem find the moment generating function ζ + using the Pollaczek–Khinchin formula (20.51), taking into account the equalities F 1 (x) = (1 + x)e−x , μ1 = 0∞ F 1 (x) dx = 2.
346
20 Basic functionals of the risk theory
20.12. Calculate the ruin functions for the process from Problem 20.9 for u = 0 and m = 2 − b−1 < 0, using formula (20.73) and the relations F 1 (x) = (1 + x)e−x and F 1 (x) = (2 + x)e−x . 20.13. Find the moment generating function T (0) for the duration of the first “red period” for the ruin process ζ (t) from Problem 20.6 (given u = 0, m < 0) using the formula (20.50) for the moment generating function of γ + (0): g+ (0, z) = Ee−zγ
+ (0)
1Iζ + >0 = q+ ϕ0 (z).
Use formula (20.62). 20.14. Consider the process ζ (t) from Problem 20.9. The moment generating function for γ + (0) is determined by the formula (20.50). Find the moment generating function of the first duration of “red period” given u = 0, b = 1/14 (m = −12), using the formula (20.65). For the calculation of the moment generating function ϕ0 (z) it should be taken into account that F 1 (x) = (1 + x)e−x , x > 0, μ = 2,
ϕ0 (z) =
1 2
∞ 0
e−zx F 1 (x) dx.
20.15. Find the moment generating function of γ + (u) for the risk process from Problem 20.6 (given u > 0, λ = 3, m = −2/7 < 0) . Use the solution of Problem 20.7 for Φ1 (u, x). Use the obtained expression g+ (u, z) = Ee−zγ
+ (u)
1Iτ + (u) 0, b = 1/14, m = −12 < 0). Find the moment generating function of γ + (u) using (20.64) and the relations F(x) = (1 + x)e−x , (x > 0), P+ (y) =
27 −y/4 25 −12y/7 e e − , (y > 0). 41 287
Or, in order to do it, you can use the density φ1 (u, x) found in Problem 20.10 and calculate ∞ e−zx φ1 (u, x)dx. g+ (u, z) = 0
Use this relation for g+ (u, z) in order to determine the moment generating function of the total deficit Z + (u) by the formula (20.66). 20.17. The risk process ζ (t) = S(t) −Ct δ = (λ μ )−1 (C − λ μ ) > 0 has the cumulant
20 Basic functionals of the risk theory ∞
k(z) :=
1 ln Ee−zζ (t) = C z + λ ( f (z) − 1), t
f (z) = Ee−zξ1 =
Rewrite it as the queueing process η (t) with cumulant k1 (z) = z + λ1 ( f (z) − 1), λ1 = λ C−1 , 1 − f (z) = z
∞
347
e−zx dF(x).
0
e−zx F(x) dx.
0
Investigate the virtual time waiting process w(t) (w(0) = 0) for η (t) using formula (z, s) := Ee−zw(θs ) by rewriting this expectation via the probability ω (11.3). Find ∞ −su p0 (s) = s 0 e P0 (u) du of the system to be free of claims in the exponentially distributed moment of time θs . Using the boundary transition as z → ∞ show that the atomic probability of w(θs ) being in 0 is positive: (z, s) = p0 (s) > 0. P(w(θs ) = 0) = p+ (s) = lim ω z→0
20.18. It can be identified based on the Figure 1 (page 360) that d θ1 = τη− (−ξ1 ), τη− (−x) = sup{t| η (t) < x}, x < 0.
for the queueing process η (t) from the previous problem. Find the moment generating function θ1 using the average in ξ1 and prove that
−
π (s) = Ee−sθ1 1Iθ 0, (δ − iα )2 Find m = Eξ (1), ψ (α ), ϕ (s, α ), ϕ± (s, α ), and write the main factorization identity. Using the second factorization identity (20.23) find the moment generating function for pairs {τ + (θμ ), γ + (θμ )}, {τ + (x), γ + (x)}. If m = 2λ δ −1 − 1 < 0(λ < δ /2) find the characteristic function for ϕ+ (α ) and compare it with one determined by the Pollaczek–Khinchin formula. Find the distribution function of ζ + if λ = δ /4. 20.23. Consider the process ζ (t) from the previous problem with λ = δ /4. Find g+ (u, z) with help of formula (20.61), d+ (α , μ ) using formula (20.63), gu (s) with help of formula (20.62), and κu (z) using formula (20.66). 20.24. Using the equality (20.57) with m < 0, prove for the process ζ (t) from the formula (20.10) the following equality Eγ + (u)1Iζ + >u = where F-3 (u) =
∞ u
∞ 0
Φ1 (u, x)dx =
λλ F3 (u) + c |m|
u 0
F-3 (u − z)dP+ (z), u ≥ 0,
F(x)dx is the tail of the third order of the d.f. F(x), x > 0.
20.25. Using the equality (20.67) with m < 0, prove for the process ζ (t) from the formula (20.49) the following equality Eγ + (u)1Iζ + >u = where F ∗ (u) =
∞ u
∞ 0
xφ1 (u, x)dx = F ∗ (u) +
λ b|m|
u 0
F ∗ (u − z)dP+ (z), u ≥ 0,
F ∗ (x)dx, F ∗ (x) = F(x) + bF(x), x > 0.
Hints 20.4. To obtain the simplified relation for the first duration of “red period” it is sufficient to calculate the corresponding integral taking into account that for m < 0, ρ+ = c|m|, p+ = (c − λ )/c, ρ+ = cp+ , u
∞ λ q+ ρ+ e−c(u+x−z)−ρ+ z dz |m| 0 0 u ecq+ z dz = λ e−cp+ u−cx − e−c(u+x) . = λ cq+ ρ+ e−c(u+x)
λ |m|
e−c(u+x−z) dP+ (z) =
0
The negative part in the mentioned integral compensates the first term of the first density P(γ + (u) < x, ζ + > u) and, thus, its simple exponential expression is fulfilled. It implies that the moment generating function for the security of ruin γ + (u) has a form
20 Basic functionals of the risk theory
g+ (u, z) = Ee−zγ
+ (u)
1Iζ + >u =
349
cq+ −ρ+ u e . c+z
Thus, the moment generating function T (u) according to (20.62) is determined by gu (s) = g+ (u, ρ− (s)). The dual relations for the second and third densities can be simplified in a similar way. 20.5. Use the moment generating function g+ (u, z) from Problem 20.4 and the formula (20.66). 20.6. The process ζ (t) is lower continuous. The fractional rational expressions for ψ (α ) and ϕ (s, α ) can be found calculating the cumulant ψ (α ) = −iα + λ (ϕ (α ) − 1). The lower continuity of ζ (t) implies that
ϕ− (s, α ) =
ρ− (s) , P(ζ − (θs ) < x) = eρ− (s)x , x < 0. ρ− (s) + iα
Dividing P3 (s, r) by (r + ρ− (s)) we obtain that P2 (s, r) = r2 + r(s − 7 − ρ− (s)) + 21sρ−−1 (s) = (ρ+ (s) − r)(r+ (s) − r).
Furthermore, the main factorization identity implies that
ϕ+ (s, α ) =
s (3 − iα )(7 − iα ) . ρ− (s) P2 (s, iα )
Because m < 0, s−1 ρ− (s) → |m|−1 = 7/2, r2 (s) = ρ+ (s) → ρ+ = 1, r3 (s) → 6 s→0
s→0
then
ϕ+ (α ) = lim ϕ+ (s, α ) = s→0
s→0
2 (3 − iα )(7 − iα ) . 7 P2 (0, iα )
Decompose the fractional rational function of the second order into fractionally linear parts (because P2 (0, r) can be decomposed as (r − 1)(r − 6)) and invert ϕ+ (α ). 20.7. Before calculating the first two ruin functions
Φ1 (u, x) := P(γ + (u) > x, ζ + > u), x ≥ 0, Φ2 (u, x) := P(γ+ (u) > x, ζ + > u), x ≥ 0, use the formulas (20.57) and (20.58) for x = 0 and corresponding conditions from the previous problem: λ = 3, c = 1, m = −2/7, p+ = 2/7, 1 −3x 1 1 −3x 1 −7x e + e−7x , F(x) = e + e , x > 0, F(x) = 2 2 3 7 P+ (z) =
24 −z 6 −6z e + e , z > 0, 35 35
350
20 Basic functionals of the risk theory
and show that
Φ1 (u, 0) = Φ2 (u, 0) = P+ (u) =
24 −u 1 −6u e + e , u > 0. 35 35
The ruin functions Φ1 (u, x) and Φ2 (u, x) can be found by formulas (20.57) and (20.58). 20.9. Find ψ (α ) and show (for m = 2 − b−1 = (2b − 1)/b) that
ϕ (s, α ) =
s(b + iα )(1 − iα )2 , P3 (s, iα )
P3 (s, r) = r3 (s + 2) + r2 (s(b − 2) + b − 4) − r(s + 1)mb + bs = 0. The negative root r1 (s) = −ρ− (s) of the cubic equation P3 (s, r) = 0 can be used for the determination of ϕ− (s, α ), and P2 (s, r) = P3 (s, r)(r + ρ− (s))−1 = r2 (s + 2) + (b − 4 + s(b − 2) − (2 + s)ρ− (s))r + bsρ−−1 (s) for ϕ+ (s, α ). 20.10. For calculation φ1,2 (u, x) by the formulas (20.67)–(20.68) it should be taken into account that under conditions of the problem λ 1 7 1 = , λ = λ1 + λ2 = 1, p = F(0) = , b = , m = −12, 2 14 b|m| 31 1 1 F(x) = (1 + x)e−x , F (x) = xe−x , x > 0, 2 2 15x + 1 −x e , x > 0, F∗ (x) = F (x) + bF(x) = 28 1 25 P+ (y) = 1 − (27e−y/4 − e−12y/7 ), y > 0. 41 7
Notice that for y = u (see (16.47) in [33]) P(γ+ (u) = u, ζ+ > u) = pF¯1 (u) > 0 in order to calculate φ2 (u, y).
Answers and Solutions 20.1. It was mentioned before that ξ (t) is both an upper and lower continuous process for which 2s ϕ (s, α ) = Eeiαξ (θs ) = , (20.77) 2s − 2iα a − (iα )2 σ 2 and the characteristic functions of ξ ± (θs ) can be expressed by formulas (20.30) and (20.33). According to (20.77), the Lundberg equation can be reduced to a quadratic one: σ 2 r2 + 2ar − 2s = 0, r1,2 (s) = ±ρ± (s),
20 Basic functionals of the risk theory
351
√ 2sσ 2 + a2 ∓ a ρ± (s) = , s ≥ 0. σ2 Thus, the main factorization identity and its components have the form:
ϕ (s, α ) = ϕ+ (s, α )ϕ− (s, α ), ϕ± (s, α ) =
ρ± (s) , s ≥ 0. ρ± (s) ∓ iα
(20.78)
The characteristic functions from (20.78) can be easily inverted in α and the densities of ξ ± (θs ) can be found:
∂ P(ξ ± < x) = ρ± (s)e∓ρ± (s)x , (±x > 0). (20.79) ∂x (1) For a < 0 we have that ρ+ (s) → ρ+ = 2|a|σ −2 > 0, ρ− (s) → 0. So, the p± (s, x) =
s→0
s→0
characteristic function and the distribution of ξ + are +
ϕ+ (α ) := Eeiαξ =
ρ+ , P(ξ + > x) = e−ρ+ x , x ≥ 0, ρ + − iα
(20.80)
P(ξ − = −∞) = 1. √ (2) For a = 0 we have that ρ± (s) = 2s/σ → 0, P(ξ ± = ±∞) = 1. s→0
(3) For a > 0 we have that ρ− (s) → ρ− = 2aσ −2 > 0, ρ+ (s) → 0 therefore s→0
P(ξ + = +∞) = 1, −
ϕ− (α ) := Eeiαξ =
s→0
ρ− , P(ξ − < x) = eρ− x , x ≤ 0. ρ − + iα
(20.81)
Because the process ξ (t) = at + σ W (t) is continuous then P(γk (x) = 0) = 1, (k = 1, 3). The formula (20.62) implies Ee−sT (x) 1Iτ + (x) 0. ρ+ ( μ )
(20.83)
The integral transform for Dx (s, μ ) = Ee−μ Qx (θs ) (x ≥ 0) can be defined in a similar way to (20.82), according to (2.70) in [7]: d+ (s, α , μ ) = =
∞ 0
eiα x dx Dx (s, μ )
ϕ+ (s, α ) ρ+ (s) ρ+ (s + μ ) − iα = . ϕ+ (s + μ , α ) ρ+ (s + μ ) ρ+ (s) − iα
(20.84)
After inversion in α we can find, similarly to (20.83), that Dx (s, μ ) = 1 −
ρ+ (s + μ ) − ρ+ (s) −ρ+ (s)x e , x ≥ 0. ρ+ (s + μ )
For x = 0 D0 (s, μ ) = Ee−μ Q0 (θs ) =
(20.85)
ρ+ (s) . ρ+ (s + μ )
√ 2s/σ . So, for ξ (t) = σ w(t) we get 3 ∞ s s e−st Ee−μ Q0 (t) dt = Ee−μ Q0 (θs ) = . s+μ 0
If a = 0, then ρ± (s) =
After inversion in s we obtain the well-known result for the distribution Q0 (t): 3 2 x P(Q0 (t) < x) = arcsin , (0 ≤ x ≤ t). π t 20.2. It was mentioned above that the considering process is both upper and lower almost semicontinuous. We have c (c − 1)iα − (iα )2 1 , + 1− = −ψ (α ) = 1 − 1 + iα c − iα (1 + iα )(c − iα ) (20.86) (c − iα )(1 + iα ) s = . ϕ (s, α ) = s − ψ (α ) (s − 2)α 2 + (1 − c)(1 − s)iα + sc Hence, the characteristic function ϕ (s, α ) is a rational function of the second order. Let us decompose ϕ (s, α ) into a product of a fractional linear multipliers that determine ϕ± (s, α ). After substitution r = iα (making the denominator in (20.86) equal to 0) we obtain the Lundberg equation which is quadratic: −(2 + s)r2 + rcm(1 − s) − sc = 0, m = (1 − c)c−1 .
(20.87)
It follows for s = 0 that 2r2 + rcm = 0 and the roots are r0 = 0, r10 = −cm/2, (r10 > 0 if m < 0, r10 < 0 if m > 0). If s > 0 it is possible to find the roots of the equation (20.87). In particular, for ( m = 0 the roots are of very simple form: r1,2 (s) = ± s/(2 + s) = ±ρ± (s). If m = 0
20 Basic functionals of the risk theory
353
then the roots are r1 (s) = |r2 (s)|, ρ+ (s) = r1 (s), ρ− (s) = −r2 (s). For s > 0 these roots determine the characteristic function for the distribution of ζ ± (θs ), according to (20.35) and (20.37),
ϕ+ (s, α ) =
p+ (s)(c − iα ) , ρ+ = cp+ (s), ρ+ (s) − iα
P(ζ + (θs ) = 0) = p+ (s),
ϕ− (s, α ) =
p+ (s)p− (s) =
s , s+λ
(20.88)
p− (s)(1 + iα ) , ρ− (s) = p− (s) = P(ζ − (θs ) = 0). ρ− (s) + iα
If m = (1 − c)/c < 0, that is equivalent to c > 1, then ρ+ = c|m|/2 = (c − 1)/2 > 0, thus + p+ (c − iα ) ϕ+ (α ) = Eeiαζ = , ρ+ = cp+ . (20.89) ρ + − iα If m > 0 , that is equivalent to c < 1, then ρ− = cm/2 = (1 − c)/2 > 0, thus −
ϕ− (α ) = Eeiαζ =
p− (1 + iα ) , ρ− = p− . ρ − + iα
(20.90)
According to (20.23) it is easy to calculate the joint moment generating function of {τ + (x), γ + (x)}: Ee−sτ
+ (θ )−uγ + (θ ) μ μ
1Iτ + (θμ ) 0 as claims ξk . If s → 0 then it follows from (20.90) that g+ (x, z) = Ee−zγ
+ (x)
1Iτ + (x) 0 r1,2 (s) =
√ 1 −(s + cm)2 ± Ds , ρ+ (s) = r1 (s), ρ− (s) = −r2 (s). 2
These roots determine the components of the main factorization identity (ρ+ (s) = cp+ (s) < c)
ϕ± (s, α ) =
p+ (s)(c − iα ) ρ− (s) , ϕ− (s, α ) = . ρ+ (s) − iα ρ− (s) + iα
(1) For m < 0 we have that ρ+ (s) → ρ+ = c|m| > 0, ρ− (s) → 0, thus P(ζ − = − s→0
∞) = 1 and +
ϕ+ (α ) = Eeiαζ =
s→0
p+ (c − iα ) , P(ζ + > x) = q+ e−ρ+ x , x > 0. ρ + − iα
So, the ruin probability for m < 0 and u > 0 is equal to
Ψ (u) = P(ζ + > u) = q+ e−ρ+ u . (2) For m > 0 we have that ρ+ (s) → 0, ρ− (s) → ρ− = cm = cp− , thus s→0
P(ξ + = ∞) = 1, and −
ϕ− (α ) = Eeiαζ =
s→0
ρ− , P(ζ − < x) = eρ− x , x ≤ 0. ρ − + iα
(3) For m = 0 we have that ρ± (s) → 0, thus P(ζ ± = ±∞) = 1. s→0
Because F(x) = e−cx , then the following relations take place according to the formulas (20.57)–(20.58) for the marginal densities of the ruin functions (for γ1 (x) = γ + (x), γ2 (x) = γ+ (x) and γ3 (x) = γx+ , respectively):
20 Basic functionals of the risk theory
∂ P(γ + (u) < x, ζ + > u) ∂x u λ e−c(u+x−z) dP+ (z), = λ e−c(x+u) + |m| 0 ∂ P(γ+ (u) < y, ζ + > u) φ2 (u, y) := ∂y λ −cy e P+ (u), P+ (u) = P(ζ + < u), y > u, = |m| λ −cy P(u − y < ζ + < y), 0 < y < u, |m| e
355
φ1 (u, x) :=
(20.93)
∂ φ3 (u, z) := P(γu+ < z, ζ + > u) ∂z λ c −cz u z > u, 0 (z − u + y) dP+ (y), |m| e = λ c −cz 0 −z (z + y) dP+ (y + u), 0 < z < u. |m| e 20.4. φ1 (u, x) = (∂ /∂ x)P(γ + (u) < x, ζ + > u) = λ e−ρ+ u−cx , x > 0;
∂ P(γ+ (u) < y, ζ + > u) ∂y ⎧ ⎨ λ e−cy (1 − e−ρ+ u ) , |m| = λ ⎩ |m| q+ e−cy e−ρ+ (u−y) − e−ρ+ u ,
φ2 (u, y) =
y > u, 0 < y < u;
∂ P(γu+ < z, ζ + > u) ∂ z ⎧ ⎨ λ cq+ e−cz (1 − e−ρ+ u ) (z + ρ+−1 ) − u , |m| = λ cq+ −cz −1 −ρ (u−z) ⎩ |m| e ρ+ e + − e−ρ+ u (z + ρ+−1 )
φ3 (u, z) =
g+ (u, z) =
z > u, 0 < z < u;
cq+ −ρ+ z cq+ e e−ρ+ u . , gu (s) = c+z c + ρ− (s)
20.5. Ee−zZ
+ (u)
1Iζ + >u =
cp+ q+ −(ρ+ +z)u e . ρ+ + z
20.6.
ϕ− (s, α ) =
s ρ− (s) (3 − iα )(7 − iα ) . , ϕ+ (s, α ) = ρ− (s) + iα ρ− (s) (ρ+ (s) − iα )(r2 (s) − iα )
24 1 1 1 6 ; + , ρ− (s)s−1 → ρ− (0) = s→0 35 1 − iα 35 6 − iα |m| 24 1 Ψ (u) = P(ζ + > u) = e−u + e−6u . 35 35
ϕ+ (α ) =
356
20.7.
20 Basic functionals of the risk theory
3 −7x 1 −3x 3 3 1 e − e , x ≥ 0. Φ1 (u, x) = e−u e−3x + e−7x + e−6u 5 7 10 7 3 ⎧ −u 3 1 −3y 1 −7y ⎪ 24e − e−6u , y > u, ⎨ 20 3 e + 7 e 3 −7y Φ2 (u, y) = 10 e−u 6e−2y + 2e−6y − 4e−3y − 12 7 e ⎪ ⎩ −6u 1 −y 1 3y −3y − 1 e−7y , 0 < y < u. +e 2e − 6 e +e 14
20.9.
p− (s)(b + iα ) s (1 − iα )2 . , ϕ+ (s, α ) = ρ− (s) + iα ρ− (s) P2 (s, iα )
ϕ− (s, α ) = 20.10.
1 25 −12u/7 −u/4 27e , Ψ (u) = P(ζ > u) = − e 41 7 +
φ1 (u, x) =
e−x 25 9 (3x − 4)e−−12u/7 + (5x + 7)e−u/4 , x ≥ 0; 41 7 4
If y = u P(γ2 (u) = u, ζ + > u) = 12 (1 + u)e−u , if y = u ⎧ 1 27 −u/4 25 −12u/7 −y ⎪ , y > u; + 287 e ⎨ 12 (1 + y)e 1 − 41 e −y 25 −12u/7 1 e 243 −u/4 φ2 (u, y) = 12 (1 + y) 41 7 e − 27e + 2 e−(u−y)/4 ⎪ ⎩ 625 −12(u−y)/7 , 0 < y < u. − 7 e 20.11. +
Ee−zζ =
2p+ (1 + z)2 . 2p+ + 2z2 + (4 − q+ )z
20.12.
Φ1 (0, x) = p(F 1 (x) + bF 1 (x)) 1 4 1 = e−x (1 + x + b(2 + x)) → (1 + 2b) |b=1/14 = ; x→0 2 2 7 1 1 1 Φ2 (0, y) = pbF 1 (y) = b(2 + y)e−y → b |b=1/14 = ; y→0 2 2 14 1 1 P(ζ + > 0, γ+ (0) = 0) = , φ2 (0, 0) = . 2 14 Φ3 (0, z) = p F 1 (z) + b
∞ z
y dF1 (y)
4 1 1 = e−z (1 + z + b(z2 + 2z + 2)) → (1 + 2b) |b=1/14 = . z→0 2 2 7 20.13. −zT (0)
g0 (s) = Ee
3 1Iζ + >0 = 2
1 1 + . 3 + ρ− (s) 7 + ρ− (s)
20 Basic functionals of the risk theory
20.14. g0 (s) = Ee−sT (0) 1Iζ + >0 = q+ q− (s) 20.15. g+ (u, z) =
357
2 + ρ− (s) . 2(1 + ρ− (s))2
3 4(12 + 3z)e−u + (2 − z)e−6u . 10 (3 + z)(7 + z)
20.16. g+ (u, z) =
1 1 63 −u/4 100 −12u/7 15 5 −12u/7 3 −u/4 e e e . − + + e 41 1 + z 4 7 1+z 7 4
(z, s) = (s − k1 (z))−1 (s − z p0 (s)), where p0 (s) is defined in (11.5). 20.17. ω −1 (z, s) = p+ 1 − λ1 0∞ e−zx F(x) dx , 20.20. ω∗ (z) = lims→0 ω p+ = P(w∗ = 0) = 1 − λ1 μ , q+ = 1 − p+ = λ1 μ . 20.22. m = λ μ − 1 = 2λ δ −1 − 1, 2 α )2 ψ (α ) = (δ λ−iδα )2 −iα ; ϕ (s, α ) = s(Pδ −i (s,α ) , 3
r2 +(s−2δ + λ − ρ− (s))r +sδ 2 ρ−−1 (s), ρ− (s) α )2 ϕ+ (s, α ) = ρ−s(s) (Pδ −i ρ− (s)+iα , (s,i α) . 2
P3 (s, r) = P2 (s, r)(ρ− (s)+r),
ϕ (s, α ) = ϕ+ (s, α )ϕ− (s, α );
P2 (s, r) =
ϕ− (s, α ) =
For λ < δ /2 (m < 0)
ϕ+ (α ) = lim ϕ+ (s, α ) = |m| s→0
(δ − iα )2 , P2 (0, iα )
P2 (0, r) = r2 + (λ − 2δ )r + |m|δ 2 = (r − r1 )(r − r2 ), ( 1 r1,2 = (2δ − λ ∓ λ (4δ + λ )) > 0. 2 After decomposition of ϕ+ (α ) into linear-fractional functions we obtain (r1 − δ )2 1 (r2 − δ )2 1 . ϕ+ (α ) = |m| 1 + + r 2 − r 1 r 1 − iα r 1 − r 2 r 2 − iα Thus, the distribution of ζ + can be expressed via e−xr1,2 inverting the previous rewe can lation on α . According to (20.52) and using the Pollacek–Khinchin formula √ 1 obtain the similar result for ϕ+ (α ) if λ = δ /4 p+ = q+ = 2 , r1,2 = (7∓ 17)δ /8 . Inverting it on α we obtain $ % √ √ 5 1 5 + −(7− 17)δ x/8 −(7+ 17)δ x/8 √ √ 1+ P{ζ > x} = . + 1− e e 4 17 17
A Appendix
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1,
359
360
A Appendix
A Appendix
361
362
A Appendix
A Appendix
363
364
A Appendix
List of abbreviations c`adl`ag c`agl`ad cdf CLT HMF HMP i.i.d. i.i.d.r.v. pdf r.v. SDE SLLN
Right continuous having left-hand limits (p. 24) Left continuous having right-hand limits (p. 24) Cumulant distribution function Central limit theorem Homogeneous Markov family (p. 179) Homogeneous Markov process (p. 176) Independent identically distributed Independent identically distributed random variables Probability density function Random variable Stochastic differential equation Strong law of large numbers
List of probability distributions Be(p) Bi(n, p) Geom(p) Pois(λ ) U(a, b) N(a, σ 2 ) Exp(λ ) Γ (α , β )
Bernoulli, P(ξ = 1) = p, P(ξ = 0) = 1 − p Binomial, P(ξ = k) = Cnk pk (1 − p)n−k , k = 0, . . . , n Geometric, P(ξ = k) = pk−1 (1 − p), k ∈ N Poisson, P(ξ = k) = (λ k /k!)e−λ , k ∈ Z+ Uniform on (a, b), P(ξ ≤ x) = 1 ∧ ((x − a)/(b − a))+ , x ∈ R x −(y−a)2 /2σ 2 Normal (Gaussian), P(ξ ≤ x) = (2πσ 2 )−1/2 −∞ e dy, x∈R Exponential, P(ξ ≤ x) = [1 − e−λx ]+ , x ∈ R Gamma, P(ξ ≤ x) = (β α /Γ (α )) 0x yα −1 e−β y dy, x ∈ R+
A Appendix
List of symbols aξ aX Aϕ B(X) B(X) C([0, T ]) C([0, +∞)) 2 (Rm ) Cuni 2 Cfin C(X, T) c0 cap cov(ξ , η ) D([a, b], X) DXa DF ∂A Eμ F F F [−1] Ft+ Ft− FtX,0 FtX Fτ FX (n) fi j F ∗n F ∗0 Hp H(X) Hk (X) HSN Hγ HΛ I[a,b] ( f ) Iab It ( f ) Iξ L(t, x)
12 11 178 177 3 241 241 180 180 2 241 233 11 24 253 242 242 45,108 329 339 4 21 21 21 21 71 21 138 161 161 89 129 129 254 251 129 193 252 193 110 195
L∗ Lˆ2 ([a, b]) Lˆ2 L cl ∞ L∞ ([0, T ]) p L p ([0, T ]) l.i.m. Lip M M c [M] M, N [M, N] Mc Md M∗ M Mloc M2 2 Mloc M 2,c M 2,d + M +loc M +2 M
M τn NP N(a, B) P(s, x,t, B) pi j pi j (t) (n) pi j PtX1 ,...,tm Rλ RX RX,Y Rξ Tt f Uf
181 193 193 129 241 241 241 241 38 251 74 75 75 74 74 74 74 89 73 73 73 73 74 74 89 89 74 73 21 59 176 137 139 137 1 177 11,107 11 12 177 110
365
366
A Appendix
XH Xnr Xns XT (X, X) ZX βN (a, b) Δs M d ϑ N (S) Λ
61 129 129 2 1 108 76 75 255 129,244
τΓ τΓ Φ Φ¯ Φ −1 φξ φtX1 ,...,tm (Ω , F, P) # ⇒
λ 1 |[0,1] λ 1 |R+ πHΛ
4 44 129
→
d
α ∈A Gα
5 5 261,282 288 282 12 12 1 3 242 242 21
References
367
References 1. Asmussen S (2000) Ruin Probability. World Scientist, Singapore 2. Bartlett MS (1978) An Introduction to Stochastic Processes with Special Reference to Methods and Applications. Cambridge University Press, Cambridge, UK. 3. Bertoin J (1996) Levy Processes. Cambridge University Press, Cambridge, UK. 4. Billingsley P (1968) Convergence of Probability Measures. Wiley Series in Probability and Mathematical Statistics, John Wiley, New York 5. Bogachev VI (1998) Gaussian Measures. Mathematical Surveys and Monographs, vol. 62, American Mathematical Society, Providence, RI 6. Borovkov AA (1976) Stochastic Processes in Queueing Theory. Springer-Verlag, Berlin 7. Bratijchuk NS, Gusak DV (1990) Boundary problems for processes with independent increments [in Russian]. Naukova Dumka, Kiev 8. Brzezniak Z, Zastawniak T (1999) Basic Stochastic Processes. Springer-Verlag, Berlin 9. Bulinski AV, Shirjaev AN (2003) Theory of Random Processes [in Russian]. Fizmatgiz, Laboratorija Bazovych Znanij, Moscow 10. B¨uhlmann H (1970) Mathematical Methods in Risk Theory Springer-Verlag, New-York 11. Chaumont L, Yor M (2003) Exercises in Probability: A Guided Tour from Measure Theory to Random Processes, Via Conditioning. Cambridge University Press, Cambridge, UK 12. Chung KL (1960) Markov Chains with Stationary Transition Probabilities. Springer, Berlin 13. Chung KL, Williams RJ, (1990) Introduction to Stochastic Integration. Springer-Verlag New York, LLC 14. Cram´er H, Leadbetter MR (1967) Stationary and Related Stochastic Processes. Sample Function Properties and Their Applications. John Wiley, New York 15. Doob JL (1990) Stochastic Processes. Wiley-Interscience, New York 16. Dorogovtsev AY, Silvesrov DS, Skorokhod AV, Yadrenko MI (1997) Probability Theory: Collection of Problems. American Mathematical society, Providence, RI 17. Dudley, RM (1989) Real Analysis and Probability. Wadsworth & Brooks/Cole, Belmont, CA 18. Dynkin EB (1965) Markov processes. Vols. I, II. Grundlehren der Mathematischen Wissenschaften, vol. 121, 122, Springer-Verlag, Berlin 19. Dynkin EB, Yushkevich AA (1969) Markov Processes-Theorems and Problems. Plenum Press, New York 20. Elliot RJ (1982) Stochastic Calculus and Applications. Applications of Mathematics 18, Springer-Verlag, New York 21. Etheridge A (2006) Financial Calculus. Cambridge University Press, Cambridge, UK 22. Feller W (1970) An Introduction to Probability Theory and Its Applications (3rd ed.). Wiley, New York 23. F¨ollmer H, Schied A (2004) Stochastic Finance: An Introduction in Discrete Time. Walter de Gruyter, Hawthorne, NY 24. Gikhman II, Skorokhod AV (2004) The Theory of Stochastic Processes: Iosif I. Gikhman, Anatoli V. Skorokhod. In 3 volumes, Classics in Mathematics Series, Springer, Berlin 25. Gikhman II, Skorokhod AV (1996) Introduction to the Theory of Random Processes. Courier Dover, Mineola 26. Gikhman II, Skorokhod AV (1982) Stochastic Differential Equations and Their Applications [in Russian]. Naukova dumka, Kiev 27. Gikhman II, Skorokhod AV, Yadrenko MI (1988) Probability Theory and Mathematical Statistics [in Russian] Vyshcha Shkola, Kiev
368
References
28. Gnedenko BV (1973) Priority queueing systems [in Russian]. MSU, Moscow 29. Gnedenko BV, Kovalenko IN (1989) Introduction to Queueing Theory. Birkhauser Boston, Cambridge, MA 30. Grandell J (1993) Aspects of Risk Theory. Springer-Verlag, New York 31. Grenander U (1950) Stochastic Processes and Statistical Inference. Arkiv fur Matematik, Vol. 1, no. 3:1871-2487, Springer, Netherlands 32. Gross D, Shortle JF, Thompson JM, Harris CM (2008) Fundamentals of Queueing Theory (4th ed.). Wiley Series in Probability and Statistics, Hoboken, NJ 33. Gusak DV (2007) Boundary Value Problems for Processes with Independent Increments in the Risk Theory. Pratsi Instytutu Matematyky Natsional’no¨ı Akademi¨ı Nauk Ukra¨ıny. Matematyka ta ¨ı¨ı Zastosuvannya 65. Instytut Matematyky NAN Ukra¨ıny, Ky¨ıv 34. Hida T (1980) Brownian Motion. Applications of Mathematics, 11, Springer-Verlag, New York 35. Ibragimov IA, Linnik YuV (1971) Independent and Stationary Sequences of Random Variables. Wolters-Noordhoff Series of Monographs and Textbooks on Pure and Applied Mathematics, Wolters-Noordhoff, Groningen 36. Ibragimov IA, Rozanov YuA (1978) Gaussian Random Processes. Applications of Math., vol. 9, Springer-Verlag, New York 37. Ibramkhalilov IS, Skorokhod AV (1980) Consistent Estimates of Parameters of Random Processes [in Russian]. Naukova dumka, Kyiv 38. Ikeda N, and Watanabe S (1989) Stochastic Differential Equations and Diffusion Processes, Second edition. North-Holland/Kodansya, Tokyo 39. Ito K (1961) Lectures on Stochastic Processes. Tata Institute of Fundamental Research, Bombay 40. Ito K, McKean H (1996) Diffusion Processes and Their Sample Paths. Springer-Verlag, New York 41. Jacod J, Shiryaev AN (1987) Limit Theorems for Stochastic Processes. Grundlehren der Mathematischen Wissenschaften, vol. 288, Springer-Verlag, Berlin 42. Johnson NL, Kotz S (1970) Distributions in Statistics: Continuous Univariate Distributions. Wiley, New York 43. Kakutani S (1944) Two-dimensional Brownian motion and harmonic functions Proc. Imp. Acad., Tokyo 20:706–714 44. Karlin S (1975) A First Course in Stochastic Processes. Second edition, Academic Press, New York 45. Karlin S (1966) Stochastic Service Systems. Nauka, Moscow 46. Kijima M (2003) Stochastic Processes with Application to Finance. Second edition. Chapman and Hall/CRC, London 47. Klimov GP (1966) Stochastic Service Systems [in Russian]. Nauka, Moscow 48. Kolmogorov AN (1992) Selected Works of A.N. Kolmogorov, Volume II: Probability theory and Mathematical statistics. Kluwer, Dordrecht 49. Koralov LB, Sinai YG (2007) Theory of Probability and Random Processes, Second edition. Springer-Verlag, Berlin 50. Korolyuk VS (1974) Boundary Problems for a Compound Poisson Process. Theory of Probability and its Applications 19, 1-14, SIAM, Philadelphia 51. Korolyuk VS, Portenko NI, Skorokhod AV, Turbin AF (1985) The Reference Book on Probability Theory and Mathematical Statistics [in Russian]. Nauka, Moscow 52. Krylov NV (2002) Introduction to the Theory of Random Processes. American Mathematical Society Bookstore, Providence, RI 53. Lamperti J (1977) Stochastic Processes. Applied Mathematical Sciences, vol. 23, Springer-Verlag, New York
References
369
54. Lamberton D, Lapeyre B (1996) Introduction to Stochastic Calculus Applied to Finance. Chapman and Hall/CRC, London 55. Leonenko MM, Mishura YuS, Parkhomenko VM, Yadrenko MI (1995) Probabilistic and Statistical Methods in Ecomometrics and Financial Mathematics. [in Ukrainian] Informtechnika, Kyiv 56. L´evy P (1948) Processus Stochastiques et Mouvement Brownien. Gauthier-Villars, Paris 57. Liptser RS, Shiryaev AN (2008) Statistics Of Random Processes, Vol. 1. Springer-Verlag New York 58. Liptser RS, Shiryaev AN (1989) Theory of Martingales. Mathematics and Its Applications (Soviet Series), 49, Kluwer Academic, Dordrecht 59. Lifshits MA (1995) Gaussian Random Functions. Springer-Verlag, New York 60. Meyer PA (1966) Probability and Potentials. Blaisdell, New York 61. Øksendal B (2000) Stochastic Differential Equations, Fifth edition. Springer-Verlag, Berlin 62. Pliska SR (1997) Introduction to Mathematical Finance. Discrete Time Models. Blackwell, Oxford 63. Port S, Stone C (1978) Brownian Motion and Classical Potential Theory. Academic Press, New York 64. Protter P (1990) Stochastic Integration and Differential Equations. A New Approach Springer-Verlag, Berlin 65. Prokhorov AV, Ushakov VG, Ushakov NG (1986) Problems in Probability Theory. [in Russian] Nauka, Moscow 66. Revuz D, Yor M (1999) Continuous martingales and Brownian Motion. Third edition. Springer-Verlag, Berlin 67. Robbins H, Sigmund D, Chow Y (1971) Great Expectations: The Theory of Optimal Stopping. Houghton Mifflin, Boston 68. Rolski T, Schmidli H, Schmidt V, Teugels J (1998) Stochastic Processes for Insurance and Finance. John Wiley and Sons, Chichester 69. Rozanov YuA (1977) Probability Theory: A Concise Course. Dover, New York 70. Rozanov YuA (1982) Markov Random Fields. Springer-Verlag, New York 71. Rozanov YuA (1995) Probability Theory, Random Processes and Mathematical Statistics. Kluwer Academic, Boston 72. Rozanov YuA (1967) Stationary Random Processes. Holden-Day, Inc., San Francisco 73. Sato K, Ito K (editor), Barndorff-Nielsen OE (editor) (2004) Stochastic Processes. Springer, New York 74. Sevast’yanov BA (1968) Branching Processes. Mathematical Notes, Volume 4, Number 2 / August, Springer Science+Business Media, New York 75. Sevastyanov BA, Zubkov AM, Chistyakov VP (1988) Collected Problems in Probability Theory. Nauka, Moscow 76. Skorokhod AV (1982) Studies in the Theory of Random Processes. Dover, New York 77. Skorokhod AV (1980) Elements of the Probability Theory and Random Processes [in Russian]. Vyshcha Shkola Publ., Kyiv 78. Skorohod AV (1991) Random Processes with Independent Increments. Mathematics and Its Applications, Soviet Series, 47 Kluwer Academic, Dordrecht 79. Skorohod AV (1996) Lectures on the Theory of Stochastic Processes. VSP, Utrecht 80. Spitzer F (2001) Principles of Random Walk. Springer-Verlag New York 81. Shiryaev AN (1969) Sequential Statistical Analysis. Translations of Mathematical Monographs 38, American Mathematical Society, Providence, RI 82. Shiryaev AN (1995) Probability. Vol 95. Graduate Texts in Mathematics, Springer-Verlag New York
370
References
83. Shiryaev AN (2004) Problems in Probability Theory. MCCME, Moscow 84. Shiryaev AN (1999) Essentials of Stochastic Finance, in 2 vol. World Scientific, River Edge, NJ 85. Steele JM (2001) Stochastic Calculus and Financial Applications. Springer-Verlag, New York 86. Striker C, Yor M (1978) Calcul stochastique dependant d’un parametre. Z. Wahrsch. Verw. Gebiete, 45: no. 2: 109–133. 87. Stroock DW, Varadhan SRS (1979) Multidimensional Diffusion Processes SpringerVerlag, New York 88. Vakhania NN, Tarieladze VI, Chobanjan SA (1987) Probability Distributions on Banach Spaces. Mathematics and Its Applications (Soviet Series), 14. D. Reidel, Dordrecht 89. Ventsel’ ES and Ovcharov LA (1988). Probability Theory and Its Engineering Applications [in Russian]. Nauka, Moscow 90. Wentzell AD (1981) A Course in the Theory of Stochastic Processes. McGraw-Hill, New York 91. Yamada T, Watanabe S (1971) On the uniquenes of solutions of stochastic differential equations J. Math. Kyoto Univ., 11: 155–167 92. Zolotarev VM (1997) Modern Theory of Summation of Random Variables. VSP, Utrecht
Index
Symbols
σ –algebra cylinder, 2 generated by Markov moment, 71 predictable, 72 “0 and 1” rule, 52 B Bayes method, 280 boundary functional, 329 Brownian bridge, 60 C call (put) option American, 305 claim causing ruin, 329 classic risk process, 328 Coding by pile of books method, 146 contingent claim American, 305 attainable, 305 European, 304 continuity set, 242 convergence of measures weak, 242 of random elements by distribution, 242 weak, 242 correlation operator of measure, 272 coupling, 246 optimal, 247 covariance, 11 criterion
for the regularity, 130 Neyman–Pearson, 273 recurrence, 138 critical region, 271 cumulant, 45 cumulant function, 327 D decision rule nonrandomized, 271 randomized, 271 decomposition Doob’s for discrete-time stochastic processes, 87 Doob–Meyer for supermartingales, 72 for the general supermartingales, 73 Krickeberg, 87 Kunita–Watanabe, 90 Riesz, 87 Wald, 129 density of measure, 272 posterior, 280 prior, 280 spectral, 107 diffusion process, 180 distribution finite-dimensional, 1 Gaussian, 59 marginal, 251 371
372
Index
of Markov chain invariant, 138 stationary, 138 E equation Black–Sholes, 316 Fokker–Planck, 181 Langevin, 218 Lundberg, 333 Ornstein–Uhlenbeck, 218 ergodic transformation, 110 errors of Type I and II, 271 estimator Bayes, 280 consistent for singular family of measures, 281 strictly for regular family of measures, 280 of parameter, 279 excessive majorant, 229 F fair price, 305 family of measures tight, 243 weakly compact, 243 filtration, 21 complete, 21 continuous, 21 left-hand continuous, 21 natural, 21 right-hand continuous, 21 financial market, 303 Black–Scholes/(B,S) model, 316 complete, 305 dividend yield, 317 Greeks, 316 flow of σ -algebras, 21 formula Black–Sholes, 316 Dynkin, 202 Feynman–Kac, 219 Itˆo, 194 multidimensional, 194 L´evy–Khinchin, 44 Pollaczek–Khinchin classic, 337 generalized, 333 Tanaka, 195
fractional Brownian motion, 61 function bounded growth, 77 characteristic, 12 m-dimensional, 12 common, 12 covariance, 11, 107 excessive, 229, 230 generalized inverse, 4 H¨older, 23, 251 Lipschitz, 251 lower semicontinuous, 230 mean, 11 mutual covariance, 11 nonnegatively defined, 11, 12 payoff, 229 continuous, 230 premium, 229, 230 renewal, 161 spectral, 107 structural, 108 superharmonic, 230 G generator, 178 Gronwall–Bellman lemma, 219 I inequality Burkholder, 78 Burkholder–Davis, 77, 78 Doob’s, 75, 77 integral, 76 Khinchin, 78 Marcinkievich–Zygmund, 78 infinitesimal operator, 178 K Kakutani alternative, 258 Kolmogorov equation backward, 181 forward, 181 Kolmogorov system of equations first (backward), 139 second (forward), 140 Kolmogorov–Chapman equations, 137, 176
Index L local time, 195 loss function all-or-nothing, 280 quadratic, 280 M main factorization identity, 331 Markov homogeneous family, 179 Markov chain, 137 continuous-time, 138 homogeneous, 137 regular, 139 Markov moment, 71 predictable, 71 Markov process, 175 homogeneous, 176 weakly measurable, 177 Markov transition function, 176 martingale, 71 inverse, 85 L´evy, 79 local, 73 martingale measure, 304 martingale transformation, 80 martingales orthogonal, 90 matrix covariance, 12 joint, 60 maximal deficit during a period, 330 mean value of measure, 272 mean vector, 12 measure absolutely continuous, 272 Gaussian, 59 on Hilbert space, 273 intensity, 45 L´evy, 45 locally absolutely continuous, 80 locally finite, 44 random point, 45 Poisson point, 45 spectral, 107 stochastic orthogonal, 108 structural, 108 Wiener, 242
measures equivalent, 272 singular, 272 model binomial (Cox–Ross–Rubinstein), 307 Ehrenfest P. and T., 146 Laplace, 145 modification continuous, 22 measurable, 22 N number of crossings of a band, 76 O outpayments, 328 overjump functional, 329 P period of a state, 137 Polya scheme, 81 portfolio, 315 self-financing, 315 price of game, 229 principle invariance, 244 of the fitting sets, 10 reflection, 260 process adapted, 71 almost lower semicontinuous, 328 almost upper semicontinuous, 328 Bessel, 88 birth-and-death, 147 claim surplus, 329 differentiable in L p sense, 33 in probability, 33 with probability one, 33 discounted capital, 304 Galton—Watson, 85 geometrical Brownian motion, 316 integrable, 71 L p sense, 34 in probability, 34 with probability one, 34 L´evy, 44 lower continuous, 328 nonbusy, 160
373
374
Index
of the fractional effect, 37 Ornstein–Uhlenbeck, 61, 183 Poisson, 44 compound, 49, 328 with intensity measure κ, 44 with parameter λ , 44 predictable, 72 discrete-time, 72 progressively measurable, 25 registration, 44 renewal delayed, 161 pure, 161 semicontinuous, 328 stepwise, 327 stochastic, 1 uniformly integrable, 72 upper continuous, 328 Wiener, 44 two-sided, 61 with discrete time, 1 with independent increments, 43 homogeneous, 43 Q quadratic characteristic joint, 74 quantile transformation, 4 R random element, 1 generated by a random process, 242 distribution, 242 generated by a random sequence, 242 distribution, 242 random field, 1 Poisson, 45 random function, 1 centered, 11 compensated, 11 continuous a.s., 33 in mean, 33 in mean square, 33 in probability, 22, 33 in the L p sense, 33 with probability one, 33 measurable, 22 separable, 22
stochastically continuous, 33 random functions stochastically equivalent, 21 in a wide sense, 21 random walk, 138 realization, 1 red period, 330 renewal epoch, 161 equation, 161 theorem, 162 representation spectral, 109 reserve process, 328 resolvent operator, 177 risk process, 328 risk zone, 330 ruin probability with finite horizon, 329 ruin time, 329 S safety security loading, 329 second factorization identity, 332 security of ruin, 329 sequence ergodic, 110 random, 1 regular, 129 singular, 129 set continuation, 230 cylinder, 2 of separability, 22 stopping, 230 supporting, 230 total, 118 shift operator, 110 Snell envelope, 81 space functional, 241 Skorohod, 24 Spitzer–Rogozin identity, 331 square variation, 74 square characteristic, 74 square variation joint, 74 State regular, 139
Index state accessible, 137 communicable, 137 essential, 137 inessential, 137 recurrent, 138 transient, 138 stationarity in wide sense, 107 strictly, 109 stationary sequence interpolation, 129, 130 prediction, 129, 130 stochastic basis, 71 stochastic differential, 194 stochastic differential equation, 215 strong solution, 215 weak solution, 216 stochastic integral Itˆo, 193 discrete, 80 over orthogonal measure, 108 stopping optimal, 230 stopping time, 71 optimal, 229 strategy optimal, 229 strong Markov family, 179 submartingale, 71 superharmonic majorant, 230 least, 230 supermartingale, 71 surplus prior to ruin, 329 T telegraph signal, 183 theorem Birkhoff–Khinchin, 110 Bochner, 12 Bochner–Khinchin, 107 Donsker, 244 Doob’s on convergence of submartingale, 76 on number of crossings, 76 optional sampling, 73 ergodic, 138
375
Fubini for stochastic integrals, 195 functional limit, 244 Hajek–Feldman, 273 Herglotz, 107 Hille –Yosida, 178 Kolmogorov on finite-dimensional distributions, 2 on continuous modification, 23 on regularity, 130 L´evy, 204 on normal correlation, 60 Poincare on returns, 119 Prokhorov, 243 Ulam, 241 total maximal deficit, 330 trading strategy, 303 arbitrage possibility, 304 self-financing, 303 trajectory, 1 transform Fourier–Stieltjes, 340 Laplace, 164 Laplace–Karson, 336 Laplace–Stieltjes, 163 transition function, 176 substochastic, 185 transition intensity, 139 transition probabilities matrix, 137 U ultimate ruin probability, 329 uniform integrability of stochastic process, 72 of totality of random variables, 72 V vector Gaussian, 59 virtual waiting time, 160 W waiting process, 3 Wald identity first, 85 fundamental, 85 generalized, 85 second, 85 white noise, 109