OX FO R D G R A D U AT E T E X T S I N M AT H E M AT I C S
Series Editors R. COHEN S.K. DONALDSON S. HILDEBRANDT T . J . LY O N S M . J . TAY L O R
OX FO R D G R A D U AT E T E X T S I N M AT H E M AT I C S
Books in the series 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Keith Hannabuss: An introduction to quantum theory Reinhold Meise and Dietmar Vogt: Introduction to functional analysis James G. Oxley: Matroid theory N.J. Hitchin, G.B. Segal, and R.S. Ward: Integrable systems: twistors, loop groups, and Riemann surfaces Wulf Rossmann: Lie groups: An introduction through linear groups Qing Liu: Algebraic geometry and arithmetic curves Martin R. Bridson and Simon M. Salamon (eds): Invitations to geometry and topology Shmuel Kantorovitz: Introduction to modern analysis Terry Lawson: Topology: A geometric approach Meinolf Geck: An introduction to algebraic geometry and algebraic groups Alastair Fletcher and Vladimir Markovic: Quasiconformal maps and Teichmiiller theory Dominic Joyce: Riemannian holonomy groups and calibrated geometry Fernando Villegas: Experimental Number Theory P´ eter Medvegyev: Stochastic Integration Theory
Stochastic Integration Theory P´eter Medvegyev
1
3 Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York c P´ eter Medvegyev, 2007 The moral rights of the author have been asserted Database right Oxford University Press (maker) First published 2007 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britain on acid-free paper by Biddles Ltd., King’s Lynn, Norfolk ISBN 978–0–19–921525–6 1 3 5 7 9 10 8 6 4 2
To the memory of my father
This page intentionally left blank
Contents Preface
xiii
1 Stochastic processes 1.1
1.2
1.3
1.4
1
Random functions
1
1.1.1 Trajectories of stochastic processes
2
1.1.2 Jumps of stochastic processes
3
1.1.3 When are stochastic processes equal?
6
Measurability of Stochastic Processes
7
1.2.1 Filtration, adapted, and progressively measurable processes
8
1.2.2 Stopping times
13
1.2.3 Stopped variables, σ-algebras, and truncated processes
19
1.2.4 Predictable processes
23
Martingales
29
1.3.1 Doob’s inequalities
30
1.3.2 The energy equality
35
1.3.3 The quadratic variation of discrete time martingales
37
1.3.4 The downcrossings inequality
42
1.3.5 Regularization of martingales
46
1.3.6 The Optional Sampling Theorem
49
1.3.7 Application: elementary properties of L´evy processes
58
1.3.8 Application: the first passage times of the Wiener processes
80
1.3.9 Some remarks on the usual assumptions
91
Localization
92
1.4.1 Stability under truncation
93
1.4.2 Local martingales
94 vii
viii CONTENTS 1.4.3 Convergence of local martingales: uniform convergence on compacts in probability
104
1.4.4 Locally bounded processes
106
2 Stochastic Integration with Locally Square-Integrable Martingales 2.1
2.2 2.3
2.4
108
The Itˆo–Stieltjes Integrals
109
2.1.1 Itˆ o–Stieltjes integrals when the integrators have finite variation
111
2.1.2 Itˆ o–Stieltjes integrals when the integrators are locally square-integrable martingales
117
2.1.3 Itˆ o–Stieltjes integrals when the integrators are semimartingales
124
2.1.4 Properties of the Itˆ o–Stieltjes integral
126
2.1.5 The integral process
126
2.1.6 Integration by parts and the existence of the quadratic variation
128
2.1.7 The Kunita–Watanabe inequality
134
The Quadratic Variation of Continuous Local Martingales
138
Integration when Integrators are Continuous Semimartingales
146
2.3.1 The space of square-integrable continuous local martingales
147
2.3.2 Integration with respect to continuous local martingales
151
2.3.3 Integration with respect to semimartingales
162
2.3.4 The Dominated Convergence Theorem for stochastic integrals
162
2.3.5 Stochastic integration and the Itˆ o–Stieltjes integral
164
Integration when Integrators are Locally Square-Integrable Martingales
167
2.4.1 The quadratic variation of locally square-integrable martingales
167
2.4.2 Integration when the integrators are locally square-integrable martingales
171
2.4.3 Stochastic integration when the integrators are semimartingales
176
CONTENTS
3 The Structure of Local Martingales 3.1
ix 179
Predictable Projection
182
3.1.1 Predictable stopping times
182
3.1.2 Decomposition of thin sets
188
3.1.3 The extended conditional expectation
190
3.1.4 Definition of the predictable projection
192
3.1.5 The uniqueness of the predictable projection, the predictable section theorem
194
3.1.6 Properties of the predictable projection
201
3.1.7 Predictable projection of local martingales
204
3.1.8 Existence of the predictable projection
206
Predictable Compensators
207
3.2.1 Predictable Radon–Nikodym Theorem
207
3.2.2 Predictable Compensator of locally integrable processes
213
3.2.3 Properties of the Predictable Compensator
217
3.3
The Fundamental Theorem of Local Martingales
219
3.4
Quadratic Variation
222
3.2
4 General Theory of Stochastic Integration 4.1
4.2
4.3
4.4
225
Purely Discontinuous Local Martingales
225
4.1.1 Orthogonality of local martingales
227
4.1.2 Decomposition of local martingales
232
4.1.3 Decomposition of semimartingales
234
Purely Discontinuous Local Martingales and Compensated Jumps
235
4.2.1 Construction of purely discontinuous local martingales
240
4.2.2 Quadratic variation of purely discontinuous local martingales
244
Stochastic Integration With Respect To Local Martingales
246
4.3.1 Definition of stochastic integration
248
4.3.2 Properties of stochastic integration
250
Stochastic Integration With Respect To Semimartingales
254
4.4.1 Integration with respect to special semimartingales
257
x
CONTENTS
4.5
4.4.2 Linearity of the stochastic integral
261
4.4.3 The associativity rule
262
4.4.4 Change of measure
264
The Proof of Davis’ Inequality
277
4.5.1 Discrete-time Davis’ inequality
279
4.5.2 Burkholder’s inequality
287
5 Some Other Theorems 5.1
292
The Doob–Meyer Decomposition
292
5.1.1 The proof of the theorem
292
5.1.2 Dellacherie’s formulas and the natural processes
299
5.1.3 The sub- super- and the quasi-martingales are semimartingales
303
5.2
Semimartingales as Good Integrators
308
5.3
Integration of Adapted Product Measurable Processes
314
5.4
Theorem of Fubini for Stochastic Integrals
319
5.5
Martingale Representation
328
6 Itˆ o’s Formula
351
6.1
Itˆ o’s Formula for Continuous Semimartingales
353
6.2
Some Applications of the Formula
359
6.2.1 Zeros of Wiener processes
359
6.2.2 Continuous L´evy processes
366
6.2.3 L´evy’s characterization of Wiener processes
368
6.2.4 Integral representation theorems for Wiener processes
373
6.2.5 Bessel processes
375
Change of Measure for Continuous Semimartingales
377
6.3.1 Locally absolutely continuous change of measure
377
6.3.2 Semimartingales and change of measure
378
6.3.3 Change of measure for continuous semimartingales
380
6.3.4 Girsanov’s formula for Wiener processes
382
6.3.5 Kazamaki–Novikov criteria
386
6.3
CONTENTS
6.4
6.5
Itˆ o’s Formula for Non-Continuous Semimartingales
394
6.4.1 Itˆ o’s formula for processes with finite variation
398
6.4.2 The proof of Itˆ o’s formula
401
6.4.3 Exponential semimartingales
411
Itˆ o’s Formula For Convex Functions
417
6.5.1 Derivative of convex functions
418
6.5.2 Definition of local times
422
6.5.3 Meyer–Itˆ o formula
429
6.5.4 Local times of continuous semimartingales
438
6.5.5 Local time of Wiener processes
445
6.5.6 Ray–Knight theorem
450
6.5.7 Theorem of Dvoretzky Erd˝ os and Kakutani
457
7 Processes with Independent Increments 7.1
xi
460
L´evy processes
460
7.1.1 Poisson processes
461
7.1.2 Compound Poisson processes generated by the jumps
464
7.1.3 Spectral measure of L´evy processes
472
7.1.4 Decomposition of L´evy processes
480
7.1.5 L´evy–Khintchine formula for L´evy processes
486
7.1.6 Construction of L´evy processes
489
7.1.7 Uniqueness of the representation
491
Predictable Compensators of Random Measures
496
7.2.1 Measurable random measures
497
7.2.2 Existence of predictable compensator
501
7.3
Characteristics of Semimartingales
508
7.4
L´evy–Khintchine Formula for Semimartingales with Independent Increments
513
7.4.1 Examples: probability of jumps of processes with independent increments
513
7.4.2 Predictable cumulants
518
7.4.3 Semimartingales with independent increments
523
7.2
xii CONTENTS
7.5
7.4.4 Characteristics of semimartingales with independent increments
530
7.4.5 The proof of the formula
534
Decomposition of Processes with Independent Increments
538
Appendix
547
A
Results from Measure Theory
547
A.1 The Monotone Class Theorem
547
A.2 Projection and the Measurable Selection Theorems
550
A.3 Cram´er’s Theorem
551
A.4 Interpretation of Stopped σ-algebras
555
B
C
Wiener Processes
559
B.1 Basic Properties
559
B.2 Existence of Wiener Processes
567
B.3 Quadratic Variation of Wiener Processes
571
Poisson processes
579
Notes and Comments
594
References
597
Index
603
Preface I started to write this book a few years ago mainly because I wanted to understand the theory of stochastic integration. Stochastic integration theory is a very popular topic. The main reason for this is that the theory provides the necessary mathematical background for derivative pricing theory. Of course, many books purport to explain the theory of stochastic integration. Most of them concentrate on the case of Brownian motion, and a few of them discuss the general case. Though the first type of book is quite readable, somehow they disguise the main ideas of the general theory. On the other hand, the books concentrating on the general theory were, for me, of a bit sketchy. I very often had quite serious problems trying to decode what the ideas of the authors were, and it took me a long time, sometimes days and weeks, to understand some basic ideas of the theory. I was nearly always able to understand the main arguments but, looking back, I think some simple notes and hints could have made my suffering shorter. The theory of stochastic integration is full of non-trivial technical details. Perhaps from a student’s point of view the best way to study and to understand measure theory and the basic principles of modern mathematical analysis is to study probability theory. Unfortunately, this is not true for the general theory of stochastic integration. The reason for this is very simple: the general theory of stochastic integration contains too much measure theory. Perhaps the best way to understand the limits of measure theory is to study the general theory of stochastic integration. I think this beautiful theory pushes modern mathematics to its very limits. On the other hand, despite many technical details there are just a very few simple issues which make up the backbone of stochastic analysis. 1. The first one is, of course, martingales and local martingales. The basic concept of stochastic analysis is random noise. But what is the right mathematical model for the random noise? Perhaps the most natural idea would be the random walk, that is processes with stationary and independent increments: the so called L´evy processes, with mean value zero. But, unfortunately, this class of processes has some very unpleasant properties. Perhaps the biggest problem is that the sum of two L´evy process is not a L´evy process again. Modern mathematics is very much built on the idea of linearity. If there is not some very fundamental and very clear reason for it, then every reasonable class of mathematical objects should be closed under linear combinations. The concept of random noise comes xiii
xiv
PREFACE
very much from applications. One of the main goals of mathematics is to build safe theoretical tools and, like other scientific instruments, mathematical tools should be both simple and safe, similar to computer tools. Most computer users never read the footnotes in computer manuals, they just have a general feeling about the limits of the software. It is the responsibility of the writer of the software to make the software work in a plausible way. If the behaviour of the software is not reasonable, then its use becomes dangerous, e.g. you could easily lose your files, or delete or modify something and make the computer behave unpredictably, etc. Likewise, if an applied mathematical theory cannot guarantee that the basic objects of the theory behave reasonably, then the theory is badly written, and as one can easily make hidden errors in it, its usage is dangerous. In our case, if the theory cannot guarantee that the sum of two random noises is again a random noise, then the theory is very dangerous from the point of view of sound applications. The main reason for introducing martingales is that from the intuitive point of view they are very close to the idea of a random walk, but if we fix the amount of observable information they form a linear space. The issue of local martingales is a bit more tricky. Of course local martingales and not just real martingales form the class of random noise. Without doubt, local martingales make life for a stochastic analyst very difficult. From an intuitive, applied point of view, local martingales and martingales are very close and that is why it is easy to make mistakes. Therefore, in most cases the mathematical proofs have to be very detailed and cautious. On the other hand the local martingales form a large and stable class, so the resulting theory is very stable and simple to use. As in elementary algebra, most of the problems come from the fact that one cannot divide by zero. In stochastic analysis most of the problems come from the fact that not every local martingale is a martingale and therefore one can take expected values only with care. Is there some intuitive idea why one should introduce local martingales? Perhaps, yes. First of all one should realize that not really martingales, but uniformly integrable martingales, are the objects of the theory. If we observe a martingale up to a fixed, finite moment of time we get a uniformly integrable martingale, but most of the natural moments of time are special random variables. The measurement of the time-line is, in some sense, very arbitrary. Traditionally we measure it with respect to some physical, astronomical movements. For some processes this coordinate system is rather arbitrary. It is more natural, for example, to say ‘after lunch I called my friend’ than to say ‘I called my friend at twenty-three past and sometimes at twentytwo past one depending on the amount of food my wife gave me’. Of course the moment of time after lunch is a random variable with respect to the coordinate system generated by the relative position of the earth and the sun, but as a basis for observing my general habits this random time, ‘after lunch’, is the natural point of orientation. So, in some ways, it is very natural to say that a process is a random noise if one can define a sequence of random moments, so-called stopping times, τ 0 < τ 1 < . . . such that if we observe the random noise up to τ k the truncated processes are uniformly integrable martingales, which is exactly the definition of local martingales. The idea that local martingales are the good
PREFACE
xv
mathematical models for random noise comes from the fact that sometimes we want to perturb the measurement of the time-line in an order-preserving way, and we want the class of ‘random noise processes’ to be invariant under these transformations. 2. The second-most important concept is quadratic variation. One can think of stochastic analysis as the mathematical theory of quadratic variation. In classical analysis one can define an integral only when the integrator has bounded variation. Even in this case, one can define two different concepts of integration. One is the Lebesgue–Stieltjes type of integration and the other is the Riemann– Stieltjes concept of integration. If the integrand is continuous, then the two concepts are equal. It is easy to see, that if the integrand is left-continuous and in the Riemann–Stieltjes type integrals one may choose only the starting point of the sub-intervals of the partitions as test-point, then for these type of approximating sums the integrals of Riemann–Stieltjes type will converge and they are equal to the Lebesgue–Stieltjes integrals. One may ask whether one can extend this trick to some more general class of integrators. The answer is yes. It turns out that the same concept works if the integrators are local martingales. There is just one new element: the convergence of the integrating sums holds only in probability. If the integrators are local martingales or if they have finite variation then for this integral, the so-called integration by parts formula is valid. In this formula, the most notable factor is the quadratic co-variation [X, Y ] (t). If, for example, X is continuous and Y has finite variation then [X, Y ] (t) = 0 but generally [X, Y ] (t) = 0. As the stochastic integrals are defined only by convergence in probability the random variable [X, Y ] (t) is defined only up to a measure-zero set. This implies that the trajectories of the process t → [X, Y ] (t) are undefined. One can exert quite a lot of effort to show that there is a right-continuous process with limits from the left, denoted by [X, Y ] such that for every t the value of [X, Y ] at time t is a version of the random variable [X, Y ] (t). The key observation in the proof of this famous theorem is that XY −[X, Y ] is a local martingale and it is the only process for which this property holds and the jump-process of the process [X, Y ] is the process ∆X∆Y . The integration by parts formula is the prototype of Itˆ o’s formula, which is the main analytical tool of stochastic analysis. Perhaps it is not without interest to emphasize that the main difficulty in proving this famous formula, in the general case of discontinuous processes, is to establish the existence of the quadratic variation. It is worth mentioning that it is relatively easy to show the existence of the quadratic variation for the so-called locally square-integrable martingales. It is nearly trivial to show the existence of the quadratic variation when the trajectories of the process have finite variation. Hence, it is not so difficult to prove the existence of [X] [X, X] if process X has a decomposition X = V + H where the trajectories of V have finite variation and H is a so-called locally square-integrable martingale. The main problem is that we do not know that every local martingale has this decomposition! To
xvi PREFACE prove that this decomposition exists one should show the Fundamental Theorem of Local Martingales, which is perhaps the most demanding result of the theory. 3. The third most important concept of the theory is predictability. There are many interrelated objects in the theory modified by the adjective predictable. Perhaps the simplest and most intuitive one is the concept of predictable stopping time. Stopping times describe the occurrence of random events. The occurrence of a random event is predictable, if there is a sequence of other events which announces the predictable event. That is, a stopping time τ is predictable if there is a sequence of stopping times (τ n ) with τ n τ and τ n < τ whenever τ > 0. This definition is very intuitive and appealing. If τ is a predictable stopping time, then one can say that the event [τ , ∞) {(t, ω) : τ (ω) ≤ t} ⊆ R+ × Ω is also predictable. The σ-algebra generated by these type of predictable random intervals is called the σ-algebra of predictable events. One should agree that this definition of predictability is in some sense very close to the intuitive idea of predictability. Quite naturally, a stochastic process is called predictable if it is measurable with respect to the σ-algebra of the predictable events. It is an important and often useful observation that the set of predictable events is the same as the σ-algebra generated by the left-continuous adapted processes. Recall that a process is called adapted when its value for every moment of time is measurable with respect to the σ-algebra representing the amount of information available at that time. The values of left-continuous processes are at least infinitesimally predictable. One of the most surprising facts of stochastic integration theory is that in the general case the integrands of stochastic integrals should be predictable. Although it looks like a very deep mathematical observation, one should also admit that this is a very natural result. The best interpretation of stochastic integrals is that they are the net results of continuous-time trading or gaming processes. Everybody knows that in a casino one should play a trading strategy only if one decides about the stakes before the random events generating the gains occur. This means that the playing strategy should be predictable. An important concept related to predictability is the concept of the predictable compensator. If one has a risky stochastic process X, one can ask whether there is a compensator P for the risk of process X. The compensator should be ‘simpler’ than the process itself. Generally it is assumed that P is monotone or at least it has finite variation. The compensator P should be predictable and one should assume that X − P is a totally random process, that is X − P is a local martingale. This is of course a very general setup, but it appears in most of the applications of stochastic analysis. For a process X there are many compensators, that is there are many processes Y such that X − Y is a local martingale. Perhaps the simplest one is X itself. But it is very important that the predictable
PREFACE
xvii
compensator of X, if it exists and if it has finite variation, is in fact unique. The reason for this is that every predictable local martingale is continuous, and if the trajectories of a continuous local martingale have finite variation then the local martingale is constant. 4. Stochastic integration theory is built on probability theory. Therefore every object of the theory is well-defined only almost surely and this means that stochastic integrals are also defined almost surely. In classical integration theory, one first defines the integral over some fixed set and then defines the integral function. In stochastic integration theory this approach does not work as it is entirely non-trivial how one can construct the integral process from the almost surely defined separate integrals. Therefore, in stochastic integration theory one immediately defines the integral processes, so stochastic integrals are processes and not random variables. 5. There are basically two types of local martingales: continuous and purely discontinuous ones. The canonical examples of continuous local martingales are the Wiener processes, and the simplest purely discontinuous local martingales are the compensated Poisson processes. Every local martingale which has trajectories with finite variation is purely discontinuous, but there are purely discontinuous local martingales with infinite variation. Every local martingale has a unique decomposition L = L (0) + Lc + Ld , where Lc is a continuous local martingale and Ld is a purely discontinuous local martingale. A very important property of purely discontinuous local martingales is that they are sums of their continuously compensated single jumps. Si , by definition, is a single jump if there is a stopping time τ such that every trajectory of Si is constant before and after the random jump-time τ . The single jumps obviously have trajectories with finite variation, and as the compensators Pi , by definition, also have finite variation, the compensated single jumps Li Si − Pi also have trajectories with finite variation. Of course this does not imply that the trajectories of L, as infinite sums, should also have finite variation. If L is a purely discontinuous local martingale and L = i Li where Li are continuously compensated single jumps, then one can think about the stochastic integral with respect to L as the sum of the stochastic integrals with respect to Li . Every Li has finite variation so, in this case, the stochastic integral, as a pathwise integral, is well-defined and if the integrand is predictable then the integral is a local martingale. Of course one should restrict the class of integrands as one has to guarantee the convergence of the sum of the already defined integrals. If the integrand is predictable then the stochastic integral with respect to a purely discontinuous local martingale is a sum of local martingales. Therefore it is also a local martingale. 6. The stochastic integral with respect to continuous local martingales is a bit more tricky. The fundamental property of stochastic integrals with respect to local martingales is that the resulting process is also a local martingale. The intuition behind this observation is that the basic interpretation of stochastic integration is that it is the cumulative gain of an investment process into a randomly changing price process. Every moment of time we decide about the size
xviii PREFACE of our investment, this is the integrand, and our short term gains are the product of our investment and the change of the random price-integrator. Our total gain is the sum of the short term gains. If we can choose our strategy only in a predictable way it is quite natural to assume that our cumulative gain process will be also totally random. That is, if the investment strategy is predictable and the random integrator price process is a local martingale, then the net, cumulative gain process is also a local martingale. How much is the quadratic variation of the resulting gain process? If H • L denotes the integral of H with respect to the local martingale L then one should guarantee the very natural identity [H • L] = H 2 • [L], where the right-hand side expression H 2 • [L] denotes the classical pathwise integral of H 2 with respect to the increasing process [L]. The identity is really very natural as [L] describes the ‘volatility’ of L along the timeline, and if in every moment of time we have H pieces of L then our short term 2 change will be (H∆L) ≈ H 2 ·∆ [L]. So our aggregated ‘volatility’ is 2 in ‘volatility’ 2 H ∆ [L] H •L. It is a very nice observation that there is just one continuous local martingale, denoted by H • L, for which [H • L, N ] = H • [L, N ] holds for every continuous local martingale N . The stochastic integral with respect to a local martingale L is the sum of two integrals: the integral H • Lc with respect to the continuous and the integral H • Ld with respect to the purely discontinuous part of L. 7. As there are local martingales which have finite variation, one can ask whether the new and the classical definitions are the same or not? The answer is that if the integrand is predictable the two concepts of integration are not different. This allows us to further generalize the concept of stochastic integration. We say that process S is a semimartingale if S = L + V where L is a local martingale and V is adapted and has finite variation. One can define the integral with respect to S as the sum of the integrals with respect L and with respect to V . A fundamental problem is that in the discontinuous case, as we have local martingales with finite variation, the decomposition is not unique. But as for processes with finite variation the two concepts of integration coincide, this definition of stochastic integral with respect to semimartingales is well-defined. In the first chapter of the book we introduce the basic definitions and some of the elementary theorems of martingale theory. In the second chapter we give an elementary introduction to stochastic integration theory. Our introduction is built on the concept of Itˆ o–Stieltjes integration. In the third chapter we shall discuss the structure of local martingales and in Chapter Four we shall discuss the general theory of stochastic integration. In Chapter Six we prove Itˆ o’s formula. In Chapter Seven we apply the general theory to the classical theory of processes with independent increments. Finally it is a pleasure to thank to those who have helped me to write this book. In particular I would like to thank the efforts of Tam´ as Badics from University of Pannonia and Petrus Potgieter from University of South Africa. They read most of the book and without their help perhaps I would not have been able
PREFACE
xix
to finish the book. I wish to thank Istv´ an Dancs and J´ anos Sz´az from Corvinus University for support and help. I would like to express my gratitude to the Magyar K¨ ulkereskedelmi Bank for their support. Budapest, 2006
[email protected] medvegyev.uni-corvinus.hu
This page intentionally left blank
1 STOCHASTIC PROCESSES
In this chapter we first discuss the basic definitions of the theory of stochastic processes. Then we discuss the simplest properties of martingales, the Martingale Convergence Theorem and the Optional Sampling Theorem. In the last section of the chapter we introduce the concept of localization.
1.1
Random functions
Let us fix a probability space (Ω, A, P). As in probability theory we refer to the set of real-valued (Ω, A)-measurable functions as random variables. We assume that the space (Ω, A, P) is complete, that is all subsets of measure zero sets are also measurable. This assumption is not a serious restriction but it is a bit surprising that we need it. We shall need this assumption many times, for example when we prove that the hitting times1 of Borel measurable sets are stopping times2 . When we prove this we shall use the so-called Projection Theorem3 which is valid only when the space (Ω, A, P) is complete. We shall also use the Measurable Selection Theorem4 several times, which is again valid only when the measure space is complete. Let us remark that all applications of the completeness assumption are connected to the Predictable Projection Theorem, which is the main tool in the discussion of discontinuous semimartingales. In the theory of stochastic processes, random variables very often have infinite value. Hence the image space of the measurable functions is not R but the set of extended real numbers R [−∞, ∞]. The most important examples of random variables with infinite value are stopping times. Stopping times give the random time of the occurrence of observable events. If for a certain outcome ω the event never occurs, it is reasonable to say that the value of the stopping time for this ω is +∞. 1 See:
Definition 1.26, page 15. Definition 1.21, page 13. 3 See: Theorem A.12, page 550. 4 See: Theorem A.13, page 551. 2 See:
1
2
STOCHASTIC PROCESSES
1.1.1
Trajectories of stochastic processes
In the most general sense stochastic processes are such functions X(t, ω) that for any fixed parameter t the mappings ω → X(t, ω) are random variables on (Ω, A, P). The set of possible time parameters Θ is some subset of the extended real numbers. In the theory of continuous-time stochastic processes Θ is an interval, generally Θ = R+ [0, ∞), but sometimes Θ = [0, ∞] and Θ = (0, ∞) is also possible. If we do not say explicitly what the domain of the definition of the stochastic process is, then Θ is R+ . It is very important to append some remarks to this definition. In probability theory the random variables are equivalence classes, which means that the random variables X(t) are defined up to measure zero sets. This means that in general X(t, ω) is meaningless for a fixed ω. If the possible values of the time parameter t are countable then we can select from the equivalence classes X(t) one element, and fix a measure zero set, and outside of this set the expressions X(t, ω) are meaningful. But this is impossible if Θ is not countable5 . Therefore, we shall always assume that X(t) is a function already carefully selected from its equivalence class. To put it in another way: when one defines a stochastic process, one should fix the space of possible trajectories and the stochastic processes are function-valued random variables which are defined on the space (Ω, A, P). Definition 1.1 Let us fix the probability space (Ω, A, P) and the set of possible time parameters6 Θ. The function X defined on Θ × Ω is a stochastic process over Θ × Ω if for every t ∈ Θ it is measurable on (Ω, A, P) in its second variable. Definition 1.2 If we fix an outcome ω ∈ Ω then the function t → X(t, ω) defined over Θ is the trajectory or realization of X corresponding to the outcome ω. If all 7 the trajectories of the process X have a certain property then we say that the process itself has this property. For example, if all the trajectories of X are continuous then we say that X is continuous, if all the trajectories of X have finite variation then we say that X has finite variation, etc. Recall that in probability theory the role of the space (Ω, A, P) is a bit problematic. All the relevant questions of probability theory are related to the joint distributions of random variables and the whole theory is independent of the specific space carrying the random variables having these joint distributions. 5 This is what the author prefers to call the revenge of the zero sets. This is very serious and it will make our life quite difficult. The routine solution to this challenge is that all the processes which we are going to discuss have some sort of continuity property. In fact, we shall nearly always assume that the trajectories of the stochastic processes are regular, that is at every point all the trajectories have limits from both sides and they are either right- or left-continuous. As we want to guarantee that the martingales have proper trajectories we shall need the so-called usual assumptions. 6 In most of the applications Θ is the time parameter. Sometimes the natural interpretation of Θ is not the time but some spatial parameter. See: Example 1.126, page 90. In continuous ‘time’ theory of stochastic processes Θ is an interval in the half-line R+ . 7 Not almost all trajectories. See: Definition 1.8, page 6, Example 1.11, page 8.
RANDOM FUNCTIONS
3
Of course it is not sufficient to define the distributions alone. For instance, it is very important to clarify the relation between the lognormal and the normal distribution, and we can do it only when we refer directly to random variables. Hence, somehow, we should assume that there is a measure space carrying the random variables with the given distributions: if ξ has normal distribution then exp(ξ) has lognormal distribution. This is a very simple and very important relation which is not directly evident from the density functions. The existence of a space (Ω, A, P) enables us to use the power of measure theory in probability theory, but the specific structure of (Ω, A, P) is highly irrelevant. The space (Ω, A, P) contains the ‘causes’, but we see only the ξ (ω) ‘consequences’. We never observe the outcome ω. We can see only its consequence ξ(ω). As the space (Ω, A, P) is irrelevant one can define it in a ‘canonical way’. In probability theory, generally, Ω R, A B (R) and P is the measure generated by the distribution function of ξ or in the multidimensional case Ω Rn and A B (Rn ). In both cases Ω is the space of all possible realizations. Similarly in the theory of stochastic processes the only entities which one can observe are the trajectories. Sometimes it is convenient if Ω is the space of possible trajectories. In this case we say that Ω is given in its canonical form. It is worth emphasizing that in probability theory there is no advantage at all in using any specific representation. In the theory of stochastic processes the relevant questions are related to time and all the information about the time should be somehow coded in Ω. Hence, it is very plausible if we assume that the elements of Ω are not just abstract objects which somehow describe the information about the timing of certain events, but are also functions over the set of possible time values. That is, in the theory of stochastic processes, the canonical model is not just one of the possible representation: it is very often the right model to discuss certain problems. 1.1.2
Jumps of stochastic processes
Of course, the theory of stochastic processes is an application of mathematical analysis. Hence the basic mathematical tool of the theory of stochastic processes is measure theory. To put it another way, perhaps one of the most powerful applications of measure theory is the theory of stochastic processes. But measure theory is deeply sequential, related on a fundamental level to countable objects. We can apply measure theory to continuous-time stochastic processes only if we restrict the trajectories of the stochastic processes to ‘countably determined functions’. Definition 1.3 Let I ⊆ R be an interval and let Y be an arbitrary topological space. We say that the function f : I → Y is regular if at any point t ∈ I, where it is meaningful, f has left-limits f (t−) f− (t) lim f (s) ∈ Y st
4
STOCHASTIC PROCESSES
and right-limits f (t+) f+ (t) lim f (s) ∈ Y. st
We say that f is right-regular if it is regular and it is right-continuous. We say that f is left-regular if it is regular and it is left-continuous. If f is a real-valued function, that is if Y R in the above definition, then the existence of limits means that the function has finite limits. As, in this book, stochastic processes are mainly real-valued stochastic processes, to make the terminology as simple as possible we shall always assume that regular processes have finite limits. If the process X is regular and if t is an interior point of Θ then as the limits are finite it is meaningful to define the jump ∆X(t) X(t+) − X(t−) of X at t. It is not too important, but a bit confusing, that somehow one should fix the definition of jumps of the regular processes at the endpoints of the time interval Θ. If Θ = R+ then what is the jump of the function χΘ at t = 0? Is it zero or one? Definition 1.4 We do not know anything about X before t = 0 so by definition we shall assume that X(0−) X(0). Therefore for any right-regular process on R+ ∆X(0) X(0+) − X(0−) = 0.
(1.1)
In a similar way, if, for example, Θ [0, 1) and X χΘ , then X is rightregular and does not have a jump at t = 1. Observe that in both examples the trajectories were continuous functions on Θ so it is a bit strange to say that the jump process of a continuous process is not zero8 . It is not entirely irrelevant how we define the jump process at t = 0. If we consider process F χR+ as a distribution function of a measure then how much is the integral [0,1] 1dF ? We shall assume that the 1 distribution functions are right-regular and not leftregular. By definition9 0 1dF is the integral over (0, 1] and as F is right-regular 8 One can take another approach. In general: what is the value of an undefined variable? If X is the value process of a game and τ is some exit strategy, then what is the value of the game if we never exit from the game, that is if τ = ∞? It is quite reasonable to say that in this case the value of the game is zero. Starting from this example one can say that once a variable is undefined then we shall assume that its value is zero. If one uses this approach then X (0−) 0 and ∆X (0) = X (0+). b 9 In measure theory one can very often find the convention a f dµ [a,b) f dµ. We shall assume that the integrator processes are right- and not left-continuous, so we shall use the convention ab f dµ (a,b] f dµ.
RANDOM FUNCTIONS
the measure of (0, 1] is F (1) − F (0) = 0 so convention one can think that
1 0
5
1dF = 0. According to our
1dF = F (1) − F (0−) = F (1) − F (0) = 1 − 1 0. [0,1]
On the other hand one can correctly argue that
1dF
[0,1]
χ([0, 1])dF = 1. R
To avoid these type of problems we shall never include the set {t = 0} in the domain of integration. The regular functions have many interesting properties. We shall very often use the next propositions: Proposition 1.5 Let f be a real-valued regular function defined on a finite and closed interval [a, b]. For any c > 0 the number of the jumps in [a, b] bigger in absolute value then c is finite. The number of the jumps of f are at most countable. Proof. The second part of the proposition is an easy consequence of the first part. Assume that there is an infinite number of points (tn ) in [a, b] for which |∆f (tn )| ≥ c. As [a, b] is compact, one can assume that tn → t∗ . Obviously we can assume that for an infinite number of points tn ≤ t∗ or t∗ ≤ tn . Hence we can assume that tn t∗ . But f has a left-limit at t∗ so if x, y < t∗ are close enough to t∗ then |f (x) − f (y)| ≤ c/4. If tn is close enough to t∗ and x < tn < y are close enough to tn and to t∗ then c ≤ |f (tn +) − f (tn −)| ≤ ≤ |f (tn +) − f (y)| + |f (y) − f (x)| + |f (x) − f (tn −)| ≤
3 c, 4
which is impossible. Proposition 1.6 If a function f is real valued and regular then it is bounded on any compact interval. Proof. Fix a finite closed interval [a, b]. If f were not bounded on [a, b] then there would be a sequence (tn ) for which |f (tn )| ≥ n. As [a, b] is compact one could assume, that tn → t∗ . We could also assume that e.g. tn t∗ and therefore f (tn ) → f (t∗ −) ∈ R which is impossible.
6
STOCHASTIC PROCESSES
Proposition 1.7 Let f be a real valued regular function defined on a finite and closed interval [a, b]. If the jumps of f are smaller than c then for any ε > 0 there is a δ such that |f (t ) − f (t )| < c + ε
whenever
|t − t | ≤ δ.
Proof. If such a δ were not available then for some δ n 0 for all n there would be tn , tn such that |tn − tn | ≤ δ n and |f (tn ) − f (tn )| ≥ c + ε.
(1.2)
As [a, b] is compact, one could assume that tn → t∗ and tn → t∗ for some t∗ . Notice that except for a finite number of indexes (tn ) and (tn ) are on different sides of t∗ , since if, for instance, for an infinite number of indexes tn , tn ≥ t∗ then for some subsequences tnk t∗ and tnk t∗ and as the trajectories of f are regular limk→∞ f (tnk ) = limk→∞ f (tnk ) which contradicts (1.2). So we can assume that tn t∗ and tn t∗ . Using again the regularity of f, one has |∆f (t∗ )| ≥ c + ε which contradicts the assumption |∆f | ≤ c. 1.1.3
When are stochastic processes equal?
A stochastic process X has three natural ‘facets’. The first one is the process itself, which is the two-dimensional ‘view’. We shall refer to this as X(t, ω) or just as X. With the first notation we want to emphasize that X is a function of two variables. For instance, the different concepts of measurability, like predictability or progressive measurability, characterize X as a function of two variables. We shall often use the notations X(t) or sometimes Xt , which denote the random variable ω → X(t, ω), that is the random variable belonging to moment t. Similarly we shall use the symbols X(ω), or Xω as well, which refer to the trajectory belonging to ω, that is X(ω) is the ‘facet’ t → X(t, ω) of X. Definition 1.8 Let X and Y be two stochastic processes on the probability space (Ω, A, P). 1. The process X is a modification of the process Y if for all t ∈ Θ the variables X(t) and Y (t) are almost surely equal, that is for all t ∈ Θ P (X(t) = Y (t)) P ({ω : X(t, ω) = Y (t, ω)}) = 1. By this definition, the set of outcomes ω where X(t, ω) = Y (t, ω), can depend on t ∈ Θ. 2. The processes X and Y are indistinguishable if there is a set N ⊆ Ω which has probability zero, and whenever ω ∈ / N then X (ω) = Y (ω) , that is X(t, ω) = Y (t, ω) for all t ∈ Θ and ω ∈ / N.
MEASURABILITY OF STOCHASTIC PROCESSES
7
Proposition 1.9 Assume that the realizations of X and Y are almost surely continuous from the left or they are almost surely continuous from the right. If X is a modification of Y then X and Y are indistinguishable. Proof. Let N0 be the set of outcomes where X and Y are not left-continuous or right-continuous. Let (rk ) be the set of rational points10 in Θ and let Nk {X(rk ) = Y (rk )} {ω : X(rk , ω) = Y (rk , ω)} . X is a modification of Y hence P(Nk ) = 0 for all k. Therefore if N ∪∞ k=0 Nk then P(N ) = 0. If ω ∈ / N then X(rk , ω) = Y (rk , ω) for all k, hence as the trajectories X(ω) and Y (ω) are continuous from the same side X(t, ω) = Y (t, ω) for all t ∈ Θ. Therefore outside N obviously X(ω) = Y (ω), that is X and Y are indistinguishable. Example 1.10 With modification one can change the topological properties of trajectories.
In the definition of stochastic processes one should always fix the analytic properties like continuity, regularity, differentiability etc. of the trajectories. It is not a great surprise that with modification one can dramatically change these properties. For example, let (Ω, A, P) ([0, 1] , B, λ) and Y (t, ω) ≡ 0. The trajectories of Y are continuous. If χQ is the characteristic function of the rational numbers, and X(t, ω) χQ (t + ω) then for all ω the trajectories of X are never continuous but X is a modification of Y . From the example it is also obvious that it is possible for X to be a modification of Y but for X and Y not to be indistinguishable. If X and Y are stochastic processes then, unless we explicitly say otherwise, X = Y means that X and Y are indistinguishable.
1.2
Measurability of Stochastic Processes
As we have already mentioned, the theory of stochastic processes is an application of measure theory. On the one hand this remark is almost unnecessary as measure theory is the cornerstone of every serious application of mathematical analysis. On the other hand it is absolutely critical how one defines the class of 10 Recall that Θ is an interval in R. If X and Y are left-continuous then left-continuity is meaningless in the left endpoint of Θ, so if Θ has a left endpoint then we assume that this left endpoint is part of (rk ). Similarly when X and Y are right-continuous and Θ has right endpoint then we assume that this endpoint is in (rk ).
8
STOCHASTIC PROCESSES
measurable functions which one can use in stochastic analysis. Every stochastic process is a function of two variables, so it is obvious to assume that every process is product measurable. Example 1.11 An almost surely continuous process is not necessarily product measurable.
Let (Ω, A, P) ([0, 1] , B, λ) and let E be a subset of [0, 1] which is not Lebesgue measurable. The process 0 if ω = 0 X(t, ω) χE (t) if ω = 0 is almost surely continuous. X is not product measurable as by Fubini’s theorem the product measurability implies partial measurability but if ω = 0 then t → X(t, ω) is not measurable. Although the example is trivial it is not without any interest. Processes X and Y are considered to be equal if they are indistinguishable. So in theory it can happen that X is product measurable and X = Y but Y is not product measurable. To avoid these type of measurability problems we should for example, assume that the different objects of stochastic analysis, like martingales, local martingales, or semimartingales etc. are right-regular and not just almost surely right-regular. Every trajectory of a Wiener processes should be continuous, but it can happen that it starts only almost surely from zero. 1.2.1
Filtration, adapted, and progressively measurable processes
A fundamental property of time is its ‘irreversibility’. This property of time is expressed with the introduction of the filtration. Definition 1.12 Let us fix a probability space (Ω, A, P). For every t ∈ Θ let us select a σ-algebra Ft ⊆ A in such a way that whenever s < t then Fs ⊆ Ft . The correspondence t → Ft is called a filtration and we shall denote this correspondence by F. The quadruplet (Ω, A, P, F) is called a stochastic basis. With the filtration F one can define the σ-algebras Ft+ ∩s>t Ft ,
Ft− σ (∪s 0 that on the interval [0, ε] the trajectory w (ω) is zero. Obviously F = ∪n Fn , where Fn is the set of outcomes ω, for which w (ω) is zero on the interval [0, 1/n]. Fn is measurable as it is equal to the set 1 w (rn ) = 0, rn ∈ 0, ∩Q . n Obviously P(Fn ) = 0, therefore P(F ) = 0. By definition w(0) ≡ 0, therefore / F0w . If t > 0 and 1/n ≤ t, then obviously Fn ∈ F0w = {Ω, ∅}. Hence F ∈ w Ft , therefore ∪1/n≤t Fn ∈ Ftw . On the other hand for every t > 0 evidently ∪1/n≤t Fn = F , since obviously ∪1/n≤t Fn ⊆ F and if ω ∈ F then ω ∈ Fn ⊆ w w , that is F0w = F0+ . ∪1/n≤t Fn for some index n. Hence F ∈ ∩t>0 Ftw = F0+ Let us remark that, as we shall see later, if N is the collection of sets with 11 One can observe that the interpretation of F t− is intuitively quite appealing, but the interpretation of Ft+ looks a bit unclear. It is intuitively not obvious that what type of information one can get in an infinitesimally short time interval after t or to put it in another way it is not too clear why one can get Ft = Ft+ . Therefore from an intuitive point of view it is not a great surprise that we shall generally assume that Ft = Ft+ .
10
STOCHASTIC PROCESSES
measure-zero in A then the filtration Ft σ (Ftw ∪ N ) is right-continuous, so this extended F satisfies the usual conditions12 . The σ-algebra F0w = {Ω, ∅} is complete, which implies that to make F right-continuous one should add to the σ-algebra Ftw all the null sets from A, or at least the null sets of Ftw for all t and it is not sufficient to complete the σ-algebras Ftw separately. Definition 1.14 We say that process X is adapted to the filtration F if X(t) is measurable with respect to Ft for every t. A set A ⊆ Θ × Ω is adapted if the process χA is adapted. In the following we shall fix a stochastic basis (Ω, A, P, F) and if we do not say otherwise we shall always assume that all stochastic processes are adapted with respect to the filtration F of the stochastic basis. It is easy to see that the set of adapted sets form a σ-algebra. Example 1.15 If Ft ≡ {∅, Ω} for all t then only the deterministic processes are adapted. If Ft ≡ A for all t then every product measurable stochastic process is adapted.
The concept of adapted processes is a dynamic generalization of partial measurability. The dynamic generalization of product measurability is progressive measurability: Definition 1.16 A set A ⊆ Θ×Ω is progressively measurable if for all t ∈ Θ A ∩ ([0, t] × Ω) ∈ Rt B ([0, t]) × Ft , that is for all t the restriction of A to [0, t] × Ω is measurable with respect to the product σ-algebra Rt B ([0, t]) × Ft . The progressively measurable sets form a σ-algebra R. We say that a process X is progressively measurable if it is measurable with respect to R. It is clear from the definition that every progressively measurable process is adapted. Example 1.17 Adapted process which is not progressively measurable.
12 See:
Proposition 1.103, page 67.
MEASURABILITY OF STOCHASTIC PROCESSES
11
Let Ω Θ [0, 1] and let Ft A be the σ-algebra generated by the finite subsets of Ω. If D {t = ω} then the function X χD is obviously adapted. We prove that it is not product measurable. Assume that {X = 1} = D ∈ B (Θ) × A. By the definition of product measurability Y [0, 1/2] × Ω ∈ B (Θ) × A. So if D ∈ B (Θ) × A then Y ∩ D ∈ B (Θ) × A. Therefore by the projection theorem13 [0, 1/2] ∈ A which is impossible. Therefore D ∈ / B (Θ) × A. If Ft A for all t then X is adapted but not progressively measurable. Example 1.18 Every adapted, continuous from the left and every adapted, continuous from the right process is progressively measurable14 .
Assume, for example, that X is adapted and continuous from the right. Fix a t (n) (n) (n) and let 0 = t0 < t1 < . . . < tk = t be a partition of [0, t]. Let us define the processes Xn (s)
X (0)
(n) X tk
if if
s = 0 . (n) (n) s ∈ tk−1 , tk
As X is adapted Xn is measurable with respect to the σ-algebra Rt B ([0, t]) × (n) Ft . If the sequence of partitions (tk ) is infinitesimal, that is if (n) (n) lim max tk − tk−1 = 0
n→∞
k
then as X is right-continuous Xn → X. Therefore the restriction of X to [0, t] is Rt -measurable. Hence X is progressively measurable. Example 1.19 If X is regular then ∆X is progressively measurable.
Like the product measurability, the progressive measurability is also a very mild assumption. It is perhaps the mildest measurability concept one can use in stochastic analysis. The main reason why one should introduce this concept is the following much-used observation: Proposition 1.20 Assume that V is a right-regular, adapted process and assume that every trajectory of V has finite variation on every finite interval [0, t]. 1. If for every ω the trajectories X (ω) are integrable on any finite interval with respect to the measure generated by V (ω) then the parametric 13 If P (N ) = 0 if N is countable otherwise P (N ) = 1, then the probability space (Ω, A, P ) is complete. 14 Specially, if X(t, ω) is measurable in ω and continuous in t then X is product measurable.
12
STOCHASTIC PROCESSES
integral process
t
X (s, ω) V (ds, ω)
Y (t, ω)
(1.4)
0
X (s, ω) V (ds, ω) (0,t]
forms a right-regular process and ∆Y = X · ∆V . 2. If additionally X is progressively measurable then Y is adapted. Proof. The first statement of the proposition is a direct consequence of the Dominated Convergence Theorem. Observe that to prove the second statement one cannot directly apply Fubini’s theorem, but one can easily adapt its usual proof: Let H denote the set of bounded processes for which Y (t) in (1.4) is Ft -measurable. As the measure of finite intervals is finite H is a linear space, it contains the constant process X ≡ 1, and if 0 ≤ Hn ∈ H and Hn H and H is bounded then by the Monotone Convergence Theorem H ∈ H. This implies that H is a λ-system. If C ∈ Ft and s1 , s2 ≤ t, and B (s1 , s2 ] × C then as V is adapted the integral
t
χB dV = χC [V (s2 ) − V (s1 )] 0
is Ft -measurable. These processes form a π-system, hence by the Monotone Class Theorem H contains the processes which are measurable with respect to the σ-algebra generated by the processes χC χ ((s1 , s2 ]). As C ∈ Ft the πsystem generates the σ-algebra of the product measurable sets B ((0, t])×Ft . X is progressively measurable so its restriction to (0, t ] is (B ((0, t]) × Ft )-measurable. Hence the proposition is true if X is bounded. From this the general case follows from the Dominated Convergence Theorem. What is the intuitive idea behind progressive measurability? Generally the filtration F is generated by some process X. Recall that if Z (ξ α )α∈A is a set of random variables and X σ (ξ α : α ∈ A) denotes the σ-algebra generated by them then X = ∪S⊆A XS where the subsets S are arbitrary countable generated subsets of A and for any S set XS denotes the σ-algebra
by the
the countably many variables ξ αi α ∈S of Z, that is XS σ ξ αi : αi ∈ S . By this i structure of the generated σ-algebras, FtX contains all the information one can obtain observing X up to time t countably many times. If a process Y is adapted with respect to F X then Y reflects the information one can obtain from countable many observations of X. But sometimes, like in (1.4), we want information
MEASURABILITY OF STOCHASTIC PROCESSES
13
which depends on uncountable number of observations of the underlying random source. In these cases one needs progressive measurability! 1.2.2
Stopping times
After filtration, stopping time is perhaps the most important concept of the theory of stochastic processes. As stopping times describe the moments when certain random events occur, it is not a great surprise that most of the relevant questions of the theory are somehow related to stopping times. It is important that not every random time is a stopping time. Stopping times are related to events described by the filtration of the stochastic base15 . At every time t one can observe only the events of the probability space (Ω, Ft , P). If τ is a random time then at time t one cannot observe the whole τ . One can observe only the random variable τ ∧ t! By definition τ is a stopping time if τ ∧ t is an (Ω, Ft , P)-random variable for all t. Definition 1.21 Let Ω be the set of outcomes and let F be a filtration on Ω. Let τ : Ω → Θ ∪ {∞}. 1. The function τ is a stopping time if for every t ∈ Θ {τ ≤ t} ∈ Ft . We denote the set of stopping times by Υ. 2. The function τ is a weak stopping time if for every t ∈ Θ {τ < t} ∈ Ft . Example 1.22 Almost-surely zero functions and stopping times.
Assume that the probability space (Ω, A, P) is complete and for every t the σ-algebra Ft contains the measure-zero sets of A. If N ⊆ Ω is a measure-zero set and the function τ ≥ 0 is zero on the complement of N , then τ is stopping time, as for all t {τ ≤ t} ⊆ N ∈ Ft , hence {τ ≤ t} ∈ Ft . In a similar way if σ ≥ 0 is almost surely +∞ then σ is a stopping time. These examples are special cases of the following: If (Ω, A, P, F) satisfies the usual conditions and τ is a stopping time and σ ≥ 0 is almost surely equal to τ then σ is also a stopping time. We shall see several times that in the theory of stochastic processes the time axis is not symmetric. The filtration defines an orientation on the real axis. 15 If we travel from a city to the countryside then the moment when we arrive at the first pub after we leave the city is a stopping time, but the time when we arrive at the last pub before we leave the city is not a stopping time. In a similar way when X is a stochastic process the first time X is zero is a stopping time, but the last time it is zero is not a stopping time. One of the most important random times which is generally not a stopping time is the moment when X reaches its maximum on a certain interval. See: Example 1.110, page 73.
14
STOCHASTIC PROCESSES
An elementary but very import consequence of this orientation is the following proposition: Proposition 1.23 Every stopping time is a weak stopping time. If the filtration F is right-continuous then every weak stopping time is a stopping time. Proof. As the filtration F is increasing, if τ is a stopping time then for all n 1 ∈ Ft−1/n ⊆ Ft . τ ≤t− n Therefore {τ < t} = ∪n
1 τ ≤t− n
∈ Ft .
On the other hand if F is right-continuous that is if Ft+ = Ft then 1 {τ ≤ t} = ∩n τ < t + ∈ ∩n Ft+1/n Ft+ = Ft . n The right-continuity of the filtration is used in the next proposition as well. Proposition 1.24 If τ and σ are stopping times then τ ∧ σ and τ ∨ σ are also stopping times. If (τ n ) is an increasing sequence of stopping times then τ lim τ n n→∞
is a stopping time. If the filtration F is right-continuous and (τ n ) is a decreasing sequence of stopping times then τ lim τ n n→∞
is a stopping time. Proof. If τ and σ are stopping times then {τ ∧ σ ≤ t} = {τ ≤ t} ∪ {σ ≤ t} ∈ Ft , {τ ∨ σ ≤ t} = {τ ≤ t} ∩ {σ ≤ t} ∈ Ft . If τ n τ then for all t {τ ≤ t} = ∩n {τ n ≤ t} ∈ Ft . If τ n τ then for all t c
{τ ≥ t} = ∩n {τ n ≥ t} = ∩n {τ n < t} ∈ Ft
MEASURABILITY OF STOCHASTIC PROCESSES
15
that is {τ < t} = ∪n {τ n < t} ∈ Ft . Hence τ is a weak stopping time. If the filtration F is right-continuous then τ is a stopping time. Corollary 1.25 If the filtration F is right-continuous and (τ n ) is a sequence of stopping times then sup τ n , n
inf τ n n
lim sup τ n , n→∞
lim inf τ n n→∞
are stopping times. The next definition concretizes the abstract definition of stopping times: Definition 1.26 If Γ ⊆ R+ × Ω then the expression τ Γ (ω) inf {t : (t, ω) ∈ Γ}
(1.5)
is called the d´ebut of the set Γ. If B ⊆ Rn and X is a vector valued stochastic process then τ B (ω) inf {t : X(t, ω) ∈ B}
(1.6)
is called the hitting time of set B. If B ⊆ R and X is a stochastic process and if Γ {X ∈ B} then τ Γ = τ B which means that every hitting time is a special d´ebut. Example 1.27 The most important hitting times are the random functions τ a (ω) inf {t : X(t, ω)Ra} where R is one of the relations ≥, >, ≤, σ : X(t) ∈ B} . The set Γ {(t, ω) : X(t, ω) ∈ B} ∩ {(t, ω) : t > σ (ω)} is progressively measurable since by the progressive measurability of X the first set in the intersection is progressively measurable, and the characteristic function of the other set is adapted and left-continuous hence it is also progressively measurable. By the theorem above if (Ω, A, P, F) satisfies the usual conditions then the expression τ = τ Γ inf {t : (t, ω) ∈ Γ} is a stopping time. 16 See:
Theorem A.12, page 550. can happen that (s, ω) ∈ Γ for all s > t, but (t, ω) ∈ / Γ. In this case τ Γ (ω) = t, but ω∈ / projΩ (Γ ∩ [0, t) × Ω), therefore in the proof we used the right-continuity of the filtration. 17 It
MEASURABILITY OF STOCHASTIC PROCESSES
17
Corollary 1.30 If the stochastic base (Ω, A, P, F) satisfies the usual conditions, the process X is progressively measurable and B is a Borel set then the hitting times τ 0 0,
τ n+1 inf {t > τ n : X(t) ∈ B}
are stopping times. Example 1.31 If X is not progressively measurable then the hitting times of Borel sets are not necessarily stopping times.
Let X χD be the adapted but not progressively measurable process in Example 1.17. The hitting time of the set B {1} is obviously not a stopping time as / A F1/2 . {τ B ≤ 1/2} = [0, 1/2] ∈ The main advantage of the above construction is its generality. An obvious disadvantage of the just proved theorem is that it builds on the Projection Theorem. Very often we do not need the generality of the above construction and we can construct stopping times without referring to the Projection Theorem. Example 1.32 Construction of stopping times without the Projection Theorem.
1. If the set B is closed and X is a continuous, adapted process then one can easily proof that the hitting time (1.6) is a stopping time. As the trajectories are continuous the sets K(t, ω) X ([0, t] , ω) are compact for every outcome ω. As B is closed K(t, ω) ∩ B = ∅ if and only, if the distance between the two sets is positive. Therefore K(t, ω) ∩ B = ∅ if and only if τ B (ω) > t. As the trajectories are continuous X([0, t] ∩ Q, ω) is dense in the set K(t, ω). As the metric is a continuous function {τ B ≤ t} = {K(t) ∩ B = ∅} = {d (K(t), B) = 0} = = {ω : inf {d(X (s, ω) , B) : s ≤ t, s ∈ Q} = 0} . X(s) is Ft -measurable for a fixed s ≤ t, hence as x → d (x, B) is continuous d(X(s), B) is also Ft -measurable. The infimum of a countable number of measurable functions is measurable, hence {τ B ≤ t} ∈ Ft . 2. We prove that if B is open, the trajectories of X are right-continuous and adapted, and the filtration F is right-continuous then the hitting time (1.6) is a stopping time. It is sufficient to prove that {τ B < t} ∈ Ft for all t. As the trajectories are right-continuous and as B is open X(s, ω) ∈ B, if and only if,
18
STOCHASTIC PROCESSES
there is an ε > 0 such that whenever u ∈ [s, s + ε) then X(u, ω) ∈ B. From this {τ B < t} = ∪s∈Q∩[0,t) {X(s) ∈ B} ∈ Ft . 3. In a similar way one can prove that if X is left-continuous and adapted, F is right-continuous, and B is open, then the hitting time τ B is a stopping time. 4. If the filtration is right-continuous, and X is a right or left-continuous adapted process, then for any number c the first passage time τ inf {t : X(t) > c} is a stopping time. 5. If B is open and the filtration is not right-continuous, then even for continuous processes the hitting time τ B is not necessarily a stopping time18 . If X(t, ω) t · ξ(ω), where ξ is a Gaussian random variable, and Ft is the filtration generated by X, then F0 = {0, Ω} , and the hitting time τ B of the set B {x > 0} is τ B (ω)
0 if ξ(ω) > 0 . ∞ if ξ(ω) ≤ 0
/ F0 , so τ B is not a stopping time. Obviously {τ B ≤ 0} ∈ 6. Finally we show that if σ is an arbitrary stopping time and X is a right-regular, adapted process and c > 0, then the first passage time τ (ω) inf {t > σ : |∆X(t, ω)| ≥ c} is stopping time. Let us fix an outcome ω and let assume that ∞ > tn τ (ω) , where |∆X(tn , ω)| ≥ c. The trajectory X(ω) is right-regular, therefore the jumps which are bigger than c do not have an accumulation point. Hence for all indexes n large enough tn is already constant, that is τ (ω) = tn > σ (ω) , so |∆X(τ (ω))| = |∆X(tn )| ≥ c for some n. This means that |∆X (τ )| ≥ c on the set {τ < ∞} and on the set {σ < ∞} one has τ > σ. Let A(t) ([0, t] ∩ Q) ∪ {t}. We prove that τ (ω) ≤ t if and only if for all n ∈ N one can find a pair qn , pn ∈ A(t) for which σ(ω) < pn < qn < pn +
1 n
18 The reason for this is clear as the event {τ B = t} can contain such outcomes ω that the trajectory will hit the set B just after t therefore one should investigate the events {τ B < t}.
MEASURABILITY OF STOCHASTIC PROCESSES
19
and |X(pn , ω) − X(qn , ω)| ≥ c −
1 . n
(1.7)
One implication is evident, that is if τ (ω) ≤ t, then as the jumps bigger than c do not have accumulation points, |∆X(s, ω)| ≥ c for some σ(ω) < s ≤ t. Hence by the regularity of the trajectories one can construct the necessary sequences. On the other hand, let us assume that the sequences (pn ) , (qn ) exist. Without loss of generality one can assume that (pn ) and (qn ) are convergent. Let σ (ω) ≤ s ≤ t be the common limit point of these sequences. If for an infinite number of indexes pn ≥ s, then in any right neighbourhood of s there is an infinite number of intervals [pn , qn ], on which X changes more then c/2 > 0, which is impossible as X is right-continuous. Similarly, only for a finite number of indexes qn ≤ s as otherwise for an infinite number of indexes pn < qn ≤ s which is impossible as X(ω) is left-continuous. This means that for indexes n big enough σ (ω) < pn ≤ s ≤ qn . Taking the limit in the line (1.7) |∆X(s, ω)| ≥ c and hence τ (ω) ≤ s ≤ t. Using this property one can easily proof that {τ ≤ t} =
n∈N
p,q∈A(t) p r > τ } = {σ > r} ∩ {τ < r} = c
= {σ ≤ r} ∩ {τ < r} ∈ Ft . From this c
{σ ≤ τ } ∩ {σ ≤ t} = {σ > τ } ∩ {σ ≤ t} = = ∪r∈Q {σ > r > τ } ∩ {σ ≤ t} = = ∪r∈Q,r≤t {σ > r > τ } ∩ {σ ≤ t} ∈ Ft . Hence by the definition of Fσ one has {σ ≤ τ } ∈ Fσ . On the other hand {τ ≤ σ} ∩ {σ ≤ t} = {σ ≤ t} ∩ {τ ≤ t} ∩ {τ ∧ t ≤ σ ∧ t} ∈ Ft ,
22
STOCHASTIC PROCESSES
since the first two sets, by the definition of stopping times, are in Ft and the two random variables in the third set are Ft -measurable. Hence {τ ≤ σ} ∈ Fσ . Proposition 1.35 If X is progressively measurable and τ is an arbitrary stopping time then the stopped variable Xτ is Fτ -measurable, and the truncated process X τ is progressively measurable. Proof. The first part of the proposition is an easy consequence of the second as, if B is a Borel measurable set and X τ is adapted, then for all s {Xτ ∈ B} ∩ {τ ≤ s} = {X (τ ∧ s) ∈ B} ∩ {τ ≤ s} = = {X τ (s) ∈ B} ∩ {τ ≤ s} ∈ Fs , that is, in this case the stopped variable Xτ is Fτ -measurable. Therefore it is sufficient to prove that if X is progressively measurable then X τ is also progressively measurable. Let Y (t, ω)
1
if t < τ (ω)
0
if t ≥ τ (ω)
.
Y is right-regular. τ is a stopping time so {Y (t) = 0} = {τ ≤ t} ∈ Ft . Hence Y is adapted, therefore it is progressively measurable20 . Obviously if τ (ω) > 0 then21
Z (t, ω)
X (s, ω) Y (ds, ω) = (0,t]
0 if t < τ (ω) . −X (τ (ω) , ω) if t ≥ τ (ω)
As X is progressively measurable Z is adapted22 and also right-regular so it is again progressively measurable. As X τ = XY − Z + X (0) χ (τ = 0) X τ is obviously progressively measurable. Corollary 1.36 If G σ(X(τ ) : X is right-regular and adapted) then G = Fτ . Proof. As every right-regular and adapted process is progressively measurable G ⊆ Fτ . If A ∈ Fτ then the process X(t) χA χ (τ ≤ t) is right-regular and by 20 See:
Example 1.18, page 11. τ (ω) = 0 then Z (ω) = 0. 22 See: Proposition 1.20, page 11. 21 If
MEASURABILITY OF STOCHASTIC PROCESSES
23
the definition of Fτ {X(t) = 1} = A ∩ {τ ≤ t} ∈ Ft . Hence X is adapted. Obviously X (τ ) = χA . Therefore Fτ ⊆ G. 1.2.4
Predictable processes
The class of progressively measurable processes is too large. As we have already remarked, the interesting stochastic processes have regular trajectories. There are two types of regular processes: some of them have left- and some of them have right-continuous trajectories. It is a bit surprising that there is a huge difference between these two classes. But one should recall that the trajectories are not just functions: the time parameter has an obvious orientation: the time line is not symmetric, the time flows from left to right. Definition 1.37 Let (Ω, A, P, F) be a stochastic base, and let us denote by P the σ-algebra of the subsets of Θ × Ω generated by the adapted, continuous processes. The sets in the σ-algebra P are called predictable. A process X is predictable if it is measurable with respect to P. Example 1.38 A deterministic process is predictable if and only if its single trajectory is a Borel-measurable function.
Obviously we call a process X deterministic if it does not depend on the random parameter ω, more exactly a process X is called deterministic if it is a stochastic process on (Ω, {Ω, ∅}). If A {Ω, ∅} then the set of continuous stochastic processes is equivalent to the set of continuous functions, and the σ-algebra generated by the continuous functions is equivalent to the σ-algebra of the Borel measurable sets on Θ. The set of predictable processes is closed for the usual operations of analysis23 . The most important and specific operation related to stochastic processes is the truncation: Proposition 1.39 If τ is an arbitrary stopping time and X is a predictable stochastic process then the truncated process X τ is also predictable. Proof. Let L be the set of bounded stochastic processes X for which X τ is predictable. It is obvious that L is a λ-system. If X is continuous then X τ is also continuous hence the π-system of the bounded continuous processes is in L. From the Monotone Class Theorem it is obvious that L contains the set of bounded predictable processes. If X is an arbitrary predictable process then 23 Algebraic
and lattice type operations, usual limits etc.
24
STOCHASTIC PROCESSES
Xn Xχ (|X| ≤ n) is a predictable bounded process and therefore Xnτ is also predictable. Xnτ → X τ therefore X τ is obviously predictable. To discuss the structure of the predictable processes let us introduce some notation: Definition 1.40 If τ and σ are stopping times then one can define the random intervals {(t, ω) ∈ [0, ∞) × Ω : τ (ω) R1 tR2 σ (ω)} where R1 and R2 are one of the relations < or ≤. One can define four random intervals [σ, τ ] , [σ, τ ) , (σ, τ ] and (σ, τ ) where the meaning of these notations is obvious. One should emphasize that, in the definition of the stochastic intervals, the value of the time parameter t is always finite. Therefore if τ (ω) = ∞ for some ω then (∞, ω) ∈ / [τ , τ ]. In measure theory we are used to the fact that the σ-algebras generated by the different types of intervals are the same. In R or in Rn one can construct every type of interval from any other type of interval with a countable number of set operations. For random intervals this is not true! For example, if we want to construct the semi-closed random interval [0, τ ) with random closed segments [0, σ] then we need a sequence of stopping times (σ n ) for which σ n τ , and σ n < τ . If there is such a sequence24 then of course [0, σ n ] [0, τ ) , that is, in this case [0, τ ) is in the σ-algebra generated by the closed random segments. But for an arbitrary stopping time τ such a sequence does not exist. If τ is a stopping time and c > 0 is a constant, then τ − c is generally not a stopping time! On the other hand if c > 0 then τ + c is always a stopping time, hence as [0, τ ] = ∩n [0, τ + 1/n) the closed random intervals [0, τ ] are in the σ-algebra generated by the intervals [0, σ). This shows again that in the theory of the stochastic processes the time line is not symmetric! Definition 1.41 Y is a predictable simple process if there is a sequence of stopping times 0 = τ0 < τ1 < . . . < τn < . . . such that Y = η 0 χ ({0}) +
η i χ ((τ i , τ i+1 ])
(1.8)
i 24 If for τ there is a sequence of stopping times σ τ , σ ≤ τ and σ < τ on the set n n n {τ > 0} then we shall say that τ is a predictable stopping time. Of course the main problem is that not every stopping time is predictable. See: Definition 3.5, page 182. The simplest examples are the jumps of the Poisson processes. See: Example 3.7, page 183.
MEASURABILITY OF STOCHASTIC PROCESSES
25
where η 0 is F0 -measurable and η i are Fτ i -measurable random variables. If the stopping times (τ i ) are constant then we say that Y is a predictable step processes. Now we are ready to discuss the structure of predictable processes. Proposition 1.42 Let X be a stochastic process on Θ [0, ∞). The following statements are equivalent 25 : 1. X is predictable. 2. X is measurable with respect to the σ-algebra generated by the adapted leftregular processes. 3. X is measurable with respect to the σ-algebra generated by the adapted leftcontinuous processes. 4. X is measurable with respect to the σ-algebra generated by the predictable step processes. 5. X is measurable with respect to the σ-algebra generated by the predictable simple processes. Proof. Let P1 , P2 , P3 , P4 and P5 denote the σ-algebras in the proposition. Obviously it is sufficient to prove that these five σ-algebras are equal. 1. Obviously P1 ⊆ P2 ⊆ P3 . 2. Let X be one of the processes generating P3 , that is let X be a left-continuous, adapted process. As X is adapted Xn (t) X (0) χ({0}) +
k
X
k 2n
k k+1 , χ 2n 2n
is a predictable step process. As X is left-continuous obviously Xn → X so X is P4 -measurable hence P3 ⊆ P4 . 3. Obviously P4 ⊆ P5 . 4. Let F ∈ F0 and let fn be such a continuous functions that fn (0) = 1 and fn is zero on the interval [1/n, ∞). If Xn fn χF then Xn is obviously P1 measurable, therefore the process χF χ({0}) = lim Xn n→∞
25 Let us recall that by definition X (0−) X (0). Therefore if ξ is an arbitrary F -measurable 0 random variable then the process X ξχ ({0}) is adapted and left-regular, so if Z is predictable then Z + X is also predictable. Hence we cannot generate P without the measurable rectangles {0}×F, F ∈ F0 . If one wants to avoid these sets then one should define the predictable processes on the open half line (0, ∞). This is not necessarily a bad idea as the predictable processes are the integrands of stochastic integrals, and we shall always integrate only on the intervals (0, t], so in the applications of the predictable processes the value of the these processes is entirely irrelevant at t = 0.
26
STOCHASTIC PROCESSES
is also P1 -measurable. If η 0 is an F0 -measurable random variable then η 0 is a limit of F0 -measurable step functions therefore the process η 0 χ ({0}) is P1 measurable. This means that the first term in (1.8) is P1 -measurable. Let us now discuss the second kind of term in (1.8). Let τ be an arbitrary stopping time. If 1 if t ≤ τ (ω) 1 − n (t − τ (ω)) if τ (ω) < t < τ (ω) + 1/n Xn (t, ω) 0 if t ≥ τ (ω) + 1/n then Xn has continuous trajectories, and it is easy to see that Xn is adapted. Therefore χ ([0, τ ]) = lim Xn ∈ P1 . n→∞
If σ ≤ τ is another stopping time then χ ((σ, τ ]) = χ ([0, τ ] \ [0, σ]) = χ ([0, τ ]) − χ ([0, σ]) ∈ P1 . If F ∈ Fσ then σ F (ω)
σ (ω) if ω ∈ F ∞ if ω ∈ /F
is also a stopping time as {σ F ≤ t} = {σ ≤ t} ∩ F ∈ Ft . If σ ≤ τ then Fσ ⊆ Fτ , therefore not only σ F but τ F is also a stopping time. χF χ ((σ, τ ]) = χ ((σ F , τ F ]) ∈ P1 . If η is Fσ -measurable, then η is a limit of step functions, hence if η is Fσ measurable and σ ≤ τ then the process ηχ ((σ, τ ]) is P1 -measurable. By the definition of the predictable simple processes every predictable simple process is P1 -measurable. Hence P5 ⊆ P1 . Corollary 1.43 If Θ = [0, ∞) then the random intervals {0} × F, F ∈ F0 and (σ, τ ] generate the σ-algebra of the predictable sets. Corollary 1.44 If Θ = [0, ∞) then the random intervals {0} × F, F ∈ F0 and [0, τ ] generate the σ-algebra of the predictable sets. Definition 1.45 Let T denote the set of measurable rectangles {0} × F,
F ∈ F0
MEASURABILITY OF STOCHASTIC PROCESSES
27
and {(s, t] × F,
F ∈ Fs } .
The sets in T are called predictable rectangles. Corollary 1.46 If Θ = [0, ∞) then the predictable rectangles generate the σ-algebra of predictable sets. It is quite natural to ask what the difference is between the σ-algebras generated by the right-regular and by the left-regular processes. Definition 1.47 The σ-algebra generated by the adapted, right-regular processes is called the σ-algebra of the optional sets. A process is called optional if it is measurable with respect to the σ-algebra of the optional sets. As every continuous process is right-regular so the σ-algebra of the optional sets is never smaller than the σ-algebra of the predictable sets P. Example 1.48 Adapted, right-regular process which is not predictable.
The simplest example of a right-regular process which is not predictable is the Poisson process. Unfortunately, at the present moment it is a bit difficult to prove26 . The next example is ‘elementary’. Let Ω [0, 1] and for all t let Ft
σ (B ([0, t]) ∪ (t, 1]) if t < 1 . B ([0, 1]) if t ≥ 1
If s ≤ t then Fs ⊆ Ft , and hence F is a filtration. It is easy to see that the random function τ (ω) ω is a stopping time. Let A [τ ] [τ , τ ] be the graph 2 of τ , which is the diagonal of the closed rectangle [0, 1] . 1. Let us show that A is optional. It is easy to see that the process Xn χ ([τ , τ + 1/n)) is right-continuous. Xn is adapted as {Xn (t) = 1} =
τ ≤t y} so, as [0, τ ] = {(t, ω) : t ≤ τ (ω)} {(t, ω) : t ≤ ω} = = {(x, y) : x ≤ y} , obviously R ∩ [0, τ ] = ∅ = (∅ × Ω) ∩ [0, τ ] . By the structure of Fs the interval (s, 1] is an atom of Fs . Hence if F ∩ (s, 1] = ∅, then (s, 1] ⊆ F , hence for some B ∈ B ([0, s]) R (s, t] × F = (s, t] × (B ∪ (s, 1]) . So R ∩ [0, τ ] = (s, t] × (B ∪ (s, 1]) ∩ {(x, y) : x ≤ y} = = ((s, t] × (s, 1]) ∩ {(x, y) : x ≤ y} = = ((s, t] × Ω) ∩ [0, τ ] and therefore in both cases the intersection has representation of type B × Ω. This remains true if we take the rectangles of type {0} × F, F ∈ F0 . As 27 If we draw Ω on the y-axis and we draw on the time line the x-axis then [τ , τ ] is the line y = x, [0, τ ] is the upper triangle. In the following argument F is under the diagonal hence the whole rectangle R is under the diagonal.
MARTINGALES
29
the generation and the restriction of the σ-algebras are interchangeable operations P ∩ [0, τ ] = σ (T ) ∩ [0, τ ] = σ (T ∩ [0, τ ]) = = σ ((B × Ω) ∩ [0, τ ]) = σ (B × Ω) ∩ [0, τ ] = = (B ([0, 1]) × Ω) ∩ [0, τ ] , which is exactly (1.9). 4. As the left-regular χ ([0, τ ]) is adapted and χ ([τ , τ ]) is not predictable, the right-regular, adapted process χ ([0, τ )) = χ ([0, τ ]) − χ ([τ , τ ]) is also not predictable.
1.3
Martingales
In this section we introduce and discuss some important properties of continuoustime martingales. As martingales are stochastic processes one should fix the properties of their trajectories. We shall assume that the trajectories of the martingales are right-regular. The right-continuity of martingales is essential in the proof of the Optional Sampling Theorem, which describes one of the most important properties of martingales. There are a lot of good books on martingales, so we will not try to prove the theorems in their most general form. We shall present only those results from martingale theory which we shall use in the following. The presentation below is a bit redundant. We could have first proved the Downcrossing Inequality and from it we could have directly proved the Martingale Convergence Theorem. But I don’t think that it is a waste of time and paper to show these theorems from different angles. Definition 1.49 Let us fix a filtration F. The adapted process X is a submartingale if 1. the trajectories of X are right-regular, 2. for any time t the expected value of X + (t) is finite28 , a.s.
3. if s < t, then E(X (t) | Fs ) ≥ X(s). 28 Some authors, see: [53], assume that if X is a submartingale then X (t) is integrable for all t. If we need this condition then we shall say that X is an integrable submartingale. The same remark holds for supermartingales as well. Of course martingales are always integrable.
30
STOCHASTIC PROCESSES
We say that X is a supermartingale, if −X is a submartingale. X is a martingale if X is a supermartingale and a submartingale at the same time. This means that 1. the trajectories of X are right-regular, 2. for any time t the expected value of X (t) is finite, a.s. 3. if s < t, then E (X (t) | Fs ) = X (s). The conditional expectation is always a random variable—that is, the conditional expectation E(X(t) | Fs ) is always an equivalence class. As X is a stochastic process X(s) is a function and not an equivalence class. Hence the two sides in the definition can be equal only in almost sure sense. Generally we shall not emphasize this, and we shall use the simpler =, ≥ and ≤ relations. If X is a martingale, and g is a convex function29 on R and E (g(X(t))+ ) < ∞ for all t, then the process Y (t) g (X(t)) is a submartingale as by Jensen’s inequality g (X (s)) = g (E (X (t) | Fs )) ≤ E (g (X (t)) | Fs ) . p
In particular, if X is a martingale, p ≥ 1, and |X (t)| is integrable for all t, then p the process |X| is a submartingale. If X is a submartingale, g is convex and increasing, and Y (t) g(X(t)) is integrable, then Y is a submartingale, as in this case E (g (X (t)) | Fs ) ≥ g (E (X (t) | Fs )) ≥ g (X (s)) . In particular, if X is a submartingale, then X + is also a submartingale. 1.3.1
Doob’s inequalities
The most well-known inequalities of the theory of martingales are Doob’s inequalities. First we prove the discrete-time versions, and then we discuss the continuous-time cases. n
Proposition 1.50 (Doob’s inequalities, discrete-time) Let X (Xk , Fk )k=1 be a non-negative submartingale. 1. If λ ≥ 0, then
λP
max Xk ≥ λ
1≤k≤n
≤ E (Xn ) .
(1.10)
2. If p > 1, then30 p Xk p ≤ max Xk ≤ p − 1 Xn p q Xn p . 1≤k≤n p 29 Convex 30 Of
functions are continuous so g(X) is adapted. course as usual 1/p + 1/q = 1.
(1.11)
MARTINGALES
31
Proof. Let us remark that both inequalities estimate the size of the maximum of the non-negative submartingales. 1. Let λ > 0. A1 {X1 ≥ λ} ,
Ak
max Xi < λ ≤ Xk ,
A
1≤i 1, then sup |X (t)| ≤ t∈Θ
p
p sup X (t)p . p − 1 t∈Θ
(1.16)
3. If Θ is closed and b is the finite or infinite right endpoint of Θ then under the conditions above λP sup X (t) ≥ λ ≤ X + (b)1 , (1.17) t∈Θ
p
λp P sup |X (t)| ≥ λ
≤ X (b)p ,
t∈Θ
sup |X (t)| ≤ t∈Θ
p
p X (b)p . p−1
(1.18)
We shall very often use the following corollary of (1.16): Corollary 1.54 If X is a martingale and p > 1, then X ∗ sup |Xk | ∈ Lp (Ω) t∈Θ
or (X ∗ ) p
p p sup |Xk | = sup |Xk | ∈ L1 (Ω) t∈Θ
t∈Θ
if and only if X is bounded in Lp (Ω). Definition 1.55 If p ≥ 1, then Hp will denote the space of martingales X for which sup |X(t)| < ∞. t
p
Hp also denotes the equivalence classes of these martingales, where two martingales are equivalent whenever they are indistinguishable. Definition 1.56 If X ∈ H2 , then we shall say that X is a square-integrable martingale. If supt |Xn (t) − X(t)|p → 0 then for a subsequence a.s
sup |Xnk (t) − X(t)| → 0, t
MARTINGALES
35
hence if Xn is right-regular for every n, then X is almost surely right-regular. From the definition of the Hp spaces it is trivial that for all p ≥ 1 the Hp martingales are uniformly integrable. From these the next observation is obvious: Proposition 1.57 Hp as a set of equivalence classes with the norm XHp
sup |X (t)| t
(1.19)
p
is a Banach space. If p > 1 then by Corollary 1.54 X ∈ Hp if and only if X is bounded in Lp (Ω). 1.3.2
The energy equality
An important elementary property of martingales is the following: Proposition 1.58 (Energy equality) Let X be a martingale and assume that X (t) is square integrable for all t. If s < t then
2 E (X (t) − X (s)) = E X 2 (t) − E X 2 (s) . Proof. The difference of the two sides is d 2 · E (X (s) · (X (s) − X (t))) . As s < t, by the martingale property dn 2 · E (X (s) χ (|X (s)| ≤ n) · (X (s) − X (t))) = = 2 · E (E (X (s) χ (|X (s)| ≤ n) · (X (s) − X (t)) | Fs )) = = 2 · E (X (s) χ (|X (s)| ≤ n) · E (X (s) − X (t) | Fs )) = = 2 · E (X (s) χ (|X (s)| ≤ n) · 0) = 0. As X (s) , X (t) ∈ L2 (Ω) obviously |X (s) · (X (s) − X (t))| is integrable. Hence one can use the Dominated Convergence Theorem on both sides so d = lim dn = 0. n→∞
Corollary 1.59 If X ∈ H then there is a random variable, denoted by X(∞), such that X(∞) ∈ L2 (Ω, F∞ , P) and 2
a.s.
X(t) = E(X(∞) | Ft )
(1.20)
36
STOCHASTIC PROCESSES
for every t. In L2 (Ω)-convergence lim X(t) = X(∞).
t→∞
Proof. Let tn ∞ be arbitrary. By the energy equality the sequence 2 X(tn )2 is increasing, and by the definition of H2 it is bounded from above. Also by the energy equality if n > m then 2
2
2
X(tn ) − X(tm )2 = X(tn )2 − X(tm )2 , hence (X(tn )) is a Cauchy sequence in L2 (Ω). As L2 (Ω) is complete the sequence (X(tn )) is convergent in L2 (Ω). It is obvious from the construction that the limit X (∞) as an object in L2 (Ω) is unique, that is X (∞) ∈ L2 (Ω) is independent of the sequence (tn ). X is a martingale, so if s ≥ 0 then a.s.
X (t) = E (X (t + s) | Ft ) . In probability spaces L1 -convergence follows from L2 -convergence and as the conditional expectation is continuous in L1 (Ω), if s → ∞ then a.s.
X (t) = E
lim X (t + s) | Ft E (X (∞) | Ft ) .
s→∞
Example 1.60 Wiener processes and the structure of the square-integrable martingales.
Let u < ∞ and let w be a Wiener process on the interval Θ [0, u]. As w has independent increments, for every t ≤ u E (w (u) | Ft ) = E (w (u) − w (t) | Ft ) + E (w (t) | Ft ) = w (t) . / H2 , On the half-line R+ w is not bounded in L2 (Ω) that is, if Θ = R+ then w ∈ and of course the representation (1.20) does not hold. Proposition 1.61 Let X be a martingale and let p ≥ 1. If for some random variable X(∞) Lp (Ω)
X (t) → X(∞), then a.s.
X (t) → X(∞)
MARTINGALES
37
and a.s.
X(t) = E (X(∞) | Ft ) ,
t ≥ 0.
(1.21)
Proof. As the conditional expectation is continuous in L1 (Ω) if s → ∞ then from the relation a.s.
X(t) = E (X(t + s) | Ft ) ,
t≥0
(1.21) follows. For an arbitrary s the increment N (u) X (u + s) − X (s) is a martingale with respect to the filtration Gu Fs+u . Let β(s) sup |X(u + s) − X(∞)| ≤ sup |N (u)| + |X(s) − X(∞)| . u
u≥0
X is right-regular, so it is sufficient to take the supremum over the rational numbers, so β(s) is measurable. sup N (u)p ≤ X (s) − X(∞)p + sup X (u + s) − X (∞)p . u
u
Lp
Let ε > 0 be arbitrary. As X (s) → X (∞) if s is large enough then the right-hand side is less than ε > 0. By Doob’s and by Markov’s inequalities P (β (s) > 2δ) ≤ P (|X(s) − X(∞)| > δ) + P sup |N (u)| > δ ≤ u
≤
X(s)
p − X(∞)p p
δ
+
ε p δ
.
P
Therefore if s → ∞ then β (s) → 0. Every stochastically convergent sequence has a.s. an almost surely convergent subsequence. By the definition of β (s) if β (sk ) → 0 a.s. then X(t) → X(∞). Corollary 1.62 If X ∈ H2 then there is a random variable X(∞) ∈ L2 (Ω) such that X(t) → X(∞), where the convergence holds in L2 (Ω) and almost surely. 1.3.3
The quadratic variation of discrete time martingales
Our goal is to extend the result just proved to spaces Hp , p ≥ 1. The main tool of stochastic analysis is the so-called quadratic variation. Let us first investigate the quadratic variation of discrete-time martingales. Proposition 1.63 (Austin) Let Z denote the set of integers. Let X = (Xn , Fn )n∈Z be a martingale over Z, that is let us assume that Θ = Z. If X
38
STOCHASTIC PROCESSES
is bounded in L1 (Ω) then the ‘quadratic variation’ of X is almost surely finite: ∞
2 a.s.
(Xn+1 − Xn ) < ∞.
(1.22)
n=−∞
Proof. As X is bounded in L1 (Ω) there is a k < ∞ such that Xn 1 ≤ k for all n ∈ Z. Let X ∗ supn |Xn |. |X| is a non-negative submartingale so by Doob’s inequality P (X ∗ ≥ p) ≤
k , p
therefore X ∗ is almost surely finite. Fix a number p and define the continuously and differentiable, convex function f (t)
t2 2p |t| − p2
if |t| ≤ p . if |t| > p
As f is convex the expression g (s1 , s2 ) f (s2 ) − f (s1 ) − (s2 − s1 ) f (s1 ) is non-negative. If |s1 | , |s2 | ≤ p then 2
g (s1 , s2 ) = s22 − s21 − (s2 − s1 ) 2s1 = (s2 − s1 ) . By the definition of f obviously f (t) ≤ 2p |t|. Therefore E (f (Xn )) ≤ 2pE (|Xn |) ≤ 2pk.
(1.23)
By the elementary properties of the conditional expectation E ((Xn+1 − Xn ) f (Xn )) = E (E ((Xn+1 − Xn ) f (Xn )) | Fn ) = = E (f (Xn ) E ((Xn+1 − Xn )) | Fn ) = 0 for all n ∈ Z. From this and from (1.23), using the definition of g, for all n 2pk ≥ E (f (Xn )) ≥ E (f (Xn ) − f (X−n )) = =
n−1 i=−n
E (f (Xi+1 ) − f (Xi )) =
MARTINGALES
=
n−1
39
E (f (Xi+1 ) − f (Xi ) − (Xi+1 − Xi ) f (Xi ))
i=−n
n−1
E (g (Xi+1 , Xi )) .
i=−n
By the Monotone Convergence Theorem 2 ∗ ∗ (Xn+1 − Xn ) χ (X ≤ p) = E g (Xn+1 , Xn ) χ (X ≤ p) ≤ E n∈Z
n∈Z
≤E =
g (Xn+1 , Xn )
=
n∈Z
E (g (Xn+1 , Xn )) ≤ 2pk.
n∈Z
As X ∗ is almost surely finite,
n∈Z
2
(Xn+1 − Xn ) is almost surely convergent.
Corollary 1.64 Let X (Xn , Fn ) be a martingale over the natural numbers N. If X is bounded in L1 (Ω) and (τ n ) is an increasing sequence of stopping times then almost surely ∞
2
(X(τ n+1 ) − X(τ n )) < ∞.
(1.24)
n=1
Proof. For every m let us introduce the bounded stopping times τ m n τ n ∧ m. By the discrete-time version of the Optional Sampling Theorem32
X m X (τ m n ) , Fτ m n n is a martingale, and therefore from the proof of the previous proposition ∞ 2 m 2pk ≥ E X τm χ (X ∗ ≤ p) . n+1 − X (τ n ) n=1
If m → ∞ then by Fatou’s lemma ∞ 2 E (X (τ n+1 ) − X (τ n )) χ (X ∗ ≤ p) ≤ 2pk, n=1
from which (1.24) is obvious. 32 See:
Lemma 1.83, page 49.
40
STOCHASTIC PROCESSES
Corollary 1.65 Let X (Xn , Fn ) be a martingale over the natural numbers N. If X is bounded in L1 (Ω) then there is a variable X∞ such that |X∞ | < ∞ and a.s.
lim Xn = X∞ .
n→∞
Proof. Assume that for some ε > 0 on a set of positive measure A lim sup |Xp − Xq | ≥ 2ε.
(1.25)
p,q→∞
Let τ 0 1, and let τ n+1 inf {m ≥ τ n : |Xm − Xτ n | ≥ ε} . Obviously τ n is a stopping time for all n and the sequence (τ n ) is increasing. On the set A |X(τ n+1 ) − X(τ n )| ≥ ε. By (1.24) almost surely
∞ n=0
2
(X(τ n+1 ) − X(τ n )) < ∞ which is impossible.
Corollary 1.66 If X = (Xn , Fn ) is a non-negative martingale then there exists a finite, non-negative variable X∞ such that X∞ ∈ L1 (Ω) and almost surely Xn → X ∞ . Proof. X is non-negative and the expected value of Xn is the same for all n, a.s. hence X is obviously bounded in L1 (Ω). So Xn → X∞ exists. By Fatou’s lemma
X (0) = E (Xn | F0 ) = lim E (Xn | F0 ) ≥ E lim inf Xn | F0 = n→∞
n→∞
= E (X∞ | F0 ) ≥ 0 and therefore X∞ ∈ L1 (Ω). Corollary 1.67 Assume that Θ = R+ . If X is a uniformly integrable martingale then there is a variable X (∞) ∈ L1 (Ω) such that X (t) → X (∞), where the convergence holds in L1 (Ω) and almost surely. For all t a.s.
X (t) = E (X (∞) | Ft ) .
(1.26)
Proof. Every uniformly integrable set is bounded in L1 , so if tn ∞, then a.s. there is an X(∞) such that X(tn ) → X(∞). By the uniform integrability the convergence holds in L1 (Ω) as well. Obviously X(∞) as an equivalence class is independent of (tn ). The relation (1.26) is an easy consequence of the L1 (Ω)continuity of the conditional expectation.
MARTINGALES
41
Corollary 1.68 Assume that p ≥ 1 and Θ = R+ . If X ∈ Hp then there is a variable X (∞) ∈ Lp (Ω) such that X (t) → X (∞), where the convergence holds in Lp (Ω) and almost surely. For all t a.s.
X (t) = E (X (∞) | Ft ) .
(1.27)
Proof. If the measure is finite and p ≤ q then Lq ⊆ Lp . Hence if p ≥ 1 and X ∈ Hp then X ∈ H1 so, if tn ∞, then there is a variable X(∞) such that a.s. p p X(tn ) → X(∞). As by the definition of Hp spaces |X(t)| ≤ sups |X(s)| ∈ L1 (Ω), so X(∞) ∈ Lp (Ω) and by the Dominated Convergence Theorem the convergence holds in Lp (Ω) as well. Obviously X(∞), as an equivalence class, is independent of (tn ). The relation (1.27) is an easy consequence of the L1 (Ω) continuity of the conditional expectation. Theorem 1.69 (L´ evy’s convergence theorem) If (Fn ) is an increasing sequence of σ-algebras, ξ ∈ L1 (Ω) and F∞ σ (∪n Fn ) , then Xn E (ξ | Fn ) → E (ξ | F∞ ) , where the convergence holds in L1 (Ω) and almost surely. Proof. Let Xn E (ξ | Fn ). As E( |Xn |) E (|E (ξ | Fn )|)) ≤ E (E (|ξ| | Fn ))) = E(|ξ|) < ∞, a.s.
X = (Xn , Fn ) is an L1 (Ω) bounded martingale. Therefore Xn → X∞ . After the proof we shall prove as a separate lemma that the sequence (Xn ) is uniformly L1
integrable, hence Xn → X∞ . If A ∈ Fn , and m ≥ n, then Xm dP = ξdP, A
A
L1
hence as Xm → X∞
X∞ dP = A
ξdP,
A ∈ ∪n Fn .
(1.28)
A
As X∞ and ξ are integrable it is easy to see that the sets A for which (1.28) is true is a λ-system. As (Fn ) is increasing ∪n Fn is obviously a π-system. Therefore by the Monotone Class Theorem (1.28) is true if A ∈ F∞ σ (∪Fn ). X∞ is a.s. obviously F∞ -measurable, hence X∞ = E (ξ | F∞ ).
42
STOCHASTIC PROCESSES
Lemma 1.70 If ξ ∈ L1 , and (Fα )α∈A is an arbitrary set of σ-algebras then the set of random variables Xα E (ξ | Fα ) ,
α∈A
(1.29)
is uniformly integrable. Proof. By Markov’s inequality for every α P (|Xα | ≥ n) ≤
1 1 E (E (|ξ| | Fα )) = E (|ξ|) . n n
Therefore for any δ there is an n0 that if n ≥ n0 , then P (|Xα | ≥ n) < δ. As that is for X ∈ L1 (Ω) the integral function A |ξ| dP is absolutely continuous, arbitrary ε > 0 there is a δ such that if P (A) < δ, then A |ξ| dP < ε. Hence if n is large enough, then
{|Xα |>n}
|Xα | dP ≤
{|Xα |>n}
E (|ξ| | Fα ) dP =
{|Xα |>n}
|ξ| dP < ε,
which means that the set (1.29) is uniformly integrable. 1.3.4
The downcrossings inequality
Let X be an arbitrary adapted stochastic process and let a < b. Let us fix a point of time t, and let S {s0 < s1 < · · · < sm } be a certain finite number of moments in the time interval [0, t). Let33 τ 0 inf {s ∈ S : X (s) > b} ∧ t. With induction define τ 2k+1 inf {s ∈ S : s > τ 2k , X (s) < a} ∧ t, τ 2k inf {s ∈ S : s > τ 2k−1 , X (s) > b} ∧ t. It is easy to check that τ k is a stopping time for all k. It is easy to see that if X is an integrable submartingale then the inequality a.s.
τ 2k ≤ τ 2k+1 < t 33 If
the set after inf is empty, then the infimum is by definition +∞.
MARTINGALES
43
is impossible as in this case X (τ 2k ) > b, X (τ 2k+1 ) < a and by the submartingale property34 b < E(X (τ 2k )) ≤ E(X (τ 2k+1 )) < a, which is impossible. We say that function f downcrosses the interval [a, b] if there are points u < v with f (u) > b and f (v) < a. By definition f has n downcrosses with thresholds a, b on the set S if there are points in S u1 < v1 < u2 < v2 < · · · < un < vn with f (uk ) > b, f (vk ) < a. Let us denote by DSa,b the a < b downcrossings of X in the set S. Obviously DSa,b ≥ n = {τ 2n−1 < t} ∈ Ft , and hence DSa,b is Ft -measurable. We show that χ
DSa,b
≥n ≤
m k=0
+
(X(τ 2k ) − X(τ 2k+1 )) + (X(t) − b) . n(b − a)
(1.30)
Recall that m is the number of points in S. Therefore the maximum number of possible downcrossings is obviously m. If we have more than n downcrossings then in the sum the first n term is bigger than b − a. For every trajectory all but the last non-zero terms of the sum are positive as they are all not smaller than b − a > 0. There are two possibilities: in the last non-zero term either τ 2k+1 < t or τ 2k+1 = t. In the first case X(τ 2k ) − X(τ 2k+1 ) > b − a > 0. In the second case still X (τ 2k ) > b, therefore in this case X (τ 2k ) − X (τ 2k+1 ) > b − X(t). Of course b − X(t) can be negative. This is the reason why we added to the sum + the correction term (X(t) − b) . If b − X(t) < 0 then +
X (τ 2k ) − X (τ 2k+1 ) + (X(t) − b) = X (τ 2k ) − X (τ 2k+1 ) + X(t) − b = = X (τ 2k ) − X (t) + X(t) − b = = X (τ 2k ) − b > 0, 34 See:
Lemma 1.83, page 49.
44
STOCHASTIC PROCESSES
which means that (1.30) always holds. Taking the expectation on both sides P
DSa,b
m k=0
≥n ≤E
+
(X(τ 2k ) − X(τ 2k+1 )) + (X(t) − b) n(b − a)
=
1 E (X(τ 2k ) − X(τ 2k+1 )) + n (b − a) k=0
1 + + E (X(t) − b) . n (b − a) m
=
Now assume that X is an integrable submartingale. As t ≥ τ 2k+1 ≥ τ 2k by the discrete Optional Sampling Theorem35 E (X(τ 2k ) − X(τ 2k+1 )) ≤ 0, so
P DSa,b ≥ n ≤
+ E (X(t) − b) n (b − a)
.
If the number of points of S increases by refining S then the number of downcrossings DSa,b does not decrease. If S is an arbitrary countable set then the number of downcrossings in S is the supremum of the downcrossings of the finite subsets of S. With the Monotone Convergence Theorem we get the following important inequality: Theorem 1.71 (Downcrossing inequality) If X is an integrable submartingale and S is an arbitrary finite or countable subset of the time interval [0, t) then
E ((X(t) − b)+ ) P DSa,b ≥ n ≤ . n (b − a) In particular
P DSa,b = ∞ = 0. There are many important consequences of this inequality. The first one is a generalization of the martingale convergence theorem. Corollary 1.72 (Submartingale convergence theorem) Let X (Xn , Fn ) be a submartingale over the natural numbers N. If X is bounded in L1 (Ω) then 35 See:
Lemma 1.83, page 49.
MARTINGALES
45
there is a variable X∞ ∈ L1 (Ω) such that a.s.
lim Xn = X∞ .
(1.31)
n→∞ a.s.
Proof. If Xn → X∞ then by Fatou’s lemma E (|X∞ |) ≤ lim inf E (|Xn |) ≤ k < ∞ n→∞
and X∞ ∈ L1 (Ω). Let a < b be rational thresholds, and let Sm {1, 2, . . . , m}. As E (|Xm |) ≤ k for all m
P DSa,b ≥n ≤ m
+ E (Xm − b) n (b − a)
≤
k . n(b − a)
If m ∞ then for all n
P DNa,b = ∞ ≤ P DNa,b ≥ n ≤
k , n(b − a)
which implies that P DNa,b = ∞ = 0. The convergence in (1.31) easily follows from the next lemma: Lemma 1.73 Let (cn ) be an arbitrary sequence of real numbers. If for every a < b rational thresholds the number of downcrossings of the sequence (cn ) is finite then the (finite or infinite) limit limn→∞ cn exists. Proof. The lim supn cn and the lim inf n cn extended real numbers always exist. If lim inf cn < a < b < lim sup cn n→∞
n→∞
then the number of the downcrossings of (cn ) is infinite. Definition 1.74 Let ξ ∈ L1 (Ω) and let Xn E (ξ | Fn ) , n ∈ N. Assume that the sequence of σ-algebras (Fn ) is decreasing, that is Fn+1 ⊆ Fn for all n ∈ N. These type of sequences are called reversed martingales. If Y−n Xn for all n ∈ N and G−n Fn then Y = (Yn , Gn ) is martingale over the parameter set Θ = {−1, −2, . . .}. If (Xn , Fn ) is a reversed martingale then one can assume that Xn = E (X0 | Fn ) for all n. If X is a continuous-time martingale and tn t∞ then the sequence (X(tn ), Ftn )n is a reversed martingale.
46
STOCHASTIC PROCESSES
Theorem 1.75 (L´ evy) If (Fn ) is a decreasing sequence of σ-algebras, X0 ∈ L1 (Ω) and F∞ ∩n Fn then Xn E (X0 | Fn ) → E (X0 | F∞ ) , where the convergence holds in L1 (Ω) and almost surely. Proof. As (Xn ) is uniformly integrable36 , it is sufficient to prove that (Xn ) is almost surely convergent. Let a < b be rational thresholds. On the set A
lim inf Xn < a < b < lim sup Xn n→∞
n→∞
the number of downcrossings is infinite. As n → X−n is a martingale on Z, the probability of A is zero. Hence a.s.
lim inf Xn = lim sup Xn . n→∞
1.3.5
n→∞
Regularization of martingales
Recall that, by definition, every continuous-time martingale is right-regular. Let F be an arbitrary filtration, and let ξ ∈ L1 (Ω). In discrete-time the sequence Xn E(ξ | Fn ) is a martingale as for every s < t a.s.
E (X(t) | Fs ) E (E(ξ | Ft ) | Fs ) = E (ξ | Fs ) X(s). In continuous-time X is not necessarily a martingale as the trajectories of X are not necessarily right-regular. Definition 1.76 A stochastic process X has martingale structure if E (X (t)) is finite for every t and a.s.
E (X(t) | Fs ) = X(s) for all s < t. Our goal is to show that if the filtration F satisfies the usual conditions then every stochastic process with martingale structure has a modification which is a 36 See:
Lemma 1.70, page 42.
MARTINGALES
47
martingale. The proof depends on the following simple lemma: Lemma 1.77 If X has a martingale structure then there is an Ω0 ⊆ Ω with P(Ω0 ) = 1, such that for every trajectory X(ω) with ω ∈ Ω0 and for every rational threshold a < b the number of downcrossings over the rational numbers a,b is finite. In particular if ω ∈ Ω0 then for every t ∈ Θ the (finite or infinite) DQ limits lim X(s, ω),
st, s∈Q
lim X(s, ω)
st, s∈Q
exist. Proof. The first part of the lemma is a direct consequence of the downcrossings inequality. If limn X(sn , ω) does not exist for some sn t then for some rational / Ω0 . thresholds a < b the number of downcrossings of (X(sn , ω)) is infinite, so ω ∈ Assume that X has a martingale structure. Let Ω0 ⊆ Ω be the subset in the lemma above. (t, ω) X
0 if ω ∈ / Ω0 . limst,s∈Q X(s, ω) if ω ∈ Ω0
(1.32)
is right-regular. Let t < s, ε > 0. We show that X (s, ω) ≤ X (t, ω) − X (tn , ω) + X (t, ω) − X
(s, ω) . + |X (tn , ω) − X (sn , ω)| + X (sn , ω) − X
As for an arbitrary ω ∈ Ω0 the number of ε/3 downcrossings of X over the Q is finite, so one can assume that in a right neighbourhood (t.t + u) of t for every tn , sn ∈ Q |X (tn , ω) − X (sn , ω)|
sup X + (t) = 0, t∈[0,1]
t∈I
1
that is, without the regularity of the trajectories Doob’s inequality does not hold. Of course Y ≡ 0 is regular modification of X, and for Y Doob’s inequality holds. 1.3.6
The Optional Sampling Theorem
As a first step let us prove the discrete-time version of the Optional Sampling Theorem40 . Lemma 1.83 Let X = (Xn , Fn ) be a discrete-time, integrable submartingale. If τ 1 and τ 2 are stopping times and for some p < ∞ P (τ 1 ≤ τ 2 ) = P (τ 2 ≤ p) = 1, then X(τ 1 ) ≤ E (X(τ 2 ) | Fτ 1 ) and E (X0 ) ≤ E (X(τ 1 )) ≤ E (X(τ 2 )) ≤ E (Xp ) . If X is a martingale then in both lines above equality holds everywhere. 40 The reader should observe that we have already used this lemma several times. Of course the proof of the lemma is independent of the results above.
50
STOCHASTIC PROCESSES
Proof. Let τ 1 ≤ τ 2 ≤ p and ϕk χ (τ 1 < k ≤ τ 2 ) . Observe that {ϕk = 1} = {τ 1 < k, τ 2 ≥ k} = c
= {τ 1 ≤ k − 1} ∩ {τ 2 ≤ k − 1} ∈ Fk−1 . By the assumptions Xk is integrable for all k, so Xk − Xk−1 is also integrable, therefore the conditional expectation of the variable Xk − Xk−1 with respect to the σ-algebra Fk−1 exists. ϕk is bounded, hence p E (η) E ϕk [Xk − Xk−1 ] = k=1
=
p
E (E (ϕk [Xk − Xk−1 ] | Fk−1 )) =
k=1
=
p
E (ϕk E (Xk − Xk−1 | Fk−1 )) ≥ 0.
k=1
If τ 1 (ω) = τ 2 (ω) for some outcome ω, then ϕk (ω) = 0 for all k, hence η (ω) 0. If τ 1 (ω) < τ 2 (ω), then η (ω) X (τ 1 (ω) + 1) − X (τ 1 (ω)) + X (τ 1 (ω) + 2) − X (τ 1 (ω) + 1) + . . . + X (τ 2 (ω)) − X (τ 2 (ω) − 1) , which is X (τ 2 (ω)) − X (τ 1 (ω)). Therefore E (η) = E (X (τ 2 ) − X (τ 1 )) ≥ 0. Xk is integrable for all k, therefore E (X (τ 1 )) and E (X (τ 2 )) are finite. By the finiteness of these expected values E (X (τ 2 ) − X (τ 1 )) = E (X (τ 2 )) − E (X (τ 1 )) , hence E (X (τ 2 )) ≥ E (X (τ 1 )) . Let A ∈ Fτ 1 ⊆ Fτ 2 , and let us define the variables τ k (ω) if ω ∈ A . τ ∗k (ω) p + 1 if ω ∈ /A
(1.33)
MARTINGALES
51
τ ∗1 and τ ∗2 are stopping times since if n ≤ p, then {τ ∗k ≤ n} = A ∩ {τ k ≤ n} = A ∩ {τ k ≤ n} ∈ Fn . By (1.33) E (X
(τ ∗2 ))
=
X (τ 2 ) dP+ Ac
A
X (p + 1) dP ≥ E (X (τ ∗1 )) =
X (τ 1 ) dP+
=
X (p + 1) dP. Ac
A
As Xp+1 is integrable one can cancel inequality so
Ac
X (p + 1) dP from both sides of the
X (τ 2 ) dP ≥
A
X (τ 1 ) dP. A
X (τ 1 ) is Fτ 1 -measurable and therefore E (X (τ 2 ) | Fτ 1 ) ≥ X (τ 1 ) . To prove the continuous-time version of the Optional Sampling Theorem we need some technical lemmas: Lemma 1.84 If τ is a stopping time, then there is a sequence of stopping times (τ n ) such that τ n has finite number of values41 , τ < τ n for all n and τn τ. (n)
Proof. Divide the interval [0, n) into n2n equal parts. Ik Let τ n (ω)
k/2n +∞
if otherwise
[(k − 1) /2n , k/2n ).
ω ∈ τ −1 (Ik ) (n)
.
(n)
Obviously τ < τ n . At every step the subintervals Ik are divided equally, and (n) (n) the value of τ n on τ −1 (Ik ) is always the right endpoint of the interval Ik . Therefore τ n τ . τ is a stopping time, hence, using that, every stopping time is a weak stopping time τ 41 τ
n (ω)
−1
(n) Ik
= +∞ is possible.
=
k τ< n 2
k−1 ∩ τ< 2n
c ∈ Fk/2n .
52
STOCHASTIC PROCESSES
Therefore
i τn ≤ n 2
=
k τn = n 2
k≤i
=
(n) τ −1 Ik ∈ Fi/2n . k≤i
The possible values of τ n are among the dyadic numbers i/2n and therefore τ n is a stopping time. Lemma 1.85 If (τ n ) is a sequence of stopping times and τ n τ then Fτ n + Fτ + . If τ n > τ and τ n τ then Fτ n Fτ + . Proof. Recall that by definition A ∈ Fρ+ if A ∩ {ρ ≤ t} ∈ Ft+ for every t. If A ∈ Fρ+ , then A ∩ {ρ < t} =
n
1 A∩ ρ≤t− n
∈ ∪n F(t−1/n)+ ⊆ Ft .
1. Let A ∈ Fρ+ and let ρ ≤ σ. A ∩ {σ ≤ t} = A ∩ {ρ ≤ t} ∩ {σ ≤ t} ∈ Ft+ as A ∩ {ρ ≤ t} ∈ Ft+ and {σ ≤ t} ∈ Ft . From this it is easy to see that Fτ + ⊆ ∩n Fτ n + . If A ∈ ∩n Fτ n + , then as τ n τ A ∩ {τ < t} = A
(∪n {τ n < t}) =
(A ∩ {τ n < t}) ∈ Ft .
n
So A ∩ {τ ≤ t} =
n
1 A∩ τ τ be a finitevalued approximating sequence42 . As τ is bounded there is an N large enough that τ (n) ≤ N . By the first lemma X(τ (n) ) = E (X(N ) | Fτ (n) ) .
(1.35)
As τ (n) > τ , by the last lemma ∩n Fτ (n) = Fτ + . So by the definition of the conditional expectation (1.35) means that X(τ (n) )dP = X(N )dP, A ∈ Fτ + . A
A
X(N ) is integrable therefore the sequence X(τ (n) ) is uniformly integrable43
by (1.35). By the right-continuity of the martingales X (τ ) = limn→∞ X τ (n) , so if A ∈ Fτ + then (n) X(N )dP = lim X(τ )dP = lim X(τ (n) )dP = n→∞
A
=
A
A n→∞
X(τ )dP. A
As X (τ ) is Fτ -measurable and Fτ ⊆ Fτ + , X (τ ) = E (X (N ) | Fτ + ) . If X is uniformly integrable then one can assume that X is a martingale on [0, ∞]. There is a continuous bijective time transformation f between the intervals [0, ∞] and [0, 1]. During this transformation the properties of X and τ do not change, but f (τ ) will be bounded, so using the same argument as above one can prove that X (τ ) = E (X (∞) | Fτ + ) . Finally if τ 1 ≤ τ 2 , then as Fτ 1 + ⊆ Fτ 2 + E (X (τ 2 ) | Fτ 1 + ) = E (E (X (N ) | Fτ 2 + ) | Fτ 1 + ) = = E (X (N ) | Fτ 1 + ) = X (τ 1 ) , where if X is uniformly integrable, then N ∞. 42 See: 43 See:
Lemma 1.84, page 51. Lemma 1.70, page 42.
54
STOCHASTIC PROCESSES
Corollary 1.87 If X is a non-negative martingale and τ 1 ≤ τ 2 , then X(τ 1 ) ≥ E (X(τ 2 ) | Fτ 1 + ) .
(1.36)
Proof. First of all let us remark, that as X is a non-negative martingale X(∞) is meaningful 44 , and if n ∞ then X (τ ∧ n) → X (τ ) for every stopping time τ . Let G σ ∪n F(τ ∧n)+ . Obviously G ⊆ Fτ + . Let A ∈ Fτ + . A ∩ {τ ≤ n} ∩ {τ ∧ n ≤ t} = A ∩ {τ ≤ t ∧ n} ∈ Ft+ , therefore A ∩ {τ ≤ n} ∈ F(τ ∧n)+ . So A ∩ {τ < ∞} ∈ G. Also A ∩ {τ > n} ∩ {τ ∧ n ≤ t} = A ∩ {t ≥ τ > n} ∈ Ft+ so A ∩ {τ > n} ∈ F(τ ∧n)+ . Hence A ∩ {τ = ∞} = A ∩ (∩n {τ > n}) ∈ G, therefore G = Fτ + . Let n1 ≤ n2 . By the Optional Sampling Theorem
X(τ 1 ∧ n1 ) = E X(τ 2 ∧ n2 ) | F(τ 1 ∧n1 )+ . X(τ 2 ∧ n2 ) ∈ L1 (Ω) and therefore by L´evy’s theorem X(τ 1 ) = E (X(τ 2 ∧ n2 ) | Fτ 1 + ) . By Fatou’s lemma X(τ 1 ) = lim E (X(τ 2 ∧ n2 ) | Fτ 1 + ) ≥ E n2 →∞
lim X(τ 2 ∧ n2 ) | Fτ 1 +
n2 →∞
=
= E (X(τ 2 ) | Fτ 1 + ) .
Proposition 1.88 (Optional Sampling Theorem for submartingales) Let τ 1 ≤ τ 2 bounded stopping times. If X is an integrable submartingale then X (τ 1 ) and X (τ 2 ) are integrable and X (τ 1 ) ≤ E (X (τ 2 ) | Fτ 1 ) .
(1.37)
The inequality also holds if τ 1 ≤ τ 2 are arbitrary stopping times and X can be extended as an integrable submartingale to [0, ∞]. Proof. The proof of the proposition is nearly the same as the proof in the martingale case. Again it is sufficient to prove the inequality in the bounded 44 See:
Corollary 1.66, page 40.
MARTINGALES (n)
55
(n)
case. Assume that τ 1 ≤ τ 2 ≤ K and let (τ 1 )n and (τ 2 )n be the finite-valued (n) (n) approximating sequences of τ 1 and τ 2 . By the construction τ 1 ≤ τ 2 , so by the first lemma of the subsection (n) (n) X(τ 1 )dP ≤ X(τ 2 )dP, F ∈ Fτ 1 + . F
F
(n)
By the right-continuity of submartingales X(τ k ) → X(τ k ) and therefore one should prove that the convergence holds in L1 (Ω), that is, one should prove the (n) uniform integrability of the sequences (X(τ k )). Since in this case one can take the limits under the integral signs therefore X(τ 1 )dP ≤ X(τ 2 )dP, F ∈ Fτ 1 + . F
F
As X(τ 1 ) is Fτ 1 + -measurable by the definition of the conditional expectation X (τ 1 ) = E (X (τ 1 ) | Fτ 1 ) ≤ E (X (τ 2 ) | Fτ 1 + ) . This means that (1.37) holds. Let us prove that the sequence uniformly integrable.
(n) X(τ k ) is
1. As X is submartingale, X + is also submartingale, therefore from the finite Optional Sampling Theorem
(n) ≤ E X + (K) | Fτ (n) . 0 ≤ X+ τ k k
The right-hand side is uniformly integrable45 , so the left-hand side is also uniformly integrable. (n)
2. Let Xn X(τ k ). By the finite Optional Sampling Theorem (Xn ) is obviously an integrable reversed submartingale. Let n > m. As (Xn ) is a reversed submartingale 0≤ Xn− dP = − Xn dP = Xn dP − E(Xn ) ≤ {Xn− ≥N } {Xn− ≥N } {Xn− 0, then σ τ ∧ N is a bounded stopping time. If π were not right but left-continuous then one could not apply the Optional Sampling Theorem: if P were left-continuous then P (σ) = 0, and E (π (0)) = 0 = E (−λσ) = E (P (σ) − λσ) = E (π (σ)) . Let w be a Wiener process and let τ a inf {t : w(t) = a}
MARTINGALES
57
be the first passage time of an a = 0. As w is not uniformly integrable and τ a is unbounded, one cannot apply the Optional Sampling Theorem: almost surely46 a.s. τ a < ∞, hence w (τ a ) = a. Therefore E (w (τ a )) = E (a) = a = 0 = E (w (0)) .
Example 1.90 The exponential martingales of Wiener processes are not uniformly integrable.
Let w be a Wiener process. If the so-called exponential martingale X (t) exp (w (t) − t/2) were uniformly integrable, then for every stopping time one could apply the Optional Sampling Theorem. X is a non-negative martingale, therefore there is47 a random variable X (∞) such that almost surely X(t) → X(∞). For almost all trajectories of w the set {w = 0} is unbounded48 , therefore w(σ n ) = 0 for some sequence σ n ∞. Therefore σ
σn
a.s. n X (∞) = lim X (σ n ) lim exp w (σ n ) − = lim exp − = 0. n→∞ n→∞ n→∞ 2 2 a.s.
Since X(0) = 1, X(∞) = 0 and X is continuous, if a < 1 then almost surely τ a inf {t : X(t) = a} < ∞. a.s.
That is X (τ a ) = a. So if a < 1, then E (X (0)) = 1 > a = E (X (τ a )) . Hence X is not uniformly integrable. Proposition 1.91 (Martingales and conservation of the expected value) Let X be an adapted and right-regular process. X is a martingale if and only if X(τ ) ∈ L1 (Ω)
and
E (X(τ )) = E (X(0))
for all bounded stopping times τ . This property holds for every stopping time τ if and only if X is a uniformly integrable martingale. 46 See:
Proposition B.7, page 564. Corollary 1.66, page 40. 48 See: Corollary B.8, page 565. 47 See:
58
STOCHASTIC PROCESSES
Proof. If X is a martingale, or uniformly integrable martingale, then by the Optional Sampling Theorem the proposition holds. Let s < t and let A ∈ Fs . It is easy to check that τ = tχAc + sχA
(1.38)
is a bounded stopping time. By the assumption of the proposition E (X(0)) = E (X(τ )) = E (X(t)χAc ) + E (X(s)χA ) . As τ ≡ t is also a stopping time, E (X(0)) = E (X(t)) = E (X(t)χAc ) + E (X(t)χA ) . Comparing the two equations E (X(s)χA ) = E (X(t)χA ) , that is E (X(s) | Fs ) = E (X(t) | Fs ) . As X is adapted, X(s) is Fs -measurable so X(s) = E (X(t) | Fs ). If one can apply the property E (X(τ )) = E (X(0)) for every stopping time τ then one can apply it for the stopping time τ ≡ ∞ as well. Hence X (∞) exists and in (1.38) t = ∞ is possible, hence X(s) = E (X(∞) | Fs ) , so X is uniformly integrable49 . Corollary 1.92 (Conservation of the martingale property under truncation) If X is a martingale and τ is a stopping time then the truncated process X τ is also a martingale. Proof. If X is right-regular then the truncated process X τ is also right-regular. By Proposition 1.35 X τ is adapted. Let φ be a bounded stopping time. As υ φ ∧ τ is a bounded stopping time by Proposition 1.91 E (X τ (φ)) = E (X(υ)) = E (X(0)) = E (X τ (0)) and therefore X τ is a martingale. 1.3.7
Application: elementary properties of L´ evy processes
L´evy processes are natural generalizations of Wiener and Poisson processes. Let us fix a stochastic base space (Ω, A, P, F) and assume that Θ = [0, ∞). Definition 1.93 Let X be an adapted stochastic process. X is a process with independent increments with respect to the filtration F if 49 See:
Lemma 1.70, page 42 .
MARTINGALES
59
1. X (0) = 0, 2. X is right-regular, 3. whenever s < t then the increment X (t) − X (s) is independent of the σ-algebra Fs . A process X with independent increments is a L´evy process, if it has stationary or homogeneous increments that is for every t and for every h > 0 the distribution of the increment X(t + h) − X(t) is the same as the distribution of X(h) − X(0). By definition every L´evy process and every process with independent increments has right-regular trajectories. This topological assumption is very important as it is not implied by the other assumptions: Example 1.94 Not every process starting from zero and having stationary and independent increments is a L´evy process.
Let Ω be arbitrary and A = Ft = {∅, Ω} and let (xα )α be a Hamel basis of R over the rational numbers. For every t let X(t) be the sum of the coordinates of t in the Hamel basis. Obviously X(t + s) = X(t) + X(s) so X has stationary and independent increments. But as X is highly discontinuous50 it does not have a modification which is a L´evy process. Example 1.95 The sum of two L´evy processes is not necessarily a L´evy process51 .
We show that even the sum of two Wiener processes is not a Wiener process. The present counter example is very important as it shows that, although the L´evy processes are the canonical and most important examples of semimartingales, they are not the right objects from the point of view of the theory. The sum of two semimartingales52 is a semimartingale and the same is true for martingales or for local martingales. But it is not true for L´evy processes! 1. Let Ω be the set of two-dimensional continuous functions R+ → R2 with the property f (0) = (0, 0). Let P1 be a measure on the Borel σ-algebra of Ω for which the canonical stochastic process X (ω, t) = ω (t) is a two-dimensional Wiener process with correlation coefficient 1. In the same way let P2 be the measure on Ω under which X is a Wiener process with correlation coefficient −1. Let P (P1 + P2 )/2. It is easy to see that the coordinate processes w1 (t) and 50 The
image space of X is the rational numbers! example depends on results which we shall prove later. So the reader can skip the example at the first reading. 52 We shall introduce the definitions of semimartingales and local martingales later. 51 The
60
STOCHASTIC PROCESSES
w2 (t) are Wiener processes. On the other hand, a simple calculation shows that the distribution of Z w1 + w2 is not Gaussian. Z is continuous and every continuous L´evy process is a linear combination of a Wiener process and a linear trend53 , therefore, as Z is not a Gaussian process it cannot be a L´evy process. 2. The next example is bit more technical, but very similar: Let w be a Wiener t process with respect to some filtration F. Let X (t) 0 sign (w) dw, where the integral, of course, is an Itˆ o integral. The quadratic variation of X is
t
2
(sign (w)) d [w] =
[X] (t) = 0
t
1ds = t 0
so by L´evy’s characterization theorem54 the continuous local martingale X is also a Wiener process55 with respect to F. If Z w + X = 1 • w + sign (w (s)) • w = (1 + sign (w (s))) • w then Z is a continuous martingale with respect to F with zero expected value. [Z] (t) =
t
2
(1 + sign (w)) d [w] = 0
t
2
(1 + sign (w (s))) ds 0
so Z is not a Wiener process. As in the first example, every continuous L´evy process is a linear combination of a Wiener process and a linear trend, therefore, as Z is not a Wiener process it cannot be a L´evy process. During the proof of the next proposition, we shall need the next very useful simple observation: Lemma 1.96 ξ 1 and ξ 2 are independent vector-valued random variables if and only if ϕ = ϕ 1 · ϕ2 , where ϕ1 is the Fourier transform of ξ 1 and ϕ2 is the Fourier transform of ξ 2 and ϕ is the Fourier transform of the joint distribution of (ξ 1 , ξ 2 ). Proof. If ξ 1 and ξ 2 are independent then the decomposition obviously holds. The other implication is an easy consequence of the Monotone Class Theorem: 53 See:
Theorem 6.11, page 367. Theorem 6.13, page 368. 55 See: Example 6.14, page 370. 54 See:
MARTINGALES
61
fix a vector v and let L be the set of bounded functions u for which E (u (ξ 1 ) · exp (i (v, ξ 2 ))) = E (u (ξ 1 )) · E (exp (i (v, ξ 2 ))) . L is obviously a λ-system. Under the conditions of the lemma L contains the π-system of the functions u (x) = exp (i (u, x)) , so it contains the characteristic functions of the sets of the σ-algebra generated by these exponential functions. Therefore it is easy to see that for every Borel measurable set B E (χB (ξ 1 ) · exp (i (v, ξ 2 ))) = P (ξ 1 ∈ B) · E (exp (i (v, ξ 2 ))) . Now let L be the set of bounded functions v for which E (χB (ξ 1 ) · v (ξ 2 )) = P (ξ 1 ∈ B) · E (v (ξ 2 )) . With the same argument as above, by the Monotone Class Theorem for any Borel measurable set D, one can choose v = χD . So P (ξ 1 ∈ B, ξ 2 ∈ D) = E (χB (ξ 1 ) · χD (ξ 2 )) = P (ξ 1 ∈ B) · P (ξ 2 ∈ D) therefore, by independent.
definition,
the
random
vectors
ξ1
and
ξ2
are
Proposition 1.97 For an adapted process X the increments are independent if and only if the σ-algebra Gt generated by the increments X (u) − X (v) ,
u≥v≥t
is independent of Ft for every t. Proof. To make the notation as simple as possible let X (t0 ) denote an arbitrary Ft0 -measurable random variable. Let 0 = t−1 ≤ t = t0 ≤ t1 ≤ t2 ≤ . . . ≤ tn . We show that if X has independent increments then the random variables X(t0 ), X(t1 ) − X(t0 ), X(t2 ) − X(t1 ), . . . , X(tn ) − X(tn−1 )
(1.39)
are independent. To prove this one should prove that the Fourier transform of the joint distribution of the variables in (1.39) is the product of the Fourier
62
STOCHASTIC PROCESSES
transforms of the distributions of these increments: n uj [X(tj ) − X(tj−1 )] = ϕ(u) E exp i
j=0
= E E exp i
= E exp i
= E exp i
= E exp i
= E exp i
n j=0
n−1
uj ∆X(tj ) E (exp (iun ∆X(tn ))) =
uj ∆X(tj ) ϕtn ,tn−1 (un ) =
j=0 n−1
uj ∆X(tj ) | Ftn−1 =
j=0 n−1
uj ∆X(tj ) E exp (iun ∆X(tn )) | Ftn−1 =
j=0 n−1
uj ∆X(tj ) ϕtn ,tn−1 (un ) = · · · =
j=0
=
n !
ϕtj ,tj−1 (uj ).
j=0
Of course this means that the σ-algebra generated by a finite number of increments is independent of Ft for any t. As the union of σ-algebras generated by finite number of increments is a π-system, with the uniqueness of the extension of the probability measures from π-systems one can prove that the σ-algebra generated by the increments is independent of Ft . Let us denote by ϕt the Fourier transform of X(t). As X has stationary and independent increments, for every u ϕt+s (u) E (exp (iuX(t + s))) = = E (exp (iu (X(t + s) − X (t))) exp (iuX(t))) = = E (exp (iu (X(t + s) − X (t)))) · E (exp (iuX(t))) = = E (exp (iuX(s))) · E (exp (iuX(t))) ϕt (u) · ϕs (u), therefore ϕt+s (u) = |ϕt (u)| · |ϕs (u)| .
(1.40)
MARTINGALES
63
As |ϕt (u)| ≤ 1 for all u and as |ϕ0 (u)| = 1 from Cauchy’s functional equation |ϕt (u)| = exp (t · c(u)) . This implies that ϕt (u) is never zero. Let h > 0. ϕt (u) − ϕt+h (u) = |ϕt (u)| 1 − ϕt+h (u) ≤ ϕt (u) ≤ |1 − ϕh (u)| . X is right-continuous so if h 0 then by the Dominated Convergence Theorem, using that X (0) = 0 lim ϕh (u) = ϕ0 (u) = 1.
h0
So ϕt (u) is right-continuous. If t > 0 then ϕt (u) − ϕt−h (u) = ϕt−h (u) 1 − ϕt (u) ≤ ϕt−h (u) ≤ |1 − ϕh (u)| → 0, so ϕt (u) is also left-continuous. Hence ϕt (u) is continuous in t. Therefore E(exp(iu∆X(t))) = lim E(exp(iu(X(t) − X(t − h)))) = h0
= lim
h0
ϕt (u) = 1, ϕt−h (u)
so ∆X(t) = 0 almost surely. a.s.
a.s.
Hence for some subsequence X (tnk ) → X (t). This implies that X (t−) = X (t). Therefore one can make the next important observation:
Proposition 1.98 If X is a L´evy process then ϕt (u) = 0 for every u and the probability of a jump at time t is zero for every t. This implies that every L´evy process is continuous in probability. We shall need the following generalization: Proposition 1.99 If X is a process with independent increments and X is continuous in probability then ϕt (u) ϕ(u, t) E (exp (iuX (t))) is never zero.
64
STOCHASTIC PROCESSES
Proof. Let us fix the parameter u. As X is continuous in probability ϕ(u, t) is continuous in t. Let t0 (u) inf {t : ϕ (u, t) = 0} . One should prove that t0 (u) = ∞. By definition X (0) = 0 therefore ϕ (u, 0) = 1 and as ϕ (u, t) is continuous in t obviously t0 (u) > 0. Let h (u, s, t) E (exp (iu (X (t) − X (s)))) . X has independent increments, so if s < t then ϕ (u, t) = ϕ (u, s) h (u, s, t) .
(1.41)
By the right-regularity of X ϕ (u, t0 (u)) = 0. As X (t) has limits from the left if t0 (u) < ∞ then ϕ (u, t0 (u) −) is well-defined. We show that it is not zero. By (1.41) if s < t0 (u) < ∞ then ϕ (u, t0 (u) −) = ϕ (u, s) h (u, s, t0 (u) −) . ϕ (u, s) = 0 by the definition of t0 (u), so if ϕ (u, t0 (u) −) = 0 then h (u, s, t0 (u) −) = 0 for every s < t0 (u). 0=
lim h (u, s, t0 (u) −) =
st0 (u)
=
lim E (exp (iuX (t0 (u) −) − iuX (s))) =
st0 (u)
= E (exp (0)) = 1, which is impossible. Therefore 0 = ϕ (u, t0 (u)) = ϕ (u, t0 (u) −) = 0, which is impossible since ϕ is continuous. Let us recall the following simple observation: Proposition 1.100 Let ψ be a complex-valued, continuous curve defined on R. If ψ (t) = 0 for every t then it has a logarithm that is there is a continuous curve φ with the property that ψ = exp (φ). If φ1 (t0 ) = φ2 (t0 ) for some point t0 and ψ = exp (φ1 ) = exp (φ2 ) for some continuous curves φ1 and φ2 then φ1 = φ2 .
MARTINGALES
65
Proof. The proposition and its proof is quite well-known, so we just sketch it: 1. ψ = 0, so if ψ = exp (φ1 ) = exp (φ2 ) then 1=
ψ exp (φ1 ) = = exp (φ1 − φ2 ) . ψ exp (φ2 )
Hence for all t φ1 (t) = φ2 (t) + 2πin (t) , where n (t) is a continuous integer-valued function. As n (t0 ) = 0 obviously n ≡ 0, so φ1 = φ2 . 2. The complex series ln (1 + z) =
∞
n+1
(−1)
n=1
zn n
is convergent if |z| < 1. On the real line exp (ln (1 + z)) = 1 + z.
(1.42)
As ln (1 + z) is analytic (1.42) holds for every |z| < 1. To simplify notation as much as possible let us assume that t0 = 0 and ϕ (t0 ) = 1 and let us assume that we are looking for a curve with φ (t0 ) = 0. From (1.42) there is an r > 0 that ψ (t) ln (ϕ (t)) is well-defined for |t| < r. 3. Let a be the infimum and let b be the supremum of the endpoints of closed intervals where one can define a φ. If an a and bn b and φ is defined on [an , bn ] then by the first point of the proof φ (t) is well-defined on (a, b). Let assume that b < ∞. As ψ (b) = 0 we can define the curve θ (t) ψ (b + t) /ψ (b). Applying the part of the proposition just proved for some r > 0 ψ (t) = exp ( (t)) , ψ (b)
|b − t| < r,
with (b) = 0. Let t ∈ (b − r, b). As the range of the complex exponential function is C\ {0} there is a z ∈ C with ψ (b) = exp (z). exp (φ (t)) = ψ (b) exp ( (t)) = exp (z + (t)) . Hence φ (t) = z + (t) + 2nπi. With z + (t) + 2nπi one can easily continue φ to (a, b + r). This contradiction shows that one can define φ for the whole R.
66
STOCHASTIC PROCESSES
ϕ1 (u) E (exp (iuX (1))) is non-zero and by the Dominated Convergence Theorem it is obviously continuous in u. By the observation just proved ϕ1 (u) = exp (log ϕ1 (u)) exp(φ(u)), where by definition φ(0) = 0. From this by (1.40) ϕn (u) = exp(nφ(u)) and ϕ1/n (u) = exp(n−1 φ(u)) for every n ∈ N. Hence if r is a rational number then ϕr (u) = exp(rφ(u)). By the just proved continuity in t t ∈ R+ .
ϕt (u) = exp (tφ(u)) ,
(1.43)
L´evy processes are not martingales but we can use martingale theory to investigate their properties. The key tool is the so-called exponential martingale of X. Let us define the process Zt (u, ω) Z (t, u, ω)
exp (iuX(t, ω)) . ϕt (u)
(1.44)
ϕt (u) is continuous in t for every fixed u, and therefore Zt (u, ω) is a right-regular stochastic process. Let t > s. E (Zt (u) | Fs ) E =E =
exp (iuX (t)) | Fs ϕt (u)
=
exp (iu (X (t) − X (s))) exp (iuX (s)) | Fs ϕt−s (u) ϕs (u)
exp (iuX (s)) E (exp (iu (X (t) − X (s)))) = ϕs (u) ϕt−s (u)
= Zs (u)
E (exp (iuX (t − s))) = ϕt−s (u)
= Zs (u) · 1 Zs (u) , therefore Zt (u) is a martingale in t for any fixed u. Definition 1.101 Zt (u) is called the exponential martingale of X. Example 1.102 The exponential martingale of a Wiener process. If w is a Wiener process then Zt (u, ω)
exp (iuw(t)) u2 = exp iuw(t) + t . exp(−tu2 /2) 2
=
MARTINGALES
67
If instead of the Fourier transform we normalize with the Laplace transform, then56 exp (uw(t)) u2 = exp uw(t) − t . exp(tu2 /2) 2
Let X be a L´evy process and assume that the filtration is generated by X. Denote this filtration by F X . Obviously F X does not necessarily contain the measure-zero sets57 , so F X does not satisfy the usual conditions. Let N denotes the collection of measure-zero sets and let us introduce the so-called augmented filtration: Ft σ (σ (X (s) : s ≤ t) ∪ N ) .
(1.45)
It is a bit surprising, but for every L´evy process the augmented filtration satisfies the usual conditions. That is, for L´evy processes the augmented filtration F is always right-continuous58 : Proposition 1.103 If X is a L´evy process then (1.45) is right-continuous that is Ft = Ft+ . Proof. Let us take the exponential martingale of X. If t < w < s then exp (iuX (w)) Zw (u) = E (Zs (u) | Fw ) E ϕw (u)
exp (iuX (s)) | Fw , ϕs (u)
therefore Zw (u) ϕs (u) exp (iuX (w))
ϕs (u) = E (exp (iuX (s)) | Fw ) . ϕw (u)
If w t then from the continuity of ϕt and from the right-continuity of X, with L´evy’s theorem59 exp (iuX (t))
ϕs (u) a.s. = E (exp (iuX (s)) | Ft+ ) . ϕt (u)
As exp (iuX (t)) is Ft -measurable, and Zt (u) is a martingale exp (iuX (t)) 56 See:
Example 1.118, page 82. Example 1.13, page 9. 58 See: Example 1.13, page 9. 59 See: Theorem 1.75, page 46. 57 See:
ϕs (u) a.s. = E (exp (iuX (s)) | Ft ) . ϕt (u)
68
STOCHASTIC PROCESSES
Therefore a.s.
E (exp (iuX (s)) | Ft ) = E (exp (iuX (s)) | Ft+ ) .
(1.46)
This equality can be extended to multidimensional trigonometric polynomials. For example, if t < w ≤ s1 ≤ s2 and η u1 X (s1 ) + u2 X (s2 ) then, as X(s2 ) − X (s1 ) is independent of Fs1 : E (exp (iη) | Fw ) = E (exp (iu1 X (s1 )) · exp (iu2 X (s2 )) | Fw ) = = E (exp (i(u1 + u2 )X (s1 )) · E (exp (iu2 (X(s2 ) − X (s1 ))) | Fs1 ) | Fw ) = = E (exp (i(u1 + u2 )X (s1 )) · E (exp (iu2 (X(s2 ) − X (s1 )))) | Fw ) =
= E exp (i (u1 + u2 ) X (s1 )) · ϕs2 −s1 (u2 ) | Fw = = ϕs2 −s1 (u2 ) · E (exp (i (u1 + u2 ) X (s1 )) | Fw ) = = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zw (u1 + u2 ) . If w t then by the right-continuity of Zs and by L´evy’s theorem60 a.s.
E (exp (iη) | Ft+ ) = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zt (u1 + u2 ) . On the other hand with the same calculation if w = t a.s.
E (exp (iη) | Ft ) = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zt (u1 + u2 ) . Therefore a.s.
E (exp (iη) | Ft ) = E (exp (iη) | Ft+ ) . That is if sk > t then
E exp i
uk X(sk )
| Ft+
a.s.
= E exp i
k
uk X(sk )
| Ft
.
(1.47)
k
If sk ≤ t then equation (1.47) trivially holds. Hence if L is the set of bounded functions f for which a.s.
E (f (X (s1 ) , . . . , X (sn )) | Ft+ ) = E (f (X (s1 ) , . . . , X (sn )) | Ft ) then L contains the π-system of the trigonometric polynomials. L is trivially a λsystem, therefore, by the Monotone Class Theorem, L contains the characteristic functions of the sets of the σ-algebra generated by the trigonometric polynomials. 60 See:
Theorem 1.75, page 46.
MARTINGALES
69
That is if B ∈ B (Rn ) then one can write in place of f the characteristic functions χB . Collection Z of sets A for which a.s.
E (χA | Ft+ ) = E (χA | Ft ) is also a λ-system which contains the sets of the π-system n
∪n σ ((X (sk ))k=1 , sk ≥ 0) . Again, by the Monotone Class Theorem, Z contains the σ-algebra 0 = σ (X (s) : s ≥ 0) . F∞
0 If A ∈ Ft+ ∩n Ft+1/n then A ∈ F∞ σ F∞ ∪ N . Therefore there is an a.s. 0 ∈ F0 ⊆ Z ∈ F∞ A , with χA = χA. As A ∞ a.s.
a.s. a.s. χA = E (χA | Ft+ ) = E χA | Ft+ = E χA | Ft . Hence up to a measure-zero set χA is almost surely equal to an Ft -measurable function E χA | Ft . As Ft contains all the measure-zero set χA is Ft measurable, that is A ∈ Ft . In a similar way one can prove the next proposition: Proposition 1.104 If X is a process with independent increments and X is continuous in probability then (1.45) is right-continuous, that is Ft = Ft+ . Example 1.105 One cannot drop the condition of independent increments. If ζ ∼ = N (0, 1) and X (t, ω) tζ (ω) then the trajectories of X are continuous and X has stationary increments. If F is the augmented filtration, then F0 = σ (N ), and if t > 0, then Ft = σ (σ (X) , N ), hence Ft is not right-continuous. Example 1.106 The augmentation is important: if w is a Wiener process then Ftw σ (w (s) : s ≤ t) is not necessarily right-continuous61 .
From now on we shall assume that the filtration of every L´evy process satisfies the usual assumptions. 61 See:
Example 1.13, page 9.
70
STOCHASTIC PROCESSES
Proposition 1.107 If the process X is left-continuous then the filtration FtX σ (X (s) : s ≤ t) is left-continuous. This remains true for the augmented filtration.
X X Proof. Let Ft− σ ∪s n then {τ n ≤ t} = {τ ≤ n}. From (1.51) by the definition of the stopped σ-algebra An A ∩ {τ ≤ n} ∈ Fτ n . As τ n is bounded, by (1.49) exp (iu (X(τ n + t) − X(τ n ))) dP = P (An ) ϕt (u) . An
(1.52)
72
STOCHASTIC PROCESSES
From (1.50) and by the Dominated Convergence Theorem
exp (iuX ∗ (t)) dP =
A
lim χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) dP =
=
A n→∞
χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) dP =
= lim
n→∞
A
exp (iu (X (τ n + t) − X (τ n ))) dP =
= lim
n→∞
An
= lim P (An ) ϕt (u) = P (A) ϕt (u) = P (A) n→∞
exp (iuX(t)) dP. Ω
2. If A Ω then the equation above means that the Fourier transform of X ∗ (t) is ϕt . That is, the distribution of X ∗ (t) and X (t) is the same. Let L be the set of bounded functions f for which for all A ∈ Fτ
f (X ∗ (t)) dP = P (A)
f (X ∗ (t)) dP.
Ω
A
Obviously L is a λ-system, and L contains the π-system of the trigonometric polynomials x → exp (iux) ,
u ∈ R.
By the Monotone Class Theorem, L contains the functions f χB with B ∈ B (R). Therefore for every A ∈ Fτ and B ∈ B(R)
χB (X ∗ (t)) dP = P (A ∩ {X ∗ (t) ∈ B}) =
A
= P (A)
χB (X ∗ (t)) dP = P (A) · P (X ∗ (t) ∈ B) .
Ω
So X ∗ (t) is independent of Fτ . 3. One should prove that X ∗ has stationary and independent increments. If σ τ + t and X ∗∗ (h) X (σ + h) − X (σ) ,
MARTINGALES
73
then using the part of the proposition already proved for the stopping time σ X ∗ (t + h) − X ∗ (t) (X (τ + t + h) − X (τ )) − (X (τ + t) − X (τ )) = = X (σ + h) − X (σ) = X ∗∗ (h) ∼ = X(h), which is independent of t and therefore X ∗ has stationary increments. Also by the already proved part of the proposition X ∗ (t + h) − X ∗ (t) = X ∗∗ (h) is independent of Fσ Ft∗ . Obviously X ∗ (0) = 0 and X ∗ is right-regular therefore X ∗ is a process with independent increments. 4. Now we prove that X and X ∗ have the same distribution. Let 0 = t0 < t1 < . . . < tn be arbitrary. As we proved X ∗ (tk ) − X ∗ (tk−1 ) ∼ = X ∗ (tk − tk−1 ) ∼ = X (tk − tk−1 ) ∼ = ∼ X (tk ) − X (tk − 1) . = As the increments are independent (X ∗ (tk ) − X ∗ (tk−1 ))k=1 has the same disn n tribution as (X (tk ) − X (tk−1 ))k=1 . This implies that (X (tk ))k=1 has the same n distribution as (X ∗ (tk ))k=1 . Which, by the Monotone Class Theorem, implies that X ∗ and X has the same distribution. n
5. As we proved X ∗ is a process with independent increments so Ft∗ is independent of the σ-algebra Gt∗ generated by the increments64 X ∗ (u) − X ∗ (v) ,
u ≥ v ≥ t.
So as a special case the set {X ∗ (t) : t ≥ 0} is independent of F0∗ = Fτ . Example 1.110 Random times which are not stopping times.
Let a > 0 and let w be a Wiener process. 1. Let γ a sup {0 ≤ s ≤ a : w (s) = 0} = inf {s ≥ 0 : w (a − s) = 0} . 64 See:
Proposition 1.97, page 61.
74
STOCHASTIC PROCESSES
Obviously γ a is Fa -measurable, so it is a random time. As P (w (a) = 0) = 0 almost surely γ a < a. Assume that γ a is a stopping time. In this case by the strong Markov property w∗ (t) w (t + γ a ) − w (γ a ) is also a Wiener process. It is easy to see that if w∗ is a Wiener process then w (t) tw∗ (1/t) is also a Wiener process65 . As every one-dimensional Wiener process almost surely returns to the origin66 , with the strong Markov property it is easy to prove that w returns to the origin almost surely after any time t. This means that there is a sequence tn 0 with tn > 0 that almost surely w∗ (tn ) = 0. But this is impossible as almost surely w∗ does not have a zero on the interval (0, a − γ a ]. 2. Let β a max {w (s) : 0 ≤ s ≤ a} , ρa inf {0 ≤ s ≤ a : w (s) = β a } . We show that ρa is not a stopping time. As P (w (a) − w (a/2) < 0) = 1/2 P (ρa < a) > 0. If ρa were a stopping time, then by the strong Markov property w∗ (t) w (t + ρa ) − w (ρa ) would be a Wiener process. But this is impossible as with positive probability the interval (0, a − ρa ] is not empty and on this interval w∗ cannot have a positive value. An important consequence of the strong Markov property is the following: Proposition 1.111 If the size of the jumps of a L´evy process X are smaller than a constant c > 0, that is |∆X| ≤ c then on any interval [0, t] the moments of X are uniformly bounded. That is for each m there is a constant K (m, t), that E (|X m (s)|) ≤ K (m, t) ,
s ∈ [0, t] .
Proof. One may assume that the stopping time67 τ 1 inf {t : |X (t)| > c} 65 See:
Corollary B.10, page 566. Corollary B.8, page 565. 67 Recall that F satisfies the usual assumptions. See: Example 1.32, page 17. 66 See:
MARTINGALES
75
is finite, as by the zero-one law the set of outcomes ω where τ 1 (ω) = ∞ has probability 0 or 1. If with probability one τ 1 (ω) = ∞ then X is uniformly bounded, hence in this case the proposition holds. Then define the stopping time τ 2 inf {t : |X ∗ (t)| > c} + τ 1 inf {t : |X (t + τ 1 ) − X (τ 1 )| > c} + τ 1 . In a similar way let us define τ 3 etc. By the strong Markov property the variables {X ∗ (t) : t ≥ 0} are independent of the σ-algebra Fτ 1 . The variable τ 2 − τ 1 inf {t ≥ 0 : |X ∗ (t)| > c} is measurable with respect to the σ-algebra generated by the variables {X ∗ (t) : t ≥ 0} hence τ 2 − τ 1 is independent of Fτ 1 . In general τ n − τ n−1 is independent of Fτ n−1 . Also by the strong Markov property for all n the distribution of τ n − τ n−1 is the same as the distribution of τ 1 . Therefore if τ 0 0, then using the independence of variables (τ k − τ k−1 )
E (exp (−τ n )) = E exp −
n
(τ k − τ k−1 )
n
= (E (exp (−τ 1 ))) q n ,
k=1
where 0 < q ≤ 1. If q = 1 then almost surely τ 1 = 0, which by the rightcontinuity implies that |X (0)| ≥ c > 0, which, by the definition of L´evy processes, is not the case, so q < 1. As the jumps are smaller than c |X (τ 1 )| ≤ |X (τ 1 −)| + |∆X (τ 1 )| ≤ ≤ |X (τ 1 −)| + c ≤ 2c. In a same way it is easy to see that in general sup |X τ n (t)| = sup |{X (t) : t ∈ [0, τ n ]}| ≤ 2nc. t
Therefore by Markov’s inequality P (|X (t)| > 2nc) ≤ P (τ n < t) = P (exp (−τ n ) > exp (−t)) ≤ ≤
E (exp (−τ n )) ≤ exp (t) q n . exp (−t)
As q < 1 L(m)
∞ n=0
m
[2 (n + 1) c] q n < ∞,
76
STOCHASTIC PROCESSES
so m
E (|X (t)| ) ≤
∞
m
[2 (n + 1) c] · P (|X (t)| > 2nc) ≤
n=0
≤ exp (t)
∞
m
[2 (n + 1) c] q n exp (t) L (m) ,
n=0
from which the proposition is evident. One can generalize these observations. Proposition 1.112 (Strong Markov property for processes with independent increments) Let X be a process with independent increments and assume that X is continuous in probability. Let D ([0, ∞)) denote the space of right-regular functions over [0, ∞) and let H be the σ-algebra over D ([0, ∞)) generated by the coordinate functionals. If f is a non-negative H-measurable functional68 over D ([0, ∞)), then for every stopping time τ < ∞ E (f (X ∗ ) | Fτ ) = E (f (Xs∗ )) |s=τ where Xs∗ (t) X (s + t) − X (s) . Proof. Let ϕ (u, t) be the Fourier transform of X (t). As X is continuous in probability ϕ (u, t) = 0 and Z (u, t)
exp (iuX (t)) ϕ (u, t)
is a martingale69 . Let τ be a bounded stopping time. By the Optional Sampling Theorem E (Z (u, τ + s) | Fτ ) = Z (u, τ ) . ϕ (u, τ + t) is Fτ -measurable. Therefore E (exp (iuX ∗ (t)) | Fτ ) E (exp (iu (X (τ + t) − X (τ ))) | Fτ ) =
(1.53)
ϕ (u, s + t) ϕ (u, τ + t) = |s=τ = ϕ (u, τ ) ϕ (u, s)
68 It is easy to see that f (X) = g (X (t ) , X (t ) , . . .) where g is an R∞ → R Borel mea1 2 surable function and (tk ) is a countable sequence in R+ . The canonical example is f (X) sups≤t |X (s)|. 69 See:
Proposition 1.99, page 63.
MARTINGALES
=
77
ϕ (u, s) E (exp (iu (X (t + s) − X (s)))) |s=τ = ϕ (u, s) = E (exp (iu (Xs∗ (t)))) |s=τ .
If τ is not bounded then τ n τ ∧ n is a bounded stopping time. Let h (s) E (exp (iu (X (s + t) − X (s)))) As τ < ∞ X (τ n + t) − X (τ n ) → X (τ + t) − X (τ ) So by the Dominated Convergence Theorem h (τ n ) → h (τ ). If A ∈ Fτ then A ∩ {τ ≤ n} ∈ Fτ n therefore
χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) =
A
χ (τ ≤ n) h (τ n ) dP. A
By the Dominated Convergence Theorem one can take the limit n → ∞. Hence in (1.53) we can drop the condition that τ is bounded. With the Monotone Class Theorem one can prove that for any Borel measurable set B E (χB (X ∗ (t)) | Fτ ) = E (χB (Xs∗ (t))) |s=τ In the usual way, using multi-dimensional trigonometric polynomials and the Monotone Class Theorem several times, one can extend the relation to every H-measurable and bounded function. Finally one can prove the proposition with the Monotone Convergence Theorem. Corollary 1.113 Under the same conditions as above E (f (X ∗ ) | τ = s) = E (f (Xs∗ )) . Let us remark, that if X is a L´evy process then the distribution of Xs∗ is the same as the distribution of X for every s so E (f (X ∗ ) | Fτ ) = E (f (X))
78
STOCHASTIC PROCESSES
for every τ < ∞. If f (X) exp (i
E exp i
n
n k=1
∗
| Fτ
uk X (tk )
uk X (tk )) then
= E exp i
k=1
n
uk X (tk )
.
k=1
The right-hand side is deterministic which implies that (X ∗ (t1 ), X ∗ (t2 ), . . . X ∗ (tn )) is independent of Fτ and has the same distribution as (X(t1 ), X(t2 ), . . . , X(tn )). Proposition 1.114 If X is a process with independent increments and X is continuous in probability, and the jumps of X are bounded by some constant c, then all the moments of X are uniformly bounded on any finite interval, that is, for every t E (|X m (s)|) ≤ K (m, t) < ∞,
s ∈ [0, t] .
Proof. Let us fix a t. X has right-regular trajectories so on any finite interval the trajectories are bounded. Therefore sups≤2t |X (s)| < ∞. Hence if b is sufficiently large then P
sup |X (s)| >
s≤2t
b 2
< q < 1.
Let τ inf {s : |X (s)| > a} ∧ 2t. By the definition of τ {τ < t} ⊆
sup |X(s)| > a ⊆ {τ ≤ t}. s≤t
If for some ω. ω∈
sup |X(s)| > a \{τ < t} s≤t
then sup |X(s, ω)| ≤ a s a, so process X has a jump at (t, ω), which by the stochastic continuity of X has probability zero. As the size of the jumps is bounded
MARTINGALES
79
by the right-continuity sup |X(s)| ≤ sup |X(s−)| + sup |∆X(s)| ≤ a + c. s≤τ
s≤τ
s≤τ
We show that this implies that
sup |X(s)| > a + b + c ⊆ sup |X(s)| > a, sup |X(τ + s) − X(τ )| > b . s≤t
s≤t
s≤t
If sup |X(s)| > a + b + c s≤t
then obviously sups≤t |X(s)| > a, hence τ ≤ t, so if sups≤t |X(τ +s)−X(τ )| ≤ b, then sup |X(s)| ≤ sup |X(s)| + sup |X(τ + s) − X(τ )| ≤ a + b + c. s≤t
s≤τ
s≤t
Which is impossible. If u ≤ t, then sup |X(u + s) − X(u)| ≤ 2 sup |X(s)|. s≤t
s≤2t
Therefore if u ≤ t, then b sup |X(u + s) − X(u)| > b ⊆ sup |X(s)| > . 2 s≤t s≤2t Let F be the distribution function of τ . By the just proved strong Markov property
P sup |X (s)| > a + b + c ≤ s≤t
≤ P sup |X (s)| > a, sup |X (τ + s) − X (τ )| > b s≤t
=
s≤t
= P τ < t, sup |X (τ + s) − X (τ )| > b =
s≤t
P sup |X ((τ + s)) − X (τ )| > b | τ = u dF (u) =
= [0,t)
s≤t
= [0,t)
P sup |X (u + s) − X (u)| > b dF (u) ≤ s≤t
80
STOCHASTIC PROCESSES
≤P
sup |X (s)| >
s≤2t
b 2
· P (τ < t) =
= q · P (τ < t) ≤ q · P sup |X (s)| > a . s≤t
From this for an arbitrary n
P sup |X (s)| > n (b + c) ≤ q n . s≤t
Hence ∞ m m m E (|X (t)| ) ≤ E sup |X (s)| (n (b + c)) q n−1 < ∞. ≤ s≤t
n=1
We shall return to L´evy processes in section 7.1. If the reader is interested only in L´evy processes then they can continue the reading there. 1.3.8
Application: the first passage times of the Wiener processes
In this subsection we present some applications of the Optional Sampling Theorem. Let w be a Wiener process. We shall discuss some properties of the first passage times τ a inf {t : w (t) = a} .
(1.54)
The set {a} is closed and w is continuous, hence τ a is a stopping time70 . Recall that71 almost surely lim sup w (t) = ∞, t→∞
lim inf w (t) = −∞. t→∞
(1.55)
Therefore as w is continuous τ a is almost surely finite. Example 1.115 The martingale convergence theorem does not hold in L1 (Ω).
Let w be a Wiener process and let X w + 1. Let τ be the first passage time of zero for X, that is let τ inf {t : X (t) = 0} = τ −1 inf {t : w (t) = −1} . 70 See: 71 See:
Example 1.32, page 17. Proposition B.7, page 564.
MARTINGALES
81
As X is martingale X τ is a non-negative martingale. By the martingale convergence theorem for non-negative martingales 72 if t ∞ then X τ (t) is almost surely convergent. As we remarked, τ is almost surely finite therefore obviously X τ (∞) = 0. By the Optional Sampling Theorem X τ (t)1 = X(τ ∧ t)1 = E (X(τ ∧ t)) = E (X(0)) = 1 for any t. Hence the convergence does not hold in L1 (Ω). Example 1.116 If a < 0 < b and τ a and τ b are the respective first passage times of some Wiener process w, then P (τ a < τ b ) =
b , b−a
P (τ b < τ a ) =
−a . b−a
By (1.55) with probability one, the trajectories of w are unbounded. Therefore as w starts from the origin the trajectories of w finally leave the interval [a, b]. So P (τ a < τ b ) + P (τ b < τ a ) = 1. If τ τ a ∧ τ b then wτ is a bounded martingale. Hence one can use the Optional Sampling Theorem. Obviously wττ is either a or b, hence E (wττ ) = aP (τ a < τ b ) + bP (τ b < τ a ) = E (wτ (0)) = 0. We have two equations with two unknowns. Solving this system of linear equations, one can easily deduce the formulas above. Example 1.117 Let a < 0 < b and let τ a and τ b be the respective first passage times of some Wiener process w. If τ τ a ∧ τ b , then E (τ ) = |ab|.
With direct calculation it is easy to see that the process w2 (t)−t is a martingale. From this it is easy to show that the process X (t) (w(t) − a) (b − w(t)) + t is also a martingale. By the Optional Sampling Theorem |ab| = −ab = E (X (0)) = E (X (τ ∧ n)) = = E (w (τ ∧ n) − a) (b − w (τ ∧ n)) + E (τ ∧ n) . 72 See:
Corollary 1.66, page 40.
82
STOCHASTIC PROCESSES
If n ∞ then by the Monotone and by the Dominated Convergence Theorems the limit of the right-hand side is E (τ ). Example 1.118 Let w be a Wiener process. The Laplace transform of the first passage time τ a is √ L (s) E (exp (−sτ a )) = exp − |a| 2s ,
s ≥ 0.
(1.56)
Let a > 0. For every u the process X (t) exp u · w (t) − t · u2 /2 is a martingale73 . So the truncated process X τ a is also a martingale. If u ≥ 0, then
0≤X
τa
u2 t (t) ≤ exp ua − 2
≤ exp (au) ,
hence X τ a is a bounded martingale. Every bounded martingale is uniformly integrable, therefore one can apply the Optional Sampling Theorem. So
u2 τ a E Xττaa = E exp ua − = E (X τ a (0)) = 1. 2 Hence
u2 τ a E exp − 2 If u
√
= exp (−ua) .
2s ≥ 0 then √
L (s) E (exp (−sτ a )) = exp −a 2s .
If a < 0 then repeating the calculations for the Wiener process −w √
L (s) = exp − |a| 2s .
Example 1.119 The Laplace transform of the first passage time of the reflected Wiener process |w| is (s) E (exp (−s L τ a )) =
73 See:
(1.44), page 66.
1 √ , cosh a 2s
s ≥ 0.
(1.57)
MARTINGALES
83
By definition τ a inf {t : |w (t)| = a} . Let 2 u t exp (uw (t)) + exp (−uw (t)) exp − X (t) 2 2 2 u t cosh (uw (t)) exp − . 2 X is the sum of two martingales, hence it is a martingale. X τ a ≤ cosh (ua), therefore one can again apply the Optional Sampling Theorem. 2
τa u τa E Xτ a = E cosh (ua) exp − = 1, 2 therefore
E exp If u
√
−u2 τa 2
=
1 . cosh (ua)
2s then E (exp (−s τ a )) =
1
√ . cosh a 2s
Example 1.120 The density function of the distribution of the first passage time τ a of a Wiener process is −1/2 a2 f (x) = |a| 2πx3 . exp − 2x
(1.58)
By the uniqueness of the Laplace transform √ it is sufficient to prove that the Laplace transform of (1.58) is exp − |a| 2s . By the definition of the Laplace transform ∞ exp (−sx) f (x) dx, s ≥ 0. L (s) 0
If F denotes the distribution function of (1.58) then F (x)
x
f (t) dt = 2 0
a
∞
√
2 1 u du, exp − 2x 2πx
(1.59)
84
STOCHASTIC PROCESSES
since if we substitute t xa2 /u2 , then 2 au3 u √ exp − xa2 (−2) u−3 du = 3 2πx3 2x a ∞ 2 ∞ 1 u √ =2 exp − du. 2x 2πx a
a
F (x) =
Integrating by parts and using that F (0) = 0, if s > 0 then L (s) = [exp (−sx) F
∞ (x)]0
∞
s exp (−sx) F (x) dx =
+ 0
∞
=s
exp (−sx) F (x) dx. 0
By (1.59)
∞
L (s) = 2s
exp (−sx) 0
a
∞
2 1 u √ dudx. exp − 2x 2πx
Fix s and let us take L (s) as a function of a. Let us denote this function by g (a). We show that if a > 0 then g (a) satisfies the differential equation d2 g (a) = 2sg (a) . da2
(1.60)
The integrand is non-negative, so by Fubini’s theorem one can change the order of the integration, so
∞
∞
g (a) = 2s a
0
exp (−sx) √
2 u 1 dxdu. exp − 2x 2πx
As 0
∞
√
1 1 exp (−sx) dx = √ Γ 2πx 2πs
1 0. If s > 0 and z s + it then 1 1 z 1 log z = exp ln (|z|) exp i arg = exp 2 2 2 |z| " arctan (t/s) arctan (t/s) 4 = s2 + t2 cos + i sin . 2 2
z
1/2
86
STOCHASTIC PROCESSES
The complex Laplace transform is continuous so ϕ (t) = L (−it) = #
$
√ " arctan −t arctan −t 4 s s 2 2 + i sin = = lim exp −a 2 s + t cos s0 2 2
"
π
π = = exp −a 2 |t| cos − sgnt + i sin − sgnt 4 4 "
= exp −a |t| (1 − i · sgnt) .
Example 1.122 The maximum process of a Wiener process. Let w be a Wiener process, and let us introduce the maximum process S (t) sup w (s) = max w (s) . s≤t
s≤t
We show that for every a ≥ 0 and t ≥ 0 P (S (t) ≥ a) = P (τ a ≤ t) = 2 · P (w (t) ≥ a) = P (|w (t)| ≥ a) .
(1.61)
The first and last equality are trivial. We prove the second one: recall that the density function of the distribution of τ a is 2 d 1 d a P (τ a ≤ t) F (t) f (t) = a √ . exp − 3 dt dt 2t 2πt √ w (t) ∼ = N 0, t , so a = U (t) 2 · P (w (t) ≥ a) = 2 1 − Φ √ t 2 = √ 2π
u2 exp − du. √ 2 a/ t
∞
Differentiating with respect to t 2 a a d U (t) = √ exp − t−3/2 , dt 2t 2π
MARTINGALES
87
hence the derivatives of P (τ a ≤ t) and 2 · P (w (t) ≥ a) with respect to t are the same. The two functions are equal if t = 0, therefore 2 · P (w (t) ≥ a) = P (τ a ≤ t) for every t.
Example 1.123 The density function of S (t) sups≤t w (s) is f (x) = √
x2 2 exp − , 2t 2πt
x > 0.
√ By (1.61) P (S (t) ≥ x) = 2 1 − Φ x/ t . Differentiating we get the formula. Example 1.124 If w is a Wiener process then
π E sup |w (s)| = , 2 s≤1
2 E sup w (s) = . π s≤1
Let S (t) sup |w (s)| = max |w (s)| , s≤t
s≤t
τ a inf {t : |w (t)| = a} . If x > 0, then74 s s
P S (t) ≤ x = P max xw 2 ≤ x = P max w 2 ≤ 1 = s≤t s≤t x x t = P max2 |w (s)| ≤ 1 = P τ1 ≥ 2 = x s≤t/x 1 x =P √ ≤ √ . τ1 t If σ > 0, then %
74 Recall
2 π
0
∞
x2 exp − 2 2σ
dx = σ.
that s → xw s/x2 is also a Wiener process.
88
STOCHASTIC PROCESSES
The expected value depends only on the distribution, so by Fubini’s theorem and by (1.57) % 2
2 ∞ τ1 x 1 dx = exp − E S (1) = E √ =E π 0 2 τ1 % ∞ 2 2 τ1 x E exp − = dx = π 0 2 % ∞ % ∞ 2 1 2 exp (x) dx = 2 dx = = π 0 cosh x π 0 exp (2x) + 1 % ∞ % % 2 1 2 π π · = . =2 dy = 2 π 1 y2 + 1 π 4 2
In a similar way, if S denotes the supremum of w then E (S (1)) = E %
1 √ τ1
%
=E
2 π
∞
0
x2 τ 1 exp − 2
dx
=
2 x τ1 E exp − dx = 2 0 % ∞ % 2 2 . exp (−x) dx = = π 0 π
=
2 π
∞
One can prove the last relation with (1.61) as well: % E (S (1)) = E (|w (1)|) =
2 π
0
∞
x2 x exp − 2
%
dx =
2 π
Example 1.125 The intersection of a two-dimensional Wiener process with a line has Cauchy distribution.
Let w1 and w2 be independent Wiener processes, and let us consider the line75 L {x = a} where a > 0. The two-dimensional process w (t) (w1 (t) , w2 (t)) meets L the first time at τ a inf {t : w1 (t) = a} . 75 The Wiener processes are invariant under rotation so the result is true for an arbitrary line. One can generalize the result to an arbitrary dimension. In the general case, we are investigating the distribution of the intersection of the Wiener processes with hyperplanes.
MARTINGALES
89
What is the distribution of the y coordinate that is what is the distribution of w2 (τ a )?
1. For an arbitrary u the process t → u−1 w1 u2 t is also a Wiener process, hence the distribution of its maximum process is the same as the distribution of the maximum process of w1 . Let us denote this maximum process by S1 . √ 1 P (τ a ≥ x) = P (S1 (x) ≤ a) = P xS1 √ 2x ≤ a = ( x) 2
√ a = P xS1 (1) ≤ a = P ≥x . S12 (1) w intersects L at w2 (τ a ). τ a is σ (w1 )-measurable, and as w1 and w2 are independent, that is the σ-algebras σ (w2 ) and σ (w1 ) are independent, τ a is independent of w2 . We show that √ w2 (τ a ) ∼ = τ a · w2 (1)
(1.62)
√ that is, the distribution of w2 (τ a ) is the same as the distribution of τ a · w2 (1). Using the independence of τ a and w2 √
P (w2 (τ a ) ≤ x | τ a = t) = P (w2 (t) ≤ x) = P tw2 (1) ≤ x , and √
√ tw2 (1) ≤ x . P ( τ a w2 (1) ≤ x | τ a = t) = P Integrating both equations by the distribution of τ a we get (1.62). Hence √ a a w2 (τ a ) ∼ · w2 (1) ∼ · w2 (1) . = τ a · w2 (1) ∼ = = S1 |w1 (1)| w1 (1) and w2 (1) are independent with distribution N (0, 1). Therefore w2 (τ a ) has a Cauchy distribution. 2. One can also prove the relation with Fourier transforms. Let us calculate the
Fourier transform of w2 (τ a )! The Fourier transform of N (0, 1) is exp −t2 /2 . By the independence of τ a and w2 and by (1.56) ϕ (t) E (exp (itw2 (τ a ))) = ∞ E (exp (itw2 (τ a )) | τ a = u) dG (u) = = 0
=
∞
E (exp (itw2 (u))) dG (u) = 0
90
STOCHASTIC PROCESSES
t2 = exp − u dG (u) = 2 0 2 2 t t = E exp − τ a L = 2 2 √
= exp −a t2 = exp (−a |t|) ,
∞
which is the Fourier transform of a Cauchy distribution. Example 1.126 The process of first passage times of Wiener processes.
Let w be a Wiener process and let us define the hitting times τ a inf {t : w (t) = a} ,
σ a inf {t : w (t) > a} .
w is continuous, the set {x > a} is open, hence σ a is a weak stopping time. As the augmented filtration of w is right-continuous σ a is a stopping time76 . w has continuous trajectories so obviously τ a ≤ σ a . As the trajectories of w can contain ‘peaks and flat segments’ it can happen that for some outcomes τ a is strictly smaller than σ a . As we shall immediately see almost surely τ a = σ a . One can define the stochastic processes T (a, ω) τ a (ω),
S(a, ω) σ a (ω)
with a ∈ R+ . It is easy to see that T and S have strictly increasing trajectories. If an a then w(τ an ) = an a, hence obviously τ an τ a , so T is leftcontinuous. On the other hand, it is easy to see that if an a, then σ an σ a , hence S is right-continuous. It is also easy to see, that T (a+, ω) = S(a, ω) and S(a−, ω) = T (a, ω) for all ω. Obviously τ a and σ a are almost surely finite. By the strong Markov property of w w∗ (t) w(τ a + t) − w(τ a ) is also a Wiener process. {τ a < σ a } is in the set {w∗ (t) ≤ 0 on some interval [0, r] , r ∈ Q} . As w∗ is a Wiener process it is not difficult to prove77 that if r > 0 then P (w∗ (t) ≤ 0, ∀t ∈ [0, r]) = 0. 76 See: 77 See:
Example 1.32, page 17. Corollary B.12, page 566.
MARTINGALES
91
Hence P (τ a = σ a ) = P (τ a < σ a ) = 0 for every a. Therefore S is a right-continuous modification of T . Obviously if b > a and τ ∗b−a is the first passage time of w∗ to b − a then τ b − τ a = τ ∗b−a . By the strong Markov property τ ∗b−a is independent of Fτ a . Therefore T (b) − T (a) is independent of Fτ a . In general, one can easily prove that T and therefore S have independent increments with respect to the filtration Ga Fτ a . Obviously S(0) = 0, hence S is a L´evy process with respect to the filtration G. 1.3.9
Some remarks on the usual assumptions
The usual assumptions are crucial conditions of stochastic analysis. Without them very few statements of the theory would hold. The most important objects of stochastic analysis are related to stopping times, as these objects express the timing of events. The main tool of stochastic analysis is measure theory. In measure theory, objects are defined up to measure-zero sets. From a technical point of view, of course it is not a great surprise that we want to guarantee that every random time, which is almost surely equal to a stopping time, should also be a stopping time. The definition of a stopping time is very natural: at time t one can observe only τ ∧ t so we should assume τ ∧ t to be Ft -measurable for every t. Hence if τ and τ are almost surely equal and they differ on a set N , then every subset of N should be also Ft -measurable. This implies that one should add all the measure-zero sets and all their subsets to the filtration78 . The right-continuity of the filtration is more problematic; it assumes that somehow we can foresee the events of the near future. At first sight is seems natural; in our usual experience we always have some knowledge about the near future. Our basic experience is speed and momentum, and these objects are by definition the derivatives of the trajectories. By definition, differentiability means that the right-derivative is equal to the left-derivative and the left-derivative depends on the past and the present. So in our differentiable world we always know the right-derivative, hence—infinitesimally—we can always see the future. But in stochastic analysis we are interested in objects which are non-differentiable. Recall that for a continuous process the hitting time of a closed set is a stopping time79 . At the moment that we hit a closed set we know that we are in the set. But what about the hitting times80 of open sets? We hit an open set at its boundary and when we hit it we are generally still outside the set. Recall that the hitting time of an open set is a stopping time only when the filtration is right-continuous81 . That is, when we hit the boundary of an open set—by the 78 See:
Example 6.37, page 386. Example 1.32, page 17. 80 See: Definion 1.26, page 15. 81 See: Example 1.32, page 17. 79 See:
92
STOCHASTIC PROCESSES
right-continuity of the filtration—we can ask for some extra information about the future which tells us whether we shall really enter the set or not. This is, of course, a very strong assumption. If we want to go to a restaurant and we are at the door, we know that we shall enter the restaurant. But a Wiener process can easily turn back at the door. One of the most surprising statements of the theory is that the augmented filtration of a L´evy process is right-continuous. This is true not only for L´evy processes, but under more general conditions82 . It is important to understand the reason behind this phenomena. The probability that a one-dimensional Wiener process hits the boundary of an open set without actually entering the set itself has zero83 probability! And in general the rightcontinuity of an augmented filtration means that all the events which need some insight into the future84 have zero probability. We cannot see the future, we are just ignoring the irrelevant information!
1.4
Localization
Localization is one of the most frequently used concepts of mathematical analysis. For example, if f is a continuous function on R, then of course generally f is not integrable on the whole real line. But this is not a problem at all. We can x still talk about the integral function F (x) 0 f (t)dt of f . The functions of Calculus are not integrable, they are just locally integrable. In the real analysis we say that a certain property holds locally if it holds on every compact subset of the underlying topological space85 . In the real line it is enough to ask that the property holds on any closed, bounded interval, in particular for any t the property should hold on any interval [0, t]. Very often, like in the case of local integrability, it is sufficient to ask that the property should hold on some intervals [0, tn ] where tn ∞. In stochastic analysis we should choose the upper bounds tn in a measurable way with respect to the underlying filtration. This explains the next definition: Definition 1.127 Let X be a family of processes. We say that process X is locally in X if there is a sequence of stopping times (τ n ) for which almost surely86 τ n ∞, and the truncated processes X τ n belong to X for every n. The sequence (τ n ) is called the localizing sequence of X. Xloc denotes the set of processes locally belonging to X . A specific problem of the definition above, is that with localization one cannot modify the value of the variable X(0), since every truncated process X τ n at the 82 This is true e.g. for so called Feller processes, which form an important subclass of the Markov processes. 83 See: Example 1.126, page 90, Corollary B.12, page 566. But see: Example 6.10, page 364. 84 Like sudden jumps of the Poisson processes. 85 Generally the topological space is locally compact. 86 Almost surely and not everywhere! See: Proposition 1.130, page 94.
LOCALIZATION
93
time t = 0 has the same value X(0). To overcome this problem some authors87 instead of using X τ n use the process X τ n χ (τ n > 0) in the definition of the localization or instead of X they localize the process X − X(0). In most cases it does not matter how we define the localization. First of all we shall use the localization procedure to define the different classes of local martingales. From the point of view of stochastic analysis, one can always assume that every local martingale is zero at time t = 0, as our final goal is to investigate the class of semimartingale, and the semimartingales have the representation X(0) + L + V, where L is a local martingale, zero at time t = 0. Just to fix the ideas we shall later explicitly concretize the definitions in the cases of local martingales and locally bounded processes. In both cases we localize the processes X − X(0). 1.4.1
Stability under truncation
It is quite natural to ask for which type of processes X one has (Xloc )loc = Xloc . Definition 1.128 We say that space of processes X is closed or stable under truncation or closed under stopping if whenever X ∈ X then X τ ∈ X for arbitrary stopping time τ . It is an important consequence of
this property that if X is closed under trun(k) cation and Xk ∈ Xloc and τ n are the localizing sequences of the processes (k)
Xk , then τ n ∧m k=1 τ n for any finite m is a common localizing sequence of the first m processes. That is, if X is closed under the truncation, then for a finite number of processes we can always assume that they have a common localizing sequence. From the definition it is clear that if X is closed under the truncation, then Xloc is also closed under the truncation as, if (τ n ) is a localizing sequence of X and τ is an arbitrary stopping time, then (τ n ) is obviously a localizing sequence of the truncated process X τ . Example 1.129 M, the space of uniformly integrable martingales, H2 , the space of the square-integrable martingales and K, the set of bounded processes are closed under truncation. It is obvious from the definition that K is closed under truncation. By the Optional Sampling Theorem if M ∈ M, then M τ ∈ M. As 2 2 τ ≤E sup |X (t)| n
∞ 0
if |η| ≤ n if |η| > n
are stopping times. Obviously ρn τ n ∧ σ n ∧ α n ∧ β n is a stopping time and ρn ∞ so (ρn ) is a localizing sequence. ρn
Z ρn (ξX + ηY )
= χ (|ξ| ≤ n) ξX ρn + χ (|η| ≤ n) ηY ρn .
(1.63)
As X ρn , Y ρn ∈ M and as χ (|ξ| ≤ n) ξ and χ (|η| ≤ n) η are bounded F0 measurable variables, obviously Z ρn ∈ M and therefore Z is a local martingale. Let us observe that in line (1.63) we used that X, Y ∈ L that is X(0) = Y (0) = 0. If in the definition of local martingales one had used the simpler X ∈ Mloc definition, then in this proposition one should have assumed the ξ and η to be bounded. 90 See:
Lemma 1.70, page 42.
96
STOCHASTIC PROCESSES
One can observe that in the definition of local martingales we used the class of uniformly integrable martingales and not the class of martingales. If Lτ n is a martingale for some τ n , then Lτ n ∧n ∈ M, so the class of local martingales is the same as the class of ‘locally uniformly integrable martingales’. Very often we prove different theorems first for uniformly integrable martingales and then with localization we extend the proofs to local martingales. In most cases one should use the same method if one wants to extend the result from uniformly integrable martingales just to martingales. An important subclass of local martingales is the space of locally squareintegrable martingales: 2 Definition 1.135 X is a locally square-integrable martingale if X−X(0) ∈ Hloc .
Example 1.136 Every martingale which has square-integrable values is a locally square-integrable martingale. By definition a martingale X is square-integrable in ω if X(t) ∈ L2 (Ω) for every t. In this case X(0) ∈ L2 (Ω), therefore for all t X(t) − X(0) ∈ L2 (Ω), so again one can assume that X(0) = 0. If τ n n then (τ n ) is a localizing sequence. By Doob’s inequality
sup |X τ n (t)| = sup |X (t)| ≤ 2 · X (n) < ∞, 2
t≤n
t 2
2
2 so X τ n ∈ H2 and therefore X ∈ Hloc .
Example 1.137 Every continuous local martingale is locally square-integrable91 .
Let X be a continuous local martingale and let (τ n ) be a localizing sequence of X. As X is continuous σ n inf {t : |X(t)| ≥ n} is a stopping time. If ρn τ n ∧σ n then ρn ∞ and |X ρn | ≤ n by the continuity of X, so X ρn is a bounded, hence it is a square-integrable martingale. Therefore 2 . M ∈ Hloc 2 Example 1.138 Martingales which are not in Hloc .
91 One can easily generalize this example. If the jumps of X are bounded then X is in H2 . loc See: Proposition 1.152, page 107.
LOCALIZATION
97
Let us denote by σ (N ) the σ-algebra generated by the measure-zero sets. Let Ft
σ (N ) if t < 1 , A if t ≥ 1
and let ξ ∈ L1 (Ω), but ξ ∈ / L2 (Ω). Let us also assume that E (ξ) = 0. F satisfies a.s. the usual conditions, hence X (t) E (ξ | Ft ) is martingale. X (0) = 0, hence 2 / Hloc as, if the not only X ∈ Mloc , but also X ∈ L. On the other hand X ∈ stopping time τ is not almost surely constant, then almost surely τ ≥ 1, hence / L2 (Ω). for all t ≥ 1 X τ (t) = ξ ∈ It is a quite natural, but wrong, guess that local martingales are badly integrable martingales. The local martingales are far more mysterious objects. Example 1.139 Integrable local martingale which is not a martingale.
Let Ω C [0, ∞) , that is let Ω be the set of continuous functions defined on the half-line R+ . Let X be the canonical coordinate process, that is if ω ∈ Ω, then let X (t, ω) ω (t), and let the filtration F be the filtration generated by X. Let P be the probability measure defined on Ω for which X is a Wiener process starting from point 1. Let τ 0 inf {t : X(t) = 0} . Let us define the measure Q(t) on the σ-algebra Ft with the Radon–Nikodym derivative dQ (t) X (t ∧ τ 0 ) = X (t) χ (t < τ 0 ) + X (τ 0 ) χ (t ≥ τ 0 ) = dP = X (t) χ (t < τ 0 ) . As the truncated martingales are martingales, X τ 0 is a martingale under the measure P. Hence E (X (t ∧ τ 0 ) | Fs ) = X (s ∧ τ 0 ) . The measures (Q (t))t≥0 are consistent: if s < t and F ∈ Fs ⊆ Ft , then
F
dQ (s) dP dP
F
dQ (t) dP = Q (t) (F ) . dP
Q (s) (F ) =
X (s ∧ τ 0 ) dP = F
X (t ∧ τ 0 ) dP F
98
STOCHASTIC PROCESSES
In particular
X (t ∧ τ 0 ) dP =
Q (t) (Ω) Ω
X (0) dP = 1, Ω
so Q (t) is a probability measure for every t. The space C [0, ∞) is a Kolmogorov type measure space, so on the Borel sets of C [0, ∞) there is a probability measure Q, which, restricted to Ft is Q (t). {τ 0 ≤ t} ∈ Ft for every t so Q (τ 0 ≤ t) = Q (t) (τ 0 ≤ t)
Ω
χ (τ 0 ≤ t) X (τ 0 ∧ t) dP =
= Ω
χ (τ 0 ≤ t) X (τ 0 ) dP = 0, a.s.
so Q (τ 0 = ∞) = 1, that is X is almost surely never zero under Q. Hence X > 0 under Q, so under Q the process Y 1/X is almost surely well-defined. 1. As a first step let us show that Y is not a martingale under Q. To show this it is sufficient to prove that the Q-expected value of Y is decreasing to zero. As P(τ 0 < ∞) if t ∞ EQ (Y (t))
Y (t) dQ =
Ω
= Ω
Ω
1 dQ (t) = X (t)
1 χ (t < τ 0 ) X (t) dP = X (t)
χ (t < τ 0 ) dP = P (t < τ 0 ) → 0.
= Ω
2. Now we prove that Y is a local martingale under Q. Let ε > 0 and let τ ε inf {t : X(t) = ε} . X is continuous, therefore if ε 0 then τ ε (ω) τ 0 (ω) for every outcome ω. Since Q(τ 0 = ∞) = 1 obviously Q-almost surely92 τ ε ∞. Let us show, that under Q the truncated process Y τ ε is a martingale. Almost surely 0 < Y τ ε ≤ 1/ε hence Y τ ε is almost surely bounded, hence it is uniformly integrable. One should 92 Let us recall that by the definition of the localizing sequence, it is sufficient if the localizing sequence converges just almost surely to infinity.
LOCALIZATION
99
only prove that Y τ ε is a martingale under Q. If s < t and F ∈ Fs , then as τ ε < τ 0
Y
τε
(t) dQ
F
F
1 dQ (t) = X (t ∧ τ ε )
(1.64)
1 X (t ∧ τ 0 ) dP = F X (t ∧ τ ε ) χ (t < τ ε ) χ (t ≥ τ ε ) + X (t) χ (t < τ 0 ) dP = = X (t) X (τ ε ) F X (t) = χ (τ 0 > t ≥ τ ε ) dP = χ (t < τ ε ) + ε F 1 = ε + (X τ 0 (t) − ε) χ (t ≥ τ ε ) dP. ε F =
Let us prove that M (t) (X τ 0 (t) − ε) χ (t ≥ τ ε ) is a martingale under P. If σ is a bounded stopping time, then as τ ε < τ 0 by the elementary properties of the conditional expectation93 and by the Optional Sampling Theorem E (M (σ)) E ((X τ 0 (σ) − ε) χ (σ ≥ τ ε )) = = E (E ((X τ 0 (σ) − ε) χ (σ ≥ τ ε ) | Fσ∧τ ε )) = = E (E (X τ 0 (σ) − ε | Fσ∧τ ε ) χ (σ ≥ τ ε )) = = E (E (X (τ 0 ∧ σ) − ε | Fσ∧τ ε ) χ (σ ≥ τ ε )) = = E ((X (σ ∧ τ ε ) − ε) χ (σ ≥ τ ε )) = = E ((X (τ ε ) − ε) χ (σ ≥ τ ε )) = 0, which means that M is really a martingale94 . As M is a martingale under P in the last integral of (1.64) one can substitute s on the place of t, so calculating backwards 1 1 τε dQ = dQ Y (t) dQ X (t ∧ τ ) X (s ∧ τ ε) ε F F F Y τ ε (s) dQ, F
that is Y τ ε is a martingale under Q. Therefore τ 1/n localizes Y under Q. 93 See: 94 See:
Proposition 1.34, page 20. Proposition 1.91, page 57.
100
STOCHASTIC PROCESSES
Example 1.140 L2 (Ω) bounded local martingale, which is not a martingale
95
.
Let w be a standard Wiener process in R3 , and let X(t) w(t)+u where u = 0 is a fixed vector. By the elementary properties of Wiener processes96 if t → ∞ then R (t) X (t)2 → ∞.
(1.65)
With direct calculation it is easy to check that on R3 \ {0} the function g (x)
1 1 =" 2 x2 x1 + x22 + x23
is harmonic, that is97 ∆g
∂2 ∂2 ∂2 g + g + g = 0. ∂x21 ∂x22 ∂x23
Hence by Itˆo’s formula98 M 1/R is a local martingale. The density function of the X (t) is
1 2 ft (x) √ 3 exp − x − u2 . 2t 2πt 1
If t ≥ 1 then ft is uniformly bounded so if t ≥ 1 then obviously
E M 2 (t) =
R3
≤
R3
1
2 ft
x2 1
(x) dλ3 (x) ≤
2 dλ3
(x) .
x2
Evidently the last integral can diverge only around x = 0. I
x≤1
1
2 dλ3 (x) =
x2
k
G(k)
1
2 dλ3
x2
(x)
95 We shall use several results which we shall prove later, so one can skip this example during the first reading. 96 See: Proposition B.7, page 564, Corollary 6.9, page 363. 97 Now ∆ denotes the Laplace operator. 98 See: Theorem 6.2, page 353. As n = 3 almost surely X(t) = 0 hence we can use the formula. See: Theorem 6.7, page 359.
LOCALIZATION
101
where G (k) =
1 2k+1
1 < x2 ≤ k 2
.
As 2k G (k) = G (0) using the transformation T (x) 2k x 1 1 3k dλ (x) = 3 2 2 2 dλ (x) = k G(0) x2 G(k) 2 x2 1 k =2 2 dλ3 (x) . G(k) x2 Hence I=
∞
2−k
k=0
G(0)
1
2 dλ3
x2
(x) < ∞.
2 It is easy to show that E
2 M (t) is continuous in t. Therefore it is bounded on [0, t]. Hence E M (t) is bounded on R+ . By (1.65) M (t) → 0. M is bounded in L1
L2 (Ω) therefore it is uniformly integrable, so M (t) → 0. If M were a martingale then 0 = M (t) = E (M (∞) | Ft ) = E (0 | Ft ) = 0, which is impossible. As the uniformly integrable local martingales are not necessarily martingales even the next, nearly trivial observation is very useful: Proposition 1.141 Every non-negative local martingale is a supermartingale. Proof: Let M = M (0) + L be a non-negative local martingale. Observe that by the definition of supermartingales, M (t) ≥ 0 is not necessarily integrable so one cannot assume that M (0) is integrable. As L ∈ L there is a localizing sequence (τ n ) that Lρn ∈ M for all n. If t > s, then as M ≥ 0, by Fatou’s lemma
E (M (t) | Fs ) = E lim inf M τ n (t) | Fs ≤ lim inf E (M τ n (t) | Fs ) = n→∞
n→∞
τn
= M (0) + lim inf E (L n→∞
(t) | Fs ) =
= M (0) + lim inf Lτ n (s) = M (s) . n→∞
Corollary 1.142 If M ∈ L and M ≥ 0 then M = 0. Proof: As M is a supermartingale 0 ≤ E (M (t)) ≤ E (M (0)) = 0 for all t ≥ 0, a.s. so M (t) = 0.
102
STOCHASTIC PROCESSES
The most striking and puzzling feature of local martingales is that even uniform integrability is not sufficient to guarantee that local martingales are proper martingales. The reason for it is the following: If Γ is a set of stopping times, then the uniform integrability of the family (X (t))t∈Θ does not guarantee the uniform integrability of the stopped family (X (τ ))τ ∈Γ . This cannot happen if the local martingale belongs to the so-called class D. Definition 1.143 Process X belongs to the Dirichlet–Doob class99 , shortly X is in class D, if the set {X (τ ) : τ < ∞ is an arbitrary finite-valued stopping time} is uniformly integrable. We shall also denote by D the set of processes in class D. Proposition 1.144 Let L be a local martingale. L is in class D if and only if L ∈ M that is if L is a uniformly integrable martingale. Proof: Recall that we constructed a non-negative L2 (Ω)-bounded local martingales which is not a proper martingale. 1. Let L ∈ D and let L be a local martingale. As τ = 0 is a stopping time, by the definition of D, L(0) is integrable, so one can assume that L ∈ L. If (τ n ) is a localizing sequence of L then L (τ n ∧ s) = Lτ n (s) = E (Lτ n (t) | Fs ) = = E (L (τ n ∧ t) | Fs ) . τ n ∞, hence the sequences (L (τ n ∧ s))n and (L (τ n ∧ t))n converge to L (s) and L (t). By uniform integrability the convergence L (τ n ∧ t) → L(t) holds in L1 (Ω) as well. By the L1 -continuity of the conditional expectation L (s) = E (L (t) | Fs ) , hence L is a martingale100 . Obviously the set {L(t)}t ⊆ {L(τ )}τ is uniformly integrable so L ∈ M. 2. The reverse implication is obvious: If L is a uniformly integrable martingale then by the Optional Sampling Theorem L (τ ) = E (L (∞) | Fτ ) for every stopping time τ , hence the family (L (τ ))τ is uniformly integrable101 . 99 In [77] on page 244 class D is called Dirchlet class. [74] on page 107 remarks that class D is for Doob’s class and the definition was introduced by P.A. Meyer in 1963. 100 Observe that it is enough to asssume that {L (τ )} is uniformly integrable for the set of τ bounded stopping times τ . 101 See: Lemma, 1.70, page 42.
LOCALIZATION
103
Corollary 1.145 If a process X is dominated by an integrable variable then X ∈ D, hence if X is a local martingale and X is dominated by an integrable variable102 then X ∈ M. Example 1.146 Let us assume that L has independent increments. If X exp (L) then X is a local martingale if and only if X is a martingale.
One should only prove that if X is a local martingale, then X is a martingale. By the definition of processes with independent increments, L(0) = 0, hence X(0) = 1. X is a non-negative local martingale, so it is a supermartingale103 . If m(t) denotes the expected value of X(t) then by the supermartingale property 1 ≥ m (t) > 0. Let us prove that M (t) X (t) /m (t) is a martingale. As L has independent increments, if t > s, then m (t) E (X(t)) = E (X(s)) E (exp (L(t) − L(s))) m(s)E (exp (L(t) − L(s))) . From this E (M (t) | Fs ) E =E
exp (L (t)) | Fs m (t)
=
exp (L (t) − L (s) + L (s)) | Fs m (t)
=
=
exp (L (s)) E (exp (L (t) − L (s)) | Fs ) = m (t)
=
exp (L (s)) E (exp (L (t) − L (s))) = m (t)
=
exp (L (s)) M (s) , m (s)
hence M is martingale. For arbitrary T < ∞ on the interval [0, T ] M is uniformly integrable, that is, M is in class D. As on interval [0, T ] 0 ≤ X = M m ≤ M, hence X is also in class D. Therefore X ∈ D and X is a local martingale on [0, T ]. This means that X is a martingale on [0, T ] for every T , hence X is a martingale on R+ . 102 See: 103 See:
Davis’ inequality. Theorem 4.62, page 277. Proposition 1.141, page 101.
104
STOCHASTIC PROCESSES
If a process has independent increments and the expected value of the process is zero, then it is obviously a martingale. Therefore martingales are the generalization of random walks. From an intuitive point of view one can also think about local martingales as generalized random walks as we shall later prove the next — somewhat striking— theorem: Theorem 1.147 Assume that the stochastic base satisfies the usual conditions. If a local martingale has independent increments then it is a true martingale104 . 1.4.3
Convergence of local martingales: uniform convergence on compacts in probability
Let X be an arbitrary space. In Xloc it is very natural to define the topology with localization; Xm → X, if X and the elements of the sequence (Xm ) have a common localizing sequence (τ n ) and for every n in the topology of X τn lim Xm = Xτn.
m→∞
p Let us assume105 that (Xm ) and X are in Hloc . In Hp one should define the topology with the norm
. XHp sup |X (s)| s p
If τ n ∞ and t < ∞, then for every δ > 0 one can find an n, that P (τ n ≤ t) < δ. Let ε > 0 be arbitrary. If A
sup |Xm (s) − X (s)| > ε , s≤t
then P (A) = P ((τ n ≤ t) ∩ A) + P ((τ n > t) ∩ A) ≤ ≤ P (τ n ≤ t) + P ((τ n > t) ∩ A) ≤ δ + P ((τ n > t) ∩ A) ≤ τn τn ≤ δ + P sup |Xm (s) − X (s)| > ε ≤ ≤δ+P
s≤t
τn sup |Xm s
(s) − X
τn
(s)| > ε .
104 Of course the main point is that a local martingale with independent increments has finite expected value. See: Theorem 7.97, page 545. 105 It is an important consequence of the Fundamental Theorem of Local Martingales that 1 . See Corollary 3.59, page 221. every local martingale is in Hloc
LOCALIZATION
105
By Markov’s inequality the stochastic convergence follows from the convergence τn in Lp (Ω). Therefore if limm→∞ Xm = X τ n in Hp then τn lim P sup |Xm (s) − X τ n (s)| > ε = 0.
m→∞
s
This implies that for every ε > 0 and for every t lim P sup |Xm (s) − X (s)| > ε = 0.
m→∞
s≤t
Hence one should expect that the next definition is very useful106 : Definition 1.148 We say that the sequence of stochastic processes (Xn ) converges uniformly on compacts in probability to process X if for arbitrary107 t a} . If X is right-regular then |X (τ a )| ≥ a, but as X can reach the level a with a jump, it can happen that for certain outcomes |X (τ a )| > a. For right-continuous processes one can only use the estimation |X (τ a )| ≤ a + |∆X (τ a )| . As the jump |∆X (τ a )| can be arbitrarily large X is not necessarily bounded on the random interval [0, τ a ] {(t, ω) : 0 ≤ t ≤ τ a (ω) < ∞} .
(1.66)
On the other hand, let us assume that X is left-continuous. If τ a (ω) > 0 and |X(τ a (ω)), ω| > a for some outcome ω then by the left-continuity one can decrease the value of τ a (ω), which by definition is impossible. Hence |X (τ a )| ≤ a on the set {τ a > 0}. This means that if X is left-continuous and X(0) = 0 then X is bounded on the random interval (1.66). These observations are the core of the next two propositions: Proposition 1.151 If the filtration is right-continuous then every left-regular process is locally bounded. Proof: Let X be left-regular. The process X − X(0) is also left-regular so one can assume that X(0) = 0. Define the random times τ n inf {t : |X(t)| > n} . The filtration is right-continuous, X is left-regular so τ n is a stopping time109 . As X(0) = 0, if τ n (ω) = 0 then |X (τ n )| ≤ n. If τ n (ω) > 0 then |X (τ n (ω), ω)| > n is impossible as in this case, by the left-continuity of X one could decrease τ n (ω). 108 See: 109 See:
Proposition 1.6, page 5. Example 1.32, page 17.
LOCALIZATION
107
Hence the truncated process X τ n is bounded. Let us show that τ n ∞, that is, let us show that the sequence (τ n ) is a localizing sequence. Obviously (τ n ) is never decreasing. If for some outcome ω the sequence (τ n (ω)) were bounded then one would find a bounded sequence (tn ) for which |X(tn , ω)| > n. Let (tnk )k be a monotone, convergent subsequence of (tn ). If tnk → t∗ , then |X(tn , ω)| → ∞, which is impossible as X has finite left and right limits. Proposition 1.152 If the filtration is right-continuous and the jumps of the right-regular process X are bounded then X is locally bounded. Proof: We can again assume that X(0) = 0. Assume that |∆X| ≤ a. As in the previous proposition if τ n inf {t : |X(t)| > n} then (τ n ) is a localizing sequence, |X(τ n −)| ≤ n, therefore |X τ n | ≤ n + |∆X(τ n )| ≤ n + a.
Example 1.153 In the previous propositions one cannot drop the condition of regularity.
The process X(t)
1/t 0
if t > 0 if t = 0
is continuous from the left but not regular, and it is obviously not locally bounded. The 1/ (1 − t) if t < 1 X(t) 0 if t ≥ 1 is continuous from the right but it is also not locally bounded.
2 STOCHASTIC INTEGRATION WITH LOCALLY SQUARE-INTEGRABLE MARTINGALES In this chapter we shall present a relatively simple introduction to stochastic integration theory. Our main simplifying assumption is that we assume that the integrators are locally square-integrable martingales. Every continuous process is 2 contains the continuous local martingales. locally bounded, hence the space Hloc In most of the applications the integrator is continuous, therefore in this chapter we shall mainly concentrate on the continuous case. As we shall see, the slightly 2 more general case, when the integrator is in Hloc is nearly the same as the continuous one. The central concept of this chapter is the quadratic variation [X]. We shall show that if X is a continuous local martingale then [X] is continuous, increasing and X 2 − [X] is also a local martingale. It is a crucial observation that in the continuous case these properties characterize the quadratic variation. When the integrator X is discontinuous then the quadratic variation [X] is also discontinuous. As in the continuous case, X 2 − [X] is still a local martingale, but this property does not characterize the quadratic variation for local martingales in general. The jump process of the quadratic variation ∆ [X] satisfies 2 the identity ∆ [X] = (∆X) , and [X] is the only right-continuous, increasing 2 process for which X 2 − [X] is a local martingale and the identity ∆ [X] = (∆X) holds. When the integrators are continuous one can define the stochastic integral for progressively measurable integrands. The main difference between the 2 case is that in the discontinuous case we should take continuous and the Hloc into account the jumps of the integral. Because of this extra burden in the discontinuous case one can define the stochastic integral only when the integrands are predictable. In the first part of the chapter we shall introduce the so-called Itˆ o–Stieltjes integral. We shall use the existence theorem of the Itˆ o–Stieltjes integral to prove the existence of the quadratic variation. After this, we present the construction 108
ˆ THE ITO–STIELTJES INTEGRALS
109
of stochastic integral when the integrators are continuous local martingales. At the end of the chapter we briefly discuss the difference between the continuous 2 and the Hloc case. In the present chapter we assume that the filtration is right-continuous and if N ∈ A has probability zero, then N ∈ Fs for all s. But we shall not need the assumption that (Ω, A, P) is complete.
2.1
The Itˆ o–Stieltjes Integrals
In this section we introduce the simplest concept of stochastic integration, which I prefer to call Itˆ o–Stieltjes integration. Every integral is basically a limit of certain approximating sums. The meaning of the integral is generally obvious for the finite approximations and by definition the integral operator extends the meaning of the finite sums to some more complicated infinite objects. In stochastic integration theory we have two stochastic processes: the integrator X and the integrand Y . As in elementary analysis, let us fix an interval [a, b] and let (n)
∆n : a = t0
(n)
< t1
< · · · < t(n) mn = b
(2.1)
be a partition of [a, b]. For a fixed partition ∆n let us define the finite approximating sum Sn
mn
(n) (n) (n) X tk − X tk−1 , Y τk
k=1 (n)
where the test points τ k have been chosen in some way from the time subin(n) (n) tervals [tk−1 , tk ]. If the integrator X is the price of some risky asset then (n)
(n)
(n)
(n)
X(tk ) − X(tk−1 ) is the change of the price during the time interval [tk−1 , tk ] (n)
and if Y (τ k ) is the number of assets one holds during this time period then Sn is the net change of the value of the portfolio during the whole time period [a, b]. If (n) (n) lim max tk − tk−1 = 0 n→∞
k
then the sequence of partitions (∆n ) is called infinitesimal. In this section we b say that the integral a Y dX exists if for any infinitesimal sequence of partitions of [a, b] the sequence of approximating sums (Sn ) is convergent and the limit is independent of the partition (∆n ). The main problem is the following: under which conditions and in which sense does the limit limn→∞ Sn exist? Generally we can only guarantee that the approximating sequence (Sn ) is convergent in probability and for the existence of the integral we should assume that the test (n) points τ k have been chosen in a very restricted way. That is, we should assume,
110
STOCHASTIC INTEGRATION (n)
(n)
that τ k = tk−1 . This type of integral we shall call the Itˆ o–Stieltjes integral of Y against X. Perhaps the most important and most unusual point in the theory is (n) that we should restrict the choice of the test points τ k . The simplest example showing why it is necessary follows: Example 2.1 Let w be a Wiener process. Try to define the integral
b
wdw!
a
Consider the approximating sums Sn
(n) (n) (n) w(tk ) w(tk ) − w(tk−1 ) ,
k
and In
(n) (n) (n) w(tk−1 ) w(tk ) − w(tk−1 ) .
k (n)
In the first case τ k
(n)
tk
(n)
and in the second case τ k
Sn − In =
(n)
tk−1 . Obviously
2 (n) (n) w(tk ) − w(tk−1 ) ,
k
which is the approximating sum for the quadratic variation of the Wiener process. As we will prove1 if n → ∞ then in L2 (Ω)-norm lim (Sn − In ) = b − a = 0,
n→∞
that is the limit of the approximating sums is dependent on the choice of the test (n) points τ k . As the interpretation of the stochastic integral is basically the net (n) (n) gain of some gambling process, it is quite reasonable to choose τ k as tk−1 as one should decide about the size of a portfolio before the prices change, since it is quite unrealistic to assume that one can decide about the size of an investment after the new prices have already been announced. It is very simple to see that 1 In = 2 =
w
2
(n) (tk )
−w
2
(n) (tk−1 )
k
−
(n) w(tk )
k
2 1 (n) 1 2 (n) w(tk ) − w(tk−1 ) , w (b) − w2 (a) − 2 2 k
1 See:
−
Example 2.27, page 129, Theorem B.17, page 571.
2
(n) w(tk−1 )
=
ˆ THE ITO–STIELTJES INTEGRALS
111
hence lim In =
n→∞
=
1 & 2 'b 1 w (t) a − (b − a) = 2 2 1 1 2 w (b) − w2 (a) − (b − a) , 2 2
and similarly lim Sn =
n→∞
=
2.1.1
1 & 2 'b 1 w (t) a − (b − a) + (b − a) = 2 2 1 1 2 w (b) − w2 (a) + (b − a) . 2 2
Itˆ o–Stieltjes integrals when the integrators have finite variation
Integration theory is quite simple when the trajectories of the integrator X have finite variation on any finite interval. As a point of departure it is worth recalling a classical theorem from elementary analysis. The following simple proposition is well-known and it is just a parametrized version of one of the most important existence theorems of the calculus. Proposition 2.2 (Existence of Riemann–Stieltjes integrals) Let us fix a finite time interval [a, b]. If the trajectories of the integrator X have finite variation and the integrand Y is continuous, then for all outcomes ω the limit of the integrating sums Sn
mn
(n) (n) (n) Y (τ k ) X(tk ) − X(tk−1 ) ,
(2.2)
k=1
exists and it is independent of the choice of the infinitesimal sequence of partitions ( (n) (n) (n) (2.1) and of the choice of the test points τ k ∈ tk−1 , tk . Proof. As the trajectories Y (ω) are continuous on [a, b] they are uniformly continuous and therefore for any ε > 0 there is a δ (ω) > 0, such that if |t − t | < δ (ω) , then2 |Y (t , ω) − Y (t , ω)|
0, otherwise X (ω) is constant on [a, b] and the integral trivially exists.
112
STOCHASTIC INTEGRATION
If all partitions of [a, b] are finer than δ (ω) /2, that is, if for all n δ (ω) (n) (n) max tk − tk−1 < k 2 then by (2.3) 0 ≤ |Si − Sj |
(i) (i) (i) (j) (j) (j) Y (τ k ) X(tk ) − X(tk−1 ) − Y (τ l ) X(tl ) − X(tl−1 ) k
l
(i) (j) Y (θr ) − Y (θr ) (X(sr ) − X(sr−1 )) ≤ r
(j) ≤ max Y (θ(i) ) − Y (θ ) |X(sr ) − X(sr−1 )| ≤ r r r
r
(j) ≤ max Y (θ(i) r ) − Y (θ r ) Var (X, a, b) ≤ ε, r
(i)
(j)
where (sr ) is any partition containing the points (tk ) and (tl ) and the (i) (j) θ(i) and θ(j) are the original test points τ k and τ k corresponding to r r [sr−1 , sr ] respectively. So for any ω, (Sn (ω)) is a Cauchy sequence. so for all ω the limit b
Y dX a
(ω) lim Sn (ω) n→∞
exists. If (Sp ) and (Sq ) are two different approximating sequences generated by different infinitesimal sequences of partitions of [a, b] or they belong to different choices of test points and In
Sp Sq
if n = 2p if n = 2q − 1
then by the argument just presented (In ) also has a limit, which is of course the common limit of (Sp ) and (Sq ). Hence the limit does not depend on the (n) infinitesimal sequence of partitions (tk ) and does not depend on the way of (n) choosing the test points (τ k ). Definition 2.3 If the value of the integral is independent of the choice of test (n) points (τ k ) then the integral is called the Riemann–Stieltjes integral of Y against b X. Of course the integral is denoted by a Y dX.
ˆ THE ITO–STIELTJES INTEGRALS
113
Example 2.4 IfY and X have common points of discontinuity then the Riemann– b Stieltjes integral a Y dX does not exist.
If Y (t)
0 if t ≤ 0 1 if t > 0
and X (t)
0 if t < 0 1 if t ≥ 0
1 (n) then the Riemann–Stieltjes integral −1 XdY does not exist. If τ k ≤ 0 for the subinterval containing t = 0 then Sn = 0, otherwise Sn = 1. Observe that (n) if the test point τ k is the left endpoint of the subinterval, then Sn = 0, hence the so-called Itˆo–Stieltjes integral3 is zero. Our goal is to extend the integral to discontinuous integrands. As a first step, we extend the integral to regular integrands. As we saw in the previous (n) example even for left-regular integrands we cannot choose the test points τ k arbitrarily. (n)
Definition 2.5 If the value of the test point τ k is always the left endpoint (n) (n) (n) (n) of the subinterval [tk−1 , tk ], that is if τ k = tk−1 for all k, then the integral is called the Itˆ o–Stieltjes integral of Y against X. Of course the Itˆo–Stieltjes b integrals are also denoted by a Y dX. Example 2.6 If f is a simple predictable jump that is f (t)
c1 c2
if if
t ≤ t0 t > t0
then for any regular function g the Itˆ o–Stieltjes integral is a
b
f dg = c1 (g (t0 +) − g (a)) + c2 (g (b) − g (t0 +)) .
If f is a simple jump that is c1 f (t) c3 c 2
if if if
t < t0 t = t0 t > t0
then for any right-regular function g the Itˆ o–Stieltjes integral is again (2.4). 3 See
the definition below.
(2.4)
114
STOCHASTIC INTEGRATION
If t0 = b then by definition g(t0 +) = g(b+) = g(b) so in this case (2.4) is (n) obvious. Let (tk ) be an infinitesimal sequence of partitions. By the definition of the integral Sn
(n) (n) (n) f (tk−1 ) g(tk ) − g(tk−1 ) =
k
(n) (n) = c1 g(tj ) − g(a) + c2 g(b) − g(tj ) , (n)
(n)
(n)
where t0 ∈ [tj−1 , tj ). If n → ∞, then tj t0 + and as g is regular the limit limn Sn exists and it is equal to the formula given. Assume that g is right-regular. (n) (n) If t0 = tj−1 then the approximating sums do not change. If t0 = tj−1 then
(n) Sn = c1 (g (t0 ) − g (a)) + c3 g(t0 ) − g(tj ) +
(n) + c2 g(b) − g(tj ) . (n)
g is right-continuous at t0 so g (t0 ) − g(tj ) → 0, hence the limit is again the same as in the previous case. One can easily generalize the example above4 : Lemma 2.7 If every trajectory of the integrand Y is a finite number of jumps and X is a right-continuous process, then for arbitrary a < b the Itˆ o– b Stieltjes integral a Y dX exists and the approximating sums converge for every outcome ω. Example 2.8 If f is a simple spike, that is if f (t)
c 0
if if
t = t0 , t = t0
then for any right-continuous integrator the Itˆ o–Stieltjes integral of f is zero.
The approximating sum is Sn = 4 Let
0 (n)
(n)
c · g(tj+1 ) − g(tj )
(n)
if t0 = tj
(n)
if t0 = tj
us observe that the Itˆ o–Stieltjes integral is, trivially, additive.
.
ˆ THE ITO–STIELTJES INTEGRALS
115
In the first case of course limn Sn = 0, in the second case as g is right-continuous lim Sn = c lim
n→∞
n→∞
(n) (n) (n) g(tj+1 ) − g(tj ) = c lim g(tj+1 ) − g(t0 ) = 0. n→∞
Observe that if g has bounded variation, then g defines a signed measure on R. b The Lebesgue–Stieltjes integral is a f dg = f (t0 )∆g(t0 ) which is different from the Itˆo–Stieltjes integral. Later5 we shall show that for left-regular processes the Lebesgue–Stieltjes and the Itˆ o–Stieltjes integrals are equal but, as in this case f is not left-regular, the theorem is not applicable6 . We shall very often use the following simple observation: Proposition 2.9 (The existence of the Itˆ o–Stieltjes integral) If the integrator X is right-continuous7 and it has finite variation and the integrand Y is b regular then for any time interval [a, b] the Itˆ o–Stieltjes integral a Y dX exists and for all outcome ω the approximating sequences In (ω)
(n) (n) (n) Y (tk−1 , ω) X(tk , ω) − X(tk−1 , ω)
k
are convergent. Proof. The proof is similar to the proof of the existence of Riemann–Stieltjes integrals. Fix an outcome ω and let (In ) be the sequence of the approximating sums. Fix an ε > 0 and an outcome ω. By the regularity of Y (ω) there are only a finite number of jumps bigger than8 c Let J
ε . 4 · Var (X) (a, b, ω)
∆Y · χ (|∆Y | ≥ c) and Z Y − J. (J)
1. Let us denote by (In ) the approximating sums formed with J. As Y is regular the number of ‘big jumps’ on every trajectory is finite. X is right-continuous, b hence by the previous lemma the integral a J (ω) dX (ω) exists for any ω. Hence if i and j are big enough, then ε (J) (J) Ii (ω) − Ij (ω) ≤ . 2 5 It is an easy consequence of the Dominated Convergence Theorem. See: Theorem 2.88, page 174. See also the properties of the stochastic integral on page 434. 6 Recall that the Riemann–Stieltjes integral b f dg does not exist. a 7 If X is not right-continuous then we should assume that Y is left-regular. 8 See: Proposition 1.5, page 5. We can assume that Var (X (ω) , a, b) > 0 otherwise X (ω) is constant on [a, b] and the proposition is trivially satisfied.
116
STOCHASTIC INTEGRATION
2. Finally let us define the approximating sums In(Z)
(n) (n) (n) Z(tk−1 , ω)X (tk , ω) − X(tk−1 , ω) .
k
The jumps of Z are smaller than c and Z is regular, hence9 there is a δ(ω) such that if |s − t| ≤ δ(ω) then |Z(s, ω) − Z(t, ω)| ≤ 2c. (n) (n) If maxk tk − tk−1 ≤ δ(ω)/2 for all n ≥ N then as in the case of the ordinary Riemann–Stieltjes integral ε (Z) (Z) Ii (ω) − Ij (ω) ≤ 2c · Var (X (ω) , a, b) ≤ . 2 3. Adding up the two inequalities above if i and j are sufficiently large then (J) (J) |Ii (ω) − Ij (ω)| ≤ Ii (ω) − Ij (ω) + (Z) (Z) + Ii (ω) − Ij (ω) ≤ ε.
(2.5)
This means that (In (ω)) is a Cauchy sequence for any ω. The rest of the proof is the same as the last part of the proof of the previous proposition. Example 2.10 The Itˆ o–Stieltjes and the Lebesgue–Stieltjes integrals are not equal.
One should emphasize that as X has bounded variation one can also define the pathwise Lebesgue–Stieltjes integral of Y with respect to the measures generated by the trajectories of X. If Y is left-continuous then Y = lim
n→∞
(n) (n) (n) Y tk−1 χ tk−1 , tk
k
so by the Dominated Convergence Theorem the two integrals are equal. But in general the Itˆ o–Stieltjes and the Lebesgue–Stieltjes integrals are not equal. If Y (t) = X (t) 9 See:
Proposition 1.7, page 6.
0 if t < 1/2 1 if t ≥ 1/2
ˆ THE ITO–STIELTJES INTEGRALS
117
then the measure generated by X is the Dirac measure δ 1/2 so the Lebesgue– Stieltjes integral over (0, 1] is one, while the Itˆ o–Stieltjes integral is zero10 . 2.1.2
Itˆ o–Stieltjes integrals when the integrators are locally square-integrable martingales
Perhaps the most important stochastic processes are the Wiener processes. As the trajectories of Wiener processes almost surely do not have finite variation11 , we cannot apply the previous construction when the integrator is a Wiener process. Theorem 2.11 (Fisk) Let L be a continuous local martingale. If the trajectories of L have finite variation then for almost all outcomes ω the trajectories of L are constant functions. Proof. Consider the local martingale M L − L (0). It is sufficient to prove that M = 0. Let V Var (M ) and let (ρn ) be a localizing sequence of M . As the variation of a continuous function is continuous υ n (ω) inf {t : |M (t, ω)| ≥ n} and κn (ω) inf {t : V (t, ω) ≥ n} are stopping times. Hence τ n υ n ∧ κn ∧ ρn is also a stopping time. Obviously τ n ∞, hence if M τ n = 0 for all n then M is zero on [0, τ n ] for all n and therefore M will be zero on ∪n [0, τ n ] = R+ × Ω, so M = 0. As the trajectories of M τ n and V τ n are bounded one can assume that M and V Var (M ) are (n) bounded. Let (tk ) be an arbitrary infinitesimal sequence of partitions of [0, t]. By the energy identity12 if u > v then
2 E (M (u) − M (v)) = E M 2 (u) − M 2 (v) , (2.6) hence as M (0) = 0
E M 2 (t) = E M 2 (t) − E M 2 (0) =
(n) (n) 2 2 =E M tk − M tk−1 = k
=E
k
10 See:
Example 2.6, page 113. Theorem B.17, page 571. 12 See: Proposition 1.58, page 35. 11 See:
M
(n) tk
−M
(n) tk−1
2
.
118
STOCHASTIC INTEGRATION
V is bounded hence V Var (M ) ≤ c.
E M 2 (t) ≤
(n)
(n) (n) (n) ≤E − M tk−1 · max M tk − M tk−1 ≤ M tk
k
k
(n) (n) ≤ E V (t) · max M tk − M tk−1 k
(n) (n) ≤ c · E max M tk − M tk−1 . k
The trajectories of M are continuous hence they are uniformly continuous on [0, t] so
(n) (n) − M tk−1 = 0. lim max M tk
n→∞
k
On the other hand
(n) (n) max M tk − M tk−1 ≤ V (t) ≤ c, k
so we can use the Dominated Convergence Theorem:
(n) (n) lim E max M tk − M tk−1 = 0.
n→∞
k
a.s.
Hence M (t) = 0 for every t. The trajectories of M are continuous and therefore13 for almost all outcomes ω one has that M (t, ω) = 0 for all t. This means that when the integrators are continuous local martingales we need another approach. First we prove two very simple lemmata: Lemma 2.12 Let (Mk , Fk ) be a discrete-time martingale and let (Nk ) be an F (Fk ) adapted process. If the variables Nk−1 · (Mk − Mk−1 ) are integrable then the sequence Z0 0,
Zn
n k=1
13 See:
Proposition 1.9, page 7.
Nk−1 · (Mk − Mk−1 )
ˆ THE ITO–STIELTJES INTEGRALS
119
is an F-martingale. Specifically, if N is uniformly bounded and M is an arbitrary discrete-time martingale then Z is a martingale. Proof. By the assumptions Nk−1 ·(Mk − Mk−1 ) is integrable, hence if k −1 ≥ m then E (Nk−1 (Mk − Mk−1 ) | Fm ) = E (E (Nk−1 (Mk − Mk−1 ) | Fk−1 ) | Fm ) = = E (Nk−1 E (Mk − Mk−1 | Fk−1 ) | Fm ) = = E (Nk−1 · 0 | Fm ) = 0, from which the lemma is evident. Lemma 2.13 Let (Mk , Fk ) be a discrete-time L2 (Ω)-valued martingale. If |Nk | ≤ c is an F-adapted sequence and Z0 0,
Zn
n
Nk−1 · (Mk − Mk−1 )
k=1
then ) 2 2 Zn 2 ≤ c Mn 2 − M0 2 . Proof. By the previous lemma (Zn ) is a martingale, so by the energy equality 2
Zn 2 =
n
2
Nk−1 (Mk − Mk−1 )2 .
k=1
Using the energy equality again 2
Zn 2 ≤ c2
n
2
Mk − Mk−1 2 =
k=1
= c2
n
2 2 Mk 2 − Mk−1 2 =
k=1
2 2 = c2 Mn 2 − M0 2 . First we prove the existence of the integral for continuous integrands. Proposition 2.14 (Existence of Itˆ o–Stieltjes integrals for continuous integrands) If X ∈ H2 and Y is adapted and continuous on a finite interval
120
STOCHASTIC INTEGRATION
[a, b] then the Itˆ o –Stieltjes integral In
b a
Y dX exists and the approximating sums
(n) (n) (n) Y (tk−1 ) X(tk ) − X(tk−1 )
k
converge in probability. Proof. The proof is similar to the proof of the existence of the integral when the integrator has finite variation. 1. The basic, but not entirely correct trick is that as Y is continuous it is uniformly continuous, hence if In and Im are two approximating sums of the integral then by the previous lemma In − Im 2
(n) (n) (n) (m) (m) (m) Y (tk−1 ) X(tk )−X(tk−1 ) − Y (tk−1 ) X(tk )−X(tk−1 ) = k k 2
= Y (tk−1 ) − Y (tk−1 ) X(tk ) − X(tk−1 ) ≤ k 2 ) 2 2 ≤c X (b)2 − X (a)2 . Of course the main problem with this estimation is that one cannot guarantee that for any fixed partition Y (tk−1 , ω) − Y (tk−1 , ω) ≤ c
(2.7) (n)
(m)
for every ω. What one can show is that if the partitions (tk ) and (tk ) are sufficiently fine then outside of an event with small probability the estimation (2.7) is valid. That is the reason why one can prove only that the integrating sums converge in probability and not in L2 (Ω). 2. To show the correct proof fix an α and a β and let * + βα2 +
. c, 2 2 2 X (b)2 − X (a)2 For every δ > 0 let us define the modulus of continuity of Y : Mδ (ω, u) sup {|Y (t, ω) − Y (s, ω)| : |t − s| ≤ δ, t, s ∈ [a, u]} . As Y is continuous one can calculate the supremum when s and t are rational numbers so Mδ is adapted and as Y is continuous obviously Mδ is also continuous.
ˆ THE ITO–STIELTJES INTEGRALS
121
Y is continuous, so every trajectory of Y is uniformly continuous on [a, b], hence for every ω lim Mδ (ω, b) = 0.
δ0
This means that if δ is sufficiently small then P(Mδ (b) ≥ c) ≤
β . 2
Fix this δ and let us define the stopping time τ inf {u : Mδ (u) ≥ c} ∧ b. As τ is a stopping time, Z Y τ is adapted and if |x − y| ≤ δ then |Z (x) − Z (y)| ≤ c. Let In(Z)
(n) (n) (n) Z(tk−1 ) X(tk ) − X(tk−1 ) .
k
(i) (j) If the partitions tk and tk are finer than δ/2 then by the previous lemma 2
βα2 (Z) (Z) 2 2 . Ii − Ij ≤ c2 X (b)2 − X (a)2 = 2 2 Let A {Mδ (b) ≥ c}. It is easy to see that Z = Y on Ac . By Chebyshev’s inequality P (|Ii − Ij | > α) = = P ({|Ii − Ij | > α} ∩ A) + P ({|Ii − Ij | > α} ∩ Ac ) ≤ ≤ P (A) + P ({|Ii − Ij | > α} ∩ Ac ) =
(Z) (Z) = P (A) + P Ii − Ij > α ∩ Ac ≤ 2 (Z) (Z)
− I I i j β β (Z) (Z) 2 ≤ ≤ + P Ii − Ij > α ≤ + 2 2 2 α
2 2 c2 X (b)2 − X (a)2 β β β ≤ + = + = β. 2 2 α 2 2 Hence (In ) is convergent in probability. Now we generalize the theorem for regular integrands.
122
STOCHASTIC INTEGRATION
Proposition 2.15 (The existence of the Itˆ o–Stieltjes integral for H2 integrators) If on a finite interval [a, b] the adapted stochastic process Y is b regular and X ∈ H2 then the Itˆ o–Stieltjes integral a Y dX exists and the Itˆ o-type approximating sums converge in probability. Proof. The proof is similar to the proof of the existence of the integral when the integrator has finite variation. Let (In ) be an approximating sequence of the b integral a Y dX. Fix an ε and a β. * + + c,
Let again J
βε2 2
2
48 X (b)2 − X (a)2
∆Y χ (|∆Y | ≥ c) , Z Y − J.
1. As the trajectories of Y are regular for any ω the trajectory Y (ω) has a finite number of jumps which are larger than c. X ∈ H2 and by definition X b is right-continuous, hence the integral a JdX exists. As it converges for every outcome ω it converges stochastically as well, so if i and j are big enough, then ε β (J) (J) P Ii − Ij > ≤ . 2 3 2. The jumps of Z are smaller than c. As in the continuous case14 if δ > 0 is small enough then there is a stopping time τ such that P (τ < b) P (A) ≤
β 3
and if |x − y| ≤ δ then |Z (x) − Z (y)| ≤ 2c on the random interval [a, τ ]. If (i) V Z τ then |V (x) − V (y)| ≤ 2c whenever |x − y| ≤ δ. If the partitions (tk ) (j) and (tk ) are finer than δ/2 then again as in the continuous case 2
(V ) (V ) 2 2 2 Ii − Ij ≤ (2c) X (b)2 − X (a)2 . 2
By Chebyshev’s inequality
ε (2c)2 X (b)22 − X (a)22 β (V ) (V ) P Ii − Ij > ≤ = . 2 2 3 (ε/2) 14 See:
Proposition 1.7, page 6.
ˆ THE ITO–STIELTJES INTEGRALS
123
3. If i and j are big enough, then ε
ε
(Z) (J) (J) (Z) P (|Ii − Ij | > ε) ≤ P Ii − Ij > + P Ii − Ij > ≤ 2 2 ≤
ε
β (Z) (Z) + P Ii − Ij > ≤ 3 2
≤
ε
β (Z) (Z) ≤ + P (A) + P Ac ∩ Ii − Ij > 2 3
≤
ε
2β (V ) (V ) + P Ii − Ij > ≤ β. 3 2
This means that (In ) is a Cauchy sequence in probability and hence it converges in probability. Corollary 2.16 Let Y be an adapted, regular process on a finite interval [a, b]. b 2 If X ∈ Hloc then the Itˆ o–Stieltjes integral a Y dX exists and the approximating sums converge in probability. 2 and let (τ n ) be a localizing sequence of X. As Proof. Assume that X ∈ Hloc τ n ∞ for any β > 0 if s is big enough then P (τ s ≤ b) < β/2. Let
In
(n) (n) (n) Y (tk−1 ) X(tk ) − X(tk−1 ) ,
k
Sn
(n) (n) (n) Y (tk−1 ) X τ s (tk ) − X τ s (tk−1 ) .
k
For any α > 0 P (|In − Im | > α) ≤ P (τ s ≤ b) + P (|In − Im | > α, τ s ≥ b) ≤ ≤
β + P (|In − Im | > α, τ s ≥ b) ≤ 2
≤
β + P (|Sn − Sm | > α) . 2
As X τ s ∈ H2 by the previous proposition P (|Sn − Sm | > α) → 0. Hence (In ) is a stochastic Cauchy sequence, so it is convergent in probability.
124 2.1.3
STOCHASTIC INTEGRATION
Itˆ o–Stieltjes integrals when the integrators are semimartingales
As we can integrate with respect to processes with finite variation and with respect to locally square-integrable martingales, the next definition is very natural: Definition 2.17 An adapted process X is called a semimartingale if X has a decomposition X = X (0) + V + H
(2.8)
2 where V is a right-continuous, adapted process with finite variation and H ∈ Hloc and V (0) = H (0) = 0.
It is important to emphasize that at the moment we do not know too much about the class of semimartingales. As there are martingales which are not locally square-integrable it is not even evident from the definition that every martingale is a semimartingale. Later we shall prove that every local martingale is a semimartingale in the above sense15 . We shall later also prove that every integrable sub- and supermartingale is a semimartingale16 . Therefore the class of semimartingales is a very broad one. Every continuous local martingale is locally square-integrable 17 , therefore in the continuous case we can use the following definition: Definition 2.18 An adapted continuous stochastic process X is called a continuous semimartingale if X has a decomposition (2.8) where H is a continuous local martingale and V is a continuous, adapted process with finite variation. Proposition 2.19 If X is a continuous semimartingale then the decomposition (2.8) is unique. Proof. If X = X (0)+H1 +V1 and X = X (0)+H2 +V2 then H1 −H2 = V2 −V1 is a continuous local martingale having finite variation. Hence by Fisk’s theorem18 H1 − H2 = V1 − V2 = 0. Example 2.20 For discontinuous semimartingales the decomposition (2.8) necessarily unique.
is not
15 This is the so called Fundamental Theorem of Local Martingales. See: Theorem 3.57, page 220. 16 This is a direct consequence of the so called Doob–Meyer decomposition. See: Proposition 5.11, page 303. 17 See: Example 1.137, page 96. 18 See: Theorem 2.11, page 117.
ˆ THE ITO–STIELTJES INTEGRALS
125
The simplest example is the compensated Poisson process. If π is a Poisson process with parameter λ then the compensated Poisson process X (t) π (t) − 2 λt is in Hloc and the trajectories of X on any finite interval have finite variation. So H X, V 0 and H 0, V X are both proper decompositions of X. Almost surely convergent sequences are convergent in probability, therefore one can easily prove the following theorem: Theorem 2.21 (Existence of Itˆ o–Stieltjes integrals) If X is a semimartingale and Y is a regular and adapted process then for any finite interval [a, b] the b Itˆ o–Stieltjes integral a Y dX exists and it is convergent in probability. The value of the integral is independent of the value of the jumps of Y , that is for any regular Y
b
Y dX =
b
b
a
Y− dX =
Y+ dX.
a
a
Proof. We have already proved the first part of the theorem. Let (In ) be the b sequence of the approximating sums for a Y dX and let (Sn ) be the sequence of approximating sums when the integrand is Y− . We need to prove that In − Sn =
(
(n)
P (n) (n) (n) X tk − X tk−1 → 0. Y tk−1 − Y− tk−1
(2.9)
k
Observe that the situation is very similar to that in the proof of Theorem 2.15. We can separate the big jumps and the small jumps and apply the same argument as above19 . Example 2.22 Wiener integrals.
The simplest case of stochastic integration is the so-called Wiener integral: the integrator is a Wiener process w, the integrand is a deterministic function f . If f is regular, then f , as a stochastic process, is adapted and regular, hence by the b above theorem the expression a f (s) dw (s) is meaningful. The increments of a Wiener process are independent. As the sum of independent normally distributed variables is again normally distributed
(n) f (ti−1 )
(n) w(ti )
−
i 19 See:
(n) w(ti−1 )
∼ =N
0,
i
Example 2.8, page 114.
f
2
(n) ti−1
(n) ti
−
(n) ti−1
.
126
STOCHASTIC INTEGRATION
Stochastic convergence implies convergence in distribution, hence
b
f dw ∼ = N 0,
a
b 2
f (t)dt ,
a
where N (µ, σ 2 ) denotes the normal distribution with expected value µ and variance σ 2 . 2.1.4
Properties of the Itˆ o–Stieltjes integral
The next properties of the Itˆ o–Stieltjes integral are obvious: Proposition 2.23 If X1 , X2 and X are semimartingales, Y1 , Y2 and Y are adapted regular processes, α and β are constants then b b a.s. b 1. α a Y1 dX + β a Y2 dX = a (αY1 + βY2 ) dX, b b b a.s 2. a Y d (αX1 + βX2 ) = α a Y dX1 + β a Y dX2 . b b a.s. c 3. If a < c < b, then a Y dX = a Y dX + c Y dX. 4. If Y1 χA is an equivalent modification of Y2 χA for some A ⊆ Ω then the b b integrals a Y1 dX and a Y2 dX are almost surely equal on A. Since the approximating sums are convergent in probability it is important to note that the Itˆ o–Stieltjes integral is defined only as an equivalence class. In the following we shall not distinguish between functions and equivalence classes so a.s. when it is not important to emphasize this difference instead of = we shall use the simpler sign =. 2.1.5
The integral process
Let us briefly investigate the integral process (Y • X) (t)
t
Y dX. a
We have defined the stochastic integral only for fixed time intervals. On every time interval the definition determines the value of the stochastic integral up to a measure-zero set, hence the properties of the integral process t → (Y • X) (t) are unclear. It is not a stochastic process, just an indexed set of random variables! When does it have a version which is a martingale? Assume that X ∈ H2 and that Y is adapted. Assume also that Y is uniformly bounded that is |Y | ≤ c for some constant c. As the filtration F is right-continuous, the right-regular process
ˆ THE ITO–STIELTJES INTEGRALS
127
Y+ is also adapted. As we have seen20 for every t ∈ [a, b]
2 (n) (n) (n) 2 ≤ Y+ tk−1 ∧ t X tk ∧ t − X tk−1 ∧ t In (t)2 E k
≤ c2 E X 2 (b) − E X 2 (a) K, hence the sequence In (t)
(n) (n) (n) Y+ tk−1 ∧ t X tk ∧ t − X tk−1 ∧ t
k
is bounded in L2 (Ω) so the sequence of the approximating sums is uniformly integrable hence not only p
In (t) → (Y • X) (t) but also L1
In (t) → (Y • X) (t) . It is easy to see21 that if s < t then E (In (t) | Fs ) = In (s) . L1
As In (t) → operator
t a
Y dX using the L1 (Ω)-continuity of the conditional expectation
t
Y dX | Fs
E
s
Y dX.
=
a
a
Observe that In (t) is right-regular so In (t) is a martingale for every n. As Im −In is a martingale by Doob’s inequality, for any λ > 0
λP sup |In (t) − Im (t)| ≥ λ t
≤ In (b) − Im (b)1 .
(In (b)) is convergent in L1 (Ω) so P
sup |In (t) − Im (t)| → 0, t
20 See: 21 See:
Lemma 2.13, page 119. Lemma 2.12, page 118.
128
STOCHASTIC INTEGRATION
hence for a subsequence a.s.
sup |Ink (t) − Imk (t)| → 0,
(2.10)
t
so except for a measure-zero set the continuity-type properties of trajectories of (In ) are preserved, so we get the following proposition: Proposition 2.24 If Y is an adapted, regular, and uniformly bounded process, X ∈ H2 then the integral process (Y • X) (t)
t
Y dX,
t≥a
a
has a version which is a martingale. If (In ) is the sequence of approximating sums then for every t P
sup |In (s) − (X • M ) (s)| → 0.
(2.11)
a≤s≤t
If X is continuous and bounded then Y • X has a continuous version. Let us emphasize that in the argument above the set of exceptional points N in (2.10) is in Fb . Of course we should define the integral process on N as well, and of course we should guarantee that the integral process is adapted. We can do this only when we assume that for all s ≤ b, N ∈ Fs . This assumption is part of the usual conditions. Observe that in the continuous case we do not explicitly use the right-continuity of the filtration. On the other hand, this is a very uninteresting remark since, in most cases22 , if we add the measure-zero sets to the filtration then the augmented filtration is right-continuous. 2.1.6
Integration by parts and the existence of the quadratic variation
One of the most important concepts of stochastic analysis is the quadratic variation. The main reason to introduce the Itˆ o–Stieltjes integral is that from the existence theorem of the Itˆo–Stieltjes integral one can easily deduce the existence of the quadratic variation of semimartingales. Definition 2.25 Let U and V be stochastic processes on [a, b]. If for every
(n) of [a, b] the sequence infinitesimal sequence of partitions tk Qn
(n)
(n) (n) (n) U tk − U tk−1 V tk − V tk−1 k
22 E.g.
if the filtration is generated by a L´evy process. See: Proposition 1.103, page 67.
ˆ THE ITO–STIELTJES INTEGRALS
129
is convergent in probability then the limit limn→∞ Qn is called the quadratic co-variation of U and V . The quadratic co-variation of U and V on [a, b] is b b b denoted by [U, V ]a . If V = U then [U, U ]a [U ]a is called the quadratic variation of U . Of course in stochastic convergence b
[U ]a lim
n→∞
2 (n)
(n) U tk − U tk−1 . k
Example 2.26 If the trajectories of X are continuous and the trajectories of V have a.s. finite variation then [X, V ]ba = 0 for any interval [a, b].
By the continuity assumption, the trajectories of X are uniformly continuous on (n) (n) the compact interval [a, b]. Hence if maxk tk − tk−1 → 0 then for every ω (n) (n) lim max X(tk , ω) − X(tk−1 , ω) → 0.
n→∞
k
Therefore, as Var(V, a, b) < ∞
(n) (n) (n) (n) X(tk ) − X(tk−1 ) V (tk ) − V (tk−1 ) ≤ |Qn | k (n) (n) ≤ max X(tk ) − X(tk−1 ) Var(V, a, b) → 0. k
a.s.
Example 2.27 If w is a Wiener process23 then [w]t0 = t. If π is a Poisson process a.s. then [π]t0 = π (t).
If π is a Poisson process then for any ω the number of the jumps on any finite interval [0, t] is finite, so for any ω one can assume that every subinterval contains just one jump, hence Qn (t, ω) is the number of jumps of the trajectory π (ω) during the time interval [0, t]. So evidently Qn (t, ω) = π (t, ω). Proposition 2.28 (Integration By Parts Formula) If M and N are semimartingales then: b
1. For any finite interval [a, b] the quadratic co-variation [M, N ]a exists. 2. The following integration by parts formula holds: (M N ) (b) − (M N ) (a) =
M− dN + a
23 See:
Theorem B.17, page 571.
b
a
b
b
N− dM + [M, N ]a .
(2.12)
130
STOCHASTIC INTEGRATION
Proof. By definition semimartingales are right-regular processes so the processes (n) M− and N− are well-defined left-regular processes. For any partition (tk ) of [a, b] let us define the approximating sums
(n) (n) (n) (n) M tk−1 ∆N tk N tk−1 ∆M tk + +
k
k
+
(n)
∆M tk
(n)
∆N tk
.
k
With elementary calculation for all k (n)
(n)
(n)
(n)
M (tk )N (tk ) − M (tk−1 )(N tk−1 ) =
(n) (n) (n) = M tk−1 N (tk ) − N (tk−1 ) +
(n) (n) (n) + N tk−1 M (tk ) − M (tk−1 ) +
(n) (n) (n) (n) + M (tk ) − M (tk−1 ) N (tk ) − N (tk−1 ) . Adding up by k, on the left side one gets a telescopic sum which adds up to M (b) N (b) − M (a) N (a) , which is the expression on the left-hand side of (2.12). The integrating sums on the right-hand side converge to the Itˆ o–Stieltjes integrals
b
M dN = a
b
M− dN a
and
b
b
N dM = a
N− dM a
b
so [M, N ]a exits and the formula (2.12) holds. Example 2.29 The jumps of independent Poisson processes.
Let N1 and N2 be two Poisson processes with respect to the same filtration24 F. For s ≥ 0 let Ui (s, t) 24 That
exp (−sNi (t)) , E (exp (−sNi (t)))
i = 1, 2
is N1 and N2 are counting L´evy processes with respect to the same filtration.
ˆ THE ITO–STIELTJES INTEGRALS
131
be the exponential martingales defined by the Laplace transforms of the Poisson processes. By the Integration By Parts Formula
t
U1 (s1 , t) U2 (s2 , t) − 1 =
U1 (s1 , r−) U2 (s2 , dr) + 0
+
t
U2 (s2 , r−) U1 (s1 , dr) + 0
+ [U1 (s1 ) , U2 (s2 )] (t) . It is easy to see that U1 and U2 are bounded martingales, with respect to F for any s ≥ 0 on any finite interval [0, t]. As they are also F-adapted the stochastic integrals are martingales25 . Therefore the expected value of the stochastic integrals are zero. So E (U1 (s1 , t) U2 (s2 , t)) − 1 = E ([U1 (s1 ) , U2 (s2 )] (t)) . By the definition of U1 and U2
2 2 ! E exp − si Ni (t) = E (exp (−si Ni (t))) i=1
i=1
if and only if E ([U1 (s1 ) , U2 (s2 )] (t)) = 0.
(2.13)
That is N1 (t) and N2 (t) are independent if and only if (2.8) holds26 . As Laplace transform is continuous in time ∆Ui (s, r) =
exp (−sNi (r)) − exp (−sNi (r−)) ≤0 E (exp (−sNi (r)))
it is easy to see that [U1 (s1 ) , U2 (s2 )] (t) =
∆U1 (s1 , r) ∆U2 (s2 , r) ≥ 0.
r≤t
Therefore its expected value is zero if and only if it is almost surely zero. Hence N1 (t) and N2 (t) are independent if and only if with probability one N1 and N2 do not have common jumps on the interval [0, t]. 25 See: 26 One
Proposition 2.24, page 128. can easily modify the proof of Lemma 1.96 on page 60.
132
STOCHASTIC INTEGRATION
The next property of the quadratic co-variation is obvious: Proposition 2.30 If M, N and U are arbitrary semimartingales, ξ and η are F0 -measurable random variables then for any interval [a, b] b a.s.
b
b
[ξM + ηN, U ]a = ξ [M, U ]a + η [N, U ]a . Specifically [M + N ] = [M ] + 2 [M, N ] + [N ] .
a.s.
Example 2.31 If X = X (0)+L+V is a continuous semimartingale then [X]ba = [L]ba for any interval [a, b], where L is the continuous local martingale part of X . b a.s.
As V and L are continuous and the trajectories of V have finite variation [V ]a = a.s. 0 and [V, L] = 0. By the additivity: b a.s.
b
b a.s.
[X]a [X (0) + L + V ]a = [L + V ]a = a.s.
b
b a.s.
b
b
= [L]a + 2 [L, V ]a + [V ]a = [L]a .
Example 2.32 Assume that F is a deterministic, right-regular function with finite variation. If w is a Wiener process then
t
w (s) dF (s) ∼ =N
0,
0
0
t
(F (t) − F (s))2 ds .
w is continuous and F has finite variation therefore [w, F ] = 0. By the integration by parts formula w (t) F (t) =
t
t
wdF +
F− dw,
0
0
hence
t
t
wdF = w (t) F (t) − 0
F− dw = 0
t
F (t) dw −
= 0
t
F− dw = 0
t
(F (t) − F (s−)) dw (s) .
= 0
ˆ THE ITO–STIELTJES INTEGRALS
133
The last integral is a Wiener integral, so 0
t
wdF ∼ =N
t 2 0, (F (t) − F (s−)) ds = 0
t 2 (F (t) − F (s)) ds . = N 0, 0
As we have remarked, if X has finite variation and Y is continuous then27 [X, Y ] = 0. Hence in this case the integration by parts formula is XY − X (0) Y (0) = Y • X + X− • Y. For this formula we do not in fact need the continuity of Y . Observe that as X has finite variation every trajectory of X defines a measure on R+ . Let Y be an arbitrary semimartingale, and let ∆Y denote the jumps of Y . We show, that in this case [Y, X] = ∆Y • X, where the integral is the Lebesgue–Stieltjes integral defined by the trajectories of X. If U ∆Y χ(|∆Y | ≥ ε) are the jumps of Y which are bigger than ε then as the number of such jumps on every finite interval is finite [Y, X] = [Y − U, X] + [U, X] = = [Y − U, X] + ∆Y χ(|∆Y | ≥ ε)∆X = = [Y − U, X] + ∆Y χ(|∆Y | ≥ ε) • X. The jumps of the regular process Z Y − U are smaller than ε, hence if the partition of the interval [a, b] is fine enough, then28 (n) (n) Z(tk , ω) − Z(tk−1 , ω) ≤ 2ε for any ω. Therefore if n → ∞
(n) (n) (n) (n) Z(tk ) − Z(tk−1 ) X(tk ) − X(tk−1 ) ≤ 2εVar (X, a, b) → 0. k
As X has finite variation and the integral is a Lebesgue–Stieltjes integral one can use the Dominated Convergence Theorem. From this theorem for every 27 See: 28 See:
Example 2.26, page 129. Proposition 1.7, page 6.
134
STOCHASTIC INTEGRATION
trajectory ∆Y χ(|∆Y | ≥ ε) • X → ∆Y • X =
∆Y ∆X,
assuming of course that for every trajectory, on every finite interval, |∆Y | is integrable. But this has to be true as the trajectories of Y are regular so on every finite interval every trajectory of Y will be bounded29 . Proposition 2.33 If X is right-continuous and has finite variation, Y is an arbitrary semimartingale then [X, Y ] = ∆Y ∆X = ∆Y • X (2.14) therefore30 XY − X (0) Y (0) = Y− • X + X− • Y + [X, Y ] = = Y− • X + X− • Y + ∆Y • X = = Y • X + X− • Y where the integral with respect to X is a Lebesgue–Stieltjes integral and the integral with respect to Y is an Itˆ o–Stieltjes integral. 2.1.7
The Kunita–Watanabe inequality
In the construction of the stochastic integral below we shall use the following simple inequality: Proposition 2.34 (Kunita–Watanabe inequality) If X, Y are product measurable processes, and M, N are semimartingales, a ≤ b ≤ ∞ and V Var ([M, N ]) then b b b a.s. |XY | dV ≤ X 2 d [M ] Y 2 d [N ]. (2.15) a
a
a
Remark first that the meaning of the proposition is not really clear as it is not clear what is the meaning of [M ], [N ] and [M, N ]. So far we have defined the quadratic variation only for fixed time intervals, and the quadratic variation for every time interval is defined as a limit in stochastic convergence, and hence the quadratic variation on any interval is defined just up to a measure-zero set. If t X is a semimartingale then for every t one can define [X] (t) [X]0 , but this [X] is not a stochastic process since for a fixed ω and t the value of [X] (t, ω) 29 See:
Proposition 1.6, page 5. that the Lebesgue–Stieltjes integral Y •X exists: The trajectories of Y are regular, hence they are bounded on every finite interval. 30 Observe
ˆ THE ITO–STIELTJES INTEGRALS
135
is undefined. Of course, if t is restricted to the set of the rational numbers then we can collect the corresponding measure-zero sets in just one measurezero set, but it is unclear how one can extend this process to the irrational values of t as at the moment we have not proved any continuity property of the quadratic variation. Observe, that we do not know anything about integral processes. In particular we do not know when they will be martingales. If the integral process is a semimartingale then, by definition, it has a right-continuous version, so by (2.12) the quadratic variation also has a right-continuous version. One of the goals of the later developments will be to provide a right-continuous version for the quadratic variation process or, which is the same, to prove some martingale-type properties for the stochastic integral. So, to prove the inequality up to the end of the section we assume that there are processes [M ], [N ] and [M, N ] which are right-continuous, and that for any t they provide a version of the related quadratic variation. In this case [M ] (ω) , [N ] (ω) and Var ([M, N ] , ω) are increasing, right-continuous functions for every ω, hence they define a measure and for every ω the integrals in (2.15) are defined as Lebesgue–Stieltjes integrals. Proof. It is sufficient to prove the proposition for finite a and b. One can prove the case b = ∞ by the Monotone Convergence Theorem. Also by the Monotone Convergence Theorem one can assume that X any Y are bounded. We should b prove the inequality when on the left-hand side we have a XY d [M, N ] since to prove (2.15) one can replace Y by Y Y · sgn (XY )
dV . d [M, N ]
1. First assume that X = 1 and Y = 1. In this case, the inequality is ) ) b a.s. b b N ] ≤ [M ] [M, a a [N ]a .
(2.16)
Fix a u and a v. The proof of (2.16) is nearly the same as the proof of the classical Cauchy–Schwarz inequality. It is easy to see that for all rational numbers r a.s.
v a.s.
v
v
v
0 ≤ [M + rN ]u = [M, M ]u + 2r · [M, N ]u + r2 · [N, N ]u Ar2 + Br + C. Hence there is a measure-zero set Z such that on the complement of Z the inequality above is true for all rational, and therefore all real, r. Hence, as in a.s.
the proof of the Cauchy–Schwarz inequality B 2 − 4AC ≤ 0 so (2.16) holds with a = u and b = v. Unifying the measure-zero sets one can easily prove (2.16) for
136
STOCHASTIC INTEGRATION
every rational numbers u and v. By the assumption above the quadratic variation is right-continuous, so the relation (2.16) holds for every real a = u and b = v. 2. Let (tk ) be a partition of [a, b] and assume that X and Y are constant on every subinterval (tk−1 , tk ]. We are integrating by trajectory so b t XY d [M, N ] ≤ |X (tk ) Y (tk )| [M, N ]tk+1 ≤ k a k ) ) t t ≤ |X (tk ) Y (tk )| [M ]tk+1 [N ]tk+1 . k k k
Using the Cauchy–Schwarz inequality we can continue % b % 2 t t XY d [M, N ] ≤ |X (tk )| [M ]tk+1 Y 2 (tk ) [N ]tk+1 = k k a k k b b = X 2 d [M ] Y 2 d [N ]. a
a
3. Using standard measure theory one can easily prove31 that if µ is a finite, regular measure on the real line, and g is a bounded Borel measurable function, then there is a sequence of step functions sn
ci χ
(n)
(n)
ti , ti+1
i
that sn → g almost surely in µ. As µ is finite and g is bounded sn → g in L2 (µ). 4. We prove that Kunita–Watanabe inequality holds for every outcome where (2.16) holds for every real a and b. Fix the process Y and an outcome ω, and consider the set of processes X for which the inequality (2.16) holds for this ω. Let sn → X (ω) be a set of step functions. By (2.16) the measure generated by [M, N ] (ω) is absolutely continuous with respect to the measure generated by [M ] (ω). Hence sn → X (ω) almost surely in [M, N ] (ω). Therefore by the Dominated Convergence theorem, using that X and Y are bounded, a and b are finite and that the convergence holds almost everywhere in [M, N ] (ω) and in L2 ([M ] (ω)) b b b XY d [M, N ] ≤ X 2 d [M ] Y 2 d [N ] a a a 31 Use Lusin’s theorem [80], page 56, and the uniform continuity of continuous functions on compact sets.
ˆ THE ITO–STIELTJES INTEGRALS
137
for outcome ω. If X is product measurable then by Fubini’s theorem every trajectory of X is Borel measurable. Hence if X is product measurable then inequality (2.15) holds for almost all outcome ω. 5. Now we fix X and repeat the argument for Y . Corollary 2.35 If q, p ≥ 1 and 1/p + 1/q then E 0
∞
∞ |XY | d [M, N ] ≤ X 2 d [M ] 0
p
- ∞ Y 2 d [N ] . 0 q
Proof. By H¨ older’s inequality and by (2.15) E
∞
-
|XY | d [M, N ] ≤
-
∞
X 2 d [M ]
E
0
0
Y 2 d [N ]
≤
0
- ∞ X 2 d [M ] 0
≤
∞
p
- ∞ Y 2 d [N ] . 0 q
Corollary 2.36 If M and N are semimartingales then |[M, N ]| ≤
" [M ] [N ]
(2.17)
and 1/2
[M + N ]
1/2
≤ [M ]
1/2
+ [N ]
and [M + N ] ≤ 2 ([M ] + [N ]) . Proof. The first inequality is just the Kunita–Watanabe inequality when X = Y = 1. [M + N ] = [M ] + 2 [M, N ] + [N ] ≤ " ≤ [M ] + 2 [M ] [N ] + [N ] =
2 1/2 1/2 = [M ] + [N ]
138
STOCHASTIC INTEGRATION
from which the second inequality is obvious. In a similar way " [M + N ] ≤ [M ] + 2 [M ] [N ] + [N ] ≤ ≤ [M ] + ([M ] + [N ]) + [N ] = = 2 ([M ] + [N ]) .
2.2
The Quadratic Variation of Continuous Local Martingales
The following proposition is the starting point in our construction of the stochastic integral process. Proposition 2.37 (Simple Doob–Meyer decomposition) If M is a uniformly bounded, continuous martingale, then: 1. 2. 3. 4.
t
The quadratic variation P (t) [M ] (t) [M ]0 exists. [M ] has a version which is increasing and continuous. For this version M 2 − [M ] is a martingale. [M ] is indistinguishable from any increasing, continuous process P for which P (0) = 0 and M 2 − P is a martingale. (n)
If (tk ) is an infinitesimal sequence of partitions of [0, t] then p
sup |Qn (s) − [M ] (s)| → 0
(2.18)
s≤t
for any t, where Qn (s)
2 (n) (n) . M tk ∧ s − M tk−1 ∧ s
k
Proof. By the Integration By Parts Formula for any t M 2 (t) − M 2 (0) = 2
t
M dM + [M ] (t) = 2 · (M • M ) (t) + [M ] (t) . 0
As M is continuous and uniformly bounded the integral process M • M has a version which is a continuous martingale32 , therefore as M 2 is continuous [M ] M 2 − M 2 (0) − 2 · M • M is continuous, and by Proposition 2.24 M 2 − [M ] = M 2 (0) + 2 · (M • M ) 32 See:
Proposition 2.24, page 128.
THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES
139
t
is a martingale. [M ] (t) is a version of the quadratic variation [M ]0 for any t. p a.s. [M ]0 ≤
q
[M ]0 . Taking the union the For any rational numbers p ≤ q we have measure-zero sets and using the continuity of [M ] we can construct a version which is increasing. If P is another continuous, increasing process for which P (0) = 0 and M 2 − P is a martingale, then N P − [M ] is also a continuous martingale and N (0) = 0. As N is the difference of two increasing processes the trajectories of N have finite variation. By Fisk’s theorem33 N = 0, so P is indistinguishable from [M ]. The convergence (2.18) is a simple consequence of (2.11). First we extend the proposition to continuous local martingales. In order to do it we need the following rule: Proposition 2.38 Under the assumptions of the previous proposition if τ is an τ arbitrary stopping time then [M τ ] = [M ] . τ
2 Proof. As (M τ ) = M 2 τ
τ 2 τ τ (M τ ) − [M ] = M 2 − [M ] = M 2 − [M ] .
τ Stopped martingales are martingales hence M 2 − [M ] is a martinτ gale. [M ] is increasing, so by the uniqueness of the quadratic variation τ [M τ ] = [M ] . Proposition 2.39 If M is a continuous local martingale then there is one and only one continuous, increasing process [M ] such that: 1. [M ] (0) = 0 and 2. M 2 − [M ] is a continuous local martingale.
(n) For any t if tk is an infinitesimal sequence of partitions of [0, t] then p
sup |Qn (s) − [M ] (s)| → 0
(2.19)
s≤t
where Qn (s)
2 (n) (n) . M tk ∧ s − M tk−1 ∧ s
k
Proof. Let M be a continuous local martingale and let (σ n ) be a localizing sequence of M . As M is continuous the hitting times υ n inf {t : |M (t)| ≥ n} 33 See:
Theorem 2.11, page 117.
140
STOCHASTIC INTEGRATION
are stopping times. Stopped martingales are martingales, so if instead of σ n we take the localizing sequence τ n σ n ∧ υ n then the processes Mn M τ n are bounded martingales. 1. As Mn is a bounded, continuous martingale [Mn ] is an increasing processes and Mn2 − [Mn ] is a continuous martingale. By the previous proposition τn
[Mn+1 ]
& τn ' = [Mn ] , = Mn+1
hence [Mn ] = [Mn+1 ] on the interval [0, τ n ]. As τ n ∞ one can define the process [M ] as the ‘union’ of the processes [Mn ], that is [M ] (t, ω) [Mn ] (t, ω) ,
t ≤ τ n (ω) .
Evidently [M ] is continuous, increasing and [M ] (0) = 0. Of course
τ n 2 τ M 2 − [M ] = (M τ n ) − [M ] n Mn2 − [Mn ] ,
which is a martingale, hence M 2 − [M ] is a local martingale. 2. Assume that A (0) = 0 and M 2 − A is a continuous local martingale for some continuous, increasing process A.
Z M 2 − [M ] − M 2 − A = A − [M ] is a continuous local martingale and Z, as the difference of two increasing processes, has finite variation. So by Fisk’s theorem Z is constant. As Z(0) = A (0) − [M ] (0) = 0, obviously Z ≡ 0. (n)
3. Finally, let us prove (2.19). Fix ε, δ, t > 0 and (tk )k . Let Qn be (m) the approximating sum for [M ] and let Qn be the approximating sum for [Mm ]. A sup |Qn (s) − [M ] (s)| > ε , s≤t
(m)
A
(m) sup Qn (s) − [Mm ] (s) > ε . s≤t
As τ m ∞, for m large enough P (τ m ≤ t) ≤ δ/2 and P A(m) ≤ δ/2. Obviously P (A) = P (A ∩ (τ m ≤ t)) + P (A ∩ (τ m > t)) ≤ ≤ P ((τ m ≤ t)) + P (A ∩ (τ m > t)) ≤
THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES
141
δ + P (A ∩ (τ m > t)) = 2
δ
δ δ δ = + P A(m) ∩ (τ m > t) ≤ + P A(m) ≤ + , 2 2 2 2 ≤
hence (2.19) holds. Proposition 2.40 If M and N are continuous local martingales then [M, N ] is the only continuous process with finite variation on finite intervals for which: 1. [M, N ] (0) = 0 and 2. M N − [M, N ] is a continuous local martingale. (n)
For any infinitesimal sequence of partitions (tk ) of [0, t] p
sup |Qn (s) − [M, N ] (s)| → 0 s≤t
where Qn (s)
(M (tk ∧ s) − M (tk−1 ∧ s)) (N (tk ∧ s) − N (tk−1 ∧ s)) .
(2.20)
k
Proof. From Fisk’s theorem the uniqueness of [M, N ] is again trivial, as M N −A and M N − B are continuous local martingales for some A and B, then A − B is a continuous local martingale with finite variation, so A − B is a constant. As A (0) = B (0) = 0 obviously A = B. MN =
1 2 2 (M + N ) − (M − N ) , 4
so it is easy to see that Proposition 2.39 can be applied to [M, N ]
1 ([M + N ] − [M − N ]) 4
(2.21)
in order to show that M N − [M, N ] is a continuous local martingale and that (2.21) holds. Definition 2.41 If for some process X there is a process P such that X − P is a local martingale, then we say that P is a compensator of X. If P is continuous then we say that P is a continuous compensator of X. If P is predictable then we say that P is a predictable compensator of X etc. So far we have proved that if M is a continuous local martingale then [M ] is the only increasing, continuous compensator of M 2 . It is important to emphasize that this property of [M ] holds only for continuous local martingales.
142
STOCHASTIC INTEGRATION
Example 2.42 Quadratic variation of the compensated Poisson processes.
Let π be a Poisson process with parameter λ. The increments of π are independent and the expected value of π (t) is λt, hence the compensated process ν (t) π (t) − λt is a martingale. We show that ν 2 (t) − λt is also a martingale, that is: λt is a continuous, increasing compensator for ν 2 .
2 E ν 2 (t) − λt | Fs = ν (s) + 2ν (s) E (ν (t) − ν (s) | Fs ) +
2 + E (ν (t) − ν (s)) | Fs − λt. The increments of π are independent, hence the conditional expectation is a real expectation. Given that the increments are stationary 2ν (s) E (ν (t) − ν (s) | Fs ) = 2ν (s) E (ν (t − s)) = 0
2 2 E (ν (t) − ν (s)) | Fs = E (ν (t − s)) = λ (t − s) , hence
E ν 2 (t) − λt | Fs = ν 2 (s) + λ (t − s) − λt = = ν 2 (s) − λs. (ν)
If we partition the interval [0, t] then if Qn is the sequence of the approximating (π) sum for [ν] and Qn is for [π] , then (π) Q(ν) n = Qn − 2λ
+ λ2
(n)
(n) (n) (n) π tk − π tk−1 tk − tk−1 + k (n)
(n)
tk − tk−1
2 .
k
(n) (n) (π) It is easy to see that if maxk tk − tk−1 → 0 then the limit of Qn is the process π. The limits of the other expressions are zero. Hence [ν] = π. Proposition 2.43 If M, N and U are continuous local martingales; ξ and η are F0 -measurable random variables then [ξM + ηN, U ] = ξ [M, U ] + η [N, U ] . Proof. M U − [M, U ] and N U − [N, U ] are local martingales hence (M + N ) U − ([M, U ] + [N, U ]) is also a local martingale, and by the uniqueness property of
THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES
143
the quadratic co-variation [M + N, U ] = [M, U ] + [N, U ] . In a similar way: M U − [M, U ] is a local martingale, ξ is F0 -measurable, hence ξ (M U − [M, U ]) is also a local martingale, hence again by the uniqueness property of the quadratic co-variation [ξM, N ] = ξ [M, N ]. Proposition 2.44 If M and N are continuous local martingales then [M, N ] = [M − M (0) , N − N (0)] = [M − M (0) , N ] . Proof. Obviously [M − M (0) , N ] = [M, N ] − [M (0) , N ]. As M (0) is F0 measurable M (0) N is a continuous local martingale. Hence [M (0) , N ] = 0. Proposition 2.45 (Stopping rule for quadratic variation) Let τ be an arbitrary stopping time. τ
1. If M is a continuous local martingale then [M τ ] = [M ] . τ 2. If M and N are continuous local martingales then [M τ , N τ ] = [M, N ] = τ [M , N ]. Proof. [M τ ] is the only continuous, increasing process A for which A (0) = 0 2 and (M τ ) − A is a continuous local martingale. M 2 − [M ] is a continuous local martingale, hence
τ τ τ 2 τ M 2 − [M ] = M 2 − [M ] = (M τ ) − [M ] τ
is a continuous local martingale, hence by the uniqueness [M ] = [M τ ]. From (2.21) and from the first part of the proof [M τ , N τ ] =
1 τ τ ([(M + N ) ] − [(M − N ) ]) = 4 1 τ τ τ ([M + N ] − [M − N ] ) = [M, N ] . 4
If U and V are martingales and τ is a stopping time, then for any bounded stopping time σ by the Optional Sampling Theorem E ((U τ · (V − V τ )) (σ)) = E (U (τ ∧ σ) · E (V (σ) − V (τ ∧ σ) | Fτ ∧σ )) = = E (U (τ ∧ σ) · 0) = 0,
144
STOCHASTIC INTEGRATION
hence U τ (V − V τ ) is a martingale. From this it is easy to prove with localization that M τ (N − N τ ) is a local martingale, hence τ
τ
M τ N − [M, N ] = M τ N − M τ N τ + M τ N τ − [M, N ] = τ
τ
= M τ (N − N τ ) + ((M N ) − [M, N ] ) is also a local martingale. From the uniqueness of the quadratic co-variation τ
[M τ , N ] = [M, N ] = [M τ , N τ ] .
Example 2.46 If M and N are independent and they are continuous local martingales with respect to their own filtration then [M, N ] = 0.
Let F M and F N be the filtrations generated by M and N . Let Fs be the σ-algebra generated by the sets A ∩ B,
A ∈ FsM , B ∈ FsN .
We shall prove that if M and N are independent martingales then M N is a martingale under the filtration F. As M and N are martingales, M (t) and N (t) are integrable. M (t) and N (t) are independent for any t. Hence the product M (t) N (t) is also integrable. If F A ∩ B, A ∈ FsM and B ∈ FsN then E (M N (t) χF ) = E (M (t) χA N (t) χB ) = E (M (t) χA ) E (N (t) χB ) = = E (M (s) χA ) E (N (s) χB ) = E (M N (s) χF ) , which by the uniqueness of the extension of finite measures can be extended for every F ∈ Fs . Hence M N is an F-martingale so [M, N ] = 0. The quadratic co-variation is independent of the filtration34 so [M, N ] = 0 under the original filtration. If M and N are local martingales with respect to their own filtration, then the localized processes are independent martingales. Hence if τ (τ n ) is a common localizing sequence then [M, N ] n = [M τ n , N τ n ] = 0. Hence [M, N ] = 0. Proposition 2.47 Let M be a continuous local martingale. M is indistinguishable from a constant if and only if the quadratic variation [M ] is zero. 34 Here we directly used the definition of the quadratic variation as the limit of the approximating sums.
THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES
145
Proof. If M is a constant then M 2 is also a constant, hence M 2 is a local martingale35 so [M ] = 0. On the other hand if [M ] = 0 then M 2 − [M ] = M 2 is a local martingale. The proposition follows from the next proposition. Proposition 2.48 M and M 2 are continuous local martingales, if and only if M is a constant. Proof. If M is constant then M and M 2 are local martingales. On the other hand 2
(M − M (0)) = M 2 − 2 · M · M (0) + M 2 (0) . Since M and M 2 are local martingales and M (0) is F0 -measurable, 2 (M − M (0)) is also a local martingale. Let (τ n ) be a localizing sequence for 2 (M − M (0)) . By the martingale property
2 2 E (M τ n (t) − M τ n (0)) = E (M τ n (0) − M τ n (0)) = 0, hence for any t a.s.
M (t ∧ τ n ) = M (0) . Therefore for any t a.s.
M (t) = lim M (t ∧ τ n ) = M (0) . n→∞
The local martingales are right-regular therefore M is indistinguishable from M (0). Corollary 2.49 Let a ≤ b < ∞. A continuous local martingale M is constant on [a, b] if and only if [M ] is constant on [a, b]. Proof. If τ n ∞ then a process X is constant on an interval [a, b] if and only τ if X τ n is constant on [a, b] for all n. Using this fact and that [M τ n ] = [M ] n one can assume that M is a martingale. 1. Define the stochastic process N (t) M (t + a) − M (a) . N is trivially a martingale for the filtration Gt Ft+a , t ≥ 0. N 2 (t) − ([M ] (t + a) − [M ] (a)) = M 2 (t + a) − ([M ] (t + a) − [M ] (a)) − − 2M (t + a) M (a) + M 2 (a) . 35 See:
Definition 1.131, page 94.
146
STOCHASTIC INTEGRATION
Obviously M 2 (t + a) − ([M ] (t + a) − [M ] (a)) is a G-martingale. M (t + a) is also a G-martingale hence M (t + a) M (a) + M 2 (a) is obviously a G-local martingale, hence by the uniqueness of the quadratic variation [N ] (t) = [M ] (t + a) − [M ] (a) . 2. M is constant on the interval [a, b] if and only if N is zero on the interval [0, b − a]. As we proved N is constant on [0, b − a] if and only if [N ] = 0 on [0, b − a]. Hence M is constant on [a, b] if and only if [M ] is constant on [a, b]. We summarize the statements above in the following proposition: Proposition 2.50 [M, N ] is a symmetric bilinear form and [M ] ≥ 0. [M ] = 0 if and only if M is constant. This is also true on any half-line [a, ∞) if instead of [M, N ] we use the increments [M, N ] − [M, N ] (a).
2.3
Integration when Integrators are Continuous Semimartingales
In this section we introduce a simple construction of the stochastic integral when the integrator X is a continuous semimartingale and the integrand Y is progressively measurable36 . Every continuous semimartingale has a unique decomposition of type X = X (0) + L + V , where V is continuous and has finite variation and L is a continuous local martingale. The integration with respect to V is a simple measure theoretic exercise: V (ω) generates a σ-finite measure on R+ for every ω. Every progressively measurable process is product measurable, hence all trajectories Y (ω) are measurable. For every ω and for every t one can define the pathwise integral (Y • V ) (t, ω)
t
Y (s, ω) V (ds, ω) , 0
where the integrals are simple Lebesgue integrals37 The main problem is how to define the stochastic integral with respect to the local martingale part L! 36 See: 37 See:
[78] Proposition 1.20, page 11.
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
2.3.1
147
The space of square-integrable continuous local martingales
Recall the definition and some elementary properties of square-integrable martingales: Definition 2.51 As before H2 is the space of L2 (Ω) bounded martingales38 on R+ . Let G 2 Hc2 denote the space of L2 (Ω)-bounded, continuous martingales. . / H02 M ∈ H2 : M (0) = 0 ,
. / G02 M ∈ G 2 : M (0) = 0 .
The elements of H2 , G 2 , H02 and G02 are equivalence classes: M1 and M2 are in the same equivalence class if they are indistinguishable. Proposition 2.52 M ∈ H2 if and only if sup M 2 (t) ∈ L1 (Ω). t
H2 , ·H2 is a Hilbert space where M H2 M (∞)2 = lim M (t)2 . t→∞
the set of continuous square-integrable martingales G 2 is a closed subspace of H2 . Proof. The first statement follows from Doob’s inequality39 . The relation M (∞)2 = lim M (t)2 t→∞
is obviously true as M (t) converges40 to M (∞) in L2 (Ω), and the norm is a continuous function. In order to show that G 2 is closed, let (Mn ) be a sequence of H2
continuous square-integrable martingales and assume that Mn → M . By Doob’s inequality41 E
2 2 sup |Mn (t) − M (t)| ≤ 4 Mn (∞) − M (∞)2 t
2
4 Mn − M H2 → 0. 38 That is if M is a martingale then M ∈ H2 , that is M is square-integrable, if and only if supt M (t) 2 < ∞. 39 See: Corollary 1.54, page 34. 40 See: Corollary 1.59, page 35. 41 See: (1.18) line, page 34.
148
STOCHASTIC INTEGRATION
From the L2 -convergence one has a subsequence for which a.s.
sup |Mnk (t) − M (t)| → 0, t
hence Mnk (t, ω) → M (t, ω) uniformly in t for almost all ω. Hence M (t, ω) is continuous in t for almost all ω. So the trajectories of M are almost surely continuous, therefore G 2 is closed. Our direct goal is to prove that if M is a square-integrable martingale and M (0) = 0 then
2 2 M H2 M (∞)2 = E M 2 (∞) = E ([M ] (∞)) . To do this one should prove that M 2 − [M ] is not only a local martingale but it is a uniformly integrable martingale. Proposition 2.53 (Characterization of square-integrable martingales) Let M be a continuous local martingale. The following statements are equivalent: 1. M is square integrable, 2. M (0) ∈ L2 (Ω) and E ([M ] (∞)) < ∞. In both cases M 2 − [M ] is a uniformly integrable martingale. Proof. The proof of the equivalence of the statements is the following: 1. Let (τ n ) be a localizing sequence of the local martingale M 2 − [M ] and let 2 σ n τ n ∧ n. By the martingale property of (M τ n ) − [M τ n ]
E M 2 (σ n ) − [M ] (σ n ) = E M 2 (0) .
(2.22)
As M is square-integrable M 2 (σ n ) ≤ sup M 2 (t) ∈ L1 (Ω) , t
so by the Dominated Convergence Theorem
lim E M 2 (σ n ) = E lim M 2 (σ n ) = E M 2 (∞) < ∞.
n→∞
n→∞
[M ] is increasing therefore by the Monotone Convergence Theorem and by (2.22)
E ([M ] (∞)) = lim E ([M ] (σ n )) = lim (E M 2 (σ n ) − E M 2 (0) < ∞, n→∞
n→∞
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
149
that is [M ] (∞) ∈ L1 (Ω) and 1. implies 2. For every stopping time τ 2 M − [M ] (τ ) ≤ sup M 2 (t) + sup [M ] (t) = t
t
= sup M (t) + [M ] (∞) ∈ L1 (Ω) , 2
t
.
/
hence the set M 2 (τ ) − [M ] (τ ) τ is dominated by an integrable variable and therefore it is uniformly integrable. By this M 2 −[M ] is a class D local martingale hence it is a uniformly integrable martingale42 . 2. Let τ be an arbitrary stopping time. Let (σ n ) be a localizing sequence of M . One can assume that M σn − M (0) is bounded43 . Let N M τ ∧σn − M (0). By the definition of the quadratic variation N 2 (t) = 2
t
N− dN + [N ] (t) . 0
o–Stieltjes integral defines a martingale44 . So As N− is bounded the Itˆ
E N 2 (t) = E ([N ] (t)) = E ([M τ ∧σn ] (t)) ≤ E ([M ] (∞)) . Applying Fatou’s lemma
2 E (M − M (0)) (τ ) ≤ E ([M ] (∞)) .
(2.23)
By the second assumption of 2. the expected value on the right-hand side is finite so the set of variables S of type (M − M (0)) (τ ) is bounded in L2 (Ω). Hence S is a uniformly integrable set and therefore M − M (0) is a class D local martingale and hence it is a martingale45 . By (2.23) M − M (0) is trivially bounded in L2 (Ω), that is M − M (0) ∈ G 2 . As M (0) ∈ L2 (Ω) by the first assumption of 2. obviously M ∈ G 2 . Corollary 2.54 If M ∈ G 2 and σ ≤ τ are stopping times then
E M 2 (τ ) − M 2 (σ) | Fσ = E ([M ] (τ ) − [M ] (σ) | Fσ ) =
2 = E (M (τ ) − M (σ)) | Fσ , specifically
E M 2 (τ ) − E M 2 (0) = E ([M ] (τ )) . 42 See:
(2.24)
Proposition 1.144, page 102. σn the general case when M is not necessarily continuous one can assume that M− −M (0) is bounded. 44 See: Proposition 2.24, page 128. 45 See: Proposition 1.144, page 102. 43 In
150
STOCHASTIC INTEGRATION
Proof. By the previous proposition M 2 − [M ] is a uniformly integrable martingale, hence if σ ≤ τ then by the Optional Sampling Theorem
E M 2 (τ ) − [M ] (τ ) | Fσ = M 2 (σ) − [M ] (σ) from which the first equation follows. M is also uniformly integrable hence again by the Optional Sampling Theorem M (σ) = E (M (τ ) | Fσ ) .
2 E (M (τ ) − M (σ)) | Fσ =
= E M 2 (τ ) + M 2 (σ) − 2M (σ) M (τ ) | Fσ =
2 = E M 2 (τ ) + M 2 (σ) − 2M (σ) | Fσ =
= E M 2 (τ ) − M 2 (σ) | Fσ . Let M be a semimartingale. Let us define ∞ χC d [M ] αM (C) E 0
where the integral with respect [M ] is the pathwise Lebesgue–Stieltjes integral generated by the increasing, right-regular46 process [M ]. It is not entirely trivial that αM is well-defined, that is the expression under the expected value is measurable. By the Monotone Convergence Theorem ∞ n χC d [M ] = E lim χC d [M ] . E n→∞
0
0
n
As 0 χC d [M ] is measurable47 for every n the parametric integral under the expected value is measurable. Obviously αM is a measure on B (R+ ) × A. Example 2.55 If M ∈ G 2 and τ is a stopping time then αM ([0, τ ]) = E M 2 (τ ) − E M 2 (0) . If M ∈ G02 then E (M 2 (∞)) = E ([M ] (∞))
[M ] (∞) = αM (R+ × Ω).
M H2
2
46 Of
course tacitly we again assume that [M ] has a right-regular version. Proposition 1.20, page 11.
47 See:
(2.25)
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
151
If τ is an arbitrary random time then ∞ αM ([0, τ ]) E χ ([0, τ ]) d [M ] = E ([M ] (τ ) − [M ] (0)) = 0
= E ([M ] (τ )) . By (2.24) for every stopping time
E ([M ] (τ )) = E M 2 (τ ) − E M 2 (0) , hence αM ([0, τ ]) = E ([M ] (τ )) − E ([M ] (0)) = E ([M ] (τ )) =
= E M 2 (τ ) − E M 2 (0) . If M ∈ G02 then M (0) 0 hence by (2.24)
E M 2 (∞) = E M 2 (∞) − E M 2 (0) = E ([M ] (∞)) . The other relations are consequences of the definitions. Definition 2.56 αM is called the Dol´eans measure48 generated by the quadratic variation of M . 2.3.2
Integration with respect to continuous local martingales
Let us start with the simplest case: Definition 2.57 Let M be a continuous local martingale. Let L2 (M ) denote the space of equivalence classes of square-integrable and progressively measurable functions on the measure space (R+ × Ω, R, αM ) that is let L2 (M ) L2 (R+ × Ω, R, αM ) where R , as before, denote the σ-algebra of progressively measurable sets. Let ·M denote the norm of the Hilbert space L2 (M ): - XM
X 2 dα R+ ×Ω
Example 2.58 The space L2 (w). 48 See:
Definition 5.4, page 295.
M
- E 0
∞
X 2 d [M ]
.
152
STOCHASTIC INTEGRATION
The quadratic variation of a Wiener process on an interval [0, s] is s. Hence t 2 Xw = E 0 X 2 (s) ds on the interval [0, t]. If t < ∞ then w ∈ L2 (w) , since by Fubini’s theorem 2 ww
t
E
2
w (s) ds
t
=
0
E w (s) ds =
0
t
sds < ∞.
2
0
The main result of this section is the following: Proposition 2.59 (Stochastic integration and quadratic variation) If M is a continuous local martingale and X ∈ L2 (M ) then there is a unique process in G02 denoted by X • M such that for every N ∈ G 2 [X • M, N ] = X • [M, N ] . If we denote X • M by
t 0
(2.26)
XdM then (2.26) can be written as
t
XdM, N = 0
t
Xd [M, N ] . 0
Proof. We divide the proof into several steps. We prove that X • M exists, and the definition of X • M is correct—that is, the process X • M is unique. 1. The proof of uniqueness is easy. If I1 and I2 are two processes in G02 satisfying (2.26) then [I1 , N ] = [I2 , N ] for all N ∈ G02 . Hence [I1 − I2 , N ] = 0 for all N ∈ G02 . As I1 − I2 ∈ G02 [I1 − I2 , I1 − I2 ] [I1 − I2 ] = 0, hence I1 − I2 is constant49 . As I1 − I2 ∈ G02 , I1 − I2 = 0, so I1 = I2 . 2. Now we prove the existence of X • M . Assume first that N ∈ G02 . By the Kunita–Watanabe inequality50 and by the formula (2.25) E
0
∞
- - ∞ ∞ Xd [M, N ] ≤ X 2 d [M ] d [N ] 0 0 2 2 - ∞
XM = XM 49 See: 50 See:
Proposition 2.47, page 144. Corollary 2.35, page 137.
E "
d [N ]
=
0
E ([N ] (∞)) = XM N H2 .
(2.27)
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
153
∞ Observe that XM N H2 < ∞, hence 0 Xd [M, N ] is almost surely finite. So the right-hand side of (2.26) is well-defined. By the bilinearity of the quadratic co-variation N → E
∞
Xd [M, N ]
0
is a continuous linear functional on the Hilbert space G02 . As every continuous linear functional on a Hilbert space has a scalar product representation there is an X • M ∈ G02 such that for every N ∈ G02 E
∞
Xd [M, N ] = (X • M, N ) E ((X • M ) (∞) N (∞)) .
(2.28)
0
3. The main part of the proof is to show that for X • M the identity (2.26) holds. Define the process S (X • M ) N − X • [M, N ] . To prove (2.26) we show that S is a continuous martingale, hence by the uniqueness of the quadratic co-variation [X • M, N ] = X • [M, N ]! First observe that S is adapted: (X • M ) N is a product of two martingales, that is the product of two adapted processes. t X is progressively measurable, by the definition of L2 (M ), so the integral 0 Xd [M, N ] is also adapted51 . S is continuous as by the construction (X • M ) N is a product of two continuous functions so it is continuous, and since M and N are continuous t the quadratic variation [M, N ] is also continuous. Therefore the integral 0 Xd [M, N ] as a function of t is continuous. Finally to show that S is a martingale one should prove that52 E (S (τ )) = E (S (0)) = 0
(2.29)
for every bounded stopping time τ . By definition X • M is a uniformly integrable martingale. Therefore by the Optional Sampling Theorem (X • M ) (τ ) = E ((X • M ) (∞) | Fτ ) . 51 See: 52 See:
Proposition 1.20, page 11. Proposition 1.91, page 57.
154
STOCHASTIC INTEGRATION
Using that N τ ∈ G02 and (2.28) E (S (τ )) E (X • M ) (τ ) N (τ ) −
τ
0
X [M, N ] =
τ
= E ((X • M ) (τ ) N (τ )) − E
X [M, N ] =
0
= E (E ((X • M ) (∞) | Fτ ) N (τ )) − E = E (E ((X • M ) (∞) N (τ ) | Fτ )) − E = E (X • M (∞) N (τ )) − E
= E (X • M (∞) N (τ )) −
0 ∞
τ
X [M, N ] 0
∞
∞
=
X [M, N ] = τ
0
∞
X [M, N ] = τ
X [M, N τ ] = 0.
0
Therefore (2.29) holds. 4. Finally if N ∈ G 2 then N − N (0) ∈ G02 , hence [X • M, N ] = [X • M, N − N (0)] = = X • [M, N − N (0)] = X • [M, N ] .
Proposition 2.60 (Stopping rule for stochastic integrals) If M is an arbitrary continuous local martingale, X ∈ L2 (M ) and τ is an arbitrary stopping time then τ
X • M τ = (χ ([0, τ ]) X) • M = (X • M ) = X τ • M τ .
(2.30)
Proof. By (2.26) and by the stopping rule for the quadratic variation, if N ∈ G 2 τ
τ
τ
τ
[(X • M ) , N ] = [(X • M ) , N ] = (X • [M, N ]) = X • [M, N ] = = X • [M τ , N ] = [X • M τ , N ] . By the bilinearity of the quadratic variation τ
[(X • M ) − X • M τ , N ] = 0, N ∈ G 2 , τ
from which [(X • M ) − X • M τ ] = 0 that is τ
(X • M ) = X • M τ .
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
155
If X ∈ L2 (M ) then trivially χ ([0, τ ]) X ∈ L2 (M ). For every N ∈ G 2 τ
[X • M τ , N ] = X • [M τ , N ] = X • [M, N ] = = (χ ([0, τ ]) X) • [M, N ] = = [(χ ([0, τ ]) X) • M, N ] , hence again X • M τ = (χ ([0, τ ]) X) • M. Using stopping rule (2.30) we can extend the stochastic integral to the space L2loc (M ). Definition 2.61 Let M be a continuous local martingale. The space L2loc (M ) is the set of progressively measurable processes X for which there is a localizing sequence of stopping times (τ n ) such that
∞
E 0 τn
=E
X 2 d [M τ n ] = E
X d [M ] = E
0
2
∞
τn
X 2 d [M ]
=
0
∞
χ ([0, τ n ]) X d [M ] 2
0
χ ([0, τ n ]) X 2 dαM < ∞. (0,∞)×Ω
Example 2.62 If M is a continuous local martingale and X is locally bounded then X ∈ L2loc (M ).
One can assume that X(0) = 0 as obviously every F0 -measurable constant pro2 . Let (τ n ) be a common localizing cess is in L2loc . As M is continuous M ∈ Hloc τn 2 53 τn sequence of X and M . M ∈ H so [M ] (∞) ∈ L1 (Ω). Therefore E
∞
2
X d [M
τn
] ≤ sup X 2 (t) E ([M τ n ] (∞)) < ∞. t≤τ n
0
Proposition 2.63 If M is a continuous local martingale then for every X ∈ L2loc (M ) there is a process denoted by X • M such that 1. (X • M ) (0) = 0 and X • M is a continuous local martingale, 2. for every continuous local martingale N [X • M, N ] = X • [M, N ] . 53 See:
Proposition 2.53, page 148.
(2.31)
156
STOCHASTIC INTEGRATION
X • M is unambiguously defined by (2.31), that is X • M is the only continuous local martingale for which for every continuous local martingale N (2.31) holds. Proof. M is a continuous local martingale so it is locally bounded hence M ∈ 2 . Assume that L2loc (M Hloc ) and let (τ n ) be such a localizing sequence of
∞X ∈ 2 X for which E 0 X d [M τ n ] < ∞ that is let X ∈ L2 (M τ n ). Consider the integrals In X • M τ n . τn
τn In+1 (X • M τ n+1 )
τn
= X • (M τ n+1 )
= X • M τ n = In ,
hence In+1 and In are equal on [0, τ n ]. One can define the integral process X • M unambiguously if for all n the value of X • M is by definition is In on the interval [0, τ n ]. By the stopping rule for stochastic integrals it is obvious from the construction that X • M is independent of the localizing sequence (τ n ). Obviously (X • M ) (0) = 0 and X • M is continuous. Trivially (X • M )
τn
τn
(X • M τ n )
= X • M τn
τ
and X • M τ n ∈ G02 , hence (X • M ) n is a uniformly integrable martingale so X •M is a local martingale. We should prove (2.31). Let (τ n ) be such a localizing sequence that X ∈ L2 (M τ n ) and N τ n ∈ G 2 . As X ∈ L2 (M τ n ) and N τ n ∈ G 2 by the stopping rule for the quadratic variation54 τn
[X • M, N ]
= [(X • M )
τn
, N τn]
[X • M τ n , N τ n ] = X • [M τ n , N τ n ] = τn
= X • [M, N ]
τn
= (X • [M, N ])
,
hence (2.31) is valid. Let us prove some elementary properties of the stochastic integral. The most important properties are simple consequences of (2.31), the basic properties of the quadratic variation and the analogous properties of the pathwise integration. Proposition 2.64 (Itˆ o’s isometry) If M is a continuous local martingale then the mapping X → X • M is an L2 (M ) → G02 isometry. That is if X ∈ L2 (M ) then
2 2 E (X • M ) (∞) X • M H2 = XM E
2
0
54 See:
Proposition 2.45, page 143.
∞
X d [M ] . 2
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
157
Proof. Using the definition of the norm in H2 and (2.25), by (2.31)
2 2 X • M H2 E (X • M ) (∞) = E ([X • M ] (∞)) E ([X • M, X • M ] (∞)) = ∞ =E Xd [X • M, M ] = E 0
∞
Xd (X • [M ]) .
0
In the right-hand side of the identity [X • M, M ] = X • [M, M ]. The integral is taken pathwise, hence ∞ 2 X • M H2 = E Xd (X • [M ]) = 0
∞
=E 0
2 X 2 d [M ] XM ,
and hence the mapping X → X • M is an isometry. 1
Example 2.65 The standard deviation of
0
√ wdw is 1/ 2.
The integral is meaningful and as on finite intervals w ∈ L2 (w) the integral 1 process w • w is a martingale. Hence the expected value of the integral 0 wdw is zero. By Itˆo’s isometry and by Fubini’s theorem 2 1 1 1
2 wdw w (s) ds = E w2 (s) ds = =E E 0
0
0
1
=
sds = 0
1 . 2
√ Hence the standard deviation is 1/ 2. We can calculate the standard deviation in the following way as well:
2
t
wdw
−
0
wdw 0
is a martingale, hence 2 1 E wdw =E 0
1
wdw
E
0
1
0
w2 d [w] = 0
1
wdw,
0
=E using (2.26) directly.
t
1
1
wdw 0
1 E w2 (s) ds = , 2
=
158
STOCHASTIC INTEGRATION
Proposition 2.66 If M is a continuous local martingale and X ∈ L2loc (M ) then [X • M ] = X 2 • [M ] .
(2.32)
Proof. By simple calculation using (2.31), and that on the right-hand side of (2.31), we have a pathwise integral [X • M ] [X • M, X • M ] = X • [M, X • M ] = = X • (X • [M, M ]) = X 2 • [M ] . Corollary 2.67 If M is a continuous local martingale and X is a progressively measurable process then X ∈ L2loc (M ) if and only if for all t almost surely
t
X 2 d [M ] X 2 • [M ] (t) < ∞.
(2.33)
0
Proof. The quadratic variation [X • M ], like every quadratic variation, is almost surely finite, hence if X ∈ L2loc (M ) then by (2.32), (2.33) holds. On the other hand, assume that (2.33) holds. For all n let us define the stopping times t τ n inf t : X 2 d [M ] ≥ n . 0
As [M ] is continuous, X 2 • [M ] is also continuous, hence
τn
X 2 d [M ] ≤ n,
0
that is X ∈ L2 (M τ n ) , hence X ∈ L2loc (M ) , so the space L2loc (M ) contains all the R-measurable processes, for which (2.33) holds for all t. Corollary 2.68 Assume that M is a local martingale and X ∈ L2loc (M ). If on an interval [a, b] 1. X (t, ω) = 0 for all ω or 2. M (t, ω) = M (a, ω) , then X • M is constant on [a, b]. Proof. The integral X 2 •[M, M ] is a pathwise integral, hence under the assumptions X 2 • [M, M ] is constant on [a, b]. As [X • M ] = X 2 • [M ] , the local martingale X • M is constant on55 [a, b]. 55 See:
Proposition 2.47, page 144.
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
159
Proposition 2.69 (Stopping rule for stochastic integrals) If M is a continuous local martingale, X ∈ L2loc (M ) and τ is an arbitrary stopping time then τ
(X • M ) = χ ([0, τ ]) X • M = X τ • M τ = X • M τ .
(2.34)
Proof. Let τ be an arbitrary stopping time. If X ∈ L2loc (M ) , then as |χ ([0, τ ]) X| ≤ |X| trivially χ ([0, τ ]) X ∈ L2loc (M ). Using the analogous properties of the L2 (M ) integrals τ τn
((X • M ) )
τ
τ
τ
= ((X • M ) n ) (X • M τ n ) = = χ ([0, τ ]) X • M τ n τn
(χ ([0, τ ]) X • M )
.
The proof of the other parts of (2.34) are analogous. Proposition 2.70 (Linearity) X • M is bilinear, that is if α1 and α2 are constants then X • (α1 M1 + α2 M2 ) = α1 (X • M1 ) + α2 (X • M2 ) and (α1 X1 + α2 X2 ) • M = α1 (X1 • M ) + α2 (X2 • M ) when all the expressions are meaningful. In these relations if two integrals are meaningful then the third one is meaningful. Proof. If X ∈ L2loc (M1 ) ∩ L2loc (M2 ) then for all t
t
X 2 d [M1 ] < ∞ 0
t
X 2 d [M2 ] < ∞.
and 0
Obviously, by the Kunita–Watanabe inequality56 [M1 + M2 ] ≤ 2 ([M1 ] + [M2 ]) hence
t
X d [M1 + M2 ] ≤ 2 2
0 56 See:
Corollary 2.36, page 137.
t 2
X d [M2 ] < ∞, 2
X d [M1 ] + 0
t
0
160
STOCHASTIC INTEGRATION
therefore X ∈ L2loc (M1 + M2 ). From the linearity of the pathwise integration and from the bilinearity of the quadratic variation [X • (α1 M1 + α2 M2 ) , N ] = X • [(α1 M1 + α2 M2 ) , N ] = = X • (α1 [M1 , N ] + α2 [M2 , N ]) = = α1 X • [M1 , N ] + α2 X • [M2 , N ] = = [α1 X • M1 + α2 X • M2 , N ] , from which the linearity of the integral in the integrand is evident. The linearity in the integrator is also evident as [(α1 X1 + α2 X2 ) • M, N ] = (α1 X1 + α2 X2 ) • [M, N ] = = α1 X1 • [M, N ] + α2 X2 • [M, N ] = = [α1 X1 • M, N ] + [α2 X2 • M, N ] = = [α1 X1 • M + α2 X2 • M, N ] . The remark about the integrability is evident from the trivial linearity of the space L2loc (M ). Proposition 2.71 (Associativity) If X ∈ L2 (M ) then Y ∈ L2 (X • M ) if and only if XY ∈ L2 (M ). If X ∈ L2loc (M ) then Y ∈ L2loc (X • M ) , if and only if XY ∈ L2loc (M ). In both cases (Y X) • M = Y • (X • M ) .
(2.35)
Proof. Using the construction of the stochastic integral and given that the associativity formula (2.35) is valid for pathwise integration [X • M ] = [X • M, X • M ] = X • [M, X • M ] = = X • (X • [M, M ]) = X 2 • [M, M ] . By the associativity of the pathwise integration for non-negative integrands E
∞
Y d [X • M ] = E 2
0
∞
=E
s
X d [M ] = 2
Y d
0
2
0
∞
Y 2 X 2 d [M ] ,
0
hence Y X ∈ L2 (M ) if and only if Y ∈ L2 (X • M ). If X ∈ L2 (M ) , then by the Kunita–Watanabe inequality for almost all ω the trajectory X (ω) is integrable
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
161
with respect to [M, N ] (ω). If XY ∈ L2 (M ) then using (2.26) again [(Y X) • M, N ] = (Y X) • [M, N ]
t
Y Xd [M, N ] = 0
t
s
Xd [M, N ] Y • (X • [M, N ]) ,
Yd 0
(2.36)
0
Using (2.26) and that Y ∈ L2 (X • M ) , Y • (X • [M, N ]) = Y • [X • M, N ] = [Y • (X • M ) , N ] . Comparing it with line (2.36), [(Y X) • M, N ] = [Y • (X • M ) , N ] . Hence by the uniqueness of the stochastic integral (Y X) • M = Y • (X • M ) . To prove the general case, observe that XY ∈ L2loc (M ) if and only if for some localizing sequence (τ n )
E χ ([0, τ n ]) X 2 Y 2 • [M ] < ∞. As
χ ([0, τ n ]) Y 2 • X 2 • [M ] = χ ([0, τ n ]) Y 2 X 2 • [M ] XY ∈ L2loc (M ) if and only if Y ∈ L2loc (X • M ). Let (τ n ) be a common localizing sequence for M and X • M . If Y ∈ L2loc (X • M ) then evidently τ
Y ∈ L2 ((X • M ) n ) = L2 ((X • M τ n )) . So τn
(Y • (X • M ))
τn
Y • (X • M )
= Y • (X • M τ n ) = τn
= (Y X • M τ n ) ((Y X • M )) from which the associativity is evident.
,
STOCHASTIC INTEGRATION
162 2.3.3
Integration with respect to semimartingales
We can extend again the definition of the stochastic integration to semimartingales: Definition 2.72 Let X = X (0) + L + V be a continuous semimartingale. If for some process Y the integrals Y • L and Y • V are meaningful then the stochastic integral Y • X of Y with respect to X by definition is the sum Y • X Y • L + Y • V. Remember that by Fisk’s theorem the decomposition X = X (0) + L + V is unique, hence the integral is well-defined. Proposition 2.73 The most important properties of the stochastic integral Y •X are the following: 1. Y • X is bilinear, that is Y • (α1 X1 + α2 X2 ) = α1 (Y • X1 ) + α2 (Y • X2 ) and (α1 Y1 + α2 Y2 ) • X = α1 (Y1 • X) + α2 (Y2 • X) assuming that all the expressions are meaningful. If two integrals are meaningful then the third is meaningful. 2. For all locally bounded processes Y, Z Z • (Y • X) = (ZY ) • X. 3. For every stopping time τ τ
(Y • X) = (Y χ ([0, τ ]) • X) = Y • X τ . 4. If the integrator X is a local martingale or if X has bounded variation on finite intervals then the same is true for the integral process Y • X. 5. Y • X is constant on any interval where either Y = 0, or X is constant. 6. [Y • X, Z] = Y • [X, Z] for any continuous semimartingale Z. 2.3.4
The Dominated Convergence Theorem for stochastic integrals
A crucial property of every integral is that under some conditions one can swap the order of taking limit and the integration: Proposition 2.74 (Dominated Convergence Theorem for stochastic integrals) Let X be a continuous semimartingale, and let (Yn ) be a sequence of
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
163
progressively measurable processes. Assume that (Yn (t, ω)) converges to Y∞ (t, ω) in every point (t, ω). If there is an integrable process Y such that57 |Yn | ≤ Y for all n, then Yn • X → Y∞ • X, where the convergence is uniform in probability on every compact interval, that is p
sup |(Yn • X) (s) − (Y∞ • X) (s)| → 0, s≤t
for all t ≥ 0.
Proof. One can prove the proposition separately when X has finite variation and when X is a local martingale. It is sufficient to prove the proposition when Y∞ ≡ 0. 1. First, assume that X has finite variation. In this case the integrability of Y means that for every t
t
|Y | dVar (X) < ∞. 0
As |Yn | ≤ Y , for every ω the trajectory Yn (ω) is also integrable on every interval [0, t]. Applying the classical Dominated Convergence Theorem for every trajectory individually, for all s ≤ t
0
s
t Yn dX ≤ |Yn | dVar (X) → 0. 0
Hence the integral, as a function of the upper bound uniformly converges to zero. Pointwise convergence on a finite measure space implies convergence in measure, so when the integrator has finite variation then the proposition holds. 2. Let X be a local martingale. Y is integrable with respect to X, hence by definition Y ∈ L2loc (X). Let ε, δ > 0 be arbitrary, and let (τ n ) be a localizing sequence of Y . To make the notation simpler, let us denote by σ a τ n for which σ P (τ n < t) ≤ δ/2. By the stopping rule (Yn • X) = Yn • X σ , that is if s ≤ σ (ω) then (Yn • X) (s, ω) = (Yn • X σ ) (s, ω) . If A
sup |Yn • X| (s) > ε , s≤t
Aσ
sup |Yn • X σ | (s) > ε , s≤t
57 The integrability of Y depends on the integrator X. If X is a local martingale, then by definition this means that Y ∈ L2loc (X).
164
STOCHASTIC INTEGRATION
then P (A) = P ((σ < t) ∩ A) + P ((σ ≥ t) ∩ A) ≤ ≤ P (σ < t) + P ((t ≤ σ) ∩ A) ≤
δ + P (Aσ ) . 2
Since Y ∈ L2 (X σ ), obviously Yn ∈ L2 (X σ ). Hence by the classical Dominated Convergence Theorem as Yn → 0 and |Yn | ≤ Y 2
Yn X σ E
∞
Yn2 d [X σ ] = E
0
∞
=E
χ ([0, σ]) Yn2 d [X]
∞
σ
Yn2 d [X]
=
0
→ 0,
0
that is Yn → 0 in L2 (X σ ). By Itˆ o’s isometry the correspondence Z → Z • X σ is H2
an L2 (X σ ) → H2 isometry58 . Hence Yn • X σ → 0. By Doob’s inequality59 E
2 sup |Yn • X | (s) σ
s≤∞
2 ≤ 4E ((Yn • X σ ) (∞)) 2
4 Yn • X σ H2 → 0. By Markov’s inequality, stochastic convergence follows from the L2 (Ω)convergence, hence P (Aσ ) P sup |Yn • X σ | (s) > ε → 0. s≤t
Hence for n large enough P (A) P sup |Yn • X| (s) > ε ≤ δ. s≤t
2.3.5
Stochastic integration and the Itˆ o–Stieltjes integral
As we mentioned, every integral is in some sense the limits of certain approximating sums. From the construction above it is not clear in which sense the integral X • M is a limit of the approximating sums. 58 See: 59 See:
Itˆ o’s isometry, Proposition 2.64, page 156. line (1.17) page 34. Proposition 2.52 page 147.
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
165
Lemma 2.75 If X is a continuous semimartingale and Y
η i · χ ((τ i , τ i+1 ])
i
is an integrable, non-negative predictable simple process60 then (Y • X) (t) =
η i · (X(τ i+1 ∧ t) − X(τ i ∧ t)) .
i
Proof. If σ ≤ τ are stopping times, then using the linearity and the stopping rule χ ((σ, τ ]) • X = (χ ([0, τ ]) − χ ([0, σ])) • X = τ
σ
= (1 • X) − (1 • X) = X τ − X σ . Hence the formula holds with η ≡ 1. It is easy to check that if F ∈ Fσ ⊆ Fτ then σ (ω) if ω ∈ F τ (ω) if ω ∈ F σ F (ω) , τ F (ω) ∞ if ω ∈ /F ∞ if ω ∈ /F are also stopping times, hence (χF χ ((σ, τ ])) • X = χ ((σ F , τ F ]) • X = X τ F − X σF = χF (X τ − X σ ) , hence the formula is valid if η = χF , F ∈ Fσ . If η is an Fσ -measurable step function, then since the integral is linear one can write η in the place of χF . It is easy to show that for any Fσ -measurable function η the process ηχ ((σ, τ ]) is integrable with respect to X, hence using the Dominated Convergence Theorem one can prove the formula when η is an arbitrary Fσ -measurable function. As Y ≥0 0 ≤ Yn
n
η i χ ((τ i , τ i+1 ]) ≤ Y.
i=1
The general case follows from the Dominated Convergence Theorem and from the linearity of the integral. Corollary 2.76 If X is a continuous semimartingale, τ n ∞ and Y i η i · χ ((τ i , τ i+1 ]) is a predictable simple process then
t
Y dX (Y • X) (t) = 0 60 See:
Definition 1.41, page 24.
i
η i · (X(τ i+1 ∧ t) − X(τ i ∧ t)) .
166
STOCHASTIC INTEGRATION
Proof. As τ n ∞, Y is left-continuous and has right-hand side limits. So Y is locally bounded on [0, ∞) and therefore Y ± are integrable. Proposition 2.77 If X is a continuous semimartingale, Y is a left-continuous, adapted and locally bounded process, then (Y • X) (t) is the Itˆ o–Stieltjes integral for every t. The convergence of the approximating sums is uniform in probability on every compact interval. The partitions of the intervals can be random as well. (n)
Proof. More precisely, let τ k For each t let
(n)
≤ τ k+1 ∞ be a sequence of stopping times.
(n) (n) (n) Y (τ k ) X(τ k+1 ∧ t) − X(τ k ∧ t)
k
be the sequence of Itˆo-type approximating processes. Assume that for each ω (n) (n) lim max τ k+1 (ω) − τ k (ω) = 0.
n→∞
k
Define the locally bounded simple predictable processes Y (n)
(n) (n) (n) χ τ k , τ k+1 . Y τk
k
As we saw
(n) (n) (n) Y (τ k ) X(τ k+1 ∧ t) − X(τ k ∧ t) . Y (n) • X (t) = k
Y is continuous from the left, hence in every point Y (n) → Y . Let K (t) sup |Y (s)| . s n then
* m $ +# m + E sup Li (t) ≤ C · E , Li (∞) = t
i=n
i=n
* +m + = C · E , [Li ] (∞) . i=n
As
"∞ n=1
[Ln ] ∈ A+ by the Dominated Convergence Theorem * +m + [Li ] (∞) = 0, lim E ,
n,m→∞
i=n
which implies that m lim E sup Li (t) = 0. n,m→∞ t
i=n
LOCAL MARTINGALES AND COMPENSATED JUMPS
237
m As L1 (Ω) is complete supt | i=1 Li (t)| is convergent in L1 (Ω). From the convergence in L1 (Ω) one has a subsequence which is almost surely convergent, therefore there is a process L such that for almost all ω n k lim sup Li (t, ω) − L (t, ω) = 0. k→∞ t i=1
L is obviously right-regular and of course L1 (Ω), that is
n i=1
Li converges to L uniformly in
n lim E sup Li (t) − L (t) = 0. n→∞ t
i=1
Again by Davis’ inequality
1/2 E sup |Li (t)| ≤ C · E [Li ] (∞) < ∞, t
hence Li is a class D local martingale hence it is a martingale. From the ∞ convergence in L1 (Ω) it follows that L i=1 Li is also a martingale. n n Li (t) + E sup L − Li (t) < ∞ E sup |L (t)| ≤ E sup t t t
i=1
i=1
hence the limit L is in D that is L a uniformly integrable martingale. "∞ + Now let us assume that n=1 [Ln ] ∈ Aloc . In this case there is a localizing sequence (τ k ) for which * * * τ k +∞ +∞ +∞ + + + τ τ , [Lnk ] = , [Ln ] k = , [Ln ] ∈ A+ . n=1
n=1
n=1
" Observe that (τ k ) is a common localizing sequence for all Ln , that is [Lτnk ] ∈ A for all n. Observe also, that by Davis’ inequality Lτnk ∈ M for every n and k. By the first part of the proof forevery k there is an L(k) ∈ M such that
(k+1) ∞ τk τk (k) . Obviously L = L(k) , so one can define an L ∈ L for n=1 Ln = L τk (k) which L = L . Let us fix an ε and a δ. As τ k ∞ for every t < ∞ there is
238
GENERAL THEORY OF STOCHASTIC INTEGRATION
an n such that P (τ k ≤ t) ≤ δ/2 whenever k ≥ n. In the usual way, for k ≥ n n Lk (s) > ε ≤ P sup L(s) − s≤t k=1 n Lk (s) > ε, τ k > t . ≤ P (τ k ≤ t) + P sup L(s) − s≤t
k=1
The first probability is smaller than δ/2, the second probability is n τk τk P sup L (s) − Lk (s) > ε, τ k > t s≤t
k=1
which is smaller than n τk τk P sup L (s) − Lk (s) > ε . s
k=1
As Lτnk → Lτ k uniformly in L1 (Ω), by Markov’s inequality n τk τk Lk (s) > ε → 0, P sup L (s) − s
k=1
from which one can easily show that for n large enough n P sup L(s) − Lk (s) > ε < δ, s≤t
k=1
n ucp that is k=1 Lk → L, which means that on every compact interval in the topology of uniform convergence in probability
lim
n→∞
n
Lk
k=1
∞
Lk = L.
k=1
Theorem 4.27 (Parseval’s identity) Under the conditions of the theorem above for every t # lim
n→∞
L−
n k=1
$ Lk (t) = 0
(4.3)
LOCAL MARTINGALES AND COMPENSATED JUMPS
239
and a.s.
[L] (t) =
∞
[Lk ] (t)
(4.4)
k=1
where in both cases the convergence holds in probability. Proof. By Davis’ inequality * $ +# n m + 1 Lk (t) ≤ · E sup L (s) − Ln (s) . E , L − c s≤t n=1 k=1
If
"∞ n=1
[Ln ] ∈ A+ then by the theorem just proved m Ln (s) = 0. lim E sup L (s) − m→∞ s≤t
n=1
in probability, By Markov’s" inequality convergence in L1 (Ω) implies convergence "∞ ∞ + [L ] ∈ A then (4.3) holds. Let [L ] ∈ A+ therefore if n n n=1 n=1 loc and "∞ let (τ k ) be a localizing sequence of [L ]. Let us fix an ε and a δ. As n n=1 τ k ∞ for every t < ∞ there is a q such that P (τ k ≤ t) ≤ δ/2 whenever k ≥ q. In the usual way, for k ≥ q n P sup L(s) − Lk (s) > ε ≤ s≤t k=1 n Lk (s) > ε, τ k > t . ≤ P (τ k ≤ t) + P sup L(s) − s≤t
k=1
Obviously n Lk (s) > ε, τ k > t = P sup L(s) − s≤t k=1 n τk τk Lk (s) > ε, τ k > t ≤ = P sup L (s) − s≤t k=1 n τk τk Lk (s) > ε . ≤ P sup L (s) − s≤t
k=1
240
GENERAL THEORY OF STOCHASTIC INTEGRATION
By the stopping rule of the quadratic variation * * * τ k +∞ +∞ +∞ + + + τ τ , [Lnk ] = , [Ln ] k = , [Ln ] ∈ A+ , n=1
n=1
n=1
so by the first part of the proof if n is large enough n δ δ Lk (s) > ε ≤ + P sup L(s) − 2 2 s≤t k=1
that is (4.3) holds in the general case. By Kunita–Watanabe inequality26 *# $ * $ +# + n n + " + , [L] (t) − , Lk (t) ≤ Lk (t). L− k=1 k=1 This implies that # [L] (t) = lim
n→∞
∞
n
$ Lk (t) = lim
k=1
n→∞
n
[Lk ] (t)
k=1
[Lk ] (t)
k=1
where convergences hold in probability. 4.2.1
Construction of purely discontinuous local martingales
The cornerstone of the construction of the general stochastic integral is the next proposition: Proposition 4.28 Let H be a progressively measurable process. There is one and only one purely discontinuous local martingale L ∈ L for which ∆L = H if and only if 1. the set {H = 0} is thin, 2. p H = 0 and " 3. H 2 ∈ A+ loc . Proof. By the definition of the thin sets, for every ω there exists just a countable of points where the trajectory H (ω) is not zero. Hence the sum
number 2 H 2 (t) s≤t H (s) is meaningful. Observe that from the condition " + H 2 ∈ Aloc it implicitly follows that H (0) = 0. 26 See:
Corollary 2.36, page 137.
LOCAL MARTINGALES AND COMPENSATED JUMPS
241
1. The uniqueness of L is obvious, as if purely discontinuous local martingales have the same jumps then they are indistinguishable27 . 2 2. If H ∆L for some L ∈ L then p H p (∆L) = 0, and as (∆L) = ∆ [L] and [L] is increasing
H2 =
2
(∆L) ≤
2
c
(∆L) + [L] =
= [L] . " H 2 ∈ A+ [L] ∈ A+ loc obviously loc , so the conditions are necessary. " + 2 3. Let us assume that H ∈ Aloc and let us assume that the sequence of stopping times (ρm ) exhausting29 for the thin set {H = 0}. We can assume that ρm is either totally inaccessible or predictable. For every stopping time ρm let us define a simple jump processes which jumps at ρm and for which the value of the jump is H (ρm ):
Since28
"
Nm H (ρm ) χ ([ρm , ∞)) . It is worth emphasizing that it is possible that ∪m [ρm ] = {H = 0}. That is, the inclusion {H = 0} ⊆ ∪m [ρm ] can be proper, but ∪m {∆Nm = 0} = {H = 0} . Nm is right-regular, H is progressively measurable, hence the stopped variables " H 2 ∈ A+ H (ρm ) are Fρm -measurable and so Nm is adapted. As loc |Nm | ≤
)
H 2 ∈ A+ loc
for every m, hence Nm has locally integrable variation, so it has a compensator p . Nm p 4. We show that Nm is continuous. If ρm is predictable then the graph [ρm ] of ρm is a predictable set30 so using property 6. of the predictable 27 See:
Corollary 4.7, page 228. (3.20) line, page 222. 29 See: Proposition 3.22, page 189. 30 See: Corollary 3.34, page 199. 28 See:
242
GENERAL THEORY OF STOCHASTIC INTEGRATION
compensator31 up to indistinguishability p ∆ (Nm )=
p
(∆Nm )
p
(H (ρm ) χ ([ρm ])) =
p
(Hχ ([ρm ])) =
= (p H) χ ([ρm ]) = 0 · χ ([ρm ]) = 0. p Hence Nm is continuous. Let ρm be totally inaccessible. As above
p ∆ (Nm )=
p
(∆Nm ) =
p
(Hχ ([ρm ])) .
ρm is totally inaccessible and therefore P (ρm = σ) = 0 for every predictable stopping time σ, hence if σ is predictable then p
0 (Hχ ([ρ ]) (σ) | Fσ− ) = (Hχ ([ρm ])) (σ) E m 0 (0 | Fσ− ) = 0. =E
p By the definition of the predictable projection ∆ (Nm ) = 0. p 5. Let Lm Nm − Nm ∈ L be the compensated jumps. As the compensators are continuous and have finite variation if i = j then [Li , Lj ] = [Ni , Nj ] = 0, and
)
[Lk ] =
)
[Nk ] =
)
H 2 ∈ A+ loc .
Hence32 there is an L ∈ L for which L = k Lk . As the convergence is uniform in probability there is a sequence for which the convergence is almost surely uniform. Hence up to indistinguishability ∆L = ∆
Lk = ∆Lk = H.
Observe that in the last step we have used the fact that {H = 0} = ∪m {∆Nm = 0} = ∪m {∆Lm = 0} . 6. Let us prove that L is purely discontinuous. Let M be a continuous local martingale. Obviously [Lk , M ] = 0. Therefore by the inequality of Kunita and 31 See: 32 See:
page 217. Theorem 4.26, page 236.
LOCAL MARTINGALES AND COMPENSATED JUMPS
243
Watanabe33 and by (4.3) # $ # $ n n Lk + M, Lk = |[M, L]| ≤ M, L − k=1 k=1 * # $ $ +# n n " + , = M, L − Lk ≤ [M ] Lk → 0 L− k=1
k=1
which implies that [M, L] = 0, that is M and L are orthogonal. Hence L is purely discontinuous. Definition 4.29 The following definitions are useful: 1. We say that process X is a single jump if there is a stopping time ρ and an Fρ -measurable random variable ξ such that X = ξχ ([ρ, ∞)). 2. We say that process X is a compensated single jump if there is a single jump Y for which X = Y − Y p . 3. We say that the X is a continuously compensated single jump if Y p in 2. is continuous. Proposition 4.30 (The structure of purely discontinuous local martingales) If L ∈ L is a purely discontinuous local martingale then in the topology of uniform convergence in probability on compact intervals L
∞
Lk ,
k=1
where for all k: 1. Lk ∈ L is a continuously compensated single jump, 2. the jumps of Lk are jumps of L. 3. If i = j then [Li , Lj ] = 0 that is Li and Lj are strongly orthogonal, 2
4. [Lk ] = (∆L (ρk )) χ ([ρk , ∞)), where ρk denotes the stopping time of Lk . & ' 5. If i = j then the graphs [ρi ] and ρj are disjoint. " If [L] ∈ A+ then the convergence holds in the topology of uniform convergence in L1 (Ω). Proof. It is sufficient to remark, that if L ∈ L is purely discontinuous then the jump process of L satisfies the conditions of the above proposition34 . 33 See: 34 See:
Corollary 2.36, page 137. Proposition 4.28, page 240.
244
GENERAL THEORY OF STOCHASTIC INTEGRATION
4.2.2
Quadratic variation of purely discontinuous local martingales
In this subsection we return to the investigation of the quadratic variation. Definition 4.31 We say that M is a pure quadratic jump process if [M ] =
2
(∆M ) .
(4.5)
Example 4.32 Every V ∈ V is a pure quadratic jump process35 .
By (2.14) [V, V ] =
∆V ∆V =
2
(∆V ) .
Theorem 4.33 (Quadratic variation of purely discontinuous local martingales) A local martingale L ∈ L is a pure quadratic jump process if and only if it is purely discontinuous. Proof. Let L ∈ L. 1. If L is purely discontinuous, then by the structure of purely discontinuous local martingales36 L = k Lk , where [Lk , Lj ] =
0 if k = j . 2 (∆L (ρk )) χ ([ρk , ∞)) if k = j
By Parseval’s identity (4.4) for every t
a.s
[L] (t) =
∞
[Lk ] (t) =
2
(∆L) (s) .
s≤t
k=1
As both sides of the equation are right-regular [L] and indistinguishable. 2. If L is a pure quadratic jump process, then [L] = 35 See: 36 See:
Proposition 2.33, page 134. Proposition 4.30, page 243.
2
(∆L) .
s≤t
2
(∆L)
are
LOCAL MARTINGALES AND COMPENSATED JUMPS
245
Let L = Lc + Ld be the decomposition of L ∈ L. As Lc is continuous37 ' ' & ' & & [L] = Lc + Ld = [Lc ] + 2 Lc , Ld + Ld = & ' = [Lc ] + Ld . By the part of the theorem already proved 2 2 & d' 2 ∆Ld = ∆Ld + ∆Lc = (∆L) . L = Hence [Lc ] = 0, therefore Lc = 0 and so L = Ld . Corollary 4.34 If X is a purely discontinuous local martingale then for every local martingale Y [X, Y ] =
∆X∆Y.
(4.6)
Proof. Obviously & ' ' & [X, Y ] = X, Y c + Y d = [X, Y c ] + X, Y d . By the definition of the orthogonality [X, Y c ] is a local martingale. ∆ [X, Y c ] = ∆X∆Y c = 0, hence [X, Y c ] is continuous. [X, Y c ] ∈ V ∩ L so by Fisk’s theorem [X, Y c ] = 0. As the purely discontinuous local martingales form a linear space & ' 1 & ' & ' X +Yd − X −Yd = X, Y d = 4 2 2
1 ∆X + ∆Y d − ∆X − ∆Y d = = 4
∆X∆Y. ∆X ∆Y d + ∆Y c = = ∆X∆Y d =
Proposition 4.35 (Quadratic variation of semimartingales) For every semimartingale X [X] = [X c ] +
2
(∆X) ,
(4.7)
where, as before38 , X c denotes the continuous part of the local martingale part of X. More generally if X and Y are semimartingales then [X, Y ] = [X c , Y c ] + 37 See: 38 See:
Corollary 4.10, page 229. Definition 4.23, page 235.
∆X∆Y.
(4.8)
246
GENERAL THEORY OF STOCHASTIC INTEGRATION
Proof. Recall39 that every semimartingale X has a decomposition, X = X (0) + X c + H + V, where X c is a continuous local martingale, V ∈ V and H is a purely discontinuous local martingale. By simple calculation [X] = [X c ] + [V ] + [H] + + 2 [X c , H] + 2 [X c , V ] + 2 [H, V ] . As X c is continuous and V has finite variation so [X c , V ] = 0. H is purely discontinuous and X c is continuous, hence by (4.6) [X c , H] = 0. Therefore [X] = [X c ] + [V ] + [H] + 2 [H, V ] . Every process with finite variation is a pure quadratic jump process so [V ] =
2
(∆V ) .
H is purely discontinuous, hence it is also a pure quadratic jump process, so [H] =
2
(∆H) .
As V has finite variation so by (2.14) [H, V ] =
∆H∆V.
Therefore [V ] + [H] + 2 [H, V ] =
2
(∆H + ∆V ) =
2
(∆X) ,
so (4.7) holds. The proof of the general case is similar. c
Corollary 4.36 If X is a semimartingale then [X c ] = [X] . More generally if c X and Y are semimartingales then [X c , Y c ] = [X, Y ] .
4.3
Stochastic Integration With Respect To Local Martingales
Recall that so far we have defined the stochastic integral with respect to local martingales only when the integrator Y was locally square-integrable. In fact, in this case the construction of the stochastic integral is nearly the same as the construction when the integrator is a continuous local martingale. The only 39 See:
Theorem 4.19, page 232.
STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES
247
2 difference is that when Y ∈ Hloc then one can integrate only predictable processes and one has to consider the condition for the jumps of the integral ∆ (X • Y ) = 2 then a predictable process X is integrable X∆Y as well. Recall that if Y ∈ Hloc if and only if
/ . X ∈ L2loc (Y ) Z : Z 2 • [Y ] ∈ A+ loc . 2 In this case X • Y ∈ Hloc . Observe that the condition X ∈ L2loc (Y ) is very 2 if and only if40 [M ] ∈ A+ natural. If M is a local martingale then M ∈ Hloc loc . 2 2 As [X • Y ] = X • [Y ] , obviously X • Y ∈ Hloc if and only if X ∈ L2loc (Y ). As ∆ (X • Y ) = X∆Y , if Y is continuous then X • Y is also continuous. Let Y = Y (0) + Y c + Y d be the decomposition of Y into continuous and purely discontinuous local martingales. As [Y ] ∈ A+ loc and as
& ' [Y ] = [Y c ] + Y d
(4.9)
& ' c d it is obvious that [Y c ] , Y d ∈ A+ loc . This immediately implies that Y and Y are 2 2 2 c in Hloc . From (4.9) loc (Y ) if and only if X ∈ Lloc (Y ) it is also clear that X ∈ L d c d 2 and X ∈ Lloc Y . This implies that X • Y and X • Y exist and obviously X • Y = X • Y c + X • Y d. By the construction X • Y c is continuous. Observe that X • Y d is a purely discontinuous local martingale as for any continuous local martingale L ' & ' & X • Y d , L = X • Y d , L = X • 0 = 0, that is X • Y d is strongly orthogonal to every continuous local martingale. The goal of this section is to extend the integration to the case when the integrator is an arbitrary local martingale. To do this one should define the stochastic integral for every purely discontinuous local martingale. Extending the integration to purely discontinuous local martingales from the integration procedure we expect the following properties: 1. If L ∈ L is purely discontinuous then X • L ∈ L should be also purely discontinuous. 2. Purely discontinuous local martingales are uniquely determined by their jumps41 , hence it is sufficient to prescribe the jumps of X • L: it is very natural to ask that the formula ∆ (X • L) = X∆L should hold. 40 See: 41 See:
Proposition 3.64, page 223. Corollary 4.7, page 228.
248
GENERAL THEORY OF STOCHASTIC INTEGRATION 1/2
3. We have proved42 [L] ∈ A+ therefore if loc for any local martingale L, " X ) • L is a purely discontinuous local martingale then the expression [X • L] = 2 (X∆L) should have locally integrable variation. 4. If L ∈ L then p (∆L) = 0. By the jump condition, if X is predictable then p
(∆ (X • L)) =
p
(X · ∆L) = X · (p (∆L)) = X · 0 = 0
from which one can expect that one can guarantee only for predictable integrands X that X • L ∈ L and ∆ (X • L) = X∆L. 4.3.1
Definition of stochastic integration
Assume, that L ∈ L is a purely discontinuous local martingale. As L is a local martingale p (∆L) is finite and p (∆L) = 0. If H is a predictable real valued process then as p (∆L) is finite43 p
(H∆L) = H (p (∆L)) = 0,
hence if )
2
H 2 (∆L) ∈ A+ loc ,
then there is one and only one purely discontinuous local martingale44 , denoted by H • L, for which ∆ (H • L) = H∆L. If one expects the properties H∆L = ∆ (H • L)
and
d
(H • L) = H • Ld
from the stochastic integral H • L then this definition is the only possible one for H • L. Definition 4.37 If L is a purely discontinuous local martingale then H • L is the stochastic integral of H with respect to L. Definition 4.38 If L = L " (0) + Lc + Ld is a local martingale and H is a predictable process for which H 2 • [L] ∈ A+ loc then H • L H • Lc + H • Ld . H • L is the stochastic integral of H with respect to L. 42 See:
(3.20), page 222. Proposition 3.37. page 201. 44 See: Proposition 4.28, page 240. 43 See:
STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES
249
Example 4.39 If X ∈ V is predictable45 and L is a local martingale then ∆X • L = ∆X∆L.
1. The trajectories of L are right-regular, therefore they are bounded on finite intervals46 . As X ∈ V obviously ∆L • X exists and ∆X∆L = ∆L • X. X is predictable and right-regular, therefore it is locally bounded47 . As Var (X) is also predictable and it is also right-regular it is also locally bounded. 2. |∆X| ≤ Var (X), which implies that ∆X • L is well-defined. Let L = L (0) + Lc + Ld be the decomposition of L. For any local martingale N ∆X • [Lc , N ] = 0 hence ∆X • Lc = 0. Therefore one can assume that L is purely discontinuous. ) ) 2 2 |∆X∆L| ≤ ∆X∆L ≤ (∆X) (∆L) ≤ " " ≤ [X] [L] < ∞. Obviously ∆ ( ∆X∆L) = ∆X∆L. As ∆X∆L has finite variation, so if it is a local martingale thenit is a purely discontinuous local martingale. Therefore we should prove that ∆X∆L is a local martingale. Hence we should prove that ∆L • X is a local martingale. 3. With localization one can assume that X and Var (X) are bounded. As X and Var (X) are bounded ) " 2 |∆L| • Var (X) = |∆X| |∆L| ≤ (∆X) [L] ≤ " " ≤ sup |X| · Var (X) [L] ∈ A+ loc . Hence with further localization we can assume that ∆L•X ∈ A. If τ is a stopping time then E ((∆L • X) (τ )) = E ((∆L • X τ ) (∞)) . As X τ is also predictable48 one should prove that if ∆L • X ∈ A and X is predictable, then E ((∆L • X) (∞)) = 0. By Dellacherie’s formula49 , using that 45 If
X is not predictable then ∆X is also not predictable so ∆X • L is undefined. Proposition 1.6, page 5. 47 See: Proposition 3.35, page 200. 48 See: Proposition 1.39, page 23. 49 See: Proposition 5.9, page 301. 46 See:
250
GENERAL THEORY OF STOCHASTIC INTEGRATION
L is a local martingale hence p (∆L) = 0, E ((∆L • X) (∞)) = E ((p (∆L) • X) (∞)) = 0. That is ∆L • X = ∆X∆L is a local martingale. 4.3.2
Properties of stochastic integration
Let us discuss the properties of stochastic integration with respect to local martingales: " H 2 • [L] ∈ A+ 1. If loc then the definition is meaningful and H • L ∈ L. Specifically every locally bounded predictable process is integrable 50 . For any local martingale L 2 (∆L) . (4.10) [L] = [Lc ] + The integral H 2 • [Lc ] is finite, hence the integral H • Lc exists51 . By (4.10) ) ) ) 2 2 H 2 • [Ld ] = H 2 • (∆L) = (H∆L) ∈ Aloc , hence H • Ld is also meaningful. Both integrals are local martingales, hence the also a local martingale. The second observation sum H • L H • Lc + H • Ld is" easily follows from the relation [L] ∈ A+ loc . 2. H∆L = ∆ (H • L). c
d
3. (H • L) = H • Lc and (H • L) = H • Ld . 4. [H • L] = H 2 • [L]. c 2 [H • L] = [(H • L) ] + (∆ (H • L)) = & ' 2 = H 2 • [Lc ] + (H∆L) = H 2 • [Lc ] + H 2 • Ld = = H 2 • [L] . 5. H • L is the only process in L for which [H • L, N ] = H • [L, N ] holds for every N ∈ L. By the inequality of Kunita and Watanabe " " |H| • Var ([L, N ]) ≤ H 2 • [L] [N ] [M ] ∈ A+ loc for any local martingale M , hence the present construction of H • L is maximal in H, that is if one wants to extend the definition of the stochastic integral to a broader class of integrands H, then H • L will not necessarily be a local martingale. 51 See: Corollary 2.67, page 158. 50
STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES
251
hence the integral H • [L, N ] is meaningful. Therefore ( c d [H • L, N ] = [(H • L) , N c ] + (H • L) , N d = = [H • Lc , N c ] + H∆L∆N ' & = H • [Lc , N c ] + H • Ld , N d =
' & = H • [Lc , N c ] + Ld , N d = = H • [L, N ] . If H • [L, N ] = [Y, N ] for some local martingale Y , then [Y − H • L, N ] = 0. Hence if N Y − H • L then [Y − H • L] = 0. Y − H • L is a local martingale therefore52 Y − H • L = 0. 6. If τ is an arbitrary stopping time, and H • L exists then τ
H • Lτ = (H • L) = (χ ([0, τ ]) H) • L. If
" H 2 • [L] ∈ Aloc , then trivially " " H 2 • [Lτ ] = χ ([0, τ ]) H 2 • [L] ∈ Aloc
so the integrals above exists. By the stopping rule of the quadratic variation if N ∈L τ
τ
τ
τ
[(H • L) , N ] = [(H • L) , N ] = (H • [L, N ]) = H • [L, N ] = = H • [Lτ , N ] = [H • Lτ , N ] , hence by the bilinearity of the quadratic variation τ
[(H • L) − H • Lτ , N ] = 0, N ∈ L, from which τ
(H • L) = H • Lτ . For arbitrary N ∈ L τ
[H • Lτ , N ] = H • [Lτ , N ] = H • [L, N ] = = (χ ([0, τ ]) H) • [L, N ] = = [(χ ([0, τ ]) H) • L, N ] , hence again H • Lτ = (χ ([0, τ ]) H) • L from Property 5. 52 See:
Proposition 2.82, page 170.
252
GENERAL THEORY OF STOCHASTIC INTEGRATION
7. The integral is linear in the integrand. By elementary calculation ) ) ) 2 (H1 + H2 ) • [L] ≤ H12 • [L] + H22 • [L], hence if H1 • L and H2 • L exist then the integral (H1 + H2 ) • L also exists. When the integrator is continuous the integral is linear. The linearity of the purely discontinuous part is a simple consequence of the relation. (H1 + H2 ) ∆L = H1 ∆L + H2 ∆L. The proof of the homogeneity is analogous. 8. The integral is linear in the integrator. By the inequality of Kunita and Watanabe53 [L1 + L2 ] ≤ 2 ([L1 ] + [L2 ]) , hence if the integrals H • L1 and H • L2 exist then H • (L1 + L2 ) also exists. The decomposition of the local martingales into continuous and purely discontinuous c d martingales is unique so (L1 + L2 ) = Lc1 + Lc2 , and (L1 + L2 ) = Ld1 + Ld2 . For continuous local martingales we have already proved the linearity, the linearity of the purely discontinuous part is evident from the relation ∆ (L1 + L2 ) = ∆L1 + ∆L2 . 9. If H i ξ i χ ((τ i , τ i+1 ]) is an adapted simple process then (H • L) (t) =
ξ i (L (τ i+1 ∧ t) − L (τ i ∧ t)) .
(4.11)
i
By the linearity it is sufficient to calculate the integral just for one jump. For the continuous part we have already deduced the formula. For the discontinuous part it is sufficient to remark that if ξ i is Fτ i -measurable and L is a purely discontinuous local martingale then ξ i (L (τ i+1 ∧ t) − L (τ i ∧ t)) is a purely discontinuous local martingale54 , with jumps ξ i χ ((τ i , τ i+1 ]) ∆L. 10. Assume that the integral H • L exists. The integral K • (H • L) exists if and only if the integral (KH) • L exists. In this case (KH) • L = K • (H • L) . Let us remark that as the integrals are pathwise integrals with respect to processes with finite variation ) " 2 K 2 • (H 2 • [L]) = (KH) • [L]. 53 See: 54 The
Corollary 2.36, page 137. space of purely discontinuous local martingales is closed under stopping.
STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES
253
K • (H • L) exists if and only if ) " " 2 K 2 • [H • L] = K 2 • (H 2 • [L]) = (KH) • [L] ∈ A+ loc , from which the first part is evident. If N is an arbitrary local martingale then [K • (H • L) , N ] = K • [H • L, N ] = KH • [L, N ] = = [KH • L, N ] , from which the second part is evident. 11. If τ is an arbitrary stopping time then τ
H • Lτ = (χ ([0, τ ]) H) • L = (H • L) . If N is an arbitrary local martingale, then τ
[H • Lτ , N ] = H • [Lτ , N ] = H • [L, N ] = = Hχ ([0, τ ]) • [L, N ] = = [Hχ ([0, τ ]) • L, N ] = τ
τ
= (H • [L, N ]) = [H • L, N ] = τ
= [(H • L) , N ] , from which the property is evident. 12. The Dominated Convergence Theorem is valid, that is if (Hn ) is a sequence of predictable processes, Hn → H∞ and there is a predictable process H, for which the integral H • L exists and |Hn | ≤ H then the integrals Hn • L also exist and Hn • L → H∞ • L, where the convergence is uniform in probability on the compact time-intervals. As Hn2 • [L] ≤ H 2 • [L] for all n ≤ ∞ the integrals Hn • L exist. By Davis’ inequality, for every stopping time τ %
2 τ E sup |((Hn − H∞ ) • L ) (t)| ≤ C · E (Hn − H∞ ) • [L] (∞) .
τ
t
"
τ H 2 • [L] m (∞) < ∞, hence by There is a localizing sequence (τ m ), that E the classical Dominated Convergence Theorem E
) 2 τ (Hn − H∞ ) • [L] m (∞) → 0
254
GENERAL THEORY OF STOCHASTIC INTEGRATION
hence L
sup |((Hn − H∞ ) • Lτ m ) (t)| →1 0, t
from which as in the continuous case55 one can guarantee on every compact interval the uniform convergence in probability. 13. The definition of the integral is unambiguous that is if L ∈ V ∩ L then the two possible concepts of integration give the same result. It is trivial from Proposition 2.89. 14. If X is left-continuous and locally bounded then (X • L) (t) is an Itˆ o– Stieltjes integral for every t where the convergence of the approximating sums is uniform in probability on every compact interval. The approximating partitions can be random as well. The proof is the same as in the continuous case56 .
4.4
Stochastic Integration With Respect To Semimartingales
Recall the definition of stochastic integration with respect to semimartingales: Definition 4.40 If semimartingale X has a decomposition X = X (0) + L + V,
V ∈ V, L ∈ L
for which the integrals H • L and H • V exist then H • X H • L + H • V. By Proposition 2.89 the next statement is trivial57 : Proposition 4.41 For predictable integrands the definition is unambiguous, that is the integral is independent of the decomposition of the integrator. Proposition 4.42 If X and Y are arbitrary semimartingales and the integrals U • X and V • Y exist, then [U • X, V • Y ] = U V • [X, Y ] . Proof. Let XL + XV , and YL + YV be the decomposition of X and Y . [U • X, V • Y ] = [U • XL , V • YL ] + [U • XL , V • YV ] + + [U • XV , V • YL ] + [U • XV , V • YV ] . 55 See:
Proposition 2.74. page 162. Proposition 2.77, page 166. 57 See: Subsection 2.4.3, page 176. 56 See:
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
255
For integrals with respect to local martingales [U • XL , V • YL ] = U V • [XL , YL ] . In the three other expressions one factor has finite variation, hence the quadratic variation is the sum of the products of the jumps58 . For example [U • XL , V • YV ] =
∆ (U • XL ) ∆ (V • YV ) =
(U ∆XL ) (V ∆YV ) .
On the other hand for the same reason U V • [XL , YV ] = U V •
∆XL ∆YV
=
U V ∆XL ∆YV ,
hence [U • XL , V • YV ] = U V • [XL , YV ] . One can finish the proof with the same calculation for the other tags. Observe that the existence of the integral H • X means that for some decomposition X = X (0) + L + V one can define the integral and the existence of the integral does not mean that in every decomposition of X the two integrals are meaningful. Observe also that with the definition we extended the class of integrable processes even for local martingales. It is possible that the integral H • L as an integral with respect to the local martingale L does not exist, but L has a decomposition L = L (0) + M + V, M ∈ L, V ∈ V for which H is integrable with respect to M and V . Of course in this general case we cannot guarantee that59 H • L ∈ L. Example 4.43 If the integrand is not locally bounded then the stochastic integral with respect to a local martingales is not necessarily a local martingale.
Let M be a compound Poisson process, where P (ξ k = ±1) = 1/2 for the jumps ξ k . M is a martingale and the trajectories of M are not continuous. Let τ 1 be the time of the first jump of M and let X (t, ω) 58 See: 59 See:
line (2.14), page 134. Example 4.43, page 255.
1 · χ ((0, τ 1 (ω)]) . t
256
GENERAL THEORY OF STOCHASTIC INTEGRATION
X is predictable but it is not locally bounded. As the trajectories of M have finite variation the pathwise stochastic integral 1 χ ((0, τ 1 (ω)]) dM (s, ω) = L (t, ω) (X • M ) (t, ω) = (0,t] s 0 if t < τ 1 (ω) = ξ 1 (ω) /τ 1 (ω) if τ 1 (ω) ≤ t is meaningful. We prove that L is not a local martingale. If (ρk ) would be a localization of L then Lρ1 was a uniformly integrable martingale. Hence for the stopping time σ ρ1 ∧ t E (L (σ)) E (L (ρ1 ∧ t)) = E (Lρ1 (t)) = E (L (0)) = 0. Therefore it is sufficient to prove that for any finite stopping time σ = 0 E (|L (σ)|) = ∞.
(4.12)
Let σ be a finite stopping time with respect to the filtration F generated by M . 1 1 E (|L (σ)|) = χ (τ 1 ≤ σ) dP ≥ χ (τ 1 ≤ σ ∧ τ 1 ) dP. τ τ Ω 1 Ω 1 Hence to prove (4.12) one can assume that σ ≤ τ 1 . In this case σ is Fτ 1 measurable. Hence it is independent of the variables (ξ n ). So one can assume that σ is a stopping time for the filtration generated by the point process part of M . By the formula of the representation of stopping times of point processes60 σ = ϕ0 χ (σ < τ 1 ) +
∞
χ (τ n ≤ σ < τ n+1 ) ϕn (τ 0 , . . . , τ n )
n=1
∞
χ (τ n ≤ σ < τ n+1 ) ϕn (τ 0 , . . . , τ n ) =
n=0
= ϕ0 χ (σ < τ 1 ) + χ (σ ≥ τ 1 ) ϕ1 (τ 1 ) . From this {τ 1 ≤ ϕ0 } ⊆ {τ 1 ≤ σ}. If ϕ0 > 0 then using that τ 1 has an exponential distribution 1 1 E (|L (σ)|) = χ (τ 1 ≤ σ) dP ≥ χ (τ 1 ≤ ϕ0 ) dP = τ τ Ω 1 Ω 1 ϕ0 1 = λ exp (−λx) dx = ∞. x 0 60 See:
Proposition C.6, page 581.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
257
σ = 0 and F0 = {∅, Ω}, therefore {σ ≤ 0} = ∅. Hence σ > 0, so if ϕ0 = 0 then σ ≥ τ 1 . Hence again 1 1 χ (τ 1 ≤ σ) dP = dP = ∞. E (|L (σ)|) = Ω τ1 Ω τ1 By the definition of the integral it is clear that if a process H is integrable with respect to semimartingales X1 and X2 then H is integrable with respect to aX1 + bX2 for every constants a, b and H • (aX1 + bX2 ) = a (H • X1 ) + b (H • X2 ) . Observe that by the above definitions the other additivity of the integral, that is the relation (H1 + H2 ) • X = H1 • X + H2 • X is not clear. Our direct goal in the following two subsections is to prove this additivity property of the integral. 4.4.1
Integration with respect to special semimartingales
Recall that by definition S is a special semimartingale if it has a decomposition S = S (0) + V + L,
V ∈ V, L ∈ L
(4.13)
where V is predictable. Theorem 4.44 (Characterization of special semimartingales) Let S be a semimartingale. The next statements are equivalent: 1. S is a special semimartingale, i.e. there is a decomposition (4.13) where V is predictable. 2. There is a decomposition (4.13), where V ∈ Aloc . 3. For all decompositions (4.13) V ∈ Aloc . 4. S ∗ (t) sups≤t |S (s) − S (0)| ∈ A+ loc . Proof. We prove the equivalence of the statements backwards. 1. Let us assume that the last statement holds, and let S = S (0) + V + L be a decomposition of S. Let L∗ (t) sups≤t |L (s)|. L∗ is in61 A+ loc , hence from the assumption of the fourth statement V ∗ (t) sup |V (s)| ≤ S ∗ (t) + L∗ (t) ∈ A+ loc . s≤t
61 See:
Example 3.3, page 181.
258
GENERAL THEORY OF STOCHASTIC INTEGRATION
The process Var (V )− is increasing and continuous from the left, hence it is locally bounded, hence Var (A)− ∈ A+ loc . As Var (V ) ≤ Var (V )− + ∆ (Var (V )) ≤ Var (V )− + 2V ∗ Var (V ) ∈ A+ loc , hence the third condition holds. 2. From the third condition the second one follows trivially. 3. If V ∈ Aloc in the decomposition S = S (0)+V +L, then V p , the predictable compensator of V , exists. V − V p is a local martingale, hence S = S (0) + V p + (V − V p + L) is a decomposition where V p ∈ V is predictable, so S is a special semimartingale. 4. Let us assume that S (0) = 0 so S = V + L. If V ∗ (t) sups≤t |V (s)|, then as V ∗ ≤ Var (V ) S ∗ ≤ V ∗ + L∗ ≤ Var (V ) + L∗ . L∗ ∈ A+ loc , so it is sufficient to prove that if V ∈ V is predictable then Var (V ) ∈ A+ loc . It is sufficient to prove that Var (V ) is locally bounded. V is continuous from the right, hence when one calculates Var (V ) it suffices to use the partitions with dyadic rationals and hence if V is predictable then Var (V ) is also predictable. Var (V ) is right-continuous and predictable hence it is locally bounded62 . Example 4.45 X ∈ V is a special semimartingale if and only if X ∈ Aloc . A compound Poisson process is a special semimartingale if and only if the expected value of the distribution of the jumps is finite.
The first remark is evident from the theorem. Recall, that a compound Poisson process has locally integrable variation if and only if the distribution of the jumps has finite expected value63 . Example 4.46 If a semimartingale S is locally bounded then S is a special semimartingale.
Example 4.47 If a semimartingale S has bounded jumps then S is a special semimartingale64 .
62 See:
Proposition 3.35, page 200. Example 3.2, page 180. 64 See: Proposition 1.152, page 107. 63 See:
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
259
Example 4.48 Decomposition of continuous semimartingales.
Recall that by definition S is a continuous semimartingale if S has a decomposition S = S (0) + V + L, where V ∈ V, L ∈ L and V and L are continuous65 . Let S now be a semimartingale and let us assume that S is continuous. As S is continuous it is locally bounded, so S is a special semimartingale. By the just proved proposition S has a decomposition S (0) + V + L, where V ∈ V is predictable and L ∈ L. As S is continuous L is also predictable, hence it is continuous66 . This implies that V is also continuous. This means that S is a continuous semimartingale. The stochastic integral X • Y is always a semimartingale. One can ask: when is it a special semimartingale? Theorem 4.49 (Integration with respect to special semimartingales) Let X be a special semimartingale. Assume that for a predictable process H the integral H • X exists. Let X X (0) + A + L be the canonical decomposition of X. H • X is a special semimartingale if and only if the integrals H • A and H • L exist and H • L is a local martingale. In this case the canonical decomposition of H • X is exactly H • A + H • L. Proof. Let us first remark that if U and W are predictable and W ∈ V and the integral U • W exists then it is predictable. This is obviously true if U χ ((s, t]) χF ,
F ∈ Fs
as67
U • W = χF W t − W s = (χF χ ((s, ∞))) W t − W s . The general case follows from the Monotone Class Theorem. Assume that the integral68 Z H •X H •V +H •M exists and it is a special semimartingale. Let Z B + N be the canonical decomposition of Z. B ∈ Aloc and B is predictable. χ (|H| ≤ n) is bounded and predictable, hence the integral χ (|H| ≤ n) • Z χ (|H| ≤ n) • B + χ (|H| ≤ n) • N 65 See:
Definition 2.18, page 124. 3.40, page 205. 67 See: Proposition 1.39, page 23. 68 With some decomposition X = X (0) + V + M. 66 See:
260
GENERAL THEORY OF STOCHASTIC INTEGRATION
exists. χ (|H| ≤ n) is bounded, B ∈ Aloc hence χ (|H| ≤ n) • B ∈ Aloc . As χ (|H| ≤ n) and B are predictable χ (|H| ≤ n) • B is also predictable. Let Hn Hχ (|H| ≤ n). Hn is bounded and predictable hence the integral Hn • X Hn • A + Hn • L is meaningful. Hn • A ∈ Aloc and Hn • A is predictable and Hn • L ∈ L so Hn • X is a special semimartingale and Hn • A + Hn • L its canonical decomposition. By the associativity rule of the integration with respect to local martingales and processes with finite variation, and by the linearity in the integrator χ (|H| ≤ n) • Z χ (|H| ≤ n) • (H • X) χ (|H| ≤ n) • (H • V + H • M ) = = χ (|H| ≤ n) • (H • V ) + χ (|H| ≤ n) • (H • M ) = = (χ (|H| ≤ n) H) • V + (χ (|H| ≤ n) H) • M (χ (|H| ≤ n) H) • X Hn • X = Hn • A + Hn • L. The canonical decomposition of special semimartingales is unique, hence χ (|H| ≤ n) • B = Hn • A,
χ (|H| ≤ n) • N = Hn • L.
As we have seen χ (|H| ≤ n) H 2 • [L] Hn2 • [L] = [Hn • L] = [χ (|H| ≤ n) • N ] = = χ (|H| ≤ n) • [N ] ≤ [N ] . " " [N ] ∈ A+ H 2 • [L] ∈ A+ loc , so by the Monotone Convergence Theorem loc and therefore the integral H • L ∈ L exists, and by the Dominated Convergence Theorem N = H • L. Similarly, H • A exists, it is in Aloc and H • A = B. If H and A are predictable then H • A is predictable hence the other implication is evident. Corollary 4.50 Let L be a local martingale and let us assume that the integral H • L exists. H • L is a local martingale if and only if sups≤t |(H • L) (s)| is locally integrable, that is sup |(H • L) (s)| ∈ A+ loc . s≤t
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
261
Proof. As sups≤t |M (s)| is locally integrable69 for every local martingale M ∈ L one should only prove that if sups≤t |(H • L) (s)| is locally integrable then H • L is a local martingale. X L is a special semimartingale with canonical decomposition X = L + 0. Hence H • L is a local martingale if and only if Y H • L is a special semimartingale. But as Y (0) = 0, the process Y is a special semimartingale70 if and only if sups≤t |Y (s)| ∈ A+ loc . 4.4.2
Linearity of the stochastic integral
The most important property of every integral is the linearity in the integrand. Now we are ready to prove this important property: Theorem 4.51 (Additivity of stochastic integration) Let X be an arbitrary semimartingale. If H1 and H2 are predictable processes and the integrals H1 • X and H2 • X exist, then for arbitrary constants a and b the integral (aH1 + bH2 ) • X exists and (aH1 + bH2 ) • X = a (H1 • X) + b (H2 • X) .
(4.14)
Proof. Let B {|∆X| > 1, |∆ (H1 • X)| > 1, |∆ (H2 • X)| > 1} be the set of the ‘big jumps’. Observe that ∆ (Hi • X) ∆ (Hi • Vi + Hi • Li ) = = ∆ (Hi • Vi ) + ∆ (Hi • Li ) = = Hi ∆Vi + Hi ∆Li = Hi ∆X, so B = {|∆X| > 1, |H1 ∆X| > 1, |H2 ∆X| > 1} . Obviously for an arbitrary ω the section B (ω) does not have an accumulation point. Let us separate the ‘big jumps’ from X. That is let X
∆XχB ,
X X − X.
∈ V and the integrals Hk • X Observe that, by the simple structure of B, X are simple sums, so they exist. By the construction of the stochastic integral 69 See: 70 See:
Example 3.3, page 181. Theorem 4.44, page 257.
262
GENERAL THEORY OF STOCHASTIC INTEGRATION
Hk • X also exists71 . As the jumps of the X are bounded, X is a special semimartingale72 .
= ∆ Hk • X = Hk ∆X = Hk ∆ X − X = Hk ∆XχB c , hence the jumps of Hk • X are also bounded and therefore the processes Hk • X are also special semimartingales. Let X = X (0) + A + L be the canonical decomposition of X. By the previous theorem integrals Hk • A and Hk • L also exist. The integration with respect to local martingales and with respect to processes with finite variation is additive, hence (H1 + H2 ) • A = H1 • A + H2 • A, (H1 + H2 ) • L = H1 • L + H2 • L, which of course means that the integrals on the left-hand side exist. The integrals are ordinary sums, hence Hk • X = H1 • X + H2 • X. (H1 + H2 ) • X Adding up these three lines above and using that the integral is additive in the integrator we get (4.14). The homogeneity of the integral is obvious by the definition of the integral. 4.4.3
The associativity rule
Like additivity, the associativity rule is also not directly evident from the definition of the stochastic integral. Theorem 4.52 (Associativity rule) Let X be an arbitrary semimartingale and let us assume that the integral H • X exists. The integral K • (H • X) exists if and only if the integral (KH) • X exists. In this case K • (H • X) = (KH) • X. • L ≤ H 2 • [L] and Var V ≤ Var (V )! 72 See: Example 4.47, page 258. 71 H 2
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
263
Proof. Assume that K is integrable with respect to the semimartingale Y H • X. Let B be again the set of the ‘big jumps’, that is B {|∆X| > 1, |∆Y | > 1, |∆ (K • Y )| > 1} . As in the previous subsection for every ω the section B (ω) is a discrete set. Let us define the processes X Y
χB ∆X,
X X − X,
χB ∆Y,
Y Y − Y .
Using the formula for the jumps of the integrals and the additivity of the integral in the integrator = H • X. Y Y − Y = H • X − H • X As the jumps of X are bounded, X is a special semimartingale. Let X = X (0) + A + L be the canonical decomposition of X. By the same reason Y is also a special semimartingale and as we saw above the canonical decomposition of Y is Y = H • X = H • A + H • L. The integral K • Y on any finite interval is a finite sum, hence if K • Y exists then K • Y also exists.
∆ K • Y = K∆Y = K∆Y χB c . The jumps of K • Y are bounded so K • Y is also a special semimartingale. Therefore the integrals K • (H • A) and K • (H • L) exist and K • (H • L) is a local martingale. By the associativity rule for local martingales and for processes with finite variation K • (H • A) = (KH) • A, K • (H • L) = (KH) • L.
264
GENERAL THEORY OF STOCHASTIC INTEGRATION
Adding up the corresponding lines K • Y = K • Y + K • Y =
= = K • (H • A + H • L) + K • H • X
= = (KH) • A + (HL) • L + (KH) • X = (KH) • X. = (KH) • X + (KH) • X The proof of the reverse implication is similar. Assume that the integrals Y H • X and (KH) • X exist, and let B {|∆X| > 1, |∆Y | > 1, |∆ ((KH) • X)| > 1} . In this case H •X =H •A+H •L (KH) • X = (KH) • A + (KH) • L = = K • (H • A) + K • (H • L) , is again a simple sum, therefore where of course the integrals exist. (KH) • X = (KH) • X = (KH) • X + (KH) • X
= = K • (H • A) + K • (H • L) + K • H • X
= =K • H •A+H •L+H •X
=K • H • A+L+X = K • (H • X) .
4.4.4
Change of measure
In this subsection we discuss the behaviour of the stochastic integral when we change the measure on the underlying probability space. Definition 4.53 Let P and Q be two probability measures on a measure space (Ω, A). Let us fix a filtration F. If Q is absolutely continuous with respect to P on the measure space (Ω, Ft ) for every t then we say that Q is locally absolutely continuous with respect to P. In this case we shall use the loc
notation Q P.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
265
loc
If Q P then one can define the Radon–Nikodym derivatives Λ (t)
dQ (t) dP (t)
where Q (t) is the restriction of Q and P (t) is the restriction of P to Ft . If s < t and F ∈ Fs then dQ (t) Λ (t) dP dP = Q (t) (F ) = F dP (t) F dQ (s) dP Λ (s) dP. = Q (s) (F ) = F dP (s) F If filtration F satisfies the usual conditions then process Λ has a modification which is a martingale. As Λ (t) is defined up to a set with measure-zero one can assume that the Radon–Nikodym process Λ is a martingale. loc
Lemma 4.54 If Q P and σ is a bounded stopping time then Λ (σ) is the Radon–Nikodym derivative dQ/dP on the σ-algebra Fσ . If Λ is uniformly integrable then this is true for any stopping time σ. Proof. If σ is a bounded stopping time and σ ≤ t then by the Optional Sampling Theorem, since Λ is a martingale Λ (σ) = E (Λ (t) | Fσ ) . That is if F ∈ Fσ ⊆ Ft then Λ (σ) dP = Λ (t) dP = Q (t) (F ) = Q (F ) . F
F
As Λ is not always a uniformly integrable martingale73 the lemma is not valid a.s. for arbitrary stopping time σ. Since Λ is non-negative Λ (t) → Λ (∞) , where Λ (∞) ≥ 0 is an integrable74 variable. By Fatou’s lemma Λ (t) = E (Λ (N ) | Ft ) = lim inf E (Λ (N ) | Ft ) ≥ N →∞
≥ E lim inf Λ (N ) | Ft = E (Λ (∞) | Ft ) . N →∞
Hence the extended process is a non-negative, integrable supermartingale on [0, ∞]. By the Optional Sampling Theorem for Submartingales75 if σ ≤ τ are 73 See:
Example 6.34, page 384. Corollary 1.66, page 40. 75 See: Proposition 1.88, page 54. 74 See:
266
GENERAL THEORY OF STOCHASTIC INTEGRATION
arbitrary stopping times then Λ (σ) ≥ E (Λ (τ ) | Fσ ) .
(4.15)
Let us introduce the stopping time τ inf {t : Λ (t) = 0} . Let L be a local martingale and let U ∆L (τ ) χ ([τ , ∞)) . As L is a local martingale U ∈ Aloc . So U has a compensator U p . With this notation we have the following theorem: loc
Proposition 4.55 Let Q P. If Λ (t)
dQ (t) dP (t)
then Λ−1 is meaningful and right-regular76 under Q. If L is a local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under Q and 0 L − Λ−1 • [L, Λ] + U p L is a local martingale77 under measure Q. Proof. We divide the proof into several steps. 1. First we show that Λ > 0 almost surely under Q. Let τ inf {t : Λ (t) = 0} . Λ is right-continuous so if τ (ω) < ∞ then Λ (τ (ω) , ω) = 0. If 0 ≤ q ∈ Q then τ + q ≥ τ . Hence by (4.15) Λ (τ ) χ (τ < ∞) ≥ χ (τ < ∞) · E (Λ (τ + q) | Fτ ) = = E (Λ (τ + q) χ (τ < ∞) | Fτ ) . Taking expected value 0 ≥ E (Λ (τ + q) χ (τ < ∞)) ≥ 0. 76 That
a.s.
is Λ−1 is almost surely finite and right-regular with respect to Q, that is Λ > 0 a.s with respect to Q. In this case Λ−1 = Λ under Q. See: (4.18). 77 More precisely L is indistinguishable from a local martingale under Q.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
267
a.s.
Hence Λ (τ + q) = 0 on the set {τ < ∞} for any q ∈ Q. As Λ is right-continuous, outside a set with P-measure-zero if τ (ω) ≤ t < ∞ then Λ (t, ω) = 0. Q (t) ({Λ (t) = 0}) = {Λ(t)=0}
dQ (t) dP = dP
Λ (t) dP = 0, {Λ(t)=0}
so Λ (t) > 0 almost surely with respect to Q (t). Q (Λ (t) = 0 for some t) = Q (τ < ∞) = Q (∪n Λ (n) = 0) ≤ ≤
∞
Q (Λ (n) = 0) =
n=1
∞
Q (n) (Λ (n) = 0) = 0.
n=1
Hence Λ−1 is meaningful and Λ−1 > 0 almost surely under Q. We prove that Λ− is also almost surely positive with respect to Q. Let ρ inf {t : Λ− (t) = 0} , 1 ρn inf t : Λ (t) ≤ . n As Λ is right-regular Λ (ρn ) ≤ 1/n. Obviously on the set {ρ < ∞} lim Λ (ρn ) = Λ (ρ−) = 0.
n→∞
By (4.15) for any positive rational number q
Λ (ρn ) χ (ρn < ∞) ≥ E Λ (ρn + q) χ (ρn < ∞) | Fρn . Taking expected value 1 ≥ E (Λ (ρn + q) χ (ρn < ∞)) ≥ 0. n By Fatou’s lemma E (Λ ((ρ + q) −) χ (ρ < ∞)) = 0. Hence for every q ≥ 0 a.s
Λ ((ρ + q) −) χ (ρ < ∞) = 0.
(4.16)
268
GENERAL THEORY OF STOCHASTIC INTEGRATION
Hence outside a set with P-measure-zero if ρ (ω) ≤ t < ∞ then Λ− (t, ω) = 0. Hence if ρ (ω) < t < ∞ then Λ (t, ω) = 0. Therefore τ (ω) ≤ ρ (ω). Q (t) ({Λ− (t) = 0}) ≤ Q (t) ({ρ ≤ t}) = ≤
{ρ≤t}
Λ (t) dP ≤
Λ (t) dP = 0. {τ ≤t}
With the same argument as above one can easily prove that Q (Λ− (t) = 0 for some t) = 0. If for some ω the trajectory Λ (ω) and Λ− (ω) are positive then as Λ (ω) is right-regular Λ−1 (ω) is also right-regular. Therefore it is bounded on any finite interval78 . Hence if V ∈ V then Λ−1 • V is well-defined and Λ−1 • V ∈ V under Q. 2. Assume that for some right-regular, adapted process N the product N Λ is a local martingale under P. We show that N is a local martingale under Q. Let σ σ be a stopping time and let us assume that the truncated process (ΛN ) is a 79 martingale under P. If F ∈ Fσ∧t , and r ≥ t, then
N σ (t) dQ =
F
N σ (t) Λσ (t) dP = F
σ
=
σ
N σ (r) dQ.
N (r) Λ (r) dP = F
F
Hence N σ is a martingale under Q with respect to the filtration (Fσ∧t )t . We show that it is a martingale under Q with respect to the filtration F. Let ρ be a bounded stopping time under F. We show that τ ρ ∧ σ is a stopping time under (Fσ∧t )t . One should show that {ρ ∧ σ ≤ t} ∈ Fσ∧t . By definition this means that {ρ ∧ σ ≤ t} ∩ {σ ∧ t ≤ r} ∈ Fr . If t ≤ r then this is true as ρ ∧ σ and σ ∧ t are stopping times. If t > r then the set above is {σ ≤ r} ∈ Fr . By the Optional Sampling Theorem, using that τ ρ ∧ σ is a stopping time under (Fσ∧t )t and N σ is a Q-martingale under this filtration N σ (0) dQ = N σ (τ ) dQ = N σ (ρ) dQ. Ω 78 See: 79 See:
Proposition 1.6, page 5. Lemma 4.54, page 265.
Ω
Ω
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
269
This implies that N σ is a martingale under Q. Hence N is a local martingale under Q. 0 (0) = 0. Integrating by 3. To simplify the notation let L (0) = 0, from which L parts LΛ = L− • Λ + Λ− • L + [L, Λ] .
(4.17)
Λ and L are local martingales under P so the stochastic integrals on the righthand side are local martingales under P. Let
a
a−1 0
if a > 0 . if a = 0
(4.18)
and let A Λ • [L, Λ] .
(4.19)
A is almost surely finite under Q as Λ > 0 and Λ− are almost surely finite under Q. But we are now defining A under P and with positive probability Λ can be unbounded on some finite intervals under P. Hence we do not know that A is well-defined under P. To solve this problem let us observe that (ρn ) in (4.16) is 0 So it is sufficient to prove a localizing sequence under Q and one can localize L. ρ ρ 0 n that (L n ) = (L) is a local martingale under Q for every n. For Lρn (4.19) is well-defined. So one can assume that A is finite. Again integrating by parts, noting that Λ is right-continuous ΛA = A− • Λ + Λ− • A + [A, Λ] = = A− • Λ + Λ− • A + ∆A∆Λ = = A− • Λ + Λ− • A + ∆Λ • A = = A− • Λ + Λ • A = = A− • Λ + ΛΛ • [L, Λ] = = A− • Λ + χ (Λ > 0) • [L, Λ] . Finally80 p ΛU p = U− • Λ + Λ− • U p + [U p , Λ] = p = U− • Λ + Λ− • U p + ∆U p ∆Λ = p = U− • Λ + Λ− • U p + ∆U p • Λ = 80 See:
Example 4.39, page 249.
270
GENERAL THEORY OF STOCHASTIC INTEGRATION
= U p • Λ + Λ− • U p = = U p • Λ + Λ − • U p ± Λ− • U = = U p • Λ + Λ− • (U − U p ) + Λ− • U The stochastic integrals with respect to local martingales are local martingales, the sum of local martingales is a local martingale so 0 ΛL − ΛA + ΛU p = ΛL = local martingale + [L, Λ] − χ (Λ > 0) • [L, Λ] + Λ− • U. Observe that the last line is χ (Λ = 0) • [L, Λ] + Λ− • U = = χ (t ≥ τ ) • [L, Λ] + Λ− (τ ) ∆L (τ ) χ (t ≥ τ ) = = χ (t ≥ τ ) ∆L (τ ) ∆Λ (τ ) + Λ− (τ ) ∆L (τ ) χ (t ≥ τ ) = 0 0 is a local where we have used that [L, Λ] is constant81 on {t ≥ τ }. Hence ΛL 0 martingale under P. So by the second part of the proof L is a local martingale under Q. loc
loc
loc
Corollary 4.56 Let Q P and let P Q that is let assume that Q ∼ P. If Λ (t)
dQ (t) dP (t)
then Λ > 0. If L is a local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under Q and 0 L − Λ−1 • [L, Λ] L is a local martingale under the measure Q. loc
Corollary 4.57 Let Q P. If Λ (t)
dQ (t) dP (t)
and L is a continuous local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under measure Q and 0 L − Λ−1 • [L, Λ] L is a local martingale under the measure Q. 81 See:
Corollary 2.49, page 145.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
271
loc
If V ∈ V under P and Q P then obviously V ∈ V under Q. Hence the proof of the following observation is trivial: loc
Corollary 4.58 If X is a semimartingale under P and Q P then X is a semimartingale under Q. Let V ∈ V and assume that the integral H • V exists under measure P. By definition this means that the pathwise integrals (H • V ) (ω) exist almost surely loc
under P. If Q P then the integral H • V exists under the measure Q as well, and the value of the two processes are almost surely the same under Q. It is not too surprising that it is true for any semimartingale. Proposition 4.59 Let X be an arbitrary semimartingale and let H be a predictable process. Assume that the integral H • X exists under measure P. If loc
Q P then the integral H • X exists under measure Q as well, and the two integral processes are indistinguishable under measure Q. Proof. By the remark above it is obviously sufficient to prove the proposition if X ∈ L under P. It is also sufficient to prove that for every T > 0 the two integrals exist on the interval [0, T ] and they are almost surely equal. 1. Let X = X c + X d be the decomposition of X into continuous and purely discontinuous local martingales. As the time horizon is finite, Λ is a uniformly integrable martingale. Recall that if L is a local martingale under the measure P then L − Λ−1 • [L, Λ] + U p L
(4.20)
is a local martingale under measure Q and if L is continuous then U p can be dropped. X − Λ−1 • [X, Λ] + U p = X & ' = X c + X d − Λ−1 • X c + X d , Λ + U p = & '
= X c − Λ−1 • [X c , Λ] + X d − Λ−1 • X d , Λ + U p . By (4.20) the processes 1c X c − Λ−1 • [X c , Λ] X
1d X d − Λ−1 • &X d , Λ' + U p and X
are local martingales under measure Q. X c is continuous, hence the quadratic 1c is continuous. If W and V co-variation [X c , Λ] is also continuous82 . Hence X 82 See:
line (3.19), page 222.
272
GENERAL THEORY OF STOCHASTIC INTEGRATION
are pure quadratic jump processes then [W + V ] = [W ] + 2 [W, V ] + [V ] = 2 2 = (∆W ) + 2 ∆W ∆V + (∆V ) = 2 = (∆ (W + V )) hence W +V is also a pure quadratic jump process. Processes with finite variation 1d is a pure quadratic jump process are pure quadratic jump processes83 , hence X under P. Under the change of measure the quadratic variation does not change, 1d is a purely 1d is a pure quadratic jump process under Q. Hence X hence X exists discontinuous local martingale under Q. We want to show that H • X under Q. This means that H • X exist on (0, t] for every t. To prove this one 1d exist under Q. 1c and H • X need only prove that the integrals H • X c 1 is a continuous local martingale, hence H•X 1c exists under Q if and only 2. X (
1c 0) Λ (τ )) ≤ c. 5. Z− is locally bounded. Let (ρn ) be a localizing sequence of Z− . Let τ n inf {s : Λ (s) > n} ∧ ρn ∧ n.
(4.21)
τ n is a bounded stopping time and if s < τ n (ω) then Λ (s, ω) ≤ n. Hence using the estimate just proved dQ = EQ (Z (τ n −)) = E Z (τ n −) dP dQ | Fτ n = = E E Z (τ n −) dP dQ = E Z (τ n −) E | Fτ n = dP = E (Z (τ n −) Λ (τ n )) ≤ kn · E (Λ (τ n )) = = kn · E ({τ n > 0} Λ (τ n ) + {τ n = 0} Λ (τ n )) ≤ ≤ kn · (n + E (Λ (0))) < ∞. 84 See:
Corollary 1.87, page 54.
274
GENERAL THEORY OF STOCHASTIC INTEGRATION
6. We show that ∆U p = 0. The stopping time τ can be covered by its predictable and totally inaccessible parts so one can assume that τ is either totally inaccessible or predictable. If τ is predictable then χ ([τ ]) is predictable therefore ∆ (U p ) =
p
(∆U )
p
(∆X (τ ) χ ([τ ])) =
p
(∆X · χ ([τ ])) =
= (p ∆X) · χ ([τ ]) = 0 · χ ([τ ]) = 0. If τ is totally inaccessible then P (τ = σ) = 0 for every predictable stopping time σ, hence p
0 ((∆Xχ ([τ ])) (σ) | Fσ− ) = E 0 (0 | Fσ− ) = 0, (∆Xχ ([τ ])) (σ) E
so ∆U p =
p
(∆Xχ ([τ ])) = 0. Therefore in both cases ∆U p = 0.
2 ( 1d is purely discontinuous, hence X 1d = 1d and 7. X ∆X 1d = ∆X d − Λ ∆ &X d , Λ' + ∆U p . ∆X
Since ∆U p = 0
1d ∆X = ∆X d − Λ · ∆X d ∆Λ =
= ∆X d 1 − Λ · ∆Λ =
= ∆X d χ (Λ = 0) + Λ · Λ− . " " H 2 • [X d ] ∈ Aloc under P. One can assume that τ n localizes H 2 • [X d ] in (4.21). Therefore one may assume that )
H 2 • [X d ] (τ n ) < ∞. E ∆X d (τ n ) H (τ n ) ≤ E Using this % (
1 2 d E ∆ H • X (τ n ) = Q
% =E
H2
1d ∆X
2
dQ (τ n ) dP
1d (τ n ) Λ (τ n ) = = E H∆X
= E H∆X d (τ n ) χ (Λ = 0) + Λ− Λ (τ n ) Λ (τ n ) =
= E H∆X d (τ n ) Λ− Λ (τ n ) Λ (τ n ) ≤
≤ E H (τ n ) ∆X d (τ n ) Λ (τ n −) ≤ n · E ∆X d (τ n ) H (τ n ) < ∞.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
8. As
√
x+y ≤
√
x+
√
y
% EQ (Z (τ n )) EQ
275
(
1d (τ n ) H2 • X
≤
% (
1d (τ n ) < ∞. ∆ H2 • X ≤ E (Z (τ n −)) + E Q
Q
Therefore Z ∈ Aloc under measure Q. 9. Let us consider the decomposition + A − Up + Λ−1 • [X, Λ] − U p X X=X and let us assume that the integral H • X exists under measure P. As the inte exists under Q one should prove that the Lebesgue–Stieltjes integrals gral H • X H • A and H • U p also exist. By the inequality of Kunita and Watanabe
T
T
|H| Λ dVar ([X, Λ]) ≤
|H| dVar (A) = 0
0
-
T
≤
2 |H| Λ d [X]
0
=
0 T
2 Λ d |H| • [X]
T
Λ d [Λ] = -
T
Λ d [Λ].
0
0
Λ > 0 and Λ− > 0 almost surely under Q, that is almost all trajectories of Λ and Λ− are positive85 hence Λ has regular trajectories almost surely under Q. Hence almost surely the trajectories of Λ are bounded on every finite inter) T Λ d [Λ] is finite. Similarly as H • X exists val, therefore the expression 0 R |H| • [X] ∈ V, hence Λ • R is finite under Q. That is for every trajectory T |H| dVar (A) < ∞, hence H • A exists under Q. Let σ be a stopping time in 0 " a localizing sequence of H 2 • [X]. 2
) E ((|H| • U p ) (σ)) = E ((|H| • U ) (σ)) ≤ E
2
|H| • [X] (σ)
< ∞.
Hence H • U p is almost surely finite under P so it is almost surely finite under Q. Therefore the integral H • X exists under Q. 10. Let us denote by (P) H • X and by (Q) H • X the value of H • X under P and under Q respectively. Let us denote by H the set of processes H for 85 See:
Proposition 4.55, page 266.
276
GENERAL THEORY OF STOCHASTIC INTEGRATION
which (P) H • X and (Q) H • X are indistinguishable under Q. From the Dominated Convergence Theorem and from the linearity of the stochastic integral it is obvious that H is a λ-system, which contains the π-system of the elementary processes. From the Monotone Class Theorem it is clear the H contains all the bounded predictable processes. 11. If Hn Hχ (|H| ≤ n) then Hn is bounded. Hence the value of the integral (P) Hn • X is Q almost surely equal to the integral (Q) Hn • X. As H • X exists under P and under Q by the Dominated Convergence Theorem uniformly ucp in probability on compact intervals (P) Hn • X → (P) H • X and (Q) Hn • ucp X → (Q) H • X. The stochastic convergence under P implies86 the stochastic convergence under Q, hence (P) H • X = (Q) H • X almost surely under Q. Let us prove some consequences of the proposition. During the construction of the stochastic integral we emphasized that we cannot define the integral pathwise. But it does not mean that the integral is not determined by the trajectories of the integrator and the integrand. Corollary 4.60 Let X and X be semimartingales. Assume that for the predictable processes H and H the integrals H • X and H • X exist. If . / . / A ω : H (ω) = H (ω) ∩ ω : X (ω) = X (ω) then the processes H • X and H • X are indistinguishable on A. Proof. One may assume that P (A) > 0. Define the measure Q (B)
P (A ∩ B) . P (A)
Obviously Q P. The processes H, H and X, X are indistinguishable under Q. Hence processes (Q)H • X and (Q) H • X are indistinguishable under Q. By the proposition above under Q up to indistinguishability (P) H • X = (Q) H • X = (Q) H • X = (P) H • X which means that (P) H • X = (P) H • X on A. The proof of the following corollary is similar: Corollary 4.61 Let X be a semimartingale and let assume that the integral H • X exists. If on a set B the trajectories of X have finite variation then almost surely on B the trajectories of H • X are equal to the pathwise integrals of H with respect to X. 86 A sequence is stochastically convergent if and only if every subsequence of the sequence has another subsequence which is almost surely convergent to the same, fixed random variable.
THE PROOF OF DAVIS’ INEQUALITY
4.5
277
The Proof of Davis’ Inequality
In this section we prove the following inequality: Theorem 4.62 (Davis’ inequality) There are positive constants c and C such that for any local martingale L ∈ L and for any stopping time τ
"
" [L] (τ ) ≤ E sup |L (t)| ≤ C · E [L] (τ ) . c·E t≤τ
Example 4.63 In the inequality one cannot write |L| (τ ) in the place of supt≤τ |L|.
If w is a Wiener process and τ inf {t : w (t) = 1} then L wτ is a martingale. E (L (t)) = 0 for every t, hence
L (t)1 = E (|L(t)|) = 2E L+ (t) ≤ 2. On the other hand if t → ∞ " √
√ τ ∧t →E τ . [L] (t) = E 1
The density function87 of τ is
1 exp − f (x) = √ 3 2x 2x π √ hence the expected value of τ is 1
E
,
x > 0,
1 exp − dx = 2x 2x3 π 0 ∞ 1 1 1 √ exp − = dx = 2x 2π x 0 ∞ u
1 1 exp − du = ∞. =√ 2 2π 0 u
√ τ =
∞
√
x√
1
If σ is an arbitrary stopping time then in place of L one can write Lσ in the inequality. On the other hand if for some localizing sequence σ n ∞ the inequality is true for all Lσn then by the Monotone Convergence Theorem it is true for L as well. By the Fundamental Theorem of Local Martingales L ∈ L has a 2 decomposition L = H + A where H ∈ Hloc and A ∈ Aloc . With localization 2 one can assume that H ∈ H and A ∈ A. L− is left-regular, hence it is locally 87 See:
(1.58) on page 83.
278
GENERAL THEORY OF STOCHASTIC INTEGRATION
bounded, so with further localization of the inequality one can assume that L− is bounded. It suffices to prove the inequality on any finite time horizon [0, T ]. It is
suffi(n) is an cient to prove the inequality for finite, discrete-time horizons: If tk infinitesimal sequence of partitions of [0, T ] then trivially
(n) E sup L tk E sup |L (t)| . (n)
t≤T
tk ≤T
Recall that as L(0) = 0 at any time t the quadratic variation [L] is the limit in probability of the sequence (n)
[L]
(t)
2 ( (n) (n) L tk ∧ t − L tk−1 ∧ t = k
= L2 (t) − 2
(
(n) (n) (n) L tk−1 ∧ t L tk ∧ t − L tk−1 ∧ t .
k
If Yn (t)
(n) (n) (n) L tk−1 ∧ t χ tk−1 ∧ t, tk ∧ t ,
k
then the sum in the above expression is (Yn • L) (t). Obviously Yn → L− and |Yn (t)| ≤ sup |L− (s)| ≤ k. s≤t
Repeating the proof of the Dominated Convergence Theorem we prove that for all t (Yn • L) (t) → (L− • L) (t) in L1 (Ω). As (Yn ) is uniformly bounded, by Itˆ o’s isometry the convergence Yn • H → L− • H holds in H2 and therefore L2
(Yn • H) (t) → (L− • H) (t). Obviously |(Yn • A) (t) − (L− • A) (t)| ≤ 2k · Var (A) (t) . As A ∈ A by the classical Dominated Convergence Theorem L1
(Yn • A) (t) → (L− • A) (t) .
THE PROOF OF DAVIS’ INEQUALITY
279
Therefore, as we said, L1
(Yn • A) (T ) → (L− • A) (T ) . (n)
L1
Hence [L] (T ) → [L] (T ) , so by Jensen’s inequality ) ) "
" (n) E ≤ E [L](n) (T ) − [L] (T ) ≤ [L] (T ) − E [L] (T ) % (n) ≤E [L] (T ) − [L] (T ) ≤
%
(n) ≤ E [L] (T ) − [L] (T ) → 0. This means that if the inequality holds in discrete-time then it is true in continuous-time. 4.5.1
Discrete-time Davis’ inequality
Up to the end of this section we assume that if M is a martingale then M (0) = 0. Definition 4.64 Let us first introduce some notation. For any sequence M (Mn ) ∆Mn Mn − Mn−1 . If M (Mn ) is a discrete-time martingale then (∆Mn ) is the martingale difference of M . [M ]n
n k=1
Mn∗
2
(∆Mk ) =
n
2
(Mk − Mk−1 )
k=1
sup |Mk | k≤n
for any n. If n is the maximal element in the parameter set or n = ∞ then we drop the subscript n. With this notation the discrete-time Davis’ inequality has the following form: Theorem 4.65 (Discrete-time Davis’ inequality ) There are positive constants c and C such that for every discrete-time martingale M for which M (0) = 0 "
"
c·E [M ] ≤ E (M ∗ ) ≤ C · E [M ] .
280
GENERAL THEORY OF STOCHASTIC INTEGRATION
The proof of the discrete-time Davis’ inequality is a simple but lengthy88 calculation. Let us first prove two lemmas: Lemma 4.66 Let M (Mn , Fn ) be a martingale and let V (Vn , Fn−1 ) be a predictable sequence89 , for which |∆Mn | |Mn − Mn−1 | ≤ Vn . If λ > 0 and 0 < δ < β − 1 then
" P M ∗ > βλ, [M ] ∨ V ∗ ≤ δλ ≤ P
"
[M ] > βλ, M ∗ ∨ V ∗ ≤ δλ ≤
2δ 2 (β − δ − 1)
2 P (M
∗
> λ) ,
" 9δ 2 [M ] > λ . P 2 β −δ −1 2
Proof. The proof of the two inequalities are similar. 1. Let us introduce the stopping times µ inf {n : |Mn | > λ} , ν inf {n : |Mn | > βλ} , ) σ inf n : [M ]n ∨ Vn+1 > δλ . For every j c
Fj {µ < j ≤ ν ∧ σ} = {µ < j} ∩ {ν ∧ σ < j} ∈ Fj−1 , hence if Hn
n
∆Mj χFj ,
j=1
then n ∆Mj χFj | Fn−1 ) = E (Hn | Fn−1 ) E( j=1
=
n−1 j=1
88 And 89 That
boring. is Vn is Fn−1 -measurable.
∆Mj χFj + E(∆Mn χFn | Fn−1 ) =
THE PROOF OF DAVIS’ INEQUALITY
=
n−1
281
∆Mj χFj + χFn E(∆Mn | Fn−1 ) =
j=1
=
n−1
∆Mj χFj Hn−1 ,
j=1
therefore (Hn ) is a martingale. By the assumptions of the lemma |∆Mj | ≤ Vj , hence by the definition of σ
2 [H]n ≤ [M ]σ = [M ]σ−1 + (∆Mσ ) χ (σ < ∞) + [M ]σ χ (σ = ∞) ≤
≤ [M ]σ−1 + Vσ2 χ (σ < ∞) + [M ]σ χ (σ = ∞) ≤ ≤ 2δ 2 λ2 . {M ∗ ≤ λ} = {µ = ∞} hence on this set H = 0 so [H] = 0. Therefore E ([H]) = E ([H] χ (M ∗ > λ) + [H] χ (M ∗ ≤ λ)) = = E ([H] χ (M ∗ > λ)) ≤ 2δ 2 λ2 P (M ∗ > λ) . Observe that Fj ∩ {ν < ∞, σ = ∞} = {µ < j ≤ ν} ∩ {ν < ∞, σ = ∞} hence on the set {ν < ∞, σ = ∞} Hn = Mν∧n − Mµ∧n . On {ν < ∞} obviously supn |Mν∧n | ≥ λβ. On {σ = ∞} by definition V ∗ ≤ δλ, hence |Mµ | = |Mµ−1 + ∆Mµ | ≤ λ + δλ. This implies that on the set {ν < ∞, σ = ∞} H ∗ = sup |Mν∧n − Mµ∧n | > λβ − λ (δ + 1) = λ (β − (1 + δ)) . n
282
GENERAL THEORY OF STOCHASTIC INTEGRATION
By Doob’s inequality90 using the definition of ν and σ
" P1 P M ∗ > βλ, [M ] ∨ V ∗ ≤ δλ = = P (ν < ∞, σ = ∞) ≤
2 E H∞
∗
≤ P (H > λ (β − (1 + δ))) ≤ ≤ ≤
E ([H]) λ (β − 1 − δ) 2
2
2
λ2 (β − 1 − δ)
≤
≤
2δ 2 λ2 P (M ∗ > λ) 2
λ2 (β − (1 + δ))
=
2δ 2 (β − 1 − δ)
2 P (M
∗
> λ) ,
which is the first inequality. 2. Analogously, let us introduce the stopping times ) µ inf n : [M ]n > λ ,
) ν inf n : [M ]n > βλ ,
σ inf {n : Mn∗ ∨ Vn+1 > δλ} . Again for all j let Fj {µ < j ≤ ν ∧ σ } . As Fj ∈ Fj−1 Gn
n
∆Mj χFj
j=1
is again a martingale. If µ ≥ σ then G∗ = 0. Hence if σ < ∞ then G∗ = G∗ χ (µ < σ ) ≤
≤ Mµ∗ + Mσ∗ χ (µ < σ ) ≤
≤ Mσ∗ −1 + Mσ∗ χ (µ < σ ) =
= Mσ∗ −1 + Mσ∗ −1 + ∆Mσ∗ χ (µ < σ ) ≤
≤ Mσ∗ −1 + Mσ∗ −1 + Vσ χ (µ < σ ) ≤ ≤ δλ + δλ + δλ = 3δλ. 90 See:
line (1.14), page 33.
THE PROOF OF DAVIS’ INEQUALITY
283
If σ = ∞ then of course σ − 1 is meaningless, but in this case obviously
∗ Mµ + Mσ∗ χ (µ < σ ) ≤ 2δλ, so in this case the inequality G∗ ≤ 3δλ still holds. On the set
" [M ] ≤ λ =
{µ = ∞} obviously G∗ = 0. "
"
2 2 2 E (G∗ ) = E (G∗ ) χ [M ] > λ + (G∗ ) χ [M ] ≤ λ = "
"
2 [M ] > λ ≤ 9δ 2 λ2 P [M ] > λ . = E (G∗ ) χ On the set {ν < ∞, σ = ∞} [G]n = [M ]ν ∧n − [M ]µ ∧n . By this using that ν < ∞ and σ = ∞ 2
2
[G] > (βλ) − [M ]µ −1 − (∆Mµ ) ≥ 2
2
≥ (βλ) − λ2 − (Vµ ) ≥
2 ≥ (βλ) − 1 + δ 2 λ2 . By Markov’s inequality and by the energy identity91 "
P2 P [M ] > βλ, M ∗ ∨ V ∗ ≤ δλ = = P (ν < ∞, σ = ∞) ≤
≤ P [G] > λ2 β 2 − 1 + δ 2 ≤
E ([G]) =
λ β − 1 + δ2
2 ∗ 2 ) E (G E G ≤ 2 2 ≤ = 2 2 λ β − 1 + δ2 λ β − 1 + δ2 "
9δ 2 P [M ] > λ . ≤ 2 2 β − (1 + δ) 2
2
Lemma 4.67 Let M (Mn , Fn ) be a martingale and let assume that M0 = 0. If dj ∆Mj Mj − Mj−1 ,
aj dj χ |dj | ≤ 2d∗j−1 − E dj χ |dj | ≤ 2d∗j−1 | Fj−1 ,
bj dj χ |dj | > 2d∗j−1 − E dj χ |dj | > 2d∗j−1 | Fj−1 E G2 = ∞ then the inequality is true, otherwise one can use Proposition 1.58 on page 35. 91 If
284
GENERAL THEORY OF STOCHASTIC INTEGRATION
then the sequences Gn
n
aj
and
j=1
Hn
n
bj ,
j=1
are F-martingales, M = G + H and |aj | ≤ 4d∗j−1 , ∞
(4.22)
dj χ |dj | > 2d∗j−1 ≤ 2d∗ ,
(4.23)
j=1 ∞
E (|bj |) ≤ 4E (d∗ ) .
(4.24)
j=1
Proof. As M0 = 0 n
dj
j=1
n
∆Mj = Mn − M0 = Mn .
j=1
One should only prove the three inequalities, since from this identity the other parts of the lemma are obvious92 . 1. (4.22) is evident. / . 2. |dj | + 2d∗j−1 ≤ 2 |dj | on |dj | > 2d∗j−1 , hence ∞ ∞
dj χ |dj | > 2d∗j−1 ≤ 2 |dj | − 2d∗j−1 χ |dj | > 2d∗j−1 ≤ j=1
j=1
≤2
∞
d∗j − d∗j−1 = 2d∗ ,
j=1
which is exactly (4.23). 3. ∞ j=1
E (|bj |) ≤
∞
E |dj | χ |dj | > 2d∗j−1 +
j=1
+
∞
E E dj χ |dj | > 2d∗j−1 | Fj−1 .
j=1 92 For any sequence (ξ , F ) E (ξ | F n n−1 ) = 0 if and only if (ξ n , Fn ) n n difference sequence.
is a martingale
THE PROOF OF DAVIS’ INEQUALITY
285
If in the second sum we bring the absolute value into the conditional expectation, then ∞ ∞
E (|bj |) ≤ 2E |dj | χ |dj | > 2d∗j−1 . j=1
j=1
By (4.23) the expression in the conditional expectation is not larger than 2d∗ , from which (4.24) is evident. The proof of the discrete-time Davis’ inequality: Let M = H + G be n the decomposition of the previous lemma. Gn j=1 aj is a martingale, |aj | ≤ 4d∗j−1 , hence by the first lemma, if λ > 0 and 0 < δ < β − 1, then
" P G∗ > βλ, [G] ∨ 4d∗ ≤ δλ ≤ P
"
[G] > βλ, G∗ ∨ 4d∗ ≤ δλ ≤
2δ 2 (β − δ − 1)
2 P (G
∗
> λ) ,
" 9δ 2 [G] > λ . P β 2 − δ2 − 1
Hence for any λ > 0 P (G∗ > βλ) ≤ P +
"
[G] > δλ + P (4d∗ > δλ) + 2δ 2
(β − δ − 1)
2 P (G
∗
> λ) ,
and P
"
[G] > βλ ≤ P (G∗ > δλ) + P (4d∗ > δλ) + +
" 9δ 2 [G] > λ . P β 2 − δ2 − 1
Integrating w.r.t. λ and using that if ξ ≥ 0 then ∞ ∞ E (ξ) = 1 − F (x)dx = P(ξ > x)dx, 0
0
one has that ∗
E (G ) ≤ β
E
"
[G] +
δ +
2δ 2 (β − δ − 1)
4E (d∗ ) + δ 2 E (G
∗
),
286
GENERAL THEORY OF STOCHASTIC INTEGRATION
and E
"
[G]
β
≤
E (G∗ ) 4E (d∗ ) + + δ δ "
9δ 2 + 2 E [G] . β − δ2 − 1
For the stopped martingale Gn the expected values in the inequalities are finite, hence one can reorder the inequalities
2
1 2δ − 2 β (β − δ − 1)
E (G∗n ) ≤
E
"
[G]n
δ
+
4E (∆Mn∗ ) . δ
and
1 9δ 2 − 2 β β − δ2 − 1
E
"
E (G∗ ) 4E (∆M ∗ ) n n + . [G]n < δ δ
If δ is small enough then the constants on the left-hand side are positive, hence we can divide by them. Hence if n ∞ then by the Monotone Convergence Theorem "
∗ E (G∗ ) ≤ A1 E [G] + A2 E (∆M ) , "
∗ E [G] ≤ B1 E (G∗ ) + B2 E (∆M ) . By the second lemma E (M ∗ ) ≤ E (G∗ + H ∗ ) ≤ ≤ E (G∗ ) + E (|bj |) ≤ E (G∗ ) + 4E (d∗ ) ≤ "
j
∗ ∗ ≤ A1 E [G] + A2 E (∆M ) + 4E (∆M ) , "
" "
E [M ] ≤ E [G] + [H] ≤ " "
≤E [G] + E (|bj |) ≤ E [G] + 4E (d∗ ) ≤ j
∗ ∗ ≤ B1 E (G ) + B2 E (∆M ) + 4E (∆M ) . ∗
THE PROOF OF DAVIS’ INEQUALITY
287
As G = M − H by the second lemma again E (G∗ ) ≤ E (M ∗ ) + E (H ∗ ) ≤ E (M ∗ ) +
∗ ≤ E (M ∗ ) + 4E (∆M )
∞
E (|bj |) ≤
j=1
and E
"
∞
"
"
"
[G] ≤ E [M ] + E [H] ≤ E [M ] + E (|bj |) ≤
"
∗ ≤E [M ] + 4E (∆M ) .
j=1
From this with simple calculation E (M ∗ ) ≤ A1 E
"
"
∗ [M ] + A3 E (∆M ) ≤ A · E [M ] ,
and E
"
∗ [M ] ≤ B1 E (M ∗ ) + B3 E (∆M ) ≤ B · E (M ∗ ) ,
from which Davis’ inequality already follows, trivially. 4.5.2
Burkholder’s inequality
One can extend Davis’ inequality in such a way that instead of the L1 (Ω)-norm one can write the Lp (Ω)-norm for every p ≥ 1. Theorem 4.68 (Burkholder’s inequality) For any p > 1 there are constants cp and Cp , such that for every local martingale L ∈ L and for every stopping time τ " " cp [L] (τ ) ≤ sup |L (t)| ≤ Cp [L] (τ ) . p
t≤τ
p
p
During the proof of the inequality we shall use the next result: Lemma 4.69 Let A be a right-regular, non-negative, increasing, adapted process and let ξ be a non-negative random variable. Assume that almost surely for every t E (A (∞) − A (t) | Ft ) ≤ E (ξ | Ft )
(4.25)
288
GENERAL THEORY OF STOCHASTIC INTEGRATION
and ∆A (t) ≤ ξ. Then for every p ≥ 1 A (∞)p ≤ 2p ξp .
(4.26)
Proof. A is increasing, so for every n χ (A (t) ≥ n) (A (∞) − A (t)) = (A ∧ n) (∞) − (A ∧ n) (t) . So if (4.26) holds for some A then it holds for A ∧ n. Hence one can assume that A is bounded, since otherwise we can replace A with A ∧ n and in (4.26) one can take n ∞. If ξ is not integrable then the inequality trivially holds. Hence one can assume that ξ is integrable. 1. As ξ is integrable E (ξ | Ft ) is a uniformly integrable martingale. As A is bounded E (A (∞) − A (t) − ξ | Ft ) = E (A (∞) | Ft ) − E (ξ | Ft ) − A (t) is a uniformly integrable, non-positive supermartingale. By the Optional Sampling Theorem for every stopping time τ E (A (∞) − A (τ ) | Fτ ) ≤ E (ξ | Fτ ) .
(4.27)
Let x > 0 and let τ x inf {t : A (t) ≥ x} . Obviously A (τ x −) ≤ x. By (4.27) E ((A (∞) − x) χ (x < A (∞))) ≤ E ((A (∞) − x) χ (τ x < ∞)) = ≤ E ((A (∞) − A (τ x −)) χ (τ x < ∞)) = = E ((A (∞) − A (τ x )) χ (τ x < ∞)) + E (∆A (τ x ) χ (τ x < ∞)) ≤ ≤ E (ξχ (τ x < ∞)) + E (ξχ (τ x < ∞)) ≤ ≤ 2E (ξχ (x ≤ A (∞))) .
THE PROOF OF DAVIS’ INEQUALITY
2. With inequality
simple
calculation
using
Fubini’s
theorem
and
289
H¨ older’s
p
A (∞)p E (Ap (∞)) = pE (Ap (∞)) − (p − 1) E (Ap (∞)) = A(∞) p−2 = p (p − 1) E A (∞) x dx 0
− p (p − 1) E
p−1
x
=
A(∞)
(A (∞) − x) x
= p (p − 1) E
dx
0
= p (p − 1)
A(∞)
p−2
dx
=
0 ∞
E ((A (∞) − x) χ (x < A (∞))) xp−2 dx ≤
0
∞
≤ 2p (p − 1)
E (ξχ (x ≤ A (∞))) xp−2 dx =
0
A(∞)
= 2p (p − 1) E
p−2
ξx
dx
= 2p · E ξAp−1 (∞) ≤
0 p−1
≤ 2p · ξp A (∞)p
. p−1
If A (∞)p > 0 then we can divide both sides by A (∞)p inequality trivially holds.
, otherwise the
Proof of Burkholder’s inequality: Let L be a local martingale. Let B ∈ Ft and let N χB (L − Lt ). N is a local martingale so by Davis’ inequality c·E
"
"
[N ] (∞) ≤ E sup |N (s)| ≤ C · E [N ] (∞) , s
which immediately implies that c·E
"
[L − Lt ] (∞) | Ft ≤ E sup |L − Ls | | Ft ≤ s
≤C ·E
"
[L − Lt ] (∞) | Ft .
Let L∗ (t) sups≤t |L (s)|. Since " " " " " [L] (∞) − [L] (t) ≤ [L] (∞) − [L] (t) = [L − Lt ] (∞) ≤ [L] (∞)
290
GENERAL THEORY OF STOCHASTIC INTEGRATION
and L∗ (∞) − L∗ (s) ≤ sup L − Lt (s) ≤ 2L∗ (∞) s
if A (t)
" [L] (t)
and ξ c−1 2 · L• (∞)
or if A (t) L∗ (t)
and ξ C
" [L] (∞)
then estimation (4.25) in the lemma holds. Without loss of generality one can assume that the constants in the definition of ξ are larger than one. Since for every constant k ≥ 1 " " ∆L∗ ≤ |∆L| = ∆ [L] ≤ k · [L] (∞) " " ∆ [L] ≤ ∆ [L] = |∆L| ≤ k · 2L∗ (∞) in both cases we get that ∆A ≤ ξ. Hence A (∞)p ≤ 2p ξp which is just the two sides of Burkholder’s inequality. p/2
p Corollary 4.70 If L ∈ L and p ≥ 1 then L ∈ Hloc if and only if [L]
∈ Aloc .
Corollary 4.71 If M is a local martingale and for some p ≥ 1 for every sequence of infinitesimal partitions of the interval [0, t] (n)
[M ]
Lp
(t) → [M ] (t) ,
then M ∗ (t) sup |M (s)| ∈ Lp (Ω) s≤t
that is M ∈ Hp on the interval [0, t]. (n)
Proof. Let (Mn ) be a discrete-time of M . If [M ] (t) is con approximation (n) p vergent in L (Ω), then K supn [M ] (t) < ∞. By the Davis–Burkholder p
inequality and by Jensen’s inequality ) % (n) sup |Mn | (s) ≤ Cp [M ](n) (t) ≤ Cp ] (t) [M ≤ L < ∞. s≤t
p
p
p
THE PROOF OF DAVIS’ INEQUALITY
291
For a subsequence sup |Mn | sup |M | , hence by the Monotone Convergence Theorem M ∗ (t)p ≤ L < ∞. Corollary 4.72 If q ≥ 1 and L ∈ Hq is purely discontinuous then L is the Hq -sum of its compensated jumps. Proof. Let us denote by (ρk ) the stopping times exhausting the jumps of L. Let L ∈ Hq be purely discontinuous and let L = Lk where Nk H (ρk ) χ ([ρk , ∞)) and Lk N − Nkp are the the compensated jumps of L. Recall that the convergence holds in the topology of uniform convergence in probability93 . L ∈ Hq so q/2 by Burkholder’s inequality [L] ∈ A and as the compensator Nkp is continuous q/2
[Lk ]
q
(∞) = (∆L (ρk )) ≤ q/2
≤ [L]
2
q/2
(∆L) (∞)
≤
(∞) ∈ L1 (Ω) .
This implies that Lk ∈ Hq . Hq is a vector space hence Yn n > m then ≤ sup Yn − Ym Hq |Y (t) − Y (t)| n m t
n k=1
Lk ∈ Hq . If
q
" ≤ Cp [Yn − Ym ] (∞) = q 2 = Cp (∆L) (s)χ (B \B ) n m , s q
where Bn ∪nk=1 [ρk ].
)
2
(∆L) is in Lq (Ω). Therefore if n, m → ∞ then Yn − Ym Hq → 0.
So (Yn ) is convergent in Hq . Convergence in Hq implies uniform convergence in Hq
probability so obviously Yn → L.
93 See:
Proposition 4.30, page 243.
5 SOME OTHER THEOREMS In this chapter we shall discuss some further theorems from the general theory of stochastic processes. First we shall prove the so-called Doob–Meyer decomposition. By the Doob–Meyer decomposition every integrable submartingale is a semimartingale. We shall also prove the theorem of Bichteler and Dellacherie, which states that the semimartingales are the only ‘good integrators’.
5.1
The Doob–Meyer Decomposition
If A ∈ A+ and M ∈ M then X A + M is a class D submartingale. Since if τ is a finite valued stopping time then |A (τ )| = |A (τ ) − A (0)| ≤ Var (A) (∞) ∈ L1 (Ω) ,
(5.1)
hence the set {X (τ ) : τ < ∞ is a stopping time} is uniformly integrable. The central observation of the stochastic analysis is that the reverse implication is also true: Theorem 5.1 (Doob–Meyer decomposition) If a submartingale X is in class D then X has a decomposition X = X (0) + M + A, where A ∈ A+ , M ∈ M and A is predictable. Up to indistinguishability this decomposition is unique. 5.1.1
The proof of the theorem
We divide the proof into several steps. The proof of the uniqueness is simple. If X (0) + M1 + A1 = X (0) + M2 + A2 292
THE DOOB–MEYER DECOMPOSITION
293
are two decompositions of X then M1 − M2 = A2 − A1 . A2 − A1 is a predictable martingale, hence it is continuous1 . As A2 − A1 has finite variation by Fisk’s theorem2 A1 = A2 , hence M1 = M2 . The proof of the existence is a bit more complicated. Definition 5.2 We say that a supermartingale P is a potential 3 , if 1. P is non-negative and 2. limt→∞ E (P (t)) = 0. Proposition 5.3 (Riesz’s decomposition) If X is a class D submartingale then X has a decomposition X = X (0) + M − P
(5.2)
where P is a class D potential and M is a uniformly integrable martingale. Up to indistinguishability this decomposition is unique. Proof. As X is in class D the set {X (t) : t ≥ 0} is uniformly integrable, hence it is bounded in L1 (Ω). Hence
sup E X + (t) ≤ sup E (|X (t)|) < K. t
t
By the submartingale convergence theorem4 the limit lim X (t) = X (∞) ∈ L1 (Ω)
t→∞
exists. Let us define the variables M (t) E (X (∞) | Ft ). As the filtration satisfies the usual conditions M has a version which is a uniformly integrable martingale. The process P M − X is in class D since it is the difference of two processes of class D. By the submartingale property P (s) M (s) − X (s) ≥ E (M (t) | Fs ) − E (X (t) | Fs ) = = E (M (t) − X (t) | Fs ) . a.s.
If t → ∞, then M (t) − X (t) → 0 and as (M (t) − X (t))t is uniformly integrable the convergence holds in L1 (Ω) as well. By the L1 (Ω)-continuity of the 1 See:
Corollary 3.40, page 205. Theorem 2.11. page 117. 3 Recall that the expected value of the supermartingales is decreasing. 4 See: Corollary 1.72, page 44. 2 See:
294
SOME OTHER THEOREMS
conditional expectation the right-hand side of the inequality almost surely goes a.s.
to zero, that is P (s) ≥ 0. E (P (s)) = E (M (s)) − E (X (s)) → E (M (∞)) − E (X (∞)) = 0, hence P is a potential. Assume that the decomposition is not unique. Let Pi , Mi , i = 1, 2 be two decompositions of X. In this case (P1 − P2 ) (t) = M1 (t) − M2 (t) = E (M1 (∞) − M2 (∞) | Ft ) . L
By the definition of the potential Pi (t) →1 0. Hence if t → ∞, then 0 = E (M1 (∞) − M2 (∞) | F∞ ) = M1 (∞) − M2 (∞) , hence M1 = M2 , so P1 = P2 . It is sufficient to proof the Doob–Meyer decomposition for the potential part of the submartingale. One should prove that if P is a class D potential, then there is one and only one N ∈ M and a predictable process A ∈ A+ for which P = N − A. If it holds then substituting −P = −N + A into line (5.2) we get the needed decomposition of X. From the definition of the potential E (A (t)) = E (N (t)) − E (P (t)) ≤ E (N (∞)) . A ∈ A+ , so A is increasing. 0 = A (0) ≤ A (t) A (∞) where E (A (∞)) < ∞. L1
Hence by the Monotone Convergence Theorem A (t) → A (∞). By the definition L1
of the potential P (t) → P (∞) = 0, hence A (∞) = N (∞). So to prove the theorem it is sufficient to prove that there is a predictable process A ∈ A+ and N ∈ M such that P (t) + A (t) = N (t) = E (N (∞) | Ft ) = E (A (∞) | Ft ) , which holds if there is an A ∈ A+ such that P (t) = E (A (∞) − A (t) | Ft ) . By the definition of the conditional expectation it is equivalent to E (χF (A (∞) − A (t))) = E (χF P (t)) = E (χF (P (t) − P (∞))) ,
F ∈ Ft .
THE DOOB–MEYER DECOMPOSITION
295
Observe that S −P is a submartingale and S (∞) = 0, hence the previous line is equivalent to E (χF (A (∞) − A (t))) = E (χF (S (∞) − S (t))) ,
F ∈ Ft .
(5.3)
For an arbitrary process X on the set of predictable rectangles (s, t] × F,
F ∈ Fs
let us define the set function µX ((s, t] × F ) E (χF (X (t) − X (s))) . Recall5 that the predictable rectangles and the sets {0} × F, F ∈ F0 generate the σ-algebra of the predictable sets P. Let µX ({0} × F ) 0,
F ∈ F0 .
Definition 5.4 If a set function µX has a unique extension to the σ-algebra P which is a measure on P then µX is called6 the Dol´eans type measure of X. Observe that the sets in (5.3) are in the σ-algebra generated by the predictable rectangles. Hence to prove the Doob–Meyer decomposition one should prove the following: Proposition 5.5 If S ∈ D is a submartingale then there is a predictable process A ∈ A+ such that the measure µS of S on the predictable sets is generated by A, that is there is a predictable process A ∈ A+ such that µA (Y ) = µS (Y ) ,
Y ∈ P.
(5.4)
As a first step we prove that µS is really a measure on P. Proposition 5.6 If S is a class D submartingale then the Dol´eans type measure µS of S can be extended from the semi-algebra of the predictable rectangles to the σ-algebra of the predictable sets. Proof. Denote by C the semi-algebra of the predictable rectangles. We want to use Carath´eodory’s extension theorem. To do this we should prove that µS is a measure on C. As S is a submartingale µS is non-negative. µS is trivially additive, hence µS is monotone on C. For all C ∈ C, using that µS is monotone 5 See: 6 See:
Corollary 1.44, page 26. Definition 2.56, page 151.
296
SOME OTHER THEOREMS
and (0, ∞] ∈ C, µS (C) ≤ µS ([0, ∞]) = µS ({0} × Ω) + µS ((0, ∞]) = = µS ((0, ∞]) E (S (∞) − S (0)) ≤ ≤ E (|S (∞)|) + E (|S (0)|) < ∞. Observe that in the last line we used that S is uniformly integrable and therefore S (∞) and S (0) are integrable. As µS is finite it is sufficient to prove that whenever Cn ∈ C, and Cn ∅, then µS (Cn ) 0. Let ε > 0 be arbitrary. If (s, t] × F ∈ C then 1 1 s + , t × F ⊆ s + , t × F ⊆ (s, t] × F. n n S is a submartingale so for every F ∈ Fs 1 E χF S s + − S (s) ≥ 0, n 1 E χF c S s + − S (s) ≥ 0. n S is uniform integrable, hence for the sum of the two sequences above 1 1 − S (s) = E lim S s + − S (s) = lim E S s + n→∞ n→∞ n n = E (S (s+) − S (s)) = 0, hence
1 lim E χF S s + − S (s) =0 n→∞ n so lim µS
n→∞
s+
1 1 , t × F lim E χF S (t) − S s + = n→∞ n n = E (χF [S (t) − S (s)]) µS ((s, t] × F ) .
Hence for every Cn ∈ C there are sets Kn and Bn ∈ C such that Bn ⊆ Kn ⊆ Cn , and for all ω the sections Kn (ω) of Kn are compact and µS (Cn ) < µS (Bn ) + ε2−n .
(5.5)
THE DOOB–MEYER DECOMPOSITION
297
Let us introduce the decreasing sequence Ln ∩k≤n Bk . C is a semi-algebra, hence Ln ∈ C for every n. Let Ln and B n be the sets in which we close the time intervals of Ln and Bn . Ln ⊆ B n ⊆ Kn ⊆ Cn ∅, We prove that if / . γ n (ω) inf {t : (t, ω) ∈ Ln } = min t : (t, ω) ∈ Ln < ∞ then γ n (ω) ∞ for all ω. Otherwise γ n (ω) ≤ K for some ω and K < ∞ and (γ n (ω) , ω) ∈ Ln . The sets [0, K] ∩ Ln (ω) are compact and γ n (ω) ∈ [0, K] ∩ Ln (ω) for all n. Hence their intersection is non-empty. Let γ ∞ be in the intersection. Then (γ ∞ , ω) ∈ Ln for all n so (γ ∞ , ω) ∈ ∩n Ln , which is impossible. Let S = S(0) + M − P be the decomposition of S, where P is the potential part of S. As M is uniformly integrable E(M (∞)) = E(M (γ n )). Therefore µS (Ln ) ≤ E(S(∞) − S(γ n )) = E(P (γ n )). As P is in class D (P (γ n ∧ t)) is uniformly integrable for every t, so as γ n ∞ lim E(P (γ n ∧ t)) = E(P (t)).
n→∞
Using that P is a supermartingale lim sup E(P (γ n )) ≤ lim sup E(P (γ n ∧ t)) = E(P (t)). n→∞
n→∞
As lim E(P (t)) = 0
t→∞
obviously µS (Ln ) → 0. By (5.5) µS (Ln ) ≤ E (S (γ n ) − S (∞)) → 0. By (5.5) c
µS (Cn \ Ln ) µS (Cn ∩ (∩k≤n Bk ) ) = µS (Cn ∩ (∪k≤n Bkc )) ≤ ≤
n k=1
µS (Cn \ Bk ) ≤
n k=1
µS (Ck \ Bk ) ≤ ε,
298
SOME OTHER THEOREMS
hence lim sup µS (Cn ) ≤ lim sup µS (Cn \ Ln ) + lim sup µS (Ln ) ≤ ε. n→∞
n→∞
n→∞
Now we can finish the proof of the Doob–Meyer decomposition. Let us recall that by (5.4) one should prove that there is a predictable process A such that Y ∈ P.
µA (Y ) = µS (Y ) ,
(5.6)
To construct A let us extend µS from P to the product measurable subsets of R+ × Ω with the definition µ (Y ) µS ( Y ) p
p R+ ×Ω
χY dµS .
(5.7)
Observe that as p χY is well-defined the set function µ (Y ) is also well-defined. If Y1 and Y2 are disjoint then by the additivity of the predictable projection µ (Y1 ∪ Y2 ) µS ( (Y1 ∪ Y2 ))
p
p
p
=
R+ ×Ω
= R+ ×Ω
p
R+ ×Ω
χY1 ∪Y2 dµS =
χY1 + χY2 dµS =
χY1 +
p
χY2 dµS =
= µS (p Y1 ) + µS (p Y2 ) µ (Y1 ) + µ (Y2 ) , so µ is additive. It is clear from the Monotone Convergence Theorem for the predictable projection that µ is σ-additive. Hence µ is a measure. µ is absolutely continuous, since if Y ⊆ R+ ×Ω is a negligible set, then there is a set N ⊆ Ω with probability zero that Y can be covered by the random intervals [0, τ n ] where τ n (ω)
n 0
if ω ∈ N . if ω ∈ /N
As P (N ) = 0 and as the usual conditions hold τ n is a stopping time for every n. Hence the intervals [0, τ n ] are predictable, and their Dol´eans-measure is obviously zero. So µ (Y ) ≤
n
µ ([0, τ n ]) =
n
µS ([0, τ n ]) = 0.
THE DOOB–MEYER DECOMPOSITION
299
By the generalized Radon–Nikodym theorem7 we can represent µ with a predictable8 process A ∈ A+ . Hence for all predictable Y µA (Y ) = µ (Y ) µS (p Y ) = µS (Y ) therefore for this A (5.6) holds. 5.1.2
Dellacherie’s formulas and the natural processes
In some applications of the Doob–Meyer decomposition it is more convenient to assume that in the decomposition the increasing process A is natural. Definition 5.7 We say that a process V ∈ V is natural if for every non-negative, bounded martingale N
t
N dV
E
t
=E
N− dV
0
.
(5.8)
0
Recall that for local martingales p N = N− , hence (5.8) can be written as
t
N dV
E
t p
=E
0
N dV
.
0
Proposition 5.8 (Dellacherie’s formula) If V ∈ A+ is natural then for every non-negative, product measurable process X
∞
E
XdV 0
∞
=E
p
XdV
,
(5.9)
0
where the two sides exist or do not exist in the same time. Proof. If η is non-negative, bounded random variable and X η · χ ((s, t]) then E
∞
XdV
= E (η (V (t) − V (s))) =
0
(n)
(n) =E η V tk − V tk−1 =
8 See:
=
k
(n) (n) E E η V tk − V tk−1 | Ft(n) =
k 7 See:
Proposition 3.49, page 208. Proposition 3.51, page 211.
k
300
SOME OTHER THEOREMS
=E
(n) (n) E η | Ft(n) V tk − V tk−1
k
k
E
M
(n) tk
(n) (n) V tk − V tk−1 .
k
By our general assumption the filtration satisfies the usual conditions so M (t) E (η | Ft ) has a version which is a bounded, non-negative martingale. If (n) (n) max tk − tk−1 → 0 k
then using that M , as every martingale, is right-continuous, Mn
(n) (n) (n) χ tk−1 , tk → M. M tk
k
η is bounded and V ∈ A+ , hence the sum behind the expected value is dominated by an integrable variable, so by the Dominated Convergence Theorem
∞
XdV
E 0
= lim E n→∞
=E
=E
lim
n→∞
lim
n→∞
k
M
(n) tk
(n) (n) V tk − V tk−1
k
M
(n) tk
t
(n) (n) V tk − V tk−1
Mn dV
=E
s
t
lim Mn dV
s n→∞
=
=
t
=E
M dV
.
s
Remember that if X η · χI then9 p
X
p
(η · χI ) = M− · χI .
Using that V is natural E
∞
XdV
=E
0
t
M dV s
=E
=E
M− dV s
∞
M− χ ((s, t]) dV 0
t
=
=E
∞
p
XdV
.
0
Hence for this special X (5.9) holds. These processes form a π-system. The bounded processes for which (5.9) is true is a λ-system, hence by the Monotone 9 See:
Corollary 3.43, page 206.
THE DOOB–MEYER DECOMPOSITION
301
Class Theorem one can extend (5.9) to the bounded processes which are measurable with respect to the σ-algebra generated by the processes X η · χ ((s, t]), hence (5.9) is true if X is a bounded product measurable process. To prove the proposition it is sufficient to apply the Monotone Convergence Theorem. Proposition 5.9 (Dellacherie’s formula) If A ∈ V and A is predictable then for any non-negative, product measurable process X ∞ ∞ p E XdA = E XdA , 0
0
where the two sides exist or do not exist in the same time. Proof. If A is predictable then Var (A) is also predictable. Therefore we can assume that A is increasing. In this case the expressions in the expectations exist and they are non-negative. Define the process σ (t, ω) inf {s : A (s, ω) ≥ t} . As A is increasing σ (t, ω) is increasing and right-continuous in t for any fixed ω. As the usual conditions hold σ t , as a function of ω is a stopping time for any fixed t. Observe that as A is right-continuous [σ t ] ⊆ {A ≥ t} , so as A is predictable Graph (σ t ) = [σ t ] = [0, σ t ] ∩ {A ≥ t} ∈ P, hence σ t is a predictable stopping time10 . By the definition of the predictable projection E (X (σ t ) χ (σ t < ∞)) = E ( p X (σ t ) χ (σ t < ∞)) . Let us remark, that for every non-negative Borel measurable function f ∞ ∞ f (u) dA (u) = f (σ t ) χ (σ t < ∞) dt. 0
0
To see this let us remark that A is right-continuous and increasing hence {t ≤ A (v)} = {σ t ≤ v} . So if f χ ([0, v]) then as A (0) = 0 ∞ ∞ f dA = A (v) = χ (t ≤ A (v)) dt = 0
=
0 ∞
0 10 See:
χ (σ t ≤ v) dt =
Corollary 3.34, page 199.
0
∞
f (σ t ) χ (σ t < ∞) dt.
(5.10)
302
SOME OTHER THEOREMS
One can prove the general case in the usual way. As σ t is predictable and as σ (t, ω) is product measurable by Fubini’s theorem ∞ ∞ XdA = E X (σ t ) χ (σ t < ∞) dt = E 0
0
∞
=
E (X (σ t ) χ (σ t < ∞)) dt =
0
∞
=
E ( p X (σ t ) χ (σ t < ∞)) dt =
0
∞
=E
p
XdA .
0
Theorem 5.10 (Dol´ eans) A process V ∈ A+ is natural if and only if V is predictable. Proof. If V is natural, then by the first formula of Dellacherie if p X = p Y , then µV (X) = µV (Y ), hence by the uniqueness of the representation of µV V is predictable11 . To see the other implication assume that V is predictable. By the second formula of Dellacherie for every product measurable process X
∞
XdV
E
∞
=E
0
p
XdV
.
0
If N is a local martingale then12 p N = N− , hence V is natural. Dellacherie’s formulas have an interesting consequence. When the integrator is a continuous local martingale then the stochastic integral is meaningful whenever the integrand is progressively measurable. By Dellacheries’s formulas even in this case the set of all possible integral processes is the same as the set of integral processes when the integrands are just predictable. Assume first 2 that X ∈ L2 (M ). By Jensen’s inequality ( p X) ≤ p X 2 , hence by the second Dellacherie’s formula p X ∈ L2 (M ). [M, N ] is continuous, hence it is predictable also by Dellacherie’s formula
∞
E
Xd [M, N ] = E
0
∞
p
Xd [M, N ] .
0
Hence during the definition of the stochastic integral the linear functionals N → E
∞
Xd [M, N ] ,
0 11 See: 12 See:
Proposition 3.51, page 211. Proposition 3.38, page 204.
N → E 0
∞
p
Xd [M, N ]
THE DOOB–MEYER DECOMPOSITION
303
coincide. Hence X • M = p X • M , and with localization if X ∈ L2loc (M ) then X ∈ L2loc (M ) and X • M = p X • M .
p
5.1.3
The sub- super- and the quasi-martingales are semimartingales
The main problem with the definition of the semimartingales is that it is very formal. An important consequence of the Doob–Meyer decomposition is that we can show some nontrivial examples for semimartingales. The most important direct application of the Doob–Meyer decomposition is the following: Proposition 5.11 Every integrable13 sub- and supermartingale X is semimartingale. Proof. Let X be integrable submartingale. To make the notation simple we shall assume that X (0) = 0. 1. Let us first assume that if X is an integrable submartingale. Let τ be an arbitrary stopping time. We prove that as in the case of martingales, X τ is also a submartingale. Let s < t and A ∈ Fs . Let us define the bounded stopping time σ (τ ∧ t) χAc + (τ ∧ s) χA . As X is integrable one can use the Optional Sampling Theorem, hence as σ ≤ τ ∧t E (X (σ)) E (X (τ ∧ t) χAc + X (τ ∧ s) χA ) ≤ ≤ E (X (τ ∧ t)) = E (X τ (t) χAc + X τ (t) χA ) , therefore E (X τ (s) χA ) ≤ E (X τ (t) χA ) , which means that X τ (s) ≤ E (X τ (t) | Fs ) , that is X τ is a submartingale. 2. If submartingale X is in class D then by the Doob–Meyer decomposition X is semimartingale. One should prove that there is a localizing sequence (τ n ), for which X τ n is in class D for all n , hence as the Doob–Meyer decomposition 13 That
is X (t) is integrable for every t.
304
SOME OTHER THEOREMS
is unique the decomposition Ln+1 + Vn+1 of X τ n+1 on the interval [0, τ n ] is indistinguishable from the decomposition Ln + Vn of X τ n . From this it is clear that X has the decomposition L + V lim Ln + lim Vn , n
n
where L is a local martingale and V has finite variation. 3. Let us define the bounded stopping times τ n inf {t : |X (t)| > n} ∧ n. As X is integrable by the Optional Sampling Theorem X (τ n ) ∈ L1 (Ω). For all t |X τ n (t)| ≤ n + |X (τ n )| ∈ L1 (Ω) , hence X τ n is a class D submartingale. Obviously τ n ≤ τ n+1 . Assume that for some ω the sequence (τ n (ω)) is bounded. In this case τ n (ω) τ ∞ (ω) < ∞. So there is an N such that if n ≥ N then τ n (ω) < n. Hence |X (τ n (ω))| ≥ n by the definition of τ n , therefore the sequence (X (τ n (ω))) is not convergent, which is a contradiction as by the right-regularity of the submartingales X has finite left limit at τ ∞ (ω). The semimartingales form a linear space, therefore if X Y − Z, where Y and Z are integrable, non-negative supermartingales then X is also a semimartingale. Let us extend X to t = ∞. By definition let X (∞) Y (∞) Z (∞) 0. As Y and Z are non-negative, after this extension they remain supermartingales14 . Hence one can assume that Y, Z and X are defined on [0, ∞]. Let ∆ : 0 = t0 < t1 < . . . < tn < tn+1 = ∞
(5.11)
be an arbitrary decomposition of [0, ∞]. Let us define the expression
sup E ∆ 14 Observed
n
|E (X (ti ) − X (ti+1 ) | Fti )| ,
i=0
that we used the non-negativity assumption.
(5.12)
THE DOOB–MEYER DECOMPOSITION
305
where one should calculate the supremum over all possible subdivisions (5.11).
E
≤E
|E (X (ti ) − X (ti+1 ) | Fti )|
i
|E (Y (ti ) − Y (ti+1 ) | Fti )|
+E
i
≤
|E (Z (ti ) − Z (ti+1 ) | Fti )| .
i
Y is a supermartingale, hence E (Y (ti ) − Y (ti+1 ) | Fti ) = Y (ti ) − E (Y (ti+1 ) | Fti ) ≥ 0. Therefore one can drop the absolute value. By the simple properties of the conditional expectation, using the assumption that Y is integrable E
n
|E (Y (ti ) − Y (ti+1 ) | Fti )|
= E (Y (0)) − E (Y (∞)) = E (Y (0)) < ∞.
i=0
Applying the same to Z one can easily see that if X has the just mentioned decomposition then the supremum (5.12) is finite. Definition 5.12 We say that the integrable15 , adapted, right-regular process X is a quasi-martingale if the supremum in (5.12) is finite. Proposition 5.13 (Rao) An integrable, right-regular process X defined on R+ is a quasi-martingale if and only if it has a decomposition X =Y −Z where Y and Z are non-negative supermartingales. Proof. We have already proved one implication. We should only show that every quasi-martingale has the mentioned decomposition. X is defined on R+ , hence as above we shall assume that X (∞) 0. Let us fix an s. For any decomposition ∆ : t0 = s < t1 < t2 . . . of [s, ∞] let us define the two variables ± C∆
(s) E
(E (X (ti ) − X (ti+1 ) | Fti )) | Fs
i 15 That
±
is X (t) is integrable for every t.
.
SOME OTHER THEOREMS
306
± The variables C∆ (s) are Fs -measurable. Let (∆n ) be an infinitesimal16 sequence of partitions of [s, ∞] , and let us assume that ∆n ⊆ ∆n+1 , that is let us assume that we get ∆n+1 by adding further points to ∆n . We shall prove that the
± (s) are almost surely convergent and the limits are almost surely sequences C∆ n finite. First we prove that if the partition ∆ is finer than ∆ , then ± ± C∆ (s) ≤ C∆ (s) ,
(5.13)
which will imply the convergence. By the quasi-martingale property the set of ± variables C∆ (s) is bounded in L1 (Ω). From the Monotone Convergence Theorem ± (s) ∞ cannot hold on a set which has positive measure. it is obvious, that C∆ n To prove (5.13) let us assume that the new point t is between ti and ti+1 . Let us introduce the variables ξ E (X (ti ) − X (t) | Fti ) ,
η E (X (t) − X (ti+1 ) | Ft ) ,
ζ E (X (ti ) − X (ti+1 ) | Fti ) . As ζ = ξ + E (η | Fti ), by Jensen’s inequality
+ ζ + ≤ ξ + + E (η | Fti ) ≤ ξ + + E η + | Fti , hence
E ζ + | Fs ≤ E ξ + | Fs + E η + | Fs , from which the inequality (5.13) is trivial. Let us introduce the variables ± C ± (s) lim C∆ (s) . n n→∞
Obviously C ± (s) is integrable and Fs -measurable. Let us observe that the vari± (s) are defined up to a measure-zero set, hence the variables C ± (s) ables C∆ n
(n) are also defined up to a measure-zero set. For arbitrary partitions ∆n ti as X (∞) 0 and as X is adapted + C∆ n
(s) −
− C∆ n
(s) = E =
E X
(n) ti
−X
(n) ti+1
| Ft(n) | Fs
=
i
i
(n) (n) E X ti − X ti+1 | Fs =
i a.s
= E (X (s) | Fs ) − E (X (∞) | Fs ) = X (s) . 16 As the length of the [s, ∞] is infinite this property, it means that we map order preservingly [0, ∞] onto [0, 1] and then the (∆n )n is infinitesimal on [0, 1] .
THE DOOB–MEYER DECOMPOSITION
307
This remains valid after we take the limit, hence for all s C + (s) − C − (s) = X (s) . a.s
(5.14)
Let us assume that t is in ∆n for all n. As s < t
±
± (n) (n) E X ti − X ti+1 | Ft(n) (t) | Fs = E | Fs ≤ E C∆ n i
(n)
tii ≥t
± (n)
(n) ≤E | Fs E X ti − X ti+1 | Ft(n) i
i ± = C∆ (s) , n
from which taking the limit and using the Monotone Convergence Theorem for the conditional expectation
E C ± (t) | Fs ≤ C ± (s) . (5.15) Let (∆n ) be an infinitesimal sequence of partitions of [0, ∞]. Let S be the union of the points in (∆n ). Obviously S is dense in R+ . By the above C ± are supermartingales on S. As S is countable so on S one can define the trajectories of C ± up to a measure zero set. By the supermartingale property except on a measure zero set N for every t the limit D± (t, ω) C ± (t+, ω)
lim
st,s∈S
C ± (s, ω)
exist and D± (t) is right-regular. X is also right-regular, hence from (5.14) on the N c for every t ≥ 0 D+ (t) − D− (t) = X (t) . D± (t) is Ft+1/n -measurable for all n, hence D± (t) is Ft+ -measurable. As F satisfies the usual conditions D± (t) is Ft measurable, that is the processes D± are adapted. If sn t and sn ∈ S, then the sequence (C ± (sn )) is a reversed supermartingale. Hence for the L1 (Ω) convergence of (C ± (sn )) it is necessary and sufficient that the sequence is bounded in L1 (Ω). By the supermartingale property as (sn ) is decreasing the expected value of (C ± (sn ))n is increasing. By the quasi-martingale property the variables C ± (0) are integrable, hence by the non-negativity the sequences (C ± (sn )) are bounded in L1 (Ω). Hence they are convergent in L1 (Ω). From this D± (t) is integrable for all t. The conditional expectation is continuous in L1 (Ω) therefore one can take the limit in (5.15) into the conditional expectation. Hence the processes D± are integrable supermartingales on R+ . Corollary 5.14 Every quasi-martingale is a semimartingale.
308
5.2
SOME OTHER THEOREMS
Semimartingales as Good Integrators
The definition of the semimartingales is quite artificial. In this section we present an important characterization of the semimartingales. We shall prove that the only class of integrators for which one can define a stochastic integral with reasonable properties is the class of the semimartingales. Recall the following definition: Definition 5.15 Process E is a predictable step process if E=
n
ξ i χ ((ti , ti+1 ])
i=0
where 0 = t0 < t1 < . . . < tn+1 and ξ i are Fti -measurable random variables. If X an arbitrary process then the only reasonable definition of the stochastic integral E • X is17 (E • X) (t) = ξ i (X (ti+1 ∧ t) − X (ti ∧ t)) . i
For an arbitrary stochastic process X the definition obviously makes the integral linear over the linear space of the predictable step processes. On the other hand it is reasonable to say that a linear mapping is an integral if the correspondence has some continuity property. Let us define the topology of uniform convergence in (t, ω) among the predictable step processes and let us define the topology for the random variables with the stochastic convergence. Definition 5.16 We say that process X is a good integrator, if for every t the correspondence E → (E • X) (t) is a continuous, linear mapping from the space of predictable step processes to the set of random variables. Observe that the required continuity property is very weak, as on the domain of definition we have a very strong, and on the image space we have a very weak, topology. As the integral is linear it is continuous if and only if it is continuous at E = 0. This means that if a sequence of step processes is uniformly convergent to zero then for any t the integral on the interval (0, t] is stochastically convergent to zero. 17 See: Theorem 2.88, page 174, line (4.11), page 252. Recall that by definition (E • X) (t) is the integral on (0, t].
SEMIMARTINGALES AS GOOD INTEGRATORS
309
Theorem 5.17 (Bichteler–Dellacherie) An adapted, right-regular process X is a semimartingale if and only if it is a good integrator. Proof. If X is a semimartingale, then by the Dominated Convergence Theorem it is obviously a good integrator18 . Hence we have to prove only the other direction. We split the proof into several steps. 1. As a first step let us separate the ‘big jumps’ of X, that is let us separate from X the jumps of X which are larger than one. By the assumptions of the theorem the trajectories of X are regular so the ‘big jumps’ do not have an accumulation point. Hence the decomposition is meaningful. From this trivially follows that the process
∆Xχ (|∆X| ≥ 1)
has finite variations. As the continuity property of the good integrators holds for processes with finite variation Y X − ∆Xχ (|∆X| ≥ 1) is also a good integrator. If we prove that Y is a semimartingale, then we obviously prove that X is a semimartingale as well. Y does not contain ‘big jumps hence if it is a semimartingale, then it is a special semimartingale19 . Therefore the decomposition of Y is unique20 . As the decomposition is unique it is sufficient to prove that Y is a semimartingale on every interval [0, t]. 2. As we have already seen21 if probability measures P and Q are equivalent, that is the measure-zero sets under P and Q are the same, then X is a semimartingale under P if and only if it is a semimartingale under Q. Therefore it is sufficient to prove that if X is a good integrator under P then one can find a probability measure Q which is equivalent to P and X is a semimartingale under Q. Observe that a sequence of random variables is stochastically convergent to some random variable if and only if any subsequence of the original sequence has another subsequence which is almost surely convergent to the same function. Therefore the stochastic convergence depends only on the collection of measurezero sets, which is not changing during the equivalent change of measure. From this it is obvious that the class of good integrators is not changing under the equivalent change of measure. 3. Let us fix an interval [0, t]. As the trajectories of X are regular the trajectories are bounded on any finite interval. Hence η sups≤t |X (s)| < ∞. Again by the regularity of the trajectories it is sufficient to calculate the supremum over the rational points s ≤ t. Therefore η is a random variable. Let Am {m ≤ η < m + 1} and ζ m 2−m χAm . ζ is evidently bounded, and as 18 See:
Lemma 2.12, page 118. Example 4.47, page 258. 20 See: Corollary 3.41, page 205. 21 See: Corollary 4.58, page 271. 19 See:
310
SOME OTHER THEOREMS
η is finite ζ is trivially positive. As E (ηζ) =
E η2−m χ (m ≤ η < m + 1) ≤ (m + 1) 2−m
m
m
it is obvious that ηζ is integrable under P. 1 R (A) E (ζ)
ζdP A
is a probability measure and as ζ is positive it is equivalent to P. For every s ≤ t
|X (s)| dR ≤
Ω
ηdR = Ω
1 E (ζ)
ηζdP < ∞, Ω
therefore X (s) is integrable under R for all s. To make the notation simple we assume that X (s) are already integrable under P for all s ∈ [0, t]. 4. Let us define the set B {(E • X) (t) : |E| ≤ 1, E ∈ E} ,
(5.16)
where E is the set of predictable step processes over [0, t]. Using the continuity property of the good integrators we prove that B is stochastically bounded, that is for every ε > 0 there is a number k, such that P (|η| ≥ k) < ε for all η ∈ B. If it was not true then there were an ε > 0, a sequence of step processes |En | ≤ 1 and kn ∞, such that P
(En • X) (t) ≥1 kn
≥ ε.
The sequence (En /kn ) is uniformly converging to zero, hence by the continuity property of the good integrators (En • X) (t) = kn
En P • X (t) → 0, kn
which is, by the indirect assumption, is not true. 5. As a last step of the proof in the next point we shall prove that for every non-empty, stochastically bounded, convex subset B of L1 there is a probability measure Q which is equivalent to P and for which
βdQ : β ∈ B
sup Ω
c < ∞.
(5.17)
SEMIMARTINGALES AS GOOD INTEGRATORS
311
From this the theorem follows as for every partition of [0, t] 0 = t0 < t1 < . . . < tn+1 = t if22
ξ i sgn EQ (X (ti+1 ) − X (ti ) | Fti ) , and E
ξ i χ ((ti , ti+1 ])
i
then as |E| ≤ 1 (E • X) (t) ∈ B, therefore Q
c ≥ E ((E • X) (t)) =
n
EQ (ξ i [X (ti+1 ) − X (ti )]) =
i=0
=
n
EQ EQ (ξ i [X (ti+1 ) − X (ti )] | Fti ) =
i=0
=
n
EQ ξ i EQ (X (ti+1 ) − X (ti ) | Fti ) =
i=0
n Q E (X (ti ) − X (ti+1 ) | Ft ) . =E i Q
i=0
Hence X is a quasi-martingale under Q. Therefore23 it is a semimartingale under Q. 6. Let B ⊆ L1 (Ω) be a non-empty stochastically bounded convex convex set24 . We prove the existence of the equivalent measure Q in (5.17) with the Hahn– ∞ Banach theorem. Let L∞ + denote the set of non-negative functions in L . H
ζ ∈ L∞ + : sup
βζdP : β ∈ B
0 there is a k (ε) such that P (B ≥ k (ε)) ≤ ε. 23 See:
312
SOME OTHER THEOREMS
It is sufficient to prove that H contains a strictly positive function ζ 0 , since in this case 1 Q (A) ζ dP E (ζ 0 ) A 0 is an equivalent probability measure for which (5.17) holds. Let G be the set of points of positivity of the functions in H. The set G is closed under the countable union: if ζ n ∈ H, and
βζ n dP : β ∈ B
sup
≤ cn
Ω
cn ≥ 1 then n
2−n ζ ∈ H. cn ζ n ∞ n
Using the lattice property of G in the usual way one can prove that G contains a set D which has maximal measure, that is P (G) ≤ P (D) for all G ∈ G. Of course to D there is a ζ D ∈ H. We should prove that P (D) = 1, hence in this case ζ D ∈ H, as an equivalence class, it is strictly positive. Let us denote by C the complement of D. We shall prove that P (C) = 0. As an indirect assumption let us assume that P (C) ε > 0.
(5.18)
As B is stochastically bounded to our ε > 0 in (5.18) there is a k, such that P (β ≥ k) ≤ ε/2 for all random variable β ∈ B. From this θ 2kχC ∈ / B. Of course, if ϑ ≥ 0, then P (θ + ϑ ≥ k) ≥ ε hence θ + ϑ ∈ / B, that is θ ∈ / B − L1+ . We can prove a bit more: θ is not even in the closure in L1 (Ω) of the convex25 set B − L1+ . That is
θ∈ / cl B − L1+ . P
If γ n β n − ϑn → θ in L1 (Ω), then γ n → θ, but if δ is small enough, then as ϑn ≥ 0 P (|γ n − θ| > δ) P (|β n − ϑn − θ| > δ) ≥ ≥ P ({β n < k} ∩ {θ ≥ 2k}) = = P ({β n < k} ∩ C) = P (C\ {β n ≥ k}) ≥ 25 The
B is conves hence B − L1+ is also convex.
ε , 2
SEMIMARTINGALES AS GOOD INTEGRATORS
313
which is impossible. By the Hahn–Banach theorem26 there is a ζ = 0 ∈ L∞ (Ω) , such that
(β − ϑ) ζdP
2−k < 2−k . i,j≥m
As we observed the real valued functions d (Zi (c, ω) , Zj (c, ω)) are measurable in (c, ω), therefore by Fubini’s theorem the probability in the formula depends on c in a measurable way. Hence nk is a measurable function of c. Let us define the ‘stopped variables’ Yk (c, t, ω) Znk (c) (c, t, ω) . For all open set G {Yk ∈ G} = ∪p {nk = p, Zp ∈ G} , therefore Yk is also product measurable. For all c −k
sup P d (Yi (c) , Yj (c)) > 2−k ≤ 2 < ∞, k
i,j≥k
k
hence for every c by the Borel and Cantelli lemma if the indexes i, j are big enough then except on a measure-zero set ω ∈ N (c) d (Yi (c, ω) , Yj (c, ω)) ≤ 2−k . D [0, ∞) is complete, hence (Yi (c, ω)) is almost surely convergent in D [0, ∞) for all c. The function limi Yi (c, t, ω) , if the limit exists, Z (c, t, ω) 0 otherwise is product measurable and Z is right-regular almost surely for all c. For an arbitrary c (Yi (c) − Z(c)) is a subsequence of (Zn (c) − Z(c)), therefore it is stochastically convergent in D [0, ∞). The measure is finite therefore for the metric space valued random variables the almost sure convergence implies the stochastic convergence. Hence Z (c, ω) is the limit of the sequence (Zn (c, ω)) for almost all ω. Returning to the proof of the proposition let us assume that Hn ∈ S and 0 ≤ Hn H, where H is bounded. By the Dominated Convergence Theorem Hn (c)• ucp X → H • X for every c. Hence by the lemma H • X has a (C × B (R+ ) × A)measurable version. That is H ∈ S. Hence the proposition is valid for bounded processes. If H is not bounded, then let Hn Hχ (|H| ≤ n). The processes Hn are also (C × G)-measurable, and of course they are bounded. Therefore the processes Hn • X have the stated version. By the Dominated Convergence
322
SOME OTHER THEOREMS ucp
Theorem Hn (c) • X → H (c) • X for every c. By the lemma this means that H (c) • X also has a measurable version. Theorem 5.25 (Fubini’s theorem for bounded integrands ) Let X be a semimartingale, and let (C, C, µ) be an arbitrary finite measure space. Let H(c, t, ω) be a function measurable with respect to the product σ-algebra C × G. Let us denote by (H • X)(c) the product measurable version of the parametric integral c → H(c) • X. If H (c, t, ω) is bounded, then (H • X) (c)dµ (c) = H (c) dµ (c) • X, (5.23) C
C
that is the integral of the parametric stochastic integral on the left side is indistinguishable from the stochastic integral on the right side. Proof. It is not a big surprise that the proof is built on the Monotone Class Theorem again. 1. By the Fundamental Theorem of Local Martingales semimartingale X has 2 . For V ∈ V one can a decomposition X (0) + V + L, where V ∈ V and L ∈ Hloc prove the equality by the classical theorem of Fubini, hence one can assume that 2 . One can easily localize the right side of (5.23). On the left side one X ∈ Hloc can interchange the localization and the integration with respect to c therefore one can assume that X (0) = 0 and X ∈ H2 . Therefore33 we can assume that E ([X] (∞)) < ∞. 2. Let us denote by S the set of bounded, (C × G)-measurable processes for which the theorem holds. If H H1 (c) H2 (t, ω) , where H1 is C-measurable step function and H2 is G-measurable and H1 and H2 are bounded functions, then arguing as in the previous proposition
H • Xdµ
C
(H1 (c) H2 ) • Xdµ(c) = C
=
C
=
αi χBi H2
• Xdµ(c) =
i
αi C
i
χBi (H2 • X) dµ(c) =
H1 (c) dµ (c) (H2 • X) =
= C
=
H1 (c) dµ (c) H2
C
so H ∈ S. 33 See:
Proposition 3.64, page 223.
•X = C
Hdµ • X,
THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS
323
3. By the Monotone Class Theorem, one should prove that S is a λ-system. Let Hn ∈ S and let 0 ≤ Hn H, where H is bounded. We prove that one can take the limit in the equation
Hn dµ • X.
(Hn • X) dµ = C
C
As H is bounded and µ is finite, therefore on the right-hand side the integrands are uniformly bounded so one can apply the classical and the stochastic Dominated Convergence Theorem, so on the right-hand side
ucp
Hn dµ • X →
Hdµ • X.
C
(5.24)
C
4. Introduce the notations Zn Hn • X and Z H • X. One should prove that the left-hand side is also convergent that is P Z (c) dµ (c) → 0. δ sup Zn (c) dµ (c) − t
C
C
By the inequalities of Cauchy–Schwarz and Doob sup |Zn (c) − Z (c)| dµ (c) ≤
E (δ) ≤ E
t
C
* + 2 + " , sup |Zn (c) − Z (c)| dµ (c) = ≤ µ(C) E C
t
* + 2 + " sup |Zn (c) − Z (c)| dµ (c) ≤ = µ(C), E t
C
-
" 2 ≤ µ(C) 4 · E (Zn (c, ∞) − Z (c, ∞)) dµ (c). C
By Itˆo’s isometry34 the last integral is
E C
t
(Hn − H) d [X] dµ. 2
(5.25)
0
As µ and E ([X] (∞)) are finite and as the integrand is bounded and Hn → H by the classical Dominated Convergence Theorem the (5.25) goes to zero. 34 See:
Proposition 2.64, page 156.
324
SOME OTHER THEOREMS
So E (δ) → 0, that is ucp (Hn • X) dµ Zn dµ → Zdµ (H • X) dµ. C
C
C
(5.26)
C
Particularly sup |Zn (c) − Z (c)| dµ (c) < ∞, C
a.s.
t
The expression Hn (c) dµ (c) • X = (Hn (c) • X) dµ (c) Zn dµ C
C
C
is meaningful, therefore for all t and for almost all outcome ω |(H(c) • X) (t, ω)| dµ (c) |Z (c, t, ω)| dµ (c) < ∞. C
C
Hence the left-hand side of (5.23) is meaningful for H as well. By (5.24) the right-hand side is also convergent, hence from (5.26)
Hdµ • X = lim
n→∞
C
Hn dµ • X = C
(Hn • X) dµ =
= lim
n→∞
C
(H • X) dµ. C
The just proved stochastic generalization of Fubini’s theorem is sufficient for most of the applications. On the other hand one can still be interested in the unbounded case: Theorem 5.26 (Fubini’s theorem for unbounded integrands) Let X be a semimartingale and let (C, C, µ) be a finite measure space. Let H (c, t, ω) be a (C × G)-measurable process, and assume that the expression - H (t, ω)2
H 2 (c, t, ω) dµ (c) < ∞
(5.27)
C
is integrable with respect to X. Under these conditions µ almost surely the stochastic integral H (c)•X exists and if (H •X)(c) denote the measurable version of this parametric integral then (H • X)(c)dµ (c) = H (c) dµ (c) • X. (5.28) C
C
THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS
325
Proof. If on the place of H one puts Hn Hχ (|H| ≤ n) then the equality holds by the previous theorem. As in the proof of the classical Fubini’s theorem one should take a limit on both sides of the truncated equality. 1. Let us first investigate the right-hand side of the equality. By the Cauchy– Schwarz inequality - " |H (c, t, ω)| dµ (c) ≤ µ (C) H 2 (c, t, ω) dµ (c).
C
(5.29)
C
By the assumptions µ is finite and H (c, t, ω) as a function of c is in the space L2 (µ) ⊆ L1 (µ), hence by the Dominated Convergence Theorem for all (t, ω)
Hn (c, t, ω) dµ (c) →
C
H (c, t, ω) dµ (c) . C
By the just proved inequality (5.29) the processes C Hdµ and C |H| dµ are integrable with respect by the Dominated Convergence Theorem for
to X, hence ucp stochastic integrals C Hn dµ • X → C Hdµ • X. This means that one can take the limit on the right side of the equation. 2. Now let us investigate the left-hand side. We first prove that for almost all c the integral H (c) • X exists. Let X X (0) + V + L, where V ∈ V, L ∈ L is the decomposition of X for which the integral H (t, ω)2 • X exists. One can assume that V ∈ V + . Using (5.29) and for every trajectory the theorem of Fubini
t
t
|H| dV dµ = C
0
|H| dµdV =
0
0
C
t
H1 dV
t " ≤ µ (C) H2 dV < ∞. 0
t Therefore for any t for almost every35 c the integral 0 H(c)dV is finite. Of course if the integral exists for every rational t then it exists for every t, therefore unifying the measure-zero sets it is easy to show that for almost all c the integral H(c) • V is meaningful. Recall that"a process G is integrable with respect to the + 2 local ) martingale L if and only if G • [L] ∈ Aloc . This means that H2 H 2 (c) dµ (c) is integrable if and only if there is a localizing sequence (τ n ) C 35 Of
course with respect to µ.
SOME OTHER THEOREMS
326
for which the expected value of -
τn
-
τn
H 2 (c) dµ (c) d [L] = 0
H 2 (c) d [L] dµ (c)
C
0
C
is finite. By Jensen’s inequality - C
τn
0
µ (c) H 2 (c) d [L] d ≥ µ(C)
-
τn
H 2 (c) d [L]d C
0
µ (c) . µ(C)
Therefore by Fubini’s theorem -
H2
E C
-
τn
H2
(c) d [L] dµ (c) = E
0
τn
(c) d [L]dµ (c)
< ∞.
0
C
Hence except on a set Cn with µ (Cn ) = 0 the expected value of -
τn
H 2 (c) d [L] 0
is finite. Unifying the measure-zero sets Cn one can easily see that " 36 H 2 (c) • [L] ∈ A+ c, that is for almost all c the integral loc for almost all H(c) • L exists. ucp
3. If integral H (c) • X exists, then Hn (c) • X → H (c) • X. Unfortunately, as we mentioned above from the inequality |Hn (c)| ≤ |H (c)| does not follow the inequality |Hn (c) • X| ≤ |H (c) • X|, and we do not know that H (c) • X is µ integrable hence one cannot use the classical Dominated Convergence Theorem for the outer integral with respect to µ. Therefore, as in the proof of the previous theorem, we prove the convergence of the right side with direct estimation. As by the classical Fubini’s theorem the theorem is obviously valid if the integrator has finite variation one can assume that X ∈ L. 4. Let s ≥ 0. Like in the previous proof introduce the variable δ n sup ((Hn (c) − H (c)) • X) dµ (c) . t≤s C
36 Of
course with respect to µ.
THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS
327
By Davis’ inequality sup |(Hn (c) − H (c)) • X| dµ (c) =
E (δ n ) ≤ E
(5.30)
C t≤s
E sup |(Hn (c) − H (c)) • X| dµ (c) ≤
= C
≤K
t≤s
E C
"
)
2
(Hn (c) − H(c)) • [X] (s) dµ =
E
=K
[(Hn (c) − H(c)) • X] (s) dµ =
C
%
= µ(C)KE C
µ (Hn (c) − H (c)) • [X] (s) d µ(C)
2
≤
- ≤ µ(C)KE
=
µ = (Hn (c) − H (c)) • [X] (s) d µ(C) C - 2
" µ(C)KE
2
(Hn (c) − H (c)) dµ • [X] (s) . C
) C
H 2 dµ is integrable with respect to X, therefore -
- 2
(Hn (c) − H (c)) dµ • [X] ≤ C
C
H 2 dµ • [X] ∈ A+ loc .
Let (τ m ) be a localizing sequence. With localization one can assume that the last expected value is finite, that is -
H 2 dµ
E
•
[X τ m ]
< ∞.
C
Applying the estimation (5.30) for X τ m and writing δ (m) instead of δ n by n
(m) the classical Dominated Convergence Theorem E δ n → 0. Hence if m is sufficiently large then
> ε + P (τ m ≤ s) P (δ n > ε) ≤ P (δ n > ε, τ m > s) + P (τ m ≤ s) ≤ P δ (m) n Therefore δ n → 0 in probability. From this point the proof of the theorem is the same as the proof of the previous one.
328
SOME OTHER THEOREMS
Corollary 5.27 (Fubini’s theorem for local martingales) Let (C, C, µ) be a finite measure space. If L is a local martingale, H (c, t, ω) is a (C ×P)-measurable function and t H 2 (c, s)dµ(c)d [L] (s) ∈ A+ loc , 0
C
then
t
t
H (c, s) dL (s) dµ (c) = C
0
H (c, s) dµ (s) dL (s) . 0
(5.31)
C
If L is a continuous local martingale and H is a (C × R)-measurable process and t P 0
H 2 (c, s) dµ (c) d [L] (s) < ∞ = 1,
C
then (5.31) holds. Corollary 5.28 (Fubini’s theorem for Wiener processes) Let (C, C, µ) be a finite measure space. If w is a Wiener process, H (c, t, ω) is an adapted, product measurable process and t
H (c, s) dµ (c) ds < ∞ = 1, 2
P 0
C
then
t
t
H (c, s) dw (s) dµ (c) = C
5.5
0
H (c, s) dµ (s) dw (s) . 0
C
Martingale Representation
Let H0p denote the space of Hp martingales which are zero at time zero. Recall that by definition martingales M and N are orthogonal if their product M N is a local martingale. This is equivalent to the condition that the quadratic variation [M, N ] is a local martingale. This implies that if M and N are orthogonal then M τ and N are also orthogonal for every stopping time τ . The topology in the spaces H0p is given by the norm supt |M (t)|p . The basic message of the Burkholder–Davis inequality is that this norm is equivalent to the norm " M Hp [M ] (∞) . (5.32) 0 p
In this section we shall use this norm. Observe that if p ≥ 1 then H0p is a Banach space.
MARTINGALE REPRESENTATION
329
Definition 5.29 Let 1 ≤ p < ∞. We say that the closed, linear subspace X of H0p is stable if it is stable under truncation, that is if X ∈ X then X τ ∈ X for every stopping time τ . If X is a subset of H0p then we shall denote by stablep (X ) the smallest closed linear subspace of H0p which is closed under truncation and contains X . Obviously H0p is a stable subspace. The intersection of stable subspaces is also stable, hence stablep (X ) is meaningful for every X ⊆ H0p . To make the notation as simple as possible if the subscript p is not important we shall drop it and instead of stablep (X ) we shall simply write stable(X ). Lemma 5.30 Let 1 ≤ p < ∞ and let X ⊆ H0p . Let N be a bounded martingale. If N is orthogonal to X then N is orthogonal to stable(X ). Proof. Let us denote by Y the set of H0p -martingales which are orthogonal to N . Of course X ⊆ Y so it is sufficient to prove that Y is a stable subspace of H0p . As we remarked Y is closed under stopping. Let Mn ∈ Y and let Mn → M∞ in H0p . As N is bounded Mn N is a local martingale which is in class D. Hence it is a uniformly integrable martingale. So E ((Mn N ) (τ )) = 0 for every stopping time τ . Let k < ∞ be an upper bound of N. |E ((M∞ N ) (τ ))| = |E ((M∞ N ) (τ )) − E ((Mn N ) (τ ))| ≤ ≤ E (|((M∞ − Mn ) N ) (τ )|) ≤ ≤ k · E (|(M∞ − Mn ) (τ )|) ≤
" [M∞ − Mn ] (∞) ≤ ≤k·E ≤ k · M∞ − Mn Hp → 0. 0
So M∞ N is also a martingale. Hence Y {X ∈ H0p : X ⊥ N } is closed in H0p . Definition 5.31 Let 1 ≤ p < ∞. We say that the subset X ⊆ H0p has the Martingale Representation Property if H0p = stable(X ). Recall that we have fixed a stochastic base (Ω, A, P, F). Definition 5.32 Let 1 ≤ p < ∞. Let us say that the probability measure Q on (Ω, A) is a H0p -measure of the subset X ⊆ H0p if 1. Q P, 2. Q = P on F0 , 3. if M ∈ X then M is in H0p under Q as well. Mp (X ) will denote the set of H0p -measures of X .
330
SOME OTHER THEOREMS
Lemma 5.33 Mp (X ) is always convex. Proof. If Q1 , Q2 ∈ Mp (X ) and 0 ≤ λ ≤ 1 and Qλ λQ1 + (1 − λ)Q2 then for every M ∈ X EQλ
p sup |M (t)| = t
Q1
= λE
p p Q2 sup |M (t)| + (1 − λ)E sup |M (t)| < ∞. t
t
If F ∈ Fs and t > s then by the martingale property under Q1 and Q2 M (t)dQλ = λ M (t)dQ1 + (1 − λ) M (t)dQ2 = F
F
F
M (s)dQ1 + (1 − λ)
=λ F
M (s)dQ2 = F
M (s)dQλ .
= F
Hence M is in H0p under Qλ . Definition 5.34 If C is a convex set and x ∈ C then we say that x is an extremal point of C if whenever u, v ∈ C and x = λu + (1 − λ)v for some 0 ≤ λ ≤ 1 then x = u or x = v. Proposition 5.35 Let 1 ≤ p < ∞ and let X ⊆ H0p . If X has the Martingale Representation Property then P is an extremal point of Mp (X ). Proof. Assume that P = λQ + (1 − λ) R, where 0 ≤ λ ≤ 1 and Q, R ∈ Mp (X ). As R ≥ 0 obviously Q P so one can define the Radon–Nikodym derivative L (∞) dQ/dP ∈ L1 (Ω, P, F∞ ). Define the martingale L (t) E (L (∞) | Ft ) . From the definition of the conditional expectation L (t) dP = L (∞) dP = Q (F ) , F
F ∈ Ft ,
F
so L (t) is the Radon–Nikodym derivative of Q with respect to P on the measure space (Ω, Ft ). Let X ∈ X . If s < t and F ∈ Fs then as X is a
MARTINGALE REPRESENTATION
331
martingale under Q dQ X (t) L (t) dP = X (t) X (t) dQ = dP = dP F F F = X (s) dQ = X (s) L (s) dP F
F
so XL is a martingale under P. Obviously Q ≤ P/λ so 0 ≤ L ≤ 1/λ. Hence L is uniformly bounded. L (0) is bounded and F0 -measurable so X · L (0) is a martingale. This implies that X · (L − L (0)) is also a martingale under P, that is X and L − L (0) are orthogonal as local martingales. That is L − L (0) is orthogonal to X . Hence by the previous lemma L − L (0) is orthogonal to stable(X ). As X has the Martingale Representation Property L − L (0) is orthogonal to H0p . As L − L (0) is bounded L − L (0) ∈ H0p . But this means37 that L − L (0) = 0. By definition Q and P are equal on F0 , hence L (∞) = L (0) = 1. Hence P = Q. Now we want to prove the converse statement for p = 1. Let P be an extremal point of Mp (X ) and assume that X does not have the Martingale Representation Property, that is stable(X ) = H0p . As stable(X ) is a closed linear space by the Hahn–Banach theorem there is a non-zero linear functional L for which L (stable(X )) = 0.
(5.33)
Assume temporarily that L has the following representation: there is a locally bounded local martingale N such that L (M ) = E ([M, N ] (∞)) ,
M ∈ H0p .
(5.34)
stable(X ) is closed under truncation, hence for every stopping time τ τ
E ([M, N τ ] (∞)) = E ([M, N ] (∞)) = = E ([M τ , N ] (∞)) = L (M τ ) = 0 whenever M ∈ stable(X ). Hence instead of N we can use N τ . As N is locally bounded we can assume that N is a uniformly bounded martingale. Instead of N we can also write N − N (0) so one can assume that N (0) = 0. Let |N | ≤ c. If N (∞) N (∞) dQ 1 − dP, dR 1 + dP 2c 2c then Q and R are non-negative measures. As N is a bounded martingale E (N (∞)) = E (N (0)) = E (0) = 0, 37 See:
Proposition 4.4, page 228.
332
SOME OTHER THEOREMS
so Q and R are probability measures and obviously P = (Q + R) /2. If X ∈ X then
p
p
sup |X(s)| dQ =
sup |X(s)|
s
Ω
Ω
s
N (∞) 1− 2c
dP ≤
p
≤2
sup |X(s)| dP < ∞. Ω
s
If s < t and F ∈ Fs then
N (∞) X(t) 1 − dP = 2c F 1 = X(t)dP − X (t) N (∞) dP = 2c F F 1 = X (s) dP − X (t) N (∞) dP. 2c F F
X (t) dQ F
As F ∈ Fs σ(ω)
if ω ∈ F if ω ∈ /F
s ∞
is a stopping time. As s ≤ t τ (ω)
t if ω ∈ F ∞ if ω ∈ /F
is also a stopping time. Hence X τ , X σ ∈ stable(X ), so
X τ − X s = X t − X s χF ∈ stable(X ).
(5.35)
Obviously H0p ⊆ H01 if p ≥ 1 so |M N | ≤ sup |M | (t) sup |N | (t) ∈ L1 (Ω) . t
t
As N is bounded obviouly38 N ∈ H0q . Hence by the Kunita–Watanabe inequality using also H¨ older’s inequality |[M, N ]| ≤
"
" [M ] (∞) [N ] (∞) ∈ L1 (Ω) .
38 Recall the definition of the Hp spaces! See: (5.32) on page 328. Implicitly we have used the Burkholder–Davis inequality.
MARTINGALE REPRESENTATION
333
By this M N − [M, N ] is a class D local martingale hence it is a uniformly integrable martingale39 . Hence E (M (∞) N (∞)) = E (M (∞) N (∞)) − L (M ) = = E (M (∞) N (∞) − [M, N ] (∞)) = = E (M (0) N (0) − [M, N ] (0)) = 0 so by (5.35)
E N (∞) χF X t (∞) = E (N (∞) χF X s (∞)) . Therefore
X (t) N (∞) dP = F
X (s) N (∞) dP. F
Hence X is a martingale under Q. This implies that Q ∈ Mp (X ). In a similar way R ∈ Mp (X ) which is a contradiction. So one should only prove that if stable(X ) = H0p then there is a locally bounded local martingale N for which (5.33) and (5.34) hold. It is easy to see that if p > 1 then the dual of H0p is H0q , where of course 1/p + 1/q = 1. The H0q martingales are not locally bounded40 so the argument above is not valid if p > 1. Assume that p = 1. Proposition 5.36 If L is a continuous linear functional over H01 then (5.34) holds, that is for some locally bounded local martingale N L (M ) = E ([M, N ] (∞)) ,
M ∈ H01 .
Proof. Obviously H02 ⊆ H01 and M H1 ≤ M H2 so if c L then |L (M )| ≤ c M H1 ≤ c M H2 0
0
so L is a continuous linear functional over H02 . 1. H02 is a Hilbert space so for some N ∈ H02 L (M ) = E (M (∞) N (∞)) ,
M ∈ H02 .
Let M ∈ H02 . From the Kunita–Watanabe inequality41 " " " " |[M, N ]| ≤ [M ] [N ] ≤ [M ] (∞) [N ] (∞) ∈ L1 (Ω) . 39 See:
Example 1.144, page 102. can easily modify Example 1.138, on page 96 to construct a counter-example. 41 Observe that we used again that the two definition of H2 spaces are equivalent. 0 40 One
334
SOME OTHER THEOREMS
Also as M, N ∈ H02 |(M N ) (t)| ≤ sup |M (t)| sup |N (t)| ∈ L1 (Ω) . t
t
Therefore M N − [M, N ] has an integrable majorant so it is a local martingale from class D. Therefore it is a uniformly integrable martingale. This implies that for some N ∈ H02 L (M ) = E (M (∞) N (∞)) = E ([M, N ] (∞)) ,
M ∈ H02 .
(5.36)
2. Now we prove that for almost all trajectory |∆N | ≤ 2c. Let τ inf {t : |∆N | > 2c} . As N (0) = 0 and N is right-continuous τ > 0. If τ (ω) < ∞ then |∆N (τ )| (ω) > 2c. Hence we should prove that P (|∆N (τ )| > 2c) = 0. Every stopping time can be covered by countable number totally inaccessible or predictable stopping times, hence one can assume that τ is either predictable or totally inaccessible. If P (|∆N (τ )| > 2c) > 0 then let ξ
sgn (∆N (τ )) χ (|∆N (τ )| > 2c) . P (|∆N (τ )| > 2c)
S ξχ ([τ , ∞)) is adapted, right-continuous and it has an integrable variation. Let M S − S p . If τ is predictable then the graph [τ ] is a predictable set, hence ∆ (S p ) =
p
(∆S)
p
(ξχ ([τ ])) = (p ξ) χ ([τ ]) .
where p (ξ) is the predictable projection of the constant process U (t) ≡ ξ. By the definition of the predictable projection p
(ξ) (τ ) = E (ξ | Fτ − ) .
If τ is totally inaccessible then P (τ = σ) = 0 for every predictable stopping time σ. Hence p
(∆M ) (σ) =
p
0 (ξχ ([τ ]) (σ) | Fσ− ) = (ξχ ([τ ])) (σ) E
0 (0 | Fσ− ) = 0. =E Hence (∆M p ) = p (∆M ) = 0. Therefore in both cases S p has just one jump which occurs at τ . This implies that M has finite variation and it has just one jump which occurs at τ . As we have seen ξ − E (ξ | Fτ − ) if τ is predictable ∆M (τ ) = . ξ if τ is totally inaccessible
MARTINGALE REPRESENTATION
335
Obviously M H1 E
"
0
)
2 [M ] (∞) = E (∆M ) (τ ) =
= E (|∆M (τ )|) ≤ E (|ξ|) + E (|E (ξ | Fτ − )|) ≤ ≤ 2E (|ξ|) = 2. t
M− dM is a local martingale with localizing sequence (ρn ). By the integration 0 by parts formula and by Fatou’s lemma
E M 2 (t) = E lim M 2 (t ∧ ρn ) ≤ lim supE M 2 (t ∧ ρn ) = n→∞
n→∞
= lim supE ([M ] (t ∧ ρn )) ≤ E ([M ] (t)) ≤ E ([M ] (∞)) = n→∞
2 = E (∆M (τ )) < ∞. Hence M ∈ H02 . If τ is totally inaccessible then L (M ) = E ([M, N ] (∞)) = E ((∆M (τ ) ∆N (τ ))) = = E ((ξ∆N (τ ))) = =
E (|∆N (τ )| χ (|∆N (τ )| > 2c)) > P (|∆N (τ )| > 2c)
> 2c
E (χ (|∆N (τ )| > 2c)) = 2c ≥ c M H1 P (|∆N (τ )| > 2c)
which is impossible. If τ is predictable then E ((∆M (τ ) ∆N (τ ))) = E ((ξ∆N (τ ))) − E ((E (ξ | Fτ − ) ∆N (τ ))) . N is a martingale therefore p (∆N ) = 0 so E (E (ξ | Fτ − ) ∆N (τ )) = E (E (ξ | Fτ − ) E (∆N (τ ) | Fτ − )) = 0, and we can get the same contradiction as above. This implies that |∆N | ≤ 2c. Therefore N is locally bounded. 3. To finish the proof we should show that the identity in the theorem holds not only in H02 but in H01 as well. To do this we should prove that H02 is dense in H01 and E ([M, N ] (∞)) is a continuous linear functional in H01 . Because these statements have some general importance we shall present them as separate lemmas.
336
SOME OTHER THEOREMS
Lemma 5.37 H2 is dense in H1 . Proof. If M ∈ H1 then M = M c + M d , where M c is the continuous part and M d is the purely discontinuous part of M . ' & [M ] = [M c ] + M d so from (5.32) it is obvious that M c , M d ∈ H1 . τ
1. M c is locally bounded so there is a localizing sequence (τ n ) that (M c ) n ∈ H2 for all n. Observe that if (τ n ) is a localizing sequence then by the Dominated Convergence Theorem M τ n − M H1 → 0 for every M ∈ H1 . ∞ 2. For the purely discontinuous part M d = k=1 Lk where Lk are continuLk converges ously compensated single jumps of M . Recall42 that the series in H1 . Therefore it is sufficient to prove the lemma when M S − S p is a continuously compensated single jump. Let τ be the jump-time of M, that is let S ∆M (τ ) χ ([τ , ∞)). Let ξ k ∆M (τ ) χ (|∆M (τ )| ≤ k) . Let Sk = ξ k χ ([τ , ∞)) and Mk Sk −Skp . By the construction of Lk the stopping time τ is either predictable or totally inaccessible. In a same way as in the proof of the proposition just above one can easily prove that Mk has just one jump which occurs at τ . Also as during the previous proof one can easily prove that Mk ∈ H2 . M − Mk H1 = ∆M (τ ) − ∆Mk (τ )1 . If τ is totally inaccessible then as ∆M (τ ) is integrable ∆M (τ ) − ∆Mk (τ )1 = ∆M (τ ) χ (|∆M (τ )| > k)1 → 0. If τ is predictable then we also have the component E (∆M (τ ) χ (|∆M (τ )| > k) | Fτ − )1 . But if k → ∞ then in L1 (Ω) lim E (∆M (τ ) χ (|∆M (τ )| > k) | Fτ − ) = E (∆M (τ ) | Fτ − ) = 0,
k→∞
from which the lemma is obvious. 42 See:
Theorem 4.26, page 236 and Proposition 4.30, page 243.
MARTINGALE REPRESENTATION
337
Our next goal is to prove that E ([M, N ] (∞)) in (5.36) is a continuous linear functional over H01 . To do this we need two lemmas. As a first step we prove the following observation: Lemma 5.38 If for some N ∈ H02 E (|[M, N ] (∞)|) ≤ c · M H1 , 0
M ∈ H02
then %
2 sup E (N (∞) − N (τ −)) | Fτ τ
0
From the Kunita–Watanabe inequality
∞
0
1dVar ([M, N ]) ≤ -
≤
∞
"
) [M ] +
[M ]−
0
-
∞
d [M ]
"
) [M ] +
0
[M ]− d [N ].
Therefore by the Cauchy–Schwarz inequality 2
(E (|[M, N ] (∞)|)) ≤ ≤E
∞
"
) [M ] +
[M ]−
0
d [M ] E 0
Let (n)
a = t0
(n)
< t1
"
∞
< . . . < t(n) n =b
2
[M ]d [N ] .
MARTINGALE REPRESENTATION
339
be an infinitesimal sequence of partitions of [a, b]. Let f > 0 be a right-regular function with bounded variation on [a, b]. n " " f (b) − f (a) =
%
%
(n) (n) f ti − f ti−1 =
i=1
=
(n) (n) f ti − f ti−1 %
%
. (n) (n) + f ti−1 f ti
n i=1
f generates a finite measure on [a, b]. As f is right-regular and it is positive 1 %
%
(n) (n) + f ti−1 f ti (n) (n) is bounded and for every t ∈ ti−1 , ti 1 1 " % .
%
→" f (t) + f (t−) (n) (n) + f ti−1 f ti So by the Dominated Convergence Theorem it is easy to see that if n → ∞ then " " f (b) − f (a) = a
b
"
1 "
f (t) +
f (t−)
df (t) .
With the Monotone Convergence Theorem one can easily prove that if f is a right-regular, non-negative, increasing function then43 "
f (∞) −
"
f (0) =
∞
"
f (t) +
" f (t−) df (t) .
0
Using this E 0 43 See:
∞
"
) [M ] +
[M ]−
Example 6.50, page 400.
"
d [M ] = E [M ] (∞) M H1 . 0
340
SOME OTHER THEOREMS
Let us estimate the second integral. Integrating by parts
∞
E 0
"
=E
[M ]d [N ] =
"
∞
[M ] (∞) [N ] (∞) − 0
∞
=E
0
[N ]− d
"
[M ] =
" [N ] (∞) − [N ]− d [M ] .
It is easy to see that44
∞
E
[N ] (∞) d 0
=E
"
" [M ] = E [N ] (∞) [M ] (∞) =
[N ] (∞)
"
[M ] (sk ) −
"
[M ] (sk−1 )
=
k
=E
E ([N ] (∞) | Fsk )
"
k
=E
∞
E ([N ] (∞) | Fs ) d
"
[M ] (sk ) −
"
[M ] (sk−1 )
=
[M ] (s) .
0
So if %
2 k sup E (N (∞) − N (τ −)) | Fτ τ
∞
then
∞
E 0
"
[M ]d [N ] =
∞
=E
E ([N ] (∞) | Fs ) − [N ] (s−) d
"
[M ] (s)
0
∞
=E 0
=E
∞
E ([N ] (∞) | Fs ) − [N ] (s) + ∆ [N ] (s) d
= " [M ] (s)
" E N (∞) − N (s) + (∆N (s)) | Fs d [M ] (s) =
2
2
2
0
one should assume that [N ] (∞) is bounded and we should use that [M ] (∞) is integrable. Then with Monotone Convergence Theorem one can drop the assumption that [N ] (∞) is bounded. 44 First
MARTINGALE REPRESENTATION
∞
=E 0
≤k ·E 2
341
" 2 E (N (∞) − N (s−)) | Fs d [M ] (s) ≤
"
[M ] (∞) = k 2 · M H1 . 0
So 2
2
(E (|[M, N ] (∞)|)) ≤ 2 · k 2 · M H1 0
which proves the inequality. Definition 5.40 N is a BMO martingale if N ∈ H2 and %
2 sup E (N (∞) − N (τ −)) | Fτ τ
< ∞.
∞
Corollary 5.41 The BMO martingales are locally bounded. Corollary 5.42 (Dual of H01 ) L is a continuous linear functional over H01 if and only if for some BMO martingale N L (M ) = E ([M, N ] (∞)) . The dual of the Banach space H01 is the space of BMO martingales. Let us return to the Martingale Representation Problem. We proved the following statement: Theorem 5.43 (Jacod–Yor) The set X ⊆ H01 has the Martingale Representation Property if and only if the underlying probability measure P is an extremal point of M1 (X ). Proposition 5.44 Let 1 ≤ p < ∞ and let X be a closed linear subspace of H0p . The following properties are equivalent: 1. If M ∈ X and H • M ∈ H0p for some predictable process H then H • M ∈ X . 2. If M ∈ X and H is a bounded and predictable process then H • M ∈ X . 3. X is stable under truncation, that is if M ∈ X and τ is an arbitrary stopping time then M τ ∈ X . 4. If M ∈ X , s ≤ t ≤ ∞ and F ∈ Fs then (M t − M s ) χF ∈ X . Proof. Let H be a bounded predictable process and let |H| ≤ c.
[H • M ] (∞) = H 2 • [M ] (∞) ≤ c2 [M ] (∞)
342
SOME OTHER THEOREMS
so if M ∈ H0p then H • M ∈ H0p and the implication 1.⇒ 2. is obvious. If τ is an arbitrary stopping time then χ ([0, τ ]) • M = 1 • M τ = M τ − M (0) = M τ hence 2. implies 3. If F ∈ Fs then τ (ω)
s if ω ∈ F ∞ if ω ∈ /F
is a stopping time. If 3. holds then M τ ∈ X . As s ≤ t t if ω ∈ F σ(ω) ∞ if ω ∈ /F is also a stopping time hence M σ ∈ X . As X is a linear space M σ − M τ ∈ X . But obviously M σ − M τ = (M t − M s )χF , hence 3. implies 4. Now let H=
χFi χ ((ti , ti+1 ])
(5.37)
i
where Fi ∈ Fti . Obviously (H • X) (t) =
χFi (M (t ∧ ti+1 ) − M (t ∧ ti ))
i
and by 4. H • M ∈ X . Hn • M − H • M Hp = (Hn − H) • M Hp = 0 0 " = [(Hn − H) • M ] (∞) = p ) 2 = (Hn − H) • [M ] (∞) . p
" M ∈ H0p so [M ] (∞) < ∞. Therefore if Hn → H is a uniformly bounded p
sequence of predictable processes then from the Dominated Convergence Theorem it is obvious that ) 2 Hn • M − H • M Hp = (Hn − H) • [M ] (∞) → 0. 0
p
MARTINGALE REPRESENTATION
343
X is closed so if Hn • M ∈ X for all n then H • M ∈ X as well. Using this property and 4. with the Monotone Class Theorem one can easily show that if H is a bounded predictable process then H • M ∈ X . If H • M ∈ H0p for some predictable process H then " (H 2 • [M ]) (∞) < ∞. p
From this as above it is easy to show that in H0p H (χ (|H| ≤ n)) • M → H • M, so H • M ∈ X . Proposition 5.45 If 1 ≤ p < ∞ and M ∈ H0p then the set C {X ∈ H0p : X = H • M } is closed in H0p . Proof. It is easy to see that the set of predictable processes H for which45 " HLp (M ) H 2 • [M ] (∞) < ∞ (5.38) p
is a linear space. In the usual way, as in the classical theory of Lp -spaces46 , one can prove that if H1 ∼ H2 whenever H1 − H2 Lp (M ) = 0 then the set of equivalence classes, denoted by Lp (M ), is a Banach space. Let Xn ∈ C and assume that Xn → X in H0p . Let Xn = Hn • M . " " Xn Hp [Xn ] (∞) = H 2 • [M ] (∞) Hn Lp (M ) . 0 p
p
This implies that (Hn ) is a Cauchy sequence in Lp (M ), so it is convergent, hence Hn → H in Lp (M ) for some H and Hn • M → H • M . Therefore X = H • M , so C is closed. n
Proposition 5.46 Let (Mi )i=1 be a finite subset of H0p . Assume that if i = j then the martingales Mi and Mj are strongly orthogonal 47 as local martingales, that is [Mi , Mj ] = 0 whenever i = j. In this case stable(M1 , M2 , . . . , Mn ) =
n i=1
45 See:
Definition 2.57, page 151 [80], Theorem 3.11, page 69. 47 See: Definition 4.1, page 227. 46 See:
4 Hi • Mi : Hi ∈ Lp (Mi ) .
344
SOME OTHER THEOREMS
That is the stable subspace generated by a finite set of strongly orthogonal H0p martingales is the linear subspace generated by the stochastic integrals Hi • Mi , Hi ∈ Lp (Mi ). Proof. Recall that as in the previous proposition Lp (M ) is the set of equivalence classes of progressively measurable processes for which (5.38) hold. Let I denote the linear space on the right side of the equality. By Proposition 5.44 for all i Hi • Mi ∈ stable(Mi ) ⊆ stable(X ) hence I ⊆ stable(X ). From the stopping rule of the stochastic integrals I is closed under stopping. Mi (0) = 0 and Mi = 1 • Mi so Mi ∈ I for all i. By strong orthogonality * *# $ + n + n + + 2 , E, H i • Mi = Hi • [Mi ] ≤ i=1 i=1 p
p
n ) ≤ Hi2 • [Mi ] . i=1
p
From Jensen’s inequality it is also easy to show that *# n $ + n ) + 1 , √ E Hi2 • [Mi ] ≤ H • M i i . n i=1 i=1 p p
" n " n This means that the norms E [ i=1 Hi • Mi ] and i=1 Hi2 • [Mi ] are p
p
equivalent. In a similar way, as in the previous proposition, one can show that I is a closed linear subspace of H02 . Therefore stable(M1 , . . . , Mn ) ⊆ I.
Example 5.47 The assumption about orthogonality is important.
MARTINGALE REPRESENTATION
345
Let w1 and w2 be independent Wiener processes. Let J (t) t. If M1 w1 ,
M2 (1 − J) • w1 + J • w2
then [M1 , M2 ] = [w1 , (1 − J) • w1 + J • w2 ] = (1 − J) [w1 ] = (1 − J) J which is not a local martingale. So the conditions of the above proposition do not hold. We show that 4 2 p Hi • Mi : Hi ∈ L (Mi ) I i=1
is not a closed set in H0p . Let ε > 0. Obviously (ε)
H1
J −1+ε , J +ε
(ε)
H2
1 J +ε
are bounded predictable processes. (ε)
(ε)
X ε H1 • M1 + H 2 • M2 = 1−J J J −1+ε • w1 + • w1 + • w2 = J +ε J +ε J +ε ε ε • w1 + w2 − • w2 . = J +ε J +ε =
As w1 and w2 are independent
2 t ε ε ε ds → 0, • w1 − • w2 (t) = 2 J +ε J +ε s+ε 0
so Xε → w2 in H0p . Assume that for some H1 and H2 w2 = H1 • M1 + H2 • M2 = = H1 • w1 + H2 (1 − J) • w1 + H2 J • w2 . Reordering (H1 + H2 (1 − J)) • w1 = (1 − H2 J) • w2 . From this [(1 − H2 J) • w2 ] = [(H1 + H2 (1 − J)) • w1 , (1 − H2 J) • w2 ] = = (H1 + H2 (1 − J)) (1 − H2 J) • [w1 , w2 ] = 0,
346
SOME OTHER THEOREMS
so (H1 + H2 (1 − J)) • w1 = (1 − H2 J) • w2 = 0. This implies that 1 − H2 J = (H1 + H2 (1 − J)) = 0 that is H2 = 1/J and H1 = 1 − 1/J. But as t 1− 0
1 s
2 ds = +∞
/ Lp (w1 ). H1 = 1 − 1/J ∈ n
n
Definition 5.48 Let (Mi )i=1 be a finite subset of H0p . We say that (Mi )i=1 has the Integral Representation Property if for every M ∈ H0p M=
n
H i • Mi ,
Hi ∈ Lp (Mi ) .
i=1
The main result about integral representation is an easy consequence of the Jacod–Yor theorem and the previous proposition: n
Theorem 5.49 (Jacod–Yor) Let 1 ≤ p < ∞ and let X (Mi )i=1 be a finite subset of H0p . Assume that if i = j then the martingales Mi and Mj are strongly orthogonal 48 as local martingales, that is [Mi , Mj ] = 0 whenever i = j. If these assumptions hold then X has the Integral Representation Property in H0p if and only if P ∈ Mp (X ). Proof. If X has the Integral Representation Property then49 stable(X ) = H0p so P is an extremal point of Mp (X ). Assume that X does not have the Integral Representation Property. This means that stablep (X ) = H0p . We show that in this case stable1 (X ) = H01 as well: If stable1 (X ) = H01 then for every M ∈ H0p ⊆ H01 M=
n i=1
48 See: 49 See:
Definition 4.1, page 227. Proposition 5.35, page 330.
H i • Mi ,
Hi ∈ L1 (Mi ) .
MARTINGALE REPRESENTATION
347
But by the strong orthogonality assumption for every k $ # n n H i • Mi = Hi2 • [Mi ] ≥ Hi2 • [Mi ] [M ] (∞) = i=1
i=1
" " [M ] (∞) ∈ Lp (Ω) so Hi2 • [Mi ] (∞) ∈ Lp (Ω). Hence Hi ∈ Lp (Mi ) for every i, which is impossible as X does not have the Integral Representation Property in H0p . Hence stablep (X ) ⊆ stable1 (X ) = H01 .
∗ By the Hahn–Banach theorem there is a continuous linear functional L ∈ H01 that L (stable1 (X )) = 0. This implies that L (stablep (X )) = 0. L is of course a BMO martingale so it is locally bounded. As we have remarked one can assume that L is bounded. As we already discussed in this case P is not an extremal point of Mp (X ). The most important example is the following: Example 5.50 If X (wk )n k=1 are independent Wiener processes and the filtration F is the filtration generated by X then X has the Integral Representation Property on any finite interval.
On any finite interval50 wk ∈ H01 . We show that M1 (X ) = {P}. If Q ∈ M1 (X ) then wk is a continuous local martingale under Q for every k. Obviously [wk , wj ] (t) = δ ij t. By L´evy’s characterization theorem51 X (w1 , w2 , . . . , wn ) is an n-dimensional Wiener process under Q as well. This implies that f (X) dP = f (X) dQ. Ω
Ω
for every F∞ -measurable bounded function f . As F is the filtration generated by X this implies that P (F ) = Q (F ) for every F ∈ F∞ so P = Q. Example 5.51 If X (π k )n k=1 are independent compensated Poisson processes and the filtration F is the filtration generated by X then X has the Integral Representation Property on any finite interval.
50 On
finite interval [0, s] w H1 = E
51 See:
Theorem 6.13, page 368.
√ [w] (s) = s. See: Example 1.124, page 87.
348
SOME OTHER THEOREMS
On any finite interval π k ∈ H01 . If two Poisson processes are independent then they do not have common jumps52 so [π k , π j ] = 0. So we can apply the Jacod– Yor theorem. We shall prove that again M1 (X ) = {P}. If X is a compensated Poisson process, then a.s.
[X] (t) − λt = X (t) .
(5.39)
Of course this identity holds under any probability measure Q P. As in the previous example one should show that if X is a local martingale then (5.39) implies that X is a compensated Poisson process with parameter λ. Let us assume that for some process X under some measure (5.39) holds. In this case obviously 2 (∆X) = ∆X, that is if ∆X = 0 then ∆X = 1. [X] has finite variation, hence X also has finite variation, so X ∈ V ∩ L. Hence X is purely discontinuous, 2 that is X is a quadratic jump process: [X] = (∆X) . The size of the jumps is constant, so as [X] is finite for every trajectory there is just finite number of jumps on every finite interval. Let N (t) denote the number of jumps in the interval [0, t]. N (t) − λt = [X] (t) − λt = X (t) .
(5.40)
As X is a local martingale this means that the compensator of N is λt. N is a counting process so exp (itN (u)) = (exp (itN (s)) − exp (itN (s−))) = = s≤u
=
(exp (it (N (s−) + 1)) − exp (itN (s−))) =
s≤u
=
(exp (it) − 1) · exp (itN (s−)) · 1 =
s≤u
= (exp (it) − 1)
exp (itN (s−)) [N (s) − N (s−)] =
s≤u
u
= (exp (it) − 1)
exp (itN (s−)) dN (s) . 0
Taking expected value and using elementary properties of the compensator, and that on every finite interval N has only finite number of jumps u ϕu (t) E (exp (itN (u))) = (exp (it) − 1) E exp (itN (s−)) dN (s) =
0
= (exp (it) − 1) E
p
exp (itN (s−)) dN (s) 0
52 See:
u
Proposition 7.13, page 471.
=
MARTINGALE REPRESENTATION
u
= λ (exp (it) − 1) E
exp (itN (s−)) ds 0
=
u
= λ (exp (it) − 1) E = λ (exp (it) − 1)
349
exp (itN (s)) ds
=
0 u
ϕs (t) ds, 0
where ϕu (t) is the Fourier transform of N (u). Differentiating both sides by u d ϕ (t) = λ (exp (it) − 1) ϕu (t) . du u The solution of this equation is ϕu (t) = exp (λu (exp (it) − 1)) . Hence N (u) has a Poisson distribution with parameter λu. By (5.40) X is a compensated Poisson process with parameter λ. Finally recall that Poisson processes are independent if and only if53 they do not have common jumps. This means that under Q the processes π k remain independent Poisson processes. Example 5.52 Continuous martingale which does not have the Integral Representation Property.
Let ((w1 , w2 ) , G) be a two-dimensional Wiener process. Let X w1 • w2 , and let F be the filtration generated by X. Evidently Ft ⊆ Gt . X is obviously a local martingale under G.
T
w12 d [w2 ]
E ([X] (T )) = E
=
0
T
E w12 (t) dt < ∞
0
so on every finite interval X is in H02 . Hence X is a G-martingale. As X is F-adapted one can easily show that X is an F-martingale. The quadratic variation [X] is F-adapted.
t
w12 d [w2 ] =
[X] (t) = 0 53 See:
Proposition 7.11, page 469 and 7.13, page 471
t
w12 (s) ds, 0
SOME OTHER THEOREMS
350
therefore the derivative of [X] is w12 . This implies that w12 is also F-adapted. As [w1 ] is deterministic Z
1 2 w1 − [w1 ] = w1 • w1 2
is also F-adapted. Z is an F-martingale: If s < t, then using that Z = w12 − [w1 ] is a G-martingale54 M (Z (t) | Fs ) = M (M (Z (t) | Gs ) | Fs ) = = M (Z (s) | Fs ) = Z (s) . If X had the Integral Representation Property then for some Y Z = Y • X Y • (w1 • w2 ) = Y w1 • w2 . As w1 and w2 are independent [w1 , w2 ] = 0. 0 < [Z • Z] = [w1 • w1 , Y • X] = [w1 • w1 , Y w1 • w2 ] = Y w12 • [w1 , w2 ] = 0, which is impossible.
54 w
1
is in H02 .
6 ˆ FORMULA ITO’s Itˆ o’s formula is the most important relation of stochastic analysis. The formula is a stochastic generalization of the Fundamental Theorem of Calculus. Recall that for an arbitrary process X, for an arbitrary differentiable function f and (n) for an arbitrary partition (tk ) of an interval [0, t] f (X(t)) − f (X(0)) =
k
=
(n) (n) f (X(tk )) − f (X(tk−1 )) =
(6.1)
(n) (n) (n) f (ξ k ) X(tk ) − X(tk−1 ) .
k (n)
where ξ k
(n)
(n)
∈ (X(tk−1 ), X(tk )). If X is continuous then by the intermediate (n) ξk
(n) X(τ k ),
(n)
(n)
(n)
value theorem = where τ k ∈ (tk−1 , tk ). If X has finite variation then if n ∞ the sum on the right-hand side will be convergent and one can easily get the Fundamental Theorem of Calculus: f (X(t)) − f (X(0)) =
t
f (X(s))dX(s).
0
On the other hand, if X is a local martingale then the telescopic sum on the right-hand side of (6.1) does not necessarily converge to the stochastic integral t (n) (n) f (X(s))dX(s), as one cannot guarantee the convergence unless τ k = tk−1 . 0 If we make a second-order approximation
(n) (n) (n) (n) (n) f (X(tk )) − f (X(tk−1 )) = f (X(tk−1 )) X(tk ) − X(tk−1 ) +
2 (n) (n) (n) + 12 f (ξ k ) X(tk ) − X(tk−1 ) then the sum of the first order terms
(n) (n) (n) In f X(tk−1 ) X(tk ) − X(tk−1 ) k
351
352
ˆ FORMULA ITO’s
t is an approximating sum of the Itˆ o–Stieltjes integral 0 f (X(s))dX(s). Of course the sum of the second order terms is also convergent, the only question is what is the limit? As ( ( (n) (n) (n) (n) (X(tk ) − X(tk−1 ))2 ≈ X(tk ) − X(tk−1 ) one can guess that the limit is 1 2
t
f (X(s))d [X] (s).
0 (n)
(n)
This is true if X is continuous as in this case again ξ k = X(τ k ) and the second order term is ‘close’ to the Stieltjes-type approximating sum (
1 (n)
( (n) (n) f X τk X(tk ) − X(tk−1 ) . 2 The argument just introduced is ‘nearly valid’ even if X is discontinuous. In this case the first order term is again an Itˆ o–Stieltjes type approximating sum and it is convergent again in Itˆ o–Stieltjes sense and the limit is1
t
f (X(s)) dX(s) =
0
t
f (X− (s)) dX(s).
0
The main difference is that in this case one cannot apply for the second order term the intermediate value theorem. Therefore the second order term is not a simple Stieltjes type approximating sum. If we take only the ‘continuous’ subintervals, then one gets a Stieljes-type approximating sum and the limit is 1 2
t
f (X− (s))d [X c ] .
0
For the remaining terms one can only apply the approximation
2 1 (n) (n) f (ξ k ) ∆X(tk ) = 2
(n) (n) (n) (n) (n) = f (X(tk )) − f (X(tk−1 )) − f (X(tk−1 )) X(tk ) − X(tk−1 ) which converges to f (X(s)) − f (X(s−)) − f (X(s−))∆X(s), 1 See:
Theorem 2.21, page 125. The second integral is convergent in the general sense as well.
ˆ FORMULA FOR CONTINUOUS SEMIMARTINGALES ITO’s
353
so in the limit the second-order term is 1 2
t
0
6.1
f (X− (s))d [X c ] +
(f (X(s)) − f (X(s−)) − f (X(s−))∆X(s)) .
0<s≤t
Itˆ o’s Formula for Continuous Semimartingales
Recall that for continuous semimartingales one has the following integration by parts formula2 : Proposition 6.1 If X and Y are continuous semimartingales then for every t
t
X (t) Y (t) − X (0) Y (0) =
XdY + 0
t
Y dX + [X, Y ] (t) .
(6.2)
0
Theorem 6.2 (Itˆ o’s formula) Let U be an open subset of Rn . If the elements of the vector X (X1 , X2 , . . . , Xn ) are continuous semimartingales, X (t) ∈ U for every t and f ∈ C 2 (U ), then f (X (t)) − f (X (0)) =
n
t
∂f (X) dXk + ∂x k k=1 0 1 t ∂2f (X) d [Xi , Xj ] . + 2 i,j 0 ∂xi ∂xj
(6.3)
Proof. We divide the proof into several steps. 1. As a first step we prove the theorem for polynomials. If f ≡ c, where c is a constant, then the theorem is trivial. It is sufficient to prove that if the identity is valid for a polynomial f then it is true for the polynomial g xl f as well. Assume, that f (X) = f (X (0)) +
∂f 1 ∂2f (X) • Xk + (X) • [Xi, Xj ] . ∂xk 2 i,j ∂xi ∂xj k
By (6.2) g (X) Xl f (X) = = g (X (0)) + Xl • f (X) + f (X) • Xl + [Xl , f (X)] = 2 See:
Proposition 2.28, page 129.
354
ˆ FORMULA ITO’s
∂f = g (X (0)) + Xl • f (X (0)) + Xl • (X) • Xk + ∂xk k ∂2f 1 (X) • [Xi, Xj ] + + Xl • 2 i,j ∂xi ∂xj + f (X) • Xl + [Xl , f (X)] . Now Xl • f (X (0)) = 0, and by the associativity rule for stochastic integrals3 g (X) = g (X (0)) +
Xl
k
∂f (X) • Xk + ∂xk
∂2f 1 Xl (X) • [Xi, Xj ] + + 2 i,j ∂xi ∂xj + f (X) • Xl + [Xl , f (X)] . By the product rule of differentiation ∂f xl ∂x ∂g k = ∂xk xl ∂f + f ∂xl
if k = l .
(6.4)
if k = l
Substituting it in the formula above, g (X) = g (X (0)) +
∂g (X) • Xk + ∂xk k
∂2f 1 Xl (X) • [Xi, Xj ] + [Xl , f (X)] . + 2 i,j ∂xi ∂xj The second partial derivatives of g are ∂2f x l ∂xi ∂xj ∂2f ∂f x + l ∂xl ∂xj ∂xj ∂2g = ∂xi ∂xj ∂f ∂2f xl /+ ∂xi ∂xl ∂xi 2 ∂ f ∂f xl +2 ∂ 2 xl ∂xl 3 See:
Proposition 2.71, page 160.
if i, j = l if i = l, j = l , if i = l, j = l if i = j = l
(6.5)
ˆ FORMULA FOR CONTINUOUS SEMIMARTINGALES ITO’s
355
that is matrices f and g are different only in column l and in row l. It is sufficient to prove that
[Xl , f (X)] =
n ∂f (X) • [Xl , Xj ] . ∂xj j=1
By the induction hypothesis f (X) is a semimartingale. As Xl is continuous the quadratic co-variation of the bounded variation part of f (X) is zero. The quadratic variation of the stochastic integral part is n n ∂f ∂f Xl , (X) • Xk = (X) • [Xl , Xj ] . ∂xk ∂xj j=1 j=1
This means that the theorem is valid for polynomials. 2. Let us prove that one can localize the expression. That is, it is sufficient to prove the theorem for Xτ n where (τ n ) is some localizing sequence of X. Let τ be an arbitrary stopping time. The integrals in the second line are integrals taken by trajectory, hence obviously ' & ∂2f ∂2f τ (Xτ ) • Xiτ , Xjτ = (Xτ ) • [Xi , Xj ] = ∂xi ∂xj ∂xi ∂xj =
∂2f (X) χ ([0, τ ]) • [Xi , Xj ] . ∂xi ∂xj
In a similar way, using the stopping rule for stochastic integrals ∂f ∂f (Xτ ) • Xkτ = (Xτ ) χ ([0, τ ]) • Xk = ∂xk ∂xk ∂f (X) χ ([0, τ ]) • Xk . = ∂xk Assume that the theorem is valid for the truncated processes Xτ n . f ∈ C 2 (U ), hence the trajectories of the ∂f /∂xk (X) and ∂ 2 f /∂xi ∂xj (X) are continuous and therefore they are integrable. Evidently the integrands above are dominated by these common integrable processes. If τ n → ∞, then χ ([0, τ n ]) → 1. Applying the Dominated Convergence Theorem on both sides and using that f (Xτ n ) → f (X) one can easily prove the equality. 3. As X is continuous it is locally bounded. Let (τ n ) be a localizing sequence for which the images of the stopped processes Xτ n are bounded. Let K ⊆ U be a compact set which contains the image of Xτ n . One can prove, that there is a sequence of polynomials (pn ) that in the topology of C 2 (K) one has pn |K → f |K . By the definition of the topology of C 2 all the derivatives
356
ˆ FORMULA ITO’s
are uniformly convergent. As the formula is valid for every polynomial by the Dominated Convergence Theorem it is valid for the function f ∈ C 2 (U ) as well. Proposition 6.3 If the semimartingale Xl has finite variation, then it is sufficient to assume that the partial derivative ∂f /∂xl exists and it is continuous. In this case in the formula (6.3) one can drop the second-order terms with index l. Proof. If Xl has finite variation then as Xi is continuous [Xl , Xi ] = 0. If f is a polynomial, then the second-order terms with index l are zero, and in the approximation we do not need the second-order terms with index l. Corollary 6.4 (Time dependent Itˆ o formula) If the elements of the vector X (X1 , X2 , . . . , Xn ) are continuous semimartingales and the image space of X is part of an open subset U ⊆ Rn and f ∈ C 2 (R+ × U ) then4
t
f (t, X (t)) = f (0, X (0)) + n
0
∂f (s, X (s)) ds+ ∂s
t
∂f (s, X (s)) dXi (s)+ ∂x i i=1 0 n n 1 t ∂2f + (s, X (s)) d [Xi , Xj ] (s). 2 i=1 j=1 0 ∂xi ∂xj
+
If X and Y are real-valued semimartingales then we can define the object Z X + iY , which one can call a complex semimartingale. Let f : C → C be a holomorphic function. f (z) has the representation u(x, y) + iv(x, y), where u and v are differentiable functions. Recall that ∂v ∂u = ∂x ∂y
and
∂u ∂v =− . ∂y ∂x
If Z is a complex semimartingale then f (Z) = u(X, Y ) + iv(X, Y ). 4 It
is sufficient to assume that f is continuously differentiable by the time parameter.
ˆ FORMULA FOR CONTINUOUS SEMIMARTINGALES ITO’s
One can apply Itˆ o’s formula for u and for v. u(X(t), Y (t)) = u(X(0), Y (0))+ t t ∂u ∂u + (X, Y )dX + (X, Y )dY + ∂x 0 0 ∂y 1 t ∂2u (X, Y )d [X, X] + + 2 0 ∂x2 1 t ∂2u + (X, Y )d [Y, Y ] + 2 0 ∂y 2 t 2 ∂ u (X, Y )d [X, Y ] + ∂x∂y 0 and v(X(t), Y (t)) = v(X(0), Y (0))+ t t ∂v ∂v + (X, Y )dX + (X, Y )dY + 0 ∂x 0 ∂y 1 t ∂2v (X, Y )d [X, X] + + 2 0 ∂x2 1 t ∂2v + (X, Y )d [Y, Y ] + 2 0 ∂y 2 t 2 ∂ v (X, Y )d [X, Y ] . + 0 ∂x∂y The sum of the first-order terms is t ∂u ∂u (X, Y )dX + (X, Y )dY + ∂x 0 0 ∂y t t ∂v ∂v +i (X, Y )dX + i (X, Y )dY. 0 ∂x 0 ∂y
t
As ux + ivx = vy − iuy = f this sum is
t
f (Z)d (iY )
f (Z)dX + 0
t
0
0
t
f (Z)dZ.
357
358
ˆ FORMULA ITO’s
Let us calculate the second-order terms:
t
∂2u ∂2v + i 2 d [X, X] = 2 ∂x ∂x
0
t
0
∂2u ∂2v + i d [Y, Y ] = ∂y 2 ∂y 2
t
0
t
− 0
t
= 0
0
t
∂2v ∂2u +i d [X, Y ] = ∂x∂y ∂x∂y
∂2u ∂2v + i 2 d [X, X] , 2 ∂x ∂x
∂2u ∂2v + i 2 d [iY, iY ] , 2 ∂x ∂x
t
− 0
t
= 0
∂2u ∂2v − i d [Y, Y ] = ∂x2 ∂x2
∂2u ∂2v + i 2 d [X, Y ] = 2 ∂ x ∂x
∂2u ∂2v d [X, iY ] . + i ∂x2 ∂2x
Also by definition [Z] [X] + 2i [X, Y ] − [Y ] . Therefore the second order term is 1 2
t
0
∂2u ∂2v 1 + i 2 d [Z] = 2 ∂ x ∂ x 2
t
f (Z)d [Z] .
0
Corollary 6.5 (Itˆ o’s formula for holomorphic functions) If f (t, z) is continuously differentiable in t and it is holomorphic in z and Z is a continuous complex semimartingale then
t
∂f (s, Z (s)) ds+ 0 ∂s 1 t ∂ 2 f (s, Z (s)) ∂f (s, Z (s)) dZ (s) + d [Z] (s) . ∂z 2 0 ∂z 2
f (t, Z (t)) = f (0, Z (0)) + + 0
t
Example 6.6 If Z w1 + iw2 is a planar Brownian motion and f is an entire function then f (Z) is a complex local martingale and
t
f (Z (t)) = f (Z (0)) +
f (Z)dZ.
0
As [w1 , w2 ] = 0 and [w1 ] (t) = [w2 ] (t) = t obviously [Z] = 0.
SOME APPLICATIONS OF THE FORMULA
6.2
359
Some Applications of the Formula
In this section we present some famous and important applications of the formula. 6.2.1
Zeros of Wiener processes
As a first application let us investigate some important properties of the multidimensional Wiener processes. By definition assume that the coordinates of a d-dimensional Wiener process w are independent one-dimensional Wiener processes. To simplify the notation we say that a stochastic process w is a d-dimensional Wiener process starting from some point x ∈ Rd if it has the rep where w is an ordinary d-dimensional Wiener process, resentation w = x + w, obviously starting from the origin. In the same way if x is an F0 -measurable random vector then one can talk about a Wiener process starting from x. Assume that w starts from some vector x. Let5 ϑ inf {w (t) : t ≥ 0} . What is the distribution of ϑ? Theorem 6.7 (Return of a Wiener process to the origin) Every d-dimensional Wiener process w starting from some vector x = 0 satisfies the following6 : 1. If d ≥ 2 then for almost every outcome ω the trajectory w(ω) is never zero, that is P (w (t) = 0, ∀t > 0) = 1. 2. If d = 2 then P (ϑ = 0) = 1, that is, w is almost surely never zero, but it hits every neighborhood of the origin almost surely. 3. If d = 2 then the trajectories of w are almost surely dense in R2 . 4. If d ≥ 3 and w (0) = x = 0 then P (ϑ ≤ r) = 5 In
this section x denotes the norm
6 See:
Corollary B.8. page 565.
r x
d−2 ,
k
x2k .
if
0 ≤ r ≤ x .
360
ˆ FORMULA ITO’s
Proof. Assume that the twice continuously differentiable function f defined on an open set U ⊆ Rd satisfies the Laplace equation d ∂2f k=1
∂x2i
= 0,
f ∈ C 2 (U ) .
(6.6)
Let τ be a stopping time. If a d-dimensional Wiener process w starting from an x remains in U then by Itˆ o’s formula
f (wτ ) − f (w (0)) =
d ∂f (wτ ) • wkτ + ∂xk
k=1
+
' & 1 ∂2f (wτ ) • wiτ , wjτ . 2 i,j ∂xi ∂xj
& ' If i = j then7 wiτ , wjτ = 0τ = 0. Hence as [wiτ ] (s) = s ∧ τ (6.7) f (wτ (t)) − f (x) = f (wτ (t)) − f (w (0)) = d t ∂f 1 τ (wτ ) dwkτ + (∆f )(wτ (s))ds = = 2 0 0 ∂xk k=1
=
d k=1
0
t
∂f (wτ ) dwkτ . ∂xk
Assume that τ < ∞ and w is bounded on the random interval [0, τ ]. In this case the integrands in (6.7) are bounded. As on any finite interval wτ is squareintegrable the stochastic integrals are martingales8 . Hence for every point of time t 0. P inf w (t) > 0 = P inf w (t) > 0 | w (ε) = y dρ (y) , t≥ε
t≥ε
Rd
where ρ is the distribution of w (ε). Let us calculate the conditional probability. As w has stationary and independent increments
P inf w (t) > 0 | w (ε) = y
=
t≥ε
= P inf w (t) − w (ε) + w (ε) > 0 | w (ε) = y t≥ε
=
= P inf w (t) − w (ε) + y > 0 = t≥ε
=P
inf w (u) + y > 0
u≥0
=P
=
inf wy (u) > 0 ,
u≥0
where wy is the Wiener process starting from the point y. By the formula already proved for x = 0 in 3. and 4. above P inf w (t) > 0 = t≥ε
Rd
P inf wy (t) > 0 dρ (y) =
=
Rd \{0}
=
t≥0
y
P inf w (t) > 0 dρ (y) = t≥0
1dρ (y) = 1. Rd \{0}
SOME APPLICATIONS OF THE FORMULA
363
If ε → 0 then P (w (t) > 0, ∀t > 0) = lim P inf w (t) > 0 = 1. t≥ε
ε0
This means that with probability one w does not return back to the origin. Hence we have proved the theorem for all initial vectors x ∈ Rd . 6. Instead of balls around the origin one can take any ball. If we take the balls with rational centers and rational radii then the two-dimensional Wiener process with probability one intersects all of them. Therefore the trajectories of the Wiener processes are dense in R2 . In the same way one can prove the following: Corollary 6.8 Let d ≥ 3 and let w be a d-dimensional Wiener process starting from some random vector x. If x is deterministic then P (ϑ ≤ r) =
r x
d−2 ,
if
0 ≤ r ≤ x .
Corollary 6.9 If d ≥ 3 and w is a d-dimensional Wiener process then limt→∞ w (t) = ∞. Proof. Let r > 0 be arbitrary and for any a ≥ r let τ a inf {t : w (t) ≥ a} . As almost surely12 lim sup w (t) = ∞ t→∞
obviously τ a < ∞ almost surely. By the strong Markov property of w w∗ (t) (w (t + τ a ) − w (τ a )) + w (τ a ) ,
t≥0
is a Wiener process starting from the random point w (τ a ) ∈ {u = a} . Since d ≥ 3 P (∃t ≥ τ a , w (t) ≤ r) = P (∃t ≥ 0, w∗ (t) ≤ r) = 12 See:
Proposition B.7, page 564.
r d−2 a
.
364
ˆ FORMULA ITO’s
If a ∞ then this probability goes to zero. Let an ∞. The probability that w (t) returns to the ball {u ≤ r} after infinitely many τ an is zero. Hence with probability one for any ω there is an n n (ω) that w (t, ω) ∈ / {u ≤ r}
t ≥ τ n (ω) .
That is with probability one13 if t ∞ then w (t, ω) → ∞. Example 6.10 Hitting times of open and closed sets in higher dimensions14 .
. / 1. Let B (x0 , r) x ∈ Rd : x − x0 < r . Let x0 = 0 ∈ B (0, 1) and let f (x) g (x − x0 )
log x − x0 2−d x − x0
if if
d=2 . d≥3
Obviously f satisfies the Laplace equation (6.6) on Rd \ {x0 }. If B (x0 , r) ⊆ B (0, 1) and B B (0, 1) \ cl (B (x0 , r)) then f is bounded on B. Let w be a d-dimensional Wiener process and let τ inf {t : w (t) ∈ ∂B (0, 1)} . As lim supt w (t) = ∞, obviously15 almost surely τ < ∞. By Itˆo’s formula X f (wτ ) is a bounded local martingale on B, therefore X is a uniformly integrable martingale16 . Hence if ρ inf {t : w (t) ∈ ∂B} , then E (X (ρ)) = E (X (0)) = f (0) . If ρ1 inf {t : w (t) ∈ ∂B (0, 1)} ρ2 inf {t : w (t) ∈ ∂B (x0 , r)} 13 Take
r 1, 2, . . . . Corollary B.12, page 566. 15 See: Proposition B.7, page 564. 16 See: Corollary 1.145, page 103. 14 See:
SOME APPLICATIONS OF THE FORMULA
365
then as ρ = ρ1 ∧ ρ2 f (0) = E (X (ρ)) = = E (X (ρ1 ) χ (ρ1 ≤ ρ2 )) + E (X (ρ2 ) χ (ρ2 < ρ1 )) . Obviously E (X (ρ2 ) χ (ρ2 < ρ1 )) = g (r) · P (ρ2 < ρ1 ) and for some k E (X (ρ1 ) χ (ρ1 ≤ ρ2 )) ≤ k for all r > 0. This implies that for any 0 < r < 1 P (ρ1 > ρ2 ) =
g (x0 ) − k log x0 − E (X (ρ2 ) χ (ρ1 > ρ2 )) ≤ . g (r) g (r)
If r 0, then the right-hand side goes to zero, so for any ε > 0 there is an r > 0 such that P (ρ2 < ρ1 ) < ε. 2. Let (qi ) be the non-zero rational points of B (0, 1) and for any i let ri > 0 be such that
(i) P ρ2 < ρ1 < 2−(i+1) where of course (i)
ρ2 inf {t : w (t) ∈ ∂B (qi , ri )} . Let G ∪i B (qi , ri ). Obviously G is open and τ cl(G) inf {t : w (t) ∈ cl (G)} = inf {t : w (t) ∈ cl (B (0, 1))} = 0. On the other hand obviously ρ1 > 0 and if τ G inf {t : w (t) ∈ G} then P (τ G ≥ τ 1 ) = 1 − P (τ G < τ 1 ) ≥ 1 − ≥1−
i
i
2−(i+1)
1 ≥ . 2
Therefore τ cl(G) and τ G are not almost surely equal.
(i) P ρ2 < ρ1 ≥
366
ˆ FORMULA ITO’s
6.2.2
Continuous L´ evy processes
Let X be a continuous L´evy process. Since X is continuous all the moments of X are finite17 . Hence X(t) has an expected value for every t. Observe that as on any finite interval the second moments are bounded X is uniformly integrable on these intervals. Therefore E(X(t)) is continuous in t, hence E(X(t)) = tE(X(1)). Therefore if m denotes the expected value of X(1) then X(t)−t·m is a martingale. This means that X is a continuous semimartingale. To simplify the notation assume that m = 0. By the definition of the quadratic variation [X] is also a continuous L´evy process. This again implies that Y (t) [X] (t) − E ([X] (t)) is a martingale. As Y obviously has finite variation by Fisk’s theorem18 it is a constant. So [X] (t) = E ([X] (t)) = a · t. By Itˆ o’s formula t exp (iuX (s)) dX (s) − exp (iuX (t)) − 1 = iu 0
−
2
u 2
t
exp (iuX (s)) d [X (s)] . 0
exp (iuX (t)) is bounded and the quadratic variation of X is deterministic, therefore by the characterization of H2 -martingales19 the stochastic integral is a martingale. Taking expected value on both sides t u2 E (exp (iuX (t))) − 1 = − E exp (iuX (s)) d [X (s)] = 2 0 t u2 exp (iuX (s)) d (as) = =− E 2 0 u2 t E (exp (iuX (s))) ds. = −a 2 0 If ϕ (u, t) E (exp (iuX (t))) then u2 ϕ (u, t) − 1 = −a 2
t
ϕ (u, s) ds. 0
Differentiating w.r.t. t u2 dϕ (u, t) = −a · ϕ (u, t) . dt 2 Solving the differential equation, u2 ϕ (u, t) = exp −a · t 2 17 See:
Proposition 1.111, page 74. Theorem 2.11, page 117. 19 See: Proposition 2.53, page 148. 18 See:
SOME APPLICATIONS OF THE FORMULA
367
for every u. By√the formula of √ the Fourier transform for the normal distribution X (t) ∼ = N 0, at . Hence X/ a is a Wiener process. In general m is not zero, hence we have proved the next proposition: Theorem 6.11 Every continuous L´evy process is a linear combination of a Wiener process and a linear trend. One can extend the theorem to processes with independent increments: Theorem 6.12 Every continuous process with independent increments is a Gaussian process, that is for every t1 , t2 , . . . , tn (X (t1 ) , X (t2 ) , . . . , X (tn )) has Gaussian distribution. Proof. If X has independent increments then Z (t) X (t + s) − X (s) also has independent increments for every s. Therefore it is easy to prove that it is sufficient to show that X (t) has a Gaussian distribution for every t. By the continuity of X all the moments of X are bounded on every finite interval20 . Therefore the expected value E (X (t)) is finite for every t. As X is bounded in L2 (Ω) on every finite interval it is uniformly integrable on any finite interval, so E (X (t)) is continuous. Hence it is easy to see that Y (t) X (t) − E (X (t)) is a continuous martingale. Therefore one may assume that X is a continuous martingale. As X has independent increments [X] also has independent increments, so U (t) [X] (t) − E ([X] (t)) is again a continuous martingale. As [X] is increasing U has finite variation. So by Fisk’s theorem almost surely U ≡ 0. Therefore one can assume that [X] is deterministic. By Itˆo’s formula
t
exp (iuX (s)) dX (s) −
exp (iuX (t)) − 1 = iu 0
u2 − 2
t
exp (iuX (s)) d [X (s)] . 0
exp (iuX) is bounded and on any finite interval X ∈ H2 , therefore the stochastic integral is a martingale21 . Taking expected value u2 E (exp (iuX (t))) − 1 = − · E 2 20 See: 21 See:
Proposition 1.114, page 78. Proposition 2.24, page 128.
0
t
exp (iuX (s)) d [X] (s) .
(6.11)
368
ˆ FORMULA ITO’s
The quadratic variation is deterministic so one can change the order of the integration: u2 E (exp (iuX (t))) − 1 = − · 2
t
E (exp (iuX (s))) d [X] (s) . 0
If ϕ (u, t) E (exp (iuX (t))), then ϕ satisfies the integral equation u2 ϕ (u, t) − 1 = − · 2
t
ϕ (u, s) d [X] (s) .
(6.12)
0
If 2 u ϕ (u, t) exp − [X (t)] 2
(6.13)
then, as [X] is deterministic with finite variation, ϕ satisfies22 (6.12). One can easily prove23 that (6.13) is the only solution of (6.12). Therefore X (t) has a Gaussian distribution for every t. 6.2.3
L´ evy’s characterization of Wiener processes
The characterization theorem of L´evy is similar to the proposition just proved: it characterizes Wiener processes among the continuous local martingales. If X ∈ L and if [X] (t)√= t then by the same argument24 as above one can prove for every that X (t) ∼ = N 0, t . As X (t + s) − X (s) ∈ L √ s the increments u − v it is easy to prove of X are also Gaussian. As X (u) − X (v) ∼ N 0, = that the increments of X are not correlated. As X has Gaussian increments the increments are independent. Therefore by the same argument as above one can prove that X is a Wiener process with respect to its own filtration25 . Our goal is to prove that X is a Wiener process with respect to the original filtration26 . Theorem 6.13 (L´ evy’s characterization of Wiener processes) Let us fix a filtration F. If the n-dimensional continuous process X (X1 , X2 , . . . , Xn ) is zero at t = 0 then the next three statements are equivalent: 1. X is an n-dimensional Wiener process with respect to F. 2. X is a local martingale with respect to F and [Xi , Xj ] (t) = δ ij t. 22 See:
(6.32), page 398. (6.48), page 416. 24 Of course X ∈ H2 2 loc and not X ∈ H so one can first localize X and then take limit in (6.11) otherwise the argument is nearly the same. 25 See: Definition B.1, page 559. 26 See: Definition B.4, page 561. 23 See:
SOME APPLICATIONS OF THE FORMULA
369
3. Whenever fk ∈ L2 (R+ , λ), where λ is Lebesgue’s measure, then
n
E (i (f • X)) (t) exp i
k=1
0
t
1 fk dXk + 2 n
k=1
t
fk2 dλ 0
will be a complex martingale with respect to F. In particular, if X is a continuous local martingale and Y (t) X 2 (t) − t is a continuous local martingale then X is a Wiener process. Proof. Let us show that each statement implies the next one. 1. The implication 1. ⇒ 2. follows27 from the relation [w] (t) = t. 2. The proof of the implication 2. ⇒ 3. is the following: Using Itˆ o’s formula with a simple calculation one can show that E (if • X) is a local martingale. As fk ∈ L2 (R+ , λ) E (i (f • X)) (t) = exp i
n k=1
t
fk dXk
exp
0
1 2 n
k=1
t
fk2 dλ 0
is uniformly bounded, hence it is a local martingale in class D. Hence E (if • X) is a martingale28 . 3. Finally we prove the implication 3. ⇒ 1. If u ∈ Rn , 0 ≤ r < ∞ and f uχ ([0, r]) then as X (0) = 0 1 2 uk χ ([0, r]) dXk + u2 (t ∧ r) = E (if • X) (t) = exp i 2 k=1 0 1 2 = exp i (u, X (r ∧ t)) + u2 (t ∧ r) . 2
n
t
E (if • X) = 0 is a martingale, hence if s < t < r then
−1 1 = E E (if • X) (t) (E (if • X) (s)) | Fs = 1 2 = E exp i (u, X (t) − X (s)) + u2 (t − s) | Fs , 2 therefore 1 2 E (exp (i (u, X (t) − X (s))) | Fs ) = exp − u2 (t − s) , 2 27 See: 28 See:
Example 2.27. page 129, Example 2.46, page 144. Proposition 1.144, page 102.
ˆ FORMULA ITO’s
370
which means that for any set F ∈ Fs F
1 2 exp (i (u, X (t) − X (s))) dP = P (F ) · exp − u2 (t − s) . 2
√ If F = Ω then this implies that the distribution of Xi (t)−Xi (s) is N 0, t − s . Therefore
exp (i (u, X (t) − X (s))) dP = P (F ) ·
exp (i (u, X (t) − X (s))) dP Ω
F
Since this equality holds for every trigonometric polynomial, by the Monotone Class Theorem for every B ∈ Rn
=
P ({X (t) − X (s) ∈ B} ∩ F ) = χB (X (t) − X (s)) dP = P (F ) χB (X (t) − X (s)) dP = Ω
F
= P ({X (t) − X (s) ∈ B}) · P (F ) . Hence the increment X (t) − X (s) is independent of the σ-algebra Fs . So X is a Wiener process.
Example 6.14 For every Wiener process w the integral sgn (w)•w is a Wiener process.
The process is a continuous local martingale. The quadratic variation of sgn (w) • w is
t
2
(sgn (w)) d [w] = 0
t
2
(sgn (w(s))) ds = t. 0
Example 6.15 The reflected Wiener process is also a Wiener process.
Let w be a Wiener process and let τ be a stopping time. Define the reflected process w 0 (t, ω)
w (t, ω) if t ≤ τ (ω) = (2wτ − w)(t, w). 2w (τ (ω) , ω) − w (t, ω) if t > τ (ω)
SOME APPLICATIONS OF THE FORMULA
371
Obviously w 0 (0) = 0, and the trajectories of w 0 are continuous. It is also obvious that [w] 0 = [2wτ − w] = [2wτ ] − 2[2wτ , w] + [w] = 4[w]τ − 4[w]τ + [w] = [w]. As wτ is a martingale and the sum of martingales is again a margingale w 0 is a continuous local martingale, so by L´evy’s theorem it is a Wiener process. Let us discuss an interesting relation between exponential martingales and the quadratic variation: Proposition 6.16 Let X and A be continuous adapted processes on the half-line t ≥ 0. If X (0) = 0 then the next statements are equivalent: 1. A has finite variation and for every α
exp αX − α2 A/2 is a local martingale, 2. [X] = A, and X is a local martingale.
∈
C the process Yα
Proof. We prove that each statement implies the other one. 1. Assume that Yα is a local martingale and let (σ n ) be a localizing sequence of Yα . Let τ n inf {t : |X (t)| ≥ n} ∧ inf {t : |A (t)| ≥ n} ∧ σ n . Yατ n is a martingale and obviously
|Yατ n |
1 2 ≤ exp |α| n + α n , 2
d τ Yα n ≤ |Yατ n | |X τ n − αAτ n | , dα 2 d τn τn τn τn 2 τn dα2 Yα ≤ |Yα | (X − αA ) − A . It is easy to see that if α is in a bounded neighbourhood of the origin then the expressions on the right-hand side are bounded. Hence in the next calculation one can differentiate under the integral sign at α = 0.
372
ˆ FORMULA ITO’s
If α = 0 then d τn Y = Xτn, dα α hence for any F ∈ Fs
E (X
τn
(t) | Fs ) dP =
F
E F
d τn Y (t) | Fs dP = dα α
d τn Y (t) dP = dα α d Y τ n (t) dP = = dα F α d Y τ n (s) dP = = dα F α d τn Yα (s) dP = X τ n (s) dP, = F dα F =
F
therefore a.s.
E (X τ n (t) | Fs ) = X τ n (s) . Therefore X τ n is a martingale. Hence X is a local martingale. In a similar way, using that at α = 0 d2 Yατ n 2 = (X τ n ) − Aτ n , dα2 2
one can prove that (X τ n ) − Aτ n is a martingale. This implies29 that A is increasing and [X] = A. 2. The implication 2. ⇒ 1. is an easy consequence of Itˆo’s formula. As the quadratic variation of a continuous semimartingale is equal to the quadratic variation of its local martingale part if Z αX − α2 A/2, then Yα = exp (Z) and 1 Yα − Yα (0) = Yα • Z + Yα • [Z] 2 1 2A 2A + Yα • αX − α = Yα • αX − α 2 2 2 29 See:
Proposition 2.40, page 141.
SOME APPLICATIONS OF THE FORMULA
α2 Yα • [X] + 2 α2 Yα • [X] + = αYα • X − 2 = αYα • X, = αYα • X −
373
1 Yα • [αX] = 2 α2 Yα • [X] = 2
which is, as a stochastic integral with respect to a continuous local martingale, a local martingale. 6.2.4
Integral representation theorems for Wiener processes
In this subsection we return to the Integral Representation Problem. Let w be a Wiener process and let F be the filtration generated by w. Let L be a local martingale with respect to F. Let assume that L (0) = 0. Every local martingale has an H1 -localization30 . By the integral representation property of Wiener processes31 Lτ n = H • w on any finite interval. Hence τn
[L]
= [Lτ n ] = [H • w] = H 2 • [w] .
As [w] (t) = t it is obvious that [L] is continuous. Therefore L is continuous. So 2 L ∈ Hloc and one can assume that Lτ n ∈ H2 . This implies that H ∈ L2 (w). By Itˆ o’s isometry32 H is unique in L2 (w). Hence L = H • w for some H ∈ L2loc (w). Proposition 6.17 If w is a Wiener process and L is a local martingale with respect to the filtration generated by w then L is continuous and L = L (0) + H • w with some H ∈ L2loc (w). Our next statement is an easy consequence of L´evy’s characterization theorem. Proposition 6.18 (Doob) Let M be a continuous local martingale on a stochastic base (Ω, A, P, F). If the quadratic variation of M has the representation 0 30 See:
Corollary 3.59, page 221. Example 5.50, page 347. 32 See: Proposition 2.64, page 156. 31 See:
t
α2 (s, ω) ds,
[M ] (t, ω) =
(6.14)
374
ˆ FORMULA ITO’s
where α (t, ω) > 0 and α is an adapted and product measurable process, then there is a Wiener process w on (Ω, A, P, F) for which
t
α (s) dw (s) .
M (t) = M (0) + 0
Proof. One can explicitly construct the Wiener process w: 1 • M. α
w
(6.15)
First we prove that the integral exists. [M ] λ, so if αM is the Dol´eans measure of M then αM λ × P. Therefore the stochastic integrals are defined among adapted product measurable processes33 .
t
0
1 d [M ] = α2
t
0
1 2 α ds = t < ∞. α2
Hence 1/α ∈ L2loc (M ). So 1/α is integrable with respect to M . That is integral (6.15) exists. As M is continuous w is a continuous local martingale. By (6.14) 1 1 • M (t) = [w] (t) • [M ] (t) = t. α α2
Therefore by L´evy’s theorem w is a Wiener process. By (6.14) α is integrable with respect to w, therefore α•w α•
1 •M α
=α
1 • M = 1 • M = M − M (0) . α
Hence the proposition holds. Corollary 6.19 Let M be a continuous local martingale on a stochastic base F of (Ω, A, P, F) A, P, (Ω, A, P, F). If [M ] λ then there is an extension Ω, and a Wiener process w on the extended base space that t% M (t) = M (0) + 0
d [M ] dw (s) . dλ
Proof. Let w 0 be an arbitrary Wiener process on some stochastic base 0 0 0 0 Ω, A, P, F . Let the new stochastic base be the product of (Ω, A, P, F) and 33 See:
Proposition 5.20, page 314.
SOME APPLICATIONS OF THE FORMULA
375
0 F0 . Obviously w 0 A, 0 P, Ω, 0 is independent of A. Let us define α by [M ] (t)
t 0
α2 (s) ds. That is, let % α
d [M ] . dλ
The process
t
w (t) 0
1 χ (α > 0) dM + α
t
χ (α = 0) dw 0 0
is a continuous local martingale. The quadratic co-variation of independent local martingales is zero34 , so [M, w] 0 = 0. Therefore
t
[w] (t) =
χ (α > 0) ds + 0
t
χ (α = 0) ds = t. 0
Hence by L´evy’s theorem w is a Wiener process. 1 χ (α > 0) • M + χ (α = 0) • w 0 = α•w α• α = χ (α > 0) • M. On the other hand [χ (α = 0) • M ] = χ (α = 0) • [M ] = 0, hence χ (α = 0) • M = 0. So α•w = χ (α > 0) • M + χ (α = 0) • M = 1 • M = M − M (0) .
6.2.5
Bessel processes
As an application of L´evy’s theorem let us investigate the Bessel processes. Let w (w1 , w2 , . . . , wd ) be a d-dimensional Wiener process. Define the Bessel process * + d + wk2 . R w w2 , k=1
We assume that w starts at x ∈ Rd , that is R (0) = x. If it is necessary we shall explicitly indicate the initial value x. Evidently the distribution of R 34 See:
Example 2.46, page 144.
376
ˆ FORMULA ITO’s
depends on x only through the size of r x: If x = y then Qx = y for some orthonormal transformation Q. It is easy to show that Qw is also a Wiener process and Qw starts at y. Obviously Rx w = Qw Ry . Proposition 6.20 If d ≥ 2 and r ≥ 0 then if we start w from some point x ∈ Rd with r = x then R w satisfies the integral equation
t
R (t) = r + 0
d−1 ds + B(t), 2R(s)
0 ≤ t < ∞,
(6.16)
wk dwk . R
(6.17)
where B is a Wiener process and B
B (k) ,
s
B (k) (s) 0
k
Put another way, R w satisfies the stochastic differential equation dR =
d−1 dt + dB. 2R
Proof. First observe that the expression in (6.16) is meaningful: as d ≥ 2 the R(s) in the denominator is almost surely not zero for every t ≥ 0. As the integral in (6.16) is taken by trajectories it is also meaningful. On the other hand t 2 t 2 t wk wk d [wk ] = dλ ≤ 1dλ = t, R R 0 0 0 hence the stochastic integrals in (6.17) are in L2 (wk ) on every finite interval. Therefore the stochastic integrals B (k) are also meaningful. 1. By the formula for the quadratic co-variation of the stochastic integrals (
B (k) , B (l) (t) = 0
t
wk wj d [wk , wj ] = δ kj R2
0
t
wk wj dλ, R2
therefore [B] (t) =
( k
t t 2 wk dλ = 1dλ = t. B (k) (t) = R2 0 0 k
The sum of local martingales is again a local martingale. Therefore by the characterization theorem of L´evy B is a Wiener process.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
2. The proof of (6.16) uses the integration by parts formula: 2 2 R (t) − R (0) 2 wk • wk + [wk ] (t) = k
=2
k
377
(6.18)
k t
wk dwk + t · d.
0
35 The multi-dimensional Wiener processes are almost surely not √ zero , therefore 2 almost surely R > 0. Hence one can use Itˆo’s formula with x:
& ' 1 1 111 R−r = √ • R2 − • R2 = 3/2 2 2 2 (R2 ) 2 R2 wk 1 1 2 d = • wk + •λ− 4 wk • λ = R 2R 8 R3 k
=
6.3
k
wk k
(6.19)
R
• wk +
d−1 • λ. 2R
Change of measure for continuous semimartingales
The class of semimartingales is remarkably stable under a lot of operations. For example, by Itˆ o’s formula a C 2 transform of a semimartingale is a semimartingale again. Later we shall show that convex transforms of semimartingales are also semimartingales. In this section we return to the discussion of the operation of equivalent changes of measure. 6.3.1
Locally absolutely continuous change of measure
If a measure Q is absolutely continuous with respect to P then one can define the Radon–Nikodym derivative dQ/dP. If a filtration F satisfies the usual conditions then the process dQ | Ft Λ (t) E dP is a martingale and as dQ dP = Q (F ) , Λ (t) dP = F F dP
F ∈ Ft
35 Let us remark that this is a critical observation as here we used the assumption that n ≥ 2. If n = 1, then one o’s formula as in this case one can only assume that R2 ≥ 0 √ cannot use Itˆ and the function x for x ≥ 0 is not a C 2 function. If we formally still apply the formula, then we get the relation R = sign (w) • w. By Example 6.14. this expression is a Wiener process. The left-hand side is non-negative, hence the two sides cannot be equal.
378
ˆ FORMULA ITO’s
Λ (t) is the Radon–Nikodym derivative of Q on (Ω, Ft , P). On the other hand let Q (t) be the restriction of Q and let P (t) be that of P to Ft . If Q (t) is absolutely continuous with respect to P (t) then one can define the derivative Λ (t)
dQ (t) . dP (t)
If F ∈ Fs ⊆ Ft then
Λ (t) dP
F
F
dQ (t) dP = Q (F ) = dP (t)
F
dQ (s) dP dP (s)
Λ (s) dP, F
hence Λ is a martingale. Of course Λ is not necessarily uniformly integrable, so it can happen that there is no ξ for which Λ (t) = E (ξ | Ft ). To put it another way, it can happen that Q P on Ft for every t, but Q is not absolutely continuous on the σ-algebra F∞ = σ (∪t Ft ). So the derivative dQ/dP need not necessarily exist. Recall the following definition: Definition 6.21 We say that a measure Q is locally absolutely continuous with respect to a measure P if Q (t) P (t) for every t where Q (t) is the restriction of Q and P (t) is the restriction of P to Ft . We shall denote this relation by loc
loc
loc
Q P. If Q P and P Q then we shall say that P and Q are locally loc equivalent. We shall denote this by P ∼ Q. loc
Definition 6.22 If Q P then the right-regular version of Λ (t)
dQ (t) dP (t)
is called the Radon–Nikodym process of P and Q. 6.3.2
Semimartingales and change of measure
We have already proved the following important observations36 : loc
Proposition 6.23 (Invariance of semimartingales) If Q P then every semimartingale under P is a semimartingale under Q. Proposition 6.24 (Integration and change of measure) Let X be an arbitrary semimartingale and assume that the integral H •X exists under the measure loc
P. If Q P then H • X exists under Q as well. Under the measure Q the two processes, the integral under P and the integral under Q, are indistinguishable. 36 See:
Proposition 4.55, page 266, Corollary 4.58, page 271, Proposition 4.59, page 271.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
379
loc
Proposition 6.25 (Transformation of local martingales) Let Q P and let Λ be the Radon–Nikodym process of P and Q. If L is a continuous local martingale under the measure P then under the measure Q: 1. Λ−1 is well defined, 2. the integral Λ−1 • [L, Λ] exists and has finite variation on compact intervals, 3. the expression 0 L − Λ−1 • [L, Λ] L
(6.20)
is a local martingale. loc
Corollary 6.26 If Q ∼ P then Λ > 0 and Λ−1 is a martingale under Q. Proof. One only needs to prove that Λ−1 is a martingale under Q. If F ∈ Fs and t > s then 1 1 (t) dQ = (t) Λ (t) dP = F Λ F Λ 1 = P (F ) = Λ (s) dP = F Λ (s) 1 = dQ. Λ (s) F loc
Corollary 6.27 If Q P and X and Y are semimartingales then [X, Y ] calculated under Q is indistinguishable under Q from [X, Y ] calculated under P. If L is a local martingale and N is a continuous semimartingale then ( 0 N [L, N ] = L, 0 is as in (6.20). where L Proof. As [X, Y ] XY − X (0) Y (0) − Y− • X − X− • Y the first statement is obvious from Proposition 6.24. Λ−1 • [L, Λ] ∈ V and N is continuous so & ( ' 0 N L − Λ−1 • [L, Λ] , N = L, & ' = [L, N ] − Λ−1 • [L, Λ] , N = [L, N ] . 0 in (6.20) is called the Girsanov transform of L. Definition 6.28 L
380 6.3.3
ˆ FORMULA ITO’s
Change of measure for continuous semimartingales
If L is a continuous local martingale then from Itˆ o’s formula it is trivial that the exponential martingale 1 E (L) exp L − [L] 2 is a positive local martingale. Proposition 6.29 (Logarithm of local martingales) If Λ is a positive and continuous local martingale then there is a continuous local martingale L Log (Λ) log Λ (0) + Λ−1 • Λ which is the only continuous local martingale for which 1 Λ = E (L) exp L − [L] . 2 log Λ = L −
1 1 [L] Log (Λ) − [Log (Λ)] . 2 2
Proof. If Λ = E (L1 ) = E (L2 ) , then as Λ > 0 1=
Λ 1 1 = exp L1 − L2 − [L1 ] + [L2 ] , Λ 2 2
that is L1 − L2 = 12 ([L1 ] − [L2 ]). Hence the continuous local martingale L1 − L2 has bounded variation and it is constant. Evidently L1 (0) = L2 (0) , therefore o’s formula L1 = L2 . As Λ > 0 the expression log Λ is meaningful. By Itˆ 1 1 • [Λ] 2 Λ2 1 1 1 • [Λ] = L − [L] . L− 2 Λ2 2
log Λ = log Λ (0) + Λ−1 • Λ −
Therefore
1 Λ = exp (log Λ) = exp L − [L] E (L) . 2 Proposition 6.30 (Logarithmic transformation of local martingales) loc
Assume that P ∼ Q and let Λ (t)
dQ (t) dP
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
381
be continuous. If Λ = E (L), that is L = Log (Λ) then dP (t) = dQ
−1
dQ −1 0 (t) . (t) = (E (L) (t)) = E −L dP
If M is a local martingale under measure P then : = M − [M, L] = M − [M, Log (Λ)] M
(6.21)
is a local martingale under measure Q. loc
Proof. Λ > 0 as P ∼ Q. & ' [M, L] [M, Log (Λ)] M, log Λ (0) + Λ−1 • Λ = & ' = M, Λ−1 • Λ = Λ−1 • [M, Λ] . : M − Λ−1 • [M, Λ] = M − [M, L] . M
1 ( 0 0 0 0 −L, −L = E −L exp −L − 2 1 = exp −L + [L, L] − [L, L] = 2 1 −1 = exp − L − [L, L] = (E (L)) . 2 Proposition 6.31 (Girsanov’s formula) If M and L ∈ L are continuous local martingales and the process
1 Λ E (L) exp L − [L] 2 is a martingale on the finite or infinite interval [0, s] then under the measure Q (A)
Λ (s) dP. A
the process : M − [L, M ] = M − 1 • [Λ, M ] M Λ is a continuous local martingale on [0, s].
(6.22)
382
ˆ FORMULA ITO’s
Proof. L (0) = 0, therefore Λ (0) = 1. Λ is a martingale on [0, s] so Λ (s) dP = 1.
Q (Ω) = Ω
Hence Q is also a probability measure. Λ (t) = E (Λ (s) | Ft ) E
dQ | Ft , dP
that is if F ∈ Ft then
Λ (t) dP =
F
F
dQ dP = Q (F ) , dP
so Λ (t) = dQ (t) /dP (t) on Ft . The other parts of the proposition are obvious from Proposition 6.30. 6.3.4
Girsanov’s formula for Wiener processes loc
Let w be a Wiener process under measure P. If Q P then w is a continuous semimartingale37 under Q. Let M + V be its decomposition under Q. M is a continuous local martingale and M (0) = 0. The quadratic variation of M under Q is38 [M ] (t) = [M + V ] (t) = [w] (t) = t. By L´evy’s theorem39 M is therefore a Wiener process under the measure Q. By (6.20) w 0 w − Λ−1 • [w, Λ] is a continuous local martingale. As Λ−1 • [w, Λ] has finite variation by Fisk’s theorem M = w. 0 If F is the augmented filtration of w then by the integral loc representation property of the Wiener processes Λ is continuous40 . If Q ∼ P then Λ > 0 hence for some L 1 Λ E (L) exp L − [L] . 2
37 See:
Proposition 6.23, page 378. Example 2.26, page 129. 39 See: Theorem 6.13, page 368. 40 See: Proposition 6.17, page 373. 38 See:
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
383
Therefore by Proposition 6.30 M =w 0 = w − [w, L] . If F is the augmented filtration of w then F0 is the trivial σ-algebra, so Λ (0) = 1, hence L (0) = 0. Again by the integral representation theorem there exists an X ∈ L2loc (w) L = L (0) + X • w = X • w,
X ∈ L2loc (w) .
Hence M =w 0 = w − [w, L] = w − [w, X • w] = = w − X • [w] . loc
Hence if P ∼ Q then there is an X ∈ L2loc (w) such that 1 t 2 X (s) ds 2 0 0 1 2 exp X • w − X • [w] (t) E (X • w) 2
Λ (t) exp
t
X (s) dw (s) −
(6.23)
and w 0 (t) w (t) −
t
X (s) ds,
X ∈ L2loc (w)
(6.24)
0
is a Wiener process under Q. On the other hand, let X ∈ L2loc (w, [0, s]). Assume that Λ in (6.23) is a martingale on [0, s]. Define the measure Q by dQ/dP Λ (s). Obviously the process in (6.24) is a Wiener process under Q. Theorem 6.32 (Girsanov formula for Wiener processes) Let w be a Wiener process under measure P and let F be the augmented filtration of w. Girsanov’s transform w 0 of w has the following properties: loc
1. If Q P then the Girsanov transform of w is a Wiener process under measure Q. loc 2. If Q ∼ P then the Girsanov transform of w has the representation (6.24). 3. If X ∈ L2loc (w) and the process Λ in line (6.23) is a martingale over the segment [0, s] then the process w 0 in (6.24) is a Wiener process over [0, s] under the measure Q where dQ/dP Λ (s). Example 6.33 Even on finite intervals Λ E (X • w) is not always a martingale.
384
ˆ FORMULA ITO’s
. / Let u = 1 and let τ inf t : w2 (t) = 1 − t . If t = 0 then almost surely w2 (t, ω) < 1 − t, and if t = 1 then almost surely w2 (t, ω) > 1 − t. So by the intermediate value theorem P (0 < τ < 1) = 1. If X (t)
−2w (t) χ (τ ≥ t) 2
(1 − t)
,
then as τ < 1
1
X 2 d [w] = 4 0
0
τ
w2 (t) (1 − t)
4 dt ≤ 4
0
τ
2
(1 − t)
4 dt
(1 − t)
< ∞.
Hence X ∈ L2loc (w, [0, 1]). By Itˆ o’s formula, if t < 1 then w2 (t) 2
(1 − t) From this I
t
2w2 (s)
=
3 ds + (1 − s)
0
t
0
2w (s)
2 dw (s) + (1 − s)
0
t
1
2 ds.
(1 − s)
τ 1 2 X • [w] = 2 0 0 τ τ τ 2 2 w (τ ) 2w (s) 1 2w2 (s) =− + ds + ds − 2 3 2 4 ds = (1 − τ ) 0 (1 − s) 0 (1 − s) 0 (1 − s) τ 1 1 1 1 2 + 2w (s) + =− 3 − 4 2 ds ≤ 1−τ (1 − s) (1 − s) (1 − s) 0 τ 1 1 + ≤− 2 ds = −1, 1−τ (1 − s) 0 1
Xdw −
1 2
1
τ
X 2 ds = (X • w) −
Therefore Λ (1) = exp (I) ≤ 1/e. Hence E (Λ (1)) = E (exp (I)) ≤
1 < 1 = E (Λ (0)) , e
so Λ is not a martingale. Example 6.34 If w (t) w (t) − µ · t then there is no probability measure Q P on F∞ for which w is a Wiener process under Q.
Let µ = 0 and let A
w 0 (t) w (t) = 0 = lim =µ . t→∞ t→∞ t t lim
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
385
If w 0 is a Wiener process under Q then by the law of large numbers, 1 = Q (A) = P (A) = 0. Therefore Q is not absolutely continuous with respect to P on F∞ . Observe that the martingale 1 Λ (t) = exp µw (t) − µ2 t 2 is not uniformly integrable. Therefore if s = ∞ then Λ is not a martingale on [0, s]. Let us discuss the underlying measure-theoretic problem. Definition 6.35 Let (Ω, F) be a filtered space. We say that the probability spaces (Ω, Ft , Pt ) are consistent, if for any s < t the restriction of Pt to Fs is Ps . The filtered space (Ω, F) is a Kolmogorov type filtered space if whenever (Ω, Ft , Pt ) are consistent probability spaces for 0 ≤ t < ∞, then there is a probability measure P on F∞ σ (Ft : t ≥ 0) such that every Pt is a restriction of P to Ft . Example 6.36 The space C ([0, ∞)) with its natural filtration is a Kolmogorov-type filtered space.
One can identify the σ-algebra Ft with the Borel sets of C ([0, t]). Let C ∪t≥0 Ft . If we have a consistent stream of probability spaces over F, then one can define a set function P (C) Pt (C) on C. C ([0, t]) is a complete, separable metric space so P is compact regular on C, hence P is σ-additive on C. By Carath´eodory’s theorem one can extend P to σ (C) = B (C [0, ∞)) = F∞ . Observe that in Example 6.34 Λ is a martingale so the measure spaces (Ω, Ft , Qt ) are consistent. If we use the canonical representation, that is Ω = C ([0, ∞)) , then there is a probability measure Q on Ω such that Q (t) is a restriction of Q for every t. Obviously w 0 is a Wiener process under Q with respect to the natural filtration F Ω . Recall that by the previous example Q cannot be absolutely continuous with respect to P. The P-measure of set A is zero so A and all of its subsets are in the augmented filtration F P . As Q (A) = 1 obviously w 0 cannot be a Wiener process under F P . If the measures P and Q are not equivalent then the augmented filtrations can be different! Hence with the change of the measure one should also change the filtration. Of course one should augment the natural filtration F Ω because F Ω does not satisfy the usual conditions. There is a simple method to solve this problem. Observe that on every FtΩ the two measures P and Q are equivalent. It is very natural to assume that we augment
386
ˆ FORMULA ITO’s
Ω FtΩ not with every measure-zero set of F∞ but only with the measure-zero sets Ω of the σ-algebras Ft for t ≥ 0. It is not difficult to see that this filtration is right-continuous and most of the results of the stochastic analysis remain valid with this augmented filtration.
There is nothing special in the problem above. Let us show a similar elementary example. Example 6.37 The filtration generated by the dyadic rational numbers.
Let (Ω, A,P) be the interval [0, 1] with Lebesgue’s measure as probability P λ. We change the filtration only at points t = 0, 1, 2, . . .. If n < t < n + 1 then Ft Fn . Obviously F is right-continuous. Let Fn be the σ-algebra generated by the finite number of intervals [k2−n , (k + 1) 2−n ] where k = 0, 1, . . . , 2n − 1. Observe that as the intervals are closed Fn contains all the dyadic rational numbers / Ft . It is also clear that 0 < k2−n < 1. It is also worth noting that {0} , {1} ∈ the dyadic rational numbers 0 < k2−n < 1 form the only measure-zero subsets of Fn . This implies that if Pt is the restriction of P to Ft , then (Ω, Ft , Pt ) is complete. F∞ σ (Ft , t ≥ 0) is the σ-algebra generated by the intervals with dyadic rational endpoints, so F∞ is the Borel σ-algebra of [0, 1]. B ([0, 1]) is not complete under Lebesgue’s measure. If we complete it, the new measure space is the set of Lebesgue measurable subsets of [0, 1]. In the completed space the number of the measure-zero sets is 2c , where c denotes the cardinality of the continuum. If we augment F∞ only with the measure-zero sets of the σalgebras Ft then F∞ does not change. The cardinality of B ([0, 1]) is just c! Let Q be Dirac’s measure δ 0 . If t < ∞, then the set {0} is not in Ft , so if A ∈ Ft and Pt (A) = 0, then Q (A) = 0, that is Q is absolutely continuous with respect loc
to Pt for every t < ∞, that is Q P. Obviously Q P does not hold. 6.3.5
Kazamaki–Novikov criteria
From Itˆ o’s formula it is clear that if L is a continuous local martingale then E (L) is also a local martingale. It is very natural to ask when E (L) will be a true martingale on some [0, T ]. As E (L) ≥ 0, from Fatou’s lemma it is clear that it is a supermartingale, that is if t > s then
E (E (L) (t) | Fs ) = E E lim Lτ n (t) | Fs ≤ n→∞
≤ lim inf E (Lτ n ) (s) = E (L) (s) . n→∞
Hence taking expected value on both sides E (E (L) (t)) ≤ E (E (L) (s))
t ≥ s.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
387
If L (0) = 0 then E (L) (0) = 1 and in this case E (L) is a martingale on some [0, t] if and only if E (E (L) (t)) = 1. Let us first mention a simple, but very frequently used condition: Proposition 6.38 If X is constant and w is a Wiener process then Λ E (X • w) is a martingale on any finite interval [0, t]. A bit more generally: if X and w are independent then Λ E (X • w) is a martingale on any finite interval [0, t]. Proof. The first part of the proposition trivially follows from the formula of the expected value of the lognormal distribution. Using the second condition one can assume that (Ω, A,P) = (Ω1 , A1 , P1 ) × (Ω2 , A2 , P2 ) . X depends only on ω 2 , hence for every ω 1 the integrand below is a martingale on Ω1 so E (Λ (t)) = Λ (t) d (P1 × P2 ) = Ω1 ×Ω2
t
exp Ω2
=
Ω1
0
1 X (ω 2 ) dw (ω 1 ) − 2
t
2
X (ω 2 ) dλ dP1 dP2 = 0
1dP2 = 1. Ω2
The next condition is more general: Proposition 6.39 (Kazamaki’s criteria) If for a continuous local martingale L∈L sup E exp
τ ≤T
1 L (τ ) 2
< ∞,
(6.25)
where the supremum is taken over all stopping times τ for which τ ≤ T then E (L) is a uniformly integrable martingale on [0, T ]. In the case if T = ∞ it is also sufficient to assume that the supremum in (6.25) is finite over just the bounded stopping times.
388
ˆ FORMULA ITO’s
Proof. Observe that if τ is an arbitrary stopping time and (6.25) holds for bounded stopping times then by Fatou’s lemma 1 1 E exp L (τ ) = E lim exp L (τ ∧ n) χ (τ < ∞) ≤ n→∞ 2 2 1 ≤ lim inf E exp L (τ ∧ n) ≤ k. n→∞ 2
1. Let p > 1 and assume that sup E exp
τ ≤T
√ p
√ L (τ ) k < ∞, 2 p−1
(6.26)
where the supremum is taken over all bounded stopping times τ ≤ T . We show that E (L) (τ ) is bounded in Lq (Ω), where 1/p + 1/q = 1. The Lq (Ω)-bounded sets are uniformly integrable hence if (6.26) holds then E (L) is a uniformly integrable martingale. Let √ p+1 . r √ p−1 Let s be the conjugate exponent of r. By simple calculation 1√ p + 1. 2
s= Obviously % q
E (L) = exp
% q q q L − [L] exp q− L . r 2 r
By H¨older’s inequality q
√
% 1/s q E exp s q − . L (τ ) r
1/r
E (E (L) (τ )) ≤ E (E ( rqL (τ )))
√ E rqL is a non-negative local martingale, so it is a supermartingale. Hence by the Optional Sampling Theorem41 the first part of the product cannot be larger than 1. % √ p q ,
= s q− √ r 2 p−1 41 See:
Proposition 1.88, page 54.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
389
hence
q
E (E (L) (τ )) ≤ E exp
1/s √ p
√ L (τ ) ≤ k 1/s . 2 p−1
2. As
exp (x) ≤ exp x+ ≤ exp (x) + 1 one has E exp
1 L (τ ) 2
1 + ≤ E exp L (τ ) ≤ 2 1 L (τ ) + 1 ≤ E exp 2
from which it is obvious that
sup E exp
τ ≤T
1 + L (τ ) 0, hence by the Dominated Convergence Theorem
lim E exp
a1
1−a2
a L (T ) 1+a
0 1 L (T ) = E exp = 1. 2
Therefore 1 ≤ E (E (L) (T )) from which, by the supermartingale property of E (L), the proposition is obvious.
Corollary 6.40 If L is a continuous local martingale and exp 12 L is a uniformly integrable submartingale then E (L) is a uniformly integrable martingale.
Proof. By the uniform integrability one can take exp 12 L on the closed interval [0, T ]. By the Optional Sampling Theorem for integrable submartingales42 if τ ≤ T then exp
1 L (τ ) 2
≤ E exp
1 L (T ) | Fτ , 2
from which (6.25) holds. Corollary 6.41 If L is a uniformly integrable continuous martingale and
E exp 12 L (T ) < ∞ then E (L) is a uniformly integrable martingale. 42 See:
Proposition 1.88, page 54.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
391
Proof. As L is uniformly integrable L (T ) is meaningful. A convex function of a martingale is a submartingale. exp
1
2 L (t)
≤ E exp 12 L (T ) | Ft .
Taking the expected value on both sides, it is clear that exp 12 L is an integrable submartingale. By the Optional Sampling Theorem
for submartingales exp 12 L (τ ) is integrable for every τ and (6.25) holds. Corollary 6.42 (Novikov’s criteria) If L ∈ L is a continuous local martingale on some finite or infinite interval [0, T ] and
E exp 12 [L] (T ) < ∞,
(6.27)
and Λ E (L) then E (Λ (T )) = E (Λ (0)) = 1 and Λ is a uniformly integrable martingale on [0, T ]. Proof. E (L) is a non-negative local martingale, hence it is a supermartingale. By the Optional Sampling Theorem43 for any bounded stopping time τ E (L (τ )) ≤ E (L (0)) = 1. By the Cauchy–Schwarz inequality 1 L (τ ) ≤ E exp 2 - - [L] (τ ) [L] (τ ) ≤ E exp L (τ ) − E exp 2 2 - - " [L] (τ ) [L] (τ ) E (L (τ )) E exp ≤ E exp ≤ 2 2 - 1 [L] (T ) < ∞. ≤ E exp 2 Hence Kazamaki’s criteria holds. 43 See:
Proposition 1.88, page 54.
392
ˆ FORMULA ITO’s
Corollary 6.43 If L X • w, T is finite and for some δ > 0
sup E exp δX 2 (t) < ∞
(6.28)
t≤T
then
t
Λ (t) exp
Xdw − 0
1 2
t
X 2 dλ 0
is a martingale on [0, T ]. Proof. Let L X • w. By Jensen’s inequality exp
1 T T X 2 (t) 1 [L] (T ) = exp dt ≤ 2 T 0 2 T X 2 (t) 1 T dt. exp ≤ 2 T 0
If T /2 ≤ δ then we can continue the estimation
E exp
1 T X 2 (t) 1 T [L] (T ) ≤ dt ≤ E exp 2 T 0 2
≤ sup E exp δX 2 (t) < ∞ t≤T
n
by condition (6.28), so Novikov’s criteria holds. Hence E (Λ (T )) = 1. Let (tk )k=0 be a partition of [0, T ]. Assume that the size of the intervals [tk−1 , tk ] is smaller than 2δ. If Λk exp
tk+1
tk
then Λ =
!
X (s) dw (s) −
1 2
tk+1
X 2 (s) ds
tk
a.s.
Λk , E (Λk ) = 1 and E (Λk | Ftk ) = 1. Hence
k
E (Λ (T )) = E E Λ (T ) | Ftn−1 =
= E E Λn−1 Λ (tn−1 ) | Ftn−1 =
= E Λ (tn−1 ) E Λn−1 | Ftn−1 = = E (Λ (tn−1 )) = · · · = E (Λ (t1 )) = 1.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
393
Corollary 6.44 If X is a Gaussian process, T is finite and sup D (X (t)) < ∞,
t≤T
then Λ = E (X • w) is a martingale on [0, T ]. If µt and σ t denote the expected value and the standard deviation of X (t) then 2
2
2 1 x − µt 1 exp δx exp − dx = E exp δX (t) = √ 2 σt σ t 2π R
exp δµ2t / (1 − 2δσ t ) √ = . 1 − 2δσ t
If δ < 1/ 2 supt≤T D (X (t)) then E exp δX 2 (t) is bounded. Example 6.45 Novikov’s criteria is an elegant but not a too strong condition.
Let τ be a stopping time. If L is a continuous local martingale, then Lτ is also a continuous local martingale. 1 τ E (Lτ ) = exp Lτ − [Lτ ] = E (L) , 2 so one could write any stopping time τ ≤ T in (6.27) instead of T . If for a stopping time τ 1 τ t be a point of continuity of µ. lim sup µn ((0, t]) ≤ lim sup µn ((0, r]) = µ ((0, r]) . n→∞
n→∞
Since the points of continuity of µ are dense in R+ and as µ is right-continuous lim sup µn ((0, t]) ≤ µ ((0, t])
(6.38)
n→∞
for every t ≥ 0. Also recall that µc denotes the continuous part of the increasing function t → µ ((0, t]). Definition 6.51 Let (∆n ) be an infinitesimal52 sequence of partitions: (n)
∆ n : 0 = t0
(n)
< t1
(n)
< . . . < tkn = ∞.
1. We say that a right-regular function f on [0, ∞) has finite quadratic variation with respect to (∆n ) if the sequence of point measures53
2
(n)
(n) (n) f ti+1 − f ti δ ti
µn
(n)
ti
∈∆n
50 One
should use the fact that X− is locally bounded. the points of continuity are dense the limit is unique. 52 That is, on any finite interval max (n) − t(n) → 0. k tk+1 k 51 As
53 Recall
that δ (a) is Dirac’s measure concentrated at point a.
(6.39)
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
405
converges in the vague topology to a locally finite measure µ where µ has the decomposition µ ((0, t]) = µc ((0, t]) +
2
(∆f (s)) .
s≤t
We shall denote µ ((0, t]) by [f ] (t) [f, f ] (t). 2. We say that right-regular functions f and g on [0, ∞) have finite quadratic co-variation with respect to (∆n ) if [f ] , [g] and [f + g] exist. In this case 1 ([f + g] − [f ] − [g]) . 2
[f, g]
3. A function g is (∆n )-integrable with respect to some function G if the limit lim
n→∞
(n)
ti
(n) (n) (n) g ti G ti+1 − G ti
≤t
is finite for every t ≥ 0. We shall denote this (∆n )-integral by
t
g (s−) dG (s) . 0
Theorem 6.52 (F¨ ollmer) Let F ∈ C 2 Rd and let (∆n ) be an infinitesimal d sequence of partitions of [0, ∞). If f (fk )k=1 are right-regular functions on R+ with finite quadratic variation and co-variation with respect to (∆n ) then for every t > 0 F (f (t)) − F (f (0)) = t ∂F = (f (s−)) , df (s) + ∂x 0 t ∂2F 1 (f (s−)) d [fi , fj ] (s) − + 2 i,j 0 ∂xi ∂xj −
+
s≤t
1 ∂2F (f (s−)) ∆fi (s) ∆fj (s) + 2 ∂xi ∂xj i,j s≤t
d ∂F F (f (s)) − F (f (s−)) − (f (s−)) ∆fi (s) ∂xi i=1
ˆ FORMULA ITO’s
406 where
t 0
∂F (n)
(n)
∂F (n) , f ti+1 − f ti (f (s−)) , df (s) lim f ti n→∞ ∂x ∂x (n) ti
≤t
where ∂F ∂x
∂F ∂F ∂F , ,..., ∂x1 ∂x2 ∂xd
denotes the gradient vector of F and all the other integrals are (∆n )-integrals. If the coordinates of the vector X (X1 , X2 , . . . , Xn ) are semimartingales, then the quadratic variations and co-variations exist and they converge uniformly on compact sets in probability. This implies that for some subsequence they converge uniformly, almost surely. Also, for semimartingales the stochastic integrals 0
t
∂F (X (s−)) dXk (s) ∂xk
exist and by the Dominated Convergence Theorem, uniformly on compact intervals in probability, 0
t
∂F ∂F (n) (n) (X (s−)) dXk (s) = (X (ti )) Xk ti+1 − X ti ∂xk ∂xk (n) ti
≤t
therefore F¨ ollmer’s theorem implies Itˆo’s formula. Proof. Fix t > 0. To simplify the notation we drop the superscript n. 1. If the first point in ∆n which is larger than t is tkn then tkn t. As f is right-continuous F (f (t)) − F (f (0)) = lim F (f (tkn )) − F (f (0)) = n→∞ = lim (F (f (ti+1 )) − F (f (ti ))) . n→∞
i
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
407
To simplify the notation further we drop all the point from ∆n which are larger than tkn . By Taylor’s formula F (f (ti+1 )) − F (f (ti )) =
d ∂F (f (ti )) (fk (ti+1 ) − fk (ti )) + ∂xk
k=1
1 ∂2F (f (ti )) (fk (ti+1 ) − fk (ti )) (fl (ti+1 ) − fl (ti )) + + 2 ∂xk ∂xl k,l
+r (f (ti ) , f (ti+1 )) where 2
|r (a, b)| ≤ ϕ (b − a) b − a . As F is twice continuously differentiable one may assume that ϕ is increasing and limc0 ϕ (c) = 0. 2. Given ε > 0 we split the set of jumps of f into two classes. C1 is a finite set and C2 is the set of jumps for which
s∈C2 ,s≤t
d
2 |∆fk (s)|
≤ ε.
k=1
As f has quadratic variation and co-variation this separation is possible. Since C1 is finite and as f is right-regular if (1) denotes the sum over the sub-intervals which contain a point from C1 then lim
n→∞
(F (f (ti+1 )) − F (f (ti ))) =
(F (f (s)) − F (f (s−))) .
(6.40)
s∈C1
(1)
Let F denote the first derivative and F the second derivative of F . Adding up the increments of other intervals
(F (f (ti+1 )) − F (f (ti ))) =
F (f (ti )) (f (ti+1 ) − fk (ti )) +
(2)
+ −
(1)
1 2
F (f (ti )) (f (ti+1 ) − f (ti )) −
1 F (f (ti )) (f (ti+1 ) − f (ti )) + F (f (ti )) (f (ti+1 ) − f (ti )) + 2 +
(2)
r (f (ti ) , f (ti+1 )) .
408
ˆ FORMULA ITO’s
As C1 is finite the expression in the third line goes to (1)
1 F (f (s−)) ∆f (s) + F (f (s−)) (∆f (s)) . 2
(6.41)
One can estimate the last expression as 2 ≤ ϕ max r (f (t ) , f (t )) f (t ) − f (t ) f (ti+1 ) − f (ti ) i i+1 i+1 i (2) (2) (2) therefore, using (6.38), lim sup r (f (ti ) , f (ti+1 )) ≤ k→∞ (2) 2 ≤ ϕ (ε+) lim sup f (ti ) − f (ti+1 ) ≤ n→∞
≤ ϕ (ε+) lim sup n→∞
d
ti ≤t
µ(k) n ((0, t]) ≤ ϕ (ε+)
k=1
d
[fk ] (t) .
k=1
If ε 0 then this expression goes to zero and the difference of (6.40) and (6.41) goes to s≤t
1 F (f (s)) − F (f (s−)) − F (f (s−)) (∆f (s)) − F (f (s−)) (∆f (s)) 2
3. Let G now be a continuous function. We show that if f is one of the functions fk or fk + fl then
2
G (f (ti )) (f (ti+1 ) − f (ti )) = t = G (f (s−)) d [f ] (s) .
lim
n→∞
0
Using the definition of measures related to the quadratic variation this means that lim
n→∞
t
G (f ) dµn = 0
t
G (f (s−)) dµ (s) , 0
(6.42)
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
where the integrals are usual Lebesgue–Stieltjes integrals. and let
h (u)
409
Let ε > 0
∆f (s) .
s∈C1 ,s≤u (C )
As C1 is a finite set it is Let µn 1 be the point measure like (6.39) based on h.
(C1 ) easy to see that the sequence of point measures µn converges to the point measure µ(C1 )
2
(∆f (s)) δ (s) .
s∈C1
As C1 is finite it is also easy to see, that
t
lim
G (f
n→∞
(s)) dµn(C1 )
t
G (f (s−)) dµ(C1 ) (s) .
(s) =
0
(6.43)
0
Let g f − h. As f = h + g obviously
2
(f (ti+1 ) − f (ti )) =
ti ≤u
+ +2
2
(h (ti+1 ) − h (ti )) +
ti ≤u 2
(g (ti+1 ) − g (ti )) +
ti ≤u
(g (ti+1 ) − g (ti )) (h (ti+1 ) − h (ti )) .
ti ≤u
C1 has only a finite number of points and if h is not continuous at some point s (C ) then g is continuous at s. Hence the third term goes to zero. Therefore µn −µn 1 converges to µ − µ(C1 ) . t t
(C1 ) (C1 ) G (f (s)) d µn − µn G (f (s−)) d µ − µ (s) − (s) ≤ 0 0 t t
(C1 ) (C1 ) G (f (s)) d µn − µn G (f (s)) d µ − µ (s) − (s) + ≤ 0
0
t
+ G (f (s)) − G (f (s−)) d µ − µ(C1 ) (s) . 0
The total size of the atoms of the measure µ − µ(C1 ) is smaller than ε2 . The function G (f ) is continuous at the point of continuity of µ − µ(C1 ) so one can
410
ˆ FORMULA ITO’s
estimate the second term by t
(C1 ) G (f (s)) − G (f (s−)) d µ − µ (s) ≤ 2ε2 sup |G (f (s))| . s≤t 0
Recall that f is bounded54 , and therefore sup |G (f (s))| < ∞. s≤t
Obviously µ − µ(C1 ) (C1 ) = 0. Hence there are finitely many open intervals which cover the points of C1 with total measure smaller than ε. Let O be the union of these intervals. As the points of continuity are dense one may assume that the points of the boundary of O are points of continuity of µ − µ(C1 ) . By the vague convergence one can assume that for some n sufficiently large (C ) (µn − µn 1 ) (O) < ε. If one deletes O from [0, t] the jumps of f are smaller than ε then on the compact set [0, t] \C1 . G is uniformly continuous on the bounded range55 of f so there is a δ such that if s1 , s2 ∈ [0, t] \O and |s1 − s2 | < δ then |G (f (s1 )) − G (f (s2 ))| < 2ε. This means that there is a step function H such that |H (s) − G (f (s))| < 2ε on [0, t] \O. On may also assume that the points of discontinuities of the step function H are points of continuity of measure µ − µ(C1 ) . t t
(C1 ) (C1 ) (s) − (s) ≤ G (f (s)) d µn − µn G (f (s)) d µ − µ lim sup n→∞ 0
0
≤ 2ε sup |G (f (s))| +
n→∞
s≤t
+2ε µn − µn(C1 ) ([0, t]) + µ − µ(C1 ) ([0, t]) + t t
(C1 ) (C1 ) H (s) d µn − µ H (s) d µ − µ (s) − (s) . + lim sup
0
0
Since the last expression, by the vague convergence goes to zero, for some k independent of ε t t
lim sup G (f (s)) d µn − µn(C1 ) (s) − G (f (s)) d µ − µ(C) (s) ≤ εk. n→∞
54 See: 55 See:
0
Proposition 1.6, page 5. Proposition 1.7, page 6.
0
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
411
As ε is arbitrary lim
n→∞
t
t
G (f (s)) d µn − µn(C1 ) (s) = G (f (s−)) d µ − µ(C1 ) (s) .
0
0
Using (6.43) one can easily show (6.42). 4. Applying this observation and the definition of the co-variation one gets the convergence of F (f (ti )) (f (ti+1 ) − f (ti )) = =
∂ 2 F (f (ti )) (fk (ti+1 ) − fk (ti )) (fl (ti+1 ) − fl (ti )) ∂xk ∂xk k,l
to the sum of integrals k,l
t
0
∂2F (f (s−)) d [fk , fl ] (s) . ∂xk ∂xl
5. As all the other terms converge, ∂F (f (ti )) (f (ti )) , f (ti+1 ) − f (ti ) ∂x i also converges and its limit, by definition, is t ∂F (f (s−)) , df (s) ∂x 0 which proves the formula. 6.4.3
Exponential semimartingales
As an application of the general Itˆ o formula let us discuss the exponential semimartingales. Let Z be an arbitrary complex semimartingale, that is let Z X + iY , where X and Y are real-valued semimartingales. Let us investigate the stochastic integral equation E = 1 + E− • Z.
(6.44)
Definition 6.53 The equation (6.44) is called the Dol´eans equation. The simplest version of the equation is when Z(s) ≡ s E (t) = 1 +
t
E (s−) ds = 1 + 0
t
E (s) ds, 0
412
ˆ FORMULA ITO’s
which characterizes the exponential function E (t) = exp (t). This explains the next definition: Definition 6.54 The solution of (6.44), denoted by E (Z), is called the exponential semimartingale of Z. Proposition 6.55 (Yor’s formula) If X and Y are arbitrary semimartingales then E (X) E (Y ) = E (X + Y + [X, Y ]) . Proof. By the formula for the quadratic variation of stochastic integrals ' & [E (X) , E (Y )] 1 + E (X)− • X, 1 + E (Y )− • Y =
= E (X)− E (Y )− • [X, Y ] . Integrating by parts E (X) E (Y ) − 1 = E (X)− • E (Y ) + E (Y )− • E (X) + [E (X) , E (Y )] =
= E (X)− E (Y )− • (Y + X + [X, Y ]) , from which, evident.
by the definition of the operator E,
Yor’s formula is
In the definition of E(Z) and during the proof of Yor’s formula we have implicitly used the following theorem: Theorem 6.56 (Solution of Dol´ eans’ equation) Let Z be an arbitrary complex semimartingale. 1. There is a process E which satisfies the integral equation (6.44). 2. If E1 and E2 are two solutions of (6.44) then E1 and E2 are indistinguishable. 3. If τ inf {t : ∆Z = −1} then E (Z) = 0 on [0, τ ), E (Z)− = 0 on [0, τ ] and E (Z) = 0 on [τ , ∞). 4. E (Z) is a semimartingale. 5. If Z has finite variation then E (Z) has finite variation. 6. If Z is a local martingale then E (Z) is a local martingale. 7. E has the following representation: 1 c (6.45) E E (Z) = exp Z − Z (0) − [Z] × 2 ! × (1 + ∆Z) exp (−∆Z) , where the product in the formula is absolutely convergent.
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
413
Proof. The proof of the theorem is a direct and simple, but lengthy calculation. We divide the proof into several steps. variation of semimartingales is finite. Hence the sum 1. The quadratic 2 |∆Z (s)| is convergent. Therefore on the interval [0, t] there are just finitely s≤t many moments when |∆Z| > 1/2. If |u| ≤ 1/2, then 2
|ln (1 + u) − u| ≤ C |u| , hence ln
!
|1 + ∆Z| |exp (−∆Z)| = (ln (|1 + ∆Z|) − |∆Z|) ≤ ≤ |ln (1 + |∆Z|) − |∆Z|| ≤ 2 ≤C |∆Z| < ∞.
Therefore the product V (t)
!
(1 + ∆Z (s)) exp (−∆Z (s))
s≤t
is absolutely convergent. Separating the real and the imaginary parts and taking logarithm, one can immediately see that V is a right-regular process with finite variation. By the definition of the product operation obviously56 V (0)
!
(1 + ∆Z (s)) = 1 + ∆Z (0) = 1.
s≤0
2. Let us denote by U the expression in the exponent of E (Z): U (t) Z − Z (0) −
1 c [Z ] . 2
With this notation E E (Z) V exp (U ) . By Itˆo’s formula for complex semimartingales, using that E (0) = 1, c and that V has finite variation, the co-variation [U, V ] = [U c , V c ] and 56 See:
(1.1) on page 4.
414
ˆ FORMULA ITO’s
c
[V ] = [V c ] are zero and hence E = 1 + E− • U + exp (U− ) • V + 1 c + E− • [U ] + 2 + (∆E − V− exp (U− ) ∆U − exp (U− ) ∆V ) . V is a pure jump process and therefore A exp (U− ) • V =
exp (U− ) ∆V.
As ∆U = ∆Z ∆E E − E− exp (U ) V − exp (U− ) V− = = exp (U− + ∆U ) V− (1 + ∆Z) exp (−∆Z) − exp (U− ) V− = = exp (U− + ∆U ) exp (−∆U ) V− (1 + ∆U ) − exp (U− ) V− = = exp (U− ) V− ∆U E− ∆U. Substituting the expressions A and ∆E A+
(∆E − E− ∆U − exp (U− ) ∆V ) = 0.
Obviously c 1 c c [U ] Z − Z (0) − [Z] = [Z c ] = [Z] , 2 c
and therefore 1 c E = 1 + E− • U + E− • [U ] = 2 1 c = 1 + E− • U + [Z] 2 1 + E− • (Z − Z (0)) = 1 + E− • Z, hence E satisfies (6.44). 3. One has to prove that the solution is unique. Let Y be an arbitrary solution of (6.44). The stochastic integrals are semimartingales so Y is a semimartingale. By Itˆo’s formula H Y · exp (−U ) is also a semimartingale. Applying the
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
415
multidimensional complex Itˆ o’s formula for the complex function z1 · exp (−z2 ) H = 1 − H− • U + exp (−U− ) • Y + 1 c c + H− • [U ] − exp (−U− ) • [U, Y ] + 2 + (∆H + H− ∆U − exp (−U− ) ∆Y ) . Y is a solution of the Dol´eans equation so exp (−U− ) • Y = exp (−U− ) Y− • Z H− • Z. c
c
c
[U, Y ] = [U, (Y− • Z)] = Y− • [U, Z] c 1 c c Y− • Z − [Z] , Z = Y− • [Z] . 2 c
c
exp (−U− ) • [U, Y ] = H− • [Z] . c
c
Adding up these terms and using that [U ] = [Z]
1 c c H− • Z + [U ] − [Z] 2
= H− • U,
hence H =1+
(∆H + H− ∆U − exp (−U− ) ∆Y ) .
Y is a solution of (6.44), so ∆Y = Y− ∆Z = Y− ∆U. Hence H =1+ 1+ =1+
(∆H + H− ∆U − exp (−U− ) Y− ∆U ) (∆H + H− ∆U − H− ∆U ) = ∆H.
(6.46)
416
ˆ FORMULA ITO’s
On the other hand, using (6.46) again ∆H H − H− Y exp (−U ) − H− = = exp (−U− − ∆U ) (Y− + ∆Y ) − H− = = exp (−U− − ∆U ) Y− (1 + ∆Z) − H− = = exp (−U− ) Y− exp (−∆U ) (1 + ∆Z) − H− = = H− (exp (−∆Z) (1 + ∆Z) − 1) so H = 1 + H− • R,
(6.47)
where R
(exp (−∆Z) (1 + ∆Z) − 1) .
For some constant C if |x| ≤ 1/2 |exp (−x) (1 + x) − 1| ≤ Cx2 . 2 Z is a semimartingale so (∆Z) < ∞ and therefore R is a complex process with finite variation. 4. Let us prove the following simple general observation: if v is a right-regular function with finite variation then the only right-regular function f for which
h
h≥0
f (s−) dv (s) ,
f (h) =
(6.48)
0
is f ≡ 0. Let s inf {t : f (t) = 0}. Obviously f = 0 on the interval [0, s). Hence by the integral equation (6.48)
s
s
f (t−) dv (t) =
f (s) = 0
0dv = 0. 0
If s < ∞ then, as v is right-regular, there is a t > s such that Var (v (t)) − Var (v (s)) ≤ 1/2. If t ≥ u > s then
u
s
≤ Var(v, s, u) sup |f (u)| ≤ s≤u≤t
u
f− dv ≤
f− dv =
f (u) = f (s) +
s
1 sup |f (u)| 2 s≤u≤t
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
417
and therefore sup |f (u)| ≤
s a) dX+
χ (X (s−) > a) (X (s) − a)− +
0<s≤t
+
χ (X (s−) ≤ a) (X (s) − a)+ +
0<s≤t
+ 66 See:
1 L (a, t) , 2
Proposition 5.23 page 319. Observe that the integrand is uniformly bounded. we show that for continuous local martingales the local time L (a, t, ω) has a version which is continuous in (a, t). 67 Later
426
ˆ FORMULA ITO’s
or (X (t) − a)− − (X (0) − a)− = −
t
χ (X− ≤ a) dX+
0
+
χ (X (s−) > a) (X (s) − a)− +
0<s≤t
+
χ (X (s−) ≤ a) (X (s) − a)+ +
0<s≤t
1 L (a, t) . 2
+
These formulas are called Tanaka’s formulas.
Let us apply the generalization of Itˆ o’s formula (6.54) for convex functions + − f (x) (x − a) and g (x) (x − a) :
t
f (X (t)) = f (X (0)) + 0 t
g (X (t)) = g (X (0)) +
f (X− ) dX + A(+) (t) , g (X− ) dX + A(−) (t) .
0
Subtracting the two lines above and using that f (x) = χ (x > a) ,
g (x) = −χ (x ≤ a)
one gets
t
1dX + A(+) (t) − A(−) (t) .
X (t) − X (0) = 0
This implies that A(+) (t) = A(−) (t). If B (+) (t) A(+) (t) −
(f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s))
0<s≤t
B (−) (t) A(−) (t) −
(g (X (s)) − g (X (s−)) − g (X (s−)) ∆X (s))
0<s≤t
then by the definition of the local time B (+) (t) + B (−) (t) = L (a, t). As the difference of the sums above is zero B (+) (t) = B (−) (t) , hence B (+) = B (−) = L (a) , so the formula is valid.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
427
For any process X one can introduce the occupation times measure µt (B) λ (s ≤ t : X (s) ∈ B) . Later we shall see68 that for Wiener processes the local time L (a, t) is the density function of µt . With the usual interpretation of the density functions for Wiener processes one can think about L (a, t) da as the time during the time interval [0, t] a Wiener process is infinitely closely around a. Example 6.70 The Green function and the local time of Wiener processes.
Let w(x) be a Wiener process starting from point x. Let x ∈ I (a, b) be a (x) bounded interval, and let τ I be the exit-time of w(x) from I. Let us calculate the expected value E L(x) (y, τ I ) . By the definition of local time L(x) t
(x) sign w(x) − y dw(x) + L(x) (y, t) . w (t) − y = w(x) (0) − y + 0 (x)
If we truncate w(x) by τ I then the truncated process is bounded. If we truncate (x) both sides with τ I then the truncated integrator is in H2 . By Itˆo’s isometry the integral is also in H2 . Therefore the stochastic integral is a uniformly integrable martingale. By the Optional Sampling Theorem the expected value of the stochastic integral is zero, so
(x) (x) E w(x) τ I − y = |x − y| + E L(x) y, τ I . w(x) leaves the bounded set [a, b] almost surely so
(x) (x) E w(x) τ I − y = |a − y| P w τ I =a
(x) + |b − y| P w τ I =b . With the Optional Sampling Theorem one can easily calculate the probabilities69 . Obviously
(x) (x) P w τI = a + P w τI = b = 1, 68 See: 69 See:
Corollary 6.75, page 435. Example 1.116, page 81.
428
ˆ FORMULA ITO’s
and
(x) x = E w(x) (0) = E w(x) τ I
(x) (x) = aP w τ I = a + bP w τ I =b . Solving the equations
b−x (x) , P w τI =a = b−a
x−a (x) P w τI . =b = b−a
Substituting back
x−a b−x (x) + |b − y| − |x − y| . = |a − y| E L(x) y, τ I b−a b−a With elementary calculation
(x)
E L
(x) y, τ I
2 = b−a
(x − a) (b − y) if a ≤ x ≤ y ≤ b . (y − a) (b − x) if a ≤ y ≤ x ≤ b
If we introduce the so-called Green function 1 (x − a) (b − y) if a ≤ x ≤ y ≤ b GI (x, y) (y − a) (b − x) if a ≤ y ≤ x ≤ b b−a then
(x) E L(x) y, τ I = 2GI (x, y) .
Example 6.71 If 0 < a < b then before reaching point b a Wiener process starting from x = 0 on average spends 2 (b − a) da time units in the da neighbourhood point a.
Let w be a Wiener process and let 0 < a < b. Let us denote by τ b the first passage time of point b. Using the interpretation of the local times one should calculate the expected value E (L (a, τ b )). Using the same method as in the previous example = |a| + E
|b − a| = E (|w (τ b ) − a|) = τb sign (w (s) − a) dw (s) + E (L (a, τ b )) .
0
Observe that now wτ b is not bounded, so it is not in H2 so the stochastic integral is not a uniformly integrable martingale. If c < 0 < a < b, then as in the previous
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
429
example E (L (a, τ b ∧ τ c )) = 2G(c,b) (0, a) . If c −∞, then the limit on the right-hand side is 2 (b − a). On the left-hand side τ b ∧τ c τ b and as t → L (a, t) is increasing and continuous by the Monotone Convergence Theorem E (L (a, τ b )) = 2 (b − a) . 6.5.3
Meyer–Itˆ o formula
Theorem 6.72 Let X be a semimartingale. If L (a) is the local time of X at point a then for almost all outcome ω the support of the measure generated by the increasing function t → L (a, t, ω) is in the set {s : X (s−, ω) = X (s, ω) = a} . Proof. By the definition of local times L (a, t, ω) is continuous in time parameter t. This implies that the measure of every single point, with respect to the measure generated by L (a, t, ω), is zero. For every trajectory the number of the jumps of X is maximum countable, so it is sufficient to prove that the support of the measure generated by L (a, t, ω) is a subset of {s : X (s−, ω) = a} for almost all outcome ω. As convex functions of semimartingales are semimartingales Y |X − a| is a semimartingale. Y 2 = Y 2 (0) + 2Y− • Y + [Y ] . Z X − a is also a semimartingale. Y 2 = Z 2 = Y 2 (0) + 2Z− • Z + [Z] . Obviously [Z] = [Y ], therefore Y− • Y = Z− • Z. As Y = |Z|
t
sign (Z− ) dZ + Aa (t) .
Y (t) = Y (0) + 0
By the associativity rule
t
Y− dY = 0
t
0
t
Y− dAa .
Y− sign (Z− ) dZ + 0
430
ˆ FORMULA ITO’s
By the definition of sign Y− sign (Z− ) |Z− | sign (Z− ) = Z− .
(6.58)
Therefore
t
t
Z− dZ =
Y− dY =
0
0
t
0
t
Y− dAa .
Z− dZ + 0
Hence, by the definition of L (a, t, ω)
t
Y− dAa =
0=
(6.59)
0
t
Y− dLa +
[4pt] = 0
Y (s−) (∆ |Z (s)| − sign (Z (s−)) ∆Z (t)) .
0<s≤t
Observe that by (6.58) the expression after the sum is finite and has the form |a| (|b| − |a|) − a (b − a) = |a| |b| − a2 − ab + a2 = = |ab| − ab ≥ 0. t La is increasing, therefore the integral 0 Y− dLa is non-negative. This implies that the sum and the integral in (6.59) are zero. But as the integral is zero the support of the measure generated by La is part of the set {Y (s−) = 0} {|X (s−) − a| = 0} = {X (s−) = a} .
Example 6.73 If L is the local time of a Wiener process and τ b is the first passage time of a point b and 0 ≤ a < b then L (a, τ b ) has an exponential distribution with parameter70 λ (2 (b − a))−1 .
We show that the Laplace transform of the random variable L (a, τ b ) is l (s) E (exp (−s · L (a, τ b ))) = 70 See:
Example 6.71, page 428.
1 . 1 + 2s · (b − a)
(6.60)
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
431
As the Laplace transform of an exponentially distributed random variable is 1 1 + s/λ this implies the statement. 1. The main idea of the proof is to show that X (t)
1 + + s · (w (t) − a) exp (−s · L (a, t)) 2
is a local martingale. As Xτb =
1 + + s · (wτ b − a) exp (−s · Lτ b (a)) 2
(6.61)
is bounded, X τ b is a bounded local martingale. Hence (6.61) is a uniformly integrable martingale. Therefore by the Optional Sampling Theorem as 0 ≤ a < b 1 =E 2
1 + exp (−s · L (a, 0)) = + s · (w (0) − a) 2
1 + + s · (w (τ b ) − a) exp (−s · L (a, τ b )) = 2
=E =
1 + s · (b − a) l (s) , 2
from which (6.60) is trivial. 2. Let us return to process X. Let U (t)
1 + + s · (w (t) − a) , 2
V (t) exp (−sL (a, t)) .
Integrating by parts
t
U dV +
X (t) = U (t) V (t) = X (0) + 0
t
V dU + [U, V ] . 0
U is continuous, V has finite variation so [U, V ] = 0. By the previous theorem the support of the measure generated by V is in {w = a}, so
t
U dV = 0
1 1 + + s · (a − a) (V (t) − V (0)) = (V (t) − 1) . 2 2
432
ˆ FORMULA ITO’s
By Tanaka’s formula 1 U (t) H (t) + s · L (a, t) , 2 where H isa continuous local martingale. V is continuous so it is locally bounded t so Z (t) 0 V dH is a local martingale. On the other hand, by the Fundamental Theorem of Calculus71
t
Vd 0
1 s t s·L = exp (−s · L (a, u)) L (a, du) = 2 2 0 t s exp (−s · L (a, u)) = = 2 −s 0 1 1 = − (exp (−s · L (a, u)) − 1) = − (V (t) − 1) . 2 2
Hence X (t) = X (0) + Z (t) +
1 1 (V (t) − 1) − (V (t) − 1) = 2 2
= X (0) + Z (t) , that is, X is a local martingale. Theorem 6.74 (Meyer–Itˆ o formula) Let X be a semimartingale and let f be denotes the left derivative of f and µ is the second a convex function. If f f− generalized derivative of f and L is the local time of X, then f (X (t)) − f (X (0)) = t f (X− ) dX+ = +
(6.62)
0
(f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s)) +
0<s≤t
+
1 2
L (a, t) dµ (a) . R
Proof. Recall that the second generalized derivative of |x| is 2δ 0 . So if f (x) = |x| , then by the theorem one gets just the definition of local times. 1. Let us first assume that the support of µ is compact. In this case the representation (6.53) holds. If f (x) = αx + β then the theorem is trivially true, 71 See:
(6.32), page 398. Or, if one likes, by Itˆ o’s formula.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
433
therefore one can assume that 1 f (x) = 2
R
|x − a| dµ (a) .
With the Dominated Convergence Theorem one can differentiate under the integral sign f (x) f− (x) =
1 2
R
sign (x − a) dµ (a) .
If J (a, t)
(|X (s) − a| − |X (s−) − a| − sign (X (s−) − a) ∆X (s)) ,
0<s≤t
then by the Monotone Convergence Theorem 1 2
=
J (a, t) dµ (a) = R
1 (|X (s) − a| − |X (s−) − a| − sign (X (s−) − a) ∆X (s)) dµ (a) 2 0<s≤t R = (f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s)) . 0<s≤t
Similarly if H (a, t) |X (t) − a| − |X (0) − a| , then f (X (t)) − f (X (0)) =
1 2
H (a, t) dµ (a) . R
Let Z (a, t)
t
sign (X (s−) − a) dX (s) 0
434
ˆ FORMULA ITO’s
and let us take a B (R) × B (R+ ) × A measurable version of this parametric integral72 . By Fubini’s theorem for stochastic integrals73 1 2
t
Z (a, t) dµ (a) = R
0
=
t
1 2
R
sign (X (s−) − a) dµ (a) dX (s) =
f (X (s−)) dX (s) .
0
By the definition of local times L = H − J − Z, that is 1 1 1 1 H = J + Z + L. 2 2 2 2 Integrating by µ and using the already proved formulas one can easily prove the theorem. 2. Let us take the general case and let x ≤ −n f (−n) + f (−n) (x + n) if f (x) if −n < x < n . fn (x) f (n) + f (n) (x − n) if x≥n fn is also convex. Let µn be the generalized second derivative of fn . Obviously the support of µn is in [−n, n] and the measure µn is finite. Hence we can use the already proved part of the theorem. Let τ n inf {t : |X (t)| ≥ n} , and let us consider the stopped processes X τ m . By the already proved part of the theorem fn (X τ n (t)) − fn (X τ n (0)) = t fn (X τ n (s−)) dX τ n (s) + = 0
+
(∆fn (X τ n (s)) − fn (X τ n (s−)) ∆X τ n (s)) +
0<s≤t
1 + 2
R
Ln (a, t) dµn (a) ,
where obviously Ln (a) denotes the local time of X τ n . Observe that |X τ n | ≤ n on [0, τ n ). Therefore on [0, τ n ) one can write f instead fn . The support of the measure generated by Ln (a) is in the set {X τ n (s−) = a} , that is if |a| ≥ n, then 72 See: 73 See:
Proposition 5.23, page 319. Theorem 5.25, page 322.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
435
Ln (a, t) = 0 for all t. The measure µ and µn are equal on the interval [−n, n] so in the integral containing the local time one can write µ instead of µn . That is R
Ln (a, t) dµn (a) =
R
Ln (a, t) dµ (a) .
From the definition of the local time it is evident that the local time of X τ n is Lτ n . Hence if t ≤ τ n , then Ln (a, t) dµn (a) = Ln (a, t) dµ (a) = Lτ n (a, t) dµ (a) = R
R
R
=
L (a, t) dµ (a) . R
If n → ∞, then τ n ∞, and the theorem holds in the general case as well. Corollary 6.75 (Occupation Times Formula) If X is a semimartingale and L is the local time of X then for every bounded Borel measurable function g : R → R and for all t for almost all outcomes
R
t
g (X (s−)) d [X c ] (s) .
L (a, t) g (a) da =
(6.63)
0
The identity is meaningful and it is also valid if g is a non-negative Borel measurable function. Proof. Let f be convex and let f ∈ C 2 . In this case one can use Itˆo’s formula. Comparing Itˆ o’s formula with (6.62)
L (a, t, ω) f (a) da =
R
t
f (X (s−)) d [X c ] .
0
Of course instead of f one can write any g non-negative, continuous function. By the Monotone Class Theorem the identity is valid for every bounded Borel measurable function. With the Monotone Convergence Theorem one can extend the identity to non-negative Borel measurable functions. Let X = w be a Wiener process. In this case [X] (s) = s and by (6.63) for every Borel measurable set B t L (a, t) da = χ (w (s) ∈ B) ds = λ (s ≤ t : w (s) ∈ B) . B
0
The last variable gives the time w is in the set B. For fixed t this occupation time is a measure on the time-line and L (a, t) is the Radon–Nikodym derivative
436
ˆ FORMULA ITO’s
of this occupation time measure. By the interpretation of the density functions L (a, t) da is the time w is around a during the time interval [0, t]. Corollary 6.76 If X is a semimartingale and L is the local time of X then [X c ] (t) =
L (a, t) da. R
Corollary 6.77 (Meyer–Tanaka formula) If X is a continuous semimartingale and L denotes the local time of X then74 |X| = |X (0)| + sign (X) • X + L (0) . By Itˆo’s formula and by the Itˆ o–Meyer formula the class of semimartingales is closed for a quite broad class of transformations. That is why the next example is interesting. Example 6.78 If X = 0 is a continuous local martingale, X (0) = 0 and 0 < α < 1 then |X|α is not a semimartingale.
1. The example is a bit surprising because |X| is a semimartingale and by the Itˆ o–Meyer formula a concave function of a semimartingale is again a semimartingale. But recall that in Theorem 6.65 the domain of definition of F is the whole real line, or at least an open convex set containing the range of X. Now this is α not true. Let us also observe that the function |x| is not concave on the whole line. 2. Let L be the local time of X. Assume that L (0) ≡ 0. By the Meyer–Tanaka formula |X| = sign (X) • X + L (0) = sign (X) • X. On the right-hand side the integral is a local martingale, hence |X| is a nonnegative local martingale so by Fatou’s lemma it is a supermartingale75 . As |X| (0) = 0 0 = E (|X (0)|) ≥ E (|X (t)|) , which implies that if L (0) ≡ 0 then |X| = 0. α 3. Now we prove that if Y |X| is a semimartingale then L (0) ≡ 0. With localization one can assume that X ∈ H02 . The support of L (0) is in {X (s) = 0} 74 Obviously 75 See:
L (0) denotes the process t → L (0, t). page 386.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
437
so by the Meyer–Tanaka formula
t
L (0, t) =
χ (X (s) = 0) dL (0, s) =
0
t
1dL = 0
t
t
χ (X (s) = 0) d |X (s)| −
=
χ (X (s) = 0) sign (X (s)) dX (s) .
0
0
Let us first investigate the second integral
t
Z (t)
χ (X (s) = 0) sign (X (s)) dX (s) . 0
By Itˆo’s isometry and by (6.63)
E Z 2 (t) = E
0
=E
χ (X (s) = 0) d [X] (s) =
t
χ ({0}) (a) L (a, t) da R
=E
L (a, t) da
=0
{0}
hence Z = 0. Now let us calculate the first integral
t
χ (X (s) = 0) d |X (s)| . 0
0 < α < 1 so β 1/α > 1. If β ≥ 2 then by Itˆ o’ formula for C 2 functions |X| = Y β = βY β−1 • Y +
β (β − 1) β−2 • [Y ] . Y 2
Using that {X (s) = 0} = {Y (s) = 0}
t
χ (Y (s) = 0) d |X (s)| =
I (t) 0
t
χ (Y (s) = 0) Y β−1 dY +
=β 0
β (β − 1) + 2
t
χ (Y (s) = 0) Y β−2 d [Y ] . 0
The integrand in the first integral is zero, so the integral is zero. If β > 2 then the integrand in the second integral is also zero, so the second integral is zero again. If β = 2, then using (6.63)
t
χ (Y (s) = 0) d [Y ] = 0
L (a, t) χ ({0}) da =
R
L (a, t) da = 0. {0}
438
ˆ FORMULA ITO’s
Let 2 > β > 1. The function g (x)
xβ 0
if x > 0 if x ≤ 0
is a convex function on R. Hence by Itˆo’s formula for convex functions 1 |X| = g (Y ) = Y β = g (Y ) • Y + H (a) dµ (a) , 2 R where H is the local time of Y . In this case again
t
χ (X = 0) g (Y ) dY =
0
t
χ (X = 0) βY β−1 dY = 0. 0
Let us calculate the integral
t
χ (Y (s) = 0) d
H (a, s) dµ (a) .
(6.64)
R
0
µ is defined by the increasing function x βxβ−1 if x > 0 g− (x) h (t) dt, = 0 if x ≤ 0 −∞ where h (x)
β (β − 1) xβ−2 0
H is the local time of Y so H (a, s) dµ (a) = R
0
if x > 0 . if x ≤ 0
∞
s
H (a, s) h (a) da =
h (Y ) d [Y ] , 0
therefore (6.64) is
t
χ (Y (s) = 0) h (Y ) d [Y ] = 0. 0
This means that if Y is a semimartingale then L (0) = 0, hence X = 0. 6.5.4
Local times of continuous semimartingales
Observe that for every a the local time L (a, t, ω) is defined only up to indistinguishability. This means that for every a one can modify L (a, t, ω) on a set with probability zero. The local time is always continuous in parameter t so one can think about L as an C ([0, ∞)) valued stochastic process: (a, ω) → L (a, ω),
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
439
where L (a, ω) denotes the trajectory of L in t. As this function valued process is defined only almost surely one can use any of its modification as local time. In this subsection we prove that under some restrictions on semimartingale X, the process L (a, t, ω) has a version which is right-regular in a. To do this we shall use the next result: Proposition 6.79 (Kolmogorov’s criteria) Let I be an interval in R and let X be a Banach space valued stochastic process on I. If for some positive constants a, b and c a
E (X (u) − X (v) ) ≤ c u − v
1+b
,
then X has a continuous modification. Proposition 6.80 If X is a continuous local martingale then the local time L (a, t, ω) of X has a modification in a which is continuous in (a, t). Proof. One can localize the proposition as if L is the local time of X and τ is a stopping time then the local time of X τ is Lτ . Therefore one can assume that X − X (0) ∈ H02 . By definition
t
sign (X (s) − a) dX (s) .
L (a, t) = |X (t) − a| − |X (0) − a| − 0
Let us introduce the notation76 : (a, u) M
u
sign (X (s) − a) dX c (s) . 0
: has a continuous version. We want to apply It is sufficient to show that M Kolmogorov’s criterion. C ([0, t]) is a Banach space for arbitrary fix t. Obviously if a function g : I → C ([0, t]) is continuous then it defines a continuous function over I × [0, t]. We show that for all t 4 : : (b) (a) − M E M
C([0,t])
4 : : E sup M (a, s) − M (b, s) ≤
(6.65)
s≤t
2
≤ k · |a − b| . 76 Of course now instead of X c one can write X. But later we shall re-use this part of the proof in a bit different situation.
440
ˆ FORMULA ITO’s
By Burkholder’s and by Jensen’s inequality, using the Occupation Times Formula
( 4 2 : : : : = E sup M (a, s) − M (b, s) ≤ c · E M (a, t) − M (b, t)
(6.66)
s≤t
=c·E
t
2 4χ (a < X (s) ≤ b) d [X c ] (s) =
0
= 4c · E
b
2 L (x, t) dx =
a
2 = 4c · (b − a) E
b
a
2
≤ 4c · (b − a) E
dx L (x, t) b−a
1 b−a
2 ≤
b 2
L (x, t) dx
.
a
Changing the integrals by Fubini’s theorem one can estimate the last line with the following expression:
2 4c · (b − a) sup E L2 (x, t) .
(6.67)
x
Using the definition of the local times and the elementary inequalities ||X (t) − a| − |X (0) − a|| ≤ |X (t) − X (0)| .
2 (z1 − z2 ) ≤ 2 z12 + z22 2
2
t
L2 (x, t) ≤ 2 (X (t) − X (0)) + 2
sign (X (s) − x) dX (s) 0
One can estimate the expected value in (6.67) by 2
2
2 X − X (0)H2 + 2 sign (X − x) • XH2 . By Itˆo’s isometry 2
sign (X − x) • XH2 = E
∞
1d [X] =
0 2
2
= 1 • XH2 = X − X (0)H2 ,
.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
441
so the estimation of E L2 (x, t) is independent of x. So by (6.67) inequality (6.65) follows. Definition 6.81 If X is a continuous local martingale then L (a, t, ω) denotes the version which is continuous in (a, t). Corollary 6.82 If X is a continuous local martingale then almost surely for every value of parameters a and t 1 L (a, t) = lim ε0 2ε
t
χ (−ε + a < X (s) < a + ε) d [X] (s) .
(6.68)
0
Proof. By the occupation times formula for any interval I 1 λ (I)
t
1 χI (X (s)) d [X] (s) = λ (I) 0 1 L (a, t) da. = λ (I) I
R
L (a, t) χI (a) da =
L is continuous in a hence if a0 ∈ I and λ (I) → 0 then 1 λ (I)
L (a, t) da → L (a0 , t) , I
from which (6.68) is evident. Corollary 6.83 If w is a Wiener process then the occupation time measure µt (B) λ (s ≤ t : w (s) ∈ B) almost surely has a differentiable distribution function and the derivative of this function is L (a, t) . Definition 6.84 A semimartingale X satisfies the so-called hypothesis A if for every t almost surely
|∆X (s)| < ∞.
0<s≤t
Proposition 6.85 If semimartingale X satisfies hypothesis A then the local time L (a, t, ω) has a B (R) × P-measurable equivalent modification which is almost surely continuous in t and right-regular in a.
442
ˆ FORMULA ITO’s
Proof. If X satisfies hypothesis A then process ∆X has finite variation. In this case X − ∆X is meaningful and it is a continuous semimartingale. Let J ∆X. As Y X −J is a continuous semimartingale it has a unique decomposition M + V , where M is a continuous local martingale, V is a continuous process with finite variation. By the definition of local times |X (t) − a| = |X (0) − a| + t sign (X (s−) − a) dX (s) + + 0
+
(∆ |X (s) − a| − sign (X (s−) − a) ∆X (s)) +
0<s≤t
+ L (a, t) . For every s by the triangle inequality |∆ |X (s) − a|| ≤ |∆X (s)| .
(6.69)
Therefore by hypothesis A the sums
sign (X (s−) − a) ∆X (s)
0<s≤t
and
∆ |X (s) − a|
0<s≤t
are finite. Hence one can separate the terms in
(∆ |X (s) − a| − sign (X (s−) − a) ∆X (s)) .
(6.70)
0<s≤t
For every semimartingale Z let 0 (a, t) Z
t
sign (X (s−) − a) dZ (s) . 0
Observe that the second term of the sum (6.70) is −J0 (a, t). Using the decomposition X M + V + J : (a, t) + V0 (a, t) + J0 (a, t) − J0 (a, t) + |X (t) − a| = |X (0) − a| + M + ∆ |X (s) − a| + L (a, t) , 0<s≤t
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
443
that is : (a, t) − V0 (a, t) − L (a, t) = |X (t) − a| − |X (0) − a| − M − ∆ |X (s) − a| .
(6.71)
0<s≤t
By (6.69) and by hypothesis A ∆ |X (s) − a| is continuous by a and it is dominated by an integrable variable with respect to the counting measure. By the Dominated Convergence Theorem lim ∆ |X (s) − a| , ∆ |X (s) − u| = u→a
0<s≤t
0<s≤t
so the sum is continuous with respect to a. One should show that the proposition : (a, t) and V0 (a, t). V has finite variation on any finite interval, the is valid for M bounded function sign (X (s−) − u) is right-regular with respect to u. By the Dominated Convergence Theorem V0 is right-regular with respect to a. Finally :. The continuous part of semimartingale X is X c = M, so let us consider M repeating the proof of the previous proposition one can easily prove that M (a, t) : (a, t) . has a continuous version M Corollary 6.86 If a semimartingale X satisfies hypothesis A and if M + V is the decomposition of X − ∆X then ∆L (a, t) L (a, t) − L (a−, t−) = L (a, t) − L (a−, t) = t χ (X (s−) = a) dV (s) = =2 0
t
=2
χ (X (s) = a) dV (s) . 0
Proof. By the proof of the previous proposition only V0 (a, t) is not continuous so t 0 ∆L (a) = −∆V (a) = − sign (X (s−) − a) − sign (X (s−) − a−) dV (s) = 0
t
=2
χ (X (s−) = a) dV (s) . 0
V continuous and X (s−) = X (s) outside countable number points s, so t t 2 χ (X (s−) = a) dV (s) = 2 χ (X (s) = a) dV (s) . 0
0
of
444
ˆ FORMULA ITO’s
Example 6.87 Even for continuous semimartingales the local time can be discontinuous.
1. Let w be a Wiener process and let X |w|. As the support of the measure generated by L (a) is in the set {X = a} if a < 0, then L (a, t) = 0. Let a = 0. L is right-continuous in parameter a therefore using the occupation times formula 1 ε0 ε
ε
L (0, t) = lim
1 = lim ε0 ε
1 ε0 ε
L (a, t) da = lim 0
R
χ (0 ≤ a < ε) L (a, t) da =
t
χ (|w| < ε) d [|w|] . 0
By Tanaka’s formula |w| = sign (w) • w + Lw (0) . Lw (0) is continuous and increasing so [|w|] = [sign (w) • w] = [w]. Hence using again that Lw is continuous 1 L (0, t) = lim ε0 ε 1 ε0 ε
t
χ (−ε < w < ε) d [w] = 0
ε
Lw (a) da = 2Lw (0) = 0.
= lim
ε
This implies that the local time L (a, t) is not left-continuous in parameter a. 2. On the other hand it is interesting to discuss the case a > 0. Again by the right-continuity 1 L (a, t) = lim ε0 ε 1 ε0 ε
t
χ (|w| ∈ [a, a + ε)) (s) ds = 0
t
χ (w ∈ [a, a + ε)) (s) ds+
= lim
0
1 ε0 ε
t
χ (−w ∈ [a, a + ε)) (s) ds.
+ lim
0
The first limit is Lw (a, t) and the second is Lw (−a, t). Hence L (a, t) = Lw (−a, t) + Lw (a, t) . This expression is continuous on the set a ≥ 0.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
6.5.5
445
Local time of Wiener processes
In this subsection we shall investigate the local times of Wiener processes. Definition 6.88 If w is a Wiener process then L denotes the local time of w at point a = 0. That is L Lw (0). We shall very often refer to L as the local time of w. Example 6.89 Tanaka’s formula for Wiener processes.
If w is a Wiener process and L Lw (0) is the local time of w then by Tanaka’s formula |w| = sign (w) • w + L β + L.
(6.72) 2
sign (w)•w is a continuous local martingale with quadratic variation (sign (w)) • [w] = [w]. By L´evy’s characterization theorem77 β sign (w)•w is also a Wiener process. Our goal is to describe the distribution of L. To do this we shall need the next simple lemma: Lemma 6.90 (Skorohod) If y is a continuous function defined on R+ and y (0) ≥ 0 then there are functions on R+ denoted by z and a for which: 1. z = y + a, 2. z is non-negative, 3. a is increasing, continuous and a (0) = 0 and the support of the measure generated by a is in the set {z = 0}. Functions a and z are unique and . / a (t) = sup y − (s) sup max (−y (s) , 0) . s≤t
(6.73)
s≤t
Proof. First we show that the decomposition is unique. Let (a1 , z1 ) and (a2 , z2 ) be two decompositions satisfying the conditions of the lemma. y = z1 − a1 = z2 − a2 , 77 See:
Theorem 6.13, page 368.
446
ˆ FORMULA ITO’s
so z1 − z2 = a1 − a2 . As a1 and a2 are increasing z1 − z2 and a1 − a2 have finite variation. Integrating by parts 2
0 ≤ (z1 − z2 ) (t) = 2
t
z1 (s) − z2 (s) d (z1 − z2 ) (s) = 0
t
z1 (s) − z2 (s) d (a1 − a2 ) (s) .
=2 0
By the assumption about the support of measures generated by functions a1 and a2 and as z1 ≥ 0 and z2 ≥ 0 the last integral is
t
z1 (s) da2 − 2
−2 0
t
z2 da1 ≤ 0. 0
Hence z1 = z2 . As a second step we show that a in (6.73) and z y + a satisfy the conditions of the lemma. a is trivially increasing. By the assumptions y is continuous, hence y − is also continuous. It is easy to show that a is continuous. For every t z (t) y (t) + a (t) ≥ y (t) + y − (t) = y + (t) ≥ 0. One should prove that the support of the measure generated by a is in the set {z = 0} , that is
χ (z > 0) da = lim
n→∞
R+
1 χ z> n R+
da = 0.
This means that one should prove that for every ε > 0 χ (z > ε) da = 0. R+
z is continuous, hence for every ε > 0 the set {z > ε} is open, hence {z > ε} is a union of countable number of open intervals. Let (u, v) be one of these intervals. It is sufficient to prove that a (v) = a (u). If s ∈ (u, v) then −y (s) a (s) − z (s) ≤ a (v) − ε. From this a (v) = max a (u) , sup y − (s) ≤ max (a (u) , a (v) − ε) . u≤s≤v
This can happen only if a (v) ≤ a (u), that is a (v) = a (u).
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
447
Proposition 6.91 The distribution of L (t, ω) L (0, t, ω) is the same as the distribution of the maximum of a Wiener process on the interval [0, t]. Hence the density function of L (t) is ft (x) √
2 x 2 , exp − 2t 2πt
x > 0.
Proof. By Tanaka’s formula |w| = β + L, where β is a Wiener process and the two sides are equal up to indistinguishability. The support of the measure generated by L is in the set {|w| = 0}. Hence by Skorohod’s lemma L (t) = sup β − (s) = sup (−β (s)) S−β (t) , a.s.
s≤t
(6.74)
s≤t
from which by the symmetry of Wiener process the proposition is evident78 . Proposition 6.92 The augmented filtration generated by β sign (w) • w is the same as the augmented filtration generated by |w|. Proof. Let F β and F |w| be the augmented filtration generated by β and by |w|. By (6.74) L is adapted with respect F β . By Tanaka’s formula |w| is F β adapted. Hence F |w| ⊆ F β . On the other hand for Wiener processes L (a, t) is almost surely continuous in a so by (6.68) and by the occupation times formula ε 1 L (t) = lim L (a, t, ω) da = ε0 2ε −ε 1 L (a, t, ω) χ ((−ε, ε)) (a) da = = lim ε0 2ε R 1 ε0 2ε
t
χ (|w (s)| < ε) ds.
= lim
0
Hence L is F |w| -adapted. Therefore β is F |w| -adapted, so F β ⊆ F |w| . Proposition 6.93 If L (a, ∞, ω) denote the limit limt→∞ L (a, t, ω) then for every a P (L (a, ∞) = ∞) = 1. 78 See:
Example 1.123, page 87 and Proposition B.7, page 564.
448
ˆ FORMULA ITO’s
Proof. By definition |w (t) − a| |a| + β (t) + L (a, t) . where β sign (w − a) • w. By L´evy’s theorem β is a Wiener process. Again by Skorohod’s lemma −
L (a, t) = sup (β (t) + |a|) . s≤t
Hence P (L (a, ∞) = ∞) = 1. Finally we show that for Wiener processes the support of the measure generated by t → L (t, ω) is not only almost surely in the set Z (ω) {t : w (t, ω) = 0} but the two sets are almost surely equal. Proposition 6.94 For almost all outcome ω the set Z (ω) is closed and has empty interior. Proof. The trajectories of Wiener processes are continuous which immediately implies that Z (ω) is closed. We show that almost surely the Lebesgue measure of Z (ω) is zero. This will imply that Z (ω) does not contain a segment with positive length. By Fubini’s theorem, using that for every t > 0 the value of a Wiener process has non-degenerated Gaussian distribution so P (w (t) = 0) = 0 for every t > 0 E (λ (Z (ω))) = E =
∞
χ (Z (ω)) (t) dt
=
0 ∞
E (χ (Z (ω)) (t)) dt = 0 0
hence λ (Z (ω)) = 0 almost surely. Definition 6.95 If w is a Wiener process then the intervals in the open set c Z (ω) = {|w (ω)| > 0} are called the excursion intervals of w.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
449
For every t let σ t (ω) inf {s > 0 : L (s) ≥ t} , ρt (ω) inf {s > 0 : L (s) > t} . σ t and ρt are obviously stopping times. [σ t , ρt ] is the largest closed interval where L is constantly t. Let O (ω) ∪t (σ t (ω) , ρt (ω)) . O (ω) is an open set in R so by the structure of the open sets of the real line O (ω) is the union of maximum countable many disjoint intervals. As L is increasing it is easy to see that if t1 = t2 then
σ t1 (ω) , ρt1 (ω) ∩ σ t2 (ω) , ρt2 (ω) = ∅.
Hence O (ω) is maximum countable union of some intervals (σ t (ω) , ρt (ω)). Obviously O (ω) is the maximum countable number of intervals where L is constant. Proposition 6.96 If w is a Wiener process and L is the local time of w at zero then almost surely O (ω) is the union of the excursion intervals of w, that is a.s
c
O (ω) = {|w (ω)| > 0} = Z (ω) . Proof. The proof uses several interesting properties of the Wiener processes. 1. Observe that with probability one the maximum of a Wiener process β on any two disjoint, compact interval is different: If a < b < c < d < ∞ then by the definition of the conditional expectation using the independence of the increments P sup β (t) = sup β (t) = =P
a≤t≤b
c≤t≤d
sup (β (t) − β (b)) + β (b) = sup (β (t) − β (c)) + β (c) =
a≤t≤b
c≤t≤d
P β (c) − β (b) = sup (β (t) − β (b)) − sup (β (t) − β (c)) =
= R
R
a≤t≤b
c≤t≤d
P (β (c) − β (b) = x − y) dF (x) dG (y) = =
1dF (x) dG (y) = 1. R
R
450
ˆ FORMULA ITO’s
Unifying the measure-zero sets one can prove the same result for every interval with rational endpoints. 2. This implies that with probability one every local maximum of a Wiener process has different value. 3. By Tanaka’s formula |w| = L − β
(6.75)
for some Wiener process β. Recall that by Skorohod’s lemma79 L is the running maximum of β. This and (6.75) implies that L is constant on any interval80 where |w| > 0. As with probability one, the local maximums of β are different on the flat segments of L with probability one w is not zero. Hence the excursion intervals of w and the flat parts of L are almost surely equal. Proposition 6.97 Let w be a Wiener process. For almost all ω the following three sets are equal81 : 1. the sets of zeros of w; 2. the complement of the O (ω); 3. support of the measure generated by local time L (ω). Proof. Let S (ω) denote the support of the measure generated by L (ω). By definition S (ω) is the complement of the largest open set G (ω) with L (G (ω)) = 0. L is constant on the components of O, so L (O) = 0 that is O (ω) ⊆ G (ω). Hence S (ω) G c (ω) ⊆ Oc (ω) . Let I be an open interval with I ∩ O (ω) = ∅. If s1 < s2 are in I then L (s1 , ω) = L (s2 , ω) is impossible, so the measure of I with respect to L (ω) is positive, hence O (ω) is the maximal open set with zero measure, that is O (ω) = G (ω). Hence the equivalence of the last two sets is evident. By the previous proposition c (Z (ω)) = O (ω) = S c (ω) so Z (ω) = S (ω). 6.5.6
Ray–Knight theorem
Let b be an arbitrary number and let τ b be the hitting time of b. On [0, b] one can define the process Z (a, ω) L (b − a, τ b (ω) , ω) , 79 See:
Proposition 6.91, page 447. Proposition 6.97, page 450. 81 See: Example 7.43, page 494. 80 See:
a ∈ [0, b] .
(6.76)
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
451
If a > 0 then Z (a) has an exponential distribution82 with parameter λ 1/ (2a). In this subsection we try to find some deep reason for this surprising result. Let us first prove some lemmas. Lemma 6.98 Let Z (Za ) be the filtration generated by (6.76). If ξ ∈ L2 (Ω, Za , P), then ξ has the following representation:
∞
ξ = E (ξ) +
H · χ (b ≥ w > b − a) dw.
(6.77)
0
In the representation H is a predictable process and E
∞
H 2 χ (b ≥ w > b − a) d [w] < ∞.
0
Proof. Let us emphasize that predictability of H means that H is predictable with respect to the filtration F generated by the underlying Wiener process. 1. Let U be the set of random variables ξ with representation (6.77). χ (b ≥ w > b − a) is a left-regular process, so the processes U H · χ (b ≥ w > b − a) ,
H ∈ L2 (w)
o’s isometry it is clear that the random form a closed subset of L2 (w). From Itˆ variables satisfying (6.77) form a closed subset of L2 (Ω, F∞ , P). Obviously Za ⊆ F∞ and so the set of variables with the given property is a closed subspace of L2 (Ω, Za , P). 2. Let η g exp −
a
g (s) Z (s) ds ,
g ∈ Cc1 ([0, a])
0
where Cc1 ([0, a]) denotes the set of continuously differentiable functions which are zero outside [0, a]. Z is continuous so the σ-algebra generated by the variables η g is equal Za . Let t U (t) exp − g (b − w (s)) ds exp (−K (t)) . 0 82 See:
Example 6.73, page 430.
452
ˆ FORMULA ITO’s
g is bounded so U is bounded. By the Occupation Times Formula η g exp −
a
g (s) Z (s) ds
exp −
0
= exp −
g (b − v) L (v, τ b ) dv
g (s) L (b − s, τ b ) ds
=
0
b
a
=
b−a
= exp − g (b − v) L (v, τ b ) dv = R τb
= exp −
g (b − w (v)) dv
= U (τ b ) .
0
Let f ∈ C 2 , M f (w) exp (−K) f (w) U. K is continuously differentiable so it has finite variation so by Itˆ o’s formula M − M (0) = f (w) U • w − f (w) U • K+ 1 + U f (w) • [w] . 2 Let f be zero on (−∞, b − a] , f (b) = 1 and f (x) = 2g (b − x) f (x). The third integral is 1 U f (w) • [w] = U g (b − w) f (w) • [w] = U f (w) • K 2 hence the second and the third integrals are the same. Hence M − M (0) = f (w) U • w. As f (x) = f (x) χ (x > b − a) M (τ b ) M (τ b ) = = M (τ b ) = f (w (τ b )) f (b) τb = M (0) + U (s) f (w (s)) dw (s) =
η g = U (τ b ) =
0
τb
= M (0) +
U (s) f (w (s)) χ (w (s) > b − a) dw (s)
0
E ηg +
0
τb
Hχ (w > b − a) dw.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
453
So for η g the representation (6.77) is valid. As η g generates Za and the set of variables for which (6.77) is valid is a closed set the lemma holds. Lemma 6.99 If the filtration is given by Z then Z (a) − 2a is a continuous martingale on [0, b]. Proof. Obviously Z (a) − 2a is continuous in a. By Tanaka’s formula +
t
(w (t) − (b − a)) = 0
1 χ (w (s) > b − a) dw (s) + L (b − a, t) . 2
If t = τ b , then Z (a) − 2a L (b − a, τ b ) − 2a = τb χ (w (s) > b − a) dw (s) = = −2 0
= −2
∞
χ (b ≥ w (s) > b − a) dw (s) .
0
From this Z (a) is integrable and its expected value is 2a. If u < v, then for every Zu -measurable bounded variable ξ, by the previous lemma and by Itˆ o’s isometry = −2E
∞
E ((Z (v) − 2v) ξ) = ∞ χ (b ≥ w > b − v) dw Hχ (b ≥ w > b − u) dw =
0
0
∞
= −2E 0
= −2E
χ (b ≥ w (s) > b − v) Hχ (b ≥ w (s) > b − u) ds
∞
=
Hχ (b ≥ w (s) > b − u) ds
= E ((Z (u) − 2u) ξ) .
0
Hence Z (a) − 2a is a martingale. Lemma 6.100 If X is a continuous local martingale and σ ≥ 0 is a random variable, then the quadratic variation of the stochastic process Lσ (a, ω) L (a, σ (ω) , ω) is finite. If u < v then the quadratic variation of Lσ on the interval [u, v] is v a.s.
[Lσ ]u = 4
v
L (a, σ) da. u
Proof. Of course, by definition, the random variable ξ is the quadratic
variation (n) of [u, v] of Lσ on the interval [u, v] if for arbitrary infinitesimal partition ak k,n
454
ˆ FORMULA ITO’s
if n → ∞ then
2 P (n) (n) → ξ. Lσ ak − Lσ ak−1
k
1. Let us fix t. Let 0 (a) X
t
sign (X (s) − a) dX (s) . 0
By the definition of local times 0 (a, t) . L (a, t) = |X (t) − a| − |X (0) − a| − X Let us remark that if f is a continuous and g is a Lipschitz continuous function then
(n) (n) (n) (n) |[f, g]| ≤ lim sup max f ak − f ak−1 − g ak−1 ≤ g ak n→∞
k
k
(n) (n) (n) (n) ≤ lim sup max f ak − f ak−1 K ak − ak−1 = 0. n→∞
k
k
The process Fσ (a) |X (σ) − a| − |X (0) − a| is obviously Lipschitz continuous in parameter a. X is a continuous local 0 is continuous83 in a so for every outcome martingale so X ( 0σ , Fσ = 0 and [Fσ ] = 0. Fσ + X Therefore ( ( 0σ . 0σ = X [Lσ ] = Fσ + X 2. By Itˆo’s formula
2 0 a(n) − X 0 a(n) = X k k−1
0 a(n) − X 0 a(n) − X 0 a(n) 0 a(n) =2 X • X + k k−1 k k−1
( 0 a(n) 0 a(n) − X . + X k k−1
83 See:
Proposition 6.80, page 439.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
455
By the Occupation Times Formula for every t almost surely (
0 a(n) − X 0 a(n) X k k−1
( (n) (n) sign X − ak − sign X − ak−1 • X =
( (n) (n) •X = = −2χ ak−1 < X ≤ ak
(n) (n) = 4χ ak−1 < X ≤ ak • [X] = 4
(n)
ak
(n)
L (a) da.
ak−1
Hence almost surely (
0 a(n) − X 0 a(n) X (σ) = 4 k k−1
v
u
k
v
L (a, σ) da = 4
Lσ (a) da. u
3. Finally we should calculate the limit of the sum of first terms. The sum of the stochastic integrals is
(n)
(n) (n) (n) 0 a 0 a − 2 X −X χ ak−1 < X ≤ ak • X. k k−1 k
0 is continuous if n → ∞ the integrand goes to zero. The integrand is locally As X bounded so the stochastic integral goes to zero uniformly on compact intervals in probability. Theorem 6.101 (Ray–Knight) There is a Wiener process β with respect to the filtration Z, that Z (a) L (b − a, τ b ) satisfies the equation a√ Zdβ, a ∈ [0, b] . (6.78) Z (a) − 2a = 2 0
Proof. L (u, t) is positive for every t > 0, so Z (a) > 0. The quadratic variation a of Z (a) − 2a is 4 0 Z (s) ds. By Doob’s representation theorem84 there is a Wiener process β with respect to filtration generated by Z for which (6.78) valid. Z (a) is a continuous semimartingale. By Itˆ o’s formula a exp (−sZ) d (−sZ) + exp (−sZ (a)) − 1 = 0
+ 84 See:
Proposition 6.18, page 373.
1 2
a
exp (−sZ) d [−sZ] . 0
456
ˆ FORMULA ITO’s
Y (u) Z (u) − 2u is a martingale Z ≥ 0 so, exp (−sZ) ≤ 1
a
E
(exp (−sZ)) d [−sZ] ≤ E
a
2
0
d [−sZ] =
0
= 4s2 E
a
Z (s) ds
=
0 a
= 4s2
E (Z (s)) ds = 0
a
sds < ∞.
2
= 8s
0
Hence the integral
a
exp (−sZ (u)) d (−s (Z (u) − 2u)) 0
is a martingale. Let L (a, s) E (exp (−sZ (a))) . Taking expected value on both sides of Itˆ o’s formula and using the martingale property of the above integral a L (s, a) − 1 = E exp (−sZ (u)) d (−2su) + 0
1 + E 2
a
exp (−sZ) d [−sZ] .
0
Let us calculate the second integral. Using (6.78) 2s2 E
a
exp (−sZ (u)) Z (u) du 0
= −2s
2
a
E 0
= 2s2
a
E (exp (−sZ (u)) Z (u)) du = 0
d exp (−sZ (u)) du. ds
Changing the expected value and differentiating by a ∂L d = −2sL (a, s) − 2s2 E exp (−sZ (a)) . ∂a ds For Laplace transforms one can change the differentiation and the integration so ∂L ∂L , = −2sL (a, s) + 2s2 ∂s ∂a
L (a, 0) = 1.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
457
With direct calculation one can easily verify that L (a, s) =
1 1 + 2sa
satisfies the equation. The Laplace transform L (a, s) is necessarily analytic so by the theorem of Cauchy and Kovalevskaja 1/ (1 + 2sa) is the unique solution of the equation. This implies that Z (a) has an exponential distribution with parameter λ = 1/ (2a). 6.5.7
Theorem of Dvoretzky Erd˝ os and Kakutani
First let us introduce some definitions: Definition 6.102 Let f be a real valued function on an interval I ⊆ R. 1. We say that t is a point of increase of f if there is a δ > 0 such that f (s) ≤ f (t) ≤ f (u) whenever s, u ∈ I ∩ (−δ + t, t + δ) and s < t < u. 2. We say that t is a point of strict increase of f if there is a δ > 0 such that f (s) < f (t) < f (u) whenever s, u ∈ I ∩ (−δ + t, t + δ) and s < t < u. A striking feature of Wiener processes is the following observation: Theorem 6.103 (Dvoretzky–Erd˝ os–Kakutani) Almost surely the trajectories of Wiener processes do not have a point of increase. Proof. Let w be a Wiener process. 1. One should show that P ({ω : w (ω) has a point of increase}) = 0. Obviously sufficient to prove that for an arbitrary v > 0 P ({ω : w (ω) has a point of increase in [0, v]}) = 0. By Girsanov’s theorem there is a probability measure P ∼ Q on (Ω, Fv ) such that w (t) w (t) + t is a Wiener process on [0, v] under Q. Every point of increase of w is a strict point of increase of w. Therefore it is sufficient to prove that P ({ω : w (ω) has a point of strict increase in [0, v]}) = 0.
458
ˆ FORMULA ITO’s
Of course this is the same as P ({ω : w (ω) has a point of strict increase}) = 0. To prove this it is sufficient to show that P (Ωp,q ) = 0 for every rational numbers p and q where Ωp,q
ω : ∃t such that w (s, ω) < w (t, ω) < w (u, ω) , for every s, u ∈ (p, q) , s < t < u
.
Using the strong Markov property of w one can assume that p = 0. 2. Let L be the local time of w. We show that for every b almost surely Z (a) L (b − a, τ b (ω) , ω) > 0,
∀a ∈ (0, b] .
As we know85 if a > 0 then Z (a) has an exponential distribution so it is almost surely positive for every fixed a ∈ (0, b]. Z (a) is continuous so if Ωn is the set of outcomes ω for which Z (a, ω) ≥ 1/n for every rational a then Z (a, ω) ≥ 1/n for every a ∈ (0, b]. If Ω ∪n Ωn then P (Ω ) = 1 and if ω ∈ Ω then Z (a, ω) > 0 for every a ∈ (0, b]. 3. Now it is obvious that there is an Ω∗ with P (Ω∗ ) = 1 that whenever ω ∈ Ω∗ then a. L (a, t, ω) is continuous in (a, t); b. the support of L (a, ω) is {w (ω) = a} for every rational number a; c. Z (a) L (b − a, τ b (ω) , ω) > 0 whenever 0 < a ≤ b for every rational number b. 4. Let ω ∈ Ω∗ and let ω ∈ Ωp,q = Ω0,q . This means that for some t w (s, ω) < w (t, ω) < w (u, ω) ,
0 ≤ s < t < u ≤ q.
(6.79)
Let us fix a rational number w (t, ω) < b < w (q, ω). Let (bn ) be a sequence of rational numbers for which bn w (t, ω). As w (t, ω) < b and b is rational by c. L (w (t, ω) , τ b (ω) , ω) = L (b − (b − w (t, ω)) , τ b (ω) , ω) > 0. L is continuous so the measure of every single point is zero so by b. Obviously L (bn , τ bn , ω) = 0. So L (w (t, ω) , τ b (ω) , ω) = L (w (t, ω) , τ b (ω) , ω) − L (bn , τ b (ω) , ω) + + L (bn , τ b (ω) , ω) − L (bn , t, ω) + + L (bn , t, ω) − L (bn , τ bn , ω) . 85 See:
Example 6.73, page 430.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
459
By the construction as t is a point of increase bn < w (t, ω) < w (a, ω) < b,
a ∈ (t, τ b ) .
By b. the support of the measure generated by L (bn , ω) is {w (ω) = bn }. Hence the second line in the above estimation is zero. t is a point of increase so by (6.79) if n → ∞ then τ bn → t. Therefore using a. 0 < L (w (t, ω) , τ b (ω) , ω) = L (w (t, ω) , τ b (ω) , ω) − L (w (t, ω) , τ b (ω) , ω) + + L (w (t, ω) , t, ω) − L (w (t, ω) , t, ω) = 0. / Ωp,q . Hence P (Ωp,q ) = 0. This is a contradiction so if ω ∈ Ω∗ then ω ∈
7 PROCESSES WITH INDEPENDENT INCREMENTS In this chapter we discuss the classical theory of processes with independent increments. In the first section we return to the theory of L´evy processes. The increments of L´evy processes are not only independent but they are also stationary. L´evy processes are semimartingales, but the same is not true for processes with independent increments. In the second part of the chapter we show the generalization of the L´evy–Khintchine formula to processes with just independent increments. The main difference between the theory of L´evy processes and the more general theory of processes with independent increments is that every L´evy process is continuous in probability. This property does not hold for the more general class. This implies that processes with independent increments can have jumps with positive probability.
7.1
L´ evy processes
In this section we briefly return to the theory of L´evy processes. The theory of L´evy processes is much simpler than the more general theory of processes with independent increments. Recall that L´evy processes have stationary and independent increments. The main consequence of these assumptions is that if ϕt (u) denotes the Fourier transform of X (t) then for every u ϕt+s (u) = ϕt (u)ϕs (u),
(7.1)
so ϕt (u) for every u satisfies Cauchy’s functional equation1 . As the Fourier transforms of distributions are always bounded the solutions of equation (7.1) have the form ϕt (u) = exp (tφ(u)) , 1 See:
line (1.40), page 62.
460
(7.2)
´ LEVY PROCESSES
461
for some φ. One of our main goals is to find the proper form2 of φ(u). Representation (7.2) has two very important consequences: 1. ϕt (u) = 0 for every u and t, 2. ϕt (u) is continuous in t. As ϕt is continuous in t, if tn t, then ϕtn (u) → ϕt (u) for every u. Hence w
P
X(tn ) − X(t) → 0, that is X(tn ) − X(t) → 0. Hence for some subsequence a.s. a.s. X (tnk ) → X (t). Therefore X (t−) = X (t). Hence if X is a L´evy process then it is continuous in probability and, as a consequence of this continuity, for every moment of time t the probability of a jump at t is zero, that is P (∆X (t) = 0) = 0 for every t. As ϕt (u) = 0 for every u one can define the exponential martingale Zt (u, ω)
exp (iuX(t, ω)) . ϕt (u)
(7.3)
Recall that, applying the Optional Sampling Theorem to (7.3), one can prove that every L´evy process is a strong Markov process3 . 7.1.1
Poisson processes
Let us recall that a L´evy process X is a Poisson process if its trajectories are increasing and the image of trajectories is almost surely the set of integers {0, 1, 2, . . .}. One should emphasize that all the non-negative integers have to be in the image of the trajectories, so Poisson processes do not have jumps which are larger than one. To put it another way: Poisson processes are the L´evy type counting processes. Definition 7.1 A process is a counting process if its image space is the set of integers {0, 1, . . .}. X is a Poisson process with respect to a filtration F if it is a counting L´evy process with respect to the filtration F. Since the values of the process are integers and as the trajectories are rightregular there is always a positive amount of time between the jumps. That is if X (t, ω) = k then X (t + u, ω) = k, whenever 0 ≤ u ≤ δ for some δ (t, ω) > 0. As the trajectories are defined for every t ≥ 0 and the values of the trajectories are finite at every t the jumps of the process cannot accumulate. Let τ 1 (ω) inf {t: X (t, ω) = 1} = inf {t: X (t, ω) > 0} < ∞. 2 This 3 See:
is the famous L´evy–Khintchine formula. Proposition 1.109, page 70.
462
PROCESSES WITH INDEPENDENT INCREMENTS
τ 1 is obviously a stopping time. We show that τ 1 is exponentially distributed: if u, v ≥ 0 then P (τ 1 > u + v) = P (X (u + v) = 0) = = P (X (u) = 0, X (u + v) − X (u) = 0) = = P (X (u) = 0) · P (X (u + v) − X (u) = 0) = = P (X (u) = 0) · P (X (v) = 0) , hence if f (t) P (τ 1 > t) then f (u + v) = f (u) · f (v) ,
u, v ≥ 0.
f ≡ 0 and f ≡ 1 cannot be solutions as X cannot be a non-trivial L´evy process4 , so for some 0 < λ < ∞ P (τ 1 > t) = P (X (t) = 0) = exp (−λt) . By the strong Markov property of L´evy processes5 the distribution of X1∗ (t) X (τ 1 + t) − X (τ 1 ) is the same as the distribution of X (t) so if τ 2 (ω) inf {t: X (t + τ 1 (ω) , ω) = 2} = inf {t: X1∗ (t, ω) > 0} < ∞ then τ 1 and τ 2 are independent and they have the same distribution6 . Proposition 7.2 If λ denotes the common parameter, then for every t ≥ 0 n+1 n n (λt) P exp (−λt) . τk > t ≥ τ k = P (X (t) = n) = n! k=1
k=1
Proof. Recall that a non-negative variable has gamma distribution Γ (a, λ) if the density function of the distribution is fa,λ (x)
λa a−1 x exp (−λx) , Γ (a)
x > 0.
random First we show that if ξ i are independent n nvariables with distribution Γ (ai , λ) , then the distribution of i=1 ξ i is Γ ( i=1 ai , λ). It is sufficient to 4 If f ≡ 1 then τ = ∞, hence X ≡ 0 and the image of trajectories is {0} only and not the 1 set of integers. 5 See: Proposition 1.109, page 70. 6 Let us recall that τ ∗ 1 is Fτ 1 -measurable and by the strong Markov property X1 is independent of Fτ 1 . See Proposition 1.109, page 70.
´ LEVY PROCESSES
463
show the calculation for two variables. If the distribution of ξ 1 is Γ(a, λ), and the distribution of ξ 2 is Γ(b, λ), and if they are independent, then the density function of ξ 1 + ξ 2 is the convolution of the density functions of ξ 1 and ξ 2 h (x)
∞
−∞
x
= 0
=
fa,λ (x − t) fb,λ (t) dt = a−1
λa (x − t) Γ (a)
exp (−λ (x − t))
λa+b exp (−λx) Γ (a) Γ (b)
λa+b exp (−λx) = Γ (a) Γ (b) =
x
λb tb−1 exp (−λt) dt = Γ (b)
a−1 b−1
(x − t)
t
dt =
0
1
a−1
(x − xz)
b−1
(xz)
xdz =
0
λa+b exp (−λx) xa+b−1 Γ (a) Γ (b)
1
a−1
(1 − z)
z b−1 dz =
0
a+b
=
λ exp (−λx) xa+b−1 . Γ (a + b)
Hence the distribution of ξ 1 + ξ 2 is Γ (a + b, λ). The density function of Γ (1, λ) is λ1 1−1 x exp (−λx) = λ exp (−λx) , Γ (1)
x > 0,
so Γ (1, λ) is the exponential distribution with parameter λ. If σ m then σ m has gamma distribution Γ (m, λ) .
m k=1
P (X (t) < n + 1) =
∞
λn+1 xn exp (−λx) dx = Γ (n + 1) t ∞ ∞ n λn xn−1 (λx) exp (−λx) exp (−λx) dx = + n = − Γ (n + 1) Γ (n + 1) t t = P (σ n+1 > t) =
n
=
(λt) exp (−λt) + P (X (t) < n) . n!
Hence n
P (X (t) = n) = P (X (t) < n + 1) − P (X (t) < n) =
(λt) exp (−λt) . n!
τk
464
PROCESSES WITH INDEPENDENT INCREMENTS
7.1.2
Compound Poisson processes generated by the jumps
Let X now be a L´evy process and let Λ be a Borel measurable set. τ 1 (ω) inf {t: ∆X (t, ω) ∈ Λ} . Since (Ω, A, P, F) satisfies the usual conditions τ 1 is a stopping time7 . As τ 1 is measurable / Λ, ∀u ∈ [0, t]) P (τ 1 > t) = P (∆X (u) ∈ is meaningful. Assume that the closure of Λ denoted by cl (Λ) does not contain the point 0, that is Λ is in the complement of a ball with some positive radius r > 0. As X is right-continuous and as X (0) = 0 obviously 0 < τ 1 ≤ ∞. In a similar way as in the previous subsection, using that the jumps in Λ cannot accumulate8 P (τ 1 > t1 + t2 ) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 + t2 ]) = / Λ, u ∈ (t1 , t1 + t2 ]) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 ]) · P (∆X (u) ∈ / Λ, u ∈ (0, t2 ]) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 ]) · P (∆X (u) ∈ = P (τ 1 > t1 ) · P (τ 1 > t2 ) . So τ 1 has an exponential distribution. Let us observe that now we cannot guarantee that λ > 0 as τ 1 ≡ ∞ is possible. Let us assume that τ 1 < ∞. Let X ∗ (t) X (τ 1 + t) − X (τ 1 ) and let τ 2 inf {t : ∆X ∗ (t) ∈ Λ} , n / cl (Λ) and as X etc. If τ 1 < ∞ then τ k < ∞ for all k. Let σ n k=1 τ k . As 0 ∈ has limits from left the almost surely9 strictly increasing sequence (σ n ) almost surely cannot have a finite accumulation point. So almost surely σ n ∞. As on every trajectory the number of jumps is at most countable one can define the 7 See:
Corollary 1.29, page 16, Example 1.32, page 17. 0∈ / cl (Λ) all the jumps are larger than some r > 0. τ 1 is a stopping time so the sets below are measurable. 9 The trajectories of a Poisson process are just almost surely nice. For example, with probability zero N (ω) ≡ 0 is possible. 8 As
´ LEVY PROCESSES
465
process N Λ which counts the jumps of X with ∆X ∈ Λ. N Λ (t)
χΛ (∆X (s)) =
∞
χ {σ n ≤ t} .
(7.4)
n=1
0<s≤t
N Λ (t) − N Λ (s) is the number of jumps in Λ during the time interval (s, t] so it is evidently measurable with respect to the σ-algebra generated by the increments of X. Hence10 N Λ (t) − N Λ (s) is independent of the σ-algebra Fs . So N Λ has independent increments. It is also easy to prove that the distribution of N Λ (t)−N Λ (s) is the same as the distribution of N Λ (t − s). It is trivial from the definition that N Λ is a right-regular counting process. Hence N Λ is a counting L´evy process. Therefore we have proved the following: Lemma 7.3 If 0 ∈ / cl (Λ) then N Λ is a Poisson process. Definition 7.4 A stopping time σ is a jump time of a process X if ∆X (σ) = 0 almost surely. Example 7.5 The jump times of L´evy processes are totally inaccessible.
Let τ be a predictable stopping time and let P (∆X (τ ) = 0) > 0. We can assume that P (|∆X (τ )| ≥ ε) > 0 for some ε > 0. If Λ {|x| ≥ ε} and if (σ n ) are the stopping times of the Poisson process N Λ then P (σ n = τ ) > 0 for some n. But this is impossible as σ n is totally inaccessible11 for every n. Therefore if τ is predictable then P (∆X (τ ) = 0) = 0. With N Λ one can define the process J Λ (t, ω)
∆X (s, ω) χΛ (∆X (s, ω)) =
(7.5)
0<s≤t N Λ (t)
=
n=1
∆X (σ n ) =
∞
∆X (σ n ) χ {σ n ≤ t} .
n=1
Lemma 7.6 If 0 ∈ / cl (Λ) then J Λ is a compound Poisson process that is: 1. J Λ (0) = 0. 2. J Λ has countable many jumps. 3. After every jump J Λ has an exponentially distributed waiting time. After this waiting time J Λ jumps again. The time between the jumps are independent and they have the same distribution. 10 See: 11 See:
Proposition 1.97, page 61. Example 3.7, page 183.
466
PROCESSES WITH INDEPENDENT INCREMENTS
4. The sizes of the jumps are independent of the waiting times up to the jumps. 5. The sizes of the jumps have the same distribution and they are independent random variables. Proof. If η n ∆X (σ n ) then by the strong Markov property the variables (η n ) are independent and they have the same distribution. One need only prove that (σ n ) and (η n ) are independent. Let τ n σ n − σ n−1 . 1. If s > t, then (t) (t) {η 1 < a, σ 1 > s} = {σ 1 > t} ∩ η 1 < a, σ 1 > s − t , where η 1 and σ 1 are the size and the time of the first jump of X ∗ (u) = X (u + t) − X (t). As σ 1 is a stopping time {σ 1 > t} ∈ Ft . Hence by the strong Markov property {σ 1 > t} is independent of (t)
(t)
(t) (t) η 1 < a, σ 1 > s − t . Hence again by the strong Markov property
(t) (t) P (η 1 < a, σ 1 > s) = P {σ 1 > t} ∩ η 1 < a, σ 1 > s − t =
(t) (t) = P (σ 1 > t) P η 1 < a, σ 1 > s − t = = P (σ 1 > t) P (η 1 < a, σ 1 > s − t) . If s t then using that 0 ∈ / cl (Λ) and therefore P (σ 1 > 0) = 1, P (η 1 < a, σ 1 > t) = P (σ 1 > t) P (η 1 < a, σ 1 > 0) = = P (σ 1 > t) · P (η 1 < a) . Hence σ 1 τ 1 and η 1 are independent. In a similar way, using the strong Markov property again one can prove that τ n is independent of η n for every n. 3. By the strong Markov property (η n , τ n ) is independent of Fσn−1 . Hence
E exp i
= E E exp i
N
um η m + i
m=1 N m=1
um η m + i
N
vn τ n
n=1 N n=1
vn τ n
=
| FσN −1
=
´ LEVY PROCESSES
= E exp i
N −1
= E exp i
N −1
um η m + i
m=1 N −1
vn τ n
E exp (iuN η N + ivN τ N ) | FσN −1
n=1
m=1
= E exp i
um η m + i
N −1
N −1
um η m + i
m=1
=
vn τ n
n=1 N −1
467
· E (exp (iuN η N + ivN τ N )) =
vn σ n
· E (exp (iuN η N )) · E (exp (ivN τ N )) =
n=1
= ··· =
N !
E (exp (ium η m ))
m=1
N !
E (exp (ivm τ m )) .
m=1
This implies12 that the σ-algebras generated by (η m ) and (τ n ) are independent. Hence (η m ) and (σ n ) are also independent. Lemma 7.7 The Fourier transform of J Λ (s) is
(exp (iux) − 1) dF (x) E exp iu · J Λ (s) = exp λs R
where λ is the parameter of the Poisson part and F is the common distribution function of the jumps. Proof. Let G be the distribution function of N Λ (s). N Λ (s) ϕ (u) E exp iu · ∆X (σ k ) = = R
k=1
N Λ (s)
E exp iu ·
∆X (σ k ) | N Λ (s) = n dG (n) .
k=1
N Λ (s) has a Poisson distribution. As N Λ (s) and the variables (∆X (σ k )) are independent one can substitute and drop the condition N Λ (s) = k: ∞ n n (λs) exp (−λs) = ϕ (u) = E exp iu · ∆X (σ k ) n! n=0 k=1 n ∞ n (λs) exp (−λs) = = exp (iux) dF (x) n! R n=0 = exp λs (exp (iux) − 1) dF (x) . R
12 See:
Lemma 1.96, page 60.
468
PROCESSES WITH INDEPENDENT INCREMENTS
Lemma 7.8 If X is a L´evy process with respect to some filtration F and 0 ∈ / cl (Λ) then J Λ and X − J Λ are also L´evy processes with respect to F. Proof. First recall13 that if X is a L´evy process then the σ-algebra Gt generated by the increments X (u) − X (v) ,
u≥v≥t
is independent of Ft for all t. Observe that for all t increments of J Λ and X−J Λ of this type are Gt -measurable. So these processes have independent increment with respect to F. From the strong Markov property it is clear that the increments of these processes are stationary. As J Λ obviously has right-regular trajectories the processes in the lemma are L´evy processes as well. Lemma 7.9 If X is a L´evy process, Λ is a Borel measurable set and 0 ∈ / cl (Λ)
then the variables J Λ (t) and X − J Λ (t) are independent for every t ≥ 0. Proof. Let us fix a t. To prove the independence of the variables J Λ (t) and X (t) − J Λ (t) it is sufficient to prove14 that
'
& ϕ (u, v) E exp i u · J Λ (t) + v · X (t) − J Λ (t) =
= E exp iu · J Λ (t) · E exp iv · X (t) − J Λ (t) .
(7.6)
Let us emphasize that as 0 ∈ / cl (Λ) on every finite interval the number of jumps in Λ is finite so J Λ has trajectories with finite variation. That is J Λ ∈ V. Let
exp iu · J Λ (s, ω) , M (s, ω, u) E (exp (iu · J Λ (s, ω)))
& ' exp iv · X (s, ω) − J Λ (s, ω) N (s, ω, v) E (exp (iv · [X (s, ω) − J Λ (s, ω)])) be the exponential martingale of J Λ and X − J Λ . The Fourier transforms in the denominators are never zero and they are continuous, hence the expressions are meaningful and the jumps of these processes are the jumps of the numerators. Integrating by parts M (t) N (t) − M (0) N (0) =
t
M− dN + 0
+ [M, N ] (t) . 13 See: 14 See:
Proposition 1.97, page 61. Lemma 1.96, page 60.
t
N− dM + 0
´ LEVY PROCESSES
469
The Fourier transforms in the denominators are never zero and they are continuous so their absolute value have a positive minimum on the compact interval [0, t]. The numerators are bounded, so the integrators are bounded on any finite interval. Hence the stochastic integrals above are real martingales15 . So their expected value is zero. We show that [M, N ] = 0. As J Λ (t) has a compound Poisson distribution one can explicitly write down its Fourier transform: E exp iu · J (s) = exp λs (exp (iux) − 1) dF (x)
Λ
R
exp (s · φ (u)) As J Λ ∈ V obviously M ∈ V. So M is purely discontinuous. Hence16 [M, N ] =
∆M ∆N.
J Λ and X − J Λ do not have common jumps, therefore [M, N ] (t) =
∆M (s) ∆N (s) = 0.
0<s≤t
Hence E (M (t) N (t)) = E (M (0) N (0)) = 1. From which (7.6) trivially holds. If N1 and N2 are Poisson processes and N1 and N2 do not have common jumps then [N1 , N2 ] =
∆N1 ∆N2 = 0.
Using this one can prove in a similar way as above the following observation: Lemma 7.10 If N1 and N2 are Poisson processes with respect to some filtration F and N1 and N2 do not have common jumps almost surely then N1 (t) and N2 (t) are independent for every t. Proposition 7.11 If (Ni ) are finitely many Poisson processes with respect to some filtration then they do not have common jumps almost surely if and only if the variables (Ni (t)) are independent17 for every t. 15 See:
Proposition 2.24, page 128. Corollary 4.34, page 245. 17 See: Example 2.29, page 130. 16 See:
470
PROCESSES WITH INDEPENDENT INCREMENTS
Proof. If the values of Poisson processes are independent then the same is true for the compensated Poison processes. By the independence on every finite time interval the compensated Poisson processes are orthogonal in the Hilbert space H02 . Hence they are orthogonal as local martingales18 . Therefore their quadratic variation is a uniformly integrable martingale19 . This implies that the expected value of the quadratic co-variation [N1 , N2 ] =
∆N1 ∆N2
is almost surely zero. As ∆N1 ∆N2 ≥ 0 the quadratic co-variation is almost surely zero. Hence the two processes do not have common jumps almost surely. The proof of the other part of the proposition is clear from the previous lemma. Theorem 7.12 (Decomposition of L´ evy processes ) If X is a L´evy process, Λ is a Borel measurable set and 0 ∈ / cl (Λ) then J Λ and X − J Λ are independent L´evy processes. Proof. Recall that by definition two processes are independent if they are independent as sets of random variables. As we proved20 J Λ (t) and X − J Λ (t) are independent for every t. From the Markov property it is clear that if h > 0 then the increments J Λ (t + h) − J Λ (t) and
X − J Λ (t + h) − X − J Λ (t)
are also independent. Let (tk ) be a time sequence. Let (αk ) denote the corresponding increments of J Λ and let (β k ) denote the corresponding increments of X −J Λ . Let Gt be the σ-algebra generated by the increments of X after t. Observe that αk and β k are Gtk -measurable. Hence the linear combination uk αk + vk β k is also Gtk -measurable. So uk αk + vk β k is independent21 of Ftk . Using these one can easily decompose the joint Fourier transform: n n iuk αk + ivk β k = ϕ (u, v) E exp = E exp
k=1 n
k=1 18 See:
Proposition 4.15, Proposition 2.84, 20 See: Lemma 7.9, page 21 See: Proposition 1.97, 19 See:
page 230. page 170. 468. page 61.
k=1
i (uk αk + vk β k )
=
´ LEVY PROCESSES
= E E exp = E exp
n−1
n
k=1
i (uk αk + vk β k )
471
| Ftn−1
=
i (uk αk + vk β k ) E (exp (i (un αn + vn β n )))
=
k=1
= ··· =
n !
E (exp (i (uk αk + vk β k ))) =
k=1
=
n !
(E (exp (iuk αk )) · E (exp (ivk β k ))) = ϕ1 (u) · ϕ2 (v) .
k=1
This means that the sets of variables (αk ) and (β k ) are independent. Hence the σalgebras generated by the increments, that is by the processes, are independent. Therefore the processes X − J Λ and J Λ are independent. With nearly the same method one can prove the following proposition. Proposition 7.13 If (Ni ) are finitely many Poisson processes with respect to some common filtration then they do not have common jumps almost surely if and only if the processes are independent. Proof. Let F be the common filtration of N1 and N2 and let U and V be the exponential martingales of N1 and N2 . As N1 and N2 do not have a common jumps the quadratic co-variation of U and V is zero. Hence they are orthogonal. That is U V is a local martingale with respect to F. On every finite interval U, V ∈ H2 , therefore |U V (t)| ≤ sup |U (s)| sup |V (s)| ∈ L1 (Ω). s
s
Hence U V is a martingale. Therefore
E U V (tk ) | Ftk−1 = U V (tk−1 ) . If we use the notation of the proof of the previous proposition then with simple calculation one can write this as
E exp (i (uk αk + vk β k )) | Ftn−1 = E (exp (iuk αk )) · E (exp (ivk β k )) . From this the proof of the proposition is obvious. Corollary 7.14 If (Ni ) are countably many independent Poisson processes then they do not have common jumps almost surely. Proof. Let N1 and N2 be independent Poisson processes and let F (1) and F (2) be the filtration generated by the processes. Let U and V be the exponential
472
PROCESSES WITH INDEPENDENT INCREMENTS
martingales of N1 and N2 . U and V are martingales with respect to filtrations F (1) and F (2) . Let F be the filtration generated by the two processes N1 and N2 . Using the independence of N1 and N2 we show that U and V are martingales (1) (2) with respect to F as well. If F1 ∈ Fs and F2 ∈ Fs where s < t then F1 ∩F2
U (t) dP = E χF1 χF2 U (t) = E χF2 E χF1 U (t) =
= E χF2 E χF1 U (s) = E χF2 χF1 U (s) = U (s) dP. = F1 ∩F2
With the Monotone Class Theorem one can prove that the equality holds for every F ∈ σ F1 ∩ F2 : F1 ∈ Fs(1) , F2 ∈ Fs(2) = Fs , that is E (U (t) | Fs ) = U (s). Hence U is a martingale with respect to F.
Example 7.15 Poisson processes without common jumps which are not independent.
Let (σ k ) be the jump times generating some Poisson process. Obviously variables (2σ k ) also generate a Poisson process. As the probability that two independent continuous random variable is equal is zero the jump times of the two processes are almost surely never equal. But as they generate the same non-trivial σ-algebra they are obviously not independent. Proposition 7.16 If X is a L´evy process and (Λk ) are finitely many
disjoint Borel measurable sets with 0 ∈ / cl (Λk) for all k, then processes N Λk are independent. The same is true for J Λk . Proof. It is sufficient to show the second part of the proposition. If X J ∪i=1 Λk n then J ∪i=2 Λk = X − J Λ1 and J Λ1 are independent. From this the proposition is obvious. n
7.1.3
Spectral measure of L´ evy processes
First let us prove a very simple identity.
´ LEVY PROCESSES
473
Definition 7.17 Let (X, A) and (Y, B) be measurable spaces. A function µ : X × B → [0, ∞] is a random measure if: 1. for every B ∈ B the function x → µ (x, B) is A-measurable, 2. for every x ∈ X the set function B → µ (x, B) is a measure on (Y, B). Proposition 7.18 Let (X, A) and (Y, B) be measurable spaces and let µ : X × B → [0, ∞] be a random measure. If ρ is a measure on (X, A) and ν (B)
µ (x, B) dρ (x) , X
then ν is a measure on (Y, B). If f is a measurable function on (Y, B) then
f (y) µ (x, dy) dρ (x) ,
f (y) dν (y) = Y
X
Y
whenever the integral on the left-hand side
f dν is meaningful.
Y
Proof. ν is non-negative and if (Bn ) are disjoint sets then by the Monotone Convergence Theorem ν (∪n Bn )
µ (x, ∪n Bn ) dρ (x) =
X
=
n
X
µ (x, Bn ) dρ (x)
X
µ (x, Bn ) dρ (x) =
n
ν (Bn ) ,
n
so ν is really a measure. If f = χB , B ∈ B, then
f (y) dν (y) = ν (B)
Y
=
µ (x, B) dρ (x) = X
χB (y) µ (x, dy) dρ (x) = X
Y
X
Y
=
f (y) µ (x, dy) dρ (x) .
In the usual way, using the linearity of the integration and the Monotone Convergence Theorem the formula can be extended to non-negative measurable functions. If f is non-negative and Y f dν is finite then almost surely w.r.t. ρ
PROCESSES WITH INDEPENDENT INCREMENTS
474
the inner integral is also finite. Let f = f + − f − and assume that the integral of f − w.r.t. ν is finite. In this case, as we remarked, the integral Y f − (y) µ (x, dy) is finite for almost all x and the integral
f (y) µ (x, dy) −
f (y) µ (x, dy) = Y
+
Y
f − (y) µ (x, dy)
Y
is almost surely meaningful. The integral of the second part with respect to ρ is finite, hence
f dν Y
f dν − +
Y
f − dν =
Y
f + (y) µ (x, dy) dρ (x) −
= X
Y
X
f (y) µ (x, dy) −
X
Y
−
f (y) µ (x, dy) dρ (x)
+
=
f − (y) µ (x, dy) dρ (x) =
Y
Y
f (y) µ (x, dy) dρ (x) . X
Y
Let us fix a moment t. For an arbitrary ω define the counting measure supported by the jumps of s → X (s, ω) in [0, t]. Denote this random measure by µX (t, ω, Λ) = µX t (ω, Λ). That is µX t (ω, Λ)
χΛ (∆X (s, ω)) = N Λ (t, ω) .
(7.7)
0<s≤t
In general the process X is fixed so in order to simplify the notation as much as possible we shall drop the superscript X and instead of µX we shall simply write µ. If 0 ∈ / cl (Λ) then by (7.7) µt (ω, Λ) is measurable in ω. Obviously if Λ ⊆ R \ {0} then c
µ (t, ω, Λ) = lim µ (t, ω, Λ ∩ [−1/n, 1/n] ) , n→∞
so µt (ω, Λ) is also measurable in ω for any Borel measurable subset Λ of R \ {0}. This implies that µt (ω, Λ) is a random measure over R \ {0}. Hence Λ → ν t (Λ) E (µt (Λ))
µt (ω, Λ) dP (ω) ,
Λ ∈ B (R \ {0})
Ω
is a measure on (R \ {0} , B (R \ {0})). If 0 ∈ / cl (Λ) then ν t (Λ) is the expected value of a Poisson process at a fixed time, therefore ν t (Λ) < ∞. Therefore ν t is σ-finite for every t.
´ LEVY PROCESSES
475
Definition 7.19 The measures ν t (Λ) E (µt (Λ)) ,
Λ ∈ B (R \ {0})
are called the spectral measures of X. To simplify the notation let ν ν 1 . Lemma 7.20 ν t (Λ) = t · ν 1 (Λ) t · ν (Λ). Proof. If 0 ∈ / cl (Λ) then N Λ is a Poisson process. In this case
ν t (Λ) E N Λ (t) = t · E N Λ (1) tν (Λ) . In the general case by the Monotone Convergence Theorem
c ν t (Λ) = E lim µt (Λ ∩ [−1/n, 1/n] ) = n→∞
c
= lim E (µt (Λ ∩ [−1/n, 1/n] )) = n→∞
c
= lim t · ν (Λ ∩ [−1/n, 1/n] ) = t · ν (Λ) . n→∞
Proposition 7.21 (L1 -identity) If X is a L´evy process then for every Borel measurable function f : R \ {0} → R E f dµt = E f (∆X (s)) χ (∆X (s) = 0) R\{0}
0<s≤t
= R\{0}
whenever the integral
R\{0}
f dν t = t
f dν,
(7.8)
R\{0}
f dν is meaningful.
Proof. As µt (ω, Λ) is a counting measure for ever Borel measurable function f f (x) µt (ω, dx) = f (∆X (s, ω)) χ (∆X (s) = 0) . R\{0}
0<s≤t
The other parts of (7.8) are direct consequences of the previous proposition. Corollary 7.22 Let X be a L´evy process. If 0 ∈ / cl (Λ) and Λ xdν (x) is finite then
J Λ (t) − E J Λ (t) = J Λ (t) − t xdν (x) (7.9) Λ
is a martingale. In particular if Λ is bounded and 0 ∈ / cl (Λ) then (7.9) is a martingale.
476
PROCESSES WITH INDEPENDENT INCREMENTS
Proof. As Λ xdν (x) R\{0} xχΛ (x) dν (x) is finite by the L1 -identity with f (x) xχΛ (x)
E J Λ (t) E
∆X (s) χΛ (∆X (s)) =
0≤s≤t
=t R\{0}
xχΛ (x) dν (x) = t
xdν (x) . Λ
/ cl (Λ) the jumps X is a L´evy process so J Λ has independent increments. As 0 ∈ Λ has right-regular trajectories. This implies that in Λ cannot accumulate. So J J Λ (t) − E J Λ (t) is a martingale. Let P denote the σ-algebra of the predictable sets. By the martingale property of the compensated jumps it is clear that if 0 ∈ / cl (Λ), F ∈ Fs and s < t then
µ (t, ω, Λ) − t · ν (Λ) dP (ω) = F
µ (s, ω, Λ) − s · ν (Λ) dP (ω) . F
This means that as ν (Λ) < ∞
µ (t, ω, Λ) − µ (s, ω, Λ) dP (ω) = F
(t − s) · ν (Λ) dP (ω) , F
that is if H (u, ω, e) χΛ (e) χF (ω) χ(s,t] (u) then
∞
Hµ (du, ω, de)
E 0
R\{0}
∞
=E
Hdν (e) du .
0
R\{0}
The meaning of the left-hand side is the following. For every ω let µ (ω, D) denote22 the counting measure of the jumps of X, that is if D ∈ B (R+ ) × B (R \ {0}) then let µ (ω, D) be the number of jumps in D. First we integrate by this measure and then, if it is meaningful, we take the expected value. If the time interval is finite and we restrict µ to a set with ν (Λ) < ∞ then the set of bounded processes for which the formula is valid is a linear space. From this in the usual way, using the Monotone Class Theorem and the Monotone 22 See:
Definition 7.44, page 496.
´ LEVY PROCESSES
477
Convergence Theorem, one can prove the following: Proposition 7.23 (General L1 -identity) If H ≥ 0 is measurable with respect to P × B (R \ {0}) then
∞
E
H (u, ω, e) µ (du, ω, de) 0
∞
=E
R\{0}
H (u, ω, e) dν (e) du . R\{0}
0
Example 7.24 The L´evy–Khintchine formula for compound Poisson processes.
Let X be a L´evy process and let 0 ∈ / cl (Λ). Let J Λ be the compound Poisson process of the jumps of X. The Fourier transform of J Λ (s) is23
exp λs R
(exp (iux) − 1) dF (x)
,
where F is the common distribution function of the jumps, and λ is the parameter of the underlying Poisson process. What is the relation between F and ν? If B ∈ B (R\ {0}) and τ is the time of the first jump in Λ then by the general L1 -identity using that χ ([0, τ ]) is predictable F (B) = P (∆X (τ ) ∈ B ∩ Λ) = E (χB∩Λ (∆X (τ ))) = ∞ =E χ ([0, τ ]) µ (du, B ∩ Λ) = 0
∞
=E
χB∩Λ (e) χ ([0, τ ]) µ (du, de)
R\{0}
0 ∞
χB∩Λ (e) χ ([0, τ ]) dν (e) du
=E 0
R\{0}
= ν (B ∩ Λ) E
∞
χ ([0, τ ]) du
=
0
= ν (B ∩ Λ) E (τ ) =
ν (B ∩ Λ) . λ
That is the Fourier transform of J Λ (s) is (exp (iux) − 1) dν (x) . exp s Λ 23 See:
Lemma 7.7, page 467.
=
=
478
PROCESSES WITH INDEPENDENT INCREMENTS
Definition 7.25 Let (E, E, ν) be a measure space and let (Ω, A, P) be a probability space. We say that the random measure µ : Ω × E → [0, ∞] is a random Poisson measure with control measure ν if: 1. whenever the sets (Λk ) are disjoint the variables µ (ω, Λk ) are independent and 2. whenever ν (Λ) < ∞ the variable µ (ω, Λ) has a Poisson distribution with parameter ν (Λ). Proposition 7.26 Let X be a L´evy process. For every t the counting measure µt (ω, Λ) is a random Poisson measure. The control measure of µt is the spectral measure ν t . 'c & Proof. For every Λ ⊆ R \ {0} let Λn Λ ∩ − n1 , n1 . Obviously 0 ∈ / cl (Λn ) and µ (t, ω, Λ) = lim µ (t, ω, Λn ) . n→∞
As 0 ∈ / cl (Λn ) the variable ω → µ (t, ω, Λn ) = N Λn (t, ω) has a Poisson distribution. The Fourier transform of this variable is exp (tν (Λn ) (exp (iu) − 1)) . The convergence for every ω implies the weak convergence, so if ν (Λ) < ∞, then as ν (Λ) = limn→∞ ν (Λn ) the Fourier transform of ω → µ (t, ω, Λ) is exp (tν (Λ) (exp (iu) − 1)) . Hence it has a Poisson distribution. If the sets Λk =
∪n Λ(k) n
∪n
c 1 1 Λk ∩ − , n n
(k)
are disjoint then the sets Λn are also disjoint for every n. Hence the variables
µ t, ω, Λ(k) n are independent. The limit of independent variables is independent, so if the sets (Λk ) are disjoint, then the variables µ (t, ω, Λk ) are independent.
´ LEVY PROCESSES
479
Definition 7.27 Let H be a Hilbert space and let (C, C, ν) be a measure space and let S ⊆ C denote the subsets of C with finite measure. π : S → H is a vector measure with control measure ν if for every S ∈ S: 1. π (S) ∈ H is defined, 2
2. π (S)H = ν (S), 3. if S1 and S2 are disjoint sets in S then the vectors π (S1 ) and π (S2 ) are orthogonal. We say that a function f : C → R is integrable to π if there is a
with respect sequence of finite valued step functions (sn ) = c χ k nk Cnk with: 1. sn → f in L2 (ν) and 2. In k cnk π (Cnk ) is a Cauchy sequence in H. If I limn→∞ In , then we shall call this limit I the integral of f with respect to π. We shall denote this integral by C f (x) dπ (x) or simply C f dπ. Proposition 7.28 If f ∈ L2 (C, C, ν) and π is a vector measure with control measure (C, C, ν) then f is integrable with respect to π and f dπ C
Proof. Let s and 2.
k ck
- = f 2
f 2 dν.
(7.10)
C
H
· χCk where Ck are disjoint and in S. By conditions 3.
2 2 s dπ = c · π (C ) k k = C H k H 2 2 = ck · π (Ck )H = k
=
k
(7.11)
c2k
2
· ν (Ck ) = C
s2 dν = s2 .
there is a sequence sn k cnk χCnk with As the step functions are dense in L2 sn → f in L2 (ν). From (7.11) In k cnk π (Cnk ) is a Cauchy sequence in H. From this the proposition is obvious. Corollary 7.29 If f ∈ L2 (C, C, ν) and π is a vector measure with control measure ν then the value of the vector integral C f dπ is independent of the approximating sequence (sn ). Proposition 7.30 If X is a L´evy process and H L2 (Ω) then for every t ≥ 0 π t (Λ) N Λ (t) − ν t (Λ) = N Λ (t) − t · ν (Λ)
(7.12)
480
PROCESSES WITH INDEPENDENT INCREMENTS
is a a Hilbert space valued vector measure over (R \ {0} , B (R \ {0}) , ν t ). The same is true if H H02 on the time interval [0, t] and (π (Λ)) (s) N Λ (s) − s · ν (Λ) ,
s ≤ t < ∞.
Proof. As we have already proved, if Λ ⊆ R \ {0} and ν t (Λ) < ∞ then the Fourier transform of N Λ (t) is exp (ν t (Λ) (exp (iu) − 1)) . Hence if ν t (Λ) < ∞, then N Λ (t) has a Poisson distribution with parameter 2 ν t (Λ). This implies that the expected value of (7.12) is zero and π t H = ν t (Λ). Λ1 As we have also proved that if Λ1 ∩ Λ2 = ∅ then N (t) and N Λ2 (t) are independent24 . So (π t (Λ1 ) , π t (Λ2 )) π t (Λ1 ) π t (Λ2 ) dP = 0. Ω
7.1.4
Decomposition of L´ evy processes
Now we are ready to prove that L´evy processes are semimartingales. Proposition 7.31 If X is a L´evy process then: 1. X is a semimartingale, 2. X has a decomposition X =V +M where: 3. V and M are independent L´evy processes, 4. M is a martingale with bounded jumps and on every finite interval M ∈ H02 , 5. V ∈ V, that is on every finite interval the trajectories of V have finite variation. Proof. If Λ {|x| ≥ 1} then the jumps of Y X − J Λ are bounded. Y is a L´evy process with bounded jumps25 . This implies that Y (t) has an expected value26 for every t. Therefore M (t) Y (t) − E (Y (t)) =
= X (t) − J Λ (t) − t · E X (1) − J Λ (1) X (t) − J Λ (t) − t · γ
24 See:
Proposition 7.16. page 472. Lemma 7.8, page 468. 26 See: Proposition 1.111, page 74. 25 See:
´ LEVY PROCESSES
481
is a L´evy process with zero expected value. Hence M is a martingale. The martingale M has finite moments, so on any finite interval M is in H02 . Therefore M satisfies 4. Obviously V (t) J Λ (t) + E (Y (t)) J Λ (t) + γ · t satisfies 5. As X − J Λ and J Λ are independent27 the proposition holds.
Corollary 7.32 The spectral measure ν has the following properties x2 dν (x) < ∞.
ν (|x| ≥ 1) < ∞, 0